Here's another issue I couldn't find a simple answer to. I'm building a website in PHP, and I want a simple front controller. In other words, I want every bit of traffic to run through a central script that farms out the traffic by parsing the URL. And I want it to be silent – I don't want the user to know the redirects are happening. Now, I could do this with query strings – that's like
http://mysite.com/index.php?page=foo&action=bar But that's ugly, and it gets complicated fast.
Instead, I want to do it in the user-friendly, semantically meaningful, SEO-ok way, like:
http://mysite.com/foo/bar I want to use the path structure instead of the query string. This is a common pattern for Ruby and Python, but maybe less so in PHP. So, here's how I did it.
(Note: This isn't rocket science. Yet it was still hard to find. And I'm sure there are 1000 ways this could be done better, cleaner, faster, more. But I'm a functional coder. I don't care if it's the most elegant, I care if it works. And this works.)
There are two parts to the controller. The first is the .htaccess file. I'm using XAMPP on Windows Vista as my development environment. If you need help getting .htaccess to work there, check out my earlier post.
RewriteCond $1 !^(index\.php|lib|parts|pages|images|robots\.txt)
RewriteRule ^(.*)$ /your/webroot/index.php/$1 [L]
Here's what this does. (Full disclosure: I adapted this from CodeIgniter's model…) The first line enables mod_rewrite. It's a must. The second line sets the conditions for the rewrite. It says, rewrite everything except what's listed in the parenthesis. If you have additional directories that you keep images, css files, or other things in, you just add them to the list in parens with a '|' between. Finally, the last line sends everything through the main controller, which is index.php, but without actually changing the URL in the address bar. This part is what makes the whole thing transparent to the user. Good stuff.
Ok, so now, we can type something like 'http://mysite.com/foo/bar' and it will silently redirect to 'http://mysite/com/index.php/foo/bar'. So now all we have to do is set up index.php to handle the incoming request. Now, there are lots of more object oriented ways to do this, but for my purposes, the simple procedural way works best: a switch statement.
//A file with all the common configuration, like web roots, security,
//database, language, etc.
//Your header, the same regardless of the page
//This is the content area that will change based on the URL.
//My div is called 'textarea'. Yours might be called something else.
//The switch statement
$url = substr($_SERVER['REQUEST_URI'], strlen(URLROOT));
//the default is an error!
//End your content div here
//Your footer, the same regardless of the page
That's it. No mystery. In that first line of PHP ($url = …), I get the path string that's in the address bar – so this will show what the user typed before we did the redirect. Then, in my config.php I've set a global variable called 'URLROOT' that corresponds to the path in my local environment. Using substr, I snip out only the part of the path that comes after the root, and feed that to my switch statement.
Like I said, I'm sure a more experienced coder has 1000 better ways to do this. But, give it a try. It worked for me!
Sat 4 Oct 2008
Recently I was trying to get .htaccess control working under Windows and XAMPP. I found it surprisingly hard to find the answer, even though it's a simple one. You've got to make two changes to the httpd.conf file, which for me was found in C:/xampp/apache/conf.
- Search the httpd.conf file for 'mod_rewrite'. You'll find the statement that loads that module is commented out. Remove the '#' at the start of the line to un-comment it.
- By default, XAMPP on Windows does not allow .htaccess files to override the httpd.conf file. Search the httpd.conf file for 'AllowOverride'. You should find one or more statements that say 'AllowOverride None'. Wherever you find it (or more selectively, if you're savvy like that), change it to 'AllowOverride All'.
Don't forget to restart Apache, and you're done!
Thu 26 Jun 2008
In an easily recognizable, but nonetheless idiotic, ploy to sell magazines, Wired's editor-in-chief Chris Anderson has published a short article called The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. In it he claims that the mere availability of data on a huge scale means that theories and models are unnecessary. As long as we have statistics that can pick trends, correlations out of the madness, we don't need the scientific method anymore.
I'll let this excellent rebuttal by John Timmer at Ars Technica do most of the work in explaining why Anderson's argument is so flawed it should never had been printed. (Ah hah! More evidence against Andrew Keen's argument for the return of the old-school editor! Anderson's crap would never have passed muster on Wikipedia.)
For me, the most important rebuttal is about falsification and repair. Without theories that we can test, how can we know we're wrong? What will being wrong look like? Without reasoned explanations for why things happen, how will we know what to do when things break? The reason that scientists are so wary of correlations is because they offer no explanatory power – they're misleading as often as not. If we act on them, completely ignorant of the underlying mechanism, we don't learn anything at all. Anderson's most staggeringly ignorant move is to suggest that theories and models are somehow unnecessary simply because they're often wrong. Wha? I guess the benefit of never having a model or a theory is that if you make no assumptions or predictions no one can ever disprove you.
I'd be willing to dismiss Anderson entirely if, presented differently, he wouldn't have otherwise tackled an interesting topic. As Timmer says, certainly the availability of massive data is changing the way we do science. But the end of theory? C'mon, Chris, that's ridiculous, and a transparent attempt to appeal to the data-heads that read Wired. This point of view is SO common, at least around San Francisco. I'm amazed that otherwise smart people would adopt such an ignorant, arrogant point of view. Fighting this kind of thinking is depressing. It reminds me of something Anthony Bourdain said about his most hated chef nemesis, Rachel Ray. (I noted this in a previous post.)
Complain all you want. It’s like railing against the pounding surf. She only grows stronger and more powerful. Her ear-shattering tones louder and louder. We KNOW she can’t cook… She’s a friendly, familiar face who appears regularly on our screens to tell us that “Even your dumb, lazy ass can cook this!” Wallowing in your own crapulence on your Cheeto-littered couch you watch her and think, “Hell…I could do that. I ain’t gonna…but I could–if I wanted! Now where’s my damn jug a Diet Pepsi?
A lazy, soulless, superficial, inexplicably popular idea. It's days are numbered, though. I predict that inside of 5 years, Google is going to hit the wall on its data-center driven problem solving. They'll call for more cowbell, find there's none to be had, and return to the land of the living where the rest of us live.
Sat 16 Feb 2008
Take a look at these results, recently released by Hitwise and reported on TechCrunch. Couldn't be more skeptical, I must say. Hitwise's methodology seems pretty typical of web survey and analytics companies. They're subject to a huge number of biases to begin with, and systematically over-represent certain parts of the population and certain contexts, despite their best efforts. I know Hitwise is doing everything they can to combat these biases, but they don't go into enough detail on their website for us to be sure of how. I have a strong suspicion, for example, that these results are not so much representative of SES as of geography. But SES makes for a better story. That big purple blob at the bottom right probably represents suburban areas in a very few markets like San Francisco, New York, Boston. The top left quadrant, that's middle America. And none of this is news – Yahoo knows the heartland is their wheelhouse. Plus, what's 'Varying Lifestyles'? Is that the catch-all for all the people they can't pidgeonhole?
We'll continue to see analytics like this, of course, but I think recent news should make us all more skeptical. Advertising is a business that has been run on analytics from the beginning, and unsurprisingly, they got it very, very wrong from the beginning. The knowledge that a small percentage of individuals do most of the clicking (and very little of the buying) should shake the industry up, but it won't. So sad!
Thu 6 Dec 2007
The Economist recently reported the results of Radiohead's bold experiment in giving their new album away for free on the internet. It turns out 60% of people paid nothing for the album – unsurprising if you believe that a rational, self-interested individual would not pay for something he could get for free (as many economists do, for example). And yet, 40% of people paid something for the album, quite a few of them more than they would have paid if they'd been able to download the album from iTunes or Amazon. Who are these people?
One window into that question might be opened by looking at the pricing data longitudinally. A few weeks after the album was released, I remember reading that the number of free-riders was only about 30%, though I can't remember the source. But still, it puts the question out there: how did the distribution of prices change over time?
My completely unsupported guess is that the vast majority of the high outliers came right away – motivated fans, ideological supporters of new music models, enemies of the big record companies. Even if we take the narrow view of pure rationalism, we can call these people 'rational zealots' – we must factor the belief and promotion of a valuable cause into the price they were willing to pay. I'd also guess that the percentage of non-payers increased dramatically as time went on, and that these days most people download the album for free. Who's got the data for us to check?! Any way you slice it, this is a cool experiment.
Fri 12 Oct 2007
Via TechCrunch, I read about a fascinating piece of work by Robert Rohde that seems to suggest that Wikipedia's astonishing rate of growth over the last few years is slowing down a bit. Check out this page, complete with interesting info. graphics like the one below.
Reading through the comments about this both on the Wikipedia talk pages and WikiEN-l archive is pretty revealing. Many people are extremely passionate about Wikipedia and unwilling to accept the validity of any argument that is seen to besmirch its good name. That leads to a lot of silly counter-arguments based on rhetoric and ideology rather than data.
The interesting thing is that I wouldn't jump to the conclusion that the slowdown, if it really does exist, is actually a bad thing for Wikipedia. It's very hard to interpret statistical analyses of logfiles. For instance, Rohde's analysis seems to show that overall edits are down slightly, and that a higher percentage of edits are reverts of earlier versions. Without knowing something more about the qualitative nature of these edits, it's hard to assume this is some kind of 'Mid-Life Crisis' slowdown as TechCrunch suggests. This could be a sign of maturity – more well-reasoned edits overall, perhaps. Or it could be a sign of change in the nature of contributions (and contributors). Wikipedia may be attracting a large proportion of users who make fewer, more substantive edits rather than many tiny corrections. We just don't know.
Fantastic food for thought, though.
Sat 24 Feb 2007
I recently found this commentary by Jason Calcanis on what he calls 'Wikipedia's Technological Obscurification'. Basically, Jason argues that there are three primary factors that keep many folks from contributing to Wikipedia:
- The lack of a WYSIWYG editor (what you see is what you get)
- The user of discussion pages that are hard to understand
- The use of IRC for many of the meta-discussions about how Wikipedia is run
Jason quite rightly points out that these are not intentional mechanisms for blocking participation, and neither are they necessarily a bad thing. He comes down with the point that these conditions are remnants of old technologies, and that Wikipedia has lacked the resources to move to more modern ones. I don't buy this last argument, and I want to reframe the question a bit.
First, I think this example shows us that although Wikipedia's rhetoric has been all about openness, in practice it doesn't really get there. My adviser Coye Cheshire often points out that even the most open of public goods on the internet end up enacting what he calls 'rings of hegemony'. In other words, we can think of the openness as starting only once we get to a certain point on the totem pole. Above that, there are still heirarchies of power that grow out of the need to do things like make rules, pay bills, and manage servers. I don't think this is a knock against Wikipedia at all. It's just a reality check.
Second, I take a completely different view from Jason about why these technologies persist. A lack of resources may be a part of it, but more important is the fact that Wikipedia is a culture with entrenched practices. Its core contributors apparently ascribe some meaning and value to technologies like Wiki markup and IRC that help them persist even when they become outmoded. This reminds is that many of these open projects are driven by a small group of zealots, even when the number of contributors overall gets very large.
Finally, Coye and I have recently written an article (forthcoming) which makes the point that online public goods system may tend to move from less order to more order over time and as they get larger. We define order as the degree to which the process by which the public good is produced and the product which constitutes it are clearly defined. Certainly wiki markup and IRC present a barrier to entry (which may or may not be intentional), but we can also think of stubbornly adhering to older technologies as one way of imposing order. By doing nothing, they essentially set up a structured process that is controlled by access to certain skills. Another way to impose order, of course, is to create new technologies and hierarchies (which they are also doing).
Think about it this way: why does Congress continue to adhere to a set of highly complex and arcane rules and procedures? Because they're necessary? Probably not. I'd argue it's more because the fact that they're so hard to understand gives a measure of power to more experienced lawmakers, thereby implying an order to the body. Imposing new rules to provide order would work too, but it would not necessarily privilege older lawmakers.
Tue 8 Aug 2006
Foolish though it is to reblog Slashdot posts, I wanted to toss in a comment about a recent WSJ article posted there. In 'Many Companies Still Cling to Big Hits To Drive Earnings' Lee Gomes argues that big hits, blockbusters, and the top 5% still dominate the market today. The Long Tail, he argues, isn't as relevent as we might think.
The thing is, Gomes was just trying to give some sound business advice: don't think you have to switch to a brand new model right away: the tried and true one still works. But in the process he's misrepresented the whole idea of the Long Tail. The idea isn't that blockbusters won't be important anymore, just that they won't be the only important thing. When many folks have access to the channels of distribution, and diverse content is always available, it creates another option for those who want it. The blockbuster has some competition.
The Barnes & Noble's of the world are in no immediate danger from the competition of the Long Tail. But if they're smart, they'll realize, unlike Mr. Gomes, that the two aren't mutually exclusive. Be a hits oriented company, but cultivate a symbiotic community of producers, consumer, commentators, searchers, stockpilers, and experts. After all, today's blockbusters will be sliding down that curve before long. Why lose their value entirely?
Thu 11 May 2006
Today and tomorrow are the culmination of a long, busy semester of working on Mycroft. I'm really proud of what we've accomplished so far, and excited that we finally get to show it off. There are two end-of-semester events at the iSchool that you might come to, if you're so inclined. For a quick overview of what I and my cohorts have been doing, you can come to the Final Project Showcase, from 4-7 tonight, in South Hall Room 110. (Check out all the project descriptions.) For a deeper look, tomorrow morning's got a series of 30 minute talks about all the projects. Check out the schedule here. We'll be giving away (some) of the deepest darkest secrets about Mycroft at our talk at 11:30 in Room 202. Hope to see you there!