March 2009


If you don't get it, it wouldn't be funny anyway (via Tamar, via William):

Chat-en-Oeuf

Chat-en-Oeuf


(Click for larger image.)

Tamar and I were just in Santa Fe for the Society for Applied Anthropology conference. The conference was meh, but Santa Fe is fairly great. In particular, we both love the food there. New Mexico cuisine is heavily influenced by Tex-Mex, or straight-up Mexican, but it's all about the New Mexico chiles. Green, red, or Christmas (that's both). Put that sauce on cardboard and it would taste great.

But I digress. 5 days at a conference means a lot of restaurants, which means a lot of searching on Yelp, TripAdvisor, ChowHound, etc. As anyone who's looked for a restaurant without local tips knows, it can be a chore. In the past, it's been argued that a big problem with online reviews is that they attract the extremes: people who love a place, and people who hate it. So, the meta-rating inevitably consists of a raft of 1's and a raft of 5's that average out to a solid, mediocre 3. When every place is a 3, online reviews aren't much more useful than a phone book – which, granted, is a little useful.

But in the process of using a variety of online review sites in Santa Fe, I noticed that the nature of the reviews could be changing. For one thing, it didn't appear to me that everyone was so extreme. I saw lot's of 2's, 3.5's, 4s, etc. Lots of thoughtful reviews, people weighing their priorities, making substantive comments. This made me wonder if the extremes problem was at least partly an issue of growing pains. When online review sites were for early adopters and tech savvy folks, they were primarily used as venues for rants and raves. Now that they're much more mainstream, it could be that the moderate, balanced opinions are becoming the norm.

Once every meta-review isn't a 3, those scores can start to be really useful. But that points out another key weakness of online review sites. Review aggregators are all about the wisdom of crowds – that's the grand, cantankerous idea that a person is dumb, but people are smart. That individuals can be biased and wrong, but given a diverse enough group, all the biases cancel each other out and what's left is the good stuff. The wisdom.

But the dirty secret about the wisdom is that it's only as good as the crowd. As Tamar wisely pointed out, one good thing about experts is that we can find one that we agree with. We seek out someone who we feel shares our taste in food, wine, restaurant experience, and we trust them. With a meta-review, we know that the biases should cancel each other out to reveal the truth about the restaurant. But there is no truth (spoon). There's only the preference of the population. And without knowing anything about those preferences, the meta-review loses most of its value.

So, realizing the trouble, what does your average review-reader do then? Of course, we turn from the meta-review to the reviews themselves. It's a reasonable course of action, a logical next step, and it feels good. But there's no wisdom in it. Once we start reading individual stories and experiences, the wisdom of crowds is gone. Now we're just getting ideosyncratic little snippets of experience that are probably not representative of the restaurant. Our search will inevitably be biased by the way that the reviews are sorted, or by the happenstance chronology of whether a bad, mediocre, or good review was the last one posted. Worse than that, our search will be subject to all kinds of social psychological biases that are interesting and appealing, but useless if we're looking for a good restaurant. We'll do things like give more weight to the first and last reviews we read, and specifically (but unconsciously) seek out reviews that validate things we already think.

Put these two issues together, and we've got a big problem for online review sites. The meta-review is of limited use because it lies: it purports to represent wisdom, but without knowing the crowd we don't know how much. The individual reviews feel good, but the wisdom doesn't lie there. (See what I did with the title? I hate myself.) The latter is a big problem for the Yelp's out there, because part of what makes us so ready to devote time to reviews is knowing that our story is out there, that our words will be read, and that what we think matters.

There are good solutions to both of these problems – solutions that I think will drastically improve online review sites. First, meta-reviews will be more and more useful the more we know about the underlying population. Review sites should start surveying their users to find out their priorities about whatever is being rated. This sounds boring, but there are lots of creative ways to get this type of info. Jane cares a lot about the food, isn't bothered by slow service because she doesn't mind sitting and chatting. Billy Bob isn't picky about food, he'll enjoy almost anything you serve him, but he thinks what's he's really paying for is the service, so it'd better be Johnny on the spot. Peter won't go to a restaurant that doesn't allow corkage and have good stemware no matter how good the food and service are. With this kind of information, I'll be able to filter reviews based on my own preferences.

Individual reviews have their place too, but primarily as expert-finding mechanisms. Tamar was saying that when she reads reviews, she looks for certain adjectives, certain things about the ways that people write that give her confidence. These are, essentially, things that help her find experts. Once she's found them, if they're regular contributors she can subscribe. You can already do this sort of thing on many review sites, but it's secondary. Individual reviews need to be abstracted from meta-reviews somehow. Not hidden, but divorced from the search-flow in which reading the reviews inevitably follows looking at the meta-review. Doing these things would make review sites 10x better.

An important rule I apply to writing academic papers is this: let the reader decide what's interesting and exciting. I read too many papers in which the authors deem their results interesting themselves, as though this was self-evident to everyone. I understand the urge, often fight it, sometimes unsuccessfully. I want to share with people what I find particularly exciting in my own findings. But the truth is, that's not for me to say. It's for others to decide what's interesting and exciting. That's not to say authors shouldn't highlight what they think is notable or surprising.

Now, this is an unrealistic goal – everyone breaks this rule in their papers. But I think it's important to shoot for. This is a 'show, don't tell' idea. I believe that if I've done good research, presented my topic and my arguments well, and written in an engaging style, I shouldn't have to tell readers what's interesting. And when I read papers that do too much of it, it detracts from the paper, and it loses some legitimacy in my eyes. I instantly think: If this were an interesting finding, I would have thought that already. The authors telling me they think it's interesting only makes me suspicious.

The same thing goes for wine labels. This weekend we had a nice visit from Tamar's parents, and we went wine tasting up in the Russian River. Ending up in Healdsburg, we tasted at Williamson's tasting room just off the town's main square. These were fairly good wines, and the tasting room does a unique thing by pairing each wine in the tasting with a little bit of food. Something different for each wine. But the bottles are so tacky. They tell all about the winery's Australian ex-pat owners, but don't even tell us what grapes are in the wines. What's worse, the description of the wine breaks my cardinal rule. It says something like (I'm making this up as I don't have the bottle in front of me…):

This exquisite Merlot shows hints of blueberry and ripe cherry on the nose, along with interesting undertones of earth and oak. The balanced tannins and long finish of this wine are ample evidence of excellent structure.

To me, this is tacky. A wine label should tell me a little bit about the winery, maybe much, much less about its owners, and focus on the wine. It should tell me what's in the wine, even if it's 100% Merlot. And it should absolutely describe what the wine-maker thinks about the characteristics of the wine, what foods it should be paired with, etc. But leave the interesting part to the drinker. Let me decide if it has excellent structure, etc. etc., etc. You get the point.

I have to deal with fairly large database imports from time to time, which can be a bit of a challenge. phpmyadmin fails pretty miserably when dealing with large database dumps or database imports. Either the SQL files are too large, or the import process takes too long. There are lots of ways around this, but after trying most of them I've found what I think is the easiest way to handle large database imports into MySQL on Windows.

All credit due to Greg at dittio.net who posted a tutorial on this. My only contribution is to say that this works on Windows as well. Here's how I did it:

  1. To make things easy, I moved the SQL file I was importing to a root directory – C:\ or similar.
  2. Run mysql. I don't have it in my path, so I had to navigate to the proper directory to find the .exe. I use the excellent XAMPP, so for me that was C:\xampp\mysql\bin\. On the laptop that I use for development I have no password set for the root user (security alert! OMG!), so the command was just 'mysql -u root'. If you had a password set the command would be 'mysql -u root -p'.
  3. This tutorial assumes there's already a database you want to use. If not, look here for a tutorial on how to create one. Otherwise, switch to that database using the command 'USE {databasename}'.
  4. Then just identify the source SQL file using 'SOURCE {pathtofile};' So, for me that was 'SOURCE C:\dbDump.sql'. Don't forget that semi-colon!

That's it. The import of a 3mb SQL file took about a second. Swoosh.

Wordle is a very cool (and very popular) tool for visualizing word counts in text. It creates these beautiful, stylish little word clouds. See below for an example. It has one fatal flaw, however, that makes it pretty, but pretty useless in a practical sense: you can't specify phrases that the cloud should keep together.

For most words I wouldn't need to do this, but a few key phrases need to stay together. In the cloud below, for example, I need to tell Wordle to keep 'social dilemma' and 'public goods' together. This simple function would allow me to add huge amounts of meaning to the visualization. This would be so easy to implement that I can't see why Wordle doesn't do it.

A secondary fun feature, probably not so fatal, would be to include some simple stemming features to combine frequencies of the same words used in multiple tenses, plurals, etc. There are a variety of open-source packages for doing this.

Competence Wordle
(Click for a larger version)

Today's big news story around here is the California Supreme Court hearing in downtown San Francisco on Proposition 8. For those of you who have been living under a rock, Proposition 8 passed as a ballot referendum last November by a 52.2% to 47.8% margin – that's a 4.4% margin. The famous 14 words of the constitutional amendment approved by Prop. 8 are: "Only marriage between a man and a woman is valid or recognized in California."

Today's hearing was about a technical issue – does Prop. 8 represent an amendment to the constitution (which ballot propositions do all the time) or a revision, which can only be done with the approval of 2/3 of both the state house and the state senate, or by Constitutional convention.

The lawyers arguing against Prop. 8 argued what I think is a simple and irrefutable point: 1. last year the California Supreme Court called the right to marry "inalienable." 2. Stripping a group of people of their inalienable rights is not an amendment, it's a major revision. There's no precedent for that.

Now, I happen to agree, but I hope the lawyers thought about another point. News reports are that many of the justices were skeptical, that they somehow think this is just a matter of language, and that nothing has really been taken from gay people. Here are two reasons why that's crap:

  1. Language counts in laws, doesn't it? Don't we spend untold time debating the meaning of language in the Constitution? In case law, to determine where precedent applies? If that's the case, then using the word 'marry' seals the deal. Otherwise, they should have said something like "access to legal unions with all its associated rights and responsibilities is an inalienable right."
  2. Even if the court is going to dismiss the importance of the word "marry" in their earlier decision, they have to recognize the importance of the word in society. Marry and marriage are words that carry a lot of weight. They are symbols of a set of rights and responsibilities that, whether we know it are not, pretty much everyone knows about. So, we've got two choices here. Either we give same-sex couples the right to marry, or else we go out there and find every law, every county ordinance, every plaque, every corporate policy that refers to marriage, and change it to 'marriage or civil union' or similar. Anything less than that would be completely unequal.

From my point of view, the immediate fight is about time. How soon will same-sex couples get their rights back? If it's not within 90 days – the length of time the court has to make a decision – it's going to be within a few years. If Prop. 8 stands, this issue *will* be on the next CA state ballot, and it *will* be overturned. There is no way, no possible way, that California will vote the same way again. This has become a national issue, and it's one that has shocked everyone out of complacency (to use a tired phrase). Still, a few years is too long to wait to give same-sex couples a right they should have had years ago.

This hilarious picture is making the rounds today (via Ashkan). It basically sums up my feelings about Twitter, even if that does make me horribly, horribly uncool.

24w7ed0
(Click for larger version)