More random R tips that I had a hard time Googling.

When you're writing SQL queries you will often want to do something like SELECT * FROM table1 WHERE id IN (SELECT id FROM table2). Of course you don't need to do it with a sub-query. But the point is, often you will want to pull out only those rows that have an id which appears in another result set.

I need to do this all the time in R. For example, I have a dataframe in which each row has a unique ID. I do a bunch of processing and end up with a vector if ids that meet my criteria, and I would like to pull out only those rows from the original dataframe which are in this new vector of ids. Of course there are many ways to do this, but I think the SQL-like way is straightforward, and it turns out it's easy to do it in R, if only you can find the damned documentation! Well, here you go: use the %in% operator. It works just like it does in a SQL query:

result <- data[data$id %in% subset.ids,]

This says select all columns from data where the data$id field is in the vector called subset.ids. I find this to be super useful, hopefully you will too!

File this one under "Oh my god I will seriously jam this pen in my eye."

About a week ago I noticed a warning from Symantec Endpoint Protection telling me virus definitions were out of date. (I am running Windows 7 64-bit, incidentally.) Running LiveUpdate manually threw an error. Err, what?

So, suspecting some form of abstract corruption, I used the Symantec Cleanwipe Tool to remove all traces and reinstalled. No dice. The installation would get only so far and then roll-back with an annoyingly non-specific error.

I tried many fixes, finally found the right one. The problem with the aborted installation and the original failure of LiveUpdate were related to a DCOM conflict. Solution is detailed here. I have a feeling if I had known I could have implemented this solution without re-installing SEP to begin with. Argh.

If you haven't seen Klout, take a look. Here's how it works: you give Klout access to Twitter, Facebook, LinkedIn via APIs, Klout does some magic analytics on your social media streams and then gives you a score. The score is supposed to encapsulate how influential you are on a given topic. My Klout score is a modest 37.

Ok, great. I love this effort to cut through the cruft and identify who matters with respect to a given topic. I guess I'm just having trouble answering the question: matters for what? Recently, some media outlets (e.g. The Guardian) have made a big deal of Klout's claim that Justin Bieber is more influential than Barak Obama. (OMG!) Klout CEO Joe Hernandez responds on the Klout Blog. Joe says:

At Klout we measure influence across the social web. The point isn’t that Justin Bieber is more influential in the world than Obama, but that he is using social media more effectively to drive more actions from his network than anyone else right now… In the last 24 hours both Obama and Bieber have tweeted links to Youtube videos. The video shared by Bieber has generated nearly 10x the number of views the Obama video has.

Ok, here's my axe to grind, and here is why measuring influence with algorithms, without much context (except a vague topic area like "iPhone"), and without real social connections and interactions is a problem. Hernandez thinks Bieber is more influential because he gets teens to tweet more or watch YouTube videos more. But convincing a teen to Tweet is like convincing a pothead to eat Cheetos: EASY. Likewise, if I run into you in the parking lot of 7-11 and suggest that you go in, you wouldn't call me very influential, even though you'll have a Slurpie in your hand in 90 seconds. Noticing a statistical correlation between these things will get you a "Master of the Obvious" badge.

I don't make any claims about Klout's algorithms, or about the statistics of influence. But I do know that an influential person is someone who persuades you to do something you would not otherwise have done. So any serious discussion of my influence has to compare the likelihood of your doing something in the presence of my influence and in its absence.

It does not appear that Klout is doing this, which leaves me confused about what they are really measuring. (In fairness, maybe they are doing this, but this is not how they're pitching it…) I have a feeling the people they identify are more like the people at the top of information cascades. I guess that's a kind of influence – you get people all lined up in social network channels according to their interests, and then try to figure out where the cascades usually begin. This sort of thing is very important to marketers and brands, because it helps them know who to go to if they want to get something started. Likewise finding these people can be useful to you and me if we want to cut to the chase and hear it from the horse's mouth. But this is really not very much like offline influence.

The Wikimedia Foundation has produced a set of 4 short and beautiful videos that highlight the experiences and identities of everyday Wikipedians. Love it!





I've been on the verge of seppuku for an hour now because of problems trying to install Eclipse plugins on Win 7. I use the same automatic install feature I always have, installing from an update site. Everything seems to work fine, but when Eclipse restarts, there's no plugin. It appears in the list of install software, but no plugin. Checking out the eclipse directories, I see that the proper plugin files aren't there. Argh!

Well, it appears that on my machine one needs administrator permission to change the Program Files directory. I'm not sure why it became necessary, but the fix is simple: Run Eclipse as an administrator. To do that, just navigate to the directory, right click on the file, and click "Run as Administrator." You can also set it to run that way permanently by right-clicking on the file, choosing "Properties", then the "Compatibility" tab, and checking "Run this program as an administrator".

Bleargh!

Usually I appreciate the commentary on TechCrunch. Even though it's often short-sighted and hyperbolic, I usually think they get the big picture ideas right and hit on the stuff that we really should be debating. But Paul Carr's recent article called Facebook Breached My Privacy, And Other Things That Whiny, Entitled Dipshits Say is so stupid, but at the same time so indicative of the ways that many tech. folks are stupid, that I just have to point it out.

Usually, I know, I'd lay down 750 words on it. No one ever accused me of brevity. But this is actually pretty simple. I'll encapsulate Carr's argument in a few sentences, then present my own.

Carr: People who complain about privacy on the web should shut up. They are deluded about what today's social systems are really like. They shouldn't put anything about themselves on the internet that might be a problem, and they should control all others who might do the same. "Blaming Facebook’s flaky approach to privacy for the ills of the exhibitionist generation is just yelling at the stable door, long after the horse has bolted."

Me: Carr sounds like an ignorant elitist jackass calling all the rest of us "whiny, entitled dipshits" just because we don't want to live by the lowest common denominator of privacy, whatever Facebook decides is best for its bottom line. It's ridiculous for a geeky, tech-savvy internet journalist who spends all his waking hours trying to understand online social systems to crap on people who do other things with their time by calling them whiny and entitled. Get a clue, buddy. People like you might code the web, but it's people like us who make it work. Learn to live by our rules, not the other way around. Expecting people to learn how to 100% control all the content they share online, and then do the same for everyone else around them is pure fantasy. If the horse has bolted, then lock the fucking stable door and we'll just hang with the chickens and the pigs.

Amidst news that Internet Explorer 6 is still making up a 13% share of the browser market during peak business hours (see Corporate IT Just Won't Let IE6 Die), this cracked me up:

Dilbert.com

The last week has been full of news about Facebook's new moves. Expanded product offerings, rampant privacy violations and the like. The big question is whether Facebook can get away with statements like this:

"People have really gotten comfortable not only sharing more information and different kinds, but more openly and with more people," Zuckerberg said at a technology awards show in January. "That social norm is just something that has evolved." (via The LA Times)

FALSE. Objectively. Some people are comfortable, but many / most are not. The question is, can Facebook dictate that norm to the web by making business-first decisions now, worrying about the consequences after? Increasingly I believe the answer is yes.

I hear many people say that Facebook is destined to go the way of MySpace, and be superseded by the next big thing in social networking. But I don't believe that anymore. There was Lycos and Altavista and the crew, and then Google came along and people thought the next thing would be along soon. Even in the last few years, there were the people who predicted that Bing or Cuil, or Powerset, or Wolfram or whatever would be the next big thing. But no one's stealing Google's market share on search (although Bing is doing ok…). Google has become a standard, and it will be very hard to shake.

Well, I think Facebook is moving towards that same position. Facebook's idea this past week has been to explode its walls. Facebook wants to be the social graph that powers the web. There will still be new, cool sites for users to get involved in, but why re-invent the wheel? Facebook will allow these sites to slice off a part of the Facebook graph for their users and populate it with their own content. All the while, of course, Facebook is keeping track, expanding its own graph, making a mint. Facebook knows things are going this way, and so this week they slapped down their trump card and said "just you try and stop us!"

We've seen a pretty big backlash in internet terms, but nothing strong enough to lead to anything but minor concessions on Facebook's part. The only things that will stop them at this point might be action from Congress or the courts. At least a few folks in Washington seem to be paying attention.

In the meantime, I think the kind of protest, resistance we're seeing is useful and necessary. I'll be interested to see if Facebook really takes notice. I'm guessing no. So for most of us, our real decision is whether to accept a public life with Facebook, or log off for good. As for me, I'm not thinking of logging off yet, but only because I always assume information about me is public and widely shared without my knowledge. I decided long ago not to put anything on Facebook (or elsewhere) that I wouldn't want to share with the world. But that's me. Facebook allows me to manage my privacy the way I'd like by default. But it should do the same for others too, rather than forcing them into potentially dangerous and uncomfortable choices.


Via The Null Device Blog

Fire! Brimstone! Hyperbole! An unpopular opinion! Apple is not really doomed. But they're in trouble. Yes, the iPad just came out, and the internet had a giant geistgasm. There's no denying, it's a sexy device.

Here's the problem – who wants it? Right now, Mac fanboys (and girls). Soon, a few others who will convert once it's on to v2 with some of the kinks ironed out, next OS version (multi-tasking!!), and the inevitable camera. Kids will love it. Ok, so that's kind of a crowd. Remember, people, this post is about hyperbole! God! You're all so dense!

So, Apple may expand its market a bit, and bring in a few converts who have a need that the iPad matches. But here's the problem. There are two reasons why Apple's introduction of the iPad is a big step backwards for the company:

  1. Apple has always been good at opening genres. They pick a niche – mobile music, smart phones, slates, and they knock v1.0 out of the park. That's what the Cupertino brand of perfectionism and attention to detail in user experience and hardware will get you. In the case of the iPad, they didn't just crack the door on a new genre, they kicked it wide open. And many, many, many others will come pouring through. Soon there will be lots of cheaper, faster, more feature-rich competitors that will run a wider variety of software. So some people will buy an iPad, but others will wait for the HP Slate, or whatever comes next. This was true for the iPhone too, but it took a *really* long time for anyone to rival the iPhone experience. But now we have Android, and soon we will have Windows 7 Phone. Whatever you think of those two OSs, take away the app. store (which is Apple's ace-in-the-hole), make this about devices and OS, and iPhone is not so clearly better. It won't take nearly as long for all the slates to make their way to market. Just a few months from now we'll see them hitting stores, and in a year we'll see what Apple has really gained.
  2. But here's the bigger issue. The ideological issue. Just like Kim Jong Il, Apple has a viciously tight ecosystem, built on secrecy, that has draconian and seemingly arbitrary policies that they enforce through code. Also like Kim Jong Il, they'll tell you it's all for your benefit, comrade user. It makes for a better experience, it allows Apple to make the perfect society… err, phone and keep it that way, free of the imperialist influences of free markets and free culture. That's all good and well, except… well, except that the biggest opportunity for the iPad to open a new market for Apple is in the education space, but those are the very people who will hate the North Korean strategy. Apple is locking out competitors and enforcing arbitrary limitations on free speech, and this is probably just the beginning. It's the Apple way, or the highway. Well, educators, educational activists, parents won't stand for that. Why would they? Oh, they could just join the enterprise developer program or what have you and circumvent Apple's process. But that will limit educational innovations to private, circumscribed groups. And why get involved with a company that might endanger your ability to teach what you want when you can get a cheaper, faster device that has none of those restrictions? Apple shot itself in the foot. It had a chance to release a groundbreaking device and capture a new market. But that chance is slipping away, further and further each time they do some crazy shit. Apple, you make me so crazy. If you'd only open your fist I would take your hand!!

Update: Wow! This post has generated some angry response (see below). A few responses:

    1. Note that the entire post is tongue-in-cheek hyperbole, which I admit and joke about in the very first sentences. Some people seem to have taken it very, very seriously!
    2. Yes, I compared Apple's policies to Kim Jong Il and Apple to North Korea (see #1). That's name calling. But while I said extreme things about Apple, many of these commenters are saying them about me. Guys, you don't know me. Control yourselves! I'm just a blogger than no one reads! Strange what the internet does sometimes. Grr! You criticize Apple! Me criticize YOU!
    3. Many folks seem to have missed the entire point of my post, which is: (1) the iPad will face serious competitors with equal or better hardware and OS much more quickly than they did with the iPhone, and (2) Apple had a chance to open a huge new market with the iPad, but is shooting itself in the foot with its draconian policies.
    4. Cool out, people. Writing these serious, angry diatribes makes you look a little silly, detracts from the good arguments you write.
  • Here's something I never thought I'd have to do, but a recent journal submission requires that tables be submitted in .doc format — despite the fact that they fire everything back into LaTeX when they're typesetting the final journal anyway! Well, I guess there's just no arguing, so I had to figure out how to convert a LaTeX table to Word format. This process turned out to be surprisingly easy. Here's what I've figured out to be the easiest way to do this:

    1. Get Latex2RTF, which is an open source converter for both Windows and Linux. Install it.
    2. The GUI front-end that came with the software didn't work, so I skipped directly to the command line. I navigated to the proper directory – for me that's C:\Program Files\latex2rtf. The basic command worked great for me: "latex2rt {path to table}\table1.tex" Remember that on Windows, if you have spaces in your directory names you may need to put the whole path to the file in quotes. It'll do this for you automatically if you use the tab autocomplete when typing the path.
    3. Latex2RTF seems pretty complete, but there are still one or two latex commands that it just won't handle. For example, I had to remove the @{} format from my table declaration, which was no big deal. Finding the commands that bonk is easy, just use the debugging option at level 4, like "latex2rt -d4 {path to table}\table1.tex"
    4. I got an almost perfect RTF out of this process, but make sure to check the table carefully. I found, for example, that it didn't convert my text daggers, so I had to add them back. After a minimum of fiddling with font sizes, row spacing, etc., save as .doc, CELEBRATE!

    Next Page »

    Creative Commons License