…or, what happens when some Manager Guy gets a few Big Ideas that some perceive as evil and some others as completely mad.

A while ago, some weird guy from HBGary Federal wanted to track down some Anonymous members, and failed spectacularly in this task. It’s a long story, but the basic idea is this: The company provides intel to government agencies. In the wake of the Wikileaks mess, the central figure thought that he just might be able to develop tools that identify anonymous Internet users through social networks.

Of course, the small problem here is that the actual theories that were put forth were either nonexistent or complete bull, but the whole idea certainly seems plausible. I mean, in the hands of actual clueful scientists, this could be a viable research project. These people are obviously not. Anyone with a half of brain would see that yes, social networks data could lend itself to datamining - but dammit, you’ve got to have some sort of an idea how to actually do that stuff. After the grandiose idea was thrown to the air, it was time to come up with the theory. “Um…” went the great analyst, “…show that there’s some correlation between thingies?” …right.

The programmer who was hired to the project started to protest, saying that this stuff was a little bit loopy. But apparently, things much loopier still was right around the horizon. But that would have to wait until after some embarrassing security flaws were handled through ritual ownage. (Obligatory jab: The tale involves rather oblivious behaviour from a Nokia guy, who handed the root passwords a bit too eagerly. I guess Microsoft security policies were implemented fast.)

After the dust was settled, some people found some curious bits from the HBGary Federal leaked emails (via reddit). Basically, the idea was to build some sort of a sockpuppet management system.

Geez, no wonder the programmer said this guy was a loonie.

The first warning sign is that this grandiose plan for sockpuppet management involves a bunch of USB drive switching:

For this purpose we custom developed either virtual machines or thumb drives for each persona. This allowed the human actor to open a virtual machine or thumb drive with an associated persona and have all the appropriate email accounts, associations, web pages, social media accounts, etc. pre-established and configured with visual cues to remind the actor which persona he/she is using so as not to accidentally cross-contaminate personas during use.

This is obviously done because swapping around USB drives is something that your average manager can comprehend, so this system can be sold to top executives who are gullible enough to get intel from these guys. This is a US Government Contractor. If they want to convince some people in the positions of power, you just can’t dump all data on the hard drive somewhere. Government types want dedicated machines and physical components. I mean, everyone knows that.

And to make best use of this fascinating physical isolation, they… um…

These accounts are maintained and updated automatically through RSS feeds, retweets, and linking together social media commenting between platforms.

…automate the whole shit. If you said “But wait! How do they get the profile data for automated posting if the data resides on some USB drive somewhere?”, you probably know more about computers than these guys.

So now, we have here a system that is supposed to… what? Manage multiple accounts? Maintain session data for multiple users? Automatically post stuff to multiple accounts at the same time?

% firefox -profileManager

Kiddie stuff.

Nowhere does it say that it should be used by multiple people, nor does it actually go into detail on how to make one person look like multiple people.

Now, I don’t claim to be an expert moderator anywhere (I’m mostly just a spam janitor in Wikipedia these days), but I know one small detail where the system, as proposed, would fail: People look at usage patterns and idiosyncracies.

And every time you speak of automation, you are also probably speaking of repetition. People are lazy. Lazy people rely on automation. Automation makes you careless. Automation is conspicuous. People can smell a spam bot from a mile away - why wouldn’t they notice a guy with a piece of sockpuppetware?

Using the assigned social media accounts we can automate the posting of content that is relevant to the persona.

These guys were supposed to be some kind of datamining experts. Would they kindly answer this question: What safeguards this system has against datamining? I mean, surely this epic spam system doesn’t just have an army of sockpuppets doing the exact same things, because that’d be ridiculously simple to discover? Surely this system can’t be beaten by one Google query that gives you a nice list of all spam socks?…

Some words from the insightful guy at Daily Kos:

And consensus is a powerful persuader. What has more effect, one guy saying BP is not at fault? Or 20 people saying it? For the weak minded, the number can make all the difference.

Know one thing: Consensus is not set in stone, even less so when scammers are involved. Opinion can change overnight.

I remember a case from Wikipedia a few years ago. I forgot the guy’s name, but he was some sort of a self-published author. His article was put up for deletion. Everyone, everyone probably reached for the immediate conclusion: “wow, this guy has articles about himself in, what, dozens of languages? If he has achieved some sort of world fame, surely he’s notable for that reason alone?” That’s what the gut feeling tells you: Number of language links can be a powerful indication that there’s a “consensus” that someone is notable. Turned out the guy had posted stub articles to many Wikipedia language versions, with a little bit of help from machine translation and dictionary browsing. As the situation dawned to the user communities of the individual language versions of the wikis, the “consensus” quickly changed.

People hate sockpuppet shills. When they notice that someone’s obviously a sockpuppet shill, they’ll start digging. And in case someone’s lazy enough to use a bit of software for automated sockpuppeting, the chances are that even your great-grandmother can spot them. And most of all, people love to hear about big failures in astroturfing - failures in sockpuppeting carry a heavy risk.

The article does raise a point, though: “there’s many people who agree with X on Twitter” is a dubious form of argumentation, because it’s an appeal to popularity. Also, Twitter isn’t a bloody instant poll system; for every politically active user screaming about latest issues on Twitter, there’s dozen who don’t, and I didn’t even mention the parties and issues here either way. You can’t assume that a quick search at some hashtag gives you an instant, reliable, representative view on some subject. Less so if half of the stuff is spamming gits.

And, incidentally, the whole irony of a company simultaneously trying to map anonymous people to real-world people, while producing a software that tries to mask the identities of sockpuppeteers, is a little bit weird. Again, I’m not a great wise all-knowing scientist, but trying to analyse data that you’re polluting yourself as part of another experiment is not always the greatest way to do science.