Dear Readers, You Can’t Sue a Spider
Oh dear, my head hurts. I’d appreciate it if all the site owners out there would go and grab a pen and piece of paper real quick; we’re going to have a short technical review. (I don’t want to hear a peep out of you, McGee).
If you don’t want your content spidered, a short little text file is the answer to your prayers. Meet your friend, the robots.txt file. Simply copy and paste this line into that file to tell the engines how you want them to deal with your content:
META NAME=”ROBOTS” CONTENT=”NOFOLLOW”
If you don’t want your content archived, use this one:
META NAME=”GOOGLEBOT” CONTENT=”NOARCHIVE”
If you don’t want archiving or snippets, use this one:
META NAME=”GOOGLEBOT” CONTENT=”NOSNIPPET”
See how easy that is? What you can’t, or at least shouldn’t, do is sue a search spider, Google (wave to our Belgian readers!), or anyone else for your inability to type just one little line. Not that it stops people from trying.
A Colorado woman is leaning on the most retarded piece of legislation ever, the Uniform Electronics Transactions Act, to argue that the Internet Archive’s "Wayback Machine" spiders entered into a binding contract when it arrived on her site and started snooping around the way spiders are known to do. By entering the site, the deaf, blind and mute spider "agreed" that it wouldn’t copy or distribute any of the material found on her Web site. Or at least that’s what she’s claiming. Personally, I find her and her line of thinking somewhat ridiculous. It would be like taking action against poor Jack Jack for not using his litter box. I mean, I stapled the note right above his food dish that refusal to use the litter box meant he was going to get re-neutered, but he went ahead and ignored it. I did my part, right? Get me my letter opener!
Yes, it’s crazy, but this is America. People sue when their morning coffee is too hot and it burns their sensitive little taste buds. Only this time it’s not even a real person being sued. Or even a fake one; it’s an intangible spider that doesn’t even exist in the physical world, a piece of software. Yes, this woman has sued the Internet Archive and its spider for conversion, civil theft, breach of contract, and violations of the Racketeering Influence and Corrupt Organizations Act (RICO) and the Colorado Organized Crime Control Act (COCCA). Oh, so it’s not just any spider, it’s a mob spider!
The racketeering charge just kills me, probably because I’m Italian and I immediately envision a man on his knees about to get hit the back of the head with a shovel. I apologize; I’ll work harder at suppressing those childhood memories. Moving along…
Here’s my take. You can’t sue the Internet Archive because you didn’t take the appropriate steps to protect your Web content. If you don’t want your content spidered or made available on the Web, use your robots.txt file to tell the spiders that. Or, if you can’t be bothered to copy and paste one line of code, create a roadblock for the engines. Make them click a box or agree not to archive the content BEFORE letting them on your site. This ensures visitors know in advance and it prevents spiders from being able to access your content, because, you know, they’re BLIND and can’t read.
But the Colorado woman didn’t do that. Users had to clickthrough the site in order to even read the full notice. Users may have been able to understand what they read and either agree or disagree to the terms, but a spider can’t. A search spider, like Posh Spice and my Jack, is illiterate, regardless where it’s coming from.
And even more the point, instead of suing a spider or demanding $100,000 for archiving the content, why not follow the Internet Archive’s easy instructions on how you can remove content and prevent it from being copied in the future? That seems far easier than a year-long court process.
So far the courts have thrown out all charges except the breach of contract charge. The question at hand now is whether a Web spider can held accountable for indexing content that has not been blocked by the site owner. Basically, it once again brings up the issue today’s Web being opt-out instead of opt-in.
More and more I’m finding myself pro-opt-out. Part of being a citizen of the Web is having your content indexed and made available online. If not, why do you even have a Web site? If you DON’T want your content made available, then I think the responsibility falls on you to tell the spiders that. Suing a search spider is ridiculous.
It’ll be interesting to see where this one goes. Should this woman win and the courts uphold the idea that spiders able to enter into binding contracts, it could change the way the search engines spider the content they find. I’m really not looking forward to the days where you have to tell the engines to come spider your site; sites have a hard enough time getting into the index as is.