I’ve had 9,000 hits on today. 3,941 of those are from MSN‘s crappy bot, which is downloading each and every comments RSS feed from WordPress over and over and over again. To the tune of 6.6MB of traffic today. It keeps overloading my server, which falls back into chill-out mode for a few minutes before starting up the web server again…only to be overwhelmed by MSNbot again.

I see on Technorati that a lot of people are having this problem. I’m just banning MSNbot from my network by filtering out their IP subnet. Screw ’em. It’s a crappy search engine, anyhow.

  1. The Robot Exclusion Protocol was actually never formalized or accepted by any body. It just came out of the consensus on the robots mailing list in the mid-90s. There’s no enforcement mechanism, and really no updates to that initial “standard” since then.

    MSN last attempted to spider the site 20 minutes ago. My .htaccess keeps them away, as if the robots.txt wasn’t a big enough hint.

