For some weeks now, my websites have been extremely slow. I was so busy with exams and then, so gone on vacation (from which I returned yesterday, facing the gaping maw of The Rest of My Life), that I have been totally unable to debug the problem. This evening, I spent several hours doing so, and I’m pleased to say that I (seem to) have fixed the problem. I recite the solution here for googlers who may follow.
No other services on my server have been slow — just HTTP. So I fired up
ab to benchmark the server, picking this blog as a test example, and a 1×1 pixel.gif as the test image. After 20 iterations, the download time averaged 2 seconds. Two freaking seconds to download a single pixel. “Download” is too strong of a word, since it’s on the same bloody server. Let’s go with “retrieve.” It should take a few milliseconds. So I knew something was up there.
On a hunch, I tried downloading the same pixel image from the server’s IP address. It took 3 milliseconds. Then I tried the same image from cvillenews.com. 3 seconds. Ah-ha, I figured — so it’s the virtual servers. I spent an hour or so mucking around with those, and got nowhere. So I decided to perform the same test on all two dozen sites hosted on my server. After testing a half dozen, I found that the trouble ones had WordPress installed. I deleted the .htaccess for a few sample sites, and the benchmark stats went through the roof. Of course — the .htaccess file in each directory that makes the URLs all pretty. That was doing it.
So, dutiful geek that I am, I moved the contents of each .htaccess into httpd.conf and triumphantly restarted Apache. I ran
ab again and found…no improvement. Hmm. So the problem wasn’t keeping the RewriteRules in an external file. It was the RewriteRules themselves.
I trolled through WordPress’ support pages, seeking somebody with the same problem. Nobody. Hmm. I returned to my process of benchmarking all sites hosted on the server and found that several others sites with .htaccess files that weren’t WordPress had the same problem. So perhaps it was bad RewriteRules?
Now I was getting somewhere. I used this site as the sample and cautiously commented out bits of the RewriteRules. None of them made a lick of difference. How curious.
But wait — what was this bit of code at the bottom of most of my .htaccess files? Code to block trackback spammers and bandwidth-sucking robots that index audio and video. Could that be it? I commented it out and — boom — speedy server.
Ah. The problem is that I’m dumb.
Long ago, I had disabled hostname lookups in Apache (which convert IP address into domain names for the logs), since it really slows down the server. But my
LIMIT GET, POST DENYs were blocking based on domain name, not IP address. Presumably, Apache had to spawn a process to look up the domain name as each hit came in, rather than handling them on a per-session basis as a part of the main HTTP process, because of the conflict between httpd.conf and my poorly-written .htaccess files.
Worse still, I’d been keeping several sites within the directory that hosts my blog. cvillenews.com, for example, was in /var/www/waldo.jaquith.org/cvillenews/. Which meant that loading cvillenews.com didn’t go through all of the (extensive) rewrites and limits in its own .htaccess, but also all of those in waldo.jaquith.org. So cvillenews.com was twice as slow as waldo.jaquith.org.
Hierarchy flattened. Domain-based blocks removed. Problem solved. Server speedy. The world is a happy place.