You may have noticed some downtime last night/this morning. We were moving servers. Again.
TL;DR this probably only interests system administrators, and things should *finally* be fixed.
To give a quick rundown – the old Pocketables server existed on a very fast and stable hosting platform in the Crowdgather data center. The only problem with it was we couldn’t edit anything. It was locked down to the point we had to make requests constantly to do simple things (such as update 9-month old exploitable plugins).
For nearly three years we campaigned to get off of that server as our traffic went from “whoa!” to “it’s ok, you’ll do better next time”. In this particular set of instances we could trace the traffic decline mostly due to our SEO, or lack thereof. Some of it was also that we were another Android blog in an ever expanding pool, some of it was that all these sites that were jammed together didn’t mesh particularly well for some.
But mostly it was serious problems which prevented us from being searchable.
We got put on a server which allowed me to update the plugins, make the site mobile friendly, do all sorts of things that would fix this.
Unfortunately this server attracted the attention of some botnets and they downed it last week. We had scraper crawlers hitting every article of the 14,000 or so we have, and generally that would have been fine on the old server.
Not the new one. The new one was underpowered. It could handle traffic fine, but not the traffic that was being generated.
We put up Cloudflare in front to serve up images… it’s free, it does things… that should have lowered the strain on the back end servers, but it didn’t by much.
The load was absurd. Looking at the CPU times (which should be 2 or below) they reached 168. To give you an idea of what that is in real terms it’s the difference between a four second page load and your browser finally timing out loading an image.
Cloudflare stopped a very large number of attacks, but as of yesterday the load times were still extremely high. We had disabled plugins that used any CPU, ran plugin profilers, it just wasn’t particularly working.
I’d been working with a CG web admin to figure out what was going on and finally got some more on the specs of the server. Long and short of it is the specs that Pocketables was on were so underpowered that I’d had to abandon them on my personal website a year ago for the same reasons, and I get a lot less traffic than Pocketables does.
It’s not that the server was absurdly slow, it’s that each operation takes time and processor power. A page could be served quickly, but it wasn’t really able to generate three pages at the same time. Something would get put on the “we’ll get to that in a minute.” As such wait times increased … slowly at first, then a little more because now the load was getting absurd.
The web server had to be restarted many many a time because we weren’t even able to get in to try anything.
And then it got worse. All that SEO work I’d done paid off, and bit us in the ass. Real traffic started increasing. Search engines started indexing us. And right in the midst of all of that the server took a spectacular nosedive.
I mention these CPU times… they should be under 2 for what we do. My personal site runs about 0.13 peaking at just under 1 usually. If the server’s doing nothing, you’re still going to have a little CPU usage because you’re using CPU to look at CPU, but it shouldn’t be much.
I looked at a perfect time. I hit that rare unicorn moment when nobody was on Pocketables (I can tell from an Apache status page,) and one person was on the site that shared the Pocketables server. One person load should have been nothing.
It was over 3. That was whack.
Decision was made to move, and we did. For the second time in a few months. In two months we’ve now had about a week of server-induced downtime. A week of me being gone for CES. A week of me not writing much due to hospital and snow.