Skip to main content

How To Learn From a Traffic Surge

I want to say a few things for my own benefit. Maybe that's
the only thing I do here. As always, I hope something I have
is useful to someone else. In this case, if you're in any
position to deal with a big surge on a small site, you might
get something useful, or at least enjoy, what I have to from
my experience getting a bump from some guy named Mike
Arrington with a little blog called TechCrunch.

This is about reaction and what was the right and wrong way
to react to the impact of a weeks traffic in a couple hours.
Should natural means have brought our typical traffic to
these levels (time will bring this) the means to handle it
on a day to day basis would have been put in place.

The sudden increase began to timeout our FastCGI processes
and this was alerted to me quickly. I confirmed this and my
first response was to initiate a restart cycle, restarting
each process in turn, which did nothing to help. I brought
up a new instance on EC2 and prepared to roll it out as a
new production machine, with the same steps I use for every
rollout of software updates. The new instance ran fine, so
I initiated the rollout, associating our public IP with the
new instance to begin taking traffic. Immediately, the
staging machine, now in produciton, stumbled and began
behaving exactly the same.

My next thought was the obvious thing both machines shared:
the database. I started looking at any metrics I could, and
with nothing obvious and the site already failing to respond,
it seemed a safe bet to restart the database, after some comments from the fine folks in ##postgresql, it became possible that badly terminated transactions might have been hanging processes and I was advised to restart PG, which is a disruptive action. When it finally cycled my staging machine seemed fine and I deployed it, only to watch it start to suffer once again.

This was when I got a message that we had gotten the bump from Mike Arrington, over at TechCrunch. Everything suddenly made sense, and dropping into logs showed me a huge surge in traffic. There are things I could probably improve about our setup, but I'm mostly satisfied with its progress. Still, this surge was well over what it was prepared for at the rate it was coming in and it would actually be unreasonable to expect a site this size to scale that quickly for such a large and relatively short burst (a few hours).

In the end, my final call is that the biggest problem that happened is that I didn't have the information obvious to me that it was the traffic and not the system causing the problem. Everything I did was only making the problem worse, and my best course of action should have been to step back and cross my fingers. I'm looking at short term reports I can consult to give me a better overview of the recent activities, traffic rates over the last hour and server error ratios that can tell me what's going on without spending too much time digging into it. The more time it takes to figure out what's going on, the more likely someone is going to jump to a conclusion in an attempt to get a solution moving as quickly as possible.

Comments

Popular posts from this blog

The Insidiousness of The Slow Solution

In software development, slow solutions can be worse than no progress at all. I'll even say its usually worse and if you find yourself making slow progress on a problem, consider stopping while you're a head.

Its easy to see why fast progress is better: either you solve the problem or you prove a proposed solution wrong and find a better one. Even a total standstill in pushing forward on a task or a bug or a request can force you to seek out new information or a second opinion.

Slow solutions, on the other hand, is kind of sneaky. Its insidious. Slow solution is related the Sunk Cost Fallacy, but maybe worse. Slow solutions have you constantly dripping more of your time, energy, and hope into a path that's still unproven, constantly digging a hole. Slow solutions are deceptive, because they still do offer real progress. It is hard to justify abandoning it or trying another route, because it is "working", technically.

We tend to romanticize the late night hacking…

Finding "One Game A Month"

I was really excited about the One Game A Month challenge as soon as I heard about it.
For about two years I've struggled in fits and starts to make my way into game development. This hasn't been productive in any of the ways I hoped when I started. Its really difficult to be fairly experienced as a developer, which I believe I am in my day job as a web developer, while struggling really hard at an area in which your experience just doesn't exist.
Its like being a pilot who doesn't know how to drive.

But this challenge provided a new breath to this little hobby of mine. It gave me a scaffolding to experiment, to learn, to reflect on finished projects. I had spent far too much time on game projects that stretched on far past their exciting phases, bogged down by bad decisions and regret.
And it has worked.
I have a lot to learn. I have a lot of experience to gain through trial and error and mistake and discovery. I have a lot of fun to be had making more small games t…

On Pruning Your Passions

We live in a hobby-rich world. There is no shortage of pastimes to grow a passion for. There is a shortage of one thing: time to indulge those passions. If you're someone who pours your heart into that one thing that makes your life worthwhile, that's a great deal. But, what if you've got no shortage of interests that draw your attention and you realize you will never have the time for all of them?

If I look at all the things I'd love to do with my life as a rose bush I'm tending, I realize that careful pruning is essential for the best outcome. This is a hard lesson to learn, because it can mean cutting beautiful flowers and watching the petals fall to the ground to wither. It has to be done.

I have a full time job that takes a lot of my mental energy. I have a wife and a son and family time is very important in my house. I try to read more, and I want to keep up with new developments in my career, and I'm trying to make time for simple, intentional relaxing t…