Wednesday, September 30, 2009

How To Invest in Poor Decision Makers

I couldn't think of a better title that fit my "How To ..." pattern. The point is, I wanted to make a response to the 37signals post that I found a little harsh. Sure, if you were able to build your company up without investors, that's a great thing! It doesn't make it a terrible thing to get a boost in the early stages or give you a license to insult people trying to pay the bills and put children through college.

Making great products is something a lot of us aspire to. Frankly, that simply isn't all of us and there really are good developers out there who are still only in it for the money. I don't know that is the case with Mint.com, but neither does anyone over at 37signals. Belittling them for taking a quick-cash option assumes a lot that may just be completely wrong about the intentions.

Now with a chunk of change, maybe the founders are planning to jump ship in a couple years and self-fund their real dreams.

On the matter of start-up investment itself, I do want to make some comments. Full disclosure: I've never been involved in a venture backed startup and I'm completely making this up from my own opinions about the world!

Pretend I'm from your bank and call you back after a loan application. You're taking out a small business loan to build an additional room in your home for a new child. Everything looks good, and I've got a few questions to go over before approving the loan.

"I'd like to make you an offer for 10% ownership in exchange for this investment in your new venture," I begin.

"What the hell are you talking about?" you quizzically respond.

"We're talking about a significant investment in a potentially very profitable new enterprise. This child may well become a doctor or lawyer and if we're going to help with the initial costs of raising this from the ground up, we all feel it is a reasonable request to share part ownership and benefit from that share over the lifetime of its profitability."

"Umm... I thought I'd pay the loan back. Plus interest, even. I don't even think I would own the child myself, technically. This is very strange..."

"Pay us back? A guarantee of interest accumulated as profit on our contribution? We'd rather take a chance of nothing or you paying us regularly for the rest of the child's entire lifespan. Oh, and all of it's children, of course."

*click*

If we look at everything in our world from neutral eyes that aren't used to our ways, things look weird. Does our investment model make sense, in this industry or any other? Why are any initial investments not setup as high-risk, high-interest loans, most likely with some initial grace period to await profitability? Of course, we could make some comments about the predatory loans and paying a cut of income for the rest of one's life, but at least banks pretend that isn't the deal upfront.

It isn't like this isn't an unusual idea. People get small business loans all the time. The tech sector seems to have skewed expectations that lead to dangerous and strange arrangements for funding. Still, I can't help but wonder if there are independent investors who would or do take such a (relatively) altruistic route. I imagine something like a traditional investment round, mandating some grace period of 1-2 years, a repayment schedule requirement full reimbursement, and interest accumulation that tapers off after repayment of the initial investment.

The basic foundation could be extended to view all initial players as investors, be the investors of time or money. Invest your time to get a business started, helped by monetary investments from others, and after repaying yourself and those individuals the company becomes its own entity. It is not burdened with paying out profit shares to you or anyone else. Yes, you'll still make your salary and you'll still run the company, but it might be a better one for it.

Sunday, September 27, 2009

How To Learn From a Traffic Surge

I want to say a few things for my own benefit. Maybe that's
the only thing I do here. As always, I hope something I have
is useful to someone else. In this case, if you're in any
position to deal with a big surge on a small site, you might
get something useful, or at least enjoy, what I have to from
my experience getting a bump from some guy named Mike
Arrington with a little blog called TechCrunch.

This is about reaction and what was the right and wrong way
to react to the impact of a weeks traffic in a couple hours.
Should natural means have brought our typical traffic to
these levels (time will bring this) the means to handle it
on a day to day basis would have been put in place.

The sudden increase began to timeout our FastCGI processes
and this was alerted to me quickly. I confirmed this and my
first response was to initiate a restart cycle, restarting
each process in turn, which did nothing to help. I brought
up a new instance on EC2 and prepared to roll it out as a
new production machine, with the same steps I use for every
rollout of software updates. The new instance ran fine, so
I initiated the rollout, associating our public IP with the
new instance to begin taking traffic. Immediately, the
staging machine, now in produciton, stumbled and began
behaving exactly the same.

My next thought was the obvious thing both machines shared:
the database. I started looking at any metrics I could, and
with nothing obvious and the site already failing to respond,
it seemed a safe bet to restart the database, after some comments from the fine folks in ##postgresql, it became possible that badly terminated transactions might have been hanging processes and I was advised to restart PG, which is a disruptive action. When it finally cycled my staging machine seemed fine and I deployed it, only to watch it start to suffer once again.

This was when I got a message that we had gotten the bump from Mike Arrington, over at TechCrunch. Everything suddenly made sense, and dropping into logs showed me a huge surge in traffic. There are things I could probably improve about our setup, but I'm mostly satisfied with its progress. Still, this surge was well over what it was prepared for at the rate it was coming in and it would actually be unreasonable to expect a site this size to scale that quickly for such a large and relatively short burst (a few hours).

In the end, my final call is that the biggest problem that happened is that I didn't have the information obvious to me that it was the traffic and not the system causing the problem. Everything I did was only making the problem worse, and my best course of action should have been to step back and cross my fingers. I'm looking at short term reports I can consult to give me a better overview of the recent activities, traffic rates over the last hour and server error ratios that can tell me what's going on without spending too much time digging into it. The more time it takes to figure out what's going on, the more likely someone is going to jump to a conclusion in an attempt to get a solution moving as quickly as possible.

Saturday, September 26, 2009

How To Turn Web Development Around (Part 3)

When I complained about the problem, I promptly outlined some ideas about solving it, vaguely. Now, I want to narrow that outline into systems I actually use. I do most of my work with Django, some hobby time is spent with App Engine and Twisted, and I enjoy Amazon Web Services, so I'm thinking from these perspectives when I approach this. Parts one and two were broad, but some of this might only apply to fewer of you. Either ignore those or adapt to whatever you use.

Django's cache layer sucks. Simply stated and simply true. Any time I decide I can cache something, I should ask myself if I could have built it before I even had the request in the first place. Doing that with the template caches simply isn't possible. It should be possible and it should be the first path you take, instead of forcing us to go out of our way to do the better thing. Anything I might want to cache, I also might want to be sure I'm not doing in more place than once, and forcing them inline in my templates does not help this. The template caches imply a copy-and-paste method of reuse when a cached portion is used in more place than one. When I define a cache block, I name it and I specify a set of keys. This is exactly the information, that when changed, I should just generate that block as a static snippet to be inserted. If it weren't for the lacking in reuse mechanics, I would advocate parsing all your templates for cache blocks and pre-generating them. Instead, we need to pull the cached contents out of the normal templates and use the existing names and keys to find the generated snippets.

On the more basic level, there are some abstractions that need to be injected into Django-proper to really be useful, by means of what they would standardize. We have no current means of standardizing our cache keys in a way that different applications can cooperate about what data is where and how to get it. Even the types that are taken for granted in Django have no useful standards. If they did, I would be able to drop a QuerySet object into the cache in a way that another query can find to reuse. And, when memcached is by far the most likely cache backend to be used, we would be providing a mechanism that abstracted away its limitations in entry size, allowing us to trust dropping our QuerySet in safely.

Denormalization should be normal. I have revision tracking in a document system, and from a normalization perspective it makes sense that each version hold a foreign key to either its previous or next version, but not both. From a practicality perspective, if I have one version I want to know the previous and next versions without doing a new query. Our Resources might offer a solution, by giving us some place outside of our model to allow denormalized data. I could generate a record of my documents with all the revision information queried and built and stored in one flat record, while keeping my base model clean.

Queuing work should be as accessible as doing work. There is little or nothing inhibiting a developing from dropping one little query or action into an existing operation. I've recently built a weighted sort to replace our basic date and time based order for posts. This means generating scores for all the posts and updating those when posts or votes change. Now, whenever we calculate scores we account for the age of all votes and the relative scores and age of all posts and votes together. In other words, this is something I'd prefer not to add to the cost of a user actually posting content or voting on something. It would have been extremely easy for me to call one generate_scores() function, but it takes thought, planning, and infrastructure to have this done after the request is handled.

Borrowing from existing Python canon makes sense, so I think multiprocessing is a candidate for use here, in one form or another. multiprocessing.Pool.apply_async() without a result returned fits the bill for an interface to call some function at another time, possibly in another process. Any function that works when passed through multiprocessing into another process should also work when queued up for execution at some later time, so borrowing here reusing existing semantics developers should be familiar with.


Friday, September 25, 2009

How To Adopt/Kidnap a Project

Distributed version control is a good thing. I've started wondering, abstractly, removing the middle word of that phrase. In other words, how are we being affected by "distributed control" and how will the landscape of free software politics change as it becomes more predominate and we all become more comfortable with it?

Even centralized version control began the distribution of control. At least, it made it easier for more than one person to control the changes of a codebase. In old days of e-mailing patches around, it was pretty much a requirement that a single person be responsible for the merging of patches into any single codebase (or any section of that codebase). Source control allowed multiple developers to commit changes and began to put less burden and less power in any one person's hands.

Anything that makes the submission of new code easier is going to thin that power even more. When anyone can come along and submit changes to change functionality or add something new, it takes a little bit of control away from the owners of that project. At some point, you start to feel that the community runs your project as much or more than you do. This has its good and its bad sides, but it is a shift we see more and more.

A few years ago there was a rift in the development team of the XFree86 project and from this we got our current fork, X.org. The story is well known and it brings to light an important political power of open source: fork and run. Even if you own a project, you'll begin to loose power both to make users and other developers happy, and to keep control at all. A strong enough disagreement could mean everyone else just leaving you behind and taking the project with them, under a slightly different name and a forked codebase. This can be scary and obviously could be harmful, but like any democracy we trade that for the benefits willingly.

Today, forking is easier than ever. Project hosts like github and launchpad promote forks as the primary means of submitting patches. No longer do you submit your changes for scrutiny and wait for acceptance or denial. These are the days of "I liked your project, and I have my own version of it. Take it or leave it." Other developers are as welcome to use your version as the original. This begs the question, when does the original project stop mattering and when do we come to realize that all forks are created equal?

The big question here is when the use of a fork with a few patches, either yet to be pulled into the original or rejected for differences in opinion, becomes as reputable. This can only happen if we can get past looking at forks as either replacing or diverging and understand them as ongoing versions with differences for good reasons. Should I find that I want to make some modification to a library I'm using that the current maintainer doesn't want to accept, there should be no social issue with those two branches, the original and my own, existing and being used in parallel. The choice for others on which to use can be considered of as much weight as its configuration options.

When we reduce to zero the social cost of taking a project for you own to make the changes that fit your needs, we make many things easier. Abandoned projects become far easier to adopt, without feeling the need to make all due process to contact and get the fair wishes of the creator. Difficult to work with maintainers no longer hold control over users and developers who disagree, because even more democratic than a democracy we can allow everyone to truly get what they want.

Thursday, September 24, 2009

How To Turn Web Development Around (Part 2)

I did my best to outline the problem in Part 1. Now I have to stand up and propose some kind of solution. Otherwise, I'm just complaining and contributing nothing of real value.

Our frameworks make certain things easier. They don't provide tools to help us with other things. For some other set of activities, they may actually prohibit. The problem here is a combination. Django makes it easy to query your database and wrap functionality up into re-usable template tags. While I'm thankful for that, I am also realizing that ease of one thing can prohibit another. When one path is made easier it creates the perception of greater difficulty in other paths. I think this is why, when our web frameworks give us all these tools to response to a web request, we completely lack in everything we could do aside from that request.

How can we make it easier to work outside the web request?

We need some idea of what working outside the web request means. We also need to define these in terms that are useful when we do get around to that request handling we've already got.

Going back to the tag cloud example, we look at the resources created when we generate one. Aside the HTML snippet of the tag cloud itself, we build the data used in the cloud, consisting of all the unique tags and their counts. This is the kind of data that makes sense to store in your cache, but this fails the normal cache use case. We don't want to loose these generated resources when caches reset, so we need something less ephemeral. Any decent key-value store would be a good solution here.

Unfortunately basic Django signals are lacking. Another means of triggering the resource generation at the right times, with the right parameters, has to be found. It makes sense to actually use existing signals, which would add to a job queue.

The few remaining parts to give us easy mechanisms for inserting snippets into templates or grabbing generated datasets in views are all very simple. Together, the three layers come together to give us what our frameworks are leaving out today. Resources, to store non-cheap data. Jobs, to generate resources. Finally, Tools to acquire and use those resources. If I were an egotistical man, I might try to coin my own acronym and name this RJT.

I know this is nothing new. Rather than make the situation better, that actually makes it worse. As any project grows and matures, the cut corners need to be filled in. Everything here is eventually built, to different variations and with probably a lot more forethought (or a lot less, depending on the pressure.) The only difference is that large scale applications need to divert more resources to pushing, instead of pulling, whereas smaller scale applications simply should, because the benefits exists in either case. We won't all need to grow at exponential rates, but we should be doing better with whatever resources and whatever work load our application is given, small or large.

Wednesday, September 23, 2009

How To Turn Web Development Around (Part 1)

Something is bothering me this past week. I've been taking some stabs at reducing the maximum render time of a site, when the caches are all empty. I cache certain components and queries, but when the caches are primed the render time is under 500ms and I think that's pretty good. That worst case senario, however, is just not acceptable. Worse than a couple seconds. That isn't time that should be taken. I dug in and found a really bad pattern.

It isn't hard to make a page faster, but the default is to be as slow as possible. We have to understand this pattern. I am looking at this in relation to Django, but I have a feeling there are similar patterns other places.

The common tagging application is a good example. It makes it really easy to tag objects, count them, query by them, and build those clever little clouds. You're given lots of new wrappers for all the common tag-related queries you'd need to do. This may be a source of the problem. We've gotten into a rut of complacency with components that give us more rope than we need to hang ourselves. Abstraction hides the cost of operations.

Are we asking the really simple question: Why are we pulling when we could be pushing? With one of the most read-heavy information systems in the world, everything revolves around the needs and demands of the almighty HTTP request. A browser asks a question and then we go and figure out the answer. We are, by default, building our software around the read when we should be building around the write.

Caching is lazy. We should be pro-active. How often is a tag cloud going to change? Only when the taggings change, of course. No page request should ever generate a tag cloud. We should be building the cloud, as a static html snippet, every time the tags change. When we're actually rendering a page, we should just insert that current snippet.

The problem is that we make lots of tiny little increases in the pulling we do and we do it all over the place. We hide it behind innocent looking functions and properties and end up using a few of those inside one element that gets repeated and it piles up. The amount of work in one page becomes insane, for what that page is. The problem isn't that it is so difficult to do something better, but that the default should be better. I would like like some practical answers to making that default better.

Friday, September 11, 2009

How To Track Changes in the Location Hash

As the web becomes more "2.0" we're collapsing collections of pages into fewer and even single, more dynamic pages. Instead of ten pages of event listings, we might have page that loads further items dynamically as you scroll. The state that was once static to a page is now loose and can alter in the lifetime of a page, which grows longer every day. Parameters of the page state have always sat in either the path, in URLs like http://myblog.us/post/how-are-you or the querystring in cases like http://www.coolstore.com/view.html?product=4356.

Neither approach works when those parameters are changing for the life of the page, and where a single URL needs to be able to represent these multiple parameter values at any time. In most uses so far, the bullet is simply bitten. The user can browse to your site and click around, and if they bookmark or send the link to a friend, they'll always come back to the front page, because the state of the page is no longer held in that URL. This wasn't acceptable for long and a few projects, including GMail, have taken to tossing some state information into the hash or anchor of the URL, after the # symbol. This has traditionally told a browser to scroll to a <a> tag with that name, but if none exists it becomes, essentially, a no-op. We have a place in URL we can stare state, now, without causing page loads and persisting the state in links and bookmarks. There still haven't been great or standard ways to deal with this, yet.

A couple years ago I started my own attempt to make this easier, when I found existing libraries outdated or just not really doing what I hoped for. They either seemed to depend on obsolete versions of other libraries, or only give a little trigger when the hash changed. I thought we needed something more than that, because this is really replacing everything we used to use querystrings for. Sure, I could toss #2 or #43 at the end of the URL depending on what page of results you saw, but what if the state was more than a single number? Querystrings can store lots of variables. This is what I wanted within the hash.

Born was hashtrack.js!

The API is pretty simple. You can check and set variables in the hash with hashtrack.getVar() and hashtrack.setVar(). Changes to the hash or to specific variables in it can be registered with callbacks via hashtrack.onhashchange() and hashtrack.onhashvarchange(). You can view the full documentation, included embedded interactive examples, at the github pages hosting it.

Tuesday, September 08, 2009

How To Select from a Range

I had some down time today to relax, and in true obsessive fashion I spent it coding for the hell of it. I got something in my head and whipped up a demo of the idea. Do you ever need to let someone select a range of things? Maybe they need to pick which and how many items to show in a search result or which letters of names they want to see from an address book? I wanted to allow selection of both "what" and "how much" in one click.


click for demo

The range being selected from can be anything: numbers, letters, weeks of the year, etc. Users can click among that like a list of a page numbers, like they would expect. I think this would work well in situations where you don't need the entry to be exact, although it can be used for precise entries. Multiple quick selections would also be easy here, maybe quickly changing the range you're viewing in an analytics app. I'd also like to look at adding a "zoom" feature, so that one selection fills the entire widget and then you can select within that to narrow down on the exact range or specific item you want.

Fork away! Especially if you're the kind of developer/designer who can make this not look like government grade bread

Github: http://github.com/ironfroggy/rangeselection/
Demo: http://ironfroggy.github.com/rangeselection/
License: MIT

Wednesday, July 22, 2009

How To Fail At Upload APIs

Youtube, what the hell is up with your upload APIs? Here we are, hacking along and being all "Hey, we got accounts syncing and thumbnails popping up and videos getting attached to blog posts and its nifty-pie, oh yeah." when we make the move to be even more web 2.0y and add full authentication with youtube accounts and integrate the full upload cycle into the media selection. We take some little video upload tests and we're happy about that. Oh, that was a nice milestone to hit in FogBugz, I tell you what.

The week passes and suddenly I'm asked why all the video uploads keep failing. All the usual things are checked, but the upload tokens validate and the headers are correct. All our tests are within the upload limits, too. These aren't friendly "Hey, I really hate to tell you this, but the video you just uploaded didn't go so well. Can you try it again in a bit? Thanks"-errors, either. These are "Fuck off. I just reset your HTTP connection, bitch"-errors. Other times, we get 502 Bad Gateway from Youtube's servers and that isn't the kind of nice error you expect from a professional service like that. With the errors happening on the client browser, I'm left with nothing to do by way of responding nicely. Our own machines never get a single byte from Youtube on the matter, much less the nicely formatted response-requests they promise to use to tell us all about the sucess or failure of the uploads we set our users up with.

What gives with the weird error and what gives with they not being nicer about it?

Through support forums our problem is matched up not with some obscure thing, but to many, many posts. It seems like everyone and their dog gets reset connections and 502 errors during uploads. At this point, I'm absolutely questioning that Youtube was ready to release this part of the API when they did, because it is obviously not mature. Now, I happen to know it on good authority that some people do upload videos to Youtube, from time to time. You know, with that little uploader they wrote called The Youtube Site Itself. So, the theories I have is that they either aren't eating their own dogfood or their own uploader is doing smart-ish upload resuming when their internal upload API chokes on them constantly. In the end, I have to ask, "What gives?"

Apparently, according to support responses and our own tests, if you upload to uploads2.gdata.youtube.com instead of uploads.gdata.youtube.com, it works fine. This is their "new upload system", but what does that mean? The API is completely identical, so does "new upload system" mean a new specific set of boxes that handle uploads? Specific boxes doesn't seem very cloud-like for Google, don't you agree? Well, i

Tuesday, July 21, 2009

Review: FogBugz 7.0

While ignoring completely that I was promised access to the FogBugz OnDeman 7.0 Beta program and just forgotten somehow, I'm going to come say that one day with the official release and I'm more excited than ever to be a FogCreek customer. Yes, I am still a card carrying free software nut-bag. I'm absolutely sure certain individuals will get grated at me, again, for not using Trac like a good geek.

Call me a fanboy, but boy-oh-boy is this a sweet release. It is a shining example of knocking a release out of the park and impressing everyone (who could be impressed at all, and thus disregarding those who will never be impressed by a commercial, for-profit bug tracker, ever, no matter what, not in a million years).

The experience is absolutely slick. Faster, brighter, shinier. Packing new features, improved features, and bug fixing in a new package is a great way to make the functional improvements stand out. Even if we have a good design, any product should take a note from this book and spruce up the design just to highlight that change is in the air.

I'm actually struggling to think of something that has been added that I didn't want or that I wanted which was not added. I'm sure there are people on both sides, but I'm still thrilled to apparently be an exact match for their target unsatisfied customer to satisfy. Even though I have really liked FogBugz for some time, I have also struggled with it to represent my work flow. I've worked out different sets of behaviors with different clients.

I keep a milestone that never gets a due date and only exists to hold cases that are approved to be done "some time" and we get to a few of them between each actual milestone. Today, I can drop that and prioritize cases in the backlog directly. I also added custom case statuses to "propose" and "reject", so I can track what I think we should do and what has been approved.

Was it a bug or a feature when I needed to clean up the content form? Next time something like that comes along, I'll enter the case as an "Improvement", the new category I added for such in-betweens. I'll probably tag it for organization, too. Maybe, I'll add a custom tag to track the branch I'm working in. I'm really looking forward to getting more and more mileage out of this release. I really have to commend everyone that worked so hard to bring this iteration to the public. Thank you, so much!

Monday, July 20, 2009

How To Recurse Your Foundation

Or, the working title: How To Look Down At The Tower of Turtles

We're a recursive bunch. We're more repetitive than . There is no shortage of writing that its caches all the way down or that we're repeating the mainframe/dumb terminal era. I have an argument that our entire profession is hinged on repeating ourselves.

Repetition is in the DNA of what we do. Software is the ultimate commodity, approaching zero-cost production. Solve one problem and the solution is applied to a thousand problems. Generalize and solve a million. Everything we do is repeated and is about repeating things. At the core, we're just moving little bits around and we repeat that action over and over, with very slight alterations. We abstract the repetition, and then we repeat the abstraction so much we need to abstract that.

We could continue to make individual observations, like the mammoth stack of caches every bit goes through or the abstractions we build up over and over on top of our languages and toolsets. Look at the core of this and you find an axiom in everything we do. We're all about doing whatever we do a lot as efficiently as possible. When we realized a block of code might need to be used in different places, we created functions and subroutines. When we needed to fetch and refetch the same data from memory, we build caches inside our CPUs. Libraries helped us reuse code and version control systems helped us apply one developers changes to the whole teams' workstations. Google needed to do roughly the same thing on thousands of machines and abstracted the whole thing with not just MapReduce, but some of the smartest, most effective sysadmin work we've ever seen.

We should accept and appreciate the overall pattern that has been driving hundreds of individual observations. The difficult part is to benefit from the knowledge. How do we make what we do better by understanding such a core axiom that drives everything we do?

Why You Should Stop Complaining

Things change and the work you do is alsways going to change. In some businesses, this is slow. In ours, it is very quick. I've seen people complaining about high-level languages. There are some who are quick to ignore the claims of benefits from such things as Cloud Computing and other new things they believe to be worthless. We are in danger of being stubborn. You cannot become entrenched in tradition or "the way we've always done it" in an industry that moves this fast. While the traditional databases and static typing have served us well for decades, this is no negative point to the value of other concepts (both new and simply revisited).

The relational database, as traditionally envisioned, often hits a very predictable and known wall: the bounds of any single machine. Yes, there is master-master replication. Yes, there are clustering techniques that can take advantage of additional hardware in particular ways. Yes, you can shard the data across multiple database machines. The growth of the database from a single machine to many is indicative of the greater pattern we see over and over again, of the need to do something over and over again commoditizing the individual acts and components.

Stretching your database over double-digits and triple-digits and more of hardware, and maintaining a high growth rate over that cluster, and eventually over super clusters, does something interesting to your view of the individual databases: they barely matter. When everything is managed by a single PostgreSQL, Oracle, or MySQL running machine, there is a tendency to do a decent amount of specific tuning. What kind of indexing do you need to build on which fields? What is the most efficient column layout for this table? These are questions that matter. Now, when you need to store several dozens of terabytes across hundreds of machines, these are details you'll think about as often as Java developers think about CPU registers.

There is no shortage of developers who will soundly tell you just how buzzword loving and stupid everyone who enjoys Cloud Computing is. Databases are always going to be important, they tell us. Most of us don't need to scale like Amazon or Google or eBay, they tell us. They are correct, but they miss the point.

There are two reasons commodity scale computing benefits developers and a group of developers for each reason.

Why That Guy In His Basement Cares About This

No, the hobbyist making little web apps doesn't need to scale to huge loads, high traffic, or enormous datasets. However, those who do drive every aspect of dealing with all the details involved into commodity status. This is not special to our industry. There are independent car companies, thousands of t-shirt companies, and the driving down of restaurant opening costs so much that their barely profitable. Isn't business grand?

Why That Guy In The Corner Office Cares About This

Imagine the growing company in the late 90s building their website growth and investing in a dozen or so heavy machines to run nice Oracle databases, which are obviously good choices because they're expensive and therefor good. The DBA team makes careful estimates of the needs their machines will face and plans the roles of each box carefully. They map out the schema, build the databases, establish their procedures and policies. Everything has its place.

Then one of the machines dies, thanks to a rare but statistically inevitable hardware failure. There is no saving it. The data was backed up, and easily retrievable, but downtime is still inevitable.

Contrast this to the cloud mentality's most important aspect: individuals don't matter. Individual machines don't matter, because functionality and data are spread out and replicated. Individual processes don't matter, because state is persisted and broken up into many services and workers, who can drop and spawn at the drop of a hat.

Sunday, July 19, 2009

How To Teach Software Development: Why Good Developers Should Care - Part Two

How To Teach Software Development
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

What's the Point?

Some opinions, while held, are held softly. I believe the understanding is the opinion changes nothing and you aren't doing anything about it, so giving a damn is pointless. You may call it apathy, but I call it misunderstanding the nature of information. Information spreads from those who have it to those who do not and those in agreement grease the wheels of that distribution.

Of course, there are good developers who don't care if there are bad developers. I'm not convinced they're still reading, at this point. If they are, then the reasons we can make a difference should help convince you to care about that difference in the first place.

The more widely held the beliefs that we can and should do something to improve the quality of this industry, the more likely anything will happen. You might not be lecturing in the classroom, but I'm sure you've pointed something out to a colleague, new or old, so remember that education never ends and we're talking about life-long improvement, not what people start with.

Your opinion spreads like a bad rumor... but, good! While you could point out to that guy in the next cubicle that the subprocess module is a cleaner solution than the popen*() functions, many may think it doesn't matter for the code that already exists, so don't bother him about it. We might ignore that cleaning up code makes it easier to come back to for fixes and improvements down the road. We can't ignore that pushing him to do the right thing today makes it more likely he'll think about it twice tomorrow. It also makes it more likely he'll return that push to you when you slip or to the next person he notices with room to improve (we all do). We have a collective momentum and together we decide if it goes up or down.

The issues at hand are more than the initial state of entry levels. We have an investment in our fellow coders, graphic designers, UI experts, testers, and managers, like it or not. Without your own teams the need for quality is obvious and makes your job better when you have better code to work with, more understandable managers supporting you, and an environment that supports more than just good enough.

Our group motivation carries outside of our own bubbles, as bubbles are mostly illusion. Think of every third party tool or library you've had a problem with and remember Kevin Bacon. Think of every new member of any team you've had a problem with, too. All of these frustrations come from people and those are people you have influence over, because we are all connected. Maybe you don't believe there's anything you, personally, can do to improve our sad state of affairs. Remember that we all have an effect on everyone else, even in the most indiscernible, indirect manner. It is very easy to downplay and completely ignore those many but tiny influences we all make, and we do it in nearly every context of our lives, but I want you to know that you do make a difference. You make a difference because we all make those small differences together, and when they are in alignment they are more powerful than even the most public figure with a metaphorical bullhorn.

Even if you will only grease the wheels of change, if you care about the change at all do not let those wheels squeal!

Saturday, July 18, 2009

How To Teach Software Development: Why Good Developers Should Care - Part One

How To Teach Software Development
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

Doctors and lawyers in the United States have the American Medical Association and the American Bar Association, respectively, and surely have analog organizations in other countries. As representatives of their professions, they work collectively with their colleagues at a reasonable and useful goal: quality control. It is in the interest of doctors that their med students are not idiots. It is in the interest of lawyers that their opponents are not (more) unethical. There are other, perhaps less admirable uses of these controls (too many doctors would lower all their salaries).

There is no such quality control in our industry. We have individuals at the lowest end of the ability spectrum and at the highest peaks of skill, and, we have the teaches of many years of expensive higher educations on some of our shoulders and only the passion of self-teaching for others. Neither dimension seems an accurate predictor of the other, and this should raise an automatic red flag. What is our education doing for us, if it doesn't let anyone trust your ability without first hand experience, evidence, and other proof you could give them as easily without that education?

I'll own up to my fair share of complaining about the problem in exchange for anyone to agree with me to do something about it. This isn't just an annoying situation that crops up in forums from time to time or can explain why a silly "newbie" asked a silly question. No, this is something costing the economy billions (trillions?), making all of our jobs more difficult, and actually killing people in the rarest situations. We aren't just being elitist snobs when we complain about someone taking a route seemingly inadequate compared to what we think we can do. Further, it shouldn't be considered bad to tell them so. If any of us have any difficulty being told we could improve, we don't have the dignity in our job that it deserves.

Of course, this might seem pointless. Do you think I'm just venting? If so, you may either believe there isn't anything we can do to improve the situation or that there isn't any point in improving it. Either way, my aim is to convince you.

Next: How To Teach Software Development: Developer Motivation
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

Friday, July 17, 2009

How To Click It Like You Mean It

Yes, this is a screenshot of a screenshot. Stick with me, but I really do has a point to this! I have to admit, publicly, that I clicked the button. The one in the screenshot. The one that isn't a button, just a PNG image. I should be glad it wasn't a pop-up!


I realized my mistake at the moment I was clicking on it, but it happened to fast to stop. I had to sit and think for a moment. Why did I do that? It drove me to write this pretty immediately and do a couple mock ups for solutions. I never want to let my users loose information or control over it. That is, we don't want them to OK a message away and neglect to actually read it and we don't want them to click "send" before they're really, really ready to confess their never ending love to Glenn Beck.

Those are two distinct safety nets. Information the user missing for being click-happy and actual actions within the application they might have wanted to avoid. Any reversible actions, like closing a dialog box or deleting something (if a copy is kept around for safety) should be given easy undo options. Even closing an entire window, if made easy, should be something you can undo.

Of course, you can't undo sending an email or formatting a USB drive. You can undo an archive and compress operation that replacing the original files, by extracting them (even if the extraction is bound to an undo button), but if the undo is sufficiently expensive, give me the chance to avoid it in the first place, please. Make to pause and think about what I'm doing first.

Of course, a lot of us are doing web apps today, so it gives us some limitations. It also means, if you want to be friendly to your users, you probably shouldn't use default dialog boxes at all. Now, we might look at wrapping some. An alert_with_undo() javascript function, anyone?

Thursday, July 16, 2009

How To Care If BSD, MIT, or GPL Licenses Are Used

The two recent posts about some individuals' choice of GPL versus others' preference for BSD and MIT style licensing has caused a lot of debate and response. I've seen everything as an interesting combination of very important topics being taken far too seriously and far too personally. All involved need to take a few steps back.

For the uninitiated and as a clarifier for the initiated, we're dealing with (basically) three categories of licensing when someone releases software (and/or its code):
  1. Closed Source. Easiest to explain, because you just get nothing.
  2. GPL. If you get the software, you get the source code, you get to change it, and anything you combine it with must be under the same terms.
  3. MIT and BSD. If you get the software, you might get the source code, you get to change it, and you have no obligations about anything else you combine it with.
The situation gets stickier when we look at those combinations and the transitions between them.

Use GPL code with Closed Source code

So long as you don't distribute your software this is fine. It is a perfectly OK thing to do for software running servers or only running in-house. However, if you want to distribute your software to end users, the terms of GPL code require that the GPL also applies to your own code, so you've got to give that code away, rather than keep it closed. Further, you have to let the users modify and redistribute it.

Returning modifications upstream?

Go ahead. As the owner of the closed source, if you decide to take portions that have modified the GPL code and return that to the project as a thank you, it is your right. You don't have to release your entire project's code to do this. Similarly, if you want to release other portions of your code for use, it is likely required to be GPL, itself.

Use MIT/BSD code with Closed Source code

This happens a lot, in the same kind of situation above, but also in distributed software, because that is OK. In some cases, a notice that you use the code is required, but you aren't required to put your own code under any particular rules or license.

Returning modifications upstream?

Just like GPL, this is fine. However, you have more freedom about releasing other components of your code under any license you see fit.

Use GPL code with MIT/BSD code

Oh, no! Now you have a problem, because the release of your own code under MIT and BSD style licensing is forbidden if you include or link it (the terms can be fuzzy with modern runtimes) with GPL code. You probably just can't use any GPL code if your own is MIT/BSD style.

Use MIT/BSD code with GPL code

Sure, go ahead. The GPL is fairly receptive. If you release an application under the GPL and it requires or includes MIT/BSD style licensed libraries, that is just fine.

Conclusions

If you're a closed source, server side or in-house project, you dont' have much to worry about. You aren't distributing, so little of this matters to you. If you're a closed source, distributed product, then GPL is off limits for you. As the lead of an open source project, you still need to worry about GPL code. Either it can limit how people can use your code, by forcing it to become GPL, or you could face limited use by making the decision yourself. In short, while its an acceptable license for its uses, it happens to be most limiting under these factors.

If you release some GPL code, I probably can't use it. Period. End of story (ignoring these commentaries about the story). Now, maybe you don't care if I can't use it, but isn't that why you're releasing it? The GPL is meant to protect us, but who and what does it protect us from? I can't release it in a closed source product, and I don't want to, but you're also keeping honest, open source enthusiastic developers from using your project. You aren't limiting us for technical or legal reasons, but only for our choice of another license. A GPL licensee can say anything about everyone having a freedom to choose their license, and this is true, but you can't escape your own choice specifically limits who else can interoperate based entirely on if they agree with you.

Wednesday, July 15, 2009

How To Use the Youtube Data API: Authentication

After a couple days trouble with the Youtube Data API and the provided Python wrappers around it, I thought it would be good to collect my findings on what works and doesn't and to fill in the gaps that I see in the docs. I really hope this series will be useful to some others in my position.

Some Doors Are Locked and Some Doors Are Ajar

A lot of the APIs use requires no authentication, not even a developer key. This makes a lot of the most common, read-only integrations a snap. However, I think this makes it more difficult to adjust when the need to authenticate for other integration comes along. This did some damage to my schedule, so I'm going to help others avoid the problem.

Public operations are simple. Youtube gives us resources in the form of feeds and images and other things at API locations, like http://i.ytimg.com/vi/FedVhnHYn-Y/0.jpg to get the first thumbnail of a video. Just plug in your video ID and go.

When you get into the realm of authenticated requests, you've got to get a bit of foundation in place, to start. I don't recommend crafting any requests your self, so grab the client library for gdata. You'll also want to register your site and get a developer key.

With that all set up, gdata.youtube.service.YouTubeService is going to be your friend. The service object begins unauthenticated. At user authorization is can be upgraded single and long-term authenticated use. There is a ClientLogin path, intended for desktop applications, where you actually ask for their username and password. We won't be covering that.

AuthSub is going to be used. With this method, the user is directed from our site to a crafted URL at YouTube, essentially telling them "Hey, I want this user to let me access their account. Is that alright with them?" The user has the job of deciding if you are trusted or not. When they do, Youtube generates a special token to send to a URL you provided. The token you've been given is good for one request, so make it a good one! The best use of that one-time token is usually going to be exchanging it for a session token that you can keep using forever, until the user revokes your rights to their account. These are the steps we're going to see next.

def authsub_url(self, request):
base = '/return/path/at/my/website/'
next = 'http://%s%s?next=%s' % (
request.get_host(),
base,
urllib2.quote(request.build_absolute_uri()))
scope = 'http://gdata.youtube.com'
secure = False
session = True

return yt_service.GenerateAuthSubURL(next, scope, secure, session)

The function generates and returns a unique URL to direct our user to. It takes a request, because we need the host and I used the current URL in my own usage and the absolute URI as the return destination coming back from Youtube. You can also assume here that yt_service is an instance of gdata.youtube.service.YouTubeService, of course. Of note is the session parameter, passed as True, which enables the token we receive to be upgraded to a session token. The user will get a different message from Youtube, depending on this parameter, so they know what you might be doing and how much access they're authorizing.

You're callback URL will be brought up by the user with a token parameter added to the querystring, and you'll be expected to keep track of that.

    yt_service.SetAuthSubToken(token)
yt_service.UpgradeToSessionToken()
session_token = yt_service.current_token.get_token_string()

This part tripped me up for a bit, because the way the official docs are split among the official guide, the python guide, and the actual definitive(ish) API reference, it wasn't as clear that the single-use token and the session-token were distinct tokens, rather than the original becoming a session-token, which my understand was, at first. It would be a lot more clear, I think, if that UpgradeToSessionToken() actually returned that new token. Of course, this isn't important if you're just using the yt_service right now. If you need to store that token for future use, however, then it happens to be really important information.

Later, if you saved this token, you can easily use it again:

    yt_service.SetAuthSubToken(session_token)


Summary

The ease of use is pretty nice. Generate the authorization URL and direct the user there, take the returned token and upgrade it for session use, and from then on, you can do lots of fun things when their account.

Tuesday, July 14, 2009

How To Overcompensate For Something

In the spirit of the old name of this blog, Ranting Techno Rave, this is a rant about a personal experience. This happened in the line of duty, so it is on topic. Has anyone else dealt with this kind of thing? Tell me about it.

This title is purposefully "provoking" and if you're the one I'm talking about, you know who you are. This might even apply to you if you're someone else with the same kind of behavior. Maybe you know or have to work with someone that exhibits the particular personality traits I've had to deal with. In whatever way this applies to you now or in the future, beware as much if you are this type of coder as if you have to deal with one of them.

The lone ranger was a terrible cowboy.

Assertive personalities are important. They point out mistakes, instead of allowing problems through inaction. There is an issue of tact, as a line one needs to watch as they walk the road of the assertive. Code review requires assertion as you tell someone, "You're doing it wrong."

Rather than try to artfully explain and avoid the background of this post, I'm going to just present you with A List of Rules When Joining a Team:
  • Don't insult the code you were hired to work on. Don't insult the coders you were hired to work with. This was actually legacy stuff I was trying to replace, myself, but "What kind of an idiot wrote this?" was a bad enough question when you only thought I wrote it. If I had, I would have removed you immediately (and I should have, anyway)
  • Before you write a single line of code, don't claim you can write all of it yourself.
  • When your new team's lead developer leaves you with a set of bugs before leaving on a pre-scheduled holiday, don't let him return to find the existing code base deleted and a bunch of new stub files checked into a new repository.
  • Respond to email.
  • Actually do your job before taking the money.
  • Last, but not least, please, please, please let me be in the position to yay or nay your application a second time.

Monday, July 13, 2009

How To Teach Software Development

How To Teach Software Development
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

Education is broken. Education about software development is even more broken. It is a sad observation of the industry from my eyes. I come to see good developers from what should be great educations as survivors, more than anything. Do they get a headstart from their education or do they overcome it?

This is the first part in a series on software education. I want to open a discussion here. Please comment if you have thoughts. Blog about it, yourself. Write about how you disagree with me. Write more if you don't. We have a troubled industry. We care enough to do something about it. We hark on the bad developers the way people used to point at freak shows, but we only hurt ourselves but not improving the situation. We have to deal with their bad code. We are the twenty percent and we can't talk to the eighty percent, by definition, so we need to improve the ratio that comes out of the factory, because we can't touch them once they are set loose on the world. Fix this problem at its source, with me, please.

For Students This Means...

You're paying for what you aren't getting. Either you really care about the world you're spending all this money to get indoctorined into or you expect to be honestly prepared for a career you think its lucrative. Neither case is true.

For Schools This Means...

You aren't producing the impressive minds and individuals that makes a school stand out.

For Businesses This Means...

You front the cost for the bad performance and overcoming of a lacking education, so consider this problem a fiscal one.

For Good Developers This Means...

You have to put up with these poor saps.

Sunday, July 12, 2009

How To Work a Sigmoid - Part Two

Software Development in Really Big Steps
  1. How To Work a Sigmoid
  2. How To Work a Sigmoid - Part Two

The last time I wrote about the curvature of project estimations, I was just speculating. Since then, I've discovered that FogBugz does track estimation over time, with a daily estimation record, and offers a graph of the 0, 50, and 100 percent estimates over time. I've been watching this develop for a small time, working more with tracked estimates, and I think some expansion on my thoughts is ready.

You can see my own estimation graph here and it demonstrates exactly what I predicted. I suspect a more complex plotting of points would emerge with the length of the project, but I have a few curiosities about how this would expand over time. The basic prediction of a generally unchanging estimation from the start, an increase in the estimation's growth in the middle, and ending with a calming and final flattening on the systems best guesses, as you slow down how many cases you file for every case that you close.

Steep hills in the estimation happen because for every case you close, you file some bugs, related features, and other cases that were brought to light or just gotten around to filing at that time. You can break down the states of case closure versus creation into three.

When you complete work in line with estimates, then things are On Track. This is misleading, but a good state at any rate. If you have ten hours worth of cases, spend 4 hours, and close about 4 hours worth of estimated cases, the target times on the project remain steady. If you keep this up until all the cases are closed and the project is finished, you can consider your estimations successful. Of course, it is more complicated.

As the design and plans are fleshed out, you'll find developers file more bugs than they close. The estimation is pushed further and further back. This isn't because the project gets more complicated or behind, although it could be so, but that the bulk of the estimation cases needed to represent the entire work of the project hasn't been filed yet. If we could design the entire thing up front, enter the cases, and never change them, we could keep a static estimation, if we remained On Track. We know that we can not and should not design everything up front, so we need to understand and work with changing estimations.

I'm going to make a second prediction about the estimation curve. I predict the curve presents itself in many steps. There are likely to be spurts of case filing and periods of working steadily on those that exist. The developers may have these steps in overlap. Taking some steps back, the steps will smooth into a larger, similar curve for the entire project. Each of these filing spurts will be the start, work, and wrapping up of some component inside the greater breadth of the project.



Saturday, July 11, 2009

How To Recognize "Software Development" Is Step One

We're all "making software," but what's that mean? There is no shortage of resources on writing code. Debates rage on about this library and that, emacs versus vi, or nix versus windows versus osx. How much of it matters? We're arguing what car dealership gives us the best deal, automatic versus manual transmissions, and shades of colors to promote the best feelings when you see that shiny new car. Great, you've got the nice car (we all do), now you've got to drive the damn thing and keep it maintained for its lifetime. Who is paying attention here?

We spend thousands of hours discussion how to write software and millions of dollars helping us do it, but most of us have no clue how to keep that code around and get it in the hands of users. I won't make this a post about "The Cloud", but I will say its largely successful, because it solves a problem most developers either ignore or are never properly exposed to.

I won't blame PHP, but it fits to the bill to describe what is either a symptom or a cause of the problem: dump it and forget it deployment, while useful, has made a generation of developers unaware of what may well be the majority of work in their chosen line of profession, if you look at it right. How many people deploy their site by copying some files via FTP, even today? A frighteningly larger number than you might think! How do you think those same individuals debug? Do they even know what the word means?

The problems here stem beyond simply the code slingers, but to the cash slingers as well. Have you ever tried to convince a client that the time spent building deployment, logging, and diagnostics facilities upfront really isn't just a way to bloat your invoices?

I want to take a time out here to admit I'm not really sure where I'm trying to go with this...

Let's have fun and be completely arbitrary in the comments: What percentage of the job do you expect to be writing code when you start and what is the reality?

Friday, July 10, 2009

How To Respond to Google Chrome OS

UPDATE: Fixed 'Response' to 'Respond' in title. Sorry about that.

We all have to do it, so I might as well take my turn.

First impression: no surprise here.

There are expectations in two forms here. We can expect certain things to come of this and we can expect certain things to disappoint us about this. There is a third, external expectation that techies will divide into a camp of people who think its Rilly, Rilly Important and a camp who thinks you're all wasting your time. I mean, gosh, its almost like this is exactly like any other topic we split down some arbitrary middle about. Get over it.

I Expect To Like:
  • Cheaper netbooks
  • Installing Chrome OS on old hardware
I Expect To Dislike:
  • Feeling like I have an OS that won't let me install anything but a browser
  • Not being able to install Android Apps
  • Not being able to run real Chrome on Android
  • Having no way to persist the state of a Javascript VM, so that I can close applications or save memory on long running ones and resume my work later
  • Still not being able to sync my bookmarks and open tabs and page states properly (or at all) so that applications that are just websites can easily move from my little netbook to my desktop
  • Not getting Android on netbooks, because Chrome OS gets pushed, instead
I Expect To Be Let Down About:
  • Getting Chrome OS on Tegra hardware with O3D
  • Google doing a funny video in time square asking What is an operating system?
  • Never having Google Notebook on a Google Netbook
My lack of pros in these lists that have anything to do with Chrome OS itself are not lost on me. I'm actually excited about it. I think its a really good thing. The availability of this certainly quality project will do great things for our perception of the web, the price points of netbooks, and Christmas in a down economy. The thing is, Chrome OS, at least initially, will be great for what it is not, rather than what it is.

Wednesday, July 08, 2009

How To Like What You See on the Frontpage

Some suggestions to improve a content voting system sparked some thoughts about the idea and I wanted to write them down to record my thoughts. The initial move was to remove down voting. No one uses it and negatives are, well, negative. So we'll drop "vote down" and replace "vote up" with "like", because what is more friendly than liking something? You know, its like you're in first grade and the article is that cute girl eating paste.

At the same time we were discussing sorting. Everything is chronological, but people might want to see popular things. Is it popular because people vote up on it or because lots of people read it? Of course, lots of places weight these today (like Reddit), so that was discussed.

Third, given the relatively higher traffic we're seeing on video content (duh, Youtube generation), adding a second row of video thumbs to the front page makes sense. I also rolled the idea in my head of adding a little randomness into this section, to get more mileage out of old videos.

Resulting conclusion: we don't care about sorting, we care about clicks (duh, again).

In other words, I shouldn't be looking for how to weight the sort order of videos and stories by popularity, which is the first obvious thing to do. What I need to ask is "which videos, placed in this section on this page, will have the highest chance of being clicked?" The first thought I had going down this road is the two obvious classes of users: new and existing. New users need to get caught, so show them something flashy. Show new users pillar content, a nice video introducing the site, and generally popular things. Existing users, most easily identified by having them log in, have already had the candy and now they want some potatoes. Show them new stuff, things being discussed, and things based on their preferences, if you've got that kind of thing set up.

Another consideration is the predictability of item selection. If I'm going to show eight videos on the front page, why should I pick eight of them? Why don't I pick sixteen and alternate? Not back and forth, but moderately random selections each page load. Really good videos might always be there, and "bottom of the top" videos might show up just now and then. For frequently anonymous users, who think "I'm not sure I like this site enough to sign up yet," get a better range of videos they're exposed to and hopefully more inclined to stick around and sign up.

In the opposite manner, can we figure out what to start excluding? After seeing the same story twenty times and not clicking on it, maybe you stop showing it to them. That space could be used for something they might be interested in.

Of course, I know I'm not inventing everything here, but I wonder if anything is a fresh idea. Obviously plenty of sites are learning to keep popular things around. Is anyone hiding ignored items? I don't know if the things I'm talking about are just "things some people are doing" or if there are real maths behind it and hard terms and concepts I can study to do it right. Hopefully, I'll be able to write more about solid results soon.

Saturday, June 20, 2009

How To Own Your Mistakes

Today was a very troubling and frustrating day for both myself and one of my best clients. This is my declaration of ownership for the my own failure to make today not happen. The short story is right after declaring the "make the site more stable" milestone complete and shipping out an invoice, the site spent its most unstable day ever being frantically put on stilts and duct taped to the wall by myself. For the long version, read on.

I had already spent roughly a week and a half working on an impromptu milestone in the project to increase the reliability and stability of the site, as well as beinggreenlit to apply hours to better build, test, and deployment processes. This is a good thing and it still stands as such. Now, the site wasn't fragile before, but a couple incidences understandably gave concern about long term quality. We had a few instances of corrupt MySQL logs, ran out of space on ourEBS volume, and embarrassingly I've had occasion to deploy code and find bugs, even a broken page, even testing locally and trying to be careful. The choice to spend time specifically on a better foundation was a good one.

This isn't about that time I spent, but another post may be.

Thursday we flipped the switch to the new system, running all new instances on EC2, migrated to Postresql, and with a whole new deployment process that includes spawning a new "staging" instance that clones our production web server and lets us test new versions before rolling it out to the public. Everything looked good, I spent some time correcting a couple hiccups, and at the end of the day when things had been running and seemed stable and golden, I declared the milestone complete (and in this arrangement, that means invoicing for a payment, so its not just an ego issue).

I woke up the next morning to find the site had been down for a few hours. It was unavailable about a dozen times throughout the rest of the day, and I clocked about 7.5 hours today getting everything in line. It has been running for longer than that now, without problem, and we seem to be in the clear.

Situations like this require us to look inward and ask what we could have done differently to avoid the escalation of a problem into a crisis, and I've spent much of today, while working on the issues and afterwards, trying to understand this. Much of what I can do now is speculation. While there are many things I could have or should have done, there are few of them that I can know for a certainty would have been "the" things to make a difference.

Priorities are one area I can be confident in believing able to avoid what happened today. A service should not run without thorough watchdogs. Websites should be given realistic traffic test exposures. I can test my code and comment it well, but the upfront work needs to be in place to ensure that my new code is actually servicing requests.

Can you always make these claims?
  • Our site's resources are tested automatically and report broken pages and other issues to us
  • We can test our production environment before it is actually production for new code
  • If something goes wrong, our server processes are restarted and we are informed, before the users know and even if they never know
I know, from now on, I will.

Tuesday, May 12, 2009

Windows 7: How To Ignore Reports of Danger

I am running Windows 7 via VirtualBox, and I skipped Vista completely, so some of my comments might also apply to Vista and thus be outdated. Too bad.

You can probably expect a few other short pieces as I find something I like and something I don't.

So, we see Internet Explorer here trying to help you out and tell you the download seems safe. Of course, it also lets you report that the download is, in fact, unsafe! This will no doubt be fed back into their SmartScreen Filter service, and when enough users report something, future users will be warned on downloading whatever bit of malware it might be. What a great way to protect your users.

Now, the only obvious place to report the download is right here, in the download dialog box, which disappears as soon as the download completes and you can open or run it and actually discover anything threatening about it to warn others about.

My Windows 7 review will eventually be the composite of many small pieces. I'll build up a score card along the way, along with a table of links between the series.

The Good: 0
The Bad: 1

Monday, April 27, 2009

How To Win By Not Mattering

This is all about the strange and confusing state of win we see repeatedly today, where a brand or concept gains such control and mindshare that no one even recognizes them as a thing anymore. Few people think of Q-tips as a brand versus just being the name of a thing. Pepsi Co executives probably grind their teeth thinking about movie goers ordering "just a coke" when Pepsi products are prominently and solely for sale. Most Internet Explorer users go beyond not understanding what IE is, they don't even understand what a browser is!

Today, I want to talk about something newer and more specific, and less sure. The direction is visible that Mercurial is being given steps (it is important to phrase it this way, as I'll explain) to not matter, and that is precisely why they will win.

CVS still matters, which is precisely why it has lost so utterly in the imaginary battle for geek mindshare. If you are using CVS, it is important to remember that along the way, because it affects how you work and what you can do. Subversion still matters, but less so, as it stays largely out of your way.

None of the layers in the DVCS arena matter very much, because none of them are very different from the others. Git and Mercurial and Darcs? They all behave similar enough that none of them offer anything different, beyond community and how to deal with failures. Now, Google announces upcoming Mercurial support for Google Code, but the real thing that stood out to me is that they built their own implementation over Bigtable. They are not supporting Mercurial, they are supporting the mercurial format.

It wouldn't be difficult to do the same thing and implement any one of them in any one of the others. I think by next year you'll see git and mercurial doing push/pull between one another.

Note: This was a crappy post and I try to stay away from posting just to post, but I'm getting back into the swing of things. Give me a break, yeah?

Next Post: How To Give Up to Succeed (Maybe public commitments will force me to write, lest I be publically humilated!)

Saturday, April 25, 2009

How To Install Google Gears on 64-bit Linux

This is just a quicky tutorial for anyone else hitting the problem I had: installing Google Gears on 64-bit Linux. Google has not released support for this officially, but are apparently working on it and while there have been no official beta releases to try it out, someone posted a 64-bit build on a forum post. This will give you an XPI, the packaging format used by Firefox extensions, but it won't work right away. Firefox needs some special instructions to make a link actually install something and the forum post doesn't include it. There are no obvious ways to tell Firefox, "Install this XPI at this URI," so what are we to do?

It turns out that it will initiate an installation if you select the XPI from the "Open File..." option in the File menu. So, download it locally and then open it in this manner, and you'll restart Firefox with a working Google Gears extension. Enjoy.

I've got to get back to work now. So little time to post these days!

Thursday, January 01, 2009

How To Prove Code Review is Important

The infamous Zune 30GB failures were traced to a leapyear issue, and apparently they use some code we can see in the Freescale codebase. Take a look at the following sample of code, which determines the year from the day number (counting from January 1, 1980). I don't know about you, but the infinite loop is immediately obvious. On a leapyear, the main loop continues when days = 366 and the incrementing is never reached, because days > 366 fails.

Am I naive to think that even a casual code review would have caught this in a moment?

year = ORIGINYEAR; /* = 1980 */
while (days > 365) {
if (IsLeapYear(year))
{
if (days > 366)
{
days -= 366;
year += 1;
}
}
else
{
days -= 365;
year += 1;
}
}


UPDATE: Fixed formatting issues. It looked fine when I posted it, honest!
I write here about programming, how to program better, things I think are neat and are related to programming. I might write other things at my personal website.

I am happily employed by the excellent Caktus Group, located in beautiful and friendly Carrboro, NC, where I work with Python, Django, and Javascript.

Blog Archive