Wednesday, July 22, 2009

How To Fail At Upload APIs

Youtube, what the hell is up with your upload APIs? Here we are, hacking along and being all "Hey, we got accounts syncing and thumbnails popping up and videos getting attached to blog posts and its nifty-pie, oh yeah." when we make the move to be even more web 2.0y and add full authentication with youtube accounts and integrate the full upload cycle into the media selection. We take some little video upload tests and we're happy about that. Oh, that was a nice milestone to hit in FogBugz, I tell you what.

The week passes and suddenly I'm asked why all the video uploads keep failing. All the usual things are checked, but the upload tokens validate and the headers are correct. All our tests are within the upload limits, too. These aren't friendly "Hey, I really hate to tell you this, but the video you just uploaded didn't go so well. Can you try it again in a bit? Thanks"-errors, either. These are "Fuck off. I just reset your HTTP connection, bitch"-errors. Other times, we get 502 Bad Gateway from Youtube's servers and that isn't the kind of nice error you expect from a professional service like that. With the errors happening on the client browser, I'm left with nothing to do by way of responding nicely. Our own machines never get a single byte from Youtube on the matter, much less the nicely formatted response-requests they promise to use to tell us all about the sucess or failure of the uploads we set our users up with.

What gives with the weird error and what gives with they not being nicer about it?

Through support forums our problem is matched up not with some obscure thing, but to many, many posts. It seems like everyone and their dog gets reset connections and 502 errors during uploads. At this point, I'm absolutely questioning that Youtube was ready to release this part of the API when they did, because it is obviously not mature. Now, I happen to know it on good authority that some people do upload videos to Youtube, from time to time. You know, with that little uploader they wrote called The Youtube Site Itself. So, the theories I have is that they either aren't eating their own dogfood or their own uploader is doing smart-ish upload resuming when their internal upload API chokes on them constantly. In the end, I have to ask, "What gives?"

Apparently, according to support responses and our own tests, if you upload to uploads2.gdata.youtube.com instead of uploads.gdata.youtube.com, it works fine. This is their "new upload system", but what does that mean? The API is completely identical, so does "new upload system" mean a new specific set of boxes that handle uploads? Specific boxes doesn't seem very cloud-like for Google, don't you agree? Well, i

Tuesday, July 21, 2009

Review: FogBugz 7.0

While ignoring completely that I was promised access to the FogBugz OnDeman 7.0 Beta program and just forgotten somehow, I'm going to come say that one day with the official release and I'm more excited than ever to be a FogCreek customer. Yes, I am still a card carrying free software nut-bag. I'm absolutely sure certain individuals will get grated at me, again, for not using Trac like a good geek.

Call me a fanboy, but boy-oh-boy is this a sweet release. It is a shining example of knocking a release out of the park and impressing everyone (who could be impressed at all, and thus disregarding those who will never be impressed by a commercial, for-profit bug tracker, ever, no matter what, not in a million years).

The experience is absolutely slick. Faster, brighter, shinier. Packing new features, improved features, and bug fixing in a new package is a great way to make the functional improvements stand out. Even if we have a good design, any product should take a note from this book and spruce up the design just to highlight that change is in the air.

I'm actually struggling to think of something that has been added that I didn't want or that I wanted which was not added. I'm sure there are people on both sides, but I'm still thrilled to apparently be an exact match for their target unsatisfied customer to satisfy. Even though I have really liked FogBugz for some time, I have also struggled with it to represent my work flow. I've worked out different sets of behaviors with different clients.

I keep a milestone that never gets a due date and only exists to hold cases that are approved to be done "some time" and we get to a few of them between each actual milestone. Today, I can drop that and prioritize cases in the backlog directly. I also added custom case statuses to "propose" and "reject", so I can track what I think we should do and what has been approved.

Was it a bug or a feature when I needed to clean up the content form? Next time something like that comes along, I'll enter the case as an "Improvement", the new category I added for such in-betweens. I'll probably tag it for organization, too. Maybe, I'll add a custom tag to track the branch I'm working in. I'm really looking forward to getting more and more mileage out of this release. I really have to commend everyone that worked so hard to bring this iteration to the public. Thank you, so much!

Monday, July 20, 2009

How To Recurse Your Foundation

Or, the working title: How To Look Down At The Tower of Turtles

We're a recursive bunch. We're more repetitive than . There is no shortage of writing that its caches all the way down or that we're repeating the mainframe/dumb terminal era. I have an argument that our entire profession is hinged on repeating ourselves.

Repetition is in the DNA of what we do. Software is the ultimate commodity, approaching zero-cost production. Solve one problem and the solution is applied to a thousand problems. Generalize and solve a million. Everything we do is repeated and is about repeating things. At the core, we're just moving little bits around and we repeat that action over and over, with very slight alterations. We abstract the repetition, and then we repeat the abstraction so much we need to abstract that.

We could continue to make individual observations, like the mammoth stack of caches every bit goes through or the abstractions we build up over and over on top of our languages and toolsets. Look at the core of this and you find an axiom in everything we do. We're all about doing whatever we do a lot as efficiently as possible. When we realized a block of code might need to be used in different places, we created functions and subroutines. When we needed to fetch and refetch the same data from memory, we build caches inside our CPUs. Libraries helped us reuse code and version control systems helped us apply one developers changes to the whole teams' workstations. Google needed to do roughly the same thing on thousands of machines and abstracted the whole thing with not just MapReduce, but some of the smartest, most effective sysadmin work we've ever seen.

We should accept and appreciate the overall pattern that has been driving hundreds of individual observations. The difficult part is to benefit from the knowledge. How do we make what we do better by understanding such a core axiom that drives everything we do?

Why You Should Stop Complaining

Things change and the work you do is alsways going to change. In some businesses, this is slow. In ours, it is very quick. I've seen people complaining about high-level languages. There are some who are quick to ignore the claims of benefits from such things as Cloud Computing and other new things they believe to be worthless. We are in danger of being stubborn. You cannot become entrenched in tradition or "the way we've always done it" in an industry that moves this fast. While the traditional databases and static typing have served us well for decades, this is no negative point to the value of other concepts (both new and simply revisited).

The relational database, as traditionally envisioned, often hits a very predictable and known wall: the bounds of any single machine. Yes, there is master-master replication. Yes, there are clustering techniques that can take advantage of additional hardware in particular ways. Yes, you can shard the data across multiple database machines. The growth of the database from a single machine to many is indicative of the greater pattern we see over and over again, of the need to do something over and over again commoditizing the individual acts and components.

Stretching your database over double-digits and triple-digits and more of hardware, and maintaining a high growth rate over that cluster, and eventually over super clusters, does something interesting to your view of the individual databases: they barely matter. When everything is managed by a single PostgreSQL, Oracle, or MySQL running machine, there is a tendency to do a decent amount of specific tuning. What kind of indexing do you need to build on which fields? What is the most efficient column layout for this table? These are questions that matter. Now, when you need to store several dozens of terabytes across hundreds of machines, these are details you'll think about as often as Java developers think about CPU registers.

There is no shortage of developers who will soundly tell you just how buzzword loving and stupid everyone who enjoys Cloud Computing is. Databases are always going to be important, they tell us. Most of us don't need to scale like Amazon or Google or eBay, they tell us. They are correct, but they miss the point.

There are two reasons commodity scale computing benefits developers and a group of developers for each reason.

Why That Guy In His Basement Cares About This

No, the hobbyist making little web apps doesn't need to scale to huge loads, high traffic, or enormous datasets. However, those who do drive every aspect of dealing with all the details involved into commodity status. This is not special to our industry. There are independent car companies, thousands of t-shirt companies, and the driving down of restaurant opening costs so much that their barely profitable. Isn't business grand?

Why That Guy In The Corner Office Cares About This

Imagine the growing company in the late 90s building their website growth and investing in a dozen or so heavy machines to run nice Oracle databases, which are obviously good choices because they're expensive and therefor good. The DBA team makes careful estimates of the needs their machines will face and plans the roles of each box carefully. They map out the schema, build the databases, establish their procedures and policies. Everything has its place.

Then one of the machines dies, thanks to a rare but statistically inevitable hardware failure. There is no saving it. The data was backed up, and easily retrievable, but downtime is still inevitable.

Contrast this to the cloud mentality's most important aspect: individuals don't matter. Individual machines don't matter, because functionality and data are spread out and replicated. Individual processes don't matter, because state is persisted and broken up into many services and workers, who can drop and spawn at the drop of a hat.

Sunday, July 19, 2009

How To Teach Software Development: Why Good Developers Should Care - Part Two

How To Teach Software Development
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

What's the Point?

Some opinions, while held, are held softly. I believe the understanding is the opinion changes nothing and you aren't doing anything about it, so giving a damn is pointless. You may call it apathy, but I call it misunderstanding the nature of information. Information spreads from those who have it to those who do not and those in agreement grease the wheels of that distribution.

Of course, there are good developers who don't care if there are bad developers. I'm not convinced they're still reading, at this point. If they are, then the reasons we can make a difference should help convince you to care about that difference in the first place.

The more widely held the beliefs that we can and should do something to improve the quality of this industry, the more likely anything will happen. You might not be lecturing in the classroom, but I'm sure you've pointed something out to a colleague, new or old, so remember that education never ends and we're talking about life-long improvement, not what people start with.

Your opinion spreads like a bad rumor... but, good! While you could point out to that guy in the next cubicle that the subprocess module is a cleaner solution than the popen*() functions, many may think it doesn't matter for the code that already exists, so don't bother him about it. We might ignore that cleaning up code makes it easier to come back to for fixes and improvements down the road. We can't ignore that pushing him to do the right thing today makes it more likely he'll think about it twice tomorrow. It also makes it more likely he'll return that push to you when you slip or to the next person he notices with room to improve (we all do). We have a collective momentum and together we decide if it goes up or down.

The issues at hand are more than the initial state of entry levels. We have an investment in our fellow coders, graphic designers, UI experts, testers, and managers, like it or not. Without your own teams the need for quality is obvious and makes your job better when you have better code to work with, more understandable managers supporting you, and an environment that supports more than just good enough.

Our group motivation carries outside of our own bubbles, as bubbles are mostly illusion. Think of every third party tool or library you've had a problem with and remember Kevin Bacon. Think of every new member of any team you've had a problem with, too. All of these frustrations come from people and those are people you have influence over, because we are all connected. Maybe you don't believe there's anything you, personally, can do to improve our sad state of affairs. Remember that we all have an effect on everyone else, even in the most indiscernible, indirect manner. It is very easy to downplay and completely ignore those many but tiny influences we all make, and we do it in nearly every context of our lives, but I want you to know that you do make a difference. You make a difference because we all make those small differences together, and when they are in alignment they are more powerful than even the most public figure with a metaphorical bullhorn.

Even if you will only grease the wheels of change, if you care about the change at all do not let those wheels squeal!

Saturday, July 18, 2009

How To Teach Software Development: Why Good Developers Should Care - Part One

How To Teach Software Development
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

Doctors and lawyers in the United States have the American Medical Association and the American Bar Association, respectively, and surely have analog organizations in other countries. As representatives of their professions, they work collectively with their colleagues at a reasonable and useful goal: quality control. It is in the interest of doctors that their med students are not idiots. It is in the interest of lawyers that their opponents are not (more) unethical. There are other, perhaps less admirable uses of these controls (too many doctors would lower all their salaries).

There is no such quality control in our industry. We have individuals at the lowest end of the ability spectrum and at the highest peaks of skill, and, we have the teaches of many years of expensive higher educations on some of our shoulders and only the passion of self-teaching for others. Neither dimension seems an accurate predictor of the other, and this should raise an automatic red flag. What is our education doing for us, if it doesn't let anyone trust your ability without first hand experience, evidence, and other proof you could give them as easily without that education?

I'll own up to my fair share of complaining about the problem in exchange for anyone to agree with me to do something about it. This isn't just an annoying situation that crops up in forums from time to time or can explain why a silly "newbie" asked a silly question. No, this is something costing the economy billions (trillions?), making all of our jobs more difficult, and actually killing people in the rarest situations. We aren't just being elitist snobs when we complain about someone taking a route seemingly inadequate compared to what we think we can do. Further, it shouldn't be considered bad to tell them so. If any of us have any difficulty being told we could improve, we don't have the dignity in our job that it deserves.

Of course, this might seem pointless. Do you think I'm just venting? If so, you may either believe there isn't anything we can do to improve the situation or that there isn't any point in improving it. Either way, my aim is to convince you.

Next: How To Teach Software Development: Developer Motivation
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

Friday, July 17, 2009

How To Click It Like You Mean It

Yes, this is a screenshot of a screenshot. Stick with me, but I really do has a point to this! I have to admit, publicly, that I clicked the button. The one in the screenshot. The one that isn't a button, just a PNG image. I should be glad it wasn't a pop-up!


I realized my mistake at the moment I was clicking on it, but it happened to fast to stop. I had to sit and think for a moment. Why did I do that? It drove me to write this pretty immediately and do a couple mock ups for solutions. I never want to let my users loose information or control over it. That is, we don't want them to OK a message away and neglect to actually read it and we don't want them to click "send" before they're really, really ready to confess their never ending love to Glenn Beck.

Those are two distinct safety nets. Information the user missing for being click-happy and actual actions within the application they might have wanted to avoid. Any reversible actions, like closing a dialog box or deleting something (if a copy is kept around for safety) should be given easy undo options. Even closing an entire window, if made easy, should be something you can undo.

Of course, you can't undo sending an email or formatting a USB drive. You can undo an archive and compress operation that replacing the original files, by extracting them (even if the extraction is bound to an undo button), but if the undo is sufficiently expensive, give me the chance to avoid it in the first place, please. Make to pause and think about what I'm doing first.

Of course, a lot of us are doing web apps today, so it gives us some limitations. It also means, if you want to be friendly to your users, you probably shouldn't use default dialog boxes at all. Now, we might look at wrapping some. An alert_with_undo() javascript function, anyone?

Thursday, July 16, 2009

How To Care If BSD, MIT, or GPL Licenses Are Used

The two recent posts about some individuals' choice of GPL versus others' preference for BSD and MIT style licensing has caused a lot of debate and response. I've seen everything as an interesting combination of very important topics being taken far too seriously and far too personally. All involved need to take a few steps back.

For the uninitiated and as a clarifier for the initiated, we're dealing with (basically) three categories of licensing when someone releases software (and/or its code):
  1. Closed Source. Easiest to explain, because you just get nothing.
  2. GPL. If you get the software, you get the source code, you get to change it, and anything you combine it with must be under the same terms.
  3. MIT and BSD. If you get the software, you might get the source code, you get to change it, and you have no obligations about anything else you combine it with.
The situation gets stickier when we look at those combinations and the transitions between them.

Use GPL code with Closed Source code

So long as you don't distribute your software this is fine. It is a perfectly OK thing to do for software running servers or only running in-house. However, if you want to distribute your software to end users, the terms of GPL code require that the GPL also applies to your own code, so you've got to give that code away, rather than keep it closed. Further, you have to let the users modify and redistribute it.

Returning modifications upstream?

Go ahead. As the owner of the closed source, if you decide to take portions that have modified the GPL code and return that to the project as a thank you, it is your right. You don't have to release your entire project's code to do this. Similarly, if you want to release other portions of your code for use, it is likely required to be GPL, itself.

Use MIT/BSD code with Closed Source code

This happens a lot, in the same kind of situation above, but also in distributed software, because that is OK. In some cases, a notice that you use the code is required, but you aren't required to put your own code under any particular rules or license.

Returning modifications upstream?

Just like GPL, this is fine. However, you have more freedom about releasing other components of your code under any license you see fit.

Use GPL code with MIT/BSD code

Oh, no! Now you have a problem, because the release of your own code under MIT and BSD style licensing is forbidden if you include or link it (the terms can be fuzzy with modern runtimes) with GPL code. You probably just can't use any GPL code if your own is MIT/BSD style.

Use MIT/BSD code with GPL code

Sure, go ahead. The GPL is fairly receptive. If you release an application under the GPL and it requires or includes MIT/BSD style licensed libraries, that is just fine.

Conclusions

If you're a closed source, server side or in-house project, you dont' have much to worry about. You aren't distributing, so little of this matters to you. If you're a closed source, distributed product, then GPL is off limits for you. As the lead of an open source project, you still need to worry about GPL code. Either it can limit how people can use your code, by forcing it to become GPL, or you could face limited use by making the decision yourself. In short, while its an acceptable license for its uses, it happens to be most limiting under these factors.

If you release some GPL code, I probably can't use it. Period. End of story (ignoring these commentaries about the story). Now, maybe you don't care if I can't use it, but isn't that why you're releasing it? The GPL is meant to protect us, but who and what does it protect us from? I can't release it in a closed source product, and I don't want to, but you're also keeping honest, open source enthusiastic developers from using your project. You aren't limiting us for technical or legal reasons, but only for our choice of another license. A GPL licensee can say anything about everyone having a freedom to choose their license, and this is true, but you can't escape your own choice specifically limits who else can interoperate based entirely on if they agree with you.

Wednesday, July 15, 2009

How To Use the Youtube Data API: Authentication

After a couple days trouble with the Youtube Data API and the provided Python wrappers around it, I thought it would be good to collect my findings on what works and doesn't and to fill in the gaps that I see in the docs. I really hope this series will be useful to some others in my position.

Some Doors Are Locked and Some Doors Are Ajar

A lot of the APIs use requires no authentication, not even a developer key. This makes a lot of the most common, read-only integrations a snap. However, I think this makes it more difficult to adjust when the need to authenticate for other integration comes along. This did some damage to my schedule, so I'm going to help others avoid the problem.

Public operations are simple. Youtube gives us resources in the form of feeds and images and other things at API locations, like http://i.ytimg.com/vi/FedVhnHYn-Y/0.jpg to get the first thumbnail of a video. Just plug in your video ID and go.

When you get into the realm of authenticated requests, you've got to get a bit of foundation in place, to start. I don't recommend crafting any requests your self, so grab the client library for gdata. You'll also want to register your site and get a developer key.

With that all set up, gdata.youtube.service.YouTubeService is going to be your friend. The service object begins unauthenticated. At user authorization is can be upgraded single and long-term authenticated use. There is a ClientLogin path, intended for desktop applications, where you actually ask for their username and password. We won't be covering that.

AuthSub is going to be used. With this method, the user is directed from our site to a crafted URL at YouTube, essentially telling them "Hey, I want this user to let me access their account. Is that alright with them?" The user has the job of deciding if you are trusted or not. When they do, Youtube generates a special token to send to a URL you provided. The token you've been given is good for one request, so make it a good one! The best use of that one-time token is usually going to be exchanging it for a session token that you can keep using forever, until the user revokes your rights to their account. These are the steps we're going to see next.

def authsub_url(self, request):
base = '/return/path/at/my/website/'
next = 'http://%s%s?next=%s' % (
request.get_host(),
base,
urllib2.quote(request.build_absolute_uri()))
scope = 'http://gdata.youtube.com'
secure = False
session = True

return yt_service.GenerateAuthSubURL(next, scope, secure, session)

The function generates and returns a unique URL to direct our user to. It takes a request, because we need the host and I used the current URL in my own usage and the absolute URI as the return destination coming back from Youtube. You can also assume here that yt_service is an instance of gdata.youtube.service.YouTubeService, of course. Of note is the session parameter, passed as True, which enables the token we receive to be upgraded to a session token. The user will get a different message from Youtube, depending on this parameter, so they know what you might be doing and how much access they're authorizing.

You're callback URL will be brought up by the user with a token parameter added to the querystring, and you'll be expected to keep track of that.

    yt_service.SetAuthSubToken(token)
yt_service.UpgradeToSessionToken()
session_token = yt_service.current_token.get_token_string()

This part tripped me up for a bit, because the way the official docs are split among the official guide, the python guide, and the actual definitive(ish) API reference, it wasn't as clear that the single-use token and the session-token were distinct tokens, rather than the original becoming a session-token, which my understand was, at first. It would be a lot more clear, I think, if that UpgradeToSessionToken() actually returned that new token. Of course, this isn't important if you're just using the yt_service right now. If you need to store that token for future use, however, then it happens to be really important information.

Later, if you saved this token, you can easily use it again:

    yt_service.SetAuthSubToken(session_token)


Summary

The ease of use is pretty nice. Generate the authorization URL and direct the user there, take the returned token and upgrade it for session use, and from then on, you can do lots of fun things when their account.

Tuesday, July 14, 2009

How To Overcompensate For Something

In the spirit of the old name of this blog, Ranting Techno Rave, this is a rant about a personal experience. This happened in the line of duty, so it is on topic. Has anyone else dealt with this kind of thing? Tell me about it.

This title is purposefully "provoking" and if you're the one I'm talking about, you know who you are. This might even apply to you if you're someone else with the same kind of behavior. Maybe you know or have to work with someone that exhibits the particular personality traits I've had to deal with. In whatever way this applies to you now or in the future, beware as much if you are this type of coder as if you have to deal with one of them.

The lone ranger was a terrible cowboy.

Assertive personalities are important. They point out mistakes, instead of allowing problems through inaction. There is an issue of tact, as a line one needs to watch as they walk the road of the assertive. Code review requires assertion as you tell someone, "You're doing it wrong."

Rather than try to artfully explain and avoid the background of this post, I'm going to just present you with A List of Rules When Joining a Team:
  • Don't insult the code you were hired to work on. Don't insult the coders you were hired to work with. This was actually legacy stuff I was trying to replace, myself, but "What kind of an idiot wrote this?" was a bad enough question when you only thought I wrote it. If I had, I would have removed you immediately (and I should have, anyway)
  • Before you write a single line of code, don't claim you can write all of it yourself.
  • When your new team's lead developer leaves you with a set of bugs before leaving on a pre-scheduled holiday, don't let him return to find the existing code base deleted and a bunch of new stub files checked into a new repository.
  • Respond to email.
  • Actually do your job before taking the money.
  • Last, but not least, please, please, please let me be in the position to yay or nay your application a second time.

Monday, July 13, 2009

How To Teach Software Development

How To Teach Software Development
  1. Introduction
  2. Developers
    Quality Control
    Motivation
    Execution
  3. Businesses
  4. Students
  5. Schools

Education is broken. Education about software development is even more broken. It is a sad observation of the industry from my eyes. I come to see good developers from what should be great educations as survivors, more than anything. Do they get a headstart from their education or do they overcome it?

This is the first part in a series on software education. I want to open a discussion here. Please comment if you have thoughts. Blog about it, yourself. Write about how you disagree with me. Write more if you don't. We have a troubled industry. We care enough to do something about it. We hark on the bad developers the way people used to point at freak shows, but we only hurt ourselves but not improving the situation. We have to deal with their bad code. We are the twenty percent and we can't talk to the eighty percent, by definition, so we need to improve the ratio that comes out of the factory, because we can't touch them once they are set loose on the world. Fix this problem at its source, with me, please.

For Students This Means...

You're paying for what you aren't getting. Either you really care about the world you're spending all this money to get indoctorined into or you expect to be honestly prepared for a career you think its lucrative. Neither case is true.

For Schools This Means...

You aren't producing the impressive minds and individuals that makes a school stand out.

For Businesses This Means...

You front the cost for the bad performance and overcoming of a lacking education, so consider this problem a fiscal one.

For Good Developers This Means...

You have to put up with these poor saps.

Sunday, July 12, 2009

How To Work a Sigmoid - Part Two

Software Development in Really Big Steps
  1. How To Work a Sigmoid
  2. How To Work a Sigmoid - Part Two

The last time I wrote about the curvature of project estimations, I was just speculating. Since then, I've discovered that FogBugz does track estimation over time, with a daily estimation record, and offers a graph of the 0, 50, and 100 percent estimates over time. I've been watching this develop for a small time, working more with tracked estimates, and I think some expansion on my thoughts is ready.

You can see my own estimation graph here and it demonstrates exactly what I predicted. I suspect a more complex plotting of points would emerge with the length of the project, but I have a few curiosities about how this would expand over time. The basic prediction of a generally unchanging estimation from the start, an increase in the estimation's growth in the middle, and ending with a calming and final flattening on the systems best guesses, as you slow down how many cases you file for every case that you close.

Steep hills in the estimation happen because for every case you close, you file some bugs, related features, and other cases that were brought to light or just gotten around to filing at that time. You can break down the states of case closure versus creation into three.

When you complete work in line with estimates, then things are On Track. This is misleading, but a good state at any rate. If you have ten hours worth of cases, spend 4 hours, and close about 4 hours worth of estimated cases, the target times on the project remain steady. If you keep this up until all the cases are closed and the project is finished, you can consider your estimations successful. Of course, it is more complicated.

As the design and plans are fleshed out, you'll find developers file more bugs than they close. The estimation is pushed further and further back. This isn't because the project gets more complicated or behind, although it could be so, but that the bulk of the estimation cases needed to represent the entire work of the project hasn't been filed yet. If we could design the entire thing up front, enter the cases, and never change them, we could keep a static estimation, if we remained On Track. We know that we can not and should not design everything up front, so we need to understand and work with changing estimations.

I'm going to make a second prediction about the estimation curve. I predict the curve presents itself in many steps. There are likely to be spurts of case filing and periods of working steadily on those that exist. The developers may have these steps in overlap. Taking some steps back, the steps will smooth into a larger, similar curve for the entire project. Each of these filing spurts will be the start, work, and wrapping up of some component inside the greater breadth of the project.



Saturday, July 11, 2009

How To Recognize "Software Development" Is Step One

We're all "making software," but what's that mean? There is no shortage of resources on writing code. Debates rage on about this library and that, emacs versus vi, or nix versus windows versus osx. How much of it matters? We're arguing what car dealership gives us the best deal, automatic versus manual transmissions, and shades of colors to promote the best feelings when you see that shiny new car. Great, you've got the nice car (we all do), now you've got to drive the damn thing and keep it maintained for its lifetime. Who is paying attention here?

We spend thousands of hours discussion how to write software and millions of dollars helping us do it, but most of us have no clue how to keep that code around and get it in the hands of users. I won't make this a post about "The Cloud", but I will say its largely successful, because it solves a problem most developers either ignore or are never properly exposed to.

I won't blame PHP, but it fits to the bill to describe what is either a symptom or a cause of the problem: dump it and forget it deployment, while useful, has made a generation of developers unaware of what may well be the majority of work in their chosen line of profession, if you look at it right. How many people deploy their site by copying some files via FTP, even today? A frighteningly larger number than you might think! How do you think those same individuals debug? Do they even know what the word means?

The problems here stem beyond simply the code slingers, but to the cash slingers as well. Have you ever tried to convince a client that the time spent building deployment, logging, and diagnostics facilities upfront really isn't just a way to bloat your invoices?

I want to take a time out here to admit I'm not really sure where I'm trying to go with this...

Let's have fun and be completely arbitrary in the comments: What percentage of the job do you expect to be writing code when you start and what is the reality?

Friday, July 10, 2009

How To Respond to Google Chrome OS

UPDATE: Fixed 'Response' to 'Respond' in title. Sorry about that.

We all have to do it, so I might as well take my turn.

First impression: no surprise here.

There are expectations in two forms here. We can expect certain things to come of this and we can expect certain things to disappoint us about this. There is a third, external expectation that techies will divide into a camp of people who think its Rilly, Rilly Important and a camp who thinks you're all wasting your time. I mean, gosh, its almost like this is exactly like any other topic we split down some arbitrary middle about. Get over it.

I Expect To Like:
  • Cheaper netbooks
  • Installing Chrome OS on old hardware
I Expect To Dislike:
  • Feeling like I have an OS that won't let me install anything but a browser
  • Not being able to install Android Apps
  • Not being able to run real Chrome on Android
  • Having no way to persist the state of a Javascript VM, so that I can close applications or save memory on long running ones and resume my work later
  • Still not being able to sync my bookmarks and open tabs and page states properly (or at all) so that applications that are just websites can easily move from my little netbook to my desktop
  • Not getting Android on netbooks, because Chrome OS gets pushed, instead
I Expect To Be Let Down About:
  • Getting Chrome OS on Tegra hardware with O3D
  • Google doing a funny video in time square asking What is an operating system?
  • Never having Google Notebook on a Google Netbook
My lack of pros in these lists that have anything to do with Chrome OS itself are not lost on me. I'm actually excited about it. I think its a really good thing. The availability of this certainly quality project will do great things for our perception of the web, the price points of netbooks, and Christmas in a down economy. The thing is, Chrome OS, at least initially, will be great for what it is not, rather than what it is.

Wednesday, July 08, 2009

How To Like What You See on the Frontpage

Some suggestions to improve a content voting system sparked some thoughts about the idea and I wanted to write them down to record my thoughts. The initial move was to remove down voting. No one uses it and negatives are, well, negative. So we'll drop "vote down" and replace "vote up" with "like", because what is more friendly than liking something? You know, its like you're in first grade and the article is that cute girl eating paste.

At the same time we were discussing sorting. Everything is chronological, but people might want to see popular things. Is it popular because people vote up on it or because lots of people read it? Of course, lots of places weight these today (like Reddit), so that was discussed.

Third, given the relatively higher traffic we're seeing on video content (duh, Youtube generation), adding a second row of video thumbs to the front page makes sense. I also rolled the idea in my head of adding a little randomness into this section, to get more mileage out of old videos.

Resulting conclusion: we don't care about sorting, we care about clicks (duh, again).

In other words, I shouldn't be looking for how to weight the sort order of videos and stories by popularity, which is the first obvious thing to do. What I need to ask is "which videos, placed in this section on this page, will have the highest chance of being clicked?" The first thought I had going down this road is the two obvious classes of users: new and existing. New users need to get caught, so show them something flashy. Show new users pillar content, a nice video introducing the site, and generally popular things. Existing users, most easily identified by having them log in, have already had the candy and now they want some potatoes. Show them new stuff, things being discussed, and things based on their preferences, if you've got that kind of thing set up.

Another consideration is the predictability of item selection. If I'm going to show eight videos on the front page, why should I pick eight of them? Why don't I pick sixteen and alternate? Not back and forth, but moderately random selections each page load. Really good videos might always be there, and "bottom of the top" videos might show up just now and then. For frequently anonymous users, who think "I'm not sure I like this site enough to sign up yet," get a better range of videos they're exposed to and hopefully more inclined to stick around and sign up.

In the opposite manner, can we figure out what to start excluding? After seeing the same story twenty times and not clicking on it, maybe you stop showing it to them. That space could be used for something they might be interested in.

Of course, I know I'm not inventing everything here, but I wonder if anything is a fresh idea. Obviously plenty of sites are learning to keep popular things around. Is anyone hiding ignored items? I don't know if the things I'm talking about are just "things some people are doing" or if there are real maths behind it and hard terms and concepts I can study to do it right. Hopefully, I'll be able to write more about solid results soon.
I write here about programming, how to program better, things I think are neat and are related to programming. I might write other things at my personal website.

I am happily employed by the excellent Caktus Group, located in beautiful and friendly Carrboro, NC, where I work with Python, Django, and Javascript.

Blog Archive