Tuesday, December 11, 2007

How To Walk Backwards to HTML 5

The more peeks I get at the HTML 5 spec, the more I dread it. We have barely shaken the last strongholds of crap-HTML since gaining some sanity some years ago. We put content in pages and we control style and layout in CSS, supposedly. Now, we see upcoming tags like article and nav and section, and all of it harks back to the days that were so dark in the web. I don't understand it.

If anything, we should take the suggestions of Douglas Crockford to heart. I want semantics in my content, not layout or anything related to it. I want themes and templates understood by the standard, not developed by a thousand projects in parallel resource squandering.

Any complaints I make about the upcoming HTML spec is completely trivialized by the fact that there is an upcoming HTML spec. Do you know how long it has been since any major shift in web formats? We're talking pre-Mozilla days here. I can't imagine the migration required with an internet the size we have today. The web makes a great platform, in my eyes, but upgrading the platform itself is working with the world's most ineffectively administrable network. Deploying can take years, even nearly to a decade.

We need to hold ourselves steady on the standards we can't even agree on today and just stop jumping on them. Break the foundations, crumble them down. Browsers are great at just working with what they're given. There is no such thing as an error, only being less affirmative. If we can take advantage of that, through arbitrary tags and attributes, we can really build something out of less.

Sunday, December 09, 2007

How To Not Open Your API Enough

So, I didn't see any opening for contributing in any useful way to the discussion about Google's new Chart API, until I read this post. How dare they call this service open. They should have been clear that their greed has lead them to secretly hide abilities of the service from the public, in an obvious attempt to corner the market on really cool graphs in web sites.

My theory is that they hope everyone uses their "Open" Chart API, which doesn't include the full service's abilities, so that their own charts, using the entire breadth of charting power, are inherently better than yours. Beware the wickedness of the corporate greed, my friend.

My sarcasm drips onto the floor. Now, I mean no disrespect to Marty, but this kind of post really does get under my skin from time to time. Maybe it just struck me at the wrong time. So what if Google has features they didn't document? So what if they use a different URL to access the API? Maybe the undocumented features are still in flux. Maybe they like to see how many people outside Google are using the charts. There are plenty of good reasons for everything he talks about them doing with this and claims it to have some anti-open nature, but I just don't see any of it.

For Google's Motivations This Means...

None of it really matters, in the end. Use the API or don't, but I don't see a gain for them in the parts of the API they are letting us use, nor do I care if they do gain. Gaining from something doesn't negate your ability to do it for the reasons outside your gain. My job involves writing software for a company that helps low-income families find affordable housing. I get paid for my job, so does that mean I can't lay claim to any good nature behind it?

Friday, December 07, 2007

How To Break Twitter


So I haven't realized it, but my twitters from the last day or so have gone unposted. Twitter pod is great, but its completely silent about authentication errors, apparently. The root of the problem is that I have two twitter accounts: ironfroggy and pythoncoders, which I intended as an aggregator of python twitter accounts. At the very least, a place to find other pythoners on twitter. Now, I haven't been keeping up on pythoncoders, so I decided to log in and update the followings, only to realize I forgot the password. Here in lies the cause of my troubles.

The twitter "reset password" form does not take a username. Oh, no, it takes an e-mail address, which I have two accounts with the same of. I crossed my fingers, went through the process, and seemed to be brought back to my ironfroggy account, so I decided to figure it out later.

Somehow, this process leaves the account with an unknown password, probably random or even null, or for some reason the fact that they accept multiple accounts with the same e-mail is not taken into account in code paths that should care about it.

I'm waiting for a solution, Twitter.

Wednesday, December 05, 2007

How To Join The Python/XKCD Rejoicing

I am saying nothing new here but I do this not for reddit votes, diggs, comments, or hits, but for love. This about a great online comic that completes my geek morning three days a week and the thing that lets me make all the big bucks: XKCD and Python, together at last.

Congratulations!

Tuesday, December 04, 2007

How To Insenstively File People Into Two Types

The development sub-blogosphere is abuzz with responses and responses to responses on the debate over splitting up developers into two camps. The core idea is that 20% of us care about software development and 80% of us just do our job and go home. We all like to think we're in the 20% and we probably are, because the 80% doesn't care enough to recognize the distinction. They might recognize those few geeks at the office who don't seem to have a life, of course. Is the debate centering around who is what group and what it means or that we are grouping so bluntly in the first place? Well, we are a binary loving people, after all.

For the 20% This Means...

We love to self comment. We're a relatively small slice of the populace spending an unusual amount of time talking about ourselves and this whole deal just exposes that. Who reads about the split between the developers that care and the developers that just pay the bills? Not the bill payers. Even if this all centers around what Mort can accomplish and what motivates him, he'll never read a word of it. Not much about this affects the 20% directly, so it leaves you wondering in this debate: what the hell is the point?

For the 80% This Means...

It means nothing to those who don't pay attention to what we're saying, and that is a defining characteristic of the people this whole thing centers around. We might be an inwardly reflecting collective of people, but this is the beginnings of the most important realization of our industry: that we are not our industry. Democratically speaking, we just want to do the job and go home. If we want all the improvements to process and quality we strive for, we need to make it for those of us, most of us, who don't give a damn about those very things.

I have friends doing this job because they "know about computers" so it seemed like a good fit or simply heard that "programmers make a lot of money." They do not read blogs and they refuse to stop using Notepad. I'm finally seeing a small interest in running Linux, but that's only because it might be getting to be an easier alternative, not a more powerful one. I can't get them to read books, talk about software, or understand why version control is better than tossing the code into a zip file every now and then, if they even do that.

What this means for the Morts is only what we make it for them. None of the things we're saying matter in their world, so the only differences are going to be what we actively decide to do about it. If we really care about the end result of quality in the work we produce, then we're going to need to stop talking and start walking. Put policies in place, work your way into management, and plaster the bathrooms with Google Testing material. Witness in the name of giving a crap about the software you write!

In the End it all Means...

In the end it should teach us to be less self reflective. We can't keep debating inward. We need outreach programs. Inner city schools are taught the dangers of guns and drugs, but the code pushers of the world need us to push all the things on them we don't realize isn't known.

Worse Than Failure has all of its material because the 20 and the 80 never talk.

Sunday, December 02, 2007

How To Recover Lost Git Branches

Daniel, at work, ran into a problem of accidentally removing a branch he had just created and made a commit to, thus loosing the days work. This was actually the fault of our internal scripts to manage the branching and merging policy we've set up. By "internal" I mean that I wrote them and it was my fault his whole day of work was gone, so that left it up to me to figure out how to repair the situation and salvage the current commit back from the ether. I thought it might be good to document, in the case that anyone else needs to do this.

This works in the case of branch A existing and branch B being removed after a single commit on it and branch B being from A. This means we know commit A and we need to find an unnamed commit, what was B, to recover it.

I can demonstrate the recovery process with a simple transcript.

ironfroggy:/tmp ironfroggy$ mkdir A
ironfroggy:/tmp ironfroggy$ cd A
ironfroggy:/tmp/A ironfroggy$ git init
Initialized empty Git repository in .git/
ironfroggy:/tmp/A ironfroggy$ cat >> test
a b c
ironfroggy:/tmp/A ironfroggy$ git add test
ironfroggy:/tmp/A ironfroggy$ echo "commit 1" | git commit -m test
Created initial commit 5b2401e: test
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 test
ironfroggy:/tmp/A ironfroggy$ cat >> test
1 2 3
ironfroggy:/tmp/A ironfroggy$ git add test
ironfroggy:/tmp/A ironfroggy$ echo "commit 2" | git commit -m test
Created commit 08217dc: test
1 files changed, 1 insertions(+), 0 deletions(-)
ironfroggy:/tmp/A ironfroggy$ git reset --hard HEAD^
HEAD is now at 5b2401e... test
ironfroggy:/tmp/A ironfroggy$ git reflog show master
5b2401e... master@{0}: reset --hard HEAD^
08217dc... master@{1}: commit: test
ironfroggy:/tmp/A ironfroggy$ git log master
commit 5b2401e38d400c3039a53a036f2d98f75d544056
Author: Calvin Spealman
Date: Sun Dec 2 13:27:25 2007 -0500

test
ironfroggy:/tmp/A ironfroggy$ git log 08217dc
commit 08217dc6ef8f8117d6c9e4bca6fe7f18f78559b6
Author: Calvin Spealman
Date: Sun Dec 2 13:28:01 2007 -0500

test

commit 5b2401e38d400c3039a53a036f2d98f75d544056
Author: Calvin Spealman
Date: Sun Dec 2 13:27:25 2007 -0500

test
ironfroggy:/tmp/A ironfroggy$ git merge 08217dc6ef8f8117d6c9e4bca6fe7f18f78559b6
Updating 5b2401e..08217dc
Fast forward
test | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
ironfroggy:/tmp/A ironfroggy$ cat test
a b c
1 2 3


The red line is the dangerous move that lost your commit. Now, nothing is actually removed from the repository, except for when you perform a garbage collection, so I knew we could recover this somehow. After a short time looking through the docs and asking a question or two on IRC, I arrived at the solution. There are logs kept on the changes that happen to refs (or branches), so I could use that to see the log of when that commit happened. You can see that and the commit the follows when the reset occurs and undoes our commit. The commit id is in the short version, but if we look at the log of that commit, we get the full ID and can simply merge it back into our branch. A very clean solution, indeed.

For Daniel This Means...

He didn't loose his whole days work, but he should be doing local commits more often so that any problems with our scripts can't show up in the first place. At least the work was recovered, and in short time. He is saved, for one more day at least, from being a disgruntled co-worker.

For Git This Means...

I'd love to contribute, so I'm thinking the git-fsck command needs something that can do this for you. Some operation to locate child commits would be really useful in these situations. Once again, I'm very happy with the move to Git.

Friday, November 30, 2007

How To Predict The Solid Web

Developers from both Opera and Mozilla have recently blogged about 3D rendering contexts for the canvas element, confirming my year-old predictions. Of course, the news is a bit saddened by the decision from Opera to support a new, non-GL-based API. I understand the desire for something more high level, but putting well known GL functions underneath is a perfectly acceptable idea. One side or a third party needs to provide a compatibility layer, or they need to decide on one of these APIs. I really hope OpenGL ES makes the win here. This also ties into the OpenGL APIs on Android, accessable through WebKit, so it only makes sense that Firefox, Opera, and all the WebKit-based browsers should standardize, before Redmond releases DirectX 1.0 Web Edition Premium.

For Users This Means...

We're going to see some fun web applications taking advantage of this, but there isn't a lot we'll see that we didn't have with Flash, for years now. I think some of the most interesting effects will come when we can use a canvas as a 3D texture and can render DOM elements into the canvas. When we reach this, we'll see lots of page effects, from folding elements to crumpled elements being deleted to rotating text and interface units.

We're going to see a lot of ugly abuse.

For Developers This Means...

Just one more thing to wait years before specialists are expected, and again you need to be a jack of all trades. Now you need to understand some code, a little database theory, CSS styling, artistic design, business layout, and 3D modeling and texturing. Have fun with it.

Wednesday, November 28, 2007

How To Work a Sigmoid

Software Development in Really Big Steps
  1. How To Work a Sigmoid
  2. How To Work a Sigmoid - Part Two

I've written about my use of FogBugz, driven by their great time tracking and estimation features. Using these, I've come across what I think is probably common and should be a goal for estimating the time of a project.

There are two estimations of a project. When you start, you can make some wild guess, pulled from the ether, weeks or months ahead of when you think it will be complete. This is the number that is notoriously and unequivically wrong. This kind of prediction is simply an invitation to make a smart person look dumb, since so few of us realize that he never was able to make that estimate. The larger the project, the greater the exponent on your chances of being able to make this estimate. This is not new to any of us.

The second estimate is the running estimate, compiled from the tasks the project has been broken down into. Now, the pro of this running estimate is that it is bound to be more accurate than the wild guess you started with, especiallly if computed with some of the fancy number working FogBugz does to account for how good different developers actually are at estimating their time. However, to every pro is a con and this one has a big one: the running estimate, although more accurate, is incomplete. You can only estmate for the tasks you've broken the project into and that is a fluxing set of tasks. As you develop you break larger tasks into smaller ones, learn new things you need to do, change requirements, find bugs in the work you've done already or the dependencies you use, and continue to iron out the design. This is even more true if you use agile techniques, so you didn't design a lot of upfront, but design on the go. Not to say this isn't a good thing, but it is a thing to be aware of.

The project starts at 0.0 and it ends at 1.0. Your guess is somewhere below or above 1.0, but mathematically cannot be equal to it (because, you can't guess!). As you accelerate your collecting of tasks to do the running estimate begins to increase toward 1.0 very quickly, until you start to level out and complete more tasks than you create. We work on the running estimate of a sigmoid curve, winding up from nothing and leveling off at the best real estimate that can be given with the real data at hand. Now, I grabbed this image from some place and I didn't add the flat line that represents your initial guess. This is both because I didn't have the time and because that guess is completely useless.

Great, so we work a sigmoid. So what?

The world is flooded with useless information and I don't want to contribute, so this is the part where I make my revelation somewhat useful. At least, theoretically. A good estimation system, like the Evidenced-Based Scheduling from Fog Creek, is really great. But, what if we included estimation of estimations? Oh, that sounds recursive, for sure. Suppose that, in addition to computing the weighted estimates and the running estimates of release after compiling all the information that can be taken usefully into account, we also track the running estimates as they change over time. If we graph these, I suspect they would roughly follow the curve of the sigmoid. If we find this or any other pattern to be true, we can estimate the estimations. The further along a project goes, we can estimate the future of the curve and make moderately intelligent guesses about where the estimation will go in the future. Weighting for how different teams and individual developers estimate, the system can train itself for accuracy.

I'm already into my current FogBugz tracked project, but my next will be setup to grab the estimate data periodically and I'm just itching to test out my theories. We can't predict when a project is going to be complete, if it ever is, but we can damn sure do better than pulling numbers out of the air.

How To Combine Your Own Efforts

I came across the invite-only beta for Onaswarm.com and I've had some good conversations since having my invite request responded by one of the developers. The upcoming plans are looking really interesting, and I like being able to combine my feeds. I'm still wanting some aggregation for things like bookmark feeds, and it would be nice if trivials like Twitter or Jaiku didn't put full posts into my final feed, but annotated the existing posts with my status. There are a lot of ways this could all go and I'm really interested in this.

You can view my Onaswarm feed. I'm going to continue keeping an ear on things over there and try to get in on this band wagon. This is more my kind of social network, because it deals with the things I already care about. What am I talk about, where have I surfed, and what am I talking about? It lets me pull all this from what I am already doing. I think this can turn out very well.

I have a related surprise to announce here, probably after the holidays. This is a developing side project of mine, which goes in line with things like this and is also quite a different beast. I'm really excited.

Monday, November 26, 2007

How To Have Too Much To Do

I've got a lot of things I'd like to tackle and I just wanted to layout some of the things on my mind lately. Many of them are small, so maybe I'll even complete a few before 2008.
  • Launch a small, free service that uses a del.icio.us account to take social bookmarks and forward them to Twitter or Jaiku or Onaswarm automatically.
  • Learn how to write a Firefox sidebar application and replace the crappy TwitBin. I want to only see the most recent status from each user and to remember my preferences better.
  • Develop a small desktop tool to grab my bugs from FogBugz and let me track time offline. This will come in handy when I travel around the holidays.
  • Get reacquainted with Nevow and Athena for a few small games, like TicTacToe and Squares.
  • And, as I write this, I want something that will take highlighted text and replace it with a link to a Google search for the text. Easier than looking up the links to everything I just mentioned!
It really seems like an OK plate, now that I have it written out.

How To Enjoy a Week of FogBugz

I have been on an eternal struggle to find the rights tools to keep me organized and on track with my projects. Flying blind is just not something I can do, with such a wandering mind. I especially like time tracking tools, because if I am tracking my time in a task, I am far more likely to focus on it until it is complete. Distractions make a lier out of me. When Joel Spolsky blogged about the Evidence-Based Scheduling in the newest release of his FogBugz product, I finally decided to try the service out for a new project I am starting on over the holiday. It has been about a week and I already have some really good impressions.

As far as bug tracking goes, FogBugz seems to be bare a good deal of similarities with Bugzilla, but is still very familiar to a Trac fan. They've even added a Wiki, although I've not used it. I'm working in solo on my FogBugz trial, right now. (More on that later.) I do wish for dependancy field on cases, instead of just linking to them in the comments. Overall I don't have many wishing for the case tracking itself, and I'm barely using the features available.

The listing is very customizable and I've taken advantage of a few different configurations already, so I can definitely see myself finding more that are useful. There have been some things I haven't found the right fit for. Notably, areas and releases have been a little awkward. Many things cross over different areas of work, so I don't have a clear separation there. I kind of wish for tags, instead. As for releases, there simply are not good uses for those when a project is so internal. I can just make a release for when we decided it is done, but then the field is as useful as not existing at all. I tried to make pseudo-releases for different milestones of functionality, but I am not sure if that is a proper fit.

Time tracking, the very thing that drove me to try FogBugz, is possibly my favorite part. Seeing what you guess and what you actually take is revealing. I seem to guess over, usually, but I wonder if I'll see my task estimates actually getting better as I use this over a period of time. The feedback may train me. I've even found, so far, that the release estimations seem to be pretty well calculated and I've hit the dates its estimated pretty well. I want to write more about my thoughts on estimation and how well you can estimate what you can't design until you've done much of the things you need to estimate in the first place.

I really am loving it, but I know I need to wait out my trial before making any final call. I think it is well worth the cost over the free Trac and others, even for personal use.

Saturday, November 24, 2007

How To Really Want To Love Flock 1.0

Everyone is abuzz with the release of Flock 1.0 and I so wanted to get on the bandwagon. I just couldn't make the jump. I tried to like it, very hard. Twitbin trumps Flock's Twitter integration and the del.icio.us bookmarks extension does a better job of helping me find my bookmarks. I find it very annoying that I wanted Flock for integration with online services, yet I couldn't remove the local bookmarks I care nothing for from the bookmark sidebar.

Now, there were some things I really liked. I loved that logging into a supported service configured it automatically. I want to see more of that.

Thursday, November 08, 2007

How To Demo With Zero Barrier

When one browses to MindMeister and looks at the nicely designed page, the user will notice a nice screenshot of the service. This is not a screenshot, but an anonymous, live embedding of the actual mind mapping service. Right at the first page, you get to start messing around with things. I think all Web 2.0 apps need to provide this kind of immediate use. We can provide such a low barrier to use, with no installation, but we've really lowered the bar, so to speak. The users won't jump very high for us these days. Let them trip and fall right into our arms.

For Web 2.0 This Means...

Web-based applications need to provide an anonymous access to their application on the front page of the website. If you have a to-do application, let the user start interacting before they do anything. Even registration is a barrier to entry. Of course, if you take what they did anonymously and migrate it when they register, you get a gold star. You get two gold stars if you also keep their anonymous data around when they return with a cookie.

For Development Frameworks This Means...

Frameworks need to provide the infrastructure to do this easily. We build things in the context of a user, so sometimes there is a barrier we have to cross ourselves to provide this. Built-in ways to create anonymous uses, promote them with credentials, and expiration anonymous accounts: all will let developers provide this Siren call to users at little cost.

For Users This Means...

More choices, because I can try more things. I don't need to give out my information and remember credentials just to try out yet another twitter clone, to-do app, or mind mapping software.

Tuesday, November 06, 2007

How To Git Away From CVS

James went along with the idea of moving away from CVS quicker than I thought and we put the plan into action last week. I put in the time to the project and started off with the default CVS replacement: Subversion. I really was looking forward to using it at work, until a friend made a subtle suggestion to look closer at the git project, which Linus Torvalds is heading as the version control system of choice for some little thing he's writing called Linux. Needless to say, I was skeptical, given the track record of the developer.

Quicker than I realized, I was falling head over heals for the examples of git use I was seeing. I cooked up a latest stable for OS X, as the installer I found was 130MB from the 1MB source tarball. Universal binaries on a project that generates about 145 distinct executables is a real bitch. I whipped up a little script to name git and toss into $PATH, while keeping all the git-* executables and other files tucked away in /usr/local/git/. I've cleaned that up and released it for anyone else interested in a clean git install on OS X. I may be releasing more git related work in a the near future, if you read on.

For Development Sanity This Means...


As a team grows out of a few developers and reaches nearly a handful, you need to start thinking more about development processes. Fixing a bug quick while in the middle of a half-finished change is a real problem, especially if you've commited already. Multiple developers working on separate projects can also be difficult to manage, if you don't think about things. These were among our concerns. Also among these concerns was the benefit of making the switch before bringing anyone else on board.

We had seven CVS modules and I developed a script that imported each of them to a new git repository and them merged them in the same layout as we had by checking out all our modules into the same directory. Keeping them together was a good move, and we kept our whole history.

Branching was one of the core driving motivators and I was thrown back when I found branching not doing what I expected in git. Branches in git are local, and although you may often push or pull between branches named the same on different repositories, they are not really related. I got to work on developing a set of branching tools, and I'm very happy with what I did in a small amount of time. The functionality is pretty complete. We can create, share, merge, and switch branches easily. I've even implemented automatic stashing and restoring of working copy changes that haven't been commited when switching between branches!

For Released Work This Means...


The tools are proving fun to work on when some issue comes up where they could be improved. They began life as a set of shell scripts, but they will very likely evolve into Python scripts to facilitate their future improvement. They are also very likely to leave the repository which they themselves are in control of. That was a headache to get proper.

When they are converted to Python I am going to release them. I really want to actually publish the git repository, but I need to figure out how to do that and if we can do it securely from work somehow. It would be nice to do that, such that the repository is always up to date with what we have. I'll probably mirror them to http://repo.or.cz/.

I've also placed the git binaries for OS X in two bzip2 compressed tarballs. git-1.5.3-osx-intel.tar.bz2 is everything you really need, placing only a single git executable into the path. You can grab git-1.5.3-remote-osx-intel.tar.bz2 to expose the tools needed to access your local repository remotely, via ssh.

For SocialServe This Means...


The three current of us are now branching for our development happily. We've done hot fixes in the middle of larger branches, without disrupting our work. We've pulled bug fixes from the master branch into our own things, shared branches, merged them into the production line, and are generally having a really good experience.

I think that, in the end, this really means more productivity. I'm able to be more flexable with my project, because I don't need to keep every commit in a state that can be pushed into production if James suddenly needs to fix something unrelated. Branch based development is great, and our scripts help manage them very well. I can't wait to release them.

Sunday, October 21, 2007

How To Fullfil The (Geek) Rockstar Dream

I've been putting a lot of though to my ongoing desire to write something in the way of a video game. This was my original foray into programming and I just didn't stick with it. Turns out I am such a geek that I actually found database design and protocols more interesting than first-person shooters. Go figure. Still, the old dream burns inside me. I've spoken with a few people here and there that could gain interest if I started something, and I'm thinking the time is arriving that I buckle down into the nights and see what I can do.

I've been looking pygame versus pyglet and hoping to find a ready-to-use accelerated sprite library. Although I really want to write a straight Python, installable game, the lure of the web is strong. There are a lot of fun ideas I could try there, and probably a much larger audience I would reach. Of course, there are pros and cons to both.

















Web-Based
Installable
Pros
  • Zero installation
  • Higher number of users
  • One target platform (for the server software)
  • More powerful result
  • Allow mods easier
  • More justified to charge for the game
Cons
  • Nearly impossible to charge players
  • Limited capabilities
  • Disperse browser platforms
  • Less people will play the game
  • More capabilities to waste my time on
  • Disperse target platforms


My options really aren't very clear. I don't know which I'll go with. Either way, I'm sure I'll bring Python into the mix on some level. Of course, I don't necessarily have to choose one or the other. I'm considering the option of taking both routes. The development time would take longer, but I could try an interesting approach of a demo or slim version of the game for free use, probably supported with advertising. Anyone who enjoys the game enough can buy a full version for download.

There are even techniques to share a significant amount of the development effort between the two versions. I'm sure that would give me some interesting things to blog about and perhaps some fun pieces of code to share.

Of course, all of these options don't even get into the questions of platform support, or javascript versus Flash for the web development. The different choices are really a bit much.

Saturday, October 20, 2007

How To Consider Chicago in February

Brett Cannon is considering an import tutorial for PyCon '08, focusing on his new work in the area. I've caught word here and there about talks people are working on, and even had a suggestion to make a talk proposal myself, which is silly. I haven't a clue what I would be able to talk about. I sure would love to listen, and watch, and chat with everyone else. I'm really wondering how likely it is that I could make PyCon '08 the one I am finally able to attend.

For Work This Means...

I have a pretty flexable schedule at work and the boss is a great guy. (No, Van, I am not just saying that because I know you read my blog.) Still, I have no idea what prospects I would have for taking the time to attend PyCon, but I'll deal with that when I decide for sure that I want to try to go. Well, I know I want to go, but I have to make sure that I personally can go, before I figure out if I can professionally go.

For Family This Means...

Either Heather will want to come along with me to the cold of Chicago in February or she has to stay home and take care of Caelan all by herself for a few days. Of course, he'll be almost two by then and gets easier to take care of every day. I wouldn't mind them coming along, but what would they do with all the time that I'm at the convention? I suspect they would find something to occupy those days for them aside, like visiting one of her friends or something else that would take them away from the house while I'm gone.

How To Move Down Three Flights Of Stairs

Well we finally moved into the new office space. It doesn't feel very strange, of course, as I have only been in the old offices for a day over two weeks. Still, it is some feeling. We've got a nice, open space. I'm really loving the use of a long ledge along the one wall as a single, extremely long book shelf. The power and network running down poles at two points in the space allows us to encircle desks around for great collaboration. I'm a little worried about the lack of any personal space barriers, but I don't see it will be a great problem. I really wish I had gotten one of the two desks with their back to a corner, but Daniel grabbed it up pretty quick. He's been there as many months as I have been there in weeks, so I can't really complain about his getting first dibs. Now that we have the new space, I have my keys and I can leave a little early to avoid most of the traffic.

The old leaser of the space was an embalming office. It makes the existence of a loading dock a little creepy.

Saturday, October 13, 2007

First Week At SocialServe

My first full week at Social Serve ended yesterday with me jetting out early to try and avoid the rush hour traffic on top of race weekend conjestion going past Lowe's Motor Speedway. Turns out, everyone else had the same idea. Still, most days I seem to be able to make good time if I just get out before the real traffic clogs the roads. I've learned a few tricks, as well, like avoiding I-77 completely on the way home in favor of Brookshire all the way to the 85 junction. Avoiding the ramps between three interstates makes a huge difference.

The work itself has been good. I still am learning my way around the existing code, and making my fair share of suggestions to improve things with all the stuff I see from the Python community. I'm looking forward to doing some fun stuff.

I've joined the commit team on the GeoPy project to push some patches and further work I've done on it for Social Serve. Where things are is a surprisingly difficult problem to deal with.

Wednesday, October 10, 2007

How To Work At SocialServe.com

Here are my instructions to anyone else who may want to work at SocialServe.com:
  1. Have a strong enough interest and passion for development to start a contracting business without any formal training. Support yourself for about a year and a half working for one client and the next.
  2. Move to an area populated enough to start a user group for your favorite development language, tool, or concept. (Mine was Python)
  3. Be suggested to send in a resume to the company of one of the first members.
  4. Sweat your way through the first real interview for the kind of job you've wanted your whole life.
  5. Cross your fingers not to screw it up.
So, that's my story. I had my interview last Tuesday, called Thursday, and started Friday. I've enjoyed it a lot. Learning my way around the codebase has been going pretty well and I've already got my first couple of commits in, as well as two small projects. I like to think I'm moving along nicely.

One of the things I need to get used to is that all of our development boxes are Macs. I have a company issued Macbook Pro (2.33ghz dual intel, 2GB RAM, 17") and I'm really enjoying the Mac life. The UNIX background is great and the interface is just slick. Installing applications is just fun. I've got the entire KDE suite installed, so I've got a lot of my favorite tools and toys right there.

The time away from the house, while somewhat nice, is probably the biggest downside. Caelan misses me a lot while I'm at work and I miss him and his mother quite a bit. After all this time at home, and all of his life so far with me there every hour, it is tough. It may be harder for me than him.

Finalizing the whole picture is my commute. 30 minutes or less to work and an hour to and hour and a half to get home. What's up with that? I need to see about leaving early or I might just leave an hour late. I'd still get home at the same time, so I might as well spend it in a comfortable chair instead of my ugly car.

Monday, September 17, 2007

How To Stop Me From Buying An iPod Touch

So I've been looking at the new line of iPods and thinking about how much I wanted an iPhone without the phone, so the iPod Touch seemed to be exactly what I wanted. Thankfully, Morgan Webb pointed me to this story about why I might be reconsidering. Now, I run both Windows and Linux. My wife is the one in the household who refuses to run Windows at all, how about that? On both operating systems I already have players that can use the old iPod lines. I use WinAmp and Amarok, although I am anxious for Amarok on Windows to be stable enough to move to. Sorry, Nullsoft! Where am I left now? I guess I need to look at my options, and I'm not really excited over the UI itself on the Touch, but the hardware. Do Nokia Internet Tablets include an MP3 player with decent battery time? I could import a miniOne from China. Thank you eastern friends!

How quickly can Apple break their high horse's legs?

Friday, September 14, 2007

How to Confuse _ and locals()["_[1]"]

So, after posting about the exception raising list comprehensions, I got this:
Kevin has left a new comment on your post "How to Add Memory Leaks to Python":

Doesn't '_' only exist in the interactive interpreter?
Kevin has a misunderstanding here in that there is a huge difference between the expressions _ and the locals()["_[1]"]. You might spot why they are so different, or you might not. The second, the one from the list comprehension, is unable to be accessed by name directly. You can only get at it via the locals() and globals() functions, depending on your scope (locals() always works right after the LC in question, though). The name is intentially something that, if tried to resolve as an actual name in the scope, won't find the object in question. Python will look for _ and then do a subscript lookup on key 1 on it. This is completely different than actually looking up the name _[1] in the dictionary where names are stored and grabbing up the value its bound to.

So, Kevin, your question can be answered in that its irrelevant, because we aren't dealing with _ at all. Also, open your profile so I can respond to you directly, next time.

How to Pretend Zope Doesn't Exist

I'm doing my best to ignore that I just read Well-kept secrets of Zope, which lists all sorts of things I'm interested in seeing developed and how Zope did it. Zope is our Simpson's to South Park's Python community. All of this ORM wild fire and debates about unicode support and web frameworks seem really petty when someone reminds us that Zope, which most of us never really look at, has been doing some of these things for over a ten years. That's practically a decade in software time!

Will I settle back into a comfortable hole in the ground, cover my years, and chant "There's no such thing as Zope" or am I going to bite some pride and surround myself with Zope? The first talk for a CharPy meeting is going to be on Zope, and I'm going to ask a lot of questions about why it gets ignored and how someone really familiar with Python can get into Zope so late into the game. Now, I am not planning to drop everything else. I still think CouchDB has some interesting ideas that I'm pretty sure aren't part of ZODB. I'm also a huge Twisted fan, where it applies. We've had some segmenting problems in Python with Zope and non-Zope, Twisted and non-Twisted and, more recently, Django and non-Django. We seem to be gaining a habit of frameworks that gain a really large following and are really known to those who don't use them, who continually ignore the developments. If Zope and Twisted played together better, both in code and community, would Django ever have even surfaced to fill the gap? Would Rails have been irrelevant?

We need to build some bridges, so who wants to shake hands?


Technorati Tags: ,

How to Add Memory Leaks to Python

One of our greatest bragging rights is the lack of memory management in our Python code and the wonder of garbage collection, so when we find a way to get a memory leak in Python, it should be made well known. I don't know if this is already known, or not. In actuality, these situations are known as reference leaks, sometimes, and they are cases where we forget to remove a reference to an object we don't want to keep around anymore. The following session will cause this problem.


Python 2.4.3 (#2, Oct 6 2006, 07:52:30)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def f():
... for i in xrange(100):
... yield i
... raise Exception("oh no!")
...
>>> [x for x in f()]
Traceback (most recent call last):
File "", line 1, in ?
File "", line 4, in f
Exception: oh no!
>>> globals()['_[1]']
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
>>>


Now, there is a global that will sit around referencing a potentially very large list and we won't be aware of it. However, it will be overwritten if another list comprehension is run in the same scope, which will be removed if the new LC is successful, and if we do our LCs in functions, the local will be cleaned up on return or raise. Of course, you can always just pass a generator expression to list() and avoid the problem entirely.

Just keep an eye out, if you build global constants with list comprehensions.

Thursday, September 13, 2007

Python, Concurrency, and My Two Cents Today

This is not the first and it will not be the last that I write about the state of concurrency in Python, comment about some debate going on in the community, and outline what I think we need to solve any apparent problems and capitalize on what a lot of us think is the future of software development. Anyone following the Python blogs is bound to have caught wind of the Guido-Eckle debate about what Python 3.0 has become, compared to what it could have been. This was followed immediately by an open letter from Juergen of SnapLogic about the GIL. I feel sure this has all happened again and that all parties involved are just playing some recorded macros.

The most compelling case we have right now against the arguments to remove the GIL are two. Firstly, it was already done! A branch of Python removed the GIL many years ago and actually found a two core/cpu system would run the same code slower, due to all the locking involved to protect mutable structures. So, while people continually say that the GIL needs to be removed, gets in their way, and generally is a wart on Python, we need to remind them that its been done and it was a bad idea. The GIL is not being kept in as a product of laziness.

Secondly, threading is not the definitive answer to concurrency needs! This is a really important one, because one of the areas that I always see the Python community strive at is finding the right way to replace the popular way. Threads are very popular in a lot of circles, but there is a huge consensus that they are simply a misrepresented de facto with little in the way of justifying the use they see. The Java world, in particular, seems to think that throwing threads at a problem can solve it. What we have to realize is how many problems are caused by threading and if they outweigh the benefits. A lot of us can't see those benefits through all that cost, so we've looked in other places for concurrency, and we've found it. In some ways, the GIL acts as a deterrent to force us into finding a better way. I, for one, am all for keeping it around just for that reason.

Where can I really go with this? Not far. I could ramble and rant about processes being better concurrency primitives than threads, but I don't feel this is the time or the place. But, please, can we stop asking for the GIL to be removed? No on is going to listen to that plea. The issue is going to come up again, and that is absolutely a promise. We're going to see this again and again, until we have something solid, standard, and powerful enough to distract the thread lovers from the GIL issues. I don't know what that solution is, but we need to figure it out soon. Guido is right, of course, and this is a library issue, not a language issue. However, we can't deny what value this library issue has for the language, and a little encouragement or name dropping on his part might do well to push a good answer to the forefront. Eventually, something needs to get to the point that we can bring it into the standard library and say "This is how you do concurrency in Python."

Who wants to answer the great question? Step up.


Technorati Tags: , ,

Sunday, September 09, 2007

How to Shop for a Portable Device

What do I want? Do I even need a phone? When the iPhone came out, my first reaction was wishing it wasn't a phone. If I want a touch screen in my pocket, shouldn't I safe money and open my options with a Nokia Internet Tablet? The iPod Touch and the iPhone will be open, eventually. They can be coerced into running new things, already. The Nokia has a huge library of software, and can be made to run linux with lots of apps I already use. The price is undeniably better, of course.

Everything says "Don't get the expensive Apple device," and yet, I want to buy it. I even want the phone model, when I don't want a phone. What great marketting.

If I could find any decently open e-paper device, I'd jump on it before anything else.

Saturday, September 08, 2007

How to Exploit All of You For Traffic

People go through a few different stages of programming Python, and one of the last is learning to optimize well and without sacrificing the quality of the code. When a piece of code is bottlenecking, it comes time to look at how you can really turn a turtle into a hare. Or, should that be the other way around?

I want to showcase ways this transformation is possible so I'm going to make a call for anyone to submit code that needs optimized. The next post in this series will show how the code was optimized, what techniques might have been tried and would have failed, and maybe some tips about why the changes worked. There will also be a sample of unoptimized code at the end, with the challenge for improvements to be sent in. From there, if the series has interest, it will continue and maybe evolve.

Send in samples of code you think could be faster. They can be real world or fake, as long as they are realistic. It doesn't matter how poorly written they are, but we need to know what it does. It needs to actually work. The best submissions will be a single function and a docstring that tells me what it can be called with and what it should be expected to do. Things that can unittest well and don't rely on things from the outside are best.

So, impress me, everyone, with the worst code you've got.

How to Improve on CouchDB

The CouchDB project has been getting a lot of talk lately, all over the blogosphere and in particular in Python circles, who like JSON and REST and are excited at the new move to them from XML. I really liked what I saw. I also knew I could have liked it a lot more.

You know something interesting is happening when someone with my anti-XML track record says the following: XML should not have been dropped from CouchDB.

I can clarify and reaffirm that I am not crazy by saying that this does not mean JSON should not have been added, but that it should not have replaced anything. In other words, both are fine. While we're at it, why does it matter what format the "document" is in? Be it an XML document, a JSON serialization, or a photo, anything could benefit from the basic architecture of CouchDB.

Looking into the details, I'm disappointed that the distribution model is just smarter replication. Something like this really should be able to do sharding out of the box, with its really unique identifiers and revisioned nature.

Can CouchDB satisfy my requriements or do I need to write SofaCouchDB?

Friday, September 07, 2007

Charlotte Python Group: First Meeting

We had the first meeting of CharPy, the Charlotte Python Group, this morning at Nova cafe downtown. There were only three of us, but most user groups only start with a handful of people, and I'm looking forward to how it grows. We're going to keep this going on the 1st and 3rd Fridays, and anyone else is welcome to join us.

Tuesday, September 04, 2007

Brain Dump 20070904

When things build up and I haven't posted, what to write about is a difficult thing to decide. This is a dump of recent or queued topics just to get them out of the way for new things.

What's Up With This Hip Cat:
  • Looks like we're getting Charlotte Python Group (CharPy) together, and hopefully having some coffee one morning this week. Let me know if you're interested!
  • I'm stretching my legs at a bigger desk in the home office.
  • I'm trying to consider how to approach an open Python position I found in this area versus a possibility with someone I had a great time working with in the past versus moving to Australia down the road.
  • Knitt and Oasis are great little indie games. I wish I could have taken part in PyWeek, but I am so busy! Pyglet looks interesting these days.
  • Took on a tight deadline project at the same time as work on Python Magazine ramped up, because I'm just that willing to bend over for money.
I know I don't write enough. I have to write my column for the October issue of Python Magazine and maybe that will jog something in my brain.

Sunday, August 12, 2007

Bookmarks for August 5 to 11

  1. wzdd: On computer programming
    Is this really what we do? funny programming

  2. YouTube - PostSecret Video made by the owner of postsecret.com Post Secret is the bleeding of an entire species. art video blog

  3. armstrong_thesis_2003.pdf (application/pdf Object) erlang programming concurrency distributed thesis

  4. Wikidbase: the ultimate groupware application, possibly Very interesting project here. Wiki with formatting that controls a database.

  5. Teach a Kid to Program / Wired How To's programming education parenting

  6. Cleversafe Open Source Community storage distributed grid

  7. eBay My World - crystal-forest

  8. WinSplit Revolution Still trying to find better ways to utilize my screens. This looks promising. display productivity monitor software windows free

  9. diagrammes modernes: Friendly Readable ID Stringspython algorithm generation debugging
  10. The Independent Gaming Source's (Opinionated) Guide to Indie Gaming Really good list of great indie games. games indie list free fun

  11. The Wii Remote API - Opera Developer Community web wii interface javascript

  12. Multiverse mmorpg games development

  13. Paul Buchheit: The first thing that you need to understand about humans brain people interesting psychology

  14. Not Worried About the Future future religion children intelligence

Extra, Extra, Google Storage Service, Who Cares?

Google is rolling out storage services for their products, and I just wish they'd take my Amazon S3 keys, instead. Go figure.

Friday, August 10, 2007

Less for More in Media

I think this is appropriate, because my favorite software is a transport for media. YouTube, music downloads, blogs, and web comics are all old media turned new. TV and movies are old and web videos are new. CDs are old and downloads are new. Books and magazines are old and blogging is new. Newspaper comics are old and web comics are new. Why did I single out comics? Because, Scott Adams, of Dilbert fame (who has an excellent, non-comic-centric blog) took fledgely comic cartoonist, Scott Meyer, under his wing. This is old teaching new, and is interesting to watch.

On the first post from Adams, I commented about how many cartoonists might not even want the traditional route of syndication, and will choose to stay with web formats. On part 2, I commented as follows:

It is becoming one of the defining characteristics of the New Media that more people can make less money. To the eyes of the Old Media, this is obviously a Bad Thing. No one gets quite as much attention or makes quite as much money, but if you look at how many more people can make it at least to a good level, and you sum it all up, I'd be sure the overall industry makes more. To add to that, huge chunks of the money aren't going to syndication agencies and other central entities. More of the less money stays with the artists. The same is happening in moves from newspaper comics to web comics, music from CD to download, and sixty dollar video games being pushed aside for dozens of ten to twenty dollar smaller titles, each. The end is more variety, and a better chance of finding something that you like, more people make a living on what they love, and more of the profits staying with the people who are actually doing the creating. The old media will not go away for a long time, and we still need it, but the model simply changes. Cartoonists aren't supposed to make a million dollars a year any more, and that's OK if, instead, twenty or more cartoonists can make a very decent living with their craft, don't you agree?

Wednesday, August 08, 2007

Why An Empty Inbox Is Terrible For Productivity

Mark Hurst of Good Experience has had a deceptive experience. I feel for his sense of accomplishment, but I'm here to let him and everyone know that an empty inbox is a terrible recipe for lost productivity.

It was incredible. It was so stunning to me. I had written off the idea of ever having a totally empty inbox.

- Mark Hurst
The problem here is a subtle lie that none of realize we tell ourselves or others when we talk about the virtues of keeping a clean inbox. The word "inbox". I did all of this some time back, and I was really excited, just like Mark. It wasn't long before I realized I had only made things worse.

Some people may know that I slowly slipped away from Python-Dev lists, and what you wouldn't know is one of the key causes to my lack of participation in mailing lists for some time now. The problem started when I added filters for all my mailing lists and started regiments to empty out my inbox to 0. I sat, looked at my wonderous empty inbox, and spent the next few months continually ignoring the non-empty labels hidden off in the label listing.

Sure, my inbox can stay empty these days, but all we're doing is hiding the problem. Hiding a problem is not a solution.

Friday, July 06, 2007

Pausing in High Definition

This is a tale of the worst customer service I have ever been witness to. I am the victim in the story. Were it not for my love of On Demand, I would have been on the phone with DirectTV days ago. I'm still considering, but it depends on some things.

All I wanted was High Definition television and a DVR box from my new cable company. Now, for background, I am renting my current house from my mother-in-law, to keep the house for her, while she works a temporary position with her company's training infrastructure. We're trying to keep as much as possible in her name, so that her move back is easier.

Here is the tale, in bullet point:
  • Call and schedule an appointment and backup appointment for HD DVR and a cable modem installation.
  • Miss the first appointment and figure they were busy.
  • Call first thing on the day of the backup, to remove the cable modem from the order. The phone company gave me a discount to keep the DSL, that saved me more than a bundle from the cable company.
  • Find out they never made the appointment. I waited a week for nothing.
  • I ask if I, not being the account holder, can go in and pick up the box. I am told tha I can.
  • Going into the location, I'm told I can't pick up the box, even though I have all the information.
  • My wife is added to the account, and we're told I can pick up the box, being married to an account holder.
  • Second trip is responded to negatively. When I say the 800 number OK'ed my trip, I am told "You don't need to listen to them, you need to listen to me. I have the boxes."
  • My wife goes to the location with a friend.
  • She's told she is not on the account, but that there is a note that her mother called in to add her. Somehow, that was not good enough.
  • "Isn't that good enough," her friend asks.
    "Who is this," the clerk asks, pointing at the friend and not looking away from my wife. "She needs to not talk."
  • On that last point, I shit you not.
  • They take my wife into the back office to tell her she can not get the box. That is strange.
  • When my wife gets home, I call them again and tell them the story, only to have it confirmed that my wife is absolutely on the account. Their HQ contacts the retail location and tells the manager to expect us and have the equipment ready.
  • We go back, pick up the box, and bring it home.
  • They forgot to give us a power cable.
  • We make a fifth trip to get a power cable.
No one would tell me how much storage the box had, so I still don't know how much I can record on the thing.

Wednesday, July 04, 2007

Me, Too, Python Magazine!

Thankfully, Brian Jones has finally went public with the Python Magazine he is launching. Why am I so thankful for this? I can talk about it now, and how I'm a technical editor, columnist, and will most likely author an article some issues aside from the column. The working title for my column is "And Now For Something Completely Different" and I think I will stick with that.

It is all very exciting. Being a huge fan of education of and through Python, the magazine is really fantastic in my eyes. Whatever I can do to help it along and turn it into a positive move for the community is worth the effort.

Keep your eyes out here for a surprise about the magazine before the first issue comes out.

See his original post:
Python Magazine Lives

Technorati Tags: , , , ,

Monday, July 02, 2007

Page Chunking, Like Chunky Milk, Is Bad

Search results suck past the first page. Google might have a billion results for some search, but it won't give them all to you in the result page. You are probably only interested in the first five or so results. To be nice, you get a whole ten results on the page. If you want more, you need to go to page after page of ten results at a time, possibly millions of pages worth to get every single result. Obviously, you won't do that, and for two reasons:
  1. You don't care about all ten of the results on the first page, much less the thousands or millions of other result pages.
  2. Refining your search is far easier than going through one page at a time.
Having or bringing the information you want to the top of the listing is better than looking for it further down in the listing. That being the case, our solutions should center around making it easier to bring information up from the mountain of results, instead of finding ways to bury you inside of it.

Some interesting headway has been made with the universal search features launched by Google. You can shift your search focus to their different specialized searches. Ask.com has some of the most interesting result filtering, with their Narrow and Expand search suggestions. Rather than paging through results or manually trying to alter your criteria, they will split the results into logical segments, and point you to what your current results might be a segment of.

Another interesting filter tool could be result voting. I imagine a small - link on each result, which when clicked will remove the result, along with any very similar results in the entire set, and will reorder the remaining ones based on how similar they are to something you have deemed completely irrelevent. This would be a great way to filter similar termed, but logically different concepts. There are rumors that Google is testing such a feature, but I have not seen proof of this yet.

What other ways can we dig through the mountains we are mining?


Technorati Tags: , ,

Monday, June 25, 2007

Validation for MySpace Hating

The hating of MySpace is not unique, but any professional-seeming information to back it up is rare. The findings are probably dead on with what I would expect, and I don't even see Facebook or know anyone on it. I do see the people on MySpace and the kind of people that I definitely do not see there. Social classes in the United States are always interesting, because there is a different dynamic than the expected class lines. Although, income certainly comes into play, it is not the definitive factor.



After reading about the division it reaffirms my desire to use Facebook. However, I don't know anyone on Facebook. All of my friends are on MySpace, including those running their own businesses, those with families, any of the younger members of my family, and the ones making far more money than I. The social divisions that mark MySpace are also what tie me to it.



apophenia: viewing American class divisions through Facebook and MySpace

Saturday, June 23, 2007

Factual Google

Google is building fact mining into the search engine. Coming across a little article over at The Best Article Every Day, I got wind that Google Spreadsheets can do lookup of certain statistical and financial information. You can have formulas that include things like the latest Microsoft stock quote or the boiling point of sodium. This seemed interesting, so I played with it a bit, but changing the formula quickly to play with it was awkward. "Can I just Google this stuff," I thought? Yes. Read on for my findings.

The documentation for the Spreadsheet function, GoogleLookup, talks about entities and attributes. "Pluto" is an entity and "mass" is an attribute. As it turns out, you can just search for "mass of Pluto" or "birth rate in Canada" and are presented with a new type of search result.

We can see that Google seems to be pulling facts from the websites they index. They are structuring the information into subjects and properties about them. The feature has some large holes of missing functionality. "boiling point of sodium" gives a fact, but the system fails to parse any of the hits for "boiling point of mercury". The information we can get seems a little hit and miss. The community needs to put effort to document all of the entities and attributes.

One interesting result is searching for "mass of Pluto" doesn't just give us a fact result, but what appears to be a Google calculator result. This means they are recognizing the mass in both value and units. We can even use "mass of Pluto" in any calculation we would give to Google calculator.

As the shift is made from taking finding relevant documents to just giving us the information directly, we might wonder what the future of the search engine is. I expect we'll see someone in the next year bring Google to court for yet another lawsuite about what they can or cannot scrape from their website. When you have a nice site with good information, and Google just gives the users the data, you probably worry about the affect on your traffic. If it does affect traffic, then will the sites Google is grabbing the information from even remain active? Where will they get facts from when their facts pulling eliminates their sources?

Thursday, June 21, 2007

The Stand Up Desk

My back and legs hurt, but this might be a solution: the Stand Up Desk. There are different ways to implement this. Some people shell out the money for adjustable desks. You could place a shelf with an extra monitor and keyboard at standing height and attached to your machine with splitters. I'm looking into the kind of adjustable mount arm attached to portables in hospitals, to install behind my desk and allow my screen and a set of keyboard and mouse to adjust up easily, without the rest of the desk needed. I think alternating sitting and standing will be nice. Until then, I'll stand up to read.



The Stand Up Desk - lifehack.org

Wednesday, June 20, 2007

Implicit Interfaces and the Web

The best interface to software might be doing nothing at all. Implicit interfaces are gaining mindshare. This is not a new idea. Amazon improves your experience based on your habits, for example. Google increasingly employs subtle, personal weighting of our search results. In The Implicit Web, Alex Iskold talks about the services of Amazon, Google, and Last.fm. All of them take advantage of the implicit actions of their users. Last.fm lets us track, publish, and find songs we listen to and like, and after installation, I forget it most of the time I use it.

Implicit Today

A number of services have risen that really should be implicit, but are not. This might be caused by implicit interfaces' very nature of being unseen. Although they can be wonderful ways to interact with our networks, they are difficult to deploy. Developing the algorithms to translate user behavior into user interaction, without hindering the user experience, can be difficult. Even coming up with an idea for employing implicitness is difficult.

The ultimate implicit application might be Google, when taken in terms of number of users. Their intuitive Page Rank system turns millions of web pages interlinking between one another and turned it into a social ranking system. Digg, reddit, and their clones are hot news these days; however, we can't deny that they have done little more than turn what was implicit into something explicit. The change has good and bad qualities. An ironic note: Google seems completely unimpressed with social services, being the only major player expressing no interest in a service like social bookmarks. At least, this might appear to be the case, at first glance. However, when we take note that Google's entire business is built on the idea of utilizing the links on our web pages as votes, we find they were ahead of the game and have the largest social bookmarking site on the internet. The only missing features are associating the websites with actual people.

Why the Explicitness

If Google were so successful with the first massively deployed implicit interface, why would sites adapt the pattern into explicit voting systems? The migration from searching to sifting is a probable cause. The original Google model works great for mostly static content. Asking the popular search engine "What's new?" is not easy, and this is an angle explicit services employ. Social networks are nothing new, but the personal and explicit aspects are newly pushed. A search engine tells you which webpages are popular, but thinks knowing who agrees is less important. They also have a hard time distinguishing between things you like and things you do not like.

Implicit Tomorrow

We need to evaluate what makes a good system, which explicit interfaces can become implicit, and what naturally implicit features to improve. Embracing the implicit areas leads to a higher level of user involvement, because they can be involved when they are unaware of it. However, making the user aware of the affects of their implicit interactions might be exactly the sort of thing the user needs to understand these services are actually there and valuable. There is little market for sites that asks you manually rank books and movies and recommend more to you. Amazon made its business on doing just that, because it takes information automatically and makes it obvious to the user what value they are getting. I routinely buy books from my Amazon page, because I know my habits are tuned it into a great place for me to find what I need. The implicit is there, but I explicitly take advantage of it.

Monday, June 18, 2007

Google Your Spellchecker

Feature volume rises as applications and services merge and soon we will need the power of Google within single applications. Of course, there are reasons for this that lend to the idea that we will not have single applications in the future. As applications migrate into services, and services combine and interact, the whole of software is evolving into a massive software ecosystem. Every state of software can be integrate, broadcast, and pull from a host of other global services. The number of "features" available at any point is rocketing into unimaginable heights. Until we can automate the integration, filtering, and aggregation of the mass of services we have for working with the same data set, we do not benefit as fully from their availability.

Jeff Atwood brought this up in context of Office 2007's Ribbon and the Scout plug-in that may not see the light of day, for internal political reasons at Redmond. The apparent story is that adding a feature to search their interface, even optionally, would undermine their attempts at marketing the glory that is the Ribbon. Of course, a searchable Ribbon is leagues beyond the traditional mess of menus and toolbars. Embrace of this concept would do nothing but benefit them, and give a head start in giving users a compass to navigate the ocean of features coming to them. Usability is about to transform from a gentle drift to a tidal wave.

I want to expand on this, but it is for another post. Features adapt into web services. Microformats and service discovery replace Plug-in systems. The interfaces of our applications will become a search engine of features, contextualized to the present task. When I can gather some information and thoughts on these subjects, I want to produce something interesting to gather the ideas into one place.

Office 2007 and Blogging

I finally started running my copy of Office 2007, and I wish I had abandoned Open Office earlier.

Everything is a lot more snappy and responsive than I expected. The common wisdom of each new version of Office requiring hardware upgrades seems unwarranted in face of this. Certainly, it is furiously faster than Open Office. I don't expect to make as much use of Google Docs and Spreadsheets, either. Word is taking up 20 megabytes in memory, while Firefox is eating 300 MB. Which one I prefer to keep running is obvious.

Now, I tried to write blogs with Open Office, but I found no plug-ins to get it to post to Blogger. You would really think I could use Google Docs, but somehow they don't properly support posting to their own blogging service from their own word processor service! Multiple blogs on one account is not supported. Posting draws the title from the first line in the document, even if the title is present and differs from this, meaning the title appears repeated in the final post. Meanwhile, Word 2007 actually includes support to operate with Blogger, a competitor's service, and supports multiple blogs. This is out of the box, as well.

Lately, I took some heat for my hard views on the whole IronPython versus Python issue, so I want to clear up some things about my opinion and my open mindedness. I will be looking at IronPython for writing plug-ins for Office, and here it doesn't bother me that things will be missing, because I am not using the other things. My first hopeful project: a free, and actually available version of Scout, the ribbon search that politics killed.

One thing that has disappointed me is the static nature of the Ribbon, which is not how I understood it to be. This could be the product of my usage patterns thus far, but I have several times expected it to adapt to me, if it really did that. For example, when I select some text during the writing of a blog post, the hyperlink options should appear. It just seems that is not how the Ribbon works, but am I alone in thinking that was the whole idea?

Object Orientation Has Little to Do With “Objects”

I would like to declare that the word "Object" from "Object Orientated Programming" is damaging to any benefits. If this seems counter-intuitive, you should keep reading. This is a case where the title is harmful to the subject. Some people take things too far and imagine some requirement for the concept of an object, and forbid anything outside their definition. If we understand the real benefits of OOP, the inappropriateness of such object-enthusiasm becomes clear.

Do objects matter? Using a traffic simulation example, we'll say we have instances of a Car class. We add lots of methods, such as accelerate(speed_diff) and implement logic to stop the virtual car at a virtual red light. The non-OO alternative would be functions operating on data describing the state of the vehicle. When we add motorcycles, we non-OO version requires a new function to operate on the new kind of data; or, so we are told. We know the OO way of doing things is to create a Vehicle class and inherit it in both Car and Motorcycle. Somewhere along the way, we loose emphasis of the affect we actually benefit from.

We benefit from the interface of the "objects", not their virtue of being objects. Too often the consensus you here focuses on completely irrelevant aspects. Methods, classes, and objects are completely without value, if you do not employ the real benefits. The real benefit is that objects have shape, and multiple objects can have the same shape. This can manifest by a single function operating on both cars and motorcycles, for example. This is an obvious benefit to have accelerate() versus accelerate_car() and accelerate_motorcycle(). It does not matter that we pass something you can call an object to the function, but that we can pass different things which act similar enough to be handled uniformly. A very non-OO way would be a function which takes the current speed, and the acceleration, and returns the new speed. The caller would need to get the information, call the function, and change the speed of wherever it is stored. Here, the user is stuck depending on the internals, rather than the shape of the externals.

There are some common situations, where I hear complaints from new comers to the Python language. The misunderstanding of what OO means, and what a language should do, leads to misunderstandings of Python as a language.

Getting the length of an object is a great example. You find many Ruby and Java programmers confused or upset that we have no length property on all our objects with a length. The interesting part is the claim that this actually makes Python "less Object Oriented." The fact here is we have a perfectly acceptable model, with common interfaces in a variety of different Python objects. Duck typing is, perhaps, the ultimate goal of object orientation. Mappings, sequences, and iterations are other great examples of shape importance in Python.

Two top reasons are code reuse and design sanity. Centering on interfaces gives us both cheaply. We can reuse code, because we only care about how it acts, and not what it is. The design of the code is cleaner, because we can remove all reference and care related to what we are dealing with and treat it uniformly.

Saturday, June 16, 2007

The Software Prosumer

The affects of prosumerism are well documented in the evolving economy of content, but the pattern applies equally well and valuably to software creation.

You are most likely a consumer, and if you're an American you think that is a Really Good Thing, most likely. The producers pushed that, and benefit from it. That is not to say we don't benefit from the relationship. Have you seen the price of tube socks at Wal-Mart? I can live with that.

Nonetheless, someone always has to complain, attack the norm, and think they know better. Anyone betting on the dominate future of the “prosumer” is likely right on that current negativity. We don't just read the news. We filter, amend, and combine it. Every novel today spawns even more words of fan fiction. Slashdot1 would be completely worthless without their prosumer users. The barriers between those who produce and those who consume are blurring and the two are mingling. The party is just getting started.

Prosumerism in Software Development

We already are seeing the affects and benefits of adapting the prosumer identity to software developers. Things like free software and Greasemonkey, which allows anyone with a little JavaScript know-how to alter existing websites, are good examples of the software prosumer. Ideally, the prosumer will consume more than produce, and what is produced can be consumed on the same level by peers. I think we have seen this with Greasemonkey and user scripts. I might not even use Firefox if it were not for the user scripts I employ. Obviously you can't say Firefox is somehow above or better than extensions to it which are more important to the user than the product itself.

There is an obvious lack of interest and motivation to promote the prosumer by software developers. You can attribute that to a subtle fear by developers in providing the users with a way of replacing them. The more development can be done among the users, the less real developers we need. However, we can also realize that the more development can be done among the users, the more can be done for the product by all. The more flexible we make our products, the more work the users will do for us. Prosumer software cultures are also great ways to get free marketing, dedicated users, and to satisfy user needs you could never find time for or even be aware of.

Plugin systems are a great way to encourage prosumerism. Some products are developed almost entirely as extendable frameworks, with all the “real” work done all in plugins. When the traditional producers work in the same environment as their prosumer base, the ability of those prosumers will rise. When the traditional consumers' only path to prosumerism is wrought with difficulty, hackish patching, and little or no producer support, the results do nothing but harm both sides of the equation.

The next time you're doing a project, keep some things in mind. If you need a new component and you could open the plug-in API a little to allow making it an extension, do so. If you can develop in an accessible (not compiled) language, do so. When you have some spare cycles, set up a repository of user contributions. A few small steps can go a long way.

Some examples and references:

Wednesday, June 13, 2007

Advertising Forgot You Remember

Do you remember when you were walking down the street and you saw that billboard for an injury law firm, so you punched the billboard and were teleported magically to their offices?

How about that commercial break during an episode of Friends, for a brand of tooth paste. Do you remember kicking your TV and bottles of toothpaste falling out of it with the shattered glass and smoke?

If you don't remember these events, why do online advertisers want you to hit their banners right then and there, which is so different from how you are used to getting advertisements? Because, you know and they know, that if they do their job right, you'll remember them later, when you need to.

In this light, I propose a new advertising model for the Web: AdMarks. I see ads for things all the time that I would buy, or want to buy, but that doesn't mean I can buy them right now, or have an immediate need for them. I'm not about to follow the banner, open a new window up, read about it, bookmark it, and come back weeks later to my bookmarked ad. I might if it was easier, though. We need advertisements that bookmark when clicked.

Of course, we need useful things to do with those bookmarks later. We should be able to search them, for when we want to buy that thing we saw a week ago. When I search a shopping meta-search site, stuff I AdMarked should come up immediately before anything else. When I'm walking around in Target with my smartphone, the bluetooth chips in the shelves should tell my phone to alert me about something I was interested in. When I happen on the website of the thing, it should know I was interested in one of their products and tell me more about it. When I've earned enough Customer Reward Points, they should just send me one, since they know I want it anyway.

I'd love to see Google do this with AdSense.

Python, IronPython, Apples, and Oranges

While Fuzzyman is over at the voidspace, talking about how great it is that, in IronPython, str and unicode are the same things, I'm over here getting more worried every day about the segmentation of Python and IronPython.
IronPython is a new implementation of the Python ... maintaining full compatibility with the Python language.
From the IronPython homepage.

They should go ahead and drop that last qualify. I want to make something very clear, and that is that I absolutely hate writing this post. The IronPython project is really great, and I've been impressed by what it has done, and my Microsoft's embrace of the language. Admiration does not trump worry, in this case. A number of issues make IronPython simply not Python. I've been advocating this issue more and more recently, so it is about time I wrote at a moderate length about the issue.

In IronPython, str is unicode

Now, it may be true that Python plans to drop the current behavior, make str unicode, and add a separate type specifically for dealing with byte strings (See PEP 358). However, that is not the case yet, and jumping the gun and making str and unicode the same type is an absolutely incorrect non-solution. This is not just a matter of taste, but a situation where IronPython is absolutely wrong. I can make two arguments against this.

IronPython does not encode or decode between str and unicode

One of the most important issues about dealing with unicode is the difference between unicode or unicode strings of text and encoding strings of text or bytestreams containing encoded text, which may be decoded into understandable unicode (Joel has covered all this). IronPython implicitly can not do this. A str with a non-ASCII "byte" cannot be encoded by Python, if you don't tell it the encoding being used. This is no flaw, it is the law. IronPython, having no str type, effectively, just assumes the bytes over 128 are taken as the corresponding codepoints. There is no encoding anywhere, in which this is the correct behavior. That's right. They just give you a known bad result, and let it go.

When There Is No Bytestring, You Have to Look Elsewhere

So what happens when you truly need to work with byte strings in IronPython, which pretends byte strings are unicode strings? Well, you have to look elsewhere. Of course, the entire .Net API is at your finger tips, so look no further than System.Byte and System.Array, of course. Sounds easy, but the danger here should be obvious. Any Python code assuming, correctly, that str is a byte string type, is subject to implosion within IronPython and any IronPython code "properly" handling byte data simply can't import outside IronPython at all.

Language and Library

Does syntax alone make a language? Maybe one day it could, but those days died out. Python is far more than its clean, beautiful syntax. The libraries that come in the standard library provide even more value. As a foundation for all the software built on top, these packages are fundamental to the success of Python. Yes, your code looks beautiful all on its own, but all on its own it does not have an embedded database, configuration parser, and mail and web servers. Right there you have a basis for a huge number of applications, without even leaving the language's vanilla installation.

IronPython does not include any of these, so if you write software using them, don't expect them to run on the .Net runtime, just because IronPython claims compatibility. You can probably access all the same facilities, but you have to do so through the .Net APIs of similar facilities. I am not even sure that the same facilities are provided there. The sad fact about a lot of this, is that many fo the libraries not included in IronPython actually work perfectly, if they would include them in the distribution, without change.

Because of this, we have to resort to things I consider terrible, like two different Python scripts, both doing some basic HTTP downloads, and both being completely incompatible because they rely on entirely different APIs: IronPython through .Net APIs and the real Python through urllib2 or httplib.

Conclusion


IronPython takes the syntax, but stops short of the language. The problem is one for both Python and IronPython lovers. In Python land, we're seeing what appears to be an influx of interest from the IronPython (also, via Silverlight) world, but all those new developers are creating completely incompatible code. IronPython advocates, on the other hand, look silly to think they are promoting the Python language, and are completely missing out on hundreds of great libraries, years of built up community, and synergy that isn't just a buzzword.

I really want this to all work out. IronPython, can we get along?
I write here about programming, how to program better, things I think are neat and are related to programming. I might write other things at my personal website.

I am happily employed by the excellent Caktus Group, located in beautiful and friendly Carrboro, NC, where I work with Python, Django, and Javascript.

Blog Archive