Skip to main content

New Job, Fun Projects, and Amazon S3

I haven't posted in a while, but things have been going on. I thought I'd post about some of the more interesting aspects. I've recently began a fairly regular contracting deal with an interesting company, and I'll have to be a little vauge on some aspects, because of NDAs and such.

During one of my usual nights of aiding the Pythoners of #python on irc.freenode.net, I was discussing a project someone was trying to complete and the long debate about various routes that could be taken led to me being contracted for the job, which I've had fun with. I've been contracted to build a Fuse package, which uses Amazon's S3 as its storage mechanism. It is a fun system to work with, because of its simplicity and challenging limitations. For example, all operations are redundant, but non-atomic, because the same data could be changed at the same time, and its unpredictable how it would propogate across their redundant network. Mostly this hasn't been an issue, because you have almost the same trust in file locks on a local system anyway, and the only issues have been how to ensure integrity within directory entries and file node chains.

This aspect of the work is to be released under the GPL upon completion, and hopefully I can factory out some things that will be useful for other uses of the S3, which I've developed for the use in this project. I'll try to factor out modules for the following features:
  • Define classes that represent a type of data stored in an S3 entry
  • Easily define meta-attributes, with coercers and defaults
  • Unique IDs generated for entries
  • "Sub-Entry" concept, where one entry is owned by another
  • Caching of data both in disk and over memcache, with an open API to implement other cache-types, like local-memory caches, or even other web services.
  • Node entries, which can span data across multiple entries for more efficient (and cost effective) reads and writes that do not involve the entire data buffer.
  • Test facilities for both BitBucket (a Python S3 access package I use) and Python-MemCached, which I use for offline testing. Both mirror all the functionalty (read: most) of the related projects, so they can be tested against without actual network use.
My work with this project has led to the beginning of a long-term working relationship with the company, which I am very excited about. I can't talk about the specifics of the work I will be doing, until the company launches in a few months. As soon as that happens, I'll be blogging extensively about some of the aspects I can devolge, and of any additional software that might be released freely (I don't know if there will be any).

If you are interested, look forward to the S3 packages I'll wrapping up this weekend. Hopefully, someone will find them useful.

Comments

Anonymous said…
you rock

Popular posts from this blog

CARDIAC: The Cardboard Computer

I am just so excited about this. CARDIAC. The Cardboard Computer. How cool is that? This piece of history is amazing and better than that: it is extremely accessible. This fantastic design was built in 1969 by David Hagelbarger at Bell Labs to explain what computers were to those who would otherwise have no exposure to them. Miraculously, the CARDIAC (CARDboard Interactive Aid to Computation) was able to actually function as a slow and rudimentary computer.  One of the most fascinating aspects of this gem is that at the time of its publication the scope it was able to demonstrate was actually useful in explaining what a computer was. Could you imagine trying to explain computers today with anything close to the CARDIAC? It had 100 memory locations and only ten instructions. The memory held signed 3-digit numbers (-999 through 999) and instructions could be encoded such that the first digit was the instruction and the second two digits were the address of memory to operate on

Statement Functions

At a small suggestion in #python, I wrote up a simple module that allows the use of many python statements in places requiring statements. This post serves as the announcement and documentation. You can find the release here . The pattern is the statement's keyword appended with a single underscore, so the first, of course, is print_. The example writes 'some+text' to an IOString for a URL query string. This mostly follows what it seems the print function will be in py3k. print_("some", "text", outfile=query_iostring, sep="+", end="") An obvious second choice was to wrap if statements. They take a condition value, and expect a truth value or callback an an optional else value or callback. Values and callbacks are named if_true, cb_true, if_false, and cb_false. if_(raw_input("Continue?")=="Y", cb_true=play_game, cb_false=quit) Of course, often your else might be an error case, so raising an exception could be useful

How To Teach Software Development

How To Teach Software Development Introduction Developers Quality Control Motivation Execution Businesses Students Schools Education is broken. Education about software development is even more broken. It is a sad observation of the industry from my eyes. I come to see good developers from what should be great educations as survivors, more than anything. Do they get a headstart from their education or do they overcome it? This is the first part in a series on software education. I want to open a discussion here. Please comment if you have thoughts. Blog about it, yourself. Write about how you disagree with me. Write more if you don't. We have a troubled industry. We care enough to do something about it. We hark on the bad developers the way people used to point at freak shows, but we only hurt ourselves but not improving the situation. We have to deal with their bad code. We are the twenty percent and we can't talk to the eighty percent, by definition, so we need to impro