Skip to main content

Duckling, A Date-Parsing Library That Makes Me Rethink About Parsing

Earlier this week I ran across a really fascinating project called Duckling. This isn't the Duckling project that I work on, but the coincidental name sameness probably caught my attention! Duckling is a date parsing library for Clojure, but it handles date parsing in a fairly unique fashion.

From the Duckling website:
Duckling is “almost” a Probabilistic Context Free Grammar.
Although I am no NLP expert (it is on my long and growing list of things to study one of these days), I was able to get the just from the explanation and the examples combined. Just look at some of the strings Duckling is able to successfully parse:
“the 1st of march”“last week”“a quarter to noon”“thirty two celsius”“2 inches”“the day before labor day 2020”
 These don't even have to be dates. Duckling's approach is generalized in a way that the library itself doesn't require special handling of dates, only that it's training set includes sufficient samplings of date (and other) text.

What stands out to me is that libraries like this are not just solving a problem, but are actually solving the problem of solving the problem. Programmers shouldn't spend their time parsing a million different ways language can describe the same or very similar things, because software can do it for us. And, as programmers, we need to be more aware about what the computers we work with every day are really capable of. When the compiler was invented, programmers were worried they're jobs would become obsolete, but look at us: we still have barely progressed, and some times I worry that is on purpose.

These little problems don't have to be hard, but by insisting that we keep re-solving them in the most difficult and manual ways, we're severely limiting the upward potentials of our craft.

Along similar thoughts I recently came across Fix My JS, which automatically lints and actually fixes errors in your Javascript.  More of this please! Programming tools can be so much more advanced than they are today, but instead of seeing any real progress, we just see new text editors copying a new combination of feature sets of older text editors.



We can do so much better. Let's see more of this!


Comments

Popular posts from this blog

Interrupting Coders Isn’t So Bad

Here’s a hot take: disrupting coders isn’t all that bad.

Some disruptions are certainly bad but they usually aren’t. The coder community has overblown the impact. A disruption can be a good thing. How harmful disruption might be a symptom of other problems.

There are different kinds of disruptions. They are caused by other coders on your team, managers and other non-coders, or meetings throughout the day.

The easiest example to debunk is a question from a fellow developer. Imagine someone walks over to your desk or they ping you on Slack, because they have “one quick question.” Do you get annoyed at the interruption when you were in the middle of something important? You help out your teammate quickly and get back to work, trying to pick up where you left off. That’s a kind of interruption we complain about frequently, but I’m not convinced this is all that bad.

You are being disrupted but your team, of which you are only one member of the whole unit, is working smoothly. You unstuck …

Announcing Feet, a Python Runner

I've been working on a problem that's bugged me for about as long as I've used Python and I want to announce my stab at a solution, finally!

I've been working on the problem of "How do i get this little thing I made to my friend so they can try it out?" Python is great. Python is especially a great language to get started in, when you
don't know a lot about software development, and probably don't even know a lot about computers in general.

Yes, Python has a lot of options for tackling some of these distribution problems for games and apps. Py2EXE was an early option, PyInstaller is very popular now, and PyOxide is an interesting recent entry. These can be great options, but they didn't fit the kind of use case and experience that made sense to me. I'd never really been about to put my finger on it, until earlier this year:

Python needs LÖVE.

LÖVE, also known as "Love 2D", is a game engine that makes it super easy to build small Lua…

CARDIAC: The Cardboard Computer

I am just so excited about this.


CARDIAC. The Cardboard Computer. How cool is that? This piece of history is amazing and better than that: it is extremely accessible. This fantastic design was built in 1969 by David Hagelbarger at Bell Labs to explain what computers were to those who would otherwise have no exposure to them. Miraculously, the CARDIAC (CARDboard Interactive Aid to Computation) was able to actually function as a slow and rudimentary computer. 
One of the most fascinating aspects of this gem is that at the time of its publication the scope it was able to demonstrate was actually useful in explaining what a computer was. Could you imagine trying to explain computers today with anything close to the CARDIAC?

It had 100 memory locations and only ten instructions. The memory held signed 3-digit numbers (-999 through 999) and instructions could be encoded such that the first digit was the instruction and the second two digits were the address of memory to operate on. The only re…