Undefined Python

Will McGugan has a post about the smell of languages, using an example of code from C that is actually completely undefined: no one knows what it should do. The code in question is this:

int main() { int i = 5; i = ++i + ++i; printf ("%d", i); }
One of the commenters talks about how some constructs in Python can be undefined, but I disagree. My response is below.

There can certainly be some cases where you aren’t sure about things involved in Python, but nothing undefined in the sense that C can be. I can counter the examples given.

“While a list is being sorted, the effect of attempting to mutate, or even inspect, the list is undefined.”

- While the list is being sorted, any inspection or mutation could only be occuring in some other thread. Multiple threads are, by desogn, indeterminable.

“Formfeed characters occurring elsewhere in the leading whitespace have an undefined effect (for instance, they may reset the space count to zero).”

- This is caused by improper formatting of a text file. If the file is not formatted properly, you can’t expect magic its-OK-ness.

“super is undefined for implicit lookups using statements or operators such as “super(C, self)[name]””

- Undefined? I actually don’t agree. At least, not by the term “undefined” as used in this post. implicit lookups like this are looked up on the type of the object in question, which is the super builtin type, in this case. The methods are undefined in the sense that the type does not define them, so they don’t exist. You can’t look them up. This is not “undefined” as in not knowing the behavior.

“If the transformed name is extremely long (longer than 255 characters), implementation defined truncation may happen.”

- This is about private name mangling. The mangled names should be considered an implementation detail, you should never use or try to create the names manually, so any implementation specific differences are completely irrelevant.


Brian said…
You don't have to be in another thread to mutate a list when sorting. Consider things like the key or cmp arguments to list.sort. These can execute arbitrary code during the sort, so it would be perfectly possible (though rather silly) to give a callback that mutates the list.
Stan Seibert said…
Suppose you have a special cmp_using_list function which takes a list to define an ordering of arbitrary objects:

>>> cmp_using_list = lambda l,x,y: cmp(l.index(x), l.index(y))
>>> foo, bar, baz = list(), '', dict() # arbitrary objects
>>> ordering = [foo, bar, baz]
>>> mycmp = lambda x,y: cmp_using_list(ordering, x, y)
>>> mycmp(foo,foo)
>>> mycmp(foo,bar)
>>> mycmp(baz,foo)
>>> sorted([foo, foo, baz, bar, baz, foo], cmp=mycmp)
[[], [], [], '', {}, {}]

So far so good, but now this statement is undefined:

>>> sorted(ordering, cmp=mycmp)

mycmp accesses ordering while ordering is being sorted. Of course, by construction, ordering is already sorted, so this happens to work anyway. (And, from a practical perspective, why would you ever need to do this?)
Stan Seibert said…
Hah, figures I would screw that example up. Of course, I should be using sort() instead of sorted() as sorted constructs a new list, thereby eliminating the self reference. If you try it, things do blow up:
>>> ordering.sort(cmp=mycmp)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
File "<stdin>", line 1, in <lambda>
ValueError: list.index(x): x not in list
Calvin Spealman said…
I found out why Stan's example blows up when I was toying with this myself. It looks like during sorting, the list is empty (at least in 2.5.1 CPython, so don't depend on it). Its probably an implementation that empties the list, holds an array of object pointers, sorts them, and refills the array. I suppose its to avoid the heavy GIL locking that would be needed to continually shift things about inside the list.
Marius said…
How about a simple example: the order of keys returned by a_dict.keys() is not defined in Python.

