My favorite Python WTF "feature" is that integers can have have the same reference, but only sometimes:
>>> a = 256
>>> b = 256
>>> a is b
True
>>> a = 257
>>> b = 257
>>> a is b
False
>>> a = 257; b = 257
>>> a is b
True
Sometimes I think of Python as the Nash Equilibrium[a] of programming languages:
It's never the absolute best language for anything, but it's hard to improve it on any front (e.g., execution speed) without hindering it on other fronts (e.g., ad-hoc interactivity), and it's never the absolute worst language for anything. Often enough, it's good enough.
Python might well be "the least-worst language for everything."
> Sometimes I think of Python as the Nash Equilibrium of programming languages
FYI: What you're describing is not a Nash equilibrium, but a Pareto optimal point [1]. They are similar in that you couldn't do any better, but Nash equilibria is in terms of whether this would cause other players to change their strategies, while Pareto optimality is only about trading off different features/dimensions.
Think of the developers as players competing against each other trying to get their ideas (PEPs) incorporated into the language, seeking individual recognition, credit, etc., and also think of languages competing against each other for developer attention, and then it will make a bit more sense why I called it a "Nash Equilibrium" :-)
Nash equilibria are mainly interesting when they are not Pareto optimal. Both the developers and users of a language, if being rational, should prefer languages to be on the Pareto frontier, but where on that frontier depends on how you weight the trade-offs.
As other commenters have pointed out, this is an implementation-specification optimization rather than a property of Python as a language.
It is, at a first glance, a bit weird. But the way you should look at it is that Python the language doesn't say the two integers have the same identity, and you shouldn't assume they will. But it also doesn't say they can't be the same object. Since Python integers are immutable, and thus having the two variables actually reference the same object can't create side effects unless you're directly playing with identities and making assumptions in your code that you shouldn't make, the implementation can have the two variables reference the same object as an optimization without breaking anything.
But this is using the seemingly harmless keyword "is" that's you're supposed to use sometimes. A programmer could stumble upon one of these statements and think it's going to work reliably after it works the first time.
I used to test for None by doing what seemed to work:
if my_variable:
do something
until I discovered it doesn't work if my_variable = 0 or some other falsy value besides None.
You could use '== None' instead, but it's generally recommended to use 'is None' (supposedly this is slightly faster). I don't think I've ever encountered anything else relying on 'is'. IMO the 'is' keyword was a poor language decision, given how rarely it's ever used.
`is` is for identity whereas `=` is for equality. You rarely want `is` unless you're asking if two references are the same object. This is almost exclusively used for `x is False/True`, but sometimes used to denote "missing" arguments (where true/false may be valid):
missing = object()
def func(a=missing):
if a is missing:
raise ValueError('you must pass a')
This "numbers less than 256 are the same objects" is a fairly common on the list of "wtf python" but I've never understood it. You don't use `is` like that and you would never use it in code because the operator you're using is not the right one for the thing you're trying to do.
Plus if this is the biggest wtf then that's pretty good going.
The "numbers less than 256 are the same objects" wasn't done so you could use "is" on them, that's just a side effect. It was done as an optimization, because those small integers are far more common than the larger ones. You save space, because you need only one copy of those small integers. And you save time, because those objects are never destroyed or recreated.
The "numbers less than 256 are the same objects" reminds me of the existence of the IntegerCache in Java, with an array storing the number from -128 to 127.
> It's never the absolute best language for anything, but it's hard to improve it on any front (e.g., execution speed) without hindering it on other fronts (e.g., ad-hoc interactivity),
This belief seems common, but I always wonder if anyone with familiarity with dynamic programming languages that were implemented by people who knew what they are doing (as implementers) thinks so. Self, Smalltalk and Common Lisp, for example, are doing much better on the ad-hoc interactivity front in non-trivial ways whilst offering implementations with vastly better performance preceding (C)Python by many years. The fact that python has terrible execution speed is most due to lack of relevant skills in the community not some conscious engineering trade-off.
Having said that, I don't think you are wrong on python being "the least worst language for everything" -- very few other languages have an eco system of remotely comparable expansiveness and quality (the top minds in several disciplines mostly use python for their work) which alone kills of huge swathes of would-be-competitors.
> Having said that, I don't think you are wrong on python being "the least worst language for everything" -- very few other languages have an eco system of remotely comparable expansiveness and quality (the top minds in several disciplines mostly use python for their work) which alone kills of huge swathes of would-be-competitors.
Yes, I agree. The ecosystem is part of what makes the language "the least worst language for everything."
In [2]: (1, 2) is (1, 2)
Out[2]: True
In [3]: a, b = (1, 2), (1, 2)
In [4]: a is b
Out[4]: True
In [7]: a = (1, 2)
In [8]: b = (1, 2)
In [9]: a is b
Out[9]: False
If you run that in a script, then you get True for all statements. Reason: when running a file, the interpreter reads the entire script and can make the optimization that both variables are the same objects, since they're not mutated.
> Sometimes I think of Python as the Nash Equilibrium[a] of programming languages:
I think you can say that about almost any language. Each feature has it's advantages and disadvantages and even the most hated features of some languages have some reasoning behind them - so changing it would hurt some use case.
Language design is sometimes more about reasonable compromises than genius ideas.
This is outside of the spec... "is" is for testing the exact same reference and it is only coincidence that to speed things up they made smaller integers the same objects in memory. See:
He names references in his post. I highly doubt he’s confused about the difference between is and ==. It’s a weird leak of interpreter details that could, in very narrow situations, cause a bug.
Java has the same "problem" when boxing an int into a java.lang.Integer. Small integers will have the same reference (==) because there is a cache table, but larger ones won't.
It's never the absolute best language for anything, but it's hard to improve it on any front (e.g., execution speed) without hindering it on other fronts (e.g., ad-hoc interactivity), and it's never the absolute worst language for anything. Often enough, it's good enough.
Python might well be "the least-worst language for everything."
--
[a] https://en.wikipedia.org/wiki/Nash_equilibrium