Hacker News new | past | comments | ask | show | jobs | submit login

> Our API server was a Python Tornado service... and one of the most frequently accessed endpoint was used to retrieve user by their name or id. Because it supported retrieval by either name or id, it set default values for both parameters as empty lists. This is a super reasonable thing to do! However, Python only initializes default parameters when the function is first evaluated, which means that the same list is used for every call to the function. As a result, if you mutate those values, the mutations span across invocations.

How on earth did such a well-known Python footgun ever make it into production? This is the kind of thing that should leap out of the screen for even a mid-level Python developer - and once you've been trained by bitter experience it's very easy to spot.




Sure it's a known footgun, but I wouldn't be so harsh without knowing more. It could be the result of a series of only slightly bad decisions.

A junior developer creates a function with default of [] instead of (), but otherwise no mutations:

    def get_user(unchecked_ids=[]):
        ids = get_valid_ids(unchecked_ids)
        if not ids: ids = [current_user_id]
        return query_users(ids)
Alice introduces a mutation in a place that is currently safe.

    fa55099 - 45 minutes ago - Alice - Avoid creating unnecessary new list
     def get_user(unchecked_ids=[]):
         ids = get_valid_ids(unchecked_ids)
    -    if not ids: ids = [current_user_id]
    +    if not ids: ids.append(current_user_id)
         return query_users(ids)
Meanwhile Bob simplifies the code in a separate branch.

    f4e704c - 30 minutes ago - Bob - Fix caller who was sending invalid ids
    + def get_user(ids=[]):
    - def get_user(unchecked_ids=[]):
    -     ids = get_valid_ids(unchecked_ids)
          if not ids: ids = [current_user_id]
          return query_users(ids)
Then Charlie does something else on the branch, tries to merge it into master, and Git auto-resolves the conflict because there's no overlap between the changes.

    def get_user(ids=[]):
        if not ids: ids.append(current_user_id)
        return query_users(ids)
And now everyone sees the data of a random user, and your foot is missing.

Is it good code? No. Good version control hygiene? Also no. Should you crucify the developer who made this mistake? Of course not, especially once you add the boilerplate chaff that was omitted here. That's why it's called a footgun, it's easy to misuse.


Humans aren't perfect. Open source projects have missed unescaped spaces in directory paths that caused the deletion of /usr (Bumblebee), video games have forgotten to check the cwd and deleted vital windows boot files (Eve Online) and operating systems have forgotten to check passwords (MacOS).

Given this was 2010 I wouldn't be surprised if they had a less mature development process that doesn't use things people take for granted today, like linters, and pre-merge code reviews.

If you assume perfection of humans, maybe 95/100 times it works out and 4/100 it's a small oops, but 1/100 times you get something embarrassing like this


I was like "hold on, 2010 wasn't that long ago!" but thinking about it now, things like Github and its code review flow via merge requests only came into my radar at about 2012, and even then it took a long time before it was widely adopted.

Around that time I was in the middle of the Java ecosystem though, which had tons of tooling for validation, linting, verification, books full of best practices, etc.

Still a far cry from what is normal these days, but we did do things like unit tests, end-to-end tests and code reviews back then. But, that was enterprise, the SF web companies I think weren't as enterprisey as the bank I worked for at the time.

It's ironic though; the old fashioned companies I've worked for adopted a lot of the more fast-and-loose-feeling practices from SF, think extreme programming, agile, taking a chance on rewriting the whole back-end to a new language, things like that.


Completely disagree with the aproach humans are not perfect. This was not a one opportunity only event. Every test case they did not run, every load simulation they did not perform, every chaos test they decided to overlook was an example of humans failing repeatedly. Massive websites were running since the early nineties...And robust software practices are a thing of the sixties.


Agreed, but this was not a solid company in the first place. Rather, Digg was (according to TFA) a failing startup losing money hand over fist that was launching v4 as a hail mary. They were so hard up that no hardware was available for hosting the new environment, so they started reusing v3 servers for v4 while the migration was ongoing.

They bet it all on this v4 thing, and lost.


Funny you should mention that there was no hardware available. Digg had a second cage full of servers in VA as part of a failed attempt to go multi-DC. If that project had ever worked out there would have been 50% more capacity. Those servers and network gear just sat there unused until they finally got liquidated for pennies on the dollar.


> Humans aren't perfect.

Humans are good at designing processes and procedures that compensate for their lack of perfection. In this case that capability was not used apparently.

Anyone can make a mistake and that's fine, but a company would be expected to have processes and enough eyes to make sure that a bunch of other people need to make the same mistake before it makes its way into prod at which point the likelihood of mistake becomes really low...


Not knowing Python, but I have a feeling things like this weren't as well known back in 2010 and the reason they're so well known is from people doing it at sites like Digg and then telling everyone that's a really bad idea. There are so many things that I know now for programming that were pretty much unknown 10-years ago because no one had really encountered them in a large scale setup.


Nah, this has been very well known since decades and is very easy to spot.


Yeah this is covered in pretty much every style guide, tutorial, and reference. And is an easy screener in an interview. I'm also surprised anyone using Python as more than a toy experiment had this issue, especially in such a critical service.


I've worked on a lot of Python as a hobbyist and student, much of it in real world use - probably at least 100kloc - and I've never encountered this. Where is this information? I'm worried that I'm missing some other "obvious" things. I've of course run into python's shallow vs deep copies, but I don't remember default values being shared between invocations.


If you do a search for "python common gotchas" it's almost certain to come up, usually pretty prominently.


Still... did no one with even moderate Python experience even glance at this very important endpoint at any stage during those 4 weeks? Like I say, this is the kind of thing that jumps out of the screen for an experienced developer.


In 2010? You'd be surprised before widespread adoption of git at how many places the process was individual developers committing changes unsupervised. I'm not sure we even disagree that's the problem - you feel a second person should have looked at it, I feel the process should require that. A code review might be something that they'd sit down with a selected piece of code that they felt was risky on a biweekly basis or something.


I’m 2010 access to prod over ftp:// might have still been a thing even at digg.


Forget code review. This is one of their most important endpoints. They had severe issues for 4 weeks. Did no one not even glance over the code and spot this glaring screw-up?


Considering the context, most people were probably overloaded with work and a lot of changes must have been made with little to no oversight.


Having done this same mistake myself, it's just such an easy mistake to make. Most likely you're writing two or three programming languages in the same project, and the others work differently from Python. So it's very easy to miss an errant `foo={}` in a function definition when reviewing a commit of several hundred lines. Especially as the only indicator of something being wrong is the lowest priority warning PyCharm has, meaning the background colour `{}` is only slightly different.

These days I'd set up a linter that enforces these things do not happen, but it's easy to be smart after the fact. You say "once you've been trained by bitter experience it's very easy to spot", which I feel is true. You just need that bitter experience first, and for me it came in the form of a production issue.


Forget about production, how did this make it into python? This is so unbelievably stupid it makes PHP look like a sensible language.


This is the question people should be asking rather than trying to identify which Digg employee should be retroactively thrown under the bus. Just a completely insane design decision.


This is one of those things that are so bad it warrants breaking backwards compability, possibly by releasing a python v4.


If they start today they could complete the migration by the mid-2040s!


It's one of the top python posts on S/O, every python developer finds their way to this discovery https://stackoverflow.com/questions/1132941/least-astonishme...


It's an interesting decision to be sure. Then again it's not really that arcane, the question is simply whether you calculate function defaults once or on every invocation. Python made the choice to run it once.

What I don't get here is why they didn't simply use 'None' instead. If a parameter is optional then 'None' makes way more sense than an empty list.


It is perfectly sensible… if your language is guaranteed immutable everywhere


I suspect because it's easier to implement defaults as values than as expressions.

The scoping of list comprehensions in Python 2 was far less justifiable, IMO.


This. Should have been taken out in Python 3, but nah, still present even today.


It feels like they had someone developing this service that hasn’t done this kind of thing before. Which is a pretty startupy thing to do. I’ve been there before, even with that exact footgun. Though a much smaller feature and testing revealed it.


You can work in python for years and never know this. How many devs know the entire spec of their working language by heart?

I've been bitten by this exact thing. It's not strictly obvious IMO. Though it didn't make it through testing.


Footguns are still footguns.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: