Hacker News new | past | comments | ask | show | jobs | submit login
PyPI Security (talkpython.fm)
50 points by SethMLarson on Oct 25, 2023 | hide | past | favorite | 21 comments



I'm really glad this is being taken seriously. It's often been an uphill battle to convince people that Python supply chain security is a serious issue.

In the ML (now called AI) space for example, it's not uncommon to download random binaries from the internet containing model weights, scripts, etc. Sometimes even at runtime (!!!) Lots of bad practices across the industry there that wouldn't be tolerated in other contexts.


ML/AI is a hellfire in itself.

IME, half the instructions got tested just once prior to publication, on the developer's laptop or if you're lucky AWS instance and that's it. No pinned versions of anything, so if you come across something you want to try (or rather, if it gets pushed to your desk by someone higher-up accompanied by a "we need AI now!!!" note) you'll first have to spend hours upon hours trying to pin down Python package versions just to get the Python part to install, then you gotta mess with CUDA versions because obviously these haven't been documented either... Nasty shit.


While we're on that topic, what is people's strategy for playing with Stable Diffusion safely?

I still haven't found any way to run it in a VM using consumer hardware (GPU continues to refuse to work). A second install of the OS on a second drive is so insanely clunky to switch between, I'd really like to not have to keep doing that.


I've not actually done it, but from what I understand using two GPUs is the way to go - you use one for your actual display etc., and the other is just passthrough to the VM.

(I was looking into it in the context of running Fusion 360 in a Windows VM though, not Stable Diffusion or any ML.)


Why not run it on your main OS? Otherwise, Docker is fine.


Because it installs like 100,000 python scripts of mystery origin that run with full privileges. Even if the maintainers are unlikely to be malicious on purpose, it only takes one person accidentally putting a typo in a dependencies file in one of the hundreds of packages it imports... many of which not commonly used ones.


Why are you running it with full privileges? It's one command to create a user on Linux, and another command to switch to it.


It's better than nothing but is it enough to run potentially malicious code?

I haven't checked recently but a while ago most distros defaulted to letting anyone peep into other users' home dirs. Moreover there has been so many exploits over the years letting a user gain root privileges that, for the purpose of security, unix users are akin to a bathroom lock.


> I haven't checked recently but a while ago most distros defaulted to letting anyone peep into other users' home dirs.

Yeah, no.


no it doesn't


gpu passthrough is normally pretty solid these days

in 2013 I had to buy hardware for it but by 2018 everything supported it out of the box


supply chain security is super important in a lot of areas but seemingly quite a hard problem to broach/solve. hopefully awareness continues to be raised/more tooling/etc


I haven't listened to the interview but are they going to add namespaces? That's the only good solution I can see to the current unfixed dependency confusion issue.

By that I mean, you want to use a private pip repo in your company, you upload `yourcompany_secretproject` to it and tell people to install it. Now the only way to prevent yourself being hacked is to publicly register an empty package `yourcompany_secretproject` on pypi.org. Oh and also hope the admins don't notice it and remove it because it's empty (which they have said they will).

Insane situation.


Why can’t you ask people to directly install from GitHub repo and skip publishing on PyPI? Curious to know.


You can, but it's not nearly as nice and it also builds the package from source instead of installing a pre-built binary which can be a lot slower.


Building from source will be slower, but GitHub allows you to publish a package as a well.


Does pip support installing that? Can you specify dependencies in `pyproject.toml` to be binaries from GitHub?

Genuinely asking.


Check out https://chrisholdgraf.com/blog/2022/install-github-from-pypr... which breaks down a few ways to to this.


This is a great interview! Mike (and Seth, who is tasked with addressing the non-PyPI security needs of the Python ecosystem) have been doing a great job both documenting and expanding the Python ecosystem’s security capabilities and outstanding needs.

PyPI’s security features have undergone a significant expansion since the backend rewrite back in 2017; I think it’s accurate to say that, since then, it has consistently been on the forefront (amongst its peer indices) in terms of adding scopeable API tokens, MFA, secret scanning, and most recently trusted publishing).

(FD: The company I work for helped add some of those features[1][2].)

[1]: https://blog.trailofbits.com/2019/06/20/getting-2fa-right-in...

[2]: https://blog.trailofbits.com/2023/05/23/trusted-publishing-a...


Thank you so much for including a transcript!!! I hate when audio or video content doesn't, I personally prefer to read rather than listen but there are plenty of users with disabilities who don't even have an option.

transcript link: https://talkpython.fm/episodes/transcript/435/pypi-security


Good to finally see people working on this.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: