Show HN: Import Docker in Python and Run Anything

stephensonsco · on Feb 24, 2016

We made a python module to run complicated non-python code in python (like stuff that needs ridiculous environmental gymnastics). It's called sidomo (Simple Docker Module) and it's used to easily make python modules that are actually docker containers (well, docker images). These containers take an input, then hit it with the contained code, and send the output back to pure python.

The hello world:

  from sidomo import Container
  with Container('ubuntu') as c:
      for line in c.run('echo hello from the;echo other side;'):
          print(line)

stavros · on Feb 24, 2016

It's probably a happy coincidence that that means "quick" in Greek.

reinhardt · on Feb 24, 2016

<pedantic>"brief" would be a closer translation</pedantic>

stavros · on Feb 24, 2016

You're the best kind of correct.

noajshu · on Feb 24, 2016

That is lovely--you're referring to 'σύντομα'?

rantanplan · on Feb 24, 2016

Quite close, he is referring to "σύντομο" the adjective.

"σύντομα" would be the adverb.

tetron · on Feb 24, 2016

Going one step further, using http://CommonWL.org you can wrap Dockerized command line tools into callable Python functions, and abstracting all the details of stdin/stdout redirection and getting files into and out of the container.

stephensonsco · on Feb 24, 2016

This is really cool!

sciencerobot · on Feb 24, 2016

I've been looking for something like this. My use case is that I have a bunch of really old bioinformatics programs that are a pain to install and I want to run them from a web app. Instead of bundling all of the weird dependencies with the web app, I want to run them in containers using background workers (rails/sidekiq in this case).

noajshu · on Feb 24, 2016

This is an awesome use case. Back when I was in particle physics we had to use ROOT (https://root.cern.ch/) for everything, and configuring it in a new environment would take at least a day.

What kind of bioinformatics software are you plugging into a webapp?

arturhoo · on Feb 24, 2016

That's a neat way way to tight together both worlds, and I can see it being useful in cases like testing.

Nonetheless, it is important to distinguish the need to communicate between programs and the need to programmatically run a piece of software like ffmpeg and getting its output.

For the seconds case, especially in more complex architectures, where you need "interact with software written in another language" it makes sense to explicitly separate this interaction, for example through a broker [0]. In the end, all you need is a way to communicate from Program A that Program B can do some sort of job, and this can be a simple string pointing to a raw video file in a storage like S3, not necessarily the raw file.

[0] http://www.artur-rodrigues.com/tech/2015/06/04/beanstalkd-a-...

ThePhysicist · on Feb 24, 2016

So if I understand correctly it basically launches the container and pipes the log output to a generator for consumption through Python?

noajshu · on Feb 24, 2016

Yes. Not necessarily log output, but stderr and/or stdout.

richardwhiuk · on Feb 24, 2016

I'm struggling to see the advantage of this? Surely either running the entire thing in the container and just running the command using subprocess would achieve the same effect....

noajshu · on Feb 24, 2016

Hi, I'm one of the devs.

That's totally true, but it means the python app can only run in the same container as the process. I thought there were essentially 2 really cool features of this: you don't have to clean up after your (sub)processes and your docker daemon could be running remotely. (e.g., you could distribute tasks to a bunch of servers running your containers)

takee · on Feb 24, 2016

Or, if you have docked installed on a system anyway why not just use docker-py (docker's python bindings) directly to run stuff in containers?

noajshu · on Feb 24, 2016

you can! I found it difficult to work with and decided to make something new after reading this post: http://blog.bordage.pro/avoid-docker-py/ It's a few years old but I think docker-py still needs some love for it to really shine.

rco8786 · on Feb 24, 2016

I can see it being useful for building up a testing environment/framework

noajshu · on Feb 24, 2016

Absolutely, we use it especially because our servers are linux and our personal machines are mac. Most of our app is not containerized but sidomo helps us make the 'fiddly bits' (e.g., ffmpeg) super portable.

kanzure · on Feb 24, 2016

Hmm, this could be a good way to use postgresql during unit tests for python applications, as a cheap alternative to sqlite://:memory: and ramdisks. Cleanup is just container management task stuff, instead of adding postgresql package to linux distro.

noajshu · on Feb 24, 2016

I feel you on the cleanup side--there was another interesting docker app on HN today that you may want to look at if you're running a large DB in your container: https://github.com/muthu-r/horcrux.

Our Container class is built so that if you use the `with` statement, container termination is handled automatically even if there's a program fault.

wdawson4 · on Feb 24, 2016

Does this require your Python app to run on a user in the docker group, thus giving your app Sudo privilege?

noajshu · on Feb 24, 2016

Your Python app doesn't have to run in a container at all. If you chose to run it in a container, and you wanted to use sidomo, you would have two options:

1 The container would need to be privileged so that it could run a docker daemon and containers (sidomo processes) within itself.

2 (the "right" way) Use the host's docker daemon from within the first container by binding the docker.sock to the child container. The first container can start and stop others that run next to it, instead of inside it. This way there's no recursion, and no containers need root privileges.

ciokan · on Feb 24, 2016

By looking at the code I can't seem to find a way of running the containers with options such as --net, --dns etc. Am I missing something or it's just not part of the plan?

vkjv · on Feb 24, 2016

Doesn't look like it. It looks like it's just a thin wrapper the `docker` module, which is a wrapper around the Docker API.

https://github.com/deepgram/sidomo/blob/7890cde8c5dbf1722311...

It probably wouldn't be much effort to add an options object that is a direct pass-thru to the Docker API. That gives you the availability of all options and helps to future proof it.

I'm not a Python developer, but I've had success doing exactly this in node with the `dockerode` module.

ilovefood · on Feb 24, 2016

This is so awesome man! I wish I could offer you a beer, it is exactly what I needed right now! ahhhhh I love the community

stephensonsco · on Feb 25, 2016

Glad you like it. We're curious to see what contraptions people come up with.

youngbullind · on Feb 24, 2016

So much for "flat is better than nested"!

noajshu · on Feb 24, 2016

map(print, ["ok", "flat is better", "but", "the with statement", "is badass"])

;)