Mitogen, an infrastructure code baseline that sucks less

merlincorey · on Sept 28, 2017

Fascinating information and ideas presented here.

As someone who has many years of experience (and tears) with Ansible, I am looking forward to exploring mitogen and the possibility of it being able to drop in my existing Ansible playbooks is pretty compelling.

e12e · on Sept 29, 2017

Cool. This is essentially about creating a single image cluster that runs python. I suppose if one can accept the need for python and a python module be installed on all "nodes" - this would start to look a lot like parallel python:

http://www.parallelpython.com/

From an architectural point of view, an si cluster could be viewed as code+communication (+data, but the data could be "wrapped" in code). Ssh and shell provides this - shell basically wraps libc and binary blobs as callable code (built-in + assumed-available code) - and if the cluster isn't too heterogeneous can be augmented with anything from c/go code (compiler assumed available) to machine code (scp /usr/bin/<someprog>...).

And it is of course possible to build this on top of python. But the result can be much more than "just" for infrastructure - if it really works it becomes a transparent cluster platform.

_wmd · on Sept 29, 2017

A zero byte Python script can grow up to be anything in the absence of constraints, the infrastructure label makes it simple to say no when thinking about including a feature, and to ease selecting one implementation strategy over another. The nerd within would love to implement subtree detachment ( https://github.com/dw/mitogen/blob/master/docs/images/discon... ): this is basically the beginnings of an Internet worm, but it has zero practical use for infrastructure, so it's complexity I'd prefer to avoid.

Commercially, the killer feature for me in this library will be its Python 2.4 support. I need that for consulting work, but that constraint stands in direct contrast to a general purpose framework for use in application development. Developers want shiny new Python 3 and suchlike, commercially I've never needed that and will be unlikely to need it for a few more Debian/RHEL/Ubuntu releases (I'm guessing, probably 2019)

I don't want to rewrite Pyro4 or end up with a design as complex as CORBA (with all the pain transparent object proxies entail).. it should have just enough complexity to make it ideal for the target use case.

simonw · on Sept 29, 2017

"How Mitogen Works" is fantastic reading - http://mitogen.readthedocs.io/en/latest/howitworks.html - an unbelievable collection of devious hacks.

shoo · on Sept 29, 2017

> given a remote machine and an SSH connection, just magically make Python code run on that machine, with no hacks involving error-prone shell snippets, temporary files, or hugely restrictive single use request-response shell pipelines, and suchlike.

I wonder if some of this functionality would be easier with a language that inherently supported distributing functions between nodes, e.g. Joe Armstrong's "favorite Erlang Program", the Universal Server:

https://joearms.github.io/published/2013-11-21-My-favorite-e...

I am missing a few of the points, re:

* how to bootstrap this onto a system with very limited assumptions about resources/access

* performance

* market effects of being able to re-use a bunch of existing libraries/scripts/tools written for configuration management or what-have-you in popular python

edit - and there's documentation for erlang over ssh:

* http://erlang.org/doc/apps/ssh/using_ssh.html

* http://erlang.org/pipermail/erlang-questions/2007-January/02...

* http://erlang.org/doc/man/slave.html

sudosteph · on Sept 29, 2017

Nice! I have been doing some experimentation with running ansible on AWS Lambda for some time now, and now I'm wondering if this wouldn't be even better for that. I had to pin ansible to an old version and do some weird stuff when building the deployment zip, but it does work. I'm not all that tied to my playbooks, and actually would prefer to write in python rather than yaml (incidentally this is the same reason I still like chef's use of ruby so much, even though ansible being agentless is more useful to me)

Seems like this could potentially do the same thing, even faster. Though I admit I have not dived in all that deep yet. Anyone tried it or see anything that would prevent it from running that way?

If not I guess I'll give it a go this weekend.

noir_lord · on Sept 29, 2017

Ansible is strange like that, lots of stuff only works if you don't touch it after you get it working, for a period of time the 'apt' module was broken so I had to do similar.

It's fantastic software but it can be a bit fickle.

johnpython · on Sept 29, 2017

With Kubernetes and other container schedulers gaining popularity and the host OS requiring less effort to be production ready, there is much less need for configuration management tooling. I don't see a lot of appeal for another tool in this space.

_wmd · on Sept 29, 2017

There'll always be a need for better tooling in this space: what configures your Kubernetes cluster?

simonw · on Sept 28, 2017

This looks amazing. I know the "getting started" doc is missing at the moment, but any chance of a 4 line example of what I'd type on my computer to start trying this out?

_wmd · on Sept 29, 2017

Hi Simon,

The n-liner is:

    import subprocess
    import logging

    import mitogen.utils


    def main(router):
        # http://mitogen.readthedocs.io/en/latest/api.html#context-factories
        remote_host = router.ssh(hostname='myvps.mycorp.com')
        root = router.sudo(via=remote_host, username='root', password='god')

        rc = root.call(subprocess.check_call, 'ifconfig lo', shell=True)
        print 'Command return code was:', rc

    logging.basicConfig(level=logging.INFO)
    mitogen.utils.run_with_router(main)

Which gives me:

    INFO:mitogen.ctx.local.22050:stdout: lo        Link encap:Local Loopback
    INFO:mitogen.ctx.local.22050:stdout:           inet addr:127.0.0.1  Mask:255.0.0.0
    INFO:mitogen.ctx.local.22050:stdout:           inet6 addr: ::1/128 Scope:Host
    INFO:mitogen.ctx.local.22050:stdout:           UP LOOPBACK RUNNING  MTU:65536  Metric:1
    INFO:mitogen.ctx.local.22050:stdout:           RX packets:323421007 errors:0 dropped:0 overruns:0 frame:0
    INFO:mitogen.ctx.local.22050:stdout:           TX packets:323421007 errors:0 dropped:0 overruns:0 carrier:0
    INFO:mitogen.ctx.local.22050:stdout:           collisions:0 txqueuelen:1
    INFO:mitogen.ctx.local.22050:stdout:           RX bytes:109087018382 (101.5 GiB)  TX bytes:109087018382 (101.5 GiB)
    INFO:mitogen.ctx.local.22050:stdout:
    Command return code was: 0

Seems context name generation has broken :/ But it's running on the remote machine, you get the idea.

I'm working on a "mitop" example utility that implements a multi-host top replacement, with hierarchical statistical roll-ups, and documenting the implementation of that will be the basis for the Getting Started guide, but it depends on a bunch of stuff I'm not ready to work on (async connect), while there are still critical issues where hangs may still occur due due to an exception, and basic stuff like module preloading being incomplete (currently day 13 of working on that).

It will come eventually, but it will likely encourage many more bug reports than my hands can respond to once it is written, so it is kind of a feature that it is missing right now. :)

marmaduke · on Sept 29, 2017

The appeal of Ansible is that you can declaratively spec infrastructure and it happens (more or less). I don't see how this competes with that.

_ix · on Sept 29, 2017

If I understand it correctly, this is intended to replace the dependencies or subsystems of applications like Ansible rather than Ansible itself.

anentropic · on Sept 29, 2017

This looks very clever, but is it a good idea?

hbex5 · on Sept 28, 2017

A thought occurs that by making this backwards compatible with Ansible playbooks, by having an AoT compiler that compiles a playbook down to pure Python and then feeds that into Mitogen, then as a migration path it could look indistinguishable from Ansible from the perspective of a playbook user.

guhcampos · on Sept 29, 2017

I believe this is not meant to replace Ansible, but work like an "engine" that Ansible and other similar tools can run under the hood. This is quite low level code even for an everyday Python developer.

_wmd · on Sept 29, 2017

Ansible internally abstracts all its SSH bits into a 'connection plug-in', the initial goal is implementing another one of those. So with one or two extra lines in ansible.cfg you will get a much faster Ansible, with no other changes necessary.

I've no desire to replace Ansible, I'm just sick of wasting hours waiting for it to finish

alexnewman · on Sept 29, 2017

I think infastructure as code is mostly overrated. I can't tell you the amount of startups wasting time cargo culting best practices without knowing why or how. Most people would be better off just launching shit and keeping it simple

talideon · on Sept 29, 2017

If you're doing something with a couple of machines and it's simple, sure.

However, the benefit of 'infrastructure as code' is that you have a _description_ of your infrastructure. Once you get above a small number of machines, keeping things consistent (because humans are humans) when you're managing pets, not cattle, is difficult and time-consuming; rolling out changes is difficult and time-consuming; understanding why something is the way it is is difficult and time-consuming. OTOH, if you have descriptions of your infrastructure in version control, you can read a description of how everything is set up and know how a given machine or VM with a given role ought to look. You can also dig through history, even if the people committing changes were lazy, and get an idea of why something ended up the way it did.

Sure, doing it as a cargo cult is a bad idea, but that's a criticism that can be levelled at just about anything a start-up might do with regard to the infrastructure side of things, because they'll tend to skimp too much on people with an ops background. But that doesn't mean that it's overrated, just that it's something that you should understand the reasons for doing before doing it.

jhundal · on Sept 29, 2017

I'd like to add that a first cut as infrastructure-as-code could be as simple as putting your manual setup tasks (including SSHing into a host) into a shell script and committing it to your repo.

But it takes a only trivial amount of additional time to set up an Ansible playbook (which abstracts away SSHing to a host) and then runs a shell script that's only concerned with actual setup tasks. Enough for shipping code and a good starting point for using the declarative aspects of Ansible to make your setup/deploy processes more automated and less error-prone.

It is possible to cargo-cult infrastructure best practices and make the mistake of investing too much engineering resources in making your deployment practices perfect (for some definition of perfect). Infrastructure-as-code should be understood as a continual process, not something to get completely right on day one, but that applies to any part of your code/product/infrastructure.

Edited: typos

mattbillenstein · on Sept 29, 2017

Keeping it simple is good advice, but that's not shell scripts once you have more than ~10 boxes.

Picking a good config management and deployment tool, getting everything checked into git, and being disciplined about how you do things will save you mountains of time in the long run.

I'm currently on saltstack and it's a nice system; it eliminates a lot of the slowness I experienced with Ansible while conceptually working similarly (playbooks -> states, templates, inventory -> pillar, roles, etc). Everything is committed in git; sensitive data like keys are encrypted with a secret not in the repo using nacl. When I deploy, I know every machine is in exactly the same state running the same software and packages -- that is much much harder with ssh and shell scripts.

lowmagnet · on Sept 29, 2017

It's super useful if you embrace disposable environments. The easier it is to build a system or cluster, the more different situations you can assist your internal testing teams in replicating.

toomuchtodo · on Sept 29, 2017

Most environments that persist data are not disposable. Works great for web servers and microsevices though.