Hacker News new | past | comments | ask | show | jobs | submit login
GitHub: Scaling on Ruby, with a nomadic tech team (medium.com/s-c-a-l-e)
176 points by jssjr on Aug 27, 2015 | hide | past | favorite | 41 comments



> For a long time, very key bits of our infrastructure were strung together with Shell scripts and simple scripting, and it’s surprisingly effective and still works really very well for us.

Key bits of the world's infrastructure still run on a bunch of flimsy shell scripts that seem like they'd break all the time but don't. If a computer program works reliably at all it will probably work reliably indefinitely, assuming the environment it lives in does not change substantially.

We probably all trust our lives to high end routers and medical equipment powered by shell scripts (or worse) every day. The code would scare you but, surprisingly, it will probably never fail you.


It's interesting why shell scripts are seen as so "fragile" or "flimsy." Why would bash be any more inherently flimsy than any other interpreted language? Why would the global mass of Ruby, JavaScript, Perl, Python, Java, C++, whatever be any less fragile? The whole world is held together by duct tape. At least most shell scripts keep it simple.


> Why would bash be any more inherently flimsy than any other interpreted language?

I'd say that it at least feels this way, because the "global state" in a shell program depends on a lot more properties of the system than the "global state" of a Python program. C/C++ or even entirely static binaries from Go for example (I'm not a Go expert) all eliminate state being 'imported' from the host machine.


Sure, if the external tools used by the script change, the script's behavior will change... Other languages somewhat solve this with versioned dependencies, but even that's not quite sufficient, which is how come all the rage about reproducible builds. But yeah, good point.


Shell scripts have a lot of implicit dependencies.

They string together other programs with widely differing interfaces. For example BSD sed and GNU sed work very differently, at least when you accidentally use a non-standard option.

Ruby, JavaScript, Perl, Python, Java, C++ all have a stable base vocabulary (core library).

bash doesn't.


Speaking of implicit dependencies, here are a few: env, path, any executable file either in the current directory or in the search path. env = functions and aliases, and can be modified by any shell script. The path is an implicit dependency in the sense any chmod +x files in any of these directories can affect the shell. Which means the shell is also dependent on the file system itself.


True enough, though as soon as you have dependencies, you can't guarantee cross-platform or cross-version portability with those other languages either.


They all come with a dependency manager and the problem of sh (or bash? got your bashisms ready all the time?) is that the moment of introducing dependencies is "immediate".


One of my pet peeves about shell scripts (speaking as a former UNIX admin) is the lack of static typing or test frameworks. So not only is the language less elegant than modern programming languages, but it's harder to test for bugs and you don't even get type safety.


I'm currently in the process of putting together a first version of bash-specs, a Bash testing framework which provides features found in other languages' test frameworks, like

- a DSL for describing specifications

- the ability to mock functions and commands

- detailed, human-readable output

You can find some sample-tests in the spec/ folder (bash-specs eats its own dog food by testing itself ;)).

Dokumentation is lacking and the mocking is currently broken but both will be fixed within the next days.

https://github.com/helpermethod/bash-specs/


Testing might be less of a common practice for quick shell scripts, but it can be done and there's nothing particularly difficult about it, compared to other languages. Especially with "pure" scripts that don't revolve around system effects... which is the case with other scripting languages too.

As for static typing, agreed, though considering that shell is basically a scriptable FFI, real static typing would probably add a lot of complexity? Attempts at "typed shell" don't seem to go over well, maybe due to "worse is better" dynamics.

BTW, re: testing shell scripts: https://github.com/sstephenson/bats


The creator of bup (https://github.com/bup/bup/) also created wvtest (https://github.com/apenwarr/wvtest) to "unit test" any code in any combination of language on any platform. It's actually kind of cheating: all you do in your code is write lines with a specific format, and then the wvthing parses that to display stats.


Shell scripts are normally based on scraping the output of other commands via regexs. Properly made Python or Ruby scripts are not.


If a "properly made" Python or Ruby script uses more advanced parsers, what prevents you from "properly making" a shell script?


A few things:

- Most people who write shell scripts really don't want to write a 'ps' or 'ip' (or for non-Linux, 'ifconfig') replacement. On the other hand, Python and Ruby have libraries that talk to /proc and the network ioctls directly.

- most shell script authors don't even use bash arrays, and higher level data types don't exist. It's hard to represent a C struct in shell. C libraries are therefore exposed to shell scripts by scraping their stdout.


My favorite quote:

    this is actually a really pragmatic set of hackers
    that just hack on Ruby, hack on C and spend their
    time working on more interesting things using a
    more stable stack, rather than chasing after the
    latest and shiny tech.


There's also the 'Ruby syntax and performance at the same time' option - which used to be limited to JRuby but is quickly being replaced with Elixir.


Elixir's syntax is only superficially similar to Ruby's, and its semantics are radically different. I don't think there's any evidence that people who picked up JRuby for performance reasons are notably likely to leave Ruby for Elixir versus other languages.


Also worth noting that within their stack, they don't chase the newest versions. Until about a year ago, they were running Rails 2.3, when they upgraded to Rails 3.0:

http://shayfrendt.com/posts/upgrading-github-to-rails-3-with...


I liked this too.

I definitely prefer working with a stack I know how to wield, and using it to solve problems. New things are good and useful, but having a focus means you can understand all the nuances that go along with it.


Very nice read. Simple, stable and steady wins platform battles. It seems to do well at allowing them to keep focus and careful control of their product.

It's great to hear these core values are baked into their culture. I remember recently reading they were running Github pages on something like a pair of dedicated servers lol. Hats off, Github rocks.

Our team recently started using Jira for issues instead of Github ones. I have to say, if Github offered a $1 more per repo cost for better issue/wiki tier, I'd rather use that in a second.


My favorite quote: "GitHub’s largely officeless workplace — about 60 percent of its employees work remotely, using a powerful homemade chatbot, called Hubot, to collaborate." It's really cool to work wherever you like without lose the team's efficient.


> A month ago I was working from a cabin in the woods in Wisconsin.

OT, but has anyone else tried this? I've often fantasized about working while embedded in a beautiful nature setting, but I imagine it's tough to find good internet in such a place. I'm in California ... any recommendations?

I've also thought of copying Antony Garret Lisi's 'science hostel' idea [1] but for coders / entrepreneurs :)

[1]: https://en.wikipedia.org/wiki/Antony_Garrett_Lisi#Science_ho...


I've been surprised how good the internet was out there actually. I'd totally recommend it. Waking up, going for a cycle ride through the woods before working on a porch in the sun is kind of amazing.


California seems to have some of the worst Internet and cell service (at least the SV area). I worked a week last month from the Outer Banks in NC with no problems.


I bought an RV and did this for a while. It was fun but I wasn't any more productive. In the past I also tried moving to a remote beach resort town in Australia with a cofounder. It was fun and beautiful but again no more productive than we were back home.


Is that the point though? To be more productive? I'd be more interested to know if you were at least equally as productive and more importantly happier? I would think it's about improving your work/life balance and long term does that make you more fulfilled?


Most people seem to think it will make them more productive without the distractions so for a lot of people it is. Pretty view won't improve your work/life balance either unfortunately.

I was less productive and equally happy. All existences become the norm quickly. Even living in a tropical paradise, or surrounded by snow capped mountain wilderness.


/r/digitalnomad


Depending on what you're doing, poor internet access can be a boon.

Otherwise you'll just be sitting there in your cabin reading Hacker News.


How do you go from DB Admin to Director of Technology in 2 years!? That's pretty impressive.


Thank you. It was a mix on working on impactful stuff https://github.com/blog/1880-making-mysql-better-at-github, making an effort to work on more general issues like availability, having a culture that allowed me to be ambitious, and solid mentorship from our CTO.


One thought is the title Director is everywhere nowadays. Not to take anything away from this guy but it could be 2+ rungs below CTO. Probably just 1 though.


Another thing that would be interesting to learn more about is their Front end structure. From what I can gather they keep things extremely simple (jQuery, mostly) and make extensive use of pjax and server generated javascript that gets evaluated by the client. If this is all there is to it then it is really impressive, considering how many other companies of similar or smaller size quickly inject a Javascript MVC on top of their stack the minute the interaction goes past simple forms (Airbnb comes to mind, who to my knowledge is using Backbone for some aspects of their app).

To be clear I think both approaches are excellent and all depends on the in-house talent, but nonetheless it's interesting to see how GitHub is sticking to a traditional, Rails-inspired document-based approach that minimizes front end complexity while other companies do not, even when both have similar UX complexity.


I'd be really curious to read more about how the Github team's intuition around polyglot programming and "Right Tool for the Job" rhetoric changed over time. I feel like I read a blog post about this one or two years ago but I can't remember which team member wrote it :)



I'd love to see their compute and software stack provisioning tools. Are they open source?


hubot is open source and you can write hubot scripts for things you want to do.


I wonder what they use for a chat system? Slack? IRC? Jabber? Their own app suite?


IIRC, they use Campfire.

https://campfirenow.com/


One of the best articles I've read in while




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: