That has been the story of every dynamic language since forever, thankfully the whole AI focus has made JITs finally matter in CPython world as well.
Personally I have learnt this lesson back in 2000's, in the age of AOLServer, Vignette, and our own Safelayer product. All based on Apache, IIS and Tcl.
We were early adopters of .NET, when it was only available to MSFT Partners and never again, using scripting languages without compilers, for full blown applications.
Those learnings are the foundations of OutSystems, same ideas, built with a powerful runtime, with the hindsight of our experiences.
The push for Python performance and JIT compilation has little to do with AI and more to do with Python's explosion in adoption for backend server applications in the 2010s, as well as the dedication of smaller projects like PyPy that existed largely because it was possible to make them exist. The ML/AI boom helped spread Python even farther and wider, yes, but none of the core language performance improvements are all that relevant for ML or AI.
As another commenter pointed out, the performance bottlenecks in AI specifically have essentially to do with the CPython runtime performance. The only exception is in the pre-processing of very large text corpora, and that alone has hardly been a blip on the radar of the people working on CPython performance.
Moreover, most of the "Python performance" projects that do sit closer to machine learning use cases (Cython-Numpy integration, Numba, Nuitka) are more or less orthogonal to the more recent push for Python interpreter performance.
Cython itself and MypyC are mainly relevant because they are intended to be general-ish purpose performance boosters for CPython, and in doing so helped fill the need for greater performance in "hot and loopy" code such as network protocols, linters, and iterators. Cython also acted as a convenient glue layer for ad-hoc C library binding. But neither project is all that closely related to AI or to the various JIT compilers that have arisen over the years.
Not at all, given Facebook and Microsoft involvement into making CPython folks finally accept a JIT has to be part of the story, coupled by NVidia and Intel work on GPU JIT DSLs for Python.
Yeah but how much of the Microsoft and Facebook effort was due to AI directly, as opposed to the general popularity of Python? which is undoubtedly driven nowadays by AI, but indirectly.
> Personally I have learnt this lesson back in 2000's, in the age of AOLServer, Vignette, and our own Safelayer product. All based on Apache, IIS and Tcl.
Woah, your mention of “Vignette” just brought back a flood of memories I think my subconscious may have blocked out to save my sanity.
The C/C++ is shipped in the form of well-established libraries like Numpy and PyTorch. Very few end users ever interact with the C/C++ parts, except for specialists with special requirements, and library contributors themselves.
Can you name specific "un-fashionable" AI projects that are dependent on Python code for things that have any significant performance impact, which are seeing significant benefits from Python JIT implementations?
What's a scripting language? Also I'm not sure for TCL (https://news.ycombinator.com/item?id=24390937 claims it's had a bytecode compiler since around 2000) but the main python and Ruby implementations have compilers (compile to bytecode then interpret the bytecode). Apparently ruby got an optional (has to be enabled) jit compiler recently and python has an experimental jit in the last release (3.13).
"... the distinguishing feature of interpreted languages is not that they are not compiled, but that any eventual compiler is part of the language runtime and that, therefore, it is possible (and easy) to execute code generated on the fly."
No, I worked with the founders at a previous startup, Intervento, which became part of an EasyPhone acquisition, which got later renamed into Altitude Software alongside other acquisitions.
They eventually left and founded OutSystems with what we learned since the Intervento days, OutSystems is of the greatest startup stories in the Portuguese industry.
This was all during dotcom wave from the 2000's, instead I left to CERN.
During their black friday / cyber monday load peak, Shopify averaged between ~0.85 and ~1.94 back-to-back RPS per CPU core. Take from that what you will.
You seem to imply that everything they run is Ruby, but they're talking about 2.4 million CPU cores on their K8s cluster, where maybe other stuff runs as well, like their Kafka clusters [1] and Airflow [2]?
Obviously you meant for the whole infrastructure: ruby / rails workers, Mysql, Kafka, whatever other stuff their app needs (redis, memcache, etc), loadbalancers, infrastructure monitoring, etc.
Just to reiterate stuff said in the other comments because your comment is maybe deliberately misrepresenting what was said in the thread.
Their entire cluster was 2.4 million CPU cores (without more info on what the cores were). This includes not only Ruby web applications that handle requests, but also other infrastructure. Asynchronous processing, database servers, message queue processing, data workflows etc, etc, etc. You cannot run a back of the envelope calculation and say 0.85 requests per second per core and that is why they're optimising Ruby. While that might be the end result and a commentary on contemporary software architecture as a whole, it does not tell you much about the performance of the Ruby part of the equation in isolation.
They had bursts of 280 million rpm (4.6 million rps) with average of 2.8 million rps.
> It does not tell you much about the performance of the Ruby part of the equation in isolation.
Indeed, it doesn't. However, it would be a fairly safe bet to assume it was the slowest part of their architecture. I keep wondering how the numbers would change if Ruby were to be replaced with something else.
Shopify invest heavily in Ruby and write plenty of stuff in lower level languages where they need to squeeze out that performance. They were heavily involved in Ruby's new JIT architecture and invested in building their own tooling to try and make Ruby act more like a static language (Sorbet, Bootsnap).
Runtime performance is just one part of a complex equation in a tech stack. It's actually a safe bet that their Ruby stack is pretty fucking solid because they've invested in that, and hiring ruby and JS engineers is still 1000x easier than hiring a C++ or Rust expert to do basic CRUD APIs.
Since we're insinuating, I bet you that Ruby is not their chief bottleneck. You won't get much more RPS if you wait on an SQL query or RPC/HTTP API call.
In my experience when you have a bottleneck in the actual Ruby code (not speaking about n+1s or heavy SQL queries or other IO), the code itself is written in such a way that it would be slow in whichever language. Again, in my experience this involves lots of (oft unnecessary) allocations and slow data transformations.
Usually this is preceded by a slow heavy SQL query. You fix the query and get a speed-up of 0.8 rps to 40 rps, add a TODO entry "the following code needs to be refactored" but you already ran out of estimation and mark the issue as resolved. Couple of months later the optimization allowed the resultset to grow and the new bottleneck is memory use and the speed of the naive algorithm and lack of appropriate data structures in the data transformation step... Again in the same code you diligently TODOed... Tell me how this is Ruby's fault.
Another example is one of the 'Oh we'll just introduce Redis-backed cache to finally make use of shared caching and alleviate the DB bottleneck'. Implementation and validation took weeks. Finally all tests are green. The test suite runs for half an hour longer. Issue was traced to latency to the Redis server and starvation due to locking between parallel workers. The task was quietly shelved afterwards without ever hitting production or being mentioned again in a prime example of learned helplessness. If only we had used an actual real programming language and not Ruby, we would not be hitting this issue (/s)
I wish most performance problems would be solved by just using a """fast language"""...
Effective use of IO at such scale implies high-quality DB driver accompanied by performant concurrent runtime that can multiplex many outstanding IO requests over few threads in parallel. This is significantly influenced by the language of choice and particular patterns it encourages with its libraries.
I can assure you - databases like MySQL are plenty fast and e.g. single-row queries are more than likely to be bottlenecked on Ruby's end.
> the code itself is written in such a way that it would be slow in whichever language. Again, in my experience this involves lots of (oft unnecessary) allocations and slow data transformations.
Inefficient data transformations with high amount of transient allocations will run at least 10 times faster in many of the Ruby's alternatives. Good ORM implementations will also be able to optimize the queries or their API is likely to encourage more performance-friendly choices.
> I wish most performance problems would be solved by just using a """fast language"""...
Many testimonies on Rust do just that. A lot of it comes down to particular choices Rust forces you to make. There is no free lunch or a magic bullet, but this also replicates to languages which offer more productivity by means of less decision fatigue heavy defaults that might not be as performant in that particular scenario, but at the same time don't sacrifice it drastically either.
You know, if I was flame-baiting, I would go ahead and say 'there goes the standard 'performance is more important than actually shipping' comment. I won't and I will address your notes even though unsubstantiated.
> Effective use of IO at such scale implies high-quality DB driver accompanied by performant concurrent runtime that can multiplex many outstanding IO requests over few threads in parallel. This is significantly influenced by the language of choice and particular patterns it encourages with its libraries.
In my experience, the bottleneck is mostly on the 'far side' of the IO from the app's PoV.
> I can assure you - databases like MySQL are plenty fast and e.g. single-row queries are more than likely to be bottlenecked on Ruby's end.
I can assure you, Ruby apps have no issues whatsoever with single-row queries. Even if they did, the speed-up would be at most constant if written in a faster language.
> Inefficient data transformations with high amount of transient allocations will run at least 10 times faster in many of the Ruby's alternatives. Good ORM implementations will also be able to optimize the queries or their API is likely to encourage more performance-friendly choices.
Or it could be o(n^2) times faster if you actually stop writing shit code in the first place.
Good ORMs do not magically fix shit algorithms or DB schema design. Rails' ORM does in fact point out common mistakes like trivial n+1 queries. It does not ask you "Are you sure you want me to execute this query that seq scans the ever-growing-but-currently-20-million-record table to return 5000 records as a part of your artisanal hand-crafted n+1 masterpiece(of shit) for you to then proceed to manually cross-reference and transform and then finally serialise as JSON just to go ahead and blame the JSON lib (which is in C btw) for the slowness".
> Many testimonies on Rust do just that. A lot of it comes down to particular choices Rust forces you to make. There is no free lunch or magic bullet, but this also replicates to languages which offer more productivity by means of less decision fatigue heavy defaults that might not be as performant in that particular scenario, but at the same time don't sacrifice it drastically either.
I am by no means going to dunk on Rust as you do on Ruby as I've just toyed with it, however I doubt that I could right now make the performance/productivity trade-off in Rust's favour for any new non-trivial web application.
To summarise, my points were that whatever language you write in, if you have IO you will be from the get go or later bottlenecked by IO and this is the best case. The realistic case is that you will not ever scale enough for any of this to matter. Even if you do you will be bottlenecked by your own shit code and/or shit architectural decisions far before even IO; both of these are also language-agnostic.
Just-in-time compilation of Ruby allowing you to elide a lot of the overhead of dynamic language features + executing optimized machine code instead of running in the VM / bytecode interpreter.
For example, doing some loop unrolling for a piece of code with a known & small-enough fixed-size iteration. As another example, doing away with some dynamic dispatch / method lookup for a call site, or inlining methods - especially handy given Ruby's first class support for dynamic code generation, execution, redefinition (monkey patching).
> In particular, YJIT is now able to better handle calls with splats as well as optional parameters, it’s able to compile exception handlers, and it can handle megamorphic call sites and instance variable accesses without falling back to the interpreter.
> We’ve also implemented specialized inlined primitives for certain core method calls such as Integer#!=, String#!=, Kernel#block_given?, Kernel#is_a?, Kernel#instance_of?, Module#===, and more. It also inlines trivial Ruby methods that only return a constant value such as #blank? and specialized #present? from Rails. These can now be used without needing to perform expensive method calls in most cases.
it makes ruby code faster than c ruby code so they are moving toward rewriting a lot of the core ruby stuff in ruby to take advantage of it. run time performance enhancing makes the language much faster.
Same as the benefits of JIT compilers for any dynamic language; makes a lot of things faster without changing your code, by turning hot paths into natively compiled code.
That's certainly not what I get out of what they said.
Shopify has introduced a bunch of very nice improvements to the usability of the Ruby language and their introductions have been seen in a very positive light.
Also, I'm pretty sure both Shopify for Ruby and Facebook for their custom PHP stuff are both considered good moves.
If I cannot refactor my services, I shall refactor Ruby instead.