Hacker News new | past | comments | ask | show | jobs | submit login

This will synchronously block until ‘chat.ask’ returns though. Be prepared to be paying for the memory of your whole app tens/low hundreds of MB of memory being held alive doing nothing (other than handling new chunks) until whatever streaming API this is using under the hood is finished streaming.



Threads?


Rails is a hot ball of global mutable state. Good luck with threads.


The default rails application server is puma and it uses threads


Yes, it does. Ruby has a global interpreter lock (GIL) that prevents multiple threads to be executed by the interpreter at the same time, so Puma does have threads, they just can’t run Ruby code at the same time. They can hide IO though.


The GIL is released during common IO operations like the HTTP requests that power LLM communication


The Rails documentation has lots of info about this: https://guides.rubyonrails.org/tuning_performance_for_deploy...

Concurrency support is missing from the language syntax and this particular library as a concept. This is by design, to not distract from beautiful code. Your request will make zero progress and take up memory while waiting for the LLM answer. Other threads might make progress on other requests, but in real world deployments this will be a handful (<10). This server will get 10s of requests per second when something written in JS or Go will get many 1000s.

It’s amazing how the Ruby community argues against their own docs and doesn’t acknowledge the design choices their language creators have made.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: