Hacker News new | past | comments | ask | show | jobs | submit login

it's just an example, but it's great to see smolagents in practice. I wonder how well the import whitelist approach works for code interpreter security.



I know some of the point of this is running things locally, but for agent workflows like this some of this seems like a solved problem: just run it on a throwaway VM. There's lots of ways to do that quickly.


VM is not the right abstraction because of performance and resource requirements. VMs are used because nothing exists that provides same or better isolation. Using a throwaway VM for each AI agent would be highly inefficient (think wasted compute and other resources, which is the opposite of what DeepSeek exemplified).


To which performance and resource requirements are you referring? A cloud VM runs as long as the agent runs, then stops running.


I mean performance overheads of an OS process running in a VM to (vs no VM) and additional resource requirements for running a VM, including memory and additional kernel. You can pull relevant numbers from academic papers.


A linear bar graph comparing compute/memory requirements?

  - OS process
  - virtual machine
  - LLM inference
Could have longevity as PC master race meme template.


OK. Thanks for clarifying. I think you're pretty wrong on this one, for what it's worth.


Is “DeepSeek” going to be the new trendy way to say to not be wasteful? I don’t think DS is a good example here. Mostly because it’s a trendy thing, and the company still has $1B in capex spend to get there.

Firecracker has changed the nature of “VMs” into something cheap and easy to spin up and throw away while maintaining isolation. There’s no reason not to use it (besides complexity, I guess).

Besides, the entire rest of this is a python notebook. With headless browsers. Using LLMs. This is entirely setting silicon on fire. The overhead from a VM the least of the compute efficiency problems. Just hit a quick cloud API and run your python or browser automation in isolation and move on.


I think you are assuming that inference happens on the same machine/VM that executes code generated by an AI agent.


I'm not even talking about Firecracker; for the duration of time things like these run, you could get a satisfactory UX with basic EC2.


The rise of captchas on regular content, no longer just for posting content, could ruin this. Cloudflare and other companies have set things up to go through a few hand selected scrapers and only they will be able to offer AI browsing and research services.


I think the opposite problem is going to occur with captchas for whatever it's worth: LLMs are going to obsolete them. It's an arms race where the defender has a huge constraint the attacker doesn't (pissing off real users); in that way, it's kind of like the opposite dynamics that password hashes exploit.


I’m not sure about that. There’s a lot of runway left for obstacles that are easy for humans and hard/impossible for AI, such as direct manipulation puzzles. (AI models have latency that would be impossible to mask.) On the other hand, a11y needs do limit what can be lawfully deployed…


There’s a lot of runway left for obstacles that are easy for humans and hard/impossible for AI, such as direct manipulation puzzles.

That's irrelevant. Humans totally hate CAPTCHAs and they are an accessibility and cultural nightmare. Just forget about them. Forget about making better ones, forget about what AI can and can't do. We moved on from CAPTCHAs for all those reasons. Everyone else needs to.


Agreed. When I open a link and get a Cloudflare CAPTCHA I just close the tab.


We eliminated all CAPTCHA use at Cloudflare in September 2023: https://blog.cloudflare.com/turnstile-ga/


OK, what you now call turnstyle. If I get one of those screens I just close the tab rather than wait several seconds for the algorithm to run and give me a green checkbox to proceed.


>AI models have latency

So do humans, or can my friend with cerebral palsy not use the internet any longer?


Totally different type of latency. A person with a motor disability dragging a puzzle piece with their finger will look very different from an AI model being called frame by frame.



That's a great video. To be clear, I'm not defending Captchas - I just don't know if I believe they're dead yet.


Cloudflare is more than captchas, it's centralized monitoring of them too: what do you think happens when your research assistant solves 50 captchas in 5 min from your home IP? It has to slow down to human research speeds.


What about Cloudflare itself? It might constitute an abuse of sorts of their leadership position, but couldn’t they dominate the AI research/agent market if they wanted? (Or maybe that’s what you were implying too)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: