it's just an example, but it's great to see smolagents in practice. I wonder how...

tptacek · 2025-02-04T20:21:42 1738700502

I know some of the point of this is running things locally, but for agent workflows like this some of this seems like a solved problem: just run it on a throwaway VM. There's lots of ways to do that quickly.

ATechGuy · 2025-02-04T20:30:41 1738701041

VM is not the right abstraction because of performance and resource requirements. VMs are used because nothing exists that provides same or better isolation. Using a throwaway VM for each AI agent would be highly inefficient (think wasted compute and other resources, which is the opposite of what DeepSeek exemplified).

tptacek · 2025-02-04T20:59:34 1738702774

To which performance and resource requirements are you referring? A cloud VM runs as long as the agent runs, then stops running.

ATechGuy · 2025-02-04T22:00:27 1738706427

I mean performance overheads of an OS process running in a VM to (vs no VM) and additional resource requirements for running a VM, including memory and additional kernel. You can pull relevant numbers from academic papers.

transpute · 2025-02-05T02:40:39 1738723239

A linear bar graph comparing compute/memory requirements?

  - OS process
  - virtual machine
  - LLM inference

Could have longevity as PC master race meme template.

tptacek · 2025-02-05T01:36:16 1738719376

OK. Thanks for clarifying. I think you're pretty wrong on this one, for what it's worth.

vineyardmike · 2025-02-04T21:13:44 1738703624

Is “DeepSeek” going to be the new trendy way to say to not be wasteful? I don’t think DS is a good example here. Mostly because it’s a trendy thing, and the company still has $1B in capex spend to get there.

Firecracker has changed the nature of “VMs” into something cheap and easy to spin up and throw away while maintaining isolation. There’s no reason not to use it (besides complexity, I guess).

Besides, the entire rest of this is a python notebook. With headless browsers. Using LLMs. This is entirely setting silicon on fire. The overhead from a VM the least of the compute efficiency problems. Just hit a quick cloud API and run your python or browser automation in isolation and move on.

ATechGuy · 2025-02-05T02:49:15 1738723755

I think you are assuming that inference happens on the same machine/VM that executes code generated by an AI agent.

tptacek · 2025-02-04T21:47:03 1738705623

I'm not even talking about Firecracker; for the duration of time things like these run, you could get a satisfactory UX with basic EC2.

cma · 2025-02-04T21:30:02 1738704602

The rise of captchas on regular content, no longer just for posting content, could ruin this. Cloudflare and other companies have set things up to go through a few hand selected scrapers and only they will be able to offer AI browsing and research services.

tptacek · 2025-02-04T21:48:29 1738705709

I think the opposite problem is going to occur with captchas for whatever it's worth: LLMs are going to obsolete them. It's an arms race where the defender has a huge constraint the attacker doesn't (pissing off real users); in that way, it's kind of like the opposite dynamics that password hashes exploit.

anon373839 · 2025-02-05T00:31:03 1738715463

I’m not sure about that. There’s a lot of runway left for obstacles that are easy for humans and hard/impossible for AI, such as direct manipulation puzzles. (AI models have latency that would be impossible to mask.) On the other hand, a11y needs do limit what can be lawfully deployed…

jgrahamc · 2025-02-05T08:37:45 1738744665

There’s a lot of runway left for obstacles that are easy for humans and hard/impossible for AI, such as direct manipulation puzzles.

That's irrelevant. Humans totally hate CAPTCHAs and they are an accessibility and cultural nightmare. Just forget about them. Forget about making better ones, forget about what AI can and can't do. We moved on from CAPTCHAs for all those reasons. Everyone else needs to.

nerdralph · 2025-02-05T14:27:39 1738765659

Agreed. When I open a link and get a Cloudflare CAPTCHA I just close the tab.

jgrahamc · 2025-02-05T14:37:51 1738766271

We eliminated all CAPTCHA use at Cloudflare in September 2023: https://blog.cloudflare.com/turnstile-ga/

nerdralph · 2025-02-05T21:55:46 1738792546

OK, what you now call turnstyle. If I get one of those screens I just close the tab rather than wait several seconds for the algorithm to run and give me a green checkbox to proceed.

pixl97 · 2025-02-05T01:08:04 1738717684

>AI models have latency

So do humans, or can my friend with cerebral palsy not use the internet any longer?

anon373839 · 2025-02-05T03:01:05 1738724465

Totally different type of latency. A person with a motor disability dragging a puzzle piece with their finger will look very different from an AI model being called frame by frame.

pixl97 · 2025-02-05T16:51:32 1738774292

Round 2: Begin

https://www.youtube.com/watch?v=WqnXp6Saa8Y

anon373839 · 2025-02-06T03:53:27 1738814007

That's a great video. To be clear, I'm not defending Captchas - I just don't know if I believe they're dead yet.

cma · 2025-02-05T08:02:18 1738742538

Cloudflare is more than captchas, it's centralized monitoring of them too: what do you think happens when your research assistant solves 50 captchas in 5 min from your home IP? It has to slow down to human research speeds.

AznHisoka · 2025-02-04T21:41:25 1738705285

What about Cloudflare itself? It might constitute an abuse of sorts of their leadership position, but couldn’t they dominate the AI research/agent market if they wanted? (Or maybe that’s what you were implying too)