Hacker News new | past | comments | ask | show | jobs | submit | lowsong's comments login

I'm astounded that anyone even considered permitting LLM generated code in a browser engine, yet alone a technical committee. Why would you risk using a system that's known to generate logic errors and bugs in a system where security and correctness are the single highest priority?


Frankly, I just assumed AI generated code would be treated with the same suspicion as human code and reviewed.


Reviewing code is harder than writing it.

When _writing code_, you achieve a certain level of understanding because you fundamentally need to make all the decisions yourself (and some of them are informed by factual statements).

When _reading code_, a lot of those decision points are easy to miss or take for granted which means you don't even notice there are alternatives. Furthermore, you don't look up the factual statements, therefore you have a lower level of understanding. Also you have no opportunity to review if those statements are actually true so decisions made on false assumptions get into the codebase.

Finally, reviewing code (to the same level of depth) is significantly more mentally taxing.


To be honest the whole premise AI code needs to be banned sounds like a bit of a histrionic caricature to me, so I might not be in the right mindset to accept this either, but, this does feel a bit histrionic too. Like, maybe in a vacuum, in a codebase, language, and functionality I'm not familiar with, or if I'm too inexperienced to be diligent, or I don't bother with tests...maybe I'm just old, and those things seem like necessary preconditions even though I'd merrily ignore them 12 years ago, and working on my own now predisposes me to being happy with the work


> … I believe our community should embrace it sooner rather than later — but like all tools and practices, with the right perspective and a measured approach.

There is no such thing as a measured approach. You can either use LLM agents to abdicate your intellectual honesty and produce slop, or you can refuse their use.


This is grossly misogynistic and insulting. It's the same old idea that women "just want to settle down and find a nice man" dressed up with some graphs. The author has even managed to make it about men in the end.


Nearly two years on from this article's publication, 'AI' tools are still as useless as ever for productive work.


In the time since this publication, the Nobel Prize in Chemistry was awarded to an ML Engineer & AI researcher.

While it's accurate that there's a lot of AI slop out there, it's also a reasonable assertion that working at the frontier of AI & ML research can be a worthwhile use of your life.


Curious to hear what kind of work you do. Because there are definitely fields where productivity as 10x'd because of AI tools.


Definitely has been true for my work. LLMs have absolutely have been useful, I even forked an IDE (Zed) to add my own custom copilot to leverage a deeper integration for my work.

But even if we consider AI beyond just NLP, there's been so much ML you can apply to other more banal day to day tasks. In my org's case, one of the big ones was anomaly detection and fault localization in aggregate network telemetry data. Worked far better than conventional statistical modeling.

I usually assume there is a caricature of "AI Tools" that all of the detractors are working backwards from that are often nothing more than a canard for the folks that are actually using AI tooling successfully in their work.


Please give us a source for the x10.


We never have any proof or source for that, and when we have one (like the Devin thing) it’s always a ridiculous project in JS with a hundred lines of code that I could write in one day.

Give me some refactoring in a C++ code base with 100k lines of code, and we’ll be able to talk.


also curious to hear which fields these are


Not management. AI hasn't learned to play golf for them yet.


Anything with using tools which you are not an expert with. If you know how to do things and only use one specific language or framework -- there is nothing to use AI for.


being 10x more productivie with ai doesn`t mean that someone is now a full engineer. ^^


No, but if "productivie" becomes one of the variable names eventually used in legacy work . . . full engineer achieved!


Yep, I think this is definitely the case for e.g. software engineering.


This whole area is so drenched in bullshit, it's no wonder that the generation of BS and fluff is still the most productive use. Just nothing where reliable facts matter. I do believe that machines can vomit crap 10x as fast as humans.


Seriously? 10x? Either those were cushion jobs or you oversold your claim


Some jobs it's more.

I had to sign a 140 page contract of foreign language legalese. Mostly boiler plate, but I had specific questions about it.

Asking questions to an AI to get the specific page answering it meant I could do the job in 2 hours. Without an AI, it would have taken me 2 days.

For programming, it's very good at creating boilerplate, tests, docs, generic API endpoints, script argument parsing, script one-liners, etc. Basically anything for which, me, as a human, don't have much added value.

It's much faster to generate imperfect things with AI and fix them that to write them myself when there is a lot of volume.

It's also pretty good at fixing typos, translating, giving word definition, and so on. Meaning if you are already in the chat, no need to switch to a dedicated tool.

I don't personally get 10x on average (although on specific well suited task I can) but I can get a good X3 on a regular basis.


But what you're doing isn't a real job. Who hands someone who doesn't speakt the language a contract to sign? Don't you have a legal department that does this job for you and has people that are specialists for that?

Also, what are you going to do if the AI answered inaccurately and you signed a contract that says something different then what you thought?


Similar to "real work," a "real job" is a definitional problem that attacks people's senses of self.


I am actually pretty sure that the thing described literally isn’t a real job, at least not working for a serious employer. I can’t imagine a company telling someone to sign contracts in a language they can’t speak and somehow try to make a sense of it.

Either it’s their own company and they’re doing something unwise, they are doing it without the knowledge of their superior or their company shouldn’t be trusted with anything.

The point was that „AI helps me translate the contracts I want to sign“ isn’t a good example of „AI increases my productivity“ because that’s not something you should ever do.


You are confusing not being able to do something well with being able to do it quickly.


But you shouldn't do some stuff you can't do properly at all, not quickly and not slowly. As a layman, you can't sign a contract in a language you don't speak, even if you have a whole year, unless you can become more-than-fluent in that language in a single year. That's just not something you should do, and the AI isn't reliable enough to help you with it. That's what a legal department is for.


I would never in my whole life sign anything in a foreign language that I don’t understand. It’s the perfect example of what AI is: let’s do anything that looks like a job well done and fuck it. That is not convincing. It’s suicidal.


I don't know that I'd say _useless_ but there's _a long way_ to go. I also suspect that many developers are going to wind up being dependent upon them at the expense of getting familiar with reading code/documentation, exploring with debuggers, writing forum/blog posts, etc.

Here is one example from the last time I asked Claude a question about message filtering on AWS SQS, which is a very common exchange in my (relatively limited) experience with these tools.tool

# me responding to a suggestion to use a feature which I was pretty sure did not exist

> ... and you're sure this strategy works with SQS FIFO queues?

# Claude apologizing for hallucinating

> I apologize - I need to correct my previous responses. I made a mistake - message filtering with message attributes is NOT supported with FIFO queues. This is an important limitation of FIFO queues.

If I didn't already have familiarity with this service and its feature set, I would have wasted 15/30/60 mins trying to validate the suggestion.

I can't imagine trying to debug code that was automatically generated and inserted into my toolchain using Copilot or whatever else. Again, I'm sure this will all get better but I'm not convinced that any of these tools should be the first ones we reach for.


> Nearly two years on from this article's publication, 'AI' tools are still as useless as ever for productive work.

Fully disagree. When developing I use llm's all the time. its far quicker for me to ask an llm to build a react component than for me to type it and have to remember all the little nuances for a technology I use a couple of times a year.

When proto typing things or doing quick one offs, using an LLM makes me 10x more productive.


OK, but React development is one of the simplest development tasks out there. It’s probably the most popular UI framework with the most data for the LLM model to generate off existing solutions. Low hanging fruit that also provides the least value to companies since react developers are a dime a dozen.

Also, one could argue that the LLM provides less value since a simple Google search for most react questions can produce a very viable answer, since 90% of what most React developers do has already been done and shared on the Internet.


Chatgpt is really good at small throwaway scripts to accomplish some task that could have taken a good portion of the day to write up otherwise (almost always gets the code right on first try)


It can add usability features pretty well too. "Look at the variables in the script and add a feature to accept them as command line arguments and environment variables. Make the CONTENT variable mandatory."

They are also pretty good at software translations nowadays. You can give them a yaml file and they know what's a variable or placeholder and what's text to be translated.


They can be incredibly useful for many kinds of knowledge work. I've been much more productive using LLMs than before. I'm at the point where I dont know how I ever managed without.


Can you even call it "knowledge work" then?


the difference is that before LLMs you had to use search to find answers to unknowns. The productivity boost of knowing everything immediately doesnt fundamentally change the work itself. I understand over reliance might become problematic at some point. My solution to that would be a bunch of GPUs so I have an R1 locally, anytime. the future is here, some people just havent noticed.


The article isn't about AI tools. He's saying that people should consider becoming engineers working on building AI training systems.


Still remarkable though that they didn't mean a prompt guesser when they said, “ML engineer”.


They don't help you with the non-productive parts, freeing up more of your time for productive work?


After crypto-bros, we now have AI-deniers


> …it is pretty sad since this is not a matter of taste or political inclination: there must be a single truth

This is more of a salient point that you perhaps realized. In life there is no single absolute, unknowable truth. Philosophy has spent the entire span of human existence grappling with this topic. The real risk with AI is not that we build some humanity-destroying AGI, but that we build a machine that is 'convincing enough' — and the idea that such a machine would be built by people who believe in objective truth is the most worrying part.


Depends, if you're a realist [1] (like most) then there can be such a thing as absolute truth, that you may not always be able to access.

[1] https://en.wikipedia.org/wiki/Philosophical_realism?wprov=sf...


This is teleologically false.

A teleological argument that assumes truth is contingent upon a specific worldview would indeed be flawed, because it would make truth an artifact of a given perspective rather than something independent of it.


It's nice to see someone that gets it, and can explain why all of these generative-AI tools are completely pointless so well.


I'm used to certs in Kubernetes, so even 6 days is long-lived. 20 minutes is more like it.


Doesn't that run into their rate limits if you generate a certificate every few minutes all the time? Or at least might be a burden, even if it didn't hit an absolute limit. (I'm assuming you're not the only person in the world doing this, so I mostly mean the collective effect this sort of usage pattern has)


Sorry, I should have clarified. You can't do certificates that fast on Let's Encrypt no. I meant running a custom CA inside/alongside Kubernetes, and using that to issue 20-minute validity certs to pods.


It's often very difficult to get ___domain names in large orgs, but very easy to get public IPs. An IP can be as easy as a couple of buttons to get a static IP and assign it to a cloud LB in AWS or Google Cloud. Domain Names usually require choosing a ___domain name (without picking a name that reveals internal project details), then convincing someone with budget to buy the ___domain, then someone has to manage the ___domain name forevermore. For quick demos, or simple environments, it'd easier to just get a static IP and use that.


Yeah so what happens when that IP is recycled in the Cloud?


I am baffled as to why anyone would want this, or anything like it.

Sure, it's a cool technical demo. But... the point of D&D is a social game played with others. Even if you play a D&D-like video game such as Baldur's Gate 3 (which I'd argue is a fundamentally different experience to playing a tabletop RPG anyway), you're experiencing a world and a story that someone else has crafted.

What's the point of replacing that social interaction, or that connection to another person's creative vision through their art, with an LLM? What value can it ever provide?


I would love a fully generated D&D campaign where I just input some parameters, like a writing prompt, and get a tailored LLM Dungeon Master organizing it. Whether it's the result of someone's creative vision or not is irrelevant to me.


I imagine for people like me who do not have anyone nearby willing to play D&D


Right, but why not play BG3? You know that an LLM is not going to provide any meaningful narrative, or craft a thought provoking story.


> You know that an LLM is not going to provide any meaningful narrative, or craft a thought provoking story

The title of the paper is "Exploring the POTENTIAL of LLM-based agents..." Right now, yeah, just play BG3. But the context of the discussion is its potential, which is certainly worth discussing imo.


I've been in a weekly D&D game for over a decade. I don't keep playing it because of the game of D&D (certainly not, I could rant for a long time about the game's myriad shortcomings across many editions). I play it because of the friends I play it with, because of the social connection that exists in and around the game.

To play a tabletop RPG without that isn't the same thing, and to play with LLM is bankrupt of meaning. You'd be better off picking up one of those old-school choose your own adventure books or a video game like BG3. At least those were written by someone with intent, at least there's meaning in the story and setting. Or, finding an online group.


People play games for a variety of reasons. While you enjoy the social aspect, others may find that to be the worst aspect of the game (or just don't have a social circle).

I grew up playing D&D alone in my room using LEGO. It was great and I still have some of the sets I built as dungeons and monsters that I built from reference pictures in the Monster Manual.

Video games are different from solo role playing. It's like trying to substitute shooting hoops in the driveway with NBA '24.


Would it be fair to say that you also don't appreciate solo rpgs?


Find someone! Make someone! Go online!

Dear god, it's crucial that adults be capable of creating and maintaining relationships. If it's impossible then, well, that's hell on earth.

Go out and make a new relationship! Do you live in Antarctica? And, if you live somewhere where new relationships can't be made, my god, get out of there! Go, go now! Go somewhere you're allowed to be human!


I am not interested in this, but an AI DM that allow human players to play would keep the social aspect. Since being DM is a higher threshold of knowledge and time investment, this could unblock a social gathering, rather than substituting it.

Also, if it doesn’t try to keep the story to strict guardrails, the creativity and story creation would come essentially from the human players.

I don’t think those arguments are enough to make me want to use it, but I see the point and possible interesting use.


> the point of D&D is a social game played with others

That's one of the points, or a benefit maybe, but its not the WHOLE point. Some people just enjoy RP and fantasy worlds, so a single-player campaign would still be enjoyable (to me, at least, but maybe I'm weird). I def don't see this replacing DMs in real groups any time soon.


I've found chatgpt useful for trying out D&D settings that I don't normally get to experience in my usual group. It loses the plot after a while, but it's enough for me to figure out whether I dig the setting enough that I would want to pitch it for a real game.


There's a saying in DnD:

"Bad DnD is worse than no DnD"

I'll agree. I've had bad groups and, wow, yeah. I'd rather have just doom scrolled my phone for 4 hours.

So, an AIDnD is wonky now, but it's better than bad DnD (at least with some futzing about that I have done myself).

To really reach here: I think you're placing value on the thing due to the effort involved. That's kind of an old school Marxist way of assigning prices to goods. I'm trying to point out that the value isn't that way for everyone. The 'new' market way of assigning prices is 'whatever someone will pay for it'. And in that very tortured analogy, that means that AIDnD is just as valid if someone likes it enough to do it. The background effort isn't part of the determination of value, necessarily.


Last week I had to caution a junior engineer on my team to only use an LLM for the first pass, and never rely on the output unmoderated.

They're fine as glorified autocomplete, fuzzy search, or other applications where accuracy isn't required. But to rely on them in any situation where accuracy is important is professional negligence.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: