Hacker News new | past | comments | ask | show | jobs | submit login

And that says… what? The entire LLM technology is worthless for all applications, from all implementations?

A company I worked for spent millions on a customer service solution that never worked. I wouldn’t say that contracted software is useless.




I agree. I use LLMs heavily for gruntwork development tasks (porting shell scripts to Ansible is an example of something I just applied them to). For these purposes, it works well. LLMs excel in situations where you need repetitive, simple adjustments on a large scale. IE: swap every postgres insert query, with the corresponding mysql insert query.

A lot of the "LLMs are worthless" talk I see tends to follow this pattern:

1. Someone gets an idea, like feeding papers into an LLM, and asks it to do something beyond its scope and proper use-case.

2. The LLM, predictably, fails.

3. Users declare not that they misused the tool, but that the tool itself is fundamentally corrupted.

It in my mind is no different to the steam roller being invented, and people remaking how well it flattens asphalt. Then a vocal group trying to use this flattening device to iron clothing in bulk, and declaring steamrollers useless when it fails at this task.


>swap every postgres insert query, with the corresponding mysql insert query.

If the data and relationships in those insert queries matter, at some unknown future date you may find yourself cursing your choice to use an LLM for this task. On the other hand you might not ever find out and just experience a faint sense of unease as to why your customers have quietly dropped your product.


I hope people do this and royally mess shit up.

Maybe then they’ll snap out of it.

I’ve already seen people completely mess things up. It’s hilarious. Someone who thinks they’re in “founder mode” and a “software engineer” because chatgpt or their cursor vomited out 800 lines of python code.


The vileness of hoping people suffer aside, anyone who doesn’t have adequate testing in place is going to fail regardless of whether bad code is written by LLMs or Real True Super Developers.


What vileness? These are people who are gleefully sidestepping things they don't understand and putting tech debt onto others.

I'd say maybe up to 5-10 years ago, there was an attitude of learning something to gain mastery of it.

Today, it seems like people want to skip levels which eventually leads to catastrophic failure. Might as well accelerate it so we can all collectively snap out of it.


The mentality you're replying to confuses me. Yes, people can mess things up pretty badly with AI. But I genuinely don't understand why the assumption that anyone using AI is also not doing basic testing, or code review.


Probably better to have AI help you write a script to translate postgres statements to mysql


Right, which is why you go back and validate code. I'm not sure why the automatic assumption that implementing AI in a workflow means you blindly accept the outputs. You run the tool, you validate the output, and you correct the output. This has been the process with every new engineering tool. I'm not sure why people assume first that AI is different, and second that people who use it are all operating like the lowest common denominator AI slop-shop.


In this analogy are all the steamroller manufacturers loudly proclaiming how well it 10x the process of bulk ironing clothes?

And is a credulous executive class en masse buying into that steam roller industry marketing and the demos of a cadre of influencer vibe ironers who’ve never had to think about the longer term impacts of steam rolling clothes?


> porting shell scripts to Ansible

Thank you for mentioning that! What a great example of something an LLM can pretty well do that otherwise can take a lot of time looking up Ansible docs to figure out the best way to do things. I'm guessing the outputs aren't as good as someone real familiar with Ansible could do, but it's a great place to start! It's such a good idea that it seems obvious in hindsight now :-)


Exactly, yeah. And once you look over the Ansible, it's a good place to start and expand. I'll often have it emit hemlcharts for me as templates, then after the tedious setup of the helm chart is done, the rest of it is me manually doing the complex parts, and customizing in depth.


Plus, it's a generic question; "give a helm chart for velero that does x y and z" is as proprietary as me doing a Google search for the same, so you're not giving proprietary source code to OpenAI/wherever so that's one fewer thing to worry about.


Yeah, I tend to agree. The main reason that I use AI for this sort of stuff is it also gives me something complete that I can then ask questions about, and refine myself. Rather than the fragmented documentation style "this specific line does this" without putting it in the context of the whole picture of a completed sample.

I'm not sure if it's a facet of my ADHD, or mild dyslexia, but I find reading documentation very hard. It's actually a wonder I've managed to learn as much as I have, given how hard it is for me to parse large amounts of text on a screen.

Having the ability to interact with a conversational type documentation system, then bullshit check it against the docs after is a game changer for me.


that's another thing! people are all "just read the documentation". the documentation goes on and on about irrelevant details, how do people not see the difference between "do x with library" -> "code that does x", and having to read a bunch of documentation to make a snippet of code that does the same x?


I'm not sure I follow what you mean, but in general yes. I do find "just read the docs" to be a way to excuse not helping team members. Often docs are not great, and tribal knowledge is needed. If you're in a situation where you're either working on your own and have no access to that, or in a situation where you're limited by the team member's willingness to share, then AI is an OK alternative within limits.

Then there's also the issue that examples in documentation are often very contrived, and sometimes more confusing. So there's value in "work up this to do such and such an operation" sometimes. Then you can interrogate the functionality better.


No, it says that people dislike liars. If you are known for making up things constantly, you might have a harder time gaining trust, even if you're right this time.


All of these things can be true at the same time:

1. LLMs have been massively overhyped, including by some of the major players.

2. LLMs have significant problems and limitations.

3. LLMs can do some incredibly impressive things and can be profoundly useful for some applications.

I would go so far as to say that #2 and #3 are hardly even debatable at this point. Everyone acknowledges #2, and the only people I see denying #3 are people who either haven't investigated or are so annoyed by #1 that they're willing to sacrifice their credibility as an intellectually honest observer.


#3 can be true and yet not be enough to make your case. Many failed technologies achieved impressive engineering milestones. Even the harshest critic could probably brainstorm some niche applications for a hallucination machine or whatever.


And yet we keep electing them to public office.


It says that people need training on what the appropriate use-cases for LLMs are.

This is not the type of report I'd use an LLM to generate. I'd use a database or spreadsheet.

Blindly using and trusting LLMs is a massive minefield that users really don't take seriously. These mistakes are amusing, but eventually someone is going to use an LLM for something important and hallucinations are going to be deadly. Imagine a pilot or pharmacist using an LLM to make decisions.

Some information needs to come from authoritative sources in an unmodified format.


If it makes data up, then it is worthless for all implementations. I'd rather it said I don't have info on this question.


It only makes it worthless for implementations where you require data. There's a universe of LLM use cases that aren't asking ChatGPT to write a report or using it as a Google replacement.


The problem is that yes llms are great when working on some regular thing for the first time. You can get started at a speed never before seen in the tech world.

But as soon as your use case goes beyond that LLMs are almost useless.

The main complaint that yes its extremely helpful in that specific subset of problems, it’s not actually pushing human knowledge forward. Nothing novel is being created with it.

It has created this illusion of being extremely helpful when in reality it is a shallow kind of help.


> If it makes data up, then it is worthless for all implementations.

Not true. It's only worthless for the things you can't easily verify. If you have a test for a function and ask an LLM to generate the function, it's very easy to say whether it succeeded or not.

In some cases, just being able to generate the function with the right types will mostly mean the LLM's solution is correct. Want a `List(Maybe a) -> Maybe(List(a))`? There's a very good chance a LLM will either write the right function or fail the type check.


> all implementations

Are you speaking for yourself or everyone?


Does “it” apply to Homo sapiens as well?


Except value isnt polarised like that.

In a research context, it provides pointers, and keywords for further investigation. In a report-writing context it provides textual content.

Neither of these or the thousand other uses are worthless. Its when you expect working and complete work product that it's (subjectively, maybe) worthless but frankly aiming for that with current gen technology is a fool's errand.


It says we don't have a lower bound on the effectiveness.

It's (currently) like an ad saying "this product can improve your stuff up to 300%"


It mostly says that one of the seriously difficult challenges with LLMs is a meta-challenge:

* LLMs are dangerously useless for certain domains.

* ... but can be quite useful for others.

* The real problem is: They make it real tricky to tell, because most of all they are trained to sound professional and authoritative. They hallucinate papers because that's what authoritative answers look like.

That already means I think LLMs are far less useful than they appear to be. It doesn't matter how amazing a technology is: If it has failure modes and it is very difficult to know what they are, it's dangerous technology no matter how awesome it is when it is working well. It's far simpler to deal with tech that has failure modes but you know about them / once things start failing it's easy to notice.

Add to it the incessant hype, and, oh boy. I am not at all surprised that LLMs have a ridiculously wide range as to detractors/supporters. Supporters of it hype the everloving fuck out of it, and that hype can easily seem justified due to how LLMs can produce conversational, authoritative sounding answers that are explicitly designed to make your human brain go: Wow, this is a great answer!

... but experts read it and can see the problems there. Which lots of tech suffers from: as a random example: Plenty of highly upvoted apparently fantastically written Stack Overflow answers have problems. For example, it's a great answer... for 10 years ago; it is a bad idea today because the answer has been obsoleted.

But between the fact that it's overhyped and particularly complex to determine an LLM answer is hallucinated drivel, it's logical to me that experts are hyperbolic when highlighting the problems. That's a natural reaction when you have a thing that SEEMS amazing but actually isn't.


> Stack Overflow answers have problems. For example, it's a great answer... for 10 years ago

To be fair, that's a huge problem with stack overflow and its culture. A better version of stack overflow wouldn't have that particular issue.


You, and the OP, are being unfair in your replies. Obviously, it's not worthless for all applications but when LLMs obviously fail in disastrous ways in some important areas, you can't refute that by going "actually it gives me codign advice and generates images".

Thats nice and impressive, but there are still important issues and shortcomings. Obligatory, semirelated xkcd: https://xkcd.com/937/


> And that says… what? The entire LLM technology is worthless for all applications, from all implementations?

You're the first in the thread to have brought that up; there are far more charitable ways to have interpreted the post you're replying to.


That software just didn‘t work that way. I don’t think it tried to convince the users that they were wrong by spouting nonsense that seems legitimate.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: