> Like Homer Simpson's button pressing birdie toy? :smackshead:
This comparison is especially apt, given that one of the main use-cases for LLMs is the same kind of... well, fraud: To give the illusion that you did the work of understanding or reviewing something, but actually just (smart-)phoning it in.
In one Apple iPhone advertisement, the famous actor is asked by their agent what they think of a script. They didn't read it, so they ask the LLM-assistant to sum it up in couple sentences... and then they tell their agent it sounds good.
I think my quip about the toy flew over a lot of heads, so I appreciate that someone got it.
The reality is that most applications and websites don’t expose enough context about the what of what you’re actually doing for AIs to be able to meaningfully infer from natural language the steps required to complete a given task.
We humans are very good at filling in the blanks based on if we’re working in Photoshop or VS Code or Excel. We infer a lot of context from the specific files we’re working on or the particular client or even the files’ organization within the file system, or even what month or day it is.
I am skeptical that models will be able to replicate a complex workflow when there’s very little in the way of labels and UI controls even visible.
I know a weekly spreadsheet from a monthly and quarterly, etc. I know the minutiae about which options to use to generate the specific source reports, etc.
Workflows can be quite complex, no matter your role.
I mean I can just see it now: gift receipts being sent to the recipient before their birthday, internal draft proposals prematurely sent to clients, mixing up clients or commingling their data, overwriting or losing data; this whole thing just screams disaster. And I’m not even thinking about people involved with safety, or finance, or legal/regultory, or medical. Law enforcement?
This kind of thing can be done properly with well defined interfaces, common standards, and reasonable and prudent guardrails.
But it won’t be. It’ll be YOLOed on a paper thin training budget and it’ll be like your own little personal chaos monkey on ketamine.
> I am skeptical that models will be able to replicate a complex workflow when there’s very little in the way of labels and UI controls even visible.
Also, at least from the perspective of internal business software, a significant part of it is trying to get people to know what they're doing. There's a ___domain-model that's being taught at the same time, and it's institutionally-important that they are cognizant and aware of what they're agreeing to. Together this tends to lead to an arrangement of multiple screens, confirmation boxes, etc.
Many individuals instinctively dislike this, and it'll be their one of their first choices for "let my LLM assistant do it."
> I mean I can just see it now
Before these LLMs, I felt like Idiocracy had become politically prescient, but now it feels like I actually see a technology that could enable it.
Brawndo is coming- it's got electrolytes and IT'S WHAT PLANTS CRAVE!
Life imitates art indeed.
I am sympathetic to wanting to automate complex workflows. Hell, I'm sympathetic to wanting to automate simple workflows. In fact, I bitch about the stupidity of the things I do at least once a week (no, you see, I take the numbers that show on this monitor, and I type them into a box on that monitor; why no cut & paste? faster to re-type the numbers; sigh).
But people provide context. Sure, an AI might tell you utility costs were up last quarter, but they won't know it was because of a water leak that went unnoticed and tripled the bill. Or it will tell you that wages were up, but not that it was because Bill from Operations had hernia surgery and we had to bring on a temp for 2 months. And it certainly won't tell you that Jim's back on the sauce, so we should probably begin putting out feelers for a new salesman.
So much of what business does is tracking metrics, yes, but the numbers never tell the whole story. There's always a backstory. Things that just can't be captured in raw data and hence can't be summarized by an AI. And AIs can't keep the ship sailing. Every small business has the guy/girl that does all the little things for everyone that absolutely holds the whole damn thing together. I'm not a BigCorp guy, but I imagine most departments are similar.
How about customer feedback? How can a model distill valuable (actionable) meaning from disparate communication mediums other than superficial high-level conclusions?
Expectations are just not realistic right now. There's going to be a lot of disappointment.
This comparison is especially apt, given that one of the main use-cases for LLMs is the same kind of... well, fraud: To give the illusion that you did the work of understanding or reviewing something, but actually just (smart-)phoning it in.
In one Apple iPhone advertisement, the famous actor is asked by their agent what they think of a script. They didn't read it, so they ask the LLM-assistant to sum it up in couple sentences... and then they tell their agent it sounds good.