Like all stubborn anti-AI know-it-alls, you sound like you’ve tried a couple of ...

voidhorse · 2024-12-31T22:55:03 1735685703

> And a big strength LLMs have is summarizing things - I’d like to see you summarize the latest 10 arxiv papers relating to prompt engineering and produce a report geared towards non-techies. And do this every 30 mins please. Also produce social media threads with that info. Is this a task you could do yourself, better than LLMs?

Right, but this is the part that is silly and sort of disingenuous and I think built upon a weird understanding of value and productivity.

Doing more constantly isn't inherently valuable. If one human writes a magnificently crafted summary of those papers once and it is promulgated across channels effectively, this is both better and more economical than having an LLM compute one (slightly incorrect) summary for each individual on demand. In fact, all the LLM does in this case is increase the amount of possible lower quality noise in the space. The one edge an LLM might have at this stage is to generate a summary that accounts for more recent information, thereby getting around the inevitable gradual "out of dateness" of human authored summaries at time T, but even then, this is not great if the trade off is to pollute the space with a. bunch of ever so slightly different variants of the same text. It's such a weird, warped idea of what productivity is, it's basically the lazy middle-manager's idea of what it means to be productive. We need to remember that not all processes are reducible to their outputs—sometimes the process is the point, not the immediate output (e.g. education).

phantompeace · 2025-01-01T01:09:11 1735693751

Who said anything about value? I can argue the vast majority of human generated content is valueless - look at Quora and Medium even before ChatGPT blew up. Where else are humans producing this amazing content? Facebook? X? Don’t even get me started.

Being able to summarise multiple articles quicker than a human can read and digest a single one is obviously more productive. I’m not sure why you’re assuming I’m talking about rewriting the papers to produce slightly different variations? It’s a summary. Concerned about the lack of “insight” or something? Then add a workflow that takes the summaries and use your imagination - maybe ask it to find potential applications in completely different fields? You already have comprehensive summaries (or the full papers in a vector db). Am I missing something?

Also the quality of the summary will be linked to the prompts and the way you go about the process (one-shotting the full paper in the prompt, map reduce, semantically chunked summaries, what model you’re using, its context length etc) as well as your RAG setup. I’m still working on my implementation but it’s simple as fuck and pretty decent in giving me, well, summaries of papers.

I can’t articulate it well enough but your human curation argument sounds to me like someone dismissing Google because anyone can lie online, and the good old Yellow Pages book can never be wrong.

voidhorse · 2025-01-01T08:42:26 1735720946

Based on your writing you are clearly emotionally invested in this technology, consider how that may affect your understanding.

By multiple rewrites, I meant that, to me, at least, it is silly to spend N compute on producing effectively the same summary on demand for the Mth chatbot user when, in some cases, we could much more economically generate one summary once and make it available via distribution channels--to be fair, that is sort of orthogonal to whether or not the "golden" summary is produced by humans or LLMs. I guess this is more of a critique of the current UX and computational expenditure model.

Yes, my whole point about the process being the point sometimes is precisely about lack of insight. It goes back to Searle's Chinese Room argument. A person in a room with a perfect dictionary and grammar reference can productively translate english texts (input) into Chinese texts (output) just by consulting the dictionary, but we wouldn't claim that this person knows Chinese. Using LLMs for "understanding" is the same. If all you care about is immediate material gain and output, sure, why not, but some of us realize that human beings still move and exist in the world and some of us still appreciate that we need to help fashion those human beings into rational ones that are able to use reason to get along, and aren't codependent on the past N years of the internet to answer any and all questions (the same criticism applies to over reliance on simplistic "answers" from search engines).

phantompeace · 2025-01-01T15:19:29 1735744769

I wouldn't say i'm more "emotionally invested" in this tech moreso than annoyed with people who expect it to be 100% perfect, as if they've accepted the snakeoil salesmen at face value and suddenly dismiss all useful applications of it at the first hurdle. Consider that your disdain for these sales people and their oft-exaggerated claims (which i absolutely despise) may cloud your judgement of the actual technology.

>it is silly to spend N compute on producing effectively the same summary on demand for the Mth chatbot user

Why? The compute is there, unused. Why is it silly to use it the way a user wants to? Is your argument more towards our effective use of electrical power across the globe or the quality of the summaries? What if the summaries are produced once and then loaded from some sort of cache - does that make it better in your eyes? I'm trying to understand exactly your point here... please accept my apologies for not being able to understand and please do not take my questions as "gotchas" or anything like that. I genuinely want to know the issue.

>A person in a room with a perfect dictionary and grammar reference can productively translate english texts (input) into Chinese texts (output) just by consulting the dictionary, but we wouldn't claim that this person knows Chinese.

Agreed, because you can't really know a language just from its words - you need grammar rules, historical/cultural context etc - precisely the kinds of things included in an LLM's training dataset. I'd argue the LLM knows the language better than the human in your example.

Again, i'm not sure how all of this is relevant to using LLMs to summarise long papers? I wouldn't have read them in the first place, because i didn't know they existed, and i don't have time to read them fully. So a summary of the latest papers every day is infinitely more better to me than just not knowing in the first place. Now if you want to talk about how LLMs can confidentally hallucinate facts or disregard things due to inherent bias in the training datasets then i'm interested because those are the things that are stopping me from actually trusting the outputs fully. (Note, i also don't trust human output on the internet either, due to inherent bias within all of us)

>human beings still move and exist in the world and some of us still appreciate that we need to help fashion those human beings into rational ones that are able to use reason to get along, and aren't codependent on the past N years of the internet to answer any and all questions

Do a simple experiment with the people around you. Ask them about something that happened a few years ago and see if they pull up Google or Wikipedia or whatever. I don't think you realise how far and few the humans you're talking about are left nowadays. Everyone, from teens to pensioners, have been affected by brain rot to some degree, whether it's plain disinformation on Facebook, or sweet nothings from their pastor/imam/rabbi, or innacurate Google search summaries (which is a valid point against LLMs - i'm also disappointed with how bad their implementation is).

And let's not assume most humans are even capable of being rational when the data in their own brains has been biased and manipulated by institutions and politicians in "democracies".

voidhorse · 2025-01-01T17:51:49 1735753909

I basically agree with everything you say here, I guess my chief concern surrounds reducing brain rot, and I mostly just worry that we will only increase brain rot through uncritical application of LLMs, rather than decrease it.

At least there is one silver lining: your comments are evidence that not everyone has suffered that brain rot, and some of us are still out there using tools critically—thanks for a good conversation on this!

phantompeace · 2025-01-02T01:12:42 1735780362

I am really glad we got the chance for this discussion and that it didn’t devolve into flaming or bad faith discussion; and i also share your sentiments RE brain rot, but for me this tech is cool yet weirdly primitive hence my excitement (I’m a 90s baby so I was “new” to the internet around the time AOL was in decline and this is the first time i feel early to something). I bet you there are ways to steer people away from their stupor using these - you know how a lie travels faster than the truth? What if these things can help equalise that?

Btw, I apologise again if I came across as blunt or rude in our exchange, upon reflection, I think you were actually right about me being somewhat emotionally invested in this (albeit due to that sliver of hope that they can be used for good). Peace be with you

hatefulmoron · 2024-12-31T23:02:47 1735686167

> And a big strength LLMs have is summarizing things - I’d like to see you summarize the latest 10 arxiv papers relating to prompt engineering and produce a report geared towards non-techies. And do this every 30 mins please. Also produce social media threads with that info. Is this a task you could do yourself, better than LLMs?

I don't mean to nitpick, but how good do you really think the output of this would be? Papers are short and usually have many references, I would expect the LLM to basically miss the important subtleties on every paper it's given, and misunderstand and misattribute any terms of art it encounters.

I mean, of course LLMs are good at summarizing: the summaries are probably mostly sort of good, and anything I'm summarizing I won't read myself. But for technical and specific texts, what's the point when you're getting a "maybe correct" retelling? Best case scenario you get a pretty paragraph that's maybe good for an introduction, and worst case you get incorrect information that misinforms you.

phantompeace · 2025-01-01T01:17:29 1735694249

The quality of the summary is only as good as the effort you put into writing your workflow. If you’re simply one shotting the paper into a message and saying “plz summarise this and I’ll reward you with $1m” then of course it’s gonna be shit. But if you semantically chunked along sections and do some RAG Q&A summaries before combining into a well formatted schema then it’s probably going to be better than the first way.

I’m using the summaries as a juicier abstract. I’m not taking them as gospel.

I’m working on following references to then add those papers to a vector db for RAG so it can actually go the step beyond. It’s fun!

hatefulmoron · 2025-01-02T07:36:36 1735803396

> I’m using the summaries as a juicier abstract. I’m not taking them as gospel.

I'm not sure of the value of this. Papers already have abstracts, rewording them using LLMs is just playing with your food. If you're seeing use out of it that's awesome though.

phantompeace · 2025-01-08T17:03:07 1736355787

You do have a point you know. I have actually been thinking about this recently and have decided to try and focus more on extracting value out of abstracts instead of summarising papers, and relying on embeddings of the paper in case the answer needs more context.

henning · 2024-12-31T22:48:38 1735685318

Due to unexpected capacity constraints, Claude is unable to reply to this message.

phantompeace · 2025-01-01T00:49:37 1735692577

Just as I thought, just snark and no real meaningful engagement.

P.S my script uses local models - no capacity constraints (apart from VRAM!)