Overfitting can be caused by a lot of different things. Having an over abundance...

ynniv · 2024-12-22T17:17:07 1734887827

These kind of hand wavey statements like “practice,” “grok,” and “nonlinear its capabilities will be” are not very constructive as they don’t have solid meaning wrt language models.

So, here's my hypothesis, as someone who is adjacent ML but haven't trained DNNs directly:

We don't understand how they work, because we didn't build them. They built themselves.

At face value this can be seen as an almost spiritual position, but I am not a religious person and I don't think there's any magic involved. Unlike traditional models, the behavior of DNNs is based on random changes that failed up. We can reason about their structure, but only loosely about their functionality. When they get better at drawing, it isn't because we taught them to draw. When they get better at reasoning, it isn't because the engineers were better philosophers. Given this, there will not be a direct correlation between inputs and capabilities, but some arrangements do work better than others.

If this is the case, high order capabilities should continue to increase with training cycles, as long as they are performed in ways that don't interfere with what has been successfully learned. People lamented the loss of capability that GPT 4 suffered as they increased safety. I think Anthropic has avoided this by choosing a less damaging way to tune a well performing model.

I think these ideas are supported by Wolfram's reduction of the problem at https://writings.stephenwolfram.com/2024/08/whats-really-goi...

dartos · 2024-12-22T19:15:09 1734894909

Your whole argument falls apart at

> We don't understand how they work, because we didn't build them. They built themselves.

We do understand how they work, we did build them. The mathematical foundation of these models are sound. The statistics behind them are well understood.

What we don’t exactly know is which parameters correspond to what results as it’s different across models.

We work backwards to see which parts of the network seem to relate to what outcomes.

> When they get better at drawing, it isn't because we taught them to draw. When they get better at reasoning, it isn't because the engineers were better philosophers.

Isn’t this the exact opposite of reality?

They get better at drawing because we improve their datasets, topologies, and their training methods and in doing so, teach them to draw.

They get better at reasoning because the engineers and data scientists building training sets do get better at philosophy.

They study what reasoning is and apply those learnings to the datasets and training methods.

That’s how CoT came about early on.

comp_throw7 · 2024-12-23T00:07:02 1734912422

> We do understand how they work, we did build them. The mathematical foundation of these models are sound. The statistics behind them are well understood.

We don't understand how they work in the sense that we can't extract the algorithms they're using to accomplish the interesting/valuable "intellectual" labor they're doing. i.e. we cannot take GPT-4 and write human-legible code that faithfully represents the "heavy lifting" GPT-4 does when it writes code (or pick any other task you might ask it to do).

That inability makes it difficult to reliably predict when they'll fail, how to improve them in specific ways, etc.

The only way in which we "understand" them is that we understand the training process which created them (and even that's limited to reproducible open-source models), which is about as accurate as saying that we "understand" human cognition because we know about evolution. In reality, we understand very little about human cognition, certainly not enough to reliably reproduce it in silico or intervene on it without a bunch of very expensive (and failure-prone) trial-and-error.

dartos · 2024-12-23T00:39:56 1734914396

> We don't understand how they work in the sense that we can't extract the algorithms they're using to accomplish the interesting/valuable "intellectual" labor they're doing. i.e. we cannot take GPT-4 and write human-legible code that faithfully represents the "heavy lifting" GPT-4 does when it writes code (or pick any other task you might ask it to do).

I think English is being a little clumsy here. At least I’m finding it hard to express what we do and don’t know.

We know why these models work. We know precisely how, physically, they come to their conclusions (it’s just processor instructions as with all software)

We don’t know precisely how to describe what they do in a formalized general way.

That is still very different from say an organic brain, where we barely even know how it works, physically.

My opinions:

I don’t think they are doing much mental “labor.” My intuition likens them to search.

They seem to excel at retrieving information encoded in their weights through training and in the context.

They are not good at generalizing.

They also, obviously, are able to accurately predict tokens such that the resulting text is very readable.

Larger models have a larger pool of information and that information is in a higher resolution, so to speak, since the larger better preforming models have more parameters.

I think much of this talk of “consciousness” or “AGI” is very much a product of human imagination, personification bias, and marketing.

og_kalu · 2024-12-23T01:39:49 1734917989

>We know why these models work. We know precisely how, physically, they come to their conclusions (it’s just processor instructions as with all software)

I don't know why you would classify this as knowing much of anything. Processor instructions ? Really?

If the average user is given unfettered access to the entire source code of his/her favorite app, does he suddenly understand it ? That seems like a ridiculous assertion.

In reality, it's even worse. We can't pinpoint what weights, how and in what ways and instances are contributing exactly to basic things like whether a word should be preceded by 'the' or 'a' and it only gets more intractable as models get bigger and bigger.

Sure, you could probably say we understand these NNs better than brains but it's not by much at all.

dartos · 2024-12-23T02:14:33 1734920073

> If the average user is given unfettered access to the entire source code of his/her favorite app, does he suddenly understand it ? That seems like a ridiculous assertion.

And one that I didn’t make.

I don’t think when we say “we understand” we’re talking about your average Joe.

I mean “we” as in all of human knowledge.

> We can't pinpoint what weights, how and in what ways and instances are contributing exactly to basic things like whether a word should be preceded by 'the' or 'a' and it only gets more intractable as models get bigger and bigger.

There is research coming out on this subject. I read a paper recently about how llama’s weights seemed to be grouped by concept like “president” or “actors.”

But just the fact that we know that information encoded in weights affects outcomes and we know the underlying mechanisms involved in the creation of those weights and the execution of the model shows that we know much more about how they work than an organic brain.

The whole organic brain thing is kind of a tangent anyway.

My point is that it’s not correct to say that we don’t know how these systems work. We do. It’s not voodoo.

We just don’t have a high level understanding of the form in which information is encoded in the weights of any given model.

og_kalu · 2024-12-23T03:03:51 1734923031

> If the average user is given unfettered access to the entire source code of his/her favorite app, does he suddenly understand it ? That seems like a ridiculous assertion. And one that I didn’t make. I don’t think when we say “we understand” we’re talking about your average Joe. I mean “we” as in all of human knowledge.

It's an analogy. In understanding weights, even the best researchers are basically like the untrained average joe with source code.

>There is research coming out on this subject. I read a paper recently about how llama’s weights seemed to be grouped by concept like “president” or “actors.”

>But just the fact that we know that information encoded in weights affects outcomes and we know the underlying mechanisms involved in the creation of those weights and the execution of the model shows that we know much more about how they work than an organic brain.

I guess i just don't see how "information is encoded in the weights" is some great understanding ? It's as vague and un-actionable as you can get.

For training, the whole revolution of back-propagation and NNs in general is that we found a way to reinforce the right connections without knowing anything about how to form them or even what they actually are.

We no longer needed to understand how eyes detect objects to build an object detecting model. None of that knowledge suddenly poofed into our heads. Back-propagation is basically "reinforce whatever layers are closer to the right answer". Extremely powerful but useless for understanding.

Knowing the Transformer architecture unfortunately tells you very little about what a trained model is actually learning during training and what it has actually learnt.

"Information is encoded in a brain's neurons and this affects our actions". Literally nothing useful you can do with this information. That's why models need to be trained to fix even little issues.

If you want to say we understand models better than the brain then sure but you are severely overestimating how much that "better" is.

dartos · 2024-12-23T20:10:59 1734984659

> It's as vague and un-actionable as you can get.

But it isn’t. Knowing that information is encoded in the weights gives us a route to deduce what a given model is doing.

And we are. Research is being done there.

> "Information is encoded in a brain's neurons and this affects our actions". Literally nothing useful you can do with this.

Different entirely. We don’t even know how to conceptualize how data is stored in the brain at all.

With a machine, we know everything. The data is stored in a binary format which represents a decimal number.

We also know what information should be present.

We can and are using this knowledge to reverse engineer what a given model is doing.

That is not something we can do with a brain because we don’t know how a brain works. The best we can do is see that there’s more blood flow in one area during certain tasks.

With these statistical models, we can carve out entire chunks of their weights and see what happens (interestingly not much. Apparently most weights don’t contribute significantly towards any token and can be ignored with little performance loss)

We can do that with these transformers models because we do know how they work.

Just because we don’t understand every aspect of every single model doesn’t mean we don’t know how they work.

I think we’re starting to run in circles and maybe splitting hairs over what “know how something works” means.

I don’t think we’re going to get much more constructive than this.

I highly recommend looking into LoRas. We can make Loras because we know how these models work.

We can’t do that for organic brains.

mistercheph · 2024-12-23T05:12:43 1734930763

The thing that you are handwaving away as just "which parameters correspond to what results" is precisely the important, the inexorable thing which defines the phenomena, and it is exactly the thing which we don't have access to, and which we did not and could not design, plan or engineer, but which emerged

dartos · 2024-12-23T11:44:01 1734954241

> which we did not and could not design, plan or engineer, but which emerged

We literally designed, planned, and engineered the environment and mechanisms which created those weights.

It’s just code. We can train models by hand too, it’d just take a lot longer.

It’s literally something we made, just from a higher order place.

To understand which exact weights correspond to what output will vary from model to model. There is research going into this subject for llama.

it’s not like we’re in the dark as to the principles that allow LLMs to make predictions.

My whole point is that to say “we don’t know how AI works” is just not true

ynniv · 2024-12-22T22:44:34 1734907474

Please, read the Wolfram blog

dartos · 2024-12-22T23:59:21 1734911961

I gave it a fair skim, but I didn’t really feel like it refuted what I said.

Is there a specific section that comes to mind?

ynniv · 2024-12-23T00:56:12 1734915372

Other than we don't tell it how to get the right answer, or understand how it eventually computes correct answers?

dartos · 2024-12-23T02:23:03 1734920583

I don’t really think you’re understanding my argument…