Hacker News new | past | comments | ask | show | jobs | submit login
Large Language Models Are Neurosymbolic Reasoners (arxiv.org)
107 points by optimalsolver on March 13, 2024 | hide | past | favorite | 164 comments



I'm trying to tackle this problem more head-on, by outfitting LLMs with lambda calculus, stacks, queues, etc. directly in their internals, operating over their latent space. [1]

I'll read your paper, but, LLMs famously fail horribly at "multi jump" reasoning, which to me means they can't reason at all. They can merely output a reflection of the human reasoning that was baked into the training data, and they can also recombine it combinatorially. Eager to see if you've solved this!

[1] https://github.com/neurallambda/neurallambda


Chain of thought is basically reasoning as humans do it, the only difference is that unlike humans the model can't see that its output is wrong, abandon a line of reasoning and re-prompt itself (yet).


Various attempts at feeding their output back in to check itself have shown marked improvements in accuracy.


Multi agent LLMs talking to each other can already do this. It's just not cost feasible yet because it can lead to infinite loops and no solutions


> they can't reason at all.

I'm pretty sure this is it. They don't understand even negation to begin with.

In this thread someone asked an AI to output images of stereotypical American soccer fans and it drew all the Seattle fans with umbrellas:

https://old.reddit.com/r/MLS/comments/1b10t68/meme_asked_ai_...

Seattle is pretty famously known for everyone not wearing umbrellas.


Humans also fail much of the time at 'multi jump' reasoning. You have to prod them.


Except no human (non-colorblind at least) past three years old thinks bananas have the same color as the sky (see the example given in the repo, that's a mistake literally no human could make)


For what it's worth, I tried it on ChatGPT and this was its response:

"The color of the daytime sky is commonly blue. The common household fruit that is also blue would be blueberries. Blueberries typically grow in acidic soil. The pH of the soil they grow in is usually between 4.5 and 5.5."


It can get this simple example right if it does chain of thought. If you ask it to just output the answer without answering other bullet points, it will very likely not get it right. Chain of thought is duct tape to actual reasoning, and the errors/hallucinations compound exponentially. Try to get chatgpt to reason about concurrent state issues in programming. If it's not a well worn issue that it already memorized, it'll be useless. It's also perfect, for instance at Advent of Codes that it's memorized, and near 0% accuracy on new Advents.


But as a human being I couldn't ever do it without chain of thoughts. First I would have to bruteforce come up with fruits that have the required colour. Otherwise I am just random guessing.

Also this problem I first tried to solve myself and I couldn't because I imagined sky to be light blue and blueberries are very dark if blue at all to me.


You could also go with grey or white etc for the sky.


Maybe not no human :) But probably 99.99% of them.


You're technically right, my two and a half had only be familiar with colors for six months, but I think it's fine to say that toddlers aren't reaching standard level of human intelligence ;)


May I ask why there are NNs in your project at all? Just to heat up the planet and make Nvidia share holders even more happy? :-)

I mean, what I've seen in the Readme makes sense. But doing basic computer stuff with NNs just makes the resource usage go brr by astronomical factors, for imho no reason, while making the results brittle, random, and often just completely made up.

Also: Do you know about the (already 40 year old!) project Cyc?

https://en.wikipedia.org/wiki/Cyc

This software can indeed "reason". And it does not hallucinate; because it's not based on NNs.


> May I ask why there are NNs in your project at all? Just to heat up the planet [...]

> Also: Do you know about the (already 40 year old!) project Cyc?

Has Cyc accomplished anything so far? Or is it just to heat up the planet? The Wikipedia page makes it sounds pretty hopeless:

> Typical pieces of knowledge represented in the Cyc knowledge base are "Every tree is a plant" and "Plants die eventually". When asked whether trees die, the inference engine can draw the obvious conclusion and answer the question correctly.

> Most of Cyc's knowledge, outside math, is only true by default. For example, Cyc knows that as a default parents love their children, when you're made happy you smile, taking your first step is a big accomplishment, when someone you love has a big accomplishment that makes you happy, and only adults have children. When asked whether a picture captioned "Someone watching his daughter take her first step" contains a smiling adult person, Cyc can logically infer that the answer is Yes, and "show its work" by presenting the step-by-step logical argument using those five pieces of knowledge from its knowledge base.


Well with 40 years of work, I’m sure Cyc have had some really mind blowing results.


"AI heats the planet"... really? You mean marginally?

I'll assume you're asking in good faith. Using NNs allows this project to stand on the shoulders of giants: philosophically, mathematically, programmatically, but also I expect this to plug in to OSS LLMs, and leverage their knowledge, similarly to how a human child learns in a Pavlovian/intuitive response, and only later starts to learn to reason.

Wrt inefficiency, training will be inefficient, but the programs can be extracted to CPU instructions / CUDA kernels during inference. Also, I'm interested in using straight through estimators in the forward pass of training, to do this conversion in training too.

Cyc looks cool, but from my cursory glance, is it capable of learning, or is its knowledge graph largely hand coded? Neurallambda is at least as scalable as an RNN, both in data and compute utilization.


> Wrt inefficiency, training will be inefficient

That's the "heating the planet part" I was referring to. :-)

> but the programs can be extracted to CPU instructions / CUDA kernels during inference

This just makes my original question more pressing: What are the NNs good for if the result will be normal computer programs? (Just created with astronomical overhead!)

> Cyc looks cool, but from my cursory glance, is it capable of learning, or is its knowledge graph largely hand coded?

The whole point is that it can infer new knowledge from known facts through a logical reasoning process.

This inference process was run since 40 years. The result is the most comprehensible "world knowledge" archive ever created. Of course this wouldn't be possible to create "by hand". And in contrast to NN hallucinations there is real logical reasoning behind, and everything is explainable.

I still don't get how some "dreamed up" programs from your project are supposed to work. Formal reasoning and NNs don't go well with each other. (One could even say they're opposites). Imho it's "real reasoning" OR "dreamed up stuff". How "dreamed up stuff" could improve "real reasoning"? Especially as the "dreamed up stuff" won't be included in the end results anyway, where only the formal things remain. To what effect are the NNs included in your project? (I mean besides the effect that the HW and energy demands will go through the roof, ending up billion times higher than just doing some lambda calculus directly…)

And yes, these are genuine questions. I just don't get it. It looks for me like "let's do things maximally inefficiently, but at least we can put a 'works with AI' rubber stamp on it"; which is maybe good to collect VC money, but else?

What do I overlook here?


> This just makes my original question more pressing: What are the NNs good for if the result will be normal computer programs? (Just created with astronomical overhead!)

You know how expensive it is to pay humans to write 'normal' computer programs? In terms of both dollars and CO2.


+1 for cyc, genuinely awesome and overlooked


The authors get LLMs to perform pretty well in a variety of IF-style text based games. Which is pretty cool, these kinds of games are played and read in natural language, which makes them pretty hard to write AIs for normally.

Something I'd love to see one day is modern AI applied to other kinds of text based games like nethack. Last I checked nobody had managed to solve the problem of nethack AI without using hard coded heuristics and goals!


Is it actually possible to beat Nethack without reading up some "spoilers" upfront? I've never heard of anybody who managed to do that.

Even when you read up all kinds of info about the game before you attempt a run it's extremely hard to reach higher levels, yet beat the game. (I myself never reached any later levels despite I know some tricks by now. Tricks impossible to infer from just playing the game; you need to read them up…)

Imho there is no winning strategy for Nethack. It's some random stuff "you need to know" to progress even a little bit paired with complete rule of the dice while encountering maximally nonsensical "puzzles".

But OK, maybe I'm just too dumb for this game and don't see the "logic" behind the things the game presents.


> Imho there is no winning strategy for Nethack. It's some random stuff "you need to know" to progress even a little bit paired with complete rule of the dice while encountering maximally nonsensical "puzzles".

In the training data for the LLMs there is probably a significant of information about Nethack already, I would think.

So with the right prompt perhaps it could play better than some people, if those people did not have any info about nethack and had not played it before.


I'm not sure. I've beaten it a number of times, but only using tons of spoilers as you say. That said, there are players who consistently win almost every game, which is crazy (there's an online nethack server somewhere, can't remember the name offhand, but you can search player stats and some of them are insane).

Edit: here's one, a player with a 60% win rate, not as crazy as I initially thought but if you've ever played nethack... https://alt.org/nethack/player-stats.php?player=Stroller


Of course one can beat Nethack. Many people did. That's not the point. The question was more: Without "spoilers" (a.k.a. "hardcoded heuristics and goals")?

(I never made progress because I didn't try hard enough. It became very boring after finding out that this game is random and quite nonsensical, and one can't come up with some strategy only by playing it often enough. I'm in general not enjoying dice games. I prefer games where you can come up with some winning strategy by curious observation and logical thinking.)

Because of the nature of Nethack I don't think it's a good AI test as such.

Maybe it would be if one let the AI read spoilers / walkthoughs and than let it try playing. Such a test could than maybe probe for the AI's text comprehension, and the ability to map the gained understanding to concrete actions. But just letting it play Nethack unprepared does not give any insides into the AI's capabilities, imho. It will just fail over and over again. Because it's (imho) impossible to beat Nethack without spoilers. You just can't extract the needed knowledge from playing. Even from playing it millions of times.


Relax, friend, I understand your question =) I'm just pointing out that it's not as random as you think. Mechanics in NetHack are quite predictable for the most part, but they are difficult to _discover_ without dying. Given that AIs can play hundreds of thousands of runs in the time it would take me to play one, I'm a bit more optimistic that they could learn the mechanics eventually.

I think the fundamental problem is that nobody knows how to do exploration-based reward functions effectively. Has Pitfall been solved by modern RL, for instance? As far as I know that's still an open problem (alongside getting to diamonds in minecraft without hardcoded heuristics, and other things along the same line).

(edit: just in terms of evidence for the first claim, once I'd done a run successfully with spoilers I was able to beat the game again without looking anything up. So I think it's more a discovery problem than nethack being inherently random)


I think my wording was bad. There are two kinds of "randomness" here at play and I didn't differentiate properly.

For me the game mechanics as such are "random". Because you can't discover them by just playing (imho).

At the same time the game is ruled by a dice. (So even the best players will fail almost 50% of the time which is almost as random as tossing a coin, and strictly not skill based).

> Given that AIs can play hundreds of thousands of runs in the time it would take me to play one, I'm a bit more optimistic that they could learn the mechanics eventually.

My gut feeling says the opposite. How do you infer any kind of rules from almost random events? Especially if the "logic" behind the "non random" parts is actually also quite made up and arbitrary (so in a sense also "random", even to people with reason).

An "exploration-based reward function" wouldn't be enough. Because this would assume that exploration has (more or less) deterministic outcomes. But given the dice in Nethack it actually does not! You can do "everything right" and still die in almost 50% of the cases. How to infer any meaningful "world model" from such events? Imho you can't.

(I can confirm that looking up spoilers will let you make progress in Nethack. That's why I think it's boring. I've tried hundred of times prior to looking up spoilers and didn't make any progress. But after biting the bullet and starting reading some walkthough it was actually quite easy to reach some deeper levels. Until I've hit the next invisible wall. Which would require again some out-of-band knowledge… I know that reading the next spoilers would also make this wall go away. But I've lost any interest in this game after finding out exactly this: It's impossible to play without a walkthough; and with a full walkthough it's actually considerably easy, and comes down to "just having luck". At this point I could just toss a coin to determine whether "I won". That's maximally boring. I don't like dice games; and the exploration part in Nethack leads nowhere because the world is arbitrarily made up. You can't discover the mechanics without already knowing them…)


Fair enough - the mechanics are certainly random in the sense that they involve dice rolls.

> So even the best players will fail almost 50% of the time which is almost as random as tossing a coin, and strictly not skill based

Even moderately experienced players will fail close to 100% of the time. So getting to an almost 50% success rate, to my mind, shows a great deal of skill! The difference between this and a coin toss is that two people, no matter how many times they have each respectively tossed a coin, will _still_ always get a 50% success rate.

> An "exploration-based reward function" wouldn't be enough. Because this would assume that exploration has (more or less) deterministic outcomes.

I don't think this is true any more for modern AIs such as AlphaGo and its predecessors, which learn distributions of possible outcomes rather than deterministic predictions. IIRC the latest versions can even self-play games like Poker to a superhuman level.

I think so long as you are able to sample a given mechanic enough times, you can build a decent estimate of the possible outcomes (and choose your behaviour accordingly). If there is any systematic deviation from pure randomness, enough data will reveal it!


> the mechanics are certainly random in the sense that they involve dice rolls.

Sorry, but that's still not what I've meant.

Dice rolls are randomness in the usual meaning of this word. But "random" can also mean "arbitrary" and/or "illogical" things. Imho Nethack mechanics are also random in this sense. They don't make sense at all… :-)

> Even moderately experienced players will fail close to 100% of the time. So getting to an almost 50% success rate, to my mind, shows a great deal of skill!

OK, you have definitely a point here.

Still not my cup of tea, such games. (And this is strictly personal, and unrelated to the rest of the discussion). I just don't like games where it's very likely that I will loose despite "doing everything right". I have no problem with games punishing merciless even small mistakes. That's OK. But having an outcome that depends mostly on the whim of the RNG is just nothing I enjoy. When I "do everything right" I like to get rewarded appropriately for it. (Of course some level of randomness is still OK. But if the RNG kills you most of the time no mater what you do this is just too frustrating for me. So I'm clearly not the target audience for Nethack… :-D).

> I don't think this is true any more for modern AIs such as AlphaGo and its predecessors, which learn distributions of possible outcomes rather than deterministic predictions.

Sure. But how do you learn from a distribution where no matter what you do you will fail in, say, 99,9% of the cases?

> IIRC the latest versions can even self-play games like Poker to a superhuman level.

Do you have some links regarding this? I thought Poker is still one of the games where AIs don't play better than humans. OK, maybe it depends on the Poker variant. There are simpler and more difficult ones.

> If there is any systematic deviation from pure randomness, enough data will reveal it!

I would agree in general.

But now we're back to the initial question: Is there enough systematic deviation from pure randomness in Nethack? Given that even people who know all the mechanics, and know some good end-to-end strategies will fail in most cases (actually, like you said, in almost all cases). And given that it's (imho) impossible to come up with this knowledge about mechanics and strategy just by playing this game. I have my doubts.


> Do you have some links regarding this? I thought Poker is still one of the games where AIs don't play better than humans. OK, maybe it depends on the Poker variant. There are simpler and more difficult ones

Texas Hold'em, one of the most popular variants - have a look at deep mind's Player Of Games, and the general technique of Counterfactual Regret Minimisation. Both are recent advances, but poker is absolutely solved at a human professional level now.

I think one thing to keep in mind is that what you find illogical has very little bearing on whether a neural net can learn to do it. Us humans come prebaked with specific priors in our brain (like cognitive biases) that AIs don't necessarily share. I'd be careful making sweeping statements about what is impossible and what isn't, personally. But I guess we'll see =)

> Sure. But how do you learn from a distribution where no matter what you do you will fail in, say, 99,9% of the cases?

You can actually test this yourself if you're interested - try to train a neural net to predict outcomes that are deterministic but with a 99.9% chance of random failure. If the net learns to succeed 0.1% of the time then your premise is false - it has successfully extracted the signal!


> An "exploration-based reward function" wouldn't be enough. Because this would assume that exploration has (more or less) deterministic outcomes. But given the dice in Nethack it actually does not! You can do "everything right" and still die in almost 50% of the cases. How to infer any meaningful "world model" from such events? Imho you can't.

You can actually do that very well. When they train LLM on text, the same prefix doesn't always lead to the same next token. And they handle that just fine.

Btw, Nethack isn't actually random: the dice use a pretty broken PRNG, and 'luck manipulation' is a thing. A computer might not actually care about the difference between spoiler-y tactics that are legible to humans, and PRNG manipulation.

(The state of the PRNG is a relatively small number of bits. Various actions can advance the PRNG, without causing any other change in the world. So you can basically make sure that you are always maximally lucky, if you can somehow recover the hidden PRNG state from the output of the program, and then model it.)


No doubt that person is very good at nethack, but even among very good players 60% win rate is survivorship bias.


> Is it actually possible to beat Nethack without reading up some "spoilers" upfront?

If people are playing nethack without reading the source, they're needlessly hobbling themselves.


Human readable spoilers are enough, no need to dive into the C source. People have already extracted most of the information and tables you need.

Especially if you only care about winning, and not about doing it eg in the fewest number of moves.


That's my point. Imho it's impossible to even come close to a realistic chance to beat this game without knowing more or less every detail of its internal mechanics. And you can't infer this knowledge just from playing. You need to look it up from some out-of-band source.

So only letting an AI play this game will not reveal anything about the "reasoning" capabilities of said AI.

Letting the AI read the source (or some other "spoilers") and than look at how it performs playing would be maybe a usable test. But that's still a big "Maybe" imho as Nethack is just "too random" I think.


> And you can't infer this knowledge just from playing. You need to look it up from some out-of-band source.

I disagree with this in principle. Given enough time, you can collect enough statistics to infer pretty much anything about the game. I've participated in efforts to reverse-engineer web-only games where neither source nor binary were available -- the server does all the compute and the client only receives information to show the user.

But in practice: how long will it take to first inscribe elbereth, and then notice its effect? Yeah, probably not happening in a few of my lifetimes.


There's https://arxiv.org/abs/2310.00166, which uses an LLM for intrinsic rewards for a RL agent. They use it on nethack. It was discussed on the TalkRL podcast: https://www.talkrl.com/episodes/pierluca-doro-and-martin-kli...


Ooh, really cool! Thanks for the links.


One of the things that llms are mainly not great at is understanding spatial relationship characters to each other. Get chat gpt to convert between different forms of chess notation and you'll start seeing errors very quickly. Ascii art is generally fairly bad unless it's regurgitation. It's not too surprising, but it will make games like nethack harder.


Yeah, I've struggled to get GPT to do ASCII art. I wonder if it's a tokenization thing? If we were to group characters in ASCII art in twos or threes we'd have a hard time making it too!


CICERO, an AI for Diplomacy, uses a LLM: https://ai.meta.com/research/cicero/diplomacy/


Yes IF-style for sure but that's not Zork either. Cool work though, but having those solving Zork or Mystery House or whatever would be sooo cool !


I did something similar [1], where the text based game being a Text Interface (TI) that provides model a view and a set of actions. The model repeatedly interacts with TI to achieve a goal.

In the current implementation, Text Interface allows the model to list files, open a file, and search within a file to find what it needs to satisfy the goal.

With my current set of prompts, the model is able to backtrack when it fails to find relevant info in the current file. However, I couldn't get it work with GPT-3.5. Only GPT-4 is capable to reason its way through the Text Interface.

[1] https://github.com/ash80/backtracking_gpt


I was recently thinking how every neural network is equivalent to a lookup table where the input is all numbers up to what can be expressed within the context window and the output is the result of the arithmetic operations applied to that number. So every neural network is equivalent to T = {(i, f(i)) : i < K} where K is the constant which determines the context window and f is the numerical function implemented by the network. Can someone ask a neural network if my reasoning is valid and correct?

The main practical issue is the size of the table but I don't see any theoretical reasons why this is incorrect. The neural network is simply a compressed representation of the uncompressed lookup table. Given that the two representations are theoretically equivalent and a lookup table does not perform any reasoning we can conclude that no neural network is actually doing any thinking other than uncompressing the table and looking up the value corresponding to the input number.

Modern neural networks have some randomness but that doesn't change the table in any meaningful way because instead of the output being a number it becomes a distribution over some finite range which can again be turned into a table with some tuples.


This reminds me of the classic problem in computation, where the simplest form of computation, the lookup table, input -> output, is limited to a finite ___domain. Turing modified the computation to have a finite internal state and infinite external environment (tape), so it becomes a transition function (state, stimulus) -> (new state, response), applied recursively in a feedback loop, allowing it to operate on infinite domains.

Famously a simple lookup table for the transition function then suffices to compute any computable function.


Have a look at Post's correspondence problem for even crazier universal models of computation. Or at Fractran.

Simplified for Post's correspondence problem, you have a set of playing cards with text written on the front and back. (You can make copies of cards in your set.)

The question is, can you arrange your cards in such a way, that they spell out the same total text on the front and back?

As an example your cards might be: [1] (a, baa), [2] (ab, aa), and [3] (bba, bb). One solution would be (3, 2, 3, 1) which spells out bbaabbbaa on both sides.

Figuring out whether a set of cards has a solution is Turing complete.


That's a good point.


It sounds like you're asking whether the output of a neural network is a deterministic function of its input. For many LLMs, you can make that answer yes with the right combination of parameters (temperature = 0) and underlying compute (variance in floating point calculations can still introduce randomness in model outputs even when the model should theoretically return the same answer every time).

There are some ways to introduce stochasticity:

1. Add randomness. The temperature or "creativity" hyperparameter in most LLMs does this, as do some decoders. The hardware these models run can also add randomness.

2. Add some concept of state. RNNs do this, some of the approaches which give the LLM a scratch pad or external memory do this, and continuous pre-training sort of does this.

How this affects people's perception of LLMs as thinking machines, I don't know. What if someone took every response I ever gave to every question that was ever asked of me in my life and made a Chinese Room[1] version of me? A lookup table that is functionally identical to my entire existence. In what contexts is the difference meaningful?

[1] https://en.wikipedia.org/wiki/Chinese_room


To your last point, https://en.m.wikipedia.org/wiki/Problem_of_induction

A LUT version of you is inductive. Every observed input/output pair does not uniquely identify your current state. Much like a puddle left by a melted ice cube indicates its volume, but little to nothing of its shape.

Post LUT-you genesis, applying property based fuzz testing would quickly reveal that the LUT-you is one of an infinite number of LUT-yous that melts into the puddle of historical data, but not the LUT-you that is the original ice cube.

https://fsharpforfunandprofit.com/posts/property-based-testi...


People can not be reduced to lookup tables even in theory. No one even knows how a single cell does what it does let alone an entire organism like a person.

I'm not making an abstract claim about neural networks because all numerical algorithms like neural networks can be reduced to a lookup table given a large enough hard drive. This is not practical because the space required would exceed the number of atoms in the known universe but the argument is sound. The same isn't true for people unless a person is idealized and abstracted into a sequence of numbers. I'm not saying no one is allowed to think of people as some sequence of numbers but this is clearly an abstraction of what it means to be a person and in the case of the neural network there is no abstraction, it really is a numerical function which can be expanded into a large table which represents its graph.


>People can not be reduced to lookup tables even in theory

Sure you can. Simply enumerate all of the physical states that the atoms in your body could be in. Any finite-sized object has a finite number of possible states, and so can be represented by a finite lookup table.

Your argument is so broad as to be meaningless.


Then give some concrete numbers for the states of the atoms. My argument is not abstract, it is very concrete. Give me a neural network and I can generate the graph and prove the equivalence between the network and its graph representation as a table of tuples.


You said "even in theory" which is obviously wrong, since the (local) universe is finite and deterministic, hence it is itself a giant lookup table.


> the (local) universe is finite and deterministic,

Radioactive decay and spontaneous pair production say otherwise on the deterministic front.


> the (local) universe is finite and deterministic

"It is not possible for the Universe being deterministic at any level. Only theories can be deterministic, practical reality is never"[0]

Q: Can you calculate your local universe's past states given its present state?

[0] https://philosophy.stackexchange.com/questions/99163/is-it-p...


Where are you going to get all that time and space to build a lookup table? Are you sure you're able to measure all state at enough precision to make an accurate table?


Doesn't matter given the original statement spawning this subthread was:

> People can not be reduced to lookup tables even in theory


What theories are you using to solve for:

- consciousness?

- the unknown?

https://en.m.wikipedia.org/wiki/Necessity_and_sufficiency

- the misunderstood?

https://plato.stanford.edu/Entries/perception-problem/

The Science of the Gaps will do I suppose?

Culture could do it though I think.


Can you rephrase that?

It currently reads like shifting goalposts, and I'd like to guess that was not your intention…


Any theory that asserts exhaustive coverage of people would need to take all relevant aspects of reality into consideration, so I suggested some of the trickiest things that are relevant.

Unfortunately for me, they are so tricky that they "don't count" (try, genuinely, to model the reality bending capability of people in a theory, I would love to see that!).


> model the reality bending capability of people

Much to the disappointment of my teenage self who would really have liked the shape-shifting spell to work, I don't see any evidence we can bend reality.

--

> consciousness?

I think this is a red herring. We can talk about P-Zombies, but we lack the means to determine if some random human (let alone AI) is one.

> the unknown?

> https://en.m.wikipedia.org/wiki/Necessity_and_sufficiency

> the misunderstood?

What about them? I still don't know why these are an interesting problem in this scenario.

> https://plato.stanford.edu/Entries/perception-problem/

Isn't one of the big criticisms of AI at the moment the fact that they do this slight more than humans, and we can point and laugh at them?

(While conveniently forgetting that half of us were Yani and the other half Laurel, that half of us were blue and black while the other half were white and gold, etc.)

> The Science of the Gaps will do I suppose?

A reference to the ever diminishing role for God in the late 19th century onwards, but I'm not sure how you're using it here?

> Culture could do it though I think.

Banks? Sure, but fictional.


All this considered, and during this process, did you happen to form any conclusions (or ~"update weights"), consciously or unconsciously (in reality)?

I think it's interesting how the human mind can "know" whether things that are unknown can be modeled, or not, and I happen to believe that this phenomenon occurs within reality (where I believe the comments within this thread are), bending that portion of it. I also believe that this phenomenon is fundamental.

But then, there "is" "no evidence" for any of this...and we all know what that means!


Then I would say your theoretical model is wrong or incomplete or makes for a circular argument (it's an assumption and not proven that finite matter evolving through time reduces to a lookup table).


Simply not true. Of course it is comforting for computer people to believe the world they live in is a giant computer, but that is not our real reality.


"""the number of bits required to perfectly recreate the natural matter of the average-sized U.S. adult male human brain down to the quantum level on a computer is about 2.6×10^42 bits of information (see Bekenstein bound for the basis for this calculation).""" - https://en.wikipedia.org/wiki/Orders_of_magnitude_(data)

(That said, I think quantum physics makes it "all a Markov chain" rather than "all a lookup table").


You are making the assumption that your body consists of a static set of atoms, but your body is a living thing. Your lookup table would end up containing the entire universe to account for extremely remote possibilities.


those would just be inputs.


My "what if someone made a lookup table of everything I ever said in response to something else" hypothetical is pretty flimsy - I realized that right after writing it.

The point I wanted to make is that concepts of sentience, consciousness, reasoning, intelligence, etc. are very philosophically loaded ideas.

Responding to your comment, I don't think anyone credible is arguing that a human being is somehow the same as a neural network. I think the question at play here is "what constitutes reasoning?" - and more specifically "can a deterministic process reason?"

This is not a new debate at all - an abacus can tell us truths about the world, but we don't consider the abacus intelligent. Is GPT-4 somehow different, or is it a very large abacus?


As a numerical function it can be implemented on an abacus so I don't think it's any different from a large enough abacus. It's practically not feasible but theoretically there is no idealization or abstraction happening when numerical calculations on a computer are transferred to an abacus.


> People can not be reduced to lookup tables even in theory.

Yes they can, this is a direct corollary of the Bekenstein Bound.


Yes, this is one view of machine learning, the idea that you are training some function to map input to output, similar to "looking up" what output is addressed by some input.

And that's why the concept of generalization is so important on machine learning, and as a consequence, why the internal representation of that "lookup" matters.

By definition a lookup table can only store data it is given. However, the idea of ML systems is actually to predict values of inputs that are similar to but not given in their training data.

Interpolation and extrapolation, key components to applying ML systems to new data and therefore critical for actual usage, are enabled by internal representations that allow for modeling the space between and around data points. It so happens that multilayer neural networks accomplish this by general and smoothed (due to regularization tricks and inductive biases) iterative warpings of the representation (embedding) space.

Due to the manifold hypothesis, we can interpret this as determining underlying and semantically meaningful subspaces, and unfolding them to perform generalized operations such as logical manipulations and drawing classification boundaries in some relatively smooth semantic space, then refolding things to drive some output representation (pixels, classes, etc.)

Another view on this is that these manipulations allow a kind of compression by optimizing the representation to make manipulations easier, in other words they re-express the data in a form that allows algorithmic evaluation of some input program. This gives the chance of modeling intrinsic relationships such as infinite sequences as vector programs. (Here I mean things like mathematical recursions, etc.) When this is accomplished, and it happens due to the pressure to optimally compress data, you could say that "understanding" emerges, and the result is a program that extrapolates to unseen values of such sequences. At this point you could say that while the input-output relationship is like a lookup table, functionally it is not the same thing because the need to compress these input-output relationships has led to some representation which allows for extrapolation, aka "intelligence" by some definitions.

The fact that these systems are still very dumb sometimes is simply due to not developing these representations as well as we would like them to, for a variety of reasons. But theoretically this is the idea behind why emergence might occur in an NN but not in a lookup table.


Take a relatively simple large language model like Llama 1. It has a context of 2048 tokens and each token can be one of 32,000 values. So the lookup table would need 32,000^2048 entries. That's not just impractically large, that's larger than cosmically large. There are only estimated to be about 10^80 atoms in the visible universe. So while a 32,000^2048 lookup table might be a valid concept mathematically, it's not anything you can intuit physically, and therefore not something you can say is incapable of reason.


Every program is a compressed representation of its output. This is from Kolmogorov complexity, which you learn this in any CS complexity theory course.

So, a neural network being a compressor/decompressor is nothing special.

Note, however, that supposing a context window of 1000 units, then we are looking at K = 2^1000 = 10^300 different entries in the truth table. Somehow, your LLM neural network is the result of compressing a 10^300 exponential scale amount of possible information, which of course could never be seen at all -- to compress a JPEG at least you have access to the original image, not just two pixels in it.

Anyways, the philosophical debate is whether you believe programs can think, whether machine intelligence is meaningful at all by definition. Some say yes, others say no. When humans think, are not our abstractions and ideas a kind of compression?


This is an old argument against determinism - I think a serious challenge is that:

1. Modern physics suggests you can implement such a lookup table for any subset of our universe.

2. We are a subset of the universe.

3. Therefore we are representable by lookup tables too.

...so your argument appears to prove too much, namely that humans aren't thinking beings either. Which is fine, but personally I don't think that's a useful definition of "thinking".


We're not a lookup table of the things we're, eg., saying, or doing etc. Nor are we looking up, in this sense, when we act.

ie., when you compress text into an NN and use it to generate text, the generated text is just a synthesis of the compressed text.

Whereas when I type, I am not synthesising text. Rather I have the skill of typing, I have an interior subjectivity of thoughts, I have memories which arent text, and so on.

When my fingers move across the keyboard it isn't because they are looking up text.

Our causal properties (experiencing, thinking, seeing, feeling, remembering, moving, speaking, growing, digesting ...) are not each, "index on the total history of prior experience", "index on the total history of prior seeing". The world directly causes, eg., us to see -- seeing isnt a lookup table of prior seeings.

( Also, the whole of physics is formulated in terms that cannot be made into a lookup table; and there is no evidence, only insistence, of the converse. )


I strongly disagree with your last statement - physics explicitly _is_ formulated in terms that can be made into a lookup table (see phase spaces in classical mechanics, for instance).

My point is that there's a finite light cone of possible causal influences over you at any moment in time, and in principle you can break those down into state variables finely enough to predict future states of a person. This is isomorphic to a lookup table, albeit one we aren't able to construct right now.

Im not suggesting it's enough to consider just the person in this scenario - the causal factors are part of the lookup.


How do you lookup quantum mechanics? Please tell the physicists about your breakthroughs.


No need, physicists already do this all the time - any computer simulation of quantum mechanical systems has to come to terms with the same problems (namely quantising the state space and representing the dynamics deterministically).


Physicists simulate on computers only what can be, which is almost nothing. Consider obtaining the dynamics of water by simulating all its parts: proton flow, hydrogen bonding etc. of 10^{PHYSICALLY UNCOMPUTABLE} interactions.

The simulations which do exist fail to model vast amounts. This is why, say, climate change is given as a prediction on temperature -- because it can be obtained as a mean which ignores "basically everything".

And it can be easily show that the assumptions of QM are false if Hilbert space is computable (QM becomes non-linear); and of classical mechanics (which becomes non-deterministic); and so on. ie., that the issue isnt merely 10^{PHYSICALLY UNCOMPUTABLE} but that non-computable functions are essential to the formulation.

The assertion that the world is computable is just that: there are no research projects, no textbooks, no experiments, no formalism to replace physics or anything like it -- nothing. All the basic assumptions of physics would have to be false, and we would have to have good reasons for supposing so.

This is just nonsense. The world is geometrical as described by physics. It is not computational as described by the discrete mathematician whose megalomania and platonism knows no bounds.


To be honest, I don't really understand what you mean by Hilbert spaces being computable, and what that has to do with the linearity of QM, determinism of classical mechanics, universe being geometrical and not computational etc. I'm familiar with all of those concepts, but not sure how they tie together here. If you have resources you could share I would appreciate it (I had little success with google).


computable = expressible as Int -> Int

Hilbert space = set of functions in Real -> Real

geometrical & non-computable = Reals

determinism = g(x, t_future) fully set by g(x, t_now) and g

if you model a geometric, g : Real -> Real with computable, c : Int -> Int then there are gaps at arbitrarily high precisions, say p (eg., p = delta(g, c) at (x, t))

construct a classical system of arbitrarily complexity (eg., 10^BIG interactions), describe each interaction with g. Since 10^BIG are required, "delta(g, c) < BIG" is required in order for the system to remain deterministic (ie., described by g). We can easily find cases where BIG > delta(g, c), so CM would be non-deterministic if g is replaced by c.

As for QM, these "gaps" are cause much deeper contradictions with premises of QM.

If you replace wavefunctions, g, with computable ones, c then they dont sum to solutions of the wave-eq, so QM fails to be linear (the detla(g,c) are massive because hibert space is infinite-dim).

Now it might be that reality is really computable in the sense that there's some c which can replace g, but this would violate the assumptions of physics and has no motivation. Physics might be wrong, but there's no evidence of that.

There are also other issues, but these are just two off the top of my head.

Refences: Look for physical church-turing, church-turing thesis, non-det and det in chaos theory, non-det in classical mechanics, physical interpretations of the reals -- this will be in postgrad work, it wont be in popsci books.


> if you model a geometric, g : Real -> Real with computable, c : Int -> Int then there are gaps at arbitrarily high precisions, say p (eg., p = delta(g, c) at (x, t))

Nobody takes "computable approximation to g: R -> R" to mean "a computable function c: R* -> R" where R is the computable reals. There are many mathematical issues with this caused by self-referential programs (realised by Turing himself in "On Computable Numbers"). Typically you would model it as "c: R* x Q -> R*" where Q is a rational describing your desired precision, right?

> Since 10^BIG are required, "delta(g, c) < BIG" is required in order for the system to remain deterministic (ie., described by g).

I'm not sure what you mean by this - the computable approximation "c" is deterministic essentially by definition. If you mean "in order to remain within some bound of g" I can kinda see what you're saying but in that case you can interleave computations with smaller and smaller precisions (the "Q" I mentioned) in order to work around that issue, right? It won't be efficient, but it will certainly be computable.

> Refences: Look for physical church-turing, church-turing thesis, non-det and det in chaos theory, non-det in classical mechanics, physical interpretations of the reals -- this will be in postgrad work, it wont be in popsci books.

Thanks! I don't know much chaos theory, I'll have a look around for a good textbook.

Edit: I just want to say - you have a pretty wild way of writing that makes it hard for me to tell if you're a crank or not. Either way, reading your posts here has given me a ton of food for thought =) what's your background?


1 yr medicine, 6 yr physics, 4 yr debating union, 20 yr c programming, 20 yr love of political and stand up comedy, 15 yr software eng, 10 yr data scientist, 15 yr python, 22 yr informal & formal philosophy, 8 yr data sci & software consult/coach to finance/defence/... and maybe soon, 4 yr PhD AI & HCI

Of those, you may decide which is the most relevant to my writing style. The amount of theatrics and irony in a live delivery might change the interpretation.

Replacing R with Q is just replacing it with (Int, Int) -- so be it. My claim concerns whether CM assumes determinism (it does) and therefore requires infinite precision, any gap whatsoever that goes missing means P(t_next|t_now) < 1

You might say this indicates reality doesnt follow CM, and so that CM is wrong and (some now less hegemonic views) of QM are correct -- reality isnt deterministic.

Fine, but QM makes the situation worse. Since it's linearity now under threat: we would not be able to compose QM systems linearly if the wavefns didnt have infinite dim.

One important assumption here is that we ought take the explicit and implicit assumptions of physics as given as our starting point, ie., Prior(Physics) = High, and Posterior(NotPhysics|Physics) = Low.

So the dialectical burden is on the "computationalists" to show that there is a workable theory of physics, at every level which preserves either (1) the assumptions of physics; or (2) motivates why those assumptions are wrong non-circularly.

Given the premise on priors above, the argument, "physics is wrong because reality is computational" is both circular and unpersuasive (this doesnt mean its wrong, just that no reason has been presented).


Wild, well, I'm a lowly math PhD so that's where my interests lie =)

I'm _not_ suggesting we replace R with Q. I'm suggesting that you bake in the desired accuracy of your computational approximation as an input. This is how Turing evades self-referential problems in his conception of computational reals, and also perhaps how you evade your criticisms with CM requiring infinite precision.

Similarly - I think it's reasonable in a computational context to assume linearity up to an error bound that is provided as an input. Of course things become non-computable if you ask for exact linearity. Equality itself is non-computable!

Either way I think we agree about physics. I don't believe the universe is describable as a computable function. Merely that we can approximate it to arbitrary degrees of accuracy =P


I think we teach people only what we can write in finite formula, and compute in finite time.

This is imv, much like teaching people what's under a street light just because everything else is in darkness.

I think, philosophically, we can build inferential telescopes that point to the vast (epistemic) blackness, inside say, a proton, or a cell, or the chaos in water.

As an ameliorative, or therapeutic project, I think people who build computational models too much should meditate on the number of protons flowing free in a drop of water, and what properties their interactions might bring about. And whether it would ever be possible to know them.


Poetic. Perhaps you're right.


I've read this thread exchange with interest, but what about the results that quantum computers are simulatable by classical computers? See David Deutsch 1985. This would reduce the issue of infinite Hilbert spaces to simulation using quantum computers, and in turn, Deutsch's result which says classical Turing machines can actually simulate quantum computers.


You can always make local arguments that, say, some g can be substituted with some c.

The issue is broader than that. It concerns the premises of vast areas of physics -- you have to show they are more likely false than true.

This isnt an argument saying no c can be found for any given g, it's saying, "g-c gaps have empirical consequences we havent observed" and if we did, physics would be foundationally wrong


When they assert theorems like "classical TMs can simulate quantum TMs" they mean the simulation is gapless. Otherwise they use the term approximation.


> The assertion that the world is computable is just that: there are no research projects, no textbooks, no experiments, no formalism to replace physics or anything like it -- nothing. All the basic assumptions of physics would have to be false, and we would have to have good reasons for supposing so.

I have no idea what this means. Physics must be computable from straightforward physical arguments like the Bekenstein Bound: finite volumes must contain finite information. Any physical object has finite volume at any given time, the universe included, ergo they must contain finite information. Any system consisting of finite information can be modeled as a finite state machine.


Thermodynamic entropy isnt information in the relevant sense.

There's a wide class of computational mysticism born of people going around and equivocating between "information" as it means radically different things where it is used.

thermodynamic entropy (a real number) != information theory entropy (bits) != information in csi != information in stat mech' != information in QM != information in a turing machien !=. ...

This is basically pseudoscience at this point. If you hear people talking about "information" as if its defined in a general sense (ie., equivocating across physics, computer science, etc.), they have no clue what they're talking about.

Eg., the "entropy" of real-valued quantum states as measured by integer-valued notions of entropy is 1 bit (the measured state is UP,DOWN) -- but QM requires the state be real-valued (having infinite information in the computability sense).

These kinds of information are not measuring the same thing, and largely irrelevant to each other.


I enjoyed the thread the other day, just want to point out there are some basic misunderstandings here - informational entropy and thermodynamic entropy in fact _do_ have a deep connection (Edwin Jaynes famously wrote about this). The key idea is that both are measures of the uncertainty of a system with respect to some distribution over it.

Saying that a real-valued state has infinite information in the computability sense is nonsense - information is a property of a _distribution_, not a _state_. You could talk about the Kolmogorov complexity of a real-valued state, but even this is generally not infinite, as anyone who's written a program to generate the digits of pi can attest.


> but QM requires the state be real-valued (having infinite information in the computability sense).

The unobservable state, which is merely a physical model that may have little resemblance to reality. All observable states necessarily have finite precision and beyond 60-70 digits are effectively undefined due to the uncertainty principle, which is yet another reason why people suggest physics is effectively computable.

While the types of information you mention are not strictly equivalent in some 1:1 sense that I don't think anyone has really suggested, there are formal correspondences, so your explanation ultimately just seems like a lot of special pleading, eg. you can derive a Bekenstein Bound for bits, thermodynamic entropy, information in QM, and so on.


No one here disagrees that measurement produces finite information; that is obvious and a necessary --- if it wasnt, we would never be able to know anything. Knowing requires an "early termination".

The issue is that there's no evidence this property of measurement is a property of reality, and all the methods, premises, etc. of physics attribute the opposite to reality.

Here, it is absolutely necessary for QM to work that the unmeasured state is real-valued,.

I'd also say that since measurement is finite in this manner, it then follows large swathes of reality are unknowable.. and this makes it clear why we cannot obtain the latent state of a QM system.


> Here, it is absolutely necessary for QM to work that the unmeasured state is real-valued

You're just doubling down on the premise that a theory founded on a formalism based on unbounded numbers requires unbounded numbers to work. Sure, but why is that necessarily reflective of reality? Why does that entail that no other formalism that doesn't embed infinities / continuity is also not possible? I simply see no reason to accept your conclusion. The infinities you see as essential could very well simply be artifacts of our formalisms.

In fact, I'd conjecture that our continuous formalisms are at the very heart of some core problems in physics [1], and that at least some of those problems can be resolved by exploring more discrete formalisms. I suppose we'll see.

[1] https://arxiv.org/abs/1609.01421


"I am not synthesising text"

Then what are you doing?

I think you are falling for the same arguments as 'mystics'. Somehow your inner thoughts are un-explainable. But nothing in your argument explains it, you are just taking your own inner experience itself as the mystical explanation.

The old 'I think therefore I am' argument.

And where did the 'thinking' come from?


It's entirely explainable, I just explained it in the other comment, "Somatosenstory representations are built by the sensory-motor system."

Do you really think the meat of my body is growing so as to record every symbol i've seen, or even an induction across them?

I find this inability to think outside of the switching frequencies of a silicon chip as they model the patterns of text tokens on reddit, absolutely bizarre.

You're an ape, have some appreciation for it. You're much more interesting than what openai can steal from amazon's ebook library.


My view is opposite.

"inability to think outside of the switching frequencies of a silicon chip"

Why can't you "think outside", and conceptualize that this 'internal subjective experience' we are having, is not unique, and could be taking place inside a silicon based NN?

It helps to meditate and observe your own thoughts, and how they arise. You can begin to realize that you don't 'think'. You don't think about what to think and thus think it. Thoughts just become un-bidden.

Its back to the Schopenhauer quote "“A man can do as he will, but not will as he will.”

Then, when you begin to notice the mechanistic nature of your own mind, you will have less resistance to how 'silicon' is also reacting. Silicon and Carbon both reacting to inputs, processing.


Organisms are mechanistic, there's nothing "not mechanistic" about the mind.

Its insane to suppose that somatosensory representation building, which requires organic neruoplasticity to be connected to the organically adaptive neuromotor system, etc. etc. etc. can just be instantiated in a bit of sand.

This is deeply mystical, pseudoscience.

You're credulously throwing away any kind of empirical analysis of the world in terms of it's properties and their mechanisms for the deeply mystical view that, unique in amongst all properties of the world, consciousness needs no empirical analysis of the properties of the systems which have it.

Of gold we ask: what makes it shine; of fire: what makes it hot.. and so on for everything empirical in the world.

But with the mind we must stop! No! No! do not do any science! please that might mean we can't be scammed by a VC out of our investment money; i cannot babble endlessly about scifi star treck episodes! no no! please do not rob me of my scifi religion! please please, do not ruin commander data for me!

Well there is no commander data. And the properties of gold are not those of sand, nor those of animals. And just as no bit of silicon will be trasumted into gold by the running of an NN on its electric field; likewise, no bit of silicon will desire or wish or conceive of anything.

We are biological organisms; we do not have souls, even if in your religion, the soul is "a pattern". Your consciousness will not be uploaded; commander data will never visit; and your local VC shyster is on a stock manipulation grift to bamboozle you out of money.


Really. I'm not sure what you are getting at. I think from your last few sentences. We actually agree, and maybe just miss-understanding.

I'm agreeing, to the mechanistic nature. Nothing Mystical. Just physical world. No souls. I was trying to get you to see the mechanistic nature of mind, then you tell me to stop being Mystical?

You think that generating a mind from sand is 'mystical'. "just be instantiated in a bit of sand?"

But don't seem to realize Carbon, is also just a basic element. So why is any reasoning based from Carbon, Non-mystical? But Silicon is?

Then some other sentences seem to be </sarcasm>. But in these discussions it is hard to tell when someone is being sarcastic or making a point.

Like this sentence:: "We are biological organisms; we do not have souls, even if in your religion, the soul is "a pattern". Your consciousness will not be uploaded; commander data will never visit; and your local VC shyster is on a stock manipulation grift to bamboozle you out of money."

I totally agree, no soul, it is a pattern. That does not mean we can't create a 'mind' from silicon, that is based on a pattern, and be equivalent in functionality.

I also agree, we will never be uploaded. I don't think we'll ever be able to measure the human neurons to a degree to allow this. But that doesn't mean we'll never build a NN with as many connections as a human brain has, and that it wont also be able to have a 'pattern' of self.

And, there is a lot of VC Shysters out there, doesn't mean a lot of real progress is not being made.


You think that reality is a pattern, and that properties obtain from configurations. Eg., that, of course, we can transmute gold into lead. Alchemy.

The problem with this is that the patterns have semantics, a pattern of wood does not have the same properties as a pattern of lead. The pattern isnt the important bit.

If you want to turn hydrogen into lead you first have to fire up a star and wait a very very long time, and as protons and electrons bundle up in every more complex configurations interactions between them come to dominate their properties... so that lead is nothing at all like hydrogen.

What is the only known element that enables "weak polymerisation", ie., adaptive self-replication at the molecular level: carbon.

What are the properties of all intelligent systems known to science? They're organic.

Why? This is no coincidence. In order to think, you have to grow -- self-replicate at every level from the cellular to organs, tissues... the material placitity of the machine that holds your thoughts has to itself adapt its structure to that of your body (the device which explores the world).

Now, should we expect hydrogen to do that? No. Silicon? No. Why would we? Ah, only because we loath the idea of being apes, and of oozing.. our biological disgust instincts tell us to run far away from it. How more beautiful if, like in genesis, we can fashion man out of clay and breath in life.

but alas, we arent clay; nor will clay ever think or desire or want.. clay cannot take impressions of the world without becoming an impression. It cannot adapt.


"The pattern isn't the important bit."

I was taking pattern as the electrical impulses in the brain. Which I would say is everything when it comes to thought and thinking.

Human Neurons are just Calcium Voltage potentials. Just like weights in a NN.

Yes. The human brain neurons are far more complicated than that. There is a lot of chemical soup of hormones and modulators that factor into the voltage potentials. What you eat can impact gut, that impacts how many neurotransmitters are produced, that impact when a neuron fires, that is experienced as a mood, etc.... SO yes, the human brain cannot be separated from the body.

There is some entity, brain+body. And it so happens that it is made from Carbon . And we call things based on Carbon to be 'organic' and it can re-produce so we call it 'alive'. These are just definitions, that we have assigned to things.

Just like a CPU can't be removed from it's power supply and keep working. There is a 'system' and it has 'parts'.

That doesn't mean, we can't model a brain to a degree close enough that you couldn't tell an AI and Human apart.

I'm just saying, that when we do reach that point, then we'll be forced to realize the AI also has an internal subject experience, and is 'conscious', or we'll have to acknowledge that humans do not.

It will be either/or.

Humans and AI are both conscious, have internal subjective experience. or Neither do. And if neither do, then humans really are just hallucinating their inner experience, but not in control. Humans are also deterministic. Nothing special.

I see from your other posts, that you seem to be putting a lot of faith in Chaos Theory or Quantum Mechanics as some underlying explanation. To give humans the special sauce.

I'd just say, introducing randomness, does not make a system un-determined. That there is a lot of randomness in our 'chaotic' body, does not imbue it with agency. Randomness != Agency.

Edit: I like the clay example. I'd say that is like the NN that goes through training, then released, but the model is static. Doesn't get updated. Eventually these AI's will learn as they go and be continually updating.


Talking to people is a proxy measure of their subjective experience, it isn't a direct measure. Since you can talk to a tape recorder and, so long as it plays back responses, you and the tape recorder likewise are engaging as-if conversing.

Since computer science isnt a science, but a form of (applied, discrete) mathematics it encourages people to think in terms of functions with purely mathematical semantics, as if 2 + 2 = 4 were the same thing whether it modelled the divison of cells or the printing of a book.

What causes people to speak is our intentions, desires, theory of mind, representational ability, imagination... etc. We speak because we are in a shared world of social intentions, and we desire to communicate something about ourselves or this world to others.. we translate, clumsily, these features of our experience into text tokens; and hope that the agency we are speaking to can recover our mind from those tokens.

Over a million years we have specialised a culture of communication to enable this illusion to take place: the illusion that meaning is in the patterns of the symbols we use.

You can, of course, build a system to perfectly immitate these patterns; just as a video game, if you hold the viewer fixed, my appear to contain a world with a table and a glass. But if you reach for that glass of water, it isnt there: it's an illusion.

This is all statistical AI is: a trick. It's a replaying back of our own conversations to each other, as if it was a real conversation with us.

We can determine, as certain as you like, that there are no goals, intensions, desires, imagination, counterfactual reasoning -- no body, no observation. The machine is not in the world with us, and it not responsive to the world -- the machine generates text, it does not speak.

You inclination to analogise the machine to a person is just on the grounds that you are strapped into you chair, and observing the video game, believe you can grab the glass of water inside.

I am not strapped into a chair, nor do you have to be. You can do science: you can build real experimental explanations of how we form representations, intentions, goals, desires etc. And it is trivial to explain AI, there is no mystery to "compress reddit and query over its space of text tokens". There is only the illusion that the user is subject to -- the belief that the agency lies in this querying process, and not in the redditors who had cause to speak to each other about their experiences of hte world .


Everyone is captivated by the current hot thing, LLM/GPT's.

But LLM's are not the whole of AI research.

A lot of your arguments are based on 'embodied' reasoning. Humans live in the world, they need to eat and survive. LLM's just compress what humans generated in the world. Correct, current LLM's are mostly regurgitating, but they don't "speak because we are in a shared world of social intentions".

I'd say game worlds are the frontier, because they are able to simulate a lower resolution world for current AI's to learn in. And in that world, they do embody it and have purpose (rewards/goals), they need to survive.

DeepMind's AlphaGo was when I switched. https://www.wired.com/2016/03/two-moves-alphago-lee-sedol-re... Move 37, it was called alien, creative, inhuman intelligence.

Now, scale that up to our world, with admittedly, thousands/millions of more variables. Put it in a robot body, with vision (the latest studies show AI vision building a world model context). Add some goals. Bam, some dangerous stuff, AI embodied in the world, with a goal to survive.

It might be far away, but where we are now was supposed to take another hundred years. So who knows.

The military is already running world simulations where the AI's goal function lead it to kill the soldier operating the AI in order to bypass him. It 'learned' to bypass the operator by killing them to achieve its goal.

Yes, Hyperbole. But really, not by much.

But back to our discussion. At that point, does the robot have an internal subjective perspective? Did AlphaGO when it was reasoning about its small low variable world?


It never has an "internal subjective experience" -- AlphaGO wasnt reasoning.

Reasoning is a cognitive process in which propositions, which model the world, are considered in turn and subject to ecological rationality (concerns of utilty, effort, interest, preference, etc.).

At no point in the flight of an aeroplane does it ever lay eggs.

You are using smoke to establish fire, these ways of measuring internal mental states of animals only work on animals.

If you can produce a robot with no prior conceptual scheme of, say, a novel apartment it is thrown into; a robot which can then determine what is in that apartment, how roughly it works (eg., light switch -> lights turn on), of an account of that apartment; explain why it has explored it; show that its behaviour is moderated and caused by these stated goals; ask it for opinions about the apartment etc. -- then we are actually playing the intelligence game, at least. Rather than stupid magic laterns.

Now, does this robot have a subjective experience?

Well I think we need to keep going with our tests: does it have an aversion to toxic stimulous? Is this aversion moderating its goals and behaviour? Are its memories contextualised by these aversions (eg., does its process of remembering display a variety when remembering negative vs. positive experiences)? And so on.

If I can ask, "Did you find my apartment fun?" and it can answer because it did, or did not -- then we're very close.

That is if we can show the reason it says, "yes" or "no" had to do with a history of taste, judgement, preference, curiosity, etc. all built up by itself -- not under "supervision with the right answers" but with no problem-specific answers ever given... and so on.

Questions of these kind arent even revenant to anything in AI. Any sincere AI engineer will say that they have nothing to do with the goals of the system theyre building. All AI that we can actually access, 100% has no interest, methods or ambitions to deliver any of the above.

AI isnt even in the category of intelligence; it isnt even trying to produce it.


So I'm trying to understand your argument here, but why isn't "Reasoning is a cognitive process" circular logic?

AlphaGo wasn't reasoning, how? I think reasonably AlphaGo has modeled a world, and it is by design subject to the bounded rationality of a game-theoretic optimization problem. So two of your criteria are satisfied.

So - I'm just reading your definition here - AlphaGo wasn't reasoning because it is not a cognitive process. Is that your specific argument? And if so, then what is a cognitive process?

I'm just focusing on this part here and I don't see the argument clearly:

> It never has an "internal subjective experience" -- AlphaGO wasnt reasoning. > Reasoning is a cognitive process in which propositions, which model the world, are considered in turn and subject to ecological rationality (concerns of utilty, effort, interest, preference, etc.).


> Reasoning is a cognitive process

You could have a non-cognitive view of reasoning, or an embodied one. NNs are a-cognitive systems, they do not engage in reasoning of any form.

Reasoning concerns inference across truth-apt propositions (eg., A->B, A thef. B). NNs have no propositions, nor are any parts truth-apt, true or false. NNs are statistical systems which select answers by weights found from optimisation. No process either in optimisation or prediction is an inferential one in the sense of cognition.

I also deny that the "world" as used by frankly just philosophically incompetent ML researchers, whose gross lack of familiarity with basically anything outside pytorch, is even relevant to the sense of "world" that propositions bare a truth relation to.

A world in the relevant sense isn't the state space of the training data -- this is an insane supposition which makes the claim "AI has world models" actually circular. The relevant sense of world is the cause of the training data. If the training data is about an abstract game then you collapse the distinction since the rules are the data.

The famous "NNs learn WMs" paper is just this: choose a system whose data is just a restatement of the system; rather than a measure of a world.

NNs do not form representational models of the cause of their measurement data because all they do is induce (ie., compress by function-fitting) across the measurement space. They model the measurements not their causes. This is only "predictive" of features of the causal origin of measurement data in rigged scenarios, and in general, fails catastrophically to be predictive.

Consider running a NN across photos of the sky: it is impossible for this process to produce newton's law of gravity. The weights are just models of the pixels, and these are not distributed according to this law.

Worse, in general, there is no function from the measurement space to properties of its causal origin -- so it is impossible to build representations by induction. (eg., there is no function Photo->Cat|Dog, the distributions of pixels in photos is ambiguous, and changes over time).

Reasoning, as in cognition, is an (logically) inferential process which considers propositions that bare a truth relation to the world which is the causal origin of concepts which the proposition comprises (created by a biogenerative process). It is the activity of an agent with an interior subjectivity and ecological rationality. Reasoning is done by an agent about something of interest to that agent, with motivation towards a goal the agent has, in the service of the agent's preferences, etc.

If reasoning is an abstract pattern, then rice falling to the ground is likewise "reasoning".


This is kind of ignoring other NN's. You're very focused on LLM's as the example.

AlphaGo learned by playing itself. And is able to anticipate multiple moves ahead.

Then, that same 'engine', was able to be applied to Chess, and learned how to beat a master from scratch, by playing itself, in just a few hours.

There was no lookups, or zip'ing of aggregated data.

A lot of what you are postulating as cognition, humans don't do either. Humans didn't figure out gravity from photos of the sky (unless you mean tracing stars and planets, and then yes an NN probably can figure it out). Many humans go through their whole day without analyzing propositions and inferring reality and what to do next. Humans are similarly un-conscious.

A lot of AI news all the time. This link is from today. A little more in the theme of 'world building models', than this current post about LLM's.

https://news.ycombinator.com/item?id=39692387

There was another on also about an NN that could pass some international Geometry competition. It was based on propositions, and reasoning.


Yes, it is impossible for anyone to figure out gravity by induction. That's the problem with AI.

The way we build explanations is largely by reasoning-by-analogy. We build physical models with our hands, to resolve ambiguities in our environment, ever more complexly -- and then, eventually, land upon the right analogy that then falls away.

Prior to gravity we had crystal spheres -- reasoning by ananlogy with such things.

Since machines arent in the world, as in my robot example, they can never build explantory conceptualisations of it.

Chess et al. are not worlds in any relevant sense. No one cares, or doubts, that a system with a fully mathematically specific "world" can be "learnt" by a computer.

The full specification of a "world" in mathematical terms doesnt require intelligence. At that point you can use the dumb strategies of alphago.

Intelligence is what you do when you don't know what you're doing. The "World Model"s we're interested in are those that arent already specified to the machine.

All these formal games are just outcomes spaces where every event is known a priori.

As I said, the AI people arent even operating in the category of intelligence. It just the profound lack of background knowledge of any field outside pytorch.lol() that their meglomania plausible

Go look at every paper in AI or ML that purports to build a model of anything: can you find a single one where the outcome space cannot be fully specified either formally (as with chess, etc.) or empirically (as with data samples)?

This has nothing to do with intelligence. We do not either start from the answers, or samples of the answers. We have no answers.

The resolution to this problem requires having a body: you have to move in order to think, in direct causal contact with the world beyond the capacities of clay, to think about it.


Guess this is the crux:

"The full specification of a "world" in mathematical terms doesnt require intelligence. At that point you can use the dumb strategies of alphago."

I fall in camp that humans are just glorified amoeba, twitching at stimuli.

Eventually an AI could model 'us' with dumb strategies, because really baked into the human brain/body are just dumb strategies.


Dumb is the wrong word, lets say, cheating.

All AI at the moment is just cheating with a fake UI.

We don't learn about the world by first being told what it is like. If you can fully specify an abstract system in mathematics, or use a historical corpus to answer questions --- you're nothing more than a kid cheating.

You seem to think that cynicism requires believing that animals are not, by construction, any different to incredibly absurdly dumb engineered works of our most over hyped morons.

This isnt cynicism, or scepticism, or erudition or sophistication. It's meglomania.

The whole history of evolution has not produced, in us and most animals, the most complex object (, likely,) in the entire universe to do something that alphago is doing. The level of ego here is off the charts.

This view is only a product of a pround ignorance of zoology (and so on) -- and a deep deep anti-intellectualism which says, "reality is easy to know, just build a computer program"


To focus on one issue, the neural machine that is chosen by optimization is one that "best" fits the photos of the sky. But those multiple optima do not preclude a neural machine whose parameter values are computationally equivalent to, say, a 3D representation of the sky projected onto a 2D perspective -- a kind of partial world theory or world model, that was picked randomly out of many optima. First, it's not impossible, just highly difficult to find at present technology. Second, the papers describing emergent structures or emergent information inside of actually-existing neural nets point to an empirical possibility that these machines are more than their statistical parts. Both these reasons incline me to stay on the fence on whether neural nets are purely stochastic parrots.


> whether neural nets are purely stochastic parrots.

Well we know how they work, it isnt speculative. All gradient-based algs on empirical outcome spaces are just kernel machines (ie., they weight their training data and take averages across it using a similarity metric).

Insofar as the ooutput seems as-if to reason it is because the input was produced by reasoning (of people). If you input text documents which have not been structured by reasoning agents, then the system doesnt work.

As for the idea of AI building generative 3D models and then projecting 2D -- yes, indeed that's how we did it.. but there are very large infinitities of 3D models all producing the same 2D.

This is where the "start from known outcome spaces" strategy of all existing AI fails. You cannot scan an infinity, or even sample meaningfully from it.

In otherwords the AI has to build such "deep models" circumstantially, it has to have a very limited set of them, and these have to 'somehow' be necessarily close to reality.

How do we do this? No mystery, we are in reality and so we in an ecological interplay with our enviorments. THe environment isnt, in cartesian terms, an evil daemon -- it doesnt lie, and doesnt tell the truth. What it does do is act reliably in reaction to us.

Via these means we explain.


Or, we don't surely know what deep nets are doing. If I give you an LLM or AlphaGo, you cannot look at it and tell me what it does. It's a bunch of parameters and edge weights. The counterargument is something like, deep nets are overparameterized and the gradient descent process does not reflect the final result. You would think that the large infinities of correct/incorrect 3D models are impossible to choose from, but in practice some have found emergent structural properties - like board positions, formal grammar fragments, etc. - enough to at least suggest that we don't understand how they work, and that it is a conflation/reductive error to call deep nets the same kernel or statistical machines as before.

The above isn't my own argument, as I'm not an expert. But theoreticians have been looking at this, and the ones posing this counterargument come from outside the ML community/Google/OpenAI so you can't attack this argument for being the wild delusions of ML researchers either. The lectures I watched was by an IAS professor in theoretical computer science, not ML people. Another professor's lecture I started watching has a background in signal theory and probability/statistics, if even he says "we don't know what's going on with deep learning", I tend to give that some credence and update my own uncertainty.

Now, I get your argument in that you are repeating everything Chomsky has said regarding explainability, evolution of human cognition and "truth of the world", statistical machines being fed a corpus of human-understandable information be it Internet text or Go game moves. Chomsky's criticism of ML-based "AI" covers all of this and I don't see your argument as introducing anything different from his (feel free to correct me if I've misread your remarks). I myself actually started on his side, now I'm a little on the fence and can see both sides more clearly.


sidebar

"If reasoning is an abstract pattern, then rice falling to the ground is likewise "reasoning"."

Technically, I think some philosophers over the ages have made that point, and argued that 'rice falling' is reasoning, and the rice is 'wanting' to be closer to the 'earth' or some such thing. It does sound wacky. Think Leibniz and monads made some argument like that.

And the various takes on 'Will to Power', Schopenhauer argued the will is a blind force. We don't have control of our own thoughts.

I think you are still giving humans too much credit for this aspect of 'cognition'.

"A man can do what he wills, but he cannot will what he wills." Schopenhauer


schopenhaur's idealism is indeed very similar, as is all idealism to this computational mysticism.

What none in this tradition considered possible is that the world exists; each reduced it down to a purely formal pattern one way or another.

Thankfully today we treat mental illness, and derealisation and depersonalisation and raised to this status.

I operate in a framework where there's a world and we're in it, and it is one way and not another, and the way it is arises from spatio-temporal properties that arent equivalent because the symbols we use in models of them are isomorphic.

In otherwords, I have passed through my phase of insanity and arrived back into the world where the grass is green because it is green; and the chair heavy, because it is massive,.


"Before one studies Zen, mountains are mountains and waters are waters;

After one gains insight through the teachings of a master, mountains are no longer mountains and waters are no longer waters;

After enlightenment, mountains are once again mountains and waters are waters."

------

or another i like

"Before enlightenment; chop wood, carry water. After enlightenment; chop wood, carry water.”

---------

Yes. I do get where you are coming from.

guess after doing a lot of meditating on 'no-self'. I started picking apart my own mind, and realizing it is all just electrical sparks and chemicals.

But then my engineer mind jumped in and said, of course we can model this. So then I get into arguments on the internet about how, of course we can model this.

Lets say we both agree on the real world existing. I think with us living in this real world. We are just arguing over the degree to which we'll be able to model a human. Of course, a model being an approximation. And I lean pretty far in the direction that once we model 'us' sufficiently to be indistinguishable from real 'us', then it will have also generated some subjective experience. (I firmly believed this until just recently after reading 'blindsight', i'm having doubts).

I think the Idealist from back in Schopenhauer day, were doing their best to describe the real world. Sometimes it sounds mystical, but that was before (or during) scientific revolution. The terminology is all different, and they didn't know a lot we take for granted. I wouldn't say they were mystics because they don't have todays knowledge.

The whole 'noumenal' world versus 'Phenomenal' world is valuable 'concept'. Our senses and mind only form an internal 'model' of the real world, it isn't the real world. But we can agree water is wet. But by how much. There are lot of studies about how different people perceive objects as moving faster/slower, etc... based on fear or anxiety. Our inner 'perception' isn't 'accurate', it isn't the actual real world. Just like a computers wouldn't be.


> Its insane to suppose that somatosensory representation building, which requires organic neruoplasticity to be connected to the organically adaptive neuromotor system, etc. etc. etc. can just be instantiated in a bit of sand.

Why? That's just an assertion, not an argument. All of those big words and sophisticated concepts you used all reduce to field interactions that we've mostly captured mathematically. Unless you're imbuing these things with some magical essence that can't be observed, but then who's doing the pseudoscience?


I'm just driven mad by how people thinking physical objects with specific properties is "magic", but reducing reality down to abstract platonic mathematical forms is "science".

This is the opposite.

Gold isnt lead. Lead isnt wood. Cells are not bits of metal. Bits of metal do not polymerise. Bits of sand do not form weak covalent bonds...

It's kinda exhausting that this mathematical superstition is so prevalent in people who otherwise believe they are somehow anti magical thinking.

There is no more magical thinking in supposing that you can ignore reality, describe it in a formula, reinterpet that formula against some other reality, and it'll all work out.

As if, "2 + 2 = 4" means the same thing when it's "2 drivers + 2 drivers = 4 deaths" vs. "2 cookies + 2 cookies = 4 happy children"

The idea that reality is essentially mathematical and not essentially physical is pythagorean magical thinking. Science says the opposite.

All the mathematics in explanatory scientific laws are just paraphrases of descriptions of the physical properties of systems. None of it is actually mathematics.

Physics does not study "2". It studies there being earth and the sun and a force between them, summarised as "2 masses" etc.


> Gold isnt lead. Lead isnt wood. Cells are not bits of metal. Bits of metal do not polymerise. Bits of sand do not form weak covalent bonds...

That's not the argument. Both gold and lead are aggregates of fields. They don't have the same macroscopic properties but they do have the same microscopic properties (or attoscopic if you want to nitpick).

> The idea that reality is essentially mathematical and not essentially physical is pythagorean magical thinking. Science says the opposite.

Science says no such thing, and nobody is saying that reality is not physical. Those who adopt a mathematical universe hypothesis, or the like, say that reality is physical but that the physical is a subset of the mathematical.

> Physics does not study "2". It studies there being earth and the sun and a force between them, summarised as "2 masses" etc.

Mathematics is the study of structure. Physics is studying the structure of reality. There is therefore an obvious and inescapable link between mathematics and physics that you are simply not going to refute by repeatedly asserting that mathematical structures have nothing to do with physics. Of course any structures that have a formal correspondence have important equivalences, because that's literally what formal correspondence means.


"Formal equivalence" does not have any causal significance.

> They don't have the same macroscopic properties

Exactly... see the above...

Whenever you taalk about "formal" equivalece you mean: ignoring the causal semantics for the formula describing the system.

"2 + 2 = 4" is eqv. to "2 + 2 = 4" even when talking about wars, people, cars, animals, ... sand, cells, brains, bodies...

NO, obvious not. This isnt hard, this is obvious.

It requires some extraordinary magical thinking to suppose that field configurations of sand are equivalent to arbitary systems... this is obvious nonesense


Typing is just a medium, it is irrelevant. Seeing and all the other senses that you mentioned are input within a context window.


uhuh.. and how do you form the inputs into that context window?

Turns out you need to move (indeed, adapt) the body in order to form the very techinques which become concepts that can be given as inputs.

The eye does not move on its own, it has to be directed to attend to reality as conceptualised -- where do these come from? Somatosenstory representations are built by the sensory-motor system.

Or, simply: in order to first think, we move.


How are people lookup tables? In the case of neural networks the representation of the table is obvious, it's just numbers. What would be the equivalent table for the liver?

My argument isn't abstract. Neural networks really are just numerical functions which can be expanded into their equivalent graph representations.


Not sure what he's referring to in terms of modern physics saying we're just a lookup table but at the very least, you could say the same thing about the conversation that we're having now. You read words, those words map to meaning representation in our heads, we then generate a response.


Obviously if we are interacting over a digital medium then the responses will be encoded as numbers but there is no way to reduce an entire person to a lookup table. Measured output of human behavior can be expressed as lists of numbers but thinking is not the same as the list of numbers, unlike in the case of neural networks where the graph and the network are actually equivalent.


You could represent all the input on different levels as numbers, e.g. all EM waves hitting our eyes, then all the physical output from our body also as numbers, and everything that causes this output from input within is what you would consider to be a lookup table.


What are the dimension of the input and output spaces involved in this idealization? In the case of a neural network there is no idealization. The network is software, it's a number. It's inputs and outputs are all bounded and can be expressed as a table of bounded tuples.


You could pick a very large number depending on a reasonable processing capability a human has, which represents all the significant physical interactions on a human body over a certain amount of time. Then take the output over a certain amount of time, being all movements of the body.

If you wanted to focus on thoughts alone, you might want to skip few layers/systems, to give input directly to whatever causes thoughts to happen.

All particles and their interactions could also be represented as numbers. But it just depends on what level we do this, and at what level what kind of complex logic is required.


Ok so give me some concrete number.


5.1536672454... could be approximated strength for a nervous signal of some sort in a human body in some unit of measurement.


I think the OP is right. All the input to a human brain can be expressed as numbers, at any given time a specific radiation, vibration, or chemical reaction is hitting our "sensors" and by the law of physics this is just numbers ( in terms of differentiation, brain does not know absolute values ).

Our output ( mechanical and vibrations ) is also fully quantifiable, thus numbers.

One giant lookup table.


Provide some concrete numbers for solar radiation then as a lookup table. You guys are confusing abstraction and idealization with what it means to be a thinking person. There is no such abstraction and idealization happening with software. The software is really just a number, there is no idealization or abstraction happening when I claim that GPT is a sequence of bits representing a numerical function.


You can't really have it both ways, being reductionist when it comes to computers (it's just a finite set of numbers, so there is no reasoning), but not permitting to use the same line of argumentation with humans (it's just a finite set of particles).

At any rate, this is an ages-old discussion in philosophy, so most likely we are not going to settle this in a Hacker News thread.


Abstraction is a property of a description of a thing, not the thing itself. In reality, what we call "GPT" is the highly organised behaviour of many electrons, probably distributed across many computers, each with extremely complex hardware of various kinds, etc etc. Calling it a sequence of bits representing a numerical function is a choice of description - an abstraction, even!

In this case it's a good description, because it correlates with the GPT in reality quite well. But they are not the same thing.


People really are just stacks of molecules that can be broken down into their causal properties - moreover, we know those causal properties to a high degree of accuracy these days.

I'm suggesting that for any given human/environment pair, there is a lookup table that produces that person's actual behaviour in that situation. Modern physics lets us approximate this lookup table, and presumably better physics would give us a better lookup table.

Since human behaviour can in principle be described with a lookup table, I see this as a bad reason to rule out a system as "thinking".

Perhaps there is another way to describe neural nets, one that does not use the language of lookup tables, that makes it feel more like thinking and less like lookups.

One such approach I've seen is looking for embedded world models in neural nets.


Suppose that we used embeddings as the input of the model rather than piece identifiers plus an embedding lookup table. This is possible with every transformer model and some libraries provide an API to do this. Moreover, we convert the parameters and ops to use arbitrary precision types. Then the network cannot be represented as a lookup table. Given that there is an infinite number of inputs, there is also an infinite number of outputs. But the arbitrary-precision network does not operate fundamentally different from the original network. It has the same parameters, ops, etc., yet you cannot store it as a (finite) lookup table.


Even if you increase the precision I can still generate a table T(P) for each fixed precision P. So the table is parametrized by P but it's still a table. The entire table T = colim T(P) is the colimit over all precision values but for every finite precision it is still a table.


I did not say fixed precision. I said arbitrary precision, so P is infinite.

The only counter-argument is that even arbitrary precision is fixed-precision because computer memory is finite. But that's kind of a silly argument, because then you are arguing that computers can never reason, because they have finite memory, and moreover humans cannot reason either, because there is a finite number of brain cells.


P obviously can't be infinite, even in theory, if you want the computation to terminate.


Right, but then as others said, then you are also arguing that humans cannot reason, since the universe is a system with a finite number of particles. Or if we exclude external factors, because humans have a finite number of brain cells.

In the end it all depends on what your definition of reasoning is, which you did not provide.


The bit precision of computation is always finite for halting computations and any finite computation can be turned into a lookup table which does no thinking or reasoning other than comparing two numbers and then extracting the value corresponding to the input key.

My argument carries through for any piece of software so if you think software can think and reason then you can remain unconvinced by my argument.

In any case, I have to drop out of this thread.


Just like to point out that RNNs have internal state which isn't captured in this view, so yes, lots of NNs can be considered this way, but not all. It's the DSP equivalent of FIRs vs IIRs.


This whole thread on lookup tables seems to be confused.

Isn't this purely math, the equivalence of a function to a lookup table is well studied. And NN as comprised of functions, can be boiled down to table as posted.

How do we get from this math concept of function=table, and get to arguments about consciousness and free-will and state space of the universe...

The table-NN equivalence doesn't seem to help peoples understanding of NN.


People are just outright abusing the terminology. OP's argument would also conclude that a sorting algorithm is not a "real" algorithm because it too can be done by an infinite lookup table.

That said, the general debate is a valid one. Are LLMs just doing fancy statistical compression of data, or are they doing "reasoning" in some important sense, be that merely mechanistic logical reasoning, or "human-level intelligent reasoning"?

For that matter, did the paper authors ever define "Reasoners" in their title, or leave it to the reader?


Sure. I agree.

The debate is good. I just don't see how the 'table-lookup' analogy is helping.

Except maybe by helping people see the non-free-will nature of the universe. But seems like people that reject this, are also ones rejecting the 'table-function' equivalence.


> thinking how every neural network is equivalent to a lookup table where the input is all numbers up to what can be expressed within the context window and the output is the result of the arithmetic operations applied to that number... no neural network is actually doing any thinking other than uncompressing the table and looking up the value corresponding to the input number

You're proposing the lookup table as one possible mechanism in Searle's chinese room, then proposing Searle's conclusion?

“Searle argues that, without ‘understanding’ (or ‘intentionality’), we cannot describe what the machine is doing as ‘thinking’ and, since it does not think, it does not have a ‘mind’ in anything like the normal sense of the word. Therefore, he concludes that the ‘strong AI’ hypothesis is false.‘

https://en.wikipedia.org/wiki/Chinese_room

I think you've said Chinese room, run as many times as it takes to get all possible sequences of Chinese characters to cache the results, then using those run it and ask if it's still or yet ‘thinking’.

PS. Where did the arithmetic operations come from? How did they come to be as they are? Is iterating to an algo that does that, ‘learning’? What's the difference between this and lossy or non-lossy compression of information? Could it be said the arithmetic operations are a compression of the lookup table into that which has the ‘right’ response given the inputs? If two different sets of arithmetic operations give by and large the same outputs from inputs, is one of them more ‘reasoning’ than the other depending how it's derived? What do we mean by ‘learning’ and ‘reasoning’ when applying those words to humans? Are teachers telling students to ‘show your work’ searching for explainable intelligence? :-)


Here's a counterexample. Suppose I create a simple neural network that computes f(x) = x^2 + c (where x and c are complex numbers) and then I run it as an RNN. This RNN will compute the mandelbrot set, which can't be represented by a lookup table.

You can't even know if the RNN will halt for a given input. Neural networks are stronger than lookup tables, they are programs.


Every computable function can be represented by a (possibly infinite)¹ lookup table.

Computer programs can only compute computable functions. Therefore any computer program is (in theory) equivalent to a table lookup.

¹ For finite inputs, the lookup table can be finite, and for infinite inputs, the lookup table can be infinite but still countable, as the set of computable functions is countable.


This table is not computable. If you had this table, you could solve the halting problem by simply looking up whether the program produced an output.


You've just restated the halting problem.

Nobody claimed that there is an algorithm to translate arbitrary programs into an equivalent lookup table. (Because that's the exact same proposition as stating that there is a program that can compute whether an arbitrary program halts when executed).

The point is: Any specific program can be translated into a lookup table. Computer programs and lookup tables are equivalent!

You claimed that computer programs are somehow "more powerful" than lookup tables. That's just plain wrong. They're exactly equivalent in "power".


I am sorry to be this blunt but this is really utter and complete nonsense. The phrase that the mandelbrot set can't be represented in a lookup table is as such true but that is because nothing that you do with finite precision numbers can represent the mandelbrot set because it essentially is an inifinte object. The function f(x) = x^2 + c as an RNN can also not compute the mandelbrot set if the numbers it uses are of finite precision. That is exactly the same limitation that the lookup table also faces so there is no fundamental difference between the two.


We can give them both infinite precision, you still can't build a lookup table of the mandelbrot set.

The mandelbrot set is essentially a map of the halting behavior of a specific program. You can't know whether or not the program will halt for a given input, and so cannot build the lookup table. Programs are stronger than input-output mappings.


Infinity is really hard to reason about, are you sure about that?

(For all I know you're a PhD in transfinites, your profile says nothing).


Infinity (of the various kinds) is well understood (see Cantor etc).

The Halting Problem is a central result in computer science, again well understood (especially here I would think!)

Their comment is correct.


I see you are a fan of flying disembodied brains, but this time without a universe surrounding the brain.


Possibly off-topic, but does anyone know where I can read up on LLMs? I've posted an "Ask HN" here, in the hopes some people can inform me about how I can keep up on what's new:

https://news.ycombinator.com/item?id=39688911


Search box in the bottom can help look for introduction tutorials recommendations etc


Yep, just along as it is fed biased data, it can still reason based on faulty info.


Not really


This reminds me of a meme that goes something along these lines:

Joins a university;

Studies for their bachelor's degree;

Gets their degree after 3-5+ years;

Studies for their master's;

Gets their master's after 2-4+ years;

Studies for their PhD while working on their thesis for a few more years;

Participares in intensive discussions with their peers, and investigates day and night;

Sends their thesis for peer review;

Reworks their thesis according to the review;

Finally publishes their thesis in an academic journal;

Someone on the internet, reads their thesis title: bulsh*t

--

Would you care to elaborate on your comment?


Is the master’s really 2-4 years? I always thought it was closer to 1 or 2.


Maybe it depends on the master's. The ones I know are 2 years if you follow the standard plan, but I wouldn't be surprised if there were 1 year master's


I...

"It is a tale Told by an idiot, full of sound and fury Signifying nothing."

This is sort of water is wet kind of research. Im glad they did it but it's not exactly moving the ball down the field.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: