I just spent a few days trying to figure out some linear algebra with the help o...

mseri · 2024-12-23T23:10:20 1734995420

It reliably fails also basic real analysis proofs, but I think this is not too surprising since those require a mix of logic and computation that is likely hard to just infer from statistical likelihood of tokens

cheald · 2024-12-24T17:57:02 1735063022

LLMs have been very useful for me in explorations of linear algebra, because I can have an idea and say "what's this operation called?" or "how do I go from this thing to that thing?", and it'll give me the mechanism and an explanation, and then I can go read actual human-written literature or documentation on the subject.

It often gets the actual math wrong, but it is good enough at connecting the dots between my layman's intuition and the "right answer" that I can get myself over humps that I'd previously have been hopelessly stuck on.

It does make those mistakes you're talking about very frequently, but once I'm told that the thing I'm trying to do is achievable with the Gram-Schmidt process, I can go self-educate on that further.

The big thing I've had to watch out for is that it'll usually agree that my approach is a good or valid one, even when it turns out not to be. I've learned to ask my questions in the shape of "how do I", rather than "what if I..." or "is it a good idea to...", because most of the time it'll twist itself into shapes to affirm the direction I'm taking rather than challenging and refining it.

jampekka · 2024-12-25T12:07:20 1735128440

Very close to my experience.

glimshe · 2024-12-23T12:32:08 1734957128

Isn't Wolfram Alpha a better "ChatGPT of Math"?

Filligree · 2024-12-23T12:46:33 1734957993

Wolfram Alpha is better at actually doing math, but far worse at explaining what it’s doing, and why.

dartos · 2024-12-23T13:17:25 1734959845

What’s worse about it?

It never tells you the wrong thing, at the very least.

jvanderbot · 2024-12-23T13:29:01 1734960541

When you give it a large math problem and the answer is "seven point one three five ... ", and it shows a plot of the result v some randomly selected ___domain, well there could be more I'd like to know.

You can unlock a full derivation of the solution, for cases where you say "Solve" or "Simplify", but what I (and I suspect GP) might want, is to know why a few of the key steps might work.

It's a fantastic tool that helped get me through my (engineering) grad work, but ultimately the breakthrough inequalities that helped me write some of my best stuff were out of a book I bought in desperation that basically cataloged linear algebra known inequalities and simplifications.

When I try that kind of thing with the best LLM I can use (as of a few months ago, albeit), the results can get incorrect pretty quickly.

PeeMcGee · 2024-12-24T06:39:47 1735022387

> [...], but what I (and I suspect GP) might want, is to know why a few of the key steps might work.

It's been some time since I've used the step-by-step explainer, and it was for calculus or intro physics problems at best, but IIRC the pro subscription will at least mention the method used to solve each step and link to reference materials (e.g., a clickable tag labeled "integration by parts"). Doesn't exactly explain why but does provide useful keywords in a sequence that can be used to derive the why.

kens · 2024-12-24T03:08:48 1735009728

What book was it that you found helpful?

jvanderbot · 2024-12-26T15:15:07 1735226107

A survey on matrix theory and matrix inequalities - Marvin Marcus

https://www.amazon.com/gp/product/048667102X/ref=ppx_yo_dt_b...

seattleeng · 2024-12-24T05:45:52 1735019152

Im reviewing linear algebra now and would also love to know that book!

jvanderbot · 2024-12-26T15:15:42 1735226142

It was this one (in case you miss sibling response): https://www.amazon.com/gp/product/048667102X/ref=ppx_yo_dt_b...

I make no claim about its usefulness for anyone else!

fn-mote · 2024-12-23T13:27:56 1734960476

Its understanding of problems was very bad last time I used it. Meaning it was difficult to communicate what you wanted it to do. Usually I try to write in the Mathematica language, but even that is not foolproof.

Hopefully they have incorporated more modern LLM since then, but it hasn’t been that long.

jampekka · 2024-12-23T13:55:26 1734962126

Wolfram Alpha's "smartness" is often Clippy level enraging. E.g. it makes assumptions of symbols based on their names (e.g. a is assumed to be a constant, derivatives are taken w.r.t. x). Even with Mathematica syntax it tends to make such assumptions and refuses to lift them even when explicitly directed. Quite often one has to change the variable symbols used to try to make Alpha to do what's meant.

amelius · 2024-12-23T16:07:29 1734970049

I wish there was a way to tell Chatgpt where it has made a mistake, with a single mouse click.

akoboldfrying · 2024-12-23T23:15:53 1734995753

What's surprising to me is that this would surely be in OpenAI's interests, too -- free RLHF!

Of course there would be the risk of adversaries giving bogus feedback, but my gut says it's relatively straightforward to filter out most of this muck.

a3w · 2024-12-23T19:22:07 1734981727

Is the explanation a pro feature? At the very end it says "step by step? Pay here"

GuB-42 · 2024-12-23T16:30:57 1734971457

Wolfram Alpha can solve equations well, but it is terrible at understanding natural language.

For example I asked Wolfram Alpha "How heavy a rocket has to be to launch 5 tons to LEO with a specific impulse of 400s", which is a straightforward application of the Tsiolkovsky rocket equation. Wolfram Alpha gave me some nonsense about particle physics (result: 95 MeV/c^2), GPT-4o did it right (result: 53.45 tons).

Wolfram alpha knows about the Tsiolkovsky rocket equation, it knows about LEO (low earth orbit), but I found no way to get a delta-v out of it, again, more nonsense. It tells me about Delta airlines, mentions satellites that it knows are not in LEO. The "natural language" part is a joke. It is more like an advanced calculator, and for that, it is great.

bongodongobob · 2024-12-23T17:52:39 1734976359

You're using it wrong, you can use natural language in your equation, but afaik it's not supposed to be able to do what you're asking of it.

CamperBob2 · 2024-12-23T19:46:14 1734983174

You know, "You're using it wrong" is usually meant to carry an ironic or sarcastic tone, right?

It dates back to Steve Jobs blaming an iPhone 4 user for "holding it wrong" rather than acknowledging a flawed antenna design that was causing dropped calls. The closest Apple ever came to admitting that it was their problem was when they subsequently ran an employment ad to hire a new antenna engineering lead. Maybe it's time for Wolfram to hire a new language-model lead.

kortilla · 2024-12-24T01:37:04 1735004224

No, “holding it wrong” is the sarcastic version. “You’re using it wrong” is a super common way to tell people they are literally using something wrong.

CamperBob2 · 2024-12-24T04:31:03 1735014663

But they're not using it wrong. They are using it as advertised by Wolfram themselves (read: himself).

The GP's rocket equation question is exactly the sort of use case for which Alpha has been touted for years.

bongodongobob · 2024-12-23T21:44:02 1734990242

It's not an LLM. You're simply asking too much of it. It doesn't work the way you want it to, sorry.

CamperBob2 · 2024-12-23T22:56:10 1734994570

Tell Wolfram. They're the ones who've been advertising it for years, well before LLMs were a thing, using English-language prompts like these examples: https://www.pcmag.com/news/23-cool-non-math-things-you-can-d...

The problem has always been that you only get good answers if you happen to stumble on a specific question that it can handle. Combining Alpha with an LLM could actually be pretty awesome, but I'm sure it's easier said than done.

Sharlin · 2024-12-23T23:11:35 1734995495

Before LLMs exploded nobody really expected WA to perform well at natural language comprehension. The expectations were at the level of "an ELIZA that knows math".

edflsafoiewq · 2024-12-23T23:39:59 1734997199

Correct, so it isn't a "ChatGPT of Math", which was the point.

jampekka · 2024-12-23T12:50:44 1734958244

Wolfram Alpha is mostly for "trivia" type problems. Or giving solutions to equations.

I was figuring out some mode decomposition methods such as ESPRIT and Prony and how to potentially extend/customize them. Wolfram Alpha doesn't seem to have a clue about such.

lupire · 2024-12-23T14:54:38 1734965678

No. Wolfram Alpha can't solve anything that isn't a function evaluation or equation. And it can't do modular arithmetic to save its unlife.

WolframOne/Mathematica is better, but that requires the user (or ChatGPT!)to write complicated code, not natural language queries.

spacemanspiff01 · 2024-12-23T13:34:46 1734960886

I wonder if these are tokenization issues? I really am curious about metas byte tokenization scheme...

jampekka · 2024-12-23T13:50:52 1734961852

Probably mostly not. The errors tend to be logical/conceptual. E.g. mixing up scalars and matrices is unlikely to be from tokenization. Especially if using spaces between the variables and operators, as AFAIK GPTs don't form tokens over spaces (although tokens may start or end with them).

lordnacho · 2024-12-23T14:09:26 1734962966

The only thing I've consistently had issues with while using AI is graphs. If I ask it to put some simple function, it produces a really weird image that has nothing to do with the graph I want. It will be a weird swirl of lines and words, and it never corrects itself no matter what I say to it.

Has anyone had any luck with this? It seems like the only thing that it just can't do.

KeplerBoy · 2024-12-23T14:21:48 1734963708

You're doing it wrong. It can't produce proper graphs with it's diffusion style image generation.

Ask it to produce graphs with python and matplotlib. That will work.

lanstin · 2024-12-23T22:50:15 1734994215

And works very well - it made me a nice general "draw successively accurate Fourier series approximations given this lambda for coefficients and this lambda for the constant term". PNG output, no real programming errors (I wouldn't remember if it had some stupid error, I'm a python programmer). Even TikZ in LaTeX isn't hopeless (although I did ending up reading the tikz manual)

thomashop · 2024-12-23T14:40:58 1734964858

Ask it to plot the graph with python plotting utilities. Not using its image generator. I think you need a ChatGPT subscription though for it to be able to run python code.

lupire · 2024-12-23T14:52:05 1734965525

You seem to get 2(?) free Python program runs per week(?) as part of the 01 preview.

When you visit chatgpt on the free account it automatically gives you the best model and then disables it after some amount of work and says to come back later or upgrade.

amelius · 2024-12-23T16:05:39 1734969939

Just install Python locally, and copy paste the code.

xienze · 2024-12-23T17:32:28 1734975148

Shouldn’t ChatGPT be smart enough to know to do this automatically, based on context?

CamperBob2 · 2024-12-23T19:44:23 1734983063

It was, for a while. I think this is an area where there may have been some regression. It can still write code to solve problems that are a poor fit for the language model, but you may need to ask it to do that explicitly.

HDThoreaun · 2024-12-23T18:41:47 1734979307

The agentic reasoning models should be able to fix this if they have the ability to run code instead of giving each task to itself. "I need to make a graph" "LLMs have difficulty graphing novel functions" "Call python instead" is a line of reasoning I would expect after seeing what O1 has come up with on other problems.

Giving AI the ability to execute code is the safety peoples nightmare though, wonder if we'll hear anything from them as this is surely coming

amelius · 2024-12-23T16:04:20 1734969860

Don't most mathematical papers contain at least one such error?

aiono · 2024-12-23T16:30:51 1734971451

Where is this data from?

amelius · 2024-12-23T19:19:13 1734981553

It's a question, and to be fair to AI it should actually refer to papers before review.

monktastic1 · 2024-12-23T23:29:01 1734996541

Yes, it's a question, but you haven't answered what you read that makes you suspect so.