My honest take is a lot of these famous academics played almost no part in the developments at openai. But they want the limelight. They aren’t as relevant as they want to be. In many cases, they were directly wrong about how ai would develop
Really? Hinton dont need openAI to be relevant. He literally invented back propagation. He sticked to deep learning through 1990s and 2000s when almost all major scientist abandoned it. He was using neural networks for language model in 2007-08 when no one knew what it was. Again the deep learning in 2010s started when his students created AlexNet by coding deep learning in GPU. Chief Scientist of OpenAI Ilya Sutskever was one of his student while developing the paper.
He already have a Turing award and don't give a rat's ass about who owns how much search traffic. OpenAI just like Google will give him millions of dollar just to be a part of organization
> Explicit, efficient error backpropagation (BP) in arbitrary, discrete, possibly sparsely connected, NN-like networks apparently was first described in a 1970 master's thesis (Linnainmaa, 1970, 1976), albeit without reference to NNs. BP is also known as the reverse mode of automatic differentiation (e.g., Griewank, 2012), where the costs of forward activation spreading essentially equal the costs of backward derivative calculation. See early BP FORTRAN code (Linnainmaa, 1970) and closely related work (Ostrovskii et al., 1971).
> BP was soon explicitly used to minimize cost functions by adapting control parameters (weights) (Dreyfus, 1973). This was followed by some preliminary, NN-specific discussion (Werbos, 1974, section 5.5.1), and a computer program for automatically deriving and implementing BP for any given differentiable system (Speelpenning, 1980).
> To my knowledge, the first NN-specific application of efficient BP as above was described by Werbos (1982). Related work was published several years later (Parker, 1985; LeCun, 1985). When computers had become 10,000 times faster per Dollar and much more accessible than those of 1960-1970, a paper of 1986 significantly contributed to the popularisation of BP for NNs (Rumelhart et al., 1986), experimentally demonstrating the emergence of useful internal representations in hidden layers.
I mean he was one of the first to use backprop for training multilayer perceptron. Their experiments showed that such networks can learn useful internal representations of data[1]. 1987. Nevertheless he is one of the founding fathers of deep learning
[1]Learning representations by back-propagating errors
It's really sad how poor attribution is in ML. Hinton certainly made important contributions to backpropagation, but he neither invented backpropagation nor was he even close to the first person to use it for multilayer perceptrons.
You've now gone from one false claim "he literally invented backpropagation", to another false claim "he is one of the first people to use it for multilayer perceptrons", and will need to revise your claim even further.
I don't particularly blame you specifically, as I said the field of ML is so bad when it comes to properly recognizing the teams of people who made significant contributions to it.
This is a marketing problem fundamentally, I'd argue. That the article or any serious piece would use a term such as "Godfather of AI" is incredibly worrying and makes me think it's pushing an agenda or is some sort of paid advertisement with extra steps to disguise it.
I have grown an aversion, and possibly a knee-jerk reaction to such pieces. I have a lot of trouble taking them seriously, and I am inclined to give them a lot more scrutiny than otherwise.
I’m not convinced that inventing back propagation gives one the authority to opine on more general technological/social trends. Frankly, many of the most important questions are difficult or impossible to know. In the case of neural networks, Hinton himself would never have become as famous were it not for one of those trends (the cost of GPU compute and the breakthrough of using GPUs for training) which was difficult or impossible to foresee.
In an alternate universe, NNs are still slow and compute limited, and we use something like evolutionary algorithms for solving hard problems. Hinton would still be just as smart and backpropagation still just as sound but no one would listen to his opinions on the future of AI.
The point is, he is quite lucky in terms of time and place, and giving outsized weight to his opinions on matters not directly related to his work is a fairly clear example of survivorship bias.
Finally, we also shouldn’t ignore the fact that Hinton’s isn’t the only well-credentialed opinion out there. There are other equally if not more esteemed academics with whom Hinton is at odds. Him inventing backpropagation is good enough to get him in the door to that conversation, but doesn’t give him carte blanche authority on the matter.
Of course he was lucky, you should expect that in general for well-known people because selection pressures that led you to hear of them, vs not hear of them, are likely to involve luck.
That is not at all a slam dunk argument. It’s barely anything.
Well unless you’re claiming the same luck that led to Hinton’s fame will lead to his accuracy on the much broader and less constrained topic of the relationship between automated systems and society, I don’t see how it’s not something.
My main point wasn’t to undermine Hinton by saying he was lucky. I did do that and I stand by it. But my main point was to say that to a large degree the future on this issue is unknowable because it depends on so many crucial yet undetermined factors. And there’s nothing you could know about backpropagation, neural networks, or computer science in general which could resolve those questions.
All people on the leading edge of big things have benefited from a huge amount of luck, and there were likely 100s of other folks on the leading edge of other potential breakthroughs that didn't happen, each of whom were equally capable in terms of raw problem solving ability or IQ. The difference is that when you get the chance to ride the wave, and you and ride it for 10, 15, 20 years, it gives you a significantly different and improved set of experiences, expertise, and problem solving ability than the folks who never had that shot but were still capable. The magic is partly that he was smart, partly that he was lucky, and also partly that the experience of pushing the field forward for 20 years and the field following you brings you something that very few others have and that is in fact very valuable.
To say Hinton is just lucky is short-changing both the work he did, the environment he did it in and utterly ignores the amount of pressure to abandon the work he was doing because it was considered to be a dead end by just about everybody else until it suddenly wasn't.
This sort of reminds me of Bloomberg articles wherein every time there is some "black swan" event, they go and find an analyst or economist that "got it right" and he gets to be prophet for a day: never mind that said analyst/economist may have predicted 100 of the last 3 financial crashes, they were "right" about this one.
It sounds like you’re biased against academics. Not only did Hinton develop some of the fundamental ideas behind AI (winning the Turing award) but also one of his PhD students is now the CTO at OpenAI.
In case anyone is curious, this appears to refer to https://en.wikipedia.org/wiki/Ilya_Sutskever who was a PhD student of Geoffrey Hinton's and is now Chief Scientist at OpenAI.
Wow the CTO of OpenAi seems to have ~1 yr of hands on engineering experience, followed by years of product and people management, That’s unexpected. I thought the CTO was Brockman.
In addition to what people clarified in this thread, you probably will be interested in this: Neural network was not a popular research area before 2005. In fact, the AI nuclear winter in the 90s left such a bitter taste that most people thought that NN is a dead end, so much so that Hinton could not even get enough funding for his research. If it were not for Canada's (I forgot the institution's name) miraculous decision to fund Hinton, LeCunn, and Bengio with $10M for 10 years, they probably wouldn't be able to continue their research. I was a CS student in the early 2000s in U of T, a pretty informed one too, yet I did not even know about Hinton's work. At that time, most of the professors who did AI research in U of T were into symbolic reasoning. I still remember I was taking courses like Model Theory and abstract interpretation from one of such professors. Yet Hinton persevered and changed the history.
I don't think Hinton cared about fame as you imagined.
This may be true in other cases, but not here. Hinton literally wrote the paper on backpropagation, the way that modern neural networks are trained. He won the Turing award for a reason.
Hinton was critical for the development of ai. But was he critical for the development of openai, the company? Loads of startups get eminent people on their boards largely for advertising.
Has he contributed that much personally? I thought a lot of the success of ChatGPT is some good ideas from lower ranked researchers + great engineering.
I asked the question knowing that he's a co-founder and chief scientist at OpenAI. Being in his position doesn't automatically mean that he's contributed meaningfully.
My experience in "Applied Research" is that often "good ideas from lower ranked researchers" (or good ideas from anyone really) is "I saw this cool paper, let's try and implement that". That doesn't mean top people like Hinton should get all the credit, but let's not kid ourselves and believe most of the ideas didn't origin in academia.
One of GOpenAI's recent breakthroughs was switching to FlashAttention, invented at Stanford and University at Buffalo.
A lot of the developments of AI in different companies Hinton was not directly responsible for. Hinton never had anything to say about those companies, I don't think he's vying for limelight.
The fact that he never said anything before and the fact that he's saying something now means two things in my mind:
1. He is noticing something different about the current iteration of AI technology. We crossed some threshold.
2. Hinton is being honest.
Your take might be honest, but it's clearly uninformed.
Everyone has been wrong about how ai developed.
It's worth giving "The Bitter Lesson" a read [1] if you haven't yet.
Maybe, but there is another force at play here too. It's that journalists want stories about AI, so they look for the most prominent people related to AI. The ones who the readers will recognize, or the ones who have good enough credentials for the journalists to impress upon their editors and readers that these are experts. The ones being asked to share their story might be trying to grab the limelight or be indifferent or even not want to talk so much about it. In any case I argue that journalism has a role. Probably these professional journalists are skilled enough that they could make any average person look like a 'limelight grabber' if the journalist had enough reason to badger that person for a story.
This isn't the case for everyone. Some really are trying to grab the limelight, like some who are really pushing their research agenda or like the professional science popularizers. It's people like Gary Marcus and Wolfram and Harari and Lanier and Steven Pinker and Malcolm Gladwell and Nassim Taleb, as a short list off the top of my head. I'm not sure I would be so quick to put Hinton among that group, but maybe it's true.
> Together with Yann LeCun, and Yoshua Bengio, Hinton won the 2018 Turing Award for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing
In many cases yes, but definitely not in this. Geoffrey Hinton is as relevant as ever. Ilya Sutskever, Chief Scientist at OpenAI, is a student of Hinton. Hinton also recently won the Turing award.
We are talking about a Turing Award winner known as one of the "godfathers of AI" and your take is that this is just about taking the limelight? The level of cynicism on HN never fails to surprise me.
He played key roles in the development of backprop, ReLU, LayerNorm, dropout, GPU-assisted deep learning, including AlexNet, was the mentor of OpenAI's Chief Scientist, and contributed many, many other things. These techniques are crucial for transformers, LLMs, generative image modelling, and many other modern applications of AI
Your post suggests that you know almost nothing about how modern deep learning originated.
Regardless of incentives, I don’t see any particular reason to think he has a more informed view than other experts on the trajectory of AI. He’s made several incorrect bets (capsule networks).
I’m sure he’s smart and all. His contributions were valuable. But he’s not special in this particular moment.
Your viewpoint is fascinating. So the inventor of backpropagation, Turing award winner, Google researcher, mentor to the CTO of OpenAI, doesn’t have any special insights about AI and the tech industry that’s forming around it? He might as well be some guy off the street?
Who, in your opinion, _does_ have enough context to be worth our attention?
Because if you’re waiting for Sam Altman or the entire OpenAI team to say “guys, I think we made a mistake here” we’re going to be knee-deep in paperclips.
Someone who is actually doing it would be a lot more authoritative in my opinion. Hinton has been wrong on most of his big ideas in the past decade. He hasn’t actually been involved in the important advances of anything recent. Inventing backprop is great. No discredit to him there. But that’s not a free pass to be seen as someone who is on the cutting edge.
But beyond all of that, what are we really asking? Are we asking about social ramifications? Because I don’t think the OpenAI devs are particularly noteworthy in their ability to divine those either. It’s more of a business question if anything. Are we talking about where the tech goes next? Because then it’s probably the devs or at least indie folks playing with the models themselves.
None of that means Hinton’s opinions are wrong. Form your own opinions. Don’t delegate your thinking.
I'm surprised you'd consider Hinton as not being "someone who is actually doing it".
Are you basically saying that you only trust warnings about AI from people who have pushed the most recent update to the latest headline-grabbing AI system at the latest AI darling unicorn? If so, aren't those people strongly self-selected to be optimistic about AI's impacts, else they might not be so keen on actively building it? And that's even setting aside they would also be financially incentivized against publicly expressing whatever doubts they do hold.
Isn't this is kind of like asking for authoritative opinions on carbon emissions from the people who are actually pumping the oil?
No, that’s the opposite of what I’m saying. Asking Hinton for his opinions on the societal impact of new AI tech is like asking the people who used to pump oil 20 years ago. It’s both out of date and not really relevant to their skill set even if it’s adjacent.
Let me clarify: who does qualify to offer an authoritative opinion, in your view? If, say, only Ilya Sutskever qualifies, then isn't that like asking someone actively pumping oil today about the danger of carbon emissions? If only Sam Altman, then isn't that like asking an oil executive?
If not Geoff Hinton, then, who?
Ultimately the harm is either real or not. If it is real, then the people with the most accurate beliefs and principles will be the ones who never joined the industry in the first place because they anticipated where it would lead, and didn't want to contribute. If it is not real, then the people with the most accurate beliefs will be the ones leading the charge to accelerate the industry. But neither group's opinions carry much credibility as opinions, because it's obvious in advance what opinions each group would self-select to have. (So they can only hope to persuade by offering logical arguments and data, not by the weight of their authoritative opinions.)
In my view, someone who makes landmark contributions to the oil industry for 20 years and then quits in order to speak frankly about their concerns with the societal impacts of their industry... is probably the most credible voice you could ever expect to find expressing a concern, if your measure of credibility involves experience pumping oil.
If you want an authoritative opinion on the societal impact of something I would want the opinion of someone who studies the societal impact of things.
So that seems to me like someone like Stuart Russel or Nick Bostrom? But what Geoff Hinton is saying seems to be vaguely in general agreement with what those people are saying.
His opinion obviously does matter because he is a founder of the field. No one believes that he is prescient. You are exaggerating and creating a strawman argument, infantilizing the readers here. We don't worship him or outsource our thinking.
You seem to be taking my usage of the word prescient as meaning he can either see the future perfectly or he cannot. That’s… not what it conventionally means. I simply mean his track record of predicting the future trajectory of AI is not great.
Your argument sounds like (and correct me if I'm wrong) something along the lines of "he chose to do X, and afterwards X was the correct choice, so he must be good at choosing correctly."
Isn't that ad hoc ergo propter hoc?
That argument would also support the statement "he went all in with 2-7 preflop, and won the hand, so he must be good at poker" -- I assume you and I would both agree that statement is not true. So why does it apply in Geoffrey's case?
I still don't follow. In your example, how would you differentiate between that choice of his being lucky vs. prescient? Or was the intent to just provide a single datapoint of him appearing to make a correct choice?
Nobody was arguing that Hinton should be listened to uncritically. You were the one asserting that he should not be listened to at all.
With respect, you seem to be shifting goalposts, from the indefensible (Hinton doesn't know what he's talking about) to the irrelevant (Hinton doesn't have perfect and complete knowledge of the future).
Authority figures will not matter. This technology, like nuclear weapons, will be pursued to the utmost by all actors capable of marshalling the resources, in secret if necessary. (After all, the 'Hydrogen bomb' was debated pro/con by established authorities, including Oppenheimer and Teller. Did that stop their development?)
US Senate has a bill drawing the line on AI launching nuclear weapons but to think US military, intelligence, and industry will sit out the AI arms race is not realistic.
China's CPC's future existence (imo) depends on AI based surveillance, propaganda, and realtime behavior conditioning. (re RT conditioning: We've already experienced this outselves via interacting with the recent chatbots to some extent. I certainly modulated my interactions to avoid the AI mommy retors.)
There's something about being first that gives a pioneer a great head start that can't be matched when it comes to considering the implications of their groundbreaking work.
Even if they're too busy doing the work, they're still thinking about what it would be like if it performed successfully, and it does seem to always take more retrospection before a leader can fully raise their head and more carefully consider unintended consequences.
Early success can give the impression that future efforts have difficulty being as meaningful, but also realistically after that the successful individual does not need to struggle to prove themself any more the way the less-accomplished would be expected to do.
Then there's seniority itself, and maturity levels that can not be gained any other way.
Beyond that when retirement is within easy reach you don't really have the same obligation to decorum itself as you would earlier, in order to actually maintain the same desired level of decorum.
Dr. Hinton seems to do a pretty good job of comparing himself to Oppenheimer.
I don't see how anyone else can question his standing more seriously than that.
You could have written the same thing about NNs for many years and you'd have been right. But the reason why Hinton has a Nobel prize to his name and you don't is because he placed a very long term bet and it paid off, in spite of lots of people saying that he wasn't going anywhere and that he should drop it.
Who knows, maybe a decade or two from now we'll see a resurgence of capsule networks, or maybe not. But I'd be a bit more careful about rejecting Hinton's hunches out of hand, his track record is pretty good.
This is a little harsh. Hinton trudged along with neural networks through the coldest AI winter and helped create the conditions for OpenAI to have all the raw ingredients needed to cook up something powerful.
If you need to build an airplane, would you rather consult Newton, the Wright brothers, or a modern aerospace engineer? Inventing a field and snatching up the low hanging fruits doesn't mean somebody would be able to consistently create leading edge output. Most of the advances in deep learning are due to hardware scaling, and the success of a few very specific architectures. Yes credit's due where credit's due, but academia name recognition is very much winner take all. For all the criticism Schumidhuber has received, he has a point. The authors of Attention is all you need, the transformers paper, yolo, have nowhere close to the name recognition of the Turing award trio despite generating similar if not more value through their ideas.
not having a PHD in ML, it's hard for me to evaluate his claims, but how valid are all the obscure papers that he brings up? Did someone actually invent backprop in 1930 in some random corner of the former Soviet Union? Or is it a case of "true but misses the point"?
Often it is indeed the latter, although it is interesting that sometimes despite that it gets at the core of our contemporary understanding of the concepts in question.
"Formal equivalence" means very little for engineering, to be frank - the implementation is the important thing. If I wanted to be snarky, I'd say that neural networks are "formally equivalent" to Fourier analysis, which is 200 years old. I see that the paper proposes an implementation of linearized attention as well, which many others have done, but none of which seem to have caught on (although FlashAttention at least makes attention O(n) in memory, if not computation).
There are multiple dimensions here - fame and fortune at the very least and whether it is localized or global in scope.
It is still winner takes all, but if you look at the overall landscape, there are plenty of opportunities where you can have an outsized impact - you can have localized fame and fortune (anyone with AI expertise under their belt have no problems with fortune!)
Yes, we needed clever ideas from scientists to make them scale. In fact, we still need clever ideas to make them scale because the current architectures still have all sorts of problems with length and efficiency.
Going along with that, as long as they are "concerned" about how AI is developing it opens the door to regulation of it. This might just conveniently hobble anyone with an early mover advantage in the market.
Even developers at Open AI played almost no part in the developments at Open AI. 99.9999% of the work was done by those who created the content it was trained on.
If that was true we could have had GPT-3/etc years ago. It's a bit like saying that college graduates are dumb because after all what have they learnt but a bunch of knowledge in text books.
The success of these LLMs comes down to the Transformer architecture which was a bit of an accidental discovery - designed for sequence-to-sequence (e.g. machine translation) NLP use by a group of Google researchers (almost all of who have since left and started their own companies).
The "Attention is all you need" Transformer seq-2-seq paper, while very significant, was an evolution of other seq-2-seq approaches such as Ilya Sutskever's "Sequence to Sequence Learning with Neural Networks". Sutskever is of course one of the OpenAI co-founders and chief scientist. He was also one of Geoff Hinton's students who worked on the AlexNet DNN that won the 2012 ImageNet competition, really kicking off the modern DNN revolution.
Reminds me of a press release by Hochreiter last week.
He claims to have ideas for architectures that could surpass the capabilities of GPT4, but can't try them for a lack of funding in his academic setting. He said his ideas were nothing short of genius..
I don't disagree. But for me, their mistake wasn't in the algorithms or their approach or anything like that.
The problem has always been, and now will likely always be, the hardware. I've written about this at length in my previous comments, but a split happened in the mid-late 1990s with the arrival of video cards like the Voodoo that set alternative computation like AI back decades.
At the time, GPUs sounded like a great way to bypass the stagnation of CPUs and memory busses which ran at pathetic speeds like 33 MHz. And even today, GPUs can be thousands of times faster than CPUs. The tradeoff is their lack of general-purpose programmability and how the user is forced to deal with manually moving buffers in and out of GPU memory space. For those reasons alone, I'm out.
What we really needed was something like the 3D chip from the Terminator II movie, where a large array of simple CPUs (possibly even lacking a cache) perform ordinary desktop computing with local memories connected into something like a single large content-addressable memory.
Yes those can be tricky to program, but modern Lisp and Haskell-style functional languages and even bare-hands languages like Rust that enforce manual memory management can do it. And Docker takes away much of the complexity of orchestrating distributed processes.
Anyway, what's going to happen now is that companies will pour billions (trillions?) of dollars into dedicated AI processors that use stuff like TensorFlow to run neural nets. Which is fine. But nobody will make the general-purpose transputers and MIMD (multiple instruction multiple data) under-$1000 chips like I've talked about. Had that architecture kept up with Moore's law, 1000 core chips would have been standard in 2010, and we'd have chips approaching 1 million cores today. Then children using toy languages would be able to try alternatives like genetic algorithms, simulated annealing, etc etc etc with one-liners and explore new models of computation. Sadly, my belief now is that will never happen.
But hey, I'm always wrong about everything. RISC-V might be able to do it, and a few others. And we're coming out of the proprietary/privatization malaise of the last 20-40 years since the pandemic revealed just how fragile our system of colonial-exploitation-powered supply chains really is. A little democratization of AI on commoditized GPUs could spur these older/simpler designs that were suppressed to protect the profits of today's major players. So new developments more than 5-10 years out can't be predicted anymore, which is a really good thing. I haven't felt this inspired by not knowing what's going to happen since the Dot Bomb when I lost that feeling.
>What we really needed was something like the 3D chip from the Terminator II movie, ...
>... Docker takes away much of the complexity of orchestrating distributed processes.
The T-800 running on Docker: After failing to balance its minigun, it falls forward out of the office window, pancaking in the parking lot below. Roll credits.
The foundational technology, e.g. Transformers, was invented outside of OpenAI. OpenAI were the first to put all the bits together. Kudos to them for that, but if we're doing credit attribution, Hinton is definitely not someone who is just unfairly seeking the limelight, he's about as legitimate a voice as you could ask for.