>> Also, we now have enough compute power to get something done. Kurtzweil claim...

visarga · on May 29, 2016

> Today we solve the stuff that's possible to solve, the low-hanging fruit of problems. Tomorrow, the next problem we try to tackle is ten times bigger, a hundred times harder, we need machines a thousand times more powerful...

Nah, if you can't handle the huge amount of data, it's possible to just switch to a sparse model or do MC-like sampling. Take AlphaGo as an example for that - a huge state space, yet it was tractable to beat the human expert. That way the network doesn't scale linearly with the size of the ___domain being learned.

These kinds of solutions don't rely on improving the hardware. What is needed is datasets, competitions and grants to get people to try and improve state-of-the-art results on them. It's been demonstrated that when a good benchmark appears, a lot of papers follow and top results improve massively. Another useful component would be simulators for training agents in RL.

A promising direction is extending neural networks with memory and attention, in order to focus its work more efficiently and to access external knowledge bases. As we improve on these knowledge bases and ontologies, all we have to learn is how to operate on them.

Thus, improvements can come in various ways, by sampling, sparsity, external knowledge bases, better research frameworks, and improving the hardware (such as having a better GPU card or a dedicated device) is just one factor.

YeGoblynQueenne · on May 29, 2016

>> Nah, if you can't handle the huge amount of data, it's possible to just switch to a sparse model or do MC-like sampling... That way the network doesn't scale linearly with the size of the ___domain being learned.

That's useful when your ___domain is finite, like in your example, Go. If you're dealing with a non-finite ___domain, like language, MC won't save you. When you sample from a huge ___domain, you eventually get something manageable. When you sample from an infinite ___domain - you get back an infinite ___domain.

That's why approximating infinite processes is hard: because you can only approximate infinity with itself. And all the computing power in the world will not save you.

>> It's been demonstrated that when a good benchmark appears, a lot of papers follow and top results improve massively.

Mnyeah, I don't know about that. It's useful to have a motivator but on the other hand the competitions become self-fulfilling prophecies, the datasets come with biases that the real world has no obligation to abide by and the competitors tend to optimise for beating the competition rather than solving the problem per se.

So you read about near-perfect results on a staple dataset, so good that it's meaningless to improve on them - 98.6% or something. Then you wait and wait to see the same results in everyday use, but when the systems are deployed in the real world their performance goes way down, so you have a system that got 99 ish in the staple dataset but 60 ish in production, as many others did before it. What have we gained, in practice? We learned how to beat a competition. That's just a waste of time.

And it's even worse because it distracts everyone, just like you say: the press, researchers, grant money...

Well, OK, I'm not saying the competitions are a waste of time, as such. But overfitting to them is a big problem in practice.

>> A promising direction is extending neural networks with memory and attention

That's what I'm talking about, isn't it? Just raw computing power won't do anything. We need to get smarter. So I'm not disagreeing with you, I'm disagreeing with the tendency to throw a bunch of data at a bunch of GPUs and say we've made progress because the whole thing runs faster. You may run faster on a bike, but you won't outrun a horse.

(Oh dear, now someone's gonna point me to a video of a man on a bike outrunning a horse. Fine, internets. You win).

pigscantfly · on May 29, 2016

Computing power absolutely does matter, because it allows us to run more complicated experiments in a reasonable amount of time, which is crucial for moving research forward. Today we work on the low-hanging fruit so that tomorrow we can reach for something higher. As a side note, your comment about runtime complexity does not make much sense when there exist problems which provably cannot be solved in linear time. It is dangerous to discourage research on that simplistic basis; we could have much more powerful POMDP solvers today (for instance) if people hadn't been scared off by overblown claims of intractability fifteen years ago.

YeGoblynQueenne · on May 29, 2016

>> your comment about runtime complexity does not make much sense when there exist problems which provably cannot be solved in linear time.

Look, it's obvious the human mind manages to solve such problems in sub-linear time. We can do language, image processing and a bunch of other things, still much better than our algorithms. And that's because our algorithms are going the dumb way and trying to learn approximations of probably infinite process from data when that's impossible to do in linear time or best. In the short term, sure, throwing lots of computing power at that kind of problem speeds things up. In the long term it just bogs everything down.

Take vision, for instance (my knowledge of image processing is very shaky but). CNNs have made huge strides in image recognition etc, and they're wonderful and magickal, but the human mind still does all that a CNN does, in a fraction of the time and with added context and meaning on top. I look at an image of a cat and I know what a cat is. A CNN identifies an image as being the image of a cat and... that's it. It just maps a bunch of pixels to a string. And it takes the CNN a month or two to train at the cost of a few thousand dollars, it takes me a split second at the cost of a few calories.

It would take me less than a second to learn to identify a new animal, or any thing, from an image and you wouldn't have to show me fifteen hundred different images of the same thing in different contexts, different lighting conditions or different poses. If you show me an image of an aardvark, even a bad ish drawing of one, I'll know an aardvark when I see it _in the flesh_ with very high probability and very high confidence. Hell- case in point. I know what an aardvark is because I saw one in a Pink Panther cartoon once.

What we do when we train with huge datasets and thousands of GPUs is just wasteful, it's brute forcing and it's dumb. We're only progressing because the state of the art is primitive and we can make baby steps that look like huge strides.

>> It is dangerous to discourage research on that simplistic basis

It's more dangerous to focus all research efforts on a dead end.

PeterisP · on May 29, 2016

It takes many months to train a human brain so that it would recognize what a cat is, far more than a few calories - and it needs not only a huge amount of data, but also an ability to experiment; e.g. we have evidence that just passive seeing without any moving/interaction is not sufficient for a mammal brain to learn to "see" usefully.

Your argument about classifying images trivially excludes the large amount of data and training that any human brain experiences during early childhood.

YeGoblynQueenne · on May 29, 2016

>> Your argument about classifying images trivially excludes the large amount of data and training that any human brain experiences during early childhood.

Not at all. That's exactly what I mean when I say that the way our brain does image recognition also takes into account context.

Our algorithms are pushing the limits of our computing hardware and yet they have no way to deal with the context a human toddler already has collected in his or her brain.

>> It takes many months to train a human brain so that it would recognize what a cat is, far more than a few calories

I noted it would take _me_ less than a second to learn to identify a new animal from an image. Obviously my brain is already trained, if you like: it has a context, some sort of general knowledge of the world that is still far, far from what a computer can handle.

I'm guessing you thought I was talking about something else, a toddler's brain maybe?