I don't have ML or deep learning background (no Masters or PhD), adding comment ...

paganel · on July 17, 2017

> The quality of the algo and I assume the deep learning model lies in the quality (breadth and depth) of the data, and how honest with himself the person choose to model it.

I've only dabbled with machine-learning here and there for the past 10 years or so, but if there's one thing I've learned so far is that the data behind your ML code (and the way it is structured) is responsible for almost all the success or failure of any given ML algorithm. I have an younger colleague at work who I've started tutoring, and he seems really interested in doing ML work (maybe because of all of the recent hype).

I've tried to emphasize to him several times that ML algorithms come and go and that he should focus a lot of his time on the data itself (from where he intends to collect it? how is it structured? is it reliable? is it "enough"? etc), but it looks that my data-related advice falls on deaf ears every time, he's only interested in me pointing to him the latest cool ML algorithm. I guess he'll live and learn, so to speak.

tchalla · on July 17, 2017

> I've learned so far is that the data behind your ML code (and the way it is structured) is responsible for almost all the success or failure of any given ML algorithm

Data is indeed a necessary condition but certainly not sufficient. You require a good marriage between engineering features and data to have a good success rate. Learning curves [0] are a good way to understand if your ML algorithm requires more data or better feature engineering.

[0] http://mlwiki.org/index.php/Learning_Curves

honestoHeminway · on July 18, 2017

Much of the programming with ML has moved towards cleaning, extrapolating and generating the data.

But this type of programing is - miracles- bugfree. We never hear of data-conversion gone wrong, data corrupted or data-mining withou conclusive results here. Obviously such bugs lack the glamour of security bugs.

skgoa · on July 18, 2017

It's also very difficult to catch these errors. Your trained model just doesn't work as well as it could, but how would you be able to tell?

Buttons840 · on July 17, 2017

> focus a lot of his time on the data itself... from where he intends to collect it? how is it structured? is it reliable? is it "enough"?

What's the best books on this subject? I suppose it's a very broad topic and thus more difficult to talk about than a single "neural network" algorithm.

bllguo · on July 17, 2017

Interested in what part of that you feel needs to be explained in more depth? Not sure reading several books is necessary for explaining data collection and data munging...to me it's definitely something best learned by doing.

work in data analysis/stats

Buttons840 · on July 18, 2017

Lots of things are best learned by doing. I just noticed there are dozens of books about machine learning algorithms but none on how to gather data. Of course, both those things can be learned independently, but I think there's room for at least a few books about data gathering considering it's so important for good machine learning results.

_oya8 · on July 19, 2017

Here at Manning (we're publishing Francois Book) have something in our early access program on this now - https://www.manning.com/books/the-art-of-data-usability

emodendroket · on July 18, 2017

This is the ___domain of statistics, isn't it?

randcraw · on July 18, 2017

Agreed. AFAIK, only statistics has addressed the question of info sufficiency in data and discriminative power of method. Personally, I think the former is an enormously important subject that isn't addressed well in most ML texts. How much data is necessary to answer a given question in practice? How do you know if your data or method are "good enough"?

From what I've seen, statistics addresses these questions better than CS-taught ML does. CS-based ML is no different from algorithm analysis; it suffers from sensitivity to limits inherent in the data. But ML courses often don't address these limits very rigorously. Yet knowing those limits is all important when effectively mining information at a professional level.

If you can't tell the decision maker what you know and what you don't, your inference/prediction really isn't useful. From what I've seen, statistics addresses this best.

kensoh · on July 17, 2017

Thanks for sharing your experience. I'm happy that my previous exposure to trading algorithms at least helped me understand more what the experts here are talking about. I believe the output model is only as good as the data (at least for the deep learning branch of ML). If the dataset does not cover data-points which exist in a wider space but in the same ___domain of the problem, or which haven't yet have a precedent, then we really can't simply assume that it is the algo/model that needs tweaking when shit hits the fan.

tlear · on July 18, 2017

This is incredibly true, even with crappy old algorithms you can do A LOT if you have great data.

Recent experience with a company that is building some models based on.. few guys recording few hours of audio and annotating it. I still can't get over the fact that otherwise smart people think this is going to work at all.

red75prime · on July 18, 2017

> but it looks that my data-related advice falls on deaf ears every time, he's only interested in me pointing to him the latest cool ML algorithm.

So, it seems their learning/planning algorithm fails, even when it is given the right data. That's unfortunate.

Sorry, I can't help but notice that you aren't happy with their brain's algorithm, while talking about importance of data. I don't say that data doesn't matter or anything. Just random observation.

landon32 · on July 18, 2017

Could actually be their data, right? Imagine if you had only had experience with software engineering. The only data you use when engineering software are the data you learn when using the product or writing tests, it's all the algorithms behind it that's important. So to them, they just don't have data on situations where the data are important.

Wow that's confusing wording. I hope it makes sense.

red75prime · on July 18, 2017

It does, but the algorithm doesn't seems to be state-of-the-art, it's more like current ML algorithms, which need lots of data to work successfully in each new ___domain. Well, there's a lot of improvement possibilities, at least.

yters · on July 17, 2017

The data processing inequality says processing data does not increase its information content.

yorwba · on July 18, 2017

But processing does increase the "obviousness" of the information content.

E.g. projecting the data onto independent dimensions doesn't change the information it contains, but it highlights that those dimensions are indeed independent. Decomposing a multimodal distribution into a mixture of unimodal distribution gives more insight than just viewing it as a bunch of data mushed together. And so on.

I think there should be a branch of information theory that quantifies the obviousness of information and how it is changed by various data processing methods.

fspeech · on July 17, 2017

The "creative" moves may very well come from the search part of the AlphaGo algorithm, though of course the networks have done their jobs of pruning the search space.

kensoh · on July 17, 2017

I see.. That's true. Though credit still goes to the algo for choosing that particular weird move out of the entire search space (it's just 'weird' and something you will think is a move made by a total newbie to the game). I remembered for that whole week during lunchtime I would watch the broadcast live on YouTube. How devastated I was to see Lee Sedol losing match after match. It was a moment I would never forget, in my mind the computer had crossed an imaginary threshold and it won. I know ML/DL experts will say it is only for a very specific area. But what's stopping more mastery of enough 'specific' areas that the mastery will be broad enough to pass Turing tests?

webmaven · on July 17, 2017

Careful, that's the sort of thinking that led to the last 'AI Winter': assuming that if enough rule-based expert systems were built, general-purpose systems could be assembled from them and/or enough could be learned to build general-purpose systems.

Now, it is worth noting that DL models are already being assembled together (often with a coordinating DL model to switch between them). This can have the advantage of the smaller models being reusable to some extent (certainly more than expert systems ever were) but is not a panacea. The results are still essentially bespoke models rather than general purpose ones.

Deep Learning obviously has a lot more mileage left in it, given that much human mental labor is 'just' training and using our general-purpose intellects for what amount to a series of rather narrowly defined tasks, but it won't surprise me if there is a wall of some sort lurking just over the horizon that will require a different approach (albeit one that may still be called 'deep learning') to cross.

OTOH, it does seem as though the folks at DeepMind are fairly aggressively pursuing whatever is on the other side of that particular horizon:

https://deepmind.com/blog/neural-approach-relational-reasoni...

https://deepmind.com/blog/cognitive-psychology/

https://deepmind.com/blog/imagine-creating-new-visual-concep...

eanzenberg · on July 17, 2017

We can debate, but I don't think another AI winter will happen again in my lifetime. AI work is just earning way too much money for its funding to get cut, and a lot of funding is currently private too.

webmaven · on July 18, 2017

I wasn't arguing for another AI Winter per-se. My warning was more along the lines of pointing out a potential personal "career winter".

sjg007 · on July 18, 2017

I'd be surprised to see inductive learning anytime soon. But I definitely see the next generation of AI systems, robots and their implementation across industry. But that will rapidly fill out and then we will still be left with self determination.

mtremsal · on July 17, 2017

My understanding is that innovation comes from reinforcement learning during self-play (rather than supervised learning of pro games), and thus goes against the best moves suggested by AlphaGo's policy network, in turn pushing it towards new options.

In a sense, it seems innovation arises when the value network forces the policy network to expand the search space because an apparently unlikely move leads to downstream positions deemed favorable.

Cybiote · on July 17, 2017

It's not that simple. The creativity is that the combination of rollouts, policy and value networks allow for more efficient traversal of the search space. Which gets you better exploration of possible paths, meaning more options than a human considered and therefore more creativity.