Why is it that Neural Net-based ML only seems to be claiming results with images and natural language?
Maybe I'm out of the loop, but I haven't seen anything demonstrating results on "data" – the kinds of challenges that are actually valuable to businesses.
Why is that? Are those just less sexy / more proprietary in nature, or is there something about those challenges that make NN's less useful to them?
First of, there is. These NN are good at exploiting the 'spatiality prior' in some types of data, like text and images. It means that features in the data which are close together, should be combined when you climb in the hierarchy of features. Databases with columns and rows don't have that prior for instance.
Second, there is also the peer reviewing problem. You are still trying to explain a very abstract concept to your peers in a paper which is usually limited to 6 or 8 pages. Text and images make for very graspable examples in such a short paper. That's the reason why some other data with a spatial prior is not used as often, like time series or EEG-data.
So, there is a combination of those two elements at play.
Only the first reason is correct (NNs are good at data with dimensional relationships).
The second reason is pretty bogus (text/images more graspable). It's valid if you're talking about mass media / popular press. But for research papers 1) images / large snippets of text are actually a negative since images take a lot of space and 2) the people doing peer review are expert scientists. They know the benchmarks and the theory.
Because this is B2B business and usually done by specialized software companies. These companies do not publish or open source their solutions because it is either against their own or their customer's interest.
Examples are the whole Predictive Maintenance sector, the medical sector (Computer Aided Diagnosis) or insurance companies which use NN to for all kind of analyses.
There are many good answers already, but my two cents is that there are many standard statistical methods that work on basic "data" type problems. If your data is spreadsheet type data, where it has some number of basic float inputs that lead to some float result, you will probably use ordinary statistical methods like boost. NNs might be able to achieve the same results but they probably won't do much better.
In fact I remember talking to an ML guy with a PHD who was working on one of these types of problems and I asked "why not try NNs on this problem". He looked at me with disgust and said something akin to "it's provable that NNs can never do better than BOOST, so why use them?"
However, boosted decision trees don't work on image analysis at all, so these types of problems have become the standard for NNs.
It is also worth noting that people are more likely to try image problems if everyone else is trying image problems, because then it is easy to compare multiple algorithms together.
Maybe I'm out of the loop, but I haven't seen anything demonstrating results on "data" – the kinds of challenges that are actually valuable to businesses.
Why is that? Are those just less sexy / more proprietary in nature, or is there something about those challenges that make NN's less useful to them?