The title has existed for years now at various companies(most with an open source program have them) so whatever you were suppose to do you're too late now.
There's a grain of truth to what he's saying. We don't write many new actual C++/Java MapReduce jobs. It's mostly flume. Although there is some momentum behind Go MapReduce.
The key word is pipeline. If you have some analysis that runs in several stages, you'll be taking the output of one stage, and connecting it to the next. If you want to compose multiple phases, chained together, raw MapReduce isn't going to help you very much with the chaining.
What's described in the paper is a way to do the chaining in a nice way. The system will take care of writing the raw MapReduces for you. But it'll also do a lot of work on the interconnections between your stages as well.
MapReduce wasn't designed for iterative algorithms or streaming data, whereas Google Dataflow and Spark (http://spark.apache.org/) make iterative algoritms easy. It's a much simpler programming paradigm, and it allows you to do iterative graph-processing and machine-learning algos (http://spark.apache.org/mllib/) that are impractical on MapReduce.
Hey sometimes you just have to do whatever it takes to synergize global channels on virtual platforms. You know, really aggregate extensible markets with repurposed leading-edge metrics.
But how will these leading-edge metrics enable us to deliver paradigm shifting solutions to our customers while simultaneously reducing costs and increasing operational efficiency?
That's the best part - it's automated with a context-sensitive customizability infrastructure. That means you can expand any targeted benchmark in a demand-driven mesh.
Cool. Does this mean Google is moving away toward languages that allow for easier use and serialization of closures than in C++ and Java? (For example, Spark uses Scala natively.)
What do you mean, warehousing? Like, item tracking inside an actual warehouse? Hard to imagine spending more than a kB per unique item -- more per SKU, but less per individual object -- so even if you have 1M items being tracked, the total size would only be a gigabyte. Even if you had a billion unique things in your store, the resulting database would still fit on a single flash drive.
This is not even true. They have recently published research that involved using MapReduce in their own systems. Example: http://research.google.com/pubs/pub41376.html