I’ve built one of these at FAANG. Generally the different parts of the system are completely separate teams that interact through apis and ingest systems. Usually there’s a mix of online and offline calculations, where features are stored in a nosqldb and some simple model runs in a tomcat server at inference time, or the offline result is just retrieved. Almost everything is precomputed.
We had an api layer where another team runs inference on their model as new user data comes in, then streams it to our api which inboards the data.
On top of this, you have extensive A/B testing systems
I have as well, and your comment matches my experience more than the article does. Different teams own different systems, and there's basically no intersection between "things that require a ton of data/computation" and "things that must be computed online".
Yep. The author, as a peddler of recommendations solutions, has an incentive to convince people that this problem is very complicated, and they should hire a consultant.
In practice, good old Matrix Factorization works really well. Can you beat it with a huge team and tons of GPU hours to train fancy neural nets? Probably. Can you set up a nightly MF job on a single big machine and serve results quickly? Sure can.
I think the question of "fancy technique versus simple technique" is beside the point. Assume for the sake of argument that you have a research organization and that it's worth your while to commit their time to a recommendation system.
The point here is that you don't typically need a huge amount of computation power to serve recommendations, even if the underlying model is sophisticated and required a lot of computation to train.
Likewise for data access, the online recommendation system typically does not need full access to the databases that the researchers need access to.
Yea same here. What Nosql DB did you use for these lookups? Im currently using postgres for it but seems a bit like a waste. Even though the array field is nice for feature vectors.
The main issue with deploying these systems right now is the technical overhead to develop them out. Existing solutions are either paid and require you to share your valuable data, or open source but either abandoned (rip Crab) or inextensible (most rely on their own DB or postgres).
I’d love to see a lightweight, flexible recommendation system at a low level, specifically the scoring portion. There are a few flexible ones (Apache has one) but none are lightweight and require massive servers (or often clusters). It also can’t be bundled into frontend applications which makes it difficult for privacy-centric, own-your-data applications to compete with paid, we-own-your-data-and-will-exploit-it applications.
It has everything you need at a platform level to build a production recommendation system given that it’s the engine that powered a lot of yahoo product’s search and recommendation capabilities. I have been experimenting with it, the number of capabilities are immense. It’s really an untapped resource.
> As a result, primary databases (e.g. MySQL, Mongo etc.) almost never work
I mean it does. As far as I'm aware Facebook's ad platform is mostly backed by hundreds of thousands of Mysql instances.
But more importantly this post really doesn't describe issues of scale.
Sure it has the stages of recommendation, that might or might not be correct, but it doesn't describe how all of those processes are scheduled, coordinated and communicate.
Stuff at scale is normally a result of tradeoffs, sure you can use a ML model to increase a retention metric by 5% but it costs an extra 350ms to generate and will quadruple the load on the backend during certain events.
What about the message passing, like is that one monolith making the recommendation (cuts down on latency kids!) or micro services, what happens if the message doesn't arrive, do you have a retry? what have you done to stop retry storms?
did you bound your queue properly?
none of this is covered, and my friends, that is 90% of the "architecture at scale" that matters.
Normally stuff at scale is "no clever shit" followed by "fine you can have that clever shit, just document it clearly, oh you've left" which descends into "god this is scary and exotic" finally leading to "lets spend half a billion making a new one with all the same mistakes."
Meta is relatively open (and open source) in how they handle stuff, including ranking, scoring and filtering described in the original article, but also fast inverted indexes and approximate nearest neighbors in high-dimensional spaces. See, for instance, Unicorn [1,2] or (at a lower level) FAISS [3].
> mostly backed by hundreds of thousands of Mysql instances
Kind of. It's part of the recipe but one you find at these large tech companies (I've worked at FB and GOOG) is they have the resources to bend even large/standard projects like MySQL to their will, while ideally preserving the good ideas that made them popular in the first place. There are wrappers/layers/modifications/etc that eventually evolve to subsume the original software, such that is acting more like a library than a standalone service/application. So, for example, while your data might eventually sit in a MySQL table, you'll never know, and likely didn't write anything specific to MySQL (or even SQL) to get there.
What you're describing sounds like you mean something on the level of Cockroach, talking the Postgres wire protocol but implemented entirely independently underneath (which came indirectly out of Google). Facebook's MySQL deployment sounds more like a heavily-patched-but-basically-MySQL installation. I think Facebook is overanalogised to Google sometimes, as an engineering org.
(Admittedly I haven't worked at either whereas you have - though I have at another FAANG fwliw - but am basing this impression partly on what I hear from friends & partly on plain old stuff I read on the internet.)
FB uses mysql in two very different ways - for the giant social-network database, mysql is basically a key-value store used as the storage layer for the graph database built on top. Then for the thousands of small utility databases (small enough to fit on a single machine) it’s used in a very vanilla way.
Do you think part of this is that Netflix has assumed zero effort from user model? My experience has been that Netflix does an ok job of recommendations, but fails at overall discovery experience. There is no way for me to drive or view content from different angles easily. I end up googling for expert opinions or hitting up rotten tomatoes to get better reviews. Netflix knows a ton about me and their content, but seems to do a poor job of making their content browseable/discoverable overall. I do like their "more like this" feature where I can see similar titles.
Perhaps its because its a niche that isn't worth investing the resources into? Sometimes narrow problem spaces are harder for a company to justify because of the cost to reward ratio. I agree with what you want and would like it myself for music (Spotify, Amazon Music, etc.) but it's a complex problem (recommenders, custom UI and the glue between) that is hard to justify compared to incremental small improvements to existing general purpose recommendations.
It just seems like there's a paucity of signal from which Netflix could come up with anything intelligent. Movies are many hours long, and there are many reasons I could be watching something. What does it mean that I allowed a movie to play to completion? Was I even paying attention? Did I decide I hated it 3/4s of the way through, but finished it just because I cared about the plot?
TikTok, on the other hand, has way more data. Things like time-to-swipe, shares, comments presumably form the basis of some sentiment metric.
Google TV has the best content discovery I've come across so far. Recommendations across most streaming services based on overall similar movies, different slices of the genre, and movies with similar directors/cast members. Plus as soon as you select another movie, you can see all the same "similar" recommendations for that movie.
>Do you think part of this is that Netflix has assumed zero effort from user model?
Talking w/a friend who works at Netflix, it sounds like this is a warranted assumption. The way he told it, they were tearing their hair out at one point b/c users wouldn't put much into it.
What I don't understand about their response is: why not make it configurable? Admittedly this is my philosophy for almost every product I work on - "make it maximally configurable, but make the defaults maximally sane" – but I'm baffled every time I hear someone talking about this 'dilemma'.
You just keep your simple interface, but allow the power users to, say, click through to a particular menu and change their setting – the setting in this case being ~"let me provide feedback / configure how recommendations work". For that kind of user, finding a 'cheat code' is actually a gratifying product experience anyway.
I think its because the complexity of allowing configurability isn't always worth it. Verifying it works for all configurations becomes exponentially harder.
I believe it can also have performance implications especially for things like recommender systems where you are depending a lot on caching, pre computation and training.
I agree, but as aleksiy123 suggests there is an additional complexity burden and it is a long journey to teach users to make use of a new technology. I think a lot of "advanced" features get de-prioritized as not many people use them and it seems like resources could be better spent helping the masses. I think that the importance of "advanced" features is often under rated by traditional engagement models. Wikipedia is a great example of where less than 1% of users click on the edit button, but that 1% adds all the value for the other 99%.
I think it's both. I'm usually able to find decent stuff by searching "best on netflix" with some modifiers, but I almost never find new stuff I like by scrolling on netflix.
In some ways it seems like a classic case of trying to solve the wrong problem because the wrong problem potentially has a technical solution. The real problem is making lots of interesting content for people to watch. If you can solve that problem then a simple system of categories is perfectly sufficient for people to discover content. But that’s not a technical problem, and all those engineers have to be given something to do.
This indicates that the problem is difficult to solve at scale and customized per person. Maybe the issue is with our expectations - I find other people are pretty bad at recommending things for me as well.
Maybe. Recommendation systems definitely seem to get worse as they scale. Amazon's was incredible circa 2000. Pandora seems to be getting worse and more repetitive. Netflix kept getting better and better until they ended their contest and since then they seem to have only become worse.
Rotten Tomatoes works fine as a recommendation system. It lists all of the new content coming out in a given week. I just read that every week, file down to what looks interesting based on the premise, and read a few reviews. I can usually tell pretty easily what I'll like. No need for in-app recommendations from any specific streaming service at all. Good old-fashioned human expert curators.
I was actually pretty impressed the other day when searching for "shiloh" (which they didn't have) because it showed a bunch of "related" queries to other dog movies (they also didn't have). The available search results were a little lacking though.
Interesting post. On thing to note, this seems to be about "on request" ranking. E.g. googleing something and in 500ms you need the recommended content.
However, a lot of usecases are time insensitive rankings. Like recommending content on netflix, spotify etc. (spotifys discover weekly even has a one week! request time :D).
In which case you can just run your ranking and store the recs in your DB and its much much easier.
This is pretty much what both Netflix and Spotify do. I would argue that there isn't a canonical recommendations stack that FAANG is converging towards, and that's a direct corollary of differing business requirements and organizational structure.
Not a direct answer to your question, but at least in English there isn’t that much stuff on how TikTok’s recommendation system works internally. This is the best breakdown I have found on TikTok’s recommendation system internals: https://leehanchung.github.io/2020-02-18-Tik-Tok-Algorithm/
TikTok seem to be learning from what the user is actually watching and for how long and not just the user's "Like"/"Not Interested In" actions. However it still seem to learn from the "Not Interested In" action more than any other platform.
This is a pretty misinformed take when it’s publicly known that YouTube was already doing this (learn from what the user is watching and for how long) the year Bytedance was founded (2012):
Somehow they're doing it better. At least subjectively, people complain more about the YouTube algo's performance than tiktok. For the latter, the most common complaint is that it's too good.
Does YouTube actually work for someone? I bit the bullet after not really giving any other input than what I watched and started marking "Not interested" or "Don't recommend channel" but I cannot see that it has had any effect at all it keeps recommending rubbish "sensationalist" videos.
I know that they probably optimize for ads etc. but if they actually showed me videos I would like to see, then I would spend more time on the platform.
These sensationalist videos are a plague for me. It only takes watching two or three regular videos on a specific topic for YouTube to hone in on it and start populating my feed with them. I find it very disappointing that they exist in every area that I am interested in, including relatively niche ones like functional programming and mechanical keyboards. Although, I have yet to come across a sensationalist video about prolog ;)
I have the exact same experience. I want to spend time on the platform and I can find videos that I like and would have loved for the algorithm to find for me, but I really have to search very thoroughly for them myself.
Yes, it works great for me. I don't really use the negative feedback. Most recommendation I guess are new uploads from creators I like or videos I haven't watched from creators I like. It shows me videos from unknown creators to me that match my interests as they go viral. It shows some random old videos that just randomly go viral again. It's a good experience.
That's interesting and perhaps I should not be that surprised. If it did not work at all, then they would probably have changed it. Their KPI is probably ad revenue, but still if it worked for no one, then I would expect their revenue to decline as people spend their time elsewhere. But then again, where will you go and watch medium to long videos created by "normal" people? Vimeo? Perhaps youtube's interactions are despite their algorithm and not because.
Not directly, most people believe they optimize for session time. It tries to serve you a result that will keep you on the platform watching videos as opposed to needing to keep scrolling or leaving the platform. They truely do want to serve the best videos that they can to you that they think you will be interested in watching. Thankfully for YouTube you watching videos and YouTube getting ad rev is correlated.
>where will you go and watch medium to long videos created by "normal" people?
No one forces you to watch that format of video. You can go on TikTok, Twitch, Netflix, etc and be entertained.
Technically, yes, but when they're talking about this sort of thing, they mean "personal recommendation system" or "content-based recommendation system."
For example, the HN front page is a recommendation system if you literally mean system-that-recommends-web-pages-to-look-at. But it's not personalized; every visitor sees the same front page. This fundamentally makes it a different sort of thing.
If You have 10000 posts that You have to sort it in some way and the user just going to see 20 of those, the sorting is the recommendation system, people are just used to think of products, movies and songs, but in those platforms the users are the products
This hardly seems like a reasonable way to characterize Netflix, which has a personal recommendation system, especially compared to HN, which is ad supported yet gives the same recommendations to everyone.
My reading of their comments is that they are trying to say that social media and news media can be characterised as having recommendation systems too, not just song and movie platforms (I don't know who exactly they're arguing against – I've never heard anyone say that recommendation systems can only be for songs and movies).
I don't think they're really paying much attention to the dimension you're splitting it along, i.e. whether the recommendations are personalised for each user. The huge important idea they have in their head is that recommendations can apply to user-generated social media content too.
Yes, but this is an extremely important dimension to split on. The practical implications are so big that the game changes completely.
* HN and classic Reddit sort their items on a single dimension ("hotness"), calculated using a few input variables and producing a single output variable. This is about as cheap to calculate as recommendation systems get. The XKCD comment recommender is a bit more complex, but still in the same complexity class. Since the whole point of an algorithm like this is to be timely, the naive approach is to compute it on-the-fly, which it's perfectly simple enough to manage.
* At a somewhat more complex level, you get stuff like a basic, uncustomized Similar Items list. If YouTube has no data on you, this is what you get from their sidebar recommender (and their front page would be analogous to Reddit and HN, but sharded by region and language). It's also pretty close to what AdWords used to be, before they started doing user profiling. The thing with this method is, even though it involves some level of AI, it's presenting the same thing to everyone and it's expensive, so the natural solution is to precompute it.
* Personalized recommenders are the worst of both worlds. You can't naively compute it on-the-fly, because it's too slow, but you also can't naively precompute it, because there's a combinatorial explosion of users and items. You actually have to be clever about it.
All I know is that Facebook's recommendation systems always show me things that I hate to see. I suppose they may "work" at scale, but at an individual level it's epic failure.
FB needs an Ad-Rev-Share-Model with ALL of its users...
Imagine if FB were to pay a fraction% of how yur data was used and paid you for it...
It may be a small amount, but in super 4th world countries, it could affect change in their lives...
Now imagine that this becomes big... and it works well.
Now imagine that the populous is aware of the hand of god above them just pressing keys to affect land masses (yes I am referring to the game from the 80s)
but this cauterizes them into union building...
So when the people realize their metrics are the product to feed consumerism for capitalistic profits, and decide to organize, what happens?
Is FB going to need a military force to protect their DCs?
---
With "Zuck Bucks" (I still am not sure if true)
This makes this ultimate "company store"
Tokens?
So how get?
How EARN? (What service on FB GENERATES '$ZB'?)
How spend?
WHAT GET? (NFTs?, Goods? Services?)?
The entire fucking model of EVERYTHING FB DOES is to MAP SENTIMENT!
Sentiment is the tie btwn INTENT and SENTIMENTAL VALUE
The idea is to map interest with emotional drivers which make someone buy (spend resources their time and effort went into building up a store-of)...
---
So map out your emotinal response over N topics and forums.. Eval your documented Online comments, NLP the fuck out of that, see what your demos are and build this profile to you....
THEN THEN THEN THEN
Offer an "earnable" (i.e. Grindable by farms and bots alike) -- "Zuck Buck" which is a TOKEN (etymology that fucking word for yourself)
of value...
Meaning, zero INTRINSIC value, Zero accountability (managed by a central Zuck Bank) <-- Yeah fuck that)
And the vaule both determined AND available to you via not INTRINSIC CONTROL, nor VALUE.
>With "Zuck Bucks" (I still am not sure if true)
I expected more from this place than to believe every click bait FB news.
Of all the UX people and tons of money they throw to into research... Yes the best option was... "Zuck bucks".
Don't get played ffs
Sorry, I mis-read what you posted, I was initially under the wrong impression you were mis-quoting me... I didn't realize yuor comment was just not delineated from my quote...
> since the user is waiting for the “page” to load, most recommendation requests have a budget of only 500ms or so and it is only possible to score a few hundred items in a request.
This doesn't make much sense to me since a recommendation is rarely needed instantly. Why not spend, say, 10 s constructing a better recommendation while the user is doing something else, during which the recommendation can simply be blank. Obviously if the user requests a recommendation on first visit, you're out of luck, but I'm thinking the typical use case is for a recommendation after the primary reason for visiting has been completed.
Much of the field seems to be fixated on throwing massive compute resources at models with results that can neither be evaluated nor reproduced.
"the Recommender Systems research community is facing a crisis where a significant number of papers present results that contribute little to collective knowledge […] often because the research lacks the […] evaluation to be properly judged and, hence, to provide meaningful contributions"
That was my thinking - anything of value is product-specific and behind closed doors. It's not my field, but something I see come up from time to time that seems weirdly over-represented in ML articles.
I work on these systems, and if anything my only complaint about the field is the propensity to solve every optimisation problem with ML. I have seen people solve textbook-grade linear, and even differentiable, optimisation problems.
And the reason it happens despite the 'invisible hand' etc is because it still works, it just happens to be horrendously inefficient. I think that's the main area of inefficiency in the industry: not in getting the job done, nor even arguably in accuracy - at least not severely - but in overcomplicating the solution[0] because we've formed a cargo cult around one particular method of optimisation, beyond all nuance.
[0] I mean 'overcomplicating' in absolute terms. Of course the very crux of my point is that, from the data scientist's perspective, it's not overcomplicated - it's less complicated than using e.g. ILP precisely because we have made libraries like TensorFlow so incredibly easy and tempting to use.
Academia thrives on open benchmark that researchers compete against each other to get the highest score on. How would you replicate that with a recommendation system where in industry you test recommendations by variants A/B/... and observe the winner? Your benchmark is no longer static and you rely on users' feedback for the recommendations each algorithm made.
> a machine learning model is trained that takes in all these dozens of features and spits out a score (details on how such a model is trained to be covered in the next post).
This part was the one I was interested in. As most of the rest are obvious.
Model deployment is an important but still tiny part of the overall ranking/recommendation systems. Bulk of the complexity stems from two key properties of recommendation systems (which are different from say computer vision models):
1. The system operates on user feedback. As a result, it needs to manage flow of lots of data, with at least some subset being managed in realtime.
2. For any single request, there are thousands of things to recommend from. As a result, a single request is not scoring a single ML model but thousands of models - one (or often more, see value modeling the post) for each candidate.
How FAANG actually builds their recommendation systems:
Millions of cores of compute, exabyte scale custom data stores. Good recommendations are expensive. If you try to build a similar system on AWS, you will spend a fortune.
Most recommender models just use co-occurrence as a seed, this can actually work pretty well on it’s own. If you want to get fancy then build up a vectorized form of the document with something like an an autoencoder, then use some approximate nearest neighbors to find documents close by. 95% of the compute and storage is just spent on calculating co-occurrence though.
> Millions of cores of compute, exabyte scale custom data stores. Good recommendations are expensive. If you try to build a similar system on AWS, you will spend a fortune.
And then it will be gamed, and become as useless as every other recommendation system already going.
Also, 'millions of cores' is a ludicrously shitty, zero-clue answer. It's like asking how Eminem makes music, and saying 'millions of pills'. Like, yes, that's an input, but you're missing the entire method of creation, of converting the crude inputs into the outputs.
For my money - and, for what little it's worth, I work in this field – I think most of the impressive feats of data science attributed to 'machine learning' are really just a function of now having hardware capacity so insanely great that we're able to 'make the map the size of the territory', so to speak. These models are essentially overfitting machines, but that's OK when (a) it's an interpolation problem and (b) your model can just memorise the entire input space (and deal with any inaccuracies by regularisation, oversampling, tweaking parameters till you get the right answers on the validation set, then talking about how 'double descent' is a miracle of mathematics, etc).
Don't get me wrong, neural nets are obviously not rubbish. They are a very good method for non-convex, non-differentiable optimisation problems, especially interpolation. (And I'm grateful for the hype cycle that's let me buy up cheap TPUs from Google and hack on their instruction set to code up linear algebraic ops, but for way more efficient optimisation methods, and also in Rust, lol.) It's just a far more nuanced story than "this method we discovered and hyped up for a decade in the 80s suddenly became the key to AGI".
These recommendation systems take control away from individuals over what content they see and replace that choice with black box algorithms that don't explain why you are seeing the content that you are or what other content was excluded. All of the companies who have deployed these content selection algorithms could have also given you manual choice over the content that you see, but chose instead to let the algorithm solely determine the content of your feed, either removing the manual option entirely or burying it so thoroughly that no one bothers to use it.
These algorithms are not benign. They make choices about what information you consume, whose opinions you read, what movies you watch, what products you are exposed to, even which politicians messages you hear.
When people complain about the takeover of algorithms, they don't mean databases or web interfaces. They mean this: content selection or preference algorithms.
We should be deeply suspicious. We should demand greater accountability. We should require that the algorithms explain themselves and offer alternatives. We should implement better. Give control back to the users in meaningful ways
If software engineering is indeed a profession, our professional responsibilities include tempering the damaging effects of content selection algorithms.
Did you know how a news paper used to choose what articles it wanted to run?
Do you know how a TV channel decides to schedule stories?
Humans, its all humans. Looking at the metrics, and steering stuff that feeds that metric.
Content filters are dumb and easy to understand. seriously, open up a fresh account at FB, instagram, twitter or tiktok.
First it'll try and get a list of people you already know. Don't give it that.
Then it'll give you a bunch of super popular but click baity influencers to follow. why? because they are the things that drive attention.
if you follow those defaults, you'll get a view of whats shallow and popular: spam, tits, dicks and money.
If you find a subject leader, for example a independent tool maker, cook, pattern maker, builder, then most of your feed will be full of those subjects, save for about 10% random shit thats there to expand your subject range (mostly tits, dicks, spam or money)
What you'll see is stuff related to what you like and stare at.
And thats the problem, they are dumb mirrors. Thats why you don't let kids play with them. Thats why you don't let people with eating disorders go on them, thats why mental health needs to be more accessible, because some times holding up a mirror to your dark desires is corrosive.
Could filter designers do more? fuck yeah, be we also have to be aware that filters are a great whipping boy for other more powerful things.
Want to dive in to all this stuff but can't find a starting point?
Start with reading my patent!
I was smart enough to see what collaborative filtering (CF) could be early on, and to file a patent that issued. I wasn't smart enough to make it a complicated patent, or to choose the right partners so I could have success with it.
But the patent makes a good way to learn how to get from "what are your desert island 5 favorite music recordings?" over to "here is a list of other music you might like". Basic CF, which is at the core of a lot of this stuff. Enjoy!:
What? They have absolutely tremendous data, the envy of any data scientist on the planet. I don't understand how you could possibly describe their user data as garbage in any conceivable way. Even search result click-and-query data alone - leaving out Android, Chrome, Cloud, and everything else - is a stupendously invaluable, priceless asset.
If you call that garbage, what on earth - or, for that matter, off it - is not garbage!?
Gentle reminder to anyone reading this that your problems are probably not FAANG problems. If you architect your system trying to solve problems you don't have, you are gonna have a bad time.
"And note that you don’t even have to be at FAANG scale to run into this problem - even if you have a small inventory (say few thousand items) and a few dozen features, you’d still run into this problem. "
Wow, this is something that has been a floater-in-mind for decades ;
I'll top it off with an interview at Twitter with the Eng MGR ~2009-ish?
--
Him: So tell me how you would do things differnetly here at twitter based n your experience?
ME: "Well, I have no idea what your internal processes are, or architecture, or problems, so my previous experience wouldn't be relevant."
I'd go for the best option that suits goals.
[This was my literal response to the question, which I thought was a trap but responded honestly -- as a previous mgr of teams, the "well, we did it at my last company as such"]
Dont reply this way. <--
Here was his statement:
This is a literal quote from a hiring manager for DevOps/Engineering at Twitter:
"Thank god!, We have hired so many people from FB, where that was there only job out of school, and no other experience, and the biggest thing they told me was "well - the way we did this at FB was... X"
--
His biggest concern was engineering-culture-creep...
> Wow, this is something that has been a floater-in-mind for decades
Have you literally never come across the "you're not Google!!!" trope before now, during the whole ~decade leading up to this very day? Gosh I envy you.
(Also, I am reaaally struggling to understand that story. Who is speaking? It sounds like a story within a story within a story. I can just about piece together the gist, but I'm very confused by all the formatting and nested quotes.)
Wow. That amazes me that anyone would answer that question without knowing anything about the problem space and implemented solutions.
Wait, I got it, I would rewrite everything as AWS Lambdas. That's the right answer! Screw your (almost certainly SQL) DB, let's move it all to DynamoDB too.
I was stating that the eng mgr was relieved to NOT hear an answer of "the way we did it at company X, and it was successful for them, so I assume that the same approach maps to your company"
I'll never understand why people think this is a valid criticism of an article, rather than pointing out an issue they have with the actual content of the article. There's nothing inherently wrong with a company sharing info about the space they operate in. In fact, it should be encouraged as long as what they share is useful.
It's a short-hand for the treatment of the subject being pretty shallow and non-descript, which seems to apply to this article exactly. I read this and didn't learn anything.
Right, but then it starts a meta-conversation about why the article got posted, or even written. It doesn't have the down-the-rabbit hole trait of an individual project of passion, or the sort of authoritative voice of a conference talk or even a Netflix blog post, it doesn't really speak to specific actionable technologies so it's not the kind of onboarding a Toward Data Science post would be. And that meta conversation inevitably leads to, oh, it's a marketing funnel. So just saying "this is content marketing" I think is a shibboleth for the entire conversation that starts with "pretty shallow and non-descript".
Of course I didn't write the original comment and there's something to say for flag-and-move-on or whatever, and other people did enjoy it. I'm just saying I understand the impulse to short-circuit the entire tedious conversation!
It provides more information. It's shallow and non-descript because it's an ad is the argument. I don't know if I believe that here. It's a blurry line with sponsored content.
Off-topic, but how did Netflix manage to get itself inserted into the FAANG acronym anyway? Their impact on the tech industry is trivial compared to all the others. Sure, if you just take out the N it's offensive, but we could have said "GAFA" or "FAAMG" would be more accurate to include Microsoft in their place.
FAANG was created by the TV personality Jim Cramer to talk about high growth tech stocks. At the time Netflix was doubling every year. It was based purely on finance.
It's now been taken over by the tech industry to be shorthand for places that are highly selective in their hiring and tend to work on cutting edge tech at scale.
That being said, the impact of Netflix on tech is pretty big. They pioneered using the cloud to run at massive scale.
It was a lot more than that. They developed systems and techniques that even Amazon adopted and are still adopting to this day. They also created a ton of open source tools for other people to use the cloud:
> FAANG was created by the TV personality Jim Cramer to talk about high growth tech stocks. At the time Netflix was doubling every year. It was based purely on finance.
That, and FAAG had less of a ring to it.
Edit: Dammit, the GP made the same observation. Oh well, I'm keeping it.
The phrase originated w/ Jim Cramer, it refers to the 5 best performing tech stocks(or what were the best performing at the time). Nothing to do with their impact on the field from a technical perspective, just a business perspective.
There was a point in time when FAANG offered the best compensation packages for engineers (Netflix was one of them) - so that's where the term originated from but while it's outdated in many respects (Microsoft is not included, Facebook is now Meta, Google is now Alphabet etc etc) it's still sticky for some reason.
Eh, the new parent company names aren't really what people know them as still. I don't think most people are even aware that Google has a parent company.
I have a friend that works at Google, and that's what we say. I don't think him or anyone would ever say he works at Alphabet.
I'm going out of my way to not say Meta for that exact reason. They haven't proved they deserve to shed their horrible reputation yet. They'll always be the Facebook we know and hate until they're less dystopian.
I think the acronym gained prominence before Microsoft's recent 'commitment' to open source. Netflix also seemed to be doing really interesting things scaling out 'disruption' to video delivery at the time. It stuck
FAANG was never about impact on tech industry. Otherwise, MSFT would be part of FAANG. Instead, it's directly related to (1) stock price and (2) compensation.
> "FAAMG" would be more accurate to include Microsoft in their place
In Europe you nearly always see "GNAFAM", which includes Microsoft too. It's certainly weird to exclude MSFT, worth at times more than Amazon+Meta+Netflix combined.
No question they've done some things that have had some impact on others in the industry. But none of them are particularly important. It's all relative. Companies like Twitter, Uber, AirBnb have all released open source projects or figured things out how to solve hard problems in ways that others have emulated.
But for every other one of the FAA(N)G companies, I can barely work a day as a developer without touching every one of their technologies. Yeah, Netflix got into ML years before most, but the netflix prize exists as a distant cautionary memory, and as an ML professional, I'd literally never heard of metaflow before. Just sayin'.
Nowhere was the argument made that somehow Netflix was more influential than Twitter/Uber/AirBnB, but your counter-argument that somehow it's less influential because you haven't heard of/used some projects directly holds no ground.
> your counter-argument that somehow it's less influential because you haven't heard of/used some projects directly holds no ground
Oh come on, they are indisputably right that Microsoft, Twitter, Uber, Airbnb, hell, even Cloudflare are more technically influential than Netflix is.
Apple and Google would make anyone's top 5, that's his point. No argument about it. Their products collectively dominate anyone's life, along with MSFT. Netflix is maybe in your top 10, top 20 for sure, but it's not up there as one of the few 'platform that everyone's lives are built on' techcos.
(Like, Netflix vs Microsoft? Seriously? For that matter, Amazon probably wouldn't be in my top 5 either, and not only because it's not mainly a tech company. I s'pose it depends how you define 'Amazon', and if you include AWS. But for Netflix there's just no argument that they win a spot there.)
What's your argument for Twitter/Uber/AirBnB being indisputably more technologically influential than Netflix? And let's please talk facts rather than opinions.
Tangent, but I was recently thinking about how FAANG, is now MAANG, and the definition of mange :
(from a google search, lol)
mange
/mānj/
Learn to pronounce
noun
noun: mange
a skin disease of mammals caused by parasitic mites and occasionally communicable to humans. It typically causes severe itching, hair loss, and the formation of scabs and lesions.
"foxes that get mange die in three or four months"
I find it oddly poetic, but, this is my last day of magic.
We had an api layer where another team runs inference on their model as new user data comes in, then streams it to our api which inboards the data.
On top of this, you have extensive A/B testing systems