1. At this scale, we’re not just talking about buying GPUs. It requires semiconductor fabs, assembly factories, power plants, batteries/lithium, cooling, water, hazardous waste disposal. These data centers are going to have to be massively geo-engineered arcologies.
2. What are they doing? AGI/ASI is a neat trick, but then what? I’m not asking because I don’t think there is an answer; I’m asking because I want the REAL answer. Larry Ellison was talking about RNA cancer vaccines. Well, I was the one that made the neural network model for the company with the US patent on this technique, and that pitch makes little sense. As the problem is understood today, the computational problems are 99% solved with laptop-class hardware. There are some remaining problems that are not solved by neural networks, but by molecular dynamics, which are done in FP64. Even if FP8 neural structure approximation speeds it up 100x, FP64 will be 99% of the computation. So what we today call “AI infrastructure” is not appropriate for the task they talk about. What is it appropriate for? Well, I know that Sam is a bit uncreative, so I assume he’s just going to keep following the “HER” timeline and make a massive playground for LLMs to talk to each other and leave humanity behind. I don’t think that is necessarily unworthy of our Apollo-scale commitment, but there are serious questions about the honest of the project, and what we should demand for transparency. We’re obviously headed toward a symbiotic merger where LLMs and GenAI are completely in control of our understanding of the world. There is a difference between watching a high-production movie for two hours, and then going back to reality, versus a never-ending stream of false sensory information engineered individually to specifically control your behavior. The only question is whether we will be able to see behind the curtain of the great Oz. That’s what I mean by transparency. Not financial or organizational, but actual code, data, model, and prompt transparency. Is this a fundamental right worth fighting for?
I have an interesting anecdote about that. I was consulting for a very large tech company on their advertising product. They essentially wanted an upsell product to sell to advertisers, like a premium offering to increase their reach. My first step is always to establish a baseline by backtesting their algorithm against simple zeroth and first-order estimators. Measuring this is a little bit complicated, but it seemed their targeting was worse than naive-bayes by a large factor, especially with respect to customer conversion. I was a pretty good data scientist, but this company paid their DS people an awful lot of money, so I couldn’t have been the first to actually discover this. The short story is that they didn’t want a better algorithm. They wanted an upsell feature. I started getting a lot of work in advertising, and it took me a number of clients to see a general trend that the advertising business is not interested in delivering ads to the people that want the product. Their real interest is in creating a stratification of product offerings that are all roughly as valuable to the advertiser as the price paid for them. They have to find ways to split up the tranches of conversion probability and sell them all separately, without revealing that this is only possible by selling ad placements that are intentionally not as good as they could be. Note that this is not insider knowledge of actual policy, just common observations from analyzing data at different places.
One thing you know about ad guys—they are really good at tricking people into spending money. I mean, it’s right there in their job description. For some reason their customers don’t seem think they’ll fall for it, I guess.
In terms of the people with products to advertise being crewed over by the ad industry, I think it is more that they don't see the similarity between the ad industry brain-washing us and the ad industry brain-washing them. Perhaps the disconnect happens because they want to interact with the ad industry, so get their stuff hawked to us, but we'd usually rather not.
Another interesting disconnect is that sometimes a person is both the “us” and the “them” in different contexts. i knew someone who would complain about some of it on other sides but when pointed out that his site used some of the same tricks he'd respond with “yeah, but I need that because …”.
Meh. I have no idea if I am smart or not -- the last several years proved to me I am definitely stupider than I thought -- but I know that with time I only started buying things I directly derive value from or in the worst-case scenario, I'll undoubtedly need during the next few months. No cutesy phone cases, no gadgets "because why not", no extra socks "because you never know", no new toaster because the current one just a tad too big etc. Almost no unnecessary purchases.
It's much more related to maturing on this or that axis than being smart IMO.
Buddy “I only buy things that I am convinced I need when I’m convinced I need to buy them” isn’t the demonstration that you’re immune to advertising that you seem to be trying to make.
Let me guess, when you are convinced that it is time to buy something that you need, you tend to buy from brands that you trust are of high quality or value — in a way that’s totally disconnected from and unrelated to any advertising or marketing efforts though?
You seem to have missed the criteria that I gave: namely that I buy stuff that I'll derive direct value from now or VerySoon™.
As for brands, obviously that's a thing but it's not always due to propaganda / ads -- often it's also about trying a few and sticking to what you have observed works best (or is longer lasting, or both).
Effectively the advertisers could buy less ad space and get the same or better conversion? That is somewhat hilarious because that means that not only are the end-users "the product" the advertisers are as well. There's only cows for the milking, on either side... and shareholders.
Yes. It works really well. You can do a WHOLE LOTTA ARB(tm)(circle R), buying the crap placements at super low CPMs and selling the performance difference to clients. This is mitigated by those clients who ONLY WANT THE BEST (but of course, sir, right this way) - but there are ways around that, too - like the MFA (made for advertising) domains of all the big-name sites you can think of that solely exist for your RTB machine to pump ads stacked on top of each other, and only visible to bots and crawlers. It doesn't help that on one side, you have folks astute with math (Data Scientists et al.) and on the other, a metric shit ton of Media Planners/Buyers who are just handed a budget and are often pretty naive about the intricacies of how it all works. But it all sort of goes back to the original point - people put on blinders. They just wanna see the metric get hit, the numbers go up. Most of the time they don't care how any of that works as long as they look good to their boss, and the industry mostly obliges.
> They have to find ways to split up the tranches of conversion probability and sell them all separately, without revealing that this is only possible by selling ad placements that are intentionally not as good as they could be.
I worked in the adtech space for almost 10 years and can confirm this is where we landed, too.
>The short story is that they didn’t want a better algorithm. They wanted an upsell feature.
This is why I got out. No one cares about getting the right ad to the right person. There's layers upon layers of hand-waving, fraud, and grift. Adtech is a true embodiment of "The Emperor's New Clothes."
Is there a solution? Obviously those companies are not going to change, so what can everyone else do about it - besides already being very rich, starting a competing ad-tech without funding, managing to get market share, and managing to remain one of the good guys.
The only thing I can think of is to use things like influencer ads on places like Instagram or Youtube which ironically sound like much better value for money as you actually know what you're getting for the money.
One thing I’ve thought about is how AI assistants are actually turning code into literature, and literature into code.
In old-fashioned programming, you can roughly observe a correlation between programmer skill and linear composition of their programs, as in, writing it all out at once from top to bottom without breaks. There was then this pre-modern era where that practice was criticized in favor of things like TDD and doc-first and interfaces, but it still probably holds on the subtasks of those methods. Now there are LLM agents that basically operate the same way. A stronger model will write all at once, while a weaker model will have to be guided through many stages of refinement. Also, it turns the programmer into a literary agent, giving prose descriptions piece by piece to match the capabilities of the model, but still in linear fashion.
And I can’t help but think that this points to an inadequacy of the language. There should be a programming language that enables arbitrary complexity through deterministic linear code, as humans seem to have an innate comfort with. One question I have about this is why postfix notation is so unpopular versus infix or prefix, where complex expressions in postfix read more like literature where details build up to greater concepts. Is it just because of school? Could postfix fix the stem/humanities gap?
I see LLMs as translators, which is not new because that’s what they were built for, but in this case between two very different structures of language, which is why they must grow in parameters with the size of the task rather than process linearly along a task with limited memory, as in the original spoken language to spoken language task. If mathematics and programming were more like spoken language, it seems the task would be massively simpler. So maybe the problem for us too is the language and not the intelligence.
> If mathematics and programming were more like spoken language, it seems the task would be massively simpler
Mathematics and programming derives from spoken languages. The thing is that spoken languages lack precision: What you said is not what I understand. So they chose a subset, assign the items precise meanings and how to use them in sentences. Then convert the whole in a terser form for ease of notation. Starting with the terse notation is the wrong way to go about it, just like someone does not start learning music by reading sheet music. Learning programming is easier by learning what the components are and not how to write syntax.
> So maybe the problem for us too is the language and not the intelligence.
The issue is understanding. Behind every bad code, there's someone with either lack of understanding or making hasty decisions (time pressure, tiredness,...).
> The thing is that spoken languages lack precision
This is true, but extreme precision is most useful if there's only communication one way, i.e. the programmer communicates to the computer how to operate and the computer does not respond (other than to execute the code). But if there's a dialog, then both parties can ask for clarification and basically perform a binary search to quickly hone in on the exact meaning.
The computer does respond. With the result of the query or the operation that I've tasked it to do. If I say `<command>`, the only replies that I expect are: confirmation if the action is dangerous; error messages if the action did not succeed; and success message if it did. I don't want to do philosophy with a tool. I only want to get a task done or create something. All the meanings for the commands are already listed in the manuals for the software that are running on the computer. And you can get training if you want targeted guidance.
> writing it all out at once from top to bottom without breaks
This only happens for toy examples; all real development is iterative, across years and teams. There are a few people who can do Feynman "think very hard and write down the answer", but it's a unique skill of limited availability.
> One question I have about this is why postfix notation is so unpopular versus infix or prefix, where complex expressions in postfix read more like literature where details build up to greater concepts. Is it just because of school?
Developers (you, and the rest of the audience) really need to be able to distinguish between a personal aesthetic preference and some sort of timeless truth.
> Could postfix fix the stem/humanities gap?
I feel fairly confident when anyone posts One Weird Trick nonsense like this that the answer is "no". Especially as postfix is in no way new, it's decades old.
Heck, this is also Anglocentric: there are plenty of human languages which are "postfix", in that they're subject-object-verb rather than subject-verb-object. Such as German and Japanese. Doesn't seem to convey an automatic advantage in either science or literature against the juggernaut of ubiquitous English.
(Mathematical notation tends to use infix operators for some things and prefix operators for others; postfix seems to be rarer? Mostly for units?)
> There should be a programming language that enables arbitrary complexity through deterministic linear code, as humans seem to have an innate comfort with.
I agree that linear code is easier to read and understand. I've noticed that often when my own code gets confusing it is because it's too nested, or too many things are happening at once, or the order of actions is not clear. After gaining a deeper understanding of the problem, rewriting it in a more linear fashion usually helps but not always possible.
I'm curious how a programming language could enable writing complex code in a linear fashion if the complexity of the code is due to the interconnected nature of all its parts. In other words, there may be no way to connect all the parts in a linear way without oversimplifying.
Of course, sometimes the complexity is incidental, that is, if I were a little smarter or spent more effort, I could reduce the complexity. But some complexity is intrinsic to the problem being solved.
The question that really fascinates me is not why code is non-linear, but why literature isn't?
Exactly what I was thinking of when reading the article. Maybe codeMic comes a year or so too late.
Soon AI will read this big blob of code and I can ask away and the AI can explain while jumping to places in the code, which I can hoover for inspection. Then I ask it to refactor it or add this or that functionality, add tests for it and show the results of the tests.
This is an awesome development. I don’t want to take anything away from the credit due to the product. But I really dislike these bloviated corporate press releases. It reads like a full article generated from a 1-sentence LLM prompt. Perhaps the Internet UX from here will be a competition between AI-based content generation, and AI-based summarization, essentially DECCO instead of CODEC. Kind of like how spam grew to consume 99% of email, so everybody has to run spam filters to get what they want. Technology and the abuse of it move together.
This is the bigger reality. It’s turned almost all business and academic writing into long-winded meaningless trash. Well, more than it already was I guess. It seems that the way people use it is to expand few bits of information into many bits of content to convince others that work was done. It’s like the Turing test for laziness. The other issue is that it tends toward agreement on anything it wasn’t trained to specifically disagree about. I can see a smarter and more disagreeable bot doing much worse on LMSys than the sycophant models. Nothing new there I guess. But it’s spilling over to human norms as well, in that previously normal human deviation from chat model style interactions is anomalous, so everybody has to use the AI, and therefore nobody is providing any more value than the LLM, so everybody is getting laid off, except the disagreeable guy, and he gets fired first. It’s hacking us in the positive reinforcement vulnerabilities, ones that get worse the more they’re exploited, but it has none of the human resource constraints that previously kept them in check.
Andy Grove flew in Clayton Christensen to let him talk for about 15 seconds before deciding that Intel would disrupt themselves by taking huge losses on Celeron. But Celeron did not save Intel; ASCII Red and multicore saved Intel. If he had actually read Clayton’s book, he would have understood that. Otellini got the disruption theory correct, and stayed out of mobile. But was that right? Maybe not in the current monetary environment where investment flows dwarf operating flows. A big mobile market could attract more investment than the losses it would generate. So disruption theory now works in reverse, and I’m not sure how far that implication goes.
I think you need higher algorithmic intensity. Gradient descent is best for monolithic GPUs. There could be other possibilities for layer-distributed training.
“Bitcoin security” is a different notion than almost all other popular chains. A prolonged 51% attack on bitcoin implies the ability to double-spend, but not at all the ability to affect prior balances. A 51% attack on most smart contract chains implies the ability to change any and all state arbitrarily.
The simplest solution is to wait until the cost of hashing exceeds the value of your transaction by some reasonable factor. I expect that better solutions will come along by soft fork without adverse effect on supply or decentralization.
How is it different exactly? If I had 51% of the hashing power on the bitcoin network, couldn't I change block history and have a majority of the network agree on that new chain?
No. If you had 51%, you could revert one block of history for every 49 blocks of attack time. In addition, you have no ability to create transactions that were not already signed by the owners, nor create bitcoins more than the block reward. This is because of the UTXO model rather than the state machine model. In Bitcoin, every transaction is verified against history, while the EVM chains only verify transactions against state. So if you control EVM state, you can bootstrap every new node to any state you wish, but UTXO verification requires rewriting the entire history.
Mostly because there is a MASSIVE oversupply of people that can’t and won’t do anything useful, and insist that their technical incompetence makes them uniquely qualified to be in charge of everybody else that is doing the work.
But at some point, that’s exactly how it has to work out, because CEO is a poorly-understood role that is only discoverable through natural selection among legions of technical bozos. One of those crazy idiots can make you rational experts a billionaire, but most of them will waste your time, and you probably can’t tell the difference.
2. What are they doing? AGI/ASI is a neat trick, but then what? I’m not asking because I don’t think there is an answer; I’m asking because I want the REAL answer. Larry Ellison was talking about RNA cancer vaccines. Well, I was the one that made the neural network model for the company with the US patent on this technique, and that pitch makes little sense. As the problem is understood today, the computational problems are 99% solved with laptop-class hardware. There are some remaining problems that are not solved by neural networks, but by molecular dynamics, which are done in FP64. Even if FP8 neural structure approximation speeds it up 100x, FP64 will be 99% of the computation. So what we today call “AI infrastructure” is not appropriate for the task they talk about. What is it appropriate for? Well, I know that Sam is a bit uncreative, so I assume he’s just going to keep following the “HER” timeline and make a massive playground for LLMs to talk to each other and leave humanity behind. I don’t think that is necessarily unworthy of our Apollo-scale commitment, but there are serious questions about the honest of the project, and what we should demand for transparency. We’re obviously headed toward a symbiotic merger where LLMs and GenAI are completely in control of our understanding of the world. There is a difference between watching a high-production movie for two hours, and then going back to reality, versus a never-ending stream of false sensory information engineered individually to specifically control your behavior. The only question is whether we will be able to see behind the curtain of the great Oz. That’s what I mean by transparency. Not financial or organizational, but actual code, data, model, and prompt transparency. Is this a fundamental right worth fighting for?