I had the same thought. I mean, were they actually using ordinary floating-point numbers to represent amounts in their ledger? This sets off so many alarm bells.
In some circles, there is the irritating tendency to believe that technology can solve every problem. Experts are eschewed because innovation is valued above all else.
Um, is it okay to admit, as an "experienced" programmer, that I often resort to print statements? I mean, compilers are just so darn fast these days.
Another trick: for rare circumstances, code whatever complicated logic is needed to isolate the bug in order to issue a print statement, then use the debugger to break on that print statement.
Intuitively, an overparameterized model will generalize well if the model’s representations capture the essential information necessary for the best model in the model class to perform well
The improvements in transformer implementation (e.g. "Flash Attention") have saved gobs of money on training and inference, I am guessing most likely more than the salary of those researchers.
> Tesla are all making rapid progress on functionality
The lack of progress with self driving seems to indicate that Tesla has a serious problem with scaling. The investment in enormous compute resources is another red flag (if you run out of ideas, just use brute force). This points to a fundamental flaw in model architecture.
Scaling experiments are routinely performed (the results are not encouraging). To say we know nothing about this is wrong.