More

ImprobableTruth · 2025-04-24T12:21:04 1745497264

I think the fact that all (good) LLM datasets are full with licensed/pirated material means we'll never really see a decent open source model under the strict definition. Open weight + open source code is really the best we're going to get, so I'm fine with it coopting the term open source even if it doesn't fully apply.

diggan · 2025-04-24T12:51:56 1745499116

> we'll never really see a decent open source model under the strict definition

But there are already a bunch of models like that, were everything (architecture, training data, training scripts, etc) is open, public and transparent. Since you weren't aware those existed since before, but you now know that, are you willing to change your perspective on it?

> so I'm fine with it coopting the term open source even if it doesn't fully apply

It really sucks that the community seems OK with this. I probably wouldn't have been a developer without FOSS, and I don't understand how it can seem OK to rob other people of this opportunity to learn from FOSS projects.

pabs3 · 2025-04-24T13:08:11 1745500091

Not all of the community is OK with this, lots of folks are strongly against OSI's bullshit OSAID for example. Really it should have been more like the Debian Deep Learning Team's Machine Learning Policy, just like last time when the OSI used the Debian Free Software Guidelines (DFSG) to create the Open Source Definition (OSD).

https://salsa.debian.org/deeplearning-team/ml-policy

ImprobableTruth · 2025-04-04T18:24:02 1743791042

The reason is that the usage is completely different from coroutine based async. With GPUs you want to queue _as many async operations as possible_ and only then synchronize. That is, you would have a program like this (pseudocode):

  b = foo(a)
  c = bar(b)
  d = baz(c)
  synchronize()

With coroutines/async await, something like this

  b = await foo(a)
  c = await bar(b)
  d = await baz(c)

would synchronize after every step, being much more inefficient.

hackernudes · 2025-04-04T20:14:26 1743797666

Pretty sure you want it to do it the first way in all cases (not just with GPUs)!

halter73 · 2025-04-04T21:08:32 1743800912

It really depends on if you're dealing with an async stream or a single async result as the input to the next function. If a is an access token needed to access resource b, you cannot access a and b at the same time. You have to serialize your operations.

alanfranz · 2025-04-05T01:33:38 1743816818

Well you can and should create multiple coroutine/tasks and then gather them. If you replace cuda with network calls, it’s exactly the same problem. Nothing to do with asyncio.

ImprobableTruth · 2025-04-05T01:56:18 1743818178

No, that's a different scenario. In the one I gave there's explicitly a dependency between requests. If you use gather, the network requests would be executed in parallel. If you have dependencies they're sequential by nature because later ones depend on values of former ones.

The 'trick' for CUDA is that you declare all this using buffers as inputs/outputs rather than values and that there's automatic ordering enforcement through CUDA's stream mechanism. Marrying that with the coroutine mechanism just doesn't really make sense.

ImprobableTruth · 2025-03-21T10:12:19 1742551939

If you compare with e.g. Deepseek and other hosters, you'll find that OpenAI is actually almost certainly charging very high margins (Deepseek has an 80% profit margin and they're 10x cheaper than openai).

The training/R&D might make OpenAI burn VC cash, but this isn't comparable with companies like WeWork whose products actively burn cash

camillomiller · 2025-03-21T10:43:11 1742553791

They said themselves that even inference is losing them money tho, or did I get that wrong?

ImprobableTruth · 2025-03-21T11:27:52 1742556472

On their subscriptions, specifically the pro subscription, because it's a flatrate to their most expensive model. The API prices are all much more expensive. It's unclear whether they're losing money on the normal subscriptions, but if so, probably not by much. Though it's definitely closer to what you described, subsidizing it to gain 'mindshare' or whatever.

yousif_123123 · 2025-03-21T14:00:25 1742565625

Well I think there's many cheaper models in terms of bang for buck currently per token and intelligence than gpt4o. Other than OpenAI having very high rate limits and throughout available without a contract done with sales, I don't see much reason to use it currently instead of sonnet 3.5 or 3.7, or Google's Flash 2.0

Perhaps their training cost and their current inference cost is higher, but what you get as a customer is a more expensive product for what it is, IMO.

Szpadel · 2025-03-22T06:13:03 1742623983

they for sure lose money on some months for some customers, but I expect globally most of subscriptions (including mine that I recently cancelled) would be much better of to migrate to API

everyone that o know that have/had subscription didn't used it very extensively, and that is how it's still profitable in general

I suspect that it's the same for copilot, especially the business variant, while they definitely lose money on my account, believe that when looking on our whole company subscription I wouldn't be surprised that it's even 30% of what we pay

ImprobableTruth · 2025-03-18T14:09:16 1742306956

Do you take issue with the 'purely empirical' approach (just trying out variants and seeing which sticks) or only with its insufficient documentation?

I don't know how you'd improve on the former. For a lot of it there simply isn't any sound theoretical foundation, so you just end up with flimsy post-hoc rationalizations.

While I agree that it's unfortunate that people often just present magic numbers without explaining where they come from, in my experience providing documentation for how one arrives at these often enough gets punished because it draws more attention to them. That is, reviewers will e.g. complain about preliminary experiments, asking for theoretical analysis or question why only certain variants were tried, whereas magic numbers are just kind of accepted.

SiempreViernes · 2025-03-18T14:21:52 1742307712

Seems pretty clear they aren't objecting to throwing stuff at the wall and seeing what sticks, but with calling the outcome of sticky-wall work "science".

I'd say that's a bit strict take on science, one could be generous and compare it to biologist going out into the forest and combing bsck with a report on finding a new lichen.

Thought admittedly these days the biologist is probably expected to report details about their search strategy, which the sticky-wall researchers don't.

thechao · 2025-03-18T14:37:09 1742308629

The biologist would be expected to describe the lichen in detail, including where it was found, its expected ecology, its place in the ecosystem, life-cycle, structure, etc. It is no longer 1696 where we can go spear some hapless fish, bring back its desiccated body, and let our fellow gentleman ogle over its weirdness.

jszymborski · 2025-03-18T14:37:24 1742308644

I'm not GP, but I don't think they are taking issue with the fact that e.g. layer numbers or architecture were arrived at without first-principles but rather empirically.

Rather that when you do come to something empirically, you need to validate your findings by e.g. ablations, hypothesis testing, case studies, etc...

Al-Khwarizmi · 2025-03-18T14:57:12 1742309832

Exactly, I can confirm this is what I meant.

fads_go · 2025-03-18T14:32:45 1742308365

> I don't know how you'd improve on the former. For a lot of it there simply isn't any sound theoretical foundation, so you just end up with flimsy post-hoc rationalizations.

So great science would come up with a sound theoretical foundation, or at least strong arguments as to why no such foundation can exist.

ImprobableTruth · 2025-01-30T09:14:49 1738228489

There are even much cheaper services that host it for only slightly more than deepseek itself [1]. I'm now very certain that deepseek is not offering the API at a loss, so either OpenAI has absurd margins or their model is much more expensive.

[1] the cheapest I've found, which also happens to run in the EU, is https://studio.nebius.ai/ at $0.8/million input.

Edit: I just saw that openrouter also now has nebius

nightpool · 2025-01-31T04:56:33 1738299393

Yes, sorry, I was being maximally-broad in my comment but I would think it's very, very, very likely that OpenAI is currently charging huge margins and markups to help maintain the cachet / exclusivity / and, in some senses, safety of their service. Charging more money for access to their models feels like a pretty big part of their moat.

Also possibly b/c of their sweetheart deal with Azure they've never needed to negotiate enterprise pricing so they're probably calculating margins based on GPU list prices or something insane like that.

ImprobableTruth · 2024-12-08T21:17:08 1733692628

The "97.3%" match is probably just the confidence value - I don't think a frequentist interpretation makes sense for this. I'm not an expert in face recognition, but these systems are very accurate, typically like >99.5% accuracy with most of the errors coming from recall rather than precision. They're also not _that_ expensive. Real-time detection on embedded devices has been possible for around a decade and costs for high quality detection have come down a lot in recent years.

Still, you're right that at those scales these systems will invariably slip once in a while and it's scary to think that this might enough to be considered a criminal, especially because people often treat these systems as infallible.

ImprobableTruth · 2024-11-10T02:29:21 1731205761

Yes, that's how it works.

I think in Zig for new types you'd use enums for ints and packed structs for more complex types.

ImprobableTruth · 2024-07-07T14:11:18 1720361478

Latency is magnitudes more critical when it's something where you have to react.

ImprobableTruth · on March 23, 2024

Transmission speeds aren't fast enough for this, unless you crank up the batch size ridiculously high.

FeepingCreature · on March 23, 2024

LoRA training/merging basically is "crank up the batch size ridiculously high" in a nutshell, right? What actually breaks when you do that?

brrrrrm · on March 23, 2024

Cranking up the batch size kills convergence.

FeepingCreature · on March 23, 2024

Wonder if that can be avoided by modifying the training approach. Ideas offhand: group by topic, train a subset of weights per node; figure out which layers have the most divergence and reduce lr on those only.

brrrrrm · on March 25, 2024

A provable way to recover convergence is to calculate the hessian. It’s computationally expensive but there are approximation methods.

ImprobableTruth · on March 23, 2024

Really sad to see. Emad pivoting to posting crypto nonsense on twitter made me think the writing is on the wall for Stability, but I still didn't expect it so soon. I expect they'll pivot to closed models and then fade away.

zenlikethat · on March 23, 2024

Translation of “stepping down to focus on distributed AI”:

Getting fired and making moves to capitalize on the current crypto boom while it lasts

n2d4 · on March 23, 2024

Getting fired by whom? He is both the majority shareholder and controls the board

rvz · on March 23, 2024

> Really sad to see. Emad pivoting to posting crypto nonsense on twitter made me think the writing is on the wall for Stability

Open-source AI is a race to zero that makes little money and Stability was facing lawsuits (especially with Getty) which are mounting into the millions and the company was already burning tens of millions.

Despite being the actual "Open AI", Stability cannot afford to sustain itself doing so.

yreg · on March 23, 2024

Source on them burning tens of millions? What are they burning it on?

skybrian · on March 23, 2024

I don’t know much about him. Was he into cryptocurrency before AI?

dragonwriter · on March 23, 2024

> I expect they'll pivot to closed models and then fade away.

They already pivoted away from open-licensed models.

ShamelessC · on March 23, 2024

> crypto nonsense on twitter

got any links?

tymscar · on March 23, 2024

Not twitter but here he is talking about that:

https://youtu.be/BdZo4JUBSQk

ShamelessC · on March 23, 2024

Wow much of what he’s saying in that video is a lie with regards to the history of latent diffusion, creating an open source GPT3, etc. Just taking credit for a bunch of work he didn’t have much to do with.

spxneo · on March 23, 2024

You know it never ceases to amaze me how even the most respected fall prey to this money laundering scheme. If people even spent some time to read about Tether they would not touch this stuff. It's blood money.

tasuki · on March 23, 2024

I've never owned any Tether, so don't know much about it. How is it blood money?

stolsvik · on March 25, 2024

You should probably post some citations to that. Tether most probably have backing for every single dollar they have, and then plenty more. What about it is “blood money”?

Tangokat · on March 23, 2024

With a bold claim like that citations would be in order.

IshKebab · on March 23, 2024

I suspect a lot of them know it's a scam, they just want in on it and don't want to admit it.

drsnow · on March 23, 2024

Exactly how is cryptocurrency merely a "money laundering scheme"?

hruzgar · on March 23, 2024

Do you even know the tech behind crypto? Just because scammers and similar people use and promote it, doesn't make it a bad technology at all.

shwaj · on March 23, 2024

Tether isn’t crypto though, in the sense of being decentralized, permissionless, etc

benreesman · on March 23, 2024

I mean, is AI a less sketchy space in 2024 than crypto/blockchain in 2024? Two or three years ago sure, I guess, but today?

The drama around OpenAI is well documented, there are multiple lawsuits and an SEC investigation at least in embryo, Karpathy bounced and Ilya's harder to spot than Kate Middleton (edit: please see below edit in regards to this tasteless quip). NVIDIA is pushing the Dutch East India Company by some measures of profitability with AMD's full cooperation: George Hotz doesn't knuckle under to the man easily and he's thrown in the towel on ever getting usable drivers on "gaming"-class gear. At least now I guess the Su-Huang Thanksgiving dinners will be less awkward.

Of the now over a dozen FAANG/AI "boomerangs" I know, all of them predate COVID hiring or whatever and all get crammed down on RSU grants they accumulated over years: whether or not phone calls got made it's pretty clearly on everyone's agenda to wash all the ESOP out, neutron bomb the Peninsula, and then hire everyone back with at dramatically lower TC all while blowing EPS out quarter after quarter.

Meanwhile the FOMC is openly talking about looser labor markets via open market operations (that's direct government interference in free labor markets to suppress wages for the pro-capitalism folks, think a little about what capitalism is supposed to mean if you are ok with this), and this against the backdrop of an election between two men having trouble campaigning effectively because one is fighting off dozens of lawsuits including multiple felony charges and the other is flying back and forth between Kiev and Tel Aviv trying to manage two wars he can't seem to manage: IIRC Biden is in Ukraine right now trying to keep Zelenskyy from drone-bombing any more refineries of Urals crude because `CL` or whatever is up like 5% in the last three weeks which is really bad in an election year looking to get nothing but uglier: is anyone really arguing that some meme on /r/crypto is what's pushing e.g. BTC and not a pretty shaky-looking Fed?

Meanwhile over in crypto land, over the same period of time that AI and other marquee Valley tech has been turning into a scandal-plagued orgy of ugly headlines on a nearly daily basis, the regulators have actually been getting serious about sending bad actors to jail or leaning on them with the prospect (SBF, CZ), major ETFs and futures serviced by reputable exchanges (e.g. CME) have entered mainstream portfolios, and a new generation of exchanges (`dy/dx`, Vertex, Apex, Orderly) backed by conventional finance investments in robust bridge infrastructure (LayerZero) are now doing standard Island/ARCA-style efficient matching and then using the blockchain for what it's for: printing a consolidated, Reg NMS/NBBO/SIP-style consolidated tape.

As a freelancer I don't really have a dog in this fight, I judge projects by feasibility, compensation, and minimum ick factor. From the vantage point of my flow the AI projects are sketchier looking on average and below market bids on average contrasted to the blockchain projects, a stark reversal from even six months ago.

Edit: I just saw the news about Kate Middleton, I was unaware of this when I wrote the above which is in extremely poor taste in light of that news. My thoughts and prayers are with her and her family.

ramraj07 · on March 23, 2024

I and many many people around me use ChatGPT every single day in our lives. AI has a lot of hype but it’s backed by real crap that’s useful. Crypto on the other hand never really did anything practical except help people buy drugs and launder money. Or make people money in the name of investments.

sevagh · on March 23, 2024

CL: Lee Chae-rin (Korean rapper), Craigslist, Chlorine

NMS: Neuroleptic malignant syndrome, No Man's Sky

NBBO: National Best Bid and Offer

SIP: Session Initiation Protocol, Systematic Investment Plan, Security Infrastructure Program