Hacker News new | past | comments | ask | show | jobs | submit login

Github was an early OpenAI design partner. OpenAI developed a custom LLM for them.

It's so interesting that even after that early mover advantage they have to go back to the foundation model providers.

Does this mean that future tech companies have no choice but to do this?




It may not be a model quality issue. It may be that GitHub wants to sell a lot more of Copilot, including to companies who refuse to use anything from OpenAI. Now GitHub can say "Oh that's fine, we have these two other lovely providers to choose from."

Also, after Anthropic and Google sold massive amounts of pre-paid usage credits to companies, those companies want to draw down that usage and get their money's worth. GitHub might allow them to do that through Copilot, and therefore get their business.


I think that the credit scenario is more true for OpenAI than others . Existing Azure commits can be used to buy OpenAI via the marketplace. It will never be as simple for any non Azure partner (Only Github is tying up with Anthropic here not Azure)

GitHub doesn’t even support using those azure managed APIs for copilot today, it is just a license you can buy currently and add to a user license. The best you can do is pay for copilot with existing azure commits .

This seems about not being left behind as other models outpace what copilot can do with their custom OpenAI model that doesn’t seem to getting updated .


Yes, because transfer learning works. A specialized model for X will be subsumed by a general model for X/Y/Z as it becomes better at Y/Z. This is why models which learn other languages become better at English.

Custom models still have use cases, e.g. situations requiring cheaper or faster inference. But ultimately The Bitter Lesson holds -- your specialized thing will always be overtaken by throwing more compute at a general thing. We'll be following around foundation models for the foreseeable future, with distilled offshoots bubbling up/dying along the way.


> This is why models which learn other languages become better at English.

Do you have a source for that, I'd love to learn more!


Evaluating cross-lingual transfer learning approaches in multilingual conversational agent models[1]

Cross-lingual transfer learning for multilingual voice agents[2]

Large Language Models Are Cross-Lingual Knowledge-Free Reasoners[3]

An Empirical Study of Cross-Lingual Transfer Learning in Programming Languages[4]

That should get you started on transfer learning re. languages, but you'll have more fun personally picking interesting papers over reading a random yahoo's choices. The fire hose of papers is nuts, so you'll never be left wanting.

[1] https://www.amazon.science/publications/evaluating-cross-lin...

[2] https://www.amazon.science/blog/cross-lingual-transfer-learn...

[3] https://arxiv.org/pdf/2406.16655v1

[4] https://arxiv.org/pdf/2310.16937v2


I see no reason why GitHub wouldn’t use fine tuned models from google or anthropic.

I think their version of gpt-3.5 was a fine tune as well. I doubt they had a whole model from scratch made just for them.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: