Hacker News new | past | comments | ask | show | jobs | submit login

One thing I thought was interesting about this paper [1] on understanding LLMs was how the models associate words/concepts in different languages with each other in what they call Multilingual Circuits.

So the example they give:

English: The opposite of "small" is " → big

French: Le contraire de "petit" est " → grand

Chinese: "小"的反义词是" → 大

Cool graphic for the above [2]

So while English is the lingua franca of the interenet and represents the largest corpus of data, the primary models being built are able to use an English dataset to build associations across languages. This might create significantly stronger AI and reasoning even for languages and regions that lack the data, tech and resources to build local models

[1] https://www.anthropic.com/research/tracing-thoughts-language...

[2] https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-...






Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: