BTW one of the largest in world collection of texts - Library of Congress, old estimation considered to about 100B characters, or approximate 100B LLM tokens.
Plus, they have thousands of multimedia carriers (cinema, music, etc) and one time was archived all tweets for history preservation.
But all multimedia and tweets are much less volume than texts, but added few additional dimensions, hard to express with text.
Plus, they have thousands of multimedia carriers (cinema, music, etc) and one time was archived all tweets for history preservation.
But all multimedia and tweets are much less volume than texts, but added few additional dimensions, hard to express with text.