I am still dismayed how quickly we gave up on including the pre-training data as a requirement for "open-source" LLMs.
As someone who thinks LLMs as akin to Lisp expert systems (but in natural language): is like including the C source code to your Lisp compiler, but claiming the Lisp applications are merely "data" and shouldn't be included.
You forgot the most egregious term which is that users have to abide by an acceptable use policy that only allows you to use it for what Meta says you can.
Is there some entropy or randomness at play here? Or some sort of RAG? Even if it was RAG, the "reasoning" is very different and doesn't mention the clear censorship in the initial prompt that the one I linked mentions.
It's funny they don't have Hugging Face as their Partner. Literally, the biggest face of Open LLMs sitting right in Europe, but somehow it's not a partner.
When you are the front runner, you don't associate with the also-ran and the wannabes. They will drag you down, and drown you in their endless discussion and alignment meetings.
Great to see Unsloth here, how long did the training process take??
Also, the different version of the same og Colab didn't make a 135M model to learn the XML tags, so do you think 8 billion should be the minimum for use this?
I may have misunderstood, but it seems the OPs intent was to get the benefits of RAG, which Cline enables, since it performs what I would consider RAG under the hood.
I ran the distilled models locally some of the censorships are there.
But on their chat (hosted), deepseek has some keyword based filters - like the moment it generates Chinese president name or other controversial keywords - the "thinking" stops abruptly!
The distilled versions I've run through Ollama are absolutely censored and don't even populate the <think></think> section for some of those questions.
This has been the problem with a lot of long context use cases. It's not just the model's support but also sufficient compute and inference time. This is exactly why I was excited for Mamba and now possibly Lightning attention.
Even though the new DCA based on which these models provide long context could be an interesting area to watch;
For those who don't know, He is the gg of `gguf`. Thank you for all your contributions! Literally the core of Ollama, LMStudio, Jan and multiple other apps!
They collaborate together! Her name is Justine Tunney - she took her “execute everywhere” work with Cosmopolitan to make Llamafile using the llama.cpp work that Giorgi has done.
She actually stole that code from a user named slaren and was personally banned by Gerg from the llama.cpp repo for about a year because of it. Also it was just lazy loading the weights, it wasn't actually a 50% reduction.
But unlike a geogussr, it uses websearch[1] [1] https://youtu.be/P2QB-fpZlFk?si=7dwlTHsV_a0kHyMl [1]
reply