> completely irrelevant and I’m only interested in writing (chat, stories, etc)
There's a person keeping track of a few writing prompts and the evolution of the quality of text with each new shiny model. They shared this link somewhere, can't find the source but I had it bookmarked for further reading. Have a look at it and see if it's something you'd like.
Why is the main character named Rhys in most (?) of them? Llama[1], Claude[3], Mistral[4] & DeepSeek-r1[5] samples all named the main character Rhys, even though that was no where specified in the prompt? GPT-4o gives the character a different name[6]. Gemini[2] names the bookshop person Rhys instead! Am I just missing something really obvious? I feel like I'm missing something big that's right in front of me
The only measurable flaw I could find was the errant use of an opening quote (‘) in
> He huffed a laugh. "Lucky you." His gaze drifted to the stained-glass window, where rain blurred the world into watercolors. "I bombed my first audition. Hamlet, uni production. Forgot ‘to be or not to be,' panicked, and quoted Toy Story."
It's pretty amazing I can find no fault with the actual text. No grammar errors, I like the writing, it competes with the quality and engagingness of a large swath of written fiction (yikes), I wanna read the next chapter.
> It's pretty amazing I can find no fault with the actual text. No grammar errors, I like the writing, it competes with the quality and engagingness of a large swath of written fiction (yikes), I wanna read the next chapter.
Those outputs are really good and come from deepseek-R1 (I assume the full version, not a distilled version).
R1 is quite large (685B params). I’m wondering if you can make a distilled R1 without the coding and math content. 7B works well for me locally. When I go up to 32B I seem to get worse results - I assume it’s just timing out in its think mode… I haven’t had time to really investigate though.
There's a person keeping track of a few writing prompts and the evolution of the quality of text with each new shiny model. They shared this link somewhere, can't find the source but I had it bookmarked for further reading. Have a look at it and see if it's something you'd like.
https://eqbench.com/results/creative-writing-v2/deepseek-ai_...