Hacker News new | past | comments | ask | show | jobs | submit | minosu's comments login

Mistral 7b inferences about 18% faster for me as a 4bit quantized version on an A100. Thats definitely relevant when running anything but chatbots.


Are you measuring tokens/sec or words per second?

The difference matters as generally in my experience, Llama 3, by virtue of its giant vocabulary, generally tokenizes text with 20-25% less tokens than something like Mistral. So even if its 18% slower in terms of tokens/second, it may, depending on the text content, actually output a given body of text faster.


This a continuation of previous work done in the godot-dodo project (https://github.com/minosvasilias/godot-dodo), which involved finetuning LLaMA models on GitHub-scraped GDScript code.

Starcoder performs significantly better than LLaMA using the same dataset, and exceeds evaluation scores of both gpt-4 and gpt-3.5-turbo, showing that single-language finetunes of smaller models may be a competitive option for coding assistants, especially for less commonplace languages such as GDScript.

The twitter thread also details some drawbacks of the current approach, namely increasing occurences where the model references out-of-scope objects in its generated code, a problem that worsens as the amount of training epochs increases.


Yes, the changes introduced by Godot 4 were a prime motivator for this project.

However, it is not quite as clear cut as OpenAI's models simply being trained on Godot 3.x projects only. Not only do they sometimes produce valid 4.x syntax (gpt-4 more often than 3.5-turbo), indicating there were at least some 4.x projects in the training data, they also hallucinate other invalid syntax, such as Python-specific functionality, or simply non-existent methods.

I do think evaluating against Godot 3.x would increase their scores somewhat, but i have not had time to do so yet.


The wrapping prompt is also used during inference. (https://github.com/minosvasilias/godot-dodo/blob/f62b90a4622...) Prompting like this is useful for instruct-finetunes, and similar prompts are used by other projects like stanford-alpaca.


Thanks for the clarification, makes sense now!


This repository presents finetuned LLaMA models that try to address the limited ability of existing language models when it comes to generating code for less popular programming languages.

gpt-3.5-turbo and gpt-4 have proven to be excellent coders, but fall off sharply when asked to generate code for languages other than Python/Javascript etc. The godot-dodo approach to address this: Finetune smaller models on a single one of these languages, using human-created code scraped from MIT-licensed GitHub repositories, with existing GPT models generating instructions for each code snippet.

This differs from the dataset generation approach used by projects such as stanford-alpaca or gpt4all, in that the output values of the training set remain high quality, human data, while following the same instruction-following behavior. This will likely prove more effective the more obscure the language. In this case, GDScript was used, which is the scripting language for the popular open-source game-engine Godot. The same approach however can be applied to any other language.

Performance is promising, with the 7 billion parameter finetune outperforming GPT models in producing syntax that compiles on first try, while being somewhat less capable at following complex instructions.

A comprehensive evaluation comparing all models can be found here: https://github.com/minosvasilias/godot-dodo/tree/main/models


This sounds like one of those bootstrapping liftoff things. Generating labels had been a big bottleneck, but if we can just find examples and then label them automatically, this could accelerate all sorts of applications.


I'm not sure what MIT licensed code is supposed to do for you. Are you going to cite every repository ingested?


I suppose for the model indeed you should do that?

But then maybe not for the actual predictions made by the model, as the MIT license says:

> The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

Arguably e.g. a single function is not a substantial portion of a multi-file project—and, usually, even that function itself is not going to be a verbatim copy but adjusted to your use case regarding variable names etc.


Technically you could do that in a big text file...


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: