Hacker News new | past | comments | ask | show | jobs | submit login

One of the core design goals Georgi Gerganov had with GGUF was to not need other files. It's literally bullet point #1 in the specs

>Single-file deployment

>Full information: all information needed to load a model is contained in the model file, and no additional information needs to be provided by the user.

https://github.com/ggml-org/ggml/blob/master/docs/gguf.md

We literally just got rid of that multi file chaos only for ollama to add it back :/






Most of the parameters you would include in ollama's ModelFile are things you would pass to llama.cpp using command line flags:

https://github.com/ggml-org/llama.cpp/blob/master/examples/m...

If you only ever have one set of configuration parameters per model (same temp, top_p, system prompt...), then I guess you can put them in a gguf file (as the format is extensible).

But what if you want two different sets? You still need to keep them somewhere. That could be a shell script for llama.cpp, or a ModelFile for ollama.

(Assuming you don't want to create a new (massive) gguf file for each permutation of parameters.)


This is why we use xdelta3, rdiff, and git



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: