As I'm working on a very similar problem right now, the difficulty is that to sa...

ogrisel · on June 17, 2014

You can use `all_model_filenames = joblib.dump(model, filename)` after fit on your dev enviroment. joblib will store each numpy array in the model datastructure as an independent file and `all_model_filenames[0] == filename` refers to the file holding the main pickle structure.

Then on your prediction servers, ensure that you have a copy of `all_model_filenames` in the same folder. You can then load the model with `model = joblib.load(filenames[0], mmap_mode='r')`. This will make it possible to use shared memory (memory mapping) for the model parameters of a large random forest so that all the Gunicorn, Celery or Storm worker processes running on the same server will use the same memory pages, making it a very efficient way to deploy large models on RAM constrained servers.

You can even use docker to ship the model as part of a container and treat the model as binary software configuration.

bayesianhorse · on June 17, 2014

As I said, run a seperate service for this. That way you only have to load the model (or even train it) once per service process. That is one thing the Openscoring service also does...

If you are more familiar with Python than Java, like me, then that would be a more attractive option.