You can use `all_model_filenames = joblib.dump(model, filename)` after fit on yo...

You can use `all_model_filenames = joblib.dump(model, filename)` after fit on your dev enviroment. joblib will store each numpy array in the model datastructure as an independent file and `all_model_filenames[0] == filename` refers to the file holding the main pickle structure.

Then on your prediction servers, ensure that you have a copy of `all_model_filenames` in the same folder. You can then load the model with `model = joblib.load(filenames[0], mmap_mode='r')`. This will make it possible to use shared memory (memory mapping) for the model parameters of a large random forest so that all the Gunicorn, Celery or Storm worker processes running on the same server will use the same memory pages, making it a very efficient way to deploy large models on RAM constrained servers.

You can even use docker to ship the model as part of a container and treat the model as binary software configuration.