Getting the results is nice but that's "shareware" not "free software" (or, for a more modern example, that is like companies submitting firmware binary blobs into mainline Linux).
Free software means you have to be able to build the final binary from source. Having 10 TB of text is no problem, but having a data center of GPUs is. Until the training cost comes down there is no way to make it free software.
Free software means that you have the ability - both legal and practical - to customize the tool for your needs. For software, that means you have to be able to build the final binary from source (so you can adapt the source and rebuild), for ML models that means you need the code and the model weights, which does allow you to fine-tune that model and adapt it to different purposes even without spending the compute cost for a full re-train.
Free software means you have to be able to build the final binary from source. Having 10 TB of text is no problem, but having a data center of GPUs is. Until the training cost comes down there is no way to make it free software.