Cool man! Thanks for sharing :) I wasn't familiar with the chunking approach. I will read more!
Regarding BERT, it indeed may perform better if fine tuned correctly. For a baseline fastText is great because it is super fast and runs on a CPU. It costed me 24$ to run a 24h autotune on a 16 CPU core machine. Also, fastText is great out of the box as it also builds word vectors for subwords, which helps with typos and specific terms that may otherwise be out of vocabulary.
I am betting that fine tuning BERT will cost me at least x10 more. But I this project is a chance to try it out :) Looking forward to v2!
Luckily, with Valohai, I get access to GPU credits for open source projects!
Regarding BERT, it indeed may perform better if fine tuned correctly. For a baseline fastText is great because it is super fast and runs on a CPU. It costed me 24$ to run a 24h autotune on a 16 CPU core machine. Also, fastText is great out of the box as it also builds word vectors for subwords, which helps with typos and specific terms that may otherwise be out of vocabulary.
I am betting that fine tuning BERT will cost me at least x10 more. But I this project is a chance to try it out :) Looking forward to v2!
Luckily, with Valohai, I get access to GPU credits for open source projects!