Hacker News new | past | comments | ask | show | jobs | submit login

For my thesis I trained a classifier on text from internal messaging systems and forums from a large consultancy company.

Most universities have had their own corpora to work with, for example: the Brown Corpus, the British National Corpus, and the Penn Treebank.

Similar corpora exist for images and video, usually created in association with national broadcasting services. News video is particularly interesting because they usually contain closed captions, which allows for multi-modal training.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: