Hacker News new | past | comments | ask | show | jobs | submit login

Yep. You're building a statistical model of a corpus of text. Given an ordered set of words, it tells you what's likely to come next. e.g,

  [it, is] -> {sunny, 0.75}
  
  [it, is] -> {raining, 0.25}
Tells you that given the phrase "it is", what comes next was "sunny" 3/4 of the time, and "raining" the rest.

Once you've got that, you can use it to generate random text that has similar characteristics to the training corpus. You just seed it with a couple starting words, and then start randomly choosing what word comes next according to the probabilities you've recorded.

jwz's got one that you can play with yourself at http://www.jwz.org/dadadodo/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: