Hacker News new | past | comments | ask | show | jobs | submit login

I'm trying to figure out what I'm looking at. A quick look at what Markov Chains are doesn't really seem to explain it. This is a mashup of these two books, basically, correct?



Yep. You're building a statistical model of a corpus of text. Given an ordered set of words, it tells you what's likely to come next. e.g,

  [it, is] -> {sunny, 0.75}
  
  [it, is] -> {raining, 0.25}
Tells you that given the phrase "it is", what comes next was "sunny" 3/4 of the time, and "raining" the rest.

Once you've got that, you can use it to generate random text that has similar characteristics to the training corpus. You just seed it with a couple starting words, and then start randomly choosing what word comes next according to the probabilities you've recorded.

jwz's got one that you can play with yourself at http://www.jwz.org/dadadodo/


Yes, it's basically a mashups of the two books. Each successive word is chosen randomly in proportion to how often in appears following the two or so words preceding it in the combined text.


Randomly generated and manually culled, yes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: