Hacker News new | past | comments | ask | show | jobs | submit login
Extracting Meaning from Millions of Pages (technologyreview.com)
75 points by jaybol on Sept 4, 2011 | hide | past | favorite | 11 comments



I work for the professor from the article (but not on TextRunner).

We're working on extracting meaning from reviews as well: http://revminer.com/

At the moment, it only has reviews of Seattle places (restaurants, hotels, etc.) but we're moving it mobile. It's written using node.js and socket.io; I'd be interested in hearing any feedback.


Is it also open source?


Not yet, but we may open source the code when we publish the paper.


How does this relate to freebase.org? I see some of the js ajaxing to the freebase API.


I'd say you're extracting information, not meaning.


From the article - "For example, to find the names of people who are CEOs within millions of documents, you'd first need to train the software with other examples, such as "Steve Jobs is CEO of Apple, Sheryl Sandberg is CEO of Facebook." "

Sheryl Sandberg? Deliberate or honest mistake? :-]


Looks like the directory index was left open. http://textrunner.cs.washington.edu/


it's open source. just download it:

http://reverb.cs.washington.edu/


Awesome: code released under the GPL, with several data sets. Good to see this project (which has been under development for a long time) releasing technology for other people to use.


Read The Web at CMU is also a similar system. http://rtw.ml.cmu.edu/rtw/


Hasn't this been out for like, a long time?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: