I plan a deeper dive into text mining this year, and am looking for some suggestions on what resources are best. A friend suggested Text Mining by Weiss, et al http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-0-387-95433-2
What would you suggest?
Information Retrieval (Manning)
Text Compression (Bell)
Natural Language Processing (Manning)
Natural Language Understanding (Allen)
Speech and Language Processing (Jurafsky)
The Text Mining Handbook (Sanger)
Statistical Machine Translation (Koehn)
Data-Intensive Text Processing with MapReduce (Lin)
Algorithms on strings (Gusfield)
Jewels of Stringology (Crochemore)
Regular Expressions (Friedl), also: http://swtch.com/~rsc/regexp/regexp1.html and automata theory (Hopcroft)
Practical Text Mining with Perl (Bilisoly)
Natural Language Processing with Python (Bird)
Computational Linguistics (Hausser)
Syntactic structures (Chomsky)
also check out these links: http://measuringmeasures.blogspot.com/2010/01/learning-about...
http://measuringmeasures.com/blog/2010/3/12/learning-about-m...
http://www.cs.technion.ac.il/~gabr/resources/resources.html