Bishop, Christopher M., Pattern Recognition and Machine Learning (New York: Springer, 2006), Information science and statistics <https://www.microsoft.com/en-us/research/people/cmbishop/prml-book/>
Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schütze, Introduction to Information Retrieval (New York: Cambridge University Press, 2008)
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Introduction to Data Mining, (First Edition) (Addison Wesley)
Witten, I. H., Alistair Moffat, and Timothy C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd ed (San Francisco, Ca: Morgan Kaufman, 1999)