BLUE SKY NOTEBOOK
Top 10 DM Algorithms:

1. C4.5: Decision Trees (ID3 and pruning methods)

  • J. Ross Quinlan: Induction of Decision Trees. Machine Learning 1(1): 81-106 (1986)http://www.cs.toronto.edu/~roweis/csc2515-2006/readings/quinlan.pdf
  • J. Ross Quinlan: Simplifying Decision Trees. International Journal of Man-Machine Studies 27(3): 221-234 (1987) http://citeseer.ist.psu.edu/525260.html

2. K-Means / Spectral Clustering (taken)

  • Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, 2002. http://robotics.stanford.edu/~ang/papers/nips01-spectral.ps
  • D. Arthur, S. Vassilvitskii. k-means++ The Advantages of Careful Seeding. Symposium on Discrete Algorithms (SODA), 2007. http://www.stanford.edu/~sergeiv/papers/kMeansPP-soda.pdf
  • Duda et al. Wiley Interscience, 2.ed, 2000. Section 10.4.3.

3. SVM

  • Thorsten Joachims, "Training Linear SVMs in Linear Time", KDD 2006, August 20-23, 2006, Philadelphia, Pennsylvania, USA.> http://www.cs.cornell.edu/People/tj/publications/joachims_06a.pdf
  • Haykin, Simon. Neural Networks: A Comprehensive Foundation, 2.ed. Prentice-Hall, 1999. Chapter 6.
  • Christopher Burges, "A Tutorial on Support Vector Machines for Pattern Recognition", Data Mining and Knowledge Discovery, Vol. 2. (1998), pp. 121-167, 1998. http://www.public.asu.edu/~jye02/CLASSES/Spring-2007/Papers/PAPERS/SVM-tutorial.pdf

4. APriori

  • Agrawal R, Srikant R. "Fast Algorithms for Mining Association Rules", VLDB. Sep 12-15 1994, Chile, 487-99. http://www.acm.org/sigmod/vldb/conf/1994/P487.PDF
  • Hand, Mannila and Smyth. Principles of Data Mining. MIT, 2000. Section 5.3.2.

5. Expectation Maximization (EM)

  • Jeff A. Bilmes. "A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models", International Compute Science Institute, 1998. http://scipp.ucsc.edu/groups/retina/articles/bilmes98gentle.pdf
Discussion of sections 1-3 is sufficient.
  • Russel and Norvig. Artificial Intelligence: A Modern Approach, 2.ed. Prentice-Hall, 2003. Section 20.3.
  • Duda et al. Wiley Interscience, 2.ed, 2000. Sections 3.9.
  • Hand, Mannila and Smyth. Principles of Data Mining. MIT, 2000. Section 8.4.

6. PageRank / HITS

  • Kleinberg, J. M. 1998. Authoritative sources in a hyperlinked environment. In Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (San Francisco, California, United States, January 25 - 27, 1998). Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, PA, 668-677.http://www.cs.cornell.edu/home/kleinber/auth.pdf
  • Corso, Gullí and Romani. Fast PageRank Computation via a Sparse Linear System (Extended Abstract). http://citeseer.ist.psu.edu/719287.html

7. AdaBoost

  • A Short Introduction to Boosting Introduction to Adaboost by Freund and Schapire from 1999.http://www.site.uottawa.ca/~stan/csi5387/boost-tut-ppr.pdf
  • Duda et al. Wiley Interscience, 2.ed, 2000. Sections 9.5.2.

8. K Nearest Neighbours (kNN)

  • Hastie, T. and Tibshirani, R. 1996. Discriminant Adaptive Nearest Neighbor Classification. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI). 18, 6 (Jun. 1996), 607-616. http://dx.doi.org/10.1109/34.506411
  • Duda et al. Wiley Interscience, 2.ed, 2000. Sections 4.4 - 4.6.
  • Hand, Mannila and Smyth. Principles of Data Mining. MIT, 2000. Section 10.6.

9. Naive Bayes/Chow-Liu Tree Model

  • Chow C , Liu C, Approximating discrete probability distributions with dependence trees, Information Theory, IEEE Transactions on, Vol. 14, No. 3. (1968), pp. 462-467.http://ieeexplore.ieee.org/iel5/18/22639/01054142.pdf
  • H. Zhang, "The optimality of naive Bayes," presented at 17th International FLAIRS conference, Miami Beach, May 17-19, 2004. http://www.cs.unb.ca/profs/hzhang/publications/FLAIRS04ZhangH.pdf
  • Hand, Mannila and Smyth. Principles of Data Mining. MIT, 2000. Section 10.8.

10. CART

|