Monday, December 29, 2014
Wednesday, December 17, 2014
Most Popular Data Mining Algorithms
http://www2.cs.uh.edu/~ceick/DM/10Algorithms-08.pdf
This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006.
Does anyone know of anything similar that is more recent?
This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006.
Does anyone know of anything similar that is more recent?
Tuesday, December 16, 2014
K-Means Visualization
Here is the website I have used last lecture for visualizing the K-Means algorithm:
http://www.naftaliharris.com/blog/visualizing-k-means-clustering/
http://www.naftaliharris.com/blog/visualizing-k-means-clustering/
Saturday, December 13, 2014
Boosting vs Bagging
This is a paper that compares bagging and boosting on decision trees:
http://home.eng.iastate.edu/~julied/classes/ee547/Handouts/q.aaai96.pdf
The paper shows that both bagging and boosting improve over individual trees and that boosting usually gives better results than bagging, although in some cases boosting fails, probably due to its tendency to get distracted by noisy records.
Note that Quinlan is an important name in the field of machine learning. He is the one that introduced C4.5.
http://home.eng.iastate.edu/~julied/classes/ee547/Handouts/q.aaai96.pdf
The paper shows that both bagging and boosting improve over individual trees and that boosting usually gives better results than bagging, although in some cases boosting fails, probably due to its tendency to get distracted by noisy records.
Note that Quinlan is an important name in the field of machine learning. He is the one that introduced C4.5.
Thursday, December 11, 2014
NetFlix
Here are a few articles related to the Netflix prize:
Lessons from the Netflix prize challenge (By the contest winners).
The BellKor Solution (2007)
The Pragmatic Theory Solution (2009)
The Big Chaos Solution (2009)
De-anonymization of the Netflix Dataset (It turned out to be much easier than what I thought!)
Lessons from the Netflix prize challenge (By the contest winners).
The BellKor Solution (2007)
The BellKor Solution (2008)
The BellKor Solution (2009)The Pragmatic Theory Solution (2009)
The Big Chaos Solution (2009)
De-anonymization of the Netflix Dataset (It turned out to be much easier than what I thought!)
On Bootstrapping Vs Cross Validation
Here is a link from Jan about the difference between cross validation and bootstrapping:
http://www.r-bloggers.com/comparing-the-bootstrap-and-cross-validation/
Here is also a link to a very famous paper (published in 1995) that compares between bootstrapping and cross validation:
http://www.cs.iastate.edu/~jtian/cs573/Papers/Kohavi-IJCAI-95.pdf
http://www.r-bloggers.com/comparing-the-bootstrap-and-cross-validation/
Here is also a link to a very famous paper (published in 1995) that compares between bootstrapping and cross validation:
http://www.cs.iastate.edu/~jtian/cs573/Papers/Kohavi-IJCAI-95.pdf
Monday, December 8, 2014
Model Ensembles Again!
Here is a good (a bit old though) reference and experimental comparison between ensemble methods.
Wednesday, December 3, 2014
Model Ensembles
I have added some "borrowed" slides on model ensembles. These slides are a good summary for a good portion of what we have covered in class but not everything.
Make sure also to review what I write on the board.
Make sure also to review what I write on the board.
Subscribe to:
Posts (Atom)