# Mining Massive Datasets
- Author(s): Jure Leskovec, Anand Rajaraman, Jeff Ullman
- [Link](http://mmds.org/)
---
## Summary
The book is based on [Stanford Computer Science](http://cs.stanford.edu) course [CS246: Mining Massive Datasets](http://cs246.stanford.edu) (and [CS345A: Data Mining](http://infolab.stanford.edu/~ullman/mining/2009/index.html)).
The following is the third edition of the book. It contains new material on Spark, Tensorflow, minhashing, community-finding, simrank, graph algorithms, and decision trees. There is a new chapter 13, covering deep learning
---
### Also discussed on the source linked above are Stanford big data courses
#### CS246
[CS246: Mining Massive Datasets](http://cs246.stanford.edu/) is graduate level course that discusses data mining and machine learning algorithms for analyzing very large amounts of data. The emphasis is on Map Reduce as a tool for creating parallel algorithms that can process very large amounts of data.
#### CS341
[CS341 Project in Mining Massive Data Sets](http://cs341.stanford.edu/) is an advanced project based course. Students work on data mining and machine learning algorithms for analyzing very large amounts of data. Both interesting big datasets as well as computational infrastructure (large MapReduce cluster) are provided by course staff. Generally, students first take [CS246](http://cs246.stanford.edu/) followed by [CS341](http://cs341.stanford.edu/).
[CS341](http://cs341.stanford.edu/) is generously supported by [Amazon](http://www.amazon.com) by giving us access to their [EC2](http://aws.amazon.com/ec2/) platform.
#### CS224W
[CS224W: Social and Information Networks](http://cs224w.stanford.edu/) is graduate level course that covers recent research on the structure and analysis of such large social and information networks and on models and algorithms that abstract their basic properties. Class explores how to practically analyze large scale network data and how to reason about it through models for network structure and evolution.
#### You can take Stanford courses!
If you are not a Stanford student, you can still take [CS246](http://cs246.stanford.edu/) as well as [CS224W](http://cs224w.stanford.edu/) or earn a [Stanford Mining Massive Datasets graduate certificate](http://scpd.stanford.edu/public/category/courseCategoryCertificateProfile.do?method=load&certificateId=10555807) by completing a sequence of four Stanford Computer Science courses. A graduate certificate is a great way to keep the skills and knowledge in your field current. More information is available at the [Stanford Center for Professional Development (SCPD)](http://scpd.stanford.edu/public/category/courseCategoryCertificateProfile.do?method=load&certificateId=10555807).
---
#### Related
#methods #data_mining