All Projects → khanhnamle1994 → Data Mining

khanhnamle1994 / Data Mining

Lecture slides and quizzes for Leskovec, Rajaraman, and Ullman's "Mining of Massive Datasets" Stanford course

Projects that are alternatives of or similar to Data Mining

Rnn Walkthrough
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Interpreting Decision Trees And Random Forests
Unwrapping decision trees and random forests to make them less of a black box
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Covid Nyc Dasymetric Map
Stars: ✭ 57 (+0%)
Mutual labels:  jupyter-notebook
Sccaf
Single-Cell Clustering Assessment Framework
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Endtoend Predictive Modeling Using Python
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Codeforces Api
Tools for estimating problem difficulty, predictors rating trajectories, and tracking individual learning progress in algorithms.
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Waveglow Vqvae
WaveGlow vocoder with VQVAE
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Pointseg
Stars: ✭ 57 (+0%)
Mutual labels:  jupyter-notebook
Baidu dogs
百度西交第三届大数据比赛Baseline(全国第4名)
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Covidnet Ct
COVID-Net Open Source Initiative - Models and Data for COVID-19 Detection in Chest CT
Stars: ✭ 57 (+0%)
Mutual labels:  jupyter-notebook
Dslt
Deep Regression Tracking with Shrinkage Loss
Stars: ✭ 55 (-3.51%)
Mutual labels:  jupyter-notebook
A Week In Wild Ai
360 view on ai/ml/dl applications
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Clr
Stars: ✭ 1,087 (+1807.02%)
Mutual labels:  jupyter-notebook
Imagenet
Trial on kaggle imagenet object localization by yolo v3 in google cloud
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Cinemanet
Stars: ✭ 57 (+0%)
Mutual labels:  jupyter-notebook
Text Analytics W Python 2e
Source Code for 'Text Analytics with Python,' 2nd Edition by Dipanjan Sarkar
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Pybkb v2
Python scripts that help me be a successfull meteorologist. (Python 2) For Python 3, use: https://github.com/blaylockbk/pyBKB_v3
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook
Hypothesis Testing With Python
True difference or noise? 📊
Stars: ✭ 58 (+1.75%)
Mutual labels:  jupyter-notebook
Niftidataset
pytorch dataset class for reading and transforming NIfTI files
Stars: ✭ 57 (+0%)
Mutual labels:  jupyter-notebook
Amazon Sagemaker Safe Deployment Pipeline
Safe blue/green deployment of Amazon SageMaker endpoints using AWS CodePipeline, CodeBuild and CodeDeploy.
Stars: ✭ 56 (-1.75%)
Mutual labels:  jupyter-notebook

Mining of Massive Datasets

Jure Leskovec, Anand Rajaraman and Jeff Ullman welcome you to the self-paced version of the on-line course based on the book Mining of Massive Datasets. It is intended for people who have a reasonable undergraduate education in Computer Science, including courses in data structures, algorithms, databases, calculus, statistics, and linear algebra.

In this course, you will learn many of the interesting algorithms that have been developed for efficient processing of large amounts of data in order to extract simple and useful models of that data. These techniques are often used to predict properties of future instances of the same sort of data, or simply to make sense of the data already available. Many people view data mining, or "big data" as machine learning. There are indeed some techniques for processing large datasets that can be considered machine learning, and we shall cover a number of these. But there are also many algorithms and ideas for dealing with big data that are not usually classified as machine learning, and we shall cover many of these as well.

data-mining

Course Outline

The course is divided into 15 modules of videos and homeworks and a final exam. In the synchronous version of the course, the material is intended to be covered in seven weeks. However, you are free to spend more or less time learning this material. Here is a list of the 15 modules:

  1. MapReduce
  2. Link Analysis (PageRank)
  3. Locality-Sensitive Hashing
  4. Distance Measures and Nearest-Neighbor Learning
  5. Frequent Itemset Analysis
  6. Social-Network Graphs
  7. Algorithms for Data Streams
  8. Recommendation Systems
  9. Dimensionality Reduction
  10. Clustering
  11. Computational Advertising
  12. Machine Learning
  13. More on MapReduce Algorithms
  14. More on Locality-Sensitive Hashing
  15. More on Link Analysis

Course Materials

The material found in this course is supported by a free on-line book, with the same title and authors as the course itself. The book is published by Cambridge University Press, but, by courtesy of the publisher, you can download a free copy at www.mmds.org. In addition to the videos provided, the slide sets used in each video can be accessed via the "Handouts" link beneath each video.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].