All Projects → jbrownlee → Datasets

jbrownlee / Datasets

Machine learning datasets used in tutorials on MachineLearningMastery.com

Projects that are alternatives of or similar to Datasets

Akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Stars: ✭ 4,334 (+708.58%)
Mutual labels:  datasets
Awesome Autonomous Vehicle
无人驾驶的资源列表中文版
Stars: ✭ 389 (-27.43%)
Mutual labels:  datasets
Doccano
Open source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+944.78%)
Mutual labels:  datasets
Dr.sure
🏫DeepLearning学习笔记以及Tensorflow、Pytorch的使用心得笔记。Dr. Sure会不定时往项目中添加他看到的最新的技术,欢迎批评指正。
Stars: ✭ 365 (-31.9%)
Mutual labels:  datasets
Video Understanding Dataset
A collection of recent video understanding datasets, under construction!
Stars: ✭ 387 (-27.8%)
Mutual labels:  datasets
Geobr
Easy access to official spatial data sets of Brazil in R and Python
Stars: ✭ 411 (-23.32%)
Mutual labels:  datasets
Medical Datasets
tracking medical datasets, with a focus on medical imaging
Stars: ✭ 296 (-44.78%)
Mutual labels:  datasets
Awesome Twitter Data
A list of Twitter datasets and related resources.
Stars: ✭ 533 (-0.56%)
Mutual labels:  datasets
Awesome Holistic 3d
A list of papers and resources (data,code,etc) for holistic 3D reconstruction in computer vision
Stars: ✭ 387 (-27.8%)
Mutual labels:  datasets
Openml
Open Machine Learning
Stars: ✭ 489 (-8.77%)
Mutual labels:  datasets
Animal Matting
Github repository for the paper End-to-end Animal Image Matting
Stars: ✭ 363 (-32.28%)
Mutual labels:  datasets
Awesome Cybersecurity Datasets
A curated list of amazingly awesome Cybersecurity datasets
Stars: ✭ 380 (-29.1%)
Mutual labels:  datasets
Chinese Nlp Corpus
Collections of Chinese NLP corpus
Stars: ✭ 438 (-18.28%)
Mutual labels:  datasets
Chakin
Simple downloader for pre-trained word vectors
Stars: ✭ 323 (-39.74%)
Mutual labels:  datasets
Awesome Dataset Tools
🔧 A curated list of awesome dataset tools
Stars: ✭ 495 (-7.65%)
Mutual labels:  datasets
Awesome Segmentation Saliency Dataset
A collection of some datasets for segmentation / saliency detection. Welcome to PR...😄
Stars: ✭ 315 (-41.23%)
Mutual labels:  datasets
Projects
🪐 End-to-end NLP workflows from prototype to production
Stars: ✭ 397 (-25.93%)
Mutual labels:  datasets
Datasette
An open source multi-tool for exploring and publishing data
Stars: ✭ 5,640 (+952.24%)
Mutual labels:  datasets
Voice datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
Stars: ✭ 494 (-7.84%)
Mutual labels:  datasets
Awesome Robotics
A curated list of awesome links and software libraries that are useful for robots.
Stars: ✭ 478 (-10.82%)
Mutual labels:  datasets

Machine Learning Datasets

This repository contains a copy of machine learning datasets used in tutorials on MachineLearningMastery.com.

This repository was created to ensure that the datasets used in tutorials remain available and are not dependent upon unreliable third parties.

All regression and classification problem CSV files have no header line, no whitespace between columns, the target is the last column, and missing values are marked with a question mark character ('?').

In many cases, tutorials will link directly to the raw dataset URL, therefore dataset filenames should not be changed once added to the repository.

Datasets

This section provides a summary of the datasets in this repository.

Binary Classification Datasets

  • Breast Cancer (Wisconsin) (breast-cancer-wisconsin.csv)
  • Breast Cancer (Yugoslavia) (breast-cancer.csv)
  • Breast Cancer (Haberman's) (haberman.csv)
  • Bank Note Authentication (banknote_authentication.csv)
  • Horse Colic (horse-colic.csv)
  • Ionosphere (ionosphere.csv)
  • Pima Indians Diabetes (pima-indians-diabetes.csv)
  • Sonar Returns (sonar.csv)
  • German Credit (german.csv)
  • Credit Card Fraud (creditcard.csv.zip)
  • Adult Income (adult-all.csv)
  • Mammography (mammography.csv)
  • Oil Spill (oil-spill.csv)
  • Phoneme (phoneme.csv)

Multiclass Classification Datasets

  • Glass Identification (glass.csv)
  • Iris Flower Species (iris.csv)
  • Wheat Seeds (wheat-seeds.csv)
  • Wine (wine.csv)
  • Ecoli (ecoli.csv)
  • Thyroid Gland (new-thyroid.csv)

Regression Datasets

  • Boston Housing (housing.csv)
  • Auto Insurance Total Claims (auto-insurance.csv)
  • Auto Imports Prices (auto_imports.csv)
  • Abalone Age (abalone.csv)
  • Wine Quality Red (winequality-red.csv)
  • Wine Quality White (winequality-white.csv)

Univariate Time Series Datasets

  • Daily Minimum Temperatures in Melbourne (daily-min-temperatures.csv)
  • Daily Maximum Temperatures in Melbourne (daily-max-temperatures.csv)
  • Daily Female Births in California (daily-total-female-births.csv)
  • Monthly International Airline Passengers (monthly-airline-passengers.csv)
  • Monthly Armed Robberies in Boston (monthly-robberies.csv)
  • Monthly Sunspots (monthly-sunspots.csv)
  • Monthly Champagne Sales (monthly_champagne_sales.csv)
  • Monthly Shampoo Sales (monthly-shampoo-sales.csv)
  • Monthly Car Sales (monthly-car-sales.csv)
  • Monthly Mean Temperatures in Nottingham Castle (monthly-mean-temp.csv)
  • Monthly Specialty Writing Paper Sales (monthly-writing-paper-sales.csv)
  • Yearly Water Usage in Baltimore (yearly-water-usage.csv)

Multivariate Time Series Datasets

  • Hourly Pollution Levels in Beijing (pollution.csv)
  • Minutely Individual Household Electric Power Consumption (household_power_consumption.zip)
  • Human Activity Recognition Using Smartphones (HAR_Smartphones.zip)
  • Indoor Movement Prediction (IndoorMovement.zip)
  • Yearly Longley Economic Employment (longley.csv)

Natural Language Processing

  • Flickr 8k Photo Caption Dataset (Flickr8k_Dataset.zip, Flickr8k_text.zip)
  • Movie Review Polarity (review_polarity.tar.gz)
  • German to English Translation (deu-eng.txt)
  • The Republic, by Plato (republic.txt)

ARFF Datasets

  • Weka UCI Datasets (weka-datasets.zip)
  • Weka Numeric Datasets (weka-datasets-numeric.zip)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].