All Projects → machinecurve → extra_keras_datasets

machinecurve / extra_keras_datasets

Licence: other
📃🎉 Additional datasets for tensorflow.keras

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to extra keras datasets

Machine-Learning-Notebooks
15+ Machine/Deep Learning Projects in Ipython Notebooks
Stars: ✭ 66 (+230%)
Mutual labels:  iris, keras-tensorflow, iris-dataset
AIODrive
Official Python/PyTorch Implementation for "All-In-One Drive: A Large-Scale Comprehensive Perception Dataset with High-Density Long-Range Point Clouds"
Stars: ✭ 32 (+60%)
Mutual labels:  datasets
HINT3
This repository contains datasets and code for the paper "HINT3: Raising the bar for Intent Detection in the Wild" accepted at EMNLP-2020's Insights Workshop https://insights-workshop.github.io/ Preprint for the paper is available here https://arxiv.org/abs/2009.13833
Stars: ✭ 27 (+35%)
Mutual labels:  datasets
Rus-SpeechRecognition-LSTM-CTC-VoxForge
Распознавание речи русского языка используя Tensorflow, обучаясь на базе Voxforge
Stars: ✭ 50 (+150%)
Mutual labels:  keras-tensorflow
chainer-ADDA
Adversarial Discriminative Domain Adaptation in Chainer
Stars: ✭ 24 (+20%)
Mutual labels:  svhn
stocktwits-sentiment
Stocktwits market sentiment analysis in Python with Keras and TensorFlow.
Stars: ✭ 23 (+15%)
Mutual labels:  keras-tensorflow
Deep-Quality-Value-Family
Official implementation of the paper "Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning Algorithms": https://arxiv.org/abs/1909.01779 To appear at the next NeurIPS2019 DRL-Workshop
Stars: ✭ 12 (-40%)
Mutual labels:  keras-tensorflow
fashion-parser
Fashion item segmentation with deep learning
Stars: ✭ 22 (+10%)
Mutual labels:  keras-tensorflow
PharmacoDB
Search across publicly available datasets to find instances where a drug or cell line of interest has been profiled.
Stars: ✭ 38 (+90%)
Mutual labels:  datasets
napkinXC
Extremely simple and fast extreme multi-class and multi-label classifiers.
Stars: ✭ 38 (+90%)
Mutual labels:  datasets
GestureAI
RNN(Recurrent Nerural network) model which recognize hand-gestures drawing 5 figures.
Stars: ✭ 20 (+0%)
Mutual labels:  keras-tensorflow
keras-yolo3-facedetection
Real-time face detection model using YOLOv3 with Keras
Stars: ✭ 13 (-35%)
Mutual labels:  keras-tensorflow
Bebop-Autonomy-Vision
An autonomous, vision-based Bebop drone.
Stars: ✭ 24 (+20%)
Mutual labels:  keras-tensorflow
One-Shot-Learning
Matching Networks Tensorflow 2 Implementation for few-shot AD diagnosis
Stars: ✭ 22 (+10%)
Mutual labels:  keras-tensorflow
Detection-of-Small-Flying-Objects-in-UAV-Videos
Code for paper "Detection of Flying Honeybees in UAV Videos"
Stars: ✭ 47 (+135%)
Mutual labels:  keras-tensorflow
iris-admin
Web admin for iris-go framwork
Stars: ✭ 602 (+2910%)
Mutual labels:  iris
keras-complex
Keras-Tensorflow implementation of complex-valued convolutional neural networks
Stars: ✭ 96 (+380%)
Mutual labels:  keras-tensorflow
traj-pred-irl
Official implementation codes of "Regularizing neural networks for future trajectory prediction via IRL framework"
Stars: ✭ 23 (+15%)
Mutual labels:  datasets
MAX-Audio-Classifier
Identify sounds in short audio clips
Stars: ✭ 115 (+475%)
Mutual labels:  keras-tensorflow
allie
🤖 A machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers).
Stars: ✭ 93 (+365%)
Mutual labels:  datasets

📃🎉 Additional datasets for tensorflow.keras

Powered by MachineCurve at www.machinecurve.com 🚀

Hi there, and welcome to the extra-keras-datasets module! This extension to the original tensorflow.keras.datasets module offers easy access to additional datasets, in ways almost equal to how you're currently importing them.

The extra-keras-datasets module is not affiliated, associated, authorized, endorsed by, or in any way officially connected with TensorFlow, Keras, or any of its subsidiaries or its affiliates. The official TensorFlow and Keras websites can be found at https://www.tensorflow.org/ and https://keras.io/.

The names TensorFlow, Keras, as well as related names, marks, emblems and images are registered trademarks of their respective owners.

Table of Contents

How to use this module?

Dependencies

Make sure to install TensorFlow! This package makes use of the TensorFlow 2.x package and specifically tensorflow.keras. Therefore, make sure to install TensorFlow - you can do so in the following way:

  • pip install tensorflow

Installation procedure

Installing is really easy, and can be done with PIP: pip install extra-keras-datasets. The package depends on numpy, scipy, pandas and scikit-learn, which will be automatically installed.

Datasets

EMNIST-Balanced

Extended MNIST (EMNIST) contains digits as well as uppercase and lowercase handwritten letters. EMNIST-Balanced contains 131.600 characters across 47 balanced classes.

from extra_keras_datasets import emnist
(input_train, target_train), (input_test, target_test) = emnist.load_data(type='balanced')


EMNIST-ByClass

Extended MNIST (EMNIST) contains digits as well as uppercase and lowercase handwritten letters. EMNIST-ByClass contains 814.255 characters across 62 unbalanced classes.

from extra_keras_datasets import emnist
(input_train, target_train), (input_test, target_test) = emnist.load_data(type='byclass')


EMNIST-ByMerge

Extended MNIST (EMNIST) contains digits as well as uppercase and lowercase handwritten letters. EMNIST-ByMerge contains 814.255 characters across 47 unbalanced classes.

from extra_keras_datasets import emnist
(input_train, target_train), (input_test, target_test) = emnist.load_data(type='bymerge')


EMNIST-Digits

Extended MNIST (EMNIST) contains digits as well as uppercase and lowercase handwritten letters. EMNIST-Digits contains 280.000 characters across 10 balanced classes (digits only).

from extra_keras_datasets import emnist
(input_train, target_train), (input_test, target_test) = emnist.load_data(type='digits')


EMNIST-Letters

Extended MNIST (EMNIST) contains digits as well as uppercase and lowercase handwritten letters. EMNIST-Letters contains 145.600 characters across 26 balanced classes (letters only).

from extra_keras_datasets import emnist
(input_train, target_train), (input_test, target_test) = emnist.load_data(type='letters')


EMNIST-MNIST

Extended MNIST (EMNIST) contains digits as well as uppercase and lowercase handwritten letters. EMNIST-MNIST contains 70.000 characters across 10 balanced classes (equal to keras.datasets.mnist).

from extra_keras_datasets import emnist
(input_train, target_train), (input_test, target_test) = emnist.load_data(type='mnist')


KMNIST-KMNIST

Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset: it contains 70.000 28x28 grayscale images of Japanese Kuzushiji characters.

from extra_keras_datasets import kmnist
(input_train, target_train), (input_test, target_test) = kmnist.load_data(type='kmnist')


KMNIST-K49

Kuzushiji-49 extends Kuzushiji-MNIST and contains 270.912 images across 49 classes.

from extra_keras_datasets import kmnist
(input_train, target_train), (input_test, target_test) = kmnist.load_data(type='k49')


SVHN-Normal

The Street View House Numbers dataset (SVHN) contains 32x32 cropped images of house numbers obtained from Google Street View. There are 73.257 digits for training and 26.032 digits for testing. Noncommercial use is allowed only: see the SVHN website for more information.

from extra_keras_datasets import svhn
(input_train, target_train), (input_test, target_test) = svhn.load_data(type='normal')


SVHN-Extra

SVHN-Extra extends SVHN-Normal with 531.131 less difficult samples and contains a total of 604.388 digits for training and 26.032 digits for testing. Noncommercial use is allowed only: see the SVHN website for more information.

from extra_keras_datasets import svhn
(input_train, target_train), (input_test, target_test) = svhn.load_data(type='extra')


STL-10

The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It contains 5.000 training images and 8.000 testing images, and represents 10 classes in total (airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck).

from extra_keras_datasets import stl10
(input_train, target_train), (input_test, target_test) = stl10.load_data()


Iris

This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other.

Predicted attribute: class of iris plant.

from extra_keras_datasets import iris
(input_train, target_train), (input_test, target_test) = iris.load_data(test_split=0.2)


Wine Quality dataset

This dataset presents wine qualities related to red and white vinho verde wine samples, from the north of Portugal. According to the creators, "[the] goal is to model wine quality based on physicochemical tests". Various chemical properties of the wine are available as well (inputs) as well as the quality score (targets) for the wine.

  • Input structure: (fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, alcohol, wine type)
  • Target structure: quality score between 0 and 10
from extra_keras_datasets import wine_quality
(input_train, target_train), (input_test, target_test) = wine_quality.load_data(which_data='both', test_split=0.2, shuffle=True)


USPS Handwritten Digits Dataset

This dataset presents thousands of 16x16 grayscale images of handwritten digits, generated from real USPS based mail.

  • Input structure: 16x16 image
  • Target structure: digit ranging from 0.0 - 9.0 describing the input
from extra_keras_datasets import usps
(input_train, target_train), (input_test, target_test) = usps.load_data()


Contributors and other references

  • EMNIST dataset:
  • KMNIST dataset:
    • Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., & Ha, D. (2018). Deep learning for classical Japanese literature. arXiv preprint arXiv:1812.01718. Retrieved from https://arxiv.org/abs/1812.01718
  • SVHN dataset:
  • STL-10 dataset:
  • Iris dataset:
    • Fisher,R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).
  • Wine Quality dataset:
    • P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.
  • USPS Handwritten Digits Dataset
    • Hull, J. J. (1994). A database for handwritten text recognition research. IEEE Transactions on pattern analysis and machine intelligence, 16(5), 550-554.

License

The licenseable parts of this repository are licensed under a MIT License, so you're free to use this repo in your machine learning projects / blogs / exercises, and so on. Happy engineering! 🚀

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].