All Projects → ThoroughImages → EasySparse

ThoroughImages / EasySparse

Licence: MIT license
Sparse learning in TensorFlow using data acquired from Spark.

Programming Languages

python
139335 projects - #7 most used programming language
scala
5932 projects

Projects that are alternatives of or similar to EasySparse

Tensorflow template application
TensorFlow template application for deep learning
Stars: ✭ 1,851 (+8714.29%)
Mutual labels:  libsvm, tfrecords
SentimentAnalysis
Sentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (+52.38%)
Mutual labels:  deeplearning
learning2hash.github.io
Website for "A survey of learning to hash for Computer Vision" https://learning2hash.github.io
Stars: ✭ 14 (-33.33%)
Mutual labels:  deeplearning
grblas
Python wrapper around GraphBLAS
Stars: ✭ 22 (+4.76%)
Mutual labels:  sparse
MPNet
Motion Planning Networks
Stars: ✭ 139 (+561.9%)
Mutual labels:  deeplearning
InferenceHelper
C++ Helper Class for Deep Learning Inference Frameworks: TensorFlow Lite, TensorRT, OpenCV, OpenVINO, ncnn, MNN, SNPE, Arm NN, NNabla, ONNX Runtime, LibTorch, TensorFlow
Stars: ✭ 142 (+576.19%)
Mutual labels:  deeplearning
Forecasting-Solar-Energy
Forecasting Solar Power: Analysis of using a LSTM Neural Network
Stars: ✭ 23 (+9.52%)
Mutual labels:  deeplearning
Learning-Lab-C-Library
This library provides a set of basic functions for different type of deep learning (and other) algorithms in C.This deep learning library will be constantly updated
Stars: ✭ 20 (-4.76%)
Mutual labels:  deeplearning
deeplearning
딥러닝 학습 자료를 모아두고 여러 사람이 참고하여 학습할 수 있도록 돕습니다.
Stars: ✭ 44 (+109.52%)
Mutual labels:  deeplearning
img ai app boilerplate
An image classification app boilerplate to serve your deep learning models asap!
Stars: ✭ 27 (+28.57%)
Mutual labels:  deeplearning
night image semantic segmentation
[ICIP 2019] : This is the official github repository for the paper "What's There in The Dark" accepted in IEEE International Conference in Image Processing 2019 (ICIP19) , Taipei, Taiwan.
Stars: ✭ 25 (+19.05%)
Mutual labels:  deeplearning
ghiaseddin
Author's implementation of the paper "Deep Relative Attributes" (ACCV 2016)
Stars: ✭ 41 (+95.24%)
Mutual labels:  deeplearning
focalloss
Focal Loss of multi-classification in tensorflow
Stars: ✭ 75 (+257.14%)
Mutual labels:  deeplearning
OCR
Optical character recognition Using Deep Learning
Stars: ✭ 25 (+19.05%)
Mutual labels:  deeplearning
Chatbot
A Deep-Learning multi-purpose chatbot made using Python3
Stars: ✭ 36 (+71.43%)
Mutual labels:  deeplearning
SpeechEnhancement
Combining Weighted Multi-resolution STFT Loss and Distance Fusion to Optimize Speech Enhancement Generative Adversarial Networks
Stars: ✭ 49 (+133.33%)
Mutual labels:  deeplearning
SwitchNorm Detection
The code of Switchable Normalization for object detection based on Detectron.pytorch.
Stars: ✭ 79 (+276.19%)
Mutual labels:  deeplearning
DLInfBench
CNN model inference benchmarks for some popular deep learning frameworks
Stars: ✭ 51 (+142.86%)
Mutual labels:  deeplearning
KeywordSpotting
Train a 4-layer Convolutional Neural Network to detect trigger word
Stars: ✭ 49 (+133.33%)
Mutual labels:  tfrecords
QmapCompression
Official implementation of "Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform", ICCV 2021
Stars: ✭ 27 (+28.57%)
Mutual labels:  deeplearning

License: MIT

EasySparse

Motivation

In production environments, we find TensorFlow poorly deals with sparse learning scenarios. Even when one reads out records from a TFRecord file, feeding the records into a deep learning model can be a hard-bone. Thus, we have open-sourced this project to help researches or engineers build their own model using sparse data while hiding the difficulties.

This project naturally fits into scenarios when one uses TensorFlow to build deep learning models by data acquired from Spark. Other scenarios can be easily generalized.

Data Flow

DataFlow

Programs

spark_to_libsvm.scala Read data from Spark to a LibSVM file while one-hot encode features by demand.

libsvm_to_tfrecord.py Convert a LibSVM file into a TFRecord.

train.py Training code for fully connected NN with multi-GPU support.

test.py Test the performance of the trained model.

utils.py Contains all the functions used in training and testing.

Usage

  1. Read data from Spark, one-hot encode some features, and write them into a LibSVM file. Be sure to manually split the data into three LibSVM files, each for training, validating and testing (spark_to_libsvm.scala).
  2. Transform the training LibSVM file into TFRecord (libsvm_to_tfrecord.py).
  3. Run the training program (train.py).
  4. Test the trained model (test.py).

Environment

  1. Spark v2.0
  2. TensorFlow >= v0.12.1

Python Package Requirements

  1. Numpy (required)
  2. Sci-kit Learn (only required in test.py)
  3. TensorFlow (required, >= v0.12.1)

Implementation Notes

  1. In the training process, train.py reads all the validation data from the LibSVM file into the memory, and harvests shuffled training batches from the TFRecord file. Meanwhile, in the test process, all the test data is read from the LibSVM file. Therefore, one does not need to convert validation and test LibSVM files to TFRecords. However, this implementation may not work when validation and test sets are too large to fit into the memory. Although it rarely happens since validation and test sets are usually much smaller than the training set. If that happens, one need to write TFRecord file queues for validation and test sets.
  2. All the parameters required in the training process are defined at the top of train.py. Here we use a two-layer FC-NN to model our data. Note that we have adopted AdamOptimizer and exponential decayed learning rate.
  3. One may play with varies types of deep learning models by only modifying the model definition in train.py.

Contribution

Contributions and comments are welcomed!

Licence

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].