All Projects → Dist Keras → Similar Projects or Alternatives

1289 Open source projects that are alternatives of or similar to Dist Keras

Griffon Data Science Virtual Machine

Stars: ✭ 128 (-79.12%)

Mutual labels: data-science, hadoop, apache-spark

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Stars: ✭ 177 (-71.13%)

Mutual labels: hadoop, apache-spark

DaFlow

Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.

Stars: ✭ 24 (-96.08%)

Mutual labels: apache-spark, hadoop

learning-hadoop-and-spark

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Stars: ✭ 146 (-76.18%)

Mutual labels: apache-spark, hadoop

aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Stars: ✭ 111 (-81.89%)

Mutual labels: apache-spark, hadoop

Pulsar Spark

When Apache Pulsar meets Apache Spark

Stars: ✭ 55 (-91.03%)

Mutual labels: data-science, apache-spark

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+822.68%)

Mutual labels: data-science, hadoop

Scalable Data Science

Scalable Data Science, course sets in big data Using Apache Spark over databricks and their mathematical, statistical and computational foundations using SageMath.

Stars: ✭ 142 (-76.84%)

Mutual labels: data-science, apache-spark

Spark With Python

Fundamentals of Spark with Python (using PySpark), code examples

Stars: ✭ 150 (-75.53%)

Mutual labels: hadoop, apache-spark

Learn machine learning

Road to Machine Learning

Stars: ✭ 81 (-86.79%)

Mutual labels: data-science, hadoop

datalake-etl-pipeline

Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations

Stars: ✭ 39 (-93.64%)

Mutual labels: apache-spark, hadoop

Trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+647.31%)

Mutual labels: data-science, hadoop

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-97.88%)

Mutual labels: apache-spark, hadoop

Sparkrdma

RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark

Stars: ✭ 215 (-64.93%)

Mutual labels: hadoop, apache-spark

sparkucx

A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer

Stars: ✭ 32 (-94.78%)

Mutual labels: apache-spark, hadoop

Agile data code 2

Code for Agile Data Science 2.0, O'Reilly 2017, Second Edition

Stars: ✭ 413 (-32.63%)

Mutual labels: data-science, apache-spark

Pysparkling

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

Stars: ✭ 231 (-62.32%)

Mutual labels: data-science, apache-spark

Spark Notebook

Interactive and Reactive Data Science using Scala and Spark.

Stars: ✭ 3,081 (+402.61%)

Mutual labels: data-science, apache-spark

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+3496.74%)

Mutual labels: data-science, hadoop

Awesome Twitter Data

A list of Twitter datasets and related resources.

Stars: ✭ 533 (-13.05%)

Mutual labels: data-science

Hadoop study

定期更新Hadoop生态圈中常用大数据组件文档重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图印象笔记 Scala版本简单demo 常用工具类去敏后的train code 持续更新!!!)

Stars: ✭ 567 (-7.5%)

Mutual labels: hadoop

Interpretable machine learning with python

Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.

Stars: ✭ 530 (-13.54%)

Mutual labels: data-science

Rumale

Rumale is a machine learning library in Ruby

Stars: ✭ 526 (-14.19%)

Mutual labels: data-science

Imbalanced Learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

Stars: ✭ 5,617 (+816.31%)

Mutual labels: data-science

Data Analysis And Machine Learning Projects

Repository of teaching materials, code, and data for my data analysis and machine learning projects.

Stars: ✭ 5,166 (+742.74%)

Mutual labels: data-science

Course V3

The 3rd edition of course.fast.ai

Stars: ✭ 4,785 (+680.59%)

Mutual labels: data-science

Feature Selection

Features selector based on the self selected-algorithm, loss function and validation method

Stars: ✭ 534 (-12.89%)

Mutual labels: data-science

Datasets For Recommender Systems

This is a repository of a topic-centric public data sources in high quality for Recommender Systems (RS)

Stars: ✭ 564 (-7.99%)

Mutual labels: data-science

Data Science Your Way

Ways of doing Data Science Engineering and Machine Learning in R and Python

Stars: ✭ 530 (-13.54%)

Mutual labels: data-science

Awesome Ai Usecases

A list of awesome and proven Artificial Intelligence use cases and applications

Stars: ✭ 587 (-4.24%)

Mutual labels: data-science

Lets Plot

An open-source plotting library for statistical data.

Stars: ✭ 531 (-13.38%)

Mutual labels: data-science

Alphapy

Automated Machine Learning [AutoML] with Python, scikit-learn, Keras, XGBoost, LightGBM, and CatBoost

Stars: ✭ 564 (-7.99%)

Mutual labels: data-science

Moderndive book

Statistical Inference via Data Science: A ModernDive into R and the Tidyverse

Stars: ✭ 527 (-14.03%)

Mutual labels: data-science

Smile

Statistical Machine Intelligence & Learning Engine

Stars: ✭ 5,412 (+782.87%)

Mutual labels: data-science

Pachyderm

Reproducible Data Science at Scale!

Stars: ✭ 5,305 (+765.42%)

Mutual labels: data-science

Dapy

Easy-to-use data analysis / manipulation framework for humans

Stars: ✭ 523 (-14.68%)

Mutual labels: data-science

Disk.frame

Fast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data

Stars: ✭ 517 (-15.66%)

Mutual labels: data-science

Glue

Linked Data Visualizations Across Multiple Files

Stars: ✭ 518 (-15.5%)

Mutual labels: data-science

Vehicle counting tensorflow

🚘 "MORE THAN VEHICLE COUNTING!" This project provides prediction for speed, color and size of the vehicles with TensorFlow Object Counting API.

Stars: ✭ 582 (-5.06%)

Mutual labels: data-science

Data Science Portfolio

Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.

Stars: ✭ 559 (-8.81%)

Mutual labels: data-science

Facebook data analyzer

Analyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more

Stars: ✭ 515 (-15.99%)

Mutual labels: data-science

Knowledge Repo

A next-generation curated knowledge sharing platform for data scientists and other technical professions.

Stars: ✭ 4,956 (+708.48%)

Mutual labels: data-science

Nipype

Workflows and interfaces for neuroimaging packages

Stars: ✭ 557 (-9.14%)

Mutual labels: data-science

Heamy

A set of useful tools for competitive data science.

Stars: ✭ 511 (-16.64%)

Mutual labels: data-science

Spacy Stanza

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy

Stars: ✭ 508 (-17.13%)

Mutual labels: data-science

Book sample

another book on data science

Stars: ✭ 611 (-0.33%)

Mutual labels: data-science

Datasheets

Read data from, write data to, and modify the formatting of Google Sheets

Stars: ✭ 593 (-3.26%)

Mutual labels: data-science

Data Science Competitions

Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).

Stars: ✭ 572 (-6.69%)

Mutual labels: data-science

Streaming Readings

Streaming System 相关的论文读物

Stars: ✭ 554 (-9.62%)

Mutual labels: apache-spark

Atm

Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).

Stars: ✭ 504 (-17.78%)

Mutual labels: data-science

Edward

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.

Stars: ✭ 4,674 (+662.48%)

Mutual labels: data-science

Data Science With Ruby

Practical Data Science with Ruby based tools.

Stars: ✭ 549 (-10.44%)

Mutual labels: data-science

Awesome Learn Datascience

📈 Curated list of resources to help you get started with Data Science

Stars: ✭ 502 (-18.11%)

Mutual labels: data-science

Snorkel

A system for quickly generating training data with weak supervision

Stars: ✭ 4,953 (+707.99%)

Mutual labels: data-science

Alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

Stars: ✭ 5,379 (+777.49%)

Mutual labels: hadoop

Probabilistic Programming And Bayesian Methods For Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Stars: ✭ 23,912 (+3800.82%)

Mutual labels: data-science

Awesome R

A curated list of awesome R packages, frameworks and software.

Stars: ✭ 4,858 (+692.5%)

Mutual labels: data-science

Dataframe Go

DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration

Stars: ✭ 487 (-20.55%)

Mutual labels: data-science

Solid

🎯 A comprehensive gradient-free optimization framework written in Python