StormMirror of Apache Storm
Stars: ✭ 6,297 (+2502.07%)
lcbo-apiA crawler and API server for Liquor Control Board of Ontario retail data
Stars: ✭ 152 (-37.19%)
gan deeplearning4jAutomatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-92.15%)
CythonThe most widely used Python to C compiler
Stars: ✭ 6,588 (+2622.31%)
FlameStreamDistributed stream processing model and its implementation
Stars: ✭ 14 (-94.21%)
VizukaExplore high-dimensional datasets and how your algo handles specific regions.
Stars: ✭ 100 (-58.68%)
ngmswissgeol.ch gives you insight in geoscientific data - above and below the surface.
Stars: ✭ 23 (-90.5%)
SamzaMirror of Apache Samza
Stars: ✭ 676 (+179.34%)
automile-netAutomile offers a simple, smart, cutting-edge telematics solution for businesses to track and manage their business vehicles.
Stars: ✭ 24 (-90.08%)
Books整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。
Stars: ✭ 222 (-8.26%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-93.39%)
SdcIntel® Scalable Dataframe Compiler for Pandas*
Stars: ✭ 623 (+157.44%)
FIW KRTFamilies In the WIld: A Kinship Recogntion Toolbox.
Stars: ✭ 18 (-92.56%)
shiftingA privacy-focused list of alternatives to mainstream services to help the competition.
Stars: ✭ 31 (-87.19%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+2237.19%)
HadoopDedup🍉基于Hadoop和HBase的大规模海量数据去重
Stars: ✭ 27 (-88.84%)
Belajarpython.comOpen Source Indonesian Python Programming Tutorial Site
Stars: ✭ 141 (-41.74%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+2178.1%)
KuduMirror of Apache Kudu
Stars: ✭ 1,360 (+461.98%)
ScannerEfficient video analysis at scale
Stars: ✭ 569 (+135.12%)
merkle-dbHigh-scalability analytics database built on immutable merkle-trees
Stars: ✭ 44 (-81.82%)
FlumeMirror of Apache Flume
Stars: ✭ 2,200 (+809.09%)
metriqlThe metrics layer for your data. Join us at https://metriql.com/slack
Stars: ✭ 227 (-6.2%)
NipypeWorkflows and interfaces for neuroimaging packages
Stars: ✭ 557 (+130.17%)
dislibThe Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
Stars: ✭ 39 (-83.88%)
LogislandScalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-59.92%)
ThrillThrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
Stars: ✭ 528 (+118.18%)
cdp-servicecdp数据平台,帮助企业充分了解客户,实现千人千面的精准营销。
Stars: ✭ 30 (-87.6%)
sgdAn R package for large scale estimation with stochastic gradient descent
Stars: ✭ 55 (-77.27%)
BeamApache Beam is a unified programming model for Batch and Streaming
Stars: ✭ 5,149 (+2027.69%)
ytprivYT metadata exporter
Stars: ✭ 28 (-88.43%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+452.89%)
scikit-learn-intelexIntel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
Stars: ✭ 887 (+266.53%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (+109.5%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-11.16%)
bullet-coreBullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Storm, Spark or Flink.
Stars: ✭ 36 (-85.12%)
Stream FrameworkStream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:
Stars: ✭ 4,576 (+1790.91%)
incubator-tezMirror of Apache Tez (Incubating)
Stars: ✭ 60 (-75.21%)
ReefMirror of Apache REEF
Stars: ✭ 92 (-61.98%)
RedisliteRedis in a python module.
Stars: ✭ 464 (+91.74%)
PoseidonA search engine which can hold 100 trillion lines of log data.
Stars: ✭ 1,793 (+640.91%)
CoursesQuiz & Assignment of Coursera
Stars: ✭ 454 (+87.6%)
Kafka UiOpen-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (-4.96%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (-2.89%)
UsqlU-SQL Examples and Issue Tracking
Stars: ✭ 221 (-8.68%)
Couchdb DockerSemi-official Apache CouchDB Docker images
Stars: ✭ 194 (-19.83%)
DatasciencevmTools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (-36.78%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (-51.65%)
big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-85.95%)