big dataA collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+25.93%)
MLBDMaterials for "Machine Learning on Big Data" course
Stars: ✭ 20 (-25.93%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+81559.26%)
pyspark-algorithmsPySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+166.67%)
AsakusafwAsakusa Framework
Stars: ✭ 114 (+322.22%)
bullet-coreBullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Storm, Spark or Flink.
Stars: ✭ 36 (+33.33%)
dislibThe Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
Stars: ✭ 39 (+44.44%)
bagriXML/Document DB on top of distributed cache
Stars: ✭ 40 (+48.15%)
mascMicrosoft's contributions for Spark with Apache Accumulo
Stars: ✭ 20 (-25.93%)
cdp-servicecdp数据平台,帮助企业充分了解客户,实现千人千面的精准营销。
Stars: ✭ 30 (+11.11%)
ClickhouseClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+78007.41%)
merkle-dbHigh-scalability analytics database built on immutable merkle-trees
Stars: ✭ 44 (+62.96%)
sgdAn R package for large scale estimation with stochastic gradient descent
Stars: ✭ 55 (+103.7%)
Vue Virtual Scroll List⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+11755.56%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+814.81%)
ytprivYT metadata exporter
Stars: ✭ 28 (+3.7%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+807.41%)
Kafka UiOpen-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (+751.85%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+611.11%)
replicatorMySQL Replicator. Replicates MySQL tables to Kafka and HBase, keeping the data changes history in HBase.
Stars: ✭ 41 (+51.85%)
learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+440.74%)
incubator-tezMirror of Apache Tez (Incubating)
Stars: ✭ 60 (+122.22%)
awesome-coder-resources编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (+100%)
metriqlThe metrics layer for your data. Join us at https://metriql.com/slack
Stars: ✭ 227 (+740.74%)
scikit-learn-intelexIntel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
Stars: ✭ 887 (+3185.19%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+770.37%)
couchdb-pkgApache CouchDB Packaging support files
Stars: ✭ 24 (-11.11%)
awesome-toolscurated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (+14.81%)
Quantitative-Big-Imaging-2018(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018
Stars: ✭ 50 (+85.19%)
KoalasKoalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+11174.07%)
CboardAn easy to use, self-service open BI reporting and BI dashboard platform.
Stars: ✭ 2,795 (+10251.85%)
mmtf-sparkMethods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-25.93%)
HyperspaceAn open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+811.11%)
TrafodionApache Trafodion
Stars: ✭ 242 (+796.3%)
Clustering4EverC4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+366.67%)
Selinon An advanced distributed task flow management on top of Celery
Stars: ✭ 237 (+777.78%)
lidboxEnd-to-end spoken language identification out of the box.
Stars: ✭ 39 (+44.44%)
Books整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。
Stars: ✭ 222 (+722.22%)
Lite Virtual ListVirtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (+725.93%)
NakedtensorBare bone examples of machine learning in TensorFlow
Stars: ✭ 2,443 (+8948.15%)
UsqlU-SQL Examples and Issue Tracking
Stars: ✭ 221 (+718.52%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+700%)