TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+611.11%)
replicatorMySQL Replicator. Replicates MySQL tables to Kafka and HBase, keeping the data changes history in HBase.
Stars: ✭ 41 (+51.85%)
Presto Go ClientA Presto client for the Go programming language.
Stars: ✭ 183 (+577.78%)
learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+440.74%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+555.56%)
incubator-tezMirror of Apache Tez (Incubating)
Stars: ✭ 60 (+122.22%)
KeyviKeyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
Stars: ✭ 171 (+533.33%)
awesome-coder-resources编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (+100%)
GeopysparkGeoTrellis for PySpark
Stars: ✭ 167 (+518.52%)
FluoApache Fluo
Stars: ✭ 159 (+488.89%)
UsqlU-SQL Examples and Issue Tracking
Stars: ✭ 221 (+718.52%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+5496.3%)
GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+462.96%)
DatasciencevmTools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (+466.67%)
couchdb-pkgApache CouchDB Packaging support files
Stars: ✭ 24 (-11.11%)
Spark With PythonFundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+455.56%)
100daysofmlcodeMy journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge.
Stars: ✭ 146 (+440.74%)
awesome-toolscurated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (+14.81%)
MetamodelMirror of Apache Metamodel
Stars: ✭ 143 (+429.63%)
Quantitative-Big-Imaging-2018(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018
Stars: ✭ 50 (+85.19%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (+418.52%)
KoalasKoalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+11174.07%)
Sparkling GraphSparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (+414.81%)
CboardAn easy to use, self-service open BI reporting and BI dashboard platform.
Stars: ✭ 2,795 (+10251.85%)
mmtf-sparkMethods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-25.93%)
HyperspaceAn open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+811.11%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+5981.48%)
TajoMirror of Apache Tajo
Stars: ✭ 128 (+374.07%)
TrafodionApache Trafodion
Stars: ✭ 242 (+796.3%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+9440.74%)
Clustering4EverC4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+366.67%)
RichdemHigh-performance Terrain and Hydrology Analysis
Stars: ✭ 127 (+370.37%)
Selinon An advanced distributed task flow management on top of Celery
Stars: ✭ 237 (+777.78%)
lidboxEnd-to-end spoken language identification out of the box.
Stars: ✭ 39 (+44.44%)
Books整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。
Stars: ✭ 222 (+722.22%)
Hdfs ShellHDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (+333.33%)
CmakCMAK is a tool for managing Apache Kafka clusters
Stars: ✭ 10,544 (+38951.85%)
NakedtensorBare bone examples of machine learning in TensorFlow
Stars: ✭ 2,443 (+8948.15%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (+318.52%)
Awkward 0.xManipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+700%)
SparkrdmaRDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+696.3%)
GimelBig Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+700%)
data-viz-utilsFunctions for easily making publication-quality figures with matplotlib.
Stars: ✭ 16 (-40.74%)