All Projects → HadoopDedup → Similar Projects or Alternatives

451 Open source projects that are alternatives of or similar to HadoopDedup

TiBigData
TiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+611.11%)
Mutual labels:  cdc
Attic Predictionio Sdk Ruby
PredictionIO Ruby SDK
Stars: ✭ 192 (+611.11%)
Mutual labels:  big-data
replicator
MySQL Replicator. Replicates MySQL tables to Kafka and HBase, keeping the data changes history in HBase.
Stars: ✭ 41 (+51.85%)
Mutual labels:  cdc
Presto Go Client
A Presto client for the Go programming language.
Stars: ✭ 183 (+577.78%)
Mutual labels:  big-data
learning-hadoop-and-spark
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+440.74%)
Mutual labels:  mapreduce
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+555.56%)
Mutual labels:  big-data
incubator-tez
Mirror of Apache Tez (Incubating)
Stars: ✭ 60 (+122.22%)
Mutual labels:  big-data
Keyvi
Keyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and lookup performance.
Stars: ✭ 171 (+533.33%)
Mutual labels:  big-data
awesome-coder-resources
编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (+100%)
Mutual labels:  big-data
Geopyspark
GeoTrellis for PySpark
Stars: ✭ 167 (+518.52%)
Mutual labels:  big-data
Social-Network-Analysis-in-Python
Social Network Facebook Analysis (Python, Networkx)
Stars: ✭ 26 (-3.7%)
Mutual labels:  big-data
Fluo
Apache Fluo
Stars: ✭ 159 (+488.89%)
Mutual labels:  big-data
mit-6.824-distributed-systems
Template repository to work on the labs from MIT 6.824 Distributed Systems course.
Stars: ✭ 48 (+77.78%)
Mutual labels:  mapreduce
Usql
U-SQL Examples and Issue Tracking
Stars: ✭ 221 (+718.52%)
Mutual labels:  big-data
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+5496.3%)
Mutual labels:  big-data
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+462.96%)
Mutual labels:  big-data
predictionio-sdk-ruby
PredictionIO Ruby SDK
Stars: ✭ 192 (+611.11%)
Mutual labels:  big-data
Datasciencevm
Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Stars: ✭ 153 (+466.67%)
Mutual labels:  big-data
couchdb-pkg
Apache CouchDB Packaging support files
Stars: ✭ 24 (-11.11%)
Mutual labels:  big-data
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+455.56%)
Mutual labels:  big-data
acousticbrainz-server
The server components for the AcousticBrainz project
Stars: ✭ 128 (+374.07%)
Mutual labels:  big-data
100daysofmlcode
My journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge.
Stars: ✭ 146 (+440.74%)
Mutual labels:  big-data
awesome-tools
curated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (+14.81%)
Mutual labels:  big-data
Metamodel
Mirror of Apache Metamodel
Stars: ✭ 143 (+429.63%)
Mutual labels:  big-data
predictionio-template-recommender
PredictionIO Recommendation Engine Template (Scala-based parallelized engine)
Stars: ✭ 80 (+196.3%)
Mutual labels:  big-data
Big Data Study
🐳 big data study
Stars: ✭ 141 (+422.22%)
Mutual labels:  big-data
Quantitative-Big-Imaging-2018
(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018
Stars: ✭ 50 (+85.19%)
Mutual labels:  big-data
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (+418.52%)
Mutual labels:  big-data
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+11174.07%)
Mutual labels:  big-data
Sparkling Graph
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (+414.81%)
Mutual labels:  big-data
pg-logical-replication
PostgreSQL Logical Replication client for node.js
Stars: ✭ 56 (+107.41%)
Mutual labels:  cdc
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (+407.41%)
Mutual labels:  big-data
Cboard
An easy to use, self-service open BI reporting and BI dashboard platform.
Stars: ✭ 2,795 (+10251.85%)
Mutual labels:  big-data
Attic Apex Malhar
Mirror of Apache Apex malhar
Stars: ✭ 131 (+385.19%)
Mutual labels:  big-data
mmtf-spark
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-25.93%)
Mutual labels:  big-data
Calcite Avatica
Mirror of Apache Calcite - Avatica
Stars: ✭ 130 (+381.48%)
Mutual labels:  big-data
Hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+811.11%)
Mutual labels:  big-data
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+5981.48%)
Mutual labels:  big-data
javaer-mind
Java 程序员进阶学习的思维导图
Stars: ✭ 66 (+144.44%)
Mutual labels:  big-data
Tajo
Mirror of Apache Tajo
Stars: ✭ 128 (+374.07%)
Mutual labels:  big-data
Trafodion
Apache Trafodion
Stars: ✭ 242 (+796.3%)
Mutual labels:  big-data
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+9440.74%)
Mutual labels:  big-data
Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+366.67%)
Mutual labels:  big-data
Richdem
High-performance Terrain and Hydrology Analysis
Stars: ✭ 127 (+370.37%)
Mutual labels:  big-data
Selinon
An advanced distributed task flow management on top of Celery
Stars: ✭ 237 (+777.78%)
Mutual labels:  big-data
Hazelcast Nodejs Client
Hazelcast IMDG Node.js Client
Stars: ✭ 124 (+359.26%)
Mutual labels:  big-data
lidbox
End-to-end spoken language identification out of the box.
Stars: ✭ 39 (+44.44%)
Mutual labels:  big-data
Scala Spark Tutorial
Project for James' Apache Spark with Scala course
Stars: ✭ 121 (+348.15%)
Mutual labels:  big-data
Books
整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。
Stars: ✭ 222 (+722.22%)
Mutual labels:  big-data
Hdfs Shell
HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
Stars: ✭ 117 (+333.33%)
Mutual labels:  big-data
predictionio-sdk-python
PredictionIO Python SDK
Stars: ✭ 199 (+637.04%)
Mutual labels:  big-data
Cmak
CMAK is a tool for managing Apache Kafka clusters
Stars: ✭ 10,544 (+38951.85%)
Mutual labels:  big-data
Nakedtensor
Bare bone examples of machine learning in TensorFlow
Stars: ✭ 2,443 (+8948.15%)
Mutual labels:  big-data
bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
Stars: ✭ 37 (+37.04%)
Mutual labels:  mapreduce
Pythondata
repo for code published on pythondata.com
Stars: ✭ 113 (+318.52%)
Mutual labels:  big-data
accumulo-docker
Apache Accumulo Docker
Stars: ✭ 17 (-37.04%)
Mutual labels:  big-data
Awkward 0.x
Manipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+700%)
Mutual labels:  big-data
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (+696.3%)
Mutual labels:  big-data
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+700%)
Mutual labels:  big-data
data-viz-utils
Functions for easily making publication-quality figures with matplotlib.
Stars: ✭ 16 (-40.74%)
Mutual labels:  big-data
61-120 of 451 similar projects