All Projects → HadoopDedup → Similar Projects or Alternatives

451 Open source projects that are alternatives of or similar to HadoopDedup

big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (+25.93%)
Mutual labels:  big-data, mapreduce
MLBD
Materials for "Machine Learning on Big Data" course
Stars: ✭ 20 (-25.93%)
Mutual labels:  big-data, mapreduce
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+81559.26%)
Mutual labels:  big-data, mapreduce
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+40607.41%)
Mutual labels:  big-data, mapreduce
pyspark-algorithms
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (+166.67%)
Mutual labels:  big-data, mapreduce
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (+162.96%)
Mutual labels:  big-data, mapreduce
Asakusafw
Asakusa Framework
Stars: ✭ 114 (+322.22%)
Mutual labels:  big-data, mapreduce
accumulo-testing
Apache Accumulo Testing
Stars: ✭ 14 (-48.15%)
Mutual labels:  big-data
phoenix-queryserver
Apache Phoenix Query Server
Stars: ✭ 33 (+22.22%)
Mutual labels:  big-data
TT Tech Space
TT Tech Research Notes
Stars: ✭ 21 (-22.22%)
Mutual labels:  big-data
Detecting-Malicious-URL-Machine-Learning
No description or website provided.
Stars: ✭ 47 (+74.07%)
Mutual labels:  big-data
bullet-core
Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Storm, Spark or Flink.
Stars: ✭ 36 (+33.33%)
Mutual labels:  big-data
dislib
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
Stars: ✭ 39 (+44.44%)
Mutual labels:  big-data
bagri
XML/Document DB on top of distributed cache
Stars: ✭ 40 (+48.15%)
Mutual labels:  big-data
big-data-engineering-indonesia
A curated list of big data engineering tools, resources and communities.
Stars: ✭ 26 (-3.7%)
Mutual labels:  big-data
masc
Microsoft's contributions for Spark with Apache Accumulo
Stars: ✭ 20 (-25.93%)
Mutual labels:  big-data
cdp-service
cdp数据平台,帮助企业充分了解客户,实现千人千面的精准营销。
Stars: ✭ 30 (+11.11%)
Mutual labels:  big-data
corpusexplorer2.0
Korpuslinguistik war noch nie so einfach...
Stars: ✭ 16 (-40.74%)
Mutual labels:  big-data
Clickhouse
ClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+78007.41%)
Mutual labels:  big-data
merkle-db
High-scalability analytics database built on immutable merkle-trees
Stars: ✭ 44 (+62.96%)
Mutual labels:  big-data
sgd
An R package for large scale estimation with stochastic gradient descent
Stars: ✭ 55 (+103.7%)
Mutual labels:  big-data
Vue Virtual Scroll List
⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+11755.56%)
Mutual labels:  big-data
Data Accelerator
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+814.81%)
Mutual labels:  big-data
ytpriv
YT metadata exporter
Stars: ✭ 28 (+3.7%)
Mutual labels:  big-data
Aws Etl Orchestrator
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+807.41%)
Mutual labels:  big-data
Kafka Ui
Open-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (+751.85%)
Mutual labels:  big-data
twitter-archive-reader
Full featured TypeScript Twitter archive reader and browser
Stars: ✭ 43 (+59.26%)
Mutual labels:  big-data
TiBigData
TiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+611.11%)
Mutual labels:  cdc
replicator
MySQL Replicator. Replicates MySQL tables to Kafka and HBase, keeping the data changes history in HBase.
Stars: ✭ 41 (+51.85%)
Mutual labels:  cdc
learning-hadoop-and-spark
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+440.74%)
Mutual labels:  mapreduce
incubator-tez
Mirror of Apache Tez (Incubating)
Stars: ✭ 60 (+122.22%)
Mutual labels:  big-data
awesome-coder-resources
编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (+100%)
Mutual labels:  big-data
Social-Network-Analysis-in-Python
Social Network Facebook Analysis (Python, Networkx)
Stars: ✭ 26 (-3.7%)
Mutual labels:  big-data
metriql
The metrics layer for your data. Join us at https://metriql.com/slack
Stars: ✭ 227 (+740.74%)
Mutual labels:  big-data
scikit-learn-intelex
Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
Stars: ✭ 887 (+3185.19%)
Mutual labels:  big-data
Eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+770.37%)
Mutual labels:  big-data
predictionio-sdk-ruby
PredictionIO Ruby SDK
Stars: ✭ 192 (+611.11%)
Mutual labels:  big-data
couchdb-pkg
Apache CouchDB Packaging support files
Stars: ✭ 24 (-11.11%)
Mutual labels:  big-data
acousticbrainz-server
The server components for the AcousticBrainz project
Stars: ✭ 128 (+374.07%)
Mutual labels:  big-data
awesome-tools
curated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (+14.81%)
Mutual labels:  big-data
predictionio-template-recommender
PredictionIO Recommendation Engine Template (Scala-based parallelized engine)
Stars: ✭ 80 (+196.3%)
Mutual labels:  big-data
Quantitative-Big-Imaging-2018
(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018
Stars: ✭ 50 (+85.19%)
Mutual labels:  big-data
Koalas
Koalas: pandas API on Apache Spark
Stars: ✭ 3,044 (+11174.07%)
Mutual labels:  big-data
pg-logical-replication
PostgreSQL Logical Replication client for node.js
Stars: ✭ 56 (+107.41%)
Mutual labels:  cdc
Cboard
An easy to use, self-service open BI reporting and BI dashboard platform.
Stars: ✭ 2,795 (+10251.85%)
Mutual labels:  big-data
mmtf-spark
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-25.93%)
Mutual labels:  big-data
Hyperspace
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Stars: ✭ 246 (+811.11%)
Mutual labels:  big-data
javaer-mind
Java 程序员进阶学习的思维导图
Stars: ✭ 66 (+144.44%)
Mutual labels:  big-data
Trafodion
Apache Trafodion
Stars: ✭ 242 (+796.3%)
Mutual labels:  big-data
Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+366.67%)
Mutual labels:  big-data
Selinon
An advanced distributed task flow management on top of Celery
Stars: ✭ 237 (+777.78%)
Mutual labels:  big-data
lidbox
End-to-end spoken language identification out of the box.
Stars: ✭ 39 (+44.44%)
Mutual labels:  big-data
Books
整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。
Stars: ✭ 222 (+722.22%)
Mutual labels:  big-data
predictionio-sdk-python
PredictionIO Python SDK
Stars: ✭ 199 (+637.04%)
Mutual labels:  big-data
Lite Virtual List
Virtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (+725.93%)
Mutual labels:  big-data
Nakedtensor
Bare bone examples of machine learning in TensorFlow
Stars: ✭ 2,443 (+8948.15%)
Mutual labels:  big-data
bigdata-doc
大数据学习笔记,学习路线,技术案例整理。
Stars: ✭ 37 (+37.04%)
Mutual labels:  mapreduce
mit-6.824-distributed-systems
Template repository to work on the labs from MIT 6.824 Distributed Systems course.
Stars: ✭ 48 (+77.78%)
Mutual labels:  mapreduce
Usql
U-SQL Examples and Issue Tracking
Stars: ✭ 221 (+718.52%)
Mutual labels:  big-data
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (+700%)
Mutual labels:  big-data
1-60 of 451 similar projects