All Projects → HadoopDedup → Similar Projects or Alternatives

451 Open source projects that are alternatives of or similar to HadoopDedup

big data

A collection of tutorials on Hadoop, MapReduce, Spark, Docker

Stars: ✭ 34 (+25.93%)

Mutual labels: big-data, mapreduce

MLBD

Materials for "Machine Learning on Big Data" course

Stars: ✭ 20 (-25.93%)

Mutual labels: big-data, mapreduce

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+81559.26%)

Mutual labels: big-data, mapreduce

Bigdata Notes

大数据入门指南 ⭐

Stars: ✭ 10,991 (+40607.41%)

Mutual labels: big-data, mapreduce

pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Stars: ✭ 72 (+166.67%)

Mutual labels: big-data, mapreduce

Big Data Engineering Coursera Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Stars: ✭ 71 (+162.96%)

Mutual labels: big-data, mapreduce

Asakusafw

Asakusa Framework

Stars: ✭ 114 (+322.22%)

Mutual labels: big-data, mapreduce

accumulo-testing

Apache Accumulo Testing

Stars: ✭ 14 (-48.15%)

Mutual labels: big-data

phoenix-queryserver

Apache Phoenix Query Server

Stars: ✭ 33 (+22.22%)

Mutual labels: big-data

TT Tech Space

TT Tech Research Notes

Stars: ✭ 21 (-22.22%)

Mutual labels: big-data

Detecting-Malicious-URL-Machine-Learning

No description or website provided.

Stars: ✭ 47 (+74.07%)

Mutual labels: big-data

bullet-core

Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Storm, Spark or Flink.

Stars: ✭ 36 (+33.33%)

Mutual labels: big-data

dislib

The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.

Stars: ✭ 39 (+44.44%)

Mutual labels: big-data

bagri

XML/Document DB on top of distributed cache

Stars: ✭ 40 (+48.15%)

Mutual labels: big-data

big-data-engineering-indonesia

A curated list of big data engineering tools, resources and communities.

Stars: ✭ 26 (-3.7%)

Mutual labels: big-data

masc

Microsoft's contributions for Spark with Apache Accumulo

Stars: ✭ 20 (-25.93%)

Mutual labels: big-data

cdp-service

cdp数据平台，帮助企业充分了解客户，实现千人千面的精准营销。

Stars: ✭ 30 (+11.11%)

Mutual labels: big-data

corpusexplorer2.0

Korpuslinguistik war noch nie so einfach...

Stars: ✭ 16 (-40.74%)

Mutual labels: big-data

Clickhouse

ClickHouse® is a free analytics DBMS for big data

Stars: ✭ 21,089 (+78007.41%)

Mutual labels: big-data

merkle-db

High-scalability analytics database built on immutable merkle-trees

Stars: ✭ 44 (+62.96%)

Mutual labels: big-data

sgd

An R package for large scale estimation with stochastic gradient descent

Stars: ✭ 55 (+103.7%)

Mutual labels: big-data

Vue Virtual Scroll List

⚡️A vue component support big amount data list with high render performance and efficient.

Stars: ✭ 3,201 (+11755.56%)

Mutual labels: big-data

Data Accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+814.81%)

Mutual labels: big-data

ytpriv

YT metadata exporter

Stars: ✭ 28 (+3.7%)

Mutual labels: big-data

Aws Etl Orchestrator

A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.

Stars: ✭ 245 (+807.41%)

Mutual labels: big-data

Kafka Ui

Open-Source Web GUI for Apache Kafka Management

Stars: ✭ 230 (+751.85%)

Mutual labels: big-data

twitter-archive-reader

Full featured TypeScript Twitter archive reader and browser

Stars: ✭ 43 (+59.26%)

Mutual labels: big-data

TiBigData

TiDB connectors for Flink/Hive/Presto

Stars: ✭ 192 (+611.11%)

Mutual labels: cdc

replicator

MySQL Replicator. Replicates MySQL tables to Kafka and HBase, keeping the data changes history in HBase.

Stars: ✭ 41 (+51.85%)

Mutual labels: cdc

learning-hadoop-and-spark

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Stars: ✭ 146 (+440.74%)

Mutual labels: mapreduce

incubator-tez

Mirror of Apache Tez (Incubating)

Stars: ✭ 60 (+122.22%)

Mutual labels: big-data

awesome-coder-resources

编程路上加油站！------【持续更新中...欢迎star,欢迎常回来看看......】【内容：编程/学习/阅读资源，开源项目,面试题,网站,书,博客,教程等等】

Stars: ✭ 54 (+100%)

Mutual labels: big-data

Social-Network-Analysis-in-Python

Social Network Facebook Analysis (Python, Networkx)

Stars: ✭ 26 (-3.7%)

Mutual labels: big-data

metriql

The metrics layer for your data. Join us at https://metriql.com/slack

Stars: ✭ 227 (+740.74%)

Mutual labels: big-data

scikit-learn-intelex

Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

Stars: ✭ 887 (+3185.19%)

Mutual labels: big-data

Eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

Stars: ✭ 235 (+770.37%)

Mutual labels: big-data

predictionio-sdk-ruby

PredictionIO Ruby SDK

Stars: ✭ 192 (+611.11%)

Mutual labels: big-data

couchdb-pkg

Apache CouchDB Packaging support files

Stars: ✭ 24 (-11.11%)

Mutual labels: big-data

acousticbrainz-server

The server components for the AcousticBrainz project

Stars: ✭ 128 (+374.07%)

Mutual labels: big-data

awesome-tools

curated list of awesome tools and libraries for specific domains

Stars: ✭ 31 (+14.81%)

Mutual labels: big-data

predictionio-template-recommender

PredictionIO Recommendation Engine Template (Scala-based parallelized engine)

Stars: ✭ 80 (+196.3%)

Mutual labels: big-data

Quantitative-Big-Imaging-2018

(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018

Stars: ✭ 50 (+85.19%)

Mutual labels: big-data

Koalas

Koalas: pandas API on Apache Spark

Stars: ✭ 3,044 (+11174.07%)

Mutual labels: big-data

pg-logical-replication

PostgreSQL Logical Replication client for node.js

Stars: ✭ 56 (+107.41%)

Mutual labels: cdc

Cboard

An easy to use, self-service open BI reporting and BI dashboard platform.