All Projects → BigDataTools → Similar Projects or Alternatives

386 Open source projects that are alternatives of or similar to BigDataTools

datasqueeze
Hadoop utility to compact small files
Stars: ✭ 18 (-50%)
Mutual labels:  hdfs
HDFS-Netdisc
基于Hadoop的分布式云存储系统 🌴
Stars: ✭ 56 (+55.56%)
Mutual labels:  hdfs
awesome-coder-resources
编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (+50%)
Mutual labels:  bigdata
logparser
Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+286.11%)
Mutual labels:  hive
apiary
Apiary provides modules which can be combined to create a federated cloud data lake
Stars: ✭ 30 (-16.67%)
Mutual labels:  hive
smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+119.44%)
Mutual labels:  hive
HiveRunner
An Open Source unit test framework for Hive queries based on JUnit 4 and 5
Stars: ✭ 244 (+577.78%)
Mutual labels:  hive
flink-learn
Learning Flink : Flink CEP,Flink Core,Flink SQL
Stars: ✭ 70 (+94.44%)
Mutual labels:  bigdata
datacatalog-tag-manager
Python package to manage Google Cloud Data Catalog tags, loading metadata from external sources -- currently supports the CSV file format
Stars: ✭ 17 (-52.78%)
Mutual labels:  bigdata
radiator
Hive Ruby API Client
Stars: ✭ 49 (+36.11%)
Mutual labels:  hive
columnify
Make record oriented data to columnar format.
Stars: ✭ 28 (-22.22%)
Mutual labels:  bigdata
SparkTwitterAnalysis
An Apache Spark standalone application using the Spark API in Scala. The application uses Simple Build Tool(SBT) for building the project.
Stars: ✭ 29 (-19.44%)
Mutual labels:  bigdata
Clustering4Ever
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (+250%)
Mutual labels:  bigdata
chatnoir-resiliparse
A robust web archive analytics toolkit
Stars: ✭ 26 (-27.78%)
Mutual labels:  bigdata
starlake
Starlake is a Spark Based On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Stars: ✭ 16 (-55.56%)
Mutual labels:  hdfs
bigquery-data-lineage
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Stars: ✭ 112 (+211.11%)
Mutual labels:  bigdata
PersonNotes
个人笔记集中营,快糙猛的形式记录技术性Notes .. 📚☕️⌨️🎧
Stars: ✭ 61 (+69.44%)
Mutual labels:  bigdata
cbass
adding "simple" to HBase
Stars: ✭ 25 (-30.56%)
Mutual labels:  hbase
intersect
一道面试题的思考 - 6000万数据包和300万数据包在50M内存使用环境中求交集
Stars: ✭ 54 (+50%)
Mutual labels:  bigdata
Notes
This is a learning note | Java基础,JVM,源码,大数据,面经
Stars: ✭ 69 (+91.67%)
Mutual labels:  bigdata
Real-time-log-analysis-system
🐧基于spark streaming+flume+kafka+hbase的实时日志处理分析系统(分为控制台版本和基于springboot、Echarts等的Web UI可视化版本)
Stars: ✭ 31 (-13.89%)
Mutual labels:  hbase
Sub-Track
Flutter Application to keep track of Subscriptions
Stars: ✭ 31 (-13.89%)
Mutual labels:  hive
meetups-archivos
Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …
Stars: ✭ 60 (+66.67%)
Mutual labels:  bigdata
liquibase-impala
Liquibase extension to add Impala Database support
Stars: ✭ 23 (-36.11%)
Mutual labels:  hive
ucz-dfs
A distributed file system written in Rust.
Stars: ✭ 25 (-30.56%)
Mutual labels:  hdfs
hdocdb
HBase as a JSON Document Database
Stars: ✭ 24 (-33.33%)
Mutual labels:  hbase
disk
基于hadoop+hbase+springboot实现分布式网盘系统
Stars: ✭ 53 (+47.22%)
Mutual labels:  hbase
teraslice
Scalable data processing pipelines in JavaScript
Stars: ✭ 48 (+33.33%)
Mutual labels:  hdfs
young-examples
java学习和项目中一些典型的应用场景样例代码
Stars: ✭ 21 (-41.67%)
Mutual labels:  bigdata
hadoop-etl-udfs
The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-52.78%)
Mutual labels:  hive
awesome-hive
A curated list of awesome Hive resources.
Stars: ✭ 20 (-44.44%)
Mutual labels:  hive
twitter-archive-reader
Full featured TypeScript Twitter archive reader and browser
Stars: ✭ 43 (+19.44%)
Mutual labels:  bigdata
orion
Management and automation platform for Stateful Distributed Systems
Stars: ✭ 77 (+113.89%)
Mutual labels:  hbase
Real Time Social Media Mining
DevOps pipeline for Real Time Social/Web Mining
Stars: ✭ 22 (-38.89%)
Mutual labels:  hdfs
hayabusa
Hayabusa: Simple and Fast Full-Text Search Engine for Massive System Log Data
Stars: ✭ 43 (+19.44%)
Mutual labels:  bigdata
replicator
MySQL Replicator. Replicates MySQL tables to Kafka and HBase, keeping the data changes history in HBase.
Stars: ✭ 41 (+13.89%)
Mutual labels:  hbase
UnROOT.jl
Native Julia I/O package to work with CERN ROOT files
Stars: ✭ 52 (+44.44%)
Mutual labels:  bigdata
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-33.33%)
Mutual labels:  hive
hbase-packet-inspector
Analyzes network traffic of HBase RegionServers
Stars: ✭ 35 (-2.78%)
Mutual labels:  hbase
beekeeper
Service for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (+19.44%)
Mutual labels:  hive
Spark DB Connector
Use Scala API to read/write data from different databases,HBase,MySQL,etc.
Stars: ✭ 24 (-33.33%)
Mutual labels:  hbase
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (+2.78%)
Mutual labels:  hive
xdu-cloudcourse-web
西电云计算课程大作业Web端代码示例
Stars: ✭ 26 (-27.78%)
Mutual labels:  hbase
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-47.22%)
Mutual labels:  bigdata
Lidea
大型分布式系统实时监控平台
Stars: ✭ 28 (-22.22%)
Mutual labels:  hbase
Spark-MLlib-Tutorial
大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件
Stars: ✭ 32 (-11.11%)
Mutual labels:  bigdata
databricks-dbapi
DBAPI and SQLAlchemy dialect for Databricks Workspace and SQL Analytics clusters
Stars: ✭ 21 (-41.67%)
Mutual labels:  hive
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (-50%)
Mutual labels:  bigdata
NoSQLDataEngineering
NoSQL Data Engineering
Stars: ✭ 25 (-30.56%)
Mutual labels:  hbase
optimus
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Stars: ✭ 1,351 (+3652.78%)
Mutual labels:  bigdata
anovos
Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
Stars: ✭ 77 (+113.89%)
Mutual labels:  bigdata
beemos
BEE MOnitoring System: create an infrastructure for monitoring beehives
Stars: ✭ 16 (-55.56%)
Mutual labels:  hive
HiveJdbcStorageHandler
No description or website provided.
Stars: ✭ 21 (-41.67%)
Mutual labels:  hive
kafka-connect-fs
Kafka Connect FileSystem Connector
Stars: ✭ 107 (+197.22%)
Mutual labels:  hdfs
fense
Fense is a database proxy written in Java, which can connect DB of different engines at the same time. The key features are: authority management, query cache, audit security, current limiting fuse, onesql and so on
Stars: ✭ 22 (-38.89%)
Mutual labels:  hive
phoenix
Apache Phoenix / Hbase Spring Boot Microservices
Stars: ✭ 23 (-36.11%)
Mutual labels:  hbase
hive compared bq
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
Stars: ✭ 27 (-25%)
Mutual labels:  hive
codefoundry
Examples for gauravbytes.com
Stars: ✭ 57 (+58.33%)
Mutual labels:  bigdata
workflUX
An open-source, cloud-ready web application for simplified deployment of big data workflows.
Stars: ✭ 26 (-27.78%)
Mutual labels:  bigdata
coolplayflink
Flink: Stateful Computations over Data Streams
Stars: ✭ 14 (-61.11%)
Mutual labels:  bigdata
61-120 of 386 similar projects