AddaxAddax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (+2265.38%)
DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (+346.15%)
openreplay📺 OpenReplay is developer-friendly, open-source session replay.
Stars: ✭ 6,131 (+23480.77%)
hive-jdbc-driverAn alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (+19.23%)
LogAnalyzeHelper论坛日志分析系统清洗程序(包含IP规则库,UDF开发,MapReduce程序,日志数据)
Stars: ✭ 33 (+26.92%)
memex-gateGeneral Architecture for Text Engineering
Stars: ✭ 47 (+80.77%)
yabr.osЧтение скобочного формата файлов 1С (oscript)
Stars: ✭ 33 (+26.92%)
ClickHouseToolsИнструменты обслуживания и разработки для Yandex ClickHouse, а также другие интересности
Stars: ✭ 16 (-38.46%)
disqA library for manipulating bioinformatics sequencing formats in Apache Spark
Stars: ✭ 29 (+11.54%)
darwinAvro Schema Evolution made easy
Stars: ✭ 26 (+0%)
disk基于hadoop+hbase+springboot实现分布式网盘系统
Stars: ✭ 53 (+103.85%)
liquibase-impalaLiquibase extension to add Impala Database support
Stars: ✭ 23 (-11.54%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+50%)
sql exporterDatabase agnostic SQL exporter for Prometheus
Stars: ✭ 72 (+176.92%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (+115.38%)
hadoop-cryptoLibrary for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
Stars: ✭ 38 (+46.15%)
click houseModern Ruby database driver for ClickHouse
Stars: ✭ 133 (+411.54%)
learning-sparkTidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (+7.69%)
iisInformation Inference Service of the OpenAIRE system
Stars: ✭ 16 (-38.46%)
xxhadoopData Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (+42.31%)
waspWASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-26.92%)
uptraceOpen source APM: OpenTelemetry traces, metrics, and logs
Stars: ✭ 1,187 (+4465.38%)
appmetrica-logsapi-loaderA tool for automatic data loading from AppMetrica LogsAPI into (local) ClickHouse
Stars: ✭ 18 (-30.77%)
chtableGrafana's table plugin for ClickHouse
Stars: ✭ 26 (+0%)
prestoTeradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
Stars: ✭ 91 (+250%)
corcAn ORC File Scheme for the Cascading data processing platform.
Stars: ✭ 14 (-46.15%)
cdsData syncing in golang for ClickHouse.
Stars: ✭ 839 (+3126.92%)
pyspark-ML-in-ColabPyspark in Google Colab: A simple machine learning (Linear Regression) model
Stars: ✭ 32 (+23.08%)
tricksterOpen Source HTTP Reverse Proxy Cache and Time Series Dashboard Accelerator
Stars: ✭ 1,753 (+6642.31%)
big-data-exploration[Archive] Intern project - Big Data Exploration using MongoDB - This Repository is NOT a supported MongoDB product
Stars: ✭ 43 (+65.38%)
UBAUEBA Solution for Insider Security. This repo is archived. Thanks!
Stars: ✭ 36 (+38.46%)
hadoop-etl-udfsThe Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-34.62%)
dbal-clickhouseDoctrine DBAL driver for ClickHouse database
Stars: ✭ 77 (+196.15%)
fenseFense is a database proxy written in Java, which can connect DB of different engines at the same time. The key features are: authority management, query cache, audit security, current limiting fuse, onesql and so on
Stars: ✭ 22 (-15.38%)
sparkucxA high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+23.08%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (+11.54%)
implyrSQL backend to dplyr for Impala
Stars: ✭ 74 (+184.62%)
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (+150%)
rastercuberastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Stars: ✭ 15 (-42.31%)
aaocp一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (+103.85%)
hive to es同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-19.23%)
ClickHouseMigratorHelp to migrate data to ClickHouse, create database and table auto.
Stars: ✭ 58 (+123.08%)
ProtonHigh performance Pinba server
Stars: ✭ 27 (+3.85%)
smart-data-lakeSmart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+203.85%)
oci-clouderaTerraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)
Stars: ✭ 20 (-23.08%)
learning-hadoop-and-sparkCompanion to Learning Hadoop and Learning Spark courses on Linked In Learning
Stars: ✭ 146 (+461.54%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-7.69%)