All Projects → bytedance → clickhouse_hadoop

bytedance / clickhouse_hadoop

Licence: other
Import data from clickhouse to hadoop with pure SQL

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to clickhouse hadoop

Addax
Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (+2265.38%)
Mutual labels:  hadoop, clickhouse
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+3076.92%)
Mutual labels:  hadoop, clickhouse
Datax
DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (+346.15%)
Mutual labels:  hadoop, clickhouse
hadoop-ecosystem
Visualizations of the Hadoop Ecosystem
Stars: ✭ 20 (-23.08%)
Mutual labels:  hadoop
clickhouse-ast-parser
AST parser and visitor for ClickHouse SQL
Stars: ✭ 60 (+130.77%)
Mutual labels:  clickhouse
darwin
Avro Schema Evolution made easy
Stars: ✭ 26 (+0%)
Mutual labels:  hadoop
dbal-clickhouse
Doctrine DBAL driver for ClickHouse database
Stars: ✭ 77 (+196.15%)
Mutual labels:  clickhouse
liquibase-impala
Liquibase extension to add Impala Database support
Stars: ✭ 23 (-11.54%)
Mutual labels:  hadoop
awesome-clickhouse
A curated list of awesome ClickHouse software.
Stars: ✭ 71 (+173.08%)
Mutual labels:  clickhouse
hive-jdbc-driver
An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (+19.23%)
Mutual labels:  hadoop
implyr
SQL backend to dplyr for Impala
Stars: ✭ 74 (+184.62%)
Mutual labels:  hadoop
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-26.92%)
Mutual labels:  hadoop
appmetrica-logsapi-loader
A tool for automatic data loading from AppMetrica LogsAPI into (local) ClickHouse
Stars: ✭ 18 (-30.77%)
Mutual labels:  clickhouse
presto
Teradata Distribution of Presto -- A Distributed SQL Query Engine for Big Data
Stars: ✭ 91 (+250%)
Mutual labels:  hadoop
cds
Data syncing in golang for ClickHouse.
Stars: ✭ 839 (+3126.92%)
Mutual labels:  clickhouse
trickster
Open Source HTTP Reverse Proxy Cache and Time Series Dashboard Accelerator
Stars: ✭ 1,753 (+6642.31%)
Mutual labels:  clickhouse
aaocp
一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (+103.85%)
Mutual labels:  hadoop
hadoop-crypto
Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
Stars: ✭ 38 (+46.15%)
Mutual labels:  hadoop
Proton
High performance Pinba server
Stars: ✭ 27 (+3.85%)
Mutual labels:  clickhouse
UBA
UEBA Solution for Insider Security. This repo is archived. Thanks!
Stars: ✭ 36 (+38.46%)
Mutual labels:  hadoop

ClickHouse Hadoop

Integrate ClickHouse natively with Hive, currently only writing is supported. Connecting Hadoop's massive data storage and deep processing power with the high performance of ClickHouse.

Build the Project

mvn package -Phadoop26 -DskipTests

Run the test cases

It is required that a clickhouse-server is running in the localhost to correctly run the test cases.

Usage

Create ClickHouse table

CREATE TABLE hive_test
(
    c1 String,
    c2 Float64,
    c3 String
)
ENGINE = MergeTree()
PARTITION BY c3
ORDER BY c1

Create Hive External Table

Before starting the hive cli, set the environment variable HIVE_AUX_JARS_PATH

export HIVE_AUX_JARS_PATH=<path-to-your-project>/target/clickhouse-hadoop-<version>.jar

Then start the hive-cli and create Hive external table:

CREATE EXTERNAL TABLE default.ck_test(
   c1 string,
   c2 double,
   c3 string
)
STORED BY 'data.bytedance.net.ck.hive.ClickHouseStorageHandler'
TBLPROPERTIES('clickhouse.conn.urls'='jdbc:clickhouse://<host-1>:<port1>,jdbc:clickhouse://<host2>:<port2>',
'clickhouse.table.name'='hive_test');

Data Ingestion

In hive-cli

INSERT INTO default.ck_test
select  c1, c2, c3 FROM default.source_table where part='part_val'
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].