smart-data-lakeSmart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (+259.09%)
dpkb大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse
Stars: ✭ 123 (+459.09%)
Hadoop Docker基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Stars: ✭ 238 (+981.82%)
MzingaOpen-source software to play the board game Hive.
Stars: ✭ 57 (+159.09%)
data-profilinga set of scripts to pull meta data and data profiling metrics from relational database systems
Stars: ✭ 57 (+159.09%)
DataX-srcDataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (-4.55%)
hive-cubeData self exporting and monitoring platform based on Hive data warehouse. https://hc.smartloli.org
Stars: ✭ 34 (+54.55%)
the-apache-ignite-bookAll code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Stars: ✭ 65 (+195.45%)
Hive Jdbc Uber JarHive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Stars: ✭ 188 (+754.55%)
reglnWindows Rregistry Linking Utility
Stars: ✭ 38 (+72.73%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (+154.55%)
radiatorHive Ruby API Client
Stars: ✭ 49 (+122.73%)
hiveql-parserHiveQL Parser. Parse HiveQL code and print AST in JSON format if success, else print well formed syntax error message.
Stars: ✭ 25 (+13.64%)
beemosBEE MOnitoring System: create an infrastructure for monitoring beehives
Stars: ✭ 16 (-27.27%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+9.09%)
HelicalinsightHelical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
Stars: ✭ 214 (+872.73%)
fenseFense is a database proxy written in Java, which can connect DB of different engines at the same time. The key features are: authority management, query cache, audit security, current limiting fuse, onesql and so on
Stars: ✭ 22 (+0%)
XsqlUnified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (+700%)
hadoop-etl-udfsThe Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (-22.73%)
PrestoThe official home of the Presto distributed SQL query engine for big data
Stars: ✭ 12,957 (+58795.45%)
Hive Third FunctionsSome useful custom hive udf functions, especial array, json, math, string functions.
Stars: ✭ 151 (+586.36%)
hive to es同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-4.55%)
databricks-dbapiDBAPI and SQLAlchemy dialect for Databricks Workspace and SQL Analytics clusters
Stars: ✭ 21 (-4.55%)
logparserEasy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+531.82%)
aaocp一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (+140.91%)
HiveFast. Scalable. Powerful. The Blockchain for Web 3.0
Stars: ✭ 142 (+545.45%)
simple-ddl-parserSimple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, BigQuery, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc. & table properties, types, domains, etc.
Stars: ✭ 76 (+245.45%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+772.73%)
herd-mdlHerd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
Stars: ✭ 11 (-50%)
HiveRunnerAn Open Source unit test framework for Hive queries based on JUnit 4 and 5
Stars: ✭ 244 (+1009.09%)
xxhadoopData Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (+68.18%)
Sub-TrackFlutter Application to keep track of Subscriptions
Stars: ✭ 31 (+40.91%)
hive-jdbc-driverAn alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Stars: ✭ 31 (+40.91%)
common-datax基于DataX的通用数据同步微服务,一个Restful接口搞定所有通用数据同步
Stars: ✭ 51 (+131.82%)
hive compared bqhive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
Stars: ✭ 27 (+22.73%)
HiverunnerAn Open Source unit test framework for Hive queries based on JUnit 4 and 5
Stars: ✭ 225 (+922.73%)
awesome-hiveA curated list of awesome Hive resources.
Stars: ✭ 20 (-9.09%)
liquibase-impalaLiquibase extension to add Impala Database support
Stars: ✭ 23 (+4.55%)
HiveLightweight and blazing fast key-value database written in pure Dart.
Stars: ✭ 2,681 (+12086.36%)
dockerfilesMulti docker container images for main Big Data Tools. (Hadoop, Spark, Kafka, HBase, Cassandra, Zookeeper, Zeppelin, Drill, Flink, Hive, Hue, Mesos, ... )
Stars: ✭ 29 (+31.82%)
LinkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+10459.09%)
Movie recommend基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Stars: ✭ 2,092 (+9409.09%)
beekeeperService for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (+95.45%)
last fmA simple app to demonstrate a testable, maintainable, and scalable architecture for flutter. flutter_bloc, get_it, hive, and REST API are some of the tech stacks used in this project.
Stars: ✭ 134 (+509.09%)
waggle-danceHive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
Stars: ✭ 194 (+781.82%)
apiaryApiary provides modules which can be combined to create a federated cloud data lake
Stars: ✭ 30 (+36.36%)
analyzing-reddit-sentiment-with-awsLearn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level tutorial.
Stars: ✭ 40 (+81.82%)