DataxDataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (+222.22%)
QuicksqlA Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+4958.33%)
Bigdata Interview🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+2280.56%)
SylphStream computing platform for bigdata
Stars: ✭ 362 (+905.56%)
fdp-modelserverAn umbrella project for multiple implementations of model serving
Stars: ✭ 47 (+30.56%)
fastdata-clusterFast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-44.44%)
DataX-srcDataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (-41.67%)
DataspherestudioDataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+3219.44%)
logparserEasy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+286.11%)
TiBigDataTiDB connectors for Flink/Hive/Presto
Stars: ✭ 192 (+433.33%)
common-datax基于DataX的通用数据同步微服务,一个Restful接口搞定所有通用数据同步
Stars: ✭ 51 (+41.67%)
Tweet-Analysis-With-Kafka-and-SparkA real time analytics dashboard to analyze the trending hashtags and @ mentions at any location using kafka and spark streaming.
Stars: ✭ 18 (-50%)
JanusgraphJanusGraph: an open-source, distributed graph database
Stars: ✭ 4,277 (+11780.56%)
Movie recommend基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Stars: ✭ 2,092 (+5711.11%)
fenseFense is a database proxy written in Java, which can connect DB of different engines at the same time. The key features are: authority management, query cache, audit security, current limiting fuse, onesql and so on
Stars: ✭ 22 (-38.89%)
hadoopofficeHadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
Stars: ✭ 56 (+55.56%)
ansible-role-test-vmsDEPRECATED - A Vagrant configuration to test Ansible roles against a variety of Linux distributions.
Stars: ✭ 42 (+16.67%)
cobra-policytoolManage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-55.56%)
phian api-gateway based on openresty
Stars: ✭ 23 (-36.11%)
vagrant-xfce4-ubuntuVagrant-based development environment using Ubuntu and the Xfce Desktop Environment
Stars: ✭ 17 (-52.78%)
vulknLove your Data. Love the Environment. Love VULKИ.
Stars: ✭ 43 (+19.44%)
BnLMetsExporterCommand Line Interface (CLI) to export METS/ALTO documents to other formats.
Stars: ✭ 11 (-69.44%)
basic-solr-configA starting point for solr schema, config and xslt.
Stars: ✭ 17 (-52.78%)
magento2-fast-vmOptimal vagrant developer box for Magento2. Folders synced by nfs/rsync. This box includes Magento developer utilities.
Stars: ✭ 89 (+147.22%)
sig-windows-dev-toolsThis is a batteries included local development environment for Kubernetes on Windows.
Stars: ✭ 52 (+44.44%)
hivebergDemonstration of a Hive Input Format for Iceberg
Stars: ✭ 22 (-38.89%)
lua-mailgunLua bindings to Mailgun HTTP API
Stars: ✭ 25 (-30.56%)
redis clustera openresty nginx lua redis cluster
Stars: ✭ 26 (-27.78%)
rails-development-environmentDevelopment environment for Ruby on Rails based on Vagrant, VirtualBox and Ubuntu 16.04 LTS (Xenial Xerus).
Stars: ✭ 50 (+38.89%)
SparkApache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Stars: ✭ 55 (+52.78%)
cmuxA set of commands for managing CDH clusters using Cloudera Manager REST API.
Stars: ✭ 34 (-5.56%)
spark-utilsBasic framework utilities to quickly start writing production ready Apache Spark applications
Stars: ✭ 25 (-30.56%)
documentrA naive solution to document schemas
Stars: ✭ 24 (-33.33%)
lua-practice使用lua结合redis,mysql,nginx等开发的实用性测试案例
Stars: ✭ 13 (-63.89%)
RillAdminvue + openresty/nodejs web admin
Stars: ✭ 34 (-5.56%)
turing✨ 🧬 Turing AI - Semantic Navigation, Chatbot using Search Engine and Many NLP Vendors.
Stars: ✭ 30 (-16.67%)
thrift2-hbasethrift2-hbase component for Hyperf.
Stars: ✭ 14 (-61.11%)
clickhouse hadoopImport data from clickhouse to hadoop with pure SQL
Stars: ✭ 26 (-27.78%)
talosNo description or website provided.
Stars: ✭ 37 (+2.78%)
misp-vagrantDeploy MISP Project software with Vagrant.
Stars: ✭ 37 (+2.78%)
ubuntu-vagrantUbuntu Linux Vagrant Base Box (https://app.vagrantup.com/rgl)
Stars: ✭ 25 (-30.56%)
FlinkTutorialFlinkTutorial 专注大数据Flink流试处理技术。从基础入门、概念、原理、实战、性能调优、源码解析等内容,使用Java开发,同时含有Scala部分核心代码。欢迎关注我的博客及github。
Stars: ✭ 46 (+27.78%)
solrqPython Solr query utility // http://solrq.readthedocs.org/en/latest/
Stars: ✭ 18 (-50%)
dev-with-docker-on-ubuntuAfter fighting with Docker on OSX and the need for 2-way syncs, fsevents, etc. I developed a desire to get back to a simple(r) development environment on a linux based VM. This project is a jumping off point.
Stars: ✭ 25 (-30.56%)
TILToday I Learned
Stars: ✭ 43 (+19.44%)
alexa-openwebifalexa skill to control your openwebif device
Stars: ✭ 25 (-30.56%)