All Projects → Drill → Similar Projects or Alternatives

824 Open source projects that are alternatives of or similar to Drill

Parquet Cpp
Apache Parquet
Stars: ✭ 339 (-79.06%)
Mutual labels:  big-data, parquet
Bigdata
💎🔥大数据学习笔记
Stars: ✭ 488 (-69.86%)
Mutual labels:  hive, hadoop
Ignite
Apache Ignite
Stars: ✭ 4,027 (+148.73%)
Mutual labels:  big-data, hadoop
Maha
A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-93.76%)
Mutual labels:  big-data, hive
Parquet Rs
Apache Parquet implementation in Rust
Stars: ✭ 144 (-91.11%)
Mutual labels:  hadoop, parquet
Amazon S3 Find And Forget
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)
Stars: ✭ 115 (-92.9%)
Mutual labels:  big-data, parquet
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (-90.06%)
Mutual labels:  hive, hadoop
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+43.48%)
Mutual labels:  hive, jdbc
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (-77.02%)
Mutual labels:  hive, hadoop
Parquetviewer
Simple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (-91.04%)
Mutual labels:  big-data, parquet
Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+135.52%)
Mutual labels:  big-data, hadoop
Orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Stars: ✭ 389 (-75.97%)
Mutual labels:  big-data, hadoop
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+271.09%)
Mutual labels:  hive, hadoop
Gimel
Big Data Processing Framework - Unified Data API or SQL on Any Storage
Stars: ✭ 216 (-86.66%)
Mutual labels:  big-data, jdbc
Awkward 0.x
Manipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (-86.66%)
Mutual labels:  big-data, parquet
smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Stars: ✭ 79 (-95.12%)
Mutual labels:  hive, hadoop
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-86.72%)
Mutual labels:  big-data, hadoop
Iceberg
Iceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (-75.73%)
Mutual labels:  hadoop, parquet
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+1261.83%)
Mutual labels:  big-data, hadoop
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+249.35%)
Mutual labels:  big-data, hadoop
hive to es
同步Hive数据仓库数据到Elasticsearch的小工具
Stars: ✭ 21 (-98.7%)
Mutual labels:  hive, hadoop
hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
Stars: ✭ 16 (-99.01%)
Mutual labels:  hive, hadoop
Helicalinsight
Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
Stars: ✭ 214 (-86.78%)
Mutual labels:  big-data, hive
terraform-aws-kinesis-firehose
This code creates a Kinesis Firehose in AWS to send CloudWatch log data to S3.
Stars: ✭ 25 (-98.46%)
Mutual labels:  big-data, parquet
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (-97.59%)
Mutual labels:  big-data, hadoop
xxhadoop
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Stars: ✭ 37 (-97.71%)
Mutual labels:  hive, hadoop
sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (-98.02%)
Mutual labels:  big-data, hadoop
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-99.69%)
Mutual labels:  big-data, hadoop
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (-49.54%)
Mutual labels:  hive, hadoop
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (-48.98%)
Mutual labels:  hive, hadoop
Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (-97.1%)
Mutual labels:  big-data, hadoop
cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-99.01%)
Mutual labels:  hive, hadoop
aaocp
一个对用户行为日志进行分析的大数据项目
Stars: ✭ 53 (-96.73%)
Mutual labels:  hive, hadoop
GooglePlay-Web-Crawler
Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive
Stars: ✭ 18 (-98.89%)
Mutual labels:  hive, hadoop
TitanDataOperationSystem
最好的大数据项目。《Titan数据运营系统》,本项目是一个全栈闭环系统,我们有用作数据可视化的web系统,然后用flume-kafaka-flume进行日志的读取,在hive设计数仓,编写spark代码进行数仓表之间的转化以及ads层表到mysql的迁移,使用azkaban进行定时任务的调度,使用技术:Java/Scala语言,Hadoop、Spark、Hive、Kafka、Flume、Azkaban、SpringBoot,Bootstrap, Echart等;
Stars: ✭ 62 (-96.17%)
Mutual labels:  hive, hadoop
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-97.9%)
Mutual labels:  big-data, hadoop
HiveJdbcStorageHandler
No description or website provided.
Stars: ✭ 21 (-98.7%)
Mutual labels:  hive, jdbc
spark-acid
ACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-94.38%)
Mutual labels:  big-data, hive
Addax
Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (-62.01%)
Mutual labels:  hive, hadoop
incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+51.88%)
Mutual labels:  hive, jdbc
hadoop-data-ingestion-tool
OLAP and ETL of Big Data
Stars: ✭ 17 (-98.95%)
Mutual labels:  big-data, hadoop
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (-82.95%)
Mutual labels:  big-data, parquet
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-99.14%)
Mutual labels:  big-data, hadoop
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (-36.69%)
Mutual labels:  big-data, hadoop
implyr
SQL backend to dplyr for Impala
Stars: ✭ 74 (-95.43%)
Mutual labels:  hadoop, jdbc
Tez
Apache Tez
Stars: ✭ 313 (-80.67%)
Mutual labels:  big-data, hadoop
Parquet Format
Apache Parquet
Stars: ✭ 800 (-50.59%)
Mutual labels:  big-data, parquet
Spark
Apache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+1852.93%)
Mutual labels:  big-data, jdbc
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-96.48%)
Mutual labels:  big-data, hadoop
Lychee
The most complete and powerful data-binding library and persistence infra for Kotlin 1.3, Android & Splitties Views DSL, JavaFX & TornadoFX, JSON, JDBC & SQLite, SharedPreferences.
Stars: ✭ 102 (-93.7%)
Mutual labels:  jdbc
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-94.94%)
Mutual labels:  big-data
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-94.69%)
Mutual labels:  parquet
Genie
Distributed Big Data Orchestration Service
Stars: ✭ 1,544 (-4.63%)
Mutual labels:  big-data
Pyhive
Python interface to Hive and Presto. 🐝
Stars: ✭ 1,378 (-14.89%)
Mutual labels:  hive
Hops Examples
Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-94.81%)
Mutual labels:  hive
Snowflake Jdbc
Snowflake JDBC Driver
Stars: ✭ 83 (-94.87%)
Mutual labels:  jdbc
Sparksql Protobuf
Read SparkSQL parquet file as RDD[Protobuf]
Stars: ✭ 82 (-94.94%)
Mutual labels:  parquet
Bigdata Notebook
Stars: ✭ 100 (-93.82%)
Mutual labels:  hadoop
Docker Hadoop Cluster
Multiple node cluster on Docker for self development.
Stars: ✭ 82 (-94.94%)
Mutual labels:  hadoop
Camus
Mirror of Linkedin's Camus
Stars: ✭ 81 (-95%)
Mutual labels:  hadoop
61-120 of 824 similar projects