All Projects → hadoop-etl-udfs → Similar Projects or Alternatives

366 Open source projects that are alternatives of or similar to hadoop-etl-udfs

Eyerissf
An Eyeriss Chip (researched by MIT, a CNN accelerator) simulator and New DNN framework "Hive"
Stars: ✭ 68 (+300%)
Mutual labels:  hive
Stormtweetssentimentd3viz
Computes and visualizes the sentiment analysis of tweets of US States in real-time using Storm.
Stars: ✭ 25 (+47.06%)
Mutual labels:  hadoop
MLHadoop
This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Stars: ✭ 50 (+194.12%)
Mutual labels:  hadoop
albis
Albis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (+17.65%)
Mutual labels:  parquet
Movies-Analytics-in-Spark-and-Scala
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
Stars: ✭ 47 (+176.47%)
Mutual labels:  hadoop
phoenix
Apache Phoenix / Hbase Spring Boot Microservices
Stars: ✭ 23 (+35.29%)
Mutual labels:  hadoop
Kylo
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Stars: ✭ 916 (+5288.24%)
Mutual labels:  hadoop
hadoop-crypto
Library for per-file client-side encyption in Hadoop FileSystems such as HDFS or S3.
Stars: ✭ 38 (+123.53%)
Mutual labels:  hadoop
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+129.41%)
Mutual labels:  hadoop
miniparquet
Library to read a subset of Parquet files
Stars: ✭ 38 (+123.53%)
Mutual labels:  parquet
Docs4dev
后端开发常用框架文档及中文翻译,包含 Spring 系列文档(Spring, Spring Boot, Spring Cloud, Spring Security, Spring Session),大数据(Apache Hive, HBase, Apache Flume),日志(Log4j2, Logback),Http Server(NGINX,Apache),Python,数据库(OpenTSDB,MySQL,PostgreSQL)等最新官方文档以及对应的中文翻译。
Stars: ✭ 974 (+5629.41%)
Mutual labels:  hive
Floating Elephants
Docker containers for Hadoop.
Stars: ✭ 19 (+11.76%)
Mutual labels:  hadoop
hadoop-ecosystem
Visualizations of the Hadoop Ecosystem
Stars: ✭ 20 (+17.65%)
Mutual labels:  hadoop
spark-waimai
基于spark的外卖大数据平台分析系统
Stars: ✭ 24 (+41.18%)
Mutual labels:  hive
Awkward 0.x
Manipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+1170.59%)
Mutual labels:  parquet
lib mysqludf redis
Provides Mysql UDF commands to synchronize data from Mysql to Redis.
Stars: ✭ 20 (+17.65%)
Mutual labels:  udf
Sqlite Parquet Vtable
A SQLite vtable extension to read Parquet files
Stars: ✭ 167 (+882.35%)
Mutual labels:  parquet
hivemind
Hive API server (offloads most API calls from hived) implemented using Python+SQL
Stars: ✭ 46 (+170.59%)
Mutual labels:  hive
parquet-extra
A collection of Apache Parquet add-on modules
Stars: ✭ 30 (+76.47%)
Mutual labels:  parquet
Parquet Index
Spark SQL index for Parquet tables
Stars: ✭ 109 (+541.18%)
Mutual labels:  parquet
hiveberg
Demonstration of a Hive Input Format for Iceberg
Stars: ✭ 22 (+29.41%)
Mutual labels:  hive
docker-hadoop
Docker image for main Apache Hadoop components (Yarn/Hdfs)
Stars: ✭ 59 (+247.06%)
Mutual labels:  hadoop
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-70.59%)
Mutual labels:  hadoop
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (+405.88%)
Mutual labels:  parquet
BigDataTools
tools for bigData
Stars: ✭ 36 (+111.76%)
Mutual labels:  hive
columnify
Make record oriented data to columnar format.
Stars: ✭ 28 (+64.71%)
Mutual labels:  parquet
Pyetl
python ETL framework
Stars: ✭ 33 (+94.12%)
Mutual labels:  hive
Winutils
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+3764.71%)
Mutual labels:  hadoop
Gcs Tools
GCS support for avro-tools, parquet-tools and protobuf
Stars: ✭ 57 (+235.29%)
Mutual labels:  parquet
Quilt
Quilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (+5823.53%)
Mutual labels:  parquet
Hiverunner
An Open Source unit test framework for Hive queries based on JUnit 4 and 5
Stars: ✭ 225 (+1223.53%)
Mutual labels:  hive
Parquet Generator
Parquet file generator
Stars: ✭ 16 (-5.88%)
Mutual labels:  parquet
Useractionanalyzeplatform
电商用户行为分析大数据平台
Stars: ✭ 645 (+3694.12%)
Mutual labels:  hadoop
Skale
High performance distributed data processing engine
Stars: ✭ 390 (+2194.12%)
Mutual labels:  parquet
HDFS-Netdisc
基于Hadoop的分布式云存储系统 🌴
Stars: ✭ 56 (+229.41%)
Mutual labels:  hadoop
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (+1917.65%)
Mutual labels:  parquet
DataX-src
DataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (+23.53%)
Mutual labels:  hive
Databook
A facebook for data
Stars: ✭ 26 (+52.94%)
Mutual labels:  hive
Tony
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 626 (+3582.35%)
Mutual labels:  hadoop
Pystore
Fast data store for Pandas time-series data
Stars: ✭ 325 (+1811.76%)
Mutual labels:  parquet
Ratatool
A tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (+1541.18%)
Mutual labels:  parquet
learning-spark
Tidy up Spark and Hadoop tutorials.
Stars: ✭ 28 (+64.71%)
Mutual labels:  hadoop
Javapdf
🍣100本 Java电子书 技术书籍PDF(以下载阅读为荣,以点赞收藏为耻)
Stars: ✭ 609 (+3482.35%)
Mutual labels:  hadoop
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+33170.59%)
Mutual labels:  hadoop
HybridBackend
Efficient training of deep recommenders on cloud.
Stars: ✭ 30 (+76.47%)
Mutual labels:  parquet
Hadoop Attack Library
A collection of pentest tools and resources targeting Hadoop environments
Stars: ✭ 228 (+1241.18%)
Mutual labels:  hadoop
meepo
异构存储数据迁移
Stars: ✭ 29 (+70.59%)
Mutual labels:  parquet
Hive
Lightweight and blazing fast key-value database written in pure Dart.
Stars: ✭ 2,681 (+15670.59%)
Mutual labels:  hive
graphique
GraphQL service for arrow tables and parquet data sets.
Stars: ✭ 28 (+64.71%)
Mutual labels:  parquet
Hadoop Connectors
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
Stars: ✭ 218 (+1182.35%)
Mutual labels:  hadoop
parquet-usql
A custom extractor designed to read parquet for Azure Data Lake Analytics
Stars: ✭ 13 (-23.53%)
Mutual labels:  parquet
logparser
Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Pig, Flink, Beam, Storm, Drill, ...
Stars: ✭ 139 (+717.65%)
Mutual labels:  hive
Calcite
Apache Calcite
Stars: ✭ 2,816 (+16464.71%)
Mutual labels:  hadoop
beekeeper
Service for automatically managing and cleaning up unreferenced data
Stars: ✭ 43 (+152.94%)
Mutual labels:  hive
sparkucx
A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer
Stars: ✭ 32 (+88.24%)
Mutual labels:  hadoop
databricks-dbapi
DBAPI and SQLAlchemy dialect for Databricks Workspace and SQL Analytics clusters
Stars: ✭ 21 (+23.53%)
Mutual labels:  hive
skein
A tool and library for easily deploying applications on Apache YARN
Stars: ✭ 128 (+652.94%)
Mutual labels:  hadoop
qs-hadoop
大数据生态圈学习
Stars: ✭ 18 (+5.88%)
Mutual labels:  hadoop
gomrjob
gomrjob - a Go Framework for Hadoop Map Reduce Jobs
Stars: ✭ 39 (+129.41%)
Mutual labels:  hadoop
Dist Keras
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Stars: ✭ 613 (+3505.88%)
Mutual labels:  hadoop
301-360 of 366 similar projects