All Projects → Parquet Go → Similar Projects or Alternatives

310 Open source projects that are alternatives of or similar to Parquet Go

dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (-73.68%)
Mutual labels:  parquet
Floating Elephants
Docker containers for Hadoop.
Stars: ✭ 19 (-83.33%)
Mutual labels:  hadoop
presto-client-php
A Presto client for the PHP programming language.
Stars: ✭ 24 (-78.95%)
Mutual labels:  presto
Apache Spark Hands On
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-35.09%)
Mutual labels:  hadoop
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-87.72%)
Mutual labels:  hadoop
Presto Redis
presto-redis is an experimental sql layer for redis
Stars: ✭ 18 (-84.21%)
Mutual labels:  presto
incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+2057.02%)
Mutual labels:  presto
Wifi
基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-18.42%)
Mutual labels:  hadoop
BigData-News
基于Spark2.2新闻网大数据实时系统项目
Stars: ✭ 36 (-68.42%)
Mutual labels:  hadoop
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+624.56%)
Mutual labels:  hadoop
Addax
Addax is an open source universal ETL tool that supports most of those RDBMS and NoSQLs on the planet, helping you transfer data from any one place to another.
Stars: ✭ 615 (+439.47%)
Mutual labels:  hadoop
Atsd
Axibase Time Series Database Documentation
Stars: ✭ 68 (-40.35%)
Mutual labels:  hadoop
fluent-plugin-webhdfs
Hadoop WebHDFS output plugin for Fluentd
Stars: ✭ 57 (-50%)
Mutual labels:  hadoop
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+616.67%)
Mutual labels:  hadoop
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+142.11%)
Mutual labels:  parquet
Pyhive
Python interface to Hive and Presto. 🐝
Stars: ✭ 1,378 (+1108.77%)
Mutual labels:  presto
centurion
Kotlin Bigdata Toolkit
Stars: ✭ 320 (+180.7%)
Mutual labels:  parquet
Winutils
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+476.32%)
Mutual labels:  hadoop
meepo
异构存储数据迁移
Stars: ✭ 29 (-74.56%)
Mutual labels:  parquet
Jumbune
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-43.86%)
Mutual labels:  hadoop
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-82.46%)
Mutual labels:  hadoop
Tony
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 626 (+449.12%)
Mutual labels:  hadoop
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-70.18%)
Mutual labels:  hadoop
Hadoop Mapreduce
Mirror of Apache Hadoop MapReduce
Stars: ✭ 88 (-22.81%)
Mutual labels:  hadoop
big-data-lite
Samples to the Oracle Big Data Lite VM
Stars: ✭ 41 (-64.04%)
Mutual labels:  hadoop
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+4861.4%)
Mutual labels:  hadoop
GooglePlay-Web-Crawler
Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive
Stars: ✭ 18 (-84.21%)
Mutual labels:  hadoop
Petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Stars: ✭ 1,108 (+871.93%)
Mutual labels:  parquet
EngineeringTeam
와이빅타 엔지니어링팀의 자료를 정리해두는 곳입니다.
Stars: ✭ 41 (-64.04%)
Mutual labels:  hadoop
Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-14.91%)
Mutual labels:  parquet
Tf Yarn
Train TensorFlow models on YARN in just a few lines of code!
Stars: ✭ 76 (-33.33%)
Mutual labels:  hadoop
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+651.75%)
Mutual labels:  hadoop
Hadoop Mini Clusters
hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE
Stars: ✭ 265 (+132.46%)
Mutual labels:  hadoop
py-hdfs-mount
Mount HDFS with fuse, works with kerberos!
Stars: ✭ 13 (-88.6%)
Mutual labels:  hadoop
Waterdrop
Production Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1528.07%)
Mutual labels:  hadoop
CDH-Install-Manual
CDH安装手册
Stars: ✭ 70 (-38.6%)
Mutual labels:  hadoop
Bigdata
💎🔥大数据学习笔记
Stars: ✭ 488 (+328.07%)
Mutual labels:  hadoop
cmux
A set of commands for managing CDH clusters using Cloudera Manager REST API.
Stars: ✭ 34 (-70.18%)
Mutual labels:  hadoop
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-49.12%)
Mutual labels:  parquet
platys-modern-data-platform
Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
Stars: ✭ 35 (-69.3%)
Mutual labels:  hadoop
School Of Sre
At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
Stars: ✭ 5,141 (+4409.65%)
Mutual labels:  hadoop
ros hadoop
Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.
Stars: ✭ 92 (-19.3%)
Mutual labels:  hadoop
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-24.56%)
Mutual labels:  parquet
cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
Stars: ✭ 16 (-85.96%)
Mutual labels:  hadoop
Presto Ethereum
Presto Ethereum Connector -- SQL on Ethereum
Stars: ✭ 450 (+294.74%)
Mutual labels:  presto
parquet-usql
A custom extractor designed to read parquet for Azure Data Lake Analytics
Stars: ✭ 13 (-88.6%)
Mutual labels:  parquet
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-50%)
Mutual labels:  hadoop
fsbrowser
Fast desktop client for Hadoop Distributed File System
Stars: ✭ 27 (-76.32%)
Mutual labels:  hadoop
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+5170.18%)
Mutual labels:  hadoop
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-87.72%)
Mutual labels:  hadoop
Bigdata Notebook
Stars: ✭ 100 (-12.28%)
Mutual labels:  hadoop
MLHadoop
This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Stars: ✭ 50 (-56.14%)
Mutual labels:  hadoop
Marmaray
Generic Data Ingestion & Dispersal Library for Hadoop
Stars: ✭ 414 (+263.16%)
Mutual labels:  hadoop
Roapi
Create full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (+121.93%)
Mutual labels:  parquet
bandar-log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-82.46%)
Mutual labels:  presto
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-78.07%)
Mutual labels:  hadoop
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+948.25%)
Mutual labels:  hadoop
Hadoop Pot
A scalable Apache Hadoop-based implementation of the Pooled Time Series video similarity algorithm based on M. Ryoo et al paper CVPR 2015.
Stars: ✭ 8 (-92.98%)
Mutual labels:  hadoop
pulse
phData Pulse application log aggregation and monitoring
Stars: ✭ 13 (-88.6%)
Mutual labels:  hadoop
kubesql
A tool based on presto using sql to query the resources of kubernetes, such as pods, nodes and so on.
Stars: ✭ 56 (-50.88%)
Mutual labels:  presto
121-180 of 310 similar projects