All Projects → Parquet Go → Similar Projects or Alternatives

310 Open source projects that are alternatives of or similar to Parquet Go

Wifi
基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-18.42%)
Mutual labels:  hadoop
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (+624.56%)
Mutual labels:  hadoop
Atsd
Axibase Time Series Database Documentation
Stars: ✭ 68 (-40.35%)
Mutual labels:  hadoop
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+616.67%)
Mutual labels:  hadoop
Pyhive
Python interface to Hive and Presto. 🐝
Stars: ✭ 1,378 (+1108.77%)
Mutual labels:  presto
Winutils
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+476.32%)
Mutual labels:  hadoop
Jumbune
Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,
Stars: ✭ 64 (-43.86%)
Mutual labels:  hadoop
Tony
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Stars: ✭ 626 (+449.12%)
Mutual labels:  hadoop
Hadoop Mapreduce
Mirror of Apache Hadoop MapReduce
Stars: ✭ 88 (-22.81%)
Mutual labels:  hadoop
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+4861.4%)
Mutual labels:  hadoop
Petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Stars: ✭ 1,108 (+871.93%)
Mutual labels:  parquet
Gather Deployment
Gathers scalable tensorflow and infrastructure deployment
Stars: ✭ 326 (+185.96%)
Mutual labels:  hadoop
Waterdrop
Production Ready Data Integration Product, documentation:
Stars: ✭ 1,856 (+1528.07%)
Mutual labels:  hadoop
Bigdata
💎🔥大数据学习笔记
Stars: ✭ 488 (+328.07%)
Mutual labels:  hadoop
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-49.12%)
Mutual labels:  parquet
School Of Sre
At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.
Stars: ✭ 5,141 (+4409.65%)
Mutual labels:  hadoop
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (-24.56%)
Mutual labels:  parquet
Presto Ethereum
Presto Ethereum Connector -- SQL on Ethereum
Stars: ✭ 450 (+294.74%)
Mutual labels:  presto
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-50%)
Mutual labels:  hadoop
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+5170.18%)
Mutual labels:  hadoop
Bigdata Notebook
Stars: ✭ 100 (-12.28%)
Mutual labels:  hadoop
Hadoop Solr
Code to index HDFS to Solr using MapReduce
Stars: ✭ 51 (-55.26%)
Mutual labels:  hadoop
Akkeeper
An easy way to deploy your Akka services to a distributed environment.
Stars: ✭ 30 (-73.68%)
Mutual labels:  hadoop
Cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster. See https://github.com/Cascading/cascading for the release repository.
Stars: ✭ 318 (+178.95%)
Mutual labels:  hadoop
Kafka Connect Hdfs
Kafka Connect HDFS connector
Stars: ✭ 400 (+250.88%)
Mutual labels:  hadoop
Sparksql Protobuf
Read SparkSQL parquet file as RDD[Protobuf]
Stars: ✭ 82 (-28.07%)
Mutual labels:  parquet
Node Parquet
NodeJS module to access apache parquet format files
Stars: ✭ 46 (-59.65%)
Mutual labels:  parquet
Skale
High performance distributed data processing engine
Stars: ✭ 390 (+242.11%)
Mutual labels:  parquet
Avro Hadoop Starter
Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Stars: ✭ 110 (-3.51%)
Mutual labels:  hadoop
Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+3244.74%)
Mutual labels:  hadoop
Quilt
Quilt is a self-organizing data hub for S3
Stars: ✭ 1,007 (+783.33%)
Mutual labels:  parquet
Sqlpad
Web-based SQL editor run in your own private cloud. Supports MySQL, Postgres, SQL Server, Vertica, Crate, ClickHouse, Trino, Presto, SAP HANA, Cassandra, Snowflake, BigQuery, SQLite, and more with ODBC
Stars: ✭ 4,113 (+3507.89%)
Mutual labels:  presto
Camus
Mirror of Linkedin's Camus
Stars: ✭ 81 (-28.95%)
Mutual labels:  hadoop
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+226.32%)
Mutual labels:  hadoop
Weblogsanalysissystem
A big data platform for analyzing web access logs
Stars: ✭ 37 (-67.54%)
Mutual labels:  hadoop
Parquet Cpp
Apache Parquet
Stars: ✭ 339 (+197.37%)
Mutual labels:  parquet
Antsdb
AntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase
Stars: ✭ 99 (-13.16%)
Mutual labels:  hadoop
Ozone
Scalable, redundant, and distributed object store for Apache Hadoop
Stars: ✭ 330 (+189.47%)
Mutual labels:  hadoop
Jsr203 Hadoop
A Java NIO file system provider for HDFS
Stars: ✭ 35 (-69.3%)
Mutual labels:  hadoop
Pystore
Fast data store for Pandas time-series data
Stars: ✭ 325 (+185.09%)
Mutual labels:  parquet
Learn machine learning
Road to Machine Learning
Stars: ✭ 81 (-28.95%)
Mutual labels:  hadoop
Pucket
Bucketing and partitioning system for Parquet
Stars: ✭ 29 (-74.56%)
Mutual labels:  parquet
Tez
Apache Tez
Stars: ✭ 313 (+174.56%)
Mutual labels:  hadoop
Hadoop Book
Example source code accompanying O'Reilly's "Hadoop: The Definitive Guide" by Tom White
Stars: ✭ 3,317 (+2809.65%)
Mutual labels:  hadoop
Spline
Data Lineage Tracking And Visualization Solution
Stars: ✭ 306 (+168.42%)
Mutual labels:  hadoop
Kglab
Graph-Based Data Science: an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries – atop Pandas, RDFlib, pySHACL, RAPIDS, NetworkX, iGraph, PyVis, pslpython, pyarrow, etc.
Stars: ✭ 98 (-14.04%)
Mutual labels:  parquet
Docker Spark
🚢 Docker image for Apache Spark
Stars: ✭ 78 (-31.58%)
Mutual labels:  hadoop
Data Algorithms Book
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Stars: ✭ 949 (+732.46%)
Mutual labels:  hadoop
Cloudbreak
A tool for provisioning and managing Apache Hadoop clusters in the cloud. Cloudbreak, as part of the Hortonworks Data Platform, makes it easy to provision, configure and elastically grow HDP clusters on cloud infrastructure. Cloudbreak can be used to provision Hadoop across cloud infrastructure providers including AWS, Azure, GCP and OpenStack.
Stars: ✭ 301 (+164.04%)
Mutual labels:  hadoop
Elasticsearch loader
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (+163.16%)
Mutual labels:  parquet
Storm Camel Example
Real-time analysis and visualization with Storm-AMQ-Camel-Websockets-Highcharts integration.
Stars: ✭ 28 (-75.44%)
Mutual labels:  hadoop
Elasticluster
Create clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (+161.4%)
Mutual labels:  hadoop
Behemoth
Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Stars: ✭ 286 (+150.88%)
Mutual labels:  hadoop
Chukwa
Mirror of Apache Chukwa
Stars: ✭ 77 (-32.46%)
Mutual labels:  hadoop
Interview Questions Collection
按知识领域整理面试题,包括C++、Java、Hadoop、机器学习等
Stars: ✭ 21 (-81.58%)
Mutual labels:  hadoop
Android Nosql
Lightweight, simple structured NoSQL database for Android
Stars: ✭ 284 (+149.12%)
Mutual labels:  hadoop
Cdc Kafka Hadoop
MySQL to NoSQL real time dataflow
Stars: ✭ 13 (-88.6%)
Mutual labels:  hadoop
Ratatool
A tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (+144.74%)
Mutual labels:  parquet
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (+142.11%)
Mutual labels:  parquet
Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-14.91%)
Mutual labels:  parquet
61-120 of 310 similar projects