All Projects → Pucket → Similar Projects or Alternatives

572 Open source projects that are alternatives of or similar to Pucket

Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+1300%)
Mutual labels:  spark, parquet, hdfs
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (+100%)
Mutual labels:  spark, parquet, hdfs
Learning Spark
零基础学习spark,大数据学习
Stars: ✭ 37 (+27.59%)
Mutual labels:  spark, hdfs
Iceberg
Iceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+1255.17%)
Mutual labels:  spark, parquet
Ibis
A pandas-like deferred expression system, with first-class SQL support
Stars: ✭ 1,630 (+5520.69%)
Mutual labels:  hdfs, spark
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (+144.83%)
Mutual labels:  spark, hdfs
Oap
Optimized Analytics Package for Spark* Platform
Stars: ✭ 343 (+1082.76%)
Mutual labels:  spark, parquet
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (+455.17%)
Mutual labels:  spark, hdfs
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-51.72%)
Mutual labels:  spark, hdfs
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+5562.07%)
Mutual labels:  spark, parquet
fastdata-cluster
Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)
Stars: ✭ 20 (-31.03%)
Mutual labels:  spark, hdfs
Yandex Big Data Engineering
Stars: ✭ 17 (-41.38%)
Mutual labels:  spark, hdfs
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (+217.24%)
Mutual labels:  spark, hdfs
experiments
Code examples for my blog posts
Stars: ✭ 21 (-27.59%)
Mutual labels:  spark, parquet
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+2855.17%)
Mutual labels:  spark, hdfs
Parquet Index
Spark SQL index for Parquet tables
Stars: ✭ 109 (+275.86%)
Mutual labels:  spark, parquet
parquet-flinktacular
How to use Parquet in Flink
Stars: ✭ 29 (+0%)
Mutual labels:  thrift, parquet
bigkube
Minikube for big data with Scala and Spark
Stars: ✭ 16 (-44.83%)
Mutual labels:  spark, hdfs
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (+196.55%)
Mutual labels:  parquet, hdfs
Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (+234.48%)
Mutual labels:  spark, parquet
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+37800%)
Mutual labels:  spark, hdfs
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (-34.48%)
Mutual labels:  hdfs, parquet
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (+417.24%)
Mutual labels:  spark, hdfs
Parquet Generator
Parquet file generator
Stars: ✭ 16 (-44.83%)
Mutual labels:  spark, parquet
Kyuubi
Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (+1151.72%)
Mutual labels:  thrift, spark
leaflet heatmap
简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-55.17%)
Mutual labels:  spark, hdfs
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+20617.24%)
Mutual labels:  spark, hdfs
Sparta
Real Time Analytics and Data Pipelines based on Spark Streaming
Stars: ✭ 513 (+1668.97%)
Mutual labels:  spark, hdfs
Snakebite
A pure python HDFS client
Stars: ✭ 828 (+2755.17%)
Mutual labels:  hdfs
Impala Java Client
Java client to connect directly to Impala using thrift
Stars: ✭ 26 (-10.34%)
Mutual labels:  thrift
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-82.76%)
Mutual labels:  hdfs
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (+2717.24%)
Mutual labels:  spark
Rpc proxy
基于thrift的服务注册和发现框架
Stars: ✭ 13 (-55.17%)
Mutual labels:  thrift
Tiledb Vcf
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Stars: ✭ 26 (-10.34%)
Mutual labels:  spark
Zys
high performance service framework based on Yaf or Swoole
Stars: ✭ 812 (+2700%)
Mutual labels:  thrift
Parquet Format
Apache Parquet
Stars: ✭ 800 (+2658.62%)
Mutual labels:  parquet
Spark Swagger
Spark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
Stars: ✭ 25 (-13.79%)
Mutual labels:  spark
Goodreads etl pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Stars: ✭ 793 (+2634.48%)
Mutual labels:  spark
Spark Redis
A connector for Spark that allows reading and writing to/from Redis cluster
Stars: ✭ 773 (+2565.52%)
Mutual labels:  spark
Interview Questions Collection
按知识领域整理面试题,包括C++、Java、Hadoop、机器学习等
Stars: ✭ 21 (-27.59%)
Mutual labels:  spark
Urhox
Urho3D extension library
Stars: ✭ 13 (-55.17%)
Mutual labels:  spark
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+3103.45%)
Mutual labels:  spark
Sparklyr
R interface for Apache Spark
Stars: ✭ 775 (+2572.41%)
Mutual labels:  spark
Angel
A Flexible and Powerful Parameter Server for large-scale machine learning
Stars: ✭ 6,458 (+22168.97%)
Mutual labels:  spark
Chronicler
Scala toolchain for InfluxDB
Stars: ✭ 24 (-17.24%)
Mutual labels:  spark
Coding Now
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Stars: ✭ 750 (+2486.21%)
Mutual labels:  spark
Spark Movie Lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+2468.97%)
Mutual labels:  spark
Sparkling Titanic
Training models with Apache Spark, PySpark for Titanic Kaggle competition
Stars: ✭ 12 (-58.62%)
Mutual labels:  spark
Spark Tdd Example
A simple Spark TDD example
Stars: ✭ 23 (-20.69%)
Mutual labels:  spark
Sparkctr
CTR prediction model based on spark(LR, GBDT, DNN)
Stars: ✭ 740 (+2451.72%)
Mutual labels:  spark
Cdhproject
hadoop各组件使用,持续更新
Stars: ✭ 733 (+2427.59%)
Mutual labels:  spark
Digitrecognizer
Java Convolutional Neural Network example for Hand Writing Digit Recognition
Stars: ✭ 23 (-20.69%)
Mutual labels:  spark
Kafka Storm Starter
Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
Stars: ✭ 728 (+2410.34%)
Mutual labels:  spark
Scrooge
A Thrift parser/generator
Stars: ✭ 724 (+2396.55%)
Mutual labels:  thrift
Heracles
High performance HBase / Spark SQL engine
Stars: ✭ 27 (-6.9%)
Mutual labels:  spark
Flint
A Time Series Library for Apache Spark
Stars: ✭ 878 (+2927.59%)
Mutual labels:  spark
Mlfeature
Feature engineering toolkit for Spark MLlib.
Stars: ✭ 12 (-58.62%)
Mutual labels:  spark
Cluster Pack
A library on top of either pex or conda-pack to make your Python code easily available on a cluster
Stars: ✭ 23 (-20.69%)
Mutual labels:  hdfs
Frameless
Expressive types for Spark.
Stars: ✭ 717 (+2372.41%)
Mutual labels:  spark
Hail
Scalable genomic data analysis.
Stars: ✭ 706 (+2334.48%)
Mutual labels:  spark
1-60 of 572 similar projects