All Projects → Oap → Similar Projects or Alternatives

452 Open source projects that are alternatives of or similar to Oap

Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (-71.72%)
Mutual labels:  spark, parquet
Rumble
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Stars: ✭ 58 (-83.09%)
Mutual labels:  spark, parquet
Parquet Generator
Parquet file generator
Stars: ✭ 16 (-95.34%)
Mutual labels:  spark, parquet
Iceberg
Iceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+14.58%)
Mutual labels:  spark, parquet
Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+18.37%)
Mutual labels:  spark, parquet
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+378.72%)
Mutual labels:  spark, parquet
Pucket
Bucketing and partitioning system for Parquet
Stars: ✭ 29 (-91.55%)
Mutual labels:  spark, parquet
Parquet Index
Spark SQL index for Parquet tables
Stars: ✭ 109 (-68.22%)
Mutual labels:  spark, parquet
experiments
Code examples for my blog posts
Stars: ✭ 21 (-93.88%)
Mutual labels:  spark, parquet
Big Data Rosetta Code
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (-25.95%)
Mutual labels:  spark
Elasticluster
Create clusters of VMs on the cloud and configure them with Ansible.
Stars: ✭ 298 (-13.12%)
Mutual labels:  spark
Book
本项目收藏这些年来看过或者听过的一些不错的书籍,在整理文件时看见这些,发现删掉有点可惜,放着又太浪费空间,本着分享的原则,就把它们共享出来,一方面给需要的读者提供这些书籍,另一方面也是一种像知识库的积累吧
Stars: ✭ 47 (-86.3%)
Mutual labels:  spark
Spark Jupyter Aws
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Stars: ✭ 259 (-24.49%)
Mutual labels:  spark
Awesome Ada
A curated list of awesome resources related to the Ada and SPARK programming language
Stars: ✭ 299 (-12.83%)
Mutual labels:  spark
laravel-spark-camera
Profile Photo Camera support for Laravel Spark
Stars: ✭ 30 (-91.25%)
Mutual labels:  spark
Cook
Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark
Stars: ✭ 314 (-8.45%)
Mutual labels:  spark
spark-http-stream
spark structured streaming via HTTP communication
Stars: ✭ 17 (-95.04%)
Mutual labels:  spark
Spark Notebook
Interactive and Reactive Data Science using Scala and Spark.
Stars: ✭ 3,081 (+798.25%)
Mutual labels:  spark
daf-kylo
Kylo integration with PDND (previously DAF).
Stars: ✭ 20 (-94.17%)
Mutual labels:  spark
Spotify-Song-Recommendation-ML
UC Berkeley team's submission for RecSys Challenge 2018
Stars: ✭ 70 (-79.59%)
Mutual labels:  spark
Ytk Learn
Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).
Stars: ✭ 337 (-1.75%)
Mutual labels:  spark
Coolplayspark
酷玩 Spark: Spark 源代码解析、Spark 类库等
Stars: ✭ 3,318 (+867.35%)
Mutual labels:  spark
Ratatool
A tool for data sampling, data generation, and data diffing
Stars: ✭ 279 (-18.66%)
Mutual labels:  parquet
spark-data-sources
Developing Spark External Data Sources using the V2 API
Stars: ✭ 36 (-89.5%)
Mutual labels:  spark
bigkube
Minikube for big data with Scala and Spark
Stars: ✭ 16 (-95.34%)
Mutual labels:  spark
Hbase Rdd
Spark RDD to read, write and delete from HBase
Stars: ✭ 277 (-19.24%)
Mutual labels:  spark
Covid19Tracker
A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.
Stars: ✭ 65 (-81.05%)
Mutual labels:  spark
Roapi
Create full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (-26.24%)
Mutual labels:  parquet
Zat
Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (-11.66%)
Mutual labels:  spark
Succinct
Enabling queries on compressed data.
Stars: ✭ 257 (-25.07%)
Mutual labels:  spark
Sparklint
A tool for monitoring and tuning Spark jobs for efficiency.
Stars: ✭ 316 (-7.87%)
Mutual labels:  spark
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-93.29%)
Mutual labels:  spark
Elasticsearch loader
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (-12.54%)
Mutual labels:  parquet
sparkProjectTemplate.g8
Template for Spark Projects
Stars: ✭ 77 (-77.55%)
Mutual labels:  spark
Parquet Cpp
Apache Parquet
Stars: ✭ 339 (-1.17%)
Mutual labels:  parquet
kafka-spark-streaming-zeppelin-docker
One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)
Stars: ✭ 82 (-76.09%)
Mutual labels:  spark
Spark Hbase Connector
Connect Spark to HBase for reading and writing data with ease
Stars: ✭ 299 (-12.83%)
Mutual labels:  spark
basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-92.71%)
Mutual labels:  spark
Clickhouse Native Jdbc
ClickHouse Native Protocol JDBC implementation
Stars: ✭ 310 (-9.62%)
Mutual labels:  spark
dllib
dllib is a distributed deep learning library running on Apache Spark
Stars: ✭ 32 (-90.67%)
Mutual labels:  spark
Spark Druid Olap
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.
Stars: ✭ 282 (-17.78%)
Mutual labels:  spark
spark learning
尚硅谷大数据Spark-2019版最新 Spark 学习
Stars: ✭ 42 (-87.76%)
Mutual labels:  spark
Scalnet
A Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs
Stars: ✭ 342 (-0.29%)
Mutual labels:  spark
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Stars: ✭ 54 (-84.26%)
Mutual labels:  spark
Cloudflow
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.
Stars: ✭ 278 (-18.95%)
Mutual labels:  spark
confluent-spark-avro
Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.
Stars: ✭ 18 (-94.75%)
Mutual labels:  spark
Learningsparkv2
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Stars: ✭ 307 (-10.5%)
Mutual labels:  spark
blog
blog entries
Stars: ✭ 39 (-88.63%)
Mutual labels:  spark
Parquet Dotnet
🏐 Apache Parquet for modern .NET
Stars: ✭ 276 (-19.53%)
Mutual labels:  parquet
SparkV
🤖⚡ | The most POWERFUL multipurpose chat/meme bot that will boost the activity in your server.
Stars: ✭ 24 (-93%)
Mutual labels:  spark
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (-90.09%)
Mutual labels:  spark
Wirbelsturm
Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.
Stars: ✭ 332 (-3.21%)
Mutual labels:  spark
Crayon
Simple framework agnostic UI router for SPAs
Stars: ✭ 310 (-9.62%)
Mutual labels:  spark
Datavec
ETL Library for Machine Learning - data pipelines, data munging and wrangling
Stars: ✭ 272 (-20.7%)
Mutual labels:  spark
trembita
Model complex data transformation pipelines easily
Stars: ✭ 44 (-87.17%)
Mutual labels:  spark
spark-extension
A library that provides useful extensions to Apache Spark and PySpark.
Stars: ✭ 25 (-92.71%)
Mutual labels:  spark
Helk
The Hunting ELK
Stars: ✭ 3,097 (+802.92%)
Mutual labels:  spark
dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (-91.25%)
Mutual labels:  parquet
bigdata-fun
A complete (distributed) BigData stack, running in containers
Stars: ✭ 14 (-95.92%)
Mutual labels:  spark
Delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+1037.9%)
Mutual labels:  spark
1-60 of 452 similar projects