All Projects → Oap → Similar Projects or Alternatives

452 Open source projects that are alternatives of or similar to Oap

Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.

Stars: ✭ 97 (-71.72%)

Mutual labels: spark, parquet

⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Stars: ✭ 58 (-83.09%)

Mutual labels: spark, parquet

Parquet Generator

Parquet file generator

Stars: ✭ 16 (-95.34%)

Mutual labels: spark, parquet

Iceberg

Iceberg is a table format for large, slow-moving tabular data

Stars: ✭ 393 (+14.58%)

Mutual labels: spark, parquet

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (+18.37%)

Mutual labels: spark, parquet

Gaffer

A large-scale entity and relation database supporting aggregation of properties

Stars: ✭ 1,642 (+378.72%)

Mutual labels: spark, parquet

Pucket

Bucketing and partitioning system for Parquet

Stars: ✭ 29 (-91.55%)

Mutual labels: spark, parquet

Parquet Index

Spark SQL index for Parquet tables

Stars: ✭ 109 (-68.22%)

Mutual labels: spark, parquet

experiments

Code examples for my blog posts

Stars: ✭ 21 (-93.88%)

Mutual labels: spark, parquet

Big Data Rosetta Code

Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code

Stars: ✭ 254 (-25.95%)

Mutual labels: spark

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-13.12%)

Mutual labels: spark

Book

本项目收藏这些年来看过或者听过的一些不错的书籍，在整理文件时看见这些，发现删掉有点可惜，放着又太浪费空间，本着分享的原则，就把它们共享出来，一方面给需要的读者提供这些书籍，另一方面也是一种像知识库的积累吧

Stars: ✭ 47 (-86.3%)

Mutual labels: spark

Spark Jupyter Aws

A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support

Stars: ✭ 259 (-24.49%)

Mutual labels: spark

Awesome Ada

A curated list of awesome resources related to the Ada and SPARK programming language

Stars: ✭ 299 (-12.83%)

Mutual labels: spark

laravel-spark-camera

Profile Photo Camera support for Laravel Spark

Stars: ✭ 30 (-91.25%)

Mutual labels: spark

Cook

Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark

Stars: ✭ 314 (-8.45%)

Mutual labels: spark

spark-http-stream

spark structured streaming via HTTP communication

Stars: ✭ 17 (-95.04%)

Mutual labels: spark

Spark Notebook

Interactive and Reactive Data Science using Scala and Spark.

Stars: ✭ 3,081 (+798.25%)

Mutual labels: spark

daf-kylo

Kylo integration with PDND (previously DAF).

Stars: ✭ 20 (-94.17%)

Mutual labels: spark

Spotify-Song-Recommendation-ML

UC Berkeley team's submission for RecSys Challenge 2018

Stars: ✭ 70 (-79.59%)

Mutual labels: spark

Ytk Learn

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (-1.75%)

Mutual labels: spark

Coolplayspark

酷玩 Spark: Spark 源代码解析、Spark 类库等

Stars: ✭ 3,318 (+867.35%)

Mutual labels: spark

Ratatool

A tool for data sampling, data generation, and data diffing

Stars: ✭ 279 (-18.66%)

Mutual labels: parquet

spark-data-sources

Developing Spark External Data Sources using the V2 API

Stars: ✭ 36 (-89.5%)

Mutual labels: spark

bigkube

Minikube for big data with Scala and Spark

Stars: ✭ 16 (-95.34%)

Mutual labels: spark

Hbase Rdd

Spark RDD to read, write and delete from HBase

Stars: ✭ 277 (-19.24%)

Mutual labels: spark

Covid19Tracker

A Robinhood style COVID-19 🦠 Android tracking app for the US. Open source and built with Kotlin.

Stars: ✭ 65 (-81.05%)

Mutual labels: spark

Roapi

Create full-fledged APIs for static datasets without writing a single line of code.

Stars: ✭ 253 (-26.24%)

Mutual labels: parquet

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (-11.66%)

Mutual labels: spark

Succinct

Enabling queries on compressed data.

Stars: ✭ 257 (-25.07%)

Mutual labels: spark

Sparklint

A tool for monitoring and tuning Spark jobs for efficiency.

Stars: ✭ 316 (-7.87%)

Mutual labels: spark

spark-structured-streaming-examples

Spark structured streaming examples with using of version 3.0.0

Stars: ✭ 23 (-93.29%)

Mutual labels: spark

Elasticsearch loader

A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch

Stars: ✭ 300 (-12.54%)

Mutual labels: parquet

sparkProjectTemplate.g8

Template for Spark Projects

Stars: ✭ 77 (-77.55%)

Mutual labels: spark

Parquet Cpp

Apache Parquet

Stars: ✭ 339 (-1.17%)

Mutual labels: parquet

kafka-spark-streaming-zeppelin-docker

One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)

Stars: ✭ 82 (-76.09%)

Mutual labels: spark

Spark Hbase Connector

Connect Spark to HBase for reading and writing data with ease

Stars: ✭ 299 (-12.83%)

Mutual labels: spark

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (-92.71%)

Mutual labels: spark

Clickhouse Native Jdbc

ClickHouse Native Protocol JDBC implementation

Stars: ✭ 310 (-9.62%)

Mutual labels: spark

dllib

dllib is a distributed deep learning library running on Apache Spark

Stars: ✭ 32 (-90.67%)

Mutual labels: spark

Spark Druid Olap

Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit.ly/2oBJSpP) an Integrated BI platform on Apache Spark.

Stars: ✭ 282 (-17.78%)

Mutual labels: spark

spark learning

尚硅谷大数据Spark-2019版最新 Spark 学习

Stars: ✭ 42 (-87.76%)

Mutual labels: spark

Scalnet

A Scala wrapper for Deeplearning4j, inspired by Keras. Scala + DL + Spark + GPUs

Stars: ✭ 342 (-0.29%)

Mutual labels: spark

prosto

Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby

Stars: ✭ 54 (-84.26%)

Mutual labels: spark

Cloudflow

Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes.

Stars: ✭ 278 (-18.95%)

Mutual labels: spark

confluent-spark-avro

Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.

Stars: ✭ 18 (-94.75%)

Mutual labels: spark

Learningsparkv2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]