All Projects → RumbleDB → Rumble

RumbleDB / Rumble

Licence: other
⛈️ Rumble 1.11.0 "Banyan Tree"🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Rumble

Devops Python Tools
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Stars: ✭ 406 (+600%)
Mutual labels:  json, spark, avro, parquet, hdfs
Vscode Data Preview
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+322.41%)
Mutual labels:  json, csv, avro, parquet
Storagetapper
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service
Stars: ✭ 232 (+300%)
Mutual labels:  s3, json, avro, hdfs
Schemer
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Stars: ✭ 97 (+67.24%)
Mutual labels:  json, spark, avro, parquet
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (+541.38%)
Mutual labels:  json, csv, avro, parquet
qwery
A SQL-like language for performing ETL transformations.
Stars: ✭ 28 (-51.72%)
Mutual labels:  query, csv, avro, s3
Pucket
Bucketing and partitioning system for Parquet
Stars: ✭ 29 (-50%)
Mutual labels:  spark, parquet, hdfs
Just Dashboard
📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+2505.17%)
Mutual labels:  json, csv, data-science
Octosql
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
Stars: ✭ 2,579 (+4346.55%)
Mutual labels:  json, csv, query
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-58.62%)
Mutual labels:  csv, avro, parquet
Roapi
Create full-fledged APIs for static datasets without writing a single line of code.
Stars: ✭ 253 (+336.21%)
Mutual labels:  s3, query, parquet
Iceberg
Iceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+577.59%)
Mutual labels:  spark, avro, parquet
Specs
Technical specifications and guidelines for implementing Frictionless Data.
Stars: ✭ 403 (+594.83%)
Mutual labels:  json, csv, data-science
Bigdata File Viewer
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Stars: ✭ 86 (+48.28%)
Mutual labels:  avro, parquet, hdfs
Elasticsearch loader
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Stars: ✭ 300 (+417.24%)
Mutual labels:  json, csv, parquet
Tiledb
The Universal Storage Engine
Stars: ✭ 1,072 (+1748.28%)
Mutual labels:  s3, data-science, hdfs
Ps Webapi
(Migrated from CodePlex) Let PowerShell Script serve or command-line process as WebAPI. PSWebApi is a simple library for building ASP.NET Web APIs (RESTful Services) by PowerShell Scripts or batch/executable files out of the box.
Stars: ✭ 24 (-58.62%)
Mutual labels:  json, csv, text
S3proxy
Access other storage backends via the S3 API
Stars: ✭ 952 (+1541.38%)
Mutual labels:  azure, s3
Data Forge Ts
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
Stars: ✭ 967 (+1567.24%)
Mutual labels:  json, csv
Parsrs
CSV, JSON, XML text parsers and generators written in pure POSIX shellscript
Stars: ✭ 56 (-3.45%)
Mutual labels:  json, csv

Rumble

Getting started: you will find a Jupyter notebook that introduces the JSONiq language on top of Rumble here. You can use it by installing the all-in-one Data Science platform Anaconda, unless you prefer to install Python+Spark+PySpark+Jupyter manually (brew, apt...).

The documentation also contains an introduction specific to Rumble and how you can read input datasets, but we have not converted it to Jupyter notebooks yet (this will follow).

The documentation of the latest official release is available here.

The documentation of the current master (for the adventurous and curious) is available here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].