All Projects → parquet-usql → Similar Projects or Alternatives

100 Open source projects that are alternatives of or similar to parquet-usql

openmrs-fhir-analytics
A collection of tools for extracting FHIR resources and analytics services on top of that data.
Stars: ✭ 55 (+323.08%)
Mutual labels:  parquet
H2PC TagExtraction
A application made to extract assets from cache files of H2v using BlamLib by KornnerStudios.
Stars: ✭ 12 (-7.69%)
Mutual labels:  extractor
apiary-data-lake
Terraform scripts for deploying Apiary Data Lake
Stars: ✭ 15 (+15.38%)
Mutual labels:  datalake
miniparquet
Library to read a subset of Parquet files
Stars: ✭ 38 (+192.31%)
Mutual labels:  parquet
Sqlite Parquet Vtable
A SQLite vtable extension to read Parquet files
Stars: ✭ 167 (+1184.62%)
Mutual labels:  parquet
pan-cortex-data-lake-python
Python idiomatic SDK for Cortex™ Data Lake.
Stars: ✭ 36 (+176.92%)
Mutual labels:  datalake
gba-mus-ripper
(Not actively maintained) A fork of Bregalad's "GBA Mus Riper" program
Stars: ✭ 50 (+284.62%)
Mutual labels:  extractor
IMCtermite
Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats
Stars: ✭ 20 (+53.85%)
Mutual labels:  parquet
Vscode Data Preview
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+1784.62%)
Mutual labels:  parquet
parquet-extra
A collection of Apache Parquet add-on modules
Stars: ✭ 30 (+130.77%)
Mutual labels:  parquet
gr-eventstream
gr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.
Stars: ✭ 38 (+192.31%)
Mutual labels:  extractor
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (+976.92%)
Mutual labels:  parquet
columnify
Make record oriented data to columnar format.
Stars: ✭ 28 (+115.38%)
Mutual labels:  parquet
php-article-extractor
A PHP library to extract article text from web pages
Stars: ✭ 28 (+115.38%)
Mutual labels:  extractor
hadoop-etl-udfs
The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (+30.77%)
Mutual labels:  parquet
Uniextract2
Universal Extractor 2 is a tool to extract files from any type of archive or installer.
Stars: ✭ 1,966 (+15023.08%)
Mutual labels:  extractor
proc-that
proc(ess)-that - easy extendable ETL tool for Node.js. Written in TypeScript.
Stars: ✭ 25 (+92.31%)
Mutual labels:  extractor
ISx
ISx is an InstallShield installer extractor
Stars: ✭ 79 (+507.69%)
Mutual labels:  extractor
PowerPointAudio-Extractor
Python script which extracts and joins audio files from powerpoints
Stars: ✭ 12 (-7.69%)
Mutual labels:  extractor
OpenBackupExtractor
A free program for extracting data (like voicemails) from iPhone and iPad backups.
Stars: ✭ 111 (+753.85%)
Mutual labels:  extractor
albis
Albis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (+53.85%)
Mutual labels:  parquet
Parquetjs
fully asynchronous, pure JavaScript implementation of the Parquet file format
Stars: ✭ 200 (+1438.46%)
Mutual labels:  parquet
databricks-notebooks
Collection of Databricks and Jupyter Notebooks
Stars: ✭ 19 (+46.15%)
Mutual labels:  parquet
Parquet Rs
Apache Parquet implementation in Rust
Stars: ✭ 144 (+1007.69%)
Mutual labels:  parquet
YaEtl
Yet Another ETL in PHP
Stars: ✭ 60 (+361.54%)
Mutual labels:  extractor
parquet-flinktacular
How to use Parquet in Flink
Stars: ✭ 29 (+123.08%)
Mutual labels:  parquet
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+12530.77%)
Mutual labels:  parquet
RecursiveExtractor
RecursiveExtractor is a .NET Standard 2.0 archive extraction Library, and Command Line Tool which can process 7zip, ar, bzip2, deb, gzip, iso, rar, tar, vhd, vhdx, vmdk, wim, xzip, and zip archives and any nested combination of the supported formats.
Stars: ✭ 109 (+738.46%)
Mutual labels:  extractor
qsv
CSVs sliced, diced & analyzed.
Stars: ✭ 438 (+3269.23%)
Mutual labels:  parquet
wasp
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (+46.15%)
Mutual labels:  parquet
date-extractor
Extract dates from text
Stars: ✭ 58 (+346.15%)
Mutual labels:  extractor
terraform-aws-kinesis-firehose
This code creates a Kinesis Firehose in AWS to send CloudWatch log data to S3.
Stars: ✭ 25 (+92.31%)
Mutual labels:  parquet
qresExtract
Qt binary resource (qres) extractor
Stars: ✭ 26 (+100%)
Mutual labels:  extractor
Parquet.jl
Julia implementation of Parquet columnar file format reader
Stars: ✭ 93 (+615.38%)
Mutual labels:  parquet
seo-audits-toolkit
SEO & Security Audit for Websites. Lighthouse & Security Headers crawler, Sitemap/Keywords/Images Extractor, Summarizer, etc ...
Stars: ✭ 311 (+2292.31%)
Mutual labels:  extractor
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+200%)
Mutual labels:  datalake
galer
A fast tool to fetch URLs from HTML attributes by crawl-in.
Stars: ✭ 138 (+961.54%)
Mutual labels:  extractor
odbc2parquet
A command line tool to query an ODBC data source and write the result into a parquet file.
Stars: ✭ 95 (+630.77%)
Mutual labels:  parquet
spparser
an async ETL tool written in Python.
Stars: ✭ 34 (+161.54%)
Mutual labels:  extractor
zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+4938.46%)
Mutual labels:  datalake
WoWHead-PHP-Extractors
Compare your database with WoWHead and find missing data
Stars: ✭ 14 (+7.69%)
Mutual labels:  extractor
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+84.62%)
Mutual labels:  parquet
gettext-extractor
A flexible and powerful Gettext message extractor with support for JavaScript, TypeScript, JSX and HTML.
Stars: ✭ 82 (+530.77%)
Mutual labels:  extractor
undock
Extract contents of a container image in a local folder
Stars: ✭ 119 (+815.38%)
Mutual labels:  extractor
KoishiEx
恋恋のEX兔子版源代码以及KoishiExAPI源代码
Stars: ✭ 48 (+269.23%)
Mutual labels:  extractor
meta-extractor
Super simple and fast html page meta data extractor with low memory footprint
Stars: ✭ 38 (+192.31%)
Mutual labels:  extractor
Awkward 0.x
Manipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+1561.54%)
Mutual labels:  parquet
npk-tools
Mikrotik's NPK files managing tools
Stars: ✭ 63 (+384.62%)
Mutual labels:  extractor
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+1261.54%)
Mutual labels:  parquet
apiary
Apiary provides modules which can be combined to create a federated cloud data lake
Stars: ✭ 30 (+130.77%)
Mutual labels:  datalake
Parquetviewer
Simple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+1015.38%)
Mutual labels:  parquet
f43.me
A more readable & cleaner feed
Stars: ✭ 60 (+361.54%)
Mutual labels:  extractor
Kartothek
A consistent table management library in python
Stars: ✭ 144 (+1007.69%)
Mutual labels:  parquet
CTR-tools
Crash Team Racing (PS1) tools - a C# framework by DCxDemo and a set of tools to parse files found in the original kart racing game by Naughty Dog.
Stars: ✭ 93 (+615.38%)
Mutual labels:  extractor
dlink
Dinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
Stars: ✭ 1,535 (+11707.69%)
Mutual labels:  datalake
Spark
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Stars: ✭ 55 (+323.08%)
Mutual labels:  parquet
Real-time-Data-Warehouse
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
Stars: ✭ 52 (+300%)
Mutual labels:  datalake
crohme-data-extractor
A modified extractor for the CROHME handwritten math symbols dataset.
Stars: ✭ 18 (+38.46%)
Mutual labels:  extractor
ingredients
Extract recipe ingredients from any recipe website on the internet.
Stars: ✭ 96 (+638.46%)
Mutual labels:  extractor
electron-video-downloader
A minimal Electron application to download videos, eg from youtube, and associated captions (optional). Uses youtube-dl under the hood.
Stars: ✭ 22 (+69.23%)
Mutual labels:  extractor
1-60 of 100 similar projects