openmrs-fhir-analyticsA collection of tools for extracting FHIR resources and analytics services on top of that data.
Stars: ✭ 55 (+323.08%)
H2PC TagExtractionA application made to extract assets from cache files of H2v using BlamLib by KornnerStudios.
Stars: ✭ 12 (-7.69%)
apiary-data-lakeTerraform scripts for deploying Apiary Data Lake
Stars: ✭ 15 (+15.38%)
miniparquetLibrary to read a subset of Parquet files
Stars: ✭ 38 (+192.31%)
gba-mus-ripper(Not actively maintained) A fork of Bregalad's "GBA Mus Riper" program
Stars: ✭ 50 (+284.62%)
IMCtermiteEnables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats
Stars: ✭ 20 (+53.85%)
Vscode Data PreviewData Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+1784.62%)
parquet-extraA collection of Apache Parquet add-on modules
Stars: ✭ 30 (+130.77%)
gr-eventstreamgr-eventstream is a set of GNU Radio blocks for creating precisely timed events and either inserting them into, or extracting them from normal data-streams precisely. It allows for the definition of high speed time-synchronous c++ burst event handlers, as well as bridging to standard GNU Radio Async PDU messages with precise timing easily.
Stars: ✭ 38 (+192.31%)
Eel SdkBig Data Toolkit for the JVM
Stars: ✭ 140 (+976.92%)
columnifyMake record oriented data to columnar format.
Stars: ✭ 28 (+115.38%)
hadoop-etl-udfsThe Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Stars: ✭ 17 (+30.77%)
Uniextract2Universal Extractor 2 is a tool to extract files from any type of archive or installer.
Stars: ✭ 1,966 (+15023.08%)
proc-thatproc(ess)-that - easy extendable ETL tool for Node.js. Written in TypeScript.
Stars: ✭ 25 (+92.31%)
ISxISx is an InstallShield installer extractor
Stars: ✭ 79 (+507.69%)
OpenBackupExtractorA free program for extracting data (like voicemails) from iPhone and iPad backups.
Stars: ✭ 111 (+753.85%)
albisAlbis: High-Performance File Format for Big Data Systems
Stars: ✭ 20 (+53.85%)
Parquetjsfully asynchronous, pure JavaScript implementation of the Parquet file format
Stars: ✭ 200 (+1438.46%)
Parquet RsApache Parquet implementation in Rust
Stars: ✭ 144 (+1007.69%)
YaEtlYet Another ETL in PHP
Stars: ✭ 60 (+361.54%)
GafferA large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+12530.77%)
RecursiveExtractorRecursiveExtractor is a .NET Standard 2.0 archive extraction Library, and Command Line Tool which can process 7zip, ar, bzip2, deb, gzip, iso, rar, tar, vhd, vhdx, vmdk, wim, xzip, and zip archives and any nested combination of the supported formats.
Stars: ✭ 109 (+738.46%)
qsvCSVs sliced, diced & analyzed.
Stars: ✭ 438 (+3269.23%)
waspWASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
Stars: ✭ 19 (+46.15%)
qresExtractQt binary resource (qres) extractor
Stars: ✭ 26 (+100%)
Parquet.jlJulia implementation of Parquet columnar file format reader
Stars: ✭ 93 (+615.38%)
seo-audits-toolkitSEO & Security Audit for Websites. Lighthouse & Security Headers crawler, Sitemap/Keywords/Images Extractor, Summarizer, etc ...
Stars: ✭ 311 (+2292.31%)
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+200%)
galerA fast tool to fetch URLs from HTML attributes by crawl-in.
Stars: ✭ 138 (+961.54%)
odbc2parquetA command line tool to query an ODBC data source and write the result into a parquet file.
Stars: ✭ 95 (+630.77%)
spparseran async ETL tool written in Python.
Stars: ✭ 34 (+161.54%)
zinggScalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+4938.46%)
DaFlowApache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (+84.62%)
gettext-extractorA flexible and powerful Gettext message extractor with support for JavaScript, TypeScript, JSX and HTML.
Stars: ✭ 82 (+530.77%)
undockExtract contents of a container image in a local folder
Stars: ✭ 119 (+815.38%)
KoishiEx恋恋のEX兔子版源代码以及KoishiExAPI源代码
Stars: ✭ 48 (+269.23%)
meta-extractorSuper simple and fast html page meta data extractor with low memory footprint
Stars: ✭ 38 (+192.31%)
Awkward 0.xManipulate arrays of complex data structures as easily as Numpy.
Stars: ✭ 216 (+1561.54%)
npk-toolsMikrotik's NPK files managing tools
Stars: ✭ 63 (+384.62%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (+1261.54%)
apiaryApiary provides modules which can be combined to create a federated cloud data lake
Stars: ✭ 30 (+130.77%)
ParquetviewerSimple windows desktop application for viewing & querying Apache Parquet files
Stars: ✭ 145 (+1015.38%)
f43.meA more readable & cleaner feed
Stars: ✭ 60 (+361.54%)
KartothekA consistent table management library in python
Stars: ✭ 144 (+1007.69%)
CTR-toolsCrash Team Racing (PS1) tools - a C# framework by DCxDemo and a set of tools to parse files found in the original kart racing game by Naughty Dog.
Stars: ✭ 93 (+615.38%)
dlinkDinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Streaming & Batch and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
Stars: ✭ 1,535 (+11707.69%)
SparkApache Spark is a fast, in-memory data processing engine with elegant and expressive development API's to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.This project will have sample programs for Spark in Scala language .
Stars: ✭ 55 (+323.08%)
crohme-data-extractorA modified extractor for the CROHME handwritten math symbols dataset.
Stars: ✭ 18 (+38.46%)
ingredientsExtract recipe ingredients from any recipe website on the internet.
Stars: ✭ 96 (+638.46%)
electron-video-downloaderA minimal Electron application to download videos, eg from youtube, and associated captions (optional). Uses youtube-dl under the hood.
Stars: ✭ 22 (+69.23%)