zinggScalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+3019.05%)
olap睿思BI-OLAP开源多维分析系统
Stars: ✭ 101 (+380.95%)
csv-cruncherTreats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.
Stars: ✭ 32 (+52.38%)
elastic-query-export🚚 Export Data from ElasticSearch to CSV/JSON using a Lucene Query (e.g. from Kibana) or a raw JSON Query string
Stars: ✭ 56 (+166.67%)
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+266.67%)
DataProfilerWhat's in your data? Extract schema, statistics and entities from datasets
Stars: ✭ 843 (+3914.29%)
naas⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (+942.86%)
artemis cliA command-line application for tutors to more productively grade programming excises on ArTEMiS
Stars: ✭ 12 (-42.86%)
etl[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+1228.57%)
zdh server数据采集平台zdh,etl 处理服务
Stars: ✭ 53 (+152.38%)
mikThe Move to Islandora Kit is an extensible PHP command-line tool for converting source content and metadata into packages suitable for importing into Islandora (or other digital repository and preservations systems).
Stars: ✭ 32 (+52.38%)
FlowMasterETL flow framework based on Yaml configs in Python
Stars: ✭ 19 (-9.52%)
eecA fast and lower memory excel write/read tool.一个非POI底层,支持流式处理的高效且超低内存的Excel读写工具
Stars: ✭ 93 (+342.86%)
iex-stocksETL for the IEX Stocks API
Stars: ✭ 19 (-9.52%)
ottosocial👍 ottosocial is a CLI to schedule tweets via CSV
Stars: ✭ 23 (+9.52%)
awesome-integrationA curated list of awesome system integration software and resources.
Stars: ✭ 117 (+457.14%)
csvlixirA CSV reading/writing application for Elixir.
Stars: ✭ 32 (+52.38%)
tabtools🔧 SQL for csv file in UNIX command line with awk.
Stars: ✭ 16 (-23.81%)
chronicle-etl📜 A CLI toolkit for extracting and working with your digital history
Stars: ✭ 78 (+271.43%)
import-cli-simpleThis the meta package for Pacemaker Community, a Symfony based CLI application that provides import functionality for products, categories, attributes, and attribute-sets. The default format is CSV, adapters for XML are also available. The application can be declaratively extended by additional operations, which can be used to reassemble and exe…
Stars: ✭ 69 (+228.57%)
wikirepoPython based Wikidata framework for easy dataframe extraction
Stars: ✭ 33 (+57.14%)
conveyCSV processing and web related data types mutual conversion
Stars: ✭ 16 (-23.81%)
DataX-srcDataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (+0%)
MPowerTCXShare stationary bike data with Strava, Garmin Connect and Golden Cheetah
Stars: ✭ 22 (+4.76%)
OBISA JavaScript framework for downloading bank statements in OFX, QIF, CSV, and JSON. Currently supports HSBC UK Personal Banking.
Stars: ✭ 37 (+76.19%)
CsvTextFieldParserA simple CSV parser based on Microsoft.VisualBasic.FileIO.TextFieldParser.
Stars: ✭ 40 (+90.48%)
COVID-19-GreeceA python-generated website for visualizing the novel coronavirus (COVID-19) data for Greece.
Stars: ✭ 21 (+0%)
krawlerA minimalist (geospatial) ETL
Stars: ✭ 51 (+142.86%)
fastapi-csv🏗️ Create APIs from CSV files within seconds, using fastapi
Stars: ✭ 46 (+119.05%)
NBiNBi is a testing framework (add-on to NUnit) for Business Intelligence and Data Access. The main goal of this framework is to let users create tests with a declarative approach based on an Xml syntax. By the means of NBi, you don't need to develop C# or Java code to specify your tests! Either, you don't need Visual Studio or Eclipse to compile y…
Stars: ✭ 102 (+385.71%)
AirflowETLBlog post on ETL pipelines with Airflow
Stars: ✭ 20 (-4.76%)
singer-runnerA CLI and library to run Singer Taps and Targets
Stars: ✭ 33 (+57.14%)
pandoc-placetablePandoc filter to include CSV data (from file or URL)
Stars: ✭ 35 (+66.67%)
openmrs-fhir-analyticsA collection of tools for extracting FHIR resources and analytics services on top of that data.
Stars: ✭ 55 (+161.9%)
id3cData logistics system enabling real-time pathogen surveillance. Built for the Seattle Flu Study.
Stars: ✭ 21 (+0%)
Papers4DataAchitectCollect papers for data engineering such as OLTP/OLAP/ETL/DistributedStorage.
Stars: ✭ 17 (-19.05%)
csvtogsTake a CSV file and create a Google Spreadsheet with the contents
Stars: ✭ 15 (-28.57%)
arrow-datafusionApache Arrow DataFusion SQL Query Engine
Stars: ✭ 2,360 (+11138.1%)
covid-19Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.
Stars: ✭ 14 (-33.33%)
thainThain is a distributed flow schedule platform.
Stars: ✭ 81 (+285.71%)
CsvCSV data manipulation made easy in PHP
Stars: ✭ 2,863 (+13533.33%)
Easy CsvEasyCSV is a simple Object Oriented CSV manipulation library for PHP 7.2+
Stars: ✭ 253 (+1104.76%)
MillerMiller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
Stars: ✭ 4,633 (+21961.9%)
SqlparserSimple SQL parser meant for querying CSV files
Stars: ✭ 249 (+1085.71%)
CsvtomarkdowntableSimple JavaScript/Node.js CSV to Markdown Table Converter
Stars: ✭ 249 (+1085.71%)
python mozetlETL jobs for Firefox Telemetry
Stars: ✭ 25 (+19.05%)
Vscode Data PreviewData Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Stars: ✭ 245 (+1066.67%)
Pxi🧚 pxi (pixie) is a small, fast, and magical command-line data processor similar to jq, mlr, and awk.
Stars: ✭ 248 (+1080.95%)
flowtorchflowTorch - a Python library for analysis and reduced-order modeling of fluid flows
Stars: ✭ 47 (+123.81%)
tabular-streamDetects tabular data (spreadsheets, dsv or json, 20+ different formats) and emits normalized objects.
Stars: ✭ 34 (+61.9%)
torch-dataframeUtility class to manipulate dataset from CSV file
Stars: ✭ 67 (+219.05%)