zinggScalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+1826.47%)
Mutual labels: etl, dataquality
python mozetlETL jobs for Firefox Telemetry
Stars: ✭ 25 (-26.47%)
Mutual labels: etl
django-data-migrationData migration framework for Django that migrates legacy data into your new django app
Stars: ✭ 18 (-47.06%)
Mutual labels: etl
blockchain-etl-streamingStreaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+67.65%)
Mutual labels: etl
zdh web大数据采集,抽取平台
Stars: ✭ 292 (+758.82%)
Mutual labels: etl
flockFlock: A Low-Cost Streaming Query Engine on FaaS Platforms
Stars: ✭ 232 (+582.35%)
Mutual labels: etl
naas⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (+544.12%)
Mutual labels: etl
covid-19Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.
Stars: ✭ 14 (-58.82%)
Mutual labels: etl
CVparserCVparser is software for parsing or extracting data out of CV/resumes.
Stars: ✭ 28 (-17.65%)
Mutual labels: etl
starlakeStarlake is a Spark Based On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Stars: ✭ 16 (-52.94%)
Mutual labels: etl
polygon-etlETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+55.88%)
Mutual labels: etl
datalake-etl-pipelineSimplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+14.71%)
Mutual labels: etl
django-calaccess-raw-dataA Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
Stars: ✭ 61 (+79.41%)
Mutual labels: etl
csv-cruncherTreats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.
Stars: ✭ 32 (-5.88%)
Mutual labels: etl
csvpluscsvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
Stars: ✭ 67 (+97.06%)
Mutual labels: etl
morph-kgcPowerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+126.47%)
Mutual labels: etl
sync-engine-exampleSynchronization Algorithm Exploration: Techniques to synchronize a SQL database with external destinations.
Stars: ✭ 17 (-50%)
Mutual labels: etl
nasdaq-symbolsETL for the NASDAQ symbol file
Stars: ✭ 13 (-61.76%)
Mutual labels: etl
OpenKettleWebUI一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 138 (+305.88%)
Mutual labels: etl
uptasticsearchAn Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+38.24%)
Mutual labels: etl