All Projects → gallia-core → Similar Projects or Alternatives

407 Open source projects that are alternatives of or similar to gallia-core

hamilton
A scalable general purpose micro-framework for defining dataflows. You can use it to create dataframes, numpy matrices, python objects, ML models, etc.
Stars: ✭ 612 (+1290.91%)
FIFA-2019-Analysis
This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Stars: ✭ 28 (-36.36%)
Butterfree
A tool for building feature stores.
Stars: ✭ 126 (+186.36%)
Mutual labels:  etl, data-engineering
naas
⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (+397.73%)
Mutual labels:  etl, data-transformation
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+6.82%)
Mutual labels:  etl, data-engineering
AirflowETL
Blog post on ETL pipelines with Airflow
Stars: ✭ 20 (-54.55%)
Mutual labels:  etl, data-engineering
pangeo-forge-recipes
Python library for building Pangeo Forge recipes.
Stars: ✭ 64 (+45.45%)
Mutual labels:  etl, data-engineering
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+1338.64%)
Mutual labels:  etl, data-engineering
Aws Data Wrangler
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+5320.45%)
Mutual labels:  etl, data-engineering
Benthos
Fancy stream processing made operationally mundane
Stars: ✭ 3,705 (+8320.45%)
Mutual labels:  etl, data-engineering
versatile-data-kit
Versatile Data Kit (VDK) is an open source framework that enables anybody with basic SQL or Python knowledge to create their own data pipelines.
Stars: ✭ 144 (+227.27%)
Mutual labels:  etl, data-engineering
fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
Stars: ✭ 123 (+179.55%)
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+29.55%)
Mutual labels:  etl, data-engineering
Sayn
Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (+79.55%)
Mutual labels:  etl, data-engineering
Aws Serverless Data Lake Framework
Enterprise-grade, production-hardened, serverless data lake on AWS
Stars: ✭ 179 (+306.82%)
Mutual labels:  etl, data-engineering
beneath
Beneath is a serverless real-time data platform ⚡️
Stars: ✭ 65 (+47.73%)
Mutual labels:  etl, data-engineering
arthur-redshift-etl
ELT Code for your Data Warehouse
Stars: ✭ 22 (-50%)
Mutual labels:  etl, data-engineering
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+5754.55%)
zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+1388.64%)
Mutual labels:  etl, data-transformation
AirflowDataPipeline
Example of an ETL Pipeline using Airflow
Stars: ✭ 24 (-45.45%)
Mutual labels:  etl, data-engineering
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-68.18%)
Mutual labels:  etl, data-engineering
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (+79.55%)
Mutual labels:  etl, data-engineering
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (+677.27%)
Mutual labels:  etl, data-engineering
hive-metastore-client
A client for connecting and running DDLs on hive metastore.
Stars: ✭ 37 (-15.91%)
Mutual labels:  etl, data-engineering
etl
[READ-ONLY] PHP - ETL (Extract Transform Load) data processing library
Stars: ✭ 279 (+534.09%)
Mutual labels:  etl, data-engineering
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+11079.55%)
Mutual labels:  etl, data-engineering
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+75%)
Mutual labels:  etl, data-engineering
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+20.45%)
Mutual labels:  etl, data-engineering
sql-to-redis
🔄 Simple tool for ETL. From SQL to Redis.
Stars: ✭ 18 (-59.09%)
Mutual labels:  etl
cubetl
CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)
Stars: ✭ 21 (-52.27%)
Mutual labels:  etl
Feature-Engineering-for-Fraud-Detection
Implementation of feature engineering from Feature engineering strategies for credit card fraud
Stars: ✭ 31 (-29.55%)
Mutual labels:  feature-engineering
PDAP-Scrapers
Code relating to scraping public police data.
Stars: ✭ 72 (+63.64%)
Mutual labels:  etl
DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Stars: ✭ 24 (-45.45%)
Mutual labels:  etl
DQCS
数据质量控制系统
Stars: ✭ 34 (-22.73%)
Mutual labels:  etl
h4sci-course
ETH PhD Program course
Stars: ✭ 19 (-56.82%)
Mutual labels:  data-engineering
neon-workshop
A Pachyderm deep learning tutorial for conference workshops
Stars: ✭ 19 (-56.82%)
Mutual labels:  data-engineering
50-days-of-Statistics-for-Data-Science
This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.
Stars: ✭ 19 (-56.82%)
Mutual labels:  feature-engineering
nasdaq-symbols
ETL for the NASDAQ symbol file
Stars: ✭ 13 (-70.45%)
Mutual labels:  etl
autoencoders tensorflow
Automatic feature engineering using deep learning and Bayesian inference using TensorFlow.
Stars: ✭ 66 (+50%)
Mutual labels:  feature-engineering
hrv-analysis
Package for Heart Rate Variability analysis in Python
Stars: ✭ 225 (+411.36%)
Mutual labels:  feature-engineering
wrangle
A data transformation package for deep learning with Autonomio, Keras and TensorFlow.
Stars: ✭ 15 (-65.91%)
Mutual labels:  etl
pyjanitor
Clean APIs for data cleaning. Python implementation of R package Janitor
Stars: ✭ 970 (+2104.55%)
Mutual labels:  data-engineering
tutorials
Short programming tutorials pertaining to data analysis.
Stars: ✭ 14 (-68.18%)
Mutual labels:  data-transformation
Quora-Paraphrase-Question-Identification
Paraphrase question identification using Feature Fusion Network (FFN).
Stars: ✭ 19 (-56.82%)
Mutual labels:  feature-engineering
covid-19
Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.
Stars: ✭ 14 (-68.18%)
Mutual labels:  etl
viewflow
Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.
Stars: ✭ 110 (+150%)
Mutual labels:  data-engineering
mik
The Move to Islandora Kit is an extensible PHP command-line tool for converting source content and metadata into packages suitable for importing into Islandora (or other digital repository and preservations systems).
Stars: ✭ 32 (-27.27%)
Mutual labels:  etl
OpenKettleWebUI
一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 138 (+213.64%)
Mutual labels:  etl
dominance-analysis
This package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.
Stars: ✭ 111 (+152.27%)
Mutual labels:  feature-engineering
etlflow
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for writing various different tasks, jobs on GCP and AWS.
Stars: ✭ 38 (-13.64%)
Mutual labels:  etl
csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
Stars: ✭ 67 (+52.27%)
Mutual labels:  etl
python mozetl
ETL jobs for Firefox Telemetry
Stars: ✭ 25 (-43.18%)
Mutual labels:  etl
web-click-flow
网站点击流离线日志分析
Stars: ✭ 14 (-68.18%)
Mutual labels:  etl
dflib
In-memory Java DataFrame library
Stars: ✭ 50 (+13.64%)
Mutual labels:  etl
DataBridge.NET
Configurable data bridge for permanent ETL jobs
Stars: ✭ 16 (-63.64%)
Mutual labels:  etl
dry-transformer
Data transformation toolkit
Stars: ✭ 59 (+34.09%)
Mutual labels:  data-transformation
feathers-versionate
Create and work with nested services.
Stars: ✭ 29 (-34.09%)
Mutual labels:  nesting
CVparser
CVparser is software for parsing or extracting data out of CV/resumes.
Stars: ✭ 28 (-36.36%)
Mutual labels:  etl
IndexedTables.jl
Flexible tables with ordered indices
Stars: ✭ 108 (+145.45%)
Mutual labels:  data-manipulation
dbt-sugar
dbt-sugar is a CLI tool that allows users of dbt to have fun and ease performing actions around dbt models
Stars: ✭ 139 (+215.91%)
Mutual labels:  data-engineering
1-60 of 407 similar projects