All Projects → Datavec → Similar Projects or Alternatives

1544 Open source projects that are alternatives of or similar to Datavec

basin
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser
Stars: ✭ 25 (-90.81%)
Mutual labels:  spark, pipeline, etl
Stetl
Stetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
Stars: ✭ 64 (-76.47%)
Mutual labels:  pipeline, etl, transformations
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-70.96%)
Mutual labels:  spark, pipeline, etl
Omniparser
omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
Stars: ✭ 148 (-45.59%)
Mutual labels:  schema, etl
mydataharbor
🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (-89.71%)
Mutual labels:  pipeline, etl
naas
⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (-19.49%)
Mutual labels:  pipeline, etl
Pyspark Example Project
Example project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+132.72%)
Mutual labels:  spark, etl
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+1708.46%)
Mutual labels:  pipeline, etl
Spark Bigquery
Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.
Stars: ✭ 65 (-76.1%)
Mutual labels:  schema, spark
Osom
An Awesome [/osom/] Object Data Modeling (Database Agnostic).
Stars: ✭ 68 (-75%)
Mutual labels:  schema, transformations
Bulk Writer
Provides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
Stars: ✭ 210 (-22.79%)
Mutual labels:  pipeline, etl
etl
M-Lab ingestion pipeline
Stars: ✭ 15 (-94.49%)
Mutual labels:  pipeline, etl
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+339.34%)
Mutual labels:  spark, etl
Udacity Data Engineering
Udacity Data Engineering Nano Degree (DEND)
Stars: ✭ 89 (-67.28%)
Mutual labels:  spark, etl
sparklanes
A lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-93.75%)
Mutual labels:  pipeline, etl
Go Streams
A lightweight stream processing library for Go
Stars: ✭ 615 (+126.1%)
Mutual labels:  pipeline, etl
Phila Airflow
Stars: ✭ 16 (-94.12%)
Mutual labels:  pipeline, etl
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+36.76%)
Mutual labels:  spark, etl
Graphql Parser
A graphql query language and schema definition language parser and formatter for rust
Stars: ✭ 203 (-25.37%)
Mutual labels:  schema, formatter
Mara Pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+576.84%)
Mutual labels:  pipeline, etl
Metl
mito ETL tool
Stars: ✭ 153 (-43.75%)
Mutual labels:  pipeline, etl
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (-73.53%)
Mutual labels:  spark, etl
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+32.72%)
Mutual labels:  spark, etl
Transmogrifai
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Stars: ✭ 2,084 (+666.18%)
Mutual labels:  spark, transformations
lineage
Generate beautiful documentation for your data pipelines in markdown format
Stars: ✭ 16 (-94.12%)
Mutual labels:  pipeline, etl
data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Stars: ✭ 34 (-87.5%)
Mutual labels:  spark, transformations
Vue Form Generator
📋 A schema-based form generator component for Vue.js
Stars: ✭ 2,853 (+948.9%)
Mutual labels:  schema
unimport
A linter, formatter for finding and removing unused import statements.
Stars: ✭ 119 (-56.25%)
Mutual labels:  formatter
ploio
Safe, Reliable, and Fast Production Deployments for Kubernetes
Stars: ✭ 11 (-95.96%)
Mutual labels:  pipeline
snakefmt
The uncompromising Snakemake code formatter
Stars: ✭ 78 (-71.32%)
Mutual labels:  formatter
Typed Immutable
Immutable and structurally typed data
Stars: ✭ 263 (-3.31%)
Mutual labels:  schema
Big Data Rosetta Code
Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code
Stars: ✭ 254 (-6.62%)
Mutual labels:  spark
bandar-log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 20 (-92.65%)
Mutual labels:  etl
latent-semantic-analysis
Pipeline for training LSA models using Scikit-Learn.
Stars: ✭ 20 (-92.65%)
Mutual labels:  pipeline
kedro
A Python framework for creating reproducible, maintainable and modular data science code.
Stars: ✭ 6,068 (+2130.88%)
Mutual labels:  pipeline
spark-http-stream
spark structured streaming via HTTP communication
Stars: ✭ 17 (-93.75%)
Mutual labels:  spark
ddquery
Django Debug Query (ddquery) beautiful colored SQL statements for logging
Stars: ✭ 25 (-90.81%)
Mutual labels:  formatter
Formvuelate
Dynamic schema-based form rendering for VueJS
Stars: ✭ 262 (-3.68%)
Mutual labels:  schema
Helk
The Hunting ELK
Stars: ✭ 3,097 (+1038.6%)
Mutual labels:  spark
spark-structured-streaming-examples
Spark structured streaming examples with using of version 3.0.0
Stars: ✭ 23 (-91.54%)
Mutual labels:  spark
grate
A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.
Stars: ✭ 98 (-63.97%)
Mutual labels:  etl
godot-exporter
Godot Engine Automation Pipeline Android – iOS – Linux – MacOS – Windows – HTML5 – Itch.io.
Stars: ✭ 54 (-80.15%)
Mutual labels:  pipeline
laravel-spark-camera
Profile Photo Camera support for Laravel Spark
Stars: ✭ 30 (-88.97%)
Mutual labels:  spark
fform
Flexibile and extendable form builder with constructor
Stars: ✭ 26 (-90.44%)
Mutual labels:  schema
Dgsh
Shell supporting pipelines to and from multiple processes
Stars: ✭ 261 (-4.04%)
Mutual labels:  pipeline
hammer
🛠 hammer is a command-line tool to schema management for Google Cloud Spanner.
Stars: ✭ 38 (-86.03%)
Mutual labels:  schema
daf-kylo
Kylo integration with PDND (previously DAF).
Stars: ✭ 20 (-92.65%)
Mutual labels:  spark
dllib
dllib is a distributed deep learning library running on Apache Spark
Stars: ✭ 32 (-88.24%)
Mutual labels:  spark
pyrealtime
Realtime data processing and plotting pipelines in Python
Stars: ✭ 62 (-77.21%)
Mutual labels:  pipeline
Spotify-Song-Recommendation-ML
UC Berkeley team's submission for RecSys Challenge 2018
Stars: ✭ 70 (-74.26%)
Mutual labels:  spark
toml-sort
Toml sorting library
Stars: ✭ 31 (-88.6%)
Mutual labels:  formatter
Seapig
🌊🐷 Utility for generalized composition of React components
Stars: ✭ 269 (-1.1%)
Mutual labels:  schema
Phytouch
Smooth scrolling, rotation, pull to refresh, page transition and any motion for the web - 丝般顺滑的触摸运动方案
Stars: ✭ 2,854 (+949.26%)
Mutual labels:  transformations
Docker Spark Cluster
A simple spark standalone cluster for your testing environment purposses
Stars: ✭ 261 (-4.04%)
Mutual labels:  spark
currency edittext
Simple currency formatter for Android EditText
Stars: ✭ 64 (-76.47%)
Mutual labels:  formatter
ctdna-pipeline
A simplified pipeline for ctDNA sequencing data analysis
Stars: ✭ 29 (-89.34%)
Mutual labels:  pipeline
BlazorMonaco
Blazor component for Microsoft's Monaco Editor which powers Visual Studio Code.
Stars: ✭ 151 (-44.49%)
Mutual labels:  formatter
etl manager
A python package to create a database on the platform using our moj data warehousing framework
Stars: ✭ 14 (-94.85%)
Mutual labels:  etl
spark learning
尚硅谷大数据Spark-2019版最新 Spark 学习
Stars: ✭ 42 (-84.56%)
Mutual labels:  spark
spark-data-sources
Developing Spark External Data Sources using the V2 API
Stars: ✭ 36 (-86.76%)
Mutual labels:  spark
1-60 of 1544 similar projects