All Projects → schic → DQCS

schic / DQCS

Licence: LGPL-3.0 license
数据质量控制系统

Programming Languages

java
68154 projects - #9 most used programming language
HTML
75241 projects
scala
5932 projects
CSS
56736 projects
javascript
184084 projects - #8 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to DQCS

zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Stars: ✭ 655 (+1826.47%)
Mutual labels:  etl, dataquality
python mozetl
ETL jobs for Firefox Telemetry
Stars: ✭ 25 (-26.47%)
Mutual labels:  etl
django-data-migration
Data migration framework for Django that migrates legacy data into your new django app
Stars: ✭ 18 (-47.06%)
Mutual labels:  etl
blockchain-etl-streaming
Streaming Ethereum and Bitcoin blockchain data to Google Pub/Sub or Postgres in Kubernetes
Stars: ✭ 57 (+67.65%)
Mutual labels:  etl
zdh web
大数据采集,抽取平台
Stars: ✭ 292 (+758.82%)
Mutual labels:  etl
flock
Flock: A Low-Cost Streaming Query Engine on FaaS Platforms
Stars: ✭ 232 (+582.35%)
Mutual labels:  etl
naas
⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (+544.12%)
Mutual labels:  etl
covid-19
Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.
Stars: ✭ 14 (-58.82%)
Mutual labels:  etl
CVparser
CVparser is software for parsing or extracting data out of CV/resumes.
Stars: ✭ 28 (-17.65%)
Mutual labels:  etl
starlake
Starlake is a Spark Based On Premise and Cloud ELT/ETL Framework for Batch & Stream Processing
Stars: ✭ 16 (-52.94%)
Mutual labels:  etl
polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
Stars: ✭ 53 (+55.88%)
Mutual labels:  etl
datalake-etl-pipeline
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations
Stars: ✭ 39 (+14.71%)
Mutual labels:  etl
django-calaccess-raw-data
A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
Stars: ✭ 61 (+79.41%)
Mutual labels:  etl
csv-cruncher
Treats CSV and JSON files as SQL tables, and exports SQL SELECTs back to CSV or JSON.
Stars: ✭ 32 (-5.88%)
Mutual labels:  etl
csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
Stars: ✭ 67 (+97.06%)
Mutual labels:  etl
morph-kgc
Powerful RDF Knowledge Graph Generation with [R2]RML Mappings
Stars: ✭ 77 (+126.47%)
Mutual labels:  etl
sync-engine-example
Synchronization Algorithm Exploration: Techniques to synchronize a SQL database with external destinations.
Stars: ✭ 17 (-50%)
Mutual labels:  etl
nasdaq-symbols
ETL for the NASDAQ symbol file
Stars: ✭ 13 (-61.76%)
Mutual labels:  etl
OpenKettleWebUI
一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
Stars: ✭ 138 (+305.88%)
Mutual labels:  etl
uptasticsearch
An Elasticsearch client tailored to data science workflows.
Stars: ✭ 47 (+38.24%)
Mutual labels:  etl

数据质量控制管理系统(基于DataCleaner)

Build Status: Linux Gitter chat

DataCleaner logo

        四川省数据质量控制管理系统项目是基于国外开源软件DataCleaner进行本地化改造和功能扩展。 DataCleaner是一个数据质量分析,比较,验证和监督的软件。DataCleaner包括一个独立的图形化客户端程序和web应用,客户端程序负责进行复杂的任务配置用户界面分析,web负责应用任务调度、监控。

模块结构

主要的应用模块包括:

  • api - 公共 API of DataCleaner. 创建你的扩展的主要的接口和声明 annotations.
  • resources - DataCleaner的静态资源。
  • oss-branding-图标和颜色
  • testware -有用的类,用于DataCleaner和扩展代码的单元测试。
  • engine
    • core - 核心引擎部分,允许根据API执行作业和组件。
    • xml-config - 包含用于读取和写入DataCleaner的作业文件和配置文件的实用程序。
    • env - 可以在其中运行的不同/替代环境,例如Apache Spark或webapp-cluster
  • components
    • ... - 许多子模块包含内置的以及与DataCleaner一起使用的其他组件/扩展。
    • standard-components - 一个容器项目,它依赖于通常在DataCleaner社区版中捆绑的所有组件。
  • desktop
    • api - DataCleaner桌面应用程序的公共API。
    • ui - 桌面用户的基于Swing的用户界面
  • monitor
    • api - 监视器的API类和接口

持续集成

Travis CI上有DataCleaner的构建日志:

https://travis-ci.org/datacleaner/DataCleaner

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].