ETW2JSONTool and library to convert ETW logs to JSON files
Stars: ✭ 66 (-89.22%)
iex-stocksETL for the IEX Stocks API
Stars: ✭ 19 (-96.9%)
dominance-analysisThis package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.
Stars: ✭ 111 (-81.86%)
dbddbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Stars: ✭ 30 (-95.1%)
neo4j-jdbcJDBC driver for Neo4j
Stars: ✭ 110 (-82.03%)
DataXServer为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能
Stars: ✭ 130 (-78.76%)
PracticalMachineLearningA collection of ML related stuff including notebooks, codes and a curated list of various useful resources such as books and softwares. Almost everything mentioned here is free (as speech not free food) or open-source.
Stars: ✭ 60 (-90.2%)
ChatisticsA WhatsApp Chat analyzer and statistics.
Stars: ✭ 32 (-94.77%)
wikirepoPython based Wikidata framework for easy dataframe extraction
Stars: ✭ 33 (-94.61%)
sparklanesA lightweight data processing framework for Apache Spark
Stars: ✭ 17 (-97.22%)
CausingCausing: CAUsal INterpretation using Graphs
Stars: ✭ 47 (-92.32%)
react-monitor-dagA React-based operation/monitoring DAG diagram.(基于React的运维/监控DAG图)
Stars: ✭ 57 (-90.69%)
openrefine-dockerOpenRefine is a free, open source power tool for working with messy data and improving it. This repository contains Dockerbuild files for automated builds.
Stars: ✭ 19 (-96.9%)
etlM-Lab ingestion pipeline
Stars: ✭ 15 (-97.55%)
zcaZCA whitening in python
Stars: ✭ 29 (-95.26%)
daanyDaany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
Stars: ✭ 49 (-91.99%)
maxwell-sinkconsume maxwell generated message from kafka,export it to another mysql.
Stars: ✭ 16 (-97.39%)
kozaData transformation framework for LinkML data models
Stars: ✭ 21 (-96.57%)
soda-sparkSoda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
Stars: ✭ 58 (-90.52%)
entNo description or website provided.
Stars: ✭ 33 (-94.61%)
dtd2mysqlMySQL / MariaDB import for DTD feeds (fares, timetable and routeing)
Stars: ✭ 25 (-95.92%)
CC33ZCurso de Ciência da Computação
Stars: ✭ 50 (-91.83%)
dswarman open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)
Stars: ✭ 57 (-90.69%)
data-landing-zoneTemplate to deploy a single Data Landing Zone of the Data Management & Analytics Scenario (former Enterprise-Scale Analytics). The Data Landing Zone is a logical construct and a unit of scale in the architecture that enables data retention and execution of data workloads for generating insights and value with data.
Stars: ✭ 136 (-77.78%)
skutilNOTE: skutil is now deprecated. See its sister project: https://github.com/tgsmith61591/skoot. Original description: A set of scikit-learn and h2o extension classes (as well as caret classes for python). See more here: https://tgsmith61591.github.io/skutil
Stars: ✭ 29 (-95.26%)
DataX-srcDataX 是异构数据广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
Stars: ✭ 21 (-96.57%)
persistityA persistence framework for game developers
Stars: ✭ 34 (-94.44%)
qsvCSVs sliced, diced & analyzed.
Stars: ✭ 438 (-28.43%)
python mozetlETL jobs for Firefox Telemetry
Stars: ✭ 25 (-95.92%)
ITMOВсе лабы и отчеты кафедры ВТ (СППО) Университета ИТМО
Stars: ✭ 63 (-89.71%)
chronicle-etl📜 A CLI toolkit for extracting and working with your digital history
Stars: ✭ 78 (-87.25%)
mydataharbor🇨🇳 MyDataHarbor是一个致力于解决任意数据源到任意数据源的分布式、高扩展性、高性能、事务级的数据同步中间件。帮助用户可靠、快速、稳定的对海量数据进行准实时增量同步或者定时全量同步,主要定位是为实时交易系统服务,亦可用于大数据的数据同步(ETL领域)。
Stars: ✭ 28 (-95.42%)
dataframeStructured data processing in Kotlin
Stars: ✭ 319 (-47.88%)
DL-DBDeep learning for time-varying multi-entity datasets
Stars: ✭ 17 (-97.22%)
PDAP-ScrapersCode relating to scraping public police data.
Stars: ✭ 72 (-88.24%)
Data-Science-101Notes and tutorials on how to use python, pandas, seaborn, numpy, matplotlib, scipy for data science.
Stars: ✭ 19 (-96.9%)
grailerweb scraping tool for grailed.com
Stars: ✭ 30 (-95.1%)
go-bqloaderbqloader is a simple ETL framework to load data from Cloud Storage into BigQuery.
Stars: ✭ 16 (-97.39%)
cobrixA COBOL parser and Mainframe/EBCDIC data source for Apache Spark
Stars: ✭ 109 (-82.19%)
wrangleA data transformation package for deep learning with Autonomio, Keras and TensorFlow.
Stars: ✭ 15 (-97.55%)
bowGo data analysis / manipulation library built on top of Apache Arrow
Stars: ✭ 20 (-96.73%)
clinkClink is a library that provides APIs and infrastructure to facilitate the development of parallelizable feature engineering operators that can be used in both C++ and Java runtime.
Stars: ✭ 24 (-96.08%)
mikThe Move to Islandora Kit is an extensible PHP command-line tool for converting source content and metadata into packages suitable for importing into Islandora (or other digital repository and preservations systems).
Stars: ✭ 32 (-94.77%)
webspicyA technology agnostic specification and test framework that yields better coverage for less testing effort.
Stars: ✭ 42 (-93.14%)
muneSimple stock price analytics
Stars: ✭ 14 (-97.71%)
fengfeng - feature engineering for machine-learning champions
Stars: ✭ 27 (-95.59%)
singer-runnerA CLI and library to run Singer Taps and Targets
Stars: ✭ 33 (-94.61%)