incubator-linkisLinkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+4453.7%)
bentenA language server for Common Workflow Language
Stars: ✭ 50 (-7.41%)
DataCon🏆DataCon大数据安全分析大赛,2019年方向二(恶意代码检测)冠军源码、2020年方向五(恶意代码分析)季军源码
Stars: ✭ 69 (+27.78%)
Information-RetrievalInformation Retrieval algorithms developed in python. To follow the blog posts, click on the link:
Stars: ✭ 103 (+90.74%)
obsplusA Pandas-Centric ObsPy Expansion Pack
Stars: ✭ 28 (-48.15%)
frovedisFramework of vectorized and distributed data analytics
Stars: ✭ 59 (+9.26%)
alfred-mailtoSend emails to recipients and groups from Alfred
Stars: ✭ 59 (+9.26%)
Chapter-2Code examples for Chapter 2 of Data Wrangling with JavaScript
Stars: ✭ 16 (-70.37%)
pytdTreasure Data Driver for Python
Stars: ✭ 15 (-72.22%)
elegant-gitElegant Git is an assistant who carefully automates routine work with Git.
Stars: ✭ 38 (-29.63%)
tutorialsShort programming tutorials pertaining to data analysis.
Stars: ✭ 14 (-74.07%)
DataProfilerWhat's in your data? Extract schema, statistics and entities from datasets
Stars: ✭ 843 (+1461.11%)
blogblog entries
Stars: ✭ 39 (-27.78%)
quickstepQuickstep project
Stars: ✭ 22 (-59.26%)
zenaton-ruby💎 Ruby gem to run and orchestrate background jobs with Zenaton Workflow Engine
Stars: ✭ 32 (-40.74%)
onelinerhub2.5k code solutions with clear explanation @ onelinerhub.com
Stars: ✭ 645 (+1094.44%)
leaflet heatmap简单的可视化湖州通话数据 假设数据量很大,没法用浏览器直接绘制热力图,把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后,再使用Apache Spark绘制热力图,然后用leafletjs加载OpenStreetMap图层和热力图图层,以达到良好的交互效果。现在使用Apache Spark实现绘制,可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法,并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .
Stars: ✭ 13 (-75.93%)
autoencoders tensorflowAutomatic feature engineering using deep learning and Bayesian inference using TensorFlow.
Stars: ✭ 66 (+22.22%)
mimirData-ish exploration through SQL+Uncertainty
Stars: ✭ 26 (-51.85%)
weaverbirdA visual data pipeline builder with various backends
Stars: ✭ 65 (+20.37%)
ACEseqWorkflowAllele-specific copy number estimation with whole genome sequencing
Stars: ✭ 19 (-64.81%)
toucan-connectorsConnectors available to retrieve data in Toucan Toco small apps
Stars: ✭ 13 (-75.93%)
kafka-compose🎼 Docker compose files for various kafka stacks
Stars: ✭ 32 (-40.74%)
faldo more with dbt. fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning models.
Stars: ✭ 567 (+950%)
swordfishOpen-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-35.19%)
my curd超轻量 快速开发脚手架、流程平台。
Stars: ✭ 38 (-29.63%)
spark-utillow-level helpers for Apache Spark libraries and tests
Stars: ✭ 16 (-70.37%)
tracemlEngine for ML/Data tracking, visualization, dashboards, and model UI for Polyaxon.
Stars: ✭ 445 (+724.07%)
sparkar-voltsAn extensive non-reactive Typescript framework that eases the development experience in Spark AR
Stars: ✭ 15 (-72.22%)
chatstats💬📊 Fun data visualizations for Facebook Messenger chats
Stars: ✭ 18 (-66.67%)
Python-Data-WranglingD-Lab's 3 hour introduction to data wrangling in Python. Learn how to import and manipulate dataframes using pandas in Python.
Stars: ✭ 41 (-24.07%)
query2reportQuery2Report is a simple open source business intelligence platform that allows users to build report/dashboard for business analytics or enterprise reporting
Stars: ✭ 43 (-20.37%)
experimentsCode examples for my blog posts
Stars: ✭ 21 (-61.11%)
web-dashboard-demoThe following application contains the DevExpress Dashboard Component for Angular. The client side is hosted on the GitHub Pages and gets data from the server side that hosts on DevExpress.com.
Stars: ✭ 65 (+20.37%)
telleryTellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.
Stars: ✭ 219 (+305.56%)
pre-commit-dbt🎣 List of `pre-commit` hooks to ensure the quality of your `dbt` projects.
Stars: ✭ 149 (+175.93%)
openverse-catalogIdentifies and collects data on cc-licensed content across web crawl data and public apis.
Stars: ✭ 27 (-50%)
machine-learning-capstone-projectThis is the final project for the Udacity Machine Learning Nanodegree: Predicting article retweets and likes based on the title using Machine Learning
Stars: ✭ 28 (-48.15%)
harlanHarlan é o sistema modular que permite você automatizar toda sua governança cadastral da nuvem.
Stars: ✭ 25 (-53.7%)
Papers4DataAchitectCollect papers for data engineering such as OLTP/OLAP/ETL/DistributedStorage.
Stars: ✭ 17 (-68.52%)
pantabRead/Write pandas DataFrames with Tableau Hyper Extracts
Stars: ✭ 64 (+18.52%)
iSkyLIMSis an open-source LIMS (laboratory Information Management System) for Next Generation Sequencing sample management, statistics and reports, and bioinformatics analysis service management.
Stars: ✭ 33 (-38.89%)
five-minute-midasPredicting Profitable Day Trading Positions using Decision Tree Classifiers. scikit-learn | Flask | SQLite3 | pandas | MLflow | Heroku | Streamlit
Stars: ✭ 41 (-24.07%)
stargateAn Apache Pulsar client written in Elixir
Stars: ✭ 33 (-38.89%)
release-notify-actionGitHub Action that triggers e-mails with release notes when these are created
Stars: ✭ 64 (+18.52%)