GeniA Clojure dataframe library that runs on Spark
Stars: ✭ 152 (+10.95%)
Drake ExamplesExample workflows for the drake R package
Stars: ✭ 57 (-58.39%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-42.34%)
DrakeAn R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+849.64%)
Just Dashboard📊 📋 Dashboards using YAML or JSON files
Stars: ✭ 1,511 (+1002.92%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+523.36%)
TargetsFunction-oriented Make-like declarative workflows for R
Stars: ✭ 293 (+113.87%)
VizukaExplore high-dimensional datasets and how your algo handles specific regions.
Stars: ✭ 100 (-27.01%)
PretzelJavascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-81.02%)
ClevercsvCleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
Stars: ✭ 887 (+547.45%)
Griffon VmGriffon Data Science Virtual Machine
Stars: ✭ 128 (-6.57%)
BiolitmapCode for the paper "BIOLITMAP: a web-based geolocated and temporal visualization of the evolution of bioinformatics publications" in Oxford Bioinformatics.
Stars: ✭ 18 (-86.86%)
Data Science On GcpSource code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017
Stars: ✭ 864 (+530.66%)
TadwAn implementation of "Network Representation Learning with Rich Text Information" (IJCAI '15).
Stars: ✭ 43 (-68.61%)
AttacaRobust, distributed version control for large files.
Stars: ✭ 41 (-70.07%)
BatchtoolsTools for computation on batch systems
Stars: ✭ 127 (-7.3%)
Etherscan MlPython Data Science and Machine Learning Library for the Ethereum and ERC-20 Blockchain
Stars: ✭ 55 (-59.85%)
Steppy ToolkitCurated set of transformers that make your work with steppy faster and more effective 🔭
Stars: ✭ 21 (-84.67%)
Datumbox FrameworkDatumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Stars: ✭ 1,063 (+675.91%)
VerticapyVerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
Stars: ✭ 59 (-56.93%)
Linkedingiveaway👨🏽🏫You can learn about anything over here. What Giveaways I do and why it's important in today's modern world. Are you interested in Giveaway's?🔋
Stars: ✭ 67 (-51.09%)
SaynData processing and modelling framework for automating tasks (incl. Python & SQL transformations).
Stars: ✭ 79 (-42.34%)
DexDex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
Stars: ✭ 1,238 (+803.65%)
PipelinexPipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more
Stars: ✭ 127 (-7.3%)
Pyclusteringpyclustring is a Python, C++ data mining library.
Stars: ✭ 806 (+488.32%)
PrefectThe easiest way to automate your data
Stars: ✭ 7,956 (+5707.3%)
Model Describermodel-describer : Making machine learning interpretable to humans
Stars: ✭ 22 (-83.94%)
Cookbook 2ndIPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Stars: ✭ 704 (+413.87%)
Applied Ml📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+12910.22%)
AutodlAutomated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+523.36%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+1780.29%)
Mldmпотоковый курс "Машинное обучение и анализ данных (Machine Learning and Data Mining)" на факультете ВМК МГУ имени М.В. Ломоносова
Stars: ✭ 35 (-74.45%)
Dvc🦉Data Version Control | Git for Data & Models | ML Experiments Management
Stars: ✭ 9,004 (+6472.26%)
Php MlPHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+5666.42%)
PycmMulti-class confusion matrix library in Python
Stars: ✭ 1,076 (+685.4%)
Spark Py NotebooksApache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 1,338 (+876.64%)
Graph samplingGraph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.
Stars: ✭ 99 (-27.74%)
RsparklingRSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-52.55%)
Pyspark Example ProjectExample project implementing best practices for PySpark ETL jobs and applications.
Stars: ✭ 633 (+362.04%)
Tsv UtilseBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Stars: ✭ 1,215 (+786.86%)
Tsrepr TSrepr: R package for time series representations
Stars: ✭ 75 (-45.26%)
ButterfreeA tool for building feature stores.
Stars: ✭ 126 (-8.03%)
Papers Literature Ml Dl Rl AiHighly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
Stars: ✭ 1,341 (+878.83%)
CookbookThe Data Engineering Cookbook
Stars: ✭ 9,829 (+7074.45%)
SupersetApache Superset is a Data Visualization and Data Exploration Platform
Stars: ✭ 42,634 (+31019.71%)
Spark R Notebooks R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks
Stars: ✭ 109 (-20.44%)
Pythondatarepo for code published on pythondata.com
Stars: ✭ 113 (-17.52%)
Tennis Crystal BallUltimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Stars: ✭ 107 (-21.9%)
Aws Data WranglerPandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
Stars: ✭ 2,385 (+1640.88%)
Rightmove webscraper.pyPython class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Stars: ✭ 125 (-8.76%)
DataprooferA proofreader for your data
Stars: ✭ 628 (+358.39%)
Data Science CareerCareer Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (+359.85%)
SteppyLightweight, Python library for fast and reproducible experimentation 🔬
Stars: ✭ 119 (-13.14%)