All Projects → productml → Blurr

productml / Blurr

Licence: apache-2.0
Data transformations for the ML era

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Blurr

Awesome Feature Engineering
A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Stars: ✭ 433 (+351.04%)
Mutual labels:  ai, data-science, feature-extraction, feature-engineering
Hyperparameter hunter
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Stars: ✭ 648 (+575%)
Mutual labels:  artificial-intelligence, ai, data-science, feature-engineering
Autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+789.58%)
Mutual labels:  artificial-intelligence, ai, data-science, feature-engineering
Csinva.github.io
Slides, paper notes, class notes, blog posts, and research on ML 📉, statistics 📊, and AI 🤖.
Stars: ✭ 342 (+256.25%)
Mutual labels:  artificial-intelligence, ai, data-science
Production Level Deep Learning
A guideline for building practical production-level deep learning systems to be deployed in real world applications.
Stars: ✭ 3,358 (+3397.92%)
Mutual labels:  artificial-intelligence, ai, pipeline
Artificio
Deep Learning Computer Vision Algorithms for Real-World Use
Stars: ✭ 326 (+239.58%)
Mutual labels:  artificial-intelligence, ai, data-science
Spacy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+22793.75%)
Mutual labels:  artificial-intelligence, ai, data-science
Kaggle Competitions
There are plenty of courses and tutorials that can help you learn machine learning from scratch but here in GitHub, I want to solve some Kaggle competitions as a comprehensive workflow with python packages. After reading, you can use this workflow to solve other real problems and use it as a template.
Stars: ✭ 86 (-10.42%)
Mutual labels:  data-science, feature-extraction, feature-engineering
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (+1222.92%)
Mutual labels:  artificial-intelligence, ai, data-science
Feature Selection
Features selector based on the self selected-algorithm, loss function and validation method
Stars: ✭ 534 (+456.25%)
Mutual labels:  data-science, feature-extraction, feature-engineering
Polyaxon
Machine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (+2989.58%)
Mutual labels:  artificial-intelligence, ai, data-science
Php Ml
PHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+8129.17%)
Mutual labels:  artificial-intelligence, data-science, feature-extraction
Atlas
An Open Source, Self-Hosted Platform For Applied Deep Learning Development
Stars: ✭ 259 (+169.79%)
Mutual labels:  artificial-intelligence, ai, data-science
Deltapy
DeltaPy - Tabular Data Augmentation (by @firmai)
Stars: ✭ 344 (+258.33%)
Mutual labels:  data-science, feature-extraction, feature-engineering
Voice Gender
Gender recognition by voice and speech analysis
Stars: ✭ 248 (+158.33%)
Mutual labels:  artificial-intelligence, ai, data-science
Ml Auto Baseball Pitching Overlay
⚾🤖⚾ Automatic baseball pitching overlay in realtime
Stars: ✭ 200 (+108.33%)
Mutual labels:  artificial-intelligence, ai, data-science
Free Ai Resources
🚀 FREE AI Resources - 🎓 Courses, 👷 Jobs, 📝 Blogs, 🔬 AI Research, and many more - for everyone!
Stars: ✭ 192 (+100%)
Mutual labels:  artificial-intelligence, ai, data-science
Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Stars: ✭ 194 (+102.08%)
Mutual labels:  artificial-intelligence, ai, data-science
Caer
High-performance Vision library in Python. Scale your research, not boilerplate.
Stars: ✭ 452 (+370.83%)
Mutual labels:  artificial-intelligence, ai, data-science
Machine Learning Open Source
Monthly Series - Machine Learning Top 10 Open Source Projects
Stars: ✭ 943 (+882.29%)
Mutual labels:  artificial-intelligence, ai, data-science

Blurr

CircleCI Documentation Status Coverage Status PyPI version

Table of contents

What is Blurr?

Blurr transforms structured, streaming raw data into features for model training and prediction using a high-level expressive YAML-based language called the Blurr Transform Spec (BTS). The BTS merges the schema and computation model for data processing.

The BTS is a data transform definition for structured data. The BTS encapsulates the business logic of data transforms and Blurr orchestrates the execution of data transforms. Blurr is runner-agnostic, so BTSs can be run by event processors such as Spark, Spark Streaming or Flink.

Is Blurr for you?

Yes, if: you are well on your way on the ML 'curve of enlightenment', and are thinking about how to do online scoring

Curve

Playground

Launch playground

Tutorial and Docs

Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering --- Andrew Ng

Read the docs

Streaming BTS Tutorial | Window BTS Tutorial

Preparing data for specific use cases using Blurr:

Contribute to Blurr

Welcome to the Blurr community! We are so glad that you share our passion for building MLOps!

Please create a new issue to begin a discussion. Alternatively, feel free to pick up an existing issue!

Please sign the Contributor License Agreement before raising a pull request.

Data Science 'Joel Test'

Inspired by the (old school) Joel Test to rate software teams, here's our version for data science teams. What's your score?

  1. Data pipelines are versioned and reproducible
  2. Pipelines (re)build in one step
  3. Deploying to production needs minimal engineering help
  4. Successful ML is a long game. You play it like it is
  5. Kaizen. Experimentation and iterations are a way of life

Roadmap

Blurr is currently in Developer Preview. Stay in touch!: Star this project or email [email protected]

  • Local transformations only
  • Support for custom functions and other python libraries in the BTS
  • Spark runner
  • S3 support for data sink
  • DynamoDB as an Intermediate Store
  • Features server
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].