All Projects β†’ sql-machine-learning β†’ Sqlflow

sql-machine-learning / Sqlflow

Licence: apache-2.0
Brings SQL and AI together.

Programming Languages

go
31211 projects - #10 most used programming language
python
139335 projects - #7 most used programming language
shell
77523 projects
java
68154 projects - #9 most used programming language
javascript
184084 projects - #8 most used programming language
Yacc
648 projects

Projects that are alternatives of or similar to Sqlflow

Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (-13.58%)
Mutual labels:  ai
Uniter
πŸŽ‰ PHP in the browser and Node.js => Docs: https://phptojs.com/
Stars: ✭ 405 (-90.82%)
Mutual labels:  transpiler
Aigames
use AI to play some games.
Stars: ✭ 422 (-90.44%)
Mutual labels:  ai
Convnetsharp
Deep Learning in C#
Stars: ✭ 390 (-91.16%)
Mutual labels:  ai
Bytecoder
Rich Domain Model for JVM Bytecode and Framework to interpret and transpile it.
Stars: ✭ 401 (-90.91%)
Mutual labels:  transpiler
Screeps
Artificial intelligence for screeps
Stars: ✭ 407 (-90.78%)
Mutual labels:  ai
W.i.l.l
A python written personal assistant
Stars: ✭ 377 (-91.46%)
Mutual labels:  ai
Model server
A scalable inference server for models optimized with OpenVINOβ„’
Stars: ✭ 431 (-90.23%)
Mutual labels:  ai
Tagui
Free RPA tool by AI Singapore
Stars: ✭ 4,257 (-3.51%)
Mutual labels:  ai
Nlpia
Examples and libraries for "Natural Language Processing in Action" book
Stars: ✭ 416 (-90.57%)
Mutual labels:  ai
Neuralnetwork.net
A TensorFlow-inspired neural network library built from scratch in C# 7.3 for .NET Standard 2.0, with GPU support through cuDNN
Stars: ✭ 392 (-91.12%)
Mutual labels:  ai
Whatlang Rs
Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/
Stars: ✭ 400 (-90.93%)
Mutual labels:  ai
Submarine
Submarine is Cloud Native Machine Learning Platform.
Stars: ✭ 416 (-90.57%)
Mutual labels:  ai
Sourcery
Refactor Python using AI. ⭐ this repo and Sourcery Starbot will send you a PR
Stars: ✭ 372 (-91.57%)
Mutual labels:  ai
Roro
roro is a free, open-source robotic process automation software written in C# and Blazor WebAssembly
Stars: ✭ 422 (-90.44%)
Mutual labels:  ai
Movement Tracking
UP - DOWN - LEFT - RIGHT movement tracking.
Stars: ✭ 379 (-91.41%)
Mutual labels:  ai
Kglib
Grakn Knowledge Graph Library (ML R&D)
Stars: ✭ 405 (-90.82%)
Mutual labels:  ai
Awesome Feature Engineering
A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Stars: ✭ 433 (-90.19%)
Mutual labels:  ai
Ruby Fann
Ruby library for interfacing with FANN (Fast Artificial Neural Network)
Stars: ✭ 425 (-90.37%)
Mutual labels:  ai
Eyeballer
Convolutional neural network for analyzing pentest screenshots
Stars: ✭ 416 (-90.57%)
Mutual labels:  ai

SQLFlow

CI codecov GoDoc License Go Report Card

What is SQLFlow

SQLFlow is a compiler that compiles a SQL program to a workflow that runs on Kubernetes. The input is a SQL program that written in our extended SQL grammar to support AI jobs including training, prediction, model evaluation, model explanation, custom jobs, and mathematical programming. The output is an Argo workflow that runs on a Kubernetes cluster distributed.

SQLFlow supports various database systems like MySQL, MariaDB, TiDB, Hive, MaxCompute and many machine learning toolkits like TensorFlow, Keras, XGBoost.

Try SQLFlow NOW in our playground https://playground.sqlflow.tech/ and check out the handy tutorials in it.

Motivation

The current experience of development ML based applications requires a team of data engineers, data scientists, business analysts as well as a proliferation of advanced languages and programming tools like Python, SQL, SAS, SASS, Julia, R. The fragmentation of tooling and development environment brings additional difficulties in engineering to model training/tuning. What if we marry the most widely used data management/processing language SQL with ML/system capabilities and let engineers with SQL skills develop advanced ML based applications?

There are already some work in progress in the industry. We can write simple machine learning prediction (or scoring) algorithms in SQL using operators like DOT_PRODUCT. However, this requires copy-n-pasting model parameters from the training program to SQL statements. In the commercial world, we see some proprietary SQL engines providing extensions to support machine learning capabilities.

  • Microsoft SQL Server: Microsoft SQL Server has the machine learning service that runs machine learning programs in R or Python as an external script.
  • Teradata SQL for DL: Teradata also provides a RESTful service, which is callable from the extended SQL SELECT syntax.
  • Google BigQuery: Google BigQuery enables machine learning in SQL by introducing the CREATE MODEL statement.

None of the existing solution solves our pain point, instead we want it to be fully extensible.

  1. This solution should be compatible to many SQL engines, instead of a specific version or type.
  2. It should support sophisticated machine learning models, including TensorFlow for deep learning and XGBoost for trees.
  3. We also want the flexibility to configure and run cutting-edge ML algorithms including specifying feature crosses, at least, no Python or R code embedded in the SQL statements, and fully integrated with hyperparameter estimation.

Quick Overview

Here are examples for training a TensorFlow DNNClassifier model using sample data Iris.train, and running prediction using the trained model. You can see how cool it is to write some elegant ML code using SQL:

sqlflow> SELECT *
FROM iris.train
TO TRAIN DNNClassifier
WITH model.n_classes = 3, model.hidden_units = [10, 20]
COLUMN sepal_length, sepal_width, petal_length, petal_width
LABEL class
INTO sqlflow_models.my_dnn_model;

...
Training set accuracy: 0.96721
Done training
sqlflow> SELECT *
FROM iris.test
TO PREDICT iris.predict.class
USING sqlflow_models.my_dnn_model;

...
Done predicting. Predict table : iris.predict

How to use SQLFlow

Contributing Guidelines

Roadmap

SQLFlow will love to support as many mainstream ML frameworks and data sources as possible, but we feel like the expansion would be hard to be done merely on our own, so we would love to hear your options on what ML frameworks and data sources you are currently using and build upon. Please refer to our roadmap for specific timelines, also let us know your current scenarios and interests around SQLFlow project so we can prioritize based on the feedback from the community.

Feedback

Your feedback is our motivation to move on. Please let us know your questions, concerns, and issues by filing GitHub Issues.

License

Apache License 2.0

Published

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].