All Projects → Shifu → Similar Projects or Alternatives

840 Open source projects that are alternatives of or similar to Shifu

Pydis
A simple longslit spectroscopy pipeline in Python
Stars: ✭ 37 (-82.13%)
Mutual labels:  pipeline
Winutils
winutils.exe hadoop.dll and hdfs.dll binaries for hadoop windows
Stars: ✭ 657 (+217.39%)
Mutual labels:  hadoop
Useractionanalyzeplatform
电商用户行为分析大数据平台
Stars: ✭ 645 (+211.59%)
Mutual labels:  hadoop
Core
The safe post-production pipeline - https://getavalon.github.io/2.0
Stars: ✭ 162 (-21.74%)
Mutual labels:  pipeline
Argo Cd
Declarative continuous deployment for Kubernetes.
Stars: ✭ 7,887 (+3710.14%)
Mutual labels:  pipeline
Supra
SUPRA: Software Defined Ultrasound Processing for Real-Time Applications - An Open Source 2D and 3D Pipeline from Beamforming to B-Mode
Stars: ✭ 96 (-53.62%)
Mutual labels:  pipeline
Go Streams
A lightweight stream processing library for Go
Stars: ✭ 615 (+197.1%)
Mutual labels:  pipeline
Eel Sdk
Big Data Toolkit for the JVM
Stars: ✭ 140 (-32.37%)
Mutual labels:  hadoop
Blurr
Data transformations for the ML era
Stars: ✭ 96 (-53.62%)
Mutual labels:  pipeline
Pdpipe
Easy pipelines for pandas DataFrames.
Stars: ✭ 590 (+185.02%)
Mutual labels:  pipeline
Jenkinsdocs
Jenkins实践文档 最新站点地址: http://www.idevops.site
Stars: ✭ 200 (-3.38%)
Mutual labels:  pipeline
Proposal Pipeline Operator
A proposal for adding a useful pipe operator to JavaScript.
Stars: ✭ 5,899 (+2749.76%)
Mutual labels:  pipeline
Nextflow
A DSL for data-driven computational pipelines
Stars: ✭ 1,337 (+545.89%)
Mutual labels:  pipeline
Hadoop study
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Stars: ✭ 567 (+173.91%)
Mutual labels:  hadoop
Go spider
[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.
Stars: ✭ 1,745 (+743%)
Mutual labels:  pipeline
Deep Forest
An Efficient, Scalable and Optimized Python Framework for Deep Forest (2021.2.1)
Stars: ✭ 547 (+164.25%)
Mutual labels:  random-forest
Vistrails
VisTrails is an open-source data analysis and visualization tool. It provides a comprehensive provenance infrastructure that maintains detailed history information about the steps followed and data derived in the course of an exploratory task: VisTrails maintains provenance of data products, of the computational processes that derive these products and their executions.
Stars: ✭ 94 (-54.59%)
Mutual labels:  pipeline
Ttyplot
a realtime plotting utility for terminal/console with data input from stdin
Stars: ✭ 532 (+157%)
Mutual labels:  pipeline
Machine Learning Models
Decision Trees, Random Forest, Dynamic Time Warping, Naive Bayes, KNN, Linear Regression, Logistic Regression, Mixture Of Gaussian, Neural Network, PCA, SVD, Gaussian Naive Bayes, Fitting Data to Gaussian, K-Means
Stars: ✭ 160 (-22.71%)
Mutual labels:  random-forest
Hyperparameter Optimization Of Machine Learning Algorithms
Implementation of hyperparameter optimization/tuning methods for machine learning & deep learning models (easy&clear)
Stars: ✭ 516 (+149.28%)
Mutual labels:  random-forest
Wifi
基于wifi抓取信息的大数据查询分析系统
Stars: ✭ 93 (-55.07%)
Mutual labels:  hadoop
Machinelearnjs
Machine Learning library for the web and Node.
Stars: ✭ 498 (+140.58%)
Mutual labels:  random-forest
Xlearning
AI on Hadoop
Stars: ✭ 1,709 (+725.6%)
Mutual labels:  hadoop
Gis Tools For Hadoop
The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.
Stars: ✭ 485 (+134.3%)
Mutual labels:  hadoop
Mnemonic
Apache Mnemonic - A non-volatile hybrid memory storage oriented library
Stars: ✭ 91 (-56.04%)
Mutual labels:  bigdata
Pdf
编程电子书,电子书,编程书籍,包括C,C#,Docker,Elasticsearch,Git,Hadoop,HeadFirst,Java,Javascript,jvm,Kafka,Linux,Maven,MongoDB,MyBatis,MySQL,Netty,Nginx,Python,RabbitMQ,Redis,Scala,Solr,Spark,Spring,SpringBoot,SpringCloud,TCPIP,Tomcat,Zookeeper,人工智能,大数据类,并发编程,数据库类,数据挖掘,新面试题,架构设计,算法系列,计算机类,设计模式,软件测试,重构优化,等更多分类
Stars: ✭ 12,009 (+5701.45%)
Mutual labels:  hadoop
Chefboost
A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python
Stars: ✭ 176 (-14.98%)
Mutual labels:  random-forest
Gaia
Build powerful pipelines in any programming language.
Stars: ✭ 4,534 (+2090.34%)
Mutual labels:  pipeline
Drake
An R-focused pipeline toolkit for reproducibility and high-performance computing
Stars: ✭ 1,301 (+528.5%)
Mutual labels:  pipeline
Data Science Ipython Notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+10551.21%)
Mutual labels:  hadoop
Jenkins Pipeline Library
wcm.io Jenkins Pipeline Library for CI/CD
Stars: ✭ 134 (-35.27%)
Mutual labels:  pipeline
The App
Sample application and CD Pipeline for DevOps Dojo
Stars: ✭ 88 (-57.49%)
Mutual labels:  pipeline
Optimus
🚚 Agile Data Preparation Workflows made easy with dask, cudf, dask_cudf and pyspark
Stars: ✭ 986 (+376.33%)
Mutual labels:  bigdata
Circosjs
d3 library to build circular graphs
Stars: ✭ 436 (+110.63%)
Mutual labels:  bigdata
Aws Serverless Cicd Workshop
Learn how to build a CI/CD pipeline for SAM-based applications
Stars: ✭ 158 (-23.67%)
Mutual labels:  pipeline
Cortx
CORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.
Stars: ✭ 426 (+105.8%)
Mutual labels:  bigdata
Biglasso
biglasso: Extending Lasso Model Fitting to Big Data in R
Stars: ✭ 87 (-57.97%)
Mutual labels:  bigdata
Rush
A cross-platform command-line tool for executing jobs in parallel
Stars: ✭ 421 (+103.38%)
Mutual labels:  pipeline
Karton
Distributed malware processing framework based on Python, Redis and MinIO.
Stars: ✭ 134 (-35.27%)
Mutual labels:  pipeline
Marmaray
Generic Data Ingestion & Dispersal Library for Hadoop
Stars: ✭ 414 (+100%)
Mutual labels:  hadoop
Text classification
Text Classification Algorithms: A Survey
Stars: ✭ 1,276 (+516.43%)
Mutual labels:  random-forest
Serving
A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)
Stars: ✭ 403 (+94.69%)
Mutual labels:  pipeline
Pipeline.rs
☔️ => ⛅️ => ☀️
Stars: ✭ 188 (-9.18%)
Mutual labels:  pipeline
Pex Context
Modern WebGL state wrapper for PEX: allocate GPU resources (textures, buffers), setup state pipelines and passes, and combine them into commands.
Stars: ✭ 117 (-43.48%)
Mutual labels:  pipeline
Weblogsanalysissystem
A big data platform for analyzing web access logs
Stars: ✭ 37 (-82.13%)
Mutual labels:  hadoop
Bio embeddings
Get protein embeddings from protein sequences
Stars: ✭ 86 (-58.45%)
Mutual labels:  pipeline
Pytorch classification
利用pytorch实现图像分类的一个完整的代码,训练,预测,TTA,模型融合,模型部署,cnn提取特征,svm或者随机森林等进行分类,模型蒸馏,一个完整的代码
Stars: ✭ 395 (+90.82%)
Mutual labels:  random-forest
Mara Pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Stars: ✭ 1,841 (+789.37%)
Mutual labels:  pipeline
Iceberg
Iceberg is a table format for large, slow-moving tabular data
Stars: ✭ 393 (+89.86%)
Mutual labels:  hadoop
Clusterflow
A pipelining tool to automate and standardise bioinformatics analyses on cluster environments.
Stars: ✭ 85 (-58.94%)
Mutual labels:  pipeline
Orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Stars: ✭ 389 (+87.92%)
Mutual labels:  hadoop
Spacy Wordnet
spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface
Stars: ✭ 156 (-24.64%)
Mutual labels:  pipeline
Learning Spark
零基础学习spark,大数据学习
Stars: ✭ 37 (-82.13%)
Mutual labels:  hadoop
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+2276.33%)
Mutual labels:  pipeline
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-27.54%)
Mutual labels:  hadoop
Lastbackend
System for containerized apps management. From build to scaling.
Stars: ✭ 1,536 (+642.03%)
Mutual labels:  pipeline
Mlj.jl
A Julia machine learning framework
Stars: ✭ 982 (+374.4%)
Mutual labels:  pipeline
Jsr203 Hadoop
A Java NIO file system provider for HDFS
Stars: ✭ 35 (-83.09%)
Mutual labels:  hadoop
Datax
DataX is an open source universal ETL tool that support Cassandra, ClickHouse, DBF, Hive, InfluxDB, Kudu, MySQL, Oracle, Presto(Trino), PostgreSQL, SQL Server
Stars: ✭ 116 (-43.96%)
Mutual labels:  hadoop
Cimonitor
Displays CI statuses on a dashboard and triggers fun modules representing the status!
Stars: ✭ 34 (-83.57%)
Mutual labels:  pipeline
301-360 of 840 similar projects