All Projects → Attaca → Similar Projects or Alternatives

1622 Open source projects that are alternatives of or similar to Attaca

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+13695.12%)

Mutual labels: data-science, big-data, distributed

Griffon Data Science Virtual Machine

Stars: ✭ 128 (+212.2%)

Mutual labels: data-science, big-data

VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.

Stars: ✭ 59 (+43.9%)

Mutual labels: data-science, big-data

Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)

Stars: ✭ 153 (+273.17%)

Mutual labels: data-science, big-data

s3git: git for Cloud Storage. Distributed Version Control for Data. Create decentralized and versioned repos that scale infinitely to 100s of millions of files. Clone huge PB-scale repos on your local SSD to make changes, commit and push back. Oh yeah, it dedupes too and offers directory versioning.

Stars: ✭ 1,287 (+3039.02%)

Mutual labels: distributed, version-control

Hazelcast Go Client

Hazelcast IMDG Go Client

Stars: ✭ 140 (+241.46%)

Mutual labels: big-data, distributed

Explore high-dimensional datasets and how your algo handles specific regions.

Stars: ✭ 100 (+143.9%)

Mutual labels: data-science, big-data

Tennis Crystal Ball

Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction

Stars: ✭ 107 (+160.98%)

Mutual labels: data-science, big-data

Datascience Ai Machinelearning Resources

Alex Castrounis' curated set of resources for artificial intelligence (AI), machine learning, data science, internet of things (IoT), and more.

Stars: ✭ 414 (+909.76%)

Mutual labels: data-science, big-data

InsightEdge Core

Stars: ✭ 22 (-46.34%)

Mutual labels: big-data, distributed

Quiz & Assignment of Coursera

Stars: ✭ 454 (+1007.32%)

Mutual labels: data-science, big-data

Reproducible Data Science at Scale!

Stars: ✭ 5,305 (+12839.02%)

Mutual labels: data-science, big-data

Sciblog support

Support content for my blog

Stars: ✭ 694 (+1592.68%)

Mutual labels: data-science, big-data

Hazelcast Nodejs Client

Hazelcast IMDG Node.js Client

Stars: ✭ 124 (+202.44%)

Mutual labels: big-data, distributed

Ansible Role Git

Ansible Role - Git

Stars: ✭ 153 (+273.17%)

Mutual labels: distributed, version-control

Maze Applied Reinforcement Learning Framework

Stars: ✭ 85 (+107.32%)

Mutual labels: data-science, distributed

Spark Py Notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 1,338 (+3163.41%)

Mutual labels: data-science, big-data

📊 📋 Dashboards using YAML or JSON files

Stars: ✭ 1,511 (+3585.37%)

Mutual labels: data-science, big-data

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)

Stars: ✭ 65 (+58.54%)

Mutual labels: data-science, big-data

hazelcast-csharp-client

Hazelcast .NET Client

Stars: ✭ 98 (+139.02%)

Mutual labels: big-data, distributed

A distributed, fast open-source graph database featuring horizontal scalability and high availability

Stars: ✭ 8,196 (+19890.24%)

Mutual labels: big-data, distributed

Open source production model management tool for data scientists

Stars: ✭ 334 (+714.63%)

Mutual labels: data-science, version-control

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (+11073.17%)

Mutual labels: data-science, big-data

Data Science Career

Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository

Stars: ✭ 630 (+1436.59%)

Mutual labels: data-science, big-data

Efficient video analysis at scale

Stars: ✭ 569 (+1287.8%)

Mutual labels: big-data, distributed

Titanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.

Stars: ✭ 787 (+1819.51%)

Mutual labels: big-data, distributed

PLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.

Stars: ✭ 192 (+368.29%)

Mutual labels: data-science, distributed

Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]

Stars: ✭ 854 (+1982.93%)

Mutual labels: data-science, big-data

Workflows and interfaces for neuroimaging packages

Stars: ✭ 557 (+1258.54%)

Mutual labels: data-science, big-data

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+53675.61%)

Mutual labels: data-science, big-data

Hazelcast Python Client

Hazelcast IMDG Python Client

Stars: ✭ 92 (+124.39%)

Mutual labels: big-data, distributed

Hazelcast Cpp Client

Hazelcast IMDG C++ Client

Stars: ✭ 67 (+63.41%)

Mutual labels: big-data, distributed

Datumbox Framework

Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.

Stars: ✭ 1,063 (+2492.68%)

Mutual labels: data-science, big-data

A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.

Stars: ✭ 283 (+590.24%)

Mutual labels: data-science, big-data

A simple Spark-powered ETL framework that just works 🍺

Stars: ✭ 79 (+92.68%)

Mutual labels: data-science, big-data

MLBox is a powerful Automated Machine Learning python library.

Stars: ✭ 1,199 (+2824.39%)

Mutual labels: data-science, distributed

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Stars: ✭ 10,698 (+25992.68%)

Mutual labels: data-science, distributed

My Journey In The Data Science World

📢 Ready to learn or review your knowledge!

Stars: ✭ 1,175 (+2765.85%)

Mutual labels: data-science, big-data

repo for code published on pythondata.com

Stars: ✭ 113 (+175.61%)

Mutual labels: data-science, big-data

Spark R Notebooks

R on Apache Spark (SparkR) tutorials for Big Data analysis and Machine Learning as IPython / Jupyter notebooks

Stars: ✭ 109 (+165.85%)

Mutual labels: data-science, big-data

The Accelerator is a tool for fast and reproducible processing of large amounts of data.

Stars: ✭ 137 (+234.15%)

Mutual labels: data-science, big-data

CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.

Stars: ✭ 3,254 (+7836.59%)

Mutual labels: big-data, distributed

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Stars: ✭ 18,547 (+45136.59%)

Mutual labels: data-science, distributed

Koalas: pandas API on Apache Spark

Stars: ✭ 3,044 (+7324.39%)

Mutual labels: data-science, big-data

A distributed in-memory key-value storage for billions of small objects.

Stars: ✭ 25 (-39.02%)

Mutual labels: big-data, distributed

Data Science Live Book

An open source book to learn data science, data analysis and machine learning, suitable for all ages!

Stars: ✭ 193 (+370.73%)

Mutual labels: data-science, big-data

A Clojure dataframe library that runs on Spark

Stars: ✭ 152 (+270.73%)

Mutual labels: data-science, big-data

Open-source distributed computation and storage platform

Stars: ✭ 4,662 (+11270.73%)

Mutual labels: big-data, distributed

Javascript full-stack framework for Big Data visualisation and analysis

Stars: ✭ 26 (-36.59%)

Mutual labels: data-science, big-data

Dataflowjavasdk

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

Stars: ✭ 854 (+1982.93%)

Mutual labels: data-science, big-data

Open Solution Value Prediction

Open solution to the Santander Value Prediction Challenge 🐠

Stars: ✭ 34 (-17.07%)

Mutual labels: data-science

🚀 A Google Chrome / Firefox extension that blocks NSFW images from the web pages that you load using TensorFlow JS.

Stars: ✭ 984 (+2300%)

Mutual labels: data-science

基于区块链的符合W3C DID和Verifiable Credential规范的分布式身份解决方案

Stars: ✭ 972 (+2270.73%)

Mutual labels: distributed

Python Training

Python training for business analysts and traders

Stars: ✭ 972 (+2270.73%)

Mutual labels: data-science

Data Polygamy is a topology-based framework that allows users to query for statistically significant relationships between spatio-temporal data sets.

Stars: ✭ 39 (-4.88%)

Mutual labels: data-science

Esper instance for TV news analysis

Stars: ✭ 37 (-9.76%)

Mutual labels: big-data

(deprecated) A fast and memory-efficient Python data engineering framework for machine learning.

Stars: ✭ 33 (-19.51%)

Mutual labels: data-science

Art Data Science

The Art of Data Science

Stars: ✭ 32 (-21.95%)

Mutual labels: data-science

R Package: Parallel Distance Matrix Computation using Multiple Threads

Stars: ✭ 37 (-9.76%)

Mutual labels: data-science

Mljar Supervised

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀

Stars: ✭ 961 (+2243.9%)

Mutual labels: data-science

1-60 of 1622 similar projects