All Projects → Pysparkling → Similar Projects or Alternatives

1030 Open source projects that are alternatives of or similar to Pysparkling

Awesome Ai Infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
Stars: ✭ 223 (-3.46%)
Mutual labels:  apache-spark
Speedml
Speedml is a Python package to speed start machine learning projects.
Stars: ✭ 192 (-16.88%)
Mutual labels:  data-science
Flaml
A fast and lightweight AutoML library.
Stars: ✭ 205 (-11.26%)
Mutual labels:  data-science
Uci Ml Api
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)
Stars: ✭ 190 (-17.75%)
Mutual labels:  data-science
Dash
Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required.
Stars: ✭ 15,592 (+6649.78%)
Mutual labels:  data-science
Klib
Easy to use Python library of customized functions for cleaning and analyzing data.
Stars: ✭ 192 (-16.88%)
Mutual labels:  data-science
Compose
A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.
Stars: ✭ 203 (-12.12%)
Mutual labels:  data-science
Delbot
It understands your voice commands, searches news and knowledge sources, and summarizes and reads out content to you.
Stars: ✭ 191 (-17.32%)
Mutual labels:  data-science
Statistical Learning
Lecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course
Stars: ✭ 223 (-3.46%)
Mutual labels:  data-science
Observations
Tools for loading standard data sets in machine learning
Stars: ✭ 190 (-17.75%)
Mutual labels:  data-science
Tsfel
An intuitive library to extract features from time series
Stars: ✭ 202 (-12.55%)
Mutual labels:  data-science
Vec4ir
Word Embeddings for Information Retrieval
Stars: ✭ 188 (-18.61%)
Mutual labels:  data-science
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (-1.73%)
Mutual labels:  data-science
Pytorch Lightning
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
Stars: ✭ 16,641 (+7103.9%)
Mutual labels:  data-science
Laurae
Advanced High Performance Data Science Toolbox for R by Laurae
Stars: ✭ 203 (-12.12%)
Mutual labels:  data-science
Dtale
Visualizer for pandas data structures
Stars: ✭ 2,864 (+1139.83%)
Mutual labels:  data-science
Jupyterlab templates
Support for jupyter notebook templates in jupyterlab
Stars: ✭ 223 (-3.46%)
Mutual labels:  data-science
Dataaspirant codes
Complete machine learning model codes
Stars: ✭ 185 (-19.91%)
Mutual labels:  data-science
Mmlspark
Simple and Distributed Machine Learning
Stars: ✭ 2,899 (+1154.98%)
Mutual labels:  apache-spark
Vaspy
Manipulating VASP files with Python.
Stars: ✭ 185 (-19.91%)
Mutual labels:  data-processing
Elastic
R client for the Elasticsearch HTTP API
Stars: ✭ 227 (-1.73%)
Mutual labels:  data-science
Anndata
Annotated data.
Stars: ✭ 171 (-25.97%)
Mutual labels:  data-science
Instascrape
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Stars: ✭ 202 (-12.55%)
Mutual labels:  data-science
Texar
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Stars: ✭ 2,236 (+867.97%)
Mutual labels:  data-processing
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Stars: ✭ 218 (-5.63%)
Mutual labels:  data-science
Awesome Computer Science Opportunities
An awesome list of events and fellowship opportunities for Computer Science students
Stars: ✭ 2,445 (+958.44%)
Mutual labels:  data-science
Fastpages
An easy to use blogging platform, with enhanced support for Jupyter Notebooks.
Stars: ✭ 2,888 (+1150.22%)
Mutual labels:  data-science
Lets Plot Kotlin
Kotlin API for Lets-Plot - an open-source plotting library for statistical data.
Stars: ✭ 181 (-21.65%)
Mutual labels:  data-science
Webstruct
NER toolkit for HTML data
Stars: ✭ 230 (-0.43%)
Mutual labels:  data-science
Ml Glossary
Machine learning glossary
Stars: ✭ 2,338 (+912.12%)
Mutual labels:  data-science
Achoo
Achoo uses a Raspberry Pi to predict if my son will need his inhaler on any given day using weather, pollen, and air quality data. If the prediction for a given day is above a specified threshold, the Pi will email his school nurse, and myself, notifying her that he may need preemptive treatment. Community-sourced health monitoring!
Stars: ✭ 200 (-13.42%)
Mutual labels:  data-science
Docker Galaxy Stable
🐳📊📚 Docker Images tracking the stable Galaxy releases.
Stars: ✭ 179 (-22.51%)
Mutual labels:  data-science
Quinn
pyspark methods to enhance developer productivity 📣 👯 🎉
Stars: ✭ 217 (-6.06%)
Mutual labels:  apache-spark
Deep Rules
Ten Quick Tips for Deep Learning in Biology
Stars: ✭ 179 (-22.51%)
Mutual labels:  data-science
Ml Auto Baseball Pitching Overlay
⚾🤖⚾ Automatic baseball pitching overlay in realtime
Stars: ✭ 200 (-13.42%)
Mutual labels:  data-science
Soda Sql
Metric collection, data testing and monitoring for SQL accessible data
Stars: ✭ 173 (-25.11%)
Mutual labels:  data-science
Full Stack Data Science
Full Stack Data Science in Python
Stars: ✭ 227 (-1.73%)
Mutual labels:  data-science
Bigdata Playground
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (-23.38%)
Mutual labels:  apache-spark
Pytorch Geometric Yoochoose
This is a tutorial for PyTorch Geometric on the YooChoose dataset
Stars: ✭ 198 (-14.29%)
Mutual labels:  data-science
Book list
Python, Machine Learning, Deep Learning and Data Science Books
Stars: ✭ 176 (-23.81%)
Mutual labels:  data-science
Chord
Python package for creating beautiful interactive Chord Diagrams. Pro version available at https://m8.fyi/chord
Stars: ✭ 217 (-6.06%)
Mutual labels:  data-science
Web Database Analytics
Web scrapping and related analytics using Python tools
Stars: ✭ 175 (-24.24%)
Mutual labels:  data-science
Analytics Zoo
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
Stars: ✭ 2,448 (+959.74%)
Mutual labels:  apache-spark
Datasets For Good
List of datasets to apply stats/machine learning/technology to the world of social good.
Stars: ✭ 174 (-24.68%)
Mutual labels:  data-science
Functional intro to python
[tutorial]A functional, Data Science focused introduction to Python
Stars: ✭ 228 (-1.3%)
Mutual labels:  data-science
Deep Spying
Spying using Smartwatch and Deep Learning
Stars: ✭ 172 (-25.54%)
Mutual labels:  data-science
Climate Change Data
🌍 A curated list of APIs, open data and ML/AI projects on climate change
Stars: ✭ 195 (-15.58%)
Mutual labels:  data-science
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-6.93%)
Mutual labels:  apache-spark
100 Days Of Ml Code
A day to day plan for this challenge. Covers both theoritical and practical aspects
Stars: ✭ 172 (-25.54%)
Mutual labels:  data-science
Tad
A desktop application for viewing and analyzing tabular data
Stars: ✭ 2,275 (+884.85%)
Mutual labels:  data-science
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-25.97%)
Mutual labels:  data-science
Jaxnet
Concise deep learning for JAX
Stars: ✭ 171 (-25.97%)
Mutual labels:  data-science
Data Science Notebook
📖 每一个伟大的思想和行动都有一个微不足道的开始
Stars: ✭ 196 (-15.15%)
Mutual labels:  data-science
Covid19 Severity Prediction
Extensive and accessible COVID-19 data + forecasting for counties and hospitals. 📈
Stars: ✭ 170 (-26.41%)
Mutual labels:  data-science
Data Science Toolkit
Collection of stats, modeling, and data science tools in Python and R.
Stars: ✭ 169 (-26.84%)
Mutual labels:  data-science
Automlpipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
Stars: ✭ 223 (-3.46%)
Mutual labels:  data-science
Reddit Hyped Stocks
A web application to explore currently hyped stocks on Reddit
Stars: ✭ 173 (-25.11%)
Mutual labels:  data-science
Imodels
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Stars: ✭ 194 (-16.02%)
Mutual labels:  data-science
Matplotplusplus
Matplot++: A C++ Graphics Library for Data Visualization 📊🗾
Stars: ✭ 2,433 (+953.25%)
Mutual labels:  data-science
Airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Stars: ✭ 4,919 (+2029.44%)
Mutual labels:  data-science
61-120 of 1030 similar projects