All Categories â†’ Data Processing → data-science

Top 1446 data-science open source projects

Awesome Datascience
📝 An awesome Data Science repository to learn and apply for real world problems.
Ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Ipython
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
Koalas
Koalas: pandas API on Apache Spark
Deep Learning Book
Repository for "Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python"
Dash.jl
Dash for Julia - A Julia interface to the Dash ecosystem for creating analytic web applications in Julia. No JavaScript required.
Roger
Golang RServe client. Use R from Go
Tweetfeels
Real-time sentiment analysis in Python using twitter's streaming api
Cjworkbench
The data journalism platform with built in training
Retriever
Quickly download, clean up, and install public datasets into a database management system
Opends4all
OpenDS4All project, hosted by LF AI & Data
✭ 240
htmldata-science
Pyglmnet
Python implementation of elastic-net regularized generalized linear models
Ntm One Shot Tf
One Shot Learning using Memory-Augmented Neural Networks (MANN) based on Neural Turing Machine architecture in Tensorflow
Data Mining Conferences
Ranking, acceptance rate, deadline, and publication tips
Ploomber
A convention over configuration workflow orchestrator. Develop locally (Jupyter or your favorite editor), deploy to Airflow or Kubernetes.
Pysparkling
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Webstruct
NER toolkit for HTML data
Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
R4ds Exercise Solutions
Exercise solutions to "R for Data Science"
Elastic
R client for the Elasticsearch HTTP API
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Automlpipeline.jl
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
Datascienceprojects
The code repository for projects and tutorials in R and Python that covers a variety of topics in data visualization, statistics sports analytics and general application of probability theory.
Statistical Learning
Lecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course
Ml Workspace
Machine Learning (Beginners Hub), information(courses, books, cheat sheets, live sessions) related to machine learning, data science and python is available
Amazing Feature Engineering
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Cardio
CardIO is a library for data science research of heart signals
Chord
Python package for creating beautiful interactive Chord Diagrams. Pro version available at https://m8.fyi/chord
Reddit Hyped Stocks
A web application to explore currently hyped stocks on Reddit
Cartoframes
CARTO Python package for data scientists
Covid19za
Coronavirus COVID-19 (2019-nCoV) Data Repository and Dashboard for South Africa
Eli5
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Compose
A machine learning tool for automated prediction engineering. It allows you to easily structure prediction problems and generate labels for supervised learning.
Scihub
Source code and data analyses for the Sci-Hub Coverage Study
Tsfel
An intuitive library to extract features from time series
1-60 of 1446 data-science projects