All Categories → Data Processing → data-science

Top 1642 data-science open source projects

geometric-smote
Implementation of the Geometric SMOTE over-sampling algorithm.
SyntheticSun
SyntheticSun is a defense-in-depth security automation and monitoring framework which utilizes threat intelligence, machine learning, managed AWS security services and, serverless technologies to continuously prevent, detect and respond to threats.
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
BCG
The BCG Open-Access Data Science & Advanced Analytics Virtual Experience Program
fastML
A Python package built on sklearn for running a series of classification Algorithms in a faster and easier way.
data-validator
A tool to validate data built around Apache Spark.
cortana-intelligence-customer360
This repository contains instructions and code to deploy a customer 360 profile solution on Azure stack using the Cortana Intelligence Suite.
introduction-to-python
Notes for the "Introduction to Programming for Data Science" class
Preguntas-Frecuentes-Data-Science-Machine-Learning
Todas tus dudas generales sobre data science y machine learning están respondidas acá.
pixiedust-facebook-analysis
A Jupyter notebook that uses the Watson Visual Recognition and Natural Language Understanding services to enrich Facebook Analytics and uses Cognos Dashboard Embedded to explore and visualize the results in Watson Studio
teach-r-online
Materials for the Teaching statistics and data science online workshops in July 2020
ZS-Data-Science-Challenge
A Data science challenge - "Mekktronix Sales Forecasting" organised by ZS through Hackerearth platform. Rank: 223 out of 4743.
policy-data-analyzer
Building a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
bc-population-indicator
R scripts for an indicator on trends in B.C.'s population size & distribution published on Environmental Reporting BC
zen-do-r
Um livro sobre programação para não-programadores.
ISLR-Python
Notes and implementations in Python for ISLR.
tutorials
Git Repo for Articles on Ergo Sum blog and the youtube channel https://www.youtube.com/channel/UCiie9CN--dazA7iT2sry5FA
data vis statistics geosciences
This repository contains the laboratory portion of an upper level undergraduate class in Python on data visualization and statistics for geo & space scientists. Labs are updated when the course is in session through the most recent branch. See master version for current class.
labs-fa17
Lab notebooks for the Fall 2017 offering of Georgia Tech's CSE 6040
Introduction-to-GAN
Introduction to Generative Adversarial Networks
EC-48W-Summer-2019
Boğaziçi University Department of Economics - Repo For Term Pojects
skip-thought-gan
Generating Text through Adversarial Training(GAN) using Skip-Thought Vectors
retailhero-recomender-baseline
Бэйслайн к задаче RetailHero.ai/#2 от @geffy 💪
opendatasets
A Python library for downloading datasets from Kaggle, Google Drive, and other online sources.
ntds 2016
Material for the EPFL master course "A Network Tour of Data Science", edition 2016.
aduana
Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even when making big crawls (one billion pages).
python-programming-for-data-science
Content from the University of British Columbia's Master of Data Science course DSCI 511.
dcbench
A benchmark of data-centric tasks from across the machine learning lifecycle.
info
All the general information you'll ever need about pursuing AI in Pakistan!
karan36k.github.io
These are all the articles and pages I have in my data science website. I try to transcribe all I learn and post regularly. Please visit and feel free to email me for suggestions.
olliePy
OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.
901-960 of 1642 data-science projects