shreyashankar / Datasets For Good
List of datasets to apply stats/machine learning/technology to the world of social good.
Stars: ✭ 174
Projects that are alternatives of or similar to Datasets For Good
Php Ml
PHP-ML - Machine Learning library for PHP
Stars: ✭ 7,900 (+4440.23%)
Mutual labels: data-science, dataset
Terpene Profile Parser For Cannabis Strains
Parser and database to index the terpene profile of different strains of Cannabis from online databases
Stars: ✭ 63 (-63.79%)
Mutual labels: health, data-science
Data Privacy For Data Scientists
A workshop on data privacy methods for data scientists.
Stars: ✭ 53 (-69.54%)
Mutual labels: data-science, education
Data Science Resources
👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (-1.72%)
Mutual labels: data-science, dataset
Dbg Pds
Deutsche Boerse's Financial Trading Public Data Set
Stars: ✭ 124 (-28.74%)
Mutual labels: data-science, dataset
Minerva Training Materials
Learn advanced data science on real-life, curated problems
Stars: ✭ 37 (-78.74%)
Mutual labels: data-science, education
Ntds 2017
Material for the EPFL master course "A Network Tour of Data Science", edition 2017.
Stars: ✭ 61 (-64.94%)
Mutual labels: data-science, education
Ml Pyxis
Tool for reading and writing datasets of tensors in a Lightning Memory-Mapped Database (LMDB). Designed to manage machine learning datasets with fast reading speeds.
Stars: ✭ 93 (-46.55%)
Mutual labels: data-science, dataset
Openml R
R package to interface with OpenML
Stars: ✭ 81 (-53.45%)
Mutual labels: data-science, dataset
Dataconfs
A list of conferences connected with data worldwide.
Stars: ✭ 36 (-79.31%)
Mutual labels: data-science, dataset
Pzad
Курс "Прикладные задачи анализа данных" (ВМК, МГУ имени М.В. Ломоносова)
Stars: ✭ 160 (-8.05%)
Mutual labels: data-science, education
Mldm
потоковый курс "Машинное обучение и анализ данных (Machine Learning and Data Mining)" на факультете ВМК МГУ имени М.В. Ломоносова
Stars: ✭ 35 (-79.89%)
Mutual labels: data-science, education
Open Solution Value Prediction
Open solution to the Santander Value Prediction Challenge 🐠
Stars: ✭ 34 (-80.46%)
Mutual labels: data-science, education
Mri Analysis Pytorch
MRI analysis using PyTorch and MedicalTorch
Stars: ✭ 55 (-68.39%)
Mutual labels: health, data-science
Datastream.io
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Stars: ✭ 814 (+367.82%)
Mutual labels: data-science, dataset
Setl
A simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-54.6%)
Mutual labels: data-science, dataset
Coffee Quality Database
Building the Coffee Quality Institute Database
Stars: ✭ 141 (-18.97%)
Mutual labels: data-science, dataset
Datasets for Social Good Projects
I was inspired to create this after taking many project-based CS and AI classes at Stanford, where I would spend more time finding data for a problem I actually cared about than writing the baseline algorithm.
The list is divided by sector, and each link has a (D), (T), or (C) next to it. (D) represents a dataset; (T) represents a tutorial; (C) represents an online challenge you can download data from and contribute knowledge to.
I am sure there are many great datasets I have missed. If you have datasets to add, please create a pull request!
Health
- Lung Cancer Early Detection Challenge (C)
- Predicting Blood Donations (D)
- Modeling Women's Health Care Decisions (C)
- New York Health Data Portal (D)
- Medicaid Adult Health: Diabetes Information (D)
- US Health Data Portal (D)
- State Medicaid Data (D)
- Youth Tobacco Legislation Data (D)
- US Chronic Disease Indicators (D)
- Broad Institute Cancer Programs Datasets (D)
- Medicare Data (D)
- Mental Health in Tech (C)
- UCI Student Alcohol Consumption Dataset (D)
- NIH Chest X-Ray Dataset (D)
- California Kindergarten Vaccinations (D)
- Classifying Breast Cancer Tumors (T)
Education
- Third Grade Reading Scores for San Mateo County (D)
- Wall Street Journal: Where it Pays to Attend College (D)
- Popular Online edX Courses from Harvard and MIT (D)
- World Bank Education Status Indicators (D)
- Cost of Higher Education in the US (D)
- Brazilian High School National Exam Scores (D)
- Indian Primary and Secondary Education Data (D)
- Visualize the State of Public Education in Colorado (C)
- National Student Loan Data System (D)
- 2010 Federal STEM Education Inventory Dataset (D)
- National School Lunch Assistance Program Data (D)
Environment
- Predicting Faulty Water Pumps in Tanzania (D)
- Air Quality and Pollution (D)
- Lead Testing in School Drinking Water (D)
- US Climate Data (D)
- Commercial Building Energy Dataset (D)
- ETH Zurich Electricity Consumption and Occupancy Dataset (D)
- US Energy Information and Administration Electric Power and Fossil Fuel Data (D)
- UN Greenhouse Gas Inventory Data (D)
- UN World Meteorological Organization Standard Normals (D)
Government
- Predicting US Presidential Election Outcomes (T)
- New York City Open Data (D)
- San Francisco Open Data (D)
- Austin Open Data (D)
- Seattle Open Data (D)
- Los Angeles Open Data (D)
- Denver Open Data (D)
- Bureau of Labor Statistics Employment Data (D)
- U.S. Census Bureau’s Small Area Income and Poverty Estimates (D)
- CIA World Factbook (D)
- USDA Food and Nutrition Service: SNAP Vendor Data (D)
- US Open Gov (D)
- American Factfinder (D)
Public Good
- City of Chicago Crime Data (D)
- US Traffic Data (D)
- East Palo Alto Homelessness Data (D)
- Global Terrorism Database (C)
- WorldBank World Development Indicators (D)
- Fake News Dataset (D)
- Credit Card Fraud Detection (D)
- Crime in India Dataset (D)
- Fatal Police Shootings in the US (D)
- Crimes Committed in France (D)
- Homelessness in USA (D)
- Modeling Bias in Age, Race, and Gender (T)
- Classifying Anti-Refugee Tweets (T)
Other Good Lists of Datasets
- https://www.datasciencecentral.com/profiles/blogs/great-github-list-of-public-data-sets
- https://ibmhadoop.devpost.com/details/data
- http://kevinchai.net/datasets
- https://www.kaggle.com/datasets
- http://archive.ics.uci.edu/ml/datasets.html?sort=nameUp&view=list
- https://github.com/rafalab/dslabs/tree/master/data
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].