Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → airalcorn2 → Michael S Guide To Becoming A Data Scientist

airalcorn2 / Michael S Guide To Becoming A Data Scientist

I was once asked about transitioning to a career in data science by three different UChicago grad students over a short period of time, so I decided to put together this outline in case anyone else was curious.

Labels

data-science

Projects that are alternatives of or similar to Michael S Guide To Becoming A Data Scientist

Ethereumdb

Stars: ✭ 21 (-38.24%)

Mutual labels: data-science

Wolfram Coronavirus

Wolfram Language code and notebooks related to the coronavirus outbreak

Stars: ✭ 30 (-11.76%)

Mutual labels: data-science

Mljar Supervised

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning 🚀

Stars: ✭ 961 (+2726.47%)

Mutual labels: data-science

Intro Python

Python pour Statistique et Science des Données -- Syntaxe, Trafic de Données, Graphes, Programmation, Apprentissage

Stars: ✭ 21 (-38.24%)

Mutual labels: data-science

Rebate

Relief Based Algorithms of ReBATE implemented in Python with Cython optimization. This repository is no longer being updated. Please see scikit-rebate.

Stars: ✭ 29 (-14.71%)

Mutual labels: data-science

Docker Iocaml Datascience

Dockerfile of Jupyter (IPython notebook) and IOCaml (OCaml kernel) with libraries for data science and machine learning

Stars: ✭ 30 (-11.76%)

Mutual labels: data-science

Clevercsv

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

Stars: ✭ 887 (+2508.82%)

Mutual labels: data-science

Python Training

Python training for business analysts and traders

Stars: ✭ 972 (+2758.82%)

Mutual labels: data-science

Python for ml

brief introduction to Python for machine learning

Stars: ✭ 29 (-14.71%)

Mutual labels: data-science

Simple Sh Datascience

A collection of Bash scripts and Dockerfiles to install data science Tool, Lib and application

Stars: ✭ 32 (-5.88%)

Mutual labels: data-science

Machine Learning Open Source

Monthly Series - Machine Learning Top 10 Open Source Projects

Stars: ✭ 943 (+2673.53%)

Mutual labels: data-science

Mlnet Workshop

ML.NET Workshop to predict car sales prices

Stars: ✭ 29 (-14.71%)

Mutual labels: data-science

Page clustering

A simple algorithm for clustering web pages, suitable for crawlers

Stars: ✭ 30 (-11.76%)

Mutual labels: data-science

Steppy Toolkit

Curated set of transformers that make your work with steppy faster and more effective 🔭

Stars: ✭ 21 (-38.24%)

Mutual labels: data-science

Art Data Science

The Art of Data Science

Stars: ✭ 32 (-5.88%)

Mutual labels: data-science

Crime Analysis

Association Rule Mining from Spatial Data for Crime Analysis

Stars: ✭ 20 (-41.18%)

Mutual labels: data-science

Arcgis Python Api

Documentation and samples for ArcGIS API for Python

Stars: ✭ 954 (+2705.88%)

Mutual labels: data-science

Open Solution Value Prediction

Open solution to the Santander Value Prediction Challenge 🐠

Stars: ✭ 34 (+0%)

Mutual labels: data-science

Feagen

(deprecated) A fast and memory-efficient Python data engineering framework for machine learning.

Stars: ✭ 33 (-2.94%)

Mutual labels: data-science

Tensorflow object counting api

🚀 The TensorFlow Object Counting API is an open source framework built on top of TensorFlow and Keras that makes it easy to develop object counting systems!

Stars: ✭ 956 (+2711.76%)

Mutual labels: data-science

View All Similar Projects ➔

Michael's Guide to Becoming a Data Scientist

Michael's Guide to Becoming a Data Scientist by Michael A. Alcorn is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

My CV
General Information
Get Experience!
Curriculum
Programming
Databases
Big Data Tools

Guide

My CV
General Information
- 8 Skills You Need to be a Data Scientist
- What's the difference between a data architect, data analyst, data engineer, and data scientist?
  - "Data analyst" will probably be less exciting than "data scientist" for those with a scientific background.
- Advice from a Data Scientist at Quora
- /r/MachineLearning
Get Experience!
- Intern - this is the best possible thing you can do.
- Try out Kaggle competitions.
- Create a LinkedIn account and keep it updated.
Curriculum
- Free Courses - use them
  - Coursera, edX, Udacity, Saylor, Khan Academy
  - Can use my course history as a guide.
- Math
  - Calculus (at least up to partial derivatives, which is typically Calculus III)
  - Linear Algebra
  - Analysis (advanced)
- Statistics - know Bayesian and frequentist theory
- Algorithms
- Machine Learning - know the big algorithms; natural language processing is probably the most useful subfield to learn
- Other Topics - graphs, game theory, information theory, etc.
Programming
- Must know Python. Almost all data scientist positions require cleansing and transforming data on a large scale and Python is typically the language of choice for this task.
- Important Python packages/libraries → scikit-learn, NumPy, Keras, TensorFlow, Theano, SciPy, Pandas, Statsmodels
- Must know R.
- Should know your way around a *nix terminal.
- Version control - should know basics of Git.
- Put personal projects on GitHub.
- Contribute to open source projects.
Databases - definitely know SQL, should probably look into NoSQL databases as well (e.g., MongoDB)
- The best way to learn databases is by working with them. Find a database and practice writing queries for it.
Big Data Tools
- Be familiar with the following: Apache Hadoop, MapReduce, Apache Spark, Apache Pig, Apache Hive, Apache Mahout, Apache Solr, Apache Lucene

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 34

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

airalcorn2 / Michael S Guide To Becoming A Data Scientist

Labels

Projects that are alternatives of or similar to Michael S Guide To Becoming A Data Scientist

Michael's Guide to Becoming a Data Scientist

Table of Contents

Guide