Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → chrislicodes → Udacity-Data-Analyst-Nanodegree

chrislicodes / Udacity-Data-Analyst-Nanodegree

Licence: other

Repository for the projects needed to complete the Data Analyst Nanodegree.

Programming Languages

Jupyter Notebook

11667 projects

75241 projects

Labels

api data text-mining udacity statistics numpy pandas data-visualization seaborn dataset data-analytics data-analysis matplotlib data-wrangling tweepy data-gathering web-crawling data-cleaning data-analyst-nanodegree

Projects that are alternatives of or similar to Udacity-Data-Analyst-Nanodegree

data-analysis-using-python

Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data

Stars: ✭ 81 (+161.29%)

Mutual labels: numpy, pandas, seaborn, data-analytics, data-analysis, matplotlib

Data-Analyst-Nanodegree

Kai Sheng Teh - Udacity Data Analyst Nanodegree

Stars: ✭ 42 (+35.48%)

Mutual labels: udacity, numpy, pandas, data-analysis, data-wrangling, data-analyst-nanodegree

The-Data-Visualization-Workshop

A New, Interactive Approach to Learning Data Visualization

Stars: ✭ 59 (+90.32%)

Mutual labels: numpy, pandas, seaborn, matplotlib, data-wrangling

人工智能学习路线图，整理近200个实战案例与项目，免费提供配套教材，零基础入门，就业实战！包括：Python，数学，机器学习，数据分析，深度学习，计算机视觉，自然语言处理，PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

Stars: ✭ 4,387 (+14051.61%)

Mutual labels: numpy, pandas, seaborn, data-analysis, matplotlib

Open Machine Learning Course

Stars: ✭ 7,963 (+25587.1%)

Mutual labels: numpy, pandas, seaborn, data-analysis, matplotlib

主要是爬虫与数据分析项目总结，外加建模与机器学习，模型的评估。

Stars: ✭ 142 (+358.06%)

Mutual labels: numpy, pandas, data-analysis, matplotlib

Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Stars: ✭ 90 (+190.32%)

Mutual labels: numpy, pandas, data-analytics, data-wrangling

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

Stars: ✭ 967 (+3019.35%)

Mutual labels: pandas, data-analysis, data-wrangling, data-cleaning

Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.

Stars: ✭ 14 (-54.84%)

Mutual labels: numpy, pandas, seaborn, matplotlib

datascienv is package that helps you to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries

Stars: ✭ 53 (+70.97%)

Mutual labels: numpy, pandas, seaborn, matplotlib

Exploratory Data Analysis Visualization Python

Data analysis and visualization with PyData ecosystem: Pandas, Matplotlib Numpy, and Seaborn

Stars: ✭ 78 (+151.61%)

Mutual labels: numpy, pandas, seaborn, matplotlib

Seaborn Tutorial

This repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.

Stars: ✭ 114 (+267.74%)

Mutual labels: numpy, pandas, data-analysis

Data Science For Marketing Analytics

Achieve your marketing goals with the data analytics power of Python

Stars: ✭ 127 (+309.68%)

Mutual labels: numpy, pandas, matplotlib

Machine Learning Projects

This repository consists of all my Machine Learning Projects.

Stars: ✭ 135 (+335.48%)

Mutual labels: numpy, pandas, matplotlib

Stock Market Analysis And Prediction

Stock Market Analysis and Prediction is the project on technical analysis, visualization and prediction using data provided by Google Finance.

Stars: ✭ 112 (+261.29%)

Mutual labels: numpy, pandas, matplotlib

A constantly updated python machine learning cheatsheet

Stars: ✭ 136 (+338.71%)

Mutual labels: numpy, pandas, matplotlib

Opendatawrangling

공공데이터 분석

Stars: ✭ 148 (+377.42%)

Mutual labels: numpy, pandas, matplotlib

Data Science Types

Mypy stubs, i.e., type information, for numpy, pandas and matplotlib

Stars: ✭ 180 (+480.65%)

Mutual labels: numpy, pandas, matplotlib

Data Science Notebook

📖 每一个伟大的思想和行动都有一个微不足道的开始

Stars: ✭ 196 (+532.26%)

Mutual labels: numpy, pandas, data-analysis

100 Pandas Puzzles

100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

Stars: ✭ 1,382 (+4358.06%)

Mutual labels: numpy, pandas, data-analysis

View All Similar Projects ➔

Udacity Data Analyst Nanodegree

Discover insights from data via Python and SQL.

Skills Acquired (Summary)

Prerequisites

You'll need to install:

And additional libraries defined in each project.

Recommended:

Anaconda

Project Overview

P0: Explore Weather Trends

The first chapter was an introduction to the following projects of the Data Analyst Nanodegree.

First chapter project was about weather trends - it required to apply (atleast) the following steps:

Extract data from a database using a SQL query
Calculate a moving average
Create a line chart

I analyzed local and global temperature data and compared the temperature trends in three german cities to overall global temperature trends. After cleaning the data, I've created a function, which was supposed to handle all the tasks that are needed to plot the data - for example calculating the linear trend and calculating the rolling average. In addition, the function had other various options for the visualization to get various graphs.

Key findings:

the average global temperature is increasing, with an also increasing tempo
Berlin is the only city in Germany in this dataset which has a higher average temperature than the global average

P1: Investigate a Dataset (Gapminder World Dataset)

This chapter was all about the data analysis process as whole. From gathering to cleaning, assessing and wrangling to exploring and visualizing the data over the programming workflow and communication was everything included.

This project included therefore all steps of the typical data analysis process. This includes:

posing questions
gather, wrangle and clean data
communicate answers to the questions
assited through visualizations and statistics.

Out of the project:

This project will examine datasets available at Gapminder. To be more specific, it will take a closer look on the life expectancy of the population from different countries and the influences from other variables. It will also take a look on the development of these variables over time.

What is Gapminder? "Gapminder is an independent Swedish foundation with no political, religious or economic affiliations. Gapminder is a fact tank, not a think tank. Gapminder fights devastating misconceptions about global development." (https://www.gapminder.org/about-gapminder/)

Here we were confronted with the full joy of a real-life dataset: from hard-to-analyze structure, missing, messy, dirty data to real and - after finally being done with data wrangling - the reward of interesting insights.

P2: Analyze A/B Test Results

Following chapter was filled with a lot of information. We talked about: Data Types, Notation, Mean, Standard Deviation, Correlation, Data Shapes, Outliers, Bias, Dangers, Probability and Bayes, Distributions, Central Limit Theorem, Bootstrapping, Confidence Intervals, Hypothesis Testing, A/B Tests, Linear Regression, Logistic Regression and more.. *heavy breathing

To goal of the project in this chapter was to get experience with A/B testing, it's difficulties and drawbacks of it. First of all, we learned what A/B testing is all about - including different metrics like the Click Through Rate (CTR) and how to analyze these metrics properly. And second of all, we learned about the drawbacks like the novelty effect or change aversion.

In the end we brought everything we've learned together to analyze this A/B test properly.

P3: Gather, Clean and Analyze Twitter Data (WeRateDogs™ (@dog_rates))

This chapter was a deep dive into the data wrangling part of the data analysis process. We learned about the difference between messy and dirty data, how tidy data should look like, about the assessing, defining, cleaning and testing process, etc. Moreover, we talked about many different file types and different methods of gathering data.

In this project we had to deal with the reality of dirty and messy data (again). We gathered data from different sources (for example the Twitter API), identified issues with the dataset in terms of tidiness and quality. Afterwards we had to solve these problems while documenting each step. The end of the project was then focused on the exploration of the data.

P4: Communicate Data Findings

The final chapter was focused on proper visualization of data. We learned about chart junk, uni-, bi- and multivariate visualization, use of color, data/ink ratio, the lief factor, other encodings, [...].

The task of the final project was to analyze and visualize real-world data. I chose the Ford GoBike dataset.

License

Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Creative Commons License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 31

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗