All Projects → leovan → data-science-introduction-with-python

leovan / data-science-introduction-with-python

Licence: other
Python 数据科学导论 | Data Science Introduction with Python

Programming Languages

HTML
75241 projects
TeX
3793 projects
CSS
56736 projects
python
139335 projects - #7 most used programming language
shell
77523 projects
Asymptote
17 projects

Projects that are alternatives of or similar to data-science-introduction-with-python

Texera
Big Data Analytics Using Interactive Workflows
Stars: ✭ 90 (+55.17%)
Mutual labels:  data-analytics
Countly Sdk Web
Countly Product Analytics SDK for websites and web applications
Stars: ✭ 165 (+184.48%)
Mutual labels:  data-analytics
Gspread Pandas
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Stars: ✭ 226 (+289.66%)
Mutual labels:  data-analytics
Node Druid Query
Simple querying library for Druid (http://druid.io)
Stars: ✭ 93 (+60.34%)
Mutual labels:  data-analytics
Big Dipper
A block explorer for Cosmos
Stars: ✭ 119 (+105.17%)
Mutual labels:  data-analytics
Ranalyticshhe
Repository for Online Classes
Stars: ✭ 183 (+215.52%)
Mutual labels:  data-analytics
Mads.jl
MADS: Model Analysis & Decision Support
Stars: ✭ 71 (+22.41%)
Mutual labels:  data-analytics
SQL-for-Data-Analytics
Perform fast and efficient data analysis with the power of SQL
Stars: ✭ 187 (+222.41%)
Mutual labels:  data-analytics
Traffic
A toolbox for processing and analysing air traffic data
Stars: ✭ 138 (+137.93%)
Mutual labels:  data-analytics
Flashx
FlashX is a collection of big data analytics tools that perform data analytics in the form of graphs and matrices.
Stars: ✭ 220 (+279.31%)
Mutual labels:  data-analytics
Awesome Bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
Stars: ✭ 10,478 (+17965.52%)
Mutual labels:  data-analytics
Candela
Visualization components for the web
Stars: ✭ 112 (+93.1%)
Mutual labels:  data-analytics
Morpheus Core
The foundational library of the Morpheus data science framework
Stars: ✭ 203 (+250%)
Mutual labels:  data-analytics
Danfojs
danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Stars: ✭ 1,304 (+2148.28%)
Mutual labels:  data-analytics
data-science-introduction-with-r
R 语言数据科学导论 | Data Science Introduction with R
Stars: ✭ 104 (+79.31%)
Mutual labels:  data-analytics
Basketball analytics
Repository which contains various scripts and work with various basketball statistics
Stars: ✭ 88 (+51.72%)
Mutual labels:  data-analytics
Data Science Resources
👨🏽‍🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Stars: ✭ 171 (+194.83%)
Mutual labels:  data-analytics
cora-docs
CoRA Docs
Stars: ✭ 36 (-37.93%)
Mutual labels:  data-analytics
datapackage-m
Power Query M functions for working with Tabular Data Packages (Frictionless Data) in Power BI and Excel
Stars: ✭ 26 (-55.17%)
Mutual labels:  data-analytics
Koolreport
This is an Open Source PHP Reporting Framework which you can use to write perfect data reports or to construct awesome dashboards using PHP
Stars: ✭ 204 (+251.72%)
Mutual labels:  data-analytics

Data Science Introduction with Python logo

Release License Issues


简介 - Introduction

  1. 本项目是一套以 Python 为分析语言的数据科学入门教程。
  2. 托管网站:https://ds-python.leovan.tech
  3. 仓库目录结构:
    • base 目录:幻灯片相关配置文件
    • docs 目录:其他资料
    • 其他一级目录:
    • 二级目录:
      • *.pdf:本节课程幻灯片
      • data:本节课程所需数据文件
      • slide:本节课程幻灯片源代码
  4. 本项目遵守 CC BY-NC-SA 4.0 协议。

准备 - Preparation

  1. 操作系统:Windows 10+ (x64),macOS 10.12+,Ubuntu 16.04+
  2. Python:最新版本 Anaconda Python 3 (下载地址)
  3. PyCharm:最新版本 (下载地址,Python IDE)
  4. Spyder:最新版本 (下载地址,Python IDE,Anaconda 已包含)
  5. Visual Studio Code:最新版本 (下载地址,用于代码浏览和编辑)
  6. Typora:最新版本 (下载地址,用于 Markdown 浏览)

参考书籍 - Reference

  1. 《Python 编程从入门到实践》(Python Crash Course, A Hand-On, Project-Based Introduction to Programming),Eric Matthes 著,袁国忠 译
  2. 《流畅的 Python》(Fluent Python),Luciano Ramalho 著,安道、吴珂 译
  3. 《利用 Python 进行数据分析》(Python for Data Analysis:Data Wrangling with Pandas, Numpy and IPython),Wes McKinney 著,徐敬一 译
  4. 《机器学习实践》(Machine Learning in Action),Petter Harrington 著,李锐、李鹏、曲亚东、王斌 译
  5. 《Python 机器学习》(Python Machine Learning),Sebastian Raschka & Vahid Mirjalili 著,陈斌 译
  6. 《统计学习方法》李航 著
  7. 《机器学习》周志华 著
  8. 《深度学习》(Deep Learning),Ian Goodfellow, Yoshua Bengio & Aaron Courville 著,赵申剑、黎彧君、符天凡、李凯 译

数据科学简介 - Data Science Introduction

  1. 数据科学概念
    • 数据科学
    • 数据产品
    • 跨界
  2. 数据科学工具箱
    • 数据科学常用工具
    • 数据科学之战:Python 和 R
    • 选择哪种语言
  3. 数据科学分工与流程
    • 数据科学分工
    • 数据分析和挖掘流程

Python 语言简介 - Python Introduction

  1. Python 相关环境配置
  2. Python 基础语法
  3. Python 数据结构
  4. Python 编码风格规范

数据分析基础 (上) - Data Analytics Introduction - Part 1

  1. NumPy 简介
  2. NumPy 多维数组对象
  3. NumPy 面向数据编程

数据分析基础 (下) - Data Analytics Introduction - Part 2

  1. pandas 简介
  2. pandas 数据载入和存储
  3. pandas 数据规整

数据可视化 - Data Visualization

  1. 数据可视化
  2. Matplotlib & Seaborn
  3. plotnine
  4. 基于 Web 的绘图库

统计分析基础 - Statistical Analytics Introduction

  1. 探索性分析
    • 描述性统计量
    • 常用分布
  2. 实验设计
    • 假设检验概念
    • 常用假设检验
  3. 线性回归
    • 一元线性回归
    • 多元线性回归
    • 广义线性回归
    • 最小二乘法与梯度下降

特征工程 - Feature Engineering

  1. 数据预处理
    • 数据清洗
    • 缺失值,重复值,异常值处理
    • 数据采样,数据集分割
  2. 特征变换和编码
    • 无量纲化
    • 分箱
    • 分类特征编码
  3. 特征提取,选择和监控
    • 特征提取
    • 特征选择
    • 特征监控

模型评估 & 超参数优化 - Model Evaluation & Hyperparameter Optimization

  1. 模型性能评估
    • 回归问题
    • 分类问题
    • 聚类问题
  2. 模型生成和选择
    • 过拟合问题
    • 评估方法
    • 偏差和方差
  3. 超参数优化
    • 搜索算法
    • 进化和群体算法
    • 贝叶斯优化

分类算法 (上) - Classification Algorithms - Part 1

  1. 逻辑回归
  2. 决策树

分类算法 (下) - Classification Algorithms - Part 2

  1. Bagging
  2. Boosting
  3. Stacking

时间序列算法 - Time Series Algorithms

  1. 时间序列
  2. ARIMA 模型
  3. 季节性分析
  4. Prophet

聚类算法 - Clustering Algorithms

  1. K-means
  2. 层次聚类
  3. 基于密度的聚类

可重复性研究 - Reproducible Research

  1. 可重复性研究
  2. Markdown
  3. reStructuredText & Sphinx
  4. Jupyter
  5. 版本控制
  6. 其他工具

深度学习算法 - Deep Learning Algorithms

  1. 人工神经网络
  2. 卷积神经网络
  3. 循环神经网络
  4. 深度学习框架
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].