All Projects → rnorm → Book_sample

rnorm / Book_sample

another book on data science

Programming Languages

python
139335 projects - #7 most used programming language
r
7636 projects

Projects that are alternatives of or similar to Book sample

Fastbook
The fastai book, published as Jupyter Notebooks
Stars: ✭ 13,998 (+2191%)
Mutual labels:  data-science, book
Machine Learning From Scratch
Succinct Machine Learning algorithm implementations from scratch in Python, solving real-world problems (Notebooks and Book). Examples of Logistic Regression, Linear Regression, Decision Trees, K-means clustering, Sentiment Analysis, Recommender Systems, Neural Networks and Reinforcement Learning.
Stars: ✭ 42 (-93.13%)
Mutual labels:  data-science, book
D2l Pytorch
This project reproduces the book Dive Into Deep Learning (https://d2l.ai/), adapting the code from MXNet into PyTorch.
Stars: ✭ 3,810 (+523.57%)
Mutual labels:  data-science, book
D2l En
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 300 universities from 55 countries including Stanford, MIT, Harvard, and Cambridge.
Stars: ✭ 11,837 (+1837.32%)
Mutual labels:  data-science, book
R4ds
R for data science: a book
Stars: ✭ 3,231 (+428.81%)
Mutual labels:  data-science, book
Data Science At The Command Line
Data Science at the Command Line
Stars: ✭ 3,174 (+419.48%)
Mutual labels:  data-science, book
D2l Vn
Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.
Stars: ✭ 402 (-34.21%)
Mutual labels:  data-science, book
Data Analysis And Machine Learning Projects
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
Stars: ✭ 5,166 (+745.5%)
Mutual labels:  data-science
Pl Compiler Resource
程序语言与编译技术相关资料(持续更新中)
Stars: ✭ 578 (-5.4%)
Mutual labels:  book
Data Science Portfolio
Portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
Stars: ✭ 559 (-8.51%)
Mutual labels:  data-science
Nipype
Workflows and interfaces for neuroimaging packages
Stars: ✭ 557 (-8.84%)
Mutual labels:  data-science
Alphapy
Automated Machine Learning [AutoML] with Python, scikit-learn, Keras, XGBoost, LightGBM, and CatBoost
Stars: ✭ 564 (-7.69%)
Mutual labels:  data-science
Vehicle counting tensorflow
🚘 "MORE THAN VEHICLE COUNTING!" This project provides prediction for speed, color and size of the vehicles with TensorFlow Object Counting API.
Stars: ✭ 582 (-4.75%)
Mutual labels:  data-science
Pachyderm
Reproducible Data Science at Scale!
Stars: ✭ 5,305 (+768.25%)
Mutual labels:  data-science
Pdpipe
Easy pipelines for pandas DataFrames.
Stars: ✭ 590 (-3.44%)
Mutual labels:  data-science
Classiccomputerscienceproblemsinpython
Source Code for the Book Classic Computer Science Problems in Python
Stars: ✭ 558 (-8.67%)
Mutual labels:  book
Smile
Statistical Machine Intelligence & Learning Engine
Stars: ✭ 5,412 (+785.76%)
Mutual labels:  data-science
Getting Started With Genomics Tools And Resources
Unix, R and python tools for genomics and data science
Stars: ✭ 587 (-3.93%)
Mutual labels:  data-science
Baikal
A graph-based functional API for building complex scikit-learn pipelines.
Stars: ✭ 573 (-6.22%)
Mutual labels:  data-science
Koodo Reader
A modern ebook manager and reader with sync and backup capacities for Windows, macOS, Linux and Web
Stars: ✭ 2,938 (+380.85%)
Mutual labels:  book

Another Book on Data Science - learn R and Python in parallel

This is a short book about data science and quantitative analysis, using R and Python together. The current version of the book has seven chapters.

Chapters

1. Introduction to R/Python Programming

R and Python are the two most popular programming languages used in data science. In this chapter, we go through the basics of R and Python programming in parallel with examples. Specifically, the topics of this chapter include variable, type, function, control flow, data structures, and object-oriented programming. The chapter is designed for both beginners and intermediate audience.

2. More on R/Python Programming

Following the first chapter, the topics of this chapter include debugging, vectorization, parallelism, working with C++ in R/Python, and functional programming. These topics are chosen to help the audience to get familiar with some intermediate/advanced topics in R/Python programming. Mastering these topics will greatly help with coding skills. Like the first chapter, in this chapter, I try to emphasize the differences between R and Python with coding examples.

3. data.table and pandas

In the first two chapters, we focus on general-purpose programming techniques. In this chapter, we introduce the very basics of data science, i.e., data manipulation. For the audience with little experience in data science, we start from a brief introduction to SQL. The major part of this chapter focuses on the two widely used data.frame packages, i.e., data.table in R and pandas in Python. Side-by-side examples using the two packages not only enables the audience to learn basic usages of these tools but also can be used as a quick reference manual.

4. Random Variables & Distributions

In this chapter, we focus on statistics, which is the foundation of data science. To better follow this chapter, I recommend any introductory level statistics course as a prerequisite. The topics of this chapter include random variable sampling methods, distribution fitting, joint distribution/copula simulation, confidence interval calculation, and hypothesis testing.

5. Linear Regression

This is a short but important chapter. In this chapter, we talk about linear regression models from scratch. Many textbooks introduce the theories behind linear regressions but still don’t help much on the implementation. We will see how the linear regression is implemented as a toy example in both R and Python with the help of linear algebra. I will also show how the basic linear regression model can be used for L2 penalized linear regression, i.e., ridge regression.

6. Optimization in Practice

Most machine learning models rely on optimization algorithms. In this chapter, we give a brief introduction to optimization. Specifically, we will talk about convexity, gradient descent, general-purpose optimization tools in R and Python, linear programming and metaheuristic algorithms, etc. Based on these techniques, we will see coding examples about maximum likelihood estimation, linear regression, logistic regression, portfolio construction, traveling salesman problem.

7. Machine Learning – A gentle introduction

Machine learning is a huge topic. In this chapter, I try to give a very short and gentle introduction to machine learning. It starts with a brief introduction of supervised learning, unsupervised learning and reinforcement learning, respectively. For supervised learning, we will see the gradient boosting regression with a pure Python implementation from scratch, from which the audience could learn the translation from the mathematical models to object-oriented programs. For unsupervised learning, the finite Gaussian mixture model and PCA are discussed. And for reinforcement learning, we will also use a simple game as an example to show the usage of deep Q-networks. The goal of this chapter is two-fold. First, I would like to give the audience an impression of what machine learning looks like. Second, reading the code snippets in this chapter could help the audience review/recap the topics in previous chapters.

Read online

The book is now available on https://www.anotherbookondatascience.com/

And also a short link www.randpython.com

FAQ

Q: How to report errors?

A: I appreciate it if you want to report errors found in the book. Please either email me [email protected] or open a github issue.

Q: More chapters to come?

A: Maybe in the future.

Q: Why is dplyr not covered?

A: I want to make the book concise with diversified topics about data science. Since data.table is introduced for data manipulation in the book, dplyr is not covered. dplyr is a great tool in R and there are many good learning resources.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].