All Projects → dlegor → Crash_Course_Pandas

dlegor / Crash_Course_Pandas

Licence: other
Notebook of the Crash Course of Pandas 2018

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to Crash Course Pandas

Hagar
Fast, flexible, and version-tolerant serializer for .NET
Stars: ✭ 216 (+1170.59%)
Mutual labels:  pipelines
devops-101
Intro to DevOps from scratch.
Stars: ✭ 57 (+235.29%)
Mutual labels:  pipelines
vent
Vent is a light-weight platform built to automate network collection and analysis pipelines using a flexible set of popular open source tools and technologies. Vent is python-based, extensible, leverages docker containers, and provides both an API and CLI.
Stars: ✭ 73 (+329.41%)
Mutual labels:  pipelines
Psi
Platform for Situated Intelligence
Stars: ✭ 249 (+1364.71%)
Mutual labels:  pipelines
mleap
R Interface to MLeap
Stars: ✭ 24 (+41.18%)
Mutual labels:  pipelines
ChRIS ultron backEnd
Backend for ChRIS
Stars: ✭ 28 (+64.71%)
Mutual labels:  pipelines
Sparktorch
Train and run Pytorch models on Apache Spark.
Stars: ✭ 195 (+1047.06%)
Mutual labels:  pipelines
dolphinnext
A graphical user interface for distributed data processing of high throughput genomics
Stars: ✭ 92 (+441.18%)
Mutual labels:  pipelines
qsiprep
Preprocessing and reconstruction of diffusion MRI
Stars: ✭ 94 (+452.94%)
Mutual labels:  pipelines
pipeComp
A R framework for pipeline benchmarking, with application to single-cell RNAseq
Stars: ✭ 38 (+123.53%)
Mutual labels:  pipelines
openshift-starter-guides
Getting Started with OpenShift for Developers Workshop
Stars: ✭ 35 (+105.88%)
Mutual labels:  pipelines
myprofile
Generate your resume easily from Github actions ✅ using discussion section 📃 🚀
Stars: ✭ 19 (+11.76%)
Mutual labels:  pipelines
xslweb
Web application framework for XSLT and XQuery developers
Stars: ✭ 39 (+129.41%)
Mutual labels:  pipelines
Mkdkr
Make + Docker + Shell = CI Pipeline
Stars: ✭ 225 (+1223.53%)
Mutual labels:  pipelines
mlx
Machine Learning eXchange (MLX). Data and AI Assets Catalog and Execution Engine
Stars: ✭ 132 (+676.47%)
Mutual labels:  pipelines
Hydro Serving
MLOps Platform
Stars: ✭ 213 (+1152.94%)
Mutual labels:  pipelines
aws-customer-churn-pipeline
An End to End Customer Churn Prediction solution using AWS services.
Stars: ✭ 30 (+76.47%)
Mutual labels:  pipelines
Deep-Learning-TIP
No description or website provided.
Stars: ✭ 26 (+52.94%)
Mutual labels:  pandas-tutorial
Breast-cancer-risk-prediction
Classification of Breast Cancer diagnosis Using Support Vector Machines
Stars: ✭ 143 (+741.18%)
Mutual labels:  pipelines
connectomemapper3
Connectome Mapper 3 is a BIDS App that implements full anatomical, diffusion, resting/state functional MRI, and recently EEG processing pipelines, from raw T1 / DWI / BOLD , and preprocessed EEG data to multi-resolution brain parcellation with corresponding connection matrices.
Stars: ✭ 45 (+164.71%)
Mutual labels:  pipelines

Crash Course Pandas

This repository contains the notebooks of the intensive course that I taught for six weeks in 2018. The intention was to study the "idiomatic Pandas", but first we reviewed topics about Python and Numpy,and then focus on the understanding and preparation of data (following the CRISP-DM methodology).

Later, I will share more notes on more topics that I consider relevant, unfortunately we could not review them.

Notebook and description:

  • Basic Programming in Python and Basic topic of Numpy: We review Python and its collections (dic, list, tuples, sets). About Numpy, we go from introduction (vectors and matrices) to specific operations (Brocasting, Matrix operation, Vectorization, etc.).
  • Pandas and the environment in Jupyter Notebook: In the first part, we examined Pandas and their objects (DataFrames and Series, and their functionalities), we also saw aspects of Jupyter's functionalities, such as the magic commands and the operation of the notebook.
  • DataFrames and Series / Relation between SQL and Pandas 1: We reviewed many ways to create DataFrames and Series, and aspects of the operation with them and each other. With examples, we examined the relationship between basic queries in SQL but using Pandas.
  • Pipeline/Relation between SQL and Pandas 2 / GroupBy and Pivot Tables: We encouraged the use of Pipeline in Pandas step by step with examples. Later, we explored the GropuBy operation on DataFrames and made examples. Finally, we finished reviewing the basic queries in SQL but in Pandas.
  • Methods in Pandas/ Merges and Joins/ Structure of the Data Analysis Project: Previously in other lessons, we used some methods in DataFrames. In this notebook we explored the most useful methods on DataFrames with the intention of working with pipeline. I briefly mentioned something about merge, join, concat and addend, but I showed many examples of how we could work with pipelines and visualizations.
  • Two Data Analysis mini-projects: In this last notebook we made two mini-projects of Data Analysis. We did it using the knowledge from the previous lesson. We also reviewed the topics covered in this course and commented on the omitted topics. For the first mini-project, we estimated some models to see the possible problems in an Machine Learning project.

Next Topics:

  • Categories and strings in Pandas.
  • Tidy Dataframes.
  • Time Series in Pandas.
  • Brief introduction in Dask.

Note:

  • Unfortunately, this course was taught in Spanish, so the comments in the notebooks were done in Spanish. But I think if you like and you know Python, you won't have problems with that.
  • You can run these notebooks in Colaboraty, on that platform the course was taught.

Comment or suggestion you can write to me.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].