All Projects → dlab-berkeley → Python-Data-Wrangling

dlab-berkeley / Python-Data-Wrangling

Licence: other
D-Lab's 3 hour introduction to data wrangling in Python. Learn how to import and manipulate dataframes using pandas in Python.

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to Python-Data-Wrangling

monthly-returns-heatmap
Python Monthly Returns Heatmap (DEPRECATED! Use QuantStats instead)
Stars: ✭ 23 (-43.9%)
Mutual labels:  pandas
OFFLINE-ERP
A desktop application which helps students to choose Disciplinary and Open Electives wisely.
Stars: ✭ 16 (-60.98%)
Mutual labels:  pandas
sklearn-predict
机器学习数据,预测趋势并画图
Stars: ✭ 16 (-60.98%)
Mutual labels:  pandas
neworder
A dynamic microsimulation framework for python
Stars: ✭ 15 (-63.41%)
Mutual labels:  pandas
grizzly
A Python-to-SQL transpiler as replacement for Python Pandas
Stars: ✭ 27 (-34.15%)
Mutual labels:  pandas
Machine-Learning
This repository contains notebooks that will help you in understanding basic ML algorithms as well as basic numpy excercise. 💥 🌈 🌈
Stars: ✭ 15 (-63.41%)
Mutual labels:  pandas
partridge
A fast, forgiving GTFS reader built on pandas DataFrames
Stars: ✭ 115 (+180.49%)
Mutual labels:  pandas
framequery
SQL on dataframes - pandas and dask
Stars: ✭ 63 (+53.66%)
Mutual labels:  pandas
skippa
SciKIt-learn Pipeline in PAndas
Stars: ✭ 33 (-19.51%)
Mutual labels:  pandas
open-data-anonimizer
Python Data Anonymization & Masking Library For Data Science Tasks
Stars: ✭ 36 (-12.2%)
Mutual labels:  pandas
stream2segment
A Python project to download, process and visualize medium-to-massive amount of seismic waveforms and metadata
Stars: ✭ 18 (-56.1%)
Mutual labels:  pandas
pandas-cheat-sheet-ja
pandas 公式チートシートの非公式翻訳版
Stars: ✭ 74 (+80.49%)
Mutual labels:  pandas
Arch-Data-Science
Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
Stars: ✭ 92 (+124.39%)
Mutual labels:  pandas
GitHub-Stalker
track your GitHub statistics with Pandas
Stars: ✭ 31 (-24.39%)
Mutual labels:  pandas
pandas-stubs
Pandas type stubs. Helps you type-check your code.
Stars: ✭ 84 (+104.88%)
Mutual labels:  pandas
pyfinmod
Financial modeling with Python and Pandas
Stars: ✭ 39 (-4.88%)
Mutual labels:  pandas
DataSciPy
Data Science with Python
Stars: ✭ 15 (-63.41%)
Mutual labels:  pandas
datahub
DataHub - Synthetic data library
Stars: ✭ 66 (+60.98%)
Mutual labels:  pandas
AlphaVantageAPI
An Opinionated AlphaVantage API Wrapper in Python 3.9. Compatible with Pandas TA (pip install pandas_ta). Get your FREE API Key at https://www.alphavantage.co/support/
Stars: ✭ 77 (+87.8%)
Mutual labels:  pandas
Dominando-Pandas
Este repositório está destinado ao processo de aprendizagem da biblioteca Pandas.
Stars: ✭ 22 (-46.34%)
Mutual labels:  pandas

D-Lab Introduction to Pandas workshop

This repository contains materials for the introductory pandas workshop at the UC Berkeley D-Lab.

1. Software for the workshop

The best learning experience happens when you can edit and run code. So, please have Python Anaconda Distribution 3.7, pandas, matplotlib, and Jupyter installed before the start of the workshop. Alternatively, if you cannot install Anaconda, you can still access the workshop materials through this datahub link. Note, this will only work if you have a berkeley.edu email address.

To use Anaconda, follow the steps below to setup your environment:

  1. Click here to download Python Anaconda 3.7 Distribution, although 3.6 is also okay if you already have it installed. Scroll down to the "Anaconda Installers" section and click the "Graphical Installer" option that corresponds to your operating system.

  2. If you are using Terminal (Mac) or GitBash (PC), you can pip install the necessary packages by typing:

$ pip install pandas matplotlib jupyter

Windows users only - if you wish to emulate the Bash programming language found in Mac users' "Terminal" application, click here to download GitBash, a Unix command-line environment for Windows users.

Alternatively, you can install these packages by adding a cell to the top of your Jupyter Notebook and typing:

!pip install pandas matplotlib jupyter

2. Files for the workshop

Once the software is installed, download the necessary files for the workshops which are contained in this repository. Get them by doing the following:

  1. Click the green "Clone or Download" button
  2. Click "Download Zip"
  3. Extract this .zip file someplace familiar, such as your Desktop.

Or, if you are a Git user you can simply clone this repository

$ git clone [email protected]:dlab-berkeley/introduction-to-pandas.git

3. Open a Jupyter Notebook

  1. Open the "Anaconda Navigator" application and click "Launch" under Jupyter Notebook

or

Navigate to the respository using Terminal or Gitbash and type

$ cd introduction-to-pandas

then

$ jupyter notebook or python3 -m notebook

This will open a blank notebook for you to use as a scratch space is you desire. Open the file "introduction-to-pandas.ipynb" to access the tutorial.

4. Outline

For this workshop, we'll go through an example using European unemployment data. We'll load, view, and modify the data as well as calculate some descriptive statistics. The idea is to get a sense of what it would be like to use pandas as part of your workflow.

We plan to cover:

  • pandas data structures
  • loading data
  • subsetting and filtering
  • calculating summary statistics
  • dealing with missing values
  • merging data sets
  • creating new variables
  • basic plotting
  • exporting data

5. Resources

Getting started with pandas

10 minutes to pandas

Visualization with pandas

6. Launch binder

If you have trouble installing the software or can otherwise not get the Jupyter Notebook to open, click this "launch binder" badge to start this session Binder

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].