All Projects → ajcr → 100 Pandas Puzzles

ajcr / 100 Pandas Puzzles

Licence: mit
100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to 100 Pandas Puzzles

Data Science Hacks
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Stars: ✭ 273 (-80.25%)
Mutual labels:  jupyter-notebook, data-analysis, pandas, numpy
Seaborn Tutorial
This repository is my attempt to help Data Science aspirants gain necessary Data Visualization skills required to progress in their career. It includes all the types of plot offered by Seaborn, applied on random datasets.
Stars: ✭ 114 (-91.75%)
Mutual labels:  jupyter-notebook, data-analysis, pandas, numpy
Data Analysis
主要是爬虫与数据分析项目总结,外加建模与机器学习,模型的评估。
Stars: ✭ 142 (-89.73%)
Mutual labels:  jupyter-notebook, data-analysis, pandas, numpy
Pydata Notebook
利用Python进行数据分析 第二版 (2017) 中文翻译笔记
Stars: ✭ 4,300 (+211.14%)
Mutual labels:  jupyter-notebook, data-analysis, pandas
Ai Learn
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Stars: ✭ 4,387 (+217.44%)
Mutual labels:  data-analysis, pandas, numpy
Stats Maths With Python
General statistics, mathematical programming, and numerical/scientific computing scripts and notebooks in Python
Stars: ✭ 381 (-72.43%)
Mutual labels:  jupyter-notebook, pandas, numpy
data-analysis-using-python
Data Analysis Using Python: A Beginner’s Guide Featuring NYC Open Data
Stars: ✭ 81 (-94.14%)
Mutual labels:  numpy, pandas, data-analysis
Credit Risk Modelling
Credit Risk analysis by using Python and ML
Stars: ✭ 91 (-93.42%)
Mutual labels:  jupyter-notebook, pandas, numpy
Pandas exercises
Practice your pandas skills!
Stars: ✭ 7,140 (+416.64%)
Mutual labels:  jupyter-notebook, data-analysis, pandas
Pymc Example Project
Example PyMC3 project for performing Bayesian data analysis using a probabilistic programming approach to machine learning.
Stars: ✭ 90 (-93.49%)
Mutual labels:  jupyter-notebook, pandas, numpy
Machine Learning Alpine
Alpine Container for Machine Learning
Stars: ✭ 30 (-97.83%)
Mutual labels:  jupyter-notebook, pandas, numpy
Zat
Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark
Stars: ✭ 303 (-78.08%)
Mutual labels:  jupyter-notebook, data-analysis, pandas
Pythondatasciencehandbook
The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.
Stars: ✭ 31,995 (+2215.12%)
Mutual labels:  jupyter-notebook, pandas, numpy
Mlcourse.ai
Open Machine Learning Course
Stars: ✭ 7,963 (+476.19%)
Mutual labels:  data-analysis, pandas, numpy
Pytablewriter
pytablewriter is a Python library to write a table in various formats: CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
Stars: ✭ 422 (-69.46%)
Mutual labels:  jupyter-notebook, pandas, numpy
visions
Type System for Data Analysis in Python
Stars: ✭ 136 (-90.16%)
Mutual labels:  numpy, pandas, data-analysis
Pyda 2e Zh
📖 [译] 利用 Python 进行数据分析 · 第 2 版
Stars: ✭ 866 (-37.34%)
Mutual labels:  data-analysis, pandas, numpy
Data Science Complete Tutorial
For extensive instructor led learning
Stars: ✭ 1,027 (-25.69%)
Mutual labels:  jupyter-notebook, pandas, numpy
Data-Science-Resources
A guide to getting started with Data Science and ML.
Stars: ✭ 17 (-98.77%)
Mutual labels:  numpy, pandas, data-analysis
Data-Analyst-Nanodegree
Kai Sheng Teh - Udacity Data Analyst Nanodegree
Stars: ✭ 42 (-96.96%)
Mutual labels:  numpy, pandas, data-analysis

100 pandas puzzles

Puzzles notebook

Solutions notebook

Inspired by 100 Numpy exerises, here are 100* short puzzles for testing your knowledge of pandas' power.

Since pandas is a large library with many different specialist features and functions, these excercises focus mainly on the fundamentals of manipulating data (indexing, grouping, aggregating, cleaning), making use of the core DataFrame and Series objects. Many of the excerises here are straightforward in that the solutions require no more than a few lines of code (in pandas or NumPy - don't go using pure Python!). Choosing the right methods and following best practices is the underlying goal.

The exercises are loosely divided in sections. Each section has a difficulty rating; these ratings are subjective, of course, but should be a seen as a rough guide as to how elaborate the required solution needs to be.

Good luck solving the puzzles!

* the list of puzzles is not yet complete! Pull requests or suggestions for additional exercises, corrections and improvements are welcomed.

Overview of puzzles

Section Name Description Difficulty
Importing pandas Getting started and checking your pandas setup Easy
DataFrame basics A few of the fundamental routines for selecting, sorting, adding and aggregating data in DataFrames Easy
DataFrames: beyond the basics Slightly trickier: you may need to combine two or more methods to get the right answer Medium
DataFrames: harder problems These might require a bit of thinking outside the box... Hard
Series and DatetimeIndex Exercises for creating and manipulating Series with datetime data Easy/Medium
Cleaning Data Making a DataFrame easier to work with Easy/Medium
Using MultiIndexes Go beyond flat DataFrames with additional index levels Medium
Minesweeper Generate the numbers for safe squares in a Minesweeper grid Hard
Plotting Explore pandas' part of plotting functionality to see trends in data Medium

Setting up

To tackle the puzzles on your own computer, you'll need a Python 3 environment with the dependencies (namely pandas) installed.

One way to do this is as follows. I'm using a bash shell, the procedure with Mac OS should be essentially the same. Windows, I'm not sure about.

  1. Check you have Python 3 installed by printing the version of Python:
python -V
  1. Clone the puzzle repository using Git:
git clone https://github.com/ajcr/100-pandas-puzzles.git
  1. Install the dependencies (caution: if you don't want to modify any Python modules in your active environment, consider using a virtual environment instead):
python -m pip install -r requirements.txt
  1. Launch a jupyter notebook server:
jupyter notebook --notebook-dir=100-pandas-puzzles

You should be able to see the notebooks and launch them in your web browser.

Contributors

This repository has benefitted from numerous contributors, with those who have sent puzzles and fixes listed in CONTRIBUTORS.

Thanks to everyone who has raised an issue too.

Other links

If you feel like reading up on pandas before starting, the official documentation useful and very extensive. Good places get a broader overview of pandas are:

There are may other excellent resources and books that are easily searchable and purchaseable.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].