All Projects → nick-ulle → 2015 Python

nick-ulle / 2015 Python

(Winter 2015) Python for Data Mining mini-course.

Projects that are alternatives of or similar to 2015 Python

Learn Quantum Computing With Python And Ibm Quantum Experience
Learn Quantum Computing with Python and IBM Quantum Experience, published by Packt
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Synthetic Medical Images
Synthetic Medical Images from Dual Generative Adversarial Networks
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Inception Resnet V2
Implementation of Google's Inception + ResNet v2 architecture in Keras
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Signdetect Face
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Gans
Various GANs for playing around
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Odsc neural nets 11 04 17
Talk "Deep Learning From Scratch Using Python" delivered at ODSC West on November 4, 2017
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Hacktoberfest2020
beginner-friendly project to help you in open-source contributions. Made specifically for contributions in HACKTOBERFEST 2020! Hello World Programs in any language and C and Cpp program , Please leave a star ⭐ to support this project! ✨
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Lstmvis
Visualization Toolbox for Long Short Term Memory networks (LSTMs)
Stars: ✭ 959 (+2993.55%)
Mutual labels:  jupyter-notebook
Unimelb Cs Subjects
Slides, Assignments, Solutions
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Betalyzer
A fintech tutorial using Python
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Pygotham2018 graphmining
Large-scale Graph Mining with Spark
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Going Dutch
Using Data Science and Machine Learning to find an apartment in Amsterdam.
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Data Science Poker Projects
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Crnn Pytorch
✍️ Convolutional Recurrent Neural Network in Pytorch | Text Recognition
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Notebooks
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Mathematical And Statistical Modeling Of Covid19 In Brazil
To make a library of models that aim to understand the spread of COVID19 in adequate scenarios of the Brazilian population
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Testrepo
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Discogan Pytorch
PyTorch implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"
Stars: ✭ 961 (+3000%)
Mutual labels:  jupyter-notebook
Imgcluster
Image clustering using the similarity algorithms: SIFT, SSIM, CW-SSIM, MSE
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook
Asrgen
Attacking Speaker Recognition with Deep Generative Models
Stars: ✭ 31 (+0%)
Mutual labels:  jupyter-notebook

Course Videos

Python for Data Mining

Python is a programming language designed to have clear, concise, and expressive code. An extremely popular general-purpose language, Python has been used for tasks as diverse as web development, teaching, and systems administration. This mini-course provides an introduction to Python for data mining.

Messy data has an inconsistent or inconvenient format, and may have missing values. Noisy data has measurement error. Data mining extracts meaningful information from messy, noisy data. This is a start-to-finish process that includes gathering, cleaning, visualizing, modeling, and reporting.

Programming and research best practices are a secondary focus of the mini-course, because Python is a philosophy as well as a language. Core concepts include: writing organized, well-documented code; being a self-sufficient learner; using version control for code management and collaboration; ensuring reproducibility of results; producing concise, informative analyses and visualizations.

We will meet for four weeks during the Winter 2015 quarter at the University of California, Davis.

Target Audience

The mini-course is open to undergraduate and graduate students from all departments. We recommend that students have prior programming experience and a basic understanding of statistical methods, so they can follow along with the examples. For instance, completion of STA 108 and STA 141 is sufficient (but not required).

Topics

Core Python

The mini-course will kick off with a quick introduction to the syntax of Python, including operators, data types, control flow statements, function definition, and string manipulation. Slower, in-depth coverage will be given to uniquely Pythonic features such as built-in data structures, list comprehensions, iterators, and docstrings.

Authoring packages and other advanced topics may also be discussed.

Scientific Computing

Support for stable, high-performance vector operations is provided by the NumPy package. NumPy will be introduced early and used often, because it's the foundation for most other scientific computing packages. We will also cover SciPy, which extends NumPy with functions for linear algebra, optimization, and elementary statistics.

Specialized packages will be discussed during the final week.

Data Manipulation

The pandas package provides tabular data structures and convenience functions for manipulating them. This includes a two-dimensional data frame similar to the one found in R. Pandas will be covered extensively, because it makes it easy to

  • Read and write many formats (CSV, JSON, HDF, database)
  • Filter and restructure data
  • Handle missing values gracefully
  • Perform group-by operations (apply functions)

Data Visualization

Many visualization packages are available for Python, but the mini-course will focus on Seaborn, which is a user-friendly abstraction of the venerable matplotlib package.

Other packages such as ggplot2, Vincent, Bokeh, and mpld3 may also be covered.

Programming Environment

Python 3 has syntax changes and new features that break compatibility with Python 2. All of the major scientific computing packages have added support for Python 3 over the last few years, so it will be our focus. We recommend the Anaconda Python 3 distribution, which bundles most packages we'll use into one download. Any other packages needed can be installed using pip or conda.

Python code is supported by a vast array of editors.

  • Spyder IDE, included in Anaconda, is a Python equivalent of RStudio, designed with scientific computing in mind.
  • PyCharm IDE and Sublime provide good user interfaces.
  • Terminal-based text editors, such as Vim and Emacs, are a great choice for ambitious students. They can be used with any language. See here for more details. Clark and Nick both use Vim.

References

No books are required, but we recommend Wes McKinney's book:

  • McKinney, W. (2012). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly Media.

Python and most of the packages we'll use have excellent documentation, which can be found at the following links.

Due to Python's popularity, a large number of general references are available. While these don't focus specifically on data analysis, they're helpful for learning the language and its idioms. Some of our favorites are listed below, many of which are free.

* Videos featuring Guido Van Rossum, Raymond Hettinger, Travis Oliphant, Fernando Perez, David Beazley, and Alex Martelli are suggested.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].