All Projects → mwermelinger → Learn To Code For Data Analysis

mwermelinger / Learn To Code For Data Analysis

Jupyter notebooks and datasets for this course

Projects that are alternatives of or similar to Learn To Code For Data Analysis

Data Science Cookbook
🎓 Jupyter notebooks from UFC data science course
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Kdd Multimodalities Recall
This is our solution for KDD Cup 2020. We implemented a very neat and simple neural ranking model based on siamese BERT which ranked first among the solo teams and ranked 12th among all teams on the final leaderboard.
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Spade
"Semantic Image Synthesis with Spatially-Adaptive Normalization" paper implementation
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Python4scientificcomputing fundamentals
This repository includes the guidelines and the exercise files of the first part of my lectures on python for scientific programming which is dedicated to a general introduction to Python programming language. These lectures are a part of the "Energy and Environmental Technologies for Building Systems" course offered for M.Sc in Energy Eng. at Politecnico di Milano.
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Itorch
IPython kernel for Torch with visualization and plotting
Stars: ✭ 1,110 (+1719.67%)
Mutual labels:  jupyter-notebook
Ccks2020 Baseline
CCKS 2020: 基于本体的金融知识图谱自动化构建技术评测
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Bayesian Linear Regression
A python tutorial for a Bayesian treatment of Linear Regression: https://zjost.github.io/bayesian-linear-regression/
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Image Classification Using Cnn And Keras
Classify images, specifically document images like ID cards, application forms, and cheque leafs, using CNN and the Keras libraries.
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Matplotlib Multilayer Network
small template code to create a multilayer network using matplotlib and networkx
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Dfencoder
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Silverhand
Micheal Gardner的数据科学笔记
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Cnn graph
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Stars: ✭ 1,110 (+1719.67%)
Mutual labels:  jupyter-notebook
Neural Painters X
Neural Paiters
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Rpi Docker Tensorflow
Docker container for the Raspberry Pi containing Tensorflow and Jupyter
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Fromscratch
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Rl Cc
Web-based Reinforcement Learning Control Center
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook
Rechunker
Disk-to-disk chunk transformation for chunked arrays.
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Principles Of Machine Learning R
Principles of Machine Learning R
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Pose estimation cvpr eccv 2018
Stars: ✭ 61 (+0%)
Mutual labels:  jupyter-notebook
Philly Traces
Stars: ✭ 60 (-1.64%)
Mutual labels:  jupyter-notebook

Learn to code for data analysis

This repository provides the Jupyter notebooks and datasets for the Open University course Learn to Code for Data Analysis, which was available twice a year on FutureLearn, with discussion forums and facilitator support, and is now available 24/7 on OpenLearn, without forums and without support.

The course was written by Michel Wermelinger (parts 1 and 3), Rob Griffiths (part 2) and Tony Hirst (part 4).

This repository does not contain the course text. It contains:

  • one test notebook to check that your software installation includes Python 3 and the necessary data analysis and visualisation libraries;
  • one exercise notebook, one project notebook, and the necessary datasets, for each of the 4 parts of the course;
  • additional software for interactive pivot tables in part 4.

You can run Jupyter notebooks locally on your computer using software like Anaconda, or you can run them in the cloud using services like CoCalc and Microsoft Azure. Running notebooks in the cloud will be noticeably slower, but you won't have to install any software and you can run them from wherever you have Internet access.

Instructions (Microsoft Azure)

Azure Notebooks

You can copy these notebooks directly to Microsoft Azure, without downloading them first to your computer, and run them there. To do so, if you have used Azure Notebooks before, click the 'launch' badge above, otherwise follow these instructions.

Instructions (Anaconda and CoCalc)

To get the notebooks and data files for the course, click on the green 'Code' button above. You will see a pop-up window with a button 'Download ZIP'. Click on it. The notebooks and data files will be downloaded as a compressed archive: a file with extension .zip. The archive will be in the folder where your web browser usually puts downloaded files. You will need to double-click on the downloaded file to uncompress it, although your browser may have already done that automatically for you.

You should now have a sub-folder named Learn-to-code-for-data-analysis-master or similar. You can rename the folder to whatever name you prefer. If you will be working on the course using your desktop, e.g. with Anaconda, you need to move that folder to anywhere within your home folder, so that Jupyter can find your notebooks and open them.

If you will be working on the course using a web-based service, e.g. CoCalc, you need to upload the folder to that service.

The instructions for installing Anaconda or using Cocalc are given at the start of the course.

The course

Learn to Code for Data Analysis is a hands-on introduction to computer programming and data analysis. It teaches how to access open data and clean, analyse and visualise it. It adopts a reproducible research approach: the data analysis is written up and publicly shared with the code used in the analysis.

The course teaches how to write computer programs, one line of code at a time, to download, clean, analyse and visualise open data (using line charts, bar charts and scatterplots). The course also teaches how to write up and share data analyses in a reproducible way.

Each part of the course is organised around a specific analysis project using open data from the World Health Organisation, the Weather Underground, the World Bank and the United Nations.

All coding and data analysis is done with tools used by professional scientists: the Python programming language, the pandas data analysis library and the Jupyter Notebooks programming environment.

The course does not assume prior experience in programming, data analysis, or statistics, but it requires basic numeracy and digital skills, like understanding percentages and working with files and folders.

Learning outcomes

  • Understanding basic programming and data analysis concepts
  • Awareness of open data sources as a public resource
  • Using a programming environment to develop programs
  • Writing simple programs to analyse large bodies of data and produce useful results

Syllabus

  • Python: variables, assignments, expressions, basic data types, if-statement, functions
  • Programming: using Jupyter Notebooks, writing readable and documented code, testing code
  • Data analysis: using pandas to read CSV and Excel files, to clean, filter, partition, aggregate and summarise data, and to produce simple charts

Pedagogy

The course follows Merrill's First Principles of Instruction. More details on how are here.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].