Data Institute 2018
For students of https://projects.propublica.org/graphics/ida-propublica-data-institute
Here are all of the materials we used to teach the 2018 Data Institute: slides, exercises, links, and homework. This is not an online course and doesn’t have all the context or instruction to be a standalone class.
Want to use our slides? Our teaching materials fall under the same Creative Commons license we use across our site. Get more details here.
Curriculum
Table of Contents
Click to jump directly to:
Week 1:
- Day 1: Intro to Data Journalism, Spreadsheets, Best Practices
- Day 2: Evaluating data, Open Refine, Analyzing One Variable
- Day 3: Intro to Mapping, Common Calculations in News, Analyzing Two Variables, Statistics
- Day 4: Intro to Code, How Websites Work, HTML, CSS
- Day 5: Intro to Design, Type, Layout & Color, Making a Webpage with Github
Week 2:
- Day 6: Javascript, JQuery
- Day 7: Visualizing Data, Charts and Maps
- Day 8: Web Scraping, Fundamentals of Programming
- Day 9: Even More Web Scraping
- Day 10: Final Presentations
Welcome Reception & Install Party
ACCOUNTS
- Github.com
(Make sure to confirm your e-mail address) - Google.com
- Datawrapper
SOFTWARE
- Google Chrome
- Slack (Mac, Windows)
- Sublime Text (Mac, Windows)
- Github Desktop App
- Tabula (Mac, Windows)
- Open Refine (Mac, Windows)
- If you're on a Mac, and you get the error that Google/Open Refine is damaged, follow these instructions.
Macs
- Open your Terminal app (comes with all Macs) and paste these exact commands into the window, one at a time, and press enter:
xcode-select --install
python -V
- Your Terminal should say something like "Python 2.7.13". Your last two digits might be different, that's okay. If you get something that Python 3, which looks like: "Python 3.X.XX" let Sisi know.
sudo easy_install pip
- This will ask you to put in your computer password. Go ahead, and quick warning: the cursor won't move, but trust that your computer is reading what you're typing in.
pip install --user BeautifulSoup
pip install --user Requests
Windows
- Download Cygwin
- When you get to this step, ask for Sisi
Day 1
Monday, Oct 1
Intro to Data Journalism
Intro to Spreadsheets
Exercises
- Organizing information in rows & columns using Senate data
- Learning how to sort with Trump expenditures
Finding & Loading Data
In-Class Demos
- Using Socrata to look at 3-1-1 calls from NYC
- Tabula
- Types of data (numeric, text, date)
- Quirks of Excel (reformatting dates, dropping leading zeros)
- Text files and types (csv, tab, fixed width, pipe)
- Text delimiter (probably quotes, but maybe not)
- Open your text file in a reader and examine it
Best Practices
In-Class Demos
- Create a text document
- Save a clean copy of your data
- Keep track of your work
- Describe your steps
- Copy/paste functions
- Screen grabs of dialogue boxes
Advanced Spreadsheets: Pivot Tables
Advanced Spreadsheets: String Functions
Exercises
- Practice Data
- Reformat
- Split
- Transpose
Homework
- None! Enjoy the city!
Day 2
Tuesday, Oct 2
Evaluating Data
Exercises
Data Integrity
OpenRefine
Analyzing Data: One Variable
Advanced Spreadsheets: Combining Two Sheets with VLOOKUP
Homework
- None! Enjoy the city!
Day 3
Wednesday, Oct 3
Mapping
Exercises*
Analyzing Data, continued
- Histograms revisited
- Data Analysis Grab-Bag (slides)
- percent change
- percapita
- choosing your denominator wisely
- correlation
- Two Variables (slides)
- scatterplots
- fitting a line
- Statistical Tests with M&Ms
- Resources
Homework
- None! Enjoy the city!
Day 4
Thursday, Oct 4
Intro to Code
In-Class Demos
- What coding languages have you heard of?
- Using the web inspector
How Websites Work
In-Class Demos
- How the Internet passes websites around
- What HTML, CSS and Javascript contribute to a webpage
Exercises
- Drawing a Website
HTML
In-Class Demos
- How to create your first HTML file
- Shortcut to the basic HTML template
- How to use:
<h1>
<h2>
<h3>
<p>
<img>
<a>
<ul>
<!-- Comments -->
Exercises
- Copy and paste this code and follow the instructions inside to format the page.
- Can you fix this broken code?
Basic CSS
In-Class Demos
- How to create your first CSS file
- Shortcut to linking to your CSS file
- How CSS styles work
Exercises
- Using your practice HTML file from before, add CSS styles to it such you change the:
- color
- font-family
- font-size
- On your own, look up how to do the following in CSS, and add it to your HTML file as well:
- underline text
- bold text
- italicize text
- Going back to the Supreme Court article you formatted earlier, do the following using CSS:
- Make the main headline dark red.
- Use the font family "Georgia" for the main headline and the subheadline.
- Center the text of the main headline and the subheadline.
- Give the paragraphs a line height of 19 pixels.
- Remove the underline from the links.
- Make the "Related articles" label all uppercase.
- Bonus: Make an underline appear when you hover over a link.
CSS Classes
In-Class Demos
- How to write your own CSS Class
- How CSS deals with conflict
Homework
- Save this HTML onto your computer. Link to a new CSS file that you create. Write CSS to make the end result look like this image. You may only write CSS. You cannot edit the HTML file.
- Using HTML and CSS, lay out a one-page web portfolio for yourself. Don't worry too much about the final design, though you are free to get started designing. Just make sure to get all of your information on the page and formatted using HTML.
Day 5
Friday, Oct 5
Intro to Design
Lecture
- What's Design Anyways?
- Design Principles: The Only 4 Rules You Gotta Know
Exercises
- Align This!
- Resume Redesign
Type, Layout & Color
Lecture
- Letter: The Many Faces of Type
- Text: How to Deal with Words
- The Grid: Putting the Pieces Together
- Colors & How to Pick 'Em
Exercises
- Name that Font!
- Type Crimes
Let's make a webpage!
Exercises
- Making a website
- Getting your portfolio on the internet!
- Using the GitHub Desktop app
Homework
- Using the principles we discussed today, redesign your résumé. Email the before and after version to [email protected].
- Using everything you've learned about design, type & layout, keep working on your portfolio HTML page. Then, since you've learned how to make a working webpage on the internet, upload your page to Github.
📚 Weekend! 🎉
Day 6
Monday, Oct 8
Javascript and JQuery
Review
- Making a Website with Cards
♠️ - Reminder to send Lena your resumes!
- Going over homework from last Thursday
In-Class Demos
- How to setup Javascript
- How to add jQuery, set it up, and use it (A good place to review what we're learning)
- Ida Demo: Starting with this HTML and CSS, let's build the JS together.
Exercises
- Save this code onto your computer as separate HTML and CSS files. Create a new JS - file and link to it in your HTML.
- Let's talk through logically, what needs to happen together.
- Can you figure out how to build a before and after graphic using Javascript?
Homework
- Using your own photos, make your own before and after interactive. Then, publish your interactive on Github and add it to your portfolio.
- Create an account for Datawrapper before tomorrow.
Day 7
Tuesday, Oct 9
Visualizing Data
Lecture
- Lines
- Bars
- Scatterplots, Treemaps & More!
Let's Make Some Charts & Maps!
In-Class Demos
- From data to charts in Google Sheets & Data Wrapper
Exercises
Day 8
Wednesday, Oct 10
Web Scraping + Fundamentals of Programming
In-Class Demos
- Introduction to Web Scraping
- Thinking through how to scrape this website
- Download and unzip this folder into your "Code" folder on your computer
- Fundamentals of Programming
- Want to review later? Everything we're covering is laid out here.
Exercises
We'll do the first two together, and you'll do the rest on your own.
- Write a function, named
copycat
, that simply prints out whatever input it's given. - Write a function, named
addition
, that when given any three numbers, will print out the total sum of all three numbers. - Write a function, named
conversion
, that when given the Fahrenheit temperature, will print out what it is in Celsius. The formula you can use is:C = (F – 32) * 5/9
- Write a function, named
find_the_max
, that given any three numbers, will print out the bigger number. Python has the native ability to do this, using the functionmax()
. Do not use it. Instead write this from scratch. - For an extra challenge: Given the following data, write a function, named
total_students
, that calculates how many total students are enrolled in Hogwarts.
pupils_by_year = [["first years", 40], ["second years", 40], ["third years", 38], ["fourth years", 35], ["fifth years", 30], ["sixth years", 29], ["seventh years", 23]]
Homework
- Keep working on your portfolios and presentations.
- See if you can write yourself any other functions. It'll help you gear up for tomorrow.
Day 9
Thursday, Oct 11
Web Scraping, Continued
In-Class Demos
- Even more web scraping!
Homework
- Prepare your presentation for tomorrow! Send a URL of the project you want to show to [email protected] by tomorrow at 9:30am. Here are some questions to think about:
- General: What did you learn? What can you do now that you could not do 2 weeks ago? What were the biggest challenges/setbacks/frustrations you faced? The biggest surprises/succeses/most awesome things you accomplished?
- Project specific: Tell us what you’re presenting: your portfolio, a dataset you analyzed, a data visualization you created. What are you proud of? What are the next steps you want to take? What are your ultimate goals for the project?