Hands-On Big Data
"Hands-on Big Data" workshop materials. First presented at IASSIST 2015 Annual Conference (http://iassist2015.pop.umn.edu/), Minneapolis, Minnesota, June 2, 2015.
Screencast version at youtube.com/librarianwomack
-
HandsOnBigData.pdf Presentation slides
-
common_crawl Check and download output from the CommonCrawl Web corpus
-
spark Short sample Spark program run on EC2
-
Tessera Instructions for running Tessera on Amazon EMR
-
wdi.hive Analyze World Development Indicators extract with Hive
-
wdi_extract.csv Small extract of data from World Development Indicators for demonstration purposes
-
wordcount.pig Short program to conduct word count of Shakespeare's works in Pig Latin