dwyl / Stars
Programming Languages
⭐️ ⭐️ ⭐️ ⭐️ ⭐️ stars ⭐️ ⭐️ ⭐️ ⭐️ ⭐️
"No more counting dollars, we'll be counting stars"
~ OneRepublic - https://youtu.be/hT_nvWreIhg?t=15s
This mini-project helps us track ⭐️ for projects on GitHub and answer interesting questions about the data.
Why?
A big part of achieving our
goals
in DWYL requires tracking certain
"Metrics"
so that we can see trends and derive actionable insights
from our data.
Discover Interesting Projects & Useful Content on GitHub
GitHub ⭐️ are one of the main (quantitative) measures we have for discovering interesting Open Source projects on GitHub.
Counting ⭐️ helps us know if the
learning materials
we are producing are useful to other people.1
Encouraging people to ⭐️ our projects is important for "exposure",
and is you can help us with if you aren't already...
The more people ⭐️ dwyl repos the more it will help
their friends/followers to discover our useful projects/content.
Discover Interesting People
The other benefit of tracking ⭐️ on our projects is that it allows us to understand who is interested in our work, which allows us to discover new & interesting people.
Ask Interesting Questions
Finally, we think that the GitHub API for ⭐️ is not great because for example it does not allow us to answer interesting questions such as:
find all people who are members of an org who have starred xyz project
or
who in the org has the most/least stars
or
which project in the org increased/decreased its stars most this week
So we decided to solve this mini-challenge with some code.
What?
“When you have mastered numbers, you will in fact no longer be reading numbers, any more than you read words when reading books. You will be reading meanings.” ~ W.E.B. Du Bois
GitHub lets it's "users" ⭐️ projects (repositories) in order to "favourite" or "bookmark" them.
Both the person starring the project (that interests them) and the rest of community can see the stars which then act as a signal of "interesting" or even "quality".
For example Natalia has the following projects starred: https://github.com/NataliaLKB?tab=stars
Some people use their stars "scarcely", which is ok because they may only want to "bookmark" a handful of things on GitHub. However other "power users" ⭐️ many things ... e.g:
https://github.com/feross?tab=stars&q=summer
(Immediate) "Research Question"
The immediate question we are going to answer with this project is:
how many distinct people have found dwyl code/tutorials useful
The answer is:
See "how" section below for exactly how this number is derived.
How?
How would you go about tackling this challenge...?
Scripts?
We wrote a few scripts to fetch the data from GitHub:
You will need
node.js
installed on your computer, if you don't already have it, go to: https://nodejs.org/en/download/
Run the following commands:
npm install # install dependencies
npm run crawl # crawls all pages on dwyl's github for stargazers
npm run combine # combines all stargazers into
npm run unique # tallies how many unique people have starred a dwyl repo
npm run learners # just the people who have starred a learn-x repo
or run a single command:
npm run all
The output will be 4 files:
-
stargazers.csv
- the list of all repos and people who have starred them -
unique.csv
- unique people that have starred any dwyl repo -
unique_learners.csv
- unique people who have starred alearn-x
repo
Run it Locally
npm install && npm run local
You should see something like this:
Want to Sort the Profile Images by Color
Sorting the avatars by the color of the avatar requires a little "magic". We first need to download all the profile images so that our script can "analyse" them.
Step 1: Get Profiles for All People!
Run this script (and go for a walk/coffee):
npm run people
Note: this will take about 50 minutes to run because we don't want to "DDOS" GitHub with 6k requests at once (and get our IP address blocked!!)
Step 2: Download Profile Images
Run this script and go for a quick bathroom break:
npm run people
Note: this will take about 20 minutes to run Again because we don't want to flood GitHub CDN with 6k requests at once.
faces.js
Step 3: Update the Fine the line that looks like this in faces.js
:
// var img_base = '/data/img/'; // get avatar from localhost
var img_base = 'https://avatars2.githubusercontent.com/u/';
comment out the github url and un-comment the relative one.
do the same thing again for the lines:
// var src = img_base + uid + '.jpg'; // get avatar from localhost
var src = img_base + uid + '?v=3&s=200'; // GET images from GitHub
Now when you run npm run local
,
wait 60 seconds for the page to load all the images ...
then once they are loaded they will be sorted into a rainbow!
Further reading
- "One Metric that Matters": http://leananalyticsbook.com/one-metric-that-matters/ discuss at: https://github.com/dwyl/hq/issues/149
- Actionable Insights: http://online-metrics.com/actionable-insights/
P.S: we prefer counting the other type stars, but for now this is a great start. 😉
1Note: while dwyl's "mission" is not simply to produce good learning materials, we think that having good learning tutorials is essential for our mission! If other people find our tutorials useful and they contribute to improving them, then everyone benefits not just the members of the dwyl team building the dwyl "product" #WinWin