All Projects → erikbern → Git Of Theseus

erikbern / Git Of Theseus

Licence: apache-2.0
Analyze how a Git repo grows over time

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects
Dockerfile
14818 projects

Projects that are alternatives of or similar to Git Of Theseus

Nexus Oss
Sonatype Nexus OSS
Stars: ✭ 240 (-87.07%)
Mutual labels:  repository-management
rpm-adapter
Turns your data storage into an RPM repository
Stars: ✭ 21 (-98.87%)
Mutual labels:  repository-management
Docker Nexus3
Dockerized version of Nexus Repo Manager 3
Stars: ✭ 917 (-50.59%)
Mutual labels:  repository-management
comptroller
A simple and lightweight tool to manage your monorepo.
Stars: ✭ 26 (-98.6%)
Mutual labels:  repository-management
repobee
CLI tool for managing Git repositories on GitHub and GitLab in the context of education
Stars: ✭ 51 (-97.25%)
Mutual labels:  repository-management
git-beam-it
Bulk clone Github repositories for a specific user/organisation or team
Stars: ✭ 26 (-98.6%)
Mutual labels:  repository-management
Reposilite
Lightweight repository management software dedicated for the Maven based artifacts (formerly NanoMaven) 📦
Stars: ✭ 222 (-88.04%)
Mutual labels:  repository-management
Fgh
📁 Automate the lifecycle and organization of your cloned GitHub repositories
Stars: ✭ 107 (-94.23%)
Mutual labels:  repository-management
git statistics
A gem that allows you to get detailed statistics of a git repository.
Stars: ✭ 62 (-96.66%)
Mutual labels:  author-statistics
Nexus Public
Sonatype Nexus Repository Manager; Open-source codebase mirror
Stars: ✭ 902 (-51.4%)
Mutual labels:  repository-management
artifact-resolver
Standalone jar executable client Maven 2 artifact resolver based on Eclipse Aether.
Stars: ✭ 13 (-99.3%)
Mutual labels:  repository-management
indy
Simple artifact proxy for maven and similar build tools
Stars: ✭ 27 (-98.55%)
Mutual labels:  repository-management
Gitlist
An elegant and modern git repository viewer
Stars: ✭ 2,837 (+52.86%)
Mutual labels:  repository-management
cleanup
Remove gone Git branches with ease.
Stars: ✭ 21 (-98.87%)
Mutual labels:  repository-management
Github Repo Size
🚀 Chrome extension to display repository size on GitHub
Stars: ✭ 859 (-53.72%)
Mutual labels:  repository-management
Pull
🤖 Keep your forks up-to-date via automated PRs
Stars: ✭ 3,364 (+81.25%)
Mutual labels:  repository-management
ferryd
Fast, safe and reliable transit for the delivery of software updates to users.
Stars: ✭ 43 (-97.68%)
Mutual labels:  repository-management
Bodhi
Bodhi is a web-system that facilitates the process of publishing updates for a Fedora-based software distribution.
Stars: ✭ 114 (-93.86%)
Mutual labels:  repository-management
Micromanage
A Micro-services Helpers Framework | Easily manage multiple repositories and projects
Stars: ✭ 93 (-94.99%)
Mutual labels:  repository-management
Repman
Repman - PHP Repository Manager: packagist proxy and host for private packages
Stars: ✭ 277 (-85.08%)
Mutual labels:  repository-management

pypi badge

Some scripts to analyze Git repos. Produces cool looking graphs like this (running it on git itself):

git

Installing

Run pip install git-of-theseus

Running

First, you need to run git-of-theseus-analyze <path to repo> (see git-of-theseus-analyze --help for a bunch of config). This will analyze a repository and might take quite some time.

After that, you can generate plots! Some examples:

  1. Run git-of-theseus-stack-plot cohorts.json will create a stack plot showing the total amount of code broken down into cohorts (what year the code was added)
  2. Run git-of-theseus-line-plot authors.json --normalize will show a plot of the % of code contributed by the top 20 authors
  3. Run git-of-theseus-survival-plot survival.json

You can run --help to see various options.

If you want to plot multiple repositories, have to run git-of-theseus-analyze separately for each project and store the data in separate directories using the --outdir flag. Then you can run git-of-theseus-survival-plot <foo/survival.json> <bar/survival.json> (optionally with the --exp-fit flag to fit an exponential decay)

Help

AttributeError: Unknown property labels – upgrade matplotlib if you are seeing this. pip install matplotlib --upgrade

Some pics

Survival of a line of code in a set of interesting repos:

git

This curve is produced by the git-of-theseus-survival-plot script and shows the percentage of lines in a commit that are still present after x years. It aggregates it over all commits, no matter what point in time they were made. So for x=0 it includes all commits, whereas for x>0 not all commits are counted (because we would have to look into the future for some of them). The survival curves are estimated using Kaplan-Meier.

You can also add an exponential fit:

git

Linux – stack plot:

git

This curve is produced by the git-of-theseus-stack-plot script and shows the total number of lines in a repo broken down into cohorts by the year the code was added.

Node – stack plot:

git

Rails – stack plot:

git

Tensorflow – stack plot:

git

Rust – stack plot:

git

Plotting other stuff

git-of-theseus-analyze will write exts.json, cohorts.json and authors.json. You can run git-of-theseus-stack-plot authors.json to plot author statistics as well, or git-of-theseus-stack-plot exts.json to plot file extension statistics. For author statistics, you might want to create a .mailmap file to deduplicate authors. For instance, here's the author statistics for Kubernetes:

git

You can also normalize it to 100%. Here's author statistics for Git:

git

Other stuff

Markovtsev Vadim implemented a very similar analysis that claims to be 20%-6x faster than Git of Theseus. It's named Hercules and there's a great blog post about all the complexity going into the analysis of Git history.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].