All Projects → stared → Tagoverflow

stared / Tagoverflow

An interactive map of Stack Exchange tags for all sites.

Programming Languages

javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Tagoverflow

Protodot
transforming your .proto files into .dot files (and .svg, .png if you happen to have graphviz installed)
Stars: ✭ 107 (-8.55%)
Mutual labels:  graph
Open Graph
Library that assists in building Open Graph meta tags
Stars: ✭ 112 (-4.27%)
Mutual labels:  graph
Coingraph
Coingraph is a real-time graph for cryptocurrencies.
Stars: ✭ 116 (-0.85%)
Mutual labels:  graph
Diagrammer
Graph and network visualization using tabular data in R.
Stars: ✭ 1,497 (+1179.49%)
Mutual labels:  graph
Graphquery
GraphQuery is a query language and execution engine tied to any backend service.
Stars: ✭ 112 (-4.27%)
Mutual labels:  graph
Bio4j
Bio4j abstract model and general entry point to the project
Stars: ✭ 113 (-3.42%)
Mutual labels:  graph
Py Ascii Graph
A simple python lib to print data as ascii histograms
Stars: ✭ 107 (-8.55%)
Mutual labels:  graph
Decryptlogin
APIs for loginning some websites by using requests.
Stars: ✭ 1,861 (+1490.6%)
Mutual labels:  stackoverflow
Misp Maltego
Set of Maltego transforms to inferface with a MISP Threat Sharing instance, and also to explore the whole MITRE ATT&CK dataset.
Stars: ✭ 112 (-4.27%)
Mutual labels:  graph
Py Factorgraph
Factor graphs and loopy belief propagation implemented in Python
Stars: ✭ 115 (-1.71%)
Mutual labels:  graph
Vue Plotly
A vue wrapper for plotly.js chart library
Stars: ✭ 109 (-6.84%)
Mutual labels:  graph
Dag
🐠 An Angular service for managing directed acyclic graphs
Stars: ✭ 111 (-5.13%)
Mutual labels:  graph
Pi Temp
Web server using a Raspberry Pi and DHT22 sensor to graph the humidity and temperature in my apartment over time.
Stars: ✭ 114 (-2.56%)
Mutual labels:  graph
Walk
A fast, general purpose, graph based build and task execution utility.
Stars: ✭ 108 (-7.69%)
Mutual labels:  graph
X6
🚀 JavaScript diagramming library that uses SVG and HTML for rendering.
Stars: ✭ 2,686 (+2195.73%)
Mutual labels:  graph
Typescript
Algebraic graphs implementation in TypeScript
Stars: ✭ 107 (-8.55%)
Mutual labels:  graph
Mac Graph
The MacGraph network. An attempt to get MACnets running on graph knowledge
Stars: ✭ 113 (-3.42%)
Mutual labels:  graph
Gfa Spec
Graphical Fragment Assembly (GFA) Format Specification
Stars: ✭ 117 (+0%)
Mutual labels:  graph
Gust
A small charting/visualization tool and partial vega implementation for Rust
Stars: ✭ 116 (-0.85%)
Mutual labels:  graph
Nodeeditor
Qt Node Editor. Dataflow programming framework
Stars: ✭ 1,734 (+1382.05%)
Mutual labels:  graph

TagOverflow

An interactive map of tags from Stack Exchange sites. Click here for the live version! As of now it looks more or less like:

Screenshot Dev

History

It is a continuation of my older project Tag Graph Map of Stack Exchange, which met with a warm reception of the Stack Exchange community (see e.g here and here; I even got t-shirts from the SE team!).

Main ingredients

What's there?

Each question on Stack Exchange site has one to five tags describing its content. Unlike on Twitter, these tags are well curated (to a point, you can get a taxonomist badge).

Nodes represent the most popular tags, with their area being proportional to the number of questions with them.

Edges represent relation between tags. Their width is related to the number of questions with both tags (e.g. with both python and list), while their shade - how much more often they occur than one should expect by random chance. Default coloring is due to community detection - automated splitting of a graph into densely connected subgraphs.

You can click on a tag to get additional data, like users who have asked or answered a lot of questions, along with the best questions with this tag. (Who knows, maybe you are one of the to guys and gals?)

Moreover, especially for Stack Overflow, which is a big place, you can draw conditional graphs. That is, consider only questions with a given tag (e.g. javascript). For example, it will count only those occurrences array, which happen to be with javascript. This tag DO NOT appear for the same reason that the site name does not appear a tag.

Methods and tricks

The co-occurrence weight (use for edge shade and strength) is calculated from the observed to expected ratio. It goes as follows:

oe_ratio =  (all_qustions_count * tag_count_AB) / (tag_count_A * tag_count_B)  

It is exactly 1 if and only if two tags co-appear at random. If it is more, it means that they do "like" each other I draw an edge. (I also ignore it when oe_ratio is less than one - i.e. when they avoid each other.) Believe me, this measure is much better than making correlations of some vectors (I tried).

The limit of 100 questions is because of the API limit. However, for dynamic graphs it is also a sane limit. But for most sites 32 tags should be well enough, except for a few sited that are bigger.

In any case, it does a lot of queries and (from time to time) Stack Exchange may block you. Don't worry, it lasts only for a few minutes.

Positions of the nodes are due to D3.js force layout. That is, nodes connected via an edge attract each other. The strength of such attraction depends on the strength of an edge. Plus, all nodes repeal each other at a short distance to prevent overlaps.

For community detection I wrote a greedy hierarchical modularity maximization (as in arXiv:cond-mat/0408187). (AFAIK there is no other JavaScript implementation of community detection; if there is a need, I would be happy to implement something more serious like Louvain or Infomap. If you want it to happen, a few encouraging e-mails will work. :) EDIT: there is a good implementation of Louvain in JS.)

There are some tricks. For example, to calculate tag statistics (e.g. average number of answer per tag) it is unfeasible to probe all questions, and there is no REST API to get these numbers directly. So, it takes 100 newest questions with a given tag, which are at least a month old (so their scores stabilize a bit).

The best askers and answerers, unfortunately, do not work properly for conditional tags (as the respective API queries can be done only for a single tag).

As tag statistics (like average score) have long tails but also can be zero or negative, neither linear nor log scale fits. So, Marta built an asinh scale! In short, for small values it works as linear, but for large - as logarithm; and is antisymmetric.

On code quality

Before looking at code: beware, when you gaze long into a code the code also gazes into you!

(Some excuses: I started it long time ago, changed it in various directions, used to learn JS, teach JS - so it has most of bad practices it could get. I should rewrite it completely into Angular.JS + D3.js; but instead, I decided to show the result, hoping that you forgive me the dirty code.)

Nonetheless, if something does not work, raise an Issue or (even better!) propose a Pull Request.

Citing

Feel free to use it for anything. Just please to refer to it as:

And for any academic papers, please cite:

  • Piotr Migdał, Symmetries and self-similarity of many-body wavefunctions, PhD Thesis (ICFO), arXiv:1412.6796

(If you are wondering about the relation of my PhD thesis to this project - well, one of main topics is community detection. While introducing basic methods, I use TagOverflow as an example.)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].