All Projects → rafaelkallis → ticket-tagger

rafaelkallis / ticket-tagger

Licence: AGPL-3.0, AGPL-3.0 licenses found Licenses found AGPL-3.0 LICENSE.md AGPL-3.0 COPYING.md
Machine learning driven issue classification bot.

Programming Languages

javascript
184084 projects - #8 most used programming language
Nunjucks
165 projects

Projects that are alternatives of or similar to ticket-tagger

tinyissue
Simple Issue Tracking for Teams
Stars: ✭ 49 (+104.17%)
Mutual labels:  issue-tracker
FemtoCleaner.jl
The code behind femtocleaner
Stars: ✭ 61 (+154.17%)
Mutual labels:  github-app
iguana
Iguana is an open source issue management system with a kanban board.
Stars: ✭ 39 (+62.5%)
Mutual labels:  issue-tracker
background-check
A GitHub App built with probot that peforms a "background check" to identify users who have been toxic in the past, and shares their toxic activity in the maintainer’s repo.
Stars: ✭ 27 (+12.5%)
Mutual labels:  github-app
tracker-issues
Issue tracking system and Workflow documents integrated to Humhub
Stars: ✭ 28 (+16.67%)
Mutual labels:  issue-tracker
linter-alex
📝Sensitive, considerate writing before you merge your Pull Requests
Stars: ✭ 67 (+179.17%)
Mutual labels:  github-app
ungoliant
🕷️ The pipeline for the OSCAR corpus
Stars: ✭ 69 (+187.5%)
Mutual labels:  fasttext
roundup
un-official mirror of http://hg.code.sf.net/p/roundup/code -- used for CI. Please visit https://issues.roundup-tracker.org for finding starter issues or log new issues.
Stars: ✭ 20 (-16.67%)
Mutual labels:  issue-tracker
NLP-paper
🎨 🎨NLP 自然语言处理教程 🎨🎨 https://dataxujing.github.io/NLP-paper/
Stars: ✭ 23 (-4.17%)
Mutual labels:  fasttext
react-preview
a GitHub App built with probot that generates preview links for react based projects.
Stars: ✭ 14 (-41.67%)
Mutual labels:  github-app
discussions
Issue Tracker for USTC LUG
Stars: ✭ 45 (+87.5%)
Mutual labels:  issue-tracker
treenga
Simple and fast issue tracking system
Stars: ✭ 36 (+50%)
Mutual labels:  issue-tracker
triage-new-issues
A GitHub App, built with Probot that adds `triage` label to newly-created issues which don't have labels
Stars: ✭ 23 (-4.17%)
Mutual labels:  github-app
issuelabeler
A GitHub bot to label issues automatically based on title and body against list of defined labels. System status (https://status.verticalaxisbd.com/)
Stars: ✭ 23 (-4.17%)
Mutual labels:  github-app
goclassy
An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.
Stars: ✭ 81 (+237.5%)
Mutual labels:  fasttext
fasttext-serverless
Serverless hashtag recommendations using fastText and Python with AWS Lambda
Stars: ✭ 20 (-16.67%)
Mutual labels:  fasttext
fasttext-serving
fastText model serving service
Stars: ✭ 54 (+125%)
Mutual labels:  fasttext
trader
No description or website provided.
Stars: ✭ 17 (-29.17%)
Mutual labels:  github-app
FastText.NetWrapper
.NET Standard wrapper for fastText library. Now works on Windows, Linux and MacOs!
Stars: ✭ 57 (+137.5%)
Mutual labels:  fasttext
compress-fasttext
Tools for shrinking fastText models (in gensim format)
Stars: ✭ 124 (+416.67%)
Mutual labels:  fasttext

Ticket Tagger

Machine learning driven issue classification bot. Add to your repository now!

AGPL Build

use ticket tagger

Installation

Visit our GitHub App and install.

install ticket tagger

License

Ticket Tagger is licensed under the GNU Affero General Public License. Every file should include a license header, if not, the following applies:

Ticket Tagger automatically predicts and labels issue types.
Copyright (C) 2018-2021  Rafael Kallis

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>. 

Carefully read the full license agreement.

"... [The AGPL-3.0 license] requires the operator of a network server to provide the source code of the modified version running there to the users of that server."

Derivative Work

References

Development

notice:

  • nodejs ^12.x is required to compile/install dependencies
  • wget is required for fetching datasets
  • we recommend at least 8 GB of RAM if you want to train or benchmark the model

get started:

git clone https://github.com/rafaelkallis/ticket-tagger ticket-tagger
cd ticket-tagger

# install appropriate nodejs version
npx nave use 12

# compile/install dependencies
npm install

# fetch dataset
npm run dataset

# run benchmark
npm run benchmark

# run linter
npm run lint

# run tests
npm test

# run server
NODE_ENV="development" npm start

confounding factors:

Impact of Label Distribution

# balanced distribution
npm run dataset:balanced
npm run benchmark

# unbalanced distribution
npm run dataset:unbalanced
npm run benchmark

Impact of function words

npm run dataset:balanced
npm run benchmark

Impact of Language Consistency in Issue Tickets

# baseline
npm run dataset:english:baseline
npm run benchmark

# english
npm run dataset:english
npm run benchmark

Presence of Code Snippets in Issue Tickets

# baseline
npm run dataset:nosnip:baseline
npm run benchmark

# no snippets
npm run dataset:nosnip
npm run benchmark

generate dataset:

Datasets can be downloaded either using npm run dataset:balanced or npm run dataset:unbalanced. The datasets were generated using github archive's which can be accessed through google BigQuery.

Add the query below to your BigQuery console and adjust if needed (e.g., resample issues to create a balanced dataset, etc.).

-- unbalanced dataset

SELECT
  CONCAT('__label__', label, ' ', title, ' ', REGEXP_REPLACE(body, '(\r|\n|\r\n)',' '))
FROM (
  SELECT
    LOWER(JSON_EXTRACT_SCALAR(payload, '$.issue.labels[0].name')) AS label,
    JSON_EXTRACT_SCALAR(payload, '$.issue.title') AS title,
    JSON_EXTRACT_SCALAR(payload, '$.issue.body') AS body
  FROM
    `githubarchive.day.201802*`
  WHERE
    _TABLE_SUFFIX BETWEEN '01' AND '10'
    AND type = 'IssuesEvent'
    AND JSON_EXTRACT_SCALAR(payload, '$.action') = 'closed' )
WHERE 
  (label = 'bug' OR label = 'enhancement' OR label = 'question')
  AND body != 'null';

run serverless app:

You need a .env file in order to run the github app. The file should look like this:

GITHUB_CERT="<private key>"
GITHUB_SECRET=123456
GITHUB_APP_ID=123
PORT=3000

Note: When running app in production, environment variables should be provided by host.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].