All Projects → Prev → shaman

Prev / shaman

Licence: MIT license
Programming Language Detector - When you input code, Shaman detects its language

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to shaman

Yoloncs
YOLO object detector for Movidius Neural Compute Stick (NCS)
Stars: ✭ 176 (+700%)
Mutual labels:  detector
SparkChamber
An event tracking framework for iOS
Stars: ✭ 44 (+100%)
Mutual labels:  detector
cryptaddress.now
A minimal service to detect which cryptocurrency an address corresponds to.
Stars: ✭ 23 (+4.55%)
Mutual labels:  detector
Php Opencv
php wrapper for opencv
Stars: ✭ 194 (+781.82%)
Mutual labels:  detector
mods
MODS (Matching On Demand with view Synthesis) is algorithm for wide-baseline matching.
Stars: ✭ 84 (+281.82%)
Mutual labels:  detector
hesaff-pytorch
PyTorch implementation of Hessian-Affine local feature detector
Stars: ✭ 21 (-4.55%)
Mutual labels:  detector
East icpr
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE
Stars: ✭ 154 (+600%)
Mutual labels:  detector
fabio
I/O library for images produced by 2D X-ray detector
Stars: ✭ 42 (+90.91%)
Mutual labels:  detector
MVDet
[ECCV 2020] Codes and MultiviewX dataset for "Multiview Detection with Feature Perspective Transformation".
Stars: ✭ 123 (+459.09%)
Mutual labels:  detector
go-mnd
Magic number detector for Go.
Stars: ✭ 153 (+595.45%)
Mutual labels:  detector
Device Detector Js
A precise user agent parser and device detector written in TypeScript
Stars: ✭ 193 (+777.27%)
Mutual labels:  detector
Open-Source-Models
Address book for computer vision models.
Stars: ✭ 30 (+36.36%)
Mutual labels:  detector
DroNet
DroNet: Efficient convolutional neural network detector for Real-Time UAV applications
Stars: ✭ 54 (+145.45%)
Mutual labels:  detector
Detect It Easy
Program for determining types of files for Windows, Linux and MacOS.
Stars: ✭ 2,982 (+13454.55%)
Mutual labels:  detector
GoogleTranslateBundle
A Symfony bundle to deals with Google Translate API
Stars: ✭ 44 (+100%)
Mutual labels:  detector
Jscpd
Copy/paste detector for programming source code.
Stars: ✭ 2,397 (+10795.45%)
Mutual labels:  detector
IP-Monitor
CSDN博客
Stars: ✭ 32 (+45.45%)
Mutual labels:  detector
gpod
GPOD - General Purpose Object Detector in Python
Stars: ✭ 31 (+40.91%)
Mutual labels:  detector
S3FD
S3FD_Mxnet
Stars: ✭ 22 (+0%)
Mutual labels:  detector
HandGesturesDroneController
Hand Gestures for Drone Control Using Deep Learning ✊ ✋ 🚁 ☝️ 🙌
Stars: ✭ 23 (+4.55%)
Mutual labels:  detector

GitHub license Pypi Build Status

Shaman - Programming Language Detector

When you input code, Shaman detects its language.

Languages supported: ASP, Bash, C, C#, CSS, HTML, Java, JavaScript, Objective-c, PHP, Python, Ruby, SQL, Swift, and XML.

Shaman is implemented with Bayes Classification and pre-defined RegEx patterns. Pre-trained model is included in the library, where the size of the model is 214KB.

The accuracy of the included model is 78% with the test set and 83% with the training set. See accuracy section for detail.

Getting Started

How to install

$ pip install shamanld

How to use

from shamanld import Shaman

code = """
#include <stdio.h>
int main() {
    printf("Hello world");
}
"""

r = Shaman.default().detect(code)

print(r)
# [('c', 42.60959840702781), ('objective-c', 8.535893087527496), ('java', 7.237626324587697), ...]

Test and train with your custom dataset

Shaman supports training the model with your custom dataset easily. The only thing you have to prepare is to make your dataset with CSV format. CSV file should include "language,code" pairs.

Test with custom dataset

$ shaman-tester path/to/test_set.csv

Training a new model with custom dataset

$ shaman-trainer path/to/training_set.csv --model-path path/to/your_model.json.gz

Testing custom model

$ shaman-trainer path/to/test_set.csv --model-path path/to/your_model.json.gz

Using custom model on the code

from shamanld import Shaman

detector = Shaman('path/to/your_model.json.gz')
detector.detect('/* some code */')

Accuracy

Included model is trained with 120K codes and tested with 42K codes. Only the codes whose lengths are more than 100 are used in both training & testing. As the codes are collected without verification, there might be some data with wrong labels.

Language Accuracy
Total 78.40% (36428 / 46464)
c 70.41% (11479 / 16304)
java 90.24% (8094 / 8969)
python 92.85% (5230 / 5633)
javascript 63.08% (2782 / 4410)
sql 80.92% (2519 / 3113)
html 83.99% (2156 / 2567)
c# 84.08% (1753 / 2085)
xml 80.18% (635 / 792)
bash 83.58% (560 / 670)
swift 83.25% (522 / 627)
php 73.09% (315 / 431)
css 68.12% (203 / 298)
objective-c 32.88% (121 / 368)
asp 36.75% (43 / 117)
ruby 20.00% (16 / 80)

JavaScript version

JavaScript inference implementation is available at Prev/shamanjs.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].