Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → graykode → Toeicbert

graykode / Toeicbert

Licence: mit

TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.

Programming Languages

python

139335 projects - #7 most used programming language

Labels

deep-learning pytorch nlp ai mask

Projects that are alternatives of or similar to Toeicbert

A high-level machine learning and deep learning library for the PHP language.

Stars: ✭ 1,270 (+1236.84%)

Mutual labels: ai

Bert As Service

Mapping a variable-length sentence to a fixed-length vector using BERT model

Stars: ✭ 9,779 (+10193.68%)

Mutual labels: ai

Ng Brazil

Commons and utils in angular for brazillian apps ( pipes / validators / directives / masks )

Stars: ✭ 92 (-3.16%)

Mutual labels: mask

Goodneighbor

Twitter AI Platform

Stars: ✭ 88 (-7.37%)

Mutual labels: ai

Deep Dream In Pytorch

Pytorch implementation of the DeepDream computer vision algorithm

Stars: ✭ 90 (-5.26%)

Mutual labels: ai

Slack Machine

A sexy, simple, yet powerful and extendable Slack bot

Stars: ✭ 91 (-4.21%)

Mutual labels: ai

Reinforcement Learning Wechat Jump

Reinforcement Learning for WeChat Jump

Stars: ✭ 85 (-10.53%)

Mutual labels: ai

Webots

Webots Robot Simulator

Stars: ✭ 1,324 (+1293.68%)

Mutual labels: ai

Ai Dl Enthusiasts Meetup

AI & Deep Learning Enthusiasts Meetup Project & Study Sessions

Stars: ✭ 90 (-5.26%)

Mutual labels: ai

Micromlp

A micro neural network multilayer perceptron for MicroPython (used on ESP32 and Pycom modules)

Stars: ✭ 92 (-3.16%)

Mutual labels: ai

Friday

An open-source virtual assistant.

Stars: ✭ 88 (-7.37%)

Mutual labels: ai

Shapeview

A customized shape view with shadow and transparent background supported.

Stars: ✭ 90 (-5.26%)

Mutual labels: mask

Atulocher

人工智能。本来是YRSSF的附属项目，结果越玩越大，现在连逻辑思维都有了。

Stars: ✭ 91 (-4.21%)

Mutual labels: ai

Falkon

Large-scale, multi-GPU capable, kernel solver

Stars: ✭ 88 (-7.37%)

Mutual labels: ai

Emojiintelligence

Neural Network built in Apple Playground using Swift

Stars: ✭ 1,323 (+1292.63%)

Mutual labels: ai

Maska

Simple zero-dependency input mask for Vue.js and vanilla JS.

Stars: ✭ 85 (-10.53%)

Mutual labels: mask

Amadeus Node

Node library for the Amadeus Self-Service travel APIs

Stars: ✭ 91 (-4.21%)

Mutual labels: ai

Ai Study

人工智能学习资料超全整理，包含机器学习基础ML、深度学习基础DL、计算机视觉CV、自然语言处理NLP、推荐系统、语音识别、图神经网路、算法工程师面试题

Stars: ✭ 93 (-2.11%)

Mutual labels: ai

Overlaycontroller

OverlayController easily pop your custom view and provide optional transition animation. written in swift 5.0

Stars: ✭ 94 (-1.05%)

Mutual labels: mask

Aif360

A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

Stars: ✭ 1,312 (+1281.05%)

Mutual labels: ai

View All Similar Projects ➔

TOEIC-BERT

76% Correct rate with ONLY Pre-Trained BERT model in TOEIC!!

This is project as topic: TOEIC(Test of English for International Communication) problem solving using pytorch-pretrained-BERT model. The reason why I used huggingface's pytorch-pretrained-BERT model is for pre-training or to do fine-tune more easily. I've solved the only blank problem, not the whole problem. There are two types of blank issues:

Selecting Correct Grammar Type.

Q) The music teacher had me _ scales several times.
  1. play (Answer)
  2. to play
  3. played
  4. playing

Selecting Correct Vocabulary Type.

Q) The wet weather _ her from going playing tennis.
  1. interrupted
  2. obstructed
  3. impeded
  4. discouraged (Answer)

BERT Testing

input

{
    "1" : {
        "question" : "Business experts predict that the upward trend is _ to continue until the end of next year.",
        "answer" : "likely",
        "1" : "potential",
        "2" : "likely",
        "3" : "safety",
        "4" : "seemed"
    }
}

output

=============================
Question : Business experts predict that the upward trend is _ to continue until the end of next year.

Real Answer : likely

1) potential 2) likely 3) safety 4) seemed

BERT's Answer => [likely]

Why BERT?

In pretrained BERT, It contains contextual information. So It can find more contextual or grammatical sentences, not clear, a little bit. I was inspired by grammar checker from blog post.

Can We Use BERT as a Language Model to Assign a Score to a Sentence?

BERT uses a bidirectional encoder to encapsulate a sentence from left to right and from right to left. Thus, it learns two representations of each word-one from left to right and one from right to left-and then concatenates them for many downstream tasks.

Evaluation

I had evaluated with only pretrained BERT model(not fine-tuning) to check grammatical or lexical error. Above mathematical expression, X is a question sentence. and n is number of questions : {a, b, c, d}. C subset means answer candidate tokens : C of warranty is ['warrant', '##y']. V means total Vocabulary.

There's a problem with more than one token. I solved this problem by getting the average value of each tensor. ex) is being formed as ['is', 'being', 'formed']

Then, we find argmax in L_n(T_n).

predictions = model(question_tensors, segment_tensors)

# predictions : [batch_size, sequence_length, vocab_size]
predictions_candidates = predictions[0, masked_index, candidate_ids].mean()

Result of Evaluation.

Fantastic result with only pretrained BERT model

bert-base-uncased: 12-layer, 768-hidden, 12-heads, 110M parameters
bert-large-uncased: 24-layer, 1024-hidden, 16-heads, 340M parameters
bert-base-cased: 12-layer, 768-hidden, 12-heads , 110M parameters
bert-large-cased: 24-layer, 1024-hidden, 16-heads, 340M parameters

Total 7067 datasets: make non-deterministic with model.eval()

	bert-base-uncased	bert-base-cased	bert-large-uncased	bert-large-cased
Correct Num	5192	5398	5321	5148
Percent	73.46%	76.38%	75.29%	72.84%

Quick Start with Python pip Package.

Start with pip

$ pip install toeicbert

Run & Option

$ python -m toeicbert --model bert-base-uncased --file test.json

-m, --model : bert-model name in huggingface's pytorch-pretrained-BERT : bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased.
-f, --file : json file to evalution, see json format, test.json.

key(question, 1, 2, 3, 4) is required options, but answer not.

_ in question will be replaced to [MASK]

{
    "1" : {
        "question" : "The music teacher had me _ scales several times.",
        "answer" : "play",
        "1" : "play",
        "2" : "to play",
        "3" : "played",
        "4" : "playing"
    },
    "2" : {
        "question" : "The music teacher had me _ scales several times.",
        "1" : "play",
        "2" : "to play",
        "3" : "played",
        "4" : "playing"
    }
}

Author

Tae Hwan Jung(Jeff Jung) @graykode, Kyung Hee Univ CE(Undergraduate).
Author Email : [email protected]

Thanks for Hwan Suk Gang(Kyung Hee Univ.) for collecting Dataset(7114 datasets)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 95

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (4) 🔗