Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → malteos → Pytorch Bert Document Classification

malteos / Pytorch Bert Document Classification

Licence: mit

Enriching BERT with Knowledge Graph Embedding for Document Classification (PyTorch)

Labels

jupyter-notebook

Projects that are alternatives of or similar to Pytorch Bert Document Classification

Pytorch learning

书籍：深度学习框架pytorch入门与实践

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Source material for Data Science for Telecom Tutorial at Strata Singapore 2015

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Linear algebra with python

Lecture Notes for Linear Algebra Featuring Python

Stars: ✭ 1,355 (+1268.69%)

Mutual labels: jupyter-notebook

NJU Master Course **Big Data Mining and Analysis**

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Estimation Of Remaining Useful Life Using Cnn

Convolutional Neural Network based regression approach for estimating machinery's remaining useful life

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Introduction to Chemical Engineering Analysis

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Source code about Python Development

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

A Primer on Gaussian Processes for Regression Analysis (PyData NYC 2019)

Stars: ✭ 99 (+0%)

Mutual labels: jupyter-notebook

Tutorial teaching the basics of Keras and some deep learning concepts

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

The Numenta Anomaly Benchmark

Stars: ✭ 1,352 (+1265.66%)

Mutual labels: jupyter-notebook

Keras implementation of GradCAM.

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

A companion code for my Medium post

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes

Stars: ✭ 1,352 (+1265.66%)

Mutual labels: jupyter-notebook

Droneblocks Tello Python

A DroneBlocks course on drone programming with Tello using Python scripts

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Stars: ✭ 99 (+0%)

Mutual labels: jupyter-notebook

An implementation of the Viterbi Algorithm for training Hidden Markov models. This repo accompanies the video found here: https://www.youtube.com/watch?v=kqSzLo9fenk

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Scipy 2014 julia

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

Hands On Exploratory Data Analysis With Python

Hands-on Exploratory Data Analysis with Python, published by Packt

Stars: ✭ 99 (+0%)

Mutual labels: jupyter-notebook

kmeans using PyTorch

Stars: ✭ 98 (-1.01%)

Mutual labels: jupyter-notebook

A Scala kernel for Jupyter

Stars: ✭ 1,354 (+1267.68%)

Mutual labels: jupyter-notebook

View All Similar Projects ➔

PyTorch BERT Document Classification

Implementation and pre-trained models of the paper Enriching BERT with Knowledge Graph Embedding for Document Classification (PDF). A submission to the GermEval 2019 shared task on hierarchical text classification. If you encounter any problems, feel free to contact us or submit a GitHub issue.

Content

CLI script to run all experiments
WikiData author embeddings (view on Tensorboard Projector)
Data preparation
Requirements
Trained model weights as release files

Model architecture

Installation

Requirements:

Python 3.6
CUDA GPU
Jupyter Notebook

Install dependencies:

pip install -r requirements.txt

Prepare data

GermEval data

Download from shared-task website: here
Run all steps in Jupyter Notebook: germeval-data.ipynb

Author Embeddings

python wikidata_for_authors.py run ~/datasets/wikidata/index_enwiki-20190420.db \
    ~/datasets/wikidata/index_dewiki-20190420.db \
    ~/datasets/wikidata/torchbiggraph/wikidata_translation_v1.tsv.gz \
    ~/notebooks/bert-text-classification/authors.pickle \
    ~/notebooks/bert-text-classification/author2embedding.pickle

# OPTIONAL: Projector format
python wikidata_for_authors.py convert_for_projector \
    ~/notebooks/bert-text-classification/author2embedding.pickle
    extras/author2embedding.projector.tsv \
    extras/author2embedding.projector_meta.tsv

Reproduce paper results

Download pre-trained models: GitHub releases

Available experiment settings

Detailed settings for each experiment can found in cli.py.

task-a__bert-german_full
task-a__bert-german_manual_no-embedding
task-a__bert-german_no-manual_embedding
task-a__bert-german_text-only
task-a__author-only
task-a__bert-multilingual_text-only

task-b__bert-german_full
task-b__bert-german_manual_no-embedding
task-b__bert-german_no-manual_embedding
task-b__bert-german_text-only
task-b__author-only
task-b__bert-multilingual_text-only

Enviroment variables

TRAIN_DF_PATH: Path to Pandas Dataframe (pickle)
GPU_ID: Run experiments on this GPU (used for CUDA_VISIBLE_DEVICES)
OUTPUT_DIR: Directory to store experiment output
EXTRAS_DIR: Directory where author embeddings and gender data is located
BERT_MODELS_DIR: Directory where pre-trained BERT models are located

Validation set

python cli.py run_on_val <name> $GPU_ID $EXTRAS_DIR $TRAIN_DF_PATH $VAL_DF_PATH $OUTPUT_DIR --epochs 5

Test set

python cli.py run_on_test <name> $GPU_ID $EXTRAS_DIR $FULL_DF_PATH $TEST_DF_PATH $OUTPUT_DIR --epochs 5

Evaluation

The scores from the result table can be reproduced with the evaluation.ipynb notebook.

How to cite

If you are using our code, please cite our paper:

@inproceedings{Ostendorff2019,
    address = {Erlangen, Germany},
    author = {Ostendorff, Malte and Bourgonje, Peter and Berger, Maria and Moreno-Schneider, Julian and Rehm, Georg},
    booktitle = {Proceedings of the GermEval 2019 Workshop},
    title = {{Enriching BERT with Knowledge Graph Embedding for Document Classification}},
    year = {2019}
}

References

License

MIT

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 99

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (0) 🔗