All Projects → yixiu00001 → Lstm Crf Medical

yixiu00001 / Lstm Crf Medical

构建医疗实体识别的模型,包含词典和语料标注,基于python构建

Projects that are alternatives of or similar to Lstm Crf Medical

Triplet Attention
Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]
Stars: ✭ 222 (-0.89%)
Mutual labels:  jupyter-notebook
Sohu competition
Sohu's 2018 content recognition competition 1st solution(搜狐内容识别大赛第一名解决方案)
Stars: ✭ 224 (+0%)
Mutual labels:  jupyter-notebook
Rethinking Numpyro
Statistical Rethinking (2nd ed.) with NumPyro
Stars: ✭ 225 (+0.45%)
Mutual labels:  jupyter-notebook
Ai Platform Samples
Official Repo for Google Cloud AI Platform
Stars: ✭ 222 (-0.89%)
Mutual labels:  jupyter-notebook
Skylift
Wi-Fi Geolocation Spoofing with the ESP8266
Stars: ✭ 223 (-0.45%)
Mutual labels:  jupyter-notebook
Lstms for predictive maintenance
LSTMS for Predictive Maintenance
Stars: ✭ 224 (+0%)
Mutual labels:  jupyter-notebook
Deep Vector Quantization
VQVAEs, GumbelSoftmaxes and friends
Stars: ✭ 222 (-0.89%)
Mutual labels:  jupyter-notebook
Zoom Learn Zoom
computational zoom from raw sensor data
Stars: ✭ 224 (+0%)
Mutual labels:  jupyter-notebook
Dragonn
A toolkit to learn how to model and interpret regulatory sequence data using deep learning.
Stars: ✭ 222 (-0.89%)
Mutual labels:  jupyter-notebook
Tutorial
Tutorial covering Open Source tools for Source Separation.
Stars: ✭ 223 (-0.45%)
Mutual labels:  jupyter-notebook
Lfortran
Official mirror of https://gitlab.com/lfortran/lfortran. Please submit pull requests (PR) there. Any PR sent here will be closed automatically.
Stars: ✭ 220 (-1.79%)
Mutual labels:  jupyter-notebook
Machine Learning Notebooks
Machine Learning notebooks for refreshing concepts.
Stars: ✭ 222 (-0.89%)
Mutual labels:  jupyter-notebook
Sdc Vehicle Detection
Udacity Project - Vehicle Detection
Stars: ✭ 224 (+0%)
Mutual labels:  jupyter-notebook
Covid 19
Ciência de Dados aplicada à pandemia do novo coronavírus.
Stars: ✭ 223 (-0.45%)
Mutual labels:  jupyter-notebook
Gan steerability
On the "steerability" of generative adversarial networks
Stars: ✭ 225 (+0.45%)
Mutual labels:  jupyter-notebook
Navigan
Navigating the GAN Parameter Space for Semantic Image Editing
Stars: ✭ 221 (-1.34%)
Mutual labels:  jupyter-notebook
Notebook
my note
Stars: ✭ 221 (-1.34%)
Mutual labels:  jupyter-notebook
Attention network with keras
An example attention network with simple dataset.
Stars: ✭ 225 (+0.45%)
Mutual labels:  jupyter-notebook
Deeplearning cv notes
📓 deepleaning and cv notes.
Stars: ✭ 223 (-0.45%)
Mutual labels:  jupyter-notebook
Machinelearningwithpython
Starter files for Pluralsight course: Understanding Machine Learning with Python
Stars: ✭ 224 (+0%)
Mutual labels:  jupyter-notebook

LSTM-CRF-medical

构建医疗实体识别的模型,包含词典和语料标注,基于python构建

数据集合标注

数据集合标注可以基于词典,通过最大匹配获得实体位置,然后标注实体类型。

词典构造

目前构造的词典包括疾病词典、症状词典和身体部位词典。疾病词典包括互联网爬取的疾病名称、疾病别名、ICD10疾病名称,去重后共39615条数据;症状为互联网爬取的症状描述,去重后共7457条数据;身体部位为互联网爬取的身体部位描述,去重后共1929条数据。示例如下:

疾病名称:1型糖尿病性急性牙周脓肿,妊娠合并系统性红斑狼疮,结石性胆囊炎,药物性股骨坏死,晚期梅毒性脉络膜炎,腹型过敏性紫癜

症状:胀痛,耳后长包,睡觉流口水,鼻塞,粉红色泡沫样痰,孕妇气喘,痔疮便血,头昏眼花

身体部位:鼻唇沟,鼻处,鼻子,鼻子尖,鼻孔,鼻尖,鼻窦软骨,鼻翼,鼻黏膜

实体检索

选取了ICD10中的5000条疾病描述,根据已有词典进行实体的最大匹配。

以疾病为例,对于输入的疾病描述进行规范化,去掉空格、换行符,去掉无意义的句头和句尾字词等。

对规范化的句子,使用词典中的每个词进行全匹配,记录匹配的词、词的起始index、词的结束index和实体类型。

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].