Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → dsindex → Syntaxnet

dsindex / Syntaxnet

reference code for syntaxnet

Programming Languages

python

139335 projects - #7 most used programming language

Labels

tests

Projects that are alternatives of or similar to Syntaxnet

Jest Allure

Generate Allure Report for jest. Allure Report, a flexible lightweight multi-language test report tool with the possibility to add steps, attachments, parameters and so on.

Stars: ✭ 90 (-52.38%)

Mutual labels: tests

Sonar Flutter

SonarQube plugin for Flutter / Dart

Stars: ✭ 138 (-26.98%)

Mutual labels: tests

Acutest

Simple header-only C/C++ unit testing facility.

Stars: ✭ 170 (-10.05%)

Mutual labels: tests

Mongodb Memory Server

Spinning up mongod in memory for fast tests. If you run tests in parallel this lib helps to spin up dedicated mongodb servers for every test file in MacOS, *nix, Windows or CI environments (in most cases with zero-config).

Stars: ✭ 1,376 (+628.04%)

Mutual labels: tests

Create Test Server

Creates a minimal Express server for testing

Stars: ✭ 117 (-38.1%)

Mutual labels: tests

Dockertest

Write better integration tests! Dockertest helps you boot up ephermal docker images for your Go tests with minimal work.

Stars: ✭ 2,254 (+1092.59%)

Mutual labels: tests

Xrautomatedtests

XRAutomatedTests is where you can find functional, graphics, performance, and other types of automated tests for your XR Unity development.

Stars: ✭ 77 (-59.26%)

Mutual labels: tests

Supra Api Nodejs

❤️ Node.js REST API boilerplate

Stars: ✭ 182 (-3.7%)

Mutual labels: tests

Test262

Official ECMAScript Conformance Test Suite

Stars: ✭ 1,770 (+836.51%)

Mutual labels: tests

Test Time

A helper to control the flow of time

Stars: ✭ 169 (-10.58%)

Mutual labels: tests

Kappuccino

A kotlin library to simplify how to do espresso tests on Android.

Stars: ✭ 104 (-44.97%)

Mutual labels: tests

Narwhal

A progressive test framework for C.

Stars: ✭ 110 (-41.8%)

Mutual labels: tests

Modern Wasm Starter

🛸 Run C++ code on web and create blazingly fast websites! A starter template to easily create WebAssembly packages using type-safe C++ bindings with automatic TypeScript declarations.

Stars: ✭ 140 (-25.93%)

Mutual labels: tests

Csi Test

CSI test frameworks

Stars: ✭ 90 (-52.38%)

Mutual labels: tests

Learn Elm Architecture In Javascript

🦄 Learn how to build web apps using the Elm Architecture in "vanilla" JavaScript (step-by-step TDD tutorial)!

Stars: ✭ 173 (-8.47%)

Mutual labels: tests

Awesome Playwright

A curated list of awesome tools, utils and projects using Playwright

Stars: ✭ 79 (-58.2%)

Mutual labels: tests

Go Testdeep

Extremely flexible golang deep comparison, extends the go testing package, tests HTTP APIs and provides tests suite

Stars: ✭ 137 (-27.51%)

Mutual labels: tests

Ember Native Dom Helpers

Test helpers for your integration tests that fire native events

Stars: ✭ 187 (-1.06%)

Mutual labels: tests

Xamarin.forms.mocks

Library for running Xamarin.Forms inside of unit tests

Stars: ✭ 179 (-5.29%)

Mutual labels: tests

Should.js

BDD style assertions for node.js -- test framework agnostic

Stars: ✭ 1,908 (+909.52%)

Mutual labels: tests

View All Similar Projects ➔

Table of Contents generated with DocToc

syntaxnet

syntaxnet

description

test code for syntaxnet
- training and test a model using UD corpus.
- training and test a Korean parser model using the Sejong corpus.
- exporting a trained model and serving(limited to the designated version of syntaxnet(old one))
- training and test a model using dragnn.
- comparision to bist-parser.

history

2017. 3. 27

test for dragnn
version

python : 2.7
bazel  : 0.4.3
protobuf : 3.2.0
syntaxnet : 40a5739ae26baf6bfa352d2dec85f5ca190254f8

2017. 3. 10

modify for recent version of syntaxnet(tf 1.0), OS X(bash script), universal treebank v2.0
version

python : 2.7
bazel  : 0.4.3
protobuf : 3.0.0b2, 3.2.0
syntaxnet : bc70271a51fe2e051b5d06edc6b9fd94880761d5

2016. 8. 16
- add 'char-map' to context.pbtxt' for train
- add '--resource_dir' for test
  - if you installed old version of syntaxnet(ex, a4b7bb9a5dd2c021edcd3d68d326255c734d0ef0 ), you should specify path to each files in 'context.pbtxt'
- version
```
syntaxnet : a5d45f2ed20effaabc213a2eb9def291354af1ec
```

how to test

# after installing syntaxnet.
# gpu supporting : https://github.com/tensorflow/models/issues/248#issuecomment-288991859
$ pwd
/path/to/models/syntaxnet
$ git clone https://github.com/dsindex/syntaxnet.git work
$ cd work
$ echo "hello syntaxnet" | ./demo.sh
# training parser only with parsed corpus
$ ./parser_trainer_test.sh

univeral dependency corpus

UD official website
- tutorial
- CoNLL-U format
UPPipe
- udpipe(git)
prepare data

$ cd work
$ mkdir corpus
$ cd corpus
# downloading ud-treebanks-v2.0.tgz
$ tar -zxvf ud-treebanks-v2.0.tgz  
$ ls universal-dependencies-2.0 
$ UD_Ancient_Greek  UD_Basque  UD_Czech ....

training tagger and parser with another corpus

# for example, training UD_English.
# detail instructions can be found in https://github.com/tensorflow/models/tree/master/syntaxnet
$ ./train.sh -v -v
...
#preprocessing with tagger
INFO:tensorflow:Seconds elapsed in evaluation: 9.77, eval metric: 99.71%
INFO:tensorflow:Seconds elapsed in evaluation: 1.26, eval metric: 92.04%
INFO:tensorflow:Seconds elapsed in evaluation: 1.26, eval metric: 92.07%
...
#pretrain parser
INFO:tensorflow:Seconds elapsed in evaluation: 4.97, eval metric: 82.20%
...
#evaluate pretrained parser
INFO:tensorflow:Seconds elapsed in evaluation: 44.30, eval metric: 92.36%
INFO:tensorflow:Seconds elapsed in evaluation: 5.42, eval metric: 82.67%
INFO:tensorflow:Seconds elapsed in evaluation: 5.59, eval metric: 82.36%
...
#train parser
INFO:tensorflow:Seconds elapsed in evaluation: 57.69, eval metric: 83.95%
...
#evaluate parser
INFO:tensorflow:Seconds elapsed in evaluation: 283.77, eval metric: 96.54%
INFO:tensorflow:Seconds elapsed in evaluation: 34.49, eval metric: 84.09%
INFO:tensorflow:Seconds elapsed in evaluation: 34.97, eval metric: 83.49%
...

training parser only

# if you have other pos-tagger and want to build parser only from the parsed corpus :
$ ./train_p.sh -v -v
...
#pretrain parser
...
#evaluate pretrained parser
INFO:tensorflow:Seconds elapsed in evaluation: 44.15, eval metric: 92.21%
INFO:tensorflow:Seconds elapsed in evaluation: 5.56, eval metric: 87.84%
INFO:tensorflow:Seconds elapsed in evaluation: 5.43, eval metric: 86.56%
...
#train parser
...
#evaluate parser
INFO:tensorflow:Seconds elapsed in evaluation: 279.04, eval metric: 94.60%
INFO:tensorflow:Seconds elapsed in evaluation: 33.19, eval metric: 88.60%
INFO:tensorflow:Seconds elapsed in evaluation: 32.57, eval metric: 87.77%
...

test new model

$ echo "this is my own tagger and parser" | ./test.sh
...
Input: this is my own tagger and parser
Parse:
tagger NN ROOT
 +-- this DT nsubj
 +-- is VBZ cop
 +-- my PRP$ nmod:poss
 +-- own JJ amod
 +-- and CC cc
 +-- parser NN conj

# original model
$ echo "this is my own tagger and parser" | ./demo.sh
Input: this is my own tagger and parser
Parse:
tagger NN ROOT
 +-- this DT nsubj
 +-- is VBZ cop
 +-- my PRP$ poss
 +-- own JJ amod
 +-- and CC cc
 +-- parser ADD conj 

$ echo "Bob brought the pizza to Alice ." | ./test.sh
Input: Bob brought the pizza to Alice .
Parse:
brought VBD ROOT
 +-- Bob NNP nsubj
 +-- pizza NN dobj
 |   +-- the DT det
 +-- Alice NNP nmod
 |   +-- to IN case
 +-- . . punct

# original model
$ echo "Bob brought the pizza to Alice ." | ./demo.sh
Input: Bob brought the pizza to Alice .
Parse:
brought VBD ROOT
 +-- Bob NNP nsubj
 +-- pizza NN dobj
 |   +-- the DT det
 +-- to IN prep
 |   +-- Alice NNP pobj
 +-- . . punct

training parser from Sejong treebank corpus

# the corpus is accessible through the path on this image : https://raw.githubusercontent.com/dsindex/blog/master/images/url_sejong.png
# copy sejong_treebank.txt.v1 to `sejong` directory.
$ ./sejong/split.sh
$ ./sejong/c2d.sh
$ ./train_sejong.sh
#pretrain parser
...
NFO:tensorflow:Seconds elapsed in evaluation: 14.18, eval metric: 93.43%
...
#evaluate pretrained parser
INFO:tensorflow:Seconds elapsed in evaluation: 116.08, eval metric: 95.11%
INFO:tensorflow:Seconds elapsed in evaluation: 14.60, eval metric: 93.76%
INFO:tensorflow:Seconds elapsed in evaluation: 14.45, eval metric: 93.78%
...
#evaluate pretrained parser by eoj-based
accuracy(UAS) = 0.903289
accuracy(UAS) = 0.876198
accuracy(UAS) = 0.876888
...
#train parser
INFO:tensorflow:Seconds elapsed in evaluation: 137.36, eval metric: 94.12%
...
#evaluate parser
INFO:tensorflow:Seconds elapsed in evaluation: 1806.21, eval metric: 96.37%
INFO:tensorflow:Seconds elapsed in evaluation: 224.40, eval metric: 94.19%
INFO:tensorflow:Seconds elapsed in evaluation: 223.75, eval metric: 94.25%
...

#evaluate parser by eoj-based
accuracy(UAS) = 0.928845
accuracy(UAS) = 0.886139
accuracy(UAS) = 0.887824
...

test korean parser model

$ cat sejong/tagged_input.sample
1	프랑스	프랑스	NNP	NNP	_	0	_	_	_
2	의	의	JKG	JKG	_	0	_	_	_
3	세계	세계	NNG	NNG	_	0	_	_	_
4	적	적	XSN	XSN	_	0	_	_	_
5	이	이	VCP	VCP	_	0	_	_	_
6	ᆫ	ᆫ	ETM	ETM	_	0	_	_	_
7	의상	의상	NNG	NNG	_	0	_	_	_
8	디자이너	디자이너	NNG	NNG	_	0	_	_	_
9	엠마누엘	엠마누엘	NNP	NNP	_	0	_	_	_
10	웅가로	웅가로	NNP	NNP	_	0	_	_	_
11	가	가	JKS	JKS	_	0	_	_	_
12	실내	실내	NNG	NNG	_	0	_	_	_
13	장식	장식	NNG	NNG	_	0	_	_	_
14	용	용	XSN	XSN	_	0	_	_	_
15	직물	직물	NNG	NNG	_	0	_	_	_
16	디자이너	디자이너	NNG	NNG	_	0	_	_	_
17	로	로	JKB	JKB	_	0	_	_	_
18	나서	나서	VV	VV	_	0	_	_	_
19	었	었	EP	EP	_	0	_	_	_
20	다	다	EF	EF	_	0	_	_	_
21	.	.	SF	SF	_	0	_	_	_

$ cat sejong/tagged_input.sample | ./test_sejong.sh -v -v
Input: 프랑스 의 세계 적 이 ᆫ 의상 디자이너 엠마누엘 웅가로 가 실내 장식 용 직물 디자이너 로 나서 었 다 .
Parse:
. SF ROOT
 +-- 다 EF MOD
     +-- 었 EP MOD
         +-- 나서 VV MOD
             +-- 가 JKS NP_SBJ
             |   +-- 웅가로 NNP MOD
             |       +-- 디자이너 NNG NP
             |       |   +-- 의 JKG NP_MOD
             |       |   |   +-- 프랑스 NNP MOD
             |       |   +-- ᆫ ETM VNP_MOD
             |       |   |   +-- 이 VCP MOD
             |       |   |       +-- 적 XSN MOD
             |       |   |           +-- 세계 NNG MOD
             |       |   +-- 의상 NNG NP
             |       +-- 엠마누엘 NNP NP
             +-- 로 JKB NP_AJT
                 +-- 디자이너 NNG MOD
                     +-- 직물 NNG NP
                         +-- 실내 NNG NP
                         +-- 용 XSN NP
                             +-- 장식 NNG MOD

apply korean POS tagger(Komoran via konlpy)

# after installing konlpy ( http://konlpy.org/ko/v0.4.3/ )
$ python sejong/tagger.py
나는 학교에 간다.
1	나	나	NP	NP	_	0	_	_	_
2	는	는	JX	JX	_	0	_	_	_
3	학교	학교	NNG	NNG	_	0	_	_	_
4	에	에	JKB	JKB	_	0	_	_	_
5	가	가	VV	VV	_	0	_	_	_
6	ㄴ다	ㄴ다	EF	EF	_	0	_	_	_
7	.	.	SF	SF	_	0	_	_	_

$ echo "나는 학교에 간다." | python sejong/tagger.py | ./test_sejong.sh
Input: 나 는 학교 에 가 ㄴ다 .
Parse:
. SF ROOT
 +-- ㄴ다 EF MOD
     +-- 가 VV MOD
         +-- 는 JX NP_SBJ
         |   +-- 나 NP MOD
         +-- 에 JKB NP_AJT
             +-- 학교 NNG MOD

tensorflow serving and syntaxnet

$ bazel-bin/tensorflow_serving/example/parsey_client --server=localhost:9000
나는 학교에 간다
Input :  나는 학교에 간다
Parsing :
{"result": [{"text": "나 는 학교 에 가 ㄴ다", "token": [{"category": "NP", "head": 1, "end": 2, "label": "MOD", "start": 0, "tag": "NP", "word": "나"}, {"category": "JX", "head": 4, "end": 6, "label": "NP_SBJ", "start": 4, "tag": "JX", "word": "는"}, {"category": "NNG", "head": 3, "end": 13, "label": "MOD", "start": 8, "tag": "NNG", "word": "학교"}, {"category": "JKB", "head": 4, "end": 17, "label": "NP_AJT", "start": 15, "tag": "JKB", "word": "에"}, {"category": "VV", "head": 5, "end": 21, "label": "MOD", "start": 19, "tag": "VV", "word": "가"}, {"category": "EC", "end": 28, "label": "ROOT", "start": 23, "tag": "EC", "word": "ㄴ다"}], "docid": "-:0"}]}
...

parsey's cousins

a collection of pretrained syntactic models
how to test

# download models from http://download.tensorflow.org/models/parsey_universal/<language>.zip

# for `English`
$ echo "Bob brought the pizza to Alice." | ./parse.sh

# tokenizing
Bob brought the pizza to Alice .

# morphological analysis
1	Bob	_	_	_	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
2	brought	_	_	_	Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD	0	_	_	_
3	the	_	_	_	Definite=Def|PronType=Art|fPOS=DET++DT	0	_	_	_
4	pizza	_	_	_	Number=Sing|fPOS=NOUN++NN	0	_	_	_
5	to	_	_	_	fPOS=ADP++IN	0	_	_	_
6	Alice	_	_	_	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
7	.	_	_	_	fPOS=PUNCT++.	0	_	_	_

# tagging
1	Bob	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
2	brought	_	VERB	VBD	Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD	0	_	_	_
3	the	_	DET	DT	Definite=Def|PronType=Art|fPOS=DET++DT	0	_	_	_
4	pizza	_	NOUN	NN	Number=Sing|fPOS=NOUN++NN	0	_	_	_
5	to	_	ADP	IN	fPOS=ADP++IN	0	_	_	_
6	Alice	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	0	_	_	_
7	.	_	PUNCT	.	fPOS=PUNCT++.	0	_	_	_

# parsing
1	Bob	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	2	nsubj	_	_
2	brought	_	VERB	VBD	Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD	0	ROOT	_	_
3	the	_	DET	DT	Definite=Def|PronType=Art|fPOS=DET++DT	4	det	_	_
4	pizza	_	NOUN	NN	Number=Sing|fPOS=NOUN++NN	2	dobj	_	_
5	to	_	ADP	IN	fPOS=ADP++IN	6	case	_	_
6	Alice	_	PROPN	NNP	Number=Sing|fPOS=PROPN++NNP	2	nmod	_	_
7	.	_	PUNCT	.	fPOS=PUNCT++.	2	punct	_	_

# conll2tree 
Input: Bob brought the pizza to Alice .
Parse:
brought VERB++VBD ROOT
 +-- Bob PROPN++NNP nsubj
 +-- pizza NOUN++NN dobj
 |   +-- the DET++DT det
 +-- Alice PROPN++NNP nmod
 |   +-- to ADP++IN case
 +-- . PUNCT++. punct

downloaded model vs trained model

1. downloaded model
Language	No. tokens	POS	fPOS	Morph	UAS	LAS
-------------------------------------------------------
English	25096	90.48%	89.71%	91.30%	84.79%	80.38%

2. trained model
INFO:tensorflow:Total processed documents: 2077
INFO:tensorflow:num correct tokens: 18634
INFO:tensorflow:total tokens: 22395
INFO:tensorflow:Seconds elapsed in evaluation: 19.85, eval metric: 83.21%

3. where does the difference(84.79% - 83.21%) come from?
as mentioned https://research.googleblog.com/2016/08/meet-parseys-cousins-syntax-for-40.html
they found good hyperparameters by using MapReduce.
for example, 
the hyperparameters for POS tagger :
  - POS_PARAMS=128-0.08-3600-0.9-0
  - decay_steps=3600
  - hidden_layer_sizes=128
  - learning_rate=0.08
  - momentum=0.9

dragnn

how to compile examples

$ cd ../
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //examples/dragnn:tutorial_1

training tagger and parser with CoNLL corpus

# compile
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //work/dragnn_examples:write_master_spec
$ bazel build -c opt //work/dragnn_examples:train_dragnn
$ bazel build -c opt //work/dragnn_examples:inference_dragnn
# training
$ cd work
$ ./train_dragnn.sh -v -v
...
INFO:tensorflow:training step: 25300, actual: 25300
INFO:tensorflow:training step: 25400, actual: 25400
INFO:tensorflow:finished step: 25400, actual: 25400
INFO:tensorflow:Annotating datset: 2002 examples
INFO:tensorflow:Done. Produced 2002 annotations
INFO:tensorflow:Total num documents: 2002
INFO:tensorflow:Total num tokens: 25148
INFO:tensorflow:POS: 85.63%
INFO:tensorflow:UAS: 79.67%
INFO:tensorflow:LAS: 74.36%
...
# test
$ echo "i love this one" | ./test_dragnn.sh
Input: i love this one
Parse:
love VBP root
 +-- i PRP nsubj
 +-- one CD obj
     +-- this DT det

training parser with Sejong corpus

# compile
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //work/dragnn_examples:write_master_spec
$ bazel build -c opt //work/dragnn_examples:train_dragnn
$ bazel build -c opt //work/dragnn_examples:inference_dragnn_sejong
# training
$ cd work
# to prepare corpus, please refer to `training parser from Sejong treebank corpus` section.
$ ./train_dragnn_sejong.sh -v -v
...
INFO:tensorflow:training step: 33100, actual: 33100
INFO:tensorflow:training step: 33200, actual: 33200
INFO:tensorflow:finished step: 33200, actual: 33200
INFO:tensorflow:Annotating datset: 4114 examples
INFO:tensorflow:Done. Produced 4114 annotations
INFO:tensorflow:Total num documents: 4114
INFO:tensorflow:Total num tokens: 97002
INFO:tensorflow:POS: 93.95%
INFO:tensorflow:UAS: 91.38%
INFO:tensorflow:LAS: 87.76%
...
# test
# after installing konlpy ( http://konlpy.org/ko/v0.4.3/ )
$ echo "제주로 가는 비행기가 심한 비바람에 회항했다." | ./test_dragnn_sejong.sh
INFO:tensorflow:Read 1 documents
Input: 제주 로 가 는 비행기 가 심하 ㄴ 비바람 에 회항 하 았 다 .
Parse:
. SF VP
 +-- 다 EF MOD
     +-- 았 EP MOD
         +-- 하 XSA MOD
             +-- 회항 SN MOD
                 +-- 가 JKS NP_SBJ
                 |   +-- 비행기 NNG MOD
                 |       +-- 는 ETM VP_MOD
                 |           +-- 가 VV MOD
                 |               +-- 로 JKB NP_AJT
                 |                   +-- 제주 MAG MOD
                 +-- 에 JKB NP_AJT
                     +-- 비바람 NNG MOD
                         +-- ㄴ SN MOD
                             +-- 심하 VV NP
# it seems that pos tagging results from the dragnn are somewhat incorrect.
# so, i replace those to the results from the Komoran tagger.
# you can modify 'inference_dragnn_sejong.py' to use the tags from the dragnn.
Input: 제주 로 가 는 비행기 가 심하 ㄴ 비바람 에 회항 하 았 다 .
Parse:
. SF VP
 +-- 다 EF MOD
     +-- 았 EP MOD
         +-- 하 XSV MOD
             +-- 회항 NNG MOD
                 +-- 가 JKS NP_SBJ
                 |   +-- 비행기 NNG MOD
                 |       +-- 는 ETM VP_MOD
                 |           +-- 가 VV MOD
                 |               +-- 로 JKB NP_AJT
                 |                   +-- 제주 NNG MOD
                 +-- 에 JKB NP_AJT
                     +-- 비바람 NNG MOD
                         +-- ㄴ ETM MOD
                             +-- 심하 VA NP

web api using tornado

how to run

# compile
$ pwd
/path/to/models/syntaxnet
$ bazel build -c opt //work/dragnn_examples:dragnn_dm
# start tornado web api
$ cd work/dragnn_examples/www
# start single process
$ ./start.sh -v -v 0 0
# despite tornado suppoting multi-processing, session of tensorflow is not fork-safe.
# so do not use multi-processing option.
# if you want to link to the model trained by Sejong corpus, just edit env.sh
# : enable_konlpy='True'

# http://hostip:8897 
# http://hostip:8897/dragnn?q=i love it
# http://hostip:8897/dragnn?q=나는 학교에 가서 공부했다.

api output format(sample)

brat annotation tool

comparison to BIST parser

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 189

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗