All Projects → cgrand → parsnip

cgrand / parsnip

Licence: other
parsley is dead, long live parsnip!

Programming Languages

clojure
4091 projects

Parsnip

A parsing library which is the spiritual successor of parsley and technical descendant of seqexp.

Design goals

  • General: accepts any CFG.
  • Deterministic: always select the same tree out of the parse forest.
  • Total: for unacceptable inputs, yields a parse for one of the closest acceptable input (distance being character deleted from the original input).
  • Scannerless
  • Incremental

Implementation details

Parsnip uses a VM with 6 opcodes (PRED, PEEK, JUMP, FORK, CALL and RET).

The VM is multithreaded (these are not real threads) and all threads are run in lockstep.

The state of a thread consists only of its PC (program counter) and its stack. At any time there can't be more than one thread with the same PC and same stack. Each threads also has an error count and a priority; they are both used when deduplicating threads with identical PC and stack.

The threads are stored in a structure resembling a lazy graph-structured stack.

PRED pred

pred is a predicate whose type is determined by the VM (the naive VM expects functions).

The predicate is applied to the current element.

If the predicate succeeds, the thread continues to the next instruction and to the next element of the input sequence. When the predicate fails the thread stays at the same PC (thus waits for the next character), increments its error count.

PEEK pred

pred is a predicate whose type is determined by the VM (the naive VM expectes functions).

The predicate is applied to the current element.

If the predicate succeeds, the thread continues to the next instruction but, unlike PRED doesn't advance to the new input. When the predicate fails the thread stays at the same PC (thus waits for the next character), increments its error count.

JUMP address

Performs a jump, address is an absolute address.

FORK address

Forks the current thread in two threads, one thread will continue to the next instruction while the other will performs a relative jump to the specified address.

Priority is given to the continuing thread.

It should be noted that the only effect of this priority is parse-tree selection: selecting one parse tree out of the parse forest.

CALL address

Pushes the return address on the stack and jump to address.

RET tag

Pops an address from the stack and jumps to it. (tag is an arbitrary value)

License

Copyright © 2014 Christophe Grand

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].