All Projects → mudasobwa → markright

mudasobwa / markright

Licence: MIT license
A customizable markdown parser in Elixir: pure pattern matching.

Programming Languages

elixir
2628 projects

Projects that are alternatives of or similar to markright

parse-md
Parse Markdown file's metadata from its content
Stars: ✭ 15 (+7.14%)
Mutual labels:  parsing, markdown-parser
Esprima
ECMAScript parsing infrastructure for multipurpose analysis
Stars: ✭ 6,391 (+45550%)
Mutual labels:  parsing, ast
inmemantlr
ANTLR as a libray for JVM based languages
Stars: ✭ 87 (+521.43%)
Mutual labels:  parsing, ast
kolasu
Kotlin Language Support – AST Library
Stars: ✭ 45 (+221.43%)
Mutual labels:  parsing, ast
Yacep
yet another csharp expression parser
Stars: ✭ 107 (+664.29%)
Mutual labels:  parsing, ast
node-typescript-parser
Parser for typescript (and javascript) files, that compiles those files and generates a human understandable AST.
Stars: ✭ 121 (+764.29%)
Mutual labels:  parsing, ast
Meriyah
A 100% compliant, self-hosted javascript parser - https://meriyah.github.io/meriyah
Stars: ✭ 690 (+4828.57%)
Mutual labels:  parsing, ast
Estree
The ESTree Spec
Stars: ✭ 3,867 (+27521.43%)
Mutual labels:  parsing, ast
Graphql Go Tools
Tools to write high performance GraphQL applications using Go/Golang.
Stars: ✭ 96 (+585.71%)
Mutual labels:  parsing, ast
Libdparse
Library for lexing and parsing D source code
Stars: ✭ 91 (+550%)
Mutual labels:  parsing, ast
kataw
An 100% spec compliant ES2022 JavaScript toolchain
Stars: ✭ 303 (+2064.29%)
Mutual labels:  parsing, ast
Escaya
An blazing fast 100% spec compliant, incremental javascript parser written in Typescript
Stars: ✭ 217 (+1450%)
Mutual labels:  parsing, ast
cppcombinator
parser combinator and AST generator in c++17
Stars: ✭ 20 (+42.86%)
Mutual labels:  parsing, ast
hxjsonast
Parse JSON into position-aware AST with Haxe!
Stars: ✭ 28 (+100%)
Mutual labels:  parsing, ast
tree-hugger
A light-weight, extendable, high level, universal code parser built on top of tree-sitter
Stars: ✭ 96 (+585.71%)
Mutual labels:  parsing, ast
Uaiso
A multi-language parsing infrastructure with an unified AST
Stars: ✭ 86 (+514.29%)
Mutual labels:  parsing, ast
Down
Blazing fast Markdown / CommonMark rendering in Swift, built upon cmark.
Stars: ✭ 1,895 (+13435.71%)
Mutual labels:  parsing, ast
codeparser
Parse Wolfram Language source code as abstract syntax trees (ASTs) or concrete syntax trees (CSTs)
Stars: ✭ 84 (+500%)
Mutual labels:  parsing, ast
OpenSIEM-Logstash-Parsing
SIEM Logstash parsing for more than hundred technologies
Stars: ✭ 140 (+900%)
Mutual labels:  parsing
dpar
Neural network transition-based dependency parser (in Rust)
Stars: ✭ 41 (+192.86%)
Mutual labels:  parsing

Markright

Build Status Hex.pm The extended, streaming, configurable markdown-like syntax parser, that produces an AST.

Out of the box is supports the full set of markdown, plus some extensions. The user of this library might easily extend the functionality with her own markup definition and a bit of elixir code to handle parsing.

Starting with version 0.5.0 supports many different markright syntaxes simultaneously, including the ability to create syntaxes on the fly.

There is no one single call to Regex used. The whole parsing is done solely on pattern matching the input binary.

The AST produced is understandable by XmlBuilder.

There are many callbacks available to transform the resulting AST. See below.

Is it of any good?

It is an incredible piece of handsome lovely software. Sure, it is.

Installation

If available in Hex, the package can be installed by adding markright to your list of dependencies in mix.exs:

def deps do
  [{:markright, "~> 0.5"}]
end

Basic usage

    @input ~s"""
    If [available in Hex](https://hex.pm/docs/publish), the package can be installed
    by adding `markright` to your list of dependencies in `mix.exs`:

    ```elixir
    def deps do
      [{:markright, "~> 0.5"}]
    end
    ```

    ## Basic Usage
    Blah...
    """

    assert(
      Markright.to_ast(@input) ==
        {:article, %{},
           [{:p, %{}, [
              "If ", {:a, %{href: "https://hex.pm/docs/publish"}, "available in Hex"},
              ", the package can be installed\nby adding ", {:code, %{}, "markright"},
              " to your list of dependencies in ", {:code, %{}, "mix.exs"}, ":"]},
            {:pre, %{},
              [{:code, %{lang: "elixir"},
              "def deps do\n  [{:markright, \"~> 0.5\"}]\nend"}]},
            {:h2, %{}, "Basic Usage"},
            {:p, %{}, "Blah...\n"}]}
    )

HTML generation

iex> "Hello, *[address]Aleksei*!"
...> |> Markright.to_ast
...> |> XmlBuilder.generate
"<article>
\t<p>
\t\tHello,
\t\t<strong class=\"address\">Aleksei</strong>
\t\t!
\t</p>
</article>"

Power tools

Callbacks

One might provide a callback to the call to to_ast/3. It will be not only called back on any AST node found (do not expect them to be called in the natural order, though,) but it allows to change the AST on the fly. Just return a %Markright.Continuation object from the callback, and you are done (see markright_test.exs for inspiration):

fun = fn
  %Markright.Continuation{ast: {:p, %{}, text}} = cont ->
    IO.puts "Currently dealing with `:p` node"
    %Markright.Continuation{cont | ast: {:div, %{}, text}}
  cont -> cont
end
assert Markright.to_ast(@input, fun) == @output

Example: make fancy links inside blockquotes with callbacks

When a last blockquote’s element is a link, make it to show the favicon, and make the blockquote itself to have cite attribute (in fact, this particular transform is already done in Markright.Finalizers.Blockquote finalizer, but if it were not, this is how it could be implemented internally):

bq_patch = fn
  {:blockquote, bq_attrs, list} when is_list(list) ->
    case :lists.reverse(list) do
      [{:a, %{href: href} = attrs, text} | t] ->
        img = with [capture] <- Regex.run(~r|\Ahttps?://[^/]+|, href) do
          {:img,
              %{alt: "favicon",
                src: capture <> "/favicon.png",
                style: "height:16px;margin-bottom:-2px;"},
              nil}
        end
        patched = :lists.reverse([{:br, %{}, nil}, "— ", img, " ", {:a, attrs, text}])
        {:blockquote, Map.put(bq_attrs, :cite, href), :lists.reverse(patched ++ t)}
      _ -> {:blockquote, bq_attrs, list}
    end
  other -> other
end
fun = fn %Markright.Continuation{ast: ast} = cont ->
  %Markright.Continuation{cont | ast: bq_patch.(ast)}
end

Custom classes

All the “grip” elements (like *strong* or ~strike~) have an option to specify a class:

iex> Markright.to_ast "Hello, *[address]Aleksei*!"
{:article, %{},
  [{:p, %{}, ["Hello, ", {:strong, %{class: "address"}, "Aleksei"}, "!"]}]}

The above is particularly helpful when writing a rich blog posts over, say, bootstrap css (or any other css, that provides cool flashes etc.)

Custom syntax

To add a new syntax is as easy as to put a new value into config:

config :markright, syntax: [
  grip: [
    sup: "^^"
  ]
]

Voilà—you have this grip on hand:

iex> Markright.to_ast "Hello, ^^Aleksei^^!"
{:article, %{},
  [{:p, %{}, ["Hello, ", {:sup, %{}, "Aleksei"}, "!"]}]}

Ninja handling: collectors

Collectors play the role of accumulators, used for accumulating some data during the parsing stage. The good example of it would be Markright.Collectors.OgpTwitter collector, that is used to build the twitter/ogp card to embed into head section of the resulting html.

  test "builds the twitter/ogp card" do
    Code.eval_string """
    defmodule Sample do
      use Markright.Collector, collectors: Markright.Collectors.OgpTwitter

      def on_ast(%Markright.Continuation{ast: {tag, _, _}} = cont), do: tag
    end
    """
    {ast, acc} = Markright.to_ast(@input, Sample)
    assert {ast, acc} == {@output, @accumulated}
    assert XmlBuilder.generate(acc[Markright.Collectors.OgpTwitter]) == @html
  after
    purge Sample
  end

Custom syntax on the fly

Starting with version 0.5.0, markright accepts custom syntax to be passed to Markright.to_ast/3:

@input ~S"""
Hello world.

> my blockquote

Right _after_.
Normal *para* again.
"""

Empty syntax will produce a set of paragraphs, ignoring anything else:

@empty_syntax []
@output_empty_syntax {:article, %{}, [
  {:p, %{}, "Hello world."},
  {:p, %{}, "> my blockquote"},
  {:p, %{}, "Right _after_.\nNormal *para* again.\n"}]}

test "works with empty syntax" do
  assert Markright.to_ast(@input, nil, syntax: @empty_syntax) == @output_empty_syntax
end

The simple syntax below accepts emphasized and bold text decorators only:

@simple_syntax [grip: [em: "_", strong: "*"]]
@output_simple_syntax {:article, %{}, [
  {:p, %{}, "Hello world."},
  {:p, %{}, "> my blockquote"},
  {:p, %{}, ["Right ", {:em, %{}, "after"}, ".\nNormal ", {:strong, %{}, "para"}, " again.\n"]}]}

test "works with simple user-defined syntax" do
  assert Markright.to_ast(@input, nil, syntax: @simple_syntax) == @output_simple_syntax
end

Syntax reference

block is a block element, roughly an analogue of HTML <div>:

Example:

block: [h: "#", blockquote: ">"]

Markright:

# Hello, world!

Result:

{:h1, %{}, "Hello, world!"]

flush is an empty element, roughly an analogue of HTML clear: all:

Example:

flush: [hr: "\n---", br: "  \n"]

Markright:

Hello, world!
---
Hello, world!

Result:

["Hello, world!", {:hr, %{}, nil}, "\nHello, world!\n"]

lead is an item element, usually chained and having a surrounding (see below):

Example:

lead: [li: {"-", [parser: Markright.Parsers.Li]}]

Markright:

- Hello, world!
- Hello, world!

Result:

{:ul, %{}, [{:li, %{}, "Hello, world!"}, {:li, %{}, "Hello, world!"}]}

surrounding the element to surround leads (see above):

Example:

surrounding: [li: :ul]

Markright: none

Result: see li above

magnet is a leading marker:

Example:

magnet: [tag: "#", youtube: "✇"]

Markright:

Hello, #world! Check this video: ✇http://youtu.be/AAAAAA

Result:

["Hello, ",
  {:a, %{class: "tag", href: "/tags/world!"}, "world!"},
  " Check this video: ",
  {:iframe,
     %{allowfullscreen: nil, frameborder: 0, height: 315,
       src: "http://www.youtube.com/embed/AAAAAA", width: 560},
   "http://www.youtube.com/embed/AAAAAA"}]

grip is an inline surrounding element, usually used for inline formatting:

Example:

grip: [i: "_", b: "*"]

Markright:

Hello, *world*!

Result:

["Hello, ", {:strong, %{}, "world"}, "!"]

custom is a custom formatter, fully relying on the implementing module:

Example:

custom: [img: "!["]

Markright:

Hi, ![my title](http://oh.me/image)!

Result:

["Hi, ",
  {:figure, %{},
   [{:img, %{alt: "my title", src: "http://oh.me/image"}, nil},
    {:figcaption, %{}, "my title"}]},
  "!"]

Development

The extensions to syntax are not supposed to be merged into trunk. I am thinking about creating a welcome plugin-like infrastructure. Suggestions are very welcome.

Documentation

Visit HexDocs. Our docs can be found at https://hexdocs.pm/markright.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].