All Projects → JohannesKaufmann → Html To Markdown

JohannesKaufmann / Html To Markdown

Licence: mit
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.

Programming Languages

go
31211 projects - #10 most used programming language
golang
3204 projects

Projects that are alternatives of or similar to Html To Markdown

Mdpdf
Markdown to PDF command line app with support for stylesheets
Stars: ✭ 512 (+230.32%)
Mutual labels:  markdown, converter
Mybox
Easy tools of document, image, file, network, location, color, and media.
Stars: ✭ 45 (-70.97%)
Mutual labels:  markdown, converter
Pandoc
Universal markup converter
Stars: ✭ 24,250 (+15545.16%)
Mutual labels:  markdown, converter
Markdown Pdf
📄 Markdown to PDF converter
Stars: ✭ 2,365 (+1425.81%)
Mutual labels:  markdown, converter
Kefirbb
A flexible Java text processor. BB, BBCode, BB-code, HTML, Textile, Markdown, parser, translator, converter.
Stars: ✭ 83 (-46.45%)
Mutual labels:  markdown, converter
Evernote2md
Convert Evernote .enex files to Markdown
Stars: ✭ 193 (+24.52%)
Mutual labels:  markdown, converter
Online Markdown
A online markdown converter specially for Wechat Public formatting.
Stars: ✭ 812 (+423.87%)
Mutual labels:  markdown, converter
Ox Hugo
A carefully crafted Org exporter back-end for Hugo
Stars: ✭ 591 (+281.29%)
Mutual labels:  markdown, converter
Word To Markdown
A ruby gem to liberate content from Microsoft Word documents
Stars: ✭ 1,216 (+684.52%)
Mutual labels:  markdown, converter
Linkify Markdown
🚀 A cli tool which automatically add references to issues, pull requests, user mentions and forks to a markdown file.
Stars: ✭ 67 (-56.77%)
Mutual labels:  markdown, converter
Yarle
Yarle - The ultimate converter of Evernote notes to Markdown
Stars: ✭ 170 (+9.68%)
Mutual labels:  markdown, converter
Kramdown Asciidoc
A kramdown extension for converting Markdown documents to AsciiDoc.
Stars: ✭ 97 (-37.42%)
Mutual labels:  markdown, converter
Showdown
A bidirectional Markdown to HTML to Markdown converter written in Javascript
Stars: ✭ 12,137 (+7730.32%)
Mutual labels:  markdown, converter
Breakdance
It's time for your markup to get down! HTML to markdown converter. Breakdance is a highly pluggable, flexible and easy to use.
Stars: ✭ 418 (+169.68%)
Mutual labels:  markdown, converter
Gulp Markdown Pdf
Markdown to PDF
Stars: ✭ 56 (-63.87%)
Mutual labels:  markdown, converter
Html To Markdown
Convert HTML to Markdown with PHP
Stars: ✭ 1,293 (+734.19%)
Mutual labels:  markdown, converter
Europa
Pure JavaScript library for converting HTML into valid Markdown
Stars: ✭ 143 (-7.74%)
Mutual labels:  markdown, converter
Typemill
TYPEMILL is a simple and lightweight Flat-File-CMS for authors and publishers.
Stars: ✭ 150 (-3.23%)
Mutual labels:  markdown
Sublimeless zk
A note taking app, Markdown editor, and text browser, featuring ID based wiki style links, and #tags, intended for zettelkasten method users. Loaded with tons of features like sophisticated tag search, note transclusion, support for note templates, bibliography support, etc. to make working in your Zettelkasten a joy 😄
Stars: ✭ 153 (-1.29%)
Mutual labels:  markdown
Gelatin
Transform text files to XML, JSON, or YAML
Stars: ✭ 150 (-3.23%)
Mutual labels:  converter

html-to-markdown

Go Report Card codecov GitHub MIT License GoDoc

gopher stading on top of a machine that converts a box of html to blocks of markdown

Convert HTML into Markdown with Go. It is using an HTML Parser to avoid the use of regexp as much as possible. That should prevent some weird cases and allows it to be used for cases where the input is totally unknown.

Installation

go get github.com/JohannesKaufmann/html-to-markdown

Usage

import md "github.com/JohannesKaufmann/html-to-markdown"

converter := md.NewConverter("", true, nil)

html = `<strong>Important</strong>`

markdown, err := converter.ConvertString(html)
if err != nil {
  log.Fatal(err)
}
fmt.Println("md ->", markdown)

If you are already using goquery you can pass a selection to Convert.

markdown, err := converter.Convert(selec)

Using it on the command line

If you want to make use of html-to-markdown on the command line without any Go coding, check out html2md, a cli wrapper for html-to-markdown that has all the following options and plugins builtin.

Options

The third parameter to md.NewConverter is *md.Options.

For example you can change the character that is around a bold text ("**") to a different one (for example "__") by changing the value of StrongDelimiter.

opt := &md.Options{
  StrongDelimiter: "__", // default: **
  // ...
}
converter := md.NewConverter("", true, opt)

For all the possible options look at godocs and for a example look at the example.

Adding Rules

converter.AddRules(
  md.Rule{
    Filter: []string{"del", "s", "strike"},
    Replacement: func(content string, selec *goquery.Selection, opt *md.Options) *string {
      // You need to return a pointer to a string (md.String is just a helper function).
      // If you return nil the next function for that html element
      // will be picked. For example you could only convert an element
      // if it has a certain class name and fallback if not.
      content = strings.TrimSpace(content)
      return md.String("~" + content + "~")
    },
  },
  // more rules
)

For more information have a look at the example add_rules.

Using Plugins

If you want plugins (github flavored markdown like striketrough, tables, ...) you can pass it to Use.

import "github.com/JohannesKaufmann/html-to-markdown/plugin"

// Use the `GitHubFlavored` plugin from the `plugin` package.
converter.Use(plugin.GitHubFlavored())

Or if you only want to use the Strikethrough plugin. You can change the character that distinguishes the text that is crossed out by setting the first argument to a different value (for example "~~" instead of "~").

converter.Use(plugin.Strikethrough(""))

For more information have a look at the example github_flavored.

Writing Plugins

Have a look at the plugin folder for a reference implementation. The most basic one is Strikethrough.

Other Methods

Godoc

func (c *Converter) Keep(tags ...string) *Converter

Determines which elements are to be kept and rendered as HTML.

func (c *Converter) Remove(tags ...string) *Converter

Determines which elements are to be removed altogether i.e. converted to an empty string.

Issues

If you find HTML snippets (or even full websites) that don't produce the expected results, please open an issue!

Related Projects

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].