All Projects → benbalter → Word To Markdown

benbalter / Word To Markdown

Licence: mit
A ruby gem to liberate content from Microsoft Word documents

Programming Languages

ruby
36898 projects - #4 most used programming language

Projects that are alternatives of or similar to Word To Markdown

Gotenberg
A Docker-powered stateless API for PDF files.
Stars: ✭ 3,272 (+169.08%)
Mutual labels:  markdown, word, libreoffice
Breakdance
It's time for your markup to get down! HTML to markdown converter. Breakdance is a highly pluggable, flexible and easy to use.
Stars: ✭ 418 (-65.62%)
Mutual labels:  markdown, converter
Etherpad Lite
Etherpad: A modern really-real-time collaborative document editor.
Stars: ✭ 11,937 (+881.66%)
Mutual labels:  libreoffice, word
Pandoc
Universal markup converter
Stars: ✭ 24,250 (+1894.24%)
Mutual labels:  markdown, converter
Evernote2md
Convert Evernote .enex files to Markdown
Stars: ✭ 193 (-84.13%)
Mutual labels:  markdown, converter
Pdf
Simple http microservice that converts Word documents to PDF
Stars: ✭ 107 (-91.2%)
Mutual labels:  libreoffice, word
Zettlr
A Markdown Editor for the 21st century.
Stars: ✭ 6,099 (+401.56%)
Mutual labels:  libreoffice, markdown
Html To Markdown
⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules.
Stars: ✭ 155 (-87.25%)
Mutual labels:  markdown, converter
Online Markdown
A online markdown converter specially for Wechat Public formatting.
Stars: ✭ 812 (-33.22%)
Mutual labels:  markdown, converter
Gotenberg Go Client
Go client for the Gotenberg API
Stars: ✭ 35 (-97.12%)
Mutual labels:  markdown, word
Ods2md
Convert LibreOffice Calc Spreadsheets (*.ods) into Markdown tables.
Stars: ✭ 35 (-97.12%)
Mutual labels:  libreoffice, markdown
Markdown Pdf
📄 Markdown to PDF converter
Stars: ✭ 2,365 (+94.49%)
Mutual labels:  markdown, converter
Yarle
Yarle - The ultimate converter of Evernote notes to Markdown
Stars: ✭ 170 (-86.02%)
Mutual labels:  markdown, converter
Showdown
A bidirectional Markdown to HTML to Markdown converter written in Javascript
Stars: ✭ 12,137 (+898.11%)
Mutual labels:  markdown, converter
Gulp Markdown Pdf
Markdown to PDF
Stars: ✭ 56 (-95.39%)
Mutual labels:  markdown, converter
Mdpdf
Markdown to PDF command line app with support for stylesheets
Stars: ✭ 512 (-57.89%)
Mutual labels:  markdown, converter
Pandiff
Prose diffs for any document format supported by Pandoc
Stars: ✭ 110 (-90.95%)
Mutual labels:  markdown, word
Europa
Pure JavaScript library for converting HTML into valid Markdown
Stars: ✭ 143 (-88.24%)
Mutual labels:  markdown, converter
Ox Hugo
A carefully crafted Org exporter back-end for Hugo
Stars: ✭ 591 (-51.4%)
Mutual labels:  markdown, converter
Mybox
Easy tools of document, image, file, network, location, color, and media.
Stars: ✭ 45 (-96.3%)
Mutual labels:  markdown, converter

Word to Markdown converter

A Ruby gem to liberate content from the jail that is Word documents

Build Status Gem Version Inline docs Build status Maintainability Test Coverage

The problem

Our default content publishing workflow is terribly broken. We've all been trained to make paper, yet today, content authored once is more commonly consumed in multiple formats, and rarely, if ever, does it embody physical form. Put another way, our go-to content authoring workflow remains relatively unchanged since it was conceived in the early 80s.

I'm asked regularly by government employees — knowledge workers who fire up a desktop word processor as the first step to any project — for an automated pipeline to convert Microsoft Word documents to Markdown, the lingua franca of the internet, but as my recent foray into building just such a converter proves, it's not that simple.

Markdown isn't just an alternative format. Markdown forces you to write for the web.

Read more

Just want to convert a Microsoft Word (or Google) document to Markdown?

You can use this hosted service (or check out its source).

Install

You'll need to install LibreOffice. Then:

gem install word-to-markdown

Usage

file = WordToMarkdown.new("/path/to/document.docx")
=> <WordToMarkdown path="/path/to/document.docx">

file.to_s
=> "# Test\n\n This is a test"

file.document.tree
=> <Nokogiri Document>

Command line usage

Once you've installed the gem, it's just:

$ w2m path/to/document.docx

Outputs the resulting markdown to stdout

Supports

  • Paragraphs
  • Numbered lists
  • Unnumbered lists
  • Nested lists
  • Italic
  • Bold
  • Explicit headings (e.g., selected as "Heading 1" or "Heading 2")
  • Implicit headings (e.g., text with a larger font size relative to paragraph text)
  • Images
  • Tables
  • Hyperlinks

Requirements and configuration

Word-to-markdown requires soffice a command line interface to LibreOffice that works on Linux, Mac, and Windows. To install soffice, see the LibreOffice documentation.

Testing

script/cibuild

Docker

First, create the Gemfile.lock by installing the dependencies:

bundle install

Everything you need to run the executable locally:

docker-compose build
docker-compose run --rm app bundle exec w2m --help
docker-compose run --rm app bundle exec w2m test/fixtures/em.docx

Hosted service

Word-to-markdown-server contains a lightweight server for converting Word Documents as a service. A live version runs at word2md.com.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].