All Projects → ColonelBuendia → rgpipe

ColonelBuendia / rgpipe

Licence: GPL-2.0 License
lesspipe for ripgrep for common new filetypes using few dependencies

Programming Languages

shell
77523 projects

Projects that are alternatives of or similar to rgpipe

Office365FiddlerExtension
This Fiddler Extension is an Office 365 centric parser to efficiently troubleshoot Office 365 client application connectivity and functionality.
Stars: ✭ 23 (+9.52%)
Mutual labels:  excel, word, office, powerpoint
Docs2Pdf
Bulk convert word/powerpoint/excel file to pdf.
Stars: ✭ 27 (+28.57%)
Mutual labels:  excel, word, office, powerpoint
craXcel-cli
Command line application to unlock Microsoft Office password protected files.
Stars: ✭ 44 (+109.52%)
Mutual labels:  excel, word, office, powerpoint
Office Ribbonx Editor
An overhauled fork of the original Custom UI Editor for Microsoft Office, built with WPF
Stars: ✭ 205 (+876.19%)
Mutual labels:  excel, word, office, powerpoint
OfficeExtractor
Extracts embedded OLE objects from Word, Excel, PowerPoint, Open Office and RTF files without needing the original programs
Stars: ✭ 67 (+219.05%)
Mutual labels:  excel, word, office, powerpoint
Gotenberg Go Client
Go client for the Gotenberg API
Stars: ✭ 35 (+66.67%)
Mutual labels:  excel, word, powerpoint
Vbasync
Cross-platform tool to synchronize macros from an Office VBA-enabled file with a version-controlled folder
Stars: ✭ 98 (+366.67%)
Mutual labels:  excel, word, powerpoint
Desktopeditors
An office suite that combines text, spreadsheet and presentation editors allowing to create, view and edit local documents
Stars: ✭ 1,008 (+4700%)
Mutual labels:  excel, word, office
Npoi
A .NET library for reading and writing Microsoft Office binary and OOXML file formats.
Stars: ✭ 1,751 (+8238.1%)
Mutual labels:  excel, word, office
Gotenberg Php Client
PHP client for the Gotenberg API
Stars: ✭ 80 (+280.95%)
Mutual labels:  excel, word, powerpoint
Docxtemplater
Generate docx pptx and xlsx (Microsoft Word, Powerpoint, Excel documents) from templates, from Node.js, the Browser and the command line / Demo: https://www.docxtemplater.com/demo
Stars: ✭ 1,990 (+9376.19%)
Mutual labels:  excel, word, powerpoint
Rage
Rage allows you to execute any file in a Microsoft Office document.
Stars: ✭ 68 (+223.81%)
Mutual labels:  excel, word, powerpoint
Unioffice
Pure go library for creating and processing Office Word (.docx), Excel (.xlsx) and Powerpoint (.pptx) documents
Stars: ✭ 3,111 (+14714.29%)
Mutual labels:  excel, word, powerpoint
Mschart
📊 mschart: office charts from R
Stars: ✭ 94 (+347.62%)
Mutual labels:  word, office, powerpoint
Documentserver
ONLYOFFICE Document Server is an online office suite comprising viewers and editors for texts, spreadsheets and presentations, fully compatible with Office Open XML formats: .docx, .xlsx, .pptx and enabling collaborative editing in real time.
Stars: ✭ 2,335 (+11019.05%)
Mutual labels:  excel, word, office
Kkfileviewofficeedit
文件在线预览及OFFICE(word,excel,ppt)的在线编辑
Stars: ✭ 234 (+1014.29%)
Mutual labels:  excel, word, office
Androiddocumentviewer
Android 文档查看: word、excel、ppt、pdf,使用mupdf及tbs
Stars: ✭ 235 (+1019.05%)
Mutual labels:  excel, word, office
flutter filereader
Flutter实现的本地文件(pdf word excel 等)查看插件,非在线预览
Stars: ✭ 101 (+380.95%)
Mutual labels:  excel, word
lazyExcel
a simply software like MS-Excel.it can be running muiti-platform...
Stars: ✭ 37 (+76.19%)
Mutual labels:  excel, office
filefy
A javascript library to produce downloadable files such as in CSV, PDF, XLSX, DOCX formats
Stars: ✭ 39 (+85.71%)
Mutual labels:  excel, word

rgpipe is a single bash/sh script and an alias to use with ripgrep to search through a myriad of file types that are otherwise not grep friendly. Use it with ripgrep's -pre command which allows ripgrep to selectively process files before searching.

TL;DR

The most basic usage is to point rgpipe at some file, and it will attempt to print the contents of said file to stdout.

rgpipe MyFancyExcelFile.xlsx

The more involved usage is as a filter in front of ripgrep to systematically attempt to grep through the contents of assorted non-text files much as you would text files. The basic incantation looks like:

rg --pre-glob '*.{xlsx,pptx,docx,pdf}' --pre rgpipe "$YourSearchTermHere"

Overview

I wrote up an extended gist about how to use it here

That gist is only useful because of the kind note by BurntSushi in this hacker news comment explaining how rg --pre-glob works.

This helps grep through:

  • New MS Office files (DOCX, PPTX, XLSX, variants thereof)
    • Uses unzip and sed
  • Old MS Office files (DOC, PPT, XLS, variants thereof) & new excel binary format
    • Uses strings
  • LibreOffice files (ODS, ODT, ODP)
    • Uses unzip and sed
  • PDF
    • Uses pdftottext from poppler
  • Web/structured formats (HTML, XHTML ...)
    • Uses w3m lynx and friends also works. Not 100% necessary.
  • Web formats disguised as books (chm, epub)
    • unzip and w3m for EPUB
    • 7zip and w3m for chm

Specifically

Ubuntu wants: sudo apt install poppler-utils p7zip w3m unzip

termux wants: pkg install poppler p7zip w3m

Usage notes

Vanilla ripgrep usage

Assuming rgpipe is in path, use /path/to/rgpipe if it's not

rg --pre rgpipe YourSearchTermHere

Better ripgrep usage

Above uses rgpipe even when it's not needed, which is slow, ripgrep can selectively use it with --pre-glob

rg --pre-glob '*.{xlsx,pptx,docx,pdf}' --pre rgpipe YourSearchTermHere

A more thorough pre glob:

rg --pre-glob '*.{pdf,xl[tas][bxm],xl[wsrta],do[ct],do[ct][xm],p[po]t[xm],p[op]t,html,htm,xhtm,xhtml,epub,chm,od[stp]}' --pre rgpipe YourSearchTermHere

An alias because that is a lot of typing

alias rgg="rg -i -z --max-columns-preview --max-columns 500 --hidden --no-ignore --pre-glob \
'*.{pdf,xl[tas][bxm],xl[wsrta],do[ct],do[ct][xm],p[po]t[xm],p[op]t,html,htm,xhtm,xhtml,epub,chm,od[stp]}' --pre rgpipe"

Poor man's full text search

Step 1: use rgpipe to make text sidecar files

find-rgpipe-type() {
     find `pwd` -type f -iname "*.$1" -exec sh -c 'for f; do rgpipe "$f" > "${f%.*}.txt"; done' _ {} +
}

# or get fancy with xargs for multithreaded goodness

find-rgpipe-type-xargs() {
    find "$(pwd)" -type f -iname "*.$1" -print0 | xargs -0 -P0 -n 1 -I {} sh -c 'rgpipe "{}" > "{}.txt"'
}

Make text sidecars for all files with PDF extension under current directory using the function defined above.

find-rgpipe-type pdf

Step 2: Use ripgrep to search those files

rg YourSearchTermHere

Super useful

1 - this hacker news comment

2 - The pre processing script that is the template into which I added some more file types

3 - midnight commander has great scripts on this subject

4 - lesspipe of course

5 - rga is a rust based tool doing a similar thing

The name

rgpipe because the idea is similar to lesspipe.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].