All Projects → andrewbihl → Bsed

andrewbihl / Bsed

Licence: mit
Simple SQL-like syntax on top of Perl text processing.

Programming Languages

python
139335 projects - #7 most used programming language
perl
6916 projects
awk
318 projects

Projects that are alternatives of or similar to Bsed

Textrude
Code generation from YAML/JSON/CSV models via SCRIBAN templates
Stars: ✭ 79 (-80.92%)
Mutual labels:  csv, text-processing
Command Line Text Processing
⚡ From finding text to search and replace, from sorting to beautifying text and more 🎨
Stars: ✭ 9,771 (+2260.14%)
Mutual labels:  text-processing, grep
J
❌ Multi-format spreadsheet CLI (now merged in http://github.com/sheetjs/js-xlsx )
Stars: ✭ 343 (-17.15%)
Mutual labels:  csv
Jsoncons
A C++, header-only library for constructing JSON and JSON-like data formats, with JSON Pointer, JSON Patch, JSON Schema, JSONPath, JMESPath, CSV, MessagePack, CBOR, BSON, UBJSON
Stars: ✭ 400 (-3.38%)
Mutual labels:  csv
Laravel Report Generator
Rapidly Generate Simple Pdf, CSV, & Excel Report Package on Laravel
Stars: ✭ 380 (-8.21%)
Mutual labels:  csv
Csv Parser
A modern C++ library for reading, writing, and analyzing CSV (and similar) files.
Stars: ✭ 359 (-13.29%)
Mutual labels:  csv
Stream Parser
⚡ PHP7 / Laravel Multi-format Streaming Parser
Stars: ✭ 391 (-5.56%)
Mutual labels:  csv
Anon
A UNIX Command To Anonymise Data
Stars: ✭ 341 (-17.63%)
Mutual labels:  csv
Blush
Grep with colours
Stars: ✭ 410 (-0.97%)
Mutual labels:  grep
Visidata
A terminal spreadsheet multitool for discovering and arranging data
Stars: ✭ 4,606 (+1012.56%)
Mutual labels:  csv
Csv
CSV Decoding and Encoding for Elixir
Stars: ✭ 398 (-3.86%)
Mutual labels:  csv
Choetl
ETL Framework for .NET / c# (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Stars: ✭ 372 (-10.14%)
Mutual labels:  csv
Simpleflatmapper
Fast and Easy mapping from database and csv to POJO. A java micro ORM, lightweight alternative to iBatis and Hibernate. Fast Csv Parser and Csv Mapper
Stars: ✭ 370 (-10.63%)
Mutual labels:  csv
React Spreadsheet
Simple, customizable yet performant spreadsheet for React
Stars: ✭ 393 (-5.07%)
Mutual labels:  csv
Artificial Adversary
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (-15.94%)
Mutual labels:  text-processing
Specs
Technical specifications and guidelines for implementing Frictionless Data.
Stars: ✭ 403 (-2.66%)
Mutual labels:  csv
Spout
Read and write spreadsheet files (CSV, XLSX and ODS), in a fast and scalable way
Stars: ✭ 3,861 (+832.61%)
Mutual labels:  csv
Meza
A Python toolkit for processing tabular data
Stars: ✭ 374 (-9.66%)
Mutual labels:  csv
Exporter
Lightweight Exporter library
Stars: ✭ 384 (-7.25%)
Mutual labels:  csv
Mockneat
MockNeat is a Java 8+ library that facilitates the generation of arbitrary data for your applications.
Stars: ✭ 410 (-0.97%)
Mutual labels:  csv

bsed

Simple SQL-like syntax on top of Perl text processing. Designed to replace simple uses of sed/grep/AWK/Perl.

Bsed is a stream editor. In contrast to interactive text editors, stream editors process text in one go, applying a command to an entire input stream or open file.

Some example commands:

  • bsed contacts.csv delete lines containing '[email protected]'
  • bsed giant_malformatted.json replace '\'' with '\"' | bsed replace 'True' with 'true' | bsed replace 'False' with 'false'
  • bsed file.txt append 'Yahoo' with '!'
  • bsed file.py on lines 25 to 75 replace 'San Francisco' with 'San Diego'
  • bsed file.py append lines containing 'deprecated_package' with ' # TODO: Update module'

Quick Start

  1. Install bsed
    • pip3 install --upgrade bsed
  2. Register bsed for autocompletion
    • echo eval "$(register-python-argcomplete bsed)" >> ~/.bash_profile

Open a new shell (or run source ~/.bash_profile). Run bsed commands for some common usages, and run bsed help for info on flag options.

Command types

Currently, there are two types of syntax:

  1. Word or entire-line operations*
    • select/delete/append/prepend/wrap {some pattern or line filter}
  2. Line filters
    • lines containing/starting with/ending with {some pattern}
    • lines {m} to {n}

As a result, three types of commands can be created:

  • Word operations
    • append "google" with "\.com"
  • Line operations
    • delete lines containing "UNC"
  • Line filter + word operations
    • on lines containing "source-url" prepend "www\..*\.com" with "https://"

* select is only for line operations currently.

Motivation

TLDR: Most batch text processing tools are too complex for the very occasional user. bsed has simple, english syntax with bash autocomplete so that you don't have to go searching Stackoverflow each time you need sed/AWK/Perl.

Many common text transformations are fit for tools such as grep, sed, and AWK. These utilities allow for fast modification of text in one operation (as opposed to interactive text editors). Being command line tools, they also allow for piping of outputs into subsequent commands. Finally, they are common default software on many systems, making them easy to rely on and good subjects for to find support/help.

Some usage examples include:

  • Getting lines from a file containing a word
  • Find-and-replace
  • Deleting, replacing, or clearing lines containing a regex pattern match
  • Placing text at the beginning or end of certain lines
  • Getting a range of line numbers

Problems with grep/sed/AWK

  1. People don't know which tool to use
  2. Inconsistent levels of regex support
  3. Inconsistent levels of efficiency

Enter Perl

Perl solves these issues (in theory) by providing a one-stop shop for all of these uses. Perl one-liners provide the set of functionality containing grep, sed, and AWK use cases, and have syntax designed to mimic that of sed. Furthermore, Perl includes advanced regex support and is for many cases more efficient than any of its counterparts.

Perl one-lines can be executed at the command line like the other text utilities. Finally, Perl also is commonly installed by default on popular operating systems. In conclusion, Perl is functionally the best general choice for stream editing.

Why not use Perl?

In practice, few people know sed well enough to fire off commands from memory. For the casual or infrequent user, usually the path to success is to search Stackoverflow for a quick sed command they can parse and tweak for their purposes.

Even fewer people know Perl, as the syntax proves to be even more daunting and difficult to remember than sed.

For example, a user may wish to perform a find-and-replace, replace "Jack" with "Jill".

AWK: awk '{gsub(/Jack/,"Jill")}' file.txt

Sed: sed -i 's/Jack/Jill/g' file.txt

Perl: perl -p -i -e 's/Jack/Jill/g' file.txt

None of these is particularly intuitive, and the details of the syntax are complex even for the simplest of commands. To the beginning user, none of the following is obvious:

  • What is the difference between {} and () in the AWK command?
  • What is -i in sed? What are the s or the g for?
  • Why single quotes as opposed to double quotes? Are these interchangeable?
  • What are those flags in Perl?

As a point of contrast, consider the structure of SQL:

SELECT email FROM User WHERE country='Argentina';

You don't need to know SQL to be able to understand the purpose of the command. Because of its intuitive syntax, a day's usage of SQL is sufficient to recall the basics for years.

For the average user

The most common use case is a one-off command they need to transform a single file. Because of this, the learning curve of understanding Perl (or sed for that matter) is often not worth the upfront time investment.

Use bsed for basic tasks

To solve this, bsed implements many common command types in an understandable English syntax designed to be as usable as SQL. Some examples of uses:

bsed file.txt select lines 0 to 50

  • Print first 50 lines, indexed from zero.

bsed file.py clear lines starting with '\s*#'

  • Replace comments in a python file with blank lines

bsed file.csv delete lines containing 'Andrew Johnson'

  • Remove any records with this person's name in the CSV

bsed performance_review.txt wrap 'Employee of the Month' with '\"'

  • Puts the phrase "Employee of the Month" in quotes

bsed data.csv on lines 0 to 2000 select lines containing 'San Diego'

  • Finds records on the first 2000 lines referencing the city. Good for quick exploration of very large files.

bsed customer_info.txt replace 'Jim Johnson' with 'John Johnson' | bsed replace '[email protected]' with '[email protected]'

  • Fix a mistaken first name. Notice commands are chained together with |.

Use the -t flag to learn or debug

Any command can be run with the -t flag and the command translation will be printed (not executed).

This is nice to debug regex, build up more complex queries, or just learn some Perl through examples. Without having to remember Perl from scratch, you can get a quick command structure and then modify it or build on it.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].