All Projects → schulke-214 → Ter

schulke-214 / Ter

Licence: mit
Text Expression Runner – Readable and easy to use text expressions

Programming Languages

rust
11053 projects

Projects that are alternatives of or similar to Ter

Aho Corasick
A fast implementation of Aho-Corasick in Rust.
Stars: ✭ 424 (+532.84%)
Mutual labels:  text-processing
Chr
🔤 Lightweight R package for manipulating [string] characters
Stars: ✭ 18 (-73.13%)
Mutual labels:  text-processing
Lingua Franca
Mycroft's multilingual text parsing and formatting library
Stars: ✭ 51 (-23.88%)
Mutual labels:  text-processing
Ekphrasis
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Stars: ✭ 433 (+546.27%)
Mutual labels:  text-processing
Whatlanggo
Natural language detection library for Go
Stars: ✭ 479 (+614.93%)
Mutual labels:  text-processing
Concise Ipython Notebooks For Deep Learning
Ipython Notebooks for solving problems like classification, segmentation, generation using latest Deep learning algorithms on different publicly available text and image data-sets.
Stars: ✭ 23 (-65.67%)
Mutual labels:  text-processing
Artificial Adversary
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stars: ✭ 348 (+419.4%)
Mutual labels:  text-processing
Javascript Text Expander
Expands texts as you type, naturally
Stars: ✭ 58 (-13.43%)
Mutual labels:  text-processing
Gohn
Hatena Notation (はてな記法) Parser written in Go
Stars: ✭ 17 (-74.63%)
Mutual labels:  text-processing
Pyparsing
Python library for creating PEG parsers
Stars: ✭ 1,052 (+1470.15%)
Mutual labels:  text-processing
Open Korean Text
Open Korean Text Processor - An Open-source Korean Text Processor
Stars: ✭ 438 (+553.73%)
Mutual labels:  text-processing
Python Nameparser
A simple Python module for parsing human names into their individual components
Stars: ✭ 462 (+589.55%)
Mutual labels:  text-processing
Fxt
A large scale feature extraction tool for text-based machine learning
Stars: ✭ 25 (-62.69%)
Mutual labels:  text-processing
Pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
Stars: ✭ 426 (+535.82%)
Mutual labels:  text-processing
Pipeit
PipeIt is a text transformation, conversion, cleansing and extraction tool.
Stars: ✭ 57 (-14.93%)
Mutual labels:  text-processing
Bsed
Simple SQL-like syntax on top of Perl text processing.
Stars: ✭ 414 (+517.91%)
Mutual labels:  text-processing
Text Mining
Text Mining in Python
Stars: ✭ 18 (-73.13%)
Mutual labels:  text-processing
Applied Text Mining In Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan
Stars: ✭ 59 (-11.94%)
Mutual labels:  text-processing
Go Search Replace
🚀 Search & replace URLs in WordPress SQL files.
Stars: ✭ 57 (-14.93%)
Mutual labels:  text-processing
Qp Trie Rs
An idiomatic and fast QP-trie implementation in pure Rust.
Stars: ✭ 47 (-29.85%)
Mutual labels:  text-processing

ter - Text Expression Runner

ter is a cli to run text expressions and perform basic text operations such as filtering, ignoring and replacing on the command line. There are many great tools that do this job. But most other tools have one in common: They are hard to memorize if you dont use them regularly. ter tries to solve this issue by providing a super simple cli & expression language which can be easily memorized and is well documented.

Quickstart

$ ter filter 'equals "foobar"' -m word				# matches all occurences `foobar` in the text
$ ter filter 'length 20'							# matches all lines with 20 chars
$ ter ignore 'numeric or special'					# ignores all lines which contain only numbers and special chars
$ ter replace 'numeric and length 5' 12345 -m word	# replaces all 5 digit numbers with `12345`

Common tasks where ter excels grep in readability

Task ter grep
Find all words containing a string ter filter 'contains "substr"' -m word grep -oh "\w*substr\w*"
Find all lines in a file with a specific length ter filter 'length 10' grep -x '.\{10\}'
Ignore all lines containing a string ter ignore 'contains "hide me"' grep -v "hide me"
Replacing all words following a specific pattern ter replace 'numeric and length 5' 12345 -m word grep itself cant replace, you need to use sed for that (which gets even more complicated).
Replacing all email addresses in a file with your email ter replace 'contains "@" and contains ".com"' [email protected] -m word Same as above.

When to use other tools

As said earlier: ter is no direct competitor to grep, awk, etc.! If you find yourself reaching the limits of the text expression language, you probably want to use more advanced tools.

Installing

At the moment ter can be installed only via cargo using:

$ cargo install ter

Documentation

There are the following global options:

  • -m / --mode, sets the operation mode, can be either line or word, defaults to line

And there are the following global flags:

  • -f / --first`, print only the first match if available
  • -l / --last`, print only the last match if available
  • --skip n, skip the first n matches
  • --limit n, show at most n matches
ter filter [FLAGS] [OPTIONS] <EXPRESSION> [FILE]
ter ignore [FLAGS] [OPTIONS] <EXPRESSION> [FILE]
ter replace [FLAGS] [OPTIONS] <EXPRESSION> <REPLACEMENT> [FILE]

If no file is provided ter tries to read from stdin.

Examples

$ docker ps | ter filter 'alphanumeric and length 12' -m word # prints all docker container ids

The Text Expression Language

This is a super simple format of writing readable and easy to memorize text processing expressions - there are many great and far more advanced languages and tools to process text on the commandline out there but all of them have one problem in common - they're unreadable and hard to memorize if not used often.

The Text Expression Languages provides only 9 Attributes to query by. These attributes indicate the format of a string which gets tested against it.

Attribute Resolve to true if the tested string
starts <str> starts with the given string
ends <str> ends with the given string
contains <str> contains a substring equal to the given string
equals <str> exactly equals the given string
length <int> has the given length
numeric contains only numeric chars
alpha contains only alphabetic chars
alphanumeric contains only alphanumeric chars
special contains only special chars

Currently there are only two binary logical operations: and and or

Operator Boolean Algebra
and Conjunction
or Disjunction

Attributes can be concattenated by logical operators.

Examples

starts "FOO" and ends "BAR"
contains "@" and contains ".com"
length 5 and length 10
numeric and length 8

Limitations

This Syntax might not cover all use cases. It's not meant to do that. If you find yourself reaching the limits of this language you might want to use more advanced tools (such as awk, grep, sed..)


The code for the language itself lives in a seperate repository.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].