All Projects → jbelmont → unix-programming-and-regular-expressions-workshop

jbelmont / unix-programming-and-regular-expressions-workshop

Licence: MIT license
A workshop on Unix Programming Principles using tools such as grep, sed, awk, shell programming and regular expressions

Programming Languages

shell
77523 projects
awk
318 projects

Projects that are alternatives of or similar to unix-programming-and-regular-expressions-workshop

Brunch
🍴 Web applications made easy. Since 2011.
Stars: ✭ 6,801 (+27104%)
Mutual labels:  pipeline, build-automation
unity-build-pipeline
Custom BASH script for build, archive, export and upload APK and IPA to server with Telegram notification
Stars: ✭ 59 (+136%)
Mutual labels:  pipeline, build-automation
path-to-regexp-php
PHP port of https://github.com/pillarjs/path-to-regexp
Stars: ✭ 21 (-16%)
Mutual labels:  regular-expression
naas
⚙️ Schedule notebooks, run them like APIs, expose securely your assets: Jupyter as a viable ⚡️ Production environment
Stars: ✭ 219 (+776%)
Mutual labels:  pipeline
Workshop-GraphQL
A GraphQL Server made for the workshop
Stars: ✭ 22 (-12%)
Mutual labels:  workshop
scala-3-crash-course
Scala 3 workshop presenting the top new features of the language.
Stars: ✭ 34 (+36%)
Mutual labels:  workshop
ckad-workshop
Getting Certified as a Kubernetes Application Developer.
Stars: ✭ 16 (-36%)
Mutual labels:  workshop
cregex
A small implementation of regular expression matching engine in C
Stars: ✭ 72 (+188%)
Mutual labels:  regular-expression
tpack
Pack a Go workflow/function as a Unix-style pipeline command
Stars: ✭ 55 (+120%)
Mutual labels:  pipeline
intro-to-tidyhydat-and-tidyverse
Introduction to R and the tidyverse in Hydrology
Stars: ✭ 16 (-36%)
Mutual labels:  workshop
dolphinnext
A graphical user interface for distributed data processing of high throughput genomics
Stars: ✭ 92 (+268%)
Mutual labels:  pipeline
saturn2019-architecture-island-workshop
What are the most essential ideas in software architecture all developers should know?
Stars: ✭ 25 (+0%)
Mutual labels:  workshop
gitops-helm-workshop
Progressive Delivery for Kubernetes with Flux, Helm, Linkerd and Flagger
Stars: ✭ 59 (+136%)
Mutual labels:  workshop
dltf
Hands-on in-person workshop for Deep Learning with TensorFlow
Stars: ✭ 14 (-44%)
Mutual labels:  workshop
human genomics pipeline
A Snakemake workflow to process single samples or cohorts of paired-end sequencing data (WGS or WES) using trim galore/bwa/GATK4/parabricks.
Stars: ✭ 19 (-24%)
Mutual labels:  pipeline
snips-skill-mental-calculation
With this App, your Assistant can test you on basic arithmetic: addition, subtraction, multiplication, and division.
Stars: ✭ 13 (-48%)
Mutual labels:  workshop
poco
Interactive pipeline filtering in PowerShell (a port of peco).
Stars: ✭ 16 (-36%)
Mutual labels:  pipeline
touchdesigner-summit-2019-large-systems
No description or website provided.
Stars: ✭ 26 (+4%)
Mutual labels:  workshop
Regex
🔤 Swifty regular expressions
Stars: ✭ 311 (+1144%)
Mutual labels:  regular-expression
artifact-promotion-plugin
A simple Jenkins plugin to promote artifacts.
Stars: ✭ 29 (+16%)
Mutual labels:  pipeline

Unix Programming and regular expressions workshop

A workshop on Unix Programming Principles using tools such as grep, sed, awk, shell programming and regular expressions

Sections:

Unix History

  • Shell Scripting was developed in the context of the UNIX Operating System from Bell Labs

  • Early UNIX systems packed incredible power into very small machines

    • 64 Kb "virtual" address space for the code and for data
    • This was often less than that of physical memory on the early PDP-11S
  • Source Code made it easy to experiment and change the system

  • AT&T Bell Labs heavily influenced Unix by the likes of Ken Thompson, Dennis Ritchie, and others

Quote from Dennis Ritchie for Vision of Unix:

What we wanted to preserve was not just a good environment in which to do programming, but a system around which a fellowship could form. We knew from experience that the essence of communal computing, as supplied by remote-access, time-shared machines, is not just to type programs into a terminal instead of a keypunch, but to encourage close communication.

  • Unix Developers were the users of the system and they developed tools to solve their own problems
  • Unix Developers were given freedom to experiment and rewrite Unix as needed
  • Unix was designed in a quest for elegance

Unix Software Philosophy

Software Tools Book and Software Tools in Pascal

  • Programs should be like specialized tools in a carpenter's toolbox

    • Avoid create programs to rule them all
    • Don't create programs that are like a Swiss Army Knife... meaning they do too much
      • One simple example would be sorting... either you can do one of the two following things:
        • Write a bunch of programs to do various tasks, each of which has an option to sort its output
        • Choose a common representation for your system (e.g. streams of ASCII text), create a mechanism for composing pieces of the system (Unix pipes), and only write the sorting functionality once
      • A less simple example would be the LLVM compiler (and compilers in general), which uses an intermediate representation (IR) that is understood and operated on by all pieces of the system as the compiler does all the passes required to generate its target (e.g. lexing, parsing, optimization, code generation, etc.)
      • In general, choosing a common representation for a system will turn the problem of interfacing the various pieces from an m*n problem into an m+n problem (where m is the number of different outputs in the first stage, n is number of inputs that are received in the second stage)
    • Tools can be combined using pipelines and the shell to get your work done
      • One famous example is Doug McIlroy's word count program, compared to Donald Knuth's, as described here and elsewhere
    • This philosophy became popular in Kernighan & Plauger Books

    Do One thing Well

    Programs are easier to:

    1. Write and to correct

    2. Document

    3. Understand and use

    The cat command originally only concatenated files

    The cp command copies files

    The mv command moves and renames files

Process Lines of Text

Using Text as the main data format has advantages:

  • Text is easy to process with existing and new tools

  • Text can be edited with any text editor

  • Text is portable across networks and machine architectures

For example to list some popular baby names and sort them:

cat data/top-10-baby-names-2016.txt | awk '{print $2 }' | sort

Use Regular Expressions

  • Regular Expressions provide powerful text matching and substitution

2 Flavors of regular expressions standardized by POSIX

  1. Basic Regular Expressions (BREs)
  2. grep, sed, ...
  3. Extended Regular Expressions (EREs)
  4. egrep, awk, ...

Default to Standard Input/Output

Use Standard Input/Output (I/O) when there is no files on the command line:

  • Helps simplify writing programs
  • Helps you hook programs together with pipelines
  • Helps encourage programs to do one thing well

Don't Be Chatty

  • Status messages that are mixed with standard output confuse programs downstream
  • If you ask then you get it. Don't prompt with 'Are you sure'
  • Do know what you are doing:
    • rm -rf / Before running a command like this
    • This will delete everything starting from the root directory
  • We have version control systems such as Git use them

Make sure to use the input format for output

  • If your text is structured then after processing
    • Write the same format for standard output in the same format of standard input
    • Doing this affords you to build specialized tools that work together

Write specialized tools if they do not exist

  • At times a tool does not exist, that is when you need to write the tool
    • Can the tool be useful to other people?
    • Can the tool be generalized?

If any of the answers to these questions are yes:

  • then write a general purpose tool

  • Scripting languages can often be used to write a software tool:

    • Awk
    • Perl
    • Python
    • Ruby
    • Shell
  • You can also use other languages like for example Golang as we will see

Software Tools Summary

  • Using the software tools approach helps provide a framework and a mindset for programming and scripting
  • You can combine software tools to solve software programs
    • This strategy in turn gives you flexibility and helps promote innovation
  • Know your tools and thinking in the Software Tools Philosophy will improve your scripting

Self Contained Shell Scripts

Executable Definition

In computing, executable code or an executable file or executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions,"[1] as opposed to a data file that must be parsed by a program to be meaningful.

Typically a high level language is used that compiles to executable machine code files

  • Executable scripts typicall start with a Shebang => #! /bin/bash or the like
    • An optional argument can be provided
    • Some Unix systems have small limits on the path name length

Shell Scripts can be simple executable text files that contain shell commands.

  • Keep in mind that this only works if the shell script is in the same language as the interactive shell
    • For example to expect a zsh shell script to run in a bash environment
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].