A list of topics and anchors that the blog and other docs link to.
#YSH
YSH —
A legacy-free dialect of shell with:
var
keywordecho $[join(mylist)]
try
and append
You run it with bin/ysh
.
Important: Before March 2023, this shell language was called Oil. We will clean up the many references to the old name over time.
For a taste of the syntax, see The Simplest Explanation of Oil and A Tour of YSH.
It shares the same runtime as OSH, so it's a smooth upgrade from both bash and OSH. Compatibility is selectively broken with Shell Options.
#OSH
OSH —
A compatible shell language based on the common use of shell (including
POSIX, bash, and others). The design criteria for the language are:
You run it with bin/osh
.
In addition, it has four features that justify a new shell: reliable error handling, safe processing of user-supplied data, lack of "quoting hell", and better error messages and tools. These features are opt-in, as OSH is compatible by default.
#oil-language
Oil Language —
The old name for the shell language influenced by Python, JavaScript, and Ruby.
In March 2023, Oil was renamed to YSH. We will no longer refer to the "Oil language", but there are still many links that point to this entry.
#osh-language
OSH Language —
A synonym for OSH, or bin/osh
.
#headless-shell
Headless Shell —
A mechanism to move the interactive shell into another process, outside of
the Oils core. The Oils project is focused on a language for automation and
glue, as opposed to a user interface.
Also see blog posts tagged #headless.
#FANOS
FANOS: File descriptors And Netstrings Over Sockets —
A protocol we invented for shells and GUIs to communicate. Key idea: the GUI passes file descriptors pointing to a terminal to the shell via a Unix domain socket. The shell's child processes will inherit those descriptors, which allows ls --color
to work as usual. That is, when ls
calls isatty()
, it will work correctly and return true
.
#eggex
Egg Expression —
The regular expression syntax for YSH, which has pattern composition and
seamless integration with egrep, awk, and other Unix tools.
It resembles Perl-style regex syntax, but literals are quoted and you can use
whitespace to make patterns more readable.
#hay
Hay - Hay Ain't YAML —
A YSH feature that lets you declare data with the same syntax as code, in a Lisp-like fashion. Code and data can be interleaved, which is useful for config files and internal DSLs.
#mycpp
mycpp —
A tool that translates a subset of statically-typed Python to C++. It
translates a large part of the Oils interpreter, but it's not a
general-purpose translator.
It depends on MyPy, and you can think of it as a hybrid between the recent mypyc compiler and the old Shed Skin compiler.
#opy
OPy —
A Python bytecode compiler based on pgen2 and
compiler2. This small piece of code allows us to adapt Python to
the needs of the Oils project. See Building Oil with the OPy Bytecode
Compiler.
As of December 2019, we expect OPy to be replaced by mycpp, which generates faster code.
#boil
Boil —
(obsolete)
The working name for the part of Oils that subsumes GNU Make. No code for this
exists yet.
#oil-native
oil-native —
The build of Oils translated to C++ with mycpp. The resulting shell
is 100% native code: i.e. there's no bytecode. When it's done, it will be the
only Oils build, and we'll just call it "Oils".
#OVM
OVM —
(obsolete)
A slice of the CPython interpreter, used as the Oils VM while it's
being prototyped. It will be replaced with C++ code "metaprogrammed" with
Python.
#OVM2
OVM2 —
(obsolete)
A nascent VM to replace our use of the CPython VM.
#OHeap2
OHeap2 —
A data format for OVM2 that is like a SmallTalk image or v8 snapshot.
Inspired by the first version of oheap.
#readline
readline —
A line-editing library derived from bash. It has emacs
and vi
modes.
#pylibc
pylibc —
An extension module to expose
libc functions to Python.
Python implements its own glob()
or fnmatch()
that are different from the
ones in libc
. We may also need libc
's locale-aware string functions.
#wwz
wwz —
A FastCGI program that serves the contents of a zip file. It makes it easy and
fast to deploy thousands of small files to a web server, and back them up. We
use it for test results, benchmarks, and continuous build logs. This Hacker
News comment provides some
color. It's a simple Unix-y solution.
#aboriginal-linux
Aboriginal Linux —
Shell scripts that implement the minimal Linux system that can rebuild
itself (discontinued as of April 2017.)
#abuild
abuild —
A 2500-line shell script that builds Alpine Linux packages.
#alpine-linux
Alpine Linux —
A minimal Linux distribution based on musl
libc and busybox.
#bash-completion
bash-completion —
A companion project to bash that provides interactive completion for
the common Unix commands. Most Linux distros use it, including Debian and
Ubuntu. It consists of tens of thousands of lines of bash code.
#ble.sh
Bash Line Editor —
ble.sh
gives you a fish-like interactive experience in bash, with
syntax highlighting, completion, and vim-style editing. It's written in pure
bash, and is likely the biggest and most sophisticated shell
program
in the world!
A long-term goal for Oils is to allow users to customize their shell this way, rather than hard-coding the UI in C++ or Python.
#bwk
bwk —
Some software archaeology I did on Kernighan's Awk, to research how Awk
relates to the shell. (One interesting thing: they both don't implement
first-class compound data structures, and thus lack garbage collection.)
#autotools
GNU autotools —
A meta-build system that generates configure
shell scripts and Makefiles
from m4
macros.
#busybox
BusyBox —
A reimplementation of standard Unix command line utilities, commonly used on
embedded Linux systems.
#debian
debian —
One of the oldest and most popular Linux distributions. It uses the apt
package manager, which wraps dpkg
. Ubuntu is based on Debian.
#debootstrap
debootstrap —
Debian uses this large shell program to construct its base image
from binary packages.
#nix
Nix —
A purely-functional package manager and Linux distribution. As with nearly
all distributions, bash plays a fundamental role in building binary
packages.
#pypy
PyPy —
A Python interpreter written in Python (including a restricted subset RPython).
It has novel JIT technology and a focus on speed.
#tinypy
tinypy —
A interpreter for a subset of Python written in just ~2K lines of C and ~2K
lines of Python (using a very dense style). I used some tinypy code for my
pratt-parsing-demo, and it inspired the plan for Oils to have a Python
interpreter.
#toybox
Toybox —
A reimplementation of standard Unix command line utilities, by the former
maintainer of busybox.
#ninja
Ninja —
A "low-level" build system focused on incremental build speed. High level
languages like CMake generate Ninja build files.
#tmux
tmux —
A Unix terminal multiplexer which provides a better interactive interface than
shell job control. GNU Screen is
another popular option.
#smoosh
Smoosh - The Symbolic, Mechanized, Observable, Operational Shell —
A formalization of the POSIX shell standard. Source
code (in
Lem and OCaml) is available.
#chroot
chroot —
A system call that gives a process a view of its own "virtual" file system.
Linux container technology like Docker or
LXC can be thought of as a "chroot on
steroids".
#libc
The C Standard Library —
The shell communicates with the kernel through the C standard library. Popular
implementations include GNU libc and
musl libc.
#tokenize
Python tokenize module —
A reimplementation of Parser/tokenizer.c
in pure Python. Part of the Python
standard library.
#pgen2
pgen2 —
A reimplementation of Parser/pgen.c
in Python, done for lib2to3.
#compiler2
compiler2 —
compiler2
is my name for the deprecated Python 2.7
compiler module. It does the same thing as Parser/compile.c
, but in
Python.
#byterun
byterun —
A Python bytecode interpreter loop written in Python, described in the AOSA
Book. It does the same thing as ceval.c
in CPython.
#dplyr
dplyr —
A "modern" data frame library for R. Part of the
Tidyverse. I use it to analyze Oils code and dependencies.
#tidyverse
TidyVerse —
Hadley Wickham created this set of R packages. They reinvent R's data
structures and standard library through metaprogramming!
#yajl
Yet Another JSON Library —
(obsolete)
Oils uses this C library to parse and print JSON. Because Oils has Python's
data structures, we use a fork of the py-yajl Python
binding to wrap yajl's nice streaming
API.
#pexpect
pexpect —
A Python library to automate terminal applications like shells, ssh
,
passwd
, etc. We use it to test the interactive shell.
#coreutils
coreutils —
The GNU implementation of ls
, cp
, mv
, etc. It also has versions of
test
, time
, and kill
, which are typically shadowed by
similar-but-different shell builtins.
#grep
grep —
A tool to search files for patterns. Prefer using egrep
(grep -E
) to
grep
, because repetition looks like [0-9]+
rather than [0-9]\+
. The
former is more consistent with all other regular expression dialects, including
Eggex.
#find
find —
A classic Unix tool that walks a directory tree, filters its entries, and
performs actions. GNU findutils implements it.
Many users don't realize that find
is an expression language like
expr or test. It looks nothing like
Awk, but they both apply predicates and actions to a stream.
#xargs
xargs —
A tool that builds and executes command lines from stdin
. A very useful
GNU extension is xargs -P
, which starts processes in parallel.
#expr
expr —
An external tool that implements mathematical expressions for shell. It has
been mostly subsumed by the POSIX $((1+2))
construct, and the
[[ $mystr =~ $myregex ]]
construct. GNU autotools still
generates code that uses it.)
#strace
strace —
A tool that prints the system calls that another process makes. For example,
strace echo hi
will show the write()
syscall, among others. The -e
flag
contains a small expression language to filter what's printed.
#antlr
ANTLR —
A tool to generate top-down parsers (LL(k)
, LL(*)
). I ported the POSIX
shell grammar to ANTLR to machine check it, but it's not used to generate code.
#yacc
yacc —
A tool to generate bottom-up parsers. Bash uses yacc, which is a
mistake discussed in this AOSA Book chapter on Bash.
#semantic-action
Semantic Action —
The "right hand side" of a rule in a parser specification is a semantic
action. It's typically a block of in the host language, e.g. C or OCaml.
Yacc and re2c both use the model of semantic actions.
ANTLR and Python's pgen.c
and pgen2 prefer to materialize
a parse tree. This means that there's an extra step to construct an
AST.
#re2c
re2c —
A tool that compiles regular expressions first to a DFA, and then
efficient C code consisting of mostly switch
and goto
statements. I
use it to express multiple lexers in the Oils project.
The best part of it is that it's a library and not a framework.
#zephyr-asdl
Zephyr ASDL —
Oils uses this domain-specific language to declare algebraic data types
in Python and C++. We use it to represent both the syntax of shell programs
and the interpreter's runtime data structures. See What is Zephyr
ASDL? and posts tagged
ASDL.
This article describes its use in Python. This SourceForge project contains the code.
#clang
Clang —
A modular front end for C and C++ that supports IDEs and other tools (as
well as the code-generating compiler). Oils has some similarities because we
have multiple uses cases for the parser: execution, interactive completion, a
tool to convert the osh language to the oil language, and more.
#protobuf
Protocol Buffers —
A schema language, serialization format, and set of APIs created and
open-sourced by Google.
#spec-test
sh_spec.py —
A test framework written for osh
that runs shell snippets against many
shells. See Spec Tests and How I Use Tests (2017).
#wild-test
Wild Tests —
A test framework that tortures the OSH parser with real-world shell scripts.
#gold-test
Gold Tests —
A type of test that compares the output of OSH and bash (or another existing
shell). The assertions are implicit so you don't have to write them.
Themes: Correctness, security, performance.
#asan
AddressSanitizer —
A compiler tool for detecting memory errors at runtime. That is, it's a kind
of dynamic analysis. It solves roughly the same problem as Valgrind, but
it's faster. Also known as ASAN.
#afl
American Fuzzy Lop —
A fuzzer that uses compiler technology to efficiently explore code paths. In
the last few years, it's been used to surface hundreds of bugs in ubiquitous
and already well-tested pieces of open-source software. Its Wikipedia
page is also
helpful.
#perf
Linux perf —
User-space tools and kernel APIs for Linux performance analysis. Uses
CPU-specific features for accurate measurements.
#flame-graph
Flame Graph —
A relatively new technique for visualizing profiler output. It shows how much
execution time can be attributed to a particular call stack. Note that a
set of function call stacks forms a tree: a function may call multiple
functions.
This explains why flame graphs can also be used like treemaps, i.e. to visualize space used in a file system hierarchy.
#bloaty
Bloaty McBloatyFace —
A code size profiler for compiled binaries. I used it to measure progress in
stripping down the CPython interpreter.
#mypy
mypy —
A type checker for Python. You can gradually add types to Python 2 or 3 code,
and MyPy will check them for consistency before execution. There are some
limitations to the code it understands, but many Python idioms are supported.
#pyannotate
PyAnnotate —
A tool that records the types of Python variables at runtime, and then
generates approximate static type annotations.
#uftrace
uftrace —
A unique and useful tool for user-space function tracing. You tell your C
compiler to instrument a binary, run it under uftrace record
, and query the
results. I used it to speed up the Oils parser. I use shell so I can use and
automate tools like uftrace
. Shell helps you write better native code.
#OCI
Open Container Initiative —
A standard for containers based on Docker. Docker is being "refactored away"
into something less monolithic and more Unix-y.
#docker
Docker —
A monolithic toolkit for containers. It has a build tool based on a shell-like
DSL, registry push/pull, and a container runtime.
#podman
Podman —
A container runtime that's part of Red Hat's rewrite / refactoring of the
Docker ecosystem. They are making Docker more modular and Unix-y, e.g. by
eliminating superfluous daemon.
#posix-shell-spec
POSIX Shell
Spec: POSIX specification for the shell (sh
). It seems
that ksh
was the dominant shell at the time of standardization, so bash
implemented POSIX + a lot of ksh.
#posix-grammar
POSIX Shell
Grammar: Subsection of the spec which has a BNF-style grammar.
#google-style-guide
Google Shell Style
Guide -- Unofficial shell style guide at Google, which
points out some deficiencies in the shell language. (Not all shell scripts at
Google attempt to conform to this style.)
#aosa-book-bash
Chapter on Bash in the Architecture of Open Source Applications —
An excellent article by bash maintainer Chet Ramey on bash's internal
structure.
Trivia about the Unix shell language, including the common ksh/bash extensions.
#here-doc
Here Document —
A construct in shell for writing lines of text to be fed to stdin
of a
process. Perl, Ruby, and PHP borrowed here docs from shell.
#shell-builtin
Shell Builtin —
A shell builtin is just like an external command, e.g. /bin/ls
, except it's
linked into the sh
binary. It takes an argv
array, returns an exit code,
and uses stdin
, stdout
, and stderr
.
#dynamic-scope
Dynamic Scope —
A method of resolving variable names. In the case of Unix shell, it means
that you look up the stack for variable references, rather than looking only in
the current stack frame. Early Lisps used these semantics, but later Lisps
switched to lexical scope.
#job-control
Job Control —
A feature of the interactive POSIX shell that's deeply intertwined with the Unix kernel.
It lets you hit Ctrl-Z to suspend vim
and get a shell. It lets you cancel all the processes
in a pipeline with Ctrl-C.
#proc
YSH Procs —
In YSH, shell-like functions are declared with the proc
keyword. Think of
them as "procedures" or "processes".
stdin
, stdout
, and return an exit code.#thompson-shell
Thompson Shell —
The first Unix shell, written by Ken Thompson. It had pipelines and redirects,
but it's not a programming language. It's an interactive tool that is notably
separate from the Unix kernel.
See the paper in Unix Shell: History and Trivia.
#bourne-shell
Bourne Shell —
A seminal upgrade to the Thompson shell, written by Stephen Bourne. It turned
shell into a programming language with loops, conditionals, and functions. It
allows you to redirect and pipe the I/O of these compound structures.
All modern Unix shells are descendants of the Bourne shell. That is, it "won" over other efforts like Bill Joy's C shell.
Stephen Bourne: Early Days of Unix and design of sh (2015, YouTube) is a nice historical overview of the project.
#bash
GNU Bash —
The most popular implementation of Unix shell. It was the first program to run
on the Linux kernel, circa 1991. OSH is largely
compatible with it. Also see the Wikipedia page for
bash.
#dash
Debian Almquist Shell —
A fork of the Almquist Shell that Debian and Ubuntu use for shell scripts, but
not the default login shell. If you look at the busybox ash
source code, it
is apparent that they are similar. The things I notice most about it are that
kebab-case
function names aren't allowed, and it has a bug related to
readonly
and tilde expansion.
#fish
fish —
Probably the most popular non-POSIX shell. It has a rich interactive
experience.
#mksh
MirBSD Korn Shell —
A fork of pdksh (Public Domain Korn Shell). This is the default
shell on Android. Testing this shell against others has taught me that many
"bash-isms" are actually "ksh-isms". bash
implemented many ksh
extensions
for compatibility.
#zsh
zsh —
zsh
is probably the second most popular interactive shell, after bash. It's
not POSIX-compliant by default, although it has options to make it POSIX
compliant. Apparently, it doesn't split words by default.
#ksh
Korn Shell —
ksh was an extension of the Bourne shell, developed at Bell Labs.
pdksh and bash cloned many of its features.
#pdksh
Public Domain Korn Shell —
A defunct clone of AT&T's Korn shell that survives in at least two forks: the
OpenBSD shell and mksh.
#metaprogramming
Metaprogramming —
A very general term for code that operates on code. Textual code
generation, C macros, C++ templates, Python reflection, non-standard evaluation
in R, and Lisp macros are all examples of metaprogramming.
In dynamic languages, the metaprogramming language is typically the language itself, while statically-typed languages require a different metaprogramming language. See Type Checking vs. Metaprogramming; ML vs. Lisp.
#metalanguage
Metalanguage —
In programming, a metalanguage is the language used to describe or
implement another language. DSLs are often used as metalanguages.
For example,
re
module. It's an abstract program but we cobbled together some
concrete tools to express it.#language-composition
Language Composition —
When parsing almost any language, it's useful to think of it as a composition
of sublanguages. Shell is an extreme case of this, but it's true for
Python, JavaScript, HTML, etc.
#DSL
Domain-Specific Language —
The Unix shell is glue for DSLs like sed, awk, find,
expr, regexes, globs, and more. Oils is implemented with DSLs like
re2c and Zephyr ASDL.
#dependency-inversion
Dependency Inversion —
A style of programming that makes programs more modular. Most of the
program is initialized in main()
and "wired together".
#string-hygiene
String Hygiene -- A property of programs that means that code isn't
confused with data. This is critical for security in distributed systems.
Shell injection, SQL injection, and HTML injection (XSS) are examples of
security problems arising from the lack of string hygiene. Solutions to the
problem include avoiding string concatenation and proper language-specific escaping.
avoiding strings.
#whipupitude
Whipupitude —
The aptitude for whipping things up, coined by Perl creator Larry Wall. Shell and Perl both have this property!
#data-language
Data Language —
TODO: Add link. A language for denoting data, like TSV, HTML, or Clojure's EDN.
Data languages can be tied to a specific language, or "polyglot". In the latter case, it's also an "interchange format", like JSON.
#sed
sed —
A text stream editor using a batch execution model.
#awk
Awk —
A classic Unix programming language for text processing.
#extended-glob
Extended Glob —
An unusual syntax in ksh and bash that gives globs the
power of regular expressions.
*.@(sh|py)
is like matching *.py
or *.sh
. The
@(foo|bar)
construct allows alternation.#ERE
POSIX Extended Regular Expressions —
The flavor of regex that bash supports.
grep
supports it with -E
, or egrep
--regexp-extended
#make
Make —
A classic Unix build tool that is also a Turing-complete programming language.
#shell
Shell —
An interactive program to control the Unix operating system, as well as a
programming language. Oils aims treat shell as a serious programming language.
#M4
M4 —
GNU Autotools is written in the text preprocessor language M4.
It's similar to the C preprocessor, except that it's Turing-complete. It was
designed to support a dialect of Fortran.
#algol-like
ALGOL Family of Languages —
C-like imperative languages with functions, loops, conditionals, etc.
#tcl
Tcl —
An embedded scripting language that's influenced some alternative shells. It
has Lisp-like properties.
#lua
Lua —
Lua is an embedded scripting language, which means that the interpreter is
a library. It has no global variables, and requires explicit capabilities
to I/O. While the Lua language has some deficiencies, this aspect of Lua will
influence Oils.
#r-language
R language —
A language for statistical computing, including data manipulation, modelling,
and visualization.
#ML
ML —
ML stands for "meta-language": a language for manipulating languages.
The ML family of languages includes OCaml and Haskell, and its distinguishing
feature is the data model of algebraic data types. The domain-specific
language ASDL uses this data model.
#cpython
CPython —
The standard implementation of the Python programming language, written in C.
#python
Python —
The popular language that I wrote OSH in.
#ocaml
OCaml —
A popular modern implementation of ML. If I hadn't prototyped
OSH in Python, OCaml would have been a good choice. The compiler and runtime
are well-engineered and well-documented. They may influence
OPy.
#cfg
Context-Free Grammar -- A formalism for expressing the syntax of
programming languages. Shell can only be partially specified using a CFG; the
POSIX grammar is incomplete.
#DFA
DFA —
A deterministic finite automaton is a mathematical notion of a state machine.
A regular expression can be translated to a DFA via an NFA. You feed
the string to the DFA and see if you end up in an "accept" state. That happens
if any only if the string matches the regular expressions.
#NFA
NFA —
Every regular expression can be translated to an equivalent nondeterministic
finite automaton. You can think of it as a state machine which magically
"knows" which transition to take at each step. It's unintuitive to many
programmers; a DFA is closer to our notion of computation.
#regular-language
Regular Language —
The class of formal languages that "regexes" are based on. Perl-style regexes
have many non-regular constructs, making them harder to
recognize than regular languages.
Every regular language corresponds to a finite automaton that recognizes it. Roughly speaking, a DFA has no memory and looks at each byte of input exactly once.
Eggex encourages the use of regular languages, but it also has clear syntax for Perl-style backtracking constructs.
#peg
Parsing Expression Grammar -- An alternative formalism to context-free
grammars, which may be better-suited to expressing shell syntax.
#lexical-state
Lexical State —
A simple parsing technique for dealing with language composition, i.e.
"sublanguages" or "dialects". Renamed to lexer modes (because
the lexer has other unrelated state).
#lexer-modes
Lexer Modes —
A simple parsing technique for dealing with language composition, i.e.
"sublanguages" or "dialects". Formerly lexical state. See
posts on #lexing.
#precedence-climbing
Precedence Climbing -- A simple algorithm for top-down
parsing of expressions. It's a special case of top-down operator precedence
parsing.
#tdop-parsing
Top-Down Operator Precedence Parsing -- Also called Pratt
parsing, this is a general algorithm for parsing expressions with multiple
levels of precedence.
#recursive-descent
Recursive Descent Parsing -- The most widely-used parsing
technique. Recursive descent parsers are written by hand, often following a
grammar. Each recursive procedure in the parser corresponds to a "production"
in a context-free grammar.
They are flexible, e.g. in accomodating ad hoc parsing rules and good error messages.
Recursive descent parsing is "top-down" parsing.
#top-down-parsing
Top-Down Parsing -- Parsing algorithms can be categorized
as either top-down or bottom-up. ANTLR uses top-down algorithms,
while yacc uses bottom-up algorithms. Pratt parsing
is a top-down algorithm and recursive descent is a
top-down technique. See LL and LR Parsing Demystified.
#AST
Abstract Syntax Tree —
In contrast to an AST, a parse tree is derived only from the rules of the
grammar for a language. You don't need to annotate your parser with nontrivial
"semantic actions". The exact definition is debatable, but in my usage, an AST
has some simplifications or annotations over a parse tree, depending on what
you need to do with it: source-to-source translation, interpretation, code
generation, etc.
#LST
Lossless Syntax Tree —
An syntax tree with enough detail to reproduce the original source code.
#adt
Algebraic Data Types —
A data model of sum and product types. This model is particularly convenient
for representing the structure of programming languages.
#data-frame
Data Frame —
A table data structure with dynamically typed columns. The R
language is built around data frames, and the Pandas
library borrowed this idea. It's similar to an SQL table, except that it
generally lives in memory, rather than on a remote server's disk.
#perlis-thompson
Perlis-Thompson Principle —
A software architecture concept distilled from statements by Alan Perlis and
Ken Thompson. Short definition: Software with fewer concepts composes,
scales, and evolves more easily. This is a tradeoff, not a hard rule.
#narrow-waist
Narrow Waist —
The narrow waist (of an hourglass) is a software concept that solves an
interoperability problem, avoiding an O(M × N) explosion. All of these
are narrow waists:
#m-by-n-explosion
O(M × N) code explosion —
A system may need bespoke code to fill in every cell of a grid, like M
algorithms and N data structures, or M languages and N operating systems.
This problem can often be mitigated by better software architecture, e.g.
with protocols, interchange formats, or intermediate representations.
#API
Application Programming Interface (API) —
A software interface specified in a programming language, often with static
linking. Contrast with ABI: Application Binary Interface.
#ABI
Application Binary Interface (ABI) —
The "runtime reality" of a software interface, often derived from an
API. The Actually Portable Executable
project takes this idea to an extreme, building on the x86-64 Linux ABI. It
essentially ignores the APIs and "puns" multiple ABIs.
#IPC
Inter-Process Communication —
A type of software composition that involves messages exchanged between
processes. It differs from composition via APIs in that the programs on each
side of the "wire" aren't compiled and deployed together, aren't synchronized
in the same "thread", and may be written in different programming languages.
IPC is similar to networking, but the links are reliable rather than unreliable. RPC abstractions can be built on top of IPC or networking.
#CGI
Common Gateway Interface —
A Unix-y protocol for creating dynamic web content. It was more popular in the
90's, but is still used today. The more complex FastCGI protocol can fix
performance problems.
#utf8
UTF-8 —
The best and most popular Unicode encoding. It's backward-compatible with
ASCII, so less code has to be rewritten to support Unicode. See blog posts
tagged #utf8
.
#JSON
JSON —
A versionless interchange format for hierarchical data. It was derived from the
syntax of JavaScript.
#j8-notation
J8 Notation —
A collection of data languages based on JSON. It specifies concrete representations for strings/bytes, records (JSON8), and tables (TSV8).
#j8-string
J8 String —
An extension of JSON strings with \yff
for binary data and \u{123456}
to move past surrogate pairs. Example: u'mu = \u{3bc}'
.
#JSON8
JSON8 —
An extension of JSON with J8 strings. Any language that has JSON library should also have a JSON8 library.
#TSV
Tab-Separated Values —
A text format for tables, where cells are separated by tabs, and each row is a line. There's no standard way to denote a literal tab or newline in a cell.
#TSV8
TSV8 —
An extension of TSV with J8 strings. Any language that has a JSON and JSON8 library should also have
a TSV8 library.
#YAML
YAML —
A human-editable configuration file syntax that's a superset of JSON. It's
quirky, but widely used in the cloud. It confuses
values like the string "NO" and
the boolean false
.
#QSN
Quoted String Notation (QSN) —
A data format for strings which looks like 'foo \x00 bar\n'
. It's an
adaptation of Rust's string literal syntax with two main use cases:
QSN will be deprecated in favor of J8 Strings.
#QTT
Quoted, Typed Tables —
An obsolete name, now TSV8.
#QTSV
QTSV —
An obsolete name, now TSV8.
#APUE
Advanced Programming in the Unix Environment —
A classic book on talking to the Unix kernel with C code. A shell uses a very old subset of the Unix interface, so it works the same way on Linux, OS X, and BSD Unixes.
#dsl-book
Domain Specific Languages by Martin Fowler —
A book of patterns for implementing DSLs. Discusses lexical
state.
#zulip
Zulip Chat —
Zulip is a hybrid of e-mail and chat that Oils users and developers can use.
Log in to oilshell.zulipchat.com with Github
or Google. I sometimes summarize Zulip threads in blog posts tagged
#zulip-links.