summaryrefslogtreecommitdiff
path: root/corebinutils/ed/README.md
blob: 389e63ef50247c0559a56e0eb36f7455cd3d23c1 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
# ed

Standalone musl-libc-based Linux port of FreeBSD `ed` for Project Tick BSD/Linux Distribution.

## Build

```sh
gmake -f GNUmakefile
gmake -f GNUmakefile CC=musl-gcc
```

## Test

```sh
gmake -f GNUmakefile test
gmake -f GNUmakefile test CC=musl-gcc
```

## Notes

- Port strategy is direct Linux-native cleanup of the FreeBSD source, not a BSD ABI shim.
- The original multi-file editor core and FreeBSD regression corpus are preserved, but the Linux build surface is standalone: `GNUmakefile`, shell test entrypoint, and musl-safe libc usage.
- Scratch-buffer storage uses `mkstemp(3)` in `TMPDIR` or `/tmp`, then `fdopen(3)` for the editor's temp file stream.
- Shell escapes and filter I/O (`!`, `r !cmd`, `w !cmd`) use `system(3)`, `popen(3)`, and `pclose(3)`.
- Terminal resize handling uses Linux `ioctl(TIOCGWINSZ)` on stdin.
- Regex handling stays on POSIX `regcomp(3)` / `regexec(3)` from the active libc.
- FreeBSD `strlcpy(3)` usage was replaced with a local implementation so the port does not depend on glibc extensions or `libbsd`.

## Linux Semantics

- Supported: core editing commands, shell escapes, global commands, undo, binary buffer handling, and the upstream FreeBSD regression scripts.
- Supported: restricted `red` mode path checks from the original source.
- Unsupported: historic crypt mode. Startup `-x` exits with an explicit Linux error, and the interactive `x` command reports `crypt mode is not supported on Linux`.
- Known inherited quirk: the implicit newline print command still accepts the historic `,1` address form from this `ed` lineage. The Linux test harness allows only that single upstream deviation so other parser regressions still fail hard.

## What is ed?

ed is an 8-bit-clean, POSIX-compliant line editor.  It should work with
any regular expression package that conforms to the POSIX interface
standard, such as GNU regex(3).

If reliable signals are supported (e.g., POSIX sigaction(2)), it should
compile with little trouble.  Otherwise, the macros SPL1() and SPL0()
should be redefined to disable interrupts.

The following compiler directives are recognized:
NO_REALLOC_NULL	- if realloc(3) does not accept a NULL pointer
BACKWARDS	- for backwards compatibility
NEED_INSQUE	- if insque(3) is missing

The file `POSIX' describes extensions to and deviations from the POSIX
standard.

The ./test directory contains regression tests for ed. The README
file in that directory explains how to run these.

For a description of the ed algorithm, see Kernighan and Plauger's book
"Software Tools in Pascal," Addison-Wesley, 1981.

## ed POSIX message


This version of ed(1) is not strictly POSIX compliant, as described in
the POSIX 1003.2 document.  The following is a summary of the omissions,
extensions and possible deviations from POSIX 1003.2.

OMISSIONS
---------
1) For backwards compatibility, the POSIX rule that says a range of
   addresses cannot be used where only a single address is expected has
   been relaxed.

2) To support the BSD `s' command (see extension [1] below),
   substitution patterns cannot be delimited by numbers or the characters
   `r', `g' and `p'.  In contrast, POSIX specifies any character expect
   space or newline can used as a delimiter.

EXTENSIONS
----------
1) BSD commands have been implemented wherever they do not conflict with
   the POSIX standard.  The BSD-ism's included are:
	i) `s' (i.e., s[n][rgp]*) to repeat a previous substitution,
	ii) `W' for appending text to an existing file,
	iii) `wq' for exiting after a write,
	iv) `z' for scrolling through the buffer, and
	v) BSD line addressing syntax (i.e., `^' and `%')  is recognized.

2) The POSIX interactive global commands `G' and `V' are extended to 
   support multiple commands, including `a', `i' and `c'.  The command
   format is the same as for the global commands `g' and `v', i.e., one
   command per line with each line, except for the last, ending in a
   backslash (\).

3) An extension to the POSIX file commands `E', `e', `r', `W' and `w' is
   that <file> arguments are processed for backslash escapes, i.e.,  any
   character preceded by a backslash is interpreted literally.  If the
   first unescaped character of a <file> argument is a bang (!), then the
   rest of the line is interpreted as a shell command, and no escape
   processing is performed by ed.

4) For SunOS ed(1) compatibility, ed runs in restricted mode if invoked
   as red.  This limits editing of files in the local directory only and
   prohibits shell commands.

DEVIATIONS
----------
1) Though ed is not a stream editor, it can be used to edit binary files.
   To assist in binary editing, when a file containing at least one ASCII
   NUL character is written, a newline is not appended if it did not
   already contain one upon reading.  In particular, reading /dev/null
   prior to writing prevents appending a newline to a binary file.

   For example, to create a file with ed containing a single NUL character:
      $ ed file
      a
      ^@
      .
      r /dev/null
      wq

    Similarly, to remove a newline from the end of binary `file':
      $ ed file
      r /dev/null
      wq

2) Since the behavior of `u' (undo) within a `g' (global) command list is
   not specified by POSIX, it follows the behavior of the SunOS ed:
   undo forces a global command list to be executed only once, rather than
   for each line matching a global pattern.  In addition, each instance of
   `u' within a global command undoes all previous commands (including
   undo's) in the command list.  This seems the best way, since the
   alternatives are either too complicated to implement or too confusing
   to use.  

   The global/undo combination is useful for masking errors that
   would otherwise cause a script to fail.  For instance, an ed script
   to remove any occurrences of either `censor1' or `censor2' might be
   written as:
   	ed - file <<EOF
	1g/.*/u\
	,s/censor1//g\
	,s/censor2//g
	...

3) The `m' (move) command within a `g' command list also follows the SunOS
   ed implementation: any moved lines are removed from the global command's
   `active' list.

4) If ed is invoked with a name argument prefixed by a bang (!), then the
   remainder of the argument is interpreted as a shell command.  To invoke
   ed on a file whose name starts with bang, prefix the name with a
   backslash.