1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
|
# Corebinutils — Overview
## What Is Corebinutils?
Corebinutils is Project Tick's collection of core command-line utilities, ported
from FreeBSD and adapted for Linux with musl libc. It provides the foundational
user-space programs that every Unix system needs — file manipulation, process
control, text processing, and system information tools — built from
battle-tested FreeBSD sources rather than GNU coreutils.
The project targets a clean, auditable, BSD-licensed alternative to the GNU
toolchain. Every utility compiles against musl libc by default, producing
statically-linkable binaries with minimal dependencies.
## Heritage and Licensing
All utilities derive from FreeBSD's `/usr/src/bin/` tree, carrying BSD
3-Clause or BSD 2-Clause licenses from the original Berkeley and FreeBSD
contributors. Project Tick's modifications (Copyright 2026) maintain the same
licensing terms. No GPL-licensed code is present in the tree.
The copyright headers trace a direct lineage:
```
Copyright (c) 1989, 1993, 1994
The Regents of the University of California. All rights reserved.
Copyright (c) 2026
Project Tick. All rights reserved.
```
Key contributors acknowledged across the codebase include Keith Muller (dd),
Andrew Moore (ed), Michael Fischbein (ls), Ken Smith (mv), and Lance Visser
(dd).
## Design Philosophy
### Linux-Native, Not Compatibility Layers
Unlike many BSD-to-Linux ports that ship a compatibility shim library,
corebinutils rewrites platform-specific code using native Linux APIs:
- **`/proc/self/mountinfo`** replaces BSD `getmntinfo(3)` in `df`
- **`statx(2)`** replaces BSD `stat(2)` for birth time in `ls`
- **`sched_getaffinity(2)`** replaces BSD `cpuset_getaffinity(2)` in `nproc`
- **`sethostname(2)` from `<unistd.h>`** replaces BSD kernel calls in `hostname`
- **`prctl(PR_SET_CHILD_SUBREAPER)`** replaces BSD `procctl` in `timeout`
- **`fdopendir(3)` + `readdir(3)`** replaces BSD FTS functions in `rm`
### musl-First Toolchain
The build system preferentially selects musl-based compilers. The configure
script tries, in order:
1. `musl-clang`
2. `clang --target=<arch>-linux-musl`
3. `clang --target=<arch>-unknown-linux-musl`
4. `musl-gcc`
5. `clang` (generic)
6. `cc`
7. `gcc`
If a glibc toolchain is detected, configure refuses to proceed unless
`--allow-glibc` is explicitly passed.
### No External Dependencies
Core utilities have zero runtime dependencies beyond libc. Optional features
(readline in `csh`, crypto in `ed`) probe for system libraries at configure
time but degrade gracefully when absent.
## Complete Utility List
### File Operations
| Utility | Description | Complexity | Source Files |
|-----------|--------------------------------------|------------|-------------|
| `cat` | Concatenate and display files | Simple | 1 `.c` |
| `cp` | Copy files and directory trees | Medium | 3+ `.c` |
| `dd` | Block-level data copying/conversion | Complex | 8+ `.c` |
| `ln` | Create hard and symbolic links | Medium | 1 `.c` |
| `mv` | Move/rename files and directories | Medium | 1 `.c` |
| `rm` | Remove files and directories | Medium | 1 `.c` |
| `rmdir` | Remove empty directories | Simple | 1 `.c` |
### Directory Operations
| Utility | Description | Complexity | Source Files |
|-------------|------------------------------------|------------|-------------|
| `ls` | List directory contents | Complex | 5+ `.c` |
| `mkdir` | Create directories | Medium | 2 `.c` |
| `pwd` | Print working directory | Simple | 1 `.c` |
| `realpath` | Canonicalize file paths | Simple | 1 `.c` |
### Permission and Attribute Management
| Utility | Description | Complexity | Source Files |
|-------------|------------------------------------|------------|-------------|
| `chmod` | Change file permissions | Medium | 2 `.c` |
| `chflags` | Change file flags (BSD compat) | Medium | 4 `.c` |
| `getfacl` | Display file ACLs | Medium | 1 `.c` |
| `setfacl` | Set file ACLs | Medium | 1 `.c` |
### Process Management
| Utility | Description | Complexity | Source Files |
|-------------|------------------------------------|------------|-------------|
| `kill` | Send signals to processes | Medium | 1 `.c` |
| `ps` | List running processes | Complex | 6+ `.c` |
| `pkill` | Signal processes by name/attribute | Medium | 1+ `.c` |
| `pwait` | Wait for process termination | Simple | 1 `.c` |
| `timeout` | Run command with time limit | Medium | 1 `.c` |
### Text Processing
| Utility | Description | Complexity | Source Files |
|-----------|--------------------------------------|------------|-------------|
| `echo` | Write arguments to stdout | Simple | 1 `.c` |
| `ed` | Line-oriented text editor | Complex | 10+ `.c` |
| `expr` | Evaluate expressions | Medium | 1 `.c` |
| `test` | Conditional expression evaluation | Medium | 1 `.c` |
### Date and Time
| Utility | Description | Complexity | Source Files |
|-----------|--------------------------------------|------------|-------------|
| `date` | Display/set system date and time | Medium | 2 `.c` |
| `sleep` | Pause for specified duration | Simple | 1 `.c` |
### System Information
| Utility | Description | Complexity | Source Files |
|------------------|---------------------------------|------------|-------------|
| `df` | Report filesystem space usage | Complex | 1 `.c` |
| `hostname` | Get/set system hostname | Simple | 1 `.c` |
| `domainname` | Get/set NIS domain name | Simple | 1 `.c` |
| `nproc` | Count available processors | Simple | 1 `.c` |
| `freebsd-version`| Show FreeBSD version (compat) | Simple | Shell script|
| `uuidgen` | Generate UUIDs | Simple | 1 `.c` |
### Shells
| Utility | Description | Complexity | Source Files |
|---------|--------------------------------------|------------|-------------|
| `sh` | POSIX-compatible shell | Very High | 60+ `.c` |
| `csh` | C-shell (tcsh port) | Very High | 30+ `.c` |
### Archive and Mail
| Utility | Description | Complexity | Source Files |
|---------|--------------------------------------|------------|-------------|
| `pax` | POSIX archive utility (tar/cpio) | Complex | 30+ `.c` |
| `rmail` | Remote mail handler | Simple | 1 `.c` |
### Miscellaneous
| Utility | Description | Complexity | Source Files |
|-------------|------------------------------------|------------|-------------|
| `sync` | Flush filesystem buffers | Simple | 1 `.c` |
| `stty` | Set terminal characteristics | Medium | 2+ `.c` |
| `cpuset` | CPU affinity management | Medium | 1 `.c` |
## Shared Components
The `contrib/` directory provides libraries shared across utilities:
### `contrib/libc-vis/`
BSD `vis(3)` and `unvis(3)` functions for encoding and decoding special
characters. Used by `ls` for safe filename display and by `pax` for
header encoding.
### `contrib/libedit/`
BSD `editline(3)` library providing command-line editing with history and
completion support. Used by `csh` and `sh` for interactive input.
### `contrib/printf/`
Shared `printf` format string processing used by multiple utilities that
need custom format string expansion beyond standard `printf(3)`.
## Project Structure
```
corebinutils/
├── configure # Top-level configure script (POSIX sh)
├── README.md # Build instructions
├── .gitattributes # Git configuration
├── .gitignore # Build artifact exclusions
├── contrib/ # Shared libraries
│ ├── libc-vis/ # vis(3)/unvis(3)
│ ├── libedit/ # editline(3)
│ └── printf/ # Shared printf helpers
├── cat/ # Each utility in its own directory
│ ├── cat.c # Main source
│ ├── GNUmakefile # Per-utility build rules
│ ├── cat.1 # Manual page
│ └── README.md # Port-specific notes
├── chmod/
│ ├── chmod.c
│ ├── mode.c # Shared mode parsing library
│ ├── mode.h
│ └── GNUmakefile
├── ... # (33 utility directories total)
└── sh/ # Full POSIX shell (60+ source files)
```
## Utility Complexity Classification
### Tier 1 — Simple (1 source file, <500 lines)
`cat`, `echo`, `hostname`, `domainname`, `nproc`, `pwd`, `realpath`, `rmdir`,
`sleep`, `sync`, `uuidgen`, `pwait`
These utilities typically have a `main()` function that parses options with
`getopt(3)`, performs a single system call, and exits. Error handling follows
the `err(3)`/`warn(3)` pattern.
### Tier 2 — Medium (1-3 source files, 500-2000 lines)
`chmod` (with `mode.c`), `cp` (with `utils.c`, `fts.c`), `date` (with
`vary.c`), `kill`, `ln`, `mkdir` (with `mode.c`), `mv`, `rm`, `test`,
`timeout`, `expr`, `df`
These utilities involve more complex option parsing, recursive directory
traversal, or multi-step algorithms. They share code through header files
and sometimes reuse `mode.c`/`mode.h`.
### Tier 3 — Complex (5+ source files, 2000+ lines)
`dd` (8 source files), `ed` (10 source files), `ls` (5 source files),
`ps` (6 source files), `pax` (30+ source files)
These are substantial programs with their own internal architecture:
- `dd`: argument parser, conversion engine, signal handling, I/O position logic
- `ed`: command parser, buffer manager, regex engine, undo system
- `ls`: stat engine, sort/compare, print/format, ANSI color
- `ps`: /proc parser, format string engine, process filter, output formatter
### Tier 4 — Shells (30-60+ source files)
`sh` and `csh` are full POSIX-compatible shells with lexers, parsers, job
control, signal handling, built-in commands, and editline integration.
## Key Differences from GNU Coreutils
| Feature | Corebinutils (BSD) | GNU Coreutils |
|------------------------|-----------------------------|----------------------------|
| License | BSD-3-Clause / BSD-2-Clause | GPL-3.0 |
| Default libc | musl | glibc |
| `echo` behavior | No `-e` flag (BSD compat) | `-e` for escape sequences |
| `test` parser | Recursive descent | Varies by implementation |
| `ls` birth time | `statx(2)` syscall | `statx(2)` or fallback |
| `dd` progress | SIGINFO + `status=progress` | `status=progress` |
| `sleep` units | `s`, `m`, `h`, `d` suffixes | `s`, `m`, `h`, `d` (GNU ext)|
| Build system | `./configure` + `GNUmakefile`| Autotools (autoconf/automake)|
| Error functions | `err(3)`/`warn(3)` from libc| `error()` from gnulib |
| FTS implementation | In-tree custom `fts.c` | gnulib FTS or `nftw(3)` |
## Signal Handling Conventions
Most utilities follow a consistent signal handling pattern:
- **SIGINFO / SIGUSR1**: Progress reporting. `dd`, `chmod`, `sleep`, and
others install a handler that sets a `volatile sig_atomic_t` flag, which
the main loop checks to print status information.
- **SIGINT**: Graceful termination. Utilities performing recursive operations
check for pending signals between iterations.
- **SIGHUP**: In `ed`, triggers an emergency save of the edit buffer to a
temporary file.
Signal handlers are installed via `sigaction(2)` rather than the legacy
`signal(2)` function, ensuring reliable semantics across platforms.
## Error Handling Patterns
All utilities exit with standardized codes:
| Exit Code | Meaning |
|-----------|------------------------------------------|
| 0 | Success |
| 1 | General failure |
| 2 | Usage error (invalid arguments) |
| 124 | Command timed out (`timeout` only) |
| 125 | `timeout` internal error |
| 126 | Command found but not executable |
| 127 | Command not found |
Error messages follow the BSD pattern:
```c
error_errno("open %s", path); // "mv: open /foo: Permission denied"
error_msg("invalid mode: %s", arg); // "chmod: invalid mode: xyz"
```
Many utilities provide custom `error_errno()` / `error_msg()` wrappers that
prepend the program name, format the message, and optionally append
`strerror(errno)`.
## Memory Management
Corebinutils utilities follow BSD memory conventions:
- **Dynamic allocation**: `malloc(3)` with explicit `NULL` checks, typically
wrapped in `xmalloc()` that calls `err(1, "malloc")` on failure.
- **No fixed-size buffers** for user-controlled data (paths, format strings).
- **Adaptive buffer sizing**: `cat` and `cp` scale I/O buffers based on
available physical memory via `sysconf(_SC_PHYS_PAGES)`.
- **Explicit cleanup**: `free()` is called in long-running loops to avoid
accumulation, though single-pass utilities may rely on process exit.
### Buffer Strategy Example (from `cat.c` and `cp/utils.c`):
```c
#define PHYSPAGES_THRESHOLD (32*1024)
#define BUFSIZE_MAX (2*1024*1024)
#define BUFSIZE_SMALL (128*1024)
if (sysconf(_SC_PHYS_PAGES) > PHYSPAGES_THRESHOLD)
bufsize = MIN(BUFSIZE_MAX, MAXPHYS * 8);
else
bufsize = BUFSIZE_SMALL;
```
## Testing
Each utility directory may contain its own test suite, invoked through:
```sh
make -f GNUmakefile test
```
Or for a specific utility:
```sh
make -f GNUmakefile check-cat
make -f GNUmakefile check-ls
```
Tests that require root privileges or specific kernel features print `SKIP`
and continue without failing the overall test run.
## Building Quick Reference
```sh
cd corebinutils/
./configure # Detect toolchain, generate build files
make -f GNUmakefile -j$(nproc) all # Build all utilities
make -f GNUmakefile test # Run test suites
make -f GNUmakefile stage # Copy binaries to out/bin/
make -f GNUmakefile install # Install to $PREFIX/bin
```
See [building.md](building.md) for detailed configure options and build
customization.
## Further Reading
- [architecture.md](architecture.md) — Build system internals, code organization
- [building.md](building.md) — Configure options, dependencies, cross-compilation
- Individual utility documentation: [cat.md](cat.md), [ls.md](ls.md),
[dd.md](dd.md), [ps.md](ps.md), etc.
- [code-style.md](code-style.md) — C coding conventions
- [error-handling.md](error-handling.md) — Error patterns and exit codes
|