1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
|
# cp — Copy Files and Directories
## Overview
`cp` copies files and directory trees. It supports recursive copying, archive
mode (preserving metadata), symlink handling policies, sparse file detection,
and interactive/forced overwrite modes.
**Source**: `cp/cp.c`, `cp/utils.c`, `cp/extern.h`, `cp/fts.c`, `cp/fts.h`
**Origin**: BSD 4.4, University of California, Berkeley
**License**: BSD-3-Clause
## Synopsis
```
cp [-HLPRafilnpsvx] [--sort] [-N mode] source_file target_file
cp [-HLPRafilnpsvx] [--sort] [-N mode] source_file ... target_directory
```
## Options
| Flag | Description |
|------|-------------|
| `-R` / `-r` | Recursive: copy directories and their contents |
| `-a` | Archive mode: equivalent to `-R -P -p` |
| `-f` | Force: remove existing target before copying |
| `-i` | Interactive: prompt before overwriting |
| `-l` | Create hard links instead of copying |
| `-n` | No-clobber: do not overwrite existing files |
| `-p` | Preserve: maintain mode, ownership, timestamps |
| `-s` | Create symbolic links instead of copying |
| `-v` | Verbose: print each file as it is copied |
| `-x` | One-filesystem: do not cross mount points |
| `-H` | Follow symlinks on command line (with `-R`) |
| `-L` | Follow all symbolic links (with `-R`) |
| `-P` | Do not follow symbolic links (default with `-R`) |
| `--sort` | Sort entries numerically during recursive copy |
| `-N mode` | Apply negated permissions to regular files |
## Source Analysis
### cp.c — Main Logic
#### Key Data Structures
```c
typedef struct {
char *p_end; /* Pointer to NULL at end of path */
char *target_end; /* Pointer to end of target base */
char p_path[PATH_MAX]; /* Current target path buffer */
int p_fd; /* Directory file descriptor */
} PATH_T;
struct options {
bool recursive;
bool force;
bool interactive;
bool no_clobber;
bool preserve;
bool hard_link;
bool symbolic_link;
bool verbose;
bool one_filesystem;
/* ... */
};
```
#### Key Functions
| Function | Purpose |
|----------|---------|
| `main()` | Parse options, stat destination, determine copy mode |
| `copy()` | Main recursive copy driver using FTS traversal |
| `ftscmp()` | qsort comparator for `--sort` numeric ordering |
| `local_strlcpy()` | Portability wrapper for `strlcpy` |
| `local_asprintf()` | Portability wrapper for `asprintf` |
#### Copy Mode Detection
`cp` determines the operation mode from the arguments:
```c
/* Three cases:
* 1. cp file1 file2 → file-to-file copy
* 2. cp file1 file2 dir/ → files into directory
* 3. cp -R dir1 dir2 → directory to new directory
*/
if (stat(target, &sb) == 0 && S_ISDIR(sb.st_mode))
type = DIR_TO_DIR;
else if (argc == 2)
type = FILE_TO_FILE;
else
usage(); /* Multiple sources require directory target */
```
### utils.c — Copy Engine
#### Adaptive Buffer Sizing
Like `cat`, `cp` adapts its I/O buffer to available memory:
```c
#define PHYSPAGES_THRESHOLD (32*1024)
#define BUFSIZE_MAX (2*1024*1024)
#define BUFSIZE_SMALL (MAXPHYS) /* 128 KB */
static ssize_t
copy_fallback(int from_fd, int to_fd)
{
if (buf == NULL) {
if (sysconf(_SC_PHYS_PAGES) > PHYSPAGES_THRESHOLD)
bufsize = MIN(BUFSIZE_MAX, MAXPHYS * 8);
else
bufsize = BUFSIZE_SMALL;
buf = malloc(bufsize);
}
/* read/write loop */
}
```
#### Key Functions in utils.c
| Function | Purpose |
|----------|---------|
| `copy_fallback()` | Buffer-based file copy with adaptive sizing |
| `copy_file()` | Copy regular file, potentially using `copy_file_range(2)` |
| `copy_link()` | Copy symbolic link (read target, create new symlink) |
| `copy_fifo()` | Copy FIFO via `mkfifo(2)` |
| `copy_special()` | Copy device nodes via `mknod(2)` |
| `setfile()` | Set timestamps, ownership, permissions on target |
| `preserve_fd_acls()` | Copy POSIX ACLs between file descriptors |
### FTS Traversal
`cp -R` uses an in-tree FTS (File Traversal Stream) implementation:
```c
FTS *ftsp;
FTSENT *curr;
int fts_options = FTS_NOCHDIR | FTS_PHYSICAL;
if (Lflag)
fts_options &= ~FTS_PHYSICAL;
fts_options |= FTS_LOGICAL;
ftsp = fts_open(argv, fts_options, NULL);
while ((curr = fts_read(ftsp)) != NULL) {
switch (curr->fts_info) {
case FTS_D: /* Directory pre-visit */
mkdir(target_path, curr->fts_statp->st_mode);
break;
case FTS_F: /* Regular file */
copy_file(curr->fts_path, target_path);
break;
case FTS_SL: /* Symbolic link */
copy_link(curr->fts_path, target_path);
break;
case FTS_DP: /* Directory post-visit */
setfile(curr->fts_statp, target_path);
break;
}
}
```
### Symlink Handling Modes
| Mode | Flag | Behavior |
|------|------|----------|
| Physical | `-P` (default) | Copy symlinks as symlinks |
| Command-line follow | `-H` | Follow symlinks named on command line |
| Logical | `-L` | Follow all symlinks, copy targets |
### Archive Mode
The `-a` flag combines three flags for complete archival:
```sh
cp -a source/ dest/
# Equivalent to:
cp -R -P -p source/ dest/
```
- `-R` — Recursive copy
- `-P` — Don't follow symlinks (preserve them as-is)
- `-p` — Preserve timestamps, ownership, and permissions
### Metadata Preservation (`-p`)
When `-p` is specified, `cp` preserves:
| Metadata | System Call |
|----------|-------------|
| Access time | `utimensat(2)` |
| Modification time | `utimensat(2)` |
| File mode | `fchmod(2)` / `chmod(2)` |
| Owner/group | `fchown(2)` / `lchown(2)` |
| ACLs | `acl_get_fd()` / `acl_set_fd()` (if available) |
### Cycle Detection
During recursive copy, `cp` tracks visited directories by `(dev, ino)`
pairs to detect filesystem cycles created by symlinks or bind mounts:
```c
/* If we've already visited this inode on this device, skip it */
if (cycle_check(curr->fts_statp->st_dev, curr->fts_statp->st_ino)) {
warnx("%s: directory causes a cycle", curr->fts_path);
fts_set(ftsp, curr, FTS_SKIP);
continue;
}
```
## System Calls Used
| Syscall | Purpose |
|---------|---------|
| `open(2)` | Open source and target files |
| `read(2)` / `write(2)` | Buffer-based data copy |
| `copy_file_range(2)` | Zero-copy in-kernel transfer |
| `mkdir(2)` | Create target directories |
| `mkfifo(2)` | Create FIFO copies |
| `mknod(2)` | Create device node copies |
| `symlink(2)` | Create symbolic links |
| `link(2)` | Create hard links (with `-l`) |
| `readlink(2)` | Read symlink target |
| `fchmod(2)` | Set permissions on target |
| `fchown(2)` | Set ownership on target |
| `utimensat(2)` | Set timestamps on target |
| `fstat(2)` | Check file type and metadata |
## Examples
```sh
# Simple file copy
cp source.txt dest.txt
# Recursive directory copy
cp -R src/ dest/
# Archive mode (preserve everything)
cp -a project/ backup/project/
# Interactive overwrite
cp -i newfile.txt existing.txt
# Create hard links instead of copies
cp -l large_file.dat link_to_large_file.dat
# Don't cross filesystem boundaries
cp -Rx /home/user/ /backup/home/user/
# Verbose recursive copy
cp -Rv config/ /etc/myapp/
```
## Exit Codes
| Code | Meaning |
|------|---------|
| 0 | All files copied successfully |
| 1 | Error copying one or more files |
## Differences from GNU cp
- No `--reflink` option for CoW copies
- No `--sparse=auto/always/never` (sparse handling is automatic)
- `--sort` flag for sorted recursive output (not in GNU)
- `-N` flag for negated permissions (not in GNU)
- Uses in-tree FTS instead of gnulib
- No SELinux context preservation (use `--preserve=context` in GNU)
|