1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
|
# cgit — URL Routing and Request Dispatch
## Overview
cgit supports two URL schemes: virtual-root (path-based) and query-string.
Incoming requests are parsed into a `cgit_query` structure and dispatched to
one of 23 command handlers via a function pointer table.
Source files: `cgit.c` (querystring parsing, routing), `parsing.c`
(`cgit_parse_url`), `cmd.c` (command table).
## URL Schemes
### Virtual Root (Path-Based)
When `virtual-root` is configured, URLs use clean paths:
```
/cgit/ → repository list
/cgit/repo.git/ → summary
/cgit/repo.git/log/ → log (default branch)
/cgit/repo.git/log/main/path → log for path on branch main
/cgit/repo.git/tree/v1.0/src/ → tree view at tag v1.0
/cgit/repo.git/commit/?id=abc → commit view
```
The path after the virtual root is passed in `PATH_INFO` and parsed by
`cgit_parse_url()`.
### Query-String (CGI)
Without virtual root, all parameters are passed in the query string:
```
/cgit.cgi?url=repo.git/log/main/path&ofs=50
```
## Query Structure
All parsed parameters are stored in `ctx.qry`:
```c
struct cgit_query {
char *raw; /* raw URL / PATH_INFO */
char *repo; /* repository URL */
char *page; /* page/command name */
char *search; /* search string */
char *grep; /* grep pattern */
char *head; /* branch reference */
char *sha1; /* object SHA-1 */
char *sha2; /* second SHA-1 (for diffs) */
char *path; /* file/dir path within repo */
char *name; /* snapshot name / ref name */
char *url; /* combined URL path */
char *mimetype; /* requested MIME type */
char *etag; /* ETag from client */
int nohead; /* suppress header */
int ofs; /* pagination offset */
int has_symref; /* path contains a symbolic ref */
int has_sha1; /* explicit SHA was given */
int has_dot; /* path contains '..' */
int ignored; /* request should be ignored */
char *sort; /* sort field */
int showmsg; /* show full commit message */
int ssdiff; /* side-by-side diff */
int show_all; /* show all items */
int context; /* diff context lines */
int follow; /* follow renames */
int log_hierarchical_threading;
};
```
## URL Parsing: `cgit_parse_url()`
In `parsing.c`, the URL is decomposed into repo, page, and path:
```c
void cgit_parse_url(const char *url)
{
/* Step 1: try progressively longer prefixes as repo URLs */
/* For each '/' in the URL, check if the prefix matches a repo */
for (p = strchr(url, '/'); p; p = strchr(p + 1, '/')) {
*p = '\0';
repo = cgit_get_repoinfo(url);
*p = '/';
if (repo) {
ctx.qry.repo = xstrdup(url_prefix);
ctx.repo = repo;
url = p + 1; /* remaining part */
break;
}
}
/* if no '/' found, try the whole URL as a repo name */
/* Step 2: parse the remaining path as page/ref/path */
/* e.g., "log/main/src/file.c" → page="log", path="main/src/file.c" */
p = strchr(url, '/');
if (p) {
ctx.qry.page = xstrndup(url, p - url);
ctx.qry.path = trim_end(p + 1, '/');
} else if (*url) {
ctx.qry.page = xstrdup(url);
}
}
```
## Query String Parsing: `querystring_cb()`
HTTP query parameters and POST form data are decoded by `querystring_cb()`
in `cgit.c`. The function maps URL parameter names to `ctx.qry` fields:
```c
static void querystring_cb(const char *name, const char *value)
{
if (!strcmp(name, "url")) ctx.qry.url = xstrdup(value);
else if (!strcmp(name, "p")) ctx.qry.page = xstrdup(value);
else if (!strcmp(name, "q")) ctx.qry.search = xstrdup(value);
else if (!strcmp(name, "h")) ctx.qry.head = xstrdup(value);
else if (!strcmp(name, "id")) ctx.qry.sha1 = xstrdup(value);
else if (!strcmp(name, "id2")) ctx.qry.sha2 = xstrdup(value);
else if (!strcmp(name, "ofs")) ctx.qry.ofs = atoi(value);
else if (!strcmp(name, "path")) ctx.qry.path = xstrdup(value);
else if (!strcmp(name, "name")) ctx.qry.name = xstrdup(value);
else if (!strcmp(name, "mimetype")) ctx.qry.mimetype = xstrdup(value);
else if (!strcmp(name, "s")) ctx.qry.sort = xstrdup(value);
else if (!strcmp(name, "showmsg")) ctx.qry.showmsg = atoi(value);
else if (!strcmp(name, "ss")) ctx.qry.ssdiff = atoi(value);
else if (!strcmp(name, "all")) ctx.qry.show_all = atoi(value);
else if (!strcmp(name, "context")) ctx.qry.context = atoi(value);
else if (!strcmp(name, "follow")) ctx.qry.follow = atoi(value);
else if (!strcmp(name, "dt")) ctx.qry.dt = atoi(value);
else if (!strcmp(name, "grep")) ctx.qry.grep = xstrdup(value);
else if (!strcmp(name, "etag")) ctx.qry.etag = xstrdup(value);
}
```
### URL Parameter Reference
| Parameter | Query Field | Type | Description |
|-----------|------------|------|-------------|
| `url` | `qry.url` | string | Full URL path (repo/page/path) |
| `p` | `qry.page` | string | Page/command name |
| `q` | `qry.search` | string | Search string |
| `h` | `qry.head` | string | Branch/ref name |
| `id` | `qry.sha1` | string | Object SHA-1 |
| `id2` | `qry.sha2` | string | Second SHA-1 (diffs) |
| `ofs` | `qry.ofs` | int | Pagination offset |
| `path` | `qry.path` | string | File path in repo |
| `name` | `qry.name` | string | Reference/snapshot name |
| `mimetype` | `qry.mimetype` | string | MIME type override |
| `s` | `qry.sort` | string | Sort field |
| `showmsg` | `qry.showmsg` | int | Show full commit message |
| `ss` | `qry.ssdiff` | int | Side-by-side diff toggle |
| `all` | `qry.show_all` | int | Show all entries |
| `context` | `qry.context` | int | Diff context lines |
| `follow` | `qry.follow` | int | Follow renames in log |
| `dt` | `qry.dt` | int | Diff type |
| `grep` | `qry.grep` | string | Grep pattern for log search |
| `etag` | `qry.etag` | string | ETag for conditional requests |
## Command Dispatch Table
The command table in `cmd.c` maps page names to handler functions:
```c
#define def_cmd(name, want_hierarchical, want_repo, want_layout, want_vpath, is_clone) \
{#name, cmd_##name, want_hierarchical, want_repo, want_layout, want_vpath, is_clone}
static struct cgit_cmd cmds[] = {
def_cmd(atom, 1, 1, 0, 0, 0),
def_cmd(about, 0, 1, 1, 0, 0),
def_cmd(blame, 1, 1, 1, 1, 0),
def_cmd(blob, 1, 1, 0, 0, 0),
def_cmd(commit, 1, 1, 1, 1, 0),
def_cmd(diff, 1, 1, 1, 1, 0),
def_cmd(head, 1, 1, 0, 0, 1),
def_cmd(info, 1, 1, 0, 0, 1),
def_cmd(log, 1, 1, 1, 1, 0),
def_cmd(ls_cache,0, 0, 0, 0, 0),
def_cmd(objects, 1, 1, 0, 0, 1),
def_cmd(patch, 1, 1, 1, 1, 0),
def_cmd(plain, 1, 1, 0, 1, 0),
def_cmd(rawdiff, 1, 1, 0, 1, 0),
def_cmd(refs, 1, 1, 1, 0, 0),
def_cmd(repolist,0, 0, 1, 0, 0),
def_cmd(snapshot, 1, 1, 0, 0, 0),
def_cmd(stats, 1, 1, 1, 1, 0),
def_cmd(summary, 1, 1, 1, 0, 0),
def_cmd(tag, 1, 1, 1, 0, 0),
def_cmd(tree, 1, 1, 1, 1, 0),
};
```
### Command Flags
| Flag | Meaning |
|------|---------|
| `want_hierarchical` | Parse hierarchical path from URL |
| `want_repo` | Requires a repository context |
| `want_layout` | Render within HTML page layout |
| `want_vpath` | Accept a virtual path (file path in repo) |
| `is_clone` | HTTP clone protocol endpoint |
### Lookup: `cgit_get_cmd()`
```c
struct cgit_cmd *cgit_get_cmd(const char *name)
{
for (int i = 0; i < ARRAY_SIZE(cmds); i++)
if (!strcmp(cmds[i].name, name))
return &cmds[i];
return NULL;
}
```
The function performs a linear search. With 21 entries, this is fast enough.
## Request Processing Flow
In `process_request()` (`cgit.c`):
```
1. Parse PATH_INFO via cgit_parse_url()
2. Parse QUERY_STRING via http_parse_querystring(querystring_cb)
3. Parse POST body (for authentication forms)
4. Resolve repository: cgit_get_repoinfo(ctx.qry.repo)
5. Determine command: cgit_get_cmd(ctx.qry.page)
6. If no page specified:
- With repo → default to "summary"
- Without repo → default to "repolist"
7. Check command flags:
- want_repo but no repo → "Repository not found" error
- is_clone and HTTP clone disabled → 404
8. Handle authentication if auth-filter is configured
9. Execute: cmd->fn(&ctx)
```
### Hierarchical Path Resolution
When `want_hierarchical=1`, cgit splits `ctx.qry.path` into a reference
(branch/tag/SHA) and a file path. It tries progressively longer prefixes
of the path as git references until one resolves:
```
path = "main/src/lib/file.c"
try: "main" → found branch "main"
qry.head = "main"
qry.path = "src/lib/file.c"
```
If no prefix resolves, the entire path is treated as a file path within the
default branch.
## Clone Protocol Endpoints
Three commands serve the Git HTTP clone protocol:
| Endpoint | Path | Function |
|----------|------|----------|
| `info` | `repo/info/refs` | `cgit_clone_info()` — advertise refs |
| `objects` | `repo/objects/*` | `cgit_clone_objects()` — serve packfiles |
| `head` | `repo/HEAD` | `cgit_clone_head()` — serve HEAD ref |
These are only active when `enable-http-clone=1` (default).
## URL Generation
`ui-shared.c` provides URL construction helpers:
```c
const char *cgit_repourl(const char *reponame);
const char *cgit_fileurl(const char *reponame, const char *pagename,
const char *filename, const char *query);
const char *cgit_pageurl(const char *reponame, const char *pagename,
const char *query);
const char *cgit_currurl(void);
```
When `virtual-root` is set, these produce clean paths. Otherwise, they
produce query-string URLs.
### Example URL generation:
```c
/* With virtual-root=/cgit/ */
cgit_repourl("myrepo")
→ "/cgit/myrepo/"
cgit_fileurl("myrepo", "tree", "src/main.c", "h=dev")
→ "/cgit/myrepo/tree/src/main.c?h=dev"
cgit_pageurl("myrepo", "log", "ofs=50")
→ "/cgit/myrepo/log/?ofs=50"
```
## Content-Type and HTTP Headers
The response content type is set by the command handler before generating
output. Common types:
| Page | Content-Type |
|------|-------------|
| HTML pages | `text/html` |
| atom | `text/xml` |
| blob | auto-detected from content |
| plain | MIME type from extension or `application/octet-stream` |
| snapshot | `application/x-gzip`, etc. |
| patch | `text/plain` |
| clone endpoints | `text/plain`, `application/x-git-packed-objects` |
Headers are emitted by `cgit_print_http_headers()` in `ui-shared.c` before
any page content.
## Error Handling
If a requested repository or page is not found, cgit renders an error page
within the standard layout. HTTP status codes:
| Condition | Status |
|-----------|--------|
| Normal page | 200 OK |
| Auth redirect | 302 Found |
| Not modified | 304 Not Modified |
| Bad request | 400 Bad Request |
| Auth required | 401 Unauthorized |
| Repo not found | 404 Not Found |
| Page not found | 404 Not Found |
The status code is set in `ctx.page.status` and emitted by the HTTP header
function.
|