# cgit — Diff Engine
## Overview
cgit's diff engine renders differences between commits, trees, and blobs.
It supports three diff modes: unified, side-by-side, and stat-only. The
engine leverages libgit's internal diff machinery and adds HTML rendering on
top.
Source files: `ui-diff.c`, `ui-diff.h`, `ui-ssdiff.c`, `ui-ssdiff.h`,
`shared.c` (diff helpers).
## Diff Types
```c
#define DIFF_UNIFIED 0 /* traditional unified diff */
#define DIFF_SSDIFF 1 /* side-by-side diff */
#define DIFF_STATONLY 2 /* only show diffstat */
```
The diff type is selected by the `ss` query parameter or the
`side-by-side-diffs` configuration directive.
## Diffstat
### File Info Structure
```c
struct fileinfo {
char status; /* 'A'dd, 'D'elete, 'M'odify, 'R'ename, etc. */
unsigned long old_size;
unsigned long new_size;
int binary;
struct object_id old_oid; /* old blob SHA */
struct object_id new_oid; /* new blob SHA */
unsigned short old_mode;
unsigned short new_mode;
char *old_path;
char *new_path;
int added; /* lines added */
int removed; /* lines removed */
};
```
### Collecting File Changes: `inspect_filepair()`
For each changed file in a commit, `inspect_filepair()` records the change
information:
```c
static void inspect_filepair(struct diff_filepair *pair)
{
/* populate a fileinfo entry from the diff_filepair */
files++;
switch (pair->status) {
case DIFF_STATUS_ADDED:
info->status = 'A';
break;
case DIFF_STATUS_DELETED:
info->status = 'D';
break;
case DIFF_STATUS_MODIFIED:
info->status = 'M';
break;
case DIFF_STATUS_RENAMED:
info->status = 'R';
/* old_path and new_path differ */
break;
case DIFF_STATUS_COPIED:
info->status = 'C';
break;
/* ... */
}
}
```
### Rendering Diffstat: `cgit_print_diffstat()`
```c
void cgit_print_diffstat(const struct object_id *old,
const struct object_id *new,
const char *prefix)
```
Renders an HTML table showing changed files with bar graphs:
```html
| M |
src/main.c |
42 |
|
...
| 5 files changed, 120 insertions, 45 deletions |
```
The bar graph width is calculated proportionally to the maximum changed
lines across all files.
## Unified Diff
### `cgit_print_diff()`
The main diff rendering function:
```c
void cgit_print_diff(const char *new_rev, const char *old_rev,
const char *prefix, int show_ctrls, int raw)
```
Parameters:
- `new_rev` — New commit SHA
- `old_rev` — Old commit SHA (optional; defaults to parent)
- `prefix` — Path prefix filter (show only diffs under this path)
- `show_ctrls` — Show diff controls (diff type toggle buttons)
- `raw` — Output raw diff without HTML wrapping
### Diff Controls
When `show_ctrls=1`, diff mode toggle buttons are rendered:
```html
Diff options
```
### Filepair Callback: `filepair_cb()`
For each changed file, `filepair_cb()` renders the diff:
```c
static void filepair_cb(struct diff_filepair *pair)
{
/* emit file header */
htmlf("%s
", pair->one->path);
/* set up diff options */
xdiff_opts.ctxlen = ctx.qry.context ?: 3;
/* run the diff and emit line-by-line output */
/* each line gets a CSS class: .add, .del, or .ctx */
}
```
### Hunk Headers
```c
void cgit_print_diff_hunk_header(int oldofs, int oldcnt,
int newofs, int newcnt,
const char *func)
```
Renders hunk headers as:
```html
@@ -oldofs,oldcnt +newofs,newcnt @@ func
```
### Line Rendering
Each diff line is rendered with a status prefix and CSS class:
| Line Type | CSS Class | Prefix |
|-----------|----------|--------|
| Added | `.add` | `+` |
| Removed | `.del` | `-` |
| Context | `.ctx` | ` ` |
| Hunk header | `.hunk` | `@@` |
## Side-by-Side Diff (`ui-ssdiff.c`)
The side-by-side diff view renders old and new versions in adjacent columns.
### LCS Algorithm
`ui-ssdiff.c` implements a Longest Common Subsequence (LCS) algorithm to
align lines between old and new versions:
```c
/* LCS computation for line alignment */
static int *lcs(char *a, int an, char *b, int bn)
{
int *prev, *curr;
/* dynamic programming: build LCS table */
prev = calloc(bn + 1, sizeof(int));
curr = calloc(bn + 1, sizeof(int));
for (int i = 1; i <= an; i++) {
for (int j = 1; j <= bn; j++) {
if (a[i-1] == b[j-1])
curr[j] = prev[j-1] + 1;
else
curr[j] = MAX(prev[j], curr[j-1]);
}
SWAP(prev, curr);
}
return prev;
}
```
### Deferred Lines
Side-by-side rendering uses a deferred output model:
```c
struct deferred_lines {
int line_no;
char *line;
struct deferred_lines *next;
};
```
Lines are collected and paired before output. For modified lines, the LCS
algorithm identifies character-level changes and highlights them with
`` or `` within each line.
### Tab Expansion
```c
static char *replace_tabs(char *line)
```
Tabs are expanded to spaces for proper column alignment in side-by-side
view. The tab width is 8 characters.
### Rendering
Side-by-side output uses a two-column ``:
```html
| 42 |
old line content |
42 |
new line content |
```
Changed characters within a line are highlighted with inline spans.
## Low-Level Diff Helpers (`shared.c`)
### Tree Diff
```c
void cgit_diff_tree(const struct object_id *old_oid,
const struct object_id *new_oid,
filepair_fn fn, const char *prefix,
int renamelimit)
```
Computes the diff between two tree objects (typically from two commits).
Calls `fn` for each changed file pair. `renamelimit` controls rename
detection threshold.
### Commit Diff
```c
void cgit_diff_commit(struct commit *commit, filepair_fn fn,
const char *prefix)
```
Diffs a commit against its first parent. For root commits (no parent),
diffs against an empty tree.
### File Diff
```c
void cgit_diff_files(const struct object_id *old_oid,
const struct object_id *new_oid,
unsigned long *old_size,
unsigned long *new_size,
int *binary, int context,
int ignorews, linediff_fn fn)
```
Performs a line-level diff between two blobs. The `linediff_fn` callback is
invoked for each output line (add/remove/context).
## Diff in Context: Commit View
`ui-commit.c` uses the diff engine to show changes in commit view:
```c
void cgit_print_commit(const char *rev, const char *prefix)
{
/* ... commit metadata ... */
cgit_print_diff(ctx.qry.sha1, info->parent_sha1, prefix, 0, 0);
}
```
## Diff in Context: Log View
`ui-log.c` can optionally show per-commit diffstats:
```c
if (ctx.cfg.enable_log_filecount) {
cgit_diff_commit(commit, inspect_filepair, NULL);
/* display changed files count, added/removed */
}
```
## Binary Detection
Files are marked as binary when diffing if the content contains null bytes
or exceeds the configured max-blob-size. Binary files are shown as:
```
Binary files differ
```
No line-level diff is performed for binary content.
## Diff Configuration
| Directive | Default | Effect |
|-----------|---------|--------|
| `side-by-side-diffs` | 0 | Default diff type |
| `renamelimit` | -1 | Rename detection limit |
| `max-blob-size` | 0 | Max blob size for display |
| `enable-log-filecount` | 0 | Show file counts in log |
| `enable-log-linecount` | 0 | Show line counts in log |
## Raw Diff Output
The `rawdiff` command outputs a plain-text unified diff without HTML
wrapping, suitable for piping or downloading:
```c
static void cmd_rawdiff(struct cgit_context *ctx)
{
ctx->page.mimetype = "text/plain";
cgit_print_diff(ctx->qry.sha1, ctx->qry.sha2,
ctx->qry.path, 0, 1 /* raw */);
}
```