diff options
Diffstat (limited to 'docs/handbook/cgit/overview.md')
| -rw-r--r-- | docs/handbook/cgit/overview.md | 262 |
1 files changed, 262 insertions, 0 deletions
diff --git a/docs/handbook/cgit/overview.md b/docs/handbook/cgit/overview.md new file mode 100644 index 0000000000..bb09d33e8b --- /dev/null +++ b/docs/handbook/cgit/overview.md @@ -0,0 +1,262 @@ +# cgit — Overview + +## What Is cgit? + +cgit is a fast, lightweight web frontend for Git repositories, implemented as a +CGI application written in C. It links directly against libgit (the C library +that forms the core of the `git` command-line tool), giving it native access to +repository objects without spawning external processes for every request. This +design makes cgit one of the fastest Git web interfaces available. + +The Project Tick fork carries version `0.0.5-1-Project-Tick` (defined in the +top-level `Makefile` as `CGIT_VERSION`). It builds against Git 2.46.0 and +extends the upstream cgit with features such as subtree display, SPDX license +detection, badge support, Code of Conduct / CLA pages, root links, and an +enhanced summary page with repository metadata. + +## Key Design Goals + +| Goal | How cgit achieves it | +|------|---------------------| +| **Speed** | Direct libgit linkage; file-based response cache; `sendfile()` on Linux | +| **Security** | `GIT_CONFIG_NOSYSTEM=1` set at load time; HTML entity escaping in every output function; directory-traversal guards; auth-filter framework | +| **Simplicity** | Single CGI binary; flat config file (`cgitrc`); no database requirement | +| **Extensibility** | Pluggable filter system (exec / Lua) for about, commit, source, email, owner, and auth content | + +## Source File Map + +The entire cgit source tree lives in `cgit/`. Every `.c` file has a matching +`.h` (with a few exceptions such as `shared.c` and `parsing.c` which declare +their interfaces in `cgit.h`). + +### Core files + +| File | Purpose | +|------|---------| +| `cgit.h` | Master header — includes libgit headers; defines all major types (`cgit_repo`, `cgit_config`, `cgit_query`, `cgit_context`, etc.) and function prototypes | +| `cgit.c` | Entry point — `prepare_context()`, `config_cb()`, `querystring_cb()`, `process_request()`, `main()` | +| `shared.c` | Global variables (`cgit_repolist`, `ctx`); repo management (`cgit_add_repo`, `cgit_get_repoinfo`); diff helpers; parsing helpers | +| `parsing.c` | Commit/tag parsing (`cgit_parse_commit`, `cgit_parse_tag`, `cgit_parse_url`) | +| `cmd.c` | Command dispatch table — maps URL page names to handler functions | +| `cmd.h` | `struct cgit_cmd` definition; `cgit_get_cmd()` prototype | +| `configfile.c` | Generic `name=value` config parser (`parse_configfile`) | +| `configfile.h` | `configfile_value_fn` typedef; `parse_configfile` prototype | + +### Infrastructure files + +| File | Purpose | +|------|---------| +| `cache.c` / `cache.h` | File-based response cache — FNV-1 hashing, slot open/lock/fill/unlock cycle | +| `filter.c` | Filter framework — exec filters (fork/exec), Lua filters (`luaL_newstate`) | +| `html.c` / `html.h` | HTML output primitives — entity escaping, URL encoding, form helpers | +| `scan-tree.c` / `scan-tree.h` | Filesystem repository scanning — `scan_tree()`, `scan_projects()` | + +### UI modules (`ui-*.c` / `ui-*.h`) + +| Module | Page | Handler function | +|--------|------|-----------------| +| `ui-repolist` | `repolist` | `cgit_print_repolist()` | +| `ui-summary` | `summary` | `cgit_print_summary()` | +| `ui-log` | `log` | `cgit_print_log()` | +| `ui-commit` | `commit` | `cgit_print_commit()` | +| `ui-diff` | `diff` | `cgit_print_diff()` | +| `ui-tree` | `tree` | `cgit_print_tree()` | +| `ui-blob` | `blob` | `cgit_print_blob()` | +| `ui-refs` | `refs` | `cgit_print_refs()` | +| `ui-tag` | `tag` | `cgit_print_tag()` | +| `ui-snapshot` | `snapshot` | `cgit_print_snapshot()` | +| `ui-plain` | `plain` | `cgit_print_plain()` | +| `ui-blame` | `blame` | `cgit_print_blame()` | +| `ui-patch` | `patch` | `cgit_print_patch()` | +| `ui-atom` | `atom` | `cgit_print_atom()` | +| `ui-clone` | `HEAD` / `info` / `objects` | `cgit_clone_head()`, `cgit_clone_info()`, `cgit_clone_objects()` | +| `ui-stats` | `stats` | `cgit_show_stats()` | +| `ui-ssdiff` | (helper) | Side-by-side diff rendering via LCS algorithm | +| `ui-shared` | (helper) | HTTP headers, HTML page skeleton, link generation | + +### Static assets + +| File | Description | +|------|-------------| +| `cgit.css` | Default stylesheet | +| `cgit.js` | Client-side JavaScript (e.g. tree filtering) | +| `cgit.png` | Default logo | +| `favicon.ico` | Default favicon | +| `robots.txt` | Default robots file | + +## Core Data Structures + +All major types are defined in `cgit.h`. The single global +`struct cgit_context ctx` (declared in `shared.c`) holds the entire request +state: + +```c +struct cgit_context { + struct cgit_environment env; /* CGI environment variables */ + struct cgit_query qry; /* Parsed query/URL parameters */ + struct cgit_config cfg; /* Global configuration */ + struct cgit_repo *repo; /* Currently selected repository (or NULL) */ + struct cgit_page page; /* HTTP response metadata */ +}; +``` + +### `struct cgit_repo` + +Represents a single Git repository. Key fields: + +```c +struct cgit_repo { + char *url; /* URL-visible name (e.g. "myproject") */ + char *name; /* Display name */ + char *basename; /* Last path component */ + char *path; /* Filesystem path to .git directory */ + char *desc; /* Description string */ + char *owner; /* Repository owner */ + char *defbranch; /* Default branch (NULL → guess from HEAD) */ + char *section; /* Section for grouped display */ + char *clone_url; /* Clone URL override */ + char *homepage; /* Project homepage URL */ + struct string_list readme; /* README file references */ + struct string_list badges; /* Badge image URLs */ + int snapshots; /* Bitmask of enabled snapshot formats */ + int enable_blame; /* Whether blame view is enabled */ + int enable_commit_graph;/* Whether commit graph is shown in log */ + int enable_subtree; /* Whether subtree detection is enabled */ + int max_stats; /* Stats period index (0=disabled) */ + int hide; /* 1 = hidden from listing */ + int ignore; /* 1 = completely ignored */ + struct cgit_filter *about_filter; /* Per-repo about filter */ + struct cgit_filter *source_filter; /* Per-repo source highlighting */ + struct cgit_filter *email_filter; /* Per-repo email filter */ + struct cgit_filter *commit_filter; /* Per-repo commit message filter */ + struct cgit_filter *owner_filter; /* Per-repo owner filter */ + /* ... */ +}; +``` + +### `struct cgit_query` + +Holds all parsed URL/query-string parameters: + +```c +struct cgit_query { + int has_symref, has_oid, has_difftype; + char *raw; /* Raw query string */ + char *repo; /* Repository URL */ + char *page; /* Page name (log, commit, diff, ...) */ + char *search; /* Search query (q=) */ + char *grep; /* Search type (qt=) */ + char *head; /* Branch/ref (h=) */ + char *oid, *oid2; /* Object IDs (id=, id2=) */ + char *path; /* Path within repository */ + char *name; /* Snapshot filename */ + int ofs; /* Pagination offset */ + int showmsg; /* Show full commit messages in log */ + diff_type difftype; /* DIFF_UNIFIED / DIFF_SSDIFF / DIFF_STATONLY */ + int context; /* Diff context lines */ + int ignorews; /* Ignore whitespace in diffs */ + int follow; /* Follow renames in log */ + char *vpath; /* Virtual path (set by cmd dispatch) */ + /* ... */ +}; +``` + +## Request Lifecycle + +1. **Environment setup** — The `constructor_environment()` function runs before + `main()` (via `__attribute__((constructor))`). It sets + `GIT_CONFIG_NOSYSTEM=1` and `GIT_ATTR_NOSYSTEM=1`, then unsets `HOME` and + `XDG_CONFIG_HOME` to prevent Git from reading user/system configurations. + +2. **Context initialization** — `prepare_context()` zeroes out `ctx` and sets + all configuration defaults (cache sizes, TTLs, feature flags, etc.). CGI + environment variables are read from `getenv()`. + +3. **Configuration parsing** — `parse_configfile()` reads the cgitrc file + (default `/etc/cgitrc`, overridable via `$CGIT_CONFIG`) and calls + `config_cb()` for each `name=value` pair. Repository definitions begin with + `repo.url=` and subsequent `repo.*` directives configure that repository. + +4. **Query parsing** — If running in CGI mode (no `$NO_HTTP`), + `http_parse_querystring()` breaks the query string into name/value pairs and + passes them to `querystring_cb()`. The `url=` parameter is further parsed by + `cgit_parse_url()` which splits it into repo, page, and path components. + +5. **Authentication** — `authenticate_cookie()` checks whether an `auth-filter` + is configured. If so, it invokes the filter with function + `"authenticate-cookie"` and sets `ctx.env.authenticated` from the filter's + exit code. POST requests to `/?p=login` route through + `authenticate_post()` instead. + +6. **Cache lookup** — If caching is enabled (`cache-size > 0`), a cache key is + constructed from the URL and passed to `cache_process()`. On a cache hit the + stored response is sent directly via `sendfile()`. On a miss, stdout is + redirected to a lock file and the request proceeds through normal processing. + +7. **Command dispatch** — `cgit_get_cmd()` looks up `ctx.qry.page` in the + static `cmds[]` table (defined in `cmd.c`). If the command requires a + repository (`want_repo == 1`), the repository is initialized via + `prepare_repo_env()` and `prepare_repo_cmd()`. + +8. **Page rendering** — The matched command's handler function is called. Each + handler uses `cgit_print_http_headers()`, `cgit_print_docstart()`, + `cgit_print_pageheader()`, and `cgit_print_docend()` (from `ui-shared.c`) + to frame their output inside a proper HTML document. + +9. **Cleanup** — `cgit_cleanup_filters()` reaps all filter resources (closing + Lua states, freeing argv arrays). + +## Version String + +The version is compiled into the binary via: + +```makefile +CGIT_VERSION = 0.0.5-1-Project-Tick +``` + +and exposed as the global: + +```c +const char *cgit_version = CGIT_VERSION; +``` + +This string appears in the HTML footer (rendered by `ui-shared.c`) and in patch +output trailers. + +## Relationship to Git + +cgit is built *inside* the Git source tree. The `Makefile` downloads +Git 2.46.0, extracts it as a `git/` subdirectory, then calls `make -C git -f +../cgit.mk` which includes Git's own `Makefile` to inherit all build variables, +object files, and linker flags. The resulting `cgit` binary is a statically +linked combination of cgit's own object files and libgit. + +## Time Constants + +`cgit.h` defines convenience macros used for relative date display: + +```c +#define TM_MIN 60 +#define TM_HOUR (TM_MIN * 60) +#define TM_DAY (TM_HOUR * 24) +#define TM_WEEK (TM_DAY * 7) +#define TM_YEAR (TM_DAY * 365) +#define TM_MONTH (TM_YEAR / 12.0) +``` + +These are used by `cgit_print_age()` in `ui-shared.c` to render "2 hours ago" +style timestamps. + +## Default Encoding + +```c +#define PAGE_ENCODING "UTF-8" +``` + +All commit messages are re-encoded to UTF-8 before display (see +`cgit_parse_commit()` in `parsing.c`). + +## License + +cgit is licensed under the GNU General Public License v2. The `COPYING` file +in the cgit directory contains the full text. |
