From 32f5f761bc8e960293b4f4feaf973dd0da26d0f8 Mon Sep 17 00:00:00 2001 From: Mehmet Samet Duman Date: Sun, 5 Apr 2026 17:37:54 +0300 Subject: NOISSUE Project Tick Handbook is Released! Assisted-by: Claude:Opus-4.6-High Signed-off-by: Mehmet Samet Duman --- docs/handbook/Project-Tick/architecture.md | 579 ++++++ docs/handbook/Project-Tick/build-systems.md | 711 ++++++++ docs/handbook/Project-Tick/ci-cd-pipeline.md | 599 +++++++ docs/handbook/Project-Tick/coding-standards.md | 558 ++++++ docs/handbook/Project-Tick/contributing.md | 545 ++++++ docs/handbook/Project-Tick/faq.md | 683 ++++++++ docs/handbook/Project-Tick/getting-started.md | 637 +++++++ docs/handbook/Project-Tick/glossary.md | 556 ++++++ docs/handbook/Project-Tick/licensing.md | 371 ++++ docs/handbook/Project-Tick/overview.md | 335 ++++ docs/handbook/Project-Tick/release-process.md | 374 ++++ docs/handbook/Project-Tick/repository-structure.md | 625 +++++++ docs/handbook/Project-Tick/security-policy.md | 282 +++ docs/handbook/Project-Tick/trademark-policy.md | 283 +++ docs/handbook/archived/overview.md | 275 +++ docs/handbook/archived/projt-launcher.md | 444 +++++ docs/handbook/archived/projt-modpack.md | 245 +++ docs/handbook/archived/ptlibzippy.md | 501 ++++++ docs/handbook/cgit/api-reference.md | 468 +++++ docs/handbook/cgit/architecture.md | 422 +++++ docs/handbook/cgit/authentication.md | 288 +++ docs/handbook/cgit/building.md | 272 +++ docs/handbook/cgit/caching-system.md | 287 +++ docs/handbook/cgit/code-style.md | 356 ++++ docs/handbook/cgit/configuration.md | 351 ++++ docs/handbook/cgit/css-theming.md | 522 ++++++ docs/handbook/cgit/deployment.md | 369 ++++ docs/handbook/cgit/diff-engine.md | 352 ++++ docs/handbook/cgit/filter-system.md | 358 ++++ docs/handbook/cgit/html-rendering.md | 380 ++++ docs/handbook/cgit/lua-integration.md | 428 +++++ docs/handbook/cgit/overview.md | 262 +++ docs/handbook/cgit/repository-discovery.md | 355 ++++ docs/handbook/cgit/snapshot-system.md | 246 +++ docs/handbook/cgit/testing.md | 335 ++++ docs/handbook/cgit/ui-modules.md | 544 ++++++ docs/handbook/cgit/url-routing.md | 331 ++++ docs/handbook/ci/branch-strategy.md | 388 +++++ docs/handbook/ci/codeowners.md | 370 ++++ docs/handbook/ci/commit-linting.md | 418 +++++ docs/handbook/ci/formatting.md | 298 ++++ docs/handbook/ci/nix-infrastructure.md | 611 +++++++ docs/handbook/ci/overview.md | 494 ++++++ docs/handbook/ci/pr-validation.md | 378 ++++ docs/handbook/ci/rate-limiting.md | 321 ++++ docs/handbook/cmark/architecture.md | 283 +++ docs/handbook/cmark/ast-node-system.md | 383 ++++ docs/handbook/cmark/block-parsing.md | 310 ++++ docs/handbook/cmark/building.md | 268 +++ docs/handbook/cmark/cli-usage.md | 249 +++ docs/handbook/cmark/code-style.md | 293 ++++ docs/handbook/cmark/commonmark-renderer.md | 344 ++++ docs/handbook/cmark/html-renderer.md | 258 +++ docs/handbook/cmark/inline-parsing.md | 317 ++++ docs/handbook/cmark/iterator-system.md | 267 +++ docs/handbook/cmark/latex-renderer.md | 320 ++++ docs/handbook/cmark/man-renderer.md | 272 +++ docs/handbook/cmark/memory-management.md | 351 ++++ docs/handbook/cmark/overview.md | 256 +++ docs/handbook/cmark/public-api.md | 637 +++++++ docs/handbook/cmark/reference-system.md | 307 ++++ docs/handbook/cmark/render-framework.md | 294 ++++ docs/handbook/cmark/scanner-system.md | 223 +++ docs/handbook/cmark/testing.md | 281 +++ docs/handbook/cmark/utf8-handling.md | 340 ++++ docs/handbook/cmark/xml-renderer.md | 291 ++++ docs/handbook/corebinutils/architecture.md | 665 +++++++ docs/handbook/corebinutils/building.md | 429 +++++ docs/handbook/corebinutils/cat.md | 211 +++ docs/handbook/corebinutils/chmod.md | 296 ++++ docs/handbook/corebinutils/code-style.md | 351 ++++ docs/handbook/corebinutils/cp.md | 270 +++ docs/handbook/corebinutils/date.md | 352 ++++ docs/handbook/corebinutils/dd.md | 407 +++++ docs/handbook/corebinutils/df.md | 264 +++ docs/handbook/corebinutils/echo.md | 158 ++ docs/handbook/corebinutils/ed.md | 306 ++++ docs/handbook/corebinutils/error-handling.md | 315 ++++ docs/handbook/corebinutils/expr.md | 194 +++ docs/handbook/corebinutils/hostname.md | 154 ++ docs/handbook/corebinutils/kill.md | 237 +++ docs/handbook/corebinutils/ln.md | 190 ++ docs/handbook/corebinutils/ls.md | 314 ++++ docs/handbook/corebinutils/mkdir.md | 194 +++ docs/handbook/corebinutils/mv.md | 285 +++ docs/handbook/corebinutils/overview.md | 362 ++++ docs/handbook/corebinutils/ps.md | 298 ++++ docs/handbook/corebinutils/pwd.md | 152 ++ docs/handbook/corebinutils/realpath.md | 119 ++ docs/handbook/corebinutils/rm.md | 293 ++++ docs/handbook/corebinutils/sleep.md | 218 +++ docs/handbook/corebinutils/test.md | 248 +++ docs/handbook/corebinutils/timeout.md | 297 ++++ docs/handbook/forgewrapper/architecture.md | 1202 +++++++++++++ docs/handbook/forgewrapper/building.md | 1843 ++++++++++++++++++++ docs/handbook/forgewrapper/overview.md | 270 +++ docs/handbook/genqrcode/architecture.md | 948 ++++++++++ docs/handbook/genqrcode/building.md | 570 ++++++ docs/handbook/genqrcode/cli-usage.md | 382 ++++ docs/handbook/genqrcode/code-style.md | 351 ++++ docs/handbook/genqrcode/encoding-modes.md | 591 +++++++ docs/handbook/genqrcode/error-correction.md | 455 +++++ docs/handbook/genqrcode/masking-algorithms.md | 578 ++++++ docs/handbook/genqrcode/micro-qr.md | 456 +++++ docs/handbook/genqrcode/overview.md | 502 ++++++ docs/handbook/genqrcode/public-api.md | 912 ++++++++++ docs/handbook/genqrcode/reed-solomon.md | 347 ++++ docs/handbook/genqrcode/testing.md | 398 +++++ docs/handbook/hooks/logging-system.md | 492 ++++++ docs/handbook/hooks/mirror-configuration.md | 627 +++++++ docs/handbook/hooks/notification-system.md | 538 ++++++ docs/handbook/hooks/overview.md | 712 ++++++++ docs/handbook/hooks/post-receive-hook.md | 778 +++++++++ docs/handbook/images4docker/architecture.md | 504 ++++++ docs/handbook/images4docker/base-images.md | 825 +++++++++ docs/handbook/images4docker/ci-cd-integration.md | 396 +++++ docs/handbook/images4docker/creating-new-images.md | 338 ++++ docs/handbook/images4docker/overview.md | 304 ++++ docs/handbook/images4docker/qt6-verification.md | 283 +++ docs/handbook/images4docker/troubleshooting.md | 395 +++++ docs/handbook/json4cpp/architecture.md | 613 +++++++ docs/handbook/json4cpp/basic-usage.md | 601 +++++++ docs/handbook/json4cpp/binary-formats.md | 411 +++++ docs/handbook/json4cpp/building.md | 430 +++++ docs/handbook/json4cpp/code-style.md | 209 +++ docs/handbook/json4cpp/custom-types.md | 465 +++++ docs/handbook/json4cpp/element-access.md | 581 ++++++ docs/handbook/json4cpp/exception-handling.md | 368 ++++ docs/handbook/json4cpp/iteration.md | 339 ++++ docs/handbook/json4cpp/json-patch.md | 341 ++++ docs/handbook/json4cpp/json-pointer.md | 361 ++++ docs/handbook/json4cpp/overview.md | 330 ++++ docs/handbook/json4cpp/parsing-internals.md | 493 ++++++ docs/handbook/json4cpp/performance.md | 275 +++ docs/handbook/json4cpp/sax-interface.md | 337 ++++ docs/handbook/json4cpp/serialization.md | 528 ++++++ docs/handbook/json4cpp/testing.md | 190 ++ docs/handbook/json4cpp/value-types.md | 474 +++++ docs/handbook/libnbtplusplus/architecture.md | 607 +++++++ docs/handbook/libnbtplusplus/building.md | 401 +++++ docs/handbook/libnbtplusplus/code-style.md | 299 ++++ docs/handbook/libnbtplusplus/compound-tags.md | 602 +++++++ docs/handbook/libnbtplusplus/endian-handling.md | 359 ++++ docs/handbook/libnbtplusplus/io-system.md | 672 +++++++ docs/handbook/libnbtplusplus/list-tags.md | 682 ++++++++ docs/handbook/libnbtplusplus/overview.md | 422 +++++ docs/handbook/libnbtplusplus/tag-system.md | 643 +++++++ docs/handbook/libnbtplusplus/testing.md | 291 ++++ docs/handbook/libnbtplusplus/visitor-pattern.md | 333 ++++ docs/handbook/libnbtplusplus/zlib-integration.md | 514 ++++++ docs/handbook/meshmc/account-management.md | 470 +++++ docs/handbook/meshmc/application-lifecycle.md | 373 ++++ docs/handbook/meshmc/architecture.md | 724 ++++++++ docs/handbook/meshmc/building.md | 554 ++++++ docs/handbook/meshmc/code-style.md | 315 ++++ docs/handbook/meshmc/component-system.md | 540 ++++++ docs/handbook/meshmc/contributing.md | 130 ++ docs/handbook/meshmc/dependencies.md | 241 +++ docs/handbook/meshmc/instance-management.md | 483 +++++ docs/handbook/meshmc/java-detection.md | 411 +++++ docs/handbook/meshmc/launch-system.md | 569 ++++++ docs/handbook/meshmc/mod-system.md | 410 +++++ docs/handbook/meshmc/network-layer.md | 551 ++++++ docs/handbook/meshmc/overview.md | 269 +++ docs/handbook/meshmc/platform-support.md | 353 ++++ docs/handbook/meshmc/release-notes.md | 222 +++ docs/handbook/meshmc/settings-system.md | 402 +++++ docs/handbook/meshmc/theme-system.md | 417 +++++ docs/handbook/meshmc/ui-system.md | 511 ++++++ docs/handbook/meta/architecture.md | 624 +++++++ docs/handbook/meta/data-models.md | 582 +++++++ docs/handbook/meta/deployment.md | 285 +++ docs/handbook/meta/fabric-metadata.md | 323 ++++ docs/handbook/meta/forge-metadata.md | 492 ++++++ docs/handbook/meta/java-runtime-metadata.md | 546 ++++++ docs/handbook/meta/mojang-metadata.md | 480 +++++ docs/handbook/meta/neoforge-metadata.md | 334 ++++ docs/handbook/meta/overview.md | 386 ++++ docs/handbook/meta/quilt-metadata.md | 267 +++ docs/handbook/meta/setup.md | 480 +++++ docs/handbook/meta/update-pipeline.md | 330 ++++ docs/handbook/mnv/architecture.md | 549 ++++++ docs/handbook/mnv/building.md | 636 +++++++ docs/handbook/mnv/code-style.md | 408 +++++ docs/handbook/mnv/contributing.md | 293 ++++ docs/handbook/mnv/gui-extension.md | 410 +++++ docs/handbook/mnv/overview.md | 381 ++++ docs/handbook/mnv/platform-support.md | 306 ++++ docs/handbook/mnv/scripting.md | 541 ++++++ docs/handbook/neozip/api-reference.md | 459 +++++ docs/handbook/neozip/architecture.md | 1075 ++++++++++++ docs/handbook/neozip/arm-optimizations.md | 403 +++++ docs/handbook/neozip/building.md | 491 ++++++ docs/handbook/neozip/checksum-algorithms.md | 461 +++++ docs/handbook/neozip/code-style.md | 259 +++ docs/handbook/neozip/deflate-algorithms.md | 797 +++++++++ docs/handbook/neozip/gzip-support.md | 413 +++++ docs/handbook/neozip/hardware-acceleration.md | 447 +++++ docs/handbook/neozip/huffman-coding.md | 643 +++++++ docs/handbook/neozip/inflate-engine.md | 665 +++++++ docs/handbook/neozip/overview.md | 509 ++++++ docs/handbook/neozip/performance-tuning.md | 361 ++++ docs/handbook/neozip/testing.md | 317 ++++ docs/handbook/neozip/x86-optimizations.md | 439 +++++ docs/handbook/ofborg/amqp-infrastructure.md | 631 +++++++ docs/handbook/ofborg/architecture.md | 814 +++++++++ docs/handbook/ofborg/build-executor.md | 657 +++++++ docs/handbook/ofborg/building.md | 530 ++++++ docs/handbook/ofborg/code-style.md | 332 ++++ docs/handbook/ofborg/configuration.md | 472 +++++ docs/handbook/ofborg/contributing.md | 326 ++++ docs/handbook/ofborg/data-flow.md | 346 ++++ docs/handbook/ofborg/deployment.md | 413 +++++ docs/handbook/ofborg/evaluation-system.md | 602 +++++++ docs/handbook/ofborg/github-integration.md | 603 +++++++ docs/handbook/ofborg/message-system.md | 731 ++++++++ docs/handbook/ofborg/overview.md | 571 ++++++ docs/handbook/ofborg/webhook-receiver.md | 470 +++++ docs/handbook/tomlplusplus/architecture.md | 920 ++++++++++ docs/handbook/tomlplusplus/arrays.md | 625 +++++++ docs/handbook/tomlplusplus/basic-usage.md | 705 ++++++++ docs/handbook/tomlplusplus/building.md | 474 +++++ docs/handbook/tomlplusplus/code-style.md | 277 +++ docs/handbook/tomlplusplus/formatting.md | 546 ++++++ docs/handbook/tomlplusplus/node-system.md | 625 +++++++ docs/handbook/tomlplusplus/overview.md | 474 +++++ docs/handbook/tomlplusplus/parsing.md | 494 ++++++ docs/handbook/tomlplusplus/path-system.md | 412 +++++ docs/handbook/tomlplusplus/tables.md | 551 ++++++ docs/handbook/tomlplusplus/testing.md | 226 +++ docs/handbook/tomlplusplus/unicode-handling.md | 335 ++++ docs/handbook/tomlplusplus/values.md | 547 ++++++ 232 files changed, 101144 insertions(+) create mode 100644 docs/handbook/Project-Tick/architecture.md create mode 100644 docs/handbook/Project-Tick/build-systems.md create mode 100644 docs/handbook/Project-Tick/ci-cd-pipeline.md create mode 100644 docs/handbook/Project-Tick/coding-standards.md create mode 100644 docs/handbook/Project-Tick/contributing.md create mode 100644 docs/handbook/Project-Tick/faq.md create mode 100644 docs/handbook/Project-Tick/getting-started.md create mode 100644 docs/handbook/Project-Tick/glossary.md create mode 100644 docs/handbook/Project-Tick/licensing.md create mode 100644 docs/handbook/Project-Tick/overview.md create mode 100644 docs/handbook/Project-Tick/release-process.md create mode 100644 docs/handbook/Project-Tick/repository-structure.md create mode 100644 docs/handbook/Project-Tick/security-policy.md create mode 100644 docs/handbook/Project-Tick/trademark-policy.md create mode 100644 docs/handbook/archived/overview.md create mode 100644 docs/handbook/archived/projt-launcher.md create mode 100644 docs/handbook/archived/projt-modpack.md create mode 100644 docs/handbook/archived/ptlibzippy.md create mode 100644 docs/handbook/cgit/api-reference.md create mode 100644 docs/handbook/cgit/architecture.md create mode 100644 docs/handbook/cgit/authentication.md create mode 100644 docs/handbook/cgit/building.md create mode 100644 docs/handbook/cgit/caching-system.md create mode 100644 docs/handbook/cgit/code-style.md create mode 100644 docs/handbook/cgit/configuration.md create mode 100644 docs/handbook/cgit/css-theming.md create mode 100644 docs/handbook/cgit/deployment.md create mode 100644 docs/handbook/cgit/diff-engine.md create mode 100644 docs/handbook/cgit/filter-system.md create mode 100644 docs/handbook/cgit/html-rendering.md create mode 100644 docs/handbook/cgit/lua-integration.md create mode 100644 docs/handbook/cgit/overview.md create mode 100644 docs/handbook/cgit/repository-discovery.md create mode 100644 docs/handbook/cgit/snapshot-system.md create mode 100644 docs/handbook/cgit/testing.md create mode 100644 docs/handbook/cgit/ui-modules.md create mode 100644 docs/handbook/cgit/url-routing.md create mode 100644 docs/handbook/ci/branch-strategy.md create mode 100644 docs/handbook/ci/codeowners.md create mode 100644 docs/handbook/ci/commit-linting.md create mode 100644 docs/handbook/ci/formatting.md create mode 100644 docs/handbook/ci/nix-infrastructure.md create mode 100644 docs/handbook/ci/overview.md create mode 100644 docs/handbook/ci/pr-validation.md create mode 100644 docs/handbook/ci/rate-limiting.md create mode 100644 docs/handbook/cmark/architecture.md create mode 100644 docs/handbook/cmark/ast-node-system.md create mode 100644 docs/handbook/cmark/block-parsing.md create mode 100644 docs/handbook/cmark/building.md create mode 100644 docs/handbook/cmark/cli-usage.md create mode 100644 docs/handbook/cmark/code-style.md create mode 100644 docs/handbook/cmark/commonmark-renderer.md create mode 100644 docs/handbook/cmark/html-renderer.md create mode 100644 docs/handbook/cmark/inline-parsing.md create mode 100644 docs/handbook/cmark/iterator-system.md create mode 100644 docs/handbook/cmark/latex-renderer.md create mode 100644 docs/handbook/cmark/man-renderer.md create mode 100644 docs/handbook/cmark/memory-management.md create mode 100644 docs/handbook/cmark/overview.md create mode 100644 docs/handbook/cmark/public-api.md create mode 100644 docs/handbook/cmark/reference-system.md create mode 100644 docs/handbook/cmark/render-framework.md create mode 100644 docs/handbook/cmark/scanner-system.md create mode 100644 docs/handbook/cmark/testing.md create mode 100644 docs/handbook/cmark/utf8-handling.md create mode 100644 docs/handbook/cmark/xml-renderer.md create mode 100644 docs/handbook/corebinutils/architecture.md create mode 100644 docs/handbook/corebinutils/building.md create mode 100644 docs/handbook/corebinutils/cat.md create mode 100644 docs/handbook/corebinutils/chmod.md create mode 100644 docs/handbook/corebinutils/code-style.md create mode 100644 docs/handbook/corebinutils/cp.md create mode 100644 docs/handbook/corebinutils/date.md create mode 100644 docs/handbook/corebinutils/dd.md create mode 100644 docs/handbook/corebinutils/df.md create mode 100644 docs/handbook/corebinutils/echo.md create mode 100644 docs/handbook/corebinutils/ed.md create mode 100644 docs/handbook/corebinutils/error-handling.md create mode 100644 docs/handbook/corebinutils/expr.md create mode 100644 docs/handbook/corebinutils/hostname.md create mode 100644 docs/handbook/corebinutils/kill.md create mode 100644 docs/handbook/corebinutils/ln.md create mode 100644 docs/handbook/corebinutils/ls.md create mode 100644 docs/handbook/corebinutils/mkdir.md create mode 100644 docs/handbook/corebinutils/mv.md create mode 100644 docs/handbook/corebinutils/overview.md create mode 100644 docs/handbook/corebinutils/ps.md create mode 100644 docs/handbook/corebinutils/pwd.md create mode 100644 docs/handbook/corebinutils/realpath.md create mode 100644 docs/handbook/corebinutils/rm.md create mode 100644 docs/handbook/corebinutils/sleep.md create mode 100644 docs/handbook/corebinutils/test.md create mode 100644 docs/handbook/corebinutils/timeout.md create mode 100644 docs/handbook/forgewrapper/architecture.md create mode 100644 docs/handbook/forgewrapper/building.md create mode 100644 docs/handbook/forgewrapper/overview.md create mode 100644 docs/handbook/genqrcode/architecture.md create mode 100644 docs/handbook/genqrcode/building.md create mode 100644 docs/handbook/genqrcode/cli-usage.md create mode 100644 docs/handbook/genqrcode/code-style.md create mode 100644 docs/handbook/genqrcode/encoding-modes.md create mode 100644 docs/handbook/genqrcode/error-correction.md create mode 100644 docs/handbook/genqrcode/masking-algorithms.md create mode 100644 docs/handbook/genqrcode/micro-qr.md create mode 100644 docs/handbook/genqrcode/overview.md create mode 100644 docs/handbook/genqrcode/public-api.md create mode 100644 docs/handbook/genqrcode/reed-solomon.md create mode 100644 docs/handbook/genqrcode/testing.md create mode 100644 docs/handbook/hooks/logging-system.md create mode 100644 docs/handbook/hooks/mirror-configuration.md create mode 100644 docs/handbook/hooks/notification-system.md create mode 100644 docs/handbook/hooks/overview.md create mode 100644 docs/handbook/hooks/post-receive-hook.md create mode 100644 docs/handbook/images4docker/architecture.md create mode 100644 docs/handbook/images4docker/base-images.md create mode 100644 docs/handbook/images4docker/ci-cd-integration.md create mode 100644 docs/handbook/images4docker/creating-new-images.md create mode 100644 docs/handbook/images4docker/overview.md create mode 100644 docs/handbook/images4docker/qt6-verification.md create mode 100644 docs/handbook/images4docker/troubleshooting.md create mode 100644 docs/handbook/json4cpp/architecture.md create mode 100644 docs/handbook/json4cpp/basic-usage.md create mode 100644 docs/handbook/json4cpp/binary-formats.md create mode 100644 docs/handbook/json4cpp/building.md create mode 100644 docs/handbook/json4cpp/code-style.md create mode 100644 docs/handbook/json4cpp/custom-types.md create mode 100644 docs/handbook/json4cpp/element-access.md create mode 100644 docs/handbook/json4cpp/exception-handling.md create mode 100644 docs/handbook/json4cpp/iteration.md create mode 100644 docs/handbook/json4cpp/json-patch.md create mode 100644 docs/handbook/json4cpp/json-pointer.md create mode 100644 docs/handbook/json4cpp/overview.md create mode 100644 docs/handbook/json4cpp/parsing-internals.md create mode 100644 docs/handbook/json4cpp/performance.md create mode 100644 docs/handbook/json4cpp/sax-interface.md create mode 100644 docs/handbook/json4cpp/serialization.md create mode 100644 docs/handbook/json4cpp/testing.md create mode 100644 docs/handbook/json4cpp/value-types.md create mode 100644 docs/handbook/libnbtplusplus/architecture.md create mode 100644 docs/handbook/libnbtplusplus/building.md create mode 100644 docs/handbook/libnbtplusplus/code-style.md create mode 100644 docs/handbook/libnbtplusplus/compound-tags.md create mode 100644 docs/handbook/libnbtplusplus/endian-handling.md create mode 100644 docs/handbook/libnbtplusplus/io-system.md create mode 100644 docs/handbook/libnbtplusplus/list-tags.md create mode 100644 docs/handbook/libnbtplusplus/overview.md create mode 100644 docs/handbook/libnbtplusplus/tag-system.md create mode 100644 docs/handbook/libnbtplusplus/testing.md create mode 100644 docs/handbook/libnbtplusplus/visitor-pattern.md create mode 100644 docs/handbook/libnbtplusplus/zlib-integration.md create mode 100644 docs/handbook/meshmc/account-management.md create mode 100644 docs/handbook/meshmc/application-lifecycle.md create mode 100644 docs/handbook/meshmc/architecture.md create mode 100644 docs/handbook/meshmc/building.md create mode 100644 docs/handbook/meshmc/code-style.md create mode 100644 docs/handbook/meshmc/component-system.md create mode 100644 docs/handbook/meshmc/contributing.md create mode 100644 docs/handbook/meshmc/dependencies.md create mode 100644 docs/handbook/meshmc/instance-management.md create mode 100644 docs/handbook/meshmc/java-detection.md create mode 100644 docs/handbook/meshmc/launch-system.md create mode 100644 docs/handbook/meshmc/mod-system.md create mode 100644 docs/handbook/meshmc/network-layer.md create mode 100644 docs/handbook/meshmc/overview.md create mode 100644 docs/handbook/meshmc/platform-support.md create mode 100644 docs/handbook/meshmc/release-notes.md create mode 100644 docs/handbook/meshmc/settings-system.md create mode 100644 docs/handbook/meshmc/theme-system.md create mode 100644 docs/handbook/meshmc/ui-system.md create mode 100644 docs/handbook/meta/architecture.md create mode 100644 docs/handbook/meta/data-models.md create mode 100644 docs/handbook/meta/deployment.md create mode 100644 docs/handbook/meta/fabric-metadata.md create mode 100644 docs/handbook/meta/forge-metadata.md create mode 100644 docs/handbook/meta/java-runtime-metadata.md create mode 100644 docs/handbook/meta/mojang-metadata.md create mode 100644 docs/handbook/meta/neoforge-metadata.md create mode 100644 docs/handbook/meta/overview.md create mode 100644 docs/handbook/meta/quilt-metadata.md create mode 100644 docs/handbook/meta/setup.md create mode 100644 docs/handbook/meta/update-pipeline.md create mode 100644 docs/handbook/mnv/architecture.md create mode 100644 docs/handbook/mnv/building.md create mode 100644 docs/handbook/mnv/code-style.md create mode 100644 docs/handbook/mnv/contributing.md create mode 100644 docs/handbook/mnv/gui-extension.md create mode 100644 docs/handbook/mnv/overview.md create mode 100644 docs/handbook/mnv/platform-support.md create mode 100644 docs/handbook/mnv/scripting.md create mode 100644 docs/handbook/neozip/api-reference.md create mode 100644 docs/handbook/neozip/architecture.md create mode 100644 docs/handbook/neozip/arm-optimizations.md create mode 100644 docs/handbook/neozip/building.md create mode 100644 docs/handbook/neozip/checksum-algorithms.md create mode 100644 docs/handbook/neozip/code-style.md create mode 100644 docs/handbook/neozip/deflate-algorithms.md create mode 100644 docs/handbook/neozip/gzip-support.md create mode 100644 docs/handbook/neozip/hardware-acceleration.md create mode 100644 docs/handbook/neozip/huffman-coding.md create mode 100644 docs/handbook/neozip/inflate-engine.md create mode 100644 docs/handbook/neozip/overview.md create mode 100644 docs/handbook/neozip/performance-tuning.md create mode 100644 docs/handbook/neozip/testing.md create mode 100644 docs/handbook/neozip/x86-optimizations.md create mode 100644 docs/handbook/ofborg/amqp-infrastructure.md create mode 100644 docs/handbook/ofborg/architecture.md create mode 100644 docs/handbook/ofborg/build-executor.md create mode 100644 docs/handbook/ofborg/building.md create mode 100644 docs/handbook/ofborg/code-style.md create mode 100644 docs/handbook/ofborg/configuration.md create mode 100644 docs/handbook/ofborg/contributing.md create mode 100644 docs/handbook/ofborg/data-flow.md create mode 100644 docs/handbook/ofborg/deployment.md create mode 100644 docs/handbook/ofborg/evaluation-system.md create mode 100644 docs/handbook/ofborg/github-integration.md create mode 100644 docs/handbook/ofborg/message-system.md create mode 100644 docs/handbook/ofborg/overview.md create mode 100644 docs/handbook/ofborg/webhook-receiver.md create mode 100644 docs/handbook/tomlplusplus/architecture.md create mode 100644 docs/handbook/tomlplusplus/arrays.md create mode 100644 docs/handbook/tomlplusplus/basic-usage.md create mode 100644 docs/handbook/tomlplusplus/building.md create mode 100644 docs/handbook/tomlplusplus/code-style.md create mode 100644 docs/handbook/tomlplusplus/formatting.md create mode 100644 docs/handbook/tomlplusplus/node-system.md create mode 100644 docs/handbook/tomlplusplus/overview.md create mode 100644 docs/handbook/tomlplusplus/parsing.md create mode 100644 docs/handbook/tomlplusplus/path-system.md create mode 100644 docs/handbook/tomlplusplus/tables.md create mode 100644 docs/handbook/tomlplusplus/testing.md create mode 100644 docs/handbook/tomlplusplus/unicode-handling.md create mode 100644 docs/handbook/tomlplusplus/values.md diff --git a/docs/handbook/Project-Tick/architecture.md b/docs/handbook/Project-Tick/architecture.md new file mode 100644 index 0000000000..9cb7d90eb2 --- /dev/null +++ b/docs/handbook/Project-Tick/architecture.md @@ -0,0 +1,579 @@ +# Project Tick — Mono-Repo Architecture + +## Architectural Philosophy + +Project Tick is structured as a unified monorepo where each top-level directory +represents an independent component. This architecture provides: + +- **Atomic cross-project changes** — A single commit can update a library and + every consumer simultaneously, eliminating version skew. +- **Unified CI** — One orchestrator workflow (`ci.yml`) detects which + sub-projects are affected by a change and dispatches builds accordingly. +- **Shared tooling** — Nix flakes, lefthook hooks, REUSE compliance, and + code formatting apply uniformly across the entire tree. +- **Independent buildability** — Despite living in one repository, each + sub-project maintains its own build system and can be built in isolation. + +--- + +## Repository Layout + +``` +Project-Tick/ +├── .github/ # GitHub Actions, issue templates, CODEOWNERS +│ ├── workflows/ # 50+ CI workflow files +│ ├── ISSUE_TEMPLATE/ # Bug report, suggestion, RFC templates +│ ├── CODEOWNERS # Ownership mapping for review routing +│ ├── dco.yml # DCO bot configuration +│ └── pull_request_template.md +│ +├── LICENSES/ # 20 SPDX-compliant license texts +├── REUSE.toml # Path-to-license mapping +├── CONTRIBUTING.md # Contribution guidelines, CLA, DCO +├── SECURITY.md # Vulnerability reporting policy +├── TRADEMARK.md # Trademark and brand usage policy +├── CODE_OF_CONDUCT.md # Code of Conduct v2 +├── README.md # Root README +│ +├── flake.nix # Top-level Nix flake (dev shells, LLVM 22) +├── flake.lock # Pinned Nix inputs +├── bootstrap.sh # Linux/macOS dependency bootstrap +├── bootstrap.cmd # Windows dependency bootstrap +├── lefthook.yml # Git hooks (REUSE lint, checkpatch) +│ +├── meshmc/ # MeshMC launcher (C++23, Qt 6, CMake) +├── mnv/ # MNV text editor (C, Autotools/CMake) +├── cgit/ # cgit Git web interface (C, Make) +│ +├── neozip/ # Compression library (C, CMake) +├── json4cpp/ # JSON library (C++, CMake/Meson) +├── tomlplusplus/ # TOML library (C++17, Meson/CMake) +├── libnbtplusplus/ # NBT library (C++, CMake) +├── cmark/ # Markdown library (C, CMake) +├── genqrcode/ # QR code library (C, CMake/Autotools) +├── forgewrapper/ # Forge bootstrap (Java, Gradle) +│ +├── corebinutils/ # BSD utility ports (C, Make) +│ +├── meta/ # Metadata generator (Python, Poetry) +├── ofborg/ # tickborg CI bot (Rust, Cargo) +├── images4docker/ # Docker build environments (Dockerfile) +├── ci/ # CI tooling (Nix, JavaScript) +├── hooks/ # Git hook scripts +│ +├── archived/ # Deprecated sub-projects +│ ├── projt-launcher/ +│ ├── projt-modpack/ +│ ├── projt-minicraft-modpack/ +│ └── ptlibzippy/ +│ +└── docs/ # Documentation + └── handbook/ # Developer handbook by component +``` + +--- + +## Dependency Graph + +### Compile-Time Dependencies + +MeshMC is the primary integration point. It consumes most of the library +sub-projects either directly or indirectly: + +``` +meshmc +├─── json4cpp # JSON configuration parsing +│ └── (header-only, no transitive deps) +│ +├─── tomlplusplus # TOML instance/mod configuration +│ └── (header-only, no transitive deps) +│ +├─── libnbtplusplus # Minecraft world/data NBT parsing +│ └── zlib # Compressed NBT support (optional) +│ +├─── neozip # General compression (zlib-compatible API) +│ └── (CPU intrinsics, no library deps) +│ +├─── cmark # Markdown changelog/news rendering +│ └── (no deps) +│ +├─── genqrcode # QR code generation for account linking +│ └── libpng # PNG output (optional, for CLI tool) +│ +├─── forgewrapper # Runtime: Forge mod loader bootstrap +│ └── (Java SPI, no compile-time deps from meshmc) +│ +├─── Qt 6 # External: GUI framework +│ ├── Core, Widgets, Concurrent +│ ├── Network, NetworkAuth +│ ├── Test, Xml +│ └── QuaZip (Qt 6) # ZIP archive handling +│ +├─── libarchive # External: Archive extraction +└─── ECM # External: Extra CMake Modules +``` + +### Runtime Dependencies + +``` +meshmc (running) +├─── forgewrapper.jar # Loaded at Minecraft launch for Forge ≥1.13 +├─── meta/ JSON manifests # Fetched over HTTP for version discovery +│ ├── Mojang versions +│ ├── Forge / NeoForge versions +│ ├── Fabric / Quilt versions +│ └── Java runtime versions +├─── JDK 17+ # For running Minecraft +└─── System zlib / neozip # Linked at build time +``` + +### CI Dependencies + +``` +ci.yml (orchestrator) +├─── ci/github-script/ # JavaScript: commit lint, PR prep, reviews +│ ├── lint-commits.js # Conventional Commits validation +│ ├── prepare.js # PR validation +│ ├── reviews.js # Review state management +│ └── withRateLimit.js # GitHub API rate limiting +│ +├─── ci/default.nix # Nix: treefmt, codeowners-validator +│ ├── treefmt-nix # Multi-language formatting +│ │ ├── actionlint # GitHub Actions YAML lint +│ │ ├── biome # JavaScript/TypeScript formatting +│ │ ├── nixfmt # Nix formatting +│ │ ├── yamlfmt # YAML formatting +│ │ └── zizmor # GitHub Actions security scanning +│ └── codeowners-validator # CODEOWNERS file validation +│ +├─── ci/pinned.json # Pinned Nixpkgs revision +│ +├─── images4docker/ # Docker build environments (40 distros) +│ +└─── ofborg/tickborg/ # Distributed CI bot + ├── RabbitMQ (AMQP) # Message queue + └── GitHub API # Check runs, PR comments +``` + +### Full Dependency Matrix + +| Consumer | json4cpp | toml++ | libnbt++ | neozip | cmark | genqrcode | forgewrapper | meta | Qt 6 | +|----------|----------|--------|----------|--------|-------|-----------|--------------|------|------| +| meshmc | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ (runtime) | ✓ (runtime, HTTP) | ✓ | +| meta | — | — | — | — | — | — | — | — | — | +| tickborg | — | — | — | — | — | — | — | — | — | +| corebinutils | — | — | — | — | — | — | — | — | — | +| mnv | — | — | — | — | — | — | — | — | — | +| cgit | — | — | — | — | — | — | — | — | — | + +The library sub-projects (json4cpp, tomlplusplus, libnbtplusplus, neozip, +cmark, genqrcode) are consumed exclusively by MeshMC within the monorepo. +External consumers can also use them independently. + +--- + +## Build System Architecture + +Each sub-project uses the build system best suited to its upstream lineage: + +``` + ┌────────────────────────┐ + │ Nix Flake (top-level) │ + │ Development Shells │ + └──────────┬─────────────┘ + │ + ┌────────────────────┼────────────────────┐ + │ │ │ + ┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐ + │ CMake │ │ Other │ │ Package │ + │ Projects │ │ Systems │ │ Managers │ + └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ + │ │ │ + ┌───────┼────────┐ ┌──────┼──────┐ ┌─────┼─────┐ + │ │ │ │ │ │ │ │ │ +meshmc neozip cmark toml++ cgit corebinutils meta tickborg +json4 genqr libnbt (Meson)(Make) (Make) (Poetry)(Cargo) +(CMake)(CMake) (CMake) mnv forgewrapper + (Auto) (Gradle) +``` + +### CMake Projects (Ninja Multi-Config) + +MeshMC and its library dependencies use CMake with the Ninja Multi-Config +generator. MeshMC ships `CMakePresets.json` with platform-specific presets: + +| Preset | Platform | Toolchain | +|--------|----------|-----------| +| `linux` | Linux | System compiler | +| `macos` | macOS | vcpkg | +| `macos_universal` | macOS Universal | x86_64 + arm64 | +| `windows_mingw` | Windows | MinGW | +| `windows_msvc` | Windows | MSVC + vcpkg | + +All presets share a hidden `base` preset that enforces: +- Ninja Multi-Config generator +- Build directory: `build/` +- Install directory: `install/` +- LTO enabled by default + +### CMake Compiler Requirements + +| Compiler | Minimum Version | Standard | +|----------|----------------|----------| +| GCC | 13 | C++23 | +| Clang | 17 | C++23 | +| MSVC | 19.36 | C++23 | + +CMake minimum version: **3.28** + +### Meson Project (tomlplusplus) + +toml++ uses Meson as its primary build system with CMake as an alternative. +The Meson build supports both header-only and compiled modes. + +### Make Projects (cgit, corebinutils) + +cgit uses a traditional `Makefile` that first builds a bundled version of Git, +then builds cgit itself. The Makefile supports `NO_LUA=1` and +`LUA_PKGCONFIG=luaXX` options. + +corebinutils uses a `./configure && make` workflow with `config.mk` for +toolchain configuration. It selects musl-gcc by default and falls back to +system gcc/clang. + +### Autotools Projects (mnv, genqrcode, neozip) + +MNV supports both CMake and traditional Autotools (`./configure && make`). +GenQRCode uses Autotools (`autogen.sh` → `./configure` → `make`). +NeoZip supports both CMake and a `./configure` script. + +### Gradle Project (forgewrapper) + +ForgeWrapper uses Gradle for building. The project includes a `gradlew` +wrapper script and uses JPMS (Java Platform Module System) via the +`jigsaw/` directory. + +### Cargo Workspace (tickborg) + +The `ofborg/` directory contains a Cargo workspace with two crates: +- `tickborg` — The main CI bot +- `tickborg-simple-build` — Simplified build runner + +The workspace uses Rust 2021 edition with `resolver = "2"`. + +### Poetry Project (meta) + +The `meta/` component uses Poetry for Python dependency management. Key +dependencies include `requests`, `cachecontrol`, `pydantic`, and `filelock`. +It provides CLI entry points for generating and updating version metadata +for each supported mod loader. + +--- + +## CI/CD Architecture + +### Orchestrator Pattern + +Project Tick uses a single monolithic CI orchestrator (`ci.yml`) that gates +all other workflows. The orchestrator: + +1. **Classifies the event** — Push, PR, merge queue, tag, scheduled, or manual +2. **Detects changed files** — Maps file paths to affected sub-projects +3. **Determines run level** — `minimal`, `standard`, or `full` +4. **Dispatches per-project builds** — Only builds what changed + +``` +Event (push/PR/merge_queue/tag) + │ + ▼ +┌──────────┐ +│ Gate │ ── classify event, detect changes, set run level +└────┬─────┘ + │ + ├──► Lint & Checks (commit messages, formatting, CODEOWNERS) + │ + ├──► meshmc-build.yml (if meshmc/ changed) + ├──► neozip-ci.yml (if neozip/ changed) + ├──► cmark-ci.yml (if cmark/ changed) + ├──► json4cpp-ci.yml (if json4cpp/ changed) + ├──► tomlplusplus-ci.yml (if tomlplusplus/ changed) + ├──► libnbtplusplus-ci.yml (if libnbtplusplus/ changed) + ├──► genqrcode-ci.yml (if genqrcode/ changed) + ├──► forgewrapper-build.yml (if forgewrapper/ changed) + ├──► cgit-ci.yml (if cgit/ changed) + ├──► corebinutils-ci.yml (if corebinutils/ changed) + ├──► mnv-ci.yml (if mnv/ changed) + │ + └──► Release workflows (if tag push) + ├── meshmc-release.yml + ├── meshmc-publish.yml + └── neozip-release.yml +``` + +### Workflow Inventory + +The `.github/workflows/` directory contains 50+ workflow files: + +**Core CI:** +- `ci.yml` — Monolithic orchestrator +- `ci-lint.yml` — Commit message and formatting checks +- `ci-schedule.yml` — Scheduled jobs + +**Per-Project CI:** +- `meshmc-build.yml`, `meshmc-codeql.yml`, `meshmc-container.yml`, `meshmc-nix.yml` +- `neozip-ci.yml`, `neozip-cmake.yml`, `neozip-configure.yml`, `neozip-analyze.yml`, `neozip-codeql.yml`, `neozip-fuzz.yml`, `neozip-lint.yml` +- `json4cpp-ci.yml`, `json4cpp-fuzz.yml`, `json4cpp-amalgam.yml`, `json4cpp-flawfinder.yml`, `json4cpp-semgrep.yml` +- `cmark-ci.yml`, `cmark-fuzz.yml` +- `tomlplusplus-ci.yml`, `tomlplusplus-fuzz.yml` +- `mnv-ci.yml`, `mnv-codeql.yml`, `mnv-coverity.yml` +- `cgit-ci.yml`, `corebinutils-ci.yml` +- `forgewrapper-build.yml`, `libnbtplusplus-ci.yml`, `genqrcode-ci.yml` + +**Release & Publishing:** +- `meshmc-release.yml`, `meshmc-publish.yml` +- `neozip-release.yml` +- `images4docker-build.yml` +- `tomlplusplus-gh-pages.yml`, `json4cpp-publish-docs.yml` + +**Repository Maintenance:** +- `repo-dependency-review.yml`, `repo-labeler.yml`, `repo-scorecards.yml`, `repo-stale.yml` +- `meshmc-backport.yml`, `meshmc-blocked-prs.yml`, `meshmc-merge-blocking-pr.yml` +- `meshmc-flake-update.yml` + +### Concurrency Control + +The CI orchestrator uses a concurrency key that varies by event type: + +| Event | Concurrency Group | +|-------|-------------------| +| Merge queue | `ci-` | +| Pull request | `ci-pr-` | +| Push | `ci-` | + +In-progress runs are cancelled for pushes and PRs but **not** for merge queue +entries, ensuring merge queue integrity. + +--- + +## Branch Strategy + +Branch classification is defined in `ci/supportedBranches.js`: + +| Branch Pattern | Type | Priority | Description | +|----------------|------|----------|-------------| +| `master` | development / primary | 0 (highest) | Main development branch | +| `release-*` | development / primary | 1 | Release branches (e.g., `release-7.0`) | +| `staging-*` | development / secondary | 2 | Pre-release staging | +| `staging-next-*` | development / secondary | 3 | Next staging cycle | +| `feature-*` | wip | — | Feature development | +| `fix-*` | wip | — | Bug fixes | +| `backport-*` | wip | — | Cherry-picks to release branches | +| `revert-*` | wip | — | Reverted changes | +| `wip-*` | wip | — | Work in progress | +| `dependabot-*` | wip | — | Automated dependency updates | + +Version tags follow: `..` (e.g., `7.0.0`). + +--- + +## Shared Infrastructure + +### Nix Flake (Top-Level) + +The root `flake.nix` provides a development shell for the entire monorepo: + +- **Toolchain:** LLVM 22 (Clang, clang-tidy) +- **clang-tidy-diff:** Wrapped Python script for incremental analysis +- **Submodule initialization:** Automatic via `shellHook` +- **Systems:** All `lib.systems.flakeExposed` (x86_64, aarch64 on Linux/macOS + and other exotic platforms) + +### CI Nix Configuration + +The `ci/default.nix` provides: + +- **treefmt** — Multi-language formatter with: + - `actionlint` — GitHub Actions YAML validation + - `biome` — JavaScript formatting (single quotes, no semicolons) + - `nixfmt` — Nix formatting (RFC style) + - `yamlfmt` — YAML formatting (retain line breaks) + - `zizmor` — GitHub Actions security scanning + - `keep-sorted` — Sort blocks marked with `keep-sorted` comments +- **codeowners-validator** — Validates the CODEOWNERS file + +### Lefthook Git Hooks + +Pre-commit hooks configured in `lefthook.yml`: + +1. **reuse-lint** — Validates REUSE compliance. If missing licenses are + detected, downloads them and stages the fix automatically. +2. **checkpatch** — Runs `scripts/checkpatch.pl` on staged C/C++ and CMake + diffs. Skipped during merge and rebase operations. + +Pre-push hooks: +1. **reuse-lint** — Final REUSE compliance check before push. + +### Bootstrap Scripts + +`bootstrap.sh` (Linux/macOS) and `bootstrap.cmd` (Windows) handle first-time +setup: + +- Detect the host distribution (Debian, Ubuntu, Fedora, RHEL, openSUSE, Arch, + macOS) +- Install required dependencies via the native package manager +- Initialize and update Git submodules +- Install and configure lefthook + +The bootstrap scripts check for: +- Build tools: npm, Go, lefthook, reuse +- Libraries: Qt6Core, quazip1-qt6, zlib, ECM (via pkg-config) + +--- + +## Security Architecture + +### Supply Chain + +- All Nix inputs are pinned with content hashes (`flake.lock`, `ci/pinned.json`) +- GitHub Actions use pinned action versions with SHA references +- `step-security/harden-runner` is used in CI workflows +- `repo-dependency-review.yml` scans dependency changes +- `repo-scorecards.yml` tracks OpenSSF Scorecard compliance + +### Code Quality + +- CodeQL analysis for meshmc, mnv, and neozip +- Fuzz testing for neozip, json4cpp, cmark, and tomlplusplus +- Semgrep and Flawfinder static analysis for json4cpp +- Coverity scanning for mnv +- clang-tidy checks enabled via `MeshMC_ENABLE_CLANG_TIDY` CMake option + +### Compiler Hardening (MeshMC) + +MeshMC's CMakeLists.txt enables: +- `-fstack-protector-strong --param=ssp-buffer-size=4` — Stack smashing protection +- `-O3 -D_FORTIFY_SOURCE=2` — Buffer overflow detection +- `-Wall -pedantic` — Comprehensive warnings +- ASLR and PIE via `CMAKE_POSITION_INDEPENDENT_CODE ON` + +--- + +## Data Flow + +### MeshMC Launch Sequence + +``` +User clicks "Launch" in MeshMC + │ + ▼ +MeshMC reads instance configuration + │ (tomlplusplus for TOML, json4cpp for JSON) + │ + ▼ +MeshMC fetches version metadata + │ (HTTP → meta/ JSON manifests) + │ + ▼ +MeshMC downloads/verifies game assets + │ (neozip for decompression, libarchive for extraction) + │ + ▼ +MeshMC prepares launch environment + │ (libnbtplusplus for world data if needed) + │ + ▼ +[If Forge ≥1.13] ForgeWrapper bootstraps Forge + │ (Java SPI, installer extraction) + │ + ▼ +Minecraft process spawned with JDK 17+ +``` + +### CI Build Flow + +``` +Developer pushes commit + │ + ▼ +ci.yml Gate job runs + │ ─ classifies event type + │ ─ detects changed files + │ ─ maps to affected sub-projects + │ + ▼ +ci-lint.yml runs in parallel + │ ─ Conventional Commits validation + │ ─ treefmt formatting check + │ ─ CODEOWNERS validation + │ + ▼ +Per-project CI dispatched + │ ─ CMake configure + build + test + │ ─ Multi-platform matrix + │ ─ CodeQL / fuzz / static analysis + │ + ▼ +Results posted as GitHub check runs + │ + ▼ +[If tag push] Release workflow triggered + ─ Build release binaries + ─ Create GitHub release + ─ Publish artifacts +``` + +### Metadata Generation Flow + +``` +meta/ update scripts run (cron or manual) + │ + ├─► updateMojang → fetches Mojang version manifest + ├─► updateForge → fetches Forge version list + ├─► updateNeoForge → fetches NeoForge version list + ├─► updateFabric → fetches Fabric loader versions + ├─► updateQuilt → fetches Quilt loader versions + ├─► updateLiteloader → fetches LiteLoader versions + └─► updateJava → fetches Java runtime versions + │ + ▼ +generate scripts produce JSON manifests + │ + ▼ +Manifests deployed (git push or static hosting) + │ + ▼ +MeshMC reads manifests at startup +``` + +--- + +## Module Boundaries + +### Interface Contracts + +Each library sub-project provides well-defined interfaces: + +| Library | Include Path | Namespace | API Style | +|---------|-------------|-----------|-----------| +| json4cpp | `` | `nlohmann` | Header-only, template-based | +| tomlplusplus | `` | `toml` | Header-only, C++17 | +| libnbtplusplus | `` | `nbt` | Compiled library, C++11 | +| neozip | `` or `` | C API | Drop-in zlib replacement | +| cmark | `` | C API | Compiled library | +| genqrcode | `` | C API | Compiled library | +| forgewrapper | Java SPI | `io.github.zekerzhayard.forgewrapper` | JAR, service provider | + +### Versioning Independence + +Each sub-project maintains its own version number: + +| Project | Versioning | Current | +|---------|-----------|---------| +| meshmc | `MAJOR.MINOR.HOTFIX` (CMake) | 7.0.0 | +| meta | `MAJOR.MINOR.PATCH-REV` (pyproject.toml) | 0.0.5-1 | +| forgewrapper | Gradle `version` property | (see gradle.properties) | +| neozip | CMake project version | (follows zlib-ng) | +| Other libraries | Follow upstream versioning | — | + +The monorepo does not impose a single version across sub-projects. Each +component releases independently based on its own cadence. diff --git a/docs/handbook/Project-Tick/build-systems.md b/docs/handbook/Project-Tick/build-systems.md new file mode 100644 index 0000000000..d47fa9ee63 --- /dev/null +++ b/docs/handbook/Project-Tick/build-systems.md @@ -0,0 +1,711 @@ +# Project Tick — Build Systems + +## Overview + +Project Tick uses seven distinct build systems across its sub-projects, each +chosen to match the upstream heritage and language ecosystem of the component. +This document provides a comprehensive reference for each build system, common +patterns, and cross-cutting concerns. + +--- + +## Build System Matrix + +| Build System | Sub-Projects | Language | Configuration | +|-------------|-------------|----------|---------------| +| **CMake** | meshmc, neozip, cmark, genqrcode, json4cpp, libnbtplusplus, mnv | C/C++ | `CMakeLists.txt`, `CMakePresets.json` | +| **Meson** | tomlplusplus | C++ | `meson.build`, `meson_options.txt` | +| **Make (GNU Make)** | cgit, corebinutils | C | `Makefile`, `GNUmakefile` | +| **Autotools** | mnv, genqrcode, neozip | C | `configure.ac`, `Makefile.am`, `configure` | +| **Gradle** | forgewrapper | Java | `build.gradle`, `settings.gradle` | +| **Cargo** | tickborg | Rust | `Cargo.toml`, `Cargo.lock` | +| **Poetry** | meta | Python | `pyproject.toml`, `poetry.lock` | +| **Nix** | CI, dev shells, deployments | Multi | `flake.nix`, `default.nix` | + +--- + +## CMake + +CMake is the dominant build system in Project Tick, used by seven sub-projects. + +### Minimum Versions + +| Component | CMake Minimum | C++ Standard | C Standard | +|-----------|--------------|-------------|-----------| +| meshmc | 3.28 | C++23 | C23 (C11 on MSVC) | +| neozip | 3.14 | — | C11 | +| cmark | 3.5 | — | C99 | +| genqrcode | 3.5 | — | C99 | +| json4cpp | 3.1 | C++11 | — | +| libnbtplusplus | 3.15 | C++11 | — | +| mnv | 3.10 | — | C11 | + +### MeshMC CMake Configuration + +MeshMC has the most sophisticated CMake setup in the monorepo, including: + +#### CMake Presets (`meshmc/CMakePresets.json`) + +All presets inherit from a hidden `base` preset: + +```json +{ + "name": "base", + "hidden": true, + "generator": "Ninja Multi-Config", + "binaryDir": "build", + "installDir": "install", + "cacheVariables": { + "ENABLE_LTO": "ON" + } +} +``` + +Platform presets: + +| Preset | OS | Toolchain | vcpkg | +|--------|-----|-----------|-------| +| `linux` | Linux | System | No | +| `macos` | macOS | System | Yes (`$VCPKG_ROOT`) | +| `macos_universal` | macOS | Universal (x86_64+arm64) | Yes | +| `windows_mingw` | Windows | MinGW | No | +| `windows_msvc` | Windows | MSVC | Yes (`$VCPKG_ROOT`) | + +Usage: + +```bash +# Configure +cmake --preset linux + +# Build +cmake --build --preset linux + +# Test (uses CTest) +cd build && ctest --output-on-failure + +# Install +cmake --install build --config Release --prefix install +``` + +#### CMake Options + +| Option | Default | Description | +|--------|---------|-------------| +| `ENABLE_LTO` | OFF | Link Time Optimization | +| `MeshMC_DISABLE_JAVA_DOWNLOADER` | OFF | Disable Java auto-download | +| `MeshMC_ENABLE_CLANG_TIDY` | OFF | Run clang-tidy during build | + +#### External Dependencies (find_package) + +```cmake +find_package(Qt6 REQUIRED COMPONENTS + Core Widgets Concurrent Network NetworkAuth Test Xml +) +find_package(ECM NO_MODULE REQUIRED) +find_package(LibArchive REQUIRED) +``` + +Additional Qt queries via `QMakeQuery`: +- `QT_INSTALL_PLUGINS` → Plugin directory +- `QT_INSTALL_LIBS` → Library directory +- `QT_INSTALL_LIBEXECS` → Libexec directory + +#### Compiler Configuration + +```cmake +set(CMAKE_CXX_STANDARD 23) +set(CMAKE_CXX_STANDARD_REQUIRED true) +set(CMAKE_C_STANDARD 23) # C11 on MSVC +set(CMAKE_AUTOMOC ON) # Qt meta-object compiler +set(CMAKE_POSITION_INDEPENDENT_CODE ON) +set(CMAKE_EXPORT_COMPILE_COMMANDS ON) +``` + +Compiler flags (GCC/Clang): +``` +-Wall -pedantic -Wno-deprecated-declarations +-fstack-protector-strong --param=ssp-buffer-size=4 +-O3 -D_FORTIFY_SOURCE=2 +-DQT_NO_DEPRECATED_WARNINGS=Y +``` + +MSVC flags: +``` +/W4 /DQT_NO_DEPRECATED_WARNINGS=Y +``` + +macOS additionally: +``` +-stdlib=libc++ +``` + +#### LTO (Link Time Optimization) + +When `ENABLE_LTO` is ON, MeshMC uses `CheckIPOSupported`: + +```cmake +include(CheckIPOSupported) +check_ipo_supported(RESULT ipo_supported OUTPUT ipo_error) +if(ipo_supported) + set(CMAKE_INTERPROCEDURAL_OPTIMIZATION_RELEASE TRUE) + set(CMAKE_INTERPROCEDURAL_OPTIMIZATION_MINSIZEREL TRUE) + set(CMAKE_INTERPROCEDURAL_OPTIMIZATION_RELWITHDEBINFO TRUE) +endif() +``` + +LTO is **not** enabled for Debug builds. + +#### Versioning + +```cmake +set(MeshMC_VERSION_MAJOR 7) +set(MeshMC_VERSION_MINOR 0) +set(MeshMC_VERSION_HOTFIX 0) +set(MeshMC_RELEASE_VERSION_NAME "7.0.0") +``` + +#### Build Targets + +The meshmc CMake tree produces: +- Main executable (`meshmc`) +- Libraries in `libraries/` subdirectory +- Java JARs in `${PROJECT_BINARY_DIR}/jars` +- Tests (via ECMAddTests when `BUILD_TESTING` is ON) + +### NeoZip CMake Configuration + +NeoZip supports both CMake and traditional `./configure`: + +```bash +# CMake +mkdir build && cd build +cmake .. -G Ninja \ + -DZLIB_COMPAT=ON \ # zlib-compatible API + -DWITH_GTEST=ON # Enable Google Test +ninja +ctest + +# Autotools +./configure +make -j$(nproc) +make test +``` + +Key CMake variables: +- `ZLIB_COMPAT` — Build with zlib-compatible API +- `WITH_GTEST` — Build with Google Test +- `WITH_BENCHMARKS` — Build benchmarks +- Architecture-specific SIMD flags are auto-detected + +### cmark CMake Configuration + +```bash +mkdir build && cd build +cmake .. -G Ninja \ + -DCMARK_TESTS=ON \ + -DCMARK_SHARED=ON +ninja +ctest +``` + +### json4cpp CMake Configuration + +json4cpp supports CMake, Meson (via `meson.build`), and Bazel: + +```bash +mkdir build && cd build +cmake .. -G Ninja \ + -DJSON_BuildTests=ON +ninja +ctest +``` + +The library is header-only; the CMake build is primarily for tests. + +### libnbt++ CMake Configuration + +```bash +mkdir build && cd build +cmake .. \ + -DNBT_BUILD_SHARED=OFF \ + -DNBT_USE_ZLIB=ON \ + -DNBT_BUILD_TESTS=ON +make -j$(nproc) +ctest +``` + +### genqrcode CMake Configuration + +```bash +mkdir build && cd build +cmake .. -G Ninja +ninja +``` + +Also supports Autotools: +```bash +./autogen.sh +./configure +make -j$(nproc) +``` + +--- + +## Meson + +### tomlplusplus + +toml++ uses Meson as its primary build system: + +```bash +meson setup build +ninja -C build +ninja -C build test +``` + +Meson options (from `meson_options.txt`): +- Build mode (header-only vs compiled) +- Test configuration +- Example programs + +Also supports CMake as an alternative: + +```bash +mkdir build && cd build +cmake .. -G Ninja +ninja +ctest +``` + +--- + +## GNU Make + +### cgit + +cgit uses a traditional `Makefile` that first builds Git as a dependency: + +```bash +# Initialize Git submodule +git submodule init +git submodule update + +# Build (builds Git first, then cgit) +make + +# Install (default: /var/www/htdocs/cgit) +sudo make install +``` + +Build options: +- `NO_LUA=1` — Build without Lua scripting support +- `LUA_PKGCONFIG=lua5.1` — Specify Lua implementation +- Custom paths via `cgit.conf` + +### corebinutils + +CoreBinUtils uses a `./configure` script that generates toolchain overrides, +then builds with GNU Make: + +```bash +./configure +make -f GNUmakefile -j$(nproc) all +make -f GNUmakefile test +``` + +The `configure` script: +- Selects musl-gcc or musl-capable clang by preference +- Falls back to system gcc/clang +- Generates `config.mk` with `CC`, `AR`, `RANLIB`, `CPPFLAGS`, `CFLAGS`, + `LDFLAGS` + +Each subdirectory (e.g., `cat/`, `ls/`, `cp/`) has its own `GNUmakefile` +that the top-level `GNUmakefile` orchestrates. + +--- + +## Autotools + +### mnv + +MNV supports both CMake and traditional Autotools: + +```bash +# Autotools (traditional) +./configure --with-features=huge --enable-gui=auto +make -j$(nproc) +sudo make install + +# CMake (alternative) +mkdir build && cd build +cmake .. -G Ninja +ninja +``` + +The Autotools build system supports extensive feature flags: +- `--with-features={tiny,small,normal,big,huge}` +- `--enable-gui={auto,no,gtk2,gtk3,motif,...}` +- `--enable-python3interp` +- `--enable-luainterp` +- And many more + +### genqrcode + +GenQRCode uses Autotools: + +```bash +./autogen.sh # Generate configure from configure.ac +./configure # Configure +make -j$(nproc) # Build +make check # Run tests +sudo make install # Install +``` + +### neozip + +NeoZip's `./configure` is a custom script (not GNU Autoconf): + +```bash +./configure +make -j$(nproc) +make test +sudo make install +``` + +--- + +## Gradle + +### forgewrapper + +ForgeWrapper uses Gradle for Java builds: + +```bash +# Build +./gradlew build + +# Test +./gradlew test + +# Clean +./gradlew clean + +# Generate JAR +./gradlew jar +``` + +Project structure: +``` +forgewrapper/ +├── build.gradle # Build configuration +├── gradle.properties # Version and settings +├── settings.gradle # Project name and modules +├── gradlew # Unix wrapper script +├── gradlew.bat # Windows wrapper script +├── gradle/ # Gradle wrapper JAR +├── jigsaw/ # JPMS module configuration +└── src/ + └── main/java/ # Source code +``` + +The Gradle wrapper (`gradlew`) pins the Gradle version so no system-wide +Gradle installation is needed. + +--- + +## Cargo + +### tickborg + +The `ofborg/` directory contains a Cargo workspace: + +```toml +[workspace] +members = [ + "tickborg", + "tickborg-simple-build" +] +resolver = "2" + +[profile.release] +debug = true +``` + +#### Building + +```bash +cd ofborg + +# Build all workspace crates +cargo build + +# Build in release mode +cargo build --release + +# Run tests +cargo test + +# Run lints +cargo clippy + +# Format +cargo fmt + +# Build specific crate +cargo build -p tickborg +``` + +#### Workspace Structure + +``` +ofborg/ +├── Cargo.toml # Workspace root +├── Cargo.lock # Locked dependencies +├── tickborg/ # Main CI bot crate +│ ├── Cargo.toml +│ └── src/ +└── tickborg-simple-build/ # Simplified build crate + ├── Cargo.toml + └── src/ +``` + +The workspace uses `resolver = "2"` (Rust 2021 edition resolver) and enables +debug symbols in release builds for profiling. + +--- + +## Poetry + +### meta + +The `meta/` component uses Poetry for Python dependency management: + +```bash +cd meta + +# Install dependencies +poetry install + +# Run in Poetry environment +poetry run generateMojang + +# Or activate shell +poetry shell +generateMojang +``` + +#### pyproject.toml + +```toml +[tool.poetry] +name = "meta" +version = "0.0.5-1" +description = "ProjT Launcher meta generator" +license = "MS-PL" + +[tool.poetry.dependencies] +python = ">=3.10,<4.0" +cachecontrol = "^0.14.0" +requests = "^2.31.0" +filelock = "^3.20.3" +packaging = "^25.0" +pydantic = "^1.10.13" + +[build-system] +requires = ["poetry-core>=1.0.0"] +build-backend = "poetry.core.masonry.api" +``` + +#### CLI Entry Points + +Poetry scripts provide named commands: + +| Command | Function | +|---------|----------| +| `generateFabric` | `meta.run.generate_fabric:main` | +| `generateForge` | `meta.run.generate_forge:main` | +| `generateLiteloader` | `meta.run.generate_liteloader:main` | +| `generateMojang` | `meta.run.generate_mojang:main` | +| `generateNeoForge` | `meta.run.generate_neoforge:main` | +| `generateQuilt` | `meta.run.generate_quilt:main` | +| `generateJava` | `meta.run.generate_java:main` | +| `updateFabric` | `meta.run.update_fabric:main` | +| `updateForge` | `meta.run.update_forge:main` | +| `updateLiteloader` | `meta.run.update_liteloader:main` | +| `updateMojang` | `meta.run.update_mojang:main` | +| `updateNeoForge` | `meta.run.update_neoforge:main` | +| `updateQuilt` | `meta.run.update_quilt:main` | +| `updateJava` | `meta.run.update_java:main` | +| `index` | `meta.run.index:main` | + +--- + +## Nix + +Nix is used across the monorepo for reproducible development environments, +CI tooling, and deployment. + +### Top-Level Flake (`flake.nix`) + +```nix +{ + description = "Project Tick is a project dedicated to providing developers + with ease of use and users with long-lasting software."; + + inputs = { + nixpkgs.url = "https://channels.nixos.org/nixos-unstable/nixexprs.tar.xz"; + }; +} +``` + +Provides: +- `devShells.default` — LLVM 22 toolchain with clang-tidy-diff +- `formatter` — nixfmt-rfc-style +- Systems: all `lib.systems.flakeExposed` + +The dev shell automatically runs `git submodule update --init --force` on +entry. + +### CI Nix (`ci/default.nix`) + +The CI Nix expression provides: + +1. **treefmt** — Multi-language formatter: + - `actionlint` — GitHub Actions YAML lint + - `biome` — JavaScript (single quotes, no semicolons) + - `keep-sorted` — Sort annotated blocks + - `nixfmt` — Nix formatting (RFC style) + - `yamlfmt` — YAML (retain line breaks) + - `zizmor` — GitHub Actions security scanning + +2. **codeowners-validator** — Built from source with patches: + - `owners-file-name.patch` + - `permissions.patch` + +3. **Pinned Nixpkgs** — `ci/pinned.json` locks the Nixpkgs revision: + ```bash + # Update pinned revision + ./ci/update-pinned.sh + ``` + +### Meta Nix (`meta/flake.nix`) + +The meta component provides a NixOS module for deployment: + +```nix +services.blockgame-meta = { + enable = true; + settings = { + DEPLOY_TO_GIT = "true"; + GIT_AUTHOR_NAME = "..."; + GIT_AUTHOR_EMAIL = "..."; + }; +}; +``` + +### MeshMC Nix (`meshmc/flake.nix`) + +MeshMC provides its own Nix flake for building: + +```bash +cd meshmc +nix build +``` + +### Per-Project Nix Files + +Several sub-projects include `default.nix`, `shell.nix`, or `flake.nix` for +Nix-based development: + +| Project | Nix Files | +|---------|-----------| +| meshmc | `flake.nix`, `default.nix`, `shell.nix` | +| meta | `flake.nix` | +| ofborg | `flake.nix`, `default.nix`, `shell.nix`, `service.nix` | +| ci | `default.nix` | +| ci/github-script | `shell.nix` | +| cmark | `shell.nix` | + +--- + +## Cross-Cutting Build Concerns + +### Compile Commands Database + +MeshMC generates `compile_commands.json` via: + +```cmake +set(CMAKE_EXPORT_COMPILE_COMMANDS ON) +``` + +This file is used by clang-tidy, clangd, and other tools for accurate +code analysis. + +### Testing Frameworks + +| Project | Test Framework | Runner | +|---------|---------------|--------| +| meshmc | Qt Test + CTest | `ctest` | +| neozip | Google Test + CTest | `ctest` | +| json4cpp | Catch2 + CTest | `ctest` | +| tomlplusplus | Catch2 | `ninja test` | +| libnbtplusplus | CTest | `ctest` | +| cmark | Custom + CTest | `ctest` | +| forgewrapper | JUnit + Gradle | `./gradlew test` | +| tickborg | Rust built-in | `cargo test` | +| corebinutils | Custom shell tests | `make test` | +| mnv | Custom test framework | `make test` | + +### Parallel Build Support + +All build systems support parallel builds: + +```bash +# CMake/Ninja +cmake --build build -j$(nproc) + +# Make +make -j$(nproc) + +# Cargo +cargo build -j $(nproc) + +# Gradle +./gradlew build --parallel +``` + +### Out-of-Source Build Enforcement + +MeshMC enforces out-of-source builds: + +```cmake +string(COMPARE EQUAL "${CMAKE_SOURCE_DIR}" "${CMAKE_BUILD_DIR}" IS_IN_SOURCE_BUILD) +if(IS_IN_SOURCE_BUILD) + message(FATAL_ERROR "You are building MeshMC in-source.") +endif() +``` + +### WSL Build Rejection + +MeshMC explicitly rejects builds on WSL (Windows Subsystem for Linux): + +```cmake +if(CMAKE_HOST_SYSTEM_VERSION MATCHES ".*[Mm]icrosoft.*" OR + CMAKE_HOST_SYSTEM_VERSION MATCHES ".*WSL.*") + message(FATAL_ERROR "Building MeshMC is not supported in Linux-on-Windows distributions.") +endif() +``` + +--- + +## Build Quick Reference + +| Action | meshmc | neozip | cgit | toml++ | tickborg | meta | forgewrapper | +|--------|--------|--------|------|--------|----------|------|-------------| +| Configure | `cmake --preset linux` | `cmake -B build` | — | `meson setup build` | — | — | — | +| Build | `cmake --build --preset linux` | `ninja -C build` | `make` | `ninja -C build` | `cargo build` | `poetry install` | `./gradlew build` | +| Test | `ctest` | `ctest` | — | `ninja -C build test` | `cargo test` | — | `./gradlew test` | +| Install | `cmake --install build` | `ninja -C build install` | `make install` | `ninja -C build install` | `cargo install` | — | — | +| Clean | rm -rf build | rm -rf build | `make clean` | rm -rf build | `cargo clean` | — | `./gradlew clean` | +| Format | `clang-format -i` | — | — | — | `cargo fmt` | — | — | +| Lint | `clang-tidy` | — | — | — | `cargo clippy` | — | — | diff --git a/docs/handbook/Project-Tick/ci-cd-pipeline.md b/docs/handbook/Project-Tick/ci-cd-pipeline.md new file mode 100644 index 0000000000..78d36e97cb --- /dev/null +++ b/docs/handbook/Project-Tick/ci-cd-pipeline.md @@ -0,0 +1,599 @@ +# Project Tick — CI/CD Pipeline + +## Overview + +Project Tick uses a multi-layered CI/CD pipeline that orchestrates builds, +tests, security scans, and releases across all sub-projects in the monorepo. +The pipeline combines GitHub Actions, Nix-based tooling, and the custom +tickborg distributed CI system. + +--- + +## Architecture + +### Three-Layer CI Strategy + +``` +Layer 1: GitHub Actions (ci.yml orchestrator) + ├── Event classification and change detection + ├── Per-project workflow dispatch + └── Release and publishing workflows + +Layer 2: CI Tooling (ci/ directory) + ├── treefmt (multi-language formatting) + ├── codeowners-validator + ├── commit linting (Conventional Commits) + └── Pinned Nix environment + +Layer 3: tickborg (ofborg/ distributed CI) + ├── RabbitMQ-based job distribution + ├── Multi-platform build execution + └── GitHub check run reporting +``` + +--- + +## GitHub Actions — The Orchestrator + +### ci.yml — Monolithic Gate + +The primary CI workflow (`ci.yml`) is the single entry point for all CI +activity. Every push, pull request, merge queue entry, tag push, and manual +dispatch flows through this workflow. + +#### Trigger Events + +```yaml +on: + push: + branches: ["**"] + tags: ["*"] + pull_request: + types: [opened, synchronize, reopened, ready_for_review] + pull_request_target: + types: [closed, labeled] + merge_group: + types: [checks_requested] + workflow_dispatch: + inputs: + force-all: + description: "Force run all project CI pipelines" + type: boolean + default: false + build-type: + description: "Build configuration for meshmc/forgewrapper" + type: choice + options: [Debug, Release] + default: Debug +``` + +#### Permissions + +The orchestrator runs with minimal permissions: + +```yaml +permissions: + contents: read +``` + +#### Concurrency Control + +```yaml +concurrency: + group: >- + ci-${{ + github.event_name == 'merge_group' && github.event.merge_group.head_ref || + github.event_name == 'pull_request' && format('pr-{0}', github.event.pull_request.number) || + github.ref + }} + cancel-in-progress: ${{ github.event_name != 'merge_group' }} +``` + +| Event | Concurrency Group | Cancel In-Progress | +|-------|-------------------|--------------------| +| Merge queue | `ci-` | No | +| Pull request | `ci-pr-` | Yes | +| Push | `ci-` | Yes | + +Merge queue runs are never cancelled to maintain queue integrity. + +#### Stage 0: Gate & Triage + +The `gate` job is the first job that runs. It: + +1. **Classifies the event:** push, PR, merge queue, tag, backport, dependabot, + scheduled, etc. +2. **Detects changed files:** Maps file paths to sub-project flags +3. **Sets run level:** `minimal`, `standard`, or `full` +4. **Exports output variables** for downstream jobs: + - Event classification flags (`is_push`, `is_pr`, `is_merge_queue`, etc.) + - Per-project change flags (`meshmc_changed`, `neozip_changed`, etc.) + - Run level for downstream decisions + +Draft PRs are automatically skipped: +```yaml +if: >- + !(github.event_name == 'pull_request' && github.event.pull_request.draft) +``` + +### ci-lint.yml — Lint & Checks + +Called from `ci.yml` before builds start. Runs commit message validation and +formatting checks. + +#### Commit Message Linting + +Uses `ci/github-script/lint-commits.js` via `actions/github-script`: + +```yaml +- name: Lint commit messages + uses: actions/github-script@v7 +``` + +The linter validates Conventional Commits format: +``` +type(scope): description +``` + +Valid types: `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, +`build`, `ci`, `chore`, `revert`. + +#### Security Hardening + +All CI jobs use `step-security/harden-runner`: + +```yaml +- name: Harden runner + uses: step-security/harden-runner@v2 + with: + egress-policy: audit +``` + +--- + +## Workflow Inventory + +### Per-Project CI Workflows + +Each sub-project has dedicated CI workflows that build, test, and analyze +the component: + +#### MeshMC (7 workflows) + +| Workflow | Purpose | +|----------|---------| +| `meshmc-build.yml` | Multi-platform build matrix | +| `meshmc-codeql.yml` | CodeQL security analysis | +| `meshmc-container.yml` | Container (Docker/Podman) build | +| `meshmc-nix.yml` | Nix build verification | +| `meshmc-backport.yml` | Automated backport PR creation | +| `meshmc-blocked-prs.yml` | Track and manage blocked PRs | +| `meshmc-merge-blocking-pr.yml` | Merge queue blocking logic | +| `meshmc-flake-update.yml` | Automated Nix flake update | + +#### NeoZip (12 workflows) + +| Workflow | Purpose | +|----------|---------| +| `neozip-ci.yml` | Primary CI | +| `neozip-cmake.yml` | CMake build matrix | +| `neozip-configure.yml` | Autotools (configure) build | +| `neozip-analyze.yml` | Static analysis | +| `neozip-codeql.yml` | CodeQL security scanning | +| `neozip-fuzz.yml` | Fuzz testing | +| `neozip-lint.yml` | Code linting | +| `neozip-libpng.yml` | libpng integration test | +| `neozip-link.yml` | Link validation | +| `neozip-osb.yml` | OpenSSF Scorecard | +| `neozip-pigz.yml` | pigz compatibility test | +| `neozip-pkgcheck.yml` | Package check | +| `neozip-release.yml` | Release workflow | + +#### json4cpp (7 workflows) + +| Workflow | Purpose | +|----------|---------| +| `json4cpp-ci.yml` | Primary CI | +| `json4cpp-fuzz.yml` | Fuzz testing | +| `json4cpp-amalgam.yml` | Amalgamation (single-header) build | +| `json4cpp-amalgam-comment.yml` | Amalgamation PR comment | +| `json4cpp-flawfinder.yml` | Flawfinder static analysis | +| `json4cpp-semgrep.yml` | Semgrep security scanning | +| `json4cpp-publish-docs.yml` | Documentation publishing | + +#### Other Sub-Projects + +| Workflow | Sub-Project | Purpose | +|----------|------------|---------| +| `cmark-ci.yml` | cmark | Build and test | +| `cmark-fuzz.yml` | cmark | Fuzz testing | +| `tomlplusplus-ci.yml` | tomlplusplus | Build and test | +| `tomlplusplus-fuzz.yml` | tomlplusplus | Fuzz testing | +| `tomlplusplus-gh-pages.yml` | tomlplusplus | Documentation deployment | +| `mnv-ci.yml` | mnv | Build and test | +| `mnv-codeql.yml` | mnv | CodeQL analysis | +| `mnv-coverity.yml` | mnv | Coverity scan | +| `mnv-link-check.yml` | mnv | Documentation link check | +| `cgit-ci.yml` | cgit | Build and test | +| `corebinutils-ci.yml` | corebinutils | Build and test | +| `forgewrapper-build.yml` | forgewrapper | Gradle build | +| `libnbtplusplus-ci.yml` | libnbtplusplus | Build and test | +| `genqrcode-ci.yml` | genqrcode | Build and test | +| `images4docker-build.yml` | images4docker | Docker image build | + +### Release & Publishing Workflows + +| Workflow | Purpose | +|----------|---------| +| `meshmc-release.yml` | Create MeshMC releases | +| `meshmc-publish.yml` | Publish MeshMC artifacts | +| `neozip-release.yml` | Create NeoZip releases | + +### Repository Maintenance Workflows + +| Workflow | Purpose | +|----------|---------| +| `repo-dependency-review.yml` | Scan dependency changes for vulnerabilities | +| `repo-labeler.yml` | Auto-label PRs by changed paths | +| `repo-scorecards.yml` | OpenSSF Scorecard compliance tracking | +| `repo-stale.yml` | Mark and close stale issues/PRs | + +--- + +## Change Detection + +The CI orchestrator maps changed file paths to sub-project flags: + +| Path Pattern | Flag | Sub-Project | +|-------------|------|-------------| +| `meshmc/**` | `meshmc_changed` | MeshMC | +| `neozip/**` | `neozip_changed` | NeoZip | +| `json4cpp/**` | `json4cpp_changed` | json4cpp | +| `tomlplusplus/**` | `tomlplusplus_changed` | tomlplusplus | +| `libnbtplusplus/**` | `libnbt_changed` | libnbt++ | +| `cmark/**` | `cmark_changed` | cmark | +| `genqrcode/**` | `genqrcode_changed` | genqrcode | +| `forgewrapper/**` | `forgewrapper_changed` | ForgeWrapper | +| `cgit/**` | `cgit_changed` | cgit | +| `corebinutils/**` | `corebinutils_changed` | CoreBinUtils | +| `mnv/**` | `mnv_changed` | MNV | +| `ofborg/**` | `ofborg_changed` | tickborg | +| `meta/**` | `meta_changed` | Meta | +| `images4docker/**` | `docker_changed` | Images4Docker | +| `ci/**` | `ci_changed` | CI tooling | +| `archived/**` | `archived_changed` | Archived | + +### Force-All Mode + +All projects are built when: +- `force-all` is set to `true` in a manual dispatch +- The event is a merge queue entry (`is_merge_queue`) + +--- + +## tickborg — Distributed CI + +tickborg is a RabbitMQ-based distributed CI system adapted from NixOS's +ofborg. It runs alongside GitHub Actions to provide: + +### Capabilities + +1. **Automatic change detection** — Detects changed sub-projects in PRs based + on file paths and commit scopes +2. **Native build system execution** — Builds each project using its own build + system (CMake, Meson, Make, Cargo, Gradle, Autotools) +3. **Multi-platform support** — Builds on 7 platform/architecture combinations +4. **GitHub integration** — Posts results as check runs and PR comments + +### Platform Matrix + +| Platform | Runner | Architecture | +|----------|--------|-------------| +| `x86_64-linux` | `ubuntu-latest` | x86_64 | +| `aarch64-linux` | `ubuntu-24.04-arm` | ARM64 | +| `x86_64-darwin` | `macos-15` | x86_64 | +| `aarch64-darwin` | `macos-15` | Apple Silicon | +| `x86_64-windows` | `windows-2025` | x86_64 | +| `aarch64-windows` | `windows-2025` | ARM64 | +| `x86_64-freebsd` | `ubuntu-latest` (VM) | x86_64 | + +### Bot Commands + +tickborg responds to `@tickbot` commands in PR comments: + +``` +@tickbot build meshmc neozip cmark # Build specific projects +@tickbot test meshmc # Run tests for a project +@tickbot eval # Full evaluation (detect + label) +``` + +### WIP Suppression + +PRs with titles starting with `WIP:` or containing `[WIP]` suppress +automatic builds. + +### Commit-Based Triggers + +tickborg reads Conventional Commits scopes to determine builds: + +| Commit Message | Triggered Build | +|---------------|-----------------| +| `feat(meshmc): add chunk loading` | meshmc | +| `fix(neozip): handle empty archives` | neozip | +| `cmark: fix buffer overflow` | cmark | +| `chore(ci): update workflow` | (no build) | + +--- + +## CI Tooling (ci/ Directory) + +### Directory Structure + +``` +ci/ +├── OWNERS # Code ownership +├── README.md # CI documentation +├── default.nix # Nix CI entry point +├── pinned.json # Pinned Nixpkgs revision +├── update-pinned.sh # Update pinned.json +├── supportedBranches.js # Branch classification +├── codeowners-validator/ +│ ├── default.nix # Build codeowners-validator +│ ├── owners-file-name.patch # Patch for file naming +│ └── permissions.patch # Patch for permissions +└── github-script/ + ├── run # CLI entry (local testing) + ├── lint-commits.js # Conventional Commits linter + ├── prepare.js # PR preparation/validation + ├── reviews.js # Review state management + ├── get-pr-commit-details.js # Extract PR commit info + ├── withRateLimit.js # GitHub API rate limiting + ├── package.json # npm dependencies + └── shell.nix # Nix development shell +``` + +### treefmt Configuration + +The CI `default.nix` configures treefmt with these formatters: + +| Formatter | Language/Format | Settings | +|-----------|----------------|----------| +| `actionlint` | GitHub Actions YAML | Default | +| `biome` | JavaScript/TypeScript | Single quotes, no semicolons, editorconfig | +| `keep-sorted` | Any (annotated blocks) | Default | +| `nixfmt` | Nix | RFC style | +| `yamlfmt` | YAML | Retain line breaks | +| `zizmor` | GitHub Actions security | Default | + +Files matching `*.min.js` are excluded from biome formatting. + +### Pinned Nixpkgs + +`ci/pinned.json` contains content-addressed references to: +- `nixpkgs` — The Nixpkgs revision used for CI tools +- `treefmt-nix` — The treefmt-nix revision + +Updated via: +```bash +./ci/update-pinned.sh +``` + +### Local CI Testing + +CI scripts can be tested locally: + +```bash +cd ci/github-script +nix-shell # or: nix develop +gh auth login +./run lint-commits +./run prepare +``` + +--- + +## Docker Images (images4docker) + +### Purpose + +The `images4docker/` directory provides 40 Dockerfiles for building MeshMC +across different Linux distributions and versions. Each image includes the +Qt 6 toolchain and all MeshMC build dependencies. + +### Image Registry + +Images are published to: +``` +ghcr.io/project-tick-infra/images/: +``` + +### Build Schedule + +The `images4docker-build.yml` workflow runs: +- On push to `main` (when Dockerfiles, workflow, or README change) +- On a daily schedule at **03:17 UTC** + +Currently 35 targets are actively built (Qt6-compatible set). + +### Supported Package Managers + +| Package Manager | Distributions | +|----------------|---------------| +| `apt` | Debian, Ubuntu | +| `dnf` | Fedora, RHEL, CentOS | +| `apk` | Alpine | +| `zypper` | openSUSE, SLES | +| `yum` | Older CentOS/RHEL | +| `pacman` | Arch Linux | +| `xbps` | Void Linux | +| `nix` | NixOS | +| `emerge` | Gentoo | + +### Qt 6 Requirement + +Qt 6 is **mandatory** for all images. If Qt 6 packages are unavailable on a +given distribution, the Docker build fails intentionally — there is no Qt 5 +fallback. This ensures all CI builds use a consistent Qt version. + +--- + +## Security Scanning + +### CodeQL + +CodeQL analysis runs for security-critical components: + +| Component | Schedule | Languages | +|-----------|----------|-----------| +| meshmc | Per-PR, scheduled | C++ | +| mnv | Per-PR, scheduled | C | +| neozip | Per-PR, scheduled | C | + +### Fuzz Testing + +Continuous fuzz testing for parser and compression libraries: + +| Component | Infrastructure | Workflow | +|-----------|---------------|----------| +| neozip | OSS-Fuzz + custom | `neozip-fuzz.yml` | +| json4cpp | OSS-Fuzz + custom | `json4cpp-fuzz.yml` | +| cmark | Custom fuzzers | `cmark-fuzz.yml` | +| tomlplusplus | Custom fuzzers | `tomlplusplus-fuzz.yml` | + +### Static Analysis + +| Tool | Component | Workflow | +|------|-----------|----------| +| Semgrep | json4cpp | `json4cpp-semgrep.yml` | +| Flawfinder | json4cpp | `json4cpp-flawfinder.yml` | +| Coverity | mnv | `mnv-coverity.yml` | +| clang-tidy | meshmc | Via `MeshMC_ENABLE_CLANG_TIDY` | + +### Dependency Review + +`repo-dependency-review.yml` scans dependency changes in PRs for known +vulnerabilities using GitHub's dependency review action. + +### OpenSSF Scorecard + +`repo-scorecards.yml` tracks the project's OpenSSF Scorecard score, measuring +security practices across dimensions like branch protection, dependency +updates, fuzzing, and signed releases. + +--- + +## Release Pipeline + +### MeshMC Releases + +1. A release tag is pushed (e.g., `7.0.0`) +2. `ci.yml` detects `is_release_tag` and dispatches release workflows +3. `meshmc-release.yml`: + - Builds release binaries for all platforms + - Creates GitHub release with changelog + - Uploads platform-specific artifacts +4. `meshmc-publish.yml`: + - Publishes artifacts to distribution channels + +### NeoZip Releases + +Similar tag-triggered flow via `neozip-release.yml`. + +### Documentation Deployment + +- `tomlplusplus-gh-pages.yml` — Deploys toml++ documentation to GitHub Pages +- `json4cpp-publish-docs.yml` — Publishes json4cpp API documentation + +--- + +## Branch Classification + +The `ci/supportedBranches.js` module classifies branches for CI decisions: + +```javascript +const typeConfig = { + master: ['development', 'primary'], + release: ['development', 'primary'], + staging: ['development', 'secondary'], + 'staging-next': ['development', 'secondary'], + feature: ['wip'], + fix: ['wip'], + backport: ['wip'], + revert: ['wip'], + wip: ['wip'], + dependabot: ['wip'], +} +``` + +Branch ordering (for base branch detection): +```javascript +const orderConfig = { + master: 0, // Highest priority + release: 1, + staging: 2, + 'staging-next': 3, +} +``` + +The `classify()` function parses branch names to extract: +- `prefix` — Branch type prefix +- `version` — Optional version number (e.g., `7.0`) +- `stable` — Whether the branch has a version (release branch) +- `type` — Classification from `typeConfig` +- `order` — Priority for base branch detection + +--- + +## DCO Enforcement + +The `.github/dco.yml` configuration: + +```yaml +allowRemediationCommits: + individual: false +``` + +This means: +- Every commit must have a `Signed-off-by` tag +- Remediation commits (adding sign-off after the fact) are **not** allowed +- Contributors must either sign off each commit individually or use + `git rebase --signoff` to retroactively sign all commits + +--- + +## Environment Variables + +### Shared CI Environment + +```yaml +env: + CI: true + FORCE_ALL: ${{ github.event.inputs.force-all == 'true' || github.event_name == 'merge_group' }} +``` + +### Per-Workflow Variables + +Individual workflows may set additional variables specific to their build +systems (CMake flags, Cargo features, Gradle properties, etc.). + +--- + +## Monitoring and Diagnostics + +### CI Health Indicators + +| Indicator | Source | +|-----------|--------| +| Build status badge | `meshmc-build.yml` badge in README | +| OpenSSF Scorecard | `repo-scorecards.yml` | +| Code coverage | Per-project coverage workflows | +| Dependency freshness | Dependabot/Renovate alerts | +| Stale issue count | `repo-stale.yml` | + +### Debugging Failed Builds + +1. Check the GitHub Actions run log +2. Identify which job failed (gate, lint, or per-project build) +3. For commit lint failures: fix commit message format +4. For build failures: reproduce locally using the same build system +5. For formatting failures: run treefmt locally via `nix develop -f ci/` diff --git a/docs/handbook/Project-Tick/coding-standards.md b/docs/handbook/Project-Tick/coding-standards.md new file mode 100644 index 0000000000..581e9b5cca --- /dev/null +++ b/docs/handbook/Project-Tick/coding-standards.md @@ -0,0 +1,558 @@ +# Project Tick — Coding Standards + +## Overview + +Project Tick spans multiple programming languages across its sub-projects. +This document defines the coding standards for each language used in the +monorepo. While each sub-project follows conventions appropriate to its +upstream lineage, these standards provide the organizational baseline. + +--- + +## C Style (neozip, cmark, genqrcode, cgit, corebinutils, mnv) + +### General Principles + +- Follow the existing style of the upstream codebase you are modifying +- Use C11 as the minimum standard (C23 preferred for new code in meshmc) +- Keep functions short and focused +- Prefer stack allocation over heap allocation where possible + +### Formatting + +- **Indentation:** 4 spaces for neozip, cmark, genqrcode; tabs for cgit, mnv +- **Line length:** 80 characters preferred, 120 maximum +- **Braces:** K&R style (opening brace on same line for functions in some + components, next line in others — follow the file you are editing) +- **Spaces:** Space after keywords (`if`, `for`, `while`, `switch`), no space + before parentheses in function calls + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Functions | `snake_case` | `compress_block()` | +| Variables | `snake_case` | `block_size` | +| Constants | `UPPER_SNAKE_CASE` | `MAX_BUFFER_SIZE` | +| Macros | `UPPER_SNAKE_CASE` | `ZLIB_VERSION` | +| Types (struct/enum) | `snake_case` or `PascalCase` (follow upstream) | `z_stream`, `inflate_state` | +| Struct members | `snake_case` | `total_out` | + +### Memory Management + +- Always check `malloc`/`calloc`/`realloc` return values +- Free resources in reverse order of allocation +- Use `sizeof(*ptr)` instead of `sizeof(type)` for allocation +- Avoid variable-length arrays (VLAs) — use heap allocation instead + +### Error Handling + +- Return error codes (negative values for errors, zero for success) +- Document error codes in function headers +- Avoid `goto` except for error-cleanup patterns + +### neozip-Specific + +NeoZip follows zlib-ng conventions: +- Maintain zlib API compatibility +- Use SIMD intrinsics via architecture-specific files in `arch/` +- Guard intrinsics with appropriate CPU feature checks +- Use `zng_` prefixed names for native API functions + +### corebinutils-Specific + +CoreBinUtils follows FreeBSD kernel style: +- Tabs for indentation +- BSD `err()` / `warn()` family for error reporting +- `POSIX.1-2008` compliance where feasible +- musl-compatible system calls + +### cgit-Specific + +cgit follows its own established style: +- Tabs for indentation +- Functions prefixed with module name (e.g., `ui_log_`, `cache_`) +- HTML output via `html()`, `htmlf()`, `html_attr()` functions + +### mnv-Specific + +MNV follows Vim coding conventions: +- Tabs for indentation +- VimScript naming conventions for script functions +- `FEAT_*` macros for feature gating +- Descriptive function names with module prefixes + +--- + +## C++ Style (meshmc, json4cpp, tomlplusplus, libnbtplusplus) + +### General Principles + +- Use modern C++ idioms (RAII, smart pointers, range-based for, etc.) +- Prefer `std::string_view` over `const char*` for read-only strings +- Prefer `std::optional` over sentinel values +- Avoid raw `new`/`delete` — use smart pointers or containers +- Use `auto` judiciously — prefer explicit types for public APIs + +### C++ Standard by Component + +| Component | Standard | Reason | +|-----------|----------|--------| +| meshmc | C++23 | Active development, modern compiler requirement | +| json4cpp | C++11 (minimum), C++17 (full features) | Wide compatibility | +| tomlplusplus | C++17 | Modern features, wide support | +| libnbtplusplus | C++11 | Compatibility with older compilers | + +### Formatting (meshmc) + +MeshMC uses `clang-format` for automated formatting. The `.clang-format` +file at `meshmc/.clang-format` defines the canonical style. Key settings: + +- **Indentation:** 4 spaces +- **Line length:** 120 characters +- **Braces:** Allman style (braces on their own lines) for functions; + attached for control flow +- **Pointer alignment:** Left-aligned (`int* ptr`, not `int *ptr`) +- **Include ordering:** Sorted, grouped by category + +Always run clang-format before committing: + +```bash +clang-format -i path/to/file.cpp +``` + +### Naming (meshmc) + +| Element | Convention | Example | +|---------|-----------|---------| +| Classes | `PascalCase` | `InstanceList` | +| Methods | `camelCase` | `loadInstance()` | +| Member variables | `m_camelCase` | `m_instanceList` | +| Static members | `s_camelCase` | `s_instance` | +| Local variables | `camelCase` | `blockSize` | +| Constants | `UPPER_SNAKE_CASE` or `k_PascalCase` | `MAX_RETRIES` | +| Namespaces | `PascalCase` or `lowercase` | `Application`, `net` | +| Template params | `PascalCase` | `typename ValueType` | +| Enum values | `PascalCase` | `LoadState::Ready` | +| Files | `PascalCase` | `InstanceList.cpp`, `InstanceList.h` | + +### Headers + +- Use `#pragma once` instead of include guards +- Include what you use (IWYU principle) +- Forward-declare when possible to reduce compile times +- Order includes: own header first, project headers, third-party, standard + +```cpp +#include "InstanceList.h" // Own header + +#include "Application.h" // Project headers +#include "FileSystem.h" + +#include // Third-party +#include + +#include // Standard library +#include +#include +``` + +### Qt Conventions (meshmc) + +- Use Qt container types only when interfacing with Qt APIs +- Prefer `std::` containers for internal logic +- Use `Q_OBJECT` macro for all QObject subclasses +- Use signal/slot connections with lambda syntax +- Use `QStringLiteral()` for string literals in Qt contexts +- Follow Qt naming: signals as verbs (`clicked()`), slots as verb-phrases + (`handleClicked()`) + +### Error Handling (meshmc) + +- Use exceptions for truly exceptional conditions +- Use `std::optional` for expected absence of values +- Use result types for operations that can fail in expected ways +- Log errors with Qt's logging categories (`QLoggingCategory`) + +### Smart Pointer Usage + +```cpp +// Owned heap objects +std::unique_ptr instance = std::make_unique(); + +// Shared ownership +std::shared_ptr settings = std::make_shared(); + +// Non-owning observation (prefer raw pointer or reference) +Instance* observer = instance.get(); + +// Qt parent-child ownership +auto* widget = new QWidget(parent); // Qt manages lifetime +``` + +### json4cpp-Specific + +json4cpp follows nlohmann/json conventions: +- Header-only library +- Heavy template metaprogramming +- ADL-based serialization (`to_json`/`from_json`) +- Namespace: `nlohmann` + +### tomlplusplus-Specific + +toml++ follows its own established conventions: +- Header-only by default +- Namespace: `toml` +- Works without RTTI or exceptions +- C++17 with optional C++20 features + +### libnbtplusplus-Specific + +libnbt++ uses C++11: +- Tag types as class hierarchy (`nbt::tag_compound`, `nbt::tag_list`, etc.) +- Stream-based I/O +- Namespace: `nbt` + +--- + +## Rust Style (tickborg) + +### General Principles + +- Follow the [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/) +- Use `rustfmt` for formatting (default configuration) +- Use `clippy` for linting +- Handle errors with `Result` — avoid `unwrap()` in library code + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Crates | `snake_case` | `tickborg` | +| Modules | `snake_case` | `build_runner` | +| Types | `PascalCase` | `BuildResult` | +| Functions | `snake_case` | `run_build()` | +| Constants | `UPPER_SNAKE_CASE` | `MAX_RETRIES` | +| Traits | `PascalCase` | `Buildable` | +| Enum variants | `PascalCase` | `Status::Success` | + +### Cargo Workspace + +The tickborg Cargo workspace uses `resolver = "2"` with two crates: + +```toml +[workspace] +members = [ + "tickborg", + "tickborg-simple-build" +] +resolver = "2" + +[profile.release] +debug = true # Debug info in release builds +``` + +### Error Handling + +```rust +// Use anyhow for application errors +use anyhow::{Context, Result}; + +fn process_pr(pr: &PullRequest) -> Result { + let changes = detect_changes(pr) + .context("failed to detect changed projects")?; + // ... +} + +// Use thiserror for library errors +use thiserror::Error; + +#[derive(Debug, Error)] +enum BuildError { + #[error("build failed for {project}: {reason}")] + BuildFailed { project: String, reason: String }, + #[error("project not found: {0}")] + ProjectNotFound(String), +} +``` + +--- + +## Java Style (forgewrapper) + +### General Principles + +- Follow standard Java conventions (Oracle / Google style) +- Target JDK 17 as the minimum +- Use JPMS (Java Platform Module System) where applicable + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Packages | `lowercase.dotted` | `io.github.zekerzhayard.forgewrapper` | +| Classes | `PascalCase` | `ForgeWrapper` | +| Interfaces | `PascalCase` (often prefixed with `I`) | `IFileDetector` | +| Methods | `camelCase` | `detectFiles()` | +| Constants | `UPPER_SNAKE_CASE` | `FORGE_VERSION` | +| Local variables | `camelCase` | `installerPath` | + +### Gradle Build + +ForgeWrapper uses Gradle with the wrapper script (`gradlew`/`gradlew.bat`): + +```bash +./gradlew build # Build +./gradlew test # Test +./gradlew clean # Clean +``` + +### Service Provider Interface + +ForgeWrapper uses Java SPI for extension: + +```java +// Service interface +public interface IFileDetector { + // Custom file detection logic +} + +// Registration via META-INF/services/ +// META-INF/services/io.github.zekerzhayard.forgewrapper.installer.detector.IFileDetector +``` + +--- + +## Python Style (meta) + +### General Principles + +- Follow [PEP 8](https://peps.python.org/pep-0008/) +- Target Python 3.10+ (as specified in `pyproject.toml`) +- Use type hints for function signatures +- Use Poetry for dependency management + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Modules | `snake_case` | `generate_forge.py` | +| Functions | `snake_case` | `update_mojang()` | +| Classes | `PascalCase` | `VersionList` | +| Constants | `UPPER_SNAKE_CASE` | `DEPLOY_TO_GIT` | +| Variables | `snake_case` | `version_data` | + +### Dependencies + +From `pyproject.toml`: + +```toml +[tool.poetry.dependencies] +python = ">=3.10,<4.0" +cachecontrol = "^0.14.0" +requests = "^2.31.0" +filelock = "^3.20.3" +packaging = "^25.0" +pydantic = "^1.10.13" +``` + +### CLI Entry Points + +Meta provides Poetry scripts for each operation: + +```bash +poetry run generateFabric +poetry run generateForge +poetry run generateNeoForge +poetry run generateQuilt +poetry run generateMojang +poetry run generateJava +poetry run updateFabric +poetry run updateForge +# ... etc. +``` + +--- + +## Shell Script Style (bootstrap.sh, hooks, CI scripts) + +### General Principles + +- Use Bash for complex scripts, POSIX sh for simple ones +- Start with `#!/usr/bin/env bash` +- Enable strict mode: `set -euo pipefail` +- Use `shellcheck` for linting + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Functions | `snake_case` | `detect_distro()` | +| Local variables | `snake_case` | `distro_id` | +| Environment variables | `UPPER_SNAKE_CASE` | `DISTRO_ID` | +| Constants | `UPPER_SNAKE_CASE` | `RED`, `GREEN`, `NC` | + +### Error Handling + +Use colored output functions as defined in `bootstrap.sh`: + +```bash +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +CYAN='\033[0;36m' +NC='\033[0m' + +info() { printf "${CYAN}[INFO]${NC} %s\n" "$*"; } +ok() { printf "${GREEN}[ OK ]${NC} %s\n" "$*"; } +warn() { printf "${YELLOW}[WARN]${NC} %s\n" "$*"; } +err() { printf "${RED}[ERR]${NC} %s\n" "$*" >&2; } +``` + +### Best Practices + +- Quote all variable expansions: `"$var"` not `$var` +- Use `[[ ]]` for conditional tests (Bash) +- Use `$()` for command substitution (not backticks) +- Use `local` for function-scoped variables +- Check command existence with `command -v` + +--- + +## JavaScript / Node.js Style (CI scripts) + +### General Principles + +The CI scripts in `ci/github-script/` follow the formatting enforced by +`biome` (configured in `ci/default.nix`): + +- **Quotes:** Single quotes +- **Semicolons:** None (ASI) +- **Indentation:** 2 spaces + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Functions | `camelCase` | `lintCommits()` | +| Variables | `camelCase` | `prNumber` | +| Constants | `UPPER_SNAKE_CASE` or `camelCase` | `MAX_RETRIES` | +| Files | `kebab-case` | `lint-commits.js` | +| Modules | `camelCase` exports | `module.exports = { classify }` | + +--- + +## Nix Style (flake.nix, ci/default.nix, deployment modules) + +### General Principles + +- Format with `nixfmt` (RFC style, as configured in CI) +- Use `let ... in` for local bindings +- Prefer attribute sets over positional arguments +- Pin all inputs with content hashes + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Attributes | `camelCase` | `devShells`, `nixpkgsFor` | +| Packages | `kebab-case` | `clang-tidy-diff` | +| Variables | `camelCase` | `forAllSystems` | +| Functions | `camelCase` | `mkShell` | + +--- + +## CMake Style (meshmc, libraries) + +### General Principles + +- Minimum CMake version: 3.28 (meshmc), 3.15 (libnbtplusplus) +- Use `target_*` commands instead of directory-level `include_directories()` +- Export compile commands: `CMAKE_EXPORT_COMPILE_COMMANDS ON` +- Use `find_package()` for external dependencies + +### Naming + +| Element | Convention | Example | +|---------|-----------|---------| +| Variables | `PascalCase` or `UPPER_CASE` | `MeshMC_VERSION_MAJOR` | +| Options | `PascalCase` | `ENABLE_LTO`, `NBT_BUILD_SHARED` | +| Targets | `PascalCase` | `MeshMC`, `nbt++` | +| Functions | `snake_case` | `query_qmake()` | + +### CMake Options (meshmc) + +| Option | Default | Description | +|--------|---------|-------------| +| `ENABLE_LTO` | OFF | Enable Link Time Optimization | +| `MeshMC_DISABLE_JAVA_DOWNLOADER` | OFF | Disable built-in Java downloader | +| `MeshMC_ENABLE_CLANG_TIDY` | OFF | Enable clang-tidy during compilation | + +--- + +## Dockerfile Style (images4docker) + +### General Principles + +- One base image per Dockerfile +- Use multi-stage builds where appropriate +- Minimize layer count +- Validate Qt 6 availability during build (fail fast) +- Pin base image tags + +### Naming + +- Dockerfiles: `.Dockerfile` +- Image tags: `ghcr.io/project-tick-infra/images/:` + +--- + +## Cross-Language Standards + +### Git Commit Messages + +All languages follow the same Conventional Commits format (see +[contributing.md](contributing.md)): + +``` +type(scope): short description +``` + +### SPDX Headers + +All source files should include SPDX headers appropriate to their comment +syntax: + +```c +// SPDX-License-Identifier: GPL-3.0-or-later +// SPDX-FileCopyrightText: 2026 Project Tick +``` + +```python +# SPDX-License-Identifier: MS-PL +# SPDX-FileCopyrightText: 2026 Project Tick +``` + +```rust +// SPDX-License-Identifier: MIT +// SPDX-FileCopyrightText: NixOS Contributors & Project Tick +``` + +### Static Analysis + +| Language | Tool | Integration | +|----------|------|-------------| +| C/C++ | clang-tidy | `MeshMC_ENABLE_CLANG_TIDY`, CI | +| C/C++ | clang-format | Pre-commit hook, CI | +| C/C++ | checkpatch.pl | Pre-commit hook | +| C/C++ | CodeQL | CI workflows | +| C/C++ | Coverity | CI (mnv only) | +| C/C++ | Flawfinder | CI (json4cpp) | +| C/C++ | Semgrep | CI (json4cpp) | +| Rust | clippy | `cargo clippy` | +| Rust | rustfmt | `cargo fmt` | +| Python | (standard linters) | Poetry dev deps | +| JavaScript | biome | treefmt CI | +| Nix | nixfmt | treefmt CI | +| YAML | yamlfmt | treefmt CI | +| GitHub Actions | actionlint, zizmor | treefmt CI | diff --git a/docs/handbook/Project-Tick/contributing.md b/docs/handbook/Project-Tick/contributing.md new file mode 100644 index 0000000000..fa967d7116 --- /dev/null +++ b/docs/handbook/Project-Tick/contributing.md @@ -0,0 +1,545 @@ +# Project Tick — Contributing Guide + +## Overview + +Project Tick welcomes contributions from the community. This guide covers the +full contribution lifecycle — from setting up your environment to getting your +changes merged. It applies to all sub-projects within the monorepo. + +All contributions are subject to the Project Tick Contributor License Agreement +(CLA), the Developer Certificate of Origin (DCO), and the project's Code of +Conduct. + +--- + +## Table of Contents + +1. [Quick Start](#quick-start) +2. [AI Policy](#ai-policy) +3. [Contributor License Agreement](#contributor-license-agreement) +4. [Developer Certificate of Origin](#developer-certificate-of-origin) +5. [Commit Message Format](#commit-message-format) +6. [Branch Naming](#branch-naming) +7. [PR Workflow](#pr-workflow) +8. [Code Review Process](#code-review-process) +9. [Issue Templates](#issue-templates) +10. [PR Requirements Checklist](#pr-requirements-checklist) +11. [What Not to Do](#what-not-to-do) +12. [Documentation](#documentation) +13. [Contact](#contact) + +--- + +## Quick Start + +```bash +# 1. Fork and clone +git clone --recursive https://github.com/YOUR_USERNAME/Project-Tick.git +cd Project-Tick + +# 2. Create a branch +git checkout -b feat/my-change + +# 3. Make changes, format, and lint +clang-format -i changed_files.cpp +reuse lint + +# 4. Commit with sign-off +git commit -s -a -m "feat(meshmc): add new feature" + +# 5. Push and open a PR +git push origin feat/my-change +``` + +--- + +## AI Policy + +Project Tick has strict rules regarding generative AI usage. This policy is +adapted from matplotlib's contributing guide and the Linux Kernel policy guide. + +### Rules + +1. **Do not post raw AI output** as comments on GitHub or the project's Discord + server. Such comments are typically formulaic and low-quality. + +2. **If you use AI tools** to develop code or documentation, you must: + - Fully understand the proposed changes + - Be able to explain why they are the correct approach + - Add personal value based on your own competency + +3. **AI-generated low-value contributions will be rejected.** Taking input, + feeding it to an AI, and posting the result without adding value is not + acceptable. + +### Signed-off-by and AI + +**AI agents MUST NOT add `Signed-off-by` tags.** Only humans can legally +certify the Developer Certificate of Origin. The human submitter is responsible +for: + +- Reviewing all AI-generated code +- Ensuring compliance with licensing requirements +- Adding their own `Signed-off-by` tag +- Taking full responsibility for the contribution + +### AI Attribution + +When AI tools contribute to development, include an `Assisted-by` tag in the +commit message: + +``` +Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2] +``` + +Where: +- `AGENT_NAME` — Name of the AI tool or framework +- `MODEL_VERSION` — Specific model version used +- `[TOOL1] [TOOL2]` — Optional specialized analysis tools (e.g., coccinelle, + sparse, smatch, clang-tidy) + +Basic development tools (git, gcc, make, editors) should **not** be listed. + +Example: + +``` +feat(meshmc): optimize chunk loading algorithm + +Improved chunk loading performance by 40% using spatial hashing. + +Signed-off-by: Jane Developer +Assisted-by: Claude:claude-3-opus coccinelle sparse +``` + +--- + +## Contributor License Agreement + +By submitting a contribution, you agree to the **Project Tick Contributor +License Agreement (CLA)**. + +The CLA ensures that: + +- You have the legal right to submit the contribution +- The contribution does not knowingly infringe third-party rights +- Project Tick may distribute the contribution under the applicable license(s) +- Long-term governance and license consistency can be maintained + +The CLA applies to **all intentional contributions**, including: +- Source code +- Documentation +- Tests +- Data +- Media assets +- Configuration files + +The full CLA text is available at: + + +**If you do not agree to the CLA, do not submit contributions.** + +--- + +## Developer Certificate of Origin + +Every commit in Project Tick must include a DCO sign-off line. The sign-off +certifies that you wrote the code or have the right to submit it under the +project's licenses. + +### How to Sign Off + +Add the `-s` flag to `git commit`: + +```bash +git commit -s -a +``` + +This appends the following line to your commit message: + +``` +Signed-off-by: Your Name +``` + +### Retroactive Sign-Off + +If you forgot to sign off, you can retroactively sign all commits in your +branch: + +```bash +git rebase --signoff develop +git push --force +``` + +### DCO Bot Enforcement + +A DCO bot automatically checks every PR. PRs missing sign-off will be +labeled and blocked from merging. The bot configuration +(`.github/dco.yml`) does not allow remediation commits — every commit +in the PR must have a sign-off. + +```yaml +# .github/dco.yml +allowRemediationCommits: + individual: false +``` + +### Important Distinction + +**Signing** (GPG/SSH signatures) and **signing-off** (DCO `Signed-off-by`) are +two different things. The DCO sign-off is the minimum requirement. GPG signing +is recommended but not required. + +--- + +## Commit Message Format + +Project Tick uses [Conventional Commits](https://www.conventionalcommits.org/) +format. The CI system validates commit messages via +`ci/github-script/lint-commits.js`. + +### Format + +``` +type(scope): short description + +Optional longer explanation of what changed and why. + +Signed-off-by: Your Name +``` + +### Types + +| Type | Description | +|------|-------------| +| `feat` | New feature | +| `fix` | Bug fix | +| `docs` | Documentation only | +| `style` | Formatting, whitespace (no code change) | +| `refactor` | Code restructuring (no feature/fix) | +| `perf` | Performance improvement | +| `test` | Adding or updating tests | +| `build` | Build system or dependency changes | +| `ci` | CI configuration changes | +| `chore` | Maintenance tasks | +| `revert` | Reverting a previous commit | + +### Scopes + +The scope identifies which sub-project is affected: + +| Scope | Sub-Project | +|-------|-------------| +| `meshmc` | MeshMC launcher | +| `mnv` | MNV text editor | +| `cgit` | cgit web interface | +| `neozip` | NeoZip compression library | +| `json4cpp` | Json4C++ JSON library | +| `tomlplusplus` | toml++ TOML library | +| `libnbt` | libnbt++ NBT library | +| `cmark` | cmark Markdown library | +| `genqrcode` | GenQRCode QR library | +| `forgewrapper` | ForgeWrapper Java shim | +| `corebinutils` | CoreBinUtils BSD ports | +| `meta` | Metadata generator | +| `tickborg` | tickborg CI bot | +| `ci` | CI infrastructure | +| `docker` | images4docker | +| `docs` | Documentation | + +For changes spanning the component's sub-structure, add a nested scope: + +``` +projtlauncher(fix): fix crash on startup with invalid config +``` + +### Examples + +``` +feat(meshmc): add chunk loading optimization +fix(neozip): handle empty archives in inflate +docs(cmark): fix API reference typo +ci(json4cpp): update build matrix for ARM64 +build(tomlplusplus): bump meson minimum to 0.60 +refactor(corebinutils): simplify ls output formatting +test(libnbt): add round-trip test for compressed NBT +chore(meta): update poetry.lock +``` + +### tickborg Integration + +The tickborg CI bot reads commit scopes to determine which sub-projects +to build: + +| Commit Message | Auto-Build | +|---------------|------------| +| `feat(meshmc): add chunk loading` | meshmc | +| `cmark: fix buffer overflow` | cmark | +| `fix(neozip): handle empty archives` | neozip | +| `chore(ci): update workflow` | (CI changes only) | + +--- + +## Branch Naming + +Use the following branch name prefixes: + +| Prefix | Purpose | Example | +|--------|---------|---------| +| `feature-*` or `feat/*` | New features | `feature-chunk-loading` | +| `fix-*` or `fix/*` | Bug fixes | `fix-crash-on-startup` | +| `backport-*` | Cherry-picks to release | `backport-7.0-fix-123` | +| `revert-*` | Reverted changes | `revert-pr-456` | +| `wip-*` or `wip/*` | Work in progress | `wip-new-ui` | + +Development branches managed by maintainers: + +| Branch | Purpose | +|--------|---------| +| `master` | Main development branch | +| `release-X.Y` | Release stabilization (e.g., `release-7.0`) | +| `staging-*` | Pre-release staging | +| `staging-next-*` | Next staging cycle | + +### WIP Convention + +If a PR title begins with `WIP:` or contains `[WIP]`, the tickborg bot +will **not** automatically build its affected projects. This lets you +push incomplete work for early review without triggering full CI. + +--- + +## PR Workflow + +### Step-by-Step + +1. **Fork** the repository on GitHub + +2. **Clone** your fork with submodules: + ```bash + git clone --recursive https://github.com/YOUR_USERNAME/Project-Tick.git + ``` + +3. **Set up upstream remote:** + ```bash + git remote add upstream https://github.com/Project-Tick/Project-Tick.git + ``` + +4. **Create a feature branch:** + ```bash + git fetch upstream + git checkout -b feature/my-change upstream/master + ``` + +5. **Develop your changes:** + - Write code + - Add tests for new functionality + - Update documentation if needed + - Run clang-format on changed C/C++ files + - Check REUSE compliance + +6. **Commit with sign-off:** + ```bash + git add -A + git commit -s -m "feat(scope): description" + ``` + +7. **Push to your fork:** + ```bash + git push origin feature/my-change + ``` + +8. **Open a PR** against `master` on the upstream repository + +9. **Fill in the PR template** — The template reminds you to: + - Sign off commits + - Sign the CLA + - Provide a clear description + +### Keeping Your Branch Updated + +```bash +git fetch upstream +git rebase upstream/master +git push --force-with-lease origin feature/my-change +``` + +--- + +## Code Review Process + +### Automated Checks + +Every PR goes through automated CI: + +1. **Gate job** — Event classification and change detection +2. **Commit lint** — Validates Conventional Commits format +3. **Formatting check** — treefmt validates code style +4. **CODEOWNERS validation** — Ensures proper ownership rules +5. **Per-project CI** — Builds and tests affected sub-projects +6. **CodeQL analysis** — Security scanning (for meshmc, mnv, neozip) +7. **DCO check** — Verifies all commits are signed off + +### Maintainer Review + +After automated checks pass: + +1. A maintainer reviews the code for: + - Correctness + - Design and architecture fit + - Test coverage + - Documentation completeness + - License compliance + +2. The maintainer may request changes by: + - Leaving inline comments + - Requesting specific modifications + - Asking clarifying questions + +3. Address all feedback by pushing additional commits or amending existing + ones. Sign off every commit. + +4. Once approved, the maintainer merges the PR. + +### Review Routing + +The `CODEOWNERS` file routes reviews automatically. All paths are currently +owned by `@YongDo-Hyun`, covering: + +- `.github/` — Actions, templates, workflows +- `archived/` — All archived sub-projects +- `cgit/` — Including contrib, filters, tests +- `cmark/` — Including all subdirectories +- `corebinutils/` — All utility directories +- Every other sub-project directory + +--- + +## Issue Templates + +Project Tick provides structured issue templates in `.github/ISSUE_TEMPLATE/`: + +### Bug Report (`bug_report.yml`) + +Fields: +- **Operating System** — Windows, macOS, Linux, Other (multi-select) +- **Version of MeshMC** — Text field for version number +- Steps to reproduce +- Expected vs actual behavior +- Logs/crash reports + +Before filing a bug, check: +- The [FAQ](https://github.com/Project-Tick/MeshMC/wiki/FAQ) +- That the bug is not caused by Minecraft or mods +- That the issue hasn't been reported before + +### Suggestion (`suggestion.yml`) + +For feature requests and improvements. + +### RFC (`rfc.yml`) + +For larger architectural proposals that need discussion before implementation. + +### Configuration (`config.yml`) + +Controls which templates appear and provides links to external resources (e.g., +Discord for general help questions). + +--- + +## PR Requirements Checklist + +Before submitting a PR, verify: + +- [ ] Code compiles without warnings +- [ ] clang-format applied to changed C/C++ files +- [ ] All existing tests pass +- [ ] New tests added for new functionality +- [ ] All commits signed off (`git commit -s`) +- [ ] Commit messages follow Conventional Commits format +- [ ] Documentation updated if needed +- [ ] REUSE compliance verified (`reuse lint`) +- [ ] Clear PR description explaining what and why +- [ ] Related issues referenced +- [ ] One logical change per PR + +### What Must Be Separate PRs + +The following must **never** be combined in a single PR: + +- **Refactors** — Code restructuring without behavior change +- **Features** — New functionality +- **Third-party updates** — Library/dependency version bumps + +Third-party library updates require standalone PRs with documented rationale +explaining why the update is needed. + +--- + +## What Not to Do + +1. **Don't mix refactors with features.** Each PR should contain one logical + change. + +2. **Don't skip sign-off.** The DCO bot will block your PR. + +3. **Don't post raw AI output.** All contributions must reflect genuine + understanding and personal competence. + +4. **Don't submit without testing.** Run the test suite for affected + sub-projects. + +5. **Don't ignore CI failures.** Fix them before requesting review. + +6. **Don't force-push to shared branches.** Only force-push to your own + feature branches. + +7. **Don't submit changes without REUSE compliance.** Every new file needs + SPDX headers. + +--- + +## Documentation + +### Where Documentation Lives + +| Location | Content | +|----------|---------| +| `docs/handbook/` | Developer handbook organized by sub-project | +| `docs/contributing/` | Contribution-specific guides | +| `docs/` | General documentation | +| `meshmc/doc/` | MeshMC-specific docs | +| `meshmc/BUILD.md` | MeshMC build instructions | +| `ofborg/doc/` | tickborg documentation | +| Sub-project `README.md` files | Per-component overviews | + +### Documentation Standards + +- Use Markdown for all documentation +- Follow the existing heading structure +- Include code examples where appropriate +- Cross-reference related documents +- Add SPDX license headers to new documentation files: + ``` + + ``` + +--- + +## Contact + +| Channel | Address | +|---------|---------| +| GitHub Issues | [Project-Tick/Project-Tick/issues](https://github.com/Project-Tick/Project-Tick/issues) | +| Email | [projecttick@projecttick.org](mailto:projecttick@projecttick.org) | + +--- + +## License + +By contributing to Project Tick, you agree to license your work under the +project's applicable licenses. See the `LICENSES/` directory for details. + +The specific license for each sub-project is tracked in `REUSE.toml`. Ensure +your contributions comply with the license of the sub-project you are +modifying. diff --git a/docs/handbook/Project-Tick/faq.md b/docs/handbook/Project-Tick/faq.md new file mode 100644 index 0000000000..0ecc50110c --- /dev/null +++ b/docs/handbook/Project-Tick/faq.md @@ -0,0 +1,683 @@ +# Project Tick — FAQ & Troubleshooting + +## General Questions + +### What is Project Tick? + +Project Tick is a unified monorepo containing MeshMC (a custom Minecraft +launcher), supporting libraries, infrastructure tooling, and developer +utilities. The project encompasses 15+ sub-projects spanning C, C++23, Rust, +Java, Python, JavaScript, Nix, Shell, and Dockerfile. + +### Why a monorepo? + +A single repository provides: + +- **Atomic changes** — Cross-cutting modifications (e.g., updating neozip and + MeshMC together) land in a single commit. +- **Unified CI** — One orchestrator workflow dispatches per-component CI based + on changed files. +- **Shared tooling** — Formatting, linting, license compliance, and Git hooks + apply uniformly. +- **Simplified dependency management** — Internal libraries (json4cpp, + tomlplusplus, libnbtplusplus, neozip) are consumed as source, not as + external packages. + +### Why C++23? + +MeshMC uses C++23 for: + +- `std::expected` for error handling without exceptions in performance paths +- `std::format` / `std::print` for type-safe formatting +- `std::ranges` improvements for cleaner data transformations +- Deducing `this` for CRTP replacement +- `if consteval` for compile-time branching + +Minimum compiler support: Clang 18+, GCC 14+, MSVC 17.10+. + +### Why fork zlib-ng instead of using it directly? + +neozip is a maintained fork of zlib-ng with Project Tick-specific +modifications: + +- Build system integration with MeshMC's CMake configuration +- Custom SIMD dispatch tuned for the project's use patterns +- Consistent licensing and REUSE annotations +- Patches carried forward as the upstream project evolves + +### Why fork Vim? + +MNV extends Vim with modern development features while maintaining backward +compatibility. The fork is maintained in-tree to allow tight integration with +the Project Tick development workflow. + +### Why fork nlohmann/json? + +json4cpp is a fork of nlohmann/json maintained for: + +- Build system compatibility with MeshMC +- Consistent REUSE/SPDX license annotations +- Controlled update cadence synchronized with launcher releases + +### What platforms does MeshMC support? + +| Platform | Architecture | Status | +|-----------------|-------------|---------------| +| Linux | x86_64 | Full support | +| Linux | aarch64 | Full support | +| macOS | x86_64 | Full support | +| macOS | aarch64 | Full support | +| Windows | x86_64 | Full support | +| Windows | aarch64 | Full support | +| WSL | — | Not supported | + +### How do I contact the project? + +- **Security issues**: yongdohyun@projecttick.org (see `SECURITY.md`) +- **General inquiries**: Open a GitHub issue or discussion +- **Trademark questions**: yongdohyun@projecttick.org + +--- + +## Build Problems + +### CMake: "Could not find a configuration file for package Qt6" + +Qt 6 is not installed or not in the CMake search path. + +**Solution (Linux — Package Manager):** + +```bash +# Debian/Ubuntu +sudo apt install qt6-base-dev qt6-5compat-dev + +# Fedora +sudo dnf install qt6-qtbase-devel qt6-qt5compat-devel + +# Arch +sudo pacman -S qt6-base qt6-5compat +``` + +**Solution (Nix):** + +```bash +# From the meshmc/ directory: +nix develop +# Or if direnv is set up: +cd meshmc/ # .envrc activates automatically +``` + +**Solution (macOS):** + +```bash +brew install qt@6 +export CMAKE_PREFIX_PATH="$(brew --prefix qt@6)" +``` + +### CMake: "CMake 3.28 or higher is required" + +MeshMC requires CMake 3.28+ for C++23 module support. + +**Solution:** + +```bash +# Nix (provides latest CMake) +nix develop + +# pip (if system package is too old) +pip install --user cmake + +# Snap +sudo snap install cmake --classic +``` + +### CMake: "Could NOT find ECM" + +Extra CMake Modules (ECM) from the KDE project is required. + +```bash +# Debian/Ubuntu +sudo apt install extra-cmake-modules + +# Fedora +sudo dnf install extra-cmake-modules + +# Arch +sudo pacman -S extra-cmake-modules + +# Nix +# Already included in flake.nix devShell +``` + +### "In-source builds are not allowed" + +MeshMC's CMake configuration prohibits building in the source directory. + +**Solution:** + +```bash +cd meshmc/ +cmake -B build -S . +cmake --build build +``` + +Never run `cmake .` directly in the source tree. Always use `-B `. + +### "compiler does not support C++23" + +Your compiler is too old. MeshMC requires: +- Clang 18+ +- GCC 14+ +- MSVC 17.10+ (Visual Studio 2022 17.10) + +**Solution:** + +```bash +# Check your compiler version +clang++ --version +g++ --version + +# Use Nix for guaranteed Clang 22 +nix develop +``` + +### neozip: configure fails on macOS + +macOS may not have all required build tools. + +```bash +# Install Xcode command line tools +xcode-select --install + +# Use Homebrew for missing dependencies +brew install autoconf automake libtool +``` + +### forgewrapper: Gradle build fails + +Ensure you use the Gradle wrapper, not a system-installed Gradle: + +```bash +cd forgewrapper/ +./gradlew build # Unix +gradlew.bat build # Windows +``` + +The Gradle wrapper (`gradlew`) downloads the correct Gradle version +automatically. Do not use `gradle build` with a system installation. + +### genqrcode: autogen.sh fails + +Install Autotools prerequisites: + +```bash +# Debian/Ubuntu +sudo apt install autoconf automake libtool pkg-config + +# Then bootstrap: +cd genqrcode/ +./autogen.sh +./configure +make +``` + +### corebinutils: GNUmakefile errors + +corebinutils requires BSD make extensions and may not build with all GNU Make +versions. Run the configure script first: + +```bash +cd corebinutils/ +./configure +make -f GNUmakefile +``` + +### cgit: missing Git submodule + +cgit requires a bundled Git source tree as a submodule. + +```bash +git submodule update --init --recursive cgit/git/ +cd cgit/ +make +``` + +If the `cgit/git/` directory is empty, the submodule was not initialized. + +### MeshMC: vcpkg dependencies fail (Windows) + +```powershell +# Ensure vcpkg is bootstrapped +cd meshmc +.\bootstrap.cmd + +# Or manually: +git clone https://github.com/microsoft/vcpkg.git +.\vcpkg\bootstrap-vcpkg.bat +.\vcpkg\vcpkg install --triplet x64-windows +``` + +--- + +## CI Problems + +### CI: "REUSE lint failed" + +Every file in the repository must have a license annotation. Check which +files are non-compliant: + +```bash +reuse lint +``` + +Fix by adding the file to `REUSE.toml`: + +```toml +[[annotations]] +path = ["path/to/new/file"] +SPDX-FileCopyrightText = "YYYY Your Name" +SPDX-License-Identifier = "MIT" +``` + +Or add an SPDX header to the file itself: + +```c +// SPDX-FileCopyrightText: 2025 Your Name +// SPDX-License-Identifier: GPL-3.0-or-later +``` + +### CI: "DCO check failed" + +Your commit is missing the `Signed-off-by` line. + +**Fix the last commit:** + +```bash +git commit --amend -s +``` + +**Fix older commits (interactive rebase):** + +```bash +git rebase -i HEAD~N +# Mark commits as "edit", then for each: +git commit --amend -s +git rebase --continue +``` + +**Prevent future failures:** + +```bash +# Always use -s flag: +git commit -s -m "feat(meshmc): add feature" +``` + +### CI: "Conventional Commits lint failed" + +Commit messages must follow the Conventional Commits format: + +``` +type(scope): description + +[body] + +[footer] +Signed-off-by: Name +``` + +Valid types: `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, +`build`, `ci`, `chore`, `revert`. + +**Bad:** +``` +Fixed the bug +Update README +``` + +**Good:** +``` +fix(meshmc): resolve crash on instance deletion +docs: update getting-started guide +``` + +### CI: "checkpatch failed" + +The lefthook pre-commit hook runs checkpatch on C/C++ changes. Common +issues: + +- Trailing whitespace +- Mixed tabs and spaces +- Missing newline at end of file +- Lines exceeding column limit + +### CI: "treefmt check failed" + +treefmt checks that all files are formatted correctly. Run locally: + +```bash +# Via Nix +nix run .#treefmt + +# Or format individual files +clang-format -i file.cpp # C/C++ +black file.py # Python +rustfmt file.rs # Rust +nixfmt file.nix # Nix +shfmt -w file.sh # Shell +``` + +### CI: workflow not triggered + +Check whether your changes match the workflow's path filters. The monolithic +`ci.yml` uses change detection to only run relevant sub-project CI: + +```yaml +# File changes are analyzed in: +# .github/actions/change-analysis/ +``` + +If you modify only documentation, MeshMC build workflows will not trigger. +This is intentional. + +--- + +## Git & Repository Questions + +### How do I clone the repository? + +```bash +git clone --recurse-submodules https://github.com/AetherMC/Project-Tick.git +cd Project-Tick +``` + +The `--recurse-submodules` flag is critical — cgit depends on a bundled Git +source tree. + +### How do I update submodules? + +```bash +git submodule update --init --recursive +``` + +### How do I set up the development environment? + +**Option A — Nix (recommended):** + +```bash +nix develop +``` + +**Option B — bootstrap script:** + +```bash +# Linux +./bootstrap.sh + +# Windows +bootstrap.cmd +``` + +**Option C — Manual installation:** + +See the [Getting Started](getting-started.md) guide for per-platform +instructions. + +### How do I run the bootstrap script? + +```bash +chmod +x bootstrap.sh +./bootstrap.sh +``` + +The script detects your distribution (Debian/Ubuntu, Fedora/RHEL, SUSE, +Arch, macOS) and verifies that required dependencies are installed. It does +**not** install packages automatically — it reports what is missing. + +### How do I contribute? + +1. Fork the repository +2. Create a feature branch: `git checkout -b feature-my-change` +3. Make changes following coding standards +4. Commit with sign-off: `git commit -s` +5. Push and open a pull request +6. Sign the CLA (PT-CLA-2.0) if first contribution + +See [Contributing](contributing.md) for full details. + +### What is the AI contribution policy? + +AI-assisted contributions are accepted under these rules: + +- AI-generated code must be reviewed and understood by the contributor +- Commit messages must include `Assisted-by: ` in the trailer +- The human contributor is legally responsible for the code +- AI-generated test data and documentation are explicitly welcome +- Fully autonomous AI commits without human review are not accepted + +### How do I use lefthook? + +lefthook is configured in `lefthook.yml` and runs Git hooks automatically: + +```bash +# Install lefthook +go install github.com/evilmartians/lefthook@latest + +# Or via Nix +nix profile install nixpkgs#lefthook + +# Install hooks +lefthook install +``` + +After installation, REUSE lint and checkpatch run automatically on +`git commit`. + +--- + +## Library Questions + +### How do I use json4cpp in my CMake project? + +```cmake +# As a subdirectory (recommended in monorepo) +add_subdirectory(json4cpp) +target_link_libraries(my_target PRIVATE nlohmann_json::nlohmann_json) +``` + +```cpp +#include +using json = nlohmann::json; + +json j = json::parse(R"({"key": "value"})"); +std::string val = j["key"]; +``` + +### How do I use tomlplusplus? + +```cpp +#include + +auto config = toml::parse_file("config.toml"); +auto value = config["section"]["key"].value(); +``` + +### How do I use libnbtplusplus? + +```cpp +#include + +// Read NBT from file +std::ifstream file("level.dat", std::ios::binary); +auto tag = nbt::io::read_compound(file); + +// Access data +auto& level = tag->at("Data").as(); +std::string name = level.at("LevelName").as().get(); +``` + +### How do I use neozip? + +neozip is API-compatible with zlib: + +```c +#include + +// Compress +z_stream strm = {0}; +deflateInit(&strm, Z_DEFAULT_COMPRESSION); +// ... standard zlib API usage +deflateEnd(&strm); +``` + +### How does forgewrapper work? + +ForgeWrapper uses Java SPI to provide a file detector for Forge's installer: + +```java +// META-INF/services/io.github.zekerzhayard.forgewrapper.installer.detector.IFileDetector +// Points to the SPI implementation class +``` + +MeshMC uses forgewrapper as a runtime dependency when launching Forge-based +Minecraft instances. + +--- + +## Nix Questions + +### Nix: "error: experimental feature 'flakes' is disabled" + +Enable flakes in your Nix configuration: + +```bash +# ~/.config/nix/nix.conf +experimental-features = nix-command flakes +``` + +Or pass the flag: + +```bash +nix --experimental-features 'nix-command flakes' develop +``` + +### Nix: "error: getting status of /nix/store/..." + +The Nix store may be corrupted. Try: + +```bash +nix-store --verify --check-contents --repair +``` + +### Nix: flake.lock is outdated + +```bash +nix flake update +git add flake.lock +git commit -s -m "build(nix): update flake.lock" +``` + +### Nix: how do I update CI's pinned nixpkgs? + +```bash +cd ci/ +./update-pinned.sh +git add pinned.json +git commit -s -m "ci: update pinned nixpkgs" +``` + +--- + +## tickborg Questions + +### What is tickborg? + +tickborg is Project Tick's CI bot, forked from ofborg. It listens for GitHub +events via AMQP (RabbitMQ) and performs automated builds and tests. + +### How do I use tickborg commands? + +In a pull request comment: + +``` +@tickbot build meshmc # Build MeshMC +@tickbot test meshmc # Build and test MeshMC +@tickbot eval meshmc # Evaluate MeshMC Nix expression +``` + +### How do I deploy tickborg locally? + +```bash +cd ofborg/ +cp example.config.json config.json +# Edit config.json with your AMQP credentials +docker-compose up -d +``` + +See `ofborg/DEPLOY.md` for full deployment instructions. + +--- + +## Platform-Specific Questions + +### Can I build on WSL? + +No. Project Tick explicitly does not support WSL for development. Use: + +- **Native Linux** for Linux builds +- **Native Windows with MSVC** for Windows builds +- **macOS** for macOS builds + +The `bootstrap.sh` script will exit with an error if it detects a WSL +environment. + +### macOS: "xcrun: error: invalid active developer path" + +Install Xcode command line tools: + +```bash +xcode-select --install +``` + +### Windows: "bootstrap.cmd fails with Scoop not found" + +Install Scoop first: + +```powershell +Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser +Invoke-RestMethod -Uri https://get.scoop.sh | Invoke-Expression +``` + +Then re-run `bootstrap.cmd`. + +--- + +## Licensing Questions + +### Which license applies to my contribution? + +Contributions inherit the license of the component you are modifying: +- MeshMC → GPL-3.0-or-later +- neozip → Zlib +- json4cpp → MIT +- tomlplusplus → MIT +- libnbtplusplus → LGPL-3.0-or-later +- cgit → GPL-2.0-only +- forgewrapper → MIT +- meta → MS-PL +- corebinutils → BSD (varies by utility) + +### How do I check license compliance? + +```bash +reuse lint +``` + +This command checks that every file has proper SPDX annotations, either via +file headers or `REUSE.toml` entries. + +### What is PT-CLA-2.0? + +The Project Tick Contributor License Agreement, version 2.0. It grants +the project a perpetual, irrevocable license to use your contribution. You +retain copyright ownership. The CLA must be signed once before your first +PR can be merged. diff --git a/docs/handbook/Project-Tick/getting-started.md b/docs/handbook/Project-Tick/getting-started.md new file mode 100644 index 0000000000..10768e7862 --- /dev/null +++ b/docs/handbook/Project-Tick/getting-started.md @@ -0,0 +1,637 @@ +# Project Tick — Getting Started + +## Prerequisites + +Before working with Project Tick, ensure you have the following base tools +installed on your system: + +| Tool | Minimum Version | Purpose | +|------|----------------|---------| +| Git | 2.30+ | Source control, submodule management | +| CMake | 3.28+ | Build system for C/C++ projects | +| Ninja | 1.10+ | Fast build backend for CMake | +| C++ compiler | GCC 13+ / Clang 17+ / MSVC 19.36+ | C++23 compilation | +| C compiler | GCC 13+ / Clang 17+ / MSVC 19.36+ | C11/C23 compilation | +| pkg-config | any | Library discovery | +| Python | 3.10+ | meta/ component | +| Rust | stable | tickborg CI bot | +| JDK | 17+ | ForgeWrapper, Minecraft runtime | +| Node.js | 22+ | CI scripts | + +### Optional but Recommended + +| Tool | Purpose | +|------|---------| +| Nix | Reproducible builds, development shells | +| Go | Installing lefthook | +| lefthook | Git hooks manager | +| reuse | REUSE license compliance checking | +| clang-format | Code formatting | +| clang-tidy | Static analysis | +| npm | CI script dependencies | +| Docker/Podman | Container-based builds | +| scdoc | Man page generation | + +--- + +## Cloning the Repository + +Project Tick uses Git submodules. Always clone recursively: + +```bash +git clone --recursive https://github.com/Project-Tick/Project-Tick.git +cd Project-Tick +``` + +If you already cloned without `--recursive`: + +```bash +git submodule update --init --recursive +``` + +The repository is large. If you only need a specific sub-project, you can +do a sparse checkout: + +```bash +git clone --filter=blob:none --sparse https://github.com/Project-Tick/Project-Tick.git +cd Project-Tick +git sparse-checkout set meshmc json4cpp tomlplusplus libnbtplusplus neozip +git submodule update --init --recursive +``` + +--- + +## Bootstrap (Recommended First Step) + +The fastest way to get a working development environment is to use the +bootstrap script. It detects your platform, installs missing dependencies, +initializes submodules, and sets up lefthook. + +### Linux / macOS + +```bash +./bootstrap.sh +``` + +The script supports the following distributions: + +| Distribution | Package Manager | Detection | +|-------------|-----------------|-----------| +| Debian | apt | `/etc/os-release` ID | +| Ubuntu, Linux Mint, Pop!_OS | apt | `/etc/os-release` ID | +| Fedora | dnf | `/etc/os-release` ID | +| RHEL, CentOS, Rocky, Alma | dnf/yum | `/etc/os-release` ID | +| openSUSE, SLES | zypper | `/etc/os-release` ID | +| Arch, Manjaro, EndeavourOS | pacman | `/etc/os-release` ID | +| macOS | Homebrew | `uname -s` = Darwin | + +The bootstrap script checks for: + +- **Build tools:** npm, Go, lefthook, reuse +- **Libraries:** Qt6Core, quazip1-qt6, zlib, ECM (via pkg-config) + +If any dependencies are missing, it installs them using the appropriate +package manager with `sudo`. + +### Windows + +```cmd +bootstrap.cmd +``` + +Uses [Scoop](https://scoop.sh) for CLI tools and +[vcpkg](https://github.com/microsoft/vcpkg) for C/C++ libraries. + +### Nix + +If you have Nix installed with flakes support: + +```bash +nix develop +``` + +This drops you into a development shell with LLVM 22, clang-tidy, and all +necessary tooling. The shell hook automatically initializes submodules. + +--- + +## Building MeshMC (Primary Application) + +MeshMC is the main application in the Project Tick ecosystem. Here's how to +build it from source. + +### Step 1: Install Dependencies + +#### Debian / Ubuntu + +```bash +sudo apt-get install \ + cmake ninja-build extra-cmake-modules pkg-config \ + qt6-base-dev libquazip1-qt6-dev zlib1g-dev \ + libcmark-dev libarchive-dev libqrencode-dev libtomlplusplus-dev \ + scdoc +``` + +#### Fedora + +```bash +sudo dnf install \ + cmake ninja-build extra-cmake-modules pkgconf \ + qt6-qtbase-devel quazip-qt6-devel zlib-devel \ + cmark-devel libarchive-devel qrencode-devel tomlplusplus-devel \ + scdoc +``` + +#### Arch Linux + +```bash +sudo pacman -S --needed \ + cmake ninja extra-cmake-modules pkgconf \ + qt6-base quazip-qt6 zlib \ + cmark libarchive qrencode tomlplusplus \ + scdoc +``` + +#### openSUSE + +```bash +sudo zypper install \ + cmake ninja extra-cmake-modules pkg-config \ + qt6-base-devel quazip-qt6-devel zlib-devel \ + cmark-devel libarchive-devel qrencode-devel tomlplusplus-devel \ + scdoc +``` + +#### macOS (Homebrew) + +```bash +brew install \ + cmake ninja extra-cmake-modules \ + qt@6 quazip zlib \ + cmark libarchive qrencode tomlplusplus \ + scdoc +``` + +### Step 2: Configure with CMake Presets + +MeshMC ships `CMakePresets.json` with platform-specific presets: + +```bash +cd meshmc + +# Linux +cmake --preset linux + +# macOS +cmake --preset macos + +# macOS Universal Binary (x86_64 + arm64) +cmake --preset macos_universal + +# Windows (MinGW) +cmake --preset windows_mingw + +# Windows (MSVC) +cmake --preset windows_msvc +``` + +All presets use Ninja Multi-Config, output to `build/`, and install to +`install/`. + +### Step 3: Build + +```bash +# Using preset (matches the configure preset name) +cmake --build --preset linux + +# Or manually with Ninja +cmake --build build --config Release +``` + +### Step 4: Install (Optional) + +```bash +cmake --install build --config Release --prefix install +``` + +The built binary appears at `install/bin/meshmc`. + +### Step 5: Run Tests + +```bash +cd build +ctest --output-on-failure +``` + +### Building with Nix + +```bash +cd meshmc +nix build +``` + +Or enter a development shell: + +```bash +nix develop +cmake --preset linux +cmake --build --preset linux +``` + +### Building with Container (Podman/Docker) + +MeshMC provides a `Containerfile`: + +```bash +cd meshmc +podman build -t meshmc . +``` + +--- + +## Building Other Sub-Projects + +### NeoZip (Compression Library) + +```bash +cd neozip + +# CMake build +mkdir build && cd build +cmake .. -G Ninja +ninja +ctest --output-on-failure + +# Or Autotools +./configure +make -j$(nproc) +make test +``` + +### cmark (Markdown Library) + +```bash +cd cmark +mkdir build && cd build +cmake .. -G Ninja +ninja +ctest --output-on-failure +``` + +### json4cpp (JSON Library) + +```bash +cd json4cpp +mkdir build && cd build +cmake .. -G Ninja +ninja +ctest --output-on-failure +``` + +json4cpp is header-only. For most uses, just include +`` or `` and point your +include path at `json4cpp/include/` or `json4cpp/single_include/`. + +### tomlplusplus (TOML Library) + +```bash +cd tomlplusplus + +# Meson (primary) +meson setup build +ninja -C build +ninja -C build test + +# Or CMake +mkdir build && cd build +cmake .. -G Ninja +ninja +ctest --output-on-failure +``` + +toml++ is header-only. Include `` or use the single +header `toml.hpp`. + +### libnbt++ (NBT Library) + +```bash +cd libnbtplusplus +mkdir build && cd build +cmake .. +make -j$(nproc) +ctest --output-on-failure +``` + +CMake options: +- `NBT_BUILD_SHARED=OFF` — Build static library (default) +- `NBT_USE_ZLIB=ON` — Enable zlib support for compressed NBT (default) +- `NBT_BUILD_TESTS=ON` — Build tests (default) + +### GenQRCode (QR Code Library) + +```bash +cd genqrcode + +# Autotools +./autogen.sh +./configure +make -j$(nproc) +make check + +# Or CMake +mkdir build && cd build +cmake .. -G Ninja +ninja +ctest --output-on-failure +``` + +### ForgeWrapper (Java) + +```bash +cd forgewrapper +./gradlew build +``` + +The JAR is produced in `build/libs/`. + +### CoreBinUtils (BSD Utilities) + +```bash +cd corebinutils +./configure +make -f GNUmakefile -j$(nproc) all + +# Run tests +make -f GNUmakefile test +``` + +Uses musl-first toolchain selection by default. + +### MNV (Text Editor) + +```bash +cd mnv + +# CMake +mkdir build && cd build +cmake .. -G Ninja +ninja + +# Or Autotools +./configure +make -j$(nproc) +``` + +### cgit (Git Web Interface) + +```bash +cd cgit + +# Initialize Git submodule (cgit bundles its own git) +git submodule init +git submodule update + +make +sudo make install +``` + +Installs to `/var/www/htdocs/cgit` by default. Provide a `cgit.conf` +file to customize. + +### Meta (Metadata Generator) + +```bash +cd meta + +# Install dependencies with Poetry +pip install poetry +poetry install + +# Update Mojang versions +poetry run updateMojang + +# Generate all metadata +poetry run generateMojang +poetry run generateForge +poetry run generateNeoForge +poetry run generateFabric +poetry run generateQuilt +poetry run generateJava +``` + +### tickborg (CI Bot) + +```bash +cd ofborg/tickborg +cargo build +cargo test +cargo check +``` + +--- + +## Setting Up the Development Environment + +### Git Hooks with Lefthook + +After cloning, install lefthook to enable pre-commit and pre-push hooks: + +```bash +# Install lefthook (if not already installed) +go install github.com/evilmartians/lefthook@latest + +# Or via npm +npm i -g lefthook + +# Install hooks in the repository +lefthook install +``` + +The hooks perform: + +1. **Pre-commit:** + - REUSE license compliance check (auto-downloads missing licenses) + - checkpatch.pl on staged C/C++/CMake changes + +2. **Pre-push:** + - Final REUSE compliance check + +### REUSE Compliance + +Ensure every file has proper SPDX headers: + +```bash +# Check compliance +reuse lint + +# Download missing license texts +reuse download --all +``` + +### Code Formatting + +MeshMC uses clang-format for C/C++ formatting: + +```bash +# Format a file +clang-format -i path/to/file.cpp + +# Check formatting (CI style) +clang-format --dry-run --Werror path/to/file.cpp +``` + +The CI system uses `treefmt` with biome (JavaScript), nixfmt (Nix), and +yamlfmt (YAML) for other file types. + +### IDE Setup + +#### VS Code + +Recommended extensions: +- C/C++ (ms-vscode.cpptools) +- CMake Tools (ms-vscode.cmake-tools) +- clangd (llvm-vs-code-extensions.vscode-clangd) + +MeshMC generates `compile_commands.json` via +`CMAKE_EXPORT_COMPILE_COMMANDS ON` for full IDE support. + +#### CLion + +Open the `meshmc/CMakeLists.txt` directly. CLion natively supports CMake +presets — select the appropriate platform preset. + +#### Vim/MNV + +Use the `compile_commands.json` with a language server like `clangd` or +`ccls`. + +--- + +## First Contribution Workflow + +1. **Fork** the repository on GitHub +2. **Clone** your fork: + ```bash + git clone --recursive https://github.com/YOUR_USERNAME/Project-Tick.git + ``` +3. **Create a branch:** + ```bash + git checkout -b feature/my-change + ``` +4. **Make your changes** +5. **Format and lint:** + ```bash + clang-format -i changed_files.cpp + reuse lint + ``` +6. **Commit with sign-off and conventional format:** + ```bash + git commit -s -a -m "feat(meshmc): add new feature description" + ``` +7. **Push and create a PR:** + ```bash + git push origin feature/my-change + ``` +8. Open a pull request against the `master` branch + +See [contributing.md](contributing.md) for detailed contribution guidelines. + +--- + +## Troubleshooting + +### CMake can't find Qt 6 + +Ensure Qt 6 is installed and discoverable: + +```bash +# Check if Qt6Core is available +pkg-config --modversion Qt6Core + +# If using a custom Qt installation, set CMAKE_PREFIX_PATH +cmake --preset linux -DCMAKE_PREFIX_PATH=/path/to/qt6 +``` + +### Submodules are empty + +```bash +git submodule update --init --recursive --force +``` + +### Build fails on WSL + +MeshMC explicitly blocks WSL builds in its CMakeLists.txt: + +``` +Building MeshMC is not supported in Linux-on-Windows distributions. +``` + +Build natively on Windows using the `windows_msvc` or `windows_mingw` preset +instead. + +### In-source build error + +MeshMC enforces out-of-source builds. If you see this error: + +``` +You are building MeshMC in-source. Please separate the build tree from the source tree. +``` + +Create a separate build directory: + +```bash +cd meshmc +cmake --preset linux # Uses build/ automatically +``` + +### Missing ECM (Extra CMake Modules) + +Install the ECM package for your distribution: + +```bash +# Debian/Ubuntu +sudo apt-get install extra-cmake-modules + +# Fedora +sudo dnf install extra-cmake-modules + +# Arch +sudo pacman -S extra-cmake-modules +``` + +### Nix build fails + +Ensure you have flakes enabled: + +```bash +# Check Nix version +nix --version + +# Enable flakes (if not already) +echo "experimental-features = nix-command flakes" >> ~/.config/nix/nix.conf +``` + +### Poetry not found + +```bash +pip install poetry +# Or +pipx install poetry +``` + +### Rust/Cargo not found + +```bash +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +source ~/.cargo/env +``` + +### cgit build fails (missing Git submodule) + +```bash +cd cgit +git submodule init +git submodule update +# Or download manually: +make get-git +``` diff --git a/docs/handbook/Project-Tick/glossary.md b/docs/handbook/Project-Tick/glossary.md new file mode 100644 index 0000000000..cca34ef3c9 --- /dev/null +++ b/docs/handbook/Project-Tick/glossary.md @@ -0,0 +1,556 @@ +# Project Tick — Glossary + +## A + +### Adler-32 +A checksum algorithm designed by Mark Adler, used in the zlib data format. +neozip provides SIMD-accelerated implementations of Adler-32 via architecture- +specific intrinsics (SSE4.2, AVX2, NEON, VMX). + +### AMQP (Advanced Message Queuing Protocol) +A wire-level protocol for message-oriented middleware. tickborg uses AMQP +(via RabbitMQ) to receive build requests from GitHub webhooks and dispatch +them to worker nodes. + +### AppImage +A portable Linux application format that bundles all dependencies into a +single executable file. MeshMC distributes Linux releases as AppImages. + +### Autotools +A build system suite (autoconf, automake, libtool) used by several +sub-projects including genqrcode. The `./autogen.sh` script bootstraps +the build, producing `configure` and `Makefile.in`. + +### AUR (Arch User Repository) +A community-driven repository for Arch Linux packages. MeshMC publishes a +PKGBUILD to the AUR for Arch users. + +### AVX2 / AVX-512 +Advanced Vector Extensions — x86 SIMD instruction sets providing 256-bit and +512-bit vector operations. neozip uses AVX2 for accelerated CRC-32, Adler-32, +and deflate hash chain insertion. + +--- + +## B + +### Bazel +An open-source build system from Google. json4cpp provides `BUILD.bazel` and +`MODULE.bazel` files for Bazel-based builds as an alternative to CMake. + +### BSD License +A permissive open-source license family. Project Tick uses BSD-1-Clause, +BSD-2-Clause, BSD-3-Clause, and BSD-4-Clause across various components, +primarily in corebinutils (FreeBSD-derived code). + +### BSL-1.0 (Boost Software License) +A permissive license used by some utility code in the monorepo. + +--- + +## C + +### Cargo +The Rust package manager and build system. tickborg uses a Cargo workspace +with two crates: `tickborg` and `tickborg-simple-build`. + +### CC-BY-SA-4.0 +Creative Commons Attribution-ShareAlike 4.0 International. Used for +documentation content within the Project Tick monorepo. + +### CC0-1.0 +Creative Commons Zero. A public domain dedication used for trivial +configuration files and metadata. + +### CGI (Common Gateway Interface) +A protocol for web servers to execute programs and return their output. cgit +is a CGI application that generates HTML from Git repositories. + +### cgit +A fast, lightweight web interface for Git repositories, written in C. The +Project Tick fork includes UI customizations and is linked against a bundled +Git source tree. + +### CLA (Contributor License Agreement) +A legal agreement between a contributor and the project. Project Tick uses +PT-CLA-2.0, which must be signed before contributions are accepted. + +### Clang +The C/C++ compiler from the LLVM project. MeshMC requires Clang 18+ or +equivalent. The Nix development shell provides LLVM 22 (Clang 22). + +### clang-format +An automatic C/C++ code formatter. MeshMC's `.clang-format` defines the +project's formatting rules, enforced by CI. + +### clang-tidy +A C/C++ static analysis tool. MeshMC's `.clang-tidy` configures enabled +checks. CI runs clang-tidy as part of the lint stage. + +### CMake +A cross-platform build system generator. The primary build system for MeshMC, +neozip, json4cpp, libnbtplusplus, genqrcode, cmark, and MNV. Project Tick +requires CMake 3.28+ for MeshMC. + +### CMake Presets +A JSON-based configuration file (`CMakePresets.json`) that defines named sets +of CMake configure, build, and test options. MeshMC uses presets for each +target platform. + +### CODEOWNERS +A GitHub feature that maps file paths to responsible reviewers. Project Tick's +`.github/CODEOWNERS` routes all reviews to `@YongDo-Hyun`. + +### CodeQL +GitHub's semantic code analysis engine for finding security vulnerabilities. +Configured in `.github/codeql/` for C, C++, and Java scanning. + +### cmark +A standard-compliant CommonMark Markdown parser written in C. Licensed under +BSD-2-Clause. Provides both a library and a CLI tool. + +### CommonMark +A strongly-defined specification for Markdown syntax. cmark is the reference +C implementation of the CommonMark spec. + +### Conventional Commits +A commit message convention: `type(scope): description`. The CI lint stage +(`ci/github-script/lint-commits.js`) enforces this convention. + +### corebinutils +A collection of core Unix utilities ported from FreeBSD. Provides minimal +implementations of commands like `cat`, `ls`, `cp`, `mv`, `rm`, `mkdir`, +`chmod`, `echo`, `kill`, `ps`, and 30+ others. + +### Coverity +A commercial static analysis tool. Some sub-projects include Coverity scan +integration in their CI workflows. + +### CRC-32 +Cyclic Redundancy Check with a 32-bit output. Used in gzip, PNG, and other +formats. neozip provides SIMD-accelerated CRC-32 using PCLMULQDQ (x86), +PMULL (ARM), and hardware instructions on s390x. + +### Crowdin +A localization management platform. MeshMC uses Crowdin for translation +management (`launcher/translations/`). + +### CurseForge +A Minecraft mod hosting platform. MeshMC integrates with CurseForge for mod +discovery and installation via `launcher/modplatform/`. + +--- + +## D + +### DCO (Developer Certificate of Origin) +A lightweight alternative to a CLA. Contributors certify their right to +submit code by adding `Signed-off-by:` to commit messages (`git commit -s`). +Enforced by `.github/dco.yml`. + +### Deflate +A lossless compression algorithm combining LZ77 and Huffman coding. neozip +provides multiple deflate strategies: fast, medium, slow (best), quick, +huffman-only, RLE, and stored. + +### DFLTCC (Deflate Conversion Call) +A hardware instruction on IBM z15+ mainframes (s390x) that performs deflate +compression/decompression in hardware. neozip supports DFLTCC via +`arch/s390/`. + +### direnv +A shell extension that loads environment variables from `.envrc` files. +Project Tick uses direnv with Nix (`use flake`) for automatic development +environment activation. + +### Docker / Containerfile +Container image build specifications. images4docker provides 40 Dockerfiles +for CI build environments. MeshMC includes a `Containerfile` for container +builds. + +--- + +## E + +### ECM (Extra CMake Modules) +A set of additional CMake modules provided by the KDE project. Required as a +build dependency for MeshMC. + +--- + +## F + +### Fabric +A lightweight Minecraft mod loader. MeshMC supports Fabric modding via +`launcher/modplatform/`. The meta generator produces Fabric version metadata +(`generate_fabric.py`). + +### Flake (Nix) +A Nix feature that provides reproducible, hermetic project definitions. The +root `flake.nix` defines the development shell with LLVM 22, Qt 6, and all +build dependencies. + +### Flatpak / Flathub +A Linux application sandboxing and distribution system. MeshMC is published +on Flathub after release. + +### Forge (Minecraft Forge) +A Minecraft mod loader for modifying the game. forgewrapper provides a Java +shim using SPI (Service Provider Interface) for Forge's boot process. + +### ForgeWrapper +A Java library that uses JPMS (Java Platform Module System) and SPI to +integrate with Minecraft Forge's installer/detector mechanism. Located at +`forgewrapper/`. + +### FreeBSD +A Unix-like operating system. corebinutils contains utilities ported from +FreeBSD's coreutils. + +### Fuzz Testing +A testing technique that provides random/malformed inputs to find crashes and +vulnerabilities. neozip, cmark, meta, and tomlplusplus include fuzz testing +targets. + +--- + +## G + +### Garnix +A CI platform for Nix projects. The meta sub-project uses Garnix +(`meta/garnix.yaml`). + +### genqrcode +A C library for generating QR codes. Supports all QR code versions (1–40), +multiple error correction levels (L/M/Q/H), and various encoding modes +(numeric, alphanumeric, byte, kanji, ECI). + +### GHCR (GitHub Container Registry) +GitHub's container image registry. images4docker images are pushed to GHCR. + +### GPL (GNU General Public License) +A copyleft license family. Project Tick uses GPL-2.0-only (cgit), +GPL-3.0-only (archived projects), and GPL-3.0-or-later (MeshMC, +images4docker). + +### Gradle +A build automation tool for JVM projects. forgewrapper uses Gradle with the +Gradle Wrapper (`gradlew`). + +--- + +## H + +### Huffman Coding +A lossless data compression algorithm using variable-length codes. neozip's +`trees.c` implements Huffman tree construction for the deflate algorithm. + +--- + +## I + +### images4docker +A collection of 40 Dockerfiles providing CI build environments for every +supported Linux distribution. Qt 6 is a mandatory dependency in all images. +Images are rebuilt daily at 03:17 UTC. + +--- + +## J + +### JPMS (Java Platform Module System) +Introduced in Java 9 (Project Jigsaw), JPMS provides a module system for +Java. forgewrapper uses JPMS configuration (`jigsaw/`) for proper module +encapsulation. + +### json4cpp +A fork of nlohmann/json — a header-only JSON library for C++. Licensed under +MIT. Provides SAX and DOM parsing, serialization, JSON Pointer, JSON Patch, +JSON Merge Patch, and CBOR/MessagePack/UBJSON/BSON support. + +--- + +## L + +### lefthook +A fast, cross-platform Git hooks manager. Configured in `lefthook.yml` to +run REUSE lint and checkpatch on pre-commit. + +### LGPL (GNU Lesser General Public License) +A copyleft license that permits linking from proprietary code. +libnbtplusplus uses LGPL-3.0, genqrcode uses LGPL-2.1. + +### libnbtplusplus +A C++ library for reading and writing Minecraft's NBT (Named Binary Tag) +format. Used by MeshMC to parse and modify Minecraft world data. + +### LLVM +A compiler infrastructure providing Clang, LLD, and other tools. Project +Tick's Nix development shell provides LLVM 22. + +### Lua +A lightweight scripting language. cgit uses Lua for content filtering +(`filter.c`). + +### LZ77 +A lossless compression algorithm that replaces repeated occurrences with +references (length, distance pairs). The foundation of the deflate algorithm +implemented in neozip. + +--- + +## M + +### Make (GNU Make) +A build automation tool. Used by cgit, corebinutils, and cmark. cgit uses +a plain Makefile, while corebinutils uses GNUmakefile. + +### MeshMC +The primary application in the Project Tick ecosystem. A custom Minecraft +launcher written in C++23 with Qt 6, supporting multiple mod loaders, +instance management, and cross-platform deployment. + +### Meson +A build system focused on speed and simplicity. tomlplusplus uses Meson as +its primary build system. + +### MIT License +A permissive open-source license. Used by json4cpp, tomlplusplus, +forgewrapper, tickborg, and archived/projt-minicraft-modpack. + +### MNV +A fork of the Vim text editor with modern enhancements. Written in C, built +with CMake or Autotools. + +### Modrinth +A Minecraft mod hosting platform. MeshMC integrates with Modrinth for mod +discovery and installation. + +### Mojang +The developer of Minecraft. meta generates Mojang version metadata +(`generate_mojang.py`). + +### MS-PL (Microsoft Public License) +An open-source license used by the meta sub-project. + +### MSVC (Microsoft Visual C++) +Microsoft's C/C++ compiler. MeshMC requires MSVC 17.10+ (Visual Studio 2022) +for Windows builds. + +### musl +A lightweight C standard library implementation for Linux. Some neozip CI +builds test against musl for static linking compatibility. + +--- + +## N + +### NBT (Named Binary Tag) +A binary format used by Minecraft for storing structured data (worlds, +entities, items). libnbtplusplus provides C++ types for all NBT tag types: +`tag_byte`, `tag_short`, `tag_int`, `tag_long`, `tag_float`, `tag_double`, +`tag_string`, `tag_byte_array`, `tag_list`, `tag_compound`, +`tag_int_array`, `tag_long_array`. + +### NEON +ARM's SIMD instruction set for 128-bit vector operations. neozip uses NEON +for accelerated CRC-32, Adler-32, and slide hash on AArch64. + +### NeoForge +A community fork of Minecraft Forge. MeshMC supports NeoForge modding. The +meta generator produces NeoForge version metadata (`generate_neoforge.py`). + +### neozip +Project Tick's fork of zlib-ng, a high-performance zlib replacement with +SIMD acceleration across x86, ARM, Power, s390x, RISC-V, and LoongArch +architectures. Licensed under the Zlib license. + +### Nix +A purely functional package manager. Project Tick uses Nix flakes for +reproducible development environments, CI tooling, and package builds. + +### nixpkgs +The Nix package collection. CI pins a specific nixpkgs revision in +`ci/pinned.json` for reproducible builds. + +### NSIS (Nullsoft Scriptable Install System) +A Windows installer creation tool. MNV uses NSIS (`mnv/nsis/`) for Windows +distribution. + +--- + +## O + +### ofborg +The upstream project from which tickborg is forked. A CI system for the +Nixpkgs package repository that processes GitHub events via AMQP. + +--- + +## P + +### PCLMULQDQ +An x86 instruction for carry-less multiplication used to accelerate CRC-32 +computation. neozip uses PCLMULQDQ via `arch/x86/`. + +### PKGBUILD +An Arch Linux package build script. MeshMC maintains a PKGBUILD for AUR +distribution. + +### PMULL +An ARM instruction for polynomial multiplication, used for CRC-32 +acceleration on AArch64. neozip's ARM CRC implementation uses PMULL. + +### Poetry +A Python dependency management and packaging tool. The meta sub-project uses +Poetry (`meta/pyproject.toml`, `meta/poetry.lock`). + +### PR (Pull Request) +A GitHub mechanism for proposing code changes. All changes to protected +branches must go through PRs with passing CI and review approval. + +--- + +## Q + +### QR Code (Quick Response Code) +A two-dimensional barcode format. genqrcode generates QR codes supporting +versions 1–40, four error correction levels (L/M/Q/H), and multiple encoding +modes. + +### Qt 6 +A cross-platform application framework. MeshMC uses Qt 6 for its GUI +(widgets, dialogs, themes). Qt 6 is a mandatory dependency across all +images4docker build environments. + +### Quilt +A Minecraft mod loader forked from Fabric. MeshMC supports Quilt modding. The +meta generator produces Quilt version metadata (`generate_quilt.py`). + +--- + +## R + +### RabbitMQ +An AMQP message broker. tickborg connects to RabbitMQ to receive build +requests dispatched from GitHub webhooks. + +### Reed-Solomon +An error-correcting code used in QR codes. genqrcode implements Reed-Solomon +error correction in `rsecc.c`. + +### Renovate +An automated dependency update bot. Configured in `meta/renovate.json`. + +### REUSE +A specification from the FSFE (Free Software Foundation Europe) for +expressing license and copyright information. Project Tick's `REUSE.toml` +maps every file path to its SPDX license identifier. + +### RISC-V +An open-source instruction set architecture. neozip includes RISC-V SIMD +optimizations via the RVV (Vector) and ZBC (Carry-less Multiply) extensions +in `arch/riscv/`. + +--- + +## S + +### Scoop +A Windows package manager. `bootstrap.cmd` uses Scoop to install +dependencies on Windows. + +### Semgrep +A pattern-based static analysis tool for security scanning. Some CI workflows +include Semgrep scans. + +### SemVer (Semantic Versioning) +A versioning scheme: `MAJOR.MINOR.PATCH`. MAJOR for breaking changes, MINOR +for backwards-compatible features, PATCH for bug fixes. + +### SIMD (Single Instruction, Multiple Data) +A parallel processing technique. neozip heavily uses SIMD for performance- +critical operations: SSE2/SSE4.2/AVX2/AVX-512 (x86), NEON (ARM), VMX/VSX +(Power), DFLTCC (s390x), RVV (RISC-V), LSX/LASX (LoongArch). + +### SPI (Service Provider Interface) +A Java API pattern for extensibility. forgewrapper uses SPI via +`IFileDetector.java` to integrate with Forge's installer mechanism. + +### SPDX (Software Package Data Exchange) +A standard for communicating software license information. All Project Tick +licenses use SPDX identifiers. The `LICENSES/` directory contains full SPDX- +named license text files. + +### SSE (Streaming SIMD Extensions) +x86 SIMD instruction sets (SSE2, SSE4.2). neozip uses SSE for baseline +SIMD acceleration on x86 platforms. + +--- + +## T + +### tickborg +Project Tick's CI bot, forked from ofborg. A Rust application that listens on +AMQP for build requests and executes them. Bot commands: `@tickbot build`, +`@tickbot test`, `@tickbot eval`. + +### TOML (Tom's Obvious Minimal Language) +A configuration file format. tomlplusplus is a C++17 header-only TOML parser +and serializer supporting TOML v1.0.0. + +### tomlplusplus +A header-only C++17 TOML library. Licensed under MIT. Provides parsing, +serialization, and manipulation of TOML documents. Built with Meson or CMake. + +### treefmt +A universal code formatter dispatcher. Configured in `ci/default.nix` to run +all language-specific formatters in a single pass. + +--- + +## U + +### Unlicense +A public domain dedication license. Used for some trivial files in the +monorepo. + +--- + +## V + +### vcpkg +Microsoft's C/C++ package manager. MeshMC uses vcpkg for Windows dependency +management (`meshmc/vcpkg.json`, `meshmc/vcpkg-configuration.json`). + +### Vim +A highly configurable text editor. MNV is a fork of Vim with additional +features. Licensed under the Vim license + GPL-3.0. + +### VMX / VSX +IBM Power architecture SIMD instruction sets (Vector Multimedia Extension / +Vector Scalar Extension). neozip uses VMX/VSX for Power8/9 acceleration. + +--- + +## W + +### WSL (Windows Subsystem for Linux) +A compatibility layer for running Linux on Windows. Project Tick does **not** +support building under WSL; native Windows builds via MSVC are required. + +--- + +## Z + +### zlib +The original compression library implementing the deflate algorithm. neozip +is a high-performance fork of zlib-ng, which itself is a modernized fork of +zlib. + +### zlib-ng +A modernized fork of zlib with SIMD optimizations. neozip is Project Tick's +fork of zlib-ng with additional modifications. + +### Zlib License +A permissive open-source license. Used by neozip and archived/ptlibzippy. diff --git a/docs/handbook/Project-Tick/licensing.md b/docs/handbook/Project-Tick/licensing.md new file mode 100644 index 0000000000..d80281a42e --- /dev/null +++ b/docs/handbook/Project-Tick/licensing.md @@ -0,0 +1,371 @@ +# Project Tick — Licensing + +## Overview + +Project Tick is a multi-licensed ecosystem. Because the monorepo contains +components with diverse origins — from BSD utility ports to GPL-licensed +applications to MIT/Zlib libraries — each sub-project carries the license +appropriate to its upstream lineage and Project Tick's own contributions. + +The project uses the [REUSE](https://reuse.software/) specification (version +3.0) for license compliance. Every file in the repository is annotated with +SPDX license identifiers and copyright statements, either inline in file +headers or via the `REUSE.toml` configuration file. + +--- + +## License Inventory + +The `LICENSES/` directory contains 20 distinct SPDX-compliant license texts: + +| SPDX Identifier | License Name | Category | +|-----------------|-------------|----------| +| `Apache-2.0` | Apache License 2.0 | Permissive | +| `BSD-1-Clause` | BSD 1-Clause License | Permissive | +| `BSD-2-Clause` | BSD 2-Clause "Simplified" License | Permissive | +| `BSD-3-Clause` | BSD 3-Clause "New" License | Permissive | +| `BSD-4-Clause` | BSD 4-Clause "Original" License | Permissive | +| `BSL-1.0` | Boost Software License 1.0 | Permissive | +| `CC-BY-SA-4.0` | Creative Commons Attribution-ShareAlike 4.0 | Creative Commons | +| `CC0-1.0` | Creative Commons Zero 1.0 Universal | Public Domain Dedication | +| `GPL-2.0-only` | GNU General Public License v2.0 only | Copyleft | +| `GPL-3.0-only` | GNU General Public License v3.0 only | Copyleft | +| `GPL-3.0-or-later` | GNU General Public License v3.0 or later | Copyleft | +| `LGPL-2.0-or-later` | GNU Lesser General Public License v2.0 or later | Weak Copyleft | +| `LGPL-2.1-or-later` | GNU Lesser General Public License v2.1 or later | Weak Copyleft | +| `LGPL-3.0-or-later` | GNU Lesser General Public License v3.0 or later | Weak Copyleft | +| `LicenseRef-Qt-Commercial` | Qt Commercial License (reference) | Proprietary | +| `MIT` | MIT License | Permissive | +| `MS-PL` | Microsoft Public License | Permissive | +| `Unlicense` | The Unlicense | Public Domain Dedication | +| `Vim` | Vim License | Permissive (custom) | +| `Zlib` | zlib License | Permissive | + +--- + +## Per-Component License Map + +### Applications + +| Component | Directory | License | Copyright | +|-----------|-----------|---------|-----------| +| **MeshMC** | `meshmc/` | GPL-3.0-or-later | 2026 Project Tick | +| MeshMC (historical code) | `meshmc/` | Apache-2.0 (incorporated work) | 2012–2022 MultiMC Contributors | +| **MNV** | `mnv/` | Vim AND GPL-3.0-or-later | Bram Moolenaar & Vim Contributors & Project Tick | +| **cgit** | `cgit/` | GPL-2.0-only | cgit Contributors & Project Tick | + +### Libraries + +| Component | Directory | License | Copyright | +|-----------|-----------|---------|-----------| +| **NeoZip** | `neozip/` | Zlib | Zlib Contributors & Zlib-ng Contributors & Project Tick | +| **Json4C++** | `json4cpp/` | MIT | Json4C++ Contributors & Project Tick | +| **toml++** | `tomlplusplus/` | MIT | Toml++ Contributors & Project Tick | +| **libnbt++** | `libnbtplusplus/` | LGPL-3.0-or-later | libnbtplusplus Contributors & ljfa-ag & Project Tick | +| **cmark** | `cmark/` | BSD-2-Clause AND MIT AND CC-BY-SA-4.0 | CMark Contributors & Project Tick | +| **GenQRCode** | `genqrcode/` | LGPL-2.1-or-later | GenQRCode Contributors & Project Tick | +| **ForgeWrapper** | `forgewrapper/` | MIT | ForgeWrapper Contributors & Project Tick | + +### System Utilities + +| Component | Directory | License | Copyright | +|-----------|-----------|---------|-----------| +| **CoreBinUtils** | `corebinutils/` | BSD-1-Clause AND BSD-2-Clause AND BSD-3-Clause AND BSD-4-Clause AND MIT | FreeBSD Contributors & Project Tick | + +### Infrastructure + +| Component | Directory | License | Copyright | +|-----------|-----------|---------|-----------| +| **Meta** | `meta/` | MS-PL | MultiMC Contributors & PolyMC Contributors & PrismLauncher Contributors & Project Tick | +| **tickborg** | `ofborg/` | MIT | NixOS Contributors & Project Tick | +| **Images4Docker** | `images4docker/` | GPL-3.0-or-later | Project Tick | + +### Archived + +| Component | Directory | License | Copyright | +|-----------|-----------|---------|-----------| +| **ProjT Launcher** | `archived/projt-launcher/` | GPL-3.0-only | MultiMC Contributors & Prism Launcher Contributors & PolyMC Contributors & Project Tick | +| **ProjT Modpack** | `archived/projt-modpack/` | GPL-3.0-only | Project Tick | +| **ProjT Minicraft Modpack** | `archived/projt-minicraft-modpack/` | MIT | Project Tick | +| **ptlibzippy** | `archived/ptlibzippy/` | Zlib | Zlib Contributors & Project Tick | + +--- + +## REUSE.toml Analysis + +The `REUSE.toml` file (version 1) uses `[[annotations]]` blocks to map file +paths to their SPDX license identifiers and copyright statements. This is the +primary mechanism for bulk license annotation. + +### Infrastructure and Configuration Files + +```toml +[[annotations]] +path = [ + ".gitignore", ".gitattributes", ".gitmodules", ".github/**", + ".envrc", ".markdownlint.yaml", ".markdownlintignore", + "Containerfile", "default.nix", "flake.lock", "flake.nix", + "shell.nix", "vcpkg-configuration.json", "vcpkg.json", + ".clang-format", ".clang-tidy", "CODEOWNERS", "hooks/**", "ci/**" +] +SPDX-License-Identifier = "CC0-1.0" +SPDX-FileCopyrightText = "NONE" +``` + +Configuration files, CI scripts, Git metadata, and build system configuration +are placed in the public domain under CC0-1.0 with no copyright claim. + +### Documentation + +```toml +[[annotations]] +path = ["**/*.md", "doc/**"] +SPDX-License-Identifier = "CC0-1.0" +SPDX-FileCopyrightText = "2026 Project Tick" +``` + +All Markdown files and documentation are CC0-1.0, allowing unrestricted reuse. + +### MeshMC-Specific Files + +```toml +# Launcher packaging +path = ["launcher/package/**"] +SPDX-License-Identifier = "GPL-3.0-or-later" + +# Qt UI files +path = ["**/*.ui"] +SPDX-License-Identifier = "GPL-3.0-or-later" + +# CMake presets +path = ["CMakePresets.json"] +SPDX-License-Identifier = "GPL-3.0-or-later" + +# Nix build files +path = ["nix/**"] +SPDX-License-Identifier = "GPL-3.0-or-later" + +# Branding and resources +path = ["branding/**", "launcher/resources/**"] +SPDX-License-Identifier = "CC0-1.0" +``` + +### CMake Build Files + +```toml +path = ["cmake/**", "**/CMakeLists.txt"] +SPDX-License-Identifier = "BSD-3-Clause" +SPDX-FileCopyrightText = "Various authors" +``` + +CMake modules and build definitions use BSD-3-Clause, reflecting their diverse +authorship. + +### Test Data + +```toml +path = ["**/testdata/**"] +SPDX-License-Identifier = "CC0-1.0" +SPDX-FileCopyrightText = "NONE" +``` + +Test data has no copyright claims and is in the public domain. + +--- + +## License Compatibility + +### Core Dependency Chain + +MeshMC (GPL-3.0-or-later) links against libraries with the following licenses: + +| Library | License | GPL-3.0 Compatible? | +|---------|---------|---------------------| +| json4cpp | MIT | Yes — permissive | +| tomlplusplus | MIT | Yes — permissive | +| libnbtplusplus | LGPL-3.0-or-later | Yes — LGPL is GPL-compatible | +| neozip | Zlib | Yes — permissive | +| cmark | BSD-2-Clause/MIT | Yes — permissive | +| genqrcode | LGPL-2.1-or-later | Yes — LGPL is GPL-compatible | +| Qt 6 | LGPL-3.0 / GPL-3.0 / Commercial | Yes — LGPL/GPL-compatible | +| QuaZip | LGPL-2.1 | Yes — LGPL is GPL-compatible | +| libarchive | BSD-2-Clause | Yes — permissive | +| ECM | BSD-3-Clause | Yes — permissive | + +All library dependencies are GPL-3.0 compatible. The GPL-3.0-or-later license +of MeshMC governs the combined work. + +### ForgeWrapper (Runtime) + +ForgeWrapper (MIT) is loaded at runtime as a separate Java process, not linked +at compile time. The MIT license is compatible with GPL-3.0 for distribution +purposes, and runtime invocation does not create a derivative work concern. + +### Meta (MS-PL) + +The MS-PL (Microsoft Public License) used by `meta/` is a permissive license +that allows use, modification, and redistribution. It is generally considered +compatible with GPL for independent components. Since `meta/` is a standalone +Python project that generates JSON data consumed by MeshMC over HTTP, there is +no linking relationship. + +### CoreBinUtils (Multi-BSD) + +CoreBinUtils uses a combination of BSD-1-Clause, BSD-2-Clause, BSD-3-Clause, +BSD-4-Clause, and MIT — reflecting the diverse origins of FreeBSD utilities. +The BSD-4-Clause (advertising clause) applies only to the specific files that +carry it. All BSD variants are permissive and do not impose copyleft +obligations. + +### MNV (Vim License + GPL-3.0-or-later) + +MNV uses a dual license: the Vim license (a permissive custom license) and +GPL-3.0-or-later. The Vim license is similar to the Charityware license and +allows free use, modification, and redistribution. The GPL-3.0-or-later +applies to Project Tick's modifications. + +### cgit (GPL-2.0-only) + +cgit uses GPL-2.0-only (not "or later"), which means it cannot be +relicensed under GPL-3.0. It remains an independent component with no +linking relationship to GPL-3.0 components. + +--- + +## SPDX Headers + +### Inline Headers + +Source files should include SPDX headers at the top: + +```c +// SPDX-FileCopyrightText: 2026 Project Tick +// SPDX-License-Identifier: GPL-3.0-or-later +``` + +```python +# SPDX-FileCopyrightText: 2026 Project Tick +# SPDX-License-Identifier: MS-PL +``` + +```cmake +# SPDX-FileCopyrightText: 2026 Project Tick +# SPDX-License-Identifier: BSD-3-Clause +``` + +### REUSE.toml Bulk Annotations + +For files where inline headers are impractical (binary files, generated files, +configuration files), use `REUSE.toml` annotations with glob patterns: + +```toml +[[annotations]] +path = ["pattern/**"] +SPDX-License-Identifier = "LICENSE-ID" +SPDX-FileCopyrightText = "Copyright holder" +``` + +### Checking Compliance + +```bash +# Install reuse tool +pip install reuse + +# Check entire repository +reuse lint + +# Download missing license texts +reuse download --all +``` + +The pre-commit hook via lefthook automatically runs `reuse lint` and +downloads missing licenses if needed. + +--- + +## Adding New Files + +When adding new files to the repository: + +1. **Determine the appropriate license** based on the sub-project: + - Files in `meshmc/` → GPL-3.0-or-later + - Files in `neozip/` → Zlib + - Files in `json4cpp/` → MIT + - Files in `meta/` → MS-PL + - Documentation → CC0-1.0 + - Configuration/build files → CC0-1.0 or BSD-3-Clause + - Test data → CC0-1.0 + +2. **Add SPDX headers** to the file (if it supports comments) + +3. **Or add a REUSE.toml annotation** for files without comment support + +4. **Run `reuse lint`** to verify compliance + +### Adding New Sub-Projects + +If adding an entirely new sub-project: + +1. Add a `[[annotations]]` block to `REUSE.toml` for the new directory +2. Place the appropriate license text in `LICENSES/` if not already present +3. Ensure all files have proper SPDX identifiers +4. Document the license in the sub-project's README + +--- + +## Third-Party License Obligations + +### Attribution Requirements + +Several licenses in the ecosystem require attribution in distributed binaries: + +| License | Attribution Requirement | +|---------|----------------------| +| Apache-2.0 | NOTICE file, license text | +| BSD-2-Clause | License text in documentation | +| BSD-3-Clause | License text in documentation | +| BSD-4-Clause | License text + advertising clause | +| MIT | License text | +| LGPL-2.1/3.0 | License text, source availability | +| GPL-2.0/3.0 | Full source code availability | +| MS-PL | License text | +| CC-BY-SA-4.0 | Attribution, ShareAlike | + +### Copyleft Obligations + +| License | Source Obligation | Dynamic Linking | +|---------|------------------|-----------------| +| GPL-2.0-only | Full source for the program | Derivative work | +| GPL-3.0-only/or-later | Full source for the program | Derivative work | +| LGPL-2.1-or-later | Source for LGPL portions, object files for relinking | Permitted without GPL | +| LGPL-3.0-or-later | Source for LGPL portions, installation info | Permitted without GPL | + +--- + +## Trademark vs. License + +It is crucial to understand that **open source licenses do not grant trademark +rights**. As stated in `TRADEMARK.md`: + +> Open source licenses govern the use, modification, and redistribution of +> source code only. They do **not** grant rights to use the Project Tick name, +> logo, or branding. + +See [trademark-policy.md](trademark-policy.md) for the full trademark policy. + +--- + +## CLA and License Grants + +The Project Tick Contributor License Agreement (CLA) ensures that all +contributions can be distributed under the project's existing licenses. By +signing the CLA, contributors: + +1. Confirm they have the legal right to make the contribution +2. Grant Project Tick a perpetual license to distribute the contribution +3. Agree not to knowingly infringe third-party rights + +This allows Project Tick to maintain license consistency across the ecosystem +without requiring future relicensing negotiations. + +CLA text: diff --git a/docs/handbook/Project-Tick/overview.md b/docs/handbook/Project-Tick/overview.md new file mode 100644 index 0000000000..6d36f7f914 --- /dev/null +++ b/docs/handbook/Project-Tick/overview.md @@ -0,0 +1,335 @@ +# Project Tick — Organization Overview + +## Introduction + +Project Tick is a modular software ecosystem organized as a unified monorepo. It +encompasses applications, libraries, system utilities, infrastructure tooling, +and metadata generators — all managed under a single repository at +`github.com/Project-Tick/Project-Tick`. The project is dedicated to providing +developers with ease of use and users with long-lasting software. + +The monorepo approach ensures tight integration between components while +preserving the independence of each sub-project. Every directory at the +repository root represents an autonomous module, library, tool, or application +that can be built and used standalone or as part of the larger system. + +Project Tick focuses on three guiding principles: + +1. **Reproducible builds** — Nix flakes and pinned dependencies ensure every + build produces identical output regardless of the host environment. +2. **Minimal dependencies** — Each component pulls only what it strictly needs. +3. **Full control over the software stack** — From compression libraries to text + editors, Project Tick maintains its own forks and adaptations to guarantee + long-term stability and security. + +--- + +## Mission + +Project Tick exists to build, package, and run software across multiple +platforms with complete transparency and reproducibility. The project provides: + +- A custom Minecraft launcher (MeshMC) with deep mod-loader integration +- System-level UNIX utilities ported from FreeBSD +- A text editor fork (MNV) with modern enhancements +- Foundational C/C++ libraries for compression, serialization, and parsing +- Infrastructure for CI/CD, container images, and metadata generation +- Git web interfaces and documentation tooling + +Every component feeds up into the broader mission: an ecosystem where every +dependency is accounted for, every license is tracked, and every build is +reproducible. + +--- + +## Sub-Projects + +Project Tick contains the following top-level components, organized by category. + +### Applications + +| Directory | Name | Description | Language | License | +|-----------|------|-------------|----------|---------| +| `meshmc/` | **MeshMC** | Custom Minecraft launcher focused on predictability, long-term stability, and simplicity. Supports Forge, NeoForge, Fabric, and Quilt mod loaders. Built with Qt 6 and C++23. Current version: 7.0.0. | C++ | GPL-3.0-or-later | +| `mnv/` | **MNV** | Greatly improved fork of the Vi/Vim text editor. Features multi-level undo, syntax highlighting, command-line history, spell checking, filename completion, block operations, and a script language. Provides a POSIX-compatible vi implementation in its minimal build. | C | Vim AND GPL-3.0-or-later | +| `cgit/` | **cgit** | Fast CGI-based web interface for Git repositories. Uses a built-in cache to decrease server I/O pressure. Supports Lua scripting for custom filters. | C | GPL-2.0-only | + +### Libraries + +| Directory | Name | Description | Language | License | +|-----------|------|-------------|----------|---------| +| `neozip/` | **NeoZip** | Next-generation zlib/zlib-ng fork for data compression. Features SIMD-accelerated implementations (SSE2, AVX2, AVX-512, NEON, etc.) for Adler32, CRC32, inflate, and deflate. Supports CPU intrinsics on x86-64, ARM, Power, RISC-V, LoongArch, and s390x. | C | Zlib | +| `json4cpp/` | **Json4C++** | Header-only JSON library for modern C++ (nlohmann/json fork). Supports JSON Pointer, JSON Patch, JSON Merge Patch, BSON, CBOR, MessagePack, UBJSON, and BJData binary formats. Single-header and multi-header modes. | C++ | MIT | +| `tomlplusplus/` | **toml++** | Header-only TOML parser and serializer for C++17. Passes all tests in the official toml-test suite. Supports serialization to JSON and YAML, proper UTF-8 handling, and works with or without exceptions and RTTI. | C++ | MIT | +| `libnbtplusplus/` | **libnbt++** | C++ library for Minecraft's Named Binary Tag (NBT) file format. Reads and writes compressed and uncompressed NBT files. Version 3 is a ground-up rewrite for usability. | C++ | LGPL-3.0-or-later | +| `cmark/` | **cmark** | CommonMark reference implementation in C. Provides a C API for parsing and rendering Markdown documents. Includes fuzz testing infrastructure. | C | BSD-2-Clause AND MIT AND CC-BY-SA-4.0 | +| `genqrcode/` | **GenQRCode** | QR Code encoding library (libqrencode fork). Supports QR Code model 2 per JIS X0510:2004 / ISO/IEC 18004. Handles numeric, alphanumeric, kanji (Shift-JIS), and 8-bit data. Also supports Micro QR Code (experimental). | C | LGPL-2.1-or-later | +| `forgewrapper/` | **ForgeWrapper** | Java library enabling launchers to start Minecraft 1.13+ with Forge. Provides a service-provider interface (`IFileDetector`) for custom file detection rules. Built with Gradle. | Java | MIT | + +### System Utilities + +| Directory | Name | Description | Language | License | +|-----------|------|-------------|----------|---------| +| `corebinutils/` | **CoreBinUtils** | Collection of BSD/FreeBSD core utilities ported to Linux. Includes `cat`, `chmod`, `cp`, `date`, `dd`, `df`, `echo`, `ed`, `expr`, `hostname`, `kill`, `ln`, `ls`, `mkdir`, `mv`, `nproc`, `ps`, `pwd`, `realpath`, `rm`, `rmdir`, `sh`, `sleep`, `stty`, `sync`, `test`, `timeout`, and more. Uses musl-first toolchain selection. | C | BSD-1-Clause AND BSD-2-Clause AND BSD-3-Clause AND BSD-4-Clause AND MIT | + +### Infrastructure & Tooling + +| Directory | Name | Description | Language | License | +|-----------|------|-------------|----------|---------| +| `meta/` | **Meta** | Metadata generator for the MeshMC launcher. Generates JSON manifests and JARs for Mojang, Forge, NeoForge, Fabric, Quilt, LiteLoader, and Java runtime versions. Written in Python, uses Poetry for dependency management. Deployable as a NixOS service. | Python | MS-PL | +| `ofborg/` | **tickborg** | Distributed RabbitMQ-based CI system adapted from NixOS ofborg. Automatically detects changed projects in PRs, builds affected sub-projects using their native build systems, posts results as GitHub check runs, and supports multi-platform builds (Linux, macOS, Windows, FreeBSD). | Rust | MIT | +| `images4docker/` | **Images4Docker** | Collection of 40 Dockerfiles for building MeshMC across different Linux distributions. Each image includes the Qt 6 toolchain and all MeshMC build dependencies. Supports apt, dnf, apk, zypper, yum, pacman, xbps, nix, and emerge package managers. Rebuilt daily at 03:17 UTC. | Dockerfile | GPL-3.0-or-later | +| `ci/` | **CI Infrastructure** | CI support files including Nix-based tooling (treefmt, codeowners-validator), GitHub Actions JavaScript helpers (commit linting, PR preparation, review management), branch classification, and pinned Nixpkgs for reproducible formatting. | Nix, JavaScript | CC0-1.0 | +| `hooks/` | **Git Hooks** | Lefthook-managed Git hooks for pre-commit REUSE license checking and code style validation via checkpatch. | Shell | CC0-1.0 | + +### Archived Projects + +| Directory | Name | Description | License | +|-----------|------|-------------|---------| +| `archived/projt-launcher/` | **ProjT Launcher** | Original Minecraft launcher (predecessor to MeshMC). Based on MultiMC/PrismLauncher/PolyMC. | GPL-3.0-only | +| `archived/projt-modpack/` | **ProjT Modpack** | Minecraft modpack distribution tooling. | GPL-3.0-only | +| `archived/projt-minicraft-modpack/` | **ProjT Minicraft Modpack** | Minicraft modpack distribution. | MIT | +| `archived/ptlibzippy/` | **ptlibzippy** | ZIP library (predecessor to NeoZip integration). | Zlib | + +### Documentation & Configuration + +| Directory / File | Description | +|------------------|-------------| +| `docs/` | Project documentation including the developer handbook | +| `LICENSES/` | SPDX-compliant license texts (20 distinct licenses) | +| `REUSE.toml` | REUSE 3.0 compliance annotations mapping paths to licenses | +| `flake.nix` | Top-level Nix flake providing development shells with LLVM 22 | +| `flake.lock` | Locked Nix inputs for reproducibility | +| `bootstrap.sh` | Cross-distro bootstrap script for dependency installation | +| `bootstrap.cmd` | Windows bootstrap script using Scoop and vcpkg | +| `lefthook.yml` | Git hooks configuration for pre-commit checks | +| `.github/` | GitHub Actions workflows, issue templates, PR template, CODEOWNERS, DCO enforcement | + +--- + +## Technology Stack + +### Programming Languages + +| Language | Where Used | +|----------|------------| +| C | neozip, cmark, genqrcode, cgit, corebinutils, mnv | +| C++ (C++23) | meshmc, json4cpp (C++11/17), tomlplusplus (C++17), libnbtplusplus (C++11) | +| Rust | tickborg (ofborg) | +| Java | forgewrapper | +| Python | meta | +| JavaScript / Node.js | CI scripts (github-script) | +| Nix | CI infrastructure, development shells, NixOS deployment modules | +| Shell (Bash/POSIX) | bootstrap scripts, hooks, build orchestration | +| Dockerfile | images4docker | +| CMake | meshmc, neozip, cmark, genqrcode, json4cpp, libnbtplusplus, mnv | + +### Build Systems + +| Build System | Projects | +|--------------|----------| +| CMake | meshmc, neozip, cmark, genqrcode, json4cpp, libnbtplusplus, mnv | +| Meson | tomlplusplus | +| Make (GNU Make) | cgit, corebinutils | +| Autotools | mnv (configure), genqrcode (configure.ac), neozip (configure) | +| Gradle | forgewrapper | +| Cargo | tickborg | +| Poetry | meta | +| Nix | CI, development shells, deployment | + +### Frameworks & Key Dependencies + +| Dependency | Used By | Purpose | +|------------|---------|---------| +| Qt 6 (Core, Widgets, Concurrent, Network, NetworkAuth, Test, Xml) | meshmc | GUI framework | +| QuaZip (Qt 6) | meshmc | ZIP archive support | +| zlib / NeoZip | meshmc, neozip | Data compression | +| libarchive | meshmc | Archive extraction | +| Extra CMake Modules (ECM) | meshmc | KDE CMake utilities | +| RabbitMQ (AMQP) | tickborg | Message queue for distributed CI | +| Poetry | meta | Python dependency management | +| Crowdin | meshmc | Translation management | + +--- + +## How Sub-Projects Relate + +The Project Tick ecosystem has clear dependency chains: + +``` +meshmc (application) +├── json4cpp (JSON parsing) +├── tomlplusplus (TOML configuration parsing) +├── libnbtplusplus (Minecraft NBT format) +├── neozip (compression) +├── cmark (Markdown rendering) +├── genqrcode (QR code generation) +├── forgewrapper (Forge mod loader integration) +└── meta (version metadata feeds) + +tickborg (CI system) +├── Detects changes across all sub-projects +├── Builds using native build systems +└── Posts results as GitHub check runs + +images4docker (container images) +├── Provides build environments for meshmc CI +└── Covers 40 Linux distributions with Qt 6 + +corebinutils (standalone) +└── Independent FreeBSD utility ports + +mnv (standalone) +└── Independent Vim fork + +cgit (standalone) +└── Independent Git web interface +``` + +MeshMC is the primary consumer of the library sub-projects. The `meta/` +component generates the version metadata that MeshMC uses to discover and +download Minecraft versions, mod loaders, and Java runtimes. The `forgewrapper/` +component is a Java shim that MeshMC invokes at runtime to bootstrap Forge. + +The `tickborg` system orchestrates CI across the entire monorepo, detecting +which sub-projects are affected by a given change and building only those +projects using their respective build systems. + +--- + +## Repository Governance + +Project Tick is maintained by its core contributors under the oversight of +Mehmet Samet Duman (trademark holder). The project uses: + +- **CODEOWNERS** for ownership-based review routing +- **DCO (Developer Certificate of Origin)** sign-off on every commit +- **CLA (Contributor License Agreement)** for all contributions +- **Conventional Commits** for structured commit messages +- **REUSE 3.0** for license compliance + +The Code of Conduct (Version 2, 15 February 2026) establishes behavioral and +ethical standards focused on technical integrity, licensing compliance, +infrastructure security, and good-faith collaboration. + +--- + +## Official Communication Channels + +| Channel | URL / Address | +|---------|---------------| +| GitHub Issues | `github.com/Project-Tick/Project-Tick/issues` | +| Email | `projecttick@projecttick.org` | +| Trademark inquiries | `yongdohyun@projecttick.org` | +| CLA text | `projecttick.org/licenses/PT-CLA-2.0.txt` | +| Crowdin (translations) | `crowdin.com/project/projtlauncher` | + +--- + +## Version History + +Project Tick evolved from several predecessor projects: + +1. **MultiMC** (2012–2022) — The original custom Minecraft launcher. Apache-2.0 + licensed. MeshMC's launcher code incorporates work from this project. + +2. **PolyMC / PrismLauncher** — Community forks of MultiMC. The `meta/` + component traces its lineage through these projects (MS-PL license). + +3. **ProjT Launcher** — Project Tick's first launcher, now archived in + `archived/projt-launcher/`. GPL-3.0-only. + +4. **MeshMC** — The current-generation launcher, a ground-up evolution with + C++23, Qt 6, and the full library stack. + +The infrastructure components have diverse origins: + +- **tickborg** is adapted from NixOS's [ofborg](https://github.com/NixOS/ofborg) +- **neozip** is based on [zlib-ng](https://github.com/zlib-ng/zlib-ng) +- **json4cpp** is based on [nlohmann/json](https://github.com/nlohmann/json) +- **tomlplusplus** is based on [marzer/tomlplusplus](https://github.com/marzer/tomlplusplus) +- **cgit** is the Project Tick fork of the cgit Git web interface +- **cmark** is the Project Tick fork of the CommonMark reference implementation +- **mnv** is the Project Tick fork of Vim +- **corebinutils** contains FreeBSD utility ports + +--- + +## Platform Support + +### MeshMC (Primary Application) + +| Platform | Architecture | Status | +|----------|-------------|--------| +| Linux | x86_64 | Fully supported | +| Linux | aarch64 | Fully supported | +| macOS | x86_64 | Fully supported | +| macOS | aarch64 (Apple Silicon) | Fully supported | +| macOS | Universal Binary | Supported via `macos_universal` preset | +| Windows | x86_64 (MSVC) | Fully supported | +| Windows | x86_64 (MinGW) | Fully supported | +| Windows | aarch64 | Supported | +| FreeBSD | x86_64 | Supported (VM-based CI) | + +### MNV + +MNV runs on MS-Windows (7, 8, 10, 11), macOS, Haiku, VMS, and almost all +flavors of UNIX. + +### CoreBinUtils + +Targets Linux with musl-first toolchain selection. Also builds against glibc +when musl is unavailable. + +### tickborg CI Platform Matrix + +| Platform | Runner | +|----------|--------| +| `x86_64-linux` | `ubuntu-latest` | +| `aarch64-linux` | `ubuntu-24.04-arm` | +| `x86_64-darwin` | `macos-15` | +| `aarch64-darwin` | `macos-15` | +| `x86_64-windows` | `windows-2025` | +| `aarch64-windows` | `windows-2025` | +| `x86_64-freebsd` | `ubuntu-latest` (VM) | + +--- + +## Quick Links + +| Resource | Path | +|----------|------| +| Root README | `README.md` | +| Contributing Guide | `CONTRIBUTING.md` | +| Security Policy | `SECURITY.md` | +| Code of Conduct | `CODE_OF_CONDUCT.md` | +| Trademark Policy | `TRADEMARK.md` | +| License Directory | `LICENSES/` | +| REUSE Configuration | `REUSE.toml` | +| MeshMC Build Guide | `meshmc/BUILD.md` | +| CI Infrastructure | `ci/README.md` | +| tickborg Documentation | `ofborg/README.md` | +| Developer Handbook | `docs/handbook/` | + +--- + +## Naming Conventions + +- **Project Tick** — The umbrella organization and monorepo name +- **MeshMC** — The Minecraft launcher application +- **MNV** — The text editor (Vi/Vim fork) +- **tickborg** — The distributed CI bot (the Rust workspace in `ofborg/tickborg/`) +- **ofborg** — The upstream project tickborg is derived from; also the directory name (`ofborg/`) +- **NeoZip** — The compression library (zlib-ng fork) +- **Json4C++** — The JSON library (nlohmann/json fork) +- **toml++** — The TOML library +- **libnbt++** — The NBT library +- **GenQRCode** — The QR code library (libqrencode fork) +- **ForgeWrapper** — The Forge bootstrapper +- **CoreBinUtils** — The BSD utility ports +- **Meta** — The launcher metadata generator +- **Images4Docker** — The Docker image collection +- **cgit** — The Git web interface + +The trademark "Project Tick" and associated branding are owned by Mehmet Samet +Duman. See `TRADEMARK.md` for usage policies. diff --git a/docs/handbook/Project-Tick/release-process.md b/docs/handbook/Project-Tick/release-process.md new file mode 100644 index 0000000000..3c292eaaae --- /dev/null +++ b/docs/handbook/Project-Tick/release-process.md @@ -0,0 +1,374 @@ +# Project Tick — Release Process + +## Overview + +Project Tick uses a per-component release methodology. Each sub-project +maintains its own version numbering, release cadence, and distribution +channels. Releases are triggered by Git tags and automated through GitHub +Actions workflows. + +--- + +## Versioning Schemes + +### Semantic Versioning (SemVer) + +Most sub-projects follow [Semantic Versioning 2.0.0](https://semver.org/): + +``` +MAJOR.MINOR.PATCH[-PRERELEASE][+BUILD] +``` + +| Component | Current Version | Source of Truth | +|-----------------|-----------------|-----------------------------------------| +| MeshMC | 7.0.0 | `meshmc/CMakeLists.txt` (project()) | +| meta | 0.0.5-1 | `meta/pyproject.toml` ([tool.poetry]) | +| neozip | — | `neozip/CMakeLists.txt` | +| json4cpp | — | `json4cpp/CMakeLists.txt` | +| tomlplusplus | — | `tomlplusplus/meson.build` | +| libnbtplusplus | — | `libnbtplusplus/CMakeLists.txt` | +| forgewrapper | — | `forgewrapper/gradle.properties` | +| cmark | — | `cmark/CMakeLists.txt` | +| genqrcode | — | `genqrcode/configure.ac` | +| tickborg | — | `ofborg/Cargo.toml` | + +### MeshMC Version Details + +MeshMC's version is defined in its root `CMakeLists.txt`: + +```cmake +project(MeshMC + VERSION 7.0.0 + DESCRIPTION "MeshMC — Custom Minecraft Launcher" + HOMEPAGE_URL "https://meshmc.org" + LANGUAGES CXX C +) +``` + +The version is decomposed into CMAKE variables and compiled into the binary +via `buildconfig/`: +- `MeshMC_VERSION_MAJOR` — 7 +- `MeshMC_VERSION_MINOR` — 0 +- `MeshMC_VERSION_PATCH` — 0 + +### meta Version Details + +The meta package uses Poetry versioning with a Debian-style suffix: + +```toml +[tool.poetry] +name = "meta" +version = "0.0.5-1" +``` + +--- + +## Branch Strategy + +### Branch Types + +| Branch Pattern | Purpose | Protected | CI Level | +|------------------|-------------------------------|-----------|-------------| +| `master` | Main development branch | Yes | Full | +| `release-*` | Release preparation branches | Yes | Full | +| `staging-*` | Integration testing branches | No | Partial | +| `feature-*` | Feature development | No | PR-only | +| `fix-*` | Bug fix branches | No | PR-only | + +### Branch Classification Logic + +Branch classification is implemented in `ci/supportedBranches.js`: + +```javascript +// Simplified representation of the classify() function: +function classify(branch) { + if (branch === 'master') + return { level: 'full', protected: true }; + if (branch.startsWith('release-')) + return { level: 'full', protected: true }; + if (branch.startsWith('staging-')) + return { level: 'partial', protected: false }; + return { level: 'pr-only', protected: false }; +} +``` + +Protected branches cannot receive direct pushes; all changes must go through +pull requests with passing CI. + +--- + +## Release Workflow + +### Phase 1 — Feature Freeze + +1. A release branch is created from `master`: + ```bash + git checkout -b release-7.1.0 master + ``` + +2. Only bug fixes and documentation updates are merged into the release branch. + +3. CI runs full validation on the release branch (same as `master`). + +### Phase 2 — Version Bump + +1. Update the version in the component's source of truth: + - **MeshMC**: Edit `meshmc/CMakeLists.txt` `project(VERSION ...)` + - **meta**: Edit `meta/pyproject.toml` `version = "..."` + - **neozip**: Edit `neozip/CMakeLists.txt` + - **Other CMake projects**: Edit the relevant `CMakeLists.txt` + - **Meson projects**: Edit `meson.build` + - **Gradle projects**: Edit `gradle.properties` + - **Cargo projects**: Edit `Cargo.toml` + +2. Update changelogs: + - MeshMC maintains `meshmc/changelog.md` + - Other components maintain changelogs in their directories + +3. Commit the version bump: + ```bash + git add -A + git commit -s -m "release: bump MeshMC to 7.1.0" + ``` + +### Phase 3 — Tagging + +Create an annotated Git tag: + +```bash +git tag -a v7.1.0 -m "MeshMC 7.1.0" +git push origin v7.1.0 +``` + +Tag naming conventions: +- **MeshMC**: `v..` (e.g., `v7.1.0`) +- **neozip**: `neozip-v` (e.g., `neozip-v2.2.3`) +- **json4cpp**: `json4cpp-v` +- **Other**: `-v` + +### Phase 4 — Automated Build & Publish + +Pushing a tag triggers the corresponding release workflow: + +| Tag Pattern | Workflow | Artifacts | +|----------------------|-------------------------------|---------------------------------------| +| `v*` | `meshmc-release.yml` | Linux/macOS/Windows binaries | +| `v*` | `meshmc-publish.yml` | Flathub, AUR, packaging repos | +| `neozip-v*` | `neozip-release.yml` | Source archive, shared libraries | +| `json4cpp-v*` | (manual) | Updated single-header | +| `images4docker-v*` | `images4docker-build.yml` | Docker images to GHCR | + +--- + +## MeshMC Release Details + +### Build Matrix + +MeshMC releases build for all supported platforms: + +| Platform | Compiler | Qt Version | Output Format | +|-------------------|----------------|------------|----------------------| +| Linux (x86_64) | Clang 18+ | 6.x | AppImage, tar.gz | +| Linux (aarch64) | Clang 18+ | 6.x | AppImage, tar.gz | +| macOS (x86_64) | Apple Clang 16+| 6.x | .dmg, .app | +| macOS (aarch64) | Apple Clang 16+| 6.x | .dmg, .app | +| Windows (x86_64) | MSVC 17.10+ | 6.x | .msi, portable .zip | +| Windows (aarch64) | MSVC 17.10+ | 6.x | .msi, portable .zip | + +### Release Workflow Steps + +``` +meshmc-release.yml: + 1. Checkout code at tag + 2. Set up dependencies (via .github/actions/meshmc/setup-dependencies/) + 3. Configure with CMake presets (Release mode) + 4. Build + 5. Run tests (ctest) + 6. Package (via .github/actions/meshmc/package/) + 7. Create GitHub Release with artifacts + 8. Upload checksums (SHA-256) +``` + +### Post-Release Publishing + +``` +meshmc-publish.yml: + 1. Download release artifacts + 2. Update Flathub manifest + 3. Update AUR PKGBUILD + 4. Update packaging repository + 5. Notify announcement channels +``` + +--- + +## neozip Release Details + +neozip releases produce: +- Source tarball (`neozip-.tar.gz`) +- Pre-built shared libraries for major platforms +- CMake package configuration files + +The build matrix covers multiple architectures to validate SIMD +optimizations: +- x86_64 (SSE2, SSE4.2, AVX2, AVX-512) +- aarch64 (NEON, ARMv8 CRC) +- s390x (DFLTCC hardware deflate) +- ppc64le (VMX, VSX) +- riscv64 (RVV, ZBC) — when available + +--- + +## Docker Image Releases + +### images4docker Rebuild Cycle + +Docker images are rebuilt on a scheduled basis: + +```yaml +# .github/workflows/images4docker-build.yml +on: + schedule: + - cron: '17 3 * * *' # Daily at 03:17 UTC + push: + branches: [master] + paths: ['images4docker/**'] +``` + +Each push to `master` that modifies `images4docker/` also triggers a rebuild. +All 40 Dockerfiles are built and pushed to GitHub Container Registry (GHCR). + +--- + +## meta Release Details + +The meta package uses Poetry for releases: + +```bash +# Build distribution +poetry build + +# Publish to PyPI (if applicable) +poetry publish +``` + +meta also defines CLI scripts in `pyproject.toml`: + +```toml +[tool.poetry.scripts] +generate_fabric = "meta.run.generate_fabric:main" +generate_forge = "meta.run.generate_forge:main" +generate_mojang = "meta.run.generate_mojang:main" +generate_neoforge = "meta.run.generate_neoforge:main" +generate_quilt = "meta.run.generate_quilt:main" +generate_java = "meta.run.generate_java:main" +update_mojang = "meta.run.update_mojang:main" +``` + +--- + +## tickborg (ofborg) Release Details + +tickborg is deployed as a Docker container: + +```bash +# Build +docker build -t tickborg:latest ofborg/ + +# Deploy with docker-compose +cd ofborg && docker-compose up -d +``` + +The Rust workspace produces two binaries: +- `tickborg` — Main CI bot with AMQP integration +- `tickborg-simple-build` — Simplified builder for local testing + +Deployment is managed via `ofborg/DEPLOY.md` and `ofborg/service.nix` for +NixOS. + +--- + +## Hotfix Process + +For critical security fixes or regressions: + +1. **Branch from the release tag**: + ```bash + git checkout -b hotfix-7.0.1 v7.0.0 + ``` + +2. **Apply the minimal fix** — only the patch, no feature additions. + +3. **Bump the PATCH version**. + +4. **Tag and release**: + ```bash + git tag -a v7.0.1 -m "MeshMC 7.0.1 — Security hotfix" + git push origin v7.0.1 + ``` + +5. **Cherry-pick to master**: + ```bash + git checkout master + git cherry-pick + ``` + +--- + +## Pre-release / Development Builds + +Development builds are produced on every push to `master`: + +- MeshMC: `meshmc-ci.yml` produces nightly artifacts +- neozip: `neozip-ci.yml` attaches build artifacts +- Other components: CI produces artifacts accessible from workflow runs + +Pre-release builds are not tagged and are identified by commit SHA or +workflow run number. + +--- + +## Release Checklist + +### Before Tagging + +- [ ] All CI checks pass on the release branch +- [ ] Version number updated in source of truth +- [ ] Changelog updated with all changes since last release +- [ ] Security advisories addressed +- [ ] License compliance verified (`reuse lint` passes) +- [ ] Documentation updated for new features +- [ ] Breaking changes documented with migration guide + +### After Tagging + +- [ ] GitHub Release created with artifacts +- [ ] SHA-256 checksums published +- [ ] Package repositories updated (Flathub, AUR, etc.) +- [ ] Downstream dependencies notified +- [ ] Announcement published + +### Post-Release + +- [ ] Release branch merged back to `master` (if applicable) +- [ ] Confirm all distribution channels have the new version +- [ ] Monitor issue tracker for regression reports +- [ ] Begin next development cycle with version bump on `master` + +--- + +## Dependency Updates + +### Automated Updates + +- **Renovate**: Configured in `meta/renovate.json` for automated dependency + PRs in the meta sub-project +- **Nix flake**: `nix flake update` refreshes all Nix inputs +- **CI pinning**: `ci/update-pinned.sh` updates the pinned Nixpkgs revision + +### Manual Updates + +- **Git submodules**: `git submodule update --remote` for cgit's bundled Git +- **vcpkg**: Update `meshmc/vcpkg.json` and `meshmc/vcpkg-configuration.json` +- **Poetry lock**: `cd meta && poetry update` +- **Cargo lock**: `cd ofborg && cargo update` diff --git a/docs/handbook/Project-Tick/repository-structure.md b/docs/handbook/Project-Tick/repository-structure.md new file mode 100644 index 0000000000..44b64f0aa8 --- /dev/null +++ b/docs/handbook/Project-Tick/repository-structure.md @@ -0,0 +1,625 @@ +# Project Tick — Repository Structure + +## Overview + +The Project Tick monorepo contains all source code, libraries, applications, +infrastructure tooling, documentation, and archived projects under a single +Git repository. This document provides a complete map of every top-level +directory and significant file. + +--- + +## Root Directory + +``` +Project-Tick/ +│ +├── .envrc # direnv configuration (loads Nix shell) +├── .gitattributes # Git attribute rules (LFS, diff, merge) +├── .gitignore # Root-level ignore patterns +├── .gitmodules # Git submodule definitions +├── bootstrap.cmd # Windows bootstrap script (Scoop + vcpkg) +├── bootstrap.sh # Linux/macOS bootstrap script +├── CODE_OF_CONDUCT.md # Code of Conduct v2 (15 Feb 2026) +├── CONTRIBUTING.md # Contribution guidelines, CLA, DCO, AI policy +├── flake.lock # Nix flake lock file (pinned inputs) +├── flake.nix # Top-level Nix flake (LLVM 22 dev shell) +├── lefthook.yml # Git hooks config (REUSE lint, checkpatch) +├── README.md # Root README with project overview +├── REUSE.toml # REUSE 3.0 license annotations +├── SECURITY.md # Security vulnerability reporting +├── TRADEMARK.md # Trademark and brand policy +├── tree.txt # Static directory tree snapshot +│ +├── .github/ # GitHub configuration +├── archived/ # Deprecated sub-projects +├── cgit/ # cgit Git web interface +├── ci/ # CI infrastructure and tooling +├── cmark/ # cmark Markdown parser +├── corebinutils/ # BSD/FreeBSD core utilities +├── docs/ # Documentation +├── forgewrapper/ # ForgeWrapper Java shim +├── genqrcode/ # QR code encoding library +├── hooks/ # Git hook scripts +├── images4docker/ # Docker build environments +├── json4cpp/ # JSON library (nlohmann/json fork) +├── libnbtplusplus/ # NBT library +├── LICENSES/ # SPDX license texts +├── meshmc/ # MeshMC Minecraft launcher +├── meta/ # Metadata generator +├── mnv/ # MNV text editor (Vim fork) +├── neozip/ # Compression library (zlib-ng fork) +├── ofborg/ # tickborg CI bot +└── tomlplusplus/ # TOML library +``` + +--- + +## .github/ — GitHub Configuration + +``` +.github/ +├── CODEOWNERS # Review routing (all paths → @YongDo-Hyun) +├── dco.yml # DCO bot config (no remediation commits) +├── pull_request_template.md # PR template (sign-off & CLA reminder) +│ +├── ISSUE_TEMPLATE/ +│ ├── bug_report.yml # Structured bug report form +│ ├── config.yml # Issue template configuration +│ ├── rfc.yml # RFC (Request for Comments) template +│ └── suggestion.yml # Feature suggestion template +│ +├── actions/ # Reusable composite actions +│ ├── change-analysis/ # File change detection logic +│ ├── meshmc/ +│ │ ├── package/ # MeshMC packaging action +│ │ └── setup-dependencies/ # MeshMC dependency setup action +│ └── mnv/ +│ └── test_artefacts/ # MNV test artifacts action +│ +├── codeql/ # CodeQL analysis configuration +│ +└── workflows/ # 50+ GitHub Actions workflow files + ├── ci.yml # Monolithic CI orchestrator + ├── ci-lint.yml # Commit/format linting + ├── ci-schedule.yml # Scheduled CI jobs + ├── meshmc-*.yml # MeshMC workflows (8 files) + ├── neozip-*.yml # NeoZip workflows (12 files) + ├── json4cpp-*.yml # json4cpp workflows (7 files) + ├── cmark-*.yml # cmark workflows (2 files) + ├── tomlplusplus-*.yml # toml++ workflows (3 files) + ├── mnv-*.yml # MNV workflows (4 files) + ├── cgit-ci.yml # cgit CI + ├── corebinutils-ci.yml # CoreBinUtils CI + ├── forgewrapper-build.yml # ForgeWrapper CI + ├── libnbtplusplus-ci.yml # libnbt++ CI + ├── genqrcode-ci.yml # GenQRCode CI + ├── images4docker-build.yml # Docker image builds + └── repo-*.yml # Repository maintenance (4 files) +``` + +--- + +## meshmc/ — MeshMC Launcher + +The largest sub-project. A Qt 6 / C++23 custom Minecraft launcher. + +``` +meshmc/ +├── .clang-format # C++ formatting rules +├── .clang-tidy # Static analysis configuration +├── .envrc # direnv configuration +├── .gitattributes # File attribute rules +├── .gitignore # Ignore patterns +├── .markdownlint.yaml # Markdown lint config +├── .markdownlintignore # Markdown lint ignore +├── BUILD.md # Comprehensive build guide +├── CMakeLists.txt # Root CMake configuration +├── CMakePresets.json # Platform-specific CMake presets +├── CONTRIBUTING.md # MeshMC-specific contribution guide +├── COPYING.md # License information +├── Containerfile # Container build file +├── README.md # MeshMC overview +├── REUSE.toml # License annotations +├── changelog.md # Version changelog +├── default.nix # Nix build expression +├── flake.lock # Nix flake lock +├── flake.nix # Nix flake +├── shell.nix # Nix development shell +├── vcpkg-configuration.json # vcpkg configuration +├── vcpkg.json # vcpkg dependencies +│ +├── branding/ # Icons, logos, splash screens +├── build/ # Build output directory +├── buildconfig/ # Compile-time configuration generation +├── cmake/ # Custom CMake modules +├── doc/ # MeshMC-specific documentation +├── install/ # Install output directory +├── LICENSES/ # MeshMC-specific license copies +├── nix/ # Nix packaging +├── scripts/ # Build and maintenance scripts +├── updater/ # Auto-update mechanism +│ +├── launcher/ # Main application source +│ ├── main.cpp # Entry point +│ ├── Application.cpp/.h # Application singleton +│ ├── CMakeLists.txt # Launcher CMake +│ ├── icons/ # Icon management +│ ├── java/ # Java runtime management +│ ├── launch/ # Game launch logic +│ ├── meta/ # Metadata handling +│ ├── minecraft/ # Minecraft-specific logic +│ ├── modplatform/ # Mod platform integration +│ ├── mojang/ # Mojang API integration +│ ├── net/ # Networking +│ ├── news/ # News feed +│ ├── notifications/ # User notifications +│ ├── resources/ # Qt resources +│ ├── screenshots/ # Screenshot management +│ ├── settings/ # Settings system +│ ├── tasks/ # Async task framework +│ ├── testdata/ # Test fixtures +│ ├── tools/ # Tool integrations +│ ├── translations/ # i18n (Crowdin) +│ ├── ui/ # Qt UI (widgets, dialogs, themes) +│ └── updater/ # In-app updater +│ +└── libraries/ # Bundled library integrations +``` + +Key source files in `launcher/`: +- `Application.cpp` — Application lifecycle management +- `BaseInstance.cpp` — Minecraft instance abstraction +- `InstanceList.cpp` — Instance collection management +- `LaunchController.cpp` — Game launch orchestration +- `FileSystem.cpp` — Cross-platform file operations +- `Json.cpp` — JSON utilities (wrapping json4cpp) +- `GZip.cpp` — Compression utilities (wrapping zlib/neozip) + +--- + +## mnv/ — MNV Text Editor + +Vim fork with modern enhancements. + +``` +mnv/ +├── CMakeLists.txt # CMake build (alternative) +├── CMakePresets.json # CMake presets +├── configure # Autotools configure script +├── CONTRIBUTING.md # Contribution guide +├── COPYING.md # License +├── LICENSE # Vim license text +├── Makefile # Root Makefile +├── README.md # Overview +├── SECURITY.md # Security policy +│ +├── ci/ # MNV-specific CI scripts +├── cmake/ # CMake modules +├── lang/ # Language support files +├── nsis/ # Windows installer (NSIS) +├── pixmaps/ # Icons and graphics +├── runtime/ # Runtime files (docs, syntax, plugins, etc.) +├── src/ # C source code +├── tools/ # Development tools +└── build/ # Build output +``` + +--- + +## cgit/ — Git Web Interface + +``` +cgit/ +├── Makefile # Build system +├── cgit.c # Main CGI entry point +├── cgit.h # Core data structures +├── cgit.css # Default stylesheet +├── cgit.js # Client-side JavaScript +├── cgit.mk # Build configuration +├── cgitrc.5.txt # Man page source +├── COPYING # GPL-2.0 license +├── README # Build instructions +├── robots.txt # Default robots.txt +│ +├── cache.c/.h # Response caching +├── cmd.c/.h # Command dispatching +├── configfile.c/.h # Configuration parsing +├── filter.c # Content filtering (Lua) +├── html.c/.h # HTML output generation +├── parsing.c # Git object parsing +├── scan-tree.c/.h # Repository scanning +├── shared.c # Shared utilities +│ +├── ui-*.c/.h # UI modules: +│ ├── ui-atom # Atom feed +│ ├── ui-blame # File blame view +│ ├── ui-blob # File content view +│ ├── ui-clone # Clone URL display +│ ├── ui-commit # Commit view +│ ├── ui-diff # Diff view +│ ├── ui-log # Commit log +│ ├── ui-patch # Patch view +│ ├── ui-plain # Plain text view +│ ├── ui-refs # Reference listing +│ ├── ui-repolist # Repository listing +│ ├── ui-shared # Shared UI utilities +│ ├── ui-snapshot # Tarball/zip snapshots +│ ├── ui-ssdiff # Side-by-side diff +│ ├── ui-stats # Statistics +│ ├── ui-summary # Repository summary +│ ├── ui-tag # Tag view +│ └── ui-tree # Tree view +│ +├── contrib/ # Third-party contributions +├── filters/ # Content filter scripts +├── git/ # Bundled Git source (submodule) +└── tests/ # Test suite +``` + +--- + +## neozip/ — Compression Library + +zlib-ng fork with SIMD acceleration. + +``` +neozip/ +├── CMakeLists.txt # CMake build +├── configure # Autotools-style configure +├── Makefile.in # Make template +├── FAQ.zlib # zlib FAQ +├── INDEX.md # File index +├── LICENSE.md # Zlib license +├── PORTING.md # Porting guide +├── README.md # Overview +│ +├── adler32.c # Adler-32 checksum +├── compress.c # compression wrapper +├── crc32.c # CRC-32 checksum +├── deflate.c # Deflate compression +├── deflate_fast.c # Fast deflate strategy +├── deflate_huff.c # Huffman-only strategy +├── deflate_medium.c # Medium deflate strategy +├── deflate_quick.c # Quick deflate strategy +├── deflate_rle.c # RLE deflate strategy +├── deflate_slow.c # Slow (best) deflate strategy +├── deflate_stored.c # Stored (no compression) +├── inflate.c # Inflate decompression +├── infback.c # Inflate back-stream +├── trees.c # Huffman tree construction +├── uncompr.c # Uncompress wrapper +├── gzlib.c # gzip file utilities +├── gzread.c # gzip read +├── gzwrite.c # gzip write +│ +├── arch/ # Architecture-specific SIMD code +│ ├── x86/ # SSE2, SSE4, AVX2, AVX512, PCLMULQDQ +│ ├── arm/ # NEON, ARMv8 CRC, PMULL +│ ├── power/ # VMX, VSX, Power8/9 +│ ├── s390/ # DFLTCC (hardware deflate) +│ ├── riscv/ # RVV, ZBC +│ └── loongarch/ # LSX, LASX +│ +├── cmake/ # CMake modules +├── doc/ # Documentation +├── test/ # Test suite +├── tools/ # Development tools +└── win32/ # Windows-specific files +``` + +--- + +## Libraries (Other) + +### json4cpp/ + +``` +json4cpp/ +├── CMakeLists.txt # CMake build +├── meson.build # Meson build (alternative) +├── BUILD.bazel # Bazel build (alternative) +├── MODULE.bazel # Bazel module +├── Package.swift # Swift Package Manager +├── Makefile # Convenience Makefile +├── LICENSE.MIT # MIT license +├── README.md # Comprehensive usage guide +│ +├── include/nlohmann/ # Public headers +├── single_include/ # Amalgamated single header +├── src/ # Implementation (for split build) +├── docs/ # MkDocs documentation +├── tests/ # Catch2 test suite +├── cmake/ # CMake modules +└── tools/ # Development tools +``` + +### tomlplusplus/ + +``` +tomlplusplus/ +├── meson.build # Primary build (Meson) +├── meson_options.txt # Meson options +├── CMakeLists.txt # CMake build (alternative) +├── LICENSE # MIT license +├── README.md # Overview and usage +├── toml.hpp # Single header include +│ +├── include/toml++/ # Multi-header includes +├── src/ # Implementation files +├── docs/ # Documentation +├── examples/ # Usage examples +├── tests/ # Test suite +├── toml-test/ # Official TOML test suite +├── fuzzing/ # Fuzz testing +└── tools/ # Development tools +``` + +### libnbtplusplus/ + +``` +libnbtplusplus/ +├── CMakeLists.txt # CMake build +├── COPYING # LGPL-3.0 full text +├── COPYING.LESSER # LGPL-3.0 lesser clause +├── README.md # Build guide +│ +├── include/ # Public headers (nbt::tag_* types) +├── src/ # Implementation +└── test/ # Test suite +``` + +### genqrcode/ + +``` +genqrcode/ +├── CMakeLists.txt # CMake build +├── configure.ac # Autotools configuration +├── autogen.sh # Autotools bootstrap +├── Makefile.am # Autotools Makefile +├── COPYING # LGPL-2.1 license +├── README.md # Overview +│ +├── qrencode.c/.h # Main encoding API +├── qrinput.c/.h # Input processing +├── qrspec.c/.h # QR specification tables +├── bitstream.c/.h # Bit stream utilities +├── mask.c/.h # Masking patterns +├── rsecc.c/.h # Reed-Solomon error correction +├── split.c/.h # Data splitting +├── qrenc.c # CLI tool +│ +├── cmake/ # CMake modules +├── tests/ # Test suite +└── use/ # Usage examples +``` + +### forgewrapper/ + +``` +forgewrapper/ +├── build.gradle # Gradle build script +├── settings.gradle # Gradle settings +├── gradle.properties # Build properties +├── gradlew # Unix Gradle wrapper +├── gradlew.bat # Windows Gradle wrapper +├── LICENSE # MIT license +├── README.md # Usage guide +│ +├── gradle/ # Gradle wrapper JAR +├── jigsaw/ # JPMS module configuration +└── src/ + └── main/java/ # Java source + └── io/github/zekerzhayard/forgewrapper/ + └── installer/ + └── detector/ + └── IFileDetector.java # SPI interface +``` + +--- + +## Infrastructure + +### meta/ — Metadata Generator + +``` +meta/ +├── pyproject.toml # Poetry project configuration +├── poetry.lock # Locked Python dependencies +├── requirements.txt # pip requirements (alternative) +├── README.md # Deployment guide +├── COPYING / LICENSE # MS-PL license +├── config.sh / config.example.sh # Shell configuration +├── init.sh # Initialization script +├── update.sh # Update script +├── flake.nix / flake.lock # Nix flake +├── garnix.yaml # Garnix CI configuration +├── renovate.json # Renovate dependency updates +│ +├── meta/ # Python package +│ └── run/ # CLI entry points +│ ├── generate_fabric.py +│ ├── generate_forge.py +│ ├── generate_mojang.py +│ ├── generate_neoforge.py +│ ├── generate_quilt.py +│ ├── generate_java.py +│ └── ... +│ +├── cache/ / caches/ # Cached upstream data +├── launcher/ # Launcher configuration +├── public/ # Generated output +├── upstream/ # Upstream source data +├── fuzz/ # Fuzz testing +├── nix/ # Nix packaging +└── venv/ # Python virtual environment +``` + +### ofborg/ — tickborg CI Bot + +``` +ofborg/ +├── Cargo.toml # Workspace root +├── Cargo.lock # Locked Rust dependencies +├── Dockerfile # Container build +├── docker-compose.yml # Multi-container deployment +├── DEPLOY.md # Deployment guide +├── README.md # Overview and bot commands +├── LICENSE # MIT license +├── default.nix # Nix build +├── flake.nix / flake.lock # Nix flake +├── shell.nix # Development shell +├── service.nix # NixOS service module +├── config.production.json # Production config +├── config.public.json # Public config +├── example.config.json # Example config +│ +├── tickborg/ # Main CI bot (Rust crate) +│ ├── Cargo.toml +│ └── src/ +│ +├── tickborg-simple-build/ # Simplified builder (Rust crate) +│ ├── Cargo.toml +│ └── src/ +│ +├── ofborg/ # Upstream ofborg (reference) +├── ofborg-simple-build/ # Upstream simple-build +├── ofborg-viewer/ # Build status viewer +│ +├── deploy/ # Deployment scripts +├── doc/ # Documentation +└── target/ # Cargo build output +``` + +### images4docker/ + +``` +images4docker/ +├── README.md # Overview and workflow docs +├── LICENSE # GPL-3.0 license +│ +├── dockerfiles/ # 40 Dockerfile-per-distro files +│ ├── debian-12.Dockerfile +│ ├── ubuntu-24.04.Dockerfile +│ ├── fedora-41.Dockerfile +│ ├── alpine-3.20.Dockerfile +│ └── ... (36 more) +│ +└── LICENSES/ # License copies +``` + +### ci/ — CI Infrastructure + +``` +ci/ +├── OWNERS # CI code ownership +├── README.md # CI documentation +├── default.nix # Nix CI entry (treefmt, validator) +├── pinned.json # Pinned Nixpkgs revision + hash +├── supportedBranches.js # Branch classification logic +├── update-pinned.sh # Update pinned.json +│ +├── codeowners-validator/ # CODEOWNERS validation tool +│ ├── default.nix +│ ├── owners-file-name.patch +│ └── permissions.patch +│ +└── github-script/ # GitHub Actions JS helpers + ├── run # CLI entry point + ├── lint-commits.js # Conventional Commits linter + ├── prepare.js # PR preparation + ├── reviews.js # Review state management + ├── get-pr-commit-details.js + ├── withRateLimit.js # API rate limiting + ├── package.json # npm dependencies + └── shell.nix # Nix dev shell +``` + +--- + +## LICENSES/ — License Texts + +``` +LICENSES/ +├── Apache-2.0.txt +├── BSD-1-Clause.txt +├── BSD-2-Clause.txt +├── BSD-3-Clause.txt +├── BSD-4-Clause.txt +├── BSL-1.0.txt +├── CC-BY-SA-4.0.txt +├── CC0-1.0.txt +├── GPL-2.0-only.txt +├── GPL-3.0-only.txt +├── GPL-3.0-or-later.txt +├── LGPL-2.0-or-later.txt +├── LGPL-2.1-or-later.txt +├── LGPL-3.0-or-later.txt +├── LicenseRef-Qt-Commercial.txt +├── MIT.txt +├── MS-PL.txt +├── Unlicense.txt +├── Vim.txt +└── Zlib.txt +``` + +20 SPDX-compliant license texts covering all sub-projects and their +dependencies. + +--- + +## corebinutils/ — BSD Utilities + +``` +corebinutils/ +├── config.mk # Toolchain configuration +├── configure # Toolchain detection script +├── GNUmakefile # Top-level orchestrator +├── README.md # Build instructions +│ +├── build/ # Shared build infrastructure +├── contrib/ # Contributed utilities +│ +├── cat/ ├── chmod/ ├── cp/ +├── chflags/ ├── cpuset/ ├── csh/ +├── date/ ├── dd/ ├── df/ +├── domainname/ ├── echo/ ├── ed/ +├── expr/ ├── freebsd-version/ ├── getfacl/ +├── hostname/ ├── kill/ ├── ln/ +├── ls/ ├── mkdir/ ├── mv/ +├── nproc/ ├── pax/ ├── pkill/ +├── ps/ ├── pwait/ ├── pwd/ +├── realpath/ ├── rm/ ├── rmail/ +├── rmdir/ ├── setfacl/ ├── sh/ +├── sleep/ ├── stty/ ├── sync/ +├── test/ ├── timeout/ └── uuidgen/ +``` + +Each utility subdirectory contains its own `GNUmakefile` and source files. + +--- + +## docs/ — Documentation + +``` +docs/ +└── handbook/ # Developer handbook + ├── Project-Tick/ # Organization-level docs (this directory) + └── [per-project]/ # Per-sub-project documentation +``` + +--- + +## archived/ — Deprecated Projects + +``` +archived/ +├── projt-launcher/ # Original launcher (GPL-3.0-only) +├── projt-modpack/ # Modpack tooling (GPL-3.0-only) +├── projt-minicraft-modpack/ # Minicraft modpack (MIT) +└── ptlibzippy/ # ZIP library (Zlib) +``` + +These projects are kept for historical reference but are no longer actively +maintained. MeshMC supersedes projt-launcher, and neozip supersedes +ptlibzippy. diff --git a/docs/handbook/Project-Tick/security-policy.md b/docs/handbook/Project-Tick/security-policy.md new file mode 100644 index 0000000000..f81028978d --- /dev/null +++ b/docs/handbook/Project-Tick/security-policy.md @@ -0,0 +1,282 @@ +# Project Tick — Security Policy + +## Overview + +Project Tick takes the security of its software ecosystem seriously. This +document describes how to report security vulnerabilities, the disclosure +process, and the security practices applied across the monorepo. + +Given that Project Tick includes components ranging from compression libraries +(NeoZip) to a full application (MeshMC) to CI infrastructure (tickborg), a +vulnerability in any sub-project could have cascading effects. The project +maintains a unified security posture across all components. + +--- + +## Reporting Vulnerabilities + +### How to Report + +If you discover a security vulnerability in any Project Tick component, report +it via email: + +**[projecttick@projecttick.org](mailto:projecttick@projecttick.org)** + +### Do NOT + +- Open a public GitHub issue for security vulnerabilities +- Post vulnerability details on Discord or social media +- Publish exploit code before the issue is resolved + +### What to Include + +When submitting a security report, include as much of the following as +possible: + +| Field | Description | +|-------|-------------| +| **Affected component** | Which sub-project (e.g., meshmc, neozip, libnbtplusplus) | +| **Affected versions** | Version numbers or commit hashes | +| **Steps to reproduce** | Detailed reproduction steps | +| **Expected behavior** | What should happen | +| **Actual behavior** | What actually happens (crash, data leak, etc.) | +| **Impact assessment** | Your assessment of severity and exploitability | +| **Logs or crash reports** | Stack traces, core dumps, error messages | +| **Proof of concept** | Minimal reproducer (if available) | +| **Suggested fix** | If you have one | + +### Example Report + +``` +Subject: [SECURITY] Buffer overflow in NeoZip deflate_fast + +Affected component: neozip +Affected versions: All versions based on zlib-ng 2.x + +Steps to reproduce: +1. Create a specially crafted gzip stream with [details] +2. Call inflate() with the crafted input +3. Observe buffer overflow at deflate_fast.c:42 + +Impact: Remote code execution via crafted compressed data. +Severity: Critical (CVSS 9.8) + +PoC: Attached file crash_input.gz + +Suggested fix: Add bounds check at deflate_fast.c:42 before +memcpy call. +``` + +--- + +## Disclosure Process + +### Timeline + +Project Tick follows a responsible disclosure process: + +1. **Acknowledgment** — You will receive an acknowledgment within **48 hours** + of your report. + +2. **Triage** — The security team assesses severity and impact within + **7 days**. + +3. **Fix development** — A fix is developed privately. Timeline depends on + severity: + - **Critical (CVSS 9.0+):** Fix within **7 days** + - **High (CVSS 7.0–8.9):** Fix within **14 days** + - **Medium (CVSS 4.0–6.9):** Fix within **30 days** + - **Low (CVSS 0.1–3.9):** Fix within **90 days** + +4. **Coordinated disclosure** — The fix is released, and the vulnerability is + disclosed publicly. Credit is given to the reporter (unless anonymity is + requested). + +5. **Advisory publication** — A security advisory is published on GitHub with + the CVE ID (if assigned). + +### Embargo + +During the fix development period: + +- Details of the vulnerability are kept confidential +- Only the core maintainers and the reporter have access +- Pre-disclosure to downstream distributors may occur for critical issues +- The reporter is asked not to disclose until the fix is released + +--- + +## Supported Components + +### Security-Critical Components + +The following components handle untrusted input and are considered +security-critical: + +| Component | Risk Area | Threat Model | +|-----------|-----------|--------------| +| **neozip** | Compression/decompression | Crafted compressed streams (e.g., zip bombs, buffer overflows) | +| **libnbtplusplus** | Binary data parsing | Malicious NBT files from untrusted sources | +| **json4cpp** | JSON parsing | Crafted JSON input (e.g., deeply nested objects, huge numbers) | +| **tomlplusplus** | TOML parsing | Crafted TOML configuration files | +| **cmark** | Markdown parsing | Crafted Markdown (e.g., pathological regex, huge nesting) | +| **genqrcode** | QR code encoding | Crafted encoding input | +| **meshmc** | Application | Network input (OAuth, HTTP APIs), file parsing, mod loading | +| **forgewrapper** | Java runtime | Classpath manipulation, installer extraction | +| **cgit** | Web interface | HTTP request handling, repository traversal | +| **mnv** | Text editor | Modeline parsing, file format handling | +| **corebinutils** | System utilities | Command-line input, file operations | +| **tickborg** | CI bot | AMQP messages, GitHub API responses | +| **meta** | Metadata generation | Upstream API responses (Mojang, Forge, etc.) | + +### Fuzz Testing Coverage + +Several sub-projects maintain active fuzz testing: + +| Component | Fuzz Infrastructure | CI Workflow | +|-----------|-------------------|-------------| +| neozip | OSS-Fuzz, custom fuzzers | `neozip-fuzz.yml` | +| json4cpp | OSS-Fuzz, custom fuzzers | `json4cpp-fuzz.yml` | +| cmark | Custom fuzzers in `fuzz/` | `cmark-fuzz.yml` | +| tomlplusplus | Custom fuzzers in `fuzzing/` | `tomlplusplus-fuzz.yml` | + +### Static Analysis Coverage + +| Component | Tool | CI Workflow | +|-----------|------|-------------| +| meshmc | CodeQL | `meshmc-codeql.yml` | +| mnv | CodeQL, Coverity | `mnv-codeql.yml`, `mnv-coverity.yml` | +| neozip | CodeQL | `neozip-codeql.yml` | +| json4cpp | Semgrep, Flawfinder | `json4cpp-semgrep.yml`, `json4cpp-flawfinder.yml` | + +--- + +## Security Practices + +### Compiler Hardening + +MeshMC's build system enables several hardening flags: + +```cmake +# Stack protection +-fstack-protector-strong --param=ssp-buffer-size=4 + +# Buffer overflow detection +-O3 -D_FORTIFY_SOURCE=2 + +# Comprehensive warnings +-Wall -pedantic + +# Position-independent code (ASLR support) +CMAKE_POSITION_INDEPENDENT_CODE ON +``` + +### Supply Chain Security + +1. **Pinned Dependencies** + - Nix inputs are content-addressed and locked in `flake.lock` + - CI Nixpkgs revision is pinned in `ci/pinned.json` with SHA256 hashes + - GitHub Actions use SHA-pinned action references + +2. **Runner Hardening** + - CI workflows use `step-security/harden-runner` with egress auditing + - `repo-scorecards.yml` tracks OpenSSF Scorecard compliance + - `repo-dependency-review.yml` scans dependency changes for known + vulnerabilities + +3. **Code Signing** + - Release artifacts are signed + - Git commits can be GPG/SSH signed (recommended but not required) + +4. **CODEOWNERS Enforcement** + - The `codeowners-validator` tool (built from source in `ci/`) validates + the `CODEOWNERS` file to ensure all paths have designated reviewers + +5. **GitHub Actions Security** + - `zizmor` scans workflows for security issues + - `actionlint` validates workflow syntax + - Minimal permissions (`contents: read` by default) + +### Network Security (MeshMC) + +MeshMC handles network operations for: +- OAuth2 authentication (Microsoft account login via Qt6 NetworkAuth) +- HTTP APIs (Mojang, Forge, Fabric, Quilt, Modrinth, CurseForge) +- File downloads (game assets, mods, Java runtimes) + +Security measures: +- TLS/HTTPS enforced for all network connections +- Certificate validation via Qt's SSL stack +- Download integrity verification (SHA-1, SHA-256 checksums) +- No execution of downloaded code without user consent + +### Infrastructure Security + +The Code of Conduct (Section 4.2) explicitly prohibits: + +- Intentional submission of malicious code +- Supply-chain compromise attempts +- Infrastructure abuse, including CI/CD exploitation or service disruption +- License violations or intentional misattribution + +Violations are treated as serious misconduct and may result in immediate +and permanent bans. + +--- + +## Vulnerability History + +Security advisories are published on the GitHub repository's Security tab: + +``` +https://github.com/Project-Tick/Project-Tick/security/advisories +``` + +--- + +## Third-Party Component Security + +Since Project Tick includes forks of upstream projects (zlib-ng, nlohmann/json, +toml++, libqrencode, Vim, cgit, ofborg), security vulnerabilities in upstream +projects may affect Project Tick. + +### Monitoring + +- Upstream security advisories are monitored +- Dependabot alerts are enabled for Cargo, npm, and pip dependencies +- The `repo-dependency-review.yml` workflow checks for known vulnerabilities + in dependency changes + +### Patching Policy + +- **Critical upstream vulnerabilities** — Patches are applied within 48 hours + and backported to all supported release branches +- **High upstream vulnerabilities** — Patches applied within 7 days +- **Other upstream vulnerabilities** — Incorporated in the next regular sync + +### Upstream Tracking + +| Component | Upstream | Tracking | +|-----------|----------|----------| +| neozip | zlib-ng/zlib-ng | GitHub releases, OSS-Fuzz | +| json4cpp | nlohmann/json | GitHub releases, OSS-Fuzz | +| tomlplusplus | marzer/tomlplusplus | GitHub releases | +| cmark | commonmark/cmark | GitHub releases | +| genqrcode | fukuchi/libqrencode | GitHub releases | +| mnv | vim/vim | GitHub security advisories | +| cgit | zx2c4/cgit | Mailing list | +| ofborg/tickborg | NixOS/ofborg | GitHub releases | + +--- + +## Contact + +For security-related inquiries: + +| Channel | Address | +|---------|---------| +| Security reports | [projecttick@projecttick.org](mailto:projecttick@projecttick.org) | +| General inquiries | [projecttick@projecttick.org](mailto:projecttick@projecttick.org) | +| Trademark | [yongdohyun@projecttick.org](mailto:yongdohyun@projecttick.org) | + +**Do not use GitHub issues for security reports.** diff --git a/docs/handbook/Project-Tick/trademark-policy.md b/docs/handbook/Project-Tick/trademark-policy.md new file mode 100644 index 0000000000..62e56eb7b5 --- /dev/null +++ b/docs/handbook/Project-Tick/trademark-policy.md @@ -0,0 +1,283 @@ +# Project Tick — Trademark Policy + +## Overview + +This document summarizes the Project Tick trademark and brand policy as defined +in `TRADEMARK.md` at the repository root. The trademarks are separate from the +open source licenses that govern the source code. + +--- + +## Trademark Ownership + +The following marks are owned by **Mehmet Samet Duman**: + +- **Project Tick™** — The project name +- **Project Tick logo** — The project visual identity +- All related branding elements + +Collectively, these are referred to as the "Marks." + +All rights in the Marks are reserved. + +--- + +## Relationship to Open Source Licenses + +This is the most important distinction to understand: + +**Open source licenses do NOT grant trademark rights.** + +Each repository under the Project Tick namespace is licensed under its +respective open source license (MIT, BSD, GPL, MS-PL, etc.). These licenses +govern use, modification, and redistribution of **source code only**. + +Open source licenses specifically **do not** grant: + +- Rights to use the Project Tick name +- Rights to use the Project Tick logo +- Rights to use Project Tick branding or trade dress +- Rights to imply affiliation, endorsement, sponsorship, or official status + +Trademark rights are legally separate from copyright licenses. + +--- + +## Permitted Uses + +The following uses are generally permitted **without** prior written permission: + +### 1. Factual References + +You may make factual references to Project Tick: + +> "This software is compatible with Project Tick." + +### 2. Unmodified Official Releases + +You may accurately describe unmodified official releases: + +> "This package contains Project Tick MeshMC version 7.0.0." + +### 3. Non-Commercial Commentary + +Non-commercial commentary, research, educational, and journalistic references +are permitted: + +> "In our analysis of open-source Minecraft launchers, Project Tick's MeshMC +> demonstrated strong performance." + +### Conditions for Permitted Use + +Even permitted uses must not: + +- Create confusion regarding the origin of software +- Suggest sponsorship, approval, or endorsement by Project Tick +- Present modified versions as official releases + +--- + +## Modified and Redistributed Versions + +Open source licenses permit modification and redistribution of source code. +However, trademark restrictions apply to how modified versions are named and +presented. + +### Requirements for Forks and Derivatives + +| Requirement | Details | +|------------|---------| +| Must not use Project Tick name/logo as if official | Forks must use distinct branding | +| Must clearly indicate modification | Derivative works must state they are modified | +| Must not use "Official," "Certified," etc. | Unless explicitly authorized | + +### Examples + +**Permissible:** + +> "Based on Project Tick" +> +> "MyLauncher — derived from Project Tick MeshMC" + +**Impermissible (without authorization):** + +> "Official Project Tick Build" +> +> "Project Tick Certified Edition" +> +> "Project Tick Pro" + +--- + +## Commercial Use + +The Marks may **not** be used in the following commercial contexts without +prior written permission: + +| Context | Example | +|---------|---------| +| Product name | "Project Tick Hosting Service" | +| Company name | "Project Tick Solutions LLC" | +| SaaS service name | "Project Tick Cloud" | +| Domain name | `projecttick-hosting.com` | +| Paid advertising | Google Ads using "Project Tick" | +| Promotional materials | Brochures featuring the logo | + +### SaaS and Hosted Services + +Operating a commercial service using Project Tick source code **does not** +grant the right to represent that service as an official Project Tick service. + +Only services directly operated by Mehmet Samet Duman under the Project Tick +identity may use the Marks in a commercial context. + +--- + +## Official Releases + +An "Official Project Tick Release" must meet **all** of the following criteria: + +1. Built and distributed by the Project Tick maintainers +2. Published through official communication channels +3. Identified by official release tags or signatures + +Modified builds, even if fully compliant with the applicable open source +license, **must not** be presented as official releases. + +--- + +## Logo Usage + +The Project Tick logo is protected by both copyright and trademark law. + +### Prohibited Modifications + +The logo may **not** be: + +| Action | Status | +|--------|--------| +| Modified | Prohibited | +| Recolored | Prohibited | +| Combined with other marks | Prohibited | +| Used for commercial services | Prohibited | +| Embedded in derivative branding | Prohibited | +| Used as a favicon for unofficial sites | Prohibited | + +Written authorization is required for any logo use beyond factual reference. + +### Creative Commons and Trademark + +If the logo is licensed under a Creative Commons license (e.g., CC BY-NC-ND), +that license applies within its stated scope but **does not waive trademark +protections**. The CC license governs copyright only; trademark restrictions +remain in full force. + +--- + +## Domain Names and Corporate Identifiers + +The Marks may **not** be used in: + +| Context | Prohibited Without Permission | +|---------|------------------------------| +| Domain names | `projecttick.io`, `meshmc-official.com` | +| Social media handles | `@projecttick`, `@meshmc_official` | +| Corporate names | "Project Tick Inc." | +| Registered business identifiers | EIN/tax registration using the name | + +--- + +## Prohibited Uses + +The following uses are **strictly prohibited** regardless of context: + +1. **Implying endorsement or affiliation** with Project Tick when none exists +2. **Misrepresenting unofficial builds** as official releases +3. **Using the Marks in a misleading or deceptive manner** +4. **Using the Marks in ways that damage reputation or goodwill** +5. **Registering confusingly similar names** (trademarks, domains, handles) + +--- + +## Enforcement + +### Reservation of Rights + +All rights not expressly granted in the TRADEMARK.md policy are reserved. +Failure to enforce any provision does **not** constitute a waiver of rights. + +Project Tick reserves the right to update the trademark policy at any time. + +### What Happens If You Violate the Policy + +1. You may receive a cease-and-desist notice +2. You may be asked to rename your project/service/domain +3. Legal action may be pursued for willful infringement +4. Pull request and issue access may be revoked + +--- + +## Practical Guidance for Common Scenarios + +### Scenario: Creating a Fork + +You may fork the source code under the applicable open source license, but: + +- Choose a **new name** for your fork (not containing "Project Tick" or "MeshMC") +- Create **new branding** (logo, icons, splash screens) +- Clearly state: "Based on Project Tick" or "Derived from MeshMC" +- Do not use "official," "certified," "authorized," or similar terms + +### Scenario: Writing About Project Tick + +You may write about Project Tick in articles, blog posts, academic papers, and +reviews. You may use the name "Project Tick" in factual context. You may +include screenshots. Do not imply endorsement. + +### Scenario: Packaging for a Linux Distribution + +Distribution packagers may use the Project Tick name for unmodified source +packages built from official release tarballs. If patches are applied that +materially change behavior, the package description should note that it +contains modifications. + +### Scenario: Hosting a Mirror + +You may host a source code mirror. You should not use the Marks in the mirror's +domain name without permission. The mirror description should clearly indicate +it is an unofficial mirror. + +### Scenario: Creating a Plugin or Mod + +You may create plugins, mods, or extensions for MeshMC. You may refer to +MeshMC compatibility. You must not name your project in a way that suggests +it is an official Project Tick product. + +--- + +## Contact + +For trademark permission requests or questions: + +**[yongdohyun@projecttick.org](mailto:yongdohyun@projecttick.org)** + +For general project inquiries: + +**[projecttick@projecttick.org](mailto:projecttick@projecttick.org)** + +--- + +## Summary Table + +| Use Case | Allowed? | Condition | +|----------|----------|-----------| +| Factual reference | Yes | Must be accurate | +| Describing unmodified official releases | Yes | Must be unmodified | +| Non-commercial research/education | Yes | No endorsement implied | +| Fork with Project Tick branding | No | Must rebrand | +| Fork with "Based on" attribution | Yes | Clear distinction | +| Commercial product name | No | Requires written permission | +| Domain name with "projecttick" | No | Requires written permission | +| Logo in derivative branding | No | Requires written permission | +| Blog post mentioning Project Tick | Yes | No endorsement implied | +| Linux distro package | Yes | If from official source | diff --git a/docs/handbook/archived/overview.md b/docs/handbook/archived/overview.md new file mode 100644 index 0000000000..c6d066c8d3 --- /dev/null +++ b/docs/handbook/archived/overview.md @@ -0,0 +1,275 @@ +# Archived Projects — Overview + +## Purpose + +The `archived/` directory contains legacy Project Tick projects that are no longer +actively developed. These projects remain in the monorepo for historical reference, +documentation completeness, and potential future reuse of components. + +Archived projects are not built, tested, or deployed by the current CI pipeline. +They are preserved as-is at the time of archival. + +--- + +## Archived Projects + +| Directory | Project Name | Type | License | Status | +|---------------------------|---------------------|------------------------|----------|-------------| +| `archived/projt-launcher/` | ProjT Launcher | Minecraft Launcher (C++/Qt) | GPL-3.0 | Archived | +| `archived/projt-modpack/` | ProjT Modpack | Minecraft Modpack | GPL-3.0 | Archived | +| `archived/projt-minicraft-modpack/` | MiniCraft Modpack | Minecraft Modpack Collection | MIT | Archived | +| `archived/ptlibzippy/` | PTlibzippy | Compression Library (C)| zlib License | Archived | + +--- + +## Why Projects Are Archived + +Projects are moved to `archived/` when they meet one or more of these criteria: + +1. **Superseded by a newer project** — The functionality has been replaced by a different + component in the monorepo (e.g., ProjT Launcher was the standalone launcher before + MeshMC took over as the primary launcher) +2. **No longer maintained** — The project has reached end-of-life and no further + development is planned +3. **Completed scope** — The project achieved its intended purpose and doesn't need + ongoing changes (e.g., modpack archives) +4. **Consolidation** — Standalone repositories were merged into the monorepo as + subtrees, and the project's active development has ended + +--- + +## Project Summaries + +### ProjT Launcher (`archived/projt-launcher/`) + +ProjT Launcher was a structurally disciplined Minecraft launcher fork of Prism Launcher. +It was engineered for long-term maintainability, architectural clarity, and controlled +ecosystem evolution. + +**Key characteristics**: +- Written in C++23 with Qt 6 +- CMake build system with presets for Linux, macOS, Windows (MSVC and MinGW) +- Layered architecture: UI (Qt Widgets) → Core/Domain → Tasks → Networking +- Detached fork libraries: zlib, bzip2, quazip, cmark, tomlplusplus, libqrencode, libnbtplusplus +- Nix-based CI and reproducible builds +- Containerized build support (Dockerfile/Containerfile) +- Comprehensive documentation in `docs/` and `docs/handbook/` + +**Notable features at time of archival**: +- Launcher Hub (web-based dashboard using CEF on Linux, native on Windows/macOS) +- Modrinth collection import +- Fabric/Quilt/NeoForge mod loader support +- Java runtime auto-detection and management +- Multi-platform packaging: RPM, DEB, AppImage, Flatpak, macOS App Bundle, Windows MSI + +**Last known version**: 0.0.5-1 (draft) + +**License heritage**: GPL-3.0, with upstream license blocks from Prism Launcher +(GPL-3.0), PolyMC (GPL-3.0), and MultiMC (Apache-2.0). + +For full documentation, see [projt-launcher.md](projt-launcher.md). + +--- + +### ProjT Modpack (`archived/projt-modpack/`) + +ProjT Modpack was a Minecraft modpack curated by Project Tick. The project contained +modpack configuration files and promotional assets. + +**Key characteristics**: +- Licensed under GPL-3.0 +- Contains promotional images (ProjT1.png, ProjT2.png, ProjT3.png) +- Affiliate banner assets (affiliate-banner-bg.webp, affiliate-banner-fg.webp) +- Minimal README — the modpack itself was distributed through launcher platforms + +**Status**: Archived with no active maintenance. The modpack distribution was +handled through the ProjT Launcher and mod platform integrations (Modrinth, +CurseForge). + +For full documentation, see [projt-modpack.md](projt-modpack.md). + +--- + +### MiniCraft Modpack (`archived/projt-minicraft-modpack/`) + +The MiniCraft Modpack is a historical archive of Minecraft modpack releases +organized into multiple "seasons" (S1 through S4). This is a collection of +pre-built modpack ZIP files rather than a source code project. + +**Key characteristics**: +- Licensed under MIT +- Organized by season: + - **MiniCraft S1**: Versions from 12.1.5 through 13.0.0, including beta/alpha/pre-release builds + - **MiniCraft S2**: Versions with mixed naming (A00051c74C, L3.0, R10056a75A, N1.0, N2.0) + - **MiniCraft S3**: Versions from 1.0 through 1.2.0.3, plus a DEV-1.2 build + - **MiniCraft S4**: Alpha versions (0.0.1–0.0.3), Beta versions (0.1–0.2.1), and + releases including a "LASTMAJORRELEASE-2.0.0" +- Contains compiled ZIP archives, not source code + +**Archive purpose**: Preserves the complete release history of the MiniCraft +modpack series for historical reference. + +--- + +### PTlibzippy (`archived/ptlibzippy/`) + +PTlibzippy is a Project Tick fork of the zlib compression library, version 0.0.5.1. +It's a general-purpose lossless data compression library implementing the DEFLATE +algorithm (RFC 1950, 1951, 1952). + +**Key characteristics**: +- Written in C +- CMake and Autotools (configure/Makefile) build systems +- Bazel build support (BUILD.bazel, MODULE.bazel) +- Extensive cross-platform support (Unix, Windows, Amiga, OS/400, QNX, VMS) +- Thread-safe implementation +- Custom PNG shim layer (`ptlibzippy_pngshim.c`) for libpng integration +- Prefix support for symbol namespacing (`PTLIBZIPPY_PREFIX`) +- Language bindings: Ada, C#/.NET, Delphi, Python, Perl, Java, Tcl + +**License**: zlib license (permissive, compatible with GPL) + +**Why forked**: The fork was maintained to resolve symbol conflicts when bundling +zlib alongside libpng in the ProjT Launcher. The custom `ptlibzippy_pngshim.c` +and symbol prefixing prevented linker conflicts in the launcher's build. + +For full documentation, see [ptlibzippy.md](ptlibzippy.md). + +--- + +## Directory Structure + +``` +archived/ +├── projt-launcher/ # ProjT Launcher (C++/Qt Minecraft Launcher) +│ ├── CMakeLists.txt # Root CMake build file +│ ├── CMakePresets.json # Build presets (linux, macos, windows_msvc, windows_mingw) +│ ├── Containerfile # Docker/Podman build container +│ ├── CHANGELOG.md # Release changelog +│ ├── COPYING.md # License (GPL-3.0 + upstream notices) +│ ├── MAINTAINERS # Maintainer contact info +│ ├── README # Project overview and build instructions +│ ├── default.nix # Nix build via flake-compat +│ ├── bootstrap/ # Platform bootstrapping (macOS) +│ ├── buildconfig/ # Build configuration templates +│ ├── ci/ # CI infrastructure (own copy, pre-monorepo) +│ ├── cmake/ # CMake modules and vcpkg integration +│ ├── docs/ # Developer and user documentation +│ │ ├── architecture/ # Architecture overview +│ │ ├── contributing/ # Contributing guides +│ │ └── handbook/ # User/developer handbook +│ └── ... +├── projt-modpack/ # ProjT Modpack +│ ├── COPYING.md # License (GPL-3.0) +│ ├── LICENSE # GPL-3.0 full text +│ ├── README.md # Minimal README +│ └── *.png, *.webp # Promotional assets +├── projt-minicraft-modpack/ # MiniCraft Modpack Archive +│ ├── LICENSE # MIT License +│ ├── README.md # Minimal README +│ └── MiniCraft/ # Season-organized modpack ZIPs +│ ├── MiniCraft S1/ # Season 1 releases +│ ├── MiniCraft S2/ # Season 2 releases +│ ├── MiniCraft S3/ # Season 3 releases +│ └── MiniCraft S4/ # Season 4 releases +└── ptlibzippy/ # PTlibzippy (zlib fork) + ├── CMakeLists.txt # CMake build system + ├── BUILD.bazel # Bazel build + ├── MODULE.bazel # Bazel module definition + ├── Makefile.in # Autotools Makefile template + ├── configure # Autotools configure script + ├── COPYING.md # zlib license + ├── README # Library overview + ├── README-cmake.md # CMake build instructions + ├── FAQ # Frequently asked questions + ├── INDEX # File listing + ├── ptlibzippy.h # Public API header + ├── ptzippyconf.h # Configuration header + ├── adler32.c # Adler-32 checksum + ├── compress.c # Compression API + ├── crc32.c # CRC-32 checksum + ├── deflate.c # DEFLATE compression + ├── inflate.c # DEFLATE decompression + ├── ptlibzippy_pngshim.c # PNG integration shim + ├── ptzippyutil.c # Internal utilities + └── contrib/ # Third-party contributions + ├── ada/ # Ada bindings + ├── blast/ # PKWare DCL decompressor + ├── crc32vx/ # Vectorized CRC-32 (s390x) + ├── delphi/ # Delphi bindings + ├── dotzlib/ # .NET bindings + ├── gcc_gvmat64/ # x86-64 assembly optimizations + └── ... +``` + +--- + +## Ownership + +All archived projects are owned by `@YongDo-Hyun` as defined in `ci/OWNERS`: + +``` +/archived/projt-launcher/ @YongDo-Hyun +/archived/projt-minicraft-modpack/ @YongDo-Hyun +/archived/projt-modpack/ @YongDo-Hyun +/archived/ptlibzippy/ @YongDo-Hyun +``` + +--- + +## Historical Context + +### Timeline + +The archived projects represent different phases of Project Tick's development: + +1. **Early phase** (2024–2025): MiniCraft Modpack was created as a community modpack + project with seasonal releases +2. **ProjT Modpack** (2025): A curated modpack distributed through the ProjT Launcher +3. **ProjT Launcher** (2025–2026): The main Minecraft launcher, forked from Prism Launcher, + representing the most significant engineering investment in the archive +4. **PTlibzippy** (2025–2026): A zlib fork created to solve symbol conflicts in the + launcher's build system + +### Relationship to Current Projects + +| Archived Project | Successor/Replacement | +|---------------------|-----------------------------------| +| ProjT Launcher | MeshMC (`meshmc/`) | +| ProjT Modpack | No direct successor | +| MiniCraft Modpack | No direct successor | +| PTlibzippy | System zlib (no longer bundled) | + +--- + +## Policy + +### Modifying Archived Code + +Archived projects should generally not be modified. Exceptions: + +- **License compliance**: Updating license headers or COPYING files +- **Security fixes**: Critical vulnerabilities in code that might be referenced externally +- **Documentation**: Fixing links, adding archival notes + +### Removing Archived Projects + +Archived projects should not be removed from the monorepo. They serve as: +- Historical reference for design decisions +- License compliance (preserving upstream attribution) +- Knowledge base for understanding the evolution of current projects + +### Referencing Archived Code + +When referencing data or patterns from archived projects in new code: +- Copy the relevant code rather than importing from `archived/` +- Document the source with a comment +- Ensure license compatibility + +--- + +## Related Documentation + +- [ProjT Launcher](projt-launcher.md) — Detailed launcher documentation +- [ProjT Modpack](projt-modpack.md) — Modpack project details +- [PTlibzippy](ptlibzippy.md) — Compression library documentation diff --git a/docs/handbook/archived/projt-launcher.md b/docs/handbook/archived/projt-launcher.md new file mode 100644 index 0000000000..d0a3413d1c --- /dev/null +++ b/docs/handbook/archived/projt-launcher.md @@ -0,0 +1,444 @@ +# ProjT Launcher + +## Overview + +ProjT Launcher was a structurally disciplined Minecraft launcher engineered for long-term +maintainability, architectural clarity, and controlled ecosystem evolution. It was a fork +of Prism Launcher (itself forked from PolyMC, which forked from MultiMC) that diverged +intentionally to prevent maintenance decay, dependency drift, and architectural erosion. + +**Status**: Archived — superseded by MeshMC (`meshmc/`). + +--- + +## Project Identity + +| Property | Value | +|-------------------|--------------------------------------------------------| +| **Name** | ProjT Launcher | +| **Location** | `archived/projt-launcher/` | +| **Language** | C++23 / Qt 6 | +| **Build System** | CMake 3.25+ | +| **License** | GPL-3.0-only | +| **Copyright** | 2026 Project Tick | +| **Upstream** | Prism Launcher → PolyMC → MultiMC | +| **Last Version** | 0.0.5-1 (draft) | +| **Website** | https://projecttick.org/p/projt-launcher/ | +| **Releases** | https://gitlab.com/Project-Tick/core/ProjT-Launcher/-/releases | + +--- + +## Why ProjT Launcher Existed + +The README states four key motivations: + +1. **Long-term maintainability** — Explicit architectural constraints and review rules + prevent uncontrolled technical debt +2. **Controlled third-party integration** — External dependencies are maintained as + detached forks with documented patch and update policies +3. **Deterministic CI and builds** — Exact dependency versions and constrained build + inputs enable reproducible builds across environments +4. **Structural clarity** — Enforced MVVM boundaries and clearly separated modules + simplify review, refactoring, and long-term contribution + +--- + +## Architecture + +### Layered Model + +The launcher followed a strict layered architecture documented in +`docs/architecture/OVERVIEW.md`: + +``` +┌─────────────────────────────────────────────────────────┐ +│ Layer 1: UI + ViewModels (launcher/ui/, viewmodels/) │ +│ Qt Widgets screens, dialogs, widgets │ +├─────────────────────────────────────────────────────────┤ +│ Layer 2: Core/Domain (launcher/, minecraft/, java/) │ +│ Models, settings, instance management, launch logic │ +├─────────────────────────────────────────────────────────┤ +│ Layer 3: Task System (launcher/tasks/) │ +│ Long-running async work: downloads, extraction │ +├─────────────────────────────────────────────────────────┤ +│ Layer 4: Networking (launcher/net/) │ +│ HTTP requests, API adapters │ +├─────────────────────────────────────────────────────────┤ +│ Layer 5: Mod Platform Integrations (modplatform/) │ +│ Modrinth, CurseForge, ATLauncher, Technic, FTB │ +└─────────────────────────────────────────────────────────┘ +``` + +### Module Boundaries + +| Rule | Description | +|------|-------------| +| UI must not perform I/O | No file or network operations in the UI layer | +| Core/Tasks must not depend on Qt Widgets | Keeps the domain logic testable | +| ViewModels must be widget-free | Only expose data and actions | +| Use Task for anything > few milliseconds | Background jobs with progress reporting | +| Dependencies flow downward | `ui` → `core` → `data` (storage/net) | + +### Directory Layout + +``` +ProjT-Launcher/ +├── launcher/ # Main application +│ ├── ui/ # Qt Widgets +│ │ ├── pages/ # Main screens +│ │ ├── widgets/ # Reusable components +│ │ ├── dialogs/ # Modal windows +│ │ └── setupwizard/ # First-run wizard +│ ├── minecraft/ # Game logic +│ │ ├── auth/ # Account authentication (Microsoft) +│ │ ├── launch/ # Game process management +│ │ ├── mod/ # Mod loading and management +│ │ └── versions/ # Version parsing and resolution +│ ├── net/ # Networking layer +│ ├── tasks/ # Background job system +│ ├── java/ # Java runtime discovery and management +│ ├── modplatform/ # Mod platform APIs +│ ├── resources/ # Images, themes, assets +│ ├── icons/ # Application icons +│ └── translations/ # Internationalization files (.ts) +├── tests/ # Unit tests +├── cmake/ # CMake build modules +├── docs/ # Documentation +├── website/ # Eleventy-based project website +├── bot/ # Automation (Cloudflare Workers) +└── meta/ # Metadata generator (Python) +``` + +--- + +## Build System + +### CMake Configuration + +The root `CMakeLists.txt` began with: + +```cmake +cmake_minimum_required(VERSION 3.25) +project(Launcher) + +set(CMAKE_CXX_STANDARD 23) +set(CMAKE_CXX_STANDARD_REQUIRED true) +set(CMAKE_C_STANDARD_REQUIRED true) +``` + +### Build Presets + +```bash +cmake --preset [macos OR linux OR windows_msvc OR windows_mingw] +cmake --build --preset [macos OR linux OR windows_msvc OR windows_mingw] --config [Debug OR Release] +``` + +### Requirements + +| Tool | Version | +|----------|----------| +| CMake | 3.25+ | +| Qt | 6.10.x | +| Compiler | C++20/23 | + +### Compiler Flags (MSVC) + +```cmake +# Security and optimization flags: +"$<$:/GS>" # Buffer security checks +"$<$:/Gw;/Gy;/guard:cf>" # Size optimization + control flow guard +"$<$:/LTCG;/MANIFEST:NO;/STACK:8388608>" # LTO, 8MB stack +``` + +The 8MB stack size was required because ATL's pack list needed 3-4 MiB as of the +time of development. + +### Output Directory Macros + +The build system used custom macros for managing output directories: + +```cmake +macro(projt_push_output_dirs name) + set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${Launcher_OUTPUT_ROOT}/${name}/$") + set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${Launcher_OUTPUT_ROOT}/${name}/$") + set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${Launcher_OUTPUT_ROOT}/${name}/$") +endmacro() + +macro(projt_pop_output_dirs) + set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${_PROJT_PREV_RUNTIME}") + ... +endmacro() +``` + +Similar push/pop macros existed for: +- `projt_push_install_libdir` / `projt_pop_install_libdir` +- `projt_push_install_includedir` / `projt_pop_install_includedir` +- `projt_push_install_libexecdir` / `projt_pop_install_libexecdir` +- `projt_push_autogen_disabled` / `projt_pop_autogen_disabled` + +These allowed different build components to use isolated output directories without +polluting the global CMake state. + +### Linux Installation Paths + +```cmake +set(Launcher_BUNDLED_LIBDIR "${CMAKE_INSTALL_LIBDIR}/projtlauncher") +set(Launcher_BUNDLED_INCLUDEDIR "include/projtlauncher") +set(Launcher_BUNDLED_LIBEXECDIR "libexec/projtlauncher") +``` + +### Qt Deprecation Policy + +```cmake +add_compile_definitions(QT_WARN_DEPRECATED_UP_TO=0x060400) +add_compile_definitions(QT_DISABLE_DEPRECATED_UP_TO=0x060400) +``` + +This configured Qt to warn about APIs deprecated before Qt 6.4.0 and hard-disable +them at compile time. + +--- + +## Nix Build + +The `default.nix` used `flake-compat` to provide a traditional Nix interface: + +```nix +(import (fetchTarball { + url = "https://github.com/edolstra/flake-compat/archive/ff81ac966bb2cae68946d5ed5fc4994f96d0ffec.tar.gz"; + sha256 = "sha256-NeCCThCEP3eCl2l/+27kNNK7QrwZB1IJCrXfrbv5oqU="; +}) { src = ./.; }).defaultNix +``` + +Quick build: + +```bash +nix build .#projtlauncher +``` + +--- + +## Container Build + +The `Containerfile` defined a Debian-based build environment: + +```dockerfile +ARG DEBIAN_VERSION=stable-slim +FROM docker.io/library/debian:${DEBIAN_VERSION} + +ARG QT_VERSION=6.10.2 + +# Compilers: clang, lld, llvm, temurin-17-jdk +# Build system: cmake, ninja-build, extra-cmake-modules, pkg-config +# Dependencies: cmark, gamemode-dev, libarchive-dev, libcmark-dev, +# libgl1-mesa-dev, libqrencode-dev, libtomlplusplus-dev, +# scdoc, zlib1g-dev +# Tooling: clang-format, clang-tidy, git + +ENV CMAKE_LINKER_TYPE=lld +``` + +Qt was installed via `aqtinstall`: + +```dockerfile +RUN pip3 install --break-system-packages aqtinstall +RUN aqt install-qt ... +``` + +--- + +## Detached Fork Libraries + +The launcher maintained its own forks of several upstream libraries: + +| Library | Directory | Purpose | +|---------------|----------------|------------------------| +| PTlibzippy | `ptlibzippy/` | Compression (zlib fork)| +| bzip2 | `bzip2/` | Compression | +| quazip | `quazip/` | ZIP handling | +| cmark | `cmark/` | Markdown parsing | +| tomlplusplus | `tomlplusplus/`| TOML parsing | +| libqrencode | `libqrencode/` | QR code generation | +| libnbtplusplus| `libnbtplusplus/` | NBT format (Minecraft)| +| gamemode | `gamemode/` | Linux GameMode support | + +These were maintained with documented patch and update policies to prevent +dependency drift while staying reasonably current with upstream. + +### Vendored Libraries + +| Library | Directory | Purpose | +|-----------|----------------|--------------------| +| LocalPeer | `LocalPeer/` | Single instance | +| murmur2 | `murmur2/` | Hash functions | +| qdcss | `qdcss/` | Dark CSS | +| rainbow | `rainbow/` | Terminal colors | +| systeminfo| `systeminfo/` | System information | + +--- + +## Features at Time of Archival + +### Changelog (v0.0.5-1 Draft) + +**Highlights from the last release cycle:** + +- Improved Fabric/Quilt component version resolution with better Minecraft-version alignment +- Added Launcher Hub support (web-based dashboard) +- Strengthened version comparison logic, especially for release-candidate handling +- Added Modrinth collection import for existing instances +- Switched Linux Launcher Hub backend from QtWebEngine to CEF +- Added native cockpit dashboard for Launcher Hub + +**Platform support:** + +| Platform | Backend | Packaging | +|-----------|----------------------|------------------------------| +| Linux | CEF-based Hub | DEB, RPM, AppImage, Flatpak | +| macOS | Native WebView | App Bundle | +| Windows | Native WebView | MSI, Portable | + +--- + +## CI Infrastructure (Pre-Monorepo) + +The launcher had its own CI infrastructure in `ci/`, which was the predecessor +to the current monorepo CI system. It included: + +- `ci/default.nix` — Nix CI entry point +- `ci/pinned.json` — Pinned dependencies +- `ci/supportedBranches.js` — Branch classification +- `ci/github-script/` — GitHub Actions helpers +- `ci/eval/` — Nix evaluation infrastructure + - `attrpaths.nix` — Attribute path enumeration + - `chunk.nix` — Evaluation chunking + - `diff.nix` — Evaluation diffing + - `outpaths.nix` — Output path computation + - `compare/` — Statistics comparison +- `ci/nixpkgs-vet.nix` / `ci/nixpkgs-vet.sh` — Nixpkgs vetting +- `ci/parse.nix` — CI configuration parsing +- `ci/supportedSystems.json` — Supported target systems +- `ci/supportedVersions.nix` — Supported version matrix + +Some of these patterns were carried forward into the monorepo CI system. + +--- + +## Documentation Structure + +The launcher had extensive documentation: + +``` +docs/ +├── APPLE_SILICON_RATIONALE.md +├── BUILD_SYSTEM.md +├── FUZZING.md +├── README.md +├── architecture/ +│ └── OVERVIEW.md +├── contributing/ +│ ├── ARCHITECTURE.md +│ ├── CODE_STYLE.md +│ ├── GETTING_STARTED.md +│ ├── LAUNCHER_TEST_MATRIX.md +│ ├── PROJECT_STRUCTURE.md +│ ├── README.md +│ ├── TESTING.md +│ └── WORKFLOW.md +└── handbook/ + ├── README.md + ├── bot.md, bzip2.md, cmark.md, ... + ├── help-pages/ + │ ├── apis.md, custom-commands.md, ... + │ └── environment-variables.md + └── wiki/ + ├── development/ + │ ├── instructions/ + │ │ ├── linux.md, macos.md, windows.md + │ └── translating.md + ├── getting-started/ + │ ├── installing-projtlauncher.md + │ ├── installing-java.md + │ ├── create-instance.md + │ └── download-modpacks.md + └── help-pages/ + └── ... (mirrors of handbook help-pages) +``` + +--- + +## Maintainership + +``` +[Mehmet Samet Duman] +GitHub: @YongDo-Hyun +Email: yongdohyun@mail.projecttick.org +Paths: ** +``` + +The project was maintained by a single maintainer with full ownership of all paths. + +--- + +## License + +The launcher carried a multi-layer license history: + +``` +ProjT Launcher - Minecraft Launcher +Copyright (C) 2026 Project Tick +License: GPL-3.0-only + +Incorporates work from: +├── Prism Launcher (Copyright 2022-2025 Prism Launcher Contributors, GPL-3.0) +│ └── Incorporates: +│ └── MultiMC (Copyright 2013-2021 MultiMC Contributors, Apache-2.0) +└── PolyMC (Copyright 2021-2022 PolyMC Contributors, GPL-3.0) +``` + +The logo carried a separate license: +- Original: Prism Launcher Logo © Prism Launcher Contributors (CC BY-SA 4.0) +- Modified: ProjT Launcher Logo © 2026 Project Tick (CC BY-SA 4.0) + +--- + +## Why It Was Archived + +ProjT Launcher was archived when MeshMC (`meshmc/`) became the primary launcher +in the Project Tick monorepo. MeshMC continued the development trajectory with: +- Updated architecture decisions +- Continued the same mod platform integrations +- Maintained the same CMake/Qt/Nix build infrastructure +- Carried forward the detached fork library approach + +The launcher code remains in `archived/` as a reference for: +- Design patterns (layered architecture, task system) +- Build system techniques (CMake push/pop macros, vcpkg integration) +- CI patterns (GitHub script infrastructure) +- License compliance (preserving upstream attribution chains) + +--- + +## Building (for Reference) + +If someone needs to build the archived launcher for historical purposes: + +```bash +cd archived/projt-launcher/ +git submodule update --init --recursive + +# Linux: +cmake --preset linux +cmake --build --preset linux --config Release + +# macOS: +cmake --preset macos +cmake --build --preset macos --config Release + +# Windows (MSVC): +cmake --preset windows_msvc +cmake --build --preset windows_msvc --config Release +``` + +Note: Build success is not guaranteed since the archived code is not maintained +and dependencies may have changed. diff --git a/docs/handbook/archived/projt-modpack.md b/docs/handbook/archived/projt-modpack.md new file mode 100644 index 0000000000..702400a77c --- /dev/null +++ b/docs/handbook/archived/projt-modpack.md @@ -0,0 +1,245 @@ +# ProjT Modpack + +## Overview + +ProjT Modpack was a curated Minecraft modpack created and distributed by Project Tick. +The project served as the official modpack offering alongside the ProjT Launcher, providing +a pre-configured set of mods for the Project Tick community. + +**Status**: Archived — no longer maintained or distributed. + +--- + +## Project Identity + +| Property | Value | +|-------------------|-----------------------------------------------------| +| **Name** | ProjT Modpack | +| **Location** | `archived/projt-modpack/` | +| **Type** | Minecraft Modpack | +| **License** | GPL-3.0-or-later | +| **Copyright** | 2025–2026 Project Tick | + +--- + +## Repository Contents + +The modpack repository contained modpack configuration files and promotional assets: + +``` +archived/projt-modpack/ +├── .DS_Store # macOS filesystem metadata (artifact) +├── .gitattributes # Git line ending and diff configuration +├── COPYING.md # GPL-3.0 license summary with copyright notice +├── LICENSE # Full GPL-3.0 license text +├── README.md # Minimal project README +├── ProjT1.png # Promotional image 1 +├── ProjT2.png # Promotional image 2 +├── ProjT3.png # Promotional image 3 +├── affiliate-banner-bg.webp # Affiliate banner background +├── affiliate-banner-fg.webp # Affiliate banner foreground +└── bisect-icon.webp # Bisect hosting icon +``` + +--- + +## License + +The modpack was licensed under GPL-3.0-or-later: + +``` +ProjT Modpack - Minecraft Modpack by Project Tick +Copyright (C) 2025-2026 Project Tick + +This program is free software: you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation, either version 3 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. +``` + +Note: This is GPL-3.0-**or-later** (with the "or any later version" clause), unlike +the ProjT Launcher which was GPL-3.0-**only**. + +--- + +## Promotional Assets + +The repository included several promotional images used for marketing and distribution: + +### Modpack Screenshots + +| File | Description | +|-----------|------------------------------------------| +| `ProjT1.png` | Promotional screenshot/image 1 | +| `ProjT2.png` | Promotional screenshot/image 2 | +| `ProjT3.png` | Promotional screenshot/image 3 | + +These were likely used on mod platform listing pages (Modrinth, CurseForge) and +the Project Tick website. + +### Affiliate Assets + +| File | Format | Description | +|---------------------------|--------|--------------------------------------| +| `affiliate-banner-bg.webp` | WebP | Affiliate banner background image | +| `affiliate-banner-fg.webp` | WebP | Affiliate banner foreground overlay | +| `bisect-icon.webp` | WebP | Bisect Hosting affiliate icon | + +The affiliate assets suggest the modpack had hosting partnership integrations, +specifically with [Bisect Hosting](https://www.bisecthosting.com/), a popular +Minecraft server hosting provider. + +--- + +## Distribution + +### Primary Distribution Channels + +The modpack was distributed through: + +1. **ProjT Launcher** — Native integration via the launcher's modpack download dialog +2. **Mod platforms** — Listed on Modrinth and/or CurseForge for wider discoverability +3. **Project Tick website** — https://projecttick.org/ (no longer active for this modpack) + +### Installation Flow + +Users could install the modpack through the ProjT Launcher: + +1. Open ProjT Launcher +2. Navigate to the modpack browser +3. Search for "ProjT Modpack" or browse curated packs +4. Click Install — the launcher handles mod downloading and configuration +5. Launch the game with the modpack pre-configured + +--- + +## Relationship to ProjT Launcher + +The ProjT Modpack was tightly coupled with the ProjT Launcher: + +- The launcher's modpack platform integrations (Modrinth, CurseForge, ATLauncher, + Technic, FTB) enabled direct modpack installation +- The modpack was the launcher's "showcase" offering — a reference configuration + demonstrating what the launcher could manage +- Promotional assets were shared between the modpack and launcher marketing + +When the ProjT Launcher was archived, the modpack lost its primary distribution +channel and was archived alongside it. + +--- + +## Relationship to MiniCraft Modpack + +The ProjT Modpack was a separate project from the MiniCraft Modpack +(`archived/projt-minicraft-modpack/`): + +| Aspect | ProjT Modpack | MiniCraft Modpack | +|----------------|----------------------------------|--------------------------------------| +| **License** | GPL-3.0 | MIT | +| **Content** | Curated mod configuration | Pre-built modpack ZIPs | +| **Format** | Platform-distributed configs | Self-contained ZIP archives | +| **Versioning** | Standard semver | Season-based (S1–S4) | +| **Distribution**| Mod platforms + launcher | Direct download | +| **Period** | 2025–2026 | 2024–2026 | + +--- + +## Why It Was Archived + +The ProjT Modpack was archived because: + +1. **Distribution channel archived** — The ProjT Launcher, which was the primary + distribution mechanism, was itself archived +2. **Community consolidation** — Project Tick's focus shifted to other projects + (MeshMC, corebinutils, cgit, etc.) +3. **No standalone value** — The modpack configuration files without a corresponding + launcher integration had limited utility + +--- + +## Historical Significance + +The ProjT Modpack was significant in Project Tick's history because: + +- **Community engagement** — It was one of the first user-facing products, giving + the community something to interact with directly +- **Platform integration testing** — It served as a test bed for the launcher's + modpack download and installation workflows +- **Branding** — The promotional assets established Project Tick's visual identity + in the Minecraft modding community +- **Ecosystem validation** — It validated the end-to-end flow from mod curation + → platform listing → launcher installation → gameplay + +--- + +## File Details + +### .gitattributes + +The repository included Git attributes for handling binary files and line endings: + +``` +# Binary files should not be diffed +*.png binary +*.webp binary +``` + +### README.md + +The README was minimal: + +```markdown +# ProjT Modpack +``` + +This suggests the modpack's detailed description was maintained on the mod platform +listing pages rather than in the repository. + +--- + +## Ownership + +Maintained by `@YongDo-Hyun` as defined in `ci/OWNERS`: + +``` +/archived/projt-modpack/ @YongDo-Hyun +``` + +--- + +## Assets Inventory + +### Image Assets + +| Asset | Format | Size Category | Purpose | +|---------------------------|--------|---------------|------------------| +| `ProjT1.png` | PNG | Full-size | Promotional | +| `ProjT2.png` | PNG | Full-size | Promotional | +| `ProjT3.png` | PNG | Full-size | Promotional | +| `affiliate-banner-bg.webp`| WebP | Banner-size | Affiliate | +| `affiliate-banner-fg.webp`| WebP | Banner-size | Affiliate | +| `bisect-icon.webp` | WebP | Icon-size | Affiliate | + +The use of WebP for affiliate/banner assets and PNG for screenshots reflects +the different quality requirements: +- PNG for screenshots — lossless quality for game imagery +- WebP for banners — smaller file size for web distribution + +--- + +## Mod Content + +The repository does not contain the mod files themselves (`.jar` files) — these +were downloaded dynamically through the launcher's mod platform integrations. +The modpack definition (which mods, versions, and configurations to include) +was stored in the platform-specific manifest format (e.g., Modrinth's +`modrinth.index.json` or CurseForge's `manifest.json`), which is not present +in the archived copy. + +This is typical for modpack distribution: the repository contains metadata and +marketing assets, while the actual mod binaries are served by the platform CDNs. diff --git a/docs/handbook/archived/ptlibzippy.md b/docs/handbook/archived/ptlibzippy.md new file mode 100644 index 0000000000..21ad2d6ce1 --- /dev/null +++ b/docs/handbook/archived/ptlibzippy.md @@ -0,0 +1,501 @@ +# PTlibzippy + +## Overview + +PTlibzippy is a Project Tick fork of the [zlib](https://zlib.net/) data compression library. +It is a general-purpose lossless data compression library implementing the DEFLATE algorithm +as specified in RFCs 1950 (zlib format), 1951 (deflate format), and 1952 (gzip format). + +PTlibzippy was version 0.0.5.1 and was maintained as a detached fork to solve symbol +conflicts when bundling zlib alongside libpng in the ProjT Launcher's build system. + +**Status**: Archived — system zlib is now used instead of a bundled fork. + +--- + +## Project Identity + +| Property | Value | +|-------------------|----------------------------------------------------------| +| **Name** | PTlibzippy | +| **Location** | `archived/ptlibzippy/` | +| **Language** | C | +| **Version** | 0.0.5.1 | +| **License** | zlib license (permissive) | +| **Copyright** | 1995–2026 Jean-loup Gailly and Mark Adler; 2026 Project Tick | +| **Homepage** | https://projecttick.org/p/zlib | +| **FAQ** | https://projecttick.org/p/zlib/zlib_faq.html | +| **Contact** | community@community.projecttick.org | + +--- + +## Why a zlib Fork? + +The fork was created to solve a specific technical problem in the ProjT Launcher: + +### The Symbol Conflict Problem + +When the launcher bundled both zlib and libpng as static libraries, the linker +encountered duplicate symbol definitions. Both zlib and libpng's internal zlib +usage exported identical function names (e.g., `deflate`, `inflate`, `compress`), +causing link-time errors or runtime symbol resolution ambiguities. + +### The Solution + +PTlibzippy addressed this through: + +1. **Symbol prefixing** — The `PTLIBZIPPY_PREFIX` CMake option enables renaming all + public symbols with a custom prefix, preventing collisions +2. **PNG shim layer** — A custom `ptlibzippy_pngshim.c` file provided a compatibility + layer between the renamed zlib symbols and libpng's expectations +3. **Custom header names** — Headers were renamed (`ptlibzippy.h`, `ptzippyconf.h`, + `ptzippyguts.h`, `ptzippyutil.h`) to avoid include-path conflicts + +As noted in the ProjT Launcher changelog: +> "zlib symbol handling was refined to use libpng-targeted shim overrides instead +> of global prefixing." + +--- + +## License + +The zlib license is permissive and compatible with GPL: + +``` +Copyright notice: + + (C) 1995-2026 Jean-loup Gailly and Mark Adler + (C) 2026 Project Tick + + This software is provided 'as-is', without any express or implied + warranty. In no event will the authors be held liable for any damages + arising from the use of this software. + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. +``` + +--- + +## Build System + +PTlibzippy supported three build systems: + +### CMake (Primary) + +```cmake +cmake_minimum_required(VERSION 3.12...3.31) + +project( + PTlibzippy + LANGUAGES C + VERSION 0.0.5.1 + HOMEPAGE_URL "https://projecttick.org/p/zlib" + DESCRIPTION "PTlibzippy - a general-purpose lossless data-compression library") +``` + +#### CMake Options + +| Option | Default | Description | +|---------------------------|---------|----------------------------------------------| +| `PTLIBZIPPY_BUILD_TESTING`| `ON` | Enable building test programs | +| `PTLIBZIPPY_BUILD_SHARED` | `ON` | Build shared library (`libptlibzippy.so`) | +| `PTLIBZIPPY_BUILD_STATIC` | `ON` | Build static library (`libptlibzippystatic.a`)| +| `PTLIBZIPPY_INSTALL` | `ON` | Enable `make install` target | +| `PTLIBZIPPY_PREFIX` | `OFF` | Enable symbol prefixing for all public APIs | + +#### Feature Detection + +The CMake build detected platform capabilities: + +```cmake +check_type_size(off64_t OFF64_T) # Large file support +check_function_exists(fseeko HAVE_FSEEKO) # POSIX fseeko +check_include_file(stdarg.h HAVE_STDARG_H) +check_include_file(unistd.h HAVE_UNISTD_H) +``` + +And generated `ptzippyconf.h` with the results: + +```cmake +configure_file(${PTlibzippy_BINARY_DIR}/ptzippyconf.h.cmakein + ${PTlibzippy_BINARY_DIR}/ptzippyconf.h) +``` + +#### Visibility Attributes + +On non-Windows platforms, the build checked for GCC visibility attributes: + +```c +void f(void) __attribute__ ((visibility("hidden"))); +``` + +This enabled hiding internal symbols from the shared library's public API. + +#### Library Targets + +**Shared library:** + +```cmake +add_library(ptlibzippy SHARED ${PTLIBZIPPY_SRCS} ...) +add_library(PTlibzippy::PTlibzippy ALIAS ptlibzippy) + +target_compile_definitions(ptlibzippy + PRIVATE PTLIBZIPPY_BUILD PTLIBZIPPY_INTERNAL= ...) +``` + +**Static library:** + +```cmake +add_library(ptlibzippystatic STATIC ${PTLIBZIPPY_SRCS} ...) +``` + +On Windows, the static library gets a `"s"` suffix to distinguish from the import library. + +#### pkg-config + +A `ptlibzippy.pc` file was generated for pkg-config integration: + +```cmake +configure_file(${PTlibzippy_SOURCE_DIR}/ptlibzippy.pc.cmakein + ${PTLIBZIPPY_PC} @ONLY) +``` + +### Autotools + +Traditional Unix build: + +```bash +./configure +make +make test +make install +``` + +### Bazel + +``` +BUILD.bazel # Build rules +MODULE.bazel # Module definition +``` + +--- + +## Source Files + +### Public Headers + +| Header | Purpose | +|----------------|----------------------------------------------------| +| `ptlibzippy.h` | Public API (compress, decompress, gzip, etc.) | +| `ptzippyconf.h`| Configuration header (generated at build time) | + +### Private Headers + +| Header | Purpose | +|----------------|----------------------------------------------------| +| `crc32.h` | CRC-32 lookup tables | +| `deflate.h` | DEFLATE compression state machine | +| `ptzippyguts.h` | Internal definitions (gzip state) | +| `inffast.h` | Fast inflate inner loop | +| `inffixed.h` | Fixed Huffman code tables | +| `inflate.h` | Inflate state machine | +| `inftrees.h` | Huffman tree building | +| `trees.h` | Dynamic Huffman tree encoding | +| `ptzippyutil.h` | System-level utilities | + +### Source Files + +| Source | Purpose | +|-------------------------|-----------------------------------------------| +| `adler32.c` | Adler-32 checksum computation | +| `compress.c` | Compression convenience API | +| `crc32.c` | CRC-32 checksum computation | +| `deflate.c` | DEFLATE compression algorithm | +| `gzclose.c` | gzip file close | +| `gzlib.c` | gzip file utility functions | +| `gzread.c` | gzip file reading | +| `gzwrite.c` | gzip file writing | +| `inflate.c` | DEFLATE decompression algorithm | +| `infback.c` | Inflate using a callback interface | +| `inftrees.c` | Generate Huffman trees for inflate | +| `inffast.c` | Fast inner loop for inflate | +| `ptlibzippy_pngshim.c` | PNG integration shim (Project Tick addition) | +| `trees.c` | Output compressed data using Huffman coding | +| `uncompr.c` | Decompression convenience API | +| `ptzippyutil.c` | Operating system interface utilities | + +### Project Tick Additions + +The following files were added or renamed by Project Tick (not present in upstream zlib): + +| File | Change Type | Purpose | +|-------------------------|-------------|--------------------------------------| +| `ptlibzippy_pngshim.c` | Added | Shim for libpng symbol compatibility | +| `ptzippyguts.h` | Renamed | From `gzguts.h` | +| `ptzippyutil.c` | Renamed | From `zutil.c` | +| `ptzippyutil.h` | Renamed | From `zutil.h` | +| `ptzippyconf.h` | Renamed | From `zconf.h` | +| `ptlibzippy.h` | Renamed | From `zlib.h` | +| `ptlibzippy.pc.cmakein` | Renamed | From `zlib.pc.cmakein` | +| `COPYING.md` | Modified | Added Project Tick copyright | + +--- + +## Symbol Prefixing + +The `PTLIBZIPPY_PREFIX` option enables symbol prefixing for all public API functions. +When enabled, all zlib functions are prefixed to avoid collisions: + +| Original Symbol | Prefixed Symbol (example) | +|----------------|---------------------------| +| `deflate` | `pt_deflate` | +| `inflate` | `pt_inflate` | +| `compress` | `pt_compress` | +| `uncompress` | `pt_uncompress` | +| `crc32` | `pt_crc32` | +| `adler32` | `pt_adler32` | + +The prefix is configured through `ptzippyconf.h`: + +```cmake +set(PT_PREFIX ${PTLIBZIPPY_PREFIX}) +file(APPEND ${PTCONF_OUT_FILE} "#cmakedefine PT_PREFIX 1\n") +``` + +--- + +## PNG Shim Layer + +The `ptlibzippy_pngshim.c` file was the key Project Tick addition. It provided a +compatibility layer that allowed libpng to use PTlibzippy's renamed symbols +transparently. + +Without the shim, libpng would look for standard zlib function names (`deflate`, +`inflate`, etc.) and fail to link against PTlibzippy's prefixed versions. + +The shim worked by: +1. Including PTlibzippy's headers (with prefixed symbols) +2. Providing wrapper functions with the original zlib names +3. Each wrapper forwarded to the corresponding PTlibzippy function + +This approach was described in the changelog as: +> "zlib symbol handling was refined to use libpng-targeted shim overrides instead +> of global prefixing" + +--- + +## Cross-Platform Support + +PTlibzippy inherited zlib's extensive platform support: + +| Platform | Build System | Notes | +|------------------|-------------------------------|-------------------------------| +| Linux | CMake, Autotools, Makefile | Primary development platform | +| macOS | CMake, Autotools | | +| Windows | CMake, NMake, MSVC | DLL and static library | +| Windows (MinGW) | CMake, Makefile | | +| Cygwin | CMake, Autotools | DLL naming handled | +| Amiga | Makefile.pup, Makefile.sas | SAS/C compiler | +| OS/400 | Custom makefiles | IBM i (formerly AS/400) | +| QNX | Custom makefiles | QNX Neutrino | +| VMS | make_vms.com | OpenVMS command procedure | + +### Platform-Specific Notes from README + +- **64-bit Irix**: `deflate.c` must be compiled without optimization with `-O` +- **Digital Unix 4.0D**: Requires `cc -std1` for correct `gzprintf` behavior +- **HP-UX 9.05**: Some versions of `/bin/cc` are incompatible +- **PalmOS**: Supported via external port (https://palmzlib.sourceforge.net/) + +--- + +## Third-Party Contributions + +The `contrib/` directory contained community-contributed extensions: + +| Directory | Description | +|----------------|------------------------------------------------| +| `contrib/ada/` | Ada programming language bindings | +| `contrib/blast/`| PKWare Data Compression Library decompressor | +| `contrib/crc32vx/`| Vectorized CRC-32 for IBM z/Architecture | +| `contrib/delphi/`| Borland Delphi bindings | +| `contrib/dotzlib/`| .NET (C#) bindings | +| `contrib/gcc_gvmat64/`| x86-64 assembly optimizations | + +Ada bindings included full package specifications: + +``` +contrib/ada/ptlib.ads # Package spec +contrib/ada/ptlib.adb # Package body +contrib/ada/ptlib-thin.ads # Thin binding spec +contrib/ada/ptlib-thin.adb # Thin binding body +contrib/ada/ptlib-streams.ads # Stream interface spec +contrib/ada/ptlib-streams.adb # Stream interface body +``` + +--- + +## FAQ Highlights + +From the project FAQ: + +**Q: Is PTlibzippy Y2K-compliant?** +A: Yes. PTlibzippy doesn't handle dates. + +**Q: Can zlib handle .zip archives?** +A: Not by itself. See `contrib/minizip`. + +**Q: Can zlib handle .Z files?** +A: No. Use `uncompress` or `gunzip` subprocess. + +**Q: How can I make a Unix shared library?** +A: Default build produces shared + static libraries: +```bash +make distclean +./configure +make +``` + +--- + +## Language Bindings + +PTlibzippy (and its zlib base) was accessible from many languages: + +| Language | Interface | +|----------|---------------------------------------------------| +| C | Native API via `ptlibzippy.h` | +| C++ | Direct C API usage | +| Ada | `contrib/ada/` bindings | +| C# (.NET)| `contrib/dotzlib/` bindings | +| Delphi | `contrib/delphi/` bindings | +| Java | `java.util.zip` package (JDK built-in) | +| Perl | IO::Compress module | +| Python | `zlib` module (Python standard library) | +| Tcl | Built-in zlib support | + +--- + +## Integration with ProjT Launcher + +In the ProjT Launcher's CMake build, PTlibzippy was used via: + +```cmake +# From cmake/usePTlibzippy.cmake (referenced in the launcher's cmake/ directory) +``` + +The launcher's `CMakeLists.txt` imported PTlibzippy alongside other compression +libraries (bzip2, quazip) to handle: + +- Mod archive extraction (`.zip`, `.jar` files) +- Instance backup/restore +- Asset pack handling +- Modpack import/export (Modrinth `.mrpack`, CurseForge `.zip` formats) + +--- + +## Why It Was Archived + +PTlibzippy was archived when: + +1. **Symbol conflict resolution matured** — The launcher's build system evolved to + handle zlib/libpng coexistence without a custom fork +2. **System zlib preferred** — Using the system's zlib package reduced maintenance + burden and ensured security patches were applied promptly +3. **Launcher archived** — When ProjT Launcher was archived, its dependency libraries + (including PTlibzippy) were archived alongside it +4. **MeshMC approach** — The successor launcher (MeshMC) uses system libraries or + vendored sources with different conflict resolution strategies + +--- + +## Building (for Reference) + +### CMake + +```bash +cd archived/ptlibzippy/ +mkdir build && cd build +cmake .. +make +make test +``` + +### With Symbol Prefixing + +```bash +cmake .. -DPTLIBZIPPY_PREFIX=pt_ +make +``` + +### Autotools + +```bash +cd archived/ptlibzippy/ +./configure +make +make test +make install +``` + +### Static-Only Build + +```bash +cmake .. -DPTLIBZIPPY_BUILD_SHARED=OFF +make +``` + +Note: Build success is not guaranteed since the archived code is not maintained. + +--- + +## File Index + +From the project `INDEX` file: + +``` +CMakeLists.txt cmake build file +ChangeLog history of changes +FAQ Frequently Asked Questions about zlib +INDEX file listing +Makefile dummy Makefile that tells you to ./configure +Makefile.in template for Unix Makefile +README project overview +configure configure script for Unix +make_vms.com makefile for VMS +test/example.c zlib usage examples for build testing +test/minigzip.c minimal gzip-like functionality for build testing +test/infcover.c inf*.c code coverage for build coverage testing +treebuild.xml XML description of source file dependencies +ptzippyconf.h zlib configuration header (template) +ptlibzippy.h zlib public API header + +amiga/ makefiles for Amiga SAS C +doc/ documentation for formats and algorithms +msdos/ makefiles for MSDOS +old/ legacy makefiles and documentation +os400/ makefiles for OS/400 +qnx/ makefiles for QNX +watcom/ makefiles for OpenWatcom +win32/ makefiles for Windows +``` + +--- + +## Ownership + +Maintained by `@YongDo-Hyun` as defined in `ci/OWNERS`: + +``` +/archived/ptlibzippy/ @YongDo-Hyun +``` diff --git a/docs/handbook/cgit/api-reference.md b/docs/handbook/cgit/api-reference.md new file mode 100644 index 0000000000..0c38564e74 --- /dev/null +++ b/docs/handbook/cgit/api-reference.md @@ -0,0 +1,468 @@ +# cgit — API Reference + +## Overview + +This document catalogs all public function prototypes, types, and global +variables exported by cgit's header files. Functions are grouped by header +file and module. + +## `cgit.h` — Core Types and Functions + +### Core Structures + +```c +struct cgit_environment { + const char *cgit_config; /* CGIT_CONFIG env var */ + const char *http_host; /* HTTP_HOST */ + const char *https; /* HTTPS */ + const char *no_http; /* NO_HTTP */ + const char *http_cookie; /* HTTP_COOKIE */ + const char *request_method; /* REQUEST_METHOD */ + const char *query_string; /* QUERY_STRING */ + const char *http_referer; /* HTTP_REFERER */ + const char *path_info; /* PATH_INFO */ + const char *script_name; /* SCRIPT_NAME */ + const char *server_name; /* SERVER_NAME */ + const char *server_port; /* SERVER_PORT */ + const char *http_accept; /* HTTP_ACCEPT */ + int authenticated; /* authentication result */ +}; + +struct cgit_query { + char *raw; + char *repo; + char *page; + char *search; + char *grep; + char *head; + char *sha1; + char *sha2; + char *path; + char *name; + char *url; + char *mimetype; + char *etag; + int nohead; + int ofs; + int has_symref; + int has_sha1; + int has_dot; + int ignored; + char *sort; + int showmsg; + int ssdiff; + int show_all; + int context; + int follow; + int dt; + int log_hierarchical_threading; +}; + +struct cgit_page { + const char *mimetype; + const char *charset; + const char *filename; + const char *etag; + const char *title; + int status; + time_t modified; + time_t expires; + size_t size; +}; + +struct cgit_config { + char *root_title; + char *root_desc; + char *root_readme; + char *root_coc; + char *root_cla; + char *root_homepage; + char *root_homepage_title; + struct string_list root_links; + char *css; + struct string_list css_list; + char *js; + struct string_list js_list; + char *logo; + char *logo_link; + char *favicon; + char *header; + char *footer; + char *head_include; + char *module_link; + char *virtual_root; + char *script_name; + char *section; + char *cache_root; + char *robots; + char *clone_prefix; + char *clone_url; + char *readme; + char *agefile; + char *project_list; + char *strict_export; + char *mimetype_file; + /* ... filter pointers, integer flags, limits ... */ + int cache_size; + int cache_root_ttl; + int cache_repo_ttl; + int cache_dynamic_ttl; + int cache_static_ttl; + int cache_about_ttl; + int cache_snapshot_ttl; + int cache_scanrc_ttl; + int max_repo_count; + int max_commit_count; + int max_message_length; + int max_repodesc_length; + int max_blob_size; + int max_stats; + int max_atom_items; + int max_subtree_commits; + int summary_branches; + int summary_tags; + int summary_log; + int snapshots; + int enable_http_clone; + int enable_index_links; + int enable_index_owner; + int enable_blame; + int enable_commit_graph; + int enable_log_filecount; + int enable_log_linecount; + int enable_remote_branches; + int enable_subject_links; + int enable_html_serving; + int enable_subtree; + int enable_tree_linenumbers; + int enable_git_config; + int enable_filter_overrides; + int enable_follow_links; + int embedded; + int noheader; + int noplainemail; + int local_time; + int case_sensitive_sort; + int section_sort; + int section_from_path; + int side_by_side_diffs; + int remove_suffix; + int scan_hidden_path; + int branch_sort; + int commit_sort; + int renamelimit; +}; + +struct cgit_repo { + char *url; + char *name; + char *basename; + char *path; + char *desc; + char *owner; + char *homepage; + char *defbranch; + char *section; + char *clone_url; + char *logo; + char *logo_link; + char *readme; + char *module_link; + char *extra_head_content; + char *snapshot_prefix; + struct string_list badges; + struct cgit_filter *about_filter; + struct cgit_filter *commit_filter; + struct cgit_filter *source_filter; + struct cgit_filter *email_filter; + struct cgit_filter *owner_filter; + int snapshots; + int enable_blame; + int enable_commit_graph; + int enable_log_filecount; + int enable_log_linecount; + int enable_remote_branches; + int enable_subject_links; + int enable_html_serving; + int enable_subtree; + int max_stats; + int max_subtree_commits; + int branch_sort; + int commit_sort; + int hide; + int ignore; +}; + +struct cgit_context { + struct cgit_environment env; + struct cgit_query qry; + struct cgit_config cfg; + struct cgit_page page; + struct cgit_repo *repo; +}; +``` + +### Global Variables + +```c +extern struct cgit_context ctx; +extern struct cgit_repolist cgit_repolist; +extern const char *cgit_version; +``` + +### Repository Management + +```c +extern struct cgit_repo *cgit_add_repo(const char *url); +extern struct cgit_repo *cgit_get_repoinfo(const char *url); +``` + +### Parsing Functions + +```c +extern void cgit_parse_url(const char *url); +extern struct commitinfo *cgit_parse_commit(struct commit *commit); +extern struct taginfo *cgit_parse_tag(struct tag *tag); +extern void cgit_free_commitinfo(struct commitinfo *info); +extern void cgit_free_taginfo(struct taginfo *info); +``` + +### Diff Functions + +```c +typedef void (*filepair_fn)(struct diff_filepair *pair); +typedef void (*linediff_fn)(char *line, int len); + +extern void cgit_diff_tree(const struct object_id *old_oid, + const struct object_id *new_oid, + filepair_fn fn, const char *prefix, + int renamelimit); +extern void cgit_diff_commit(struct commit *commit, filepair_fn fn, + const char *prefix); +extern void cgit_diff_files(const struct object_id *old_oid, + const struct object_id *new_oid, + unsigned long *old_size, + unsigned long *new_size, + int *binary, int context, + int ignorews, linediff_fn fn); +``` + +### Snapshot Functions + +```c +extern int cgit_parse_snapshots_mask(const char *str); + +extern const struct cgit_snapshot_format cgit_snapshot_formats[]; +``` + +### Filter Functions + +```c +extern struct cgit_filter *cgit_new_filter(const char *cmd, filter_type type); +extern int cgit_open_filter(struct cgit_filter *filter, ...); +extern int cgit_close_filter(struct cgit_filter *filter); +``` + +### Utility Functions + +```c +extern const char *cgit_repobasename(const char *reponame); +extern char *cgit_default_repo_desc; +extern int cgit_ref_path_exists(const char *path, const char *ref, int file_only); +``` + +## `html.h` — HTML Output Functions + +```c +extern const char *fmt(const char *format, ...); +extern char *fmtalloc(const char *format, ...); + +extern void html_raw(const char *data, size_t size); +extern void html(const char *txt); +extern void htmlf(const char *format, ...); +extern void html_txt(const char *txt); +extern void html_ntxt(const char *txt, int len); +extern void html_attr(const char *txt); +extern void html_url_path(const char *txt); +extern void html_url_arg(const char *txt); +extern void html_hidden(const char *name, const char *value); +extern void html_option(const char *value, const char *text, + const char *selected_value); +extern void html_link_open(const char *url, const char *title, + const char *class); +extern void html_link_close(void); +extern void html_include(const char *filename); +extern void html_checkbox(const char *name, int value); +extern void html_txt_input(const char *name, const char *value, int size); +``` + +## `ui-shared.h` — Page Layout and Links + +### HTTP and Layout + +```c +extern void cgit_print_http_headers(void); +extern void cgit_print_docstart(void); +extern void cgit_print_docend(void); +extern void cgit_print_pageheader(void); +extern void cgit_print_layout_start(void); +extern void cgit_print_layout_end(void); +extern void cgit_print_error(const char *msg); +extern void cgit_print_error_page(int code, const char *msg, + const char *fmt, ...); +``` + +### URL Generation + +```c +extern const char *cgit_repourl(const char *reponame); +extern const char *cgit_fileurl(const char *reponame, const char *pagename, + const char *filename, const char *query); +extern const char *cgit_pageurl(const char *reponame, const char *pagename, + const char *query); +extern const char *cgit_currurl(void); +extern const char *cgit_rooturl(void); +``` + +### Link Functions + +```c +extern void cgit_summary_link(const char *name, const char *title, + const char *class, const char *head); +extern void cgit_tag_link(const char *name, const char *title, + const char *class, const char *tag); +extern void cgit_tree_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +extern void cgit_log_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path, + int ofs, const char *grep, const char *pattern, + int showmsg, int follow); +extern void cgit_commit_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +extern void cgit_patch_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +extern void cgit_refs_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +extern void cgit_diff_link(const char *name, const char *title, + const char *class, const char *head, + const char *new_rev, const char *old_rev, + const char *path, int toggle); +extern void cgit_stats_link(const char *name, const char *title, + const char *class, const char *head, + const char *path); +extern void cgit_plain_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +extern void cgit_blame_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +extern void cgit_object_link(struct object *obj); +extern void cgit_submodule_link(const char *name, const char *path, + const char *commit); +extern void cgit_print_snapshot_links(const char *repo, const char *head, + const char *hex, int snapshots); +extern void cgit_print_branches(int max); +extern void cgit_print_tags(int max); +``` + +### Diff Helpers + +```c +extern void cgit_print_diff_hunk_header(int oldofs, int oldcnt, + int newofs, int newcnt, + const char *func); +extern void cgit_print_diff_line_prefix(int type); +``` + +## `cmd.h` — Command Dispatch + +```c +struct cgit_cmd { + const char *name; + void (*fn)(struct cgit_context *ctx); + unsigned int want_hierarchical:1; + unsigned int want_repo:1; + unsigned int want_layout:1; + unsigned int want_vpath:1; + unsigned int is_clone:1; +}; + +extern struct cgit_cmd *cgit_get_cmd(const char *name); +``` + +## `cache.h` — Cache System + +```c +typedef void (*cache_fill_fn)(void *cbdata); + +extern int cache_process(int size, const char *path, const char *key, + int ttl, cache_fill_fn fn, void *cbdata); +extern int cache_ls(const char *path); +extern unsigned long hash_str(const char *str); +``` + +## `configfile.h` — Configuration File Parser + +```c +typedef void (*configfile_value_fn)(const char *name, const char *value); + +extern int parse_configfile(const char *filename, configfile_value_fn fn); +``` + +## `scan-tree.h` — Repository Scanner + +```c +typedef void (*repo_config_fn)(struct cgit_repo *repo, + const char *name, const char *value); + +extern void scan_projects(const char *path, const char *projectsfile, + repo_config_fn fn); +extern void scan_tree(const char *path, repo_config_fn fn); +``` + +## `filter.c` — Filter Types + +```c +#define ABOUT_FILTER 0 +#define COMMIT_FILTER 1 +#define SOURCE_FILTER 2 +#define EMAIL_FILTER 3 +#define AUTH_FILTER 4 +#define OWNER_FILTER 5 + +typedef int filter_type; +``` + +## UI Module Entry Points + +Each `ui-*.c` module exposes one or more public functions: + +| Module | Function | Description | +|--------|----------|-------------| +| `ui-atom.c` | `cgit_print_atom(char *tip, char *path, int max)` | Generate Atom feed | +| `ui-blame.c` | `cgit_print_blame(void)` | Render blame view | +| `ui-blob.c` | `cgit_print_blob(const char *hex, char *path, const char *head, int file_only)` | Display blob content | +| `ui-clone.c` | `cgit_clone_info(void)` | HTTP clone: `info/refs` | +| `ui-clone.c` | `cgit_clone_objects(void)` | HTTP clone: pack objects | +| `ui-clone.c` | `cgit_clone_head(void)` | HTTP clone: `HEAD` ref | +| `ui-commit.c` | `cgit_print_commit(const char *rev, const char *prefix)` | Display commit | +| `ui-diff.c` | `cgit_print_diff(const char *new_rev, const char *old_rev, const char *prefix, int show_ctrls, int raw)` | Render diff | +| `ui-diff.c` | `cgit_print_diffstat(const struct object_id *old, const struct object_id *new, const char *prefix)` | Render diffstat | +| `ui-log.c` | `cgit_print_log(const char *tip, int ofs, int cnt, char *grep, char *pattern, char *path, int pager, int commit_graph, int commit_sort)` | Display log | +| `ui-patch.c` | `cgit_print_patch(const char *new_rev, const char *old_rev, const char *prefix)` | Generate patch | +| `ui-plain.c` | `cgit_print_plain(void)` | Serve raw file content | +| `ui-refs.c` | `cgit_print_refs(void)` | Display branches and tags | +| `ui-repolist.c` | `cgit_print_repolist(void)` | Repository index page | +| `ui-snapshot.c` | `cgit_print_snapshot(const char *head, const char *hex, const char *prefix, const char *filename, int snapshots)` | Generate archive | +| `ui-stats.c` | `cgit_print_stats(void)` | Display statistics | +| `ui-summary.c` | `cgit_print_summary(void)` | Repository summary page | +| `ui-ssdiff.c` | `cgit_ssdiff_header_begin(void)` | Start ssdiff output | +| `ui-ssdiff.c` | `cgit_ssdiff_header_end(void)` | End ssdiff header | +| `ui-ssdiff.c` | `cgit_ssdiff_footer(void)` | End ssdiff output | +| `ui-tag.c` | `cgit_print_tag(const char *revname)` | Display tag | +| `ui-tree.c` | `cgit_print_tree(const char *rev, char *path)` | Display tree | diff --git a/docs/handbook/cgit/architecture.md b/docs/handbook/cgit/architecture.md new file mode 100644 index 0000000000..e35633a505 --- /dev/null +++ b/docs/handbook/cgit/architecture.md @@ -0,0 +1,422 @@ +# cgit — Architecture + +## High-Level Component Map + +``` +┌──────────────────────────────────────────────────────────────┐ +│ cgit.c │ +│ constructor_environment() [__attribute__((constructor))] │ +│ prepare_context() → config_cb() → querystring_cb() │ +│ authenticate_cookie() → process_request() → main() │ +├──────────────────────────────────────────────────────────────┤ +│ Command Dispatcher │ +│ cmd.c │ +│ cgit_get_cmd() → static cmds[] table (23 entries) │ +│ struct cgit_cmd { name, fn, want_repo, want_vpath, is_clone }│ +├──────────┬───────────┬───────────┬───────────────────────────┤ +│ UI Layer │ Caching │ Filters │ HTTP Clone │ +│ ui-*.c │ cache.c │ filter.c │ ui-clone.c │ +│ (17 mods)│ cache.h │ │ │ +├──────────┴───────────┴───────────┴───────────────────────────┤ +│ Core Utilities │ +│ shared.c — global vars, repo mgmt, diff wrappers │ +│ parsing.c — cgit_parse_commit(), cgit_parse_tag(), │ +│ cgit_parse_url() │ +│ html.c — entity escaping, URL encoding, form helpers │ +│ configfile.c — line-oriented name=value parser │ +│ scan-tree.c — filesystem repository discovery │ +├──────────────────────────────────────────────────────────────┤ +│ Vendored git library │ +│ git/ — full Git 2.46.0 source; linked via cgit.mk │ +│ Provides: object store, diff engine (xdiff), refs, revwalk, │ +│ archive, notes, commit graph, blame, packfile │ +└──────────────────────────────────────────────────────────────┘ +``` + +## Global State + +cgit uses a single global variable to carry all request state: + +```c +/* shared.c */ +struct cgit_repolist cgit_repolist; /* Array of all known repositories */ +struct cgit_context ctx; /* Current request context */ +``` + +### `struct cgit_context` + +```c +struct cgit_context { + struct cgit_environment env; /* CGI env vars (HTTP_HOST, QUERY_STRING, etc.) */ + struct cgit_query qry; /* Parsed URL/query parameters */ + struct cgit_config cfg; /* All global config directives */ + struct cgit_repo *repo; /* Pointer into cgit_repolist.repos[] or NULL */ + struct cgit_page page; /* HTTP response metadata (mimetype, status, etag) */ +}; +``` + +### `struct cgit_environment` + +Populated by `prepare_context()` via `getenv()`: + +```c +struct cgit_environment { + const char *cgit_config; /* $CGIT_CONFIG (default: /etc/cgitrc) */ + const char *http_host; /* $HTTP_HOST */ + const char *https; /* $HTTPS ("on" if TLS) */ + const char *no_http; /* $NO_HTTP (non-NULL → CLI mode) */ + const char *path_info; /* $PATH_INFO */ + const char *query_string; /* $QUERY_STRING */ + const char *request_method; /* $REQUEST_METHOD */ + const char *script_name; /* $SCRIPT_NAME */ + const char *server_name; /* $SERVER_NAME */ + const char *server_port; /* $SERVER_PORT */ + const char *http_cookie; /* $HTTP_COOKIE */ + const char *http_referer; /* $HTTP_REFERER */ + unsigned int content_length; /* $CONTENT_LENGTH */ + int authenticated; /* Set by auth filter (0 or 1) */ +}; +``` + +### `struct cgit_page` + +Controls HTTP response headers: + +```c +struct cgit_page { + time_t modified; /* Last-Modified header */ + time_t expires; /* Expires header */ + size_t size; /* Content-Length (0 = omit) */ + const char *mimetype; /* Content-Type (default: "text/html") */ + const char *charset; /* charset param (default: "UTF-8") */ + const char *filename; /* Content-Disposition filename */ + const char *etag; /* ETag header value */ + const char *title; /* HTML */ + int status; /* HTTP status code (0 = 200) */ + const char *statusmsg; /* HTTP status message */ +}; +``` + +## Request Lifecycle — Detailed + +### Phase 1: Pre-main Initialization + +```c +__attribute__((constructor)) +static void constructor_environment() +{ + setenv("GIT_CONFIG_NOSYSTEM", "1", 1); + setenv("GIT_ATTR_NOSYSTEM", "1", 1); + unsetenv("HOME"); + unsetenv("XDG_CONFIG_HOME"); +} +``` + +This runs before `main()` on every invocation. It prevents Git from loading +`/etc/gitconfig`, `~/.gitconfig`, or any `$XDG_CONFIG_HOME/git/config`, ensuring +complete isolation from the host system's Git configuration. + +### Phase 2: Context Preparation + +`prepare_context()` zero-initializes `ctx` and sets every configuration field +to its default value. Key defaults: + +| Field | Default | +|-------|---------| +| `cfg.cache_size` | `0` (disabled) | +| `cfg.cache_root` | `CGIT_CACHE_ROOT` (`/var/cache/cgit`) | +| `cfg.cache_repo_ttl` | `5` minutes | +| `cfg.cache_root_ttl` | `5` minutes | +| `cfg.cache_static_ttl` | `-1` (never expires) | +| `cfg.max_repo_count` | `50` | +| `cfg.max_commit_count` | `50` | +| `cfg.max_msg_len` | `80` | +| `cfg.max_repodesc_len` | `80` | +| `cfg.enable_http_clone` | `1` | +| `cfg.enable_index_owner` | `1` | +| `cfg.enable_tree_linenumbers` | `1` | +| `cfg.summary_branches` | `10` | +| `cfg.summary_log` | `10` | +| `cfg.summary_tags` | `10` | +| `cfg.difftype` | `DIFF_UNIFIED` | +| `cfg.robots` | `"index, nofollow"` | +| `cfg.root_title` | `"Git repository browser"` | + +The function also reads all CGI environment variables and sets +`page.mimetype = "text/html"`, `page.charset = PAGE_ENCODING` (`"UTF-8"`). + +### Phase 3: Configuration Parsing + +```c +parse_configfile(ctx.env.cgit_config, config_cb); +``` + +`parse_configfile()` (in `configfile.c`) opens the file, reads lines of the +form `name=value`, skips comments (`#` and `;`), and calls the callback for each +directive. It supports recursive `include=` directives up to 8 levels deep. + +`config_cb()` (in `cgit.c`) is a ~200-line chain of `if/else if` blocks that +maps directive names to `ctx.cfg.*` fields. When `repo.url=` is encountered, +`cgit_add_repo()` allocates a new repository entry; subsequent `repo.*` +directives configure that entry via `repo_config()`. + +Special directive: `scan-path=` triggers immediate filesystem scanning via +`scan_tree()` or `scan_projects()`, or via a cached repolist file if +`cache-size > 0`. + +### Phase 4: Query String Parsing + +```c +http_parse_querystring(ctx.qry.raw, querystring_cb); +``` + +`querystring_cb()` maps short parameter names to `ctx.qry.*` fields: + +| Parameter | Field | Purpose | +|-----------|-------|---------| +| `r` | `qry.repo` | Repository URL | +| `p` | `qry.page` | Page name | +| `url` | `qry.url` | Combined repo/page/path | +| `h` | `qry.head` | Branch/ref | +| `id` | `qry.oid` | Object ID | +| `id2` | `qry.oid2` | Second object ID (for diffs) | +| `ofs` | `qry.ofs` | Pagination offset | +| `q` | `qry.search` | Search query | +| `qt` | `qry.grep` | Search type | +| `path` | `qry.path` | File path | +| `name` | `qry.name` | Snapshot filename | +| `dt` | `qry.difftype` | Diff type (0/1/2) | +| `context` | `qry.context` | Diff context lines | +| `ignorews` | `qry.ignorews` | Ignore whitespace | +| `follow` | `qry.follow` | Follow renames | +| `showmsg` | `qry.showmsg` | Show full messages | +| `s` | `qry.sort` | Sort order | +| `period` | `qry.period` | Stats period | + +The `url=` parameter receives special processing via `cgit_parse_url()` (in +`parsing.c`), which iteratively splits the URL at `/` characters, looking for +the longest prefix that matches a known repository URL. + +### Phase 5: Authentication + +`authenticate_cookie()` checks three cases: + +1. **No auth filter** → set `ctx.env.authenticated = 1` and return. +2. **POST to login page** → call `authenticate_post()`, which reads up to + `MAX_AUTHENTICATION_POST_BYTES` (4096) from stdin, pipes it to the auth + filter with function `"authenticate-post"`, and exits. +3. **Normal request** → invoke auth filter with function + `"authenticate-cookie"`. The filter's exit code becomes + `ctx.env.authenticated`. + +The auth filter receives 12 arguments: + +``` +function, cookie, method, query_string, http_referer, +path_info, http_host, https, repo, page, fullurl, loginurl +``` + +### Phase 6: Cache Envelope + +If `ctx.cfg.cache_size > 0`, the request is wrapped in `cache_process()`: + +```c +cache_process(ctx.cfg.cache_size, ctx.cfg.cache_root, + cache_key, ttl, fill_fn); +``` + +This constructs a filename from the FNV-1 hash of the cache key, attempts to +open an existing slot, verifies the key matches, checks expiry, and either +serves cached content or locks and fills a new slot. See the Caching System +document for full details. + +### Phase 7: Command Dispatch + +```c +cmd = cgit_get_cmd(); +``` + +`cgit_get_cmd()` (in `cmd.c`) performs a linear scan of the static `cmds[]` +table: + +```c +static struct cgit_cmd cmds[] = { + def_cmd(HEAD, 1, 0, 1), + def_cmd(atom, 1, 0, 0), + def_cmd(about, 0, 0, 0), + def_cmd(blame, 1, 1, 0), + def_cmd(blob, 1, 0, 0), + def_cmd(cla, 0, 0, 0), + def_cmd(commit, 1, 1, 0), + def_cmd(coc, 0, 0, 0), + def_cmd(diff, 1, 1, 0), + def_cmd(info, 1, 0, 1), + def_cmd(log, 1, 1, 0), + def_cmd(ls_cache, 0, 0, 0), + def_cmd(objects, 1, 0, 1), + def_cmd(patch, 1, 1, 0), + def_cmd(plain, 1, 0, 0), + def_cmd(rawdiff, 1, 1, 0), + def_cmd(refs, 1, 0, 0), + def_cmd(repolist, 0, 0, 0), + def_cmd(snapshot, 1, 0, 0), + def_cmd(stats, 1, 1, 0), + def_cmd(summary, 1, 0, 0), + def_cmd(tag, 1, 0, 0), + def_cmd(tree, 1, 1, 0), +}; +``` + +The `def_cmd` macro expands to `{#name, name##_fn, want_repo, want_vpath, is_clone}`. + +Default page if none specified: +- With a repository → `"summary"` +- Without a repository → `"repolist"` + +### Phase 8: Repository Preparation + +If `cmd->want_repo` is set: + +1. `prepare_repo_env()` calls `setenv("GIT_DIR", ctx.repo->path, 1)`, + `setup_git_directory_gently()`, and `load_display_notes()`. +2. `prepare_repo_cmd()` resolves the default branch (via `guess_defbranch()` + which checks `HEAD` → `refs/heads/*`), resolves the requested head to an OID, + sorts submodules, chooses the README file, and sets the page title. + +### Phase 9: Page Rendering + +The handler function (`cmd->fn()`) is called. Most handlers follow this +pattern: + +```c +cgit_print_layout_start(); /* HTTP headers + HTML doctype + header + tabs */ +/* ... page-specific content ... */ +cgit_print_layout_end(); /* footer + closing tags */ +``` + +`cgit_print_layout_start()` calls: +- `cgit_print_http_headers()` — Content-Type, Last-Modified, Expires, ETag +- `cgit_print_docstart()` — `<!DOCTYPE html>`, `<html>`, CSS/JS includes +- `cgit_print_pageheader()` — header table, navigation tabs, breadcrumbs + +## Module Dependency Graph + +``` +cgit.c ──→ cmd.c ──→ ui-*.c (all modules) + │ │ + │ └──→ cache.c + │ + ├──→ configfile.c + ├──→ scan-tree.c ──→ configfile.c + ├──→ ui-shared.c ──→ html.c + ├──→ ui-stats.c + ├──→ ui-blob.c + ├──→ ui-summary.c + └──→ filter.c + +ui-commit.c ──→ ui-diff.c ──→ ui-ssdiff.c +ui-summary.c ──→ ui-log.c, ui-refs.c, ui-blob.c, ui-plain.c +ui-log.c ──→ ui-shared.c +All ui-*.c ──→ html.c, ui-shared.c +``` + +## The `struct cgit_cmd` Pattern + +Each command in `cmd.c` is defined as a static function that wraps the +corresponding UI module: + +```c +static void log_fn(void) +{ + cgit_print_log(ctx.qry.oid, ctx.qry.ofs, ctx.cfg.max_commit_count, + ctx.qry.grep, ctx.qry.search, ctx.qry.path, 1, + ctx.repo->enable_commit_graph, + ctx.repo->commit_sort); +} +``` + +The thin wrapper pattern means all context is accessed via the global `ctx` +struct, and the wrapper simply extracts the relevant fields and passes them to +the module function. + +## Repository List Management + +The `cgit_repolist` global is a dynamically-growing array: + +```c +struct cgit_repolist { + int length; /* Allocated capacity */ + int count; /* Number of repos */ + struct cgit_repo *repos; /* Array */ +}; +``` + +`cgit_add_repo()` doubles the array capacity when needed (starting from 8). +Each new repo inherits defaults from `ctx.cfg.*` (snapshots, feature flags, +filters, etc.). + +`cgit_get_repoinfo()` performs a linear scan (O(n)) to find a repo by URL. +Ignored repos (`repo->ignore == 1`) are skipped. + +## Build System + +The build works in two stages: + +1. **Git build** — `make` in the top-level `cgit/` directory delegates to + `make -C git -f ../cgit.mk` which includes Git's own `Makefile`. + +2. **cgit link** — `cgit.mk` lists all cgit object files (`CGIT_OBJ_NAMES`), + compiles them with `CGIT_CFLAGS` (which embeds `CGIT_CONFIG`, + `CGIT_SCRIPT_NAME`, `CGIT_CACHE_ROOT` as string literals), and links them + against Git's `libgit.a`. + +Lua support is auto-detected via `pkg-config` (checking `luajit`, `lua`, +`lua5.2`, `lua5.1` in order). Define `NO_LUA=1` to build without Lua. +Linux systems get `HAVE_LINUX_SENDFILE` which enables the `sendfile()` syscall +in the cache layer. + +## Thread Safety + +cgit runs as a **single-process CGI** — one process per HTTP request. There is +no multi-threading. All global state (`ctx`, `cgit_repolist`, the static +`diffbuf` in `shared.c`, the static format buffers in `html.c`) is safe because +each process is fully isolated. + +The `fmt()` function in `html.c` uses a ring buffer of 8 static buffers +(`static char buf[8][1024]`) to allow up to 8 nested `fmt()` calls in a single +expression. The `bufidx` rotates via `bufidx = (bufidx + 1) & 7`. + +## Error Handling + +The codebase uses three assertion-style helpers from `shared.c`: + +```c +int chk_zero(int result, char *msg); /* die if result != 0 */ +int chk_positive(int result, char *msg); /* die if result <= 0 */ +int chk_non_negative(int result, char *msg); /* die if result < 0 */ +``` + +For user-facing errors, `cgit_print_error_page()` sets HTTP status, prints +headers, renders the page skeleton, and displays the error message. + +## Type System + +cgit uses three enums defined in `cgit.h`: + +```c +typedef enum { + DIFF_UNIFIED, DIFF_SSDIFF, DIFF_STATONLY +} diff_type; + +typedef enum { + ABOUT, COMMIT, SOURCE, EMAIL, AUTH, OWNER +} filter_type; +``` + +And three function pointer typedefs: + +```c +typedef void (*configfn)(const char *name, const char *value); +typedef void (*filepair_fn)(struct diff_filepair *pair); +typedef void (*linediff_fn)(char *line, int len); +``` diff --git a/docs/handbook/cgit/authentication.md b/docs/handbook/cgit/authentication.md new file mode 100644 index 0000000000..a4fe000a87 --- /dev/null +++ b/docs/handbook/cgit/authentication.md @@ -0,0 +1,288 @@ +# cgit — Authentication + +## Overview + +cgit supports cookie-based authentication through the `auth-filter` +mechanism. The authentication system intercepts requests before page +rendering and delegates all credential validation to an external filter +(exec or Lua script). + +Source file: `cgit.c` (authentication hooks), `filter.c` (filter execution). + +## Architecture + +Authentication is entirely filter-driven. cgit itself stores no credentials, +sessions, or user databases. The auth filter is responsible for: + +1. Rendering login forms +2. Validating credentials +3. Setting/reading session cookies +4. Determining authorization per-repository + +## Configuration + +```ini +auth-filter=lua:/path/to/auth.lua +# or +auth-filter=exec:/path/to/auth.sh +``` + +The auth filter type is `AUTH_FILTER` (constant `4`) and receives 12 +arguments. + +## Authentication Flow + +### Request Processing in `cgit.c` + +Authentication is checked in `process_request()` after URL parsing and +command dispatch: + +```c +/* In process_request() */ +if (ctx.cfg.auth_filter) { + /* Step 1: Check current authentication state */ + authenticate_cookie(); + + /* Step 2: Handle POST login attempts */ + if (ctx.env.request_method && + !strcmp(ctx.env.request_method, "POST")) + authenticate_post(); + + /* Step 3: Run the auth filter to decide access */ + cmd->fn(&ctx); +} +``` + +### `authenticate_cookie()` + +Opens the auth filter to check the current session cookie: + +```c +static void authenticate_cookie(void) +{ + /* Open auth filter with current request context */ + cgit_open_filter(ctx.cfg.auth_filter, + ctx.env.http_cookie, /* current cookies */ + ctx.env.request_method, /* GET/POST */ + ctx.env.query_string, /* full query */ + ctx.env.http_referer, /* referer header */ + ctx.env.path_info, /* request path */ + ctx.env.http_host, /* hostname */ + ctx.env.https ? "1" : "0", /* HTTPS flag */ + ctx.qry.repo, /* repository name */ + ctx.qry.page, /* page/command */ + ctx.env.http_accept, /* accept header */ + "cookie" /* authentication phase */ + ); + /* Read filter's response to determine auth state */ + ctx.env.authenticated = /* filter exit code */; + cgit_close_filter(ctx.cfg.auth_filter); +} +``` + +### `authenticate_post()` + +Handles login form submissions: + +```c +static void authenticate_post(void) +{ + /* Read POST body for credentials */ + /* Open auth filter with phase="post" */ + cgit_open_filter(ctx.cfg.auth_filter, + /* ... same 11 args ... */ + "post" /* authentication phase */ + ); + /* Filter processes credentials, may set cookies */ + cgit_close_filter(ctx.cfg.auth_filter); +} +``` + +### Authorization Check + +After authentication, the auth filter is called again before rendering each +page to determine if the authenticated user has access to the requested +repository and page: + +```c +static int open_auth_filter(const char *repo, const char *page) +{ + cgit_open_filter(ctx.cfg.auth_filter, + /* ... request context ... */ + "authorize" /* authorization phase */ + ); + int authorized = cgit_close_filter(ctx.cfg.auth_filter); + return authorized == 0; /* 0 = authorized */ +} +``` + +## Auth Filter Arguments + +The auth filter receives 12 arguments in total: + +| # | Argument | Description | +|---|----------|-------------| +| 1 | `filter_cmd` | The filter command itself | +| 2 | `http_cookie` | Raw `HTTP_COOKIE` header value | +| 3 | `request_method` | HTTP method (`GET`, `POST`) | +| 4 | `query_string` | Raw query string | +| 5 | `http_referer` | HTTP Referer header | +| 6 | `path_info` | PATH_INFO from CGI | +| 7 | `http_host` | Hostname | +| 8 | `https` | `"1"` if HTTPS, `"0"` if HTTP | +| 9 | `repo` | Repository URL | +| 10 | `page` | Page/command name | +| 11 | `http_accept` | HTTP Accept header | +| 12 | `phase` | `"cookie"`, `"post"`, or `"authorize"` | + +## Filter Phases + +### `cookie` Phase + +Called on every request. The filter should: +1. Read the session cookie from argument 2 +2. Validate the session +3. Return exit code 0 if authenticated, non-zero otherwise + +### `post` Phase + +Called when the request method is POST. The filter should: +1. Read POST body from stdin +2. Validate credentials +3. If valid, output a `Set-Cookie` header +4. Output a redirect response (302) + +### `authorize` Phase + +Called after authentication to check per-repository access. The filter +should: +1. Check if the authenticated user can access the requested repo/page +2. Return exit code 0 if authorized +3. Return non-zero to deny access (cgit will show an error page) + +## Filter Return Codes + +| Exit Code | Meaning | +|-----------|---------| +| 0 | Success (authenticated/authorized) | +| Non-zero | Failure (unauthenticated/unauthorized) | + +## Environment Variables + +The auth filter also has access to standard CGI environment variables: + +```c +struct cgit_environment { + const char *cgit_config; /* $CGIT_CONFIG */ + const char *http_host; /* $HTTP_HOST */ + const char *https; /* $HTTPS */ + const char *no_http; /* $NO_HTTP */ + const char *http_cookie; /* $HTTP_COOKIE */ + const char *request_method; /* $REQUEST_METHOD */ + const char *query_string; /* $QUERY_STRING */ + const char *http_referer; /* $HTTP_REFERER */ + const char *path_info; /* $PATH_INFO */ + const char *script_name; /* $SCRIPT_NAME */ + const char *server_name; /* $SERVER_NAME */ + const char *server_port; /* $SERVER_PORT */ + const char *http_accept; /* $HTTP_ACCEPT */ + int authenticated; /* set by auth filter */ +}; +``` + +## Shipped Auth Filter + +cgit ships a Lua-based hierarchical authentication filter: + +### `filters/simple-hierarchical-auth.lua` + +This filter implements path-based access control using a simple user +database and repository permission map. + +Features: +- Cookie-based session management +- Per-repository access control +- Hierarchical path matching +- Password hashing + +Usage: + +```ini +auth-filter=lua:/usr/lib/cgit/filters/simple-hierarchical-auth.lua +``` + +## Cache Interaction + +Authentication affects cache keys. The cache key includes the +authentication state and cookie: + +```c +static const char *cache_key(void) +{ + return fmt("%s?%s?%s?%s?%s", + ctx.qry.raw, + ctx.env.http_host, + ctx.env.https ? "1" : "0", + ctx.env.authenticated ? "1" : "0", + ctx.env.http_cookie ? ctx.env.http_cookie : ""); +} +``` + +This ensures that: +- Authenticated and unauthenticated users get separate cache entries +- Different authenticated users (different cookies) get separate entries +- The cache never leaks restricted content to unauthorized users + +## Security Considerations + +1. **HTTPS**: Always use HTTPS when authentication is enabled to protect + cookies and credentials in transit +2. **Cookie flags**: Auth filter scripts should set `Secure`, `HttpOnly`, + and `SameSite` cookie flags +3. **Session expiry**: Implement session timeouts in the auth filter +4. **Password storage**: Never store passwords in plain text; use bcrypt or + similar hashing +5. **CSRF protection**: The auth filter should implement CSRF tokens for + POST login forms +6. **Cache poisoning**: The cache key includes auth state, but ensure the + auth filter is deterministic for the same cookie + +## Disabling Authentication + +By default, no auth filter is configured and all repositories are publicly +accessible. To restrict access, set up the auth filter and optionally +combine with `strict-export` for file-based visibility control. + +## Example: Custom Auth Filter (Shell) + +```bash +#!/bin/bash +# Simple auth filter skeleton +PHASE="${12}" + +case "$PHASE" in + cookie) + COOKIE="$2" + if validate_session "$COOKIE"; then + exit 0 # authenticated + fi + exit 1 # not authenticated + ;; + post) + read -r POST_BODY + # Parse username/password from POST_BODY + # Validate credentials + # Set cookie header + echo "Status: 302 Found" + echo "Set-Cookie: session=TOKEN; HttpOnly; Secure" + echo "Location: $6" + echo + exit 0 + ;; + authorize) + REPO="$9" + # Check if current user can access $REPO + exit 0 # authorized + ;; +esac +``` diff --git a/docs/handbook/cgit/building.md b/docs/handbook/cgit/building.md new file mode 100644 index 0000000000..00f9e1244f --- /dev/null +++ b/docs/handbook/cgit/building.md @@ -0,0 +1,272 @@ +# cgit — Building + +## Prerequisites + +| Dependency | Required | Purpose | +|-----------|----------|---------| +| GCC or Clang | Yes | C compiler (C99) | +| GNU Make | Yes | Build system | +| OpenSSL (libcrypto) | Yes | SHA-1 hash implementation (`SHA1_HEADER = <openssl/sha.h>`) | +| zlib | Yes | Git object compression | +| libcurl | No | Not used — `NO_CURL=1` is passed by cgit.mk | +| Lua or LuaJIT | No | Lua filter support; auto-detected via pkg-config | +| asciidoc / a2x | No | Man page / HTML / PDF documentation generation | +| Python | No | Git's test harness (for `make test`) | + +## Build System Overview + +cgit uses a two-stage build that embeds itself within Git's build infrastructure: + +``` +cgit/Makefile + └── make -C git -f ../cgit.mk ../cgit + └── git/Makefile (included by cgit.mk) + └── Compile cgit objects + link against libgit.a +``` + +### Stage 1: Top-Level Makefile + +The top-level `Makefile` lives in `cgit/` and defines all user-configurable +variables: + +```makefile +CGIT_VERSION = 0.0.5-1-Project-Tick +CGIT_SCRIPT_NAME = cgit.cgi +CGIT_SCRIPT_PATH = /var/www/htdocs/cgit +CGIT_DATA_PATH = $(CGIT_SCRIPT_PATH) +CGIT_CONFIG = /etc/cgitrc +CACHE_ROOT = /var/cache/cgit +prefix = /usr/local +libdir = $(prefix)/lib +filterdir = $(libdir)/cgit/filters +docdir = $(prefix)/share/doc/cgit +mandir = $(prefix)/share/man +SHA1_HEADER = <openssl/sha.h> +GIT_VER = 2.46.0 +GIT_URL = https://www.kernel.org/pub/software/scm/git/git-$(GIT_VER).tar.xz +``` + +The main `cgit` target delegates to: + +```makefile +cgit: + $(QUIET_SUBDIR0)git $(QUIET_SUBDIR1) -f ../cgit.mk ../cgit NO_CURL=1 +``` + +This enters the `git/` subdirectory and runs `cgit.mk` from there, prefixing +all cgit source paths with `../`. + +### Stage 2: cgit.mk + +`cgit.mk` is run inside the `git/` directory so it can `include Makefile` to +inherit Git's build variables (`CC`, `CFLAGS`, linker flags, OS detection via +`config.mak.uname`, etc.). + +Key sections: + +#### Version tracking + +```makefile +$(CGIT_PREFIX)VERSION: force-version + @cd $(CGIT_PREFIX) && '$(SHELL_PATH_SQ)' ./gen-version.sh "$(CGIT_VERSION)" +``` + +The `gen-version.sh` script writes a `VERSION` file that is included by the +build. Only `cgit.o` references `CGIT_VERSION`, so only that object is rebuilt +when the version changes. + +#### CGIT_CFLAGS + +```makefile +CGIT_CFLAGS += -DCGIT_CONFIG='"$(CGIT_CONFIG)"' +CGIT_CFLAGS += -DCGIT_SCRIPT_NAME='"$(CGIT_SCRIPT_NAME)"' +CGIT_CFLAGS += -DCGIT_CACHE_ROOT='"$(CACHE_ROOT)"' +``` + +These compile-time constants are used in `cgit.c` as default values in +`prepare_context()`. + +#### Lua detection + +```makefile +LUA_PKGCONFIG := $(shell for pc in luajit lua lua5.2 lua5.1; do \ + $(PKG_CONFIG) --exists $$pc 2>/dev/null && echo $$pc && break; \ +done) +``` + +If Lua is found, its `--cflags` and `--libs` are appended to `CGIT_CFLAGS` and +`CGIT_LIBS`. If not found, `NO_LUA=YesPlease` is set and `-DNO_LUA` is added. + +#### Linux sendfile + +```makefile +ifeq ($(uname_S),Linux) + HAVE_LINUX_SENDFILE = YesPlease +endif + +ifdef HAVE_LINUX_SENDFILE + CGIT_CFLAGS += -DHAVE_LINUX_SENDFILE +endif +``` + +This enables the `sendfile()` syscall in `cache.c` for zero-copy writes from +cache files to stdout. + +#### Object files + +All cgit source files are listed explicitly: + +```makefile +CGIT_OBJ_NAMES += cgit.o cache.o cmd.o configfile.o filter.o html.o +CGIT_OBJ_NAMES += parsing.o scan-tree.o shared.o +CGIT_OBJ_NAMES += ui-atom.o ui-blame.o ui-blob.o ui-clone.o ui-commit.o +CGIT_OBJ_NAMES += ui-diff.o ui-log.o ui-patch.o ui-plain.o ui-refs.o +CGIT_OBJ_NAMES += ui-repolist.o ui-shared.o ui-snapshot.o ui-ssdiff.o +CGIT_OBJ_NAMES += ui-stats.o ui-summary.o ui-tag.o ui-tree.o +``` + +The prefixed paths (`CGIT_OBJS := $(addprefix $(CGIT_PREFIX),$(CGIT_OBJ_NAMES))`) +point back to the `cgit/` directory from inside `git/`. + +## Quick Build + +```bash +cd cgit + +# Download the vendored Git source (required on first build) +make get-git + +# Build cgit binary +make -j$(nproc) +``` + +The output is a single binary named `cgit` in the `cgit/` directory. + +## Build Variables Reference + +| Variable | Default | Description | +|----------|---------|-------------| +| `CGIT_VERSION` | `0.0.5-1-Project-Tick` | Compiled-in version string | +| `CGIT_SCRIPT_NAME` | `cgit.cgi` | Name of the installed CGI binary | +| `CGIT_SCRIPT_PATH` | `/var/www/htdocs/cgit` | CGI binary install directory | +| `CGIT_DATA_PATH` | `$(CGIT_SCRIPT_PATH)` | Static assets (CSS, JS, images) directory | +| `CGIT_CONFIG` | `/etc/cgitrc` | Default config file path (compiled in) | +| `CACHE_ROOT` | `/var/cache/cgit` | Default cache directory (compiled in) | +| `prefix` | `/usr/local` | Install prefix | +| `libdir` | `$(prefix)/lib` | Library directory | +| `filterdir` | `$(libdir)/cgit/filters` | Filter scripts install directory | +| `docdir` | `$(prefix)/share/doc/cgit` | Documentation directory | +| `mandir` | `$(prefix)/share/man` | Man page directory | +| `SHA1_HEADER` | `<openssl/sha.h>` | SHA-1 implementation header | +| `GIT_VER` | `2.46.0` | Git version to download and vendor | +| `GIT_URL` | `https://...git-$(GIT_VER).tar.xz` | Git source download URL | +| `NO_LUA` | (unset) | Set to any value to disable Lua support | +| `LUA_PKGCONFIG` | (auto-detected) | Explicit pkg-config name for Lua | +| `NO_C99_FORMAT` | (unset) | Define if your printf lacks `%zu`, `%lld` etc. | +| `HAVE_LINUX_SENDFILE` | (auto on Linux) | Enable `sendfile()` in cache | +| `V` | (unset) | Set to `1` for verbose build output | + +Overrides can be placed in a `cgit.conf` file (included by both `Makefile` and +`cgit.mk` via `-include cgit.conf`). + +## Installation + +```bash +make install # Install binary and static assets +make install-doc # Install man pages, HTML docs, PDF docs +make install-man # Man pages only +make install-html # HTML docs only +make install-pdf # PDF docs only +``` + +### Installed files + +| Path | Mode | Source | +|------|------|--------| +| `$(CGIT_SCRIPT_PATH)/$(CGIT_SCRIPT_NAME)` | 0755 | `cgit` binary | +| `$(CGIT_DATA_PATH)/cgit.css` | 0644 | Default stylesheet | +| `$(CGIT_DATA_PATH)/cgit.js` | 0644 | Client-side JavaScript | +| `$(CGIT_DATA_PATH)/cgit.png` | 0644 | Default logo | +| `$(CGIT_DATA_PATH)/favicon.ico` | 0644 | Default favicon | +| `$(CGIT_DATA_PATH)/robots.txt` | 0644 | Robots exclusion file | +| `$(filterdir)/*` | (varies) | Filter scripts from `filters/` | +| `$(mandir)/man5/cgitrc.5` | 0644 | Man page (if `install-man`) | + +## Make Targets + +| Target | Description | +|--------|-------------| +| `all` | Build the cgit binary (default) | +| `cgit` | Explicit build target | +| `test` | Build everything (`all` target on git) then run `tests/` | +| `install` | Install binary, CSS, JS, images, filters | +| `install-doc` | Install man pages + HTML + PDF | +| `install-man` | Man pages only | +| `install-html` | HTML docs only | +| `install-pdf` | PDF docs only | +| `clean` | Remove cgit objects, VERSION, CGIT-CFLAGS, tags | +| `cleanall` | `clean` + `make -C git clean` | +| `clean-doc` | Remove generated doc files | +| `get-git` | Download and extract Git source into `git/` | +| `tags` | Generate ctags for all `*.[ch]` files | +| `sparse` | Run `sparse` static analysis via cgit.mk | +| `uninstall` | Remove installed binary and assets | +| `uninstall-doc` | Remove installed documentation | + +## Documentation Generation + +Man pages are generated from `cgitrc.5.txt` using `asciidoc`/`a2x`: + +```makefile +MAN5_TXT = $(wildcard *.5.txt) +DOC_MAN5 = $(patsubst %.txt,%,$(MAN5_TXT)) +DOC_HTML = $(patsubst %.txt,%.html,$(MAN_TXT)) +DOC_PDF = $(patsubst %.txt,%.pdf,$(MAN_TXT)) + +%.5 : %.5.txt + a2x -f manpage $< + +$(DOC_HTML): %.html : %.txt + $(TXT_TO_HTML) -o $@+ $< && mv $@+ $@ + +$(DOC_PDF): %.pdf : %.txt + a2x -f pdf cgitrc.5.txt +``` + +## Cross-Compilation + +For cross-compiling (e.g. targeting MinGW on Linux): + +```bash +make CC=x86_64-w64-mingw32-gcc +``` + +The `toolchain-mingw32.cmake` file in the repository is for CMake-based +projects; cgit itself uses Make exclusively. + +## Customizing the Build + +Create a `cgit.conf` file alongside the Makefile: + +```makefile +# cgit.conf — local build overrides +CGIT_VERSION = 1.0.0-custom +CGIT_CONFIG = /usr/local/etc/cgitrc +CACHE_ROOT = /tmp/cgit-cache +NO_LUA = 1 +``` + +This file is `-include`d by both `Makefile` and `cgit.mk`, so it applies to +all build stages. + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| `make: *** No rule to make target 'git/Makefile'` | Run `make get-git` first | +| `lua.h: No such file or directory` | Install Lua dev package or set `NO_LUA=1` | +| `openssl/sha.h: No such file or directory` | Install `libssl-dev` / `openssl-devel` | +| `sendfile: undefined reference` | Set `HAVE_LINUX_SENDFILE=` (empty) on non-Linux | +| Build fails with `redefinition of 'struct cache_slot'` | Git's `cache.h` conflict — cgit uses `CGIT_CACHE_H` guard | +| `dlsym: symbol not found: write` | Lua filter's `write()` interposition requires `-ldl` (auto on Linux) | +| Version shows as `unknown` | Run `./gen-version.sh "$(CGIT_VERSION)"` or check `VERSION` file | diff --git a/docs/handbook/cgit/caching-system.md b/docs/handbook/cgit/caching-system.md new file mode 100644 index 0000000000..5d3b723ed5 --- /dev/null +++ b/docs/handbook/cgit/caching-system.md @@ -0,0 +1,287 @@ +# cgit — Caching System + +## Overview + +cgit implements a file-based output cache that stores the fully rendered +HTML/binary response for each unique request. The cache avoids regenerating +pages for repeated identical requests. When caching is disabled +(`cache-size=0`, the default), all output is written directly to `stdout`. + +Source files: `cache.c`, `cache.h`. + +## Cache Slot Structure + +Each cached response is represented by a `cache_slot`: + +```c +struct cache_slot { + const char *key; /* request identifier (URL-based) */ + int keylen; /* strlen(key) */ + int ttl; /* time-to-live in minutes */ + cache_fill_fn fn; /* callback to regenerate content */ + int cache_fd; /* fd for the cache file */ + int lock_fd; /* fd for the .lock file */ + const char *cache_name;/* path: cache_root/hash(key) */ + const char *lock_name; /* path: cache_name + ".lock" */ + int match; /* 1 if cache file matches key */ + struct stat cache_st; /* stat of the cache file */ + int bufsize; /* size of the header buffer */ + char buf[1024 + 4 * 20]; /* header: key + timestamps */ +}; +``` + +The `cache_fill_fn` typedef: + +```c +typedef void (*cache_fill_fn)(void *cbdata); +``` + +This callback is invoked to produce the page content when the cache needs +filling. The callback writes directly to `stdout`, which is redirected to the +lock file while cache filling is in progress. + +## Hash Function + +Cache file names are derived from the request key using the FNV-1 hash: + +```c +unsigned long hash_str(const char *str) +{ + unsigned long h = 0x811c9dc5; + unsigned char *s = (unsigned char *)str; + while (*s) { + h *= 0x01000193; + h ^= (unsigned long)*s++; + } + return h; +} +``` + +The resulting hash is formatted as `%lx` and joined with the configured +`cache-root` directory to produce the cache file path. The lock file is +the same path with `.lock` appended. + +## Slot Lifecycle + +A cache request goes through these phases, managed by `process_slot()`: + +### 1. Open (`open_slot`) + +Opens the cache file and reads the header. The header contains the original +key followed by creation and expiry timestamps. If the stored key matches the +current request key, `slot->match` is set to 1. + +```c +static int open_slot(struct cache_slot *slot) +{ + slot->cache_fd = open(slot->cache_name, O_RDONLY); + if (slot->cache_fd == -1) + return errno; + if (fstat(slot->cache_fd, &slot->cache_st)) + return errno; + /* read header into slot->buf */ + return 0; +} +``` + +### 2. Check Match + +If the file exists and the key matches, the code checks whether the entry +has expired based on the TTL: + +```c +static int is_expired(struct cache_slot *slot) +{ + if (slot->ttl < 0) + return 0; /* negative TTL = never expires */ + return slot->cache_st.st_mtime + slot->ttl * 60 < time(NULL); +} +``` + +A TTL of `-1` means the entry never expires (used for `cache-static-ttl`). + +### 3. Lock (`lock_slot`) + +Creates the `.lock` file with `O_WRONLY | O_CREAT | O_EXCL` and writes the +header containing the key and timestamps. If locking fails (another process +holds the lock), the stale cached content is served instead. + +```c +static int lock_slot(struct cache_slot *slot) +{ + slot->lock_fd = open(slot->lock_name, + O_WRONLY | O_CREAT | O_EXCL, S_IRUSR | S_IWUSR); + if (slot->lock_fd == -1) + return errno; + /* write header: key + creation timestamp */ + return 0; +} +``` + +### 4. Fill (`fill_slot`) + +Redirects `stdout` to the lock file using `dup2()`, invokes the +`cache_fill_fn` callback to generate the page content, then restores `stdout`: + +```c +static int fill_slot(struct cache_slot *slot) +{ + /* save original stdout */ + /* dup2(slot->lock_fd, STDOUT_FILENO) */ + slot->fn(slot->cbdata); + /* restore original stdout */ + return 0; +} +``` + +### 5. Close and Rename + +After filling, the lock file is atomically renamed to the cache file: + +```c +if (rename(slot->lock_name, slot->cache_name)) + return errno; +``` + +This ensures readers never see a partially-written file. + +### 6. Print (`print_slot`) + +The cache file content (minus the header) is sent to `stdout`. On Linux, +`sendfile()` is used for zero-copy output: + +```c +static int print_slot(struct cache_slot *slot) +{ +#ifdef HAVE_LINUX_SENDFILE + off_t start = slot->keylen + 1; /* skip header */ + sendfile(STDOUT_FILENO, slot->cache_fd, &start, + slot->cache_st.st_size - start); +#else + /* fallback: read()/write() loop */ +#endif +} +``` + +## Process Slot State Machine + +`process_slot()` implements a state machine combining all phases: + +``` +START → open_slot() + ├── success + key match + not expired → print_slot() → DONE + ├── success + key match + expired → lock_slot() + │ ├── lock acquired → fill_slot() → close_slot() → open_slot() → print_slot() + │ └── lock failed → print_slot() (serve stale) + ├── success + key mismatch → lock_slot() + │ ├── lock acquired → fill_slot() → close_slot() → open_slot() → print_slot() + │ └── lock failed → fill_slot() (direct to stdout) + └── open failed → lock_slot() + ├── lock acquired → fill_slot() → close_slot() → open_slot() → print_slot() + └── lock failed → fill_slot() (direct to stdout, no cache) +``` + +## Public API + +```c +/* Process a request through the cache */ +extern int cache_process(int size, const char *path, const char *key, + int ttl, cache_fill_fn fn, void *cbdata); + +/* List all cache entries (for debugging/administration) */ +extern int cache_ls(const char *path); + +/* Hash a string using FNV-1 */ +extern unsigned long hash_str(const char *str); +``` + +### `cache_process()` + +Parameters: +- `size` — Maximum number of cache entries (from `cache-size`). If `0`, + caching is bypassed and `fn` is called directly. +- `path` — Cache root directory. +- `key` — Request identifier (derived from full URL + query string). +- `ttl` — Time-to-live in minutes. +- `fn` — Callback function that generates the page content. +- `cbdata` — Opaque data passed to the callback. + +### `cache_ls()` + +Scans the cache root directory and prints information about each cache entry +to `stdout`. Used for administrative inspection. + +## TTL Configuration Mapping + +Different page types have different TTLs: + +| Page Type | Config Directive | Default | Applied When | +|-----------|-----------------|---------|--------------| +| Repository list | `cache-root-ttl` | 5 min | `cmd->want_repo == 0` | +| Repo pages | `cache-repo-ttl` | 5 min | `cmd->want_repo == 1` and dynamic | +| Dynamic pages | `cache-dynamic-ttl` | 5 min | `cmd->want_vpath == 1` | +| Static content | `cache-static-ttl` | -1 (never) | SHA-referenced content | +| About pages | `cache-about-ttl` | 15 min | About/readme view | +| Snapshots | `cache-snapshot-ttl` | 5 min | Snapshot downloads | +| Scan results | `cache-scanrc-ttl` | 15 min | scan-path results | + +Static content uses a TTL of `-1` because SHA-addressed content is +immutable — a given commit/tree/blob hash always refers to the same data. + +## Cache Key Generation + +The cache key is built from the complete query context in `cgit.c`: + +```c +static const char *cache_key(void) +{ + return fmt("%s?%s?%s?%s?%s", + ctx.qry.raw, ctx.env.http_host, + ctx.env.https ? "1" : "0", + ctx.env.authenticated ? "1" : "0", + ctx.env.http_cookie ? ctx.env.http_cookie : ""); +} +``` + +The key captures: raw query string, hostname, HTTPS state, authentication +state, and cookies. This ensures that authenticated users get different +cache entries than unauthenticated users. + +## Concurrency + +The cache supports concurrent access from multiple CGI processes: + +1. **Atomic writes**: Content is written to a `.lock` file first, then + atomically renamed to the cache file. Readers never see partial content. +2. **Non-blocking locks**: If a lock is already held, the process either + serves stale cached content (if available) or generates content directly + to stdout without caching. +3. **No deadlocks**: Lock files are `O_EXCL`, not `flock()`. If a process + crashes while holding a lock, the stale `.lock` file remains. It is + typically cleaned up by the next successful writer. + +## Cache Directory Management + +The cache root directory (`cache-root`, default `/var/cache/cgit`) must be +writable by the web server user. Cache files are created with mode `0600` +(`S_IRUSR | S_IWUSR`). + +There is no built-in cache eviction. Old cache files persist until a new +request with the same hash replaces them. Administrators should set up +periodic cleanup (e.g., a cron job) to purge expired files: + +```bash +find /var/cache/cgit -type f -mmin +60 -delete +``` + +## Disabling the Cache + +Set `cache-size=0` (the default). When `size` is 0, `cache_process()` calls +the fill function directly, writing to stdout with no file I/O overhead: + +```c +if (!size) { + fn(cbdata); + return 0; +} +``` diff --git a/docs/handbook/cgit/code-style.md b/docs/handbook/cgit/code-style.md new file mode 100644 index 0000000000..d4059391dc --- /dev/null +++ b/docs/handbook/cgit/code-style.md @@ -0,0 +1,356 @@ +# cgit — Code Style and Conventions + +## Overview + +cgit follows C99 conventions with a style influenced by the Linux kernel and +Git project coding standards. This document describes the patterns, naming +conventions, and idioms used throughout the codebase. + +## Language Standard + +cgit is written in C99, compiled with: + +```makefile +CGIT_CFLAGS += -std=c99 +``` + +No C11 or GNU extensions are required, though some platform-specific features +(like `sendfile()` on Linux) are conditionally compiled. + +## Formatting + +### Indentation + +- Tabs for indentation (1 tab = 8 spaces display width, consistent with + Linux kernel/Git style) +- No spaces for indentation alignment + +### Braces + +K&R style (opening brace on same line): + +```c +if (condition) { + /* body */ +} else { + /* body */ +} + +static void function_name(int arg) +{ + /* function body */ +} +``` + +Functions place the opening brace on its own line. Control structures +(`if`, `for`, `while`, `switch`) keep it on the same line. + +### Line Length + +No strict limit, but lines generally stay under 80 characters. Long function +calls are broken across lines. + +## Naming Conventions + +### Functions + +Public API functions use the `cgit_` prefix: + +```c +void cgit_print_commit(const char *rev, const char *prefix); +void cgit_print_diff(const char *new_rev, const char *old_rev, ...); +struct cgit_repo *cgit_add_repo(const char *url); +struct cgit_repo *cgit_get_repoinfo(const char *url); +int cgit_parse_snapshots_mask(const char *str); +``` + +Static (file-local) functions use descriptive names without prefix: + +```c +static void config_cb(const char *name, const char *value); +static void querystring_cb(const char *name, const char *value); +static void process_request(void); +static int open_slot(struct cache_slot *slot); +``` + +### Types + +Struct types use `cgit_` prefix with snake_case: + +```c +struct cgit_context; +struct cgit_repo; +struct cgit_config; +struct cgit_query; +struct cgit_page; +struct cgit_environment; +struct cgit_cmd; +struct cgit_filter; +struct cgit_snapshot_format; +``` + +### Macros and Constants + +Uppercase with underscores: + +```c +#define ABOUT_FILTER 0 +#define COMMIT_FILTER 1 +#define SOURCE_FILTER 2 +#define EMAIL_FILTER 3 +#define AUTH_FILTER 4 +#define DIFF_UNIFIED 0 +#define DIFF_SSDIFF 1 +#define DIFF_STATONLY 2 +#define FMT_BUFS 8 +#define FMT_SIZE 8192 +``` + +### Variables + +Global variables use descriptive names: + +```c +struct cgit_context ctx; +struct cgit_repolist cgit_repolist; +const char *cgit_version; +``` + +## File Organization + +### Header Files + +Each module has a corresponding header file with include guards: + +```c +#ifndef UI_DIFF_H +#define UI_DIFF_H + +extern void cgit_print_diff(const char *new_rev, const char *old_rev, + const char *prefix, int show_ctrls, int raw); +extern void cgit_print_diffstat(const struct object_id *old, + const struct object_id *new, + const char *prefix); + +#endif /* UI_DIFF_H */ +``` + +### Source Files + +Typical source file structure: + +1. License header (if present) +2. Include directives +3. Static (file-local) variables +4. Static helper functions +5. Public API functions + +### Module Pattern + +UI modules follow a consistent pattern with `ui-*.c` / `ui-*.h` pairs: + +```c +/* ui-example.c */ +#include "cgit.h" +#include "ui-example.h" +#include "html.h" +#include "ui-shared.h" + +static void helper_function(void) +{ + /* ... */ +} + +void cgit_print_example(void) +{ + /* main entry point */ +} +``` + +## Common Patterns + +### Global Context + +cgit uses a single global `struct cgit_context ctx` variable that holds all +request state. Functions access it directly rather than passing it as a +parameter: + +```c +/* Access global context directly */ +if (ctx.repo && ctx.repo->enable_blame) + cgit_print_blame(); + +/* Not: cgit_print_blame(&ctx) */ +``` + +### Callback Functions + +Configuration and query parsing use callback function pointers: + +```c +typedef void (*configfile_value_fn)(const char *name, const char *value); +typedef void (*filepair_fn)(struct diff_filepair *pair); +typedef void (*linediff_fn)(char *line, int len); +typedef void (*cache_fill_fn)(void *cbdata); +``` + +### String Formatting + +The `fmt()` ring buffer is used for temporary string construction: + +```c +const char *url = fmt("%s/%s/", ctx.cfg.virtual_root, repo->url); +html_attr(url); +``` + +Never store `fmt()` results long-term — use `fmtalloc()` or `xstrdup()`. + +### NULL Checks + +Functions generally check for NULL pointers at the start: + +```c +void cgit_print_blob(const char *hex, const char *path, + const char *head, int file_only) +{ + if (!hex && !path) { + cgit_print_error_page(400, "Bad request", + "Need either hex or path"); + return; + } + /* ... */ +} +``` + +### Memory Management + +cgit uses Git's `xmalloc` / `xstrdup` / `xrealloc` wrappers that die on +allocation failure: + +```c +char *name = xstrdup(value); +repo = xrealloc(repo, new_size); +``` + +No explicit `free()` calls in most paths — the CGI process exits after each +request, and the OS reclaims all memory. + +### Boolean as Int + +Boolean values are represented as `int` (0 or 1), consistent with C99 +convention before `_Bool`: + +```c +int enable_blame; +int enable_commit_graph; +int binary; +int match; +``` + +### Typedef Avoidance + +Structs are generally not typedef'd — they use the `struct` keyword +explicitly: + +```c +struct cgit_repo *repo; +struct cache_slot slot; +``` + +Exception: function pointer typedefs are used for callbacks: + +```c +typedef void (*configfile_value_fn)(const char *name, const char *value); +``` + +## Error Handling + +### `die()` for Fatal Errors + +Unrecoverable errors use Git's `die()`: + +```c +if (!ctx.repo) + die("no repository"); +``` + +### Error Pages for User Errors + +User-facing errors use the error page function: + +```c +cgit_print_error_page(404, "Not Found", + "No repository found for '%s'", + ctx.qry.repo); +``` + +### Return Codes + +Functions that can fail return int (0 = success, non-zero = error): + +```c +static int open_slot(struct cache_slot *slot) +{ + slot->cache_fd = open(slot->cache_name, O_RDONLY); + if (slot->cache_fd == -1) + return errno; + return 0; +} +``` + +## Preprocessor Usage + +Conditional compilation for platform features: + +```c +#ifdef HAVE_LINUX_SENDFILE + sendfile(STDOUT_FILENO, slot->cache_fd, &off, size); +#else + /* read/write fallback */ +#endif + +#ifdef HAVE_LUA + /* Lua filter support */ +#endif +``` + +## Git Library Integration + +cgit includes Git as a library. It uses Git's internal APIs directly: + +```c +#include "git/cache.h" +#include "git/object.h" +#include "git/commit.h" +#include "git/diff.h" +#include "git/revision.h" +#include "git/archive.h" +``` + +Functions from Git's library are called without wrapper layers: + +```c +struct commit *commit = lookup_commit_reference(&oid); +struct tree *tree = parse_tree_indirect(&oid); +init_revisions(&rev, NULL); +``` + +## Documentation + +- Code comments are used sparingly, mainly for non-obvious logic +- No Doxygen or similar documentation generators are used +- Function documentation is in the header files as prototypes with + descriptive parameter names +- The `cgitrc.5.txt` file provides user-facing documentation in + man page format + +## Commit Messages + +Commit messages follow the standard Git format: + +``` +subject: brief description (50 chars or less) + +Extended description wrapping at 72 characters. Explain what and why, +not how. +``` diff --git a/docs/handbook/cgit/configuration.md b/docs/handbook/cgit/configuration.md new file mode 100644 index 0000000000..afc29fce07 --- /dev/null +++ b/docs/handbook/cgit/configuration.md @@ -0,0 +1,351 @@ +# cgit — Configuration Reference + +## Configuration File + +Default location: `/etc/cgitrc` (compiled in as `CGIT_CONFIG`). Override at +runtime by setting the `$CGIT_CONFIG` environment variable. + +## File Format + +The configuration file uses a simple `name=value` format, parsed by +`parse_configfile()` in `configfile.c`. Key rules: + +- Lines starting with `#` or `;` are comments +- Leading whitespace on lines is skipped +- No quoting mechanism — the value is everything after the `=` to end of line +- Empty lines are ignored +- Nesting depth for `include=` directives is limited to 8 levels + +```c +int parse_configfile(const char *filename, configfile_value_fn fn) +{ + static int nesting; + /* ... */ + if (nesting > 8) + return -1; + /* ... */ + while (read_config_line(f, &name, &value)) + fn(name.buf, value.buf); + /* ... */ +} +``` + +## Global Directives + +All global directives are processed by `config_cb()` in `cgit.c`. When a +directive is encountered, the value is stored in the corresponding +`ctx.cfg.*` field. + +### Site Identity + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `root-title` | `"Git repository browser"` | string | HTML page title for the index page | +| `root-desc` | `"a fast webinterface for the git dscm"` | string | Subtitle text on the index page | +| `root-readme` | (none) | path | Path to a file rendered on the site about page | +| `root-coc` | (none) | path | Path to Code of Conduct file | +| `root-cla` | (none) | path | Path to Contributor License Agreement file | +| `root-homepage` | (none) | URL | External homepage URL | +| `root-homepage-title` | (none) | string | Title text for the homepage link | +| `root-link` | (none) | string | `label\|url` pairs for navigation links (can repeat) | +| `logo` | `"/cgit.png"` | URL | Path to the site logo image | +| `logo-link` | (none) | URL | URL the logo links to | +| `favicon` | `"/favicon.ico"` | URL | Path to the favicon | +| `css` | (none) | URL | Stylesheet URL (can repeat for multiple stylesheets) | +| `js` | (none) | URL | JavaScript URL (can repeat) | +| `header` | (none) | path | File included at the top of every page | +| `footer` | (none) | path | File included at the bottom of every page | +| `head-include` | (none) | path | File included in the HTML `<head>` | +| `robots` | `"index, nofollow"` | string | Content for `<meta name="robots">` | + +### URL Configuration + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `virtual-root` | (none) | path | Base URL path when using URL rewriting (always ends with `/`) | +| `script-name` | `CGIT_SCRIPT_NAME` | path | CGI script name (from `$SCRIPT_NAME` env var) | +| `clone-prefix` | (none) | string | Prefix for clone URLs when auto-generating | +| `clone-url` | (none) | string | Clone URL template (`$CGIT_REPO_URL` expanded) | + +When `virtual-root` is set, URLs use path-based routing: +`/cgit/repo/log/path`. Without it, query-string routing is used: +`?url=repo/log/path`. + +### Feature Flags + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `enable-http-clone` | `1` | int | Allow HTTP clone operations (HEAD, info/refs, objects/) | +| `enable-index-links` | `0` | int | Show log/tree/commit links on the repo index page | +| `enable-index-owner` | `1` | int | Show the Owner column on the repo index page | +| `enable-blame` | `0` | int | Enable blame view for all repos | +| `enable-commit-graph` | `0` | int | Show ASCII commit graph in log view | +| `enable-log-filecount` | `0` | int | Show changed-file count in log view | +| `enable-log-linecount` | `0` | int | Show added/removed line counts in log | +| `enable-remote-branches` | `0` | int | Display remote tracking branches | +| `enable-subject-links` | `0` | int | Show parent commit subjects instead of hashes | +| `enable-html-serving` | `0` | int | Serve HTML files as-is from plain view | +| `enable-subtree` | `0` | int | Detect and display git-subtree directories | +| `enable-tree-linenumbers` | `1` | int | Show line numbers in file/blob view | +| `enable-git-config` | `0` | int | Read `gitweb.*` and `cgit.*` from repo's git config | +| `enable-filter-overrides` | `0` | int | Allow repos to override global filters | +| `enable-follow-links` | `0` | int | Show "follow" links in log view for renames | +| `embedded` | `0` | int | Omit HTML boilerplate for embedding in another page | +| `noheader` | `0` | int | Suppress the page header | +| `noplainemail` | `0` | int | Hide email addresses in output | +| `local-time` | `0` | int | Display times in local timezone instead of UTC | + +### Limits + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `max-repo-count` | `50` | int | Repos per page on the index (≤0 → unlimited) | +| `max-commit-count` | `50` | int | Commits per page in log view | +| `max-message-length` | `80` | int | Truncate commit subject at this length | +| `max-repodesc-length` | `80` | int | Truncate repo description at this length | +| `max-blob-size` | `0` | int (KB) | Max blob size to display (0 = unlimited) | +| `max-stats` | `0` | int | Stats period (0=disabled, 1=week, 2=month, 3=quarter, 4=year) | +| `max-atom-items` | `10` | int | Number of entries in Atom feeds | +| `max-subtree-commits` | `2000` | int | Max commits to scan for subtree trailers | +| `renamelimit` | `-1` | int | Diff rename detection limit (-1 = Git default) | + +### Caching + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `cache-size` | `0` | int | Number of cache entries (0 = disabled) | +| `cache-root` | `CGIT_CACHE_ROOT` | path | Directory for cache files | +| `cache-root-ttl` | `5` | int (min) | TTL for repo-list pages | +| `cache-repo-ttl` | `5` | int (min) | TTL for repo-specific pages | +| `cache-dynamic-ttl` | `5` | int (min) | TTL for dynamic content | +| `cache-static-ttl` | `-1` | int (min) | TTL for static content (-1 = forever) | +| `cache-about-ttl` | `15` | int (min) | TTL for about/readme pages | +| `cache-snapshot-ttl` | `5` | int (min) | TTL for snapshot pages | +| `cache-scanrc-ttl` | `15` | int (min) | TTL for cached scan-path results | + +### Sorting + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `case-sensitive-sort` | `1` | int | Case-sensitive repo name sorting | +| `section-sort` | `1` | int | Sort sections alphabetically | +| `section-from-path` | `0` | int | Derive section name from path depth (>0 = from start, <0 = from end) | +| `repository-sort` | `"name"` | string | Default sort field for repo list | +| `branch-sort` | `0` | int | Branch sort: 0=name, 1=age | +| `commit-sort` | `0` | int | Commit sort: 0=default, 1=date, 2=topo | + +### Snapshots + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `snapshots` | (none) | string | Space-separated list of enabled formats: `.tar` `.tar.gz` `.tar.bz2` `.tar.lz` `.tar.xz` `.tar.zst` `.zip`. Also accepts `all`. | + +### Filters + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `about-filter` | (none) | filter | Filter for rendering README/about content | +| `source-filter` | (none) | filter | Filter for syntax highlighting source code | +| `commit-filter` | (none) | filter | Filter for commit messages | +| `email-filter` | (none) | filter | Filter for email display (2 args: email, page) | +| `owner-filter` | (none) | filter | Filter for owner display | +| `auth-filter` | (none) | filter | Authentication filter (12 args) | + +Filter values use the format `type:command`: +- `exec:/path/to/script` — external process filter +- `lua:/path/to/script.lua` — Lua script filter +- Plain path without prefix defaults to `exec` + +### Display + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `summary-branches` | `10` | int | Branches shown on summary page | +| `summary-tags` | `10` | int | Tags shown on summary page | +| `summary-log` | `10` | int | Log entries shown on summary page | +| `side-by-side-diffs` | `0` | int | Default to side-by-side diff view | +| `remove-suffix` | `0` | int | Remove `.git` suffix from repo URLs | +| `scan-hidden-path` | `0` | int | Include hidden dirs when scanning | + +### Miscellaneous + +| Directive | Default | Type | Description | +|-----------|---------|------|-------------| +| `agefile` | `"info/web/last-modified"` | path | File in repo checked for modification time | +| `mimetype-file` | (none) | path | Apache-style mime.types file | +| `mimetype.<ext>` | (none) | string | MIME type for a file extension | +| `module-link` | (none) | URL | URL template for submodule links | +| `strict-export` | (none) | path | Only export repos containing this file | +| `project-list` | (none) | path | File listing project directories for `scan-path` | +| `scan-path` | (none) | path | Directory to scan for git repositories | +| `readme` | (none) | string | Default README file spec (can repeat) | +| `include` | (none) | path | Include another config file | + +## Repository Directives + +Repository configuration begins with `repo.url=` which creates a new +repository entry via `cgit_add_repo()`. Subsequent `repo.*` directives +modify the most recently created repository via `repo_config()` in `cgit.c`. + +| Directive | Description | +|-----------|-------------| +| `repo.url` | Repository URL path (triggers new repo creation) | +| `repo.path` | Filesystem path to the git repository | +| `repo.name` | Display name | +| `repo.basename` | Override for basename derivation | +| `repo.desc` | Repository description | +| `repo.owner` | Repository owner name | +| `repo.homepage` | Project homepage URL | +| `repo.defbranch` | Default branch name | +| `repo.section` | Section heading for grouped display | +| `repo.clone-url` | Clone URL (overrides global) | +| `repo.readme` | README file spec (`[ref:]path`, can repeat) | +| `repo.logo` | Per-repo logo URL | +| `repo.logo-link` | Per-repo logo link URL | +| `repo.extra-head-content` | Extra HTML for `<head>` | +| `repo.snapshots` | Snapshot format mask (space-separated suffixes) | +| `repo.snapshot-prefix` | Prefix for snapshot filenames | +| `repo.enable-blame` | Override global enable-blame | +| `repo.enable-commit-graph` | Override global enable-commit-graph | +| `repo.enable-log-filecount` | Override global enable-log-filecount | +| `repo.enable-log-linecount` | Override global enable-log-linecount | +| `repo.enable-remote-branches` | Override global enable-remote-branches | +| `repo.enable-subject-links` | Override global enable-subject-links | +| `repo.enable-html-serving` | Override global enable-html-serving | +| `repo.enable-subtree` | Override global enable-subtree | +| `repo.max-stats` | Override global max-stats | +| `repo.max-subtree-commits` | Override global max-subtree-commits | +| `repo.branch-sort` | `"age"` or `"name"` | +| `repo.commit-sort` | `"date"` or `"topo"` | +| `repo.module-link` | Submodule URL template | +| `repo.module-link.<submodule>` | Per-submodule URL | +| `repo.badge` | Badge entry: `url\|imgurl` or just `imgurl` (can repeat) | +| `repo.hide` | `1` = hide from listing (still accessible by URL) | +| `repo.ignore` | `1` = completely ignore this repository | + +### Filter overrides (require `enable-filter-overrides=1`) + +| Directive | Description | +|-----------|-------------| +| `repo.about-filter` | Per-repo about filter | +| `repo.commit-filter` | Per-repo commit filter | +| `repo.source-filter` | Per-repo source filter | +| `repo.email-filter` | Per-repo email filter | +| `repo.owner-filter` | Per-repo owner filter | + +## Repository Defaults + +When a new repository is created by `cgit_add_repo()`, it inherits all global +defaults from `ctx.cfg`: + +```c +ret->section = ctx.cfg.section; +ret->snapshots = ctx.cfg.snapshots; +ret->enable_blame = ctx.cfg.enable_blame; +ret->enable_commit_graph = ctx.cfg.enable_commit_graph; +ret->enable_log_filecount = ctx.cfg.enable_log_filecount; +ret->enable_log_linecount = ctx.cfg.enable_log_linecount; +ret->enable_remote_branches = ctx.cfg.enable_remote_branches; +ret->enable_subject_links = ctx.cfg.enable_subject_links; +ret->enable_html_serving = ctx.cfg.enable_html_serving; +ret->enable_subtree = ctx.cfg.enable_subtree; +ret->max_stats = ctx.cfg.max_stats; +ret->max_subtree_commits = ctx.cfg.max_subtree_commits; +ret->branch_sort = ctx.cfg.branch_sort; +ret->commit_sort = ctx.cfg.commit_sort; +ret->module_link = ctx.cfg.module_link; +ret->readme = ctx.cfg.readme; +ret->about_filter = ctx.cfg.about_filter; +ret->commit_filter = ctx.cfg.commit_filter; +ret->source_filter = ctx.cfg.source_filter; +ret->email_filter = ctx.cfg.email_filter; +ret->owner_filter = ctx.cfg.owner_filter; +ret->clone_url = ctx.cfg.clone_url; +``` + +This means global directives should appear *before* `repo.url=` entries, since +they set the defaults for subsequently defined repositories. + +## Git Config Integration + +When `enable-git-config=1`, the `scan-tree` scanner reads each repository's +`.git/config` and maps gitweb-compatible directives: + +```c +if (!strcmp(key, "gitweb.owner")) + config_fn(repo, "owner", value); +else if (!strcmp(key, "gitweb.description")) + config_fn(repo, "desc", value); +else if (!strcmp(key, "gitweb.category")) + config_fn(repo, "section", value); +else if (!strcmp(key, "gitweb.homepage")) + config_fn(repo, "homepage", value); +else if (skip_prefix(key, "cgit.", &name)) + config_fn(repo, name, value); +``` + +Any `cgit.*` key in the git config is passed directly to the repo config +handler, allowing per-repo settings without modifying the global cgitrc. + +## README File Spec Format + +README directives support three forms: + +| Format | Meaning | +|--------|---------| +| `path` | File on disk, relative to repo path | +| `/absolute/path` | File on disk, absolute | +| `ref:path` | File tracked in the git repository at the given ref | +| `:path` | File tracked in the default branch or query head | + +Multiple `readme` directives can be specified. cgit tries each in order and +uses the first one found (checked via `cgit_ref_path_exists()` for tracked +files, or `access(R_OK)` for disk files). + +## Macro Expansion + +The `expand_macros()` function (in `shared.c`) performs environment variable +substitution in certain directive values (`cache-root`, `scan-path`, +`project-list`, `include`). A `$VARNAME` or `${VARNAME}` in the value is +replaced with the corresponding environment variable. + +## Example Configuration + +```ini +# Site settings +root-title=Project Tick Git +root-desc=Source code for Project Tick +logo=/cgit/cgit.png +css=/cgit/cgit.css +virtual-root=/cgit/ + +# Features +enable-commit-graph=1 +enable-blame=1 +enable-http-clone=1 +enable-index-links=1 +snapshots=tar.gz tar.xz zip +max-stats=quarter + +# Caching +cache-size=1000 +cache-root=/var/cache/cgit + +# Filters +source-filter=exec:/usr/lib/cgit/filters/syntax-highlighting.py +about-filter=exec:/usr/lib/cgit/filters/about-formatting.sh + +# Scanning +scan-path=/srv/git/ +section-from-path=1 + +# Or manual repo definitions: +repo.url=myproject +repo.path=/srv/git/myproject.git +repo.desc=My awesome project +repo.owner=Alice +repo.readme=master:README.md +repo.clone-url=https://git.example.com/myproject.git +repo.snapshots=tar.gz zip +repo.badge=https://ci.example.com/badge.svg|https://ci.example.com/ +``` diff --git a/docs/handbook/cgit/css-theming.md b/docs/handbook/cgit/css-theming.md new file mode 100644 index 0000000000..0a7b404595 --- /dev/null +++ b/docs/handbook/cgit/css-theming.md @@ -0,0 +1,522 @@ +# cgit — CSS Theming + +## Overview + +cgit ships with a comprehensive CSS stylesheet (`cgit.css`) that controls +the visual appearance of all pages. The stylesheet is designed with a light +color scheme and semantic CSS classes that map directly to cgit's HTML +structure. + +Source file: `cgit.css` (~450 lines). + +## Loading Stylesheets + +CSS files are specified via the `css` configuration directive: + +```ini +css=/cgit/cgit.css +``` + +Multiple stylesheets can be loaded by repeating the directive: + +```ini +css=/cgit/cgit.css +css=/cgit/custom.css +``` + +Stylesheets are included in document order in the `<head>` section via +`cgit_print_docstart()` in `ui-shared.c`. + +## Page Structure + +The HTML layout uses this basic structure: + +```html +<body> + <div id='cgit'> + <table id='header'>...</table> <!-- site header with logo --> + <table id='navigation'>...</table> <!-- tab navigation --> + <div id='content'> <!-- page content --> + <!-- page-specific content --> + </div> + <div class='footer'>...</div> <!-- footer --> + </div> +</body> +``` + +## Base Styles + +### Body and Layout + +```css +body { + font-family: sans-serif; + font-size: 11px; + color: #000; + background: white; + padding: 4px; +} + +div#cgit { + padding: 0; + margin: 0; + font-family: monospace; + font-size: 12px; +} +``` + +### Header + +```css +table#header { + width: 100%; + margin-bottom: 1em; +} + +table#header td.logo { + /* logo cell */ +} + +table#header td.main { + font-size: 250%; + font-weight: bold; + vertical-align: bottom; + padding-left: 10px; +} + +table#header td.sub { + color: #999; + font-size: 75%; + vertical-align: top; + padding-left: 10px; +} +``` + +### Navigation Tabs + +```css +table#navigation { + width: 100%; +} + +table#navigation a { + padding: 2px 6px; + color: #000; + text-decoration: none; +} + +table#navigation a:hover { + color: #00f; +} +``` + +## Content Areas + +### Repository List + +```css +table.list { + border-collapse: collapse; + border: solid 1px #aaa; + width: 100%; +} + +table.list th { + text-align: left; + font-weight: bold; + background: #ddd; + border-bottom: solid 1px #aaa; + padding: 2px 4px; +} + +table.list td { + padding: 2px 4px; + border: none; +} + +table.list tr:hover { + background: #eee; +} + +table.list td a { + color: #00f; + text-decoration: none; +} + +table.list td a:hover { + text-decoration: underline; +} +``` + +### Sections + +```css +div.section-header { + background: #eee; + border: solid 1px #ddd; + padding: 2px 4px; + font-weight: bold; + margin-top: 1em; +} +``` + +## Diff Styles + +### Diffstat + +```css +table.diffstat { + border-collapse: collapse; + border: solid 1px #aaa; +} + +table.diffstat td { + padding: 1px 4px; + border: none; +} + +table.diffstat td.mode { + font-weight: bold; + /* status indicator: A/M/D/R */ +} + +table.diffstat td.graph { + width: 500px; +} + +table.diffstat td.graph span.add { + background: #5f5; + /* green bar for additions */ +} + +table.diffstat td.graph span.rem { + background: #f55; + /* red bar for deletions */ +} + +table.diffstat .total { + font-weight: bold; + text-align: center; +} +``` + +### Unified Diff + +```css +table.diff { + width: 100%; +} + +table.diff td div.head { + font-weight: bold; + margin-top: 1em; + color: #000; +} + +table.diff td div.hunk { + color: #009; + /* hunk header @@ ... @@ */ +} + +table.diff td div.add { + color: green; + background: #dfd; +} + +table.diff td div.del { + color: red; + background: #fdd; +} +``` + +### Side-by-Side Diff + +```css +table.ssdiff { + width: 100%; +} + +table.ssdiff td { + font-family: monospace; + font-size: 12px; + padding: 1px 4px; + vertical-align: top; +} + +table.ssdiff td.lineno { + text-align: right; + width: 3em; + background: #eee; + color: #999; +} + +table.ssdiff td.add { + background: #dfd; +} + +table.ssdiff td.del { + background: #fdd; +} + +table.ssdiff td.changed { + background: #ffc; +} + +table.ssdiff span.add { + background: #afa; + font-weight: bold; +} + +table.ssdiff span.del { + background: #faa; + font-weight: bold; +} +``` + +## Blob/Tree View + +```css +table.blob { + border-collapse: collapse; + width: 100%; +} + +table.blob td { + font-family: monospace; + font-size: 12px; + padding: 0 4px; + vertical-align: top; +} + +table.blob td.linenumbers { + text-align: right; + color: #999; + background: #eee; + width: 3em; + border-right: solid 1px #ddd; +} + +table.blob td.lines { + white-space: pre; +} +``` + +### Tree Listing + +```css +table.list td.ls-mode { + font-family: monospace; + width: 10em; +} + +table.list td.ls-size { + text-align: right; + width: 5em; +} +``` + +## Commit View + +```css +table.commit-info { + border-collapse: collapse; + border: solid 1px #aaa; + margin-bottom: 1em; +} + +table.commit-info th { + text-align: left; + font-weight: bold; + padding: 2px 6px; + vertical-align: top; +} + +table.commit-info td { + padding: 2px 6px; +} + +div.commit-subject { + font-weight: bold; + font-size: 125%; + margin: 1em 0 0.5em; +} + +div.commit-msg { + white-space: pre; + font-family: monospace; +} + +div.notes-header { + font-weight: bold; + padding-top: 1em; +} + +div.notes { + white-space: pre; + font-family: monospace; + border-left: solid 3px #dd5; + padding: 0.5em; + background: #ffe; +} +``` + +## Log View + +```css +div.commit-graph { + font-family: monospace; + white-space: pre; + color: #000; +} + +/* Column colors for commit graph */ +.column1 { color: #a00; } +.column2 { color: #0a0; } +.column3 { color: #00a; } +.column4 { color: #aa0; } +.column5 { color: #0aa; } +.column6 { color: #a0a; } +``` + +## Stats View + +```css +table.stats { + border-collapse: collapse; + border: solid 1px #aaa; +} + +table.stats th { + text-align: left; + padding: 2px 6px; + background: #ddd; +} + +table.stats td { + padding: 2px 6px; +} + +div.stats-graph { + /* bar chart container */ +} +``` + +## Form Elements + +```css +div.cgit-panel { + float: right; + margin: 0 0 0.5em 0.5em; + padding: 4px; + border: solid 1px #aaa; + background: #eee; +} + +div.cgit-panel b { + display: block; + margin-bottom: 2px; +} + +div.cgit-panel select, +div.cgit-panel input { + font-size: 11px; +} +``` + +## Customization Strategies + +### Method 1: Override Stylesheet + +Create a custom CSS file that overrides specific rules: + +```css +/* /cgit/custom.css */ +body { + background: #1a1a2e; + color: #e0e0e0; +} + +div#cgit { + background: #16213e; +} + +table.list th { + background: #0f3460; + color: #e0e0e0; +} +``` + +```ini +css=/cgit/cgit.css +css=/cgit/custom.css +``` + +### Method 2: Replace Stylesheet + +Replace the default stylesheet entirely: + +```ini +css=/cgit/mytheme.css +``` + +### Method 3: head-include + +Inject inline styles via the `head-include` directive: + +```ini +head-include=/etc/cgit/extra-head.html +``` + +```html +<!-- /etc/cgit/extra-head.html --> +<style> + body { background: #f0f0f0; } +</style> +``` + +## CSS Classes Reference + +### Layout Classes + +| Class/ID | Element | Description | +|----------|---------|-------------| +| `#cgit` | div | Main container | +| `#header` | table | Site header | +| `#navigation` | table | Tab navigation | +| `#content` | div | Page content area | +| `.footer` | div | Page footer | + +### Content Classes + +| Class | Element | Description | +|-------|---------|-------------| +| `.list` | table | Data listing (repos, files, refs) | +| `.blob` | table | File content display | +| `.diff` | table | Unified diff | +| `.ssdiff` | table | Side-by-side diff | +| `.diffstat` | table | Diff statistics | +| `.commit-info` | table | Commit metadata | +| `.stats` | table | Statistics data | +| `.cgit-panel` | div | Control panel | + +### Diff Classes + +| Class | Element | Description | +|-------|---------|-------------| +| `.add` | div/span | Added lines/chars | +| `.del` | div/span | Deleted lines/chars | +| `.hunk` | div | Hunk header | +| `.ctx` | div | Context lines | +| `.head` | div | File header | +| `.changed` | td | Modified line (ssdiff) | +| `.lineno` | td | Line number column | + +### Status Classes + +| Class | Description | +|-------|-------------| +| `.upd` | Modified file | +| `.add` | Added file | +| `.del` | Deleted file | +| `.mode` | File mode indicator | +| `.graph` | Graph bar container | diff --git a/docs/handbook/cgit/deployment.md b/docs/handbook/cgit/deployment.md new file mode 100644 index 0000000000..8c991726af --- /dev/null +++ b/docs/handbook/cgit/deployment.md @@ -0,0 +1,369 @@ +# cgit — Deployment Guide + +## Overview + +cgit runs as a CGI application under a web server. This guide covers +compilation, installation, web server configuration, and production tuning. + +## Prerequisites + +Build dependencies: +- GCC or Clang (C99 compiler) +- GNU Make +- OpenSSL or compatible TLS library (for libgit HTTPS) +- zlib (for git object decompression) +- Optional: Lua or LuaJIT (for Lua filters) +- Optional: pkg-config (for Lua detection) + +Runtime dependencies: +- A CGI-capable web server (Apache, Nginx+fcgiwrap, lighttpd) +- Git repositories on the filesystem + +## Building + +```bash +# Clone/download the source +cd cgit/ + +# Build with defaults +make + +# Or with custom settings +make prefix=/usr CGIT_SCRIPT_PATH=/var/www/cgi-bin \ + CGIT_CONFIG=/etc/cgitrc CACHE_ROOT=/var/cache/cgit + +# Install +make install +``` + +### Build Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `prefix` | `/usr/local` | Installation prefix | +| `CGIT_SCRIPT_PATH` | `$(prefix)/lib/cgit` | CGI binary directory | +| `CGIT_DATA_PATH` | `$(prefix)/share/cgit` | Static files (CSS, images) | +| `CGIT_CONFIG` | `/etc/cgitrc` | Default config file path | +| `CACHE_ROOT` | `/var/cache/cgit` | Default cache directory | +| `CGIT_SCRIPT_NAME` | `"/"` | Default CGI script name | +| `NO_LUA` | (unset) | Set to 1 to disable Lua | + +### Installed Files + +``` +$(CGIT_SCRIPT_PATH)/cgit.cgi # CGI binary +$(CGIT_DATA_PATH)/cgit.css # Stylesheet +$(CGIT_DATA_PATH)/cgit.js # JavaScript +$(CGIT_DATA_PATH)/cgit.png # Logo image +$(CGIT_DATA_PATH)/robots.txt # Robots exclusion file +``` + +## Apache Configuration + +### CGI Module + +```apache +# Enable CGI +LoadModule cgi_module modules/mod_cgi.so + +# Basic CGI setup +ScriptAlias /cgit/ /usr/lib/cgit/cgit.cgi/ +Alias /cgit-data/ /usr/share/cgit/ + +<Directory "/usr/lib/cgit/"> + AllowOverride None + Options +ExecCGI + Require all granted +</Directory> + +<Directory "/usr/share/cgit/"> + AllowOverride None + Require all granted +</Directory> +``` + +### URL Rewriting (Clean URLs) + +```apache +# Enable clean URLs via mod_rewrite +RewriteEngine On +RewriteRule ^/cgit/(.*)$ /usr/lib/cgit/cgit.cgi/$1 [PT] +``` + +With corresponding cgitrc: + +```ini +virtual-root=/cgit/ +css=/cgit-data/cgit.css +logo=/cgit-data/cgit.png +``` + +## Nginx Configuration + +Nginx does not support CGI natively. Use `fcgiwrap` or `spawn-fcgi`: + +### With fcgiwrap + +```bash +# Install fcgiwrap +# Start it (systemd, OpenRC, or manual) +fcgiwrap -s unix:/run/fcgiwrap.sock & +``` + +```nginx +server { + listen 80; + server_name git.example.com; + + root /usr/share/cgit; + + # Serve static files directly + location /cgit-data/ { + alias /usr/share/cgit/; + } + + # Pass CGI requests to fcgiwrap + location /cgit { + include fastcgi_params; + fastcgi_param SCRIPT_FILENAME /usr/lib/cgit/cgit.cgi; + fastcgi_param PATH_INFO $uri; + fastcgi_param QUERY_STRING $args; + fastcgi_param HTTP_HOST $server_name; + fastcgi_pass unix:/run/fcgiwrap.sock; + } +} +``` + +### With spawn-fcgi + +```bash +spawn-fcgi -s /run/cgit.sock -n -- /usr/bin/fcgiwrap +``` + +## lighttpd Configuration + +```lighttpd +server.modules += ("mod_cgi", "mod_alias", "mod_rewrite") + +alias.url = ( + "/cgit-data/" => "/usr/share/cgit/", + "/cgit/" => "/usr/lib/cgit/cgit.cgi" +) + +cgi.assign = ( + "cgit.cgi" => "" +) + +url.rewrite-once = ( + "^/cgit/(.*)$" => "/cgit/cgit.cgi/$1" +) +``` + +## Configuration File + +Create `/etc/cgitrc`: + +```ini +# Site identity +root-title=My Git Server +root-desc=Git repository browser +css=/cgit-data/cgit.css +logo=/cgit-data/cgit.png +favicon=/cgit-data/favicon.ico + +# URL routing +virtual-root=/cgit/ + +# Features +enable-commit-graph=1 +enable-blame=1 +enable-http-clone=1 +enable-index-links=1 +snapshots=tar.gz tar.xz zip +max-stats=quarter + +# Caching (recommended for production) +cache-size=1000 +cache-root=/var/cache/cgit +cache-root-ttl=5 +cache-repo-ttl=5 +cache-static-ttl=-1 + +# Repository discovery +scan-path=/srv/git/ +section-from-path=1 +enable-git-config=1 + +# Filters +source-filter=exec:/usr/lib/cgit/filters/syntax-highlighting.py +about-filter=exec:/usr/lib/cgit/filters/about-formatting.sh +``` + +## Cache Directory Setup + +```bash +# Create cache directory +mkdir -p /var/cache/cgit + +# Set ownership to web server user +chown www-data:www-data /var/cache/cgit +chmod 700 /var/cache/cgit + +# Optional: periodic cleanup cron job +echo "*/30 * * * * find /var/cache/cgit -type f -mmin +60 -delete" | \ + crontab -u www-data - +``` + +## Repository Permissions + +The web server user needs read access to all git repositories: + +```bash +# Option 1: Add web server user to git group +usermod -aG git www-data + +# Option 2: Set directory permissions +chmod -R g+rX /srv/git/ + +# Option 3: Use ACLs +setfacl -R -m u:www-data:rX /srv/git/ +setfacl -R -d -m u:www-data:rX /srv/git/ +``` + +## HTTPS Setup + +For production, serve cgit over HTTPS: + +```nginx +server { + listen 443 ssl; + server_name git.example.com; + + ssl_certificate /etc/ssl/certs/git.example.com.pem; + ssl_certificate_key /etc/ssl/private/git.example.com.key; + + # ... cgit configuration ... +} + +server { + listen 80; + server_name git.example.com; + return 301 https://$server_name$request_uri; +} +``` + +## Performance Tuning + +### Enable Caching + +The response cache is essential for performance: + +```ini +cache-size=1000 # number of cache entries +cache-root-ttl=5 # repo list: 5 minutes +cache-repo-ttl=5 # repo pages: 5 minutes +cache-static-ttl=-1 # static content: forever +cache-about-ttl=15 # about pages: 15 minutes +``` + +### Limit Resource Usage + +```ini +max-repo-count=100 # repos per page +max-commit-count=50 # commits per page +max-blob-size=512 # max blob display (KB) +max-message-length=120 # truncate long subjects +max-repodesc-length=80 # truncate descriptions +``` + +### Use Lua Filters + +Lua filters avoid fork/exec overhead: + +```ini +source-filter=lua:/usr/share/cgit/filters/syntax-highlight.lua +email-filter=lua:/usr/share/cgit/filters/email-libravatar.lua +``` + +### Optimize Git Access + +```bash +# Run periodic git gc on repositories +for repo in /srv/git/*.git; do + git -C "$repo" gc --auto +done + +# Ensure pack files are optimized +for repo in /srv/git/*.git; do + git -C "$repo" repack -a -d +done +``` + +## Monitoring + +### Check Cache Status + +```bash +# Count cache entries +ls /var/cache/cgit/ | wc -l + +# Check cache hit rate (if access logs are enabled) +grep "cgit.cgi" /var/log/nginx/access.log | tail -100 +``` + +### Health Check + +```bash +# Verify cgit is responding +curl -s -o /dev/null -w "%{http_code}" http://localhost/cgit/ +``` + +## Docker Deployment + +```dockerfile +FROM alpine:latest + +RUN apk add --no-cache \ + git make gcc musl-dev openssl-dev zlib-dev lua5.3-dev \ + fcgiwrap nginx + +COPY cgit/ /build/cgit/ +WORKDIR /build/cgit +RUN make && make install + +COPY cgitrc /etc/cgitrc +COPY nginx.conf /etc/nginx/conf.d/cgit.conf + +EXPOSE 80 +CMD ["sh", "-c", "fcgiwrap -s unix:/run/fcgiwrap.sock & nginx -g 'daemon off;'"] +``` + +## systemd Service + +```ini +# /etc/systemd/system/fcgiwrap-cgit.service +[Unit] +Description=fcgiwrap for cgit +After=network.target + +[Service] +ExecStart=/usr/bin/fcgiwrap -s unix:/run/fcgiwrap.sock +User=www-data +Group=www-data + +[Install] +WantedBy=multi-user.target +``` + +## Troubleshooting + +| Symptom | Cause | Solution | +|---------|-------|----------| +| 500 Internal Server Error | CGI binary not executable | `chmod +x cgit.cgi` | +| Blank page | Missing CSS path | Check `css=` directive | +| No repositories shown | Wrong `scan-path` | Verify path and permissions | +| Cache errors | Permission denied | Fix cache dir ownership | +| Lua filter fails | Lua not compiled in | Rebuild without `NO_LUA` | +| Clone fails | `enable-http-clone=0` | Set to `1` | +| Missing styles | Static file alias wrong | Check web server alias config | +| Timeout on large repos | No caching | Enable `cache-size` | diff --git a/docs/handbook/cgit/diff-engine.md b/docs/handbook/cgit/diff-engine.md new file mode 100644 index 0000000000..c82092842c --- /dev/null +++ b/docs/handbook/cgit/diff-engine.md @@ -0,0 +1,352 @@ +# cgit — Diff Engine + +## Overview + +cgit's diff engine renders differences between commits, trees, and blobs. +It supports three diff modes: unified, side-by-side, and stat-only. The +engine leverages libgit's internal diff machinery and adds HTML rendering on +top. + +Source files: `ui-diff.c`, `ui-diff.h`, `ui-ssdiff.c`, `ui-ssdiff.h`, +`shared.c` (diff helpers). + +## Diff Types + +```c +#define DIFF_UNIFIED 0 /* traditional unified diff */ +#define DIFF_SSDIFF 1 /* side-by-side diff */ +#define DIFF_STATONLY 2 /* only show diffstat */ +``` + +The diff type is selected by the `ss` query parameter or the +`side-by-side-diffs` configuration directive. + +## Diffstat + +### File Info Structure + +```c +struct fileinfo { + char status; /* 'A'dd, 'D'elete, 'M'odify, 'R'ename, etc. */ + unsigned long old_size; + unsigned long new_size; + int binary; + struct object_id old_oid; /* old blob SHA */ + struct object_id new_oid; /* new blob SHA */ + unsigned short old_mode; + unsigned short new_mode; + char *old_path; + char *new_path; + int added; /* lines added */ + int removed; /* lines removed */ +}; +``` + +### Collecting File Changes: `inspect_filepair()` + +For each changed file in a commit, `inspect_filepair()` records the change +information: + +```c +static void inspect_filepair(struct diff_filepair *pair) +{ + /* populate a fileinfo entry from the diff_filepair */ + files++; + switch (pair->status) { + case DIFF_STATUS_ADDED: + info->status = 'A'; + break; + case DIFF_STATUS_DELETED: + info->status = 'D'; + break; + case DIFF_STATUS_MODIFIED: + info->status = 'M'; + break; + case DIFF_STATUS_RENAMED: + info->status = 'R'; + /* old_path and new_path differ */ + break; + case DIFF_STATUS_COPIED: + info->status = 'C'; + break; + /* ... */ + } +} +``` + +### Rendering Diffstat: `cgit_print_diffstat()` + +```c +void cgit_print_diffstat(const struct object_id *old, + const struct object_id *new, + const char *prefix) +``` + +Renders an HTML table showing changed files with bar graphs: + +```html +<table summary='diffstat' class='diffstat'> + <tr> + <td class='mode'>M</td> + <td class='upd'><a href='...'>src/main.c</a></td> + <td class='right'>42</td> + <td class='graph'> + <span class='add' style='width: 70%'></span> + <span class='rem' style='width: 30%'></span> + </td> + </tr> + ... + <tr class='total'> + <td colspan='3'>5 files changed, 120 insertions, 45 deletions</td> + </tr> +</table> +``` + +The bar graph width is calculated proportionally to the maximum changed +lines across all files. + +## Unified Diff + +### `cgit_print_diff()` + +The main diff rendering function: + +```c +void cgit_print_diff(const char *new_rev, const char *old_rev, + const char *prefix, int show_ctrls, int raw) +``` + +Parameters: +- `new_rev` — New commit SHA +- `old_rev` — Old commit SHA (optional; defaults to parent) +- `prefix` — Path prefix filter (show only diffs under this path) +- `show_ctrls` — Show diff controls (diff type toggle buttons) +- `raw` — Output raw diff without HTML wrapping + +### Diff Controls + +When `show_ctrls=1`, diff mode toggle buttons are rendered: + +```html +<div class='cgit-panel'> + <b>Diff options</b> + <form method='get' action='...'> + <select name='dt'> + <option value='0'>unified</option> + <option value='1'>ssdiff</option> + <option value='2'>stat only</option> + </select> + <input type='submit' value='Go'/> + </form> +</div> +``` + +### Filepair Callback: `filepair_cb()` + +For each changed file, `filepair_cb()` renders the diff: + +```c +static void filepair_cb(struct diff_filepair *pair) +{ + /* emit file header */ + htmlf("<div class='head'>%s</div>", pair->one->path); + /* set up diff options */ + xdiff_opts.ctxlen = ctx.qry.context ?: 3; + /* run the diff and emit line-by-line output */ + /* each line gets a CSS class: .add, .del, or .ctx */ +} +``` + +### Hunk Headers + +```c +void cgit_print_diff_hunk_header(int oldofs, int oldcnt, + int newofs, int newcnt, + const char *func) +``` + +Renders hunk headers as: + +```html +<div class='hunk'>@@ -oldofs,oldcnt +newofs,newcnt @@ func</div> +``` + +### Line Rendering + +Each diff line is rendered with a status prefix and CSS class: + +| Line Type | CSS Class | Prefix | +|-----------|----------|--------| +| Added | `.add` | `+` | +| Removed | `.del` | `-` | +| Context | `.ctx` | ` ` | +| Hunk header | `.hunk` | `@@` | + +## Side-by-Side Diff (`ui-ssdiff.c`) + +The side-by-side diff view renders old and new versions in adjacent columns. + +### LCS Algorithm + +`ui-ssdiff.c` implements a Longest Common Subsequence (LCS) algorithm to +align lines between old and new versions: + +```c +/* LCS computation for line alignment */ +static int *lcs(char *a, int an, char *b, int bn) +{ + int *prev, *curr; + /* dynamic programming: build LCS table */ + prev = calloc(bn + 1, sizeof(int)); + curr = calloc(bn + 1, sizeof(int)); + for (int i = 1; i <= an; i++) { + for (int j = 1; j <= bn; j++) { + if (a[i-1] == b[j-1]) + curr[j] = prev[j-1] + 1; + else + curr[j] = MAX(prev[j], curr[j-1]); + } + SWAP(prev, curr); + } + return prev; +} +``` + +### Deferred Lines + +Side-by-side rendering uses a deferred output model: + +```c +struct deferred_lines { + int line_no; + char *line; + struct deferred_lines *next; +}; +``` + +Lines are collected and paired before output. For modified lines, the LCS +algorithm identifies character-level changes and highlights them with +`<span class='add'>` or `<span class='del'>` within each line. + +### Tab Expansion + +```c +static char *replace_tabs(char *line) +``` + +Tabs are expanded to spaces for proper column alignment in side-by-side +view. The tab width is 8 characters. + +### Rendering + +Side-by-side output uses a two-column `<table>`: + +```html +<table class='ssdiff'> + <tr> + <td class='lineno'><a>42</a></td> + <td class='del'>old line content</td> + <td class='lineno'><a>42</a></td> + <td class='add'>new line content</td> + </tr> +</table> +``` + +Changed characters within a line are highlighted with inline spans. + +## Low-Level Diff Helpers (`shared.c`) + +### Tree Diff + +```c +void cgit_diff_tree(const struct object_id *old_oid, + const struct object_id *new_oid, + filepair_fn fn, const char *prefix, + int renamelimit) +``` + +Computes the diff between two tree objects (typically from two commits). +Calls `fn` for each changed file pair. `renamelimit` controls rename +detection threshold. + +### Commit Diff + +```c +void cgit_diff_commit(struct commit *commit, filepair_fn fn, + const char *prefix) +``` + +Diffs a commit against its first parent. For root commits (no parent), +diffs against an empty tree. + +### File Diff + +```c +void cgit_diff_files(const struct object_id *old_oid, + const struct object_id *new_oid, + unsigned long *old_size, + unsigned long *new_size, + int *binary, int context, + int ignorews, linediff_fn fn) +``` + +Performs a line-level diff between two blobs. The `linediff_fn` callback is +invoked for each output line (add/remove/context). + +## Diff in Context: Commit View + +`ui-commit.c` uses the diff engine to show changes in commit view: + +```c +void cgit_print_commit(const char *rev, const char *prefix) +{ + /* ... commit metadata ... */ + cgit_print_diff(ctx.qry.sha1, info->parent_sha1, prefix, 0, 0); +} +``` + +## Diff in Context: Log View + +`ui-log.c` can optionally show per-commit diffstats: + +```c +if (ctx.cfg.enable_log_filecount) { + cgit_diff_commit(commit, inspect_filepair, NULL); + /* display changed files count, added/removed */ +} +``` + +## Binary Detection + +Files are marked as binary when diffing if the content contains null bytes +or exceeds the configured max-blob-size. Binary files are shown as: + +``` +Binary files differ +``` + +No line-level diff is performed for binary content. + +## Diff Configuration + +| Directive | Default | Effect | +|-----------|---------|--------| +| `side-by-side-diffs` | 0 | Default diff type | +| `renamelimit` | -1 | Rename detection limit | +| `max-blob-size` | 0 | Max blob size for display | +| `enable-log-filecount` | 0 | Show file counts in log | +| `enable-log-linecount` | 0 | Show line counts in log | + +## Raw Diff Output + +The `rawdiff` command outputs a plain-text unified diff without HTML +wrapping, suitable for piping or downloading: + +```c +static void cmd_rawdiff(struct cgit_context *ctx) +{ + ctx->page.mimetype = "text/plain"; + cgit_print_diff(ctx->qry.sha1, ctx->qry.sha2, + ctx->qry.path, 0, 1 /* raw */); +} +``` diff --git a/docs/handbook/cgit/filter-system.md b/docs/handbook/cgit/filter-system.md new file mode 100644 index 0000000000..be6f94e4b7 --- /dev/null +++ b/docs/handbook/cgit/filter-system.md @@ -0,0 +1,358 @@ +# cgit — Filter System + +## Overview + +cgit provides a pluggable content filtering pipeline that transforms text +before it is rendered in HTML output. Filters are used for tasks such as +syntax highlighting, README rendering, email obfuscation, and authentication. + +Source file: `filter.c`. + +## Filter Types + +Six filter types are defined, each identified by a constant and linked to an +entry in the `filter_specs[]` table: + +```c +#define ABOUT_FILTER 0 /* README/about page rendering */ +#define COMMIT_FILTER 1 /* commit message formatting */ +#define SOURCE_FILTER 2 /* source code syntax highlighting */ +#define EMAIL_FILTER 3 /* email address display */ +#define AUTH_FILTER 4 /* authentication/authorization */ +#define OWNER_FILTER 5 /* owner field display */ +``` + +### Filter Specs Table + +```c +static struct { + char *prefix; + int args; +} filter_specs[] = { + [ABOUT_FILTER] = { "about", 1 }, + [COMMIT_FILTER] = { "commit", 0 }, + [SOURCE_FILTER] = { "source", 1 }, + [EMAIL_FILTER] = { "email", 2 }, /* email, page */ + [AUTH_FILTER] = { "auth", 12 }, + [OWNER_FILTER] = { "owner", 0 }, +}; +``` + +The `args` field specifies the number of *extra* arguments the filter +receives (beyond the filter command itself). + +## Filter Structure + +```c +struct cgit_filter { + char *cmd; /* command or script path */ + int type; /* filter type constant */ + int (*open)(struct cgit_filter *, ...); /* start filter */ + int (*close)(struct cgit_filter *); /* finish filter */ + void (*fprintf)(struct cgit_filter *, FILE *, const char *fmt, ...); + void (*cleanup)(struct cgit_filter *); /* free resources */ + int argument_count; /* from filter_specs */ +}; +``` + +Two implementations exist: + +| Implementation | Struct | Description | +|---------------|--------|-------------| +| Exec filter | `struct cgit_exec_filter` | Fork/exec an external process | +| Lua filter | `struct cgit_lua_filter` | Execute a Lua script in-process | + +## Exec Filters + +Exec filters fork a child process and redirect `stdout` through a pipe. All +data written to `stdout` while the filter is open passes through the child +process, which can transform it before output. + +### Structure + +```c +struct cgit_exec_filter { + struct cgit_filter base; + char *cmd; + char **argv; + int old_stdout; /* saved fd for restoring stdout */ + int pipe_fh[2]; /* pipe: [read, write] */ + pid_t pid; /* child process id */ +}; +``` + +### Open Phase + +```c +static int open_exec_filter(struct cgit_filter *base, ...) +{ + struct cgit_exec_filter *f = (struct cgit_exec_filter *)base; + /* create pipe */ + pipe(f->pipe_fh); + /* save stdout */ + f->old_stdout = dup(STDOUT_FILENO); + /* fork */ + f->pid = fork(); + if (f->pid == 0) { + /* child: redirect stdin from pipe read end */ + dup2(f->pipe_fh[0], STDIN_FILENO); + close(f->pipe_fh[0]); + close(f->pipe_fh[1]); + /* exec the filter command with extra args from va_list */ + execvp(f->cmd, f->argv); + /* on failure: */ + exit(1); + } + /* parent: redirect stdout to pipe write end */ + dup2(f->pipe_fh[1], STDOUT_FILENO); + close(f->pipe_fh[0]); + close(f->pipe_fh[1]); + return 0; +} +``` + +### Close Phase + +```c +static int close_exec_filter(struct cgit_filter *base) +{ + struct cgit_exec_filter *f = (struct cgit_exec_filter *)base; + int status; + fflush(stdout); + /* restore original stdout */ + dup2(f->old_stdout, STDOUT_FILENO); + close(f->old_stdout); + /* wait for child */ + waitpid(f->pid, &status, 0); + /* return child exit status */ + if (WIFEXITED(status)) + return WEXITSTATUS(status); + return -1; +} +``` + +### Argument Passing + +Extra arguments (from `filter_specs[].args`) are passed via `va_list` in the +open function and become `argv` entries for the child process: + +| Filter Type | argv[0] | argv[1] | argv[2] | ... | +|-------------|---------|---------|---------|-----| +| ABOUT | cmd | filename | — | — | +| SOURCE | cmd | filename | — | — | +| COMMIT | cmd | — | — | — | +| OWNER | cmd | — | — | — | +| EMAIL | cmd | email | page | — | +| AUTH | cmd | (12 args: method, mimetype, http_host, https, authenticated, username, http_cookie, request_method, query_string, referer, path, http_accept) | + +## Lua Filters + +When cgit is compiled with Lua support, filters can be Lua scripts executed +in-process without fork/exec overhead. + +### Structure + +```c +struct cgit_lua_filter { + struct cgit_filter base; + char *script_file; + lua_State *lua_state; +}; +``` + +### Lua API + +The Lua script must define a `filter_open()` and `filter_close()` function. +Data is passed to the Lua script through a custom `write()` function +registered in the Lua environment. + +```lua +-- Example source filter +function filter_open(filename) + -- Called when the filter opens + -- filename is the file being processed +end + +function write(str) + -- Called with chunks of content to filter + -- Write transformed output + html(str) +end + +function filter_close() + -- Called when filtering is complete + return 0 -- return exit code +end +``` + +### Lua C Bindings + +cgit registers several C functions into the Lua environment: + +```c +lua_pushcfunction(lua_state, lua_html); /* html() */ +lua_pushcfunction(lua_state, lua_html_txt); /* html_txt() */ +lua_pushcfunction(lua_state, lua_html_attr); /* html_attr() */ +lua_pushcfunction(lua_state, lua_html_url_path); /* html_url_path() */ +lua_pushcfunction(lua_state, lua_html_url_arg); /* html_url_arg() */ +lua_pushcfunction(lua_state, lua_html_include); /* include() */ +``` + +These correspond to the C functions in `html.c` and allow the Lua script to +produce properly escaped HTML output. + +### Lua Filter Open + +```c +static int open_lua_filter(struct cgit_filter *base, ...) +{ + struct cgit_lua_filter *f = (struct cgit_lua_filter *)base; + /* Load and execute the Lua script if not already loaded */ + if (!f->lua_state) { + f->lua_state = luaL_newstate(); + luaL_openlibs(f->lua_state); + /* register C bindings */ + /* load script file */ + } + /* redirect write() calls to the Lua state */ + /* call filter_open() in the Lua script, passing extra args */ + return 0; +} +``` + +### Lua Filter Close + +```c +static int close_lua_filter(struct cgit_filter *base) +{ + struct cgit_lua_filter *f = (struct cgit_lua_filter *)base; + /* call filter_close() in the Lua script */ + /* return the script's exit code */ + return lua_tointeger(f->lua_state, -1); +} +``` + +## Filter Construction + +`cgit_new_filter()` creates a new filter instance: + +```c +struct cgit_filter *cgit_new_filter(const char *cmd, filter_type type) +{ + if (!cmd || !*cmd) + return NULL; + + if (!prefixcmp(cmd, "lua:")) { + /* create Lua filter */ + return new_lua_filter(cmd + 4, type); + } + if (!prefixcmp(cmd, "exec:")) { + /* create exec filter, stripping prefix */ + return new_exec_filter(cmd + 5, type); + } + /* default: treat as exec filter */ + return new_exec_filter(cmd, type); +} +``` + +Prefix rules: +- `lua:/path/to/script.lua` → Lua filter +- `exec:/path/to/script` → exec filter +- `/path/to/script` (no prefix) → exec filter (backward compatibility) + +## Filter Usage Points + +### About Filter (`ABOUT_FILTER`) + +Applied when rendering README and about pages. Called from `ui-summary.c` +and the about view: + +```c +cgit_open_filter(ctx.repo->about_filter, filename); +/* write README content */ +cgit_close_filter(ctx.repo->about_filter); +``` + +Common use: converting Markdown to HTML. + +### Source Filter (`SOURCE_FILTER`) + +Applied when displaying file contents in blob/tree views. Called from +`ui-tree.c`: + +```c +cgit_open_filter(ctx.repo->source_filter, filename); +/* write file content */ +cgit_close_filter(ctx.repo->source_filter); +``` + +Common use: syntax highlighting. + +### Commit Filter (`COMMIT_FILTER`) + +Applied to commit messages in log and commit views. Called from `ui-log.c` +and `ui-commit.c`: + +```c +cgit_open_filter(ctx.repo->commit_filter); +html_txt(info->msg); +cgit_close_filter(ctx.repo->commit_filter); +``` + +Common use: linkifying issue references. + +### Email Filter (`EMAIL_FILTER`) + +Applied to author/committer email addresses. Receives the email address and +current page name as arguments: + +```c +cgit_open_filter(ctx.repo->email_filter, email, page); +html_txt(email); +cgit_close_filter(ctx.repo->email_filter); +``` + +Common use: gravatar integration, email obfuscation. + +### Auth Filter (`AUTH_FILTER`) + +Used for cookie-based authentication. Receives 12 arguments covering the +full HTTP request context. See `authentication.md` for details. + +### Owner Filter (`OWNER_FILTER`) + +Applied when displaying the repository owner. + +## Shipped Filter Scripts + +cgit ships with filter scripts in the `filters/` directory: + +| Script | Type | Description | +|--------|------|-------------| +| `syntax-highlighting.py` | SOURCE | Python-based syntax highlighter using Pygments | +| `syntax-highlighting.sh` | SOURCE | Shell-based highlighter (highlight command) | +| `about-formatting.sh` | ABOUT | Renders markdown via `markdown` or `rst2html` | +| `html-converters/md2html` | ABOUT | Standalone markdown-to-HTML converter | +| `html-converters/rst2html` | ABOUT | reStructuredText-to-HTML converter | +| `html-converters/txt2html` | ABOUT | Plain text to HTML converter | +| `email-gravatar.py` | EMAIL | Adds gravatar avatars | +| `email-libravatar.lua` | EMAIL | Lua-based libravatar integration | +| `simple-hierarchical-auth.lua` | AUTH | Lua path-based authentication | + +## Error Handling + +If an exec filter's child process exits with a non-zero status, `close()` +returns that status code. The calling code can check this to fall back to +unfiltered output. + +If a Lua filter throws an error, the error message is logged via +`die("lua error")` and the filter is aborted. + +## Performance Considerations + +- **Exec filters** have per-invocation fork/exec overhead. For high-traffic + sites, consider Lua filters or enabling the response cache. +- **Lua filters** run in-process with no fork overhead but require Lua support + to be compiled in. +- Filters are not called when serving cached responses — the cached output + already includes the filtered content. diff --git a/docs/handbook/cgit/html-rendering.md b/docs/handbook/cgit/html-rendering.md new file mode 100644 index 0000000000..dab14d66b2 --- /dev/null +++ b/docs/handbook/cgit/html-rendering.md @@ -0,0 +1,380 @@ +# cgit — HTML Rendering Engine + +## Overview + +cgit generates all HTML output through a set of low-level rendering functions +defined in `html.c` and `html.h`. These functions handle entity escaping, +URL encoding, and formatted output. Higher-level page structure is built by +`ui-shared.c`. + +Source files: `html.c`, `html.h`, `ui-shared.c`, `ui-shared.h`. + +## Output Model + +All output functions write directly to `stdout` via `write(2)`. There is no +internal buffering beyond the standard I/O buffer. This design works because +cgit runs as a CGI process — each request is a separate process with its own +stdout connected to the web server. + +## Core Output Functions + +### Raw Output + +```c +void html_raw(const char *data, size_t size); +``` + +Writes raw bytes to stdout without any escaping. Used for binary content +and pre-escaped strings. + +### Escaped Text Output + +```c +void html(const char *txt); +``` + +Writes a string with HTML entity escaping: +- `<` → `<` +- `>` → `>` +- `&` → `&` + +```c +void html_txt(const char *txt); +``` + +Same as `html()` but also escapes: +- `"` → `"` +- `'` → `'` + +Used for text content that appears inside HTML tags. + +```c +void html_ntxt(const char *txt, int len); +``` + +Length-limited version of `html_txt()`. Writes at most `len` characters, +appending `...` if truncated. + +### Attribute Escaping + +```c +void html_attr(const char *txt); +``` + +Escapes text for use in HTML attribute values. Escapes the same characters +as `html_txt()`. + +## URL Encoding + +### URL Escape Table + +`html.c` defines a 256-entry escape table for URL encoding: + +```c +static const char *url_escape_table[256] = { + "%00", "%01", "%02", ..., + [' '] = "+", + ['!'] = NULL, /* pass through */ + ['"'] = "%22", + ['#'] = "%23", + ['%'] = "%25", + ['&'] = "%26", + ['+'] = "%2B", + ['?'] = "%3F", + /* letters, digits, '-', '_', '.', '~' pass through (NULL) */ + ... +}; +``` + +Characters with a `NULL` entry pass through unmodified. All others are +replaced with their percent-encoded representations. + +### URL Path Encoding + +```c +void html_url_path(const char *txt); +``` + +Encodes a URL path component. Uses `url_escape_table` but preserves `/` +characters (they are structural in paths). + +### URL Argument Encoding + +```c +void html_url_arg(const char *txt); +``` + +Encodes a URL query parameter value. Uses `url_escape_table` including +encoding `/` characters. + +## Formatted Output + +### `fmt()` — Ring Buffer Formatter + +```c +const char *fmt(const char *format, ...); +``` + +A `printf`-style formatter that returns a pointer to an internal static +buffer. Uses a ring of 8 buffers (each 8 KB) to allow multiple `fmt()` +calls in a single expression: + +```c +#define FMT_BUFS 8 +#define FMT_SIZE 8192 + +static char bufs[FMT_BUFS][FMT_SIZE]; +static int bufidx; + +const char *fmt(const char *format, ...) +{ + bufidx = (bufidx + 1) % FMT_BUFS; + va_list args; + va_start(args, format); + vsnprintf(bufs[bufidx], FMT_SIZE, format, args); + va_end(args); + return bufs[bufidx]; +} +``` + +This is used extensively throughout cgit for constructing strings without +explicit memory management. The ring buffer avoids use-after-free for up to +8 nested calls. + +### `fmtalloc()` — Heap Formatter + +```c +char *fmtalloc(const char *format, ...); +``` + +Like `fmt()` but allocates a new heap buffer with `xstrfmt()`. Used when +the result must outlive the ring buffer cycle. + +### `htmlf()` — Formatted HTML + +```c +void htmlf(const char *format, ...); +``` + +`printf`-style output directly to stdout. Does NOT perform HTML escaping — +the caller must ensure the format string and arguments are safe. + +## Form Helpers + +### Hidden Fields + +```c +void html_hidden(const char *name, const char *value); +``` + +Generates a hidden form field: + +```html +<input type='hidden' name='name' value='value' /> +``` + +Values are attribute-escaped. + +### Option Elements + +```c +void html_option(const char *value, const char *text, const char *selected_value); +``` + +Generates an `<option>` element, marking it as selected if `value` matches +`selected_value`: + +```html +<option value='value' selected='selected'>text</option> +``` + +### Checkbox Input + +```c +void html_checkbox(const char *name, int value); +``` + +Generates a checkbox input. + +### Text Input + +```c +void html_txt_input(const char *name, const char *value, int size); +``` + +Generates a text input field. + +## Link Generation + +```c +void html_link_open(const char *url, const char *title, const char *class); +void html_link_close(void); +``` + +Generate `<a>` tags with optional title and class attributes. URL is +path-escaped. + +## File Inclusion + +```c +void html_include(const char *filename); +``` + +Reads a file from disk and writes its contents to stdout without escaping. +Used for header/footer file inclusion configured via the `header` and +`footer` directives. + +## Page Structure (`ui-shared.c`) + +### HTTP Headers + +```c +void cgit_print_http_headers(void); +``` + +Emits HTTP response headers based on `ctx.page`: + +``` +Status: 200 OK +Content-Type: text/html; charset=utf-8 +Last-Modified: Thu, 01 Jan 2024 00:00:00 GMT +Expires: Thu, 01 Jan 2024 01:00:00 GMT +ETag: "abc123" +``` + +Fields are only emitted when the corresponding `ctx.page` fields are set. + +### HTML Document Head + +```c +void cgit_print_docstart(void); +``` + +Emits the HTML5 doctype, `<html>`, and `<head>` section: + +```html +<!DOCTYPE html> +<html lang='en'> +<head> + <title>repo - page + + + + + +``` + +### Page Header + +```c +void cgit_print_pageheader(void); +``` + +Renders the page header with logo, navigation tabs, and search form. +Navigation tabs are context-sensitive — repository pages show +summary/refs/log/tree/commit/diff/stats/etc. + +### Page Footer + +```c +void cgit_print_docend(void); +``` + +Closes the HTML document with footer content and closing tags. + +### Full Page Layout + +```c +void cgit_print_layout_start(void); +void cgit_print_layout_end(void); +``` + +These wrap the page content, calling `cgit_print_http_headers()`, +`cgit_print_docstart()`, `cgit_print_pageheader()`, etc. Commands with +`want_layout=1` have their output wrapped in this skeleton. + +## Repository Navigation + +```c +void cgit_print_repoheader(void); +``` + +For each page within a repository, renders: +- Repository name and description +- Navigation tabs: summary, refs, log, tree, commit, diff, stats +- Clone URLs +- Badges + +## Link Functions + +`ui-shared.c` provides numerous helper functions for generating +context-aware links: + +```c +void cgit_summary_link(const char *name, const char *title, + const char *class, const char *head); +void cgit_tag_link(const char *name, const char *title, + const char *class, const char *tag); +void cgit_tree_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +void cgit_log_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path, + int ofs, const char *grep, const char *pattern, + int showmsg, int follow); +void cgit_commit_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +void cgit_patch_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +void cgit_refs_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +void cgit_diff_link(const char *name, const char *title, + const char *class, const char *head, + const char *new_rev, const char *old_rev, + const char *path, int toggle_hierarchical_threading); +void cgit_stats_link(const char *name, const char *title, + const char *class, const char *head, + const char *path); +void cgit_plain_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +void cgit_blame_link(const char *name, const char *title, + const char *class, const char *head, + const char *rev, const char *path); +void cgit_object_link(struct object *obj); +void cgit_submodule_link(const char *name, const char *path, + const char *commit); +``` + +Each function builds a complete `` tag with the appropriate URL, including +all required query parameters for the target page. + +## Diff Output Helpers + +```c +void cgit_print_diff_hunk_header(int oldofs, int oldcnt, + int newofs, int newcnt, const char *func); +void cgit_print_diff_line_prefix(int type); +``` + +These render diff hunks with proper CSS classes for syntax coloring (`.add`, +`.del`, `.hunk`). + +## Error Pages + +```c +void cgit_print_error(const char *msg); +void cgit_print_error_page(int code, const char *msg, const char *fmt, ...); +``` + +`cgit_print_error_page()` sets the HTTP status code and wraps the error +message in a full page layout. + +## Encoding + +All text output assumes UTF-8. The `Content-Type` header is always +`charset=utf-8`. There is no character set conversion. diff --git a/docs/handbook/cgit/lua-integration.md b/docs/handbook/cgit/lua-integration.md new file mode 100644 index 0000000000..26d605862e --- /dev/null +++ b/docs/handbook/cgit/lua-integration.md @@ -0,0 +1,428 @@ +# cgit — Lua Integration + +## Overview + +cgit supports Lua as an in-process scripting language for content filters. +Lua filters avoid the fork/exec overhead of shell-based filters and have +direct access to cgit's HTML output functions. Lua support is optional and +auto-detected at compile time. + +Source files: `filter.c` (Lua filter implementation), `cgit.mk` (Lua detection). + +## Compile-Time Detection + +Lua support is detected by `cgit.mk` using `pkg-config`: + +```makefile +ifndef NO_LUA +LUAPKGS := luajit lua lua5.2 lua5.1 +LUAPKG := $(shell for p in $(LUAPKGS); do \ + $(PKG_CONFIG) --exists $$p 2>/dev/null && echo $$p && break; done) +ifneq ($(LUAPKG),) + CGIT_CFLAGS += -DHAVE_LUA $(shell $(PKG_CONFIG) --cflags $(LUAPKG)) + CGIT_LIBS += $(shell $(PKG_CONFIG) --libs $(LUAPKG)) +endif +endif +``` + +Detection order: `luajit` → `lua` → `lua5.2` → `lua5.1`. + +To disable Lua even when available: + +```bash +make NO_LUA=1 +``` + +The `HAVE_LUA` preprocessor define gates all Lua-related code: + +```c +#ifdef HAVE_LUA +/* Lua filter implementation */ +#else +/* stub: cgit_new_filter() returns NULL for lua: prefix */ +#endif +``` + +## Lua Filter Structure + +```c +struct cgit_lua_filter { + struct cgit_filter base; /* common filter fields */ + char *script_file; /* path to Lua script */ + lua_State *lua_state; /* Lua interpreter state */ +}; +``` + +The `lua_State` is lazily initialized on first use and reused for subsequent +invocations of the same filter. + +## C API Exposed to Lua + +cgit registers these C functions in the Lua environment: + +### `html(str)` + +Writes raw HTML to stdout (no escaping): + +```c +static int lua_html(lua_State *L) +{ + const char *str = luaL_checkstring(L, 1); + html(str); + return 0; +} +``` + +### `html_txt(str)` + +Writes HTML-escaped text: + +```c +static int lua_html_txt(lua_State *L) +{ + const char *str = luaL_checkstring(L, 1); + html_txt(str); + return 0; +} +``` + +### `html_attr(str)` + +Writes attribute-escaped text: + +```c +static int lua_html_attr(lua_State *L) +{ + const char *str = luaL_checkstring(L, 1); + html_attr(str); + return 0; +} +``` + +### `html_url_path(str)` + +Writes a URL-encoded path: + +```c +static int lua_html_url_path(lua_State *L) +{ + const char *str = luaL_checkstring(L, 1); + html_url_path(str); + return 0; +} +``` + +### `html_url_arg(str)` + +Writes a URL-encoded query argument: + +```c +static int lua_html_url_arg(lua_State *L) +{ + const char *str = luaL_checkstring(L, 1); + html_url_arg(str); + return 0; +} +``` + +### `html_include(filename)` + +Includes a file's contents in the output: + +```c +static int lua_html_include(lua_State *L) +{ + const char *filename = luaL_checkstring(L, 1); + html_include(filename); + return 0; +} +``` + +## Lua Filter Lifecycle + +### Initialization + +On first `open()`, the Lua state is created and the script is loaded: + +```c +static int open_lua_filter(struct cgit_filter *base, ...) +{ + struct cgit_lua_filter *f = (struct cgit_lua_filter *)base; + + if (!f->lua_state) { + /* Create new Lua state */ + f->lua_state = luaL_newstate(); + luaL_openlibs(f->lua_state); + + /* Register C functions */ + lua_pushcfunction(f->lua_state, lua_html); + lua_setglobal(f->lua_state, "html"); + lua_pushcfunction(f->lua_state, lua_html_txt); + lua_setglobal(f->lua_state, "html_txt"); + lua_pushcfunction(f->lua_state, lua_html_attr); + lua_setglobal(f->lua_state, "html_attr"); + lua_pushcfunction(f->lua_state, lua_html_url_path); + lua_setglobal(f->lua_state, "html_url_path"); + lua_pushcfunction(f->lua_state, lua_html_url_arg); + lua_setglobal(f->lua_state, "html_url_arg"); + lua_pushcfunction(f->lua_state, lua_html_include); + lua_setglobal(f->lua_state, "include"); + + /* Load and execute the script file */ + if (luaL_dofile(f->lua_state, f->script_file)) + die("lua error: %s", + lua_tostring(f->lua_state, -1)); + } + + /* Redirect stdout writes to lua write() function */ + + /* Call filter_open() with filter-specific arguments */ + lua_getglobal(f->lua_state, "filter_open"); + /* push arguments from va_list */ + lua_call(f->lua_state, nargs, 0); + + return 0; +} +``` + +### Data Flow + +While the filter is open, data written to stdout is intercepted via a custom +`write()` function: + +```c +/* The fprintf callback for Lua filters */ +static void lua_fprintf(struct cgit_filter *base, FILE *f, + const char *fmt, ...) +{ + struct cgit_lua_filter *lf = (struct cgit_lua_filter *)base; + /* format the string */ + /* call the Lua write() function with the formatted text */ + lua_getglobal(lf->lua_state, "write"); + lua_pushstring(lf->lua_state, buf); + lua_call(lf->lua_state, 1, 0); +} +``` + +### Close + +```c +static int close_lua_filter(struct cgit_filter *base) +{ + struct cgit_lua_filter *f = (struct cgit_lua_filter *)base; + + /* Call filter_close() */ + lua_getglobal(f->lua_state, "filter_close"); + lua_call(f->lua_state, 0, 1); + + /* Get return code */ + int rc = lua_tointeger(f->lua_state, -1); + lua_pop(f->lua_state, 1); + + return rc; +} +``` + +### Cleanup + +```c +static void cleanup_lua_filter(struct cgit_filter *base) +{ + struct cgit_lua_filter *f = (struct cgit_lua_filter *)base; + if (f->lua_state) + lua_close(f->lua_state); +} +``` + +## Lua Script Interface + +### Required Functions + +A Lua filter script must define these functions: + +```lua +function filter_open(...) + -- Called when the filter opens + -- Arguments are filter-type specific +end + +function write(str) + -- Called with content chunks to process + -- Transform and output using html() functions +end + +function filter_close() + -- Called when filtering is complete + return 0 -- return exit code +end +``` + +### Available Global Functions + +| Function | Description | +|----------|-------------| +| `html(str)` | Output raw HTML | +| `html_txt(str)` | Output HTML-escaped text | +| `html_attr(str)` | Output attribute-escaped text | +| `html_url_path(str)` | Output URL-path-encoded text | +| `html_url_arg(str)` | Output URL-argument-encoded text | +| `include(filename)` | Include file contents in output | + +All standard Lua libraries are available (`string`, `table`, `math`, `io`, +`os`, etc.). + +## Example Filters + +### Source Highlighting Filter + +```lua +-- syntax-highlighting.lua +local filename = "" +local buffer = {} + +function filter_open(fn) + filename = fn + buffer = {} +end + +function write(str) + table.insert(buffer, str) +end + +function filter_close() + local content = table.concat(buffer) + local ext = filename:match("%.(%w+)$") or "" + + -- Simple keyword highlighting + local keywords = { + ["function"] = true, ["local"] = true, + ["if"] = true, ["then"] = true, + ["end"] = true, ["return"] = true, + ["for"] = true, ["while"] = true, + ["do"] = true, ["else"] = true, + } + + html("
")
+    for line in content:gmatch("([^\n]*)\n?") do
+        html_txt(line)
+        html("\n")
+    end
+    html("
") + + return 0 +end +``` + +### Email Obfuscation Filter + +```lua +-- email-obfuscate.lua +function filter_open(email, page) + -- email = the email address + -- page = current page name +end + +function write(str) + -- Replace @ with [at] for display + local obfuscated = str:gsub("@", " [at] ") + html_txt(obfuscated) +end + +function filter_close() + return 0 +end +``` + +### About/README Filter + +```lua +-- about-markdown.lua +local buffer = {} + +function filter_open(filename) + buffer = {} +end + +function write(str) + table.insert(buffer, str) +end + +function filter_close() + local content = table.concat(buffer) + -- Process markdown (using a Lua markdown library) + -- or shell out to a converter + local handle = io.popen("cmark", "w") + handle:write(content) + local result = handle:read("*a") + handle:close() + html(result) + return 0 +end +``` + +### Auth Filter (Lua) + +```lua +-- auth.lua +-- The auth filter receives 12 arguments +function filter_open(cookie, method, query, referer, path, + host, https, repo, page, accept, phase) + if phase == "cookie" then + -- Validate session cookie + if valid_session(cookie) then + return 0 -- authenticated + end + return 1 -- not authenticated + elseif phase == "post" then + -- Handle login form submission + elseif phase == "authorize" then + -- Check repository access + end +end + +function write(str) + html(str) +end + +function filter_close() + return 0 +end +``` + +## Performance + +Lua filters offer significant performance advantages over exec filters: + +| Aspect | Exec Filter | Lua Filter | +|--------|-------------|------------| +| Startup | fork() + exec() per request | One-time Lua state creation | +| Process | New process per invocation | In-process | +| Memory | Separate address space | Shared memory | +| Latency | ~1-5ms fork overhead | ~0.01ms function call | +| Libraries | Any language | Lua libraries only | + +## Limitations + +- Lua scripts run in the same process as cgit — a crash in the script + crashes cgit +- Standard Lua I/O functions (`print`, `io.write`) bypass cgit's output + pipeline — use `html()` and friends instead +- The Lua state persists between invocations within the same CGI process, + but CGI processes are typically short-lived +- Error handling is via `die()` — a Lua error terminates the CGI process + +## Configuration + +```ini +# Use Lua filter for source highlighting +source-filter=lua:/usr/share/cgit/filters/syntax-highlight.lua + +# Use Lua filter for about pages +about-filter=lua:/usr/share/cgit/filters/about-markdown.lua + +# Use Lua filter for authentication +auth-filter=lua:/usr/share/cgit/filters/simple-hierarchical-auth.lua + +# Use Lua filter for email display +email-filter=lua:/usr/share/cgit/filters/email-libravatar.lua +``` diff --git a/docs/handbook/cgit/overview.md b/docs/handbook/cgit/overview.md new file mode 100644 index 0000000000..bb09d33e8b --- /dev/null +++ b/docs/handbook/cgit/overview.md @@ -0,0 +1,262 @@ +# cgit — Overview + +## What Is cgit? + +cgit is a fast, lightweight web frontend for Git repositories, implemented as a +CGI application written in C. It links directly against libgit (the C library +that forms the core of the `git` command-line tool), giving it native access to +repository objects without spawning external processes for every request. This +design makes cgit one of the fastest Git web interfaces available. + +The Project Tick fork carries version `0.0.5-1-Project-Tick` (defined in the +top-level `Makefile` as `CGIT_VERSION`). It builds against Git 2.46.0 and +extends the upstream cgit with features such as subtree display, SPDX license +detection, badge support, Code of Conduct / CLA pages, root links, and an +enhanced summary page with repository metadata. + +## Key Design Goals + +| Goal | How cgit achieves it | +|------|---------------------| +| **Speed** | Direct libgit linkage; file-based response cache; `sendfile()` on Linux | +| **Security** | `GIT_CONFIG_NOSYSTEM=1` set at load time; HTML entity escaping in every output function; directory-traversal guards; auth-filter framework | +| **Simplicity** | Single CGI binary; flat config file (`cgitrc`); no database requirement | +| **Extensibility** | Pluggable filter system (exec / Lua) for about, commit, source, email, owner, and auth content | + +## Source File Map + +The entire cgit source tree lives in `cgit/`. Every `.c` file has a matching +`.h` (with a few exceptions such as `shared.c` and `parsing.c` which declare +their interfaces in `cgit.h`). + +### Core files + +| File | Purpose | +|------|---------| +| `cgit.h` | Master header — includes libgit headers; defines all major types (`cgit_repo`, `cgit_config`, `cgit_query`, `cgit_context`, etc.) and function prototypes | +| `cgit.c` | Entry point — `prepare_context()`, `config_cb()`, `querystring_cb()`, `process_request()`, `main()` | +| `shared.c` | Global variables (`cgit_repolist`, `ctx`); repo management (`cgit_add_repo`, `cgit_get_repoinfo`); diff helpers; parsing helpers | +| `parsing.c` | Commit/tag parsing (`cgit_parse_commit`, `cgit_parse_tag`, `cgit_parse_url`) | +| `cmd.c` | Command dispatch table — maps URL page names to handler functions | +| `cmd.h` | `struct cgit_cmd` definition; `cgit_get_cmd()` prototype | +| `configfile.c` | Generic `name=value` config parser (`parse_configfile`) | +| `configfile.h` | `configfile_value_fn` typedef; `parse_configfile` prototype | + +### Infrastructure files + +| File | Purpose | +|------|---------| +| `cache.c` / `cache.h` | File-based response cache — FNV-1 hashing, slot open/lock/fill/unlock cycle | +| `filter.c` | Filter framework — exec filters (fork/exec), Lua filters (`luaL_newstate`) | +| `html.c` / `html.h` | HTML output primitives — entity escaping, URL encoding, form helpers | +| `scan-tree.c` / `scan-tree.h` | Filesystem repository scanning — `scan_tree()`, `scan_projects()` | + +### UI modules (`ui-*.c` / `ui-*.h`) + +| Module | Page | Handler function | +|--------|------|-----------------| +| `ui-repolist` | `repolist` | `cgit_print_repolist()` | +| `ui-summary` | `summary` | `cgit_print_summary()` | +| `ui-log` | `log` | `cgit_print_log()` | +| `ui-commit` | `commit` | `cgit_print_commit()` | +| `ui-diff` | `diff` | `cgit_print_diff()` | +| `ui-tree` | `tree` | `cgit_print_tree()` | +| `ui-blob` | `blob` | `cgit_print_blob()` | +| `ui-refs` | `refs` | `cgit_print_refs()` | +| `ui-tag` | `tag` | `cgit_print_tag()` | +| `ui-snapshot` | `snapshot` | `cgit_print_snapshot()` | +| `ui-plain` | `plain` | `cgit_print_plain()` | +| `ui-blame` | `blame` | `cgit_print_blame()` | +| `ui-patch` | `patch` | `cgit_print_patch()` | +| `ui-atom` | `atom` | `cgit_print_atom()` | +| `ui-clone` | `HEAD` / `info` / `objects` | `cgit_clone_head()`, `cgit_clone_info()`, `cgit_clone_objects()` | +| `ui-stats` | `stats` | `cgit_show_stats()` | +| `ui-ssdiff` | (helper) | Side-by-side diff rendering via LCS algorithm | +| `ui-shared` | (helper) | HTTP headers, HTML page skeleton, link generation | + +### Static assets + +| File | Description | +|------|-------------| +| `cgit.css` | Default stylesheet | +| `cgit.js` | Client-side JavaScript (e.g. tree filtering) | +| `cgit.png` | Default logo | +| `favicon.ico` | Default favicon | +| `robots.txt` | Default robots file | + +## Core Data Structures + +All major types are defined in `cgit.h`. The single global +`struct cgit_context ctx` (declared in `shared.c`) holds the entire request +state: + +```c +struct cgit_context { + struct cgit_environment env; /* CGI environment variables */ + struct cgit_query qry; /* Parsed query/URL parameters */ + struct cgit_config cfg; /* Global configuration */ + struct cgit_repo *repo; /* Currently selected repository (or NULL) */ + struct cgit_page page; /* HTTP response metadata */ +}; +``` + +### `struct cgit_repo` + +Represents a single Git repository. Key fields: + +```c +struct cgit_repo { + char *url; /* URL-visible name (e.g. "myproject") */ + char *name; /* Display name */ + char *basename; /* Last path component */ + char *path; /* Filesystem path to .git directory */ + char *desc; /* Description string */ + char *owner; /* Repository owner */ + char *defbranch; /* Default branch (NULL → guess from HEAD) */ + char *section; /* Section for grouped display */ + char *clone_url; /* Clone URL override */ + char *homepage; /* Project homepage URL */ + struct string_list readme; /* README file references */ + struct string_list badges; /* Badge image URLs */ + int snapshots; /* Bitmask of enabled snapshot formats */ + int enable_blame; /* Whether blame view is enabled */ + int enable_commit_graph;/* Whether commit graph is shown in log */ + int enable_subtree; /* Whether subtree detection is enabled */ + int max_stats; /* Stats period index (0=disabled) */ + int hide; /* 1 = hidden from listing */ + int ignore; /* 1 = completely ignored */ + struct cgit_filter *about_filter; /* Per-repo about filter */ + struct cgit_filter *source_filter; /* Per-repo source highlighting */ + struct cgit_filter *email_filter; /* Per-repo email filter */ + struct cgit_filter *commit_filter; /* Per-repo commit message filter */ + struct cgit_filter *owner_filter; /* Per-repo owner filter */ + /* ... */ +}; +``` + +### `struct cgit_query` + +Holds all parsed URL/query-string parameters: + +```c +struct cgit_query { + int has_symref, has_oid, has_difftype; + char *raw; /* Raw query string */ + char *repo; /* Repository URL */ + char *page; /* Page name (log, commit, diff, ...) */ + char *search; /* Search query (q=) */ + char *grep; /* Search type (qt=) */ + char *head; /* Branch/ref (h=) */ + char *oid, *oid2; /* Object IDs (id=, id2=) */ + char *path; /* Path within repository */ + char *name; /* Snapshot filename */ + int ofs; /* Pagination offset */ + int showmsg; /* Show full commit messages in log */ + diff_type difftype; /* DIFF_UNIFIED / DIFF_SSDIFF / DIFF_STATONLY */ + int context; /* Diff context lines */ + int ignorews; /* Ignore whitespace in diffs */ + int follow; /* Follow renames in log */ + char *vpath; /* Virtual path (set by cmd dispatch) */ + /* ... */ +}; +``` + +## Request Lifecycle + +1. **Environment setup** — The `constructor_environment()` function runs before + `main()` (via `__attribute__((constructor))`). It sets + `GIT_CONFIG_NOSYSTEM=1` and `GIT_ATTR_NOSYSTEM=1`, then unsets `HOME` and + `XDG_CONFIG_HOME` to prevent Git from reading user/system configurations. + +2. **Context initialization** — `prepare_context()` zeroes out `ctx` and sets + all configuration defaults (cache sizes, TTLs, feature flags, etc.). CGI + environment variables are read from `getenv()`. + +3. **Configuration parsing** — `parse_configfile()` reads the cgitrc file + (default `/etc/cgitrc`, overridable via `$CGIT_CONFIG`) and calls + `config_cb()` for each `name=value` pair. Repository definitions begin with + `repo.url=` and subsequent `repo.*` directives configure that repository. + +4. **Query parsing** — If running in CGI mode (no `$NO_HTTP`), + `http_parse_querystring()` breaks the query string into name/value pairs and + passes them to `querystring_cb()`. The `url=` parameter is further parsed by + `cgit_parse_url()` which splits it into repo, page, and path components. + +5. **Authentication** — `authenticate_cookie()` checks whether an `auth-filter` + is configured. If so, it invokes the filter with function + `"authenticate-cookie"` and sets `ctx.env.authenticated` from the filter's + exit code. POST requests to `/?p=login` route through + `authenticate_post()` instead. + +6. **Cache lookup** — If caching is enabled (`cache-size > 0`), a cache key is + constructed from the URL and passed to `cache_process()`. On a cache hit the + stored response is sent directly via `sendfile()`. On a miss, stdout is + redirected to a lock file and the request proceeds through normal processing. + +7. **Command dispatch** — `cgit_get_cmd()` looks up `ctx.qry.page` in the + static `cmds[]` table (defined in `cmd.c`). If the command requires a + repository (`want_repo == 1`), the repository is initialized via + `prepare_repo_env()` and `prepare_repo_cmd()`. + +8. **Page rendering** — The matched command's handler function is called. Each + handler uses `cgit_print_http_headers()`, `cgit_print_docstart()`, + `cgit_print_pageheader()`, and `cgit_print_docend()` (from `ui-shared.c`) + to frame their output inside a proper HTML document. + +9. **Cleanup** — `cgit_cleanup_filters()` reaps all filter resources (closing + Lua states, freeing argv arrays). + +## Version String + +The version is compiled into the binary via: + +```makefile +CGIT_VERSION = 0.0.5-1-Project-Tick +``` + +and exposed as the global: + +```c +const char *cgit_version = CGIT_VERSION; +``` + +This string appears in the HTML footer (rendered by `ui-shared.c`) and in patch +output trailers. + +## Relationship to Git + +cgit is built *inside* the Git source tree. The `Makefile` downloads +Git 2.46.0, extracts it as a `git/` subdirectory, then calls `make -C git -f +../cgit.mk` which includes Git's own `Makefile` to inherit all build variables, +object files, and linker flags. The resulting `cgit` binary is a statically +linked combination of cgit's own object files and libgit. + +## Time Constants + +`cgit.h` defines convenience macros used for relative date display: + +```c +#define TM_MIN 60 +#define TM_HOUR (TM_MIN * 60) +#define TM_DAY (TM_HOUR * 24) +#define TM_WEEK (TM_DAY * 7) +#define TM_YEAR (TM_DAY * 365) +#define TM_MONTH (TM_YEAR / 12.0) +``` + +These are used by `cgit_print_age()` in `ui-shared.c` to render "2 hours ago" +style timestamps. + +## Default Encoding + +```c +#define PAGE_ENCODING "UTF-8" +``` + +All commit messages are re-encoded to UTF-8 before display (see +`cgit_parse_commit()` in `parsing.c`). + +## License + +cgit is licensed under the GNU General Public License v2. The `COPYING` file +in the cgit directory contains the full text. diff --git a/docs/handbook/cgit/repository-discovery.md b/docs/handbook/cgit/repository-discovery.md new file mode 100644 index 0000000000..9b961e74cf --- /dev/null +++ b/docs/handbook/cgit/repository-discovery.md @@ -0,0 +1,355 @@ +# cgit — Repository Discovery + +## Overview + +cgit discovers repositories through two mechanisms: explicit `repo.url=` +entries in the configuration file, and automatic filesystem scanning via +`scan-path`. The scan-tree subsystem recursively searches directories for +git repositories and auto-configures them. + +Source files: `scan-tree.c`, `scan-tree.h`, `shared.c` (repository list management). + +## Manual Repository Configuration + +Repositories can be explicitly defined in the cgitrc file: + +```ini +repo.url=myproject +repo.path=/srv/git/myproject.git +repo.desc=My project description +repo.owner=Alice +``` + +Each `repo.url=` triggers `cgit_add_repo()` in `shared.c`, which creates a +new `cgit_repo` entry in the global repository list. + +### `cgit_add_repo()` + +```c +struct cgit_repo *cgit_add_repo(const char *url) +{ + struct cgit_repo *ret; + /* grow the repo array if needed */ + if (cgit_repolist.count >= cgit_repolist.length) { + /* realloc with doubled capacity */ + } + ret = &cgit_repolist.repos[cgit_repolist.count++]; + /* initialize with defaults from ctx.cfg */ + ret->url = xstrdup(url); + ret->name = ret->url; + ret->path = NULL; + ret->desc = cgit_default_repo_desc; + ret->owner = NULL; + ret->section = ctx.cfg.section; + ret->snapshots = ctx.cfg.snapshots; + /* ... inherit all global defaults ... */ + return ret; +} +``` + +## Repository Lookup + +```c +struct cgit_repo *cgit_get_repoinfo(const char *url) +{ + int i; + for (i = 0; i < cgit_repolist.count; i++) { + if (!strcmp(cgit_repolist.repos[i].url, url)) + return &cgit_repolist.repos[i]; + } + return NULL; +} +``` + +This is a linear scan — adequate for typical installations with dozens to +hundreds of repositories. + +## Filesystem Scanning: `scan-path` + +The `scan-path` configuration directive triggers automatic repository +discovery. When encountered in the config file, `scan_tree()` or +`scan_projects()` is called immediately. + +### `scan_tree()` + +```c +void scan_tree(const char *path, repo_config_fn fn) +``` + +Recursively scans `path` for git repositories: + +```c +static void scan_path(const char *base, const char *path, repo_config_fn fn) +{ + DIR *dir; + struct dirent *ent; + + dir = opendir(path); + if (!dir) return; + + while ((ent = readdir(dir)) != NULL) { + /* skip "." and ".." */ + /* skip hidden directories unless scan-hidden-path=1 */ + + if (is_git_dir(fullpath)) { + /* found a bare repository */ + add_repo(base, fullpath, fn); + } else if (is_git_dir(fullpath + "/.git")) { + /* found a non-bare repository */ + add_repo(base, fullpath + "/.git", fn); + } else { + /* recurse into subdirectory */ + scan_path(base, fullpath, fn); + } + } + closedir(dir); +} +``` + +### Git Directory Detection: `is_git_dir()` + +```c +static int is_git_dir(const char *path) +{ + struct stat st; + struct strbuf pathbuf = STRBUF_INIT; + + /* check for path/HEAD */ + strbuf_addf(&pathbuf, "%s/HEAD", path); + if (stat(pathbuf.buf, &st)) { + strbuf_release(&pathbuf); + return 0; + } + + /* check for path/objects */ + strbuf_reset(&pathbuf); + strbuf_addf(&pathbuf, "%s/objects", path); + if (stat(pathbuf.buf, &st) || !S_ISDIR(st.st_mode)) { + strbuf_release(&pathbuf); + return 0; + } + + /* check for path/refs */ + strbuf_reset(&pathbuf); + strbuf_addf(&pathbuf, "%s/refs", path); + if (stat(pathbuf.buf, &st) || !S_ISDIR(st.st_mode)) { + strbuf_release(&pathbuf); + return 0; + } + + strbuf_release(&pathbuf); + return 1; +} +``` + +A directory is considered a git repository if it contains `HEAD`, `objects/`, +and `refs/` subdirectories. + +### Repository Registration: `add_repo()` + +When a git directory is found, `add_repo()` creates a repository entry: + +```c +static void add_repo(const char *base, const char *path, repo_config_fn fn) +{ + /* derive URL from path relative to base */ + /* strip .git suffix if remove-suffix is set */ + struct cgit_repo *repo = cgit_add_repo(url); + repo->path = xstrdup(path); + + /* read gitweb config from the repo */ + if (ctx.cfg.enable_git_config) { + char *gitconfig = fmt("%s/config", path); + parse_configfile(gitconfig, gitconfig_config); + } + + /* read owner from filesystem */ + if (!repo->owner) { + /* stat the repo dir and lookup uid owner */ + struct stat st; + if (!stat(path, &st)) { + struct passwd *pw = getpwuid(st.st_uid); + if (pw) + repo->owner = xstrdup(pw->pw_name); + } + } + + /* read description from description file */ + if (!repo->desc) { + char *descfile = fmt("%s/description", path); + /* read first line */ + } +} +``` + +### Git Config Integration: `gitconfig_config()` + +When `enable-git-config=1`, each discovered repository's `.git/config` is +parsed for metadata: + +```c +static int gitconfig_config(const char *key, const char *value) +{ + if (!strcmp(key, "gitweb.owner")) + repo_config(repo, "owner", value); + else if (!strcmp(key, "gitweb.description")) + repo_config(repo, "desc", value); + else if (!strcmp(key, "gitweb.category")) + repo_config(repo, "section", value); + else if (!strcmp(key, "gitweb.homepage")) + repo_config(repo, "homepage", value); + else if (skip_prefix(key, "cgit.", &name)) + repo_config(repo, name, value); + return 0; +} +``` + +This is compatible with gitweb's configuration keys and also supports +cgit-specific `cgit.*` keys. + +## Project List Scanning: `scan_projects()` + +```c +void scan_projects(const char *path, const char *projectsfile, + repo_config_fn fn) +``` + +Instead of recursively scanning a directory, reads a text file listing +project paths (one per line). Each path is appended to the base path and +checked with `is_git_dir()`. + +This is useful for large installations where full recursive scanning is too +slow. + +```ini +project-list=/etc/cgit/projects.list +scan-path=/srv/git +``` + +The `projects.list` file contains relative paths: + +``` +myproject.git +team/frontend.git +team/backend.git +``` + +## Section Derivation + +When `section-from-path` is set, repository sections are automatically +derived from the directory structure: + +| Value | Behavior | +|-------|----------| +| `0` | No auto-sectioning | +| `1` | First path component becomes section | +| `2` | First two components become section | +| `-1` | Last component becomes section | + +Example with `section-from-path=1` and `scan-path=/srv/git`: + +``` +/srv/git/team/project.git → section="team" +/srv/git/personal/test.git → section="personal" +``` + +## Age File + +The modification time of a repository is determined by: + +1. The `agefile` (default: `info/web/last-modified`) — if this file exists + in the repository, its contents (a date string) or modification time is + used +2. Otherwise, the mtime of the loose `refs/` directory +3. As a fallback, the repository directory's own mtime + +```c +static time_t read_agefile(const char *path) +{ + FILE *f; + static char buf[64]; + + f = fopen(path, "r"); + if (!f) + return -1; + if (fgets(buf, sizeof(buf), f)) { + fclose(f); + return parse_date(buf, NULL); + } + fclose(f); + /* fallback to file mtime */ + struct stat st; + if (!stat(path, &st)) + return st.st_mtime; + return 0; +} +``` + +## Repository List Management + +The global repository list is a dynamically-sized array: + +```c +struct cgit_repolist { + int count; + int length; /* allocated capacity */ + struct cgit_repo *repos; +}; + +struct cgit_repolist cgit_repolist; +``` + +### Sorting + +The repository list can be sorted by different criteria: + +```c +static int cmp_name(const void *a, const void *b); /* by name */ +static int cmp_section(const void *a, const void *b); /* by section */ +static int cmp_idle(const void *a, const void *b); /* by age */ +``` + +Sorting is controlled by the `repository-sort` directive and the `s` query +parameter. + +## Repository Visibility + +Two directives control repository visibility: + +| Directive | Effect | +|-----------|--------| +| `repo.hide=1` | Repository is hidden from the index but accessible by URL | +| `repo.ignore=1` | Repository is completely ignored | + +Additionally, `strict-export` restricts export to repositories containing a +specific file (e.g., `git-daemon-export-ok`): + +```ini +strict-export=git-daemon-export-ok +``` + +## Scan Path Caching + +Scanning large directory trees can be slow. The `cache-scanrc-ttl` directive +controls how long scan results are cached: + +```ini +cache-scanrc-ttl=15 # cache scan results for 15 minutes +``` + +When caching is enabled, the scan is performed only when the cached result +expires. + +## Configuration Reference + +| Directive | Default | Description | +|-----------|---------|-------------| +| `scan-path` | (none) | Directory to scan for repos | +| `project-list` | (none) | File listing project paths | +| `enable-git-config` | 0 | Read repo metadata from git config | +| `scan-hidden-path` | 0 | Include hidden directories in scan | +| `remove-suffix` | 0 | Strip `.git` suffix from URLs | +| `section-from-path` | 0 | Auto-derive section from path | +| `strict-export` | (none) | Required file for repo visibility | +| `agefile` | `info/web/last-modified` | File checked for repo age | +| `cache-scanrc-ttl` | 15 | TTL for cached scan results (minutes) | diff --git a/docs/handbook/cgit/snapshot-system.md b/docs/handbook/cgit/snapshot-system.md new file mode 100644 index 0000000000..bb39047f48 --- /dev/null +++ b/docs/handbook/cgit/snapshot-system.md @@ -0,0 +1,246 @@ +# cgit — Snapshot System + +## Overview + +cgit can generate downloadable source archives (snapshots) from any git +reference. Supported formats include tar, compressed tar variants, and zip. +The snapshot system validates requests against a configured format mask and +delegates archive generation to the git archive API. + +Source file: `ui-snapshot.c`, `ui-snapshot.h`. + +## Snapshot Format Table + +All supported formats are defined in `cgit_snapshot_formats[]`: + +```c +const struct cgit_snapshot_format cgit_snapshot_formats[] = { + { ".zip", "application/x-zip", write_zip_archive, 0x01 }, + { ".tar.gz", "application/x-gzip", write_tar_gzip_archive, 0x02 }, + { ".tar.bz2", "application/x-bzip2", write_tar_bzip2_archive, 0x04 }, + { ".tar", "application/x-tar", write_tar_archive, 0x08 }, + { ".tar.xz", "application/x-xz", write_tar_xz_archive, 0x10 }, + { ".tar.zst", "application/x-zstd", write_tar_zstd_archive, 0x20 }, + { ".tar.lz", "application/x-lzip", write_tar_lzip_archive, 0x40 }, + { NULL } +}; +``` + +### Format Structure + +```c +struct cgit_snapshot_format { + const char *suffix; /* file extension */ + const char *mimetype; /* HTTP Content-Type */ + write_archive_fn_t fn; /* archive writer function */ + int bit; /* bitmask flag */ +}; +``` + +### Format Bitmask + +Each format has a power-of-two bit value. The `snapshots` configuration +directive sets a bitmask by OR-ing the bits of enabled formats: + +| Suffix | Bit | Hex | +|--------|-----|-----| +| `.zip` | 0x01 | 1 | +| `.tar.gz` | 0x02 | 2 | +| `.tar.bz2` | 0x04 | 4 | +| `.tar` | 0x08 | 8 | +| `.tar.xz` | 0x10 | 16 | +| `.tar.zst` | 0x20 | 32 | +| `.tar.lz` | 0x40 | 64 | +| all | 0x7F | 127 | + +### Parsing Snapshot Configuration + +`cgit_parse_snapshots_mask()` in `shared.c` converts the configuration +string to a bitmask: + +```c +int cgit_parse_snapshots_mask(const char *str) +{ + int mask = 0; + /* for each word in str */ + /* compare against cgit_snapshot_formats[].suffix */ + /* if match, mask |= format->bit */ + /* "all" enables all formats */ + return mask; +} +``` + +## Snapshot Request Processing + +### Entry Point: `cgit_print_snapshot()` + +```c +void cgit_print_snapshot(const char *head, const char *hex, + const char *prefix, const char *filename, + int snapshots) +``` + +Parameters: +- `head` — Branch/tag reference +- `hex` — Commit SHA +- `prefix` — Archive prefix (directory name within archive) +- `filename` — Requested filename (e.g., `myrepo-v1.0.tar.gz`) +- `snapshots` — Enabled format bitmask + +### Reference Resolution: `get_ref_from_filename()` + +Decomposes the requested filename into a reference and format: + +```c +static const struct cgit_snapshot_format *get_ref_from_filename( + const char *filename, char **ref) +{ + /* for each format suffix */ + /* if filename ends with suffix */ + /* extract the part before the suffix as the ref */ + /* return the matching format */ + /* strip repo prefix if present */ +} +``` + +Example decomposition: +- `myrepo-v1.0.tar.gz` → ref=`v1.0`, format=`.tar.gz` +- `myrepo-main.zip` → ref=`main`, format=`.zip` +- `myrepo-abc1234.tar.xz` → ref=`abc1234`, format=`.tar.xz` + +The prefix `myrepo-` is the `snapshot-prefix` (defaults to the repo basename). + +### Validation + +Before generating an archive, the function validates: + +1. **Format enabled**: The format's bit must be set in the snapshot mask +2. **Reference exists**: The ref must resolve to a valid git object +3. **Object type**: Must be a commit, tag, or tree + +### Archive Generation: `write_archive_type()` + +```c +static int write_archive_type(const char *format, const char *hex, + const char *prefix) +{ + struct archiver_args args; + memset(&args, 0, sizeof(args)); + args.base = prefix; /* directory prefix in archive */ + /* resolve hex to tree object */ + /* call write_archive() from libgit */ +} +``` + +The actual archive creation is delegated to Git's `write_archive()` API, +which handles tar and zip generation natively. + +### Compression Pipeline + +For compressed formats, the archive data is piped through compression: + +```c +static int write_tar_gzip_archive(/* ... */) +{ + /* pipe tar output through gzip compression */ +} + +static int write_tar_bzip2_archive(/* ... */) +{ + /* pipe tar output through bzip2 compression */ +} + +static int write_tar_xz_archive(/* ... */) +{ + /* pipe tar output through xz compression */ +} + +static int write_tar_zstd_archive(/* ... */) +{ + /* pipe tar output through zstd compression */ +} + +static int write_tar_lzip_archive(/* ... */) +{ + /* pipe tar output through lzip compression */ +} +``` + +## HTTP Response + +Snapshot responses include: + +``` +Content-Type: application/x-gzip +Content-Disposition: inline; filename="myrepo-v1.0.tar.gz" +``` + +The `Content-Disposition` header triggers a file download in browsers with +the correct filename. + +## Snapshot Links + +Snapshot links on repository pages are generated by `ui-shared.c`: + +```c +void cgit_print_snapshot_links(const char *repo, const char *head, + const char *hex, int snapshots) +{ + for (f = cgit_snapshot_formats; f->suffix; f++) { + if (!(snapshots & f->bit)) + continue; + /* generate link: repo/snapshot/prefix-ref.suffix */ + } +} +``` + +These links appear on the summary page and optionally in the log view. + +## Snapshot Prefix + +The archive prefix (directory name inside the archive) is determined by: + +1. `repo.snapshot-prefix` if set +2. Otherwise, the repository basename + +For a request like `myrepo-v1.0.tar.gz`, the archive contains files under +`myrepo-v1.0/`. + +## Signature Detection + +cgit can detect and display signature files alongside snapshots. When a +file matching `.asc` or `.sig` exists in the +repository, a signature link is shown next to the snapshot download. + +## Configuration + +| Directive | Default | Description | +|-----------|---------|-------------| +| `snapshots` | (none) | Space-separated list of enabled suffixes | +| `repo.snapshots` | (inherited) | Per-repo override | +| `repo.snapshot-prefix` | (basename) | Per-repo archive prefix | +| `cache-snapshot-ttl` | 5 min | Cache TTL for snapshot pages | + +### Enabling Snapshots + +```ini +# Global: enable tar.gz and zip for all repos +snapshots=tar.gz zip + +# Per-repo: enable all formats +repo.url=myrepo +repo.snapshots=all + +# Per-repo: disable snapshots +repo.url=internal-tools +repo.snapshots= +``` + +## Security Considerations + +- Snapshots are generated on-the-fly from git objects, so they always reflect + the repository's current state +- Large repositories can produce large archives — consider enabling caching + and setting appropriate `max-blob-size` limits +- Snapshot requests for non-existent refs return a 404 error page +- The snapshot filename is sanitized to prevent path traversal diff --git a/docs/handbook/cgit/testing.md b/docs/handbook/cgit/testing.md new file mode 100644 index 0000000000..ee7b5979f9 --- /dev/null +++ b/docs/handbook/cgit/testing.md @@ -0,0 +1,335 @@ +# cgit — Testing + +## Overview + +cgit has a shell-based test suite in the `tests/` directory. Tests use +Git's own test framework (`test-lib.sh`) and exercise cgit by invoking the +CGI binary with simulated HTTP requests. + +Source location: `cgit/tests/`. + +## Test Framework + +The test harness is built on Git's `test-lib.sh`, sourced from the vendored +Git tree at `git/t/test-lib.sh`. This provides: + +- TAP-compatible output +- Test assertions (`test_expect_success`, `test_expect_failure`) +- Temporary directory management (`trash` directories) +- Color-coded pass/fail reporting + +### `setup.sh` + +All test scripts source `tests/setup.sh`, which provides: + +```bash +# Core test helpers +prepare_tests() # Create repos and config file +run_test() # Execute a single test case +cgit_query() # Invoke cgit with a query string +cgit_url() # Invoke cgit with a virtual URL +strip_headers() # Remove HTTP headers from CGI output +``` + +### Invoking cgit + +Tests invoke cgit as a CGI binary by setting environment variables: + +```bash +cgit_query() +{ + CGIT_CONFIG="$PWD/cgitrc" QUERY_STRING="$1" cgit +} + +cgit_url() +{ + CGIT_CONFIG="$PWD/cgitrc" QUERY_STRING="url=$1" cgit +} +``` + +The `cgit` binary is on PATH (prepended by setup.sh). The response includes +HTTP headers followed by HTML content. `strip_headers()` removes the +headers for content-only assertions. + +## Test Repository Setup + +`setup_repos()` creates test repositories: + +```bash +setup_repos() +{ + rm -rf cache + mkdir -p cache + mkrepo repos/foo 5 # 5 commits + mkrepo repos/bar 50 commit-graph # 50 commits with commit-graph + mkrepo repos/foo+bar 10 testplus # 10 commits + special chars + mkrepo "repos/with space" 2 # repo with spaces in name + mkrepo repos/filter 5 testplus # for filter tests +} +``` + +### `mkrepo()` + +```bash +mkrepo() { + name=$1 + count=$2 + test_create_repo "$name" + ( + cd "$name" + n=1 + while test $n -le $count; do + echo $n >file-$n + git add file-$n + git commit -m "commit $n" + n=$(expr $n + 1) + done + case "$3" in + testplus) + echo "hello" >a+b + git add a+b + git commit -m "add a+b" + git branch "1+2" + ;; + commit-graph) + git commit-graph write + ;; + esac + ) +} +``` + +### Test Configuration + +A `cgitrc` file is generated in the test directory with: + +```ini +virtual-root=/ +cache-root=$PWD/cache +cache-size=1021 +snapshots=tar.gz tar.bz tar.lz tar.xz tar.zst zip +enable-log-filecount=1 +enable-log-linecount=1 +summary-log=5 +summary-branches=5 +summary-tags=5 +clone-url=git://example.org/$CGIT_REPO_URL.git +enable-filter-overrides=1 +root-coc=$PWD/site-coc.txt +root-cla=$PWD/site-cla.txt +root-homepage=https://projecttick.org +root-homepage-title=Project Tick +root-link=GitHub|https://github.com/example +root-link=GitLab|https://gitlab.com/example +root-link=Codeberg|https://codeberg.org/example + +repo.url=foo +repo.path=$PWD/repos/foo/.git + +repo.url=bar +repo.path=$PWD/repos/bar/.git +repo.desc=the bar repo + +repo.url=foo+bar +repo.path=$PWD/repos/foo+bar/.git +repo.desc=the foo+bar repo +# ... +``` + +## Test Scripts + +### Test File Naming + +Tests follow the convention `tNNNN-description.sh`: + +| Test | Description | +|------|-------------| +| `t0001-validate-git-versions.sh` | Verify Git version compatibility | +| `t0010-validate-html.sh` | Validate HTML output | +| `t0020-validate-cache.sh` | Test cache system | +| `t0101-index.sh` | Repository index page | +| `t0102-summary.sh` | Repository summary page | +| `t0103-log.sh` | Log view | +| `t0104-tree.sh` | Tree view | +| `t0105-commit.sh` | Commit view | +| `t0106-diff.sh` | Diff view | +| `t0107-snapshot.sh` | Snapshot downloads | +| `t0108-patch.sh` | Patch view | +| `t0109-gitconfig.sh` | Git config integration | +| `t0110-rawdiff.sh` | Raw diff output | +| `t0111-filter.sh` | Filter system | +| `t0112-coc.sh` | Code of Conduct page | +| `t0113-cla.sh` | CLA page | +| `t0114-root-homepage.sh` | Root homepage links | + +### Number Ranges + +| Range | Category | +|-------|----------| +| `t0001-t0099` | Infrastructure/validation tests | +| `t0100-t0199` | Feature tests | + +## Running Tests + +### All Tests + +```bash +cd cgit/tests +make +``` + +The Makefile discovers all `t*.sh` files and runs them: + +```makefile +T = $(wildcard t[0-9][0-9][0-9][0-9]-*.sh) + +all: $(T) + +$(T): + @'$(SHELL_PATH_SQ)' $@ $(CGIT_TEST_OPTS) +``` + +### Individual Tests + +```bash +# Run a single test +./t0101-index.sh + +# With verbose output +./t0101-index.sh -v + +# With Valgrind +./t0101-index.sh --valgrind +``` + +### Test Options + +Options are passed via `CGIT_TEST_OPTS` or command-line arguments: + +| Option | Description | +|--------|-------------| +| `-v`, `--verbose` | Show test details | +| `--valgrind` | Run cgit under Valgrind | +| `--debug` | Show shell trace | + +### Valgrind Support + +`setup.sh` intercepts the `--valgrind` flag and configures Valgrind +instrumentation via a wrapper script in `tests/valgrind/`: + +```bash +if test -n "$cgit_valgrind"; then + GIT_VALGRIND="$TEST_DIRECTORY/valgrind" + CGIT_VALGRIND=$(cd ../valgrind && pwd) + PATH="$CGIT_VALGRIND/bin:$PATH" +fi +``` + +## Test Patterns + +### HTML Content Assertion + +```bash +run_test 'repo index contains foo' ' + cgit_url "/" | strip_headers | grep -q "foo" +' +``` + +### HTTP Header Assertion + +```bash +run_test 'content type is text/html' ' + cgit_url "/" | head -1 | grep -q "Content-Type: text/html" +' +``` + +### Snapshot Download + +```bash +run_test 'snapshot is valid tar.gz' ' + cgit_url "/foo/snapshot/foo-master.tar.gz" | strip_headers | \ + gunzip | tar tf - >/dev/null +' +``` + +### Negative Assertion + +```bash +run_test 'no 404 on valid repo' ' + ! cgit_url "/foo" | grep -q "404" +' +``` + +### Lua Filter Conditional + +```bash +if [ $CGIT_HAS_LUA -eq 1 ]; then + run_test 'lua filter works' ' + cgit_url "/filter-lua/about/" | strip_headers | grep -q "filtered" + ' +fi +``` + +## Test Filter Scripts + +The `tests/filters/` directory contains simple filter scripts for testing: + +### `dump.sh` + +A passthrough filter that copies stdin to stdout, used to verify filter +invocation: + +```bash +#!/bin/sh +cat +``` + +### `dump.lua` + +Lua equivalent of the dump filter: + +```lua +function filter_open(...) +end + +function write(str) + html(str) +end + +function filter_close() + return 0 +end +``` + +## Cleanup + +```bash +cd cgit/tests +make clean +``` + +Removes the `trash` directories created by tests. + +## Writing New Tests + +1. Create a new file `tNNNN-description.sh` +2. Source `setup.sh` and call `prepare_tests`: + +```bash +#!/bin/sh +. ./setup.sh +prepare_tests "my new feature" + +run_test 'description of test case' ' + cgit_url "/foo/my-page/" | strip_headers | grep -q "expected" +' +``` + +3. Make it executable: `chmod +x tNNNN-description.sh` +4. Run: `./tNNNN-description.sh -v` + +## CI Integration + +Tests are run as part of the CI pipeline. The `ci/` directory contains +Nix-based CI configuration that builds cgit and runs the test suite in a +reproducible environment. diff --git a/docs/handbook/cgit/ui-modules.md b/docs/handbook/cgit/ui-modules.md new file mode 100644 index 0000000000..b03a437a35 --- /dev/null +++ b/docs/handbook/cgit/ui-modules.md @@ -0,0 +1,544 @@ +# cgit — UI Modules + +## Overview + +cgit's user interface is implemented as a collection of `ui-*.c` modules, +each responsible for rendering a specific page type. All modules share +common infrastructure from `ui-shared.c` and `html.c`. + +## Module Map + +| Module | Page | Entry Function | +|--------|------|---------------| +| `ui-repolist.c` | Repository index | `cgit_print_repolist()` | +| `ui-summary.c` | Repository summary | `cgit_print_summary()` | +| `ui-log.c` | Commit log | `cgit_print_log()` | +| `ui-tree.c` | File/directory tree | `cgit_print_tree()` | +| `ui-blob.c` | File content | `cgit_print_blob()` | +| `ui-commit.c` | Commit details | `cgit_print_commit()` | +| `ui-diff.c` | Diff view | `cgit_print_diff()` | +| `ui-ssdiff.c` | Side-by-side diff | `cgit_ssdiff_*()` | +| `ui-patch.c` | Patch output | `cgit_print_patch()` | +| `ui-refs.c` | Branch/tag listing | `cgit_print_refs()` | +| `ui-tag.c` | Tag details | `cgit_print_tag()` | +| `ui-stats.c` | Statistics | `cgit_print_stats()` | +| `ui-atom.c` | Atom feed | `cgit_print_atom()` | +| `ui-plain.c` | Raw file serving | `cgit_print_plain()` | +| `ui-blame.c` | Blame view | `cgit_print_blame()` | +| `ui-clone.c` | HTTP clone | `cgit_clone_info/objects/head()` | +| `ui-snapshot.c` | Archive download | `cgit_print_snapshot()` | +| `ui-shared.c` | Common layout | (shared functions) | + +## `ui-repolist.c` — Repository Index + +Renders the main page listing all configured repositories. + +### Functions + +```c +void cgit_print_repolist(void) +``` + +### Features + +- Sortable columns: Name, Description, Owner, Idle (age) +- Section grouping (based on `repo.section` or `section-from-path`) +- Pagination with configurable `max-repo-count` +- Age calculation via `read_agefile()` or ref modification time +- Optional filter by search query + +### Sorting + +```c +static int cmp_name(const void *a, const void *b); +static int cmp_section(const void *a, const void *b); +static int cmp_idle(const void *a, const void *b); +``` + +Sort field is selected by the `s` query parameter or `repository-sort` +directive. + +### Age File Resolution + +```c +static time_t read_agefile(const char *path) +{ + /* Try reading date from agefile content */ + /* Fall back to file mtime */ + /* Fall back to refs/ dir mtime */ +} +``` + +### Pagination + +```c +static void print_pager(int items, int pagelen, char *search, char *sort) +{ + /* Render page navigation links */ + /* [prev] 1 2 3 4 5 [next] */ +} +``` + +## `ui-summary.c` — Repository Summary + +Renders the overview page for a single repository. + +### Functions + +```c +void cgit_print_summary(void) +``` + +### Content + +- Repository metadata table (description, owner, homepage, clone URLs) +- SPDX license detection from `LICENSES/` directory +- CODEOWNERS and MAINTAINERS file detection +- Badges display +- Branch listing (limited by `summary-branches`) +- Tag listing (limited by `summary-tags`) +- Recent commits (limited by `summary-log`) +- Snapshot download links +- README rendering (via about-filter) + +### License Detection + +```c +/* Scan for SPDX license identifiers */ +/* Check LICENSES/ directory for .txt files */ +/* Extract license names from filenames */ +``` + +### README Priority + +README files are tried in order of `repo.readme` entries: + +1. `ref:README.md` — tracked file in a specific ref +2. `:README.md` — tracked file in HEAD +3. `/path/to/README.md` — file on disk + +## `ui-log.c` — Commit Log + +Renders a paginated list of commits. + +### Functions + +```c +void cgit_print_log(const char *tip, int ofs, int cnt, + char *grep, char *pattern, char *path, + int pager, int commit_graph, int commit_sort) +``` + +### Features + +- Commit graph visualization (ASCII art) +- File change count per commit (when `enable-log-filecount=1`) +- Line count per commit (when `enable-log-linecount=1`) +- Grep/search within commit messages +- Path filtering (show commits affecting a specific path) +- Follow renames (when `enable-follow-links=1`) +- Pagination with next/prev links + +### Commit Graph Colors + +```c +static const char *column_colors_html[] = { + "", + "", + "", + "", + "", + "", +}; +``` + +### Decorations + +```c +static void show_commit_decorations(struct commit *commit) +{ + /* Display branch/tag labels next to commits */ + /* Uses git's decoration API */ +} +``` + +## `ui-tree.c` — Tree View + +Renders directory listings and file contents. + +### Functions + +```c +void cgit_print_tree(const char *rev, char *path) +``` + +### Directory Listing + +For each entry in a tree object: + +```c +/* For each tree entry */ +switch (entry->mode) { + case S_IFDIR: /* directory → link to subtree */ + case S_IFREG: /* regular file → link to blob */ + case S_IFLNK: /* symlink → show target */ + case S_IFGITLINK: /* submodule → link to submodule */ +} +``` + +### File Display + +```c +static void print_text_buffer(const char *name, char *buf, + unsigned long size) +{ + /* Show file content with line numbers */ + /* Apply source filter if configured */ +} + +static void print_binary_buffer(char *buf, unsigned long size) +{ + /* Show "Binary file (N bytes)" message */ +} +``` + +### Walk Tree Context + +```c +struct walk_tree_context { + char *curr_rev; + char *match_path; + int state; /* 0=searching, 1=found, 2=printed */ +}; +``` + +The tree walker recursively descends into subdirectories to find the +requested path. + +## `ui-blob.c` — Blob View + +Displays individual file content or serves raw file data. + +### Functions + +```c +void cgit_print_blob(const char *hex, char *path, + const char *head, int file_only) +int cgit_ref_path_exists(const char *path, const char *ref, int file_only) +char *cgit_ref_read_file(const char *path, const char *ref, + unsigned long *size) +``` + +### MIME Detection + +When serving raw content, MIME types are detected from: +1. The `mimetype.` configuration directives +2. The `mimetype-file` (Apache-style mime.types) +3. Default: `application/octet-stream` + +## `ui-commit.c` — Commit View + +Displays full commit details. + +### Functions + +```c +void cgit_print_commit(const char *rev, const char *prefix) +``` + +### Content + +- Author and committer info (name, email, date) +- Commit subject and full message +- Parent commit links +- Git notes +- Commit decorations (branches, tags) +- Diffstat +- Full diff (unified or side-by-side) + +### Notes Display + +```c +/* Check for git notes */ +struct strbuf notes = STRBUF_INIT; +format_display_notes(&commit->object.oid, ¬es, ...); +if (notes.len) { + html("
Notes
"); + html("
"); + html_txt(notes.buf); + html("
"); +} +``` + +## `ui-diff.c` — Diff View + +Renders diffs between commits or trees. + +### Functions + +```c +void cgit_print_diff(const char *new_rev, const char *old_rev, + const char *prefix, int show_ctrls, int raw) +void cgit_print_diffstat(const struct object_id *old, + const struct object_id *new, + const char *prefix) +``` + +See [diff-engine.md](diff-engine.md) for detailed documentation. + +## `ui-ssdiff.c` — Side-by-Side Diff + +Renders two-column diff view with character-level highlighting. + +### Functions + +```c +void cgit_ssdiff_header_begin(void) +void cgit_ssdiff_header_end(void) +void cgit_ssdiff_footer(void) +``` + +See [diff-engine.md](diff-engine.md) for LCS algorithm details. + +## `ui-patch.c` — Patch Output + +Generates a downloadable patch file. + +### Functions + +```c +void cgit_print_patch(const char *new_rev, const char *old_rev, + const char *prefix) +``` + +Output is `text/plain` content suitable for `git apply`. Uses Git's +`rev_info` and `log_tree_commit` to generate the patch. + +## `ui-refs.c` — References View + +Displays branches and tags with sorting. + +### Functions + +```c +void cgit_print_refs(void) +void cgit_print_branches(int max) +void cgit_print_tags(int max) +``` + +### Branch Display + +Each branch row shows: +- Branch name (link to log) +- Idle time +- Author of last commit + +### Tag Display + +Each tag row shows: +- Tag name (link to tag) +- Idle time +- Author/tagger +- Download links (if snapshots enabled) + +### Sorting + +```c +static int cmp_branch_age(const void *a, const void *b); +static int cmp_tag_age(const void *a, const void *b); +static int cmp_branch_name(const void *a, const void *b); +static int cmp_tag_name(const void *a, const void *b); +``` + +Sort order is controlled by `branch-sort` (0=name, 1=age). + +## `ui-tag.c` — Tag View + +Displays details of a specific tag. + +### Functions + +```c +void cgit_print_tag(const char *revname) +``` + +### Content + +For annotated tags: +- Tagger name and date +- Tag message +- Tagged object link + +For lightweight tags: +- Redirects to the tagged object (commit, tree, or blob) + +## `ui-stats.c` — Statistics View + +Displays contributor statistics by period. + +### Functions + +```c +void cgit_print_stats(void) +``` + +### Periods + +```c +struct cgit_period { + const char *name; /* "week", "month", "quarter", "year" */ + int max_periods; + int count; + /* accessor functions for period boundaries */ +}; +``` + +### Data Collection + +```c +static void collect_stats(struct cgit_period *period) +{ + /* Walk commit log */ + /* Group commits by author and period */ + /* Count additions/deletions per period */ +} +``` + +### Output + +- Bar chart showing commits per period +- Author ranking table +- Sortable by commit count + +## `ui-atom.c` — Atom Feed + +Generates an Atom XML feed. + +### Functions + +```c +void cgit_print_atom(char *tip, char *path, int max) +``` + +### Output + +```xml + + + repo - log + 2024-01-01T00:00:00Z + + commit subject + 2024-01-01T00:00:00Z + Alicealice@example.com + urn:sha1:abc123 + + commit message + + +``` + +Limited by `max-atom-items` (default 10). + +## `ui-plain.c` — Raw File Serving + +Serves file content with proper MIME types. + +### Functions + +```c +void cgit_print_plain(void) +``` + +### Features + +- MIME type detection by file extension +- Directory listing (HTML) when path is a tree +- Binary file serving with correct Content-Type +- Security: HTML serving gated by `enable-html-serving` + +### Security + +When `enable-html-serving=0` (default), HTML files are served as +`text/plain` to prevent XSS. + +## `ui-blame.c` — Blame View + +Displays line-by-line blame information. + +### Functions + +```c +void cgit_print_blame(void) +``` + +### Implementation + +Uses Git's `blame_scoreboard` API: + +```c +/* Set up blame scoreboard */ +/* Walk file history */ +/* For each line, emit: commit hash, author, line content */ +``` + +### Output + +Each line shows: +- Abbreviated commit hash (linked to commit view) +- Line number +- File content + +Requires `enable-blame=1`. + +## `ui-clone.c` — HTTP Clone Endpoints + +Serves the smart HTTP clone protocol. + +### Functions + +```c +void cgit_clone_info(void) /* GET info/refs */ +void cgit_clone_objects(void) /* GET objects/* */ +void cgit_clone_head(void) /* GET HEAD */ +``` + +### `cgit_clone_info()` + +Enumerates all refs and their SHA-1 hashes: + +```c +static void print_ref_info(const char *refname, + const struct object_id *oid, ...) +{ + /* Output: sha1\trefname\n */ +} +``` + +### `cgit_clone_objects()` + +Serves loose objects and pack files from the object store. + +### `cgit_clone_head()` + +Returns the symbolic HEAD reference. + +Requires `enable-http-clone=1` (default). + +## `ui-snapshot.c` — Archive Downloads + +See [snapshot-system.md](snapshot-system.md) for detailed documentation. + +## `ui-shared.c` — Common Infrastructure + +Provides shared layout and link generation used by all modules. + +See [html-rendering.md](html-rendering.md) for detailed documentation. + +### Key Functions + +- Page skeleton: `cgit_print_docstart()`, `cgit_print_pageheader()`, + `cgit_print_docend()` +- Links: `cgit_commit_link()`, `cgit_tree_link()`, `cgit_log_link()`, etc. +- URLs: `cgit_repourl()`, `cgit_fileurl()`, `cgit_pageurl()` +- Errors: `cgit_print_error_page()` diff --git a/docs/handbook/cgit/url-routing.md b/docs/handbook/cgit/url-routing.md new file mode 100644 index 0000000000..0adb3b7fc5 --- /dev/null +++ b/docs/handbook/cgit/url-routing.md @@ -0,0 +1,331 @@ +# cgit — URL Routing and Request Dispatch + +## Overview + +cgit supports two URL schemes: virtual-root (path-based) and query-string. +Incoming requests are parsed into a `cgit_query` structure and dispatched to +one of 23 command handlers via a function pointer table. + +Source files: `cgit.c` (querystring parsing, routing), `parsing.c` +(`cgit_parse_url`), `cmd.c` (command table). + +## URL Schemes + +### Virtual Root (Path-Based) + +When `virtual-root` is configured, URLs use clean paths: + +``` +/cgit/ → repository list +/cgit/repo.git/ → summary +/cgit/repo.git/log/ → log (default branch) +/cgit/repo.git/log/main/path → log for path on branch main +/cgit/repo.git/tree/v1.0/src/ → tree view at tag v1.0 +/cgit/repo.git/commit/?id=abc → commit view +``` + +The path after the virtual root is passed in `PATH_INFO` and parsed by +`cgit_parse_url()`. + +### Query-String (CGI) + +Without virtual root, all parameters are passed in the query string: + +``` +/cgit.cgi?url=repo.git/log/main/path&ofs=50 +``` + +## Query Structure + +All parsed parameters are stored in `ctx.qry`: + +```c +struct cgit_query { + char *raw; /* raw URL / PATH_INFO */ + char *repo; /* repository URL */ + char *page; /* page/command name */ + char *search; /* search string */ + char *grep; /* grep pattern */ + char *head; /* branch reference */ + char *sha1; /* object SHA-1 */ + char *sha2; /* second SHA-1 (for diffs) */ + char *path; /* file/dir path within repo */ + char *name; /* snapshot name / ref name */ + char *url; /* combined URL path */ + char *mimetype; /* requested MIME type */ + char *etag; /* ETag from client */ + int nohead; /* suppress header */ + int ofs; /* pagination offset */ + int has_symref; /* path contains a symbolic ref */ + int has_sha1; /* explicit SHA was given */ + int has_dot; /* path contains '..' */ + int ignored; /* request should be ignored */ + char *sort; /* sort field */ + int showmsg; /* show full commit message */ + int ssdiff; /* side-by-side diff */ + int show_all; /* show all items */ + int context; /* diff context lines */ + int follow; /* follow renames */ + int log_hierarchical_threading; +}; +``` + +## URL Parsing: `cgit_parse_url()` + +In `parsing.c`, the URL is decomposed into repo, page, and path: + +```c +void cgit_parse_url(const char *url) +{ + /* Step 1: try progressively longer prefixes as repo URLs */ + /* For each '/' in the URL, check if the prefix matches a repo */ + + for (p = strchr(url, '/'); p; p = strchr(p + 1, '/')) { + *p = '\0'; + repo = cgit_get_repoinfo(url); + *p = '/'; + if (repo) { + ctx.qry.repo = xstrdup(url_prefix); + ctx.repo = repo; + url = p + 1; /* remaining part */ + break; + } + } + /* if no '/' found, try the whole URL as a repo name */ + + /* Step 2: parse the remaining path as page/ref/path */ + /* e.g., "log/main/src/file.c" → page="log", path="main/src/file.c" */ + p = strchr(url, '/'); + if (p) { + ctx.qry.page = xstrndup(url, p - url); + ctx.qry.path = trim_end(p + 1, '/'); + } else if (*url) { + ctx.qry.page = xstrdup(url); + } +} +``` + +## Query String Parsing: `querystring_cb()` + +HTTP query parameters and POST form data are decoded by `querystring_cb()` +in `cgit.c`. The function maps URL parameter names to `ctx.qry` fields: + +```c +static void querystring_cb(const char *name, const char *value) +{ + if (!strcmp(name, "url")) ctx.qry.url = xstrdup(value); + else if (!strcmp(name, "p")) ctx.qry.page = xstrdup(value); + else if (!strcmp(name, "q")) ctx.qry.search = xstrdup(value); + else if (!strcmp(name, "h")) ctx.qry.head = xstrdup(value); + else if (!strcmp(name, "id")) ctx.qry.sha1 = xstrdup(value); + else if (!strcmp(name, "id2")) ctx.qry.sha2 = xstrdup(value); + else if (!strcmp(name, "ofs")) ctx.qry.ofs = atoi(value); + else if (!strcmp(name, "path")) ctx.qry.path = xstrdup(value); + else if (!strcmp(name, "name")) ctx.qry.name = xstrdup(value); + else if (!strcmp(name, "mimetype")) ctx.qry.mimetype = xstrdup(value); + else if (!strcmp(name, "s")) ctx.qry.sort = xstrdup(value); + else if (!strcmp(name, "showmsg")) ctx.qry.showmsg = atoi(value); + else if (!strcmp(name, "ss")) ctx.qry.ssdiff = atoi(value); + else if (!strcmp(name, "all")) ctx.qry.show_all = atoi(value); + else if (!strcmp(name, "context")) ctx.qry.context = atoi(value); + else if (!strcmp(name, "follow")) ctx.qry.follow = atoi(value); + else if (!strcmp(name, "dt")) ctx.qry.dt = atoi(value); + else if (!strcmp(name, "grep")) ctx.qry.grep = xstrdup(value); + else if (!strcmp(name, "etag")) ctx.qry.etag = xstrdup(value); +} +``` + +### URL Parameter Reference + +| Parameter | Query Field | Type | Description | +|-----------|------------|------|-------------| +| `url` | `qry.url` | string | Full URL path (repo/page/path) | +| `p` | `qry.page` | string | Page/command name | +| `q` | `qry.search` | string | Search string | +| `h` | `qry.head` | string | Branch/ref name | +| `id` | `qry.sha1` | string | Object SHA-1 | +| `id2` | `qry.sha2` | string | Second SHA-1 (diffs) | +| `ofs` | `qry.ofs` | int | Pagination offset | +| `path` | `qry.path` | string | File path in repo | +| `name` | `qry.name` | string | Reference/snapshot name | +| `mimetype` | `qry.mimetype` | string | MIME type override | +| `s` | `qry.sort` | string | Sort field | +| `showmsg` | `qry.showmsg` | int | Show full commit message | +| `ss` | `qry.ssdiff` | int | Side-by-side diff toggle | +| `all` | `qry.show_all` | int | Show all entries | +| `context` | `qry.context` | int | Diff context lines | +| `follow` | `qry.follow` | int | Follow renames in log | +| `dt` | `qry.dt` | int | Diff type | +| `grep` | `qry.grep` | string | Grep pattern for log search | +| `etag` | `qry.etag` | string | ETag for conditional requests | + +## Command Dispatch Table + +The command table in `cmd.c` maps page names to handler functions: + +```c +#define def_cmd(name, want_hierarchical, want_repo, want_layout, want_vpath, is_clone) \ + {#name, cmd_##name, want_hierarchical, want_repo, want_layout, want_vpath, is_clone} + +static struct cgit_cmd cmds[] = { + def_cmd(atom, 1, 1, 0, 0, 0), + def_cmd(about, 0, 1, 1, 0, 0), + def_cmd(blame, 1, 1, 1, 1, 0), + def_cmd(blob, 1, 1, 0, 0, 0), + def_cmd(commit, 1, 1, 1, 1, 0), + def_cmd(diff, 1, 1, 1, 1, 0), + def_cmd(head, 1, 1, 0, 0, 1), + def_cmd(info, 1, 1, 0, 0, 1), + def_cmd(log, 1, 1, 1, 1, 0), + def_cmd(ls_cache,0, 0, 0, 0, 0), + def_cmd(objects, 1, 1, 0, 0, 1), + def_cmd(patch, 1, 1, 1, 1, 0), + def_cmd(plain, 1, 1, 0, 1, 0), + def_cmd(rawdiff, 1, 1, 0, 1, 0), + def_cmd(refs, 1, 1, 1, 0, 0), + def_cmd(repolist,0, 0, 1, 0, 0), + def_cmd(snapshot, 1, 1, 0, 0, 0), + def_cmd(stats, 1, 1, 1, 1, 0), + def_cmd(summary, 1, 1, 1, 0, 0), + def_cmd(tag, 1, 1, 1, 0, 0), + def_cmd(tree, 1, 1, 1, 1, 0), +}; +``` + +### Command Flags + +| Flag | Meaning | +|------|---------| +| `want_hierarchical` | Parse hierarchical path from URL | +| `want_repo` | Requires a repository context | +| `want_layout` | Render within HTML page layout | +| `want_vpath` | Accept a virtual path (file path in repo) | +| `is_clone` | HTTP clone protocol endpoint | + +### Lookup: `cgit_get_cmd()` + +```c +struct cgit_cmd *cgit_get_cmd(const char *name) +{ + for (int i = 0; i < ARRAY_SIZE(cmds); i++) + if (!strcmp(cmds[i].name, name)) + return &cmds[i]; + return NULL; +} +``` + +The function performs a linear search. With 21 entries, this is fast enough. + +## Request Processing Flow + +In `process_request()` (`cgit.c`): + +``` +1. Parse PATH_INFO via cgit_parse_url() +2. Parse QUERY_STRING via http_parse_querystring(querystring_cb) +3. Parse POST body (for authentication forms) +4. Resolve repository: cgit_get_repoinfo(ctx.qry.repo) +5. Determine command: cgit_get_cmd(ctx.qry.page) +6. If no page specified: + - With repo → default to "summary" + - Without repo → default to "repolist" +7. Check command flags: + - want_repo but no repo → "Repository not found" error + - is_clone and HTTP clone disabled → 404 +8. Handle authentication if auth-filter is configured +9. Execute: cmd->fn(&ctx) +``` + +### Hierarchical Path Resolution + +When `want_hierarchical=1`, cgit splits `ctx.qry.path` into a reference +(branch/tag/SHA) and a file path. It tries progressively longer prefixes +of the path as git references until one resolves: + +``` +path = "main/src/lib/file.c" +try: "main" → found branch "main" + qry.head = "main" + qry.path = "src/lib/file.c" +``` + +If no prefix resolves, the entire path is treated as a file path within the +default branch. + +## Clone Protocol Endpoints + +Three commands serve the Git HTTP clone protocol: + +| Endpoint | Path | Function | +|----------|------|----------| +| `info` | `repo/info/refs` | `cgit_clone_info()` — advertise refs | +| `objects` | `repo/objects/*` | `cgit_clone_objects()` — serve packfiles | +| `head` | `repo/HEAD` | `cgit_clone_head()` — serve HEAD ref | + +These are only active when `enable-http-clone=1` (default). + +## URL Generation + +`ui-shared.c` provides URL construction helpers: + +```c +const char *cgit_repourl(const char *reponame); +const char *cgit_fileurl(const char *reponame, const char *pagename, + const char *filename, const char *query); +const char *cgit_pageurl(const char *reponame, const char *pagename, + const char *query); +const char *cgit_currurl(void); +``` + +When `virtual-root` is set, these produce clean paths. Otherwise, they +produce query-string URLs. + +### Example URL generation: + +```c +/* With virtual-root=/cgit/ */ +cgit_repourl("myrepo") + → "/cgit/myrepo/" + +cgit_fileurl("myrepo", "tree", "src/main.c", "h=dev") + → "/cgit/myrepo/tree/src/main.c?h=dev" + +cgit_pageurl("myrepo", "log", "ofs=50") + → "/cgit/myrepo/log/?ofs=50" +``` + +## Content-Type and HTTP Headers + +The response content type is set by the command handler before generating +output. Common types: + +| Page | Content-Type | +|------|-------------| +| HTML pages | `text/html` | +| atom | `text/xml` | +| blob | auto-detected from content | +| plain | MIME type from extension or `application/octet-stream` | +| snapshot | `application/x-gzip`, etc. | +| patch | `text/plain` | +| clone endpoints | `text/plain`, `application/x-git-packed-objects` | + +Headers are emitted by `cgit_print_http_headers()` in `ui-shared.c` before +any page content. + +## Error Handling + +If a requested repository or page is not found, cgit renders an error page +within the standard layout. HTTP status codes: + +| Condition | Status | +|-----------|--------| +| Normal page | 200 OK | +| Auth redirect | 302 Found | +| Not modified | 304 Not Modified | +| Bad request | 400 Bad Request | +| Auth required | 401 Unauthorized | +| Repo not found | 404 Not Found | +| Page not found | 404 Not Found | + +The status code is set in `ctx.page.status` and emitted by the HTTP header +function. diff --git a/docs/handbook/ci/branch-strategy.md b/docs/handbook/ci/branch-strategy.md new file mode 100644 index 0000000000..89535c9f54 --- /dev/null +++ b/docs/handbook/ci/branch-strategy.md @@ -0,0 +1,388 @@ +# Branch Strategy + +## Overview + +The Project Tick monorepo uses a structured branch naming convention that enables +CI scripts to automatically classify branches, determine valid base branches for PRs, +and decide which checks to run. The classification logic lives in +`ci/supportedBranches.js`. + +--- + +## Branch Naming Convention + +### Format + +``` +prefix[-version[-suffix]] +``` + +Where: +- `prefix` — The branch type (e.g., `master`, `release`, `feature`) +- `version` — Optional semantic version (e.g., `1.0`, `2.5.1`) +- `suffix` — Optional additional descriptor (e.g., `pre`, `hotfix`) + +### Parsing Regex + +```javascript +/(?[a-zA-Z-]+?)(-(?\d+\.\d+(?:\.\d+)?)(?:-(?.*))?)?$/ +``` + +This regex extracts three named groups: + +| Group | Description | Example: `release-2.5.1-hotfix` | +|-----------|----------------------------------|---------------------------------| +| `prefix` | Branch type identifier | `release` | +| `version` | Semantic version number | `2.5.1` | +| `suffix` | Additional descriptor | `hotfix` | + +### Parse Examples + +```javascript +split('master') +// { prefix: 'master', version: undefined, suffix: undefined } + +split('release-1.0') +// { prefix: 'release', version: '1.0', suffix: undefined } + +split('release-2.5.1') +// { prefix: 'release', version: '2.5.1', suffix: undefined } + +split('staging-1.0') +// { prefix: 'staging', version: '1.0', suffix: undefined } + +split('staging-next-1.0') +// { prefix: 'staging-next', version: '1.0', suffix: undefined } + +split('feature-new-ui') +// { prefix: 'feature', version: undefined, suffix: undefined } +// Note: "new-ui" doesn't match version pattern, so prefix consumes it + +split('fix-crash-on-start') +// { prefix: 'fix', version: undefined, suffix: undefined } + +split('backport-123-to-release-1.0') +// { prefix: 'backport', version: undefined, suffix: undefined } +// Note: "123-to-release-1.0" doesn't start with a version, so no match + +split('dependabot-npm') +// { prefix: 'dependabot', version: undefined, suffix: undefined } +``` + +--- + +## Branch Classification + +### Type Configuration + +```javascript +const typeConfig = { + master: ['development', 'primary'], + release: ['development', 'primary'], + staging: ['development', 'secondary'], + 'staging-next': ['development', 'secondary'], + feature: ['wip'], + fix: ['wip'], + backport: ['wip'], + revert: ['wip'], + wip: ['wip'], + dependabot: ['wip'], +} +``` + +### Branch Types + +| Prefix | Type Tags | Description | +|----------------|------------------------------|-------------------------------------| +| `master` | `development`, `primary` | Main development branch | +| `release` | `development`, `primary` | Release branches (e.g., `release-1.0`) | +| `staging` | `development`, `secondary` | Pre-release staging | +| `staging-next` | `development`, `secondary` | Next staging cycle | +| `feature` | `wip` | Feature development branches | +| `fix` | `wip` | Bug fix branches | +| `backport` | `wip` | Backport branches | +| `revert` | `wip` | Revert branches | +| `wip` | `wip` | Work-in-progress branches | +| `dependabot` | `wip` | Automated dependency updates | + +Any branch with an unrecognized prefix defaults to type `['wip']`. + +### Type Tag Meanings + +| Tag | Purpose | +|--------------|-------------------------------------------------------------| +| `development` | A long-lived branch that receives PRs | +| `primary` | The main target for new work (master or release branches) | +| `secondary` | A staging area — receives from primary, not from WIP directly | +| `wip` | A short-lived branch created for a specific task | + +--- + +## Order Configuration + +Branch ordering determines which branch is preferred when multiple branches are +equally good candidates as PR base branches: + +```javascript +const orderConfig = { + master: 0, + release: 1, + staging: 2, + 'staging-next': 3, +} +``` + +| Branch Prefix | Order | Preference | +|----------------|-------|------------------------------------------| +| `master` | 0 | Highest — default target for new work | +| `release` | 1 | Second — for release-specific changes | +| `staging` | 2 | Third — staging area | +| `staging-next` | 3 | Fourth — next staging cycle | +| All others | `Infinity` | Lowest — not considered as base branches | + +If two branches have the same number of commits ahead of a PR head, the one with +the lower order is preferred. This means `master` is preferred over `release-1.0` +when both are equally close. + +--- + +## Classification Function + +```javascript +function classify(branch) { + const { prefix, version } = split(branch) + return { + branch, + order: orderConfig[prefix] ?? Infinity, + stable: version != null, + type: typeConfig[prefix] ?? ['wip'], + version: version ?? 'dev', + } +} +``` + +### Output Fields + +| Field | Type | Description | +|----------|----------|------------------------------------------------------| +| `branch` | String | The original branch name | +| `order` | Number | Sort priority (lower = preferred as base) | +| `stable` | Boolean | `true` if the branch has a version suffix | +| `type` | Array | Type tags from `typeConfig` | +| `version` | String | Extracted version number, or `'dev'` if none | + +### Classification Examples + +```javascript +classify('master') +// { branch: 'master', order: 0, stable: false, type: ['development', 'primary'], version: 'dev' } + +classify('release-1.0') +// { branch: 'release-1.0', order: 1, stable: true, type: ['development', 'primary'], version: '1.0' } + +classify('release-2.5.1') +// { branch: 'release-2.5.1', order: 1, stable: true, type: ['development', 'primary'], version: '2.5.1' } + +classify('staging-1.0') +// { branch: 'staging-1.0', order: 2, stable: true, type: ['development', 'secondary'], version: '1.0' } + +classify('staging-next-1.0') +// { branch: 'staging-next-1.0', order: 3, stable: true, type: ['development', 'secondary'], version: '1.0' } + +classify('feature-new-ui') +// { branch: 'feature-new-ui', order: Infinity, stable: false, type: ['wip'], version: 'dev' } + +classify('fix-crash-on-start') +// { branch: 'fix-crash-on-start', order: Infinity, stable: false, type: ['wip'], version: 'dev' } + +classify('dependabot-npm') +// { branch: 'dependabot-npm', order: Infinity, stable: false, type: ['wip'], version: 'dev' } + +classify('wip-experiment') +// { branch: 'wip-experiment', order: Infinity, stable: false, type: ['wip'], version: 'dev' } + +classify('random-unknown-branch') +// { branch: 'random-unknown-branch', order: Infinity, stable: false, type: ['wip'], version: 'dev' } +``` + +--- + +## Branch Flow Model + +### Development Flow + +``` +┌─────────────────────────────────────────────┐ +│ master │ +│ (primary development, receives all work) │ +└──────────┬──────────────────────┬───────────┘ + │ fork │ fork + ▼ ▼ +┌──────────────────┐ ┌──────────────────────┐ +│ staging-X.Y │ │ release-X.Y │ +│ (secondary, │ │ (primary, │ +│ pre-release) │ │ stable release) │ +└──────────────────┘ └──────────────────────┘ +``` + +### WIP Branch Flow + +``` + master (or release-X.Y) + │ + ┌────┴────┐ + │ fork │ + ▼ │ + feature-* │ + fix-* │ + backport-* │ + wip-* │ + │ │ + └──── PR ─┘ + (merged back) +``` + +### Typical Branch Lifecycle + +1. **Create** — Developer creates `feature-my-change` from `master` +2. **Develop** — Commits follow Conventional Commits format +3. **PR** — Pull request targets `master` (or the appropriate release branch) +4. **CI Validation** — `prepare.js` classifies branches, `lint-commits.js` checks messages +5. **Review** — Code review by owners defined in `ci/OWNERS` +6. **Merge** — PR is merged into the target branch +7. **Cleanup** — The WIP branch is deleted + +--- + +## How CI Uses Branch Classification + +### Commit Linting Exemptions + +PRs between development branches skip commit linting: + +```javascript +if ( + baseBranchType.includes('development') && + headBranchType.includes('development') && + pr.base.repo.id === pr.head.repo?.id +) { + core.info('This PR is from one development branch to another. Skipping checks.') + return +} +``` + +Exempted transitions: +- `staging` → `master` +- `staging-next` → `staging` +- `release-X.Y` → `master` + +### Base Branch Suggestion + +For WIP branches, `prepare.js` finds the optimal base: + +1. Start with `master` as a candidate +2. Compare commit distances to all `release-*` branches (sorted newest first) +3. The branch with fewest commits ahead is the best candidate +4. On ties, lower `order` wins (master > release > staging) + +### Release Branch Targeting Warning + +When a non-backport/fix/revert branch targets a release branch: + +``` +Warning: This PR targets release branch `release-1.0`. +New features should typically target `master`. +``` + +--- + +## Version Extraction + +The `stable` flag and `version` field enable version-aware CI decisions: + +| Branch | `stable` | `version` | Interpretation | +|-------------------|----------|-----------|--------------------------------| +| `master` | `false` | `'dev'` | Development, no specific version | +| `release-1.0` | `true` | `'1.0'` | Release 1.0 | +| `release-2.5.1` | `true` | `'2.5.1'` | Release 2.5.1 | +| `staging-1.0` | `true` | `'1.0'` | Staging for release 1.0 | +| `feature-foo` | `false` | `'dev'` | WIP, no version association | + +Release branches are sorted by version (descending) when computing base branch +suggestions, so `release-2.0` is checked before `release-1.0`. + +--- + +## Module Exports + +The `supportedBranches.js` module exports two functions: + +```javascript +module.exports = { classify, split } +``` + +| Function | Purpose | +|-----------|----------------------------------------------------------| +| `classify` | Full classification: type tags, order, stability, version| +| `split` | Parse branch name into prefix, version, suffix | + +These are imported by: +- `ci/github-script/lint-commits.js` — For commit linting exemptions +- `ci/github-script/prepare.js` — For branch targeting validation + +--- + +## Self-Testing + +When `supportedBranches.js` is run directly (not imported as a module), it executes +built-in tests: + +```bash +cd ci/ +node supportedBranches.js +``` + +Output: + +``` +split(branch) +master { prefix: 'master', version: undefined, suffix: undefined } +release-1.0 { prefix: 'release', version: '1.0', suffix: undefined } +release-2.5.1 { prefix: 'release', version: '2.5.1', suffix: undefined } +staging-1.0 { prefix: 'staging', version: '1.0', suffix: undefined } +staging-next-1.0 { prefix: 'staging-next', version: '1.0', suffix: undefined } +feature-new-ui { prefix: 'feature', version: undefined, suffix: undefined } +fix-crash-on-start { prefix: 'fix', version: undefined, suffix: undefined } +... + +classify(branch) +master { branch: 'master', order: 0, stable: false, type: ['development', 'primary'], version: 'dev' } +release-1.0 { branch: 'release-1.0', order: 1, stable: true, type: ['development', 'primary'], version: '1.0' } +... +``` + +--- + +## Adding New Branch Types + +To add a new branch type: + +1. Add the prefix and type tags to `typeConfig`: + +```javascript +const typeConfig = { + // ... existing entries ... + 'hotfix': ['wip'], // or ['development', 'primary'] if it's a long-lived branch +} +``` + +2. If it should be a base branch candidate, add it to `orderConfig`: + +```javascript +const orderConfig = { + // ... existing entries ... + hotfix: 4, // lower number = higher preference +} +``` + +3. Update the self-tests at the bottom of the file. diff --git a/docs/handbook/ci/codeowners.md b/docs/handbook/ci/codeowners.md new file mode 100644 index 0000000000..0054a168f1 --- /dev/null +++ b/docs/handbook/ci/codeowners.md @@ -0,0 +1,370 @@ +# CODEOWNERS + +## Overview + +Project Tick uses a code ownership system based on the `ci/OWNERS` file. This file +follows the same syntax as GitHub's native `CODEOWNERS` file but is stored in a +custom location and validated by a patched version of the +[codeowners-validator](https://github.com/mszostok/codeowners-validator) tool. + +The OWNERS file serves two purposes: +1. **Automated review routing** — PR authors know who to request reviews from +2. **Structural validation** — CI checks that referenced paths and users exist + +--- + +## File Location and Format + +### Location + +``` +ci/OWNERS +``` + +Unlike GitHub's native CODEOWNERS (which must be in `.github/CODEOWNERS`, +`CODEOWNERS`, or `docs/CODEOWNERS`), Project Tick stores ownership data in +`ci/OWNERS` to keep CI infrastructure colocated. + +### Syntax + +The file uses CODEOWNERS syntax: + +``` +# Comments start with # +# Pattern followed by one or more @owner references +/path/pattern/ @owner1 @owner2 +``` + +### Header + +``` +# This file describes who owns what in the Project Tick CI infrastructure. +# Users/teams will get review requests for PRs that change their files. +# +# This file uses the same syntax as the natively supported CODEOWNERS file, +# see https://help.github.com/articles/about-codeowners/ for documentation. +# +# Validated by ci/codeowners-validator. +``` + +--- + +## Ownership Map + +The OWNERS file maps every major directory and subdirectory in the monorepo to +code owners. Below is the complete ownership mapping: + +### GitHub Infrastructure + +``` +/.github/actions/change-analysis/ @YongDo-Hyun +/.github/actions/meshmc/package/ @YongDo-Hyun +/.github/actions/meshmc/setup-dependencies/ @YongDo-Hyun +/.github/actions/mnv/test_artefacts/ @YongDo-Hyun +/.github/codeql/ @YongDo-Hyun +/.github/ISSUE_TEMPLATE/ @YongDo-Hyun +/.github/workflows/ @YongDo-Hyun +``` + +### Archived Projects + +``` +/archived/projt-launcher/ @YongDo-Hyun +/archived/projt-minicraft-modpack/ @YongDo-Hyun +/archived/projt-modpack/ @YongDo-Hyun +/archived/ptlibzippy/ @YongDo-Hyun +``` + +### Core Projects + +``` +/cgit/* @YongDo-Hyun +/cgit/contrib/* @YongDo-Hyun +/cgit/contrib/hooks/ @YongDo-Hyun +/cgit/filters/ @YongDo-Hyun +/cgit/tests/ @YongDo-Hyun + +/cmark/* @YongDo-Hyun +/cmark/api_test/ @YongDo-Hyun +/cmark/bench/ @YongDo-Hyun +/cmark/cmake/ @YongDo-Hyun +/cmark/data/ @YongDo-Hyun +/cmark/fuzz/ @YongDo-Hyun +/cmark/man/ @YongDo-Hyun +/cmark/src/ @YongDo-Hyun +/cmark/test/ @YongDo-Hyun +/cmark/tools/ @YongDo-Hyun +/cmark/wrappers/ @YongDo-Hyun +``` + +### Corebinutils (every utility individually owned) + +``` +/corebinutils/* @YongDo-Hyun +/corebinutils/cat/ @YongDo-Hyun +/corebinutils/chflags/ @YongDo-Hyun +/corebinutils/chmod/ @YongDo-Hyun +/corebinutils/contrib/* @YongDo-Hyun +/corebinutils/contrib/libc-vis/ @YongDo-Hyun +/corebinutils/contrib/libedit/ @YongDo-Hyun +/corebinutils/contrib/printf/ @YongDo-Hyun +/corebinutils/cp/ @YongDo-Hyun +... +/corebinutils/uuidgen/ @YongDo-Hyun +``` + +### Other Projects + +``` +/forgewrapper/* @YongDo-Hyun +/forgewrapper/gradle/ @YongDo-Hyun +/forgewrapper/jigsaw/ @YongDo-Hyun +/forgewrapper/src/ @YongDo-Hyun + +/genqrcode/* @YongDo-Hyun +/genqrcode/cmake/ @YongDo-Hyun +/genqrcode/tests/ @YongDo-Hyun +/genqrcode/use/ @YongDo-Hyun + +/hooks/ @YongDo-Hyun +/images4docker/ @YongDo-Hyun + +/json4cpp/* @YongDo-Hyun +/json4cpp/.reuse/ @YongDo-Hyun +/json4cpp/cmake/ @YongDo-Hyun +/json4cpp/docs/ @YongDo-Hyun +/json4cpp/include/* @YongDo-Hyun +... + +/libnbtplusplus/* @YongDo-Hyun +/libnbtplusplus/include/* @YongDo-Hyun +... + +/LICENSES/ @YongDo-Hyun + +/meshmc/* @YongDo-Hyun +/meshmc/branding/ @YongDo-Hyun +/meshmc/buildconfig/ @YongDo-Hyun +/meshmc/cmake/* @YongDo-Hyun +/meshmc/launcher/* @YongDo-Hyun +... +``` + +--- + +## Pattern Syntax + +### Glob Rules + +| Pattern | Matches | +|---------------|------------------------------------------------------| +| `/path/` | All files directly under `path/` | +| `/path/*` | All files directly under `path/` (explicit) | +| `/path/**` | All files recursively under `path/` | +| `*.js` | All `.js` files everywhere | +| `/path/*.md` | All `.md` files directly under `path/` | + +### Ownership Resolution + +When multiple patterns match a file, the **last matching rule** wins (just like +Git's `.gitignore` and GitHub's native CODEOWNERS): + +``` +/meshmc/* @teamA # Matches all direct files +/meshmc/launcher/* @teamB # More specific — wins for launcher files +``` + +A PR modifying `meshmc/launcher/main.cpp` would require review from `@teamB`. + +### Explicit Directory Listing + +The OWNERS file explicitly lists individual subdirectories rather than using `**` +recursive globs. This is intentional: + +1. **Precision** — Each directory has explicit ownership +2. **Validation** — The codeowners-validator checks that each listed path exists +3. **Documentation** — The file serves as a directory map of the monorepo + +--- + +## Validation + +### codeowners-validator + +The CI runs a patched version of `codeowners-validator` against the OWNERS file. +The tool is built from source with Project Tick–specific patches. + +#### What It Validates + +| Check | Description | +|-------------------------|------------------------------------------------| +| **Path existence** | All paths in OWNERS exist in the repository | +| **User/team existence** | All `@` references are valid GitHub users/teams| +| **Syntax** | Pattern syntax is valid CODEOWNERS format | +| **No orphaned patterns** | Patterns match at least one file | + +#### Custom Patches + +Two patches are applied to the upstream validator: + +**1. Custom OWNERS file path** (`owners-file-name.patch`) + +```go +func openCodeownersFile(dir string) (io.Reader, error) { + if file, ok := os.LookupEnv("OWNERS_FILE"); ok { + return fs.Open(file) + } + // ... default CODEOWNERS paths +} +``` + +Set `OWNERS_FILE=ci/OWNERS` to validate the custom location. + +**2. Removed write-access requirement** (`permissions.patch`) + +GitHub's native CODEOWNERS requires that listed users have write access to the +repository. Project Tick's OWNERS file is used for review routing, not branch +protection, so this check is removed: + +```go +// Before: required push permission +if t.Permissions["push"] { return nil } +return newValidateError("Team cannot review PRs...") + +// After: any team membership is sufficient +return nil +``` + +Also removes the `github.ScopeReadOrg` requirement from required OAuth scopes, +allowing the validator to work with tokens generated for GitHub Apps. + +### Running Validation Locally + +```bash +cd ci/ +nix-shell # enters the CI dev shell with codeowners-validator available + +# Set the custom OWNERS file path: +export OWNERS_FILE=ci/OWNERS + +# Run validation: +codeowners-validator +``` + +Or build and run directly: + +```bash +nix-build ci/ -A codeownersValidator +OWNERS_FILE=ci/OWNERS ./result/bin/codeowners-validator +``` + +--- + +## MAINTAINERS File Relationship + +In addition to `ci/OWNERS`, individual projects may have a `MAINTAINERS` file +(e.g., `archived/projt-launcher/MAINTAINERS`): + +``` +# MAINTAINERS +# +# Fields: +# - Name: Display name +# - GitHub: GitHub handle (with @) +# - Email: Primary contact email +# - Paths: Comma-separated glob patterns (repo-relative) + +[Mehmet Samet Duman] +GitHub: @YongDo-Hyun +Email: yongdohyun@mail.projecttick.org +Paths: ** +``` + +The `MAINTAINERS` file provides additional metadata (email, display name) that +`OWNERS` doesn't support. The two files serve complementary purposes: + +| File | Purpose | Format | +|--------------|--------------------------------------|-------------------| +| `ci/OWNERS` | Automated review routing via CI | CODEOWNERS syntax | +| `MAINTAINERS`| Human-readable contact information | INI-style blocks | + +--- + +## Review Requirements + +### How Reviews Are Triggered + +When a PR modifies files matching an OWNERS pattern: + +1. The workflow identifies which owners are responsible for the changed paths +2. Review requests are sent to the matching owners +3. At least one approving review from a code owner is typically required before merge + +### Bot-Managed Reviews + +The CI bot (`github-actions[bot]`) manages automated reviews via `ci/github-script/reviews.js`: +- Reviews are tagged with a `reviewKey` comment for identification +- When issues are resolved, bot reviews are automatically dismissed or minimized +- The `CHANGES_REQUESTED` state blocks merge until the review is dismissed + +--- + +## Adding Ownership Entries + +### For a New Project Directory + +1. Add ownership patterns to `ci/OWNERS`: + +``` +/newproject/* @owner-handle +/newproject/src/ @owner-handle +/newproject/tests/ @owner-handle +``` + +2. List every subdirectory explicitly (not just the top-level with `**`) + +3. Run the validator locally: + +```bash +cd ci/ +nix-shell +OWNERS_FILE=ci/OWNERS codeowners-validator +``` + +4. Commit with a CI scope: + +``` +ci(repo): add ownership for newproject +``` + +### For a New Team or User + +Simply reference the new `@handle` in the ownership patterns: + +``` +/some/path/ @existing-owner @new-owner +``` + +The validator will check that `@new-owner` exists in the GitHub organization. + +--- + +## Limitations + +### No Recursive Globs in Current File + +The current OWNERS file uses explicit directory listings rather than `/**` recursive +globs. This means: +- New subdirectories must be manually added to OWNERS +- Deeply nested directories need their own entries +- The file can grow large for projects with many subdirectories + +### Single Organization Scope + +All `@` references must be members of the repository's GitHub organization, +or GitHub users with access to the repository. + +### No Per-File Patterns + +The file doesn't currently use file-level patterns (e.g., `*.nix @nix-team`). +Ownership is assigned at the directory level. diff --git a/docs/handbook/ci/commit-linting.md b/docs/handbook/ci/commit-linting.md new file mode 100644 index 0000000000..9b8e9cc97d --- /dev/null +++ b/docs/handbook/ci/commit-linting.md @@ -0,0 +1,418 @@ +# Commit Linting + +## Overview + +Project Tick enforces the [Conventional Commits](https://www.conventionalcommits.org/) +specification for all commit messages. The commit linter (`ci/github-script/lint-commits.js`) +runs automatically on every pull request to validate that every commit follows the required +format. + +This ensures: +- Consistent, machine-readable commit history +- Automated changelog generation potential +- Clear communication of change intent (feature, fix, refactor, etc.) +- Monorepo-aware scoping that maps commits to project directories + +--- + +## Commit Message Format + +### Structure + +``` +type(scope): subject +``` + +### Examples + +``` +feat(mnv): add new keybinding support +fix(meshmc): resolve crash on startup +ci(neozip): update build matrix +docs(cmark): fix API reference +refactor(corebinutils): simplify ls output logic +chore(deps): bump tomlplusplus to v4.0.0 +revert(forgewrapper): undo jigsaw module changes +``` + +### Rules + +| Rule | Requirement | +|-------------------------------|----------------------------------------------------------| +| **Type** | Must be one of the supported types (see below) | +| **Scope** | Optional, but should match a known project directory | +| **Subject** | Must follow the type/scope with `: ` (colon + space) | +| **Trailing period** | Subject must NOT end with a period | +| **Subject case** | Subject should start with a lowercase letter (warning) | +| **No fixup/squash commits** | `fixup!`, `squash!`, `amend!` prefixes are rejected | +| **Breaking changes** | Use `!` after type/scope: `feat(mnv)!: remove API` | + +--- + +## Supported Types + +The following Conventional Commit types are recognized: + +```javascript +const CONVENTIONAL_TYPES = [ + 'build', + 'chore', + 'ci', + 'docs', + 'feat', + 'fix', + 'perf', + 'refactor', + 'revert', + 'style', + 'test', +] +``` + +| Type | Use When | +|-----------|-------------------------------------------------------------| +| `build` | Changes to the build system or external dependencies | +| `chore` | Routine tasks, no production code change | +| `ci` | CI configuration files and scripts | +| `docs` | Documentation only changes | +| `feat` | A new feature | +| `fix` | A bug fix | +| `perf` | A performance improvement | +| `refactor`| Code change that neither fixes a bug nor adds a feature | +| `revert` | Reverts a previous commit | +| `style` | Formatting, semicolons, whitespace (no code change) | +| `test` | Adding or correcting tests | + +--- + +## Known Scopes + +Scopes correspond to directories in the Project Tick monorepo: + +```javascript +const KNOWN_SCOPES = [ + 'archived', + 'cgit', + 'ci', + 'cmark', + 'corebinutils', + 'forgewrapper', + 'genqrcode', + 'hooks', + 'images4docker', + 'json4cpp', + 'libnbtplusplus', + 'meshmc', + 'meta', + 'mnv', + 'neozip', + 'tomlplusplus', + 'repo', + 'deps', +] +``` + +### Special Scopes + +| Scope | Meaning | +|----------|----------------------------------------------------| +| `repo` | Changes affecting the repository as a whole | +| `deps` | Dependency updates not scoped to a single project | + +### Unknown Scope Handling + +Using an unknown scope generates a **warning** (not an error): + +``` +Commit abc123456789: scope "myproject" is not a known project. +Known scopes: archived, cgit, ci, cmark, ... +``` + +This allows new scopes to be introduced before updating the linter. + +--- + +## Validation Logic + +### Regex Pattern + +The commit message is validated against this regex: + +```javascript +const conventionalRegex = new RegExp( + `^(${CONVENTIONAL_TYPES.join('|')})(\\(([^)]+)\\))?(!)?: .+$`, +) +``` + +Expanded, this matches: + +``` +^(build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test) # type +(\(([^)]+)\))? # optional (scope) +(!)? # optional breaking change marker +: .+$ # colon, space, and subject +``` + +### Validation Order + +For each commit in the PR: + +1. **Check for fixup/squash/amend** — If the message starts with `amend!`, `fixup!`, or + `squash!`, the commit fails immediately. These commits should be rebased before merging: + + ```javascript + const fixups = ['amend!', 'fixup!', 'squash!'] + if (fixups.some((s) => msg.startsWith(s))) { + core.error( + `${logPrefix}: starts with "${fixups.find((s) => msg.startsWith(s))}". ` + + 'Did you forget to run `git rebase -i --autosquash`?', + ) + failures.add(commit.sha) + continue + } + ``` + +2. **Check Conventional Commits format** — If the regex doesn't match, the commit fails: + + ```javascript + if (!conventionalRegex.test(msg)) { + core.error( + `${logPrefix}: "${msg}" does not follow Conventional Commits format. ` + + 'Expected: type(scope): subject (e.g. "feat(mnv): add keybinding")', + ) + failures.add(commit.sha) + continue + } + ``` + +3. **Check trailing period** — Subjects ending with `.` fail: + + ```javascript + if (msg.endsWith('.')) { + core.error(`${logPrefix}: subject should not end with a period.`) + failures.add(commit.sha) + } + ``` + +4. **Warn on unknown scope** — Non-standard scopes produce a warning: + + ```javascript + if (scope && !KNOWN_SCOPES.includes(scope)) { + core.warning( + `${logPrefix}: scope "${scope}" is not a known project. ` + + `Known scopes: ${KNOWN_SCOPES.join(', ')}`, + ) + warnings.add(commit.sha) + } + ``` + +5. **Warn on uppercase subject** — If the first character after `: ` is uppercase, warn: + + ```javascript + const subjectStart = msg.indexOf(': ') + 2 + if (subjectStart < msg.length) { + const firstChar = msg[subjectStart] + if (firstChar === firstChar.toUpperCase() && firstChar !== firstChar.toLowerCase()) { + core.warning(`${logPrefix}: subject should start with lowercase letter.`) + warnings.add(commit.sha) + } + } + ``` + +--- + +## Branch-Based Exemptions + +The linter skips validation for PRs between development branches: + +```javascript +const baseBranchType = classify(pr.base.ref.replace(/^refs\/heads\//, '')).type +const headBranchType = classify(pr.head.ref.replace(/^refs\/heads\//, '')).type + +if ( + baseBranchType.includes('development') && + headBranchType.includes('development') && + pr.base.repo.id === pr.head.repo?.id +) { + core.info('This PR is from one development branch to another. Skipping checks.') + return +} +``` + +This exempts: +- `staging` → `master` merges +- `staging-next` → `staging` merges +- `release-X.Y` → `master` merges + +These are infrastructure merges where commits were already validated in their original PRs. + +The `classify()` function from `supportedBranches.js` determines branch types: + +| Branch Prefix | Type | Exempt as PR source? | +|----------------|-------------------------|---------------------| +| `master` | `development`, `primary` | Yes | +| `release-*` | `development`, `primary` | Yes | +| `staging-*` | `development`, `secondary` | Yes | +| `staging-next-*`| `development`, `secondary` | Yes | +| `feature-*` | `wip` | No | +| `fix-*` | `wip` | No | +| `backport-*` | `wip` | No | + +--- + +## Commit Detail Extraction + +The linter uses `get-pr-commit-details.js` to extract commit information. Notably, +this uses **git directly** rather than the GitHub API: + +```javascript +async function getCommitDetailsForPR({ core, pr, repoPath }) { + await runGit({ + args: ['fetch', `--depth=1`, 'origin', pr.base.sha], + repoPath, core, + }) + await runGit({ + args: ['fetch', `--depth=${pr.commits + 1}`, 'origin', pr.head.sha], + repoPath, core, + }) + + const shas = ( + await runGit({ + args: [ + 'rev-list', + `--max-count=${pr.commits}`, + `${pr.base.sha}..${pr.head.sha}`, + ], + repoPath, core, + }) + ).stdout.split('\n').map((s) => s.trim()).filter(Boolean) +``` + +### Why Not Use the GitHub API? + +The GitHub REST API's "list commits on a PR" endpoint has a hard limit of **250 commits**. +For large PRs or release-branch merges, this is insufficient. Using git directly: +- Has no commit count limit +- Also returns changed file paths per commit (used for scope validation) +- Is faster for bulk operations + +For each commit, the script extracts: + +| Field | Source | Purpose | +|----------------------|-----------------------------|---------------------------------| +| `sha` | `git rev-list` | Commit identifier | +| `subject` | `git log --format=%s` | First line of commit message | +| `changedPaths` | `git log --name-only` | Files changed in that commit | +| `changedPathSegments` | Path splitting | Directory segments for scope matching | + +--- + +## Error Output + +### Failures (block merge) + +``` +Error: Commit abc123456789: "Add new feature" does not follow Conventional Commits format. +Expected: type(scope): subject (e.g. "feat(mnv): add keybinding") + +Error: Commit def456789012: starts with "fixup!". +Did you forget to run `git rebase -i --autosquash`? + +Error: Commit ghi789012345: subject should not end with a period. + +Error: Please review the Conventional Commits guidelines at + and the project CONTRIBUTING.md. + +Error: 3 commit(s) do not follow commit conventions. +``` + +### Warnings (informational) + +``` +Warning: Commit jkl012345678: scope "myproject" is not a known project. +Known scopes: archived, cgit, ci, cmark, ... + +Warning: Commit mno345678901: subject should start with lowercase letter. + +Warning: 2 commit(s) have minor issues (see warnings above). +``` + +--- + +## Local Testing + +Test the commit linter locally using the CLI runner: + +```bash +cd ci/github-script +nix-shell # enter Nix dev shell +gh auth login # authenticate with GitHub +./run lint-commits YongDo-Hyun Project-Tick 123 # lint PR #123 +``` + +The `./run` CLI uses the `commander` package and authenticates via `gh auth token`: + +```javascript +program + .command('lint-commits') + .description('Lint commit messages for Conventional Commits compliance.') + .argument('', 'Repository owner (e.g. YongDo-Hyun)') + .argument('', 'Repository name (e.g. Project-Tick)') + .argument('', 'Pull Request number') + .action(async (owner, repo, pr) => { + const lint = (await import('./lint-commits.js')).default + await run(lint, owner, repo, pr) + }) +``` + +--- + +## Best Practices + +### Writing Good Commit Messages + +1. **Use the correct type** — `feat` for features, `fix` for bugs, `docs` for documentation +2. **Include a scope** — Helps identify which project is affected: `feat(meshmc): ...` +3. **Use imperative mood** — "add feature" not "added feature" or "adds feature" +4. **Keep subject under 72 characters** — For readability in `git log` +5. **Start with lowercase** — `add feature` not `Add feature` +6. **No trailing period** — `fix(cgit): resolve parse error` not `fix(cgit): resolve parse error.` + +### Handling Fixup Commits During Development + +During development, you can use `git commit --fixup=` freely. Before opening +the PR (or before requesting review), squash them: + +```bash +git rebase -i --autosquash origin/master +``` + +### Multiple Scopes + +If a commit touches multiple projects, either: +- Use `repo` as the scope: `refactor(repo): update shared build config` +- Use the primary affected project as the scope +- Split the commit into separate per-project commits + +--- + +## Adding New Types or Scopes + +### New Scope + +Add the scope to the `KNOWN_SCOPES` array in `ci/github-script/lint-commits.js`: + +```javascript +const KNOWN_SCOPES = [ + 'archived', + 'cgit', + // ... + 'newproject', // ← add here (keep sorted) + // ... +] +``` + +### New Type + +Adding new types requires updating `CONVENTIONAL_TYPES` — but this should be done +rarely, as the standard Conventional Commits types cover most use cases. diff --git a/docs/handbook/ci/formatting.md b/docs/handbook/ci/formatting.md new file mode 100644 index 0000000000..9d2ddb35a4 --- /dev/null +++ b/docs/handbook/ci/formatting.md @@ -0,0 +1,298 @@ +# Code Formatting + +## Overview + +Project Tick uses [treefmt](https://github.com/numtide/treefmt) orchestrated through +[treefmt-nix](https://github.com/numtide/treefmt-nix) to enforce consistent code formatting +across the entire monorepo. The formatting configuration lives in `ci/default.nix` and +covers JavaScript, Nix, YAML, GitHub Actions workflows, and sorted-list enforcement. + +--- + +## Configured Formatters + +### Summary Table + +| Formatter | Language/Files | Key Settings | +|-------------|-------------------------------|-------------------------------------------| +| `actionlint` | GitHub Actions YAML | Default (syntax + best practices) | +| `biome` | JavaScript / TypeScript | Single quotes, optional semicolons | +| `keep-sorted`| Any (marked sections) | Default | +| `nixfmt` | Nix expressions | nixfmt-rfc-style | +| `yamlfmt` | YAML files | Retain line breaks | +| `zizmor` | GitHub Actions YAML | Security scanning | + +--- + +### actionlint + +**Purpose**: Validates GitHub Actions workflow files for syntax errors, type mismatches, +and best practices. + +**Scope**: `.github/workflows/*.yml` + +**Configuration**: Default — no custom settings. + +```nix +programs.actionlint.enable = true; +``` + +**What it catches**: +- Invalid workflow syntax +- Missing or incorrect `runs-on` values +- Type mismatches in expressions +- Unknown action references + +--- + +### biome + +**Purpose**: Formats JavaScript and TypeScript source files with consistent style. + +**Scope**: All `.js` and `.ts` files except `*.min.js` + +**Configuration**: + +```nix +programs.biome = { + enable = true; + validate.enable = false; + settings.formatter = { + useEditorconfig = true; + }; + settings.javascript.formatter = { + quoteStyle = "single"; + semicolons = "asNeeded"; + }; + settings.json.formatter.enabled = false; +}; +settings.formatter.biome.excludes = [ + "*.min.js" +]; +``` + +**Style rules**: + +| Setting | Value | Effect | +|---------------------|----------------|-------------------------------------------| +| `useEditorconfig` | `true` | Respects `.editorconfig` (indent, etc.) | +| `quoteStyle` | `"single"` | Uses `'string'` instead of `"string"` | +| `semicolons` | `"asNeeded"` | Only inserts `;` where ASI requires it | +| `validate.enable` | `false` | No lint-level validation, only formatting | +| `json.formatter` | `disabled` | JSON files are not formatted by biome | + +**Exclusions**: `*.min.js` — Minified JavaScript files are never reformatted. + +--- + +### keep-sorted + +**Purpose**: Enforces alphabetical ordering in marked sections of any file type. + +**Scope**: Files containing `keep-sorted` markers. + +```nix +programs.keep-sorted.enable = true; +``` + +**Usage**: Add markers around sections that should stay sorted: + +``` +# keep-sorted start +apple +banana +cherry +# keep-sorted end +``` + +--- + +### nixfmt + +**Purpose**: Formats Nix expressions according to the RFC-style convention. + +**Scope**: All `.nix` files. + +```nix +programs.nixfmt = { + enable = true; + package = pkgs.nixfmt; +}; +``` + +The `pkgs.nixfmt` package from the pinned Nixpkgs provides the formatter. This +is `nixfmt-rfc-style`, the official Nix formatting standard. + +--- + +### yamlfmt + +**Purpose**: Formats YAML files with consistent indentation and structure. + +**Scope**: All `.yml` and `.yaml` files. + +```nix +programs.yamlfmt = { + enable = true; + settings.formatter = { + retain_line_breaks = true; + }; +}; +``` + +**Key setting**: `retain_line_breaks = true` — Preserves intentional blank lines between +YAML sections, preventing the formatter from collapsing the file into a dense block. + +--- + +### zizmor + +**Purpose**: Security scanner for GitHub Actions workflows. Detects injection +vulnerabilities, insecure defaults, and untrusted input handling. + +**Scope**: `.github/workflows/*.yml` + +```nix +programs.zizmor.enable = true; +``` + +**What it detects**: +- Script injection via `${{ github.event.* }}` in `run:` steps +- Insecure use of `pull_request_target` +- Unquoted expressions that could be exploited +- Dangerous permission configurations + +--- + +## treefmt Global Settings + +```nix +projectRootFile = ".git/config"; +settings.verbose = 1; +settings.on-unmatched = "debug"; +``` + +| Setting | Value | Purpose | +|--------------------|---------------|----------------------------------------------| +| `projectRootFile` | `.git/config` | Identifies repository root for treefmt | +| `settings.verbose` | `1` | Logs which files each formatter processes | +| `settings.on-unmatched` | `"debug"` | Files with no matching formatter are logged at debug level | + +--- + +## Running Formatters + +### In CI + +The formatting check runs as a Nix derivation: + +```bash +nix-build ci/ -A fmt.check +``` + +This: +1. Copies the full source tree (excluding `.git`) into the Nix store +2. Runs all configured formatters +3. Fails with a diff if any file would be reformatted + +### Locally (Nix Shell) + +```bash +cd ci/ +nix-shell # enter CI dev shell +treefmt # format all files +treefmt --check # check without modifying (dry run) +``` + +### Locally (Nix Build) + +```bash +# Just check (no modification): +nix-build ci/ -A fmt.check + +# Get the formatter binary: +nix-build ci/ -A fmt.pkg +./result/bin/treefmt +``` + +--- + +## Source Tree Construction + +The treefmt check operates on a clean copy of the source tree: + +```nix +fs = pkgs.lib.fileset; +src = fs.toSource { + root = ../.; + fileset = fs.difference ../. (fs.maybeMissing ../.git); +}; +``` + +This: +- Takes the entire repository directory (`../.` from `ci/`) +- Excludes the `.git` directory (which is large and irrelevant for formatting) +- `fs.maybeMissing` handles the case where `.git` doesn't exist (e.g., in tarballs) + +The resulting source is passed to`fmt.check`: + +```nix +check = treefmtEval.config.build.check src; +``` + +--- + +## Formatter Outputs + +The formatting system exposes three Nix attributes: + +```nix +{ + shell = treefmtEval.config.build.devShell; # Interactive shell + pkg = treefmtEval.config.build.wrapper; # treefmt binary + check = treefmtEval.config.build.check src; # CI check derivation +} +``` + +| Attribute | Use Case | +|------------|--------------------------------------------------------| +| `fmt.shell` | `nix develop .#fmt.shell` — interactive formatting | +| `fmt.pkg` | The treefmt wrapper with all formatters bundled | +| `fmt.check` | `nix build .#fmt.check` — CI formatting check | + +--- + +## Troubleshooting + +### "File would be reformatted" + +If CI fails with formatting issues: + +```bash +# Enter the CI shell to get the exact same formatter versions: +cd ci/ +nix-shell + +# Format all files: +treefmt + +# Stage and commit the changes: +git add -u +git commit -m "style(repo): apply treefmt formatting" +``` + +### Editor Integration + +For real-time formatting in VS Code: + +1. Use the biome extension for JavaScript/TypeScript +2. Configure single quotes and optional semicolons to match CI settings +3. Use nixpkgs-fmt or nixfmt for Nix files + +### Formatter Conflicts + +Each file type has exactly one formatter assigned by treefmt. If a file matches +multiple formatters, treefmt reports a conflict. The current configuration avoids +this by: +- Disabling biome's JSON formatter +- Having non-overlapping file type coverage diff --git a/docs/handbook/ci/nix-infrastructure.md b/docs/handbook/ci/nix-infrastructure.md new file mode 100644 index 0000000000..27481ed46a --- /dev/null +++ b/docs/handbook/ci/nix-infrastructure.md @@ -0,0 +1,611 @@ +# Nix Infrastructure + +## Overview + +The CI system for the Project Tick monorepo is built on Nix, using pinned dependency +sources to guarantee reproducible builds and formatting checks. The primary entry point +is `ci/default.nix`, which bootstraps the complete CI toolchain from `ci/pinned.json`. + +This document covers the Nix expressions in detail: how they work, what they produce, +and how they integrate with the broader Project Tick build infrastructure. + +--- + +## ci/default.nix — The CI Entry Point + +The `default.nix` file is the sole entry point for all Nix-based CI operations. It: + +1. Reads pinned source revisions from `pinned.json` +2. Fetches the exact Nixpkgs tarball +3. Configures the treefmt multi-formatter +4. Builds the codeowners-validator +5. Exposes a development shell with all CI tools + +### Top-level Structure + +```nix +let + pinned = (builtins.fromJSON (builtins.readFile ./pinned.json)).pins; +in +{ + system ? builtins.currentSystem, + nixpkgs ? null, +}: +let + nixpkgs' = + if nixpkgs == null then + fetchTarball { + inherit (pinned.nixpkgs) url; + sha256 = pinned.nixpkgs.hash; + } + else + nixpkgs; + + pkgs = import nixpkgs' { + inherit system; + config = { }; + overlays = [ ]; + }; +``` + +### Function Parameters + +| Parameter | Default | Purpose | +|-----------|------------------------------|-------------------------------------------------| +| `system` | `builtins.currentSystem` | Target system (e.g., `x86_64-linux`) | +| `nixpkgs` | `null` (uses pinned) | Override Nixpkgs source for development/testing | + +When `nixpkgs` is `null` (the default), the pinned revision is fetched. When provided +explicitly, the override is used instead — useful for testing against newer Nixpkgs. + +### Importing Nixpkgs + +The Nixpkgs tarball is imported with empty config and no overlays: + +```nix +pkgs = import nixpkgs' { + inherit system; + config = { }; + overlays = [ ]; +}; +``` + +This ensures a "pure" package set with no user-specific customizations that could +break CI reproducibility. + +--- + +## Pinned Dependencies (pinned.json) + +### Format + +The `pinned.json` file uses the [npins](https://github.com/andir/npins) v5 format. It +stores Git-based pins with full provenance information: + +```json +{ + "pins": { + "nixpkgs": { + "type": "Git", + "repository": { + "type": "GitHub", + "owner": "NixOS", + "repo": "nixpkgs" + }, + "branch": "nixpkgs-unstable", + "submodules": false, + "revision": "bde09022887110deb780067364a0818e89258968", + "url": "https://github.com/NixOS/nixpkgs/archive/bde09022887110deb780067364a0818e89258968.tar.gz", + "hash": "13mi187zpa4rw680qbwp7pmykjia8cra3nwvjqmsjba3qhlzif5l" + }, + "treefmt-nix": { + "type": "Git", + "repository": { + "type": "GitHub", + "owner": "numtide", + "repo": "treefmt-nix" + }, + "branch": "main", + "submodules": false, + "revision": "e96d59dff5c0d7fddb9d113ba108f03c3ef99eca", + "url": "https://github.com/numtide/treefmt-nix/archive/e96d59dff5c0d7fddb9d113ba108f03c3ef99eca.tar.gz", + "hash": "02gqyxila3ghw8gifq3mns639x86jcq079kvfvjm42mibx7z5fzb" + } + }, + "version": 5 +} +``` + +### Pin Fields + +| Field | Description | +|--------------|------------------------------------------------------------| +| `type` | Source type (`Git`) | +| `repository` | Source location (`GitHub` with owner + repo) | +| `branch` | Upstream branch being tracked | +| `submodules` | Whether to fetch Git submodules (`false`) | +| `revision` | Full commit SHA of the pinned revision | +| `url` | Direct tarball download URL for the pinned revision | +| `hash` | SRI hash (base32) for integrity verification | + +### Why Two Pins? + +| Pin | Tracked Branch | Purpose | +|---------------|----------------------|--------------------------------------------| +| `nixpkgs` | `nixpkgs-unstable` | Base package set: compilers, tools, libraries | +| `treefmt-nix` | `main` | Code formatter orchestrator and its modules | + +The `nixpkgs-unstable` branch is used rather than a release branch to get recent +tool versions while still being reasonably stable. + +--- + +## Updating Pinned Dependencies + +### update-pinned.sh + +The update script is minimal: + +```bash +#!/usr/bin/env nix-shell +#!nix-shell -i bash -p npins + +set -euo pipefail + +cd "$(dirname "${BASH_SOURCE[0]}")" + +npins --lock-file pinned.json update +``` + +This: + +1. Enters a `nix-shell` with `npins` available +2. Changes to the `ci/` directory (where `pinned.json` lives) +3. Runs `npins update` to fetch the latest commit from each tracked branch +4. Updates `pinned.json` with new revisions and hashes + +### When to Update + +- **Regularly**: To pick up security patches and tool updates +- **When a formatter change is needed**: New treefmt-nix releases may add formatters +- **When CI breaks on upstream**: Pin to a known-good revision + +### Manual Update Procedure + +```bash +# From the repository root: +cd ci/ +./update-pinned.sh + +# Review the diff: +git diff pinned.json + +# Test locally: +nix-build -A fmt.check + +# Commit: +git add pinned.json +git commit -m "ci: update pinned nixpkgs and treefmt-nix" +``` + +--- + +## treefmt Integration + +### What is treefmt? + +[treefmt](https://github.com/numtide/treefmt) is a multi-language formatter orchestrator. +It runs multiple formatters in parallel and ensures every file type has exactly one formatter. +The `treefmt-nix` module provides a Nix-native way to configure it. + +### Configuration in default.nix + +```nix +fmt = + let + treefmtNixSrc = fetchTarball { + inherit (pinned.treefmt-nix) url; + sha256 = pinned.treefmt-nix.hash; + }; + treefmtEval = (import treefmtNixSrc).evalModule pkgs { + projectRootFile = ".git/config"; + + settings.verbose = 1; + settings.on-unmatched = "debug"; + + programs.actionlint.enable = true; + + programs.biome = { + enable = true; + validate.enable = false; + settings.formatter = { + useEditorconfig = true; + }; + settings.javascript.formatter = { + quoteStyle = "single"; + semicolons = "asNeeded"; + }; + settings.json.formatter.enabled = false; + }; + settings.formatter.biome.excludes = [ + "*.min.js" + ]; + + programs.keep-sorted.enable = true; + + programs.nixfmt = { + enable = true; + package = pkgs.nixfmt; + }; + + programs.yamlfmt = { + enable = true; + settings.formatter = { + retain_line_breaks = true; + }; + }; + + programs.zizmor.enable = true; + }; +``` + +### treefmt Settings + +| Setting | Value | Purpose | +|----------------------------|---------------|---------------------------------------------| +| `projectRootFile` | `.git/config` | Marker file to detect the repository root | +| `settings.verbose` | `1` | Show which formatter processes each file | +| `settings.on-unmatched` | `"debug"` | Log unmatched files at debug level | + +### Configured Formatters + +#### actionlint +- **Purpose**: Lint GitHub Actions workflow YAML files +- **Scope**: `.github/workflows/*.yml` +- **Configuration**: Default settings + +#### biome +- **Purpose**: Format JavaScript and TypeScript files +- **Configuration**: + - `useEditorconfig = true` — Respects `.editorconfig` settings + - `quoteStyle = "single"` — Uses single quotes + - `semicolons = "asNeeded"` — Only adds semicolons where required by ASI + - `validate.enable = false` — No lint-level validation, only formatting + - `json.formatter.enabled = false` — Does not format JSON files +- **Exclusions**: `*.min.js` — Minified JavaScript files are skipped + +#### keep-sorted +- **Purpose**: Enforces sorted order in marked sections (e.g., dependency lists) +- **Configuration**: Default settings + +#### nixfmt +- **Purpose**: Format Nix expressions +- **Package**: Uses `pkgs.nixfmt` from the pinned Nixpkgs +- **Configuration**: Default nixfmt-rfc-style formatting + +#### yamlfmt +- **Purpose**: Format YAML files +- **Configuration**: + - `retain_line_breaks = true` — Preserves intentional blank lines + +#### zizmor +- **Purpose**: Security scanning for GitHub Actions workflows +- **Configuration**: Default settings +- **Detects**: Injection vulnerabilities, insecure defaults, untrusted inputs + +### Formatter Source Tree + +The treefmt evaluation creates a source tree from the repository, excluding `.git`: + +```nix +fs = pkgs.lib.fileset; +src = fs.toSource { + root = ../.; + fileset = fs.difference ../. (fs.maybeMissing ../.git); +}; +``` + +This ensures the formatting check operates on the full repository contents while +avoiding Git internals. + +### Outputs + +The `fmt` attribute set exposes three derivations: + +```nix +{ + shell = treefmtEval.config.build.devShell; # nix develop .#fmt.shell + pkg = treefmtEval.config.build.wrapper; # treefmt binary + check = treefmtEval.config.build.check src; # nix build .#fmt.check +} +``` + +| Output | Type | Purpose | +|------------|-------------|--------------------------------------------------| +| `fmt.shell` | Dev shell | Interactive shell with treefmt available | +| `fmt.pkg` | Binary | The treefmt wrapper with all formatters configured| +| `fmt.check` | Check | A Nix derivation that fails if any file needs reformatting | + +--- + +## codeowners-validator Derivation + +### Purpose + +The codeowners-validator checks that the `ci/OWNERS` file is structurally valid: +- All referenced paths exist in the repository +- All referenced GitHub users/teams exist in the organization +- Glob patterns are syntactically correct + +### Build Definition + +```nix +{ + buildGoModule, + fetchFromGitHub, + fetchpatch, +}: +buildGoModule { + name = "codeowners-validator"; + src = fetchFromGitHub { + owner = "mszostok"; + repo = "codeowners-validator"; + rev = "f3651e3810802a37bd965e6a9a7210728179d076"; + hash = "sha256-5aSmmRTsOuPcVLWfDF6EBz+6+/Qpbj66udAmi1CLmWQ="; + }; + patches = [ + (fetchpatch { + name = "user-write-access-check"; + url = "https://github.com/mszostok/codeowners-validator/compare/f3651e3...840eeb8.patch"; + hash = "sha256-t3Dtt8SP9nbO3gBrM0nRE7+G6N/ZIaczDyVHYAG/6mU="; + }) + ./permissions.patch + ./owners-file-name.patch + ]; + postPatch = "rm -r docs/investigation"; + vendorHash = "sha256-R+pW3xcfpkTRqfS2ETVOwG8PZr0iH5ewroiF7u8hcYI="; +} +``` + +### Patches Applied + +#### 1. user-write-access-check (upstream PR #222) +Fetched from the upstream repository. Modifies the write-access validation logic. + +#### 2. permissions.patch +Undoes part of the upstream PR's write-access requirement: + +```diff + var reqScopes = map[github.Scope]struct{}{ +- github.ScopeReadOrg: {}, + } +``` + +And removes the push permission checks for teams and users: + +```diff + for _, t := range v.repoTeams { + if strings.EqualFold(t.GetSlug(), team) { +- if t.Permissions["push"] { +- return nil +- } +- return newValidateError(...) ++ return nil + } + } +``` + +This is necessary because Project Tick's OWNERS file is used for code review routing, +not for GitHub's native branch protection rules. Contributors listed in OWNERS don't +need write access to the repository. + +#### 3. owners-file-name.patch +Adds support for a custom CODEOWNERS file path via the `OWNERS_FILE` environment variable: + +```diff + func openCodeownersFile(dir string) (io.Reader, error) { ++ if file, ok := os.LookupEnv("OWNERS_FILE"); ok { ++ return fs.Open(file) ++ } ++ + var detectedFiles []string +``` + +This allows the validator to check `ci/OWNERS` instead of the default `.github/CODEOWNERS` +or `CODEOWNERS` paths. + +--- + +## CI Dev Shell + +The top-level `shell` attribute combines all CI tools: + +```nix +shell = pkgs.mkShell { + packages = [ + fmt.pkg + codeownersValidator + ]; +}; +``` + +This provides: +- `treefmt` — The configured multi-formatter +- `codeowners-validator` — The patched OWNERS validator + +Enter the shell: + +```bash +cd ci/ +nix-shell # or: nix develop +treefmt # format all files +codeowners-validator # validate OWNERS +``` + +--- + +## github-script Nix Shell + +The `ci/github-script/shell.nix` provides a separate dev shell for JavaScript CI scripts: + +```nix +{ + system ? builtins.currentSystem, + pkgs ? (import ../../ci { inherit system; }).pkgs, +}: + +pkgs.callPackage ( + { + gh, + importNpmLock, + mkShell, + nodejs, + }: + mkShell { + packages = [ + gh + importNpmLock.hooks.linkNodeModulesHook + nodejs + ]; + + npmDeps = importNpmLock.buildNodeModules { + npmRoot = ./.; + inherit nodejs; + }; + } +) { } +``` + +### Key Features + +1. **Shared Nixpkgs**: Imports the pinned `pkgs` from `../../ci` (the parent `default.nix`) +2. **Node.js**: Full Node.js runtime for running CI scripts +3. **GitHub CLI**: `gh` for authentication (`gh auth token` is used by the `run` CLI) +4. **npm Lockfile Integration**: `importNpmLock` builds `node_modules` from `package-lock.json` + in the Nix store, then `linkNodeModulesHook` symlinks it into the working directory + +--- + +## Relationship to Root flake.nix + +The root `flake.nix` defines the overall development environment: + +```nix +{ + description = "Project Tick is a project dedicated to providing developers + with ease of use and users with long-lasting software."; + + inputs = { + nixpkgs.url = "https://channels.nixos.org/nixos-unstable/nixexprs.tar.xz"; + }; + + outputs = { self, nixpkgs }: + let + systems = lib.systems.flakeExposed; + forAllSystems = lib.genAttrs systems; + nixpkgsFor = forAllSystems (system: nixpkgs.legacyPackages.${system}); + in + { + devShells = forAllSystems (system: ...); + formatter = forAllSystems (system: nixpkgsFor.${system}.nixfmt-rfc-style); + }; +} +``` + +The flake's `inputs.nixpkgs` uses `nixos-unstable` via Nix channels, while the CI +`pinned.json` uses a specific commit from `nixpkgs-unstable`. These are related but +independently pinned — the flake updates when `flake.lock` is refreshed, while CI +pins update only when `update-pinned.sh` is explicitly run. + +### When Each Is Used + +| Context | Nix Source | +|-------------------|-----------------------------------------------| +| `nix develop` | Root `flake.nix` → `flake.lock` → nixpkgs | +| CI formatting check| `ci/default.nix` → `ci/pinned.json` → nixpkgs| +| CI script dev shell| `ci/github-script/shell.nix` → `ci/default.nix` | + +--- + +## Evaluation and Build Commands + +### Building the Format Check + +```bash +# From repository root: +nix-build ci/ -A fmt.check + +# Or with flakes: +nix build .#fmt.check +``` + +This produces a derivation that: +1. Copies the entire source tree (minus `.git`) into the Nix store +2. Runs all configured formatters +3. Fails with a diff if any file would be reformatted + +### Entering the CI Shell + +```bash +# Nix classic: +nix-shell ci/ + +# Nix flakes: +nix develop ci/ +``` + +### Building codeowners-validator + +```bash +nix-build ci/ -A codeownersValidator +./result/bin/codeowners-validator +``` + +--- + +## Troubleshooting + +### "hash mismatch" on pinned.json update + +If `update-pinned.sh` produces a hash mismatch, the upstream source has changed +at the same branch tip. Re-run the update: + +```bash +cd ci/ +./update-pinned.sh +``` + +### Formatter version mismatch + +If local formatting produces different results than CI: + +1. Ensure you're using the same Nixpkgs pin: `nix-shell ci/` +2. Run `treefmt` from within the CI shell +3. If the issue persists, update pins: `./update-pinned.sh` + +### codeowners-validator fails to build + +The Go module build requires network access for vendored dependencies. Ensure: +- The `vendorHash` in `codeowners-validator/default.nix` matches the actual Go module checksum +- If upstream dependencies change, update `vendorHash` + +--- + +## Security Considerations + +- **Hash verification**: All fetched tarballs are verified against their SRI hashes +- **No overlays**: Nixpkgs is imported with empty overlays to prevent supply-chain attacks +- **Pinned revisions**: Exact commit SHAs prevent upstream branch tampering +- **zizmor**: GitHub Actions workflows are scanned for injection vulnerabilities +- **actionlint**: Workflow syntax is validated to catch misconfigurations + +--- + +## Summary + +The Nix infrastructure provides: + +1. **Reproducibility** — Identical tools and versions across all CI runs and developer machines +2. **Composability** — Each component (treefmt, codeowners-validator) is independently buildable +3. **Security** — Hash-verified dependencies, security scanning, no arbitrary overlays +4. **Developer experience** — `nix-shell` provides a ready-to-use environment with zero manual setup diff --git a/docs/handbook/ci/overview.md b/docs/handbook/ci/overview.md new file mode 100644 index 0000000000..19d42cfe2a --- /dev/null +++ b/docs/handbook/ci/overview.md @@ -0,0 +1,494 @@ +# CI Infrastructure — Overview + +## Purpose + +The `ci/` directory contains the Continuous Integration infrastructure for the Project Tick monorepo. +It provides reproducible builds, automated code quality checks, commit message validation, +pull request lifecycle management, and code ownership enforcement — all orchestrated through +Nix expressions and JavaScript-based GitHub Actions scripts. + +The CI system is designed around three core principles: + +1. **Reproducibility** — Pinned Nix dependencies ensure identical builds across environments +2. **Conventional Commits** — Enforced commit message format for automated changelog generation +3. **Ownership-driven review** — CODEOWNERS-style file ownership with automated review routing + +--- + +## Directory Structure + +``` +ci/ +├── OWNERS # Code ownership file (CODEOWNERS format) +├── README.md # CI README with local testing instructions +├── default.nix # Nix CI entry point — treefmt, codeowners-validator, shell +├── pinned.json # Pinned Nixpkgs + treefmt-nix revisions (npins format) +├── update-pinned.sh # Script to update pinned.json via npins +├── supportedBranches.js # Branch classification logic for CI decisions +├── codeowners-validator/ # Builds codeowners-validator from source (Go) +│ ├── default.nix # Nix derivation for codeowners-validator +│ ├── owners-file-name.patch # Patch: custom OWNERS file path via OWNERS_FILE env var +│ └── permissions.patch # Patch: remove write-access check (not needed for non-native CODEOWNERS) +└── github-script/ # JavaScript CI scripts for GitHub Actions + ├── run # CLI entry point for local testing (commander-based) + ├── lint-commits.js # Commit message linter (Conventional Commits) + ├── prepare.js # PR preparation: mergeability, branch targeting, touched files + ├── reviews.js # Review lifecycle: post, dismiss, minimize bot reviews + ├── get-pr-commit-details.js # Extract commit SHAs, subjects, changed paths via git + ├── withRateLimit.js # GitHub API rate limiting with Bottleneck + ├── package.json # Node.js dependencies (@actions/core, @actions/github, bottleneck, commander) + ├── package-lock.json # Lockfile for reproducible npm installs + ├── shell.nix # Nix dev shell for github-script (Node.js + gh CLI) + ├── README.md # Local testing documentation + ├── .editorconfig # Editor configuration + ├── .gitignore # Git ignore rules + └── .npmrc # npm configuration +``` + +--- + +## How CI Works End-to-End + +### 1. Triggering + +CI runs are triggered by GitHub Actions workflows (defined in `.github/workflows/`) when +pull requests are opened, updated, or merged against supported branches. The `supportedBranches.js` +module classifies branches to determine which checks to run. + +### 2. Environment Setup + +The CI environment is bootstrapped via `ci/default.nix`, which: + +- Reads pinned dependency revisions from `ci/pinned.json` +- Fetches the exact Nixpkgs tarball at the pinned commit +- Imports `treefmt-nix` for code formatting +- Builds the `codeowners-validator` tool with Project Tick–specific patches +- Exposes a development shell with all CI tools available + +```nix +# ci/default.nix — entry point +let + pinned = (builtins.fromJSON (builtins.readFile ./pinned.json)).pins; +in +{ + system ? builtins.currentSystem, + nixpkgs ? null, +}: +let + nixpkgs' = + if nixpkgs == null then + fetchTarball { + inherit (pinned.nixpkgs) url; + sha256 = pinned.nixpkgs.hash; + } + else + nixpkgs; + + pkgs = import nixpkgs' { + inherit system; + config = { }; + overlays = [ ]; + }; +``` + +### 3. Code Formatting (treefmt) + +The `default.nix` configures `treefmt-nix` with multiple formatters: + +| Formatter | Purpose | Configuration | +|-------------|--------------------------------------|----------------------------------------------| +| `actionlint` | GitHub Actions workflow linting | Enabled, no custom config | +| `biome` | JavaScript/TypeScript formatting | Single quotes, no semicolons, no JSON format | +| `keep-sorted`| Sorted list enforcement | Enabled, no custom config | +| `nixfmt` | Nix expression formatting | Uses `pkgs.nixfmt` | +| `yamlfmt` | YAML formatting | Retains line breaks | +| `zizmor` | GitHub Actions security scanning | Enabled, no custom config | + +Biome is configured with specific style rules: + +```nix +programs.biome = { + enable = true; + validate.enable = false; + settings.formatter = { + useEditorconfig = true; + }; + settings.javascript.formatter = { + quoteStyle = "single"; + semicolons = "asNeeded"; + }; + settings.json.formatter.enabled = false; +}; +settings.formatter.biome.excludes = [ + "*.min.js" +]; +``` + +### 4. Commit Linting + +When a PR is opened or updated, `ci/github-script/lint-commits.js` validates every commit +message against the Conventional Commits specification. It checks: + +- Format: `type(scope): subject` +- No `fixup!`, `squash!`, or `amend!` prefixes (must be rebased before merge) +- No trailing period on subject line +- Lowercase first letter in subject +- Known scopes matching monorepo project directories + +The supported types are: + +```javascript +const CONVENTIONAL_TYPES = [ + 'build', 'chore', 'ci', 'docs', 'feat', 'fix', + 'perf', 'refactor', 'revert', 'style', 'test', +] +``` + +And the known scopes correspond to monorepo directories: + +```javascript +const KNOWN_SCOPES = [ + 'archived', 'cgit', 'ci', 'cmark', 'corebinutils', + 'forgewrapper', 'genqrcode', 'hooks', 'images4docker', + 'json4cpp', 'libnbtplusplus', 'meshmc', 'meta', 'mnv', + 'neozip', 'tomlplusplus', 'repo', 'deps', +] +``` + +### 5. PR Preparation and Validation + +The `ci/github-script/prepare.js` script handles PR lifecycle: + +1. **Mergeability check** — Polls GitHub's mergeability computation with exponential backoff + (5s, 10s, 20s, 40s, 80s retries) +2. **Branch classification** — Classifies base and head branches using `supportedBranches.js` +3. **Base branch suggestion** — For WIP branches, computes the optimal base branch by comparing + merge-base commit distances across `master` and all release branches +4. **Merge conflict detection** — If the PR has conflicts, uses the head SHA directly; otherwise + uses the merge commit SHA +5. **Touched file detection** — Identifies which CI-relevant paths were modified: + - `ci` — any file under `ci/` + - `pinned` — `ci/pinned.json` specifically + - `github` — any file under `.github/` + +### 6. Review Lifecycle Management + +The `ci/github-script/reviews.js` module manages bot reviews: + +- **`postReview()`** — Posts or updates a review with a tracking comment tag + (``) +- **`dismissReviews()`** — Dismisses, minimizes (marks as outdated), or resolves bot reviews + when the underlying issue is fixed +- Reviews are tagged with a `reviewKey` to allow multiple independent review concerns + on the same PR + +### 7. Rate Limiting + +All GitHub API calls go through `ci/github-script/withRateLimit.js`, which uses the +Bottleneck library for request throttling: + +- Read requests: controlled by a reservoir updated from the GitHub rate limit API +- Write requests (`POST`, `PUT`, `PATCH`, `DELETE`): minimum 1 second between calls +- The reservoir keeps 1000 spare requests for other concurrent jobs +- Reservoir is refreshed every 60 seconds +- Requests to `github.com` (not the API), `/rate_limit`, and `/search/` endpoints bypass throttling + +### 8. Code Ownership Validation + +The `ci/codeowners-validator/` builds a patched version of the +[codeowners-validator](https://github.com/mszostok/codeowners-validator) tool: + +- Fetched from GitHub at a specific pinned commit +- Two patches applied: + - `owners-file-name.patch` — Adds support for custom CODEOWNERS file path via `OWNERS_FILE` env var + - `permissions.patch` — Removes the write-access permission check (not needed since Project Tick + uses an `OWNERS` file rather than GitHub's native `CODEOWNERS`) + +This validates the `ci/OWNERS` file against the actual repository structure and GitHub +organization membership. + +--- + +## Component Interaction Flow + +``` +┌─────────────────────────────────────────┐ +│ GitHub Actions Workflow │ +│ (.github/workflows/*.yml) │ +└──────────────┬──────────────────────────┘ + │ triggers + ▼ +┌──────────────────────────────────────────┐ +│ ci/default.nix │ +│ ┌─────────┐ ┌──────────────────────┐ │ +│ │pinned. │ │ treefmt-nix │ │ +│ │json │──│ (formatting checks) │ │ +│ └─────────┘ └──────────────────────┘ │ +│ ┌──────────────────────┐ │ +│ │ codeowners-validator │ │ +│ │ (OWNERS validation) │ │ +│ └──────────────────────┘ │ +└──────────────┬───────────────────────────┘ + │ also triggers + ▼ +┌──────────────────────────────────────────┐ +│ ci/github-script/ │ +│ ┌────────────────┐ ┌───────────────┐ │ +│ │ prepare.js │ │ lint-commits │ │ +│ │ (PR validation) │ │ (commit msg) │ │ +│ └───────┬────────┘ └──────┬────────┘ │ +│ │ │ │ +│ ┌───────▼────────┐ ┌──────▼────────┐ │ +│ │ reviews.js │ │ supported │ │ +│ │ (bot reviews) │ │ Branches.js │ │ +│ └───────┬────────┘ └───────────────┘ │ +│ │ │ +│ ┌───────▼────────┐ │ +│ │ withRateLimit │ │ +│ │ (API throttle) │ │ +│ └────────────────┘ │ +└──────────────────────────────────────────┘ +``` + +--- + +## Key Design Decisions + +### Why Nix for CI? + +Nix ensures that every CI run uses the exact same versions of tools, compilers, and +libraries. The `pinned.json` file locks specific commits of Nixpkgs and treefmt-nix, +eliminating "works on my machine" problems. + +### Why a custom OWNERS file? + +GitHub's native CODEOWNERS has limitations: +- Must be in `.github/CODEOWNERS`, `CODEOWNERS`, or `docs/CODEOWNERS` +- Requires repository write access for all listed owners +- Cannot be extended with custom validation + +Project Tick uses `ci/OWNERS` with the same glob pattern syntax but adds: +- Custom file path support (via the `OWNERS_FILE` environment variable) +- No write-access requirement (via the permissions patch) +- Integration with the codeowners-validator for structural validation + +### Why Bottleneck for rate limiting? + +GitHub Actions can run many jobs in parallel, and each job makes API calls. Without +throttling, a large CI run could exhaust the GitHub API rate limit (5000 requests/hour +for authenticated requests). Bottleneck provides: +- Concurrency control (1 concurrent request by default) +- Reservoir-based rate limiting (dynamically updated from the API) +- Separate throttling for mutative requests (1 second minimum between writes) + +### Why local testing support? + +The `ci/github-script/run` CLI allows developers to test CI scripts locally before +pushing. This accelerates development and reduces CI feedback loops: + +```bash +cd ci/github-script +nix-shell # sets up Node.js + dependencies +gh auth login # authenticate with GitHub +./run lint-commits YongDo-Hyun Project-Tick 123 +./run prepare YongDo-Hyun Project-Tick 123 +``` + +--- + +## Pinned Dependencies + +The CI system pins two external Nix sources: + +| Dependency | Source | Branch | Purpose | +|-------------|----------------------------------------------|--------------------|--------------------------------| +| `nixpkgs` | `github:NixOS/nixpkgs` | `nixpkgs-unstable` | Base package set for CI tools | +| `treefmt-nix`| `github:numtide/treefmt-nix` | `main` | Multi-formatter orchestrator | + +Pins are stored in `ci/pinned.json` in npins v5 format: + +```json +{ + "pins": { + "nixpkgs": { + "type": "Git", + "repository": { + "type": "GitHub", + "owner": "NixOS", + "repo": "nixpkgs" + }, + "branch": "nixpkgs-unstable", + "revision": "bde09022887110deb780067364a0818e89258968", + "url": "https://github.com/NixOS/nixpkgs/archive/bde09022887110deb780067364a0818e89258968.tar.gz", + "hash": "13mi187zpa4rw680qbwp7pmykjia8cra3nwvjqmsjba3qhlzif5l" + }, + "treefmt-nix": { + "type": "Git", + "repository": { + "type": "GitHub", + "owner": "numtide", + "repo": "treefmt-nix" + }, + "branch": "main", + "revision": "e96d59dff5c0d7fddb9d113ba108f03c3ef99eca", + "url": "https://github.com/numtide/treefmt-nix/archive/e96d59dff5c0d7fddb9d113ba108f03c3ef99eca.tar.gz", + "hash": "02gqyxila3ghw8gifq3mns639x86jcq079kvfvjm42mibx7z5fzb" + } + }, + "version": 5 +} +``` + +To update pins: + +```bash +cd ci/ +./update-pinned.sh +``` + +This runs `npins --lock-file pinned.json update` to fetch the latest revisions. + +--- + +## Node.js Dependencies (github-script) + +The `ci/github-script/package.json` declares: + +```json +{ + "private": true, + "dependencies": { + "@actions/core": "1.11.1", + "@actions/github": "6.0.1", + "bottleneck": "2.19.5", + "commander": "14.0.3" + } +} +``` + +| Package | Version | Purpose | +|-------------------|----------|-----------------------------------------------| +| `@actions/core` | `1.11.1` | GitHub Actions core utilities (logging, outputs) | +| `@actions/github` | `6.0.1` | GitHub API client (Octokit wrapper) | +| `bottleneck` | `2.19.5` | Rate limiting / request throttling | +| `commander` | `14.0.3` | CLI argument parsing for local `./run` tool | + +These versions are kept in sync with the +[actions/github-script](https://github.com/actions/github-script) action. + +--- + +## Nix Dev Shell + +The `ci/github-script/shell.nix` provides a development environment for working on +the CI scripts locally: + +```nix +{ + system ? builtins.currentSystem, + pkgs ? (import ../../ci { inherit system; }).pkgs, +}: + +pkgs.callPackage ( + { + gh, + importNpmLock, + mkShell, + nodejs, + }: + mkShell { + packages = [ + gh + importNpmLock.hooks.linkNodeModulesHook + nodejs + ]; + + npmDeps = importNpmLock.buildNodeModules { + npmRoot = ./.; + inherit nodejs; + }; + } +) { } +``` + +This gives you: +- `nodejs` — Node.js runtime +- `gh` — GitHub CLI for authentication +- `importNpmLock.hooks.linkNodeModulesHook` — Automatically links `node_modules` from the Nix store + +--- + +## Outputs Exposed by default.nix + +The `ci/default.nix` exposes the following attributes: + +| Attribute | Type | Description | +|----------------------|-----------|--------------------------------------------------| +| `pkgs` | Nixpkgs | The pinned Nixpkgs package set | +| `fmt.shell` | Derivation| Dev shell with treefmt formatter available | +| `fmt.pkg` | Derivation| The treefmt wrapper binary | +| `fmt.check` | Derivation| A check derivation that fails if formatting drifts| +| `codeownersValidator`| Derivation| Patched codeowners-validator binary | +| `shell` | Derivation| Combined CI dev shell (fmt + codeowners-validator)| + +```nix +rec { + inherit pkgs fmt; + codeownersValidator = pkgs.callPackage ./codeowners-validator { }; + + shell = pkgs.mkShell { + packages = [ + fmt.pkg + codeownersValidator + ]; + }; +} +``` + +--- + +## Integration with Root Flake + +The root `flake.nix` provides: + +- Dev shells for all supported systems (`aarch64-linux`, `x86_64-linux`, etc.) +- A formatter (`nixfmt-rfc-style`) +- The CI `default.nix` is imported indirectly via the flake for Nix-based CI runs + +```nix +{ + description = "Project Tick is a project dedicated to providing developers + with ease of use and users with long-lasting software."; + + inputs = { + nixpkgs.url = "https://channels.nixos.org/nixos-unstable/nixexprs.tar.xz"; + }; + ... +} +``` + +--- + +## Summary of CI Checks + +| Check | Tool / Script | Scope | +|--------------------------|---------------------------|------------------------------------| +| Code formatting | treefmt (biome, nixfmt, yamlfmt, actionlint, zizmor) | All source files | +| Commit message format | `lint-commits.js` | All commits in a PR | +| PR mergeability | `prepare.js` | Every PR | +| Base branch targeting | `prepare.js` + `supportedBranches.js` | WIP → development PRs | +| Code ownership validity | `codeowners-validator` | `ci/OWNERS` file | +| GitHub Actions security | `zizmor` (via treefmt) | `.github/workflows/*.yml` | +| Sorted list enforcement | `keep-sorted` (via treefmt)| Files with keep-sorted markers | + +--- + +## Related Documentation + +- [Nix Infrastructure](nix-infrastructure.md) — Deep dive into the Nix expressions +- [Commit Linting](commit-linting.md) — Commit message conventions and validation rules +- [PR Validation](pr-validation.md) — Pull request checks and lifecycle management +- [Branch Strategy](branch-strategy.md) — Branch naming, classification, and release branches +- [CODEOWNERS](codeowners.md) — Ownership file format and validation +- [Formatting](formatting.md) — Code formatting configuration and tools +- [Rate Limiting](rate-limiting.md) — GitHub API rate limiting strategy diff --git a/docs/handbook/ci/pr-validation.md b/docs/handbook/ci/pr-validation.md new file mode 100644 index 0000000000..f7933d3e75 --- /dev/null +++ b/docs/handbook/ci/pr-validation.md @@ -0,0 +1,378 @@ +# PR Validation + +## Overview + +The `ci/github-script/prepare.js` script runs on every pull request to validate +mergeability, classify branches, suggest optimal base branches, detect merge conflicts, +and identify which CI-relevant paths were touched. It also manages bot review comments +to guide contributors toward correct PR targeting. + +--- + +## What prepare.js Does + +1. **Checks PR state** — Ensures the PR is still open +2. **Waits for mergeability** — Polls GitHub until mergeability is computed +3. **Classifies branches** — Categorizes base and head branches using `supportedBranches.js` +4. **Validates branch targeting** — Warns if a feature branch targets a release branch +5. **Suggests better base branches** — For WIP branches, finds the optimal base by comparing + commit distances +6. **Computes merge SHAs** — Determines the merge commit SHA and target comparison SHA +7. **Detects touched CI paths** — Identifies changes to `ci/`, `ci/pinned.json`, `.github/` + +--- + +## Mergeability Check + +GitHub computes merge status asynchronously. The script polls with exponential backoff: + +```javascript +for (const retryInterval of [5, 10, 20, 40, 80]) { + core.info('Checking whether the pull request can be merged...') + const prInfo = ( + await github.rest.pulls.get({ + ...context.repo, + pull_number, + }) + ).data + + if (prInfo.state !== 'open') throw new Error('PR is not open anymore.') + + if (prInfo.mergeable == null) { + core.info( + `GitHub is still computing mergeability, waiting ${retryInterval}s...`, + ) + await new Promise((resolve) => setTimeout(resolve, retryInterval * 1000)) + continue + } + // ... process PR +} +throw new Error( + 'Timed out waiting for GitHub to compute mergeability. Check https://www.githubstatus.com.', +) +``` + +### Retry Schedule + +| Attempt | Wait Time | Cumulative Wait | +|---------|-----------|-----------------| +| 1 | 5 seconds | 5 seconds | +| 2 | 10 seconds| 15 seconds | +| 3 | 20 seconds| 35 seconds | +| 4 | 40 seconds| 75 seconds | +| 5 | 80 seconds| 155 seconds | + +If mergeability is still not computed after ~2.5 minutes, the script throws an error +with a link to [githubstatus.com](https://www.githubstatus.com) for checking GitHub's +system status. + +--- + +## Branch Classification + +Both the base and head branches are classified using `supportedBranches.js`: + +```javascript +const baseClassification = classify(base.ref) +core.setOutput('base', baseClassification) + +const headClassification = + base.repo.full_name === head.repo.full_name + ? classify(head.ref) + : { type: ['wip'] } +core.setOutput('head', headClassification) +``` + +### Fork Handling + +For cross-fork PRs (where the head repo differs from the base repo), the head branch +is always classified as `{ type: ['wip'] }` regardless of its name. This prevents +fork branches from being treated as development branches. + +### Classification Output + +Each classification produces: + +```javascript +{ + branch: 'release-1.0', + order: 1, + stable: true, + type: ['development', 'primary'], + version: '1.0', +} +``` + +| Field | Description | +|-----------|------------------------------------------------------| +| `branch` | The full branch name | +| `order` | Ranking for base-branch preference (lower = better) | +| `stable` | Whether the branch has a version suffix | +| `type` | Array of type tags | +| `version` | Extracted version number, or `'dev'` | + +--- + +## Release Branch Targeting Warning + +If a WIP branch (feature, fix, etc.) targets a stable release branch, the script +checks whether it's a backport: + +```javascript +if ( + baseClassification.stable && + baseClassification.type.includes('primary') +) { + const headPrefix = head.ref.split('-')[0] + if (!['backport', 'fix', 'revert'].includes(headPrefix)) { + core.warning( + `This PR targets release branch \`${base.ref}\`. ` + + 'New features should typically target \`master\`.', + ) + } +} +``` + +| Head Branch Prefix | Allowed to target release? | Reason | +|-------------------|---------------------------|---------------------| +| `backport-*` | Yes | Explicit backport | +| `fix-*` | Yes | Bug fix for release | +| `revert-*` | Yes | Reverting a change | +| `feature-*` | Warning issued | Should target master| +| `wip-*` | Warning issued | Should target master| + +--- + +## Base Branch Suggestion + +For WIP branches, the script computes the optimal base branch by analyzing commit +distances from the head to all candidate base branches: + +### Algorithm + +1. **List all branches** — Fetch all branches in the repository via pagination +2. **Filter candidates** — Keep `master` and all stable primary branches (release-*) +3. **Compute merge bases** — For each candidate, find the merge-base commit with the + PR head and count commits between them + +```javascript +async function mergeBase({ branch, order, version }) { + const { data } = await github.rest.repos.compareCommitsWithBasehead({ + ...context.repo, + basehead: `${branch}...${head.sha}`, + per_page: 1, + page: 2, + }) + return { + branch, + order, + version, + commits: data.total_commits, + sha: data.merge_base_commit.sha, + } +} +``` + +4. **Select the best** — The branch with the fewest commits ahead wins. If there's a tie, + the branch with the lowest `order` wins (i.e., `master` over `release-*`). + +```javascript +let candidates = [await mergeBase(classify('master'))] +for (const release of releases) { + const nextCandidate = await mergeBase(release) + if (candidates[0].commits === nextCandidate.commits) + candidates.push(nextCandidate) + if (candidates[0].commits > nextCandidate.commits) + candidates = [nextCandidate] + if (candidates[0].commits < 10000) break +} + +const best = candidates.sort((a, b) => a.order - b.order).at(0) +``` + +5. **Post review if mismatch** — If the suggested base differs from the current base, + a bot review is posted: + +```javascript +if (best.branch !== base.ref) { + const current = await mergeBase(classify(base.ref)) + const body = [ + `This PR targets \`${current.branch}\`, but based on the commit history ` + + `\`${best.branch}\` appears to be a better fit ` + + `(${current.commits - best.commits} fewer commits ahead).`, + '', + `If this is intentional, you can ignore this message. Otherwise:`, + `- [Change the base branch](...) to \`${best.branch}\`.`, + ].join('\n') + + await postReview({ github, context, core, dry, body, reviewKey }) +} +``` + +6. **Dismiss reviews if correct** — If the base branch matches the suggestion, any + previous bot reviews are dismissed. + +### Early Termination + +The algorithm stops evaluating release branches once the candidate count drops below +10,000 commits. This prevents unnecessary API calls for branches that are clearly +not good candidates. + +--- + +## Merge SHA Computation + +The script computes two key SHAs for downstream CI jobs: + +### Mergeable PR + +```javascript +if (prInfo.mergeable) { + core.info('The PR can be merged.') + mergedSha = prInfo.merge_commit_sha + targetSha = ( + await github.rest.repos.getCommit({ + ...context.repo, + ref: prInfo.merge_commit_sha, + }) + ).data.parents[0].sha +} +``` + +- `mergedSha` — GitHub's trial merge commit SHA +- `targetSha` — The first parent of the merge commit (base branch tip) + +### Conflicting PR + +```javascript +else { + core.warning('The PR has a merge conflict.') + mergedSha = head.sha + targetSha = ( + await github.rest.repos.compareCommitsWithBasehead({ + ...context.repo, + basehead: `${base.sha}...${head.sha}`, + }) + ).data.merge_base_commit.sha +} +``` + +- `mergedSha` — Falls back to the head SHA (no merge commit exists) +- `targetSha` — The merge-base between base and head + +--- + +## Touched Path Detection + +The script identifies which CI-relevant paths were modified in the PR: + +```javascript +const files = ( + await github.paginate(github.rest.pulls.listFiles, { + ...context.repo, + pull_number, + per_page: 100, + }) +).map((file) => file.filename) + +const touched = [] +if (files.some((f) => f.startsWith('ci/'))) touched.push('ci') +if (files.includes('ci/pinned.json')) touched.push('pinned') +if (files.some((f) => f.startsWith('.github/'))) touched.push('github') +core.setOutput('touched', touched) +``` + +| Touched Tag | Condition | Use Case | +|------------|------------------------------------------|---------------------------------| +| `ci` | Any file under `ci/` was changed | Re-run CI infrastructure checks | +| `pinned` | `ci/pinned.json` specifically changed | Validate pin integrity | +| `github` | Any file under `.github/` was changed | Re-run workflow lint checks | + +--- + +## Outputs + +The script sets the following outputs for downstream workflow jobs: + +| Output | Type | Description | +|-------------|--------|---------------------------------------------------| +| `base` | Object | Base branch classification (branch, type, version) | +| `head` | Object | Head branch classification | +| `mergedSha` | String | Merge commit SHA (or head SHA if conflicting) | +| `targetSha` | String | Base comparison SHA | +| `touched` | Array | Which CI-relevant paths were modified | + +--- + +## Review Lifecycle + +The `prepare.js` script integrates with `reviews.js` for bot review management: + +### Posting a Review + +When the script detects a branch targeting issue, it posts a `REQUEST_CHANGES` review: + +```javascript +await postReview({ github, context, core, dry, body, reviewKey: 'prepare' }) +``` + +The review body includes: +- A description of the issue +- A comparison of commit distances +- A link to GitHub's "change base branch" documentation + +### Dismissing Reviews + +When the issue is resolved (correct base branch), previous reviews are dismissed: + +```javascript +await dismissReviews({ github, context, core, dry, reviewKey: 'prepare' }) +``` + +The `reviewKey` (`'prepare'`) ensures only reviews posted by this script are affected. + +--- + +## Dry Run Mode + +When the `--no-dry` flag is NOT passed (default in local testing), all mutative +operations (posting/dismissing reviews) are skipped: + +```javascript +module.exports = async ({ github, context, core, dry }) => { + // ... + if (!dry) { + await github.rest.pulls.createReview({ ... }) + } +} +``` + +This allows safe local testing without modifying real PRs. + +--- + +## Local Testing + +```bash +cd ci/github-script +nix-shell +gh auth login + +# Dry run (default — no changes to the PR): +./run prepare YongDo-Hyun Project-Tick 123 + +# Live run (actually posts/dismisses reviews): +./run prepare YongDo-Hyun Project-Tick 123 --no-dry +``` + +--- + +## Error Conditions + +| Condition | Behavior | +|-------------------------------------|----------------------------------------------| +| PR is closed | Throws: `"PR is not open anymore."` | +| Mergeability timeout | Throws: `"Timed out waiting for GitHub..."` | +| API rate limit exceeded | Handled by `withRateLimit.js` | +| Merge conflict | Warning issued; head SHA used as mergedSha | +| Wrong base branch | REQUEST_CHANGES review posted | diff --git a/docs/handbook/ci/rate-limiting.md b/docs/handbook/ci/rate-limiting.md new file mode 100644 index 0000000000..4b349ee2b4 --- /dev/null +++ b/docs/handbook/ci/rate-limiting.md @@ -0,0 +1,321 @@ +# Rate Limiting + +## Overview + +The CI system interacts heavily with the GitHub REST API for PR validation, commit +analysis, review management, and branch comparison. To prevent exhausting the +GitHub API rate limit (5,000 requests/hour for authenticated tokens), all API calls +are routed through `ci/github-script/withRateLimit.js`, which uses the +[Bottleneck](https://github.com/SGrondin/bottleneck) library for request throttling. + +--- + +## Architecture + +### Request Flow + +``` +┌──────────────────────────┐ +│ CI Script │ +│ (lint-commits.js, │ +│ prepare.js, etc.) │ +└────────────┬─────────────┘ + │ github.rest.* + ▼ +┌──────────────────────────┐ +│ withRateLimit wrapper │ +│ ┌──────────────────┐ │ +│ │ allLimits │ │ ← Bottleneck (maxConcurrent: 1, reservoir: dynamic) +│ │ (all requests) │ │ +│ └──────────────────┘ │ +│ ┌──────────────────┐ │ +│ │ writeLimits │ │ ← Bottleneck (minTime: 1000ms) chained to allLimits +│ │ (POST/PUT/PATCH/ │ │ +│ │ DELETE only) │ │ +│ └──────────────────┘ │ +└────────────┬─────────────┘ + │ + ▼ +┌──────────────────────────┐ +│ GitHub REST API │ +│ api.github.com │ +└──────────────────────────┘ +``` + +--- + +## Implementation + +### Module Signature + +```javascript +module.exports = async ({ github, core, maxConcurrent = 1 }, callback) => { +``` + +| Parameter | Type | Default | Description | +|----------------|----------|---------|--------------------------------------| +| `github` | Object | — | Octokit instance from `@actions/github` | +| `core` | Object | — | `@actions/core` for logging | +| `maxConcurrent` | Number | `1` | Maximum concurrent API requests | +| `callback` | Function| — | The script logic to execute | + +### Bottleneck Configuration + +Two Bottleneck limiters are configured: + +#### allLimits — Controls all requests + +```javascript +const allLimits = new Bottleneck({ + maxConcurrent, + reservoir: 0, // Updated dynamically +}) +``` + +- `maxConcurrent: 1` — Only one API request at a time (prevents burst usage) +- `reservoir: 0` — Starts empty; updated by `updateReservoir()` before first use + +#### writeLimits — Additional throttle for mutative requests + +```javascript +const writeLimits = new Bottleneck({ minTime: 1000 }).chain(allLimits) +``` + +- `minTime: 1000` — At least 1 second between write requests +- `.chain(allLimits)` — Write requests also go through the global limiter + +--- + +## Request Classification + +The Octokit `request` hook intercepts every API call and routes it through +the appropriate limiter: + +```javascript +github.hook.wrap('request', async (request, options) => { + // Bypass: different host (e.g., github.com for raw downloads) + if (options.url.startsWith('https://github.com')) return request(options) + + // Bypass: rate limit endpoint (doesn't count against quota) + if (options.url === '/rate_limit') return request(options) + + // Bypass: search endpoints (separate rate limit pool) + if (options.url.startsWith('/search/')) return request(options) + + stats.requests++ + + if (['POST', 'PUT', 'PATCH', 'DELETE'].includes(options.method)) + return writeLimits.schedule(request.bind(null, options)) + else + return allLimits.schedule(request.bind(null, options)) +}) +``` + +### Bypass Rules + +| URL Pattern | Reason | +|-------------------------------|---------------------------------------------| +| `https://github.com/*` | Raw file downloads, not API calls | +| `/rate_limit` | Meta-endpoint, doesn't count against quota | +| `/search/*` | Separate rate limit pool (30 requests/min) | + +### Request Routing + +| HTTP Method | Limiter | Throttle Rule | +|----------------------|------------------|----------------------------------| +| `GET` | `allLimits` | Concurrency-limited + reservoir | +| `POST` | `writeLimits` | 1 second minimum + concurrency | +| `PUT` | `writeLimits` | 1 second minimum + concurrency | +| `PATCH` | `writeLimits` | 1 second minimum + concurrency | +| `DELETE` | `writeLimits` | 1 second minimum + concurrency | + +--- + +## Reservoir Management + +### Dynamic Reservoir Updates + +The reservoir tracks how many API requests the script is allowed to make: + +```javascript +async function updateReservoir() { + let response + try { + response = await github.rest.rateLimit.get() + } catch (err) { + core.error(`Failed updating reservoir:\n${err}`) + return + } + const reservoir = Math.max(0, response.data.resources.core.remaining - 1000) + core.info(`Updating reservoir to: ${reservoir}`) + allLimits.updateSettings({ reservoir }) +} +``` + +### Reserve Buffer + +The script always keeps **1,000 spare requests** for other concurrent jobs: + +```javascript +const reservoir = Math.max(0, response.data.resources.core.remaining - 1000) +``` + +If the rate limit shows 3,500 remaining requests, the reservoir is set to 2,500. +If remaining is below 1,000, the reservoir is set to 0 (all requests will queue). + +### Why 1,000? + +Other GitHub Actions jobs running in parallel (status checks, deployment workflows, +external integrations) typically use fewer than 100 requests each. A 1,000-request +buffer provides ample headroom: + +- Normal job: ~50–100 API calls +- 10 concurrent jobs: ~500–1,000 API calls +- Buffer: 1,000 requests — covers typical parallel workload + +### Update Schedule + +```javascript +await updateReservoir() // Initial update before any work +const reservoirUpdater = setInterval(updateReservoir, 60 * 1000) // Every 60 seconds +``` + +The reservoir is refreshed every minute to account for: +- Other jobs consuming requests in parallel +- Rate limit window resets (GitHub resets the limit every hour) + +### Cleanup + +```javascript +try { + await callback(stats) +} finally { + clearInterval(reservoirUpdater) + core.notice( + `Processed ${stats.prs} PRs, ${stats.issues} Issues, ` + + `made ${stats.requests + stats.artifacts} API requests ` + + `and downloaded ${stats.artifacts} artifacts.`, + ) +} +``` + +The interval is cleared in a `finally` block to prevent resource leaks even if +the callback throws an error. + +--- + +## Statistics Tracking + +The wrapper tracks four metrics: + +```javascript +const stats = { + issues: 0, + prs: 0, + requests: 0, + artifacts: 0, +} +``` + +| Metric | Incremented By | Purpose | +|-------------|---------------------------------------|----------------------------------| +| `requests` | Every throttled API call | Total API usage | +| `prs` | Callback logic (PR processing) | PRs analyzed | +| `issues` | Callback logic (issue processing) | Issues analyzed | +| `artifacts` | Callback logic (artifact downloads) | Artifacts downloaded | + +At the end of execution, a summary is logged: + +``` +Notice: Processed 1 PRs, 0 Issues, made 15 API requests and downloaded 0 artifacts. +``` + +--- + +## Error Handling + +### Rate Limit API Failure + +If the rate limit endpoint itself fails (network error, GitHub outage): + +```javascript +try { + response = await github.rest.rateLimit.get() +} catch (err) { + core.error(`Failed updating reservoir:\n${err}`) + return // Keep retrying on next interval +} +``` + +The error is logged but does not crash the script. The reservoir retains its +previous value, and the next 60-second interval will try again. + +### Exhausted Reservoir + +When the reservoir reaches 0: +- All new requests queue in Bottleneck +- Requests wait until the next `updateReservoir()` call adds capacity +- If GitHub's rate limit has not reset, requests continue to queue +- The script may time out if the rate limit window hasn't reset + +--- + +## GitHub API Rate Limits Reference + +| Resource | Limit | Reset Period | +|-------------|--------------------------|--------------| +| Core REST API| 5,000 requests/hour | Rolling hour | +| Search API | 30 requests/minute | Rolling minute| +| GraphQL API | 5,000 points/hour | Rolling hour | + +The `withRateLimit.js` module only manages the **Core REST API** limit. Search +requests bypass the throttle because they have a separate, lower limit that is +rarely a concern for CI scripts. + +--- + +## Usage in CI Scripts + +### Wrapping a Script + +```javascript +const withRateLimit = require('./withRateLimit.js') + +module.exports = async ({ github, core }) => { + await withRateLimit({ github, core }, async (stats) => { + // All github.rest.* calls here are automatically throttled + + const pr = await github.rest.pulls.get({ + owner: 'YongDo-Hyun', + repo: 'Project-Tick', + pull_number: 123, + }) + stats.prs++ + + // ... more API calls + }) +} +``` + +### Adjusting Concurrency + +For scripts that can safely parallelize reads: + +```javascript +await withRateLimit({ github, core, maxConcurrent: 5 }, async (stats) => { + // Up to 5 concurrent GET requests + // Write requests still have 1-second minimum spacing +}) +``` + +--- + +## Best Practices + +1. **Minimize API calls** — Use pagination efficiently, avoid redundant requests +2. **Prefer git over API** — For commit data, `get-pr-commit-details.js` uses git directly + to bypass the 250-commit API limit and reduce API usage +3. **Use the `stats` object** — Track what the script does for observability +4. **Don't bypass the wrapper** — All API calls should go through the throttled Octokit instance +5. **Handle network errors** — The wrapper handles rate limit API failures, but callback + scripts should handle their own API errors gracefully diff --git a/docs/handbook/cmark/architecture.md b/docs/handbook/cmark/architecture.md new file mode 100644 index 0000000000..e35bd2e578 --- /dev/null +++ b/docs/handbook/cmark/architecture.md @@ -0,0 +1,283 @@ +# cmark — Architecture + +## High-Level Design + +cmark implements a two-phase parsing pipeline that converts CommonMark Markdown into an Abstract Syntax Tree (AST), which can then be rendered into multiple output formats. The design separates concerns cleanly: block-level structure is identified first, then inline content is parsed within the appropriate blocks. + +``` +Input Text (UTF-8) + │ + ▼ +┌──────────────────┐ +│ S_parser_feed │ Split input into lines (blocks.c) +│ │ Handle UTF-8 BOM, CR/LF normalization +└────────┬───────────┘ + │ + ▼ +┌──────────────────┐ +│ S_process_line │ Line-by-line block structure analysis (blocks.c) +│ │ Open/close containers, detect leaf blocks +└────────┬───────────┘ + │ + ▼ +┌──────────────────┐ +│ finalize_document│ Close all open blocks (blocks.c) +│ │ Resolve reference link definitions +└────────┬───────────┘ + │ + ▼ +┌──────────────────┐ +│ process_inlines │ Parse inline content in paragraphs/headings (blocks.c → inlines.c) +│ │ Delimiter stack algorithm for emphasis +│ │ Bracket stack for links/images +└────────┬───────────┘ + │ + ▼ +┌──────────────────┐ +│ AST (cmark_node tree) │ +└────────┬───────────┘ + │ + ▼ +┌──────────────────┐ +│ Renderer │ Iterator-driven traversal +│ (html/xml/ │ Enter/Exit events per node +│ latex/man/cm) │ +└──────────────────┘ + │ + ▼ + Output String +``` + +## Module Dependency Graph + +The internal header dependencies reveal the layered architecture: + +``` +cmark.h (public API — types, enums, function declarations) + ├── cmark_export.h (generated — DLL export macros) + └── cmark_version.h (generated — version constants) + +node.h (internal — struct cmark_node) + ├── cmark.h + └── buffer.h + +parser.h (internal — struct cmark_parser) + ├── references.h + ├── node.h + └── buffer.h + +iterator.h (internal — struct cmark_iter) + └── cmark.h + +render.h (internal — struct cmark_renderer) + └── buffer.h + +buffer.h (internal — cmark_strbuf) + └── cmark.h + +chunk.h (internal — cmark_chunk) + ├── cmark.h + ├── buffer.h + └── cmark_ctype.h + +references.h (internal — cmark_reference_map) + └── chunk.h + +inlines.h (internal — inline parsing API) + ├── chunk.h + └── references.h + +scanners.h (internal — scanner function declarations) + ├── cmark.h + └── chunk.h + +houdini.h (internal — HTML/URL escaping) + └── buffer.h + +cmark_ctype.h (internal — locale-independent char classification) + (no cmark dependencies) + +utf8.h (internal — UTF-8 processing) + └── buffer.h +``` + +## Phase 1: Block Structure (blocks.c) + +Block parsing operates on a state machine maintained in the `cmark_parser` struct (defined in `parser.h`): + +```c +struct cmark_parser { + struct cmark_mem *mem; // Memory allocator + struct cmark_reference_map *refmap; // Link reference definitions + struct cmark_node *root; // Document root node + struct cmark_node *current; // Deepest open block + int line_number; // Current line being processed + bufsize_t offset; // Byte position in current line + bufsize_t column; // Virtual column (tabs expanded) + bufsize_t first_nonspace; // Position of first non-whitespace + bufsize_t first_nonspace_column; // Column of first non-whitespace + bufsize_t thematic_break_kill_pos; // Optimization for thematic break scanning + int indent; // Indentation level (first_nonspace_column - column) + bool blank; // Whether current line is blank + bool partially_consumed_tab; // Tab only partially used for indentation + cmark_strbuf curline; // Current line being processed + bufsize_t last_line_length; // Length of previous line (for end_column) + cmark_strbuf linebuf; // Buffer for accumulating partial lines across feeds + cmark_strbuf content; // Accumulated content for the current open block + int options; // Option flags + bool last_buffer_ended_with_cr; // For CR/LF handling across buffer boundaries + unsigned int total_size; // Total bytes fed (for reference expansion limiting) +}; +``` + +### Line Processing Flow + +For each line, `S_process_line()` does the following: + +1. **Increment line number**, store current line in `parser->curline`. +2. **Check open blocks** (`check_open_blocks()`): Walk through the tree from root to the deepest open node. For each open container node, try to match the expected line prefix: + - Block quote: expect `>` (optionally preceded by up to 3 spaces) + - List item: expect indentation matching `marker_offset + padding` + - Code block (fenced): check for closing fence or skip fence offset spaces + - Code block (indented): expect 4+ spaces of indentation + - HTML block: check type-specific continuation rules +3. **Try new container starts**: If not all open blocks matched, check if the current line starts a new container (block quote, list item). +4. **Try new leaf blocks**: If the line doesn't continue an existing block or start a new container, check for: + - ATX heading (lines starting with 1-6 `#` characters) + - Setext heading (underlines of `=` or `-` following a paragraph) + - Thematic break (3+ `*`, `-`, or `_` on a line by themselves) + - Fenced code block (3+ backticks or tildes) + - HTML block (7 different start patterns) + - Indented code block (4+ spaces of indentation) +5. **Add line content**: For blocks that accept lines (paragraph, heading, code block), append the line content to `parser->content`. +6. **Handle lazy continuation**: Paragraphs support lazy continuation where a non-blank line can continue a paragraph even without matching container prefixes. + +### Finalization + +When a block is closed (either explicitly or because a new block replaces it), `finalize()` is called: + +- **Paragraphs**: Reference link definitions at the start are extracted and stored in `parser->refmap`. If only references remain, the paragraph node is deleted. +- **Code blocks (fenced)**: The first line becomes the info string; remaining content becomes the code body. +- **Code blocks (indented)**: Trailing blank lines are removed. +- **Lists**: Tight/loose status is determined by checking for blank lines between items and their children. + +## Phase 2: Inline Parsing (inlines.c) + +After all block structure is finalized, `process_inlines()` walks the AST with an iterator and calls `cmark_parse_inlines()` for every node whose type `contains_inlines()` — specifically, `CMARK_NODE_PARAGRAPH` and `CMARK_NODE_HEADING`. + +The inline parser uses a `subject` struct that tracks: + +```c +typedef struct { + cmark_mem *mem; + cmark_chunk input; // The text to parse + unsigned flags; // Skip flags for HTML constructs + int line; // Source line number + bufsize_t pos; // Current position in input + int block_offset; // Column offset of containing block + int column_offset; // Adjustment for multi-line inlines + cmark_reference_map *refmap; // Reference definitions + delimiter *last_delim; // Top of delimiter stack + bracket *last_bracket; // Top of bracket stack + bufsize_t backticks[MAXBACKTICKS + 1]; // Cache of backtick positions + bool scanned_for_backticks; // Whether full backtick scan done + bool no_link_openers; // Optimization flag +} subject; +``` + +### Delimiter Stack Algorithm + +Emphasis (`*`, `_`) and smart quotes (`'`, `"`) use a delimiter stack. When a run of delimiter characters is found: + +1. `scan_delims()` determines whether the run can open and/or close emphasis, based on Unicode-aware flanking rules (checking whether surrounding characters are spaces or punctuation using `cmark_utf8proc_is_space()` and `cmark_utf8proc_is_punctuation_or_symbol()`). +2. The delimiter is pushed onto the stack as a `delimiter` struct. +3. When a closing delimiter is found, the stack is scanned backwards for a matching opener, and `S_insert_emph()` creates `CMARK_NODE_EMPH` or `CMARK_NODE_STRONG` nodes. + +### Bracket Stack Algorithm + +Links and images use a separate bracket stack: + +1. `[` pushes a bracket entry; `![` pushes one marked as `image = true`. +2. When `]` is encountered, the bracket stack is searched for a matching opener. +3. If found, the parser looks for `(url "title")` or `[ref]` after the `]`. +4. For reference-style links, `cmark_reference_lookup()` is called against the parser's `refmap`. + +## Phase 3: AST Rendering + +All renderers traverse the AST using the iterator system. There are two rendering architectures: + +### Direct Renderers (no framework) +- **HTML** (`html.c`): Uses `cmark_strbuf` directly. The `S_render_node()` function handles enter/exit events in a large switch statement. HTML escaping is done via `houdini_escape_html()`. +- **XML** (`xml.c`): Similar direct approach with XML-specific escaping and indentation tracking. + +### Framework Renderers (via render.c) +- **LaTeX** (`latex.c`), **man** (`man.c`), **CommonMark** (`commonmark.c`): These use the `cmark_render()` generic framework, which provides: + - Line wrapping at a configurable width + - Prefix management for indented output (block quotes, list items) + - Breakpoint tracking for intelligent line breaking + - Escape dispatch via function pointers (`outc`) + +The framework signature: + +```c +char *cmark_render(cmark_node *root, int options, int width, + void (*outc)(cmark_renderer *, cmark_escaping, int32_t, unsigned char), + int (*render_node)(cmark_renderer *, cmark_node *, + cmark_event_type, int)); +``` + +Each format-specific renderer supplies its own `outc` (character-level escaping) and `render_node` (node-level output) callback functions. + +## Key Design Decisions + +### Owning vs. Non-Owning Strings + +cmark uses two string types: + +- **`cmark_strbuf`** (buffer.h): Owning, growable byte buffer. Used for accumulating output and parser state. Memory is managed via the `cmark_mem` allocator. +- **`cmark_chunk`** (chunk.h): Non-owning slice (pointer + length). Used for referencing substrings of the input during parsing without copying. + +### Node Memory Layout + +Every `cmark_node` uses a discriminated union (`node->as`) to store type-specific data without separate allocations: + +```c +union { + cmark_list list; // list marker, start, tight, delimiter + cmark_code code; // info string, fence char/length/offset + cmark_heading heading; // level, setext flag, internal_offset + cmark_link link; // url, title + cmark_custom custom; // on_enter, on_exit + int html_block_type; // HTML block type (1-7) +} as; +``` + +### Open Block Tracking + +During block parsing, open blocks are tracked via the `CMARK_NODE__OPEN` flag in `node->flags`. The parser maintains a `current` pointer to the deepest open block. When new blocks are created, they're added as children of the appropriate open container. When blocks are finalized (closed), control returns to the parent. + +### Reference Expansion Limiting + +To prevent superlinear growth from adversarial reference definitions, `parser->total_size` tracks total bytes fed. After finalization, `parser->refmap->max_ref_size` is set to `MAX(total_size, 100000)`, and each reference lookup deducts the reference's size from the available budget. + +## Error Handling + +cmark follows a defensive programming model: +- NULL checks on all public API entry points (return 0 or NULL for invalid arguments) +- `assert()` for internal invariants (only active in debug builds with `-DCMARK_DEBUG_NODES`) +- Abort-on-allocation-failure in the default memory allocator +- No exceptions (pure C99) +- Invalid UTF-8 sequences are replaced with U+FFFD (when `CMARK_OPT_VALIDATE_UTF8` is set) + +## Thread Safety + +cmark is **not** thread-safe for concurrent access to the same parser or node tree. However, separate parser instances and separate node trees can be used in parallel from different threads, as there is no global mutable state (the `DEFAULT_MEM_ALLOCATOR` is read-only after initialization). + +## Cross-References + +- [block-parsing.md](block-parsing.md) — Detailed block-level parsing logic +- [inline-parsing.md](inline-parsing.md) — Delimiter and bracket stack algorithms +- [ast-node-system.md](ast-node-system.md) — Node struct internals +- [render-framework.md](render-framework.md) — Generic render engine +- [memory-management.md](memory-management.md) — Allocator and buffer details +- [iterator-system.md](iterator-system.md) — AST traversal mechanics diff --git a/docs/handbook/cmark/ast-node-system.md b/docs/handbook/cmark/ast-node-system.md new file mode 100644 index 0000000000..3d25415eda --- /dev/null +++ b/docs/handbook/cmark/ast-node-system.md @@ -0,0 +1,383 @@ +# cmark — AST Node System + +## Overview + +The AST (Abstract Syntax Tree) node system is defined across `node.h` (internal struct definitions) and `node.c` (node creation, destruction, accessor functions, and tree manipulation). Every element in a parsed CommonMark document is represented as a `cmark_node`. Nodes form a tree via parent/child/sibling pointers, with type-specific data stored in a discriminated union. + +## The `cmark_node` Struct + +Defined in `node.h`, this is the central data structure of the entire library: + +```c +struct cmark_node { + cmark_mem *mem; // Memory allocator used for this node + + struct cmark_node *next; // Next sibling + struct cmark_node *prev; // Previous sibling + struct cmark_node *parent; // Parent node + struct cmark_node *first_child; // First child + struct cmark_node *last_child; // Last child + + void *user_data; // Arbitrary user-attached data + + unsigned char *data; // String content (for text, code, HTML) + bufsize_t len; // Length of data + + int start_line; // Source position: starting line (1-based) + int start_column; // Source position: starting column (1-based) + int end_line; // Source position: ending line + int end_column; // Source position: ending column + uint16_t type; // Node type (cmark_node_type enum value) + uint16_t flags; // Internal flags (open, last-line-blank, etc.) + + union { + cmark_list list; // List-specific data + cmark_code code; // Code block-specific data + cmark_heading heading; // Heading-specific data + cmark_link link; // Link/image-specific data + cmark_custom custom; // Custom block/inline data + int html_block_type; // HTML block type (1-7) + } as; +}; +``` + +The union `as` means each node only occupies memory for one type-specific payload, keeping the struct compact. The largest union member determines the node's size. + +## Type-Specific Structs + +### `cmark_list` — List Properties + +```c +typedef struct { + int marker_offset; // Indentation of list marker from left margin + int padding; // Total indentation (marker + content offset) + int start; // Starting number for ordered lists (0 for bullet) + unsigned char list_type; // CMARK_BULLET_LIST or CMARK_ORDERED_LIST + unsigned char delimiter; // CMARK_PERIOD_DELIM, CMARK_PAREN_DELIM, or CMARK_NO_DELIM + unsigned char bullet_char;// '*', '-', or '+' for bullet lists + bool tight; // Whether the list is tight (no blank lines between items) +} cmark_list; +``` + +`marker_offset` and `padding` are used during block parsing to track indentation levels for list continuation. The `tight` flag is determined during block finalization by checking whether blank lines appear between list items or their children. + +### `cmark_code` — Code Block Properties + +```c +typedef struct { + unsigned char *info; // Info string (language hint, e.g., "python") + uint8_t fence_length; // Length of opening fence (3+ backticks or tildes) + uint8_t fence_offset; // Indentation of fence from left margin + unsigned char fence_char; // '`' or '~' + int8_t fenced; // Whether this is a fenced code block (vs. indented) +} cmark_code; +``` + +For indented code blocks, `fenced` is 0, and `info`, `fence_length`, `fence_char`, and `fence_offset` are unused. For fenced code blocks, `info` is extracted from the first line of the opening fence and stored as a separately allocated string. + +### `cmark_heading` — Heading Properties + +```c +typedef struct { + int internal_offset; // Internal offset within the heading content + int8_t level; // Heading level (1-6) + bool setext; // Whether this is a setext-style heading (underlined) +} cmark_heading; +``` + +ATX headings (`# Heading`) have `setext = false`. Setext headings (underlined with `=` or `-`) have `setext = true`. The `level` field is shared and defaults to 1 when a heading node is created. + +### `cmark_link` — Link and Image Properties + +```c +typedef struct { + unsigned char *url; // Destination URL (separately allocated) + unsigned char *title; // Optional title text (separately allocated) +} cmark_link; +``` + +Both `url` and `title` are separately allocated strings that must be freed when the node is destroyed. This struct is used for both `CMARK_NODE_LINK` and `CMARK_NODE_IMAGE`. + +### `cmark_custom` — Custom Block/Inline Properties + +```c +typedef struct { + unsigned char *on_enter; // Literal text rendered when entering the node + unsigned char *on_exit; // Literal text rendered when leaving the node +} cmark_custom; +``` + +Custom nodes allow embedding arbitrary content in the AST for extensions. Both strings are separately allocated. + +## Internal Flags + +The `flags` field uses bit flags defined in the `cmark_node__internal_flags` enum: + +```c +enum cmark_node__internal_flags { + CMARK_NODE__OPEN = (1 << 0), // Block is still open (accepting content) + CMARK_NODE__LAST_LINE_BLANK = (1 << 1), // Last line of this block was blank + CMARK_NODE__LAST_LINE_CHECKED = (1 << 2), // blank-line status has been computed + CMARK_NODE__LIST_LAST_LINE_BLANK = (1 << 3), // (unused/reserved) +}; +``` + +- **`CMARK_NODE__OPEN`**: Set when a block is created during parsing. Cleared by `finalize()` when the block is closed. The parser's `current` pointer always points to a node with this flag set. +- **`CMARK_NODE__LAST_LINE_BLANK`**: Set/cleared by `S_set_last_line_blank()` in `blocks.c` to track whether the most recent line added to this block was blank. Used for determining list tightness. +- **`CMARK_NODE__LAST_LINE_CHECKED`**: Prevents redundant traversal when checking `S_ends_with_blank_line()`, which recursively descends into list items. + +## Node Creation + +### `cmark_node_new_with_mem()` + +The primary creation function (in `node.c`): + +```c +cmark_node *cmark_node_new_with_mem(cmark_node_type type, cmark_mem *mem) { + cmark_node *node = (cmark_node *)mem->calloc(1, sizeof(*node)); + node->mem = mem; + node->type = (uint16_t)type; + + switch (node->type) { + case CMARK_NODE_HEADING: + node->as.heading.level = 1; + break; + case CMARK_NODE_LIST: { + cmark_list *list = &node->as.list; + list->list_type = CMARK_BULLET_LIST; + list->start = 0; + list->tight = false; + break; + } + default: + break; + } + + return node; +} +``` + +The `calloc()` zeroes all fields, so pointers start as NULL and numeric fields as 0. Only heading and list nodes need explicit default initialization. + +### `make_block()` — Parser-Internal Creation + +During block parsing, `make_block()` in `blocks.c` creates nodes with source position and the `CMARK_NODE__OPEN` flag: + +```c +static cmark_node *make_block(cmark_mem *mem, cmark_node_type tag, + int start_line, int start_column) { + cmark_node *e; + e = (cmark_node *)mem->calloc(1, sizeof(*e)); + e->mem = mem; + e->type = (uint16_t)tag; + e->flags = CMARK_NODE__OPEN; + e->start_line = start_line; + e->start_column = start_column; + e->end_line = start_line; + return e; +} +``` + +### Inline Node Creation + +The inline parser in `inlines.c` uses two factory functions: + +```c +// Create an inline with string content (text, code, HTML) +static inline cmark_node *make_literal(subject *subj, cmark_node_type t, + int start_column, int end_column) { + cmark_node *e = (cmark_node *)subj->mem->calloc(1, sizeof(*e)); + e->mem = subj->mem; + e->type = (uint16_t)t; + e->start_line = e->end_line = subj->line; + e->start_column = start_column + 1 + subj->column_offset + subj->block_offset; + e->end_column = end_column + 1 + subj->column_offset + subj->block_offset; + return e; +} + +// Create an inline with no value (emphasis, strong, etc.) +static inline cmark_node *make_simple(cmark_mem *mem, cmark_node_type t) { + cmark_node *e = (cmark_node *)mem->calloc(1, sizeof(*e)); + e->mem = mem; + e->type = t; + return e; +} +``` + +## Node Destruction + +### `S_free_nodes()` — Iterative Subtree Freeing + +The `S_free_nodes()` function in `node.c` avoids recursion by splicing children into a flat linked list: + +```c +static void S_free_nodes(cmark_node *e) { + cmark_mem *mem = e->mem; + cmark_node *next; + while (e != NULL) { + switch (e->type) { + case CMARK_NODE_CODE_BLOCK: + mem->free(e->data); + mem->free(e->as.code.info); + break; + case CMARK_NODE_TEXT: + case CMARK_NODE_HTML_INLINE: + case CMARK_NODE_CODE: + case CMARK_NODE_HTML_BLOCK: + mem->free(e->data); + break; + case CMARK_NODE_LINK: + case CMARK_NODE_IMAGE: + mem->free(e->as.link.url); + mem->free(e->as.link.title); + break; + case CMARK_NODE_CUSTOM_BLOCK: + case CMARK_NODE_CUSTOM_INLINE: + mem->free(e->as.custom.on_enter); + mem->free(e->as.custom.on_exit); + break; + default: + break; + } + if (e->last_child) { + // Splice children into list for flat iteration + e->last_child->next = e->next; + e->next = e->first_child; + } + next = e->next; + mem->free(e); + e = next; + } +} +``` + +This splicing technique converts the tree into a flat list, allowing O(n) iterative freeing without a recursion stack. For each node with children, the children are prepended to the remaining list by connecting `last_child->next` to `e->next` and `e->next` to `first_child`. + +## Containership Rules + +The `S_can_contain()` function in `node.c` enforces which node types can contain which children: + +```c +static bool S_can_contain(cmark_node *node, cmark_node *child) { + // Ancestor loop detection + if (child->first_child != NULL) { + cmark_node *cur = node->parent; + while (cur != NULL) { + if (cur == child) return false; + cur = cur->parent; + } + } + + // Documents cannot be children + if (child->type == CMARK_NODE_DOCUMENT) return false; + + switch (node->type) { + case CMARK_NODE_DOCUMENT: + case CMARK_NODE_BLOCK_QUOTE: + case CMARK_NODE_ITEM: + return cmark_node_is_block(child) && child->type != CMARK_NODE_ITEM; + + case CMARK_NODE_LIST: + return child->type == CMARK_NODE_ITEM; + + case CMARK_NODE_CUSTOM_BLOCK: + return true; // Custom blocks can contain anything + + case CMARK_NODE_PARAGRAPH: + case CMARK_NODE_HEADING: + case CMARK_NODE_EMPH: + case CMARK_NODE_STRONG: + case CMARK_NODE_LINK: + case CMARK_NODE_IMAGE: + case CMARK_NODE_CUSTOM_INLINE: + return cmark_node_is_inline(child); + + default: + break; + } + return false; +} +``` + +Key rules: +- **Document, block quote, list item**: Can contain any block except items +- **List**: Can only contain items +- **Custom block**: Can contain anything (no restrictions) +- **Paragraph, heading, emphasis, strong, link, image, custom inline**: Can only contain inline nodes +- **Leaf blocks** (thematic break, code block, HTML block): Cannot contain anything + +## Tree Manipulation + +### Unlinking + +The internal `S_node_unlink()` function detaches a node from its parent and siblings: + +```c +static void S_node_unlink(cmark_node *node) { + if (node->prev) { + node->prev->next = node->next; + } + if (node->next) { + node->next->prev = node->prev; + } + // Update parent's first_child / last_child pointers + if (node->parent) { + if (node->parent->first_child == node) + node->parent->first_child = node->next; + if (node->parent->last_child == node) + node->parent->last_child = node->prev; + } + node->next = NULL; + node->prev = NULL; + node->parent = NULL; +} +``` + +### String Setting Helper + +The `cmark_set_cstr()` function manages string assignment with proper memory handling: + +```c +static bufsize_t cmark_set_cstr(cmark_mem *mem, unsigned char **dst, + const char *src) { + unsigned char *old = *dst; + bufsize_t len; + if (src && src[0]) { + len = (bufsize_t)strlen(src); + *dst = (unsigned char *)mem->realloc(NULL, len + 1); + memcpy(*dst, src, len + 1); + } else { + len = 0; + *dst = NULL; + } + if (old) { + mem->free(old); + } + return len; +} +``` + +This function allocates a new copy of the source string, assigns it, then frees the old value — ensuring no memory leaks even when overwriting existing data. + +## Node Data Storage Pattern + +Nodes store their text content in two ways depending on type: + +1. **Direct storage** (`data` + `len`): Used by `CMARK_NODE_TEXT`, `CMARK_NODE_CODE`, `CMARK_NODE_CODE_BLOCK`, `CMARK_NODE_HTML_BLOCK`, and `CMARK_NODE_HTML_INLINE`. The `data` field points to a separately allocated buffer containing the text content. + +2. **Union storage** (`as.*`): Used by lists, code blocks (for the info string), headings, links/images, and custom nodes. These store structured data rather than raw text. + +3. **Hybrid**: `CMARK_NODE_CODE_BLOCK` uses both — `data` for the code content and `as.code.info` for the info string. + +## The `cmark_node_check()` Function + +For debug builds, `cmark_node_check()` validates the structural integrity of the tree. It checks that parent/child/sibling pointers are consistent and that the tree forms a valid structure. It returns the number of errors found and prints details to the provided `FILE*`. + +## Cross-References + +- [node.h](../../../cmark/src/node.h) — Struct definitions +- [node.c](../../../cmark/src/node.c) — Implementation +- [iterator-system.md](iterator-system.md) — How nodes are traversed +- [block-parsing.md](block-parsing.md) — How block nodes are created during parsing +- [inline-parsing.md](inline-parsing.md) — How inline nodes are created +- [memory-management.md](memory-management.md) — Allocator integration diff --git a/docs/handbook/cmark/block-parsing.md b/docs/handbook/cmark/block-parsing.md new file mode 100644 index 0000000000..2c9efecd50 --- /dev/null +++ b/docs/handbook/cmark/block-parsing.md @@ -0,0 +1,310 @@ +# cmark — Block Parsing + +## Overview + +Block parsing is Phase 1 of cmark's two-phase parsing pipeline. Implemented in `blocks.c`, it processes the input line-by-line, identifying block-level document structure: paragraphs, headings, code blocks, block quotes, lists, thematic breaks, and HTML blocks. The result is a tree of `cmark_node` block nodes with accumulated text content. Inline parsing occurs in Phase 2. + +The algorithm follows the CommonMark specification's description at `http://spec.commonmark.org/0.24/#phase-1-block-structure`. + +## Key Constants + +```c +#define CODE_INDENT 4 // Spaces required for indented code block +#define TAB_STOP 4 // Tab stop width for column calculation +``` + +## Parser State + +The parser state is maintained in the `cmark_parser` struct (from `parser.h`). During line processing, these fields track the current position: + +- `offset` — byte position in the current line +- `column` — virtual column number (tabs expanded to `TAB_STOP` boundaries) +- `first_nonspace` — byte position of first non-whitespace character +- `first_nonspace_column` — column of first non-whitespace character +- `indent` — the difference `first_nonspace_column - column`, representing effective indentation +- `blank` — whether the line is blank (only whitespace before line end) +- `partially_consumed_tab` — set when a tab is only partially used for indentation + +## Input Feeding: `S_parser_feed()` + +The entry point for input is `S_parser_feed()`, which splits raw input into lines: + +```c +static void S_parser_feed(cmark_parser *parser, const unsigned char *buffer, + size_t len, bool eof); +``` + +### Line Splitting Logic + +The function scans for line-ending characters (`\n`, `\r`) and processes complete lines via `S_process_line()`. Partial lines are accumulated in `parser->linebuf`. + +Key handling: +1. **UTF-8 BOM**: Skipped if found at the start of the first line (3-byte sequence `0xEF 0xBB 0xBF`). +2. **CR/LF across buffer boundaries**: If the previous buffer ended with `\r` and the next starts with `\n`, the `\n` is skipped. +3. **NULL bytes**: Replaced with the UTF-8 replacement character (U+FFFD, `0xEF 0xBF 0xBD`). +4. **Total size tracking**: `parser->total_size` accumulates bytes fed, capped at `UINT_MAX`, used later for reference expansion limiting. + +### Line Termination + +Each line is terminated at `\n`, `\r`, or `\r\n`. The line content passed to `S_process_line()` does NOT include the line-ending characters themselves. + +## Line Processing: `S_process_line()` + +The main per-line processing function. For each line, it: + +1. Stores the line in `parser->curline` +2. Creates a `cmark_chunk` wrapper for the line data +3. Increments `parser->line_number` +4. Calls `check_open_blocks()` to match existing containers +5. Attempts to open new containers and leaf blocks +6. Adds line content to the appropriate block + +### Step 1: Check Open Blocks + +```c +static cmark_node *check_open_blocks(cmark_parser *parser, cmark_chunk *input, + bool *all_matched); +``` + +Starting from the document root, this walks through the tree of open blocks (following `last_child` pointers). For each open container, it tries to match the expected line prefix. + +The matching rules for each container type: + +#### Block Quote +```c +static bool parse_block_quote_prefix(cmark_parser *parser, cmark_chunk *input); +``` +Expects `>` preceded by up to 3 spaces of indentation. After matching the `>`, optionally consumes one space or tab after it. + +#### List Item +```c +static bool parse_node_item_prefix(cmark_parser *parser, cmark_chunk *input, + cmark_node *container); +``` +Expects indentation of at least `marker_offset + padding` characters. If the line is blank and the item has at least one child, the item continues (lazy continuation). + +#### Fenced Code Block +```c +static bool parse_code_block_prefix(cmark_parser *parser, cmark_chunk *input, + cmark_node *container, bool *should_continue); +``` +For fenced code blocks: checks if the line is a closing fence (same fence char, length >= opening fence length, preceded by up to 3 spaces). If it is, the block is finalized. Otherwise, skips up to `fence_offset` spaces and continues. + +For indented code blocks: requires 4+ spaces of indentation, or a blank line. + +#### HTML Block +```c +static bool parse_html_block_prefix(cmark_parser *parser, cmark_node *container); +``` +HTML block types 1-5 accept blank lines (continue until end condition is met). Types 6-7 do NOT accept blank lines. + +### Step 2: New Container Starts + +If not all open blocks were matched (`!all_matched`), the parser checks if the unmatched portion of the line starts a new container: + +- **Block quote**: Line starts with `>` (preceded by up to 3 spaces) +- **List item**: Line starts with a list marker (bullet character or ordered number + delimiter) + +### Step 3: New Leaf Blocks + +The parser checks for new leaf block starts using scanner functions: + +- **ATX heading**: `scan_atx_heading_start()` — lines starting with 1-6 `#` characters +- **Fenced code block**: `scan_open_code_fence()` — 3+ backticks or tildes +- **HTML block**: `scan_html_block_start()` and `scan_html_block_start_7()` — 7 different HTML start patterns +- **Setext heading**: `scan_setext_heading_line()` — underlines of `=` or `-` (only when following a paragraph) +- **Thematic break**: `S_scan_thematic_break()` — 3+ `*`, `-`, or `_` characters + +### Step 4: Content Accumulation + +For blocks that accept lines (`accepts_lines()` returns true for paragraphs, headings, and code blocks), the line content is appended to `parser->content` via `add_line()`: + +```c +static void add_line(cmark_chunk *ch, cmark_parser *parser) { + int chars_to_tab; + int i; + if (parser->partially_consumed_tab) { + parser->offset += 1; // skip over tab + chars_to_tab = TAB_STOP - (parser->column % TAB_STOP); + for (i = 0; i < chars_to_tab; i++) { + cmark_strbuf_putc(&parser->content, ' '); + } + } + cmark_strbuf_put(&parser->content, ch->data + parser->offset, + ch->len - parser->offset); +} +``` + +When a tab is only partially consumed (e.g., the tab represents 4 columns but only 1 was needed for indentation), the remaining columns are emitted as spaces. + +## Adding Child Blocks + +```c +static cmark_node *add_child(cmark_parser *parser, cmark_node *parent, + cmark_node_type block_type, int start_column); +``` + +When a new block is detected, `add_child()` creates it: + +1. If the parent can't contain the new block type (checked via `can_contain()`), the parent is finalized and the function moves up the tree until it finds a suitable ancestor. +2. A new node is created with `make_block()` (which sets `CMARK_NODE__OPEN`). +3. The node is linked as the last child of the parent. + +### Container Acceptance Rules + +```c +static inline bool can_contain(cmark_node_type parent_type, + cmark_node_type child_type) { + return (parent_type == CMARK_NODE_DOCUMENT || + parent_type == CMARK_NODE_BLOCK_QUOTE || + parent_type == CMARK_NODE_ITEM || + (parent_type == CMARK_NODE_LIST && child_type == CMARK_NODE_ITEM)); +} +``` + +Only documents, block quotes, list items, and lists (for items only) can contain other blocks. + +## List Item Parsing + +```c +static bufsize_t parse_list_marker(cmark_mem *mem, cmark_chunk *input, + bufsize_t pos, bool interrupts_paragraph, + cmark_list **dataptr); +``` + +This function detects list markers: + +**Bullet markers**: `*`, `-`, or `+` followed by whitespace. + +**Ordered markers**: Up to 9 digits followed by `.` or `)` and whitespace. The 9-digit limit prevents integer overflow (max value ~999,999,999 fits in a 32-bit int). + +**Paragraph interruption rules**: When `interrupts_paragraph` is true (the marker would interrupt a preceding paragraph): +- Bullet markers require non-blank content after them +- Ordered markers must start at 1 + +### List Matching + +```c +static int lists_match(cmark_list *list_data, cmark_list *item_data) { + return (list_data->list_type == item_data->list_type && + list_data->delimiter == item_data->delimiter && + list_data->bullet_char == item_data->bullet_char); +} +``` + +Two list items belong to the same list only if they share the same list type, delimiter style, and bullet character. This means `- item` and `* item` create separate lists. + +## Offset Advancement + +```c +static void S_advance_offset(cmark_parser *parser, cmark_chunk *input, + bufsize_t count, bool columns); +``` + +This function advances `parser->offset` and `parser->column`. The `columns` parameter determines whether `count` measures bytes or virtual columns. Tab expansion is handled here: +- When counting columns and a tab appears, `chars_to_tab = TAB_STOP - (column % TAB_STOP)` determines how many columns the tab represents +- If only part of the tab is consumed (advancing fewer columns than the tab provides), `parser->partially_consumed_tab` is set + +## Finding First Non-Space + +```c +static void S_find_first_nonspace(cmark_parser *parser, cmark_chunk *input); +``` + +Scans from `parser->offset` forward, setting: +- `parser->first_nonspace` — byte position +- `parser->first_nonspace_column` — column of first non-whitespace +- `parser->indent` — `first_nonspace_column - column` +- `parser->blank` — whether the line is blank + +This function is idempotent — it won't re-scan if `first_nonspace > offset`. + +## Thematic Break Detection + +```c +static int S_scan_thematic_break(cmark_parser *parser, cmark_chunk *input, + bufsize_t offset); +``` + +Checks for 3 or more `*`, `_`, or `-` characters (optionally separated by spaces/tabs) on a line by themselves. Uses `parser->thematic_break_kill_pos` as an optimization to avoid re-scanning positions that already failed. + +## ATX Heading Trailing Hash Removal + +```c +static void chop_trailing_hashtags(cmark_chunk *ch); +``` + +After an ATX heading line is identified, trailing `#` characters are removed from the content if they're preceded by a space. This implements the CommonMark rule that `## Heading ##` renders as "Heading" without trailing `#` marks. + +## Block Finalization + +```c +static cmark_node *finalize(cmark_parser *parser, cmark_node *b); +``` + +When a block is closed (no longer accepting content), `finalize()` processes its accumulated content: + +### Paragraph Finalization +Reference link definitions at the start are extracted: +```c +static bool resolve_reference_link_definitions(cmark_parser *parser); +``` +This repeatedly calls `cmark_parse_reference_inline()` from `inlines.c` to parse reference definitions like `[label]: url "title"`. If the paragraph becomes empty after extracting all references, the paragraph node is deleted. + +### Code Block Finalization +- **Fenced**: The first line becomes the info string (after HTML unescaping and trimming). Remaining content becomes the code body. +- **Indented**: Trailing blank lines are removed, and a final newline is appended. + +### Heading and HTML Block Finalization +Content is simply detached from the parser's content buffer and stored in `data`. + +### List Finalization +Determines tight/loose status by checking: +1. Non-final, non-empty list items ending with a blank line → loose +2. Children of list items that end with blank lines (checked recursively via `S_ends_with_blank_line()`) → loose +3. Otherwise → tight + +## Document Finalization + +```c +static cmark_node *finalize_document(cmark_parser *parser); +``` + +Called by `cmark_parser_finish()`: + +1. All open blocks are finalized by walking from `parser->current` up to `parser->root`. +2. The root document is finalized. +3. Reference expansion limit is set: `refmap->max_ref_size = MAX(parser->total_size, 100000)`. +4. `process_inlines()` is called, which uses an iterator to find all nodes that contain inlines (paragraphs and headings) and calls `cmark_parse_inlines()` on each. +5. After inline parsing, the content buffer of each processed node is freed. + +## Inline Content Detection + +```c +static inline bool contains_inlines(cmark_node_type block_type) { + return (block_type == CMARK_NODE_PARAGRAPH || + block_type == CMARK_NODE_HEADING); +} +``` + +Only paragraphs and headings have their string content parsed for inline elements. Code blocks, HTML blocks, and other leaf nodes preserve their content as-is. + +## Lazy Continuation Lines + +The CommonMark spec defines "lazy continuation lines" — lines that continue a paragraph without matching all container prefixes. For example: + +```markdown +> This is a block quote +with a lazy continuation line +``` + +The second line doesn't start with `>` but still belongs to the paragraph inside the block quote. The parser handles this by checking whether the line could be added to an existing open paragraph rather than closing and starting a new one. + +## Cross-References + +- [parser.h](../../../cmark/src/parser.h) — Parser struct definition +- [blocks.c](../../../cmark/src/blocks.c) — Full implementation +- [inline-parsing.md](inline-parsing.md) — Phase 2 parsing +- [scanner-system.md](scanner-system.md) — Scanner functions used for block detection +- [reference-system.md](reference-system.md) — How reference definitions are extracted +- [ast-node-system.md](ast-node-system.md) — Node creation and tree structure diff --git a/docs/handbook/cmark/building.md b/docs/handbook/cmark/building.md new file mode 100644 index 0000000000..56272af2be --- /dev/null +++ b/docs/handbook/cmark/building.md @@ -0,0 +1,268 @@ +# cmark — Building + +## Build System Overview + +cmark uses CMake (minimum version 3.14) as its build system. The top-level `CMakeLists.txt` defines the project as C/CXX with version 0.31.2. It configures C99 standard without extensions, sets up export header generation, CTest integration, and subdirectory targets for the library, CLI tool, tests, man pages, and fuzz harness. + +## Prerequisites + +- A C99-compliant compiler (GCC, Clang, MSVC) +- CMake 3.14 or later +- POSIX environment (for man page generation; skipped on Windows) +- Optional: re2c (only needed if modifying `scanners.re`) +- Optional: Python 3 (for running spec tests) + +## Basic Build Steps + +```bash +# Out-of-source build (required — in-source builds are explicitly blocked) +mkdir build && cd build +cmake .. +make +``` + +The CMakeLists.txt enforces out-of-source builds with: + +```cmake +if("${CMAKE_SOURCE_DIR}" STREQUAL "${CMAKE_BINARY_DIR}") + message(FATAL_ERROR "Do not build in-source.\nPlease remove CMakeCache.txt and the CMakeFiles/ directory.\nThen: mkdir build ; cd build ; cmake .. ; make") +endif() +``` + +## CMake Configuration Options + +### Library Type + +```cmake +option(BUILD_SHARED_LIBS "Build the CMark library as shared" OFF) +``` + +By default, cmark builds as a **static library**. Set `-DBUILD_SHARED_LIBS=ON` for a shared library. When building as static, the compile definition `CMARK_STATIC_DEFINE` is automatically set. + +**Legacy options** (deprecated but still functional for backwards compatibility): +- `CMARK_SHARED` — replaced by `BUILD_SHARED_LIBS` +- `CMARK_STATIC` — replaced by `BUILD_SHARED_LIBS` (inverted logic) + +Both emit `AUTHOR_WARNING` messages advising migration to the standard CMake variable. + +### Fuzzing Support + +```cmake +option(CMARK_LIB_FUZZER "Build libFuzzer fuzzing harness" OFF) +``` + +When enabled, targets matching `fuzz` get `-fsanitize=fuzzer`, while all other targets get `-fsanitize=fuzzer-no-link`. + +### Build Types + +The project supports these build types via `CMAKE_BUILD_TYPE`: + +| Type | Description | +|------|-------------| +| `Release` | Default. Optimized build | +| `Debug` | Adds `-DCMARK_DEBUG_NODES` for node integrity checking via `assert()` | +| `Profile` | Adds `-pg` for profiling with gprof | +| `Asan` | Address sanitizer (loads `FindAsan` module) | +| `Ubsan` | Adds `-fsanitize=undefined` for undefined behavior sanitizer | + +Debug builds automatically add node structure checking: + +```cmake +add_compile_options($<$:-DCMARK_DEBUG_NODES>) +``` + +## Compiler Flags + +The `cmark_add_compile_options()` function applies compiler warnings per-target (not globally), so cmark can be used as a subdirectory in projects with other languages: + +**GCC/Clang:** +``` +-Wall -Wextra -pedantic -Wstrict-prototypes (C only) +``` + +**MSVC:** +``` +-D_CRT_SECURE_NO_WARNINGS +``` + +Visibility is set globally to hidden, with explicit export via the generated `cmark_export.h`: + +```cmake +set(CMAKE_C_VISIBILITY_PRESET hidden) +set(CMAKE_VISIBILITY_INLINES_HIDDEN 1) +``` + +## Library Target: `cmark` + +Defined in `src/CMakeLists.txt`, the `cmark` library target includes these source files: + +```cmake +add_library(cmark + blocks.c buffer.c cmark.c cmark_ctype.c + commonmark.c houdini_href_e.c houdini_html_e.c houdini_html_u.c + html.c inlines.c iterator.c latex.c + man.c node.c references.c render.c + scanners.c scanners.re utf8.c xml.c) +``` + +Target properties: +```cmake +set_target_properties(cmark PROPERTIES + OUTPUT_NAME "cmark" + PDB_NAME libcmark # Avoid PDB name clash with executable + POSITION_INDEPENDENT_CODE YES + SOVERSION ${PROJECT_VERSION} # Includes minor + patch in soname + VERSION ${PROJECT_VERSION}) +``` + +The library exposes headers via its interface include directories: +```cmake +target_include_directories(cmark INTERFACE + $ + $ + $) +``` + +The export header is generated automatically: +```cmake +generate_export_header(cmark BASE_NAME ${PROJECT_NAME}) +``` + +This produces `cmark_export.h` containing `CMARK_EXPORT` macros that resolve to `__declspec(dllexport/dllimport)` on Windows or `__attribute__((visibility("default")))` on Unix. + +## Executable Target: `cmark_exe` + +```cmake +add_executable(cmark_exe main.c) +set_target_properties(cmark_exe PROPERTIES + OUTPUT_NAME "cmark" + INSTALL_RPATH "${Base_rpath}") +target_link_libraries(cmark_exe PRIVATE cmark) +``` + +The executable has the same output name as the library (`cmark`), but the PDB names differ to avoid conflicts on Windows. + +## Generated Files + +Two files are generated at configure time: + +### `cmark_version.h` + +Generated from `cmark_version.h.in`: +```cmake +configure_file(cmark_version.h.in ${CMAKE_CURRENT_BINARY_DIR}/cmark_version.h) +``` + +Contains `CMARK_VERSION` (integer) and `CMARK_VERSION_STRING` (string) macros. + +### `libcmark.pc` + +Generated from `libcmark.pc.in` for pkg-config integration: +```cmake +configure_file(libcmark.pc.in ${CMAKE_CURRENT_BINARY_DIR}/libcmark.pc @ONLY) +``` + +## Test Infrastructure + +Tests are enabled via CMake's standard `BUILD_TESTING` option (defaults to ON): + +```cmake +if(BUILD_TESTING) + add_subdirectory(api_test) + add_subdirectory(test testdir) +endif() +``` + +### API Tests (`api_test/`) + +C-level API tests that exercise the public API functions directly — node creation, manipulation, parsing, rendering. + +### Spec Tests (`test/`) + +CommonMark specification conformance tests. These parse expected input/output pairs from the CommonMark spec and verify cmark produces the correct output. + +## RPATH Configuration + +For shared library builds, the install RPATH is set to the library directory: + +```cmake +if(BUILD_SHARED_LIBS) + set(p "${CMAKE_INSTALL_FULL_LIBDIR}") + list(FIND CMAKE_PLATFORM_IMPLICIT_LINK_DIRECTORIES "${p}" i) + if("${i}" STREQUAL "-1") + set(Base_rpath "${p}") + endif() +endif() +set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE) +``` + +This ensures the executable can find the shared library at runtime without requiring `LD_LIBRARY_PATH`. + +## Man Page Generation + +Man pages are built on non-Windows platforms: + +```cmake +if(NOT CMAKE_SYSTEM_NAME STREQUAL Windows) + add_subdirectory(man) +endif() +``` + +## Building for Fuzzing + +To build the libFuzzer harness: + +```bash +mkdir build-fuzz && cd build-fuzz +cmake -DCMARK_LIB_FUZZER=ON -DCMAKE_C_COMPILER=clang .. +make +``` + +The fuzz targets are in the `fuzz/` subdirectory. + +## Platform-Specific Notes + +### OpenBSD + +The CLI tool uses `pledge(2)` on OpenBSD 6.0+ for sandboxing: +```c +#if defined(__OpenBSD__) +# include +# if OpenBSD >= 201605 +# define USE_PLEDGE +# include +# endif +#endif +``` + +The pledge sequence is: +1. Before parsing: `pledge("stdio rpath", NULL)` — allows reading files +2. After parsing, before rendering: `pledge("stdio", NULL)` — drops file read access + +### Windows + +On Windows (non-Cygwin), binary mode is set for stdin/stdout to prevent CR/LF translation: +```c +#if defined(_WIN32) && !defined(__CYGWIN__) + _setmode(_fileno(stdin), _O_BINARY); + _setmode(_fileno(stdout), _O_BINARY); +#endif +``` + +## Scanner Regeneration + +The `scanners.c` file is generated from `scanners.re` using re2c. To regenerate: + +```bash +re2c --case-insensitive -b -i --no-generation-date -8 \ + -o scanners.c scanners.re +``` + +The generated file is checked into the repository, so re2c is not required for normal builds. + +## Cross-References + +- [cli-usage.md](cli-usage.md) — Command-line tool details and options +- [testing.md](testing.md) — Test framework details +- [code-style.md](code-style.md) — Coding conventions +- [scanner-system.md](scanner-system.md) — Scanner generation details diff --git a/docs/handbook/cmark/cli-usage.md b/docs/handbook/cmark/cli-usage.md new file mode 100644 index 0000000000..d77c3b8fa9 --- /dev/null +++ b/docs/handbook/cmark/cli-usage.md @@ -0,0 +1,249 @@ +# cmark — CLI Usage + +## Overview + +The `cmark` command-line tool (`main.c`) reads CommonMark input from files or stdin and renders it to one of five output formats. It serves as both a reference implementation and a practical conversion tool. + +## Entry Point + +```c +int main(int argc, char *argv[]); +``` + +## Output Formats + +```c +typedef enum { + FORMAT_NONE, + FORMAT_HTML, + FORMAT_XML, + FORMAT_MAN, + FORMAT_COMMONMARK, + FORMAT_LATEX, +} writer_format; +``` + +Default: `FORMAT_HTML`. + +## Command-Line Options + +| Option | Long Form | Description | +|--------|-----------|-------------| +| `-t FORMAT` | `--to FORMAT` | Output format: `html`, `xml`, `man`, `commonmark`, `latex` | +| | `--width N` | Wrapping width (0 = no wrapping; default 0). Only affects `commonmark`, `man`, `latex` | +| | `--sourcepos` | Include source position information | +| | `--hardbreaks` | Render soft breaks as hard breaks | +| | `--nobreaks` | Render soft breaks as spaces | +| | `--unsafe` | Allow raw HTML and dangerous URLs | +| | `--smart` | Enable smart punctuation (curly quotes, em/en dashes, ellipses) | +| | `--validate-utf8` | Validate and clean UTF-8 input | +| `-h` | `--help` | Print usage information | +| | `--version` | Print version string | + +## Option Parsing + +```c +for (i = 1; i < argc; i++) { + if (strcmp(argv[i], "--version") == 0) { + printf("cmark %s", cmark_version_string()); + printf(" - CommonMark converter\n(C) 2014-2016 John MacFarlane\n"); + exit(0); + } else if (strcmp(argv[i], "--sourcepos") == 0) { + options |= CMARK_OPT_SOURCEPOS; + } else if (strcmp(argv[i], "--hardbreaks") == 0) { + options |= CMARK_OPT_HARDBREAKS; + } else if (strcmp(argv[i], "--nobreaks") == 0) { + options |= CMARK_OPT_NOBREAKS; + } else if (strcmp(argv[i], "--smart") == 0) { + options |= CMARK_OPT_SMART; + } else if (strcmp(argv[i], "--unsafe") == 0) { + options |= CMARK_OPT_UNSAFE; + } else if (strcmp(argv[i], "--validate-utf8") == 0) { + options |= CMARK_OPT_VALIDATE_UTF8; + } else if ((strcmp(argv[i], "--to") == 0 || strcmp(argv[i], "-t") == 0) && + i + 1 < argc) { + i++; + if (strcmp(argv[i], "man") == 0) writer = FORMAT_MAN; + else if (strcmp(argv[i], "html") == 0) writer = FORMAT_HTML; + else if (strcmp(argv[i], "xml") == 0) writer = FORMAT_XML; + else if (strcmp(argv[i], "commonmark") == 0) writer = FORMAT_COMMONMARK; + else if (strcmp(argv[i], "latex") == 0) writer = FORMAT_LATEX; + else { + fprintf(stderr, "Unknown format %s\n", argv[i]); + exit(1); + } + } else if (strcmp(argv[i], "--width") == 0 && i + 1 < argc) { + i++; + width = atoi(argv[i]); + } else if (strcmp(argv[i], "-h") == 0 || strcmp(argv[i], "--help") == 0) { + print_usage(); + exit(0); + } else if (*argv[i] == '-') { + print_usage(); + exit(1); + } else { + // Treat as filename + files[numfps++] = i; + } +} +``` + +## Input Handling + +### File Input + +```c +for (i = 0; i < numfps; i++) { + fp = fopen(argv[files[i]], "rb"); + if (fp == NULL) { + fprintf(stderr, "Error opening file %s: %s\n", argv[files[i]], strerror(errno)); + exit(1); + } + // Read in chunks and feed to parser + while ((bytes = fread(buffer, 1, sizeof(buffer), fp)) > 0) { + cmark_parser_feed(parser, buffer, bytes); + if (bytes < sizeof(buffer)) break; + } + fclose(fp); +} +``` + +Files are opened in binary mode (`"rb"`) and read in chunks of `BUFFER_SIZE` (4096 bytes). Each chunk is fed to the streaming parser via `cmark_parser_feed()`. + +### Stdin Input + +```c +if (numfps == 0) { + // Read from stdin + while ((bytes = fread(buffer, 1, sizeof(buffer), stdin)) > 0) { + cmark_parser_feed(parser, buffer, bytes); + if (bytes < sizeof(buffer)) break; + } +} +``` + +When no files are specified, input is read from stdin. + +### Windows Binary Mode + +```c +#if defined(_WIN32) && !defined(__CYGWIN__) +_setmode(_fileno(stdin), _O_BINARY); +_setmode(_fileno(stdout), _O_BINARY); +#endif +``` + +On Windows, stdin and stdout are set to binary mode to prevent CR/LF translation. + +## Rendering + +```c +document = cmark_parser_finish(parser); +cmark_parser_free(parser); + +// Render based on format +result = print_document(document, writer, width, options); +``` + +### `print_document()` + +```c +static void print_document(cmark_node *document, writer_format writer, + int width, int options) { + char *result; + switch (writer) { + case FORMAT_HTML: + result = cmark_render_html(document, options); + break; + case FORMAT_XML: + result = cmark_render_xml(document, options); + break; + case FORMAT_MAN: + result = cmark_render_man(document, options, width); + break; + case FORMAT_COMMONMARK: + result = cmark_render_commonmark(document, options, width); + break; + case FORMAT_LATEX: + result = cmark_render_latex(document, options, width); + break; + default: + fprintf(stderr, "Unknown format %d\n", writer); + exit(1); + } + printf("%s", result); + document->mem->free(result); +} +``` + +The rendered result is written to stdout and then freed. + +### Cleanup + +```c +cmark_node_free(document); +``` + +The AST is freed after rendering. + +## OpenBSD Security + +```c +#ifdef __OpenBSD__ + if (pledge("stdio rpath", NULL) != 0) { + perror("pledge"); + return 1; + } +#endif +``` + +On OpenBSD, the program restricts itself to `stdio` and `rpath` (read-only file access) via `pledge()`. This prevents the cmark binary from performing any operations beyond reading files and writing to stdout/stderr. + +## Usage Examples + +```bash +# Convert Markdown to HTML +cmark input.md + +# Convert with smart punctuation +cmark --smart input.md + +# Convert to man page with 72-column wrapping +cmark -t man --width 72 input.md + +# Convert to LaTeX +cmark -t latex input.md + +# Round-trip through CommonMark +cmark -t commonmark input.md + +# Include source positions in output +cmark --sourcepos input.md + +# Allow raw HTML passthrough +cmark --unsafe input.md + +# Read from stdin +echo "# Hello" | cmark + +# Validate UTF-8 input +cmark --validate-utf8 input.md + +# Print version +cmark --version +``` + +## Exit Codes + +- `0` — Success +- `1` — Error (unknown option, file open failure, unknown format) + +## Cross-References + +- [main.c](../../cmark/src/main.c) — Full implementation +- [public-api.md](public-api.md) — The C API functions called by main +- [html-renderer.md](html-renderer.md) — `cmark_render_html()` +- [xml-renderer.md](xml-renderer.md) — `cmark_render_xml()` +- [latex-renderer.md](latex-renderer.md) — `cmark_render_latex()` +- [man-renderer.md](man-renderer.md) — `cmark_render_man()` +- [commonmark-renderer.md](commonmark-renderer.md) — `cmark_render_commonmark()` diff --git a/docs/handbook/cmark/code-style.md b/docs/handbook/cmark/code-style.md new file mode 100644 index 0000000000..0ac2af2def --- /dev/null +++ b/docs/handbook/cmark/code-style.md @@ -0,0 +1,293 @@ +# cmark — Code Style and Conventions + +## Overview + +This document describes the coding conventions and patterns used throughout the cmark codebase. Understanding these conventions makes the source code easier to navigate. + +## Naming Conventions + +### Public API Functions + +All public functions use the `cmark_` prefix: +```c +cmark_node *cmark_node_new(cmark_node_type type); +cmark_parser *cmark_parser_new(int options); +char *cmark_render_html(cmark_node *root, int options); +``` + +### Internal (Static) Functions + +File-local static functions use the `S_` prefix: +```c +static void S_render_node(cmark_node *node, cmark_event_type ev_type, + struct render_state *state, int options); +static cmark_node *S_node_new(cmark_node_type type, cmark_mem *mem); +static void S_free_nodes(cmark_node *e); +static bool S_is_leaf(cmark_node *node); +static int S_get_enumlevel(cmark_node *node); +``` + +This convention makes it immediately clear whether a function has file-local scope. + +### Internal (Non-Static) Functions + +Functions that are internal to the library but shared across translation units use: +- `cmark_` prefix (same as public) — declared in private headers (e.g., `parser.h`, `node.h`) +- No `S_` prefix + +Examples: +```c +// In node.h (private header): +void cmark_node_set_type(cmark_node *node, cmark_node_type type); +cmark_node *make_block(cmark_mem *mem, cmark_node_type type, + int start_line, int start_column); +``` + +### Struct Members + +No prefix convention — struct members use plain names: +```c +struct cmark_node { + cmark_mem *mem; + cmark_node *next; + cmark_node *prev; + cmark_node *parent; + cmark_node *first_child; + cmark_node *last_child; + // ... +}; +``` + +### Type Names + +Typedefs use the `cmark_` prefix: +```c +typedef struct cmark_node cmark_node; +typedef struct cmark_parser cmark_parser; +typedef struct cmark_iter cmark_iter; +typedef int32_t bufsize_t; // Exception: no cmark_ prefix +``` + +### Enum Values + +Enum constants use the `CMARK_` prefix with UPPER_CASE: +```c +typedef enum { + CMARK_NODE_NONE, + CMARK_NODE_DOCUMENT, + CMARK_NODE_BLOCK_QUOTE, + // ... +} cmark_node_type; +``` + +### Preprocessor Macros + +Macros use UPPER_CASE, sometimes with `CMARK_` prefix: +```c +#define CMARK_OPT_SOURCEPOS (1 << 1) +#define CMARK_BUF_INIT(mem) { mem, cmark_strbuf__initbuf, 0, 0 } +#define MAX_LINK_LABEL_LENGTH 999 +#define CODE_INDENT 4 +``` + +## Error Handling Patterns + +### Allocation Failure + +The default allocator (`xcalloc`, `xrealloc`) aborts on failure: +```c +static void *xcalloc(size_t nmemb, size_t size) { + void *ptr = calloc(nmemb, size); + if (!ptr) abort(); + return ptr; +} +``` + +Functions that allocate never return NULL — they either succeed or terminate. This eliminates NULL-check boilerplate throughout the codebase. + +### Invalid Input + +Functions that receive invalid arguments typically: +1. Return 0/false/NULL for queries +2. Do nothing for mutations +3. Never crash + +Example from `node.c`: +```c +int cmark_node_set_heading_level(cmark_node *node, int level) { + if (node == NULL || node->type != CMARK_NODE_HEADING) return 0; + if (level < 1 || level > 6) return 0; + node->as.heading.level = level; + return 1; +} +``` + +### Return Conventions + +- **0/1 for success/failure**: Setter functions return 1 on success, 0 on failure +- **NULL for not found**: Lookup functions return NULL when the item doesn't exist +- **Assertion for invariants**: Internal invariants use `assert()`: + ```c + assert(googled_node->type == CMARK_NODE_DOCUMENT); + ``` + +## Header Guard Style + +```c +#ifndef CMARK_NODE_H +#define CMARK_NODE_H +// ... +#endif +``` + +Guards use `CMARK_` prefix + uppercase filename + `_H`. + +## Include Patterns + +### Public Headers +```c +#include "cmark.h" // Always first — provides all public types +``` + +### Private Headers +```c +#include "node.h" // Internal node definitions +#include "parser.h" // Parser internals +#include "buffer.h" // cmark_strbuf +#include "chunk.h" // cmark_chunk +#include "references.h" // Reference map +#include "utf8.h" // UTF-8 utilities +#include "scanners.h" // re2c-generated scanners +``` + +### System Headers +```c +#include +#include +#include +#include +``` + +## Inline Functions + +The `CMARK_INLINE` macro abstracts compiler-specific inline syntax: +```c +#ifdef _MSC_VER +#define CMARK_INLINE __forceinline +#else +#define CMARK_INLINE __inline__ +#endif +``` + +Used for small, hot-path functions in headers: +```c +static CMARK_INLINE void cmark_chunk_free(cmark_mem *mem, cmark_chunk *c) { ... } +static CMARK_INLINE cmark_chunk cmark_chunk_dup(...) { ... } +``` + +## Memory Ownership Patterns + +### Owning vs Non-Owning + +The `cmark_chunk` type makes ownership explicit: +- `alloc > 0` → the chunk owns the memory and must free it +- `alloc == 0` → the chunk borrows memory from elsewhere + +### Transfer of Ownership + +`cmark_strbuf_detach()` transfers ownership from a strbuf to the caller: +```c +unsigned char *data = cmark_strbuf_detach(&buf); +// Caller now owns 'data' and must free it +``` + +### Consistent Cleanup + +Free functions null out pointers after freeing: +```c +static CMARK_INLINE void cmark_chunk_free(cmark_mem *mem, cmark_chunk *c) { + if (c->alloc) + mem->free((void *)c->data); + c->data = NULL; // NULL after free + c->alloc = 0; + c->len = 0; +} +``` + +## Iterative vs Recursive Patterns + +The codebase avoids recursion for tree operations to prevent stack overflow on deeply nested input: + +### Iterative Tree Destruction +`S_free_nodes()` uses sibling-list splicing instead of recursion: +```c +// Splice children into sibling chain +if (e->first_child) { + cmark_node *last = e->last_child; + last->next = e->next; + e->next = e->first_child; +} +``` + +### Iterator-Based Traversal +All rendering uses `cmark_iter` instead of recursive `render_children()`: +```c +while ((ev_type = cmark_iter_next(iter)) != CMARK_EVENT_DONE) { + cur = cmark_iter_get_node(iter); + S_render_node(cur, ev_type, &state, options); +} +``` + +## Type Size Definitions + +```c +typedef int32_t bufsize_t; +``` + +Buffer sizes use `int32_t` (not `size_t`) to: +1. Allow negative values for error signaling +2. Keep node structs compact (32-bit vs 64-bit on LP64) +3. Limit maximum allocation to 2GB (adequate for text processing) + +## Bitmask Patterns + +Option flags use single-bit constants: +```c +#define CMARK_OPT_SOURCEPOS (1 << 1) +#define CMARK_OPT_HARDBREAKS (1 << 2) +#define CMARK_OPT_UNSAFE (1 << 17) +#define CMARK_OPT_NOBREAKS (1 << 4) +#define CMARK_OPT_VALIDATE_UTF8 (1 << 9) +#define CMARK_OPT_SMART (1 << 10) +``` + +Tested with bitwise AND: +```c +if (options & CMARK_OPT_SOURCEPOS) { ... } +``` + +Combined with bitwise OR: +```c +int options = CMARK_OPT_SOURCEPOS | CMARK_OPT_SMART; +``` + +## Leaf Mask Pattern + +`S_is_leaf()` in `iterator.c` uses a bitmask for O(1) node-type classification: +```c +static const int S_leaf_mask = + (1 << CMARK_NODE_HTML_BLOCK) | (1 << CMARK_NODE_THEMATIC_BREAK) | + (1 << CMARK_NODE_CODE_BLOCK) | (1 << CMARK_NODE_TEXT) | ...; + +static bool S_is_leaf(cmark_node *node) { + return ((1 << node->type) & S_leaf_mask) != 0; +} +``` + +This is more efficient than a switch statement for a simple boolean classification. + +## Cross-References + +- [architecture.md](architecture.md) — Design decisions +- [memory-management.md](memory-management.md) — Allocator patterns +- [public-api.md](public-api.md) — Public API naming diff --git a/docs/handbook/cmark/commonmark-renderer.md b/docs/handbook/cmark/commonmark-renderer.md new file mode 100644 index 0000000000..01ffb3a987 --- /dev/null +++ b/docs/handbook/cmark/commonmark-renderer.md @@ -0,0 +1,344 @@ +# cmark — CommonMark Renderer + +## Overview + +The CommonMark renderer (`commonmark.c`) converts a `cmark_node` AST back into CommonMark-formatted Markdown text. This is significantly more complex than the other renderers because it must reproduce syntactically valid Markdown that, when re-parsed, produces an equivalent AST. It uses the generic render framework from `render.c`. + +## Entry Point + +```c +char *cmark_render_commonmark(cmark_node *root, int options, int width); +``` + +- `root` — AST root node +- `options` — Option flags +- `width` — Target line width for wrapping; 0 disables wrapping + +## Character Escaping (`outc`) + +The CommonMark escaping is the most complex of all renderers. Three escaping modes exist: + +### NORMAL Mode + +Characters that could be interpreted as Markdown syntax must be backslash-escaped. Characters that trigger escaping: + +```c +case '*': +case '#': +case '(': +case ')': +case '[': +case ']': +case '<': +case '>': +case '!': +case '\\': + // Backslash-escaped: \*, \#, \(, etc. +``` + +Additionally: +- `.` and `)` — only escaped at line start (after a digit), to prevent triggering ordered list syntax +- `-`, `+`, `=`, `_` — only escaped at line start, to prevent thematic breaks, bullet lists, or setext headings +- `~` — only escaped at line start +- `&` — escaped to prevent entity references +- `'`, `"` — escaped for smart punctuation + +For whitespace handling: +- NBSP (`\xA0`) → `\xa0` (the literal non-breaking space character) +- Tab → space (tabs cannot be reliably round-tripped) + +### URL Mode + +Only `(`, `)`, and whitespace `\x20` are escaped with backslashes. URLs in parenthesized `()` format need minimal escaping. + +### TITLE Mode + +For link titles, only the title delimiter character is escaped. The renderer currently always uses `"` as the title delimiter, so `"` is backslash-escaped within titles. + +## Backtick Sequence Analysis + +Two helper functions determine how to format inline code spans: + +### `longest_backtick_sequence()` + +```c +static int longest_backtick_sequence(const char *code) { + int longest = 0; + int current = 0; + size_t i = 0; + size_t code_len = strlen(code); + while (i <= code_len) { + if (code[i] == '`') { + current++; + } else { + if (current > longest) + longest = current; + current = 0; + } + i++; + } + return longest; +} +``` + +Finds the maximum run of consecutive backticks within a code string. + +### `shortest_unused_backtick_sequence()` + +```c +static int shortest_unused_backtick_sequence(const char *code) { + int32_t used = 1; // Bitmask for sequences of length 1-31 + int current = 0; + // ... scan for runs, set bits in 'used' + int i = 0; + while (used & 1) { + used >>= 1; + i++; + } + return i + 1; +} +``` + +Determines the shortest backtick sequence (1-32) that does NOT appear in the code content. This ensures the code delimiter won't conflict with backticks inside the code. + +Uses a clever bit-manipulation approach: a 32-bit integer `used` tracks which backtick sequence lengths appear. After scanning, the position of the first unset bit gives the shortest unused length. + +## Autolink Detection + +```c +static bool is_autolink(cmark_node *node) { + const char *title; + const char *url; + // ... + if (node->first_child->type != CMARK_NODE_TEXT) return false; + url = (char *)node->as.link.url; + title = (char *)node->as.link.title; + if (title && title[0]) return false; // Autolinks have no title + if (url && + (strncmp(url, "http://", 7) == 0 || strncmp(url, "https://", 8) == 0 || + strncmp(url, "mailto:", 7) == 0) && + strcmp(url, (char *)node->first_child->data) == 0) + return true; + return false; +} +``` + +A link is an autolink if: +1. It has exactly one child, a text node +2. No title +3. URL starts with `http://`, `https://`, or `mailto:` +4. The text exactly matches the URL + +## Node Rendering (`S_render_node`) + +### Block Nodes + +#### Document +No output. + +#### Block Quote +``` +ENTER: Sets prefix to "> " for first line and "> " for continuations +EXIT: Restores prefix, adds blank line +``` + +The prefix mechanism is central to CommonMark rendering. When entering a block quote: +```c +cmark_strbuf_puts(renderer->prefix, "> "); +``` + +All content within the block quote is prefixed with `"> "` on each line. + +#### List +``` +ENTER: Records tight/loose status, records bullet character +EXIT: Restores prefix, adds blank line +``` + +The renderer stores whether the list is tight to control inter-item blank lines. + +#### Item +``` +ENTER: Computes marker and indentation prefix +EXIT: Restores prefix +``` + +**Bullet items:** Use `-`, `*`, or `+` (from `cmark_node_get_list_delim`). The prefix is set to appropriate indentation: + +```c +// For a bullet item: +"- " on the first line +" " on continuation lines (indentation matches marker width) +``` + +**Ordered items:** Number is computed by counting previous siblings: +```c +list_number = cmark_node_get_list_start(node->parent); +tmp = node; +while (tmp->prev) { + tmp = tmp->prev; + list_number++; +} +``` + +Format: `"N. "` or `"N) "` depending on delimiter type. Continuation indent matches the marker width. + +For tight lists, items don't emit blank lines between them. + +#### Heading +**ATX headings** (levels 1-6): +``` +### Content\n +``` + +The number of `#` characters matches the heading level. A newline follows the heading content. + +**Setext headings** (levels 1-2 when `width > 0`): +Not used — the renderer always uses ATX headings. + +#### Code Block +The renderer determines whether to use fenced or indented code: + +**Fenced code blocks:** +``` +```[info] +content +``` +``` + +The fence character is `` ` ``. The fence length is max(3, longest_backtick_in_content + 1). + +If the code has an info string, fenced blocks are always used (indented blocks cannot carry info strings). + +**Indented code blocks:** +If there's no info string and `width == 0`, the renderer uses 4-space indentation by setting the prefix to `" "`. + +#### HTML Block +Content is output LITERALLY (no escaping): +```c +cmark_render_ascii(renderer, (char *)node->data); +``` + +This preserves raw HTML exactly. + +#### Thematic Break +``` +---\n +``` + +Uses `---` (three hyphens). + +#### Paragraph +``` +ENTER: (nothing for tight, blank line for normal) +EXIT: \n (newline after content) +``` + +In tight lists, paragraphs don't add blank lines before/after. + +### Inline Nodes + +#### Text +Output with NORMAL escaping (all Markdown-significant characters escaped). + +#### Soft Break +Depends on options: +- `CMARK_OPT_HARDBREAKS`: `\\\n` (backslash line break) +- `CMARK_OPT_NOBREAKS`: space +- Default: newline + +#### Line Break +``` +\\\n +``` + +Backslash followed by newline. + +#### Code (inline) +The renderer selects delimiters using `shortest_unused_backtick_sequence()`: + +```c +int numticks = shortest_unused_backtick_sequence(code); +// output numticks backticks +// if code starts or ends with backtick, add space padding +// output literal code +// output numticks backticks +``` + +If the code content starts or ends with a backtick, spaces are added inside the delimiters to prevent ambiguity: +``` +`` `code` `` +``` + +#### Emphasis +``` +ENTER: * or _ (delimiter character) +EXIT: * or _ (matching delimiter) +``` + +The delimiter selection depends on what characters appear in the content. If the content contains `*`, `_` is preferred (and vice versa). The `emph_delim` variable tracks the chosen delimiter. + +#### Strong +``` +ENTER: ** or __ +EXIT: ** or __ +``` + +Same delimiter selection logic as emphasis. + +#### Link +**Autolinks:** +``` + +``` + +**Normal links:** +``` +ENTER: [ +EXIT: ](URL "TITLE") or ](URL) if no title +``` + +The URL is output with URL escaping, the title with TITLE escaping. + +#### Image +``` +ENTER: ![ +EXIT: ](URL "TITLE") or ](URL) if no title +``` + +Same as links but with `!` prefix. + +#### HTML Inline +Output literally (no escaping). + +## Prefix Management + +The CommonMark renderer makes extensive use of the prefix system from `render.c`. Each line of output is prefixed with accumulated prefix strings from container nodes. For example, a list item inside a block quote: + +``` +> - Item text +> continuation +``` + +The prefix stack would be: +1. `"> "` from the block quote +2. `" "` (continuation indent) from the list item + +The `cmark_renderer` struct maintains `prefix` and `begin_content` fields to handle this. + +## Round-Trip Fidelity + +The CommonMark renderer aims for round-trip fidelity: parsing the output should produce an AST equivalent to the input. This is not always perfectly achievable: + +1. **Whitespace normalization**: Some whitespace differences (e.g., number of blank lines) are lost. +2. **Reference links**: Inline link syntax is always used; reference-style links are not preserved. +3. **ATX vs setext**: Always uses ATX headings. +4. **Indented vs fenced**: Logic selects one based on info string presence and width setting. +5. **Emphasis delimiter**: May differ from the original (`*` vs `_`). + +## Cross-References + +- [commonmark.c](../../cmark/src/commonmark.c) — Full implementation +- [render-framework.md](render-framework.md) — Generic render framework +- [public-api.md](public-api.md) — `cmark_render_commonmark()` API docs +- [scanner-system.md](scanner-system.md) — Scanners used for autolink detection diff --git a/docs/handbook/cmark/html-renderer.md b/docs/handbook/cmark/html-renderer.md new file mode 100644 index 0000000000..98406c300c --- /dev/null +++ b/docs/handbook/cmark/html-renderer.md @@ -0,0 +1,258 @@ +# cmark — HTML Renderer + +## Overview + +The HTML renderer (`html.c`) converts a `cmark_node` AST into an HTML string. Unlike the LaTeX, man, and CommonMark renderers, it does NOT use the generic render framework from `render.c`. Instead, it writes directly to a `cmark_strbuf` buffer, giving it full control over output formatting. + +## Entry Point + +```c +char *cmark_render_html(cmark_node *root, int options); +``` + +Creates an iterator over the AST, processes each node via `S_render_node()`, and returns the resulting HTML string. The caller is responsible for freeing the returned buffer. + +### Implementation + +```c +char *cmark_render_html(cmark_node *root, int options) { + char *result; + cmark_strbuf html = CMARK_BUF_INIT(root->mem); + cmark_event_type ev_type; + cmark_node *cur; + struct render_state state = {&html, NULL}; + cmark_iter *iter = cmark_iter_new(root); + + while ((ev_type = cmark_iter_next(iter)) != CMARK_EVENT_DONE) { + cur = cmark_iter_get_node(iter); + S_render_node(cur, ev_type, &state, options); + } + result = (char *)cmark_strbuf_detach(&html); + cmark_iter_free(iter); + return result; +} +``` + +## Render State + +```c +struct render_state { + cmark_strbuf *html; // Output buffer + cmark_node *plain; // Non-NULL when rendering image alt text (plain text mode) +}; +``` + +The `plain` field is used for image alt text rendering. When entering an image node, `state->plain` is set to the image node. While `plain` is non-NULL, only text content is emitted (HTML tags are suppressed) — this ensures the `alt` attribute contains only plain text, not nested HTML. When the iterator exits the image node (`state->plain == node`), plain mode is cleared. + +## HTML Escaping + +```c +static void escape_html(cmark_strbuf *dest, const unsigned char *source, + bufsize_t length) { + houdini_escape_html(dest, source, length, 0); +} +``` + +Characters `<`, `>`, `&`, `"` are converted to their HTML entity equivalents. The `0` argument means "not secure mode" (no additional escaping). + +## Source Position Attributes + +```c +static void S_render_sourcepos(cmark_node *node, cmark_strbuf *html, int options) { + char buffer[BUFFER_SIZE]; + if (CMARK_OPT_SOURCEPOS & options) { + snprintf(buffer, BUFFER_SIZE, " data-sourcepos=\"%d:%d-%d:%d\"", + cmark_node_get_start_line(node), cmark_node_get_start_column(node), + cmark_node_get_end_line(node), cmark_node_get_end_column(node)); + cmark_strbuf_puts(html, buffer); + } +} +``` + +When `CMARK_OPT_SOURCEPOS` is set, all block-level elements receive a `data-sourcepos` attribute with format `"startline:startcol-endline:endcol"`. + +## Newline Helper + +```c +static inline void cr(cmark_strbuf *html) { + if (html->size && html->ptr[html->size - 1] != '\n') + cmark_strbuf_putc(html, '\n'); +} +``` + +Ensures the output ends with a newline without adding redundant ones. + +## Node Rendering Logic + +The `S_render_node()` function handles each node type in a large switch statement. The `entering` boolean indicates whether this is an `CMARK_EVENT_ENTER` or `CMARK_EVENT_EXIT` event. + +### Block Nodes + +#### Document +No output — the document node is purely structural. + +#### Block Quote +``` +ENTER: \n\n +EXIT: \n\n +``` + +#### List +``` +ENTER (bullet): \n\n +ENTER (ordered): \n\n (or
    if start > 1) +EXIT: \n or
\n +``` + +#### Item +``` +ENTER: \n +EXIT: \n +``` + +#### Heading +``` +ENTER: \n (where N = heading level) +EXIT: \n +``` + +The heading level is injected into character arrays: +```c +char start_heading[] = "as.heading.level); +``` + +#### Code Block +Always a leaf node (single event). Output: +```html +ESCAPED CONTENT\n +``` + +If the code block has an info string, a `class` attribute is added: +```html +ESCAPED CONTENT\n +``` + +The `"language-"` prefix is only added if the info string doesn't already start with `"language-"`. + +#### HTML Block +When `CMARK_OPT_UNSAFE` is set, raw HTML is output verbatim. Otherwise, it's replaced with: +```html + +``` + +#### Thematic Break +```html +\n +``` + +#### Paragraph +The paragraph respects tight list context. The renderer checks if the paragraph's grandparent is a list with `tight = true`: + +```c +parent = cmark_node_parent(node); +grandparent = cmark_node_parent(parent); +if (grandparent != NULL && grandparent->type == CMARK_NODE_LIST) { + tight = grandparent->as.list.tight; +} else { + tight = false; +} +``` + +In tight lists, the `

` tags are suppressed — content flows directly without wrapping. + +#### Custom Block +On enter, outputs the `on_enter` text literally. On exit, outputs `on_exit`. + +### Inline Nodes + +#### Text +```c +escape_html(html, node->data, node->len); +``` + +All text content is HTML-escaped. + +#### Line Break +```html +
\n +``` + +#### Soft Break +Behavior depends on options: +- `CMARK_OPT_HARDBREAKS`: `
\n` +- `CMARK_OPT_NOBREAKS`: single space +- Default: `\n` + +#### Code (inline) +```html +ESCAPED CONTENT +``` + +#### HTML Inline +Same as HTML block: verbatim with `CMARK_OPT_UNSAFE`, otherwise ``. + +#### Emphasis +``` +ENTER: +EXIT: +``` + +#### Strong +``` +ENTER: +EXIT: +``` + +#### Link +``` +ENTER:
+EXIT: +``` + +URL safety: If `CMARK_OPT_UNSAFE` is NOT set, the URL is checked against `_scan_dangerous_url()`. Dangerous URLs (`javascript:`, `vbscript:`, `file:`, certain `data:` schemes) produce an empty `href`. + +URL escaping uses `houdini_escape_href()` which percent-encodes special characters. Title escaping uses `escape_html()`. + +#### Image +``` +ENTER: 
+  (enters plain text mode — state->plain = node)
+EXIT:  +``` + +During plain text mode (between enter and exit), only text content, code content, and HTML inline content are output (HTML-escaped), and breaks are rendered as spaces. + +#### Custom Inline +On enter, outputs `on_enter` literally. On exit, outputs `on_exit`. + +## URL Safety + +Links and images check URL safety unless `CMARK_OPT_UNSAFE` is set: + +```c +if (node->as.link.url && ((options & CMARK_OPT_UNSAFE) || + !(_scan_dangerous_url(node->as.link.url)))) { + houdini_escape_href(html, node->as.link.url, + (bufsize_t)strlen((char *)node->as.link.url)); +} +``` + +The `_scan_dangerous_url()` scanner (from `scanners.c`) matches schemes: `javascript:`, `vbscript:`, `file:`, and `data:` (except for safe image MIME types: `image/png`, `image/gif`, `image/jpeg`, `image/webp`). + +## Differences from Framework Renderers + +The HTML renderer differs from the render-framework-based renderers in several ways: + +1. **No line wrapping**: HTML output has no configurable width or word-wrap logic. +2. **No prefix management**: Block quotes and lists don't use prefix strings for indentation — they use HTML tags. +3. **Direct buffer writes**: All output goes directly to a `cmark_strbuf`, with no escaping dispatch function. +4. **No `width` parameter**: `cmark_render_html()` takes only `root` and `options`. + +## Cross-References + +- [html.c](../../cmark/src/html.c) — Full implementation +- [render-framework.md](render-framework.md) — The alternative render architecture used by other renderers +- [iterator-system.md](iterator-system.md) — How the AST is traversed +- [scanner-system.md](scanner-system.md) — `_scan_dangerous_url()` for URL safety +- [public-api.md](public-api.md) — `cmark_render_html()` API documentation diff --git a/docs/handbook/cmark/inline-parsing.md b/docs/handbook/cmark/inline-parsing.md new file mode 100644 index 0000000000..4485017305 --- /dev/null +++ b/docs/handbook/cmark/inline-parsing.md @@ -0,0 +1,317 @@ +# cmark — Inline Parsing + +## Overview + +Inline parsing is Phase 2 of cmark's pipeline. Implemented in `inlines.c`, it processes the text content of paragraph and heading nodes, recognizing emphasis (`*`, `_`), code spans (`` ` ``), links (`[text](url)`), images (`![alt](url)`), autolinks (``), raw HTML inline, hard line breaks, soft line breaks, and smart punctuation. + +The entry point is `cmark_parse_inlines()`, called from `process_inlines()` in `blocks.c` after all block structure has been finalized. + +## The `subject` Struct + +All inline parsing state is tracked in the `subject` struct: + +```c +typedef struct { + cmark_mem *mem; // Memory allocator + cmark_chunk input; // The text being parsed + unsigned flags; // Skip flags for HTML constructs + int line; // Source line number + bufsize_t pos; // Current byte position in input + int block_offset; // Column offset of the containing block + int column_offset; // Adjustment for multi-line source position tracking + cmark_reference_map *refmap; // Link reference definitions + delimiter *last_delim; // Top of delimiter stack (linked list, newest first) + bracket *last_bracket; // Top of bracket stack (linked list, newest first) + bufsize_t backticks[MAXBACKTICKS + 1]; // Cached positions of backtick sequences + bool scanned_for_backticks; // Whether the full input has been scanned for backticks + bool no_link_openers; // Optimization: set when no link openers remain +} subject; +``` + +`MAXBACKTICKS` is defined as 1000. The `backticks` array caches the positions of backtick sequences of each length, enabling O(1) lookup once the input has been fully scanned. + +### Skip Flags + +The `flags` field uses bit flags to track which HTML constructs have been confirmed absent: + +```c +#define FLAG_SKIP_HTML_CDATA (1u << 0) +#define FLAG_SKIP_HTML_DECLARATION (1u << 1) +#define FLAG_SKIP_HTML_PI (1u << 2) +#define FLAG_SKIP_HTML_COMMENT (1u << 3) +``` + +Once a scan for a particular HTML construct fails, the flag is set to avoid rescanning. + +## The Delimiter Stack + +Emphasis and smart punctuation use a delimiter stack. Each entry is: + +```c +typedef struct delimiter { + struct delimiter *previous; // Link to older delimiter + struct delimiter *next; // Link to newer delimiter (towards top) + cmark_node *inl_text; // The text node created for this delimiter run + bufsize_t position; // Position in the input + bufsize_t length; // Number of delimiter characters remaining + unsigned char delim_char; // '*', '_', '\'', or '"' + bool can_open; // Whether this run can open emphasis + bool can_close; // Whether this run can close emphasis +} delimiter; +``` + +The stack is a doubly-linked list with `last_delim` pointing to the newest entry. + +## The Bracket Stack + +Links and images use a separate bracket stack: + +```c +typedef struct bracket { + struct bracket *previous; // Link to older bracket + cmark_node *inl_text; // The text node for '[' or '![' + bufsize_t position; // Position in the input + bool image; // Whether this is an image opener '![' + bool active; // Can still match (set to false when deactivated) + bool bracket_after; // Whether a '[' appeared after this bracket +} bracket; +``` + +Brackets are deactivated (set `active = false`) when: +- A matching `]` fails to produce a valid link (the opener is deactivated to prevent infinite loops) +- An inner link is formed (outer brackets are deactivated per spec) + +## Emphasis Flanking Rules: `scan_delims()` + +```c +static int scan_delims(subject *subj, unsigned char c, bool *can_open, + bool *can_close); +``` + +This function determines whether a run of `*`, `_`, `'`, or `"` characters can open and/or close emphasis, following the CommonMark spec's Unicode-aware flanking rules: + +1. The function looks at the character **before** the run and the character **after** the run. +2. It uses `cmark_utf8proc_iterate()` to decode the surrounding characters as full Unicode code points. +3. It classifies them using `cmark_utf8proc_is_space()` and `cmark_utf8proc_is_punctuation_or_symbol()`. + +The flanking rules: +- **Left-flanking**: numdelims > 0, character after is not a space, AND (character after is not punctuation OR character before is a space or punctuation) +- **Right-flanking**: numdelims > 0, character before is not a space, AND (character before is not punctuation OR character after is a space or punctuation) + +For `*`: `can_open = left_flanking`, `can_close = right_flanking` + +For `_`: +```c +*can_open = left_flanking && + (!right_flanking || cmark_utf8proc_is_punctuation_or_symbol(before_char)); +*can_close = right_flanking && + (!left_flanking || cmark_utf8proc_is_punctuation_or_symbol(after_char)); +``` + +For `'` and `"` (smart punctuation): +```c +*can_open = left_flanking && + (!right_flanking || before_char == '(' || before_char == '[') && + before_char != ']' && before_char != ')'; +*can_close = right_flanking; +``` + +The function advances `subj->pos` past the delimiter run and returns the number of delimiter characters consumed. For quotes, only 1 delimiter is consumed regardless of how many appear. + +## Emphasis Resolution: `S_insert_emph()` + +```c +static delimiter *S_insert_emph(subject *subj, delimiter *opener, + delimiter *closer); +``` + +When a closing delimiter is found that matches an opener on the stack, this function creates emphasis nodes: + +1. If the opener and closer have combined length >= 2 AND both have individual length >= 2, create a `CMARK_NODE_STRONG` node (consuming 2 characters from each). +2. Otherwise, create a `CMARK_NODE_EMPH` node (consuming 1 character from each). +3. All inline nodes between the opener and closer are moved to become children of the new emphasis node. +4. Any delimiters between the opener and closer are removed from the stack. +5. If the opener is exhausted (`length == 0`), it's removed from the stack. +6. If the closer is exhausted, it's removed too; otherwise, processing continues. + +## Code Span Parsing: `handle_backticks()` + +```c +static cmark_node *handle_backticks(subject *subj, int options); +``` + +When a backtick is encountered: + +1. `take_while(subj, isbacktick)` consumes the opening backtick run and records its length. +2. `scan_to_closing_backticks()` searches forward for a matching backtick run of the same length. + +The scanning function uses the `subj->backticks[]` array to cache positions of backtick sequences. If `subj->scanned_for_backticks` is true and the cached position for the needed length is behind the current position, it immediately returns 0 (no match). + +If no closing backticks are found, the opening run is emitted as literal text. If found, the content between is extracted, normalized via `S_normalize_code()`: + +```c +static void S_normalize_code(cmark_strbuf *s) { + // 1. Convert \r\n and \r to spaces + // 2. Convert \n to spaces + // 3. If content begins and ends with a space and contains non-space chars, + // strip one leading and one trailing space +} +``` + +## Link Parsing + +When `]` is encountered after an opener on the bracket stack: + +### Inline Links: `[text](url "title")` + +The parser looks for `(` immediately after `]`, then: +1. Skips optional whitespace +2. Tries to parse a link destination (URL) +3. Skips optional whitespace +4. Optionally parses a link title (in single quotes, double quotes, or parentheses) +5. Expects `)` + +### Reference Links: `[text][ref]` or `[text][]` or `[text]` + +If the inline link syntax doesn't match, the parser tries: +1. `[text][ref]` — explicit reference +2. `[text][]` — collapsed reference (label = text) +3. `[text]` — shortcut reference (label = text) + +Reference lookup uses `cmark_reference_lookup()` against the parser's `refmap`. + +### URL Cleaning + +```c +unsigned char *cmark_clean_url(cmark_mem *mem, cmark_chunk *url); +``` + +Trims the URL, unescapes HTML entities, and handles angle-bracket-delimited URLs. + +### Autolinks + +```c +static inline cmark_node *make_autolink(subject *subj, int start_column, + int end_column, cmark_chunk url, + int is_email); +``` + +Autolinks (`` or ``) are detected via the `scan_autolink_uri()` and `scan_autolink_email()` scanner functions. Email autolinks have `mailto:` prepended to the URL automatically: + +```c +static unsigned char *cmark_clean_autolink(cmark_mem *mem, cmark_chunk *url, + int is_email) { + cmark_strbuf buf = CMARK_BUF_INIT(mem); + cmark_chunk_trim(url); + if (is_email) + cmark_strbuf_puts(&buf, "mailto:"); + houdini_unescape_html_f(&buf, url->data, url->len); + return cmark_strbuf_detach(&buf); +} +``` + +## Smart Punctuation + +When `CMARK_OPT_SMART` is enabled, the inline parser transforms: + +```c +static const char *EMDASH = "\xE2\x80\x94"; // — +static const char *ENDASH = "\xE2\x80\x93"; // – +static const char *ELLIPSES = "\xE2\x80\xA6"; // … +static const char *LEFTDOUBLEQUOTE = "\xE2\x80\x9C"; // " +static const char *RIGHTDOUBLEQUOTE = "\xE2\x80\x9D"; // " +static const char *LEFTSINGLEQUOTE = "\xE2\x80\x98"; // ' +static const char *RIGHTSINGLEQUOTE = "\xE2\x80\x99"; // ' +``` + +- `---` becomes em dash (—) +- `--` becomes en dash (–) +- `...` becomes ellipsis (…) +- `'` and `"` are converted to curly quotes using the delimiter stack (open/close logic) + +## Hard and Soft Line Breaks + +- **Hard line break**: Two or more spaces before a line ending, or a backslash before a line ending. Creates a `CMARK_NODE_LINEBREAK` node. +- **Soft line break**: A line ending not preceded by spaces or backslash. Creates a `CMARK_NODE_SOFTBREAK` node. + +## Special Character Dispatch + +```c +static bufsize_t subject_find_special_char(subject *subj, int options); +``` + +This function scans forward from `subj->pos` looking for the next special character that needs inline processing. Special characters include: +- Line endings (`\r`, `\n`) +- Backtick (`` ` ``) +- Backslash (`\`) +- Ampersand (`&`) +- Less-than (`<`) +- Open bracket (`[`) +- Close bracket (`]`) +- Exclamation mark (`!`) +- Emphasis characters (`*`, `_`) + +Any text between special characters is collected as a `CMARK_NODE_TEXT` node. + +## Source Position Tracking + +```c +static void adjust_subj_node_newlines(subject *subj, cmark_node *node, + int matchlen, int extra, int options); +``` + +When `CMARK_OPT_SOURCEPOS` is enabled, this function adjusts source positions for multi-line inline constructs. It counts newlines in the just-matched span and updates: +- `subj->line` — incremented by the number of newlines +- `node->end_line` — adjusted for multi-line spans +- `node->end_column` — set to characters after the last newline +- `subj->column_offset` — adjusted for correct subsequent position calculations + +## Inline Node Factory Functions + +The inline parser uses efficient factory functions: + +```c +// Macros for simple nodes +#define make_linebreak(mem) make_simple(mem, CMARK_NODE_LINEBREAK) +#define make_softbreak(mem) make_simple(mem, CMARK_NODE_SOFTBREAK) +#define make_emph(mem) make_simple(mem, CMARK_NODE_EMPH) +#define make_strong(mem) make_simple(mem, CMARK_NODE_STRONG) +``` + +```c +// Fast child appending (bypasses S_can_contain validation) +static void append_child(cmark_node *node, cmark_node *child) { + cmark_node *old_last_child = node->last_child; + child->next = NULL; + child->prev = old_last_child; + child->parent = node; + node->last_child = child; + if (old_last_child) { + old_last_child->next = child; + } else { + node->first_child = child; + } +} +``` + +This `append_child()` is a simplified version of the public `cmark_node_append_child()`, skipping containership validation since the inline parser always produces valid structures. + +## The Main Parse Loop + +```c +void cmark_parse_inlines(cmark_mem *mem, cmark_node *parent, + cmark_reference_map *refmap, int options); +``` + +This function initializes a `subject` from the parent node's `data` field, then repeatedly calls `parse_inline()` until the input is exhausted. Each call to `parse_inline()` finds the next special character, emits any preceding text as a `CMARK_NODE_TEXT`, and dispatches to the appropriate handler. + +After all characters are processed, the delimiter stack is processed to resolve any remaining emphasis, and then cleaned up. + +## Cross-References + +- [inlines.c](../../cmark/src/inlines.c) — Full implementation +- [inlines.h](../../cmark/src/inlines.h) — Internal API declarations +- [block-parsing.md](block-parsing.md) — Phase 1 that produces the input for inline parsing +- [reference-system.md](reference-system.md) — How link references are stored and looked up +- [scanner-system.md](scanner-system.md) — Scanner functions for HTML tags, autolinks, etc. +- [utf8-handling.md](utf8-handling.md) — Unicode character classification for flanking rules diff --git a/docs/handbook/cmark/iterator-system.md b/docs/handbook/cmark/iterator-system.md new file mode 100644 index 0000000000..3cdcfda66e --- /dev/null +++ b/docs/handbook/cmark/iterator-system.md @@ -0,0 +1,267 @@ +# cmark — Iterator System + +## Overview + +The iterator system (`iterator.c`, `iterator.h`) provides depth-first traversal of the AST using an event-based model. Each node is visited twice: once on `CMARK_EVENT_ENTER` (before children) and once on `CMARK_EVENT_EXIT` (after children). Leaf nodes receive both events in immediate succession. + +All renderers (HTML, XML, LaTeX, man, CommonMark) use the iterator as their traversal mechanism. + +## Public API + +```c +cmark_iter *cmark_iter_new(cmark_node *root); +void cmark_iter_free(cmark_iter *iter); +cmark_event_type cmark_iter_next(cmark_iter *iter); +cmark_node *cmark_iter_get_node(cmark_iter *iter); +cmark_event_type cmark_iter_get_event_type(cmark_iter *iter); +cmark_node *cmark_iter_get_root(cmark_iter *iter); +void cmark_iter_reset(cmark_iter *iter, cmark_node *current, cmark_event_type event_type); +``` + +## Iterator State + +```c +struct cmark_iter { + cmark_mem *mem; + cmark_node *root; + cmark_node *cur; + cmark_event_type ev_type; +}; +``` + +The iterator stores: +- `root` — The subtree root (traversal boundary) +- `cur` — Current node +- `ev_type` — Current event (`CMARK_EVENT_ENTER`, `CMARK_EVENT_EXIT`, `CMARK_EVENT_DONE`, or `CMARK_EVENT_NONE`) + +## Event Types + +```c +typedef enum { + CMARK_EVENT_NONE, // Initial state + CMARK_EVENT_DONE, // Traversal complete (exited root) + CMARK_EVENT_ENTER, // Entering a node (pre-children) + CMARK_EVENT_EXIT, // Exiting a node (post-children) +} cmark_event_type; +``` + +## Leaf Node Detection + +```c +static const int S_leaf_mask = + (1 << CMARK_NODE_HTML_BLOCK) | (1 << CMARK_NODE_THEMATIC_BREAK) | + (1 << CMARK_NODE_CODE_BLOCK) | (1 << CMARK_NODE_TEXT) | + (1 << CMARK_NODE_SOFTBREAK) | (1 << CMARK_NODE_LINEBREAK) | + (1 << CMARK_NODE_CODE) | (1 << CMARK_NODE_HTML_INLINE); + +static bool S_is_leaf(cmark_node *node) { + return ((1 << node->type) & S_leaf_mask) != 0; +} +``` + +Leaf nodes are determined by a bitmask — not by checking whether `first_child` is NULL. This means an emphasis node with no children is still treated as a container (it receives separate enter and exit events). + +The leaf node types are: +- **Block leaves**: `HTML_BLOCK`, `THEMATIC_BREAK`, `CODE_BLOCK` +- **Inline leaves**: `TEXT`, `SOFTBREAK`, `LINEBREAK`, `CODE`, `HTML_INLINE` + +## Traversal Algorithm + +`cmark_iter_next()` implements the state machine: + +```c +cmark_event_type cmark_iter_next(cmark_iter *iter) { + cmark_event_type ev_type = iter->ev_type; + cmark_node *cur = iter->cur; + + if (ev_type == CMARK_EVENT_DONE) { + return CMARK_EVENT_DONE; + } + + // For initial state, start with ENTER on root + if (ev_type == CMARK_EVENT_NONE) { + iter->ev_type = CMARK_EVENT_ENTER; + return iter->ev_type; + } + + if (ev_type == CMARK_EVENT_ENTER && !S_is_leaf(cur)) { + // Container node being entered — descend to first child if it exists + if (cur->first_child) { + iter->ev_type = CMARK_EVENT_ENTER; + iter->cur = cur->first_child; + } else { + // Empty container — immediately exit + iter->ev_type = CMARK_EVENT_EXIT; + } + } else if (cur == iter->root) { + // Exiting root (or leaf at root) — done + iter->ev_type = CMARK_EVENT_DONE; + iter->cur = NULL; + } else if (cur->next) { + // Move to next sibling + iter->ev_type = CMARK_EVENT_ENTER; + iter->cur = cur->next; + } else if (cur->parent) { + // No more siblings — exit parent + iter->ev_type = CMARK_EVENT_EXIT; + iter->cur = cur->parent; + } else { + // Orphan node — done + assert(false); + iter->ev_type = CMARK_EVENT_DONE; + iter->cur = NULL; + } + + return iter->ev_type; +} +``` + +### State Transition Summary + +| Current State | Condition | Next State | +|--------------|-----------|------------| +| `NONE` | (initial) | `ENTER(root)` | +| `ENTER(container)` | has children | `ENTER(first_child)` | +| `ENTER(container)` | no children | `EXIT(container)` | +| `ENTER(leaf)` or `EXIT(node)` | node == root | `DONE` | +| `ENTER(leaf)` or `EXIT(node)` | has next sibling | `ENTER(next)` | +| `ENTER(leaf)` or `EXIT(node)` | has parent | `EXIT(parent)` | +| `DONE` | (terminal) | `DONE` | + +### Traversal Order Example + +For a document with a paragraph containing "Hello *world*": + +``` +Document +└── Paragraph + ├── Text("Hello ") + ├── Emph + │ └── Text("world") + └── (end) +``` + +Event sequence: +1. `ENTER(Document)` +2. `ENTER(Paragraph)` +3. `ENTER(Text "Hello ")` — leaf, immediate transition +4. `ENTER(Emph)` +5. `ENTER(Text "world")` — leaf, immediate transition +6. `EXIT(Emph)` +7. `EXIT(Paragraph)` +8. `EXIT(Document)` +9. `DONE` + +## Iterator Reset + +```c +void cmark_iter_reset(cmark_iter *iter, cmark_node *current, + cmark_event_type event_type) { + iter->cur = current; + iter->ev_type = event_type; +} +``` + +Allows repositioning the iterator to any node and event type. This is used by renderers to skip subtrees — e.g., when the HTML renderer processes an image node, it may skip children after extracting alt text. + +## Text Node Consolidation + +```c +void cmark_consolidate_text_nodes(cmark_node *root) { + if (root == NULL) return; + cmark_iter *iter = cmark_iter_new(root); + cmark_strbuf buf = CMARK_BUF_INIT(iter->mem); + cmark_event_type ev_type; + cmark_node *cur, *tmp, *next; + + while ((ev_type = cmark_iter_next(iter)) != CMARK_EVENT_DONE) { + cur = cmark_iter_get_node(iter); + if (ev_type == CMARK_EVENT_ENTER && cur->type == CMARK_NODE_TEXT && + cur->next && cur->next->type == CMARK_NODE_TEXT) { + // Merge consecutive text nodes + cmark_strbuf_clear(&buf); + cmark_strbuf_put(&buf, cur->data, cur->len); + tmp = cur->next; + while (tmp && tmp->type == CMARK_NODE_TEXT) { + cmark_iter_reset(iter, tmp, CMARK_EVENT_ENTER); + cmark_strbuf_put(&buf, tmp->data, tmp->len); + cur->end_column = tmp->end_column; + next = tmp->next; + cmark_node_free(tmp); + tmp = next; + } + // Replace cur's data with merged content + cmark_chunk_free(iter->mem, &cur->as.literal); + cmark_strbuf_trim(&buf); + // ... set cur->data and cur->len + } + } + cmark_strbuf_free(&buf); + cmark_iter_free(iter); +} +``` + +This function merges adjacent text nodes into a single text node. Adjacent text nodes can arise from inline parsing (e.g., when backslash escapes split text). The function: + +1. Finds consecutive text node runs +2. Concatenates their content into a buffer +3. Updates the first node's content and end position +4. Frees the subsequent nodes +5. Uses `cmark_iter_reset()` to skip freed nodes + +## Usage Patterns + +### Standard Rendering Loop + +```c +cmark_iter *iter = cmark_iter_new(root); +while ((ev_type = cmark_iter_next(iter)) != CMARK_EVENT_DONE) { + cur = cmark_iter_get_node(iter); + S_render_node(cur, ev_type, &state, options); +} +cmark_iter_free(iter); +``` + +### Skipping Children + +To skip rendering of a node's children (e.g., for image alt text in HTML): +```c +if (ev_type == CMARK_EVENT_ENTER) { + cmark_iter_reset(iter, node, CMARK_EVENT_EXIT); +} +``` + +This jumps directly to the exit event, bypassing all children. + +### Safe Node Removal + +The iterator handles node removal between calls. Since `cmark_iter_next()` always follows `next` and `parent` pointers from the current position, removing the current node is safe as long as: +1. The node's `next` and `parent` pointers remain valid +2. The iterator is reset to skip the removed node's children + +## Thread Safety + +Iterators are NOT thread-safe. A single AST must not be iterated concurrently without external synchronization. However, since iterators only read the AST (never modify it), multiple read-only iterators on the same AST are safe if no modifications occur. + +## Memory + +The iterator allocates a `cmark_iter` struct using the root node's memory allocator: +```c +cmark_iter *cmark_iter_new(cmark_node *root) { + cmark_mem *mem = root->mem; + cmark_iter *iter = (cmark_iter *)mem->calloc(1, sizeof(cmark_iter)); + iter->mem = mem; + iter->root = root; + iter->cur = root; + iter->ev_type = CMARK_EVENT_NONE; + return iter; +} +``` + +## Cross-References + +- [iterator.c](../../cmark/src/iterator.c) — Iterator implementation +- [iterator.h](../../cmark/src/iterator.h) — Iterator struct definition +- [ast-node-system.md](ast-node-system.md) — The nodes being traversed +- [html-renderer.md](html-renderer.md) — Example of iterator-driven rendering +- [render-framework.md](render-framework.md) — Framework that wraps iterator use diff --git a/docs/handbook/cmark/latex-renderer.md b/docs/handbook/cmark/latex-renderer.md new file mode 100644 index 0000000000..d7a492d580 --- /dev/null +++ b/docs/handbook/cmark/latex-renderer.md @@ -0,0 +1,320 @@ +# cmark — LaTeX Renderer + +## Overview + +The LaTeX renderer (`latex.c`) converts a `cmark_node` AST into LaTeX source, suitable for compilation with `pdflatex`, `xelatex`, or `lualatex`. It uses the generic render framework from `render.c`, operating through a per-character output callback (`outc`) and a per-node render callback (`S_render_node`). + +## Entry Point + +```c +char *cmark_render_latex(cmark_node *root, int options, int width); +``` + +- `root` — AST root node +- `options` — Option flags (`CMARK_OPT_SOURCEPOS`, `CMARK_OPT_HARDBREAKS`, `CMARK_OPT_NOBREAKS`, `CMARK_OPT_UNSAFE`) +- `width` — Target line width for hard-wrapping; 0 disables wrapping + +## Character Escaping (`outc`) + +The `outc` function handles per-character output decisions. It is the most complex part of the LaTeX renderer, with different behavior for three escaping modes: + +```c +static void outc(cmark_renderer *renderer, cmark_escaping escape, + int32_t c, unsigned char nextc); +``` + +### LITERAL Mode +Pass-through: all characters are output unchanged. + +### NORMAL Mode +Extensive special-character handling: + +| Character | LaTeX Output | Purpose | +|-----------|-------------|---------| +| `$` | `\$` | Math mode delimiter | +| `%` | `\%` | Comment character | +| `&` | `\&` | Table column separator | +| `_` | `\_` | Subscript operator | +| `#` | `\#` | Parameter reference | +| `^` | `\^{}` | Superscript operator | +| `{` | `\{` | Group open | +| `}` | `\}` | Group close | +| `~` | `\textasciitilde{}` | Non-breaking space | +| `[` | `{[}` | Optional argument bracket | +| `]` | `{]}` | Optional argument bracket | +| `\` | `\textbackslash{}` | Escape character | +| `|` | `\textbar{}` | Pipe | +| `'` | `\textquotesingle{}` | Straight single quote | +| `"` | `\textquotedbl{}` | Straight double quote | +| `` ` `` | `\textasciigrave{}` | Backtick | +| `\xA0` (NBSP) | `~` | LaTeX non-breaking space | +| `\x2014` (—) | `---` | Em dash | +| `\x2013` (–) | `--` | En dash | +| `\x2018` (') | `` ` `` | Left single quote | +| `\x2019` (') | `'` | Right single quote | +| `\x201C` (") | ` `` ` | Left double quote | +| `\x201D` (") | `''` | Right double quote | + +### URL Mode +Only these characters are escaped: +- `$` → `\$` +- `%` → `\%` +- `&` → `\&` +- `_` → `\_` +- `#` → `\#` +- `{` → `\{` +- `}` → `\}` + +All other characters pass through unchanged. + +## Link Type Classification + +The renderer classifies links into five categories: + +```c +typedef enum { + NO_LINK, + URL_AUTOLINK, + EMAIL_AUTOLINK, + NORMAL_LINK, + INTERNAL_LINK, +} link_type; +``` + +### `get_link_type()` + +```c +static link_type get_link_type(cmark_node *node) { + // 1. "mailto:" links where text matches url + // 2. "http[s]:" links where text matches url (with or without protocol) + // 3. Links starting with '#' → INTERNAL_LINK + // 4. Everything else → NORMAL_LINK +} +``` + +Detection logic: +1. **URL_AUTOLINK**: The `url` starts with `http://` or `https://`, the link has exactly one text child, and that child's content matches the URL (or matches the URL minus the protocol prefix). +2. **EMAIL_AUTOLINK**: The `url` starts with `mailto:`, the link has exactly one text child, and that child's content matches the URL after `mailto:`. +3. **INTERNAL_LINK**: The `url` starts with `#`. +4. **NORMAL_LINK**: Everything else. + +## Enumeration Level + +For nested ordered lists, the renderer selects the appropriate LaTeX counter style: + +```c +static int S_get_enumlevel(cmark_node *node) { + int enumlevel = 0; + cmark_node *tmp = node; + while (tmp) { + if (tmp->type == CMARK_NODE_LIST && + cmark_node_get_list_type(tmp) == CMARK_ORDERED_LIST) { + enumlevel++; + } + tmp = tmp->parent; + } + return enumlevel; +} +``` + +This walks up the tree, counting ordered list ancestors. LaTeX ordered lists cycle through: `enumi` (arabic), `enumii` (alpha), `enumiii` (roman), `enumiv` (Alpha). + +## Node Rendering (`S_render_node`) + +### Block Nodes + +#### Document +No output. + +#### Block Quote +``` +ENTER: \begin{quote}\n +EXIT: \end{quote}\n +``` + +#### List +``` +ENTER (bullet): \begin{itemize}\n +ENTER (ordered): \begin{enumerate}\n + \def\labelenumI{COUNTER}\n (if start != 1) + \setcounter{enumI}{START-1}\n +EXIT: \end{itemize}\n or \end{enumerate}\n +``` + +The counter is formatted based on enumeration level: +- Level 1: `\arabic{enumi}.` +- Level 2: `\alph{enumii}.` (surrounded by `(`) +- Level 3: `\roman{enumiii}.` +- Level 4: `\Alph{enumiv}.` + +Period delimiters use `.`, parenthesis delimiters use `)`. + +#### Item +``` +ENTER: \item{} (empty braces prevent ligatures with following content) +EXIT: \n +``` + +#### Heading +``` +ENTER: \section{ or \subsection{ or \subsubsection{ or \paragraph{ or \subparagraph{ +EXIT: }\n +``` + +Mapping: level 1 → `\section`, level 2 → `\subsection`, level 3 → `\subsubsection`, level 4 → `\paragraph`, level 5 → `\subparagraph`. + +#### Code Block +```latex +\begin{verbatim} +LITERAL CONTENT +\end{verbatim} +``` + +The content is output in `LITERAL` escape mode (no character escaping). Info strings are ignored. + +#### HTML Block +``` +ENTER: % raw HTML omitted\n (as a LaTeX comment) +``` + +Raw HTML is always omitted in LaTeX output, regardless of `CMARK_OPT_UNSAFE`. + +#### Thematic Break +``` +\begin{center}\rule{0.5\linewidth}{\linethickness}\end{center}\n +``` + +#### Paragraph +Same tight-list check as the HTML renderer: +```c +parent = cmark_node_parent(node); +grandparent = cmark_node_parent(parent); +tight = (grandparent && grandparent->type == CMARK_NODE_LIST) ? + grandparent->as.list.tight : false; +``` +- Normal: newline before and after +- Tight: no leading/trailing blank lines + +### Inline Nodes + +#### Text +Output with NORMAL escaping. + +#### Soft Break +Depends on options: +- `CMARK_OPT_HARDBREAKS`: `\\\\\n` +- `CMARK_OPT_NOBREAKS`: space +- Default: newline + +#### Line Break +``` +\\\\\n +``` + +#### Code (inline) +``` +\texttt{ESCAPED CONTENT} +``` + +Special handling: Code content is output character-by-character with inline-code escaping. Special characters (`\`, `{`, `}`, `$`, `%`, `&`, `_`, `#`, `^`, `~`) are escaped. + +#### Emphasis +``` +ENTER: \emph{ +EXIT: } +``` + +#### Strong +``` +ENTER: \textbf{ +EXIT: } +``` + +#### Link +Rendering depends on link type: + +**NORMAL_LINK:** +``` +ENTER: \href{URL}{ +EXIT: } +``` + +**URL_AUTOLINK:** +``` +ENTER: \url{URL} +(children are skipped — no EXIT rendering needed) +``` + +**EMAIL_AUTOLINK:** +``` +ENTER: \href{URL}{\nolinkurl{ +EXIT: }} +``` + +**INTERNAL_LINK:** +``` +ENTER: (nothing — rendered as plain text) +EXIT: (~\ref{LABEL}) +``` + +Where `LABEL` is the URL with the leading `#` stripped. + +**NO_LINK:** +No output. + +#### Image +``` +ENTER: \protect\includegraphics{URL} +``` + +Image children (alt text) are skipped. If `CMARK_OPT_UNSAFE` is not set and the URL matches `_scan_dangerous_url()`, the URL is omitted. + +#### HTML Inline +``` +% raw HTML omitted +``` + +Always omitted, regardless of `CMARK_OPT_UNSAFE`. + +## Source Position Comments + +When `CMARK_OPT_SOURCEPOS` is set, the renderer adds LaTeX comments before block elements: + +```c +snprintf(buffer, BUFFER_SIZE, "%% %d:%d-%d:%d\n", + cmark_node_get_start_line(node), cmark_node_get_start_column(node), + cmark_node_get_end_line(node), cmark_node_get_end_column(node)); +``` + +## Example Output + +Markdown input: +```markdown +# Hello World + +A paragraph with *emphasis* and **bold**. + +- Item 1 +- Item 2 +``` + +LaTeX output: +```latex +\section{Hello World} + +A paragraph with \emph{emphasis} and \textbf{bold}. + +\begin{itemize} +\item{}Item 1 + +\item{}Item 2 + +\end{itemize} +``` + +## Cross-References + +- [latex.c](../../cmark/src/latex.c) — Full implementation +- [render-framework.md](render-framework.md) — Generic render framework (`cmark_render()`, `cmark_renderer`) +- [public-api.md](public-api.md) — `cmark_render_latex()` API docs +- [html-renderer.md](html-renderer.md) — Contrast with direct buffer renderer diff --git a/docs/handbook/cmark/man-renderer.md b/docs/handbook/cmark/man-renderer.md new file mode 100644 index 0000000000..cae1c6dbf3 --- /dev/null +++ b/docs/handbook/cmark/man-renderer.md @@ -0,0 +1,272 @@ +# cmark — Man Page Renderer + +## Overview + +The man page renderer (`man.c`) converts a `cmark_node` AST into roff/troff format suitable for the Unix `man` page system. It uses the generic render framework from `render.c`. + +## Entry Point + +```c +char *cmark_render_man(cmark_node *root, int options, int width); +``` + +- `root` — AST root node +- `options` — Option flags (`CMARK_OPT_HARDBREAKS`, `CMARK_OPT_NOBREAKS`, `CMARK_OPT_SOURCEPOS`, `CMARK_OPT_UNSAFE`) +- `width` — Target line width for wrapping; 0 disables wrapping + +## Character Escaping (`S_outc`) + +The man page escaping is simpler than LaTeX. The `S_outc` function handles: + +```c +static void S_outc(cmark_renderer *renderer, cmark_escaping escape, + int32_t c, unsigned char nextc) { + if (escape == LITERAL) { + cmark_render_code_point(renderer, c); + return; + } + switch (c) { + case 46: // '.' — if at line start + cmark_render_ascii(renderer, "\\&."); + break; + case 39: // '\'' — if at line start + cmark_render_ascii(renderer, "\\&'"); + break; + case 45: // '-' + cmark_render_ascii(renderer, "\\-"); + break; + case 92: // '\\' + cmark_render_ascii(renderer, "\\e"); + break; + case 8216: // ' (left single quote) + cmark_render_ascii(renderer, "\\[oq]"); + break; + case 8217: // ' (right single quote) + cmark_render_ascii(renderer, "\\[cq]"); + break; + case 8220: // " (left double quote) + cmark_render_ascii(renderer, "\\[lq]"); + break; + case 8221: // " (right double quote) + cmark_render_ascii(renderer, "\\[rq]"); + break; + case 8212: // — (em dash) + cmark_render_ascii(renderer, "\\[em]"); + break; + case 8211: // – (en dash) + cmark_render_ascii(renderer, "\\[en]"); + break; + default: + cmark_render_code_point(renderer, c); + } +} +``` + +### Line-Start Protection + +The `.` and `'` characters are only escaped when they appear at the beginning of a line, since roff interprets them as macro/command prefixes. The check: + +```c +case 46: +case 39: + if (renderer->begin_line) { + cmark_render_ascii(renderer, "\\&."); // or "\\&'" + } +``` + +The `\\&` prefix is a zero-width space that prevents roff from treating the character as a command prefix. + +## Block Number Tracking + +The renderer tracks nesting with a `block_number` variable for generating matching `.RS`/`.RE` (indent start/end) pairs: + +This variable is incremented when entering list items and block quotes, and decremented on exit. It controls the indentation level of nested content. + +## Node Rendering (`S_render_node`) + +### Block Nodes + +#### Document +No output on enter or exit. + +#### Block Quote +``` +ENTER: .RS\n +EXIT: .RE\n +``` + +`.RS` pushes relative indentation, `.RE` pops it. + +#### List +On exit, adds a blank output line (`cr()`) to separate from following content. + +#### Item +``` +ENTER (bullet): .IP \(bu 2\n +ENTER (ordered): .IP "N." 4\n (where N = list start + sibling count) +EXIT: (cr if not last item) +``` + +The ordered item number is calculated by counting previous siblings: +```c +int list_number = cmark_node_get_list_start(node->parent); +tmp = node; +while (tmp->prev) { + tmp = tmp->prev; + list_number++; +} +``` + +`.IP` sets an indented paragraph with a tag (bullet or number) and indentation width. + +#### Heading +``` +ENTER (level 1): .SH\n (section heading) +ENTER (level 2): .SS\n (subsection heading) +ENTER (other): .PP\n\fB (paragraph, start bold) +EXIT (other): \fR\n (end bold) +``` + +Level 1 and 2 headings use dedicated roff macros. Level 3+ are rendered as bold paragraphs. + +#### Code Block +``` +.IP\n.nf\n\\f[C]\n +LITERAL CONTENT +\\f[]\n.fi\n +``` + +- `.nf` — no-fill (preformatted) +- `\\f[C]` — switch to constant-width font +- `\\f[]` — restore previous font +- `.fi` — return to fill mode + +#### HTML Block +``` +(nothing) +``` +Raw HTML blocks are silently omitted in man output. + +#### Thematic Break +There is no native roff thematic break. The renderer outputs nothing for this node type. + +#### Paragraph +Same tight-list check as other renderers: +```c +tight = (grandparent && grandparent->type == CMARK_NODE_LIST) ? + grandparent->as.list.tight : false; +``` +- Normal: `.PP\n` before content +- Tight: no `.PP` prefix + +### Inline Nodes + +#### Text +Output with NORMAL escaping. + +#### Soft Break +Depends on options: +- `CMARK_OPT_HARDBREAKS`: `.PD 0\n.P\n.PD\n` +- `CMARK_OPT_NOBREAKS`: space +- Default: newline + +The hardbreak sequence `.PD 0\n.P\n.PD\n` is a man page idiom that: +1. Sets paragraph distance to 0 (`.PD 0`) +2. Starts a new paragraph (`.P`) +3. Restores default paragraph distance (`.PD`) + +#### Line Break +Same as hardbreak: +``` +.PD 0\n.P\n.PD\n +``` + +#### Code (inline) +``` +\f[C]ESCAPED CONTENT\f[] +``` + +Font switch to `C` (constant-width), then restore. + +#### Emphasis +``` +ENTER: \f[I] (italic font) +EXIT: \f[] (restore font) +``` + +#### Strong +``` +ENTER: \f[B] (bold font) +EXIT: \f[] (restore font) +``` + +#### Link +Links render their text content normally. On exit: +``` +(ESCAPED_URL) +``` + +If the link URL is the same as the text content (autolink), the URL suffix is suppressed. + +#### Image +``` +ENTER: [IMAGE: +EXIT: ] +``` + +Images have no roff equivalent, so they're rendered as bracketed alt text. + +#### HTML Inline +Silently omitted. + +## Source Position + +When `CMARK_OPT_SOURCEPOS` is set, man output includes roff comments: +``` +.\" sourcepos: LINE:COL-LINE:COL +``` + +(The `.\"` prefix is the roff comment syntax.) + +## Example Output + +Markdown input: +```markdown +# My Tool + +A description with *emphasis*. + +## Options + +- `--flag` — Does something +- `--other` — Does another thing +``` + +Man output: +```roff +.SH +My Tool +.PP +A description with \f[I]emphasis\f[]. +.SS +Options +.IP \(bu 2 +\f[C]\-\-flag\f[] \[em] Does something +.IP \(bu 2 +\f[C]\-\-other\f[] \[em] Does another thing +``` + +## Limitations + +1. **No heading levels > 2**: Levels 3+ are rendered as bold paragraphs, losing semantic heading structure. +2. **No images**: Only alt text is shown in brackets. +3. **No raw HTML**: Silently dropped. +4. **No thematic breaks**: No visual separator is output. +5. **No tables**: Not part of core CommonMark, but if extensions add them, the man renderer has no support. + +## Cross-References + +- [man.c](../../cmark/src/man.c) — Full implementation +- [render-framework.md](render-framework.md) — Generic render framework +- [public-api.md](public-api.md) — `cmark_render_man()` API docs +- [latex-renderer.md](latex-renderer.md) — Another framework-based renderer diff --git a/docs/handbook/cmark/memory-management.md b/docs/handbook/cmark/memory-management.md new file mode 100644 index 0000000000..dbc0046cb9 --- /dev/null +++ b/docs/handbook/cmark/memory-management.md @@ -0,0 +1,351 @@ +# cmark — Memory Management + +## Overview + +cmark's memory management is built around three concepts: +1. **Pluggable allocator** (`cmark_mem`) — a function-pointer table for calloc/realloc/free +2. **Owning buffer** (`cmark_strbuf`) — a growable byte buffer that owns its memory +3. **Non-owning slice** (`cmark_chunk`) — a view into either a `cmark_strbuf` or external memory + +## Pluggable Allocator + +### `cmark_mem` Structure + +```c +typedef struct cmark_mem { + void *(*calloc)(size_t, size_t); + void *(*realloc)(void *, size_t); + void (*free)(void *); +} cmark_mem; +``` + +All allocation throughout cmark respects this interface. Every node, buffer, parser, and iterator receives a `cmark_mem *` and uses it for all allocations. + +### Default Allocator + +```c +static void *xcalloc(size_t nmemb, size_t size) { + void *ptr = calloc(nmemb, size); + if (!ptr) abort(); + return ptr; +} + +static void *xrealloc(void *ptr, size_t size) { + void *new_ptr = realloc(ptr, size); + if (!new_ptr) abort(); + return new_ptr; +} + +cmark_mem DEFAULT_MEM_ALLOCATOR = {xcalloc, xrealloc, free}; +``` + +The default allocator wraps standard `calloc`/`realloc`/`free`, adding `abort()` on allocation failure. This means cmark never returns NULL from allocations — it terminates on out-of-memory. + +### Getting the Default Allocator + +```c +cmark_mem *cmark_get_default_mem_allocator(void) { + return &DEFAULT_MEM_ALLOCATOR; +} +``` + +### Custom Allocator Usage + +Users can provide custom allocators (arena allocators, debug allocators, etc.) via: + +```c +cmark_parser *cmark_parser_new_with_mem(int options, cmark_mem *mem); +cmark_node *cmark_node_new_with_mem(cmark_node_type type, cmark_mem *mem); +``` + +The allocator propagates: nodes created by the parser inherit the parser's allocator. Iterators use the root node's allocator. + +## Growable Buffer (`cmark_strbuf`) + +### Structure + +```c +struct cmark_strbuf { + cmark_mem *mem; + unsigned char *ptr; + bufsize_t asize; // allocated size + bufsize_t size; // used size (excluding NUL terminator) +}; +``` + +### Initialization + +```c +#define CMARK_BUF_INIT(mem) { mem, cmark_strbuf__initbuf, 0, 0 } +``` + +`cmark_strbuf__initbuf` is a static empty buffer that avoids allocating for empty strings: +```c +unsigned char cmark_strbuf__initbuf[1] = {0}; +``` + +This means: uninitialized/empty buffers point to a shared static empty string rather than NULL. This eliminates NULL checks throughout the code. + +### Growth Strategy + +```c +void cmark_strbuf_grow(cmark_strbuf *buf, bufsize_t target_size) { + // Minimum allocation of 8 bytes + bufsize_t new_size = 8; + // Double until >= target (or use 2x current if growing existing) + if (buf->asize) { + new_size = buf->asize; + } + while (new_size < target_size) { + new_size *= 2; + } + // Allocate + if (buf->ptr == cmark_strbuf__initbuf) { + buf->ptr = (unsigned char *)buf->mem->calloc(new_size, 1); + } else { + buf->ptr = (unsigned char *)buf->mem->realloc(buf->ptr, new_size); + } + buf->asize = new_size; +} +``` + +The growth strategy doubles the capacity each time, ensuring amortized O(1) appends. Minimum capacity is 8 bytes. + +When the buffer transitions from the shared static init to a real allocation, `calloc` is used (zero-initialized). Subsequent growths use `realloc`. + +### Key Operations + +```c +// Appending +void cmark_strbuf_put(cmark_strbuf *buf, const unsigned char *data, bufsize_t len); +void cmark_strbuf_puts(cmark_strbuf *buf, const char *string); +void cmark_strbuf_putc(cmark_strbuf *buf, int c); + +// Printf-style +void cmark_strbuf_printf(cmark_strbuf *buf, const char *fmt, ...); +void cmark_strbuf_vprintf(cmark_strbuf *buf, const char *fmt, va_list ap); + +// Manipulation +void cmark_strbuf_clear(cmark_strbuf *buf); // Reset size to 0, keep allocation +void cmark_strbuf_set(cmark_strbuf *buf, const unsigned char *data, bufsize_t len); +void cmark_strbuf_sets(cmark_strbuf *buf, const char *string); +void cmark_strbuf_copy_cstr(char *data, bufsize_t datasize, const cmark_strbuf *buf); +void cmark_strbuf_swap(cmark_strbuf *a, cmark_strbuf *b); + +// Whitespace +void cmark_strbuf_trim(cmark_strbuf *buf); // Trim leading and trailing whitespace +void cmark_strbuf_normalize_whitespace(cmark_strbuf *buf); // Collapse runs to single space +void cmark_strbuf_unescape(cmark_strbuf *buf); // Process backslash escapes + +// Lifecycle +unsigned char *cmark_strbuf_detach(cmark_strbuf *buf); // Return ptr, reset buf to init +void cmark_strbuf_free(cmark_strbuf *buf); // Free memory, reset to init +``` + +### `cmark_strbuf_detach()` + +```c +unsigned char *cmark_strbuf_detach(cmark_strbuf *buf) { + unsigned char *data = buf->ptr; + if (buf->asize == 0) { + // Never allocated — return a new empty string + data = (unsigned char *)buf->mem->calloc(1, 1); + } + // Reset buffer to initial state + buf->ptr = cmark_strbuf__initbuf; + buf->asize = 0; + buf->size = 0; + return data; +} +``` + +Transfers ownership of the buffer's memory to the caller. The buffer is reset to the empty init state. The caller must `free()` the returned pointer. + +### Whitespace Normalization + +```c +void cmark_strbuf_normalize_whitespace(cmark_strbuf *s) { + bool last_char_was_space = false; + bufsize_t r, w; + for (r = 0, w = 0; r < s->size; r++) { + if (cmark_isspace(s->ptr[r])) { + if (!last_char_was_space) { + s->ptr[w++] = ' '; + last_char_was_space = true; + } + } else { + s->ptr[w++] = s->ptr[r]; + last_char_was_space = false; + } + } + cmark_strbuf_truncate(s, w); +} +``` + +Collapses consecutive whitespace into a single space. Uses an in-place read/write cursor technique. + +### Backslash Unescape + +```c +void cmark_strbuf_unescape(cmark_strbuf *buf) { + bufsize_t r, w; + for (r = 0, w = 0; r < buf->size; r++) { + if (buf->ptr[r] == '\\' && cmark_ispunct(buf->ptr[r + 1])) + r++; + buf->ptr[w++] = buf->ptr[r]; + } + cmark_strbuf_truncate(buf, w); +} +``` + +Removes backslash escapes before punctuation characters, in-place. + +## Non-Owning Slice (`cmark_chunk`) + +### Structure + +```c +typedef struct { + const unsigned char *data; + bufsize_t len; + bufsize_t alloc; // 0 if non-owning, > 0 if owning +} cmark_chunk; +``` + +A `cmark_chunk` is either: +- **Non-owning** (`alloc == 0`): Points into someone else's memory (e.g., the parser's input buffer) +- **Owning** (`alloc > 0`): Owns its `data` pointer and must free it + +### Key Operations + +```c +// Create a non-owning reference +static CMARK_INLINE cmark_chunk cmark_chunk_buf_detach(cmark_strbuf *buf); +static CMARK_INLINE cmark_chunk cmark_chunk_literal(const char *data); +static CMARK_INLINE cmark_chunk cmark_chunk_dup(const cmark_chunk *ch, bufsize_t pos, bufsize_t len); + +// Free (only if owning) +static CMARK_INLINE void cmark_chunk_free(cmark_mem *mem, cmark_chunk *c) { + if (c->alloc) + mem->free((void *)c->data); + c->data = NULL; + c->alloc = 0; + c->len = 0; +} +``` + +### Ownership Transfer + +`cmark_chunk_buf_detach()` takes ownership of a `cmark_strbuf`'s memory: + +```c +static CMARK_INLINE cmark_chunk cmark_chunk_buf_detach(cmark_strbuf *buf) { + cmark_chunk c; + c.len = buf->size; + c.data = cmark_strbuf_detach(buf); + c.alloc = 1; // Now owns the data + return c; +} +``` + +### Non-Owning References + +`cmark_chunk_dup()` creates a non-owning view into existing memory: + +```c +static CMARK_INLINE cmark_chunk cmark_chunk_dup(const cmark_chunk *ch, + bufsize_t pos, bufsize_t len) { + cmark_chunk c = {ch->data + pos, len, 0}; // alloc = 0: non-owning + return c; +} +``` + +This is used extensively during parsing to avoid copying strings. For example, text node content during inline parsing initially points into the parser's line buffer. Only when the node outlives the parse does the data need to be copied. + +## Node Memory Management + +### Node Allocation + +```c +static cmark_node *S_node_new(cmark_node_type type, cmark_mem *mem) { + cmark_node *node = (cmark_node *)mem->calloc(1, sizeof(*node)); + cmark_strbuf_init(mem, &node->content, 0); + node->type = (uint16_t)type; + node->mem = mem; + return node; +} +``` + +Nodes are zero-initialized via `calloc`. The `mem` pointer is stored on the node for later freeing. + +### Node Deallocation + +```c +static void S_free_nodes(cmark_node *e) { + cmark_node *next; + while (e != NULL) { + // Free type-specific data + switch (e->type) { + case CMARK_NODE_CODE_BLOCK: + cmark_chunk_free(e->mem, &e->as.code.info); + cmark_chunk_free(e->mem, &e->as.literal); + break; + case CMARK_NODE_LINK: + case CMARK_NODE_IMAGE: + e->mem->free(e->as.link.url); + e->mem->free(e->as.link.title); + break; + // ... other types + } + // Splice children into the free list + if (e->first_child) { + cmark_node *last = e->last_child; + last->next = e->next; + e->next = e->first_child; + } + // Advance and free + next = e->next; + e->mem->free(e); + e = next; + } +} +``` + +This is an iterative (non-recursive) destructor that avoids stack overflow on deeply nested ASTs. The key technique is **sibling-list splicing**: children are inserted into the sibling chain before the current position, converting tree traversal into linear list traversal. + +### What Gets Freed Per Node Type + +| Node Type | Freed Data | +|-----------|-----------| +| `CODE_BLOCK` | `as.code.info` chunk, `as.literal` chunk | +| `TEXT`, `HTML_BLOCK`, `HTML_INLINE`, `CODE` | `as.literal` chunk | +| `LINK`, `IMAGE` | `as.link.url`, `as.link.title` | +| `CUSTOM_BLOCK`, `CUSTOM_INLINE` | `as.custom.on_enter`, `as.custom.on_exit` | +| `HEADING` | `as.heading.setext_content` (if chunk) | +| All nodes | `content` strbuf | + +## Parser Memory + +The parser allocates: +- A `cmark_parser` struct +- A `cmark_strbuf` for the current line (`linebuf`) +- A `cmark_strbuf` for collected content (`content`) +- A `cmark_reference_map` for link references +- Individual `cmark_node` objects for the AST + +When `cmark_parser_free()` is called, only the parser's own resources are freed — the AST is NOT freed (the user owns it). To free the AST, call `cmark_node_free()` on the root. + +## Memory Safety Patterns + +1. **No NULL returns**: The default allocator aborts on failure. User allocators should do the same or handle errors externally. +2. **Init buffers**: `cmark_strbuf__initbuf` prevents NULL pointer dereferences on empty buffers. +3. **Owning vs non-owning**: The `cmark_chunk.alloc` field prevents double-frees and ensures non-owning references are not freed. +4. **Iterative destruction**: `S_free_nodes()` avoids stack overflow on deep trees. + +## Cross-References + +- [buffer.c](../../cmark/src/buffer.c), [buffer.h](../../cmark/src/buffer.h) — `cmark_strbuf` implementation +- [chunk.h](../../cmark/src/chunk.h) — `cmark_chunk` definition +- [cmark.c](../../cmark/src/cmark.c) — Default allocator, `cmark_get_default_mem_allocator()` +- [node.c](../../cmark/src/node.c) — Node allocation and deallocation +- [ast-node-system.md](ast-node-system.md) — Node structure and lifecycle diff --git a/docs/handbook/cmark/overview.md b/docs/handbook/cmark/overview.md new file mode 100644 index 0000000000..4fc95bdad7 --- /dev/null +++ b/docs/handbook/cmark/overview.md @@ -0,0 +1,256 @@ +# cmark — Overview + +## What Is cmark? + +cmark is a C library and command-line tool for parsing and rendering CommonMark (standardized Markdown). Written in C99, it implements a two-phase parsing architecture — block structure recognition followed by inline content parsing — producing an Abstract Syntax Tree (AST) that can be traversed, manipulated, and rendered into multiple output formats. + +**Language:** C (C99) +**Build System:** CMake (minimum version 3.14) +**Project Version:** 0.31.2 +**License:** BSD-2-Clause +**Authors:** John MacFarlane, Vicent Marti, Kārlis Gaņģis, Nick Wellnhofer + +## Core Architecture Summary + +cmark's processing pipeline follows this sequence: + +1. **Input** — UTF-8 text is fed to the parser, either all at once or incrementally via a streaming API. +2. **Block Parsing** (`blocks.c`) — The input is scanned line-by-line to identify block-level structures (paragraphs, headings, code blocks, lists, block quotes, thematic breaks, HTML blocks). +3. **Inline Parsing** (`inlines.c`) — Within paragraph and heading blocks, inline elements are parsed (emphasis, links, images, code spans, HTML inline, line breaks). +4. **AST Construction** — A tree of `cmark_node` structures is built, with each node representing a document element. +5. **Rendering** — The AST is traversed using an iterator and rendered to one of five output formats: HTML, XML, LaTeX, man (groff), or CommonMark. + +## Source File Map + +The `cmark/src/` directory contains the following source files, organized by responsibility: + +### Public API +| File | Purpose | +|------|---------| +| `cmark.h` | Public API header — all exported types, enums, and function declarations | +| `cmark.c` | Core glue — `cmark_markdown_to_html()`, default memory allocator, version info | +| `main.c` | CLI entry point — argument parsing, file I/O, format dispatch | + +### AST Node System +| File | Purpose | +|------|---------| +| `node.h` | Internal node struct definition, type-specific unions (`cmark_list`, `cmark_code`, `cmark_heading`, `cmark_link`, `cmark_custom`), internal flags | +| `node.c` | Node creation/destruction, accessor functions, tree manipulation (insert, append, unlink, replace) | + +### Parsing +| File | Purpose | +|------|---------| +| `parser.h` | Internal `cmark_parser` struct definition (parser state: line number, offset, column, indent, reference map) | +| `blocks.c` | Block-level parsing — line-by-line analysis, open/close block logic, list item detection, finalization | +| `inlines.c` | Inline-level parsing — emphasis/strong via delimiter stack, backtick code spans, links/images via bracket stack, autolinks, HTML inline | +| `inlines.h` | Internal API: `cmark_parse_inlines()`, `cmark_parse_reference_inline()`, `cmark_clean_url()`, `cmark_clean_title()` | + +### Traversal +| File | Purpose | +|------|---------| +| `iterator.h` | Internal `cmark_iter` struct with `cmark_iter_state` (current + next event/node pairs) | +| `iterator.c` | Iterator implementation — `cmark_iter_new()`, `cmark_iter_next()`, `cmark_iter_reset()`, `cmark_consolidate_text_nodes()` | + +### Renderers +| File | Purpose | +|------|---------| +| `render.h` | `cmark_renderer` struct, `cmark_escaping` enum (`LITERAL`, `NORMAL`, `TITLE`, `URL`) | +| `render.c` | Generic render framework — line wrapping, prefix management, `cmark_render()` dispatch loop | +| `html.c` | HTML renderer — `cmark_render_html()`, direct strbuf-based output, no render framework | +| `xml.c` | XML renderer — `cmark_render_xml()`, direct strbuf-based output with CommonMark DTD | +| `latex.c` | LaTeX renderer — `cmark_render_latex()`, uses render framework | +| `man.c` | groff man renderer — `cmark_render_man()`, uses render framework | +| `commonmark.c` | CommonMark renderer — `cmark_render_commonmark()`, uses render framework | + +### Text Processing and Utilities +| File | Purpose | +|------|---------| +| `buffer.h` / `buffer.c` | `cmark_strbuf` — growable byte buffer with amortized O(1) append | +| `chunk.h` | `cmark_chunk` — lightweight non-owning string slice (pointer + length) | +| `utf8.h` / `utf8.c` | UTF-8 iteration, validation, encoding, case folding, Unicode property queries | +| `references.h` / `references.c` | Link reference definition storage and lookup (sorted array with binary search) | +| `scanners.h` / `scanners.c` | re2c-generated scanner functions for recognizing Markdown syntax patterns | +| `scanners.re` | re2c source for scanner generation | +| `cmark_ctype.h` / `cmark_ctype.c` | Locale-independent `cmark_isspace()`, `cmark_ispunct()`, `cmark_isdigit()`, `cmark_isalpha()` | +| `houdini.h` | HTML/URL escaping and unescaping API | +| `houdini_html_e.c` | HTML entity escaping | +| `houdini_html_u.c` | HTML entity unescaping | +| `houdini_href_e.c` | URL/href percent-encoding | +| `entities.inc` | HTML entity name-to-codepoint lookup table | +| `case_fold.inc` | Unicode case folding table for reference normalization | + +## The Simple Interface + +The simplest way to use cmark is a single function call defined in `cmark.c`: + +```c +char *cmark_markdown_to_html(const char *text, size_t len, int options); +``` + +Internally, this calls `cmark_parse_document()` to build the AST, then `cmark_render_html()` to produce the output, and finally frees the document node. The caller is responsible for freeing the returned string. + +The implementation in `cmark.c`: + +```c +char *cmark_markdown_to_html(const char *text, size_t len, int options) { + cmark_node *doc; + char *result; + + doc = cmark_parse_document(text, len, options); + result = cmark_render_html(doc, options); + cmark_node_free(doc); + + return result; +} +``` + +## The Streaming Interface + +For large documents or streaming input, cmark provides an incremental parsing API: + +```c +cmark_parser *parser = cmark_parser_new(CMARK_OPT_DEFAULT); + +// Feed chunks of data as they arrive +while ((bytes = fread(buffer, 1, sizeof(buffer), fp)) > 0) { + cmark_parser_feed(parser, buffer, bytes); +} + +// Finalize and get the AST +cmark_node *document = cmark_parser_finish(parser); +cmark_parser_free(parser); + +// Render to any format +char *html = cmark_render_html(document, CMARK_OPT_DEFAULT); +char *xml = cmark_render_xml(document, CMARK_OPT_DEFAULT); +char *man = cmark_render_man(document, CMARK_OPT_DEFAULT, 72); +char *tex = cmark_render_latex(document, CMARK_OPT_DEFAULT, 80); +char *cm = cmark_render_commonmark(document, CMARK_OPT_DEFAULT, 0); + +// Cleanup +cmark_node_free(document); +``` + +The parser accumulates input in an internal line buffer (`parser->linebuf`) and processes complete lines as they become available. The `S_parser_feed()` function in `blocks.c` scans for line-ending characters (`\n`, `\r`) and dispatches each complete line to `S_process_line()`. + +## Node Type Taxonomy + +cmark defines 21 node types in the `cmark_node_type` enum: + +### Block Nodes (container and leaf) +| Enum Value | Type String | Container? | Accepts Lines? | Contains Inlines? | +|-----------|-------------|------------|---------------|-------------------| +| `CMARK_NODE_DOCUMENT` | "document" | Yes | No | No | +| `CMARK_NODE_BLOCK_QUOTE` | "block_quote" | Yes | No | No | +| `CMARK_NODE_LIST` | "list" | Yes (items only) | No | No | +| `CMARK_NODE_ITEM` | "item" | Yes | No | No | +| `CMARK_NODE_CODE_BLOCK` | "code_block" | No (leaf) | Yes | No | +| `CMARK_NODE_HTML_BLOCK` | "html_block" | No (leaf) | No | No | +| `CMARK_NODE_CUSTOM_BLOCK` | "custom_block" | Yes | No | No | +| `CMARK_NODE_PARAGRAPH` | "paragraph" | No | Yes | Yes | +| `CMARK_NODE_HEADING` | "heading" | No | Yes | Yes | +| `CMARK_NODE_THEMATIC_BREAK` | "thematic_break" | No (leaf) | No | No | + +### Inline Nodes +| Enum Value | Type String | Leaf? | +|-----------|-------------|-------| +| `CMARK_NODE_TEXT` | "text" | Yes | +| `CMARK_NODE_SOFTBREAK` | "softbreak" | Yes | +| `CMARK_NODE_LINEBREAK` | "linebreak" | Yes | +| `CMARK_NODE_CODE` | "code" | Yes | +| `CMARK_NODE_HTML_INLINE` | "html_inline" | Yes | +| `CMARK_NODE_CUSTOM_INLINE` | "custom_inline" | No | +| `CMARK_NODE_EMPH` | "emph" | No | +| `CMARK_NODE_STRONG` | "strong" | No | +| `CMARK_NODE_LINK` | "link" | No | +| `CMARK_NODE_IMAGE` | "image" | No | + +Range sentinels are also defined for classification: +- `CMARK_NODE_FIRST_BLOCK = CMARK_NODE_DOCUMENT` +- `CMARK_NODE_LAST_BLOCK = CMARK_NODE_THEMATIC_BREAK` +- `CMARK_NODE_FIRST_INLINE = CMARK_NODE_TEXT` +- `CMARK_NODE_LAST_INLINE = CMARK_NODE_IMAGE` + +## Option Flags + +Options are passed as a bitmask integer to parsing and rendering functions: + +```c +#define CMARK_OPT_DEFAULT 0 +#define CMARK_OPT_SOURCEPOS (1 << 1) // Add data-sourcepos attributes +#define CMARK_OPT_HARDBREAKS (1 << 2) // Render softbreaks as hard breaks +#define CMARK_OPT_SAFE (1 << 3) // Legacy (now default behavior) +#define CMARK_OPT_NOBREAKS (1 << 4) // Render softbreaks as spaces +#define CMARK_OPT_NORMALIZE (1 << 8) // Legacy (no effect) +#define CMARK_OPT_VALIDATE_UTF8 (1 << 9) // Validate UTF-8 input +#define CMARK_OPT_SMART (1 << 10) // Smart quotes and dashes +#define CMARK_OPT_UNSAFE (1 << 17) // Allow raw HTML and dangerous URLs +``` + +## Memory Management Model + +cmark uses a pluggable memory allocator defined by the `cmark_mem` struct: + +```c +typedef struct cmark_mem { + void *(*calloc)(size_t, size_t); + void *(*realloc)(void *, size_t); + void (*free)(void *); +} cmark_mem; +``` + +The default allocator in `cmark.c` wraps standard `calloc`/`realloc`/`free` with abort-on-NULL safety checks (`xcalloc`, `xrealloc`). Every node stores a pointer to the allocator it was created with (`node->mem`), ensuring consistent allocation/deallocation throughout the tree. + +## Version Information + +Runtime version queries: + +```c +int cmark_version(void); // Returns CMARK_VERSION as integer (0xMMmmpp) +const char *cmark_version_string(void); // Returns CMARK_VERSION_STRING +``` + +The version is encoded as a 24-bit integer where bits 16–23 are major, 8–15 are minor, and 0–7 are patch. For example, `0x001F02` represents version 0.31.2. + +## Backwards Compatibility Aliases + +For code written against older cmark API versions, these aliases are provided: + +```c +#define CMARK_NODE_HEADER CMARK_NODE_HEADING +#define CMARK_NODE_HRULE CMARK_NODE_THEMATIC_BREAK +#define CMARK_NODE_HTML CMARK_NODE_HTML_BLOCK +#define CMARK_NODE_INLINE_HTML CMARK_NODE_HTML_INLINE +``` + +Short-name aliases (without the `CMARK_` prefix) are also available unless `CMARK_NO_SHORT_NAMES` is defined: + +```c +#define NODE_DOCUMENT CMARK_NODE_DOCUMENT +#define NODE_PARAGRAPH CMARK_NODE_PARAGRAPH +#define BULLET_LIST CMARK_BULLET_LIST +// ... and many more +``` + +## Cross-References + +- [architecture.md](architecture.md) — Detailed two-phase parsing pipeline, module dependency graph +- [public-api.md](public-api.md) — Complete public API reference with all function signatures +- [ast-node-system.md](ast-node-system.md) — Internal `cmark_node` struct, type-specific unions, tree operations +- [block-parsing.md](block-parsing.md) — `blocks.c` line-by-line analysis, open block tracking, finalization +- [inline-parsing.md](inline-parsing.md) — `inlines.c` delimiter algorithm, bracket stack, backtick scanning +- [iterator-system.md](iterator-system.md) — AST traversal with enter/exit events +- [html-renderer.md](html-renderer.md) — HTML output with escaping and source position +- [xml-renderer.md](xml-renderer.md) — XML output with CommonMark DTD +- [latex-renderer.md](latex-renderer.md) — LaTeX output via render framework +- [man-renderer.md](man-renderer.md) — groff man page output +- [commonmark-renderer.md](commonmark-renderer.md) — Round-trip CommonMark output +- [render-framework.md](render-framework.md) — Shared `cmark_render()` engine for text-based renderers +- [memory-management.md](memory-management.md) — Allocator model, buffer growth, node freeing +- [utf8-handling.md](utf8-handling.md) — UTF-8 validation, iteration, case folding +- [reference-system.md](reference-system.md) — Link reference definitions storage and resolution +- [scanner-system.md](scanner-system.md) — re2c-generated pattern matching +- [building.md](building.md) — CMake build configuration and options +- [cli-usage.md](cli-usage.md) — Command-line tool usage +- [testing.md](testing.md) — Test infrastructure (spec tests, API tests, fuzzing) +- [code-style.md](code-style.md) — Coding conventions and naming patterns diff --git a/docs/handbook/cmark/public-api.md b/docs/handbook/cmark/public-api.md new file mode 100644 index 0000000000..7168282e23 --- /dev/null +++ b/docs/handbook/cmark/public-api.md @@ -0,0 +1,637 @@ +# cmark — Public API Reference + +## Header: `cmark.h` + +All public API functions, types, and constants are declared in `cmark.h`. Functions marked with `CMARK_EXPORT` are exported from the shared library. The header is usable from C++ via `extern "C"` guards. + +--- + +## Type Definitions + +### Node Types + +```c +typedef enum { + /* Error status */ + CMARK_NODE_NONE, + + /* Block nodes */ + CMARK_NODE_DOCUMENT, + CMARK_NODE_BLOCK_QUOTE, + CMARK_NODE_LIST, + CMARK_NODE_ITEM, + CMARK_NODE_CODE_BLOCK, + CMARK_NODE_HTML_BLOCK, + CMARK_NODE_CUSTOM_BLOCK, + CMARK_NODE_PARAGRAPH, + CMARK_NODE_HEADING, + CMARK_NODE_THEMATIC_BREAK, + + /* Range sentinels */ + CMARK_NODE_FIRST_BLOCK = CMARK_NODE_DOCUMENT, + CMARK_NODE_LAST_BLOCK = CMARK_NODE_THEMATIC_BREAK, + + /* Inline nodes */ + CMARK_NODE_TEXT, + CMARK_NODE_SOFTBREAK, + CMARK_NODE_LINEBREAK, + CMARK_NODE_CODE, + CMARK_NODE_HTML_INLINE, + CMARK_NODE_CUSTOM_INLINE, + CMARK_NODE_EMPH, + CMARK_NODE_STRONG, + CMARK_NODE_LINK, + CMARK_NODE_IMAGE, + + CMARK_NODE_FIRST_INLINE = CMARK_NODE_TEXT, + CMARK_NODE_LAST_INLINE = CMARK_NODE_IMAGE +} cmark_node_type; +``` + +### List Types + +```c +typedef enum { + CMARK_NO_LIST, + CMARK_BULLET_LIST, + CMARK_ORDERED_LIST +} cmark_list_type; +``` + +### Delimiter Types + +```c +typedef enum { + CMARK_NO_DELIM, + CMARK_PERIOD_DELIM, + CMARK_PAREN_DELIM +} cmark_delim_type; +``` + +### Event Types (for iterator) + +```c +typedef enum { + CMARK_EVENT_NONE, + CMARK_EVENT_DONE, + CMARK_EVENT_ENTER, + CMARK_EVENT_EXIT +} cmark_event_type; +``` + +### Opaque Types + +```c +typedef struct cmark_node cmark_node; +typedef struct cmark_parser cmark_parser; +typedef struct cmark_iter cmark_iter; +``` + +### Memory Allocator + +```c +typedef struct cmark_mem { + void *(*calloc)(size_t, size_t); + void *(*realloc)(void *, size_t); + void (*free)(void *); +} cmark_mem; +``` + +--- + +## Simple Interface + +### `cmark_markdown_to_html` + +```c +CMARK_EXPORT +char *cmark_markdown_to_html(const char *text, size_t len, int options); +``` + +Converts CommonMark text to HTML in a single call. The input `text` must be UTF-8 encoded. The returned string is null-terminated and allocated via the default allocator; the caller must free it with `free()`. + +**Implementation** (in `cmark.c`): Calls `cmark_parse_document()`, then `cmark_render_html()`, then `cmark_node_free()`. + +--- + +## Node Classification + +### `cmark_node_is_block` + +```c +CMARK_EXPORT bool cmark_node_is_block(cmark_node *node); +``` + +Returns `true` if `node->type` is between `CMARK_NODE_FIRST_BLOCK` and `CMARK_NODE_LAST_BLOCK` inclusive. Returns `false` for NULL. + +### `cmark_node_is_inline` + +```c +CMARK_EXPORT bool cmark_node_is_inline(cmark_node *node); +``` + +Returns `true` if `node->type` is between `CMARK_NODE_FIRST_INLINE` and `CMARK_NODE_LAST_INLINE` inclusive. Returns `false` for NULL. + +### `cmark_node_is_leaf` + +```c +CMARK_EXPORT bool cmark_node_is_leaf(cmark_node *node); +``` + +Returns `true` for node types that cannot have children: +- `CMARK_NODE_THEMATIC_BREAK` +- `CMARK_NODE_CODE_BLOCK` +- `CMARK_NODE_TEXT` +- `CMARK_NODE_SOFTBREAK` +- `CMARK_NODE_LINEBREAK` +- `CMARK_NODE_CODE` +- `CMARK_NODE_HTML_INLINE` + +Note: `CMARK_NODE_HTML_BLOCK` is **not** classified as a leaf by `cmark_node_is_leaf()`, though the iterator treats it as one (see `S_leaf_mask` in `iterator.c`). + +--- + +## Node Creation and Destruction + +### `cmark_node_new` + +```c +CMARK_EXPORT cmark_node *cmark_node_new(cmark_node_type type); +``` + +Creates a new node of the given type using the default memory allocator. For `CMARK_NODE_HEADING`, the level defaults to 1. For `CMARK_NODE_LIST`, the list type defaults to `CMARK_BULLET_LIST` with `start = 0` and `tight = false`. + +### `cmark_node_new_with_mem` + +```c +CMARK_EXPORT cmark_node *cmark_node_new_with_mem(cmark_node_type type, cmark_mem *mem); +``` + +Same as `cmark_node_new` but uses the specified memory allocator. All nodes in a single tree must use the same allocator. + +### `cmark_node_free` + +```c +CMARK_EXPORT void cmark_node_free(cmark_node *node); +``` + +Frees the node and all its descendants. The node is first unlinked from its siblings/parent. The internal `S_free_nodes()` function iterates the subtree (splicing children into a flat list for iterative freeing) and releases type-specific memory: +- `CMARK_NODE_CODE_BLOCK`: frees `data` and `as.code.info` +- `CMARK_NODE_TEXT`, `CMARK_NODE_HTML_INLINE`, `CMARK_NODE_CODE`, `CMARK_NODE_HTML_BLOCK`: frees `data` +- `CMARK_NODE_LINK`, `CMARK_NODE_IMAGE`: frees `as.link.url` and `as.link.title` +- `CMARK_NODE_CUSTOM_BLOCK`, `CMARK_NODE_CUSTOM_INLINE`: frees `as.custom.on_enter` and `as.custom.on_exit` + +--- + +## Tree Traversal + +### `cmark_node_next` + +```c +CMARK_EXPORT cmark_node *cmark_node_next(cmark_node *node); +``` + +Returns the next sibling, or NULL. + +### `cmark_node_previous` + +```c +CMARK_EXPORT cmark_node *cmark_node_previous(cmark_node *node); +``` + +Returns the previous sibling, or NULL. + +### `cmark_node_parent` + +```c +CMARK_EXPORT cmark_node *cmark_node_parent(cmark_node *node); +``` + +Returns the parent node, or NULL. + +### `cmark_node_first_child` + +```c +CMARK_EXPORT cmark_node *cmark_node_first_child(cmark_node *node); +``` + +Returns the first child, or NULL. + +### `cmark_node_last_child` + +```c +CMARK_EXPORT cmark_node *cmark_node_last_child(cmark_node *node); +``` + +Returns the last child, or NULL. + +--- + +## Iterator API + +### `cmark_iter_new` + +```c +CMARK_EXPORT cmark_iter *cmark_iter_new(cmark_node *root); +``` + +Creates a new iterator starting at `root`. Returns NULL if `root` is NULL. The iterator begins in a pre-first state (`CMARK_EVENT_NONE`); the first call to `cmark_iter_next()` returns `CMARK_EVENT_ENTER` for the root. + +### `cmark_iter_free` + +```c +CMARK_EXPORT void cmark_iter_free(cmark_iter *iter); +``` + +Frees the iterator. Does not free any nodes. + +### `cmark_iter_next` + +```c +CMARK_EXPORT cmark_event_type cmark_iter_next(cmark_iter *iter); +``` + +Advances to the next node and returns the event type: +- `CMARK_EVENT_ENTER` — entering a node (for non-leaf nodes, children follow) +- `CMARK_EVENT_EXIT` — leaving a node (all children have been visited) +- `CMARK_EVENT_DONE` — iteration complete (returned to root) + +Leaf nodes only generate `ENTER` events, never `EXIT`. + +### `cmark_iter_get_node` + +```c +CMARK_EXPORT cmark_node *cmark_iter_get_node(cmark_iter *iter); +``` + +Returns the current node. + +### `cmark_iter_get_event_type` + +```c +CMARK_EXPORT cmark_event_type cmark_iter_get_event_type(cmark_iter *iter); +``` + +Returns the current event type. + +### `cmark_iter_get_root` + +```c +CMARK_EXPORT cmark_node *cmark_iter_get_root(cmark_iter *iter); +``` + +Returns the root node of the iteration. + +### `cmark_iter_reset` + +```c +CMARK_EXPORT void cmark_iter_reset(cmark_iter *iter, cmark_node *current, + cmark_event_type event_type); +``` + +Resets the iterator position. The node must be a descendant of the root (or the root itself). + +--- + +## Node Accessors + +### User Data + +```c +CMARK_EXPORT void *cmark_node_get_user_data(cmark_node *node); +CMARK_EXPORT int cmark_node_set_user_data(cmark_node *node, void *user_data); +``` + +Get/set arbitrary user data pointer. Returns 0 on failure, 1 on success. cmark does not manage the lifecycle of user data. + +### Type Information + +```c +CMARK_EXPORT cmark_node_type cmark_node_get_type(cmark_node *node); +CMARK_EXPORT const char *cmark_node_get_type_string(cmark_node *node); +``` + +`cmark_node_get_type_string()` returns strings like `"document"`, `"paragraph"`, `"heading"`, `"text"`, `"emph"`, `"strong"`, `"link"`, `"image"`, etc. Returns `""` for unrecognized types. + +### String Content + +```c +CMARK_EXPORT const char *cmark_node_get_literal(cmark_node *node); +CMARK_EXPORT int cmark_node_set_literal(cmark_node *node, const char *content); +``` + +Works for `CMARK_NODE_HTML_BLOCK`, `CMARK_NODE_TEXT`, `CMARK_NODE_HTML_INLINE`, `CMARK_NODE_CODE`, and `CMARK_NODE_CODE_BLOCK`. Returns NULL / 0 for other types. + +### Heading Level + +```c +CMARK_EXPORT int cmark_node_get_heading_level(cmark_node *node); +CMARK_EXPORT int cmark_node_set_heading_level(cmark_node *node, int level); +``` + +Only works for `CMARK_NODE_HEADING`. Level must be 1–6. Returns 0 on error. + +### List Properties + +```c +CMARK_EXPORT cmark_list_type cmark_node_get_list_type(cmark_node *node); +CMARK_EXPORT int cmark_node_set_list_type(cmark_node *node, cmark_list_type type); +CMARK_EXPORT cmark_delim_type cmark_node_get_list_delim(cmark_node *node); +CMARK_EXPORT int cmark_node_set_list_delim(cmark_node *node, cmark_delim_type delim); +CMARK_EXPORT int cmark_node_get_list_start(cmark_node *node); +CMARK_EXPORT int cmark_node_set_list_start(cmark_node *node, int start); +CMARK_EXPORT int cmark_node_get_list_tight(cmark_node *node); +CMARK_EXPORT int cmark_node_set_list_tight(cmark_node *node, int tight); +``` + +All list accessors only work for `CMARK_NODE_LIST`. `set_list_start` rejects negative values. `set_list_tight` interprets `tight == 1` as true. + +### Code Block Info + +```c +CMARK_EXPORT const char *cmark_node_get_fence_info(cmark_node *node); +CMARK_EXPORT int cmark_node_set_fence_info(cmark_node *node, const char *info); +``` + +The info string from a fenced code block (e.g., `"python"` from ` ```python `). Only works for `CMARK_NODE_CODE_BLOCK`. + +### Link/Image Properties + +```c +CMARK_EXPORT const char *cmark_node_get_url(cmark_node *node); +CMARK_EXPORT int cmark_node_set_url(cmark_node *node, const char *url); +CMARK_EXPORT const char *cmark_node_get_title(cmark_node *node); +CMARK_EXPORT int cmark_node_set_title(cmark_node *node, const char *title); +``` + +Only work for `CMARK_NODE_LINK` and `CMARK_NODE_IMAGE`. Return NULL / 0 for other types. + +### Custom Block/Inline + +```c +CMARK_EXPORT const char *cmark_node_get_on_enter(cmark_node *node); +CMARK_EXPORT int cmark_node_set_on_enter(cmark_node *node, const char *on_enter); +CMARK_EXPORT const char *cmark_node_get_on_exit(cmark_node *node); +CMARK_EXPORT int cmark_node_set_on_exit(cmark_node *node, const char *on_exit); +``` + +Only work for `CMARK_NODE_CUSTOM_BLOCK` and `CMARK_NODE_CUSTOM_INLINE`. + +### Source Position + +```c +CMARK_EXPORT int cmark_node_get_start_line(cmark_node *node); +CMARK_EXPORT int cmark_node_get_start_column(cmark_node *node); +CMARK_EXPORT int cmark_node_get_end_line(cmark_node *node); +CMARK_EXPORT int cmark_node_get_end_column(cmark_node *node); +``` + +Line and column numbers are 1-based. These are populated during parsing if `CMARK_OPT_SOURCEPOS` is set. + +--- + +## Tree Manipulation + +### `cmark_node_unlink` + +```c +CMARK_EXPORT void cmark_node_unlink(cmark_node *node); +``` + +Removes `node` from the tree (detaching from parent and siblings) without freeing its memory. + +### `cmark_node_insert_before` + +```c +CMARK_EXPORT int cmark_node_insert_before(cmark_node *node, cmark_node *sibling); +``` + +Inserts `sibling` before `node`. Validates that the parent can contain the sibling (via `S_can_contain()`). Returns 1 on success, 0 on failure. + +### `cmark_node_insert_after` + +```c +CMARK_EXPORT int cmark_node_insert_after(cmark_node *node, cmark_node *sibling); +``` + +Inserts `sibling` after `node`. Returns 1 on success, 0 on failure. + +### `cmark_node_replace` + +```c +CMARK_EXPORT int cmark_node_replace(cmark_node *oldnode, cmark_node *newnode); +``` + +Replaces `oldnode` with `newnode` in the tree. The old node is unlinked but not freed. + +### `cmark_node_prepend_child` + +```c +CMARK_EXPORT int cmark_node_prepend_child(cmark_node *node, cmark_node *child); +``` + +Adds `child` as the first child of `node`. Validates containership. + +### `cmark_node_append_child` + +```c +CMARK_EXPORT int cmark_node_append_child(cmark_node *node, cmark_node *child); +``` + +Adds `child` as the last child of `node`. Validates containership. + +### `cmark_consolidate_text_nodes` + +```c +CMARK_EXPORT void cmark_consolidate_text_nodes(cmark_node *root); +``` + +Merges adjacent `CMARK_NODE_TEXT` children into single text nodes throughout the subtree. Uses an iterator to find consecutive text nodes and concatenates their data via `cmark_strbuf`. + +--- + +## Parsing Functions + +### `cmark_parser_new` + +```c +CMARK_EXPORT cmark_parser *cmark_parser_new(int options); +``` + +Creates a parser with the default memory allocator and a new document root. + +### `cmark_parser_new_with_mem` + +```c +CMARK_EXPORT cmark_parser *cmark_parser_new_with_mem(int options, cmark_mem *mem); +``` + +Creates a parser with the specified allocator. + +### `cmark_parser_new_with_mem_into_root` + +```c +CMARK_EXPORT cmark_parser *cmark_parser_new_with_mem_into_root( + int options, cmark_mem *mem, cmark_node *root); +``` + +Creates a parser that appends parsed content to an existing root node. Useful for assembling a single document from multiple parsed fragments. + +### `cmark_parser_free` + +```c +CMARK_EXPORT void cmark_parser_free(cmark_parser *parser); +``` + +Frees the parser and its internal buffers. Does NOT free the parsed document tree. + +### `cmark_parser_feed` + +```c +CMARK_EXPORT void cmark_parser_feed(cmark_parser *parser, const char *buffer, size_t len); +``` + +Feeds a chunk of input data to the parser. Can be called multiple times for streaming input. + +### `cmark_parser_finish` + +```c +CMARK_EXPORT cmark_node *cmark_parser_finish(cmark_parser *parser); +``` + +Finalizes parsing and returns the document root. Must be called after all input has been fed. Triggers `finalize_document()` which closes all open blocks and runs inline parsing. + +### `cmark_parse_document` + +```c +CMARK_EXPORT cmark_node *cmark_parse_document(const char *buffer, size_t len, int options); +``` + +Convenience function equivalent to: create parser → feed entire buffer → finish → free parser. Returns the document root. + +### `cmark_parse_file` + +```c +CMARK_EXPORT cmark_node *cmark_parse_file(FILE *f, int options); +``` + +Reads from a `FILE*` in 4096-byte chunks and parses incrementally. + +--- + +## Rendering Functions + +### `cmark_render_html` + +```c +CMARK_EXPORT char *cmark_render_html(cmark_node *root, int options); +``` + +Renders to HTML. Caller must free returned string. + +### `cmark_render_xml` + +```c +CMARK_EXPORT char *cmark_render_xml(cmark_node *root, int options); +``` + +Renders to XML with CommonMark DTD. Includes `` header. + +### `cmark_render_man` + +```c +CMARK_EXPORT char *cmark_render_man(cmark_node *root, int options, int width); +``` + +Renders to groff man page format. `width` controls line wrapping (0 = no wrap). + +### `cmark_render_commonmark` + +```c +CMARK_EXPORT char *cmark_render_commonmark(cmark_node *root, int options, int width); +``` + +Renders back to CommonMark format. `width` controls line wrapping. + +### `cmark_render_latex` + +```c +CMARK_EXPORT char *cmark_render_latex(cmark_node *root, int options, int width); +``` + +Renders to LaTeX. `width` controls line wrapping. + +--- + +## Option Constants + +### Rendering Options + +```c +#define CMARK_OPT_DEFAULT 0 // No special options +#define CMARK_OPT_SOURCEPOS (1 << 1) // data-sourcepos attributes (HTML), sourcepos attributes (XML) +#define CMARK_OPT_HARDBREAKS (1 << 2) // Render softbreaks as
or \\ +#define CMARK_OPT_SAFE (1 << 3) // Legacy — safe mode is now default +#define CMARK_OPT_UNSAFE (1 << 17) // Render raw HTML and dangerous URLs +#define CMARK_OPT_NOBREAKS (1 << 4) // Render softbreaks as spaces +``` + +### Parsing Options + +```c +#define CMARK_OPT_NORMALIZE (1 << 8) // Legacy — no effect +#define CMARK_OPT_VALIDATE_UTF8 (1 << 9) // Replace invalid UTF-8 with U+FFFD +#define CMARK_OPT_SMART (1 << 10) // Smart quotes and dashes +``` + +--- + +## Memory Allocator + +### `cmark_get_default_mem_allocator` + +```c +CMARK_EXPORT cmark_mem *cmark_get_default_mem_allocator(void); +``` + +Returns a pointer to the default allocator (`DEFAULT_MEM_ALLOCATOR` in `cmark.c`) which wraps `calloc`, `realloc`, and `free` with abort-on-failure guards. + +--- + +## Version API + +### `cmark_version` + +```c +CMARK_EXPORT int cmark_version(void); +``` + +Returns the version as a packed integer: `(major << 16) | (minor << 8) | patch`. + +### `cmark_version_string` + +```c +CMARK_EXPORT const char *cmark_version_string(void); +``` + +Returns the version as a human-readable string (e.g., `"0.31.2"`). + +--- + +## Node Integrity Checking + +```c +CMARK_EXPORT int cmark_node_check(cmark_node *node, FILE *out); +``` + +Validates the structural integrity of the node tree, printing errors to `out`. Returns the number of errors found. Available in all builds but primarily useful in debug builds. + +--- + +## Cross-References + +- [ast-node-system.md](ast-node-system.md) — Internal struct definitions behind these opaque types +- [iterator-system.md](iterator-system.md) — Detailed iterator mechanics +- [memory-management.md](memory-management.md) — Allocator details and buffer management +- [block-parsing.md](block-parsing.md) — How `cmark_parser_feed` and `cmark_parser_finish` work internally +- [html-renderer.md](html-renderer.md) — How `cmark_render_html` generates output diff --git a/docs/handbook/cmark/reference-system.md b/docs/handbook/cmark/reference-system.md new file mode 100644 index 0000000000..0e63b5c796 --- /dev/null +++ b/docs/handbook/cmark/reference-system.md @@ -0,0 +1,307 @@ +# cmark — Reference System + +## Overview + +The reference system (`references.c`, `references.h`) manages link reference definitions — the `[label]: URL "title"` constructs in CommonMark. During block parsing, reference definitions are extracted and stored. During inline parsing, reference links (`[text][label]` and `[text]`) look up these stored definitions. + +## Data Structures + +### Reference Entry + +```c +typedef struct cmark_reference { + struct cmark_reference *next; // Unused — leftover from old linked-list design + unsigned char *url; + unsigned char *title; + unsigned char *label; + unsigned int age; // Insertion order (for stable sorting) + unsigned int size; // Length of the label string +} cmark_reference; +``` + +Each reference stores: +- `label` — The normalized reference label (case-folded, whitespace-collapsed) +- `url` — The destination URL +- `title` — Optional title string (may be NULL) +- `age` — Monotonically increasing counter for insertion order +- `size` — Byte length of the label + +### Reference Map + +```c +struct cmark_reference_map { + cmark_mem *mem; + cmark_reference **refs; // Sorted array of reference pointers + unsigned int size; // Number of entries + unsigned int ref_size; // Cumulative size of all labels + URLs + titles + unsigned int max_ref_size; // Maximum allowed ref_size (anti-DoS limit) + cmark_reference *last; // Most recently added reference + unsigned int asize; // Allocated capacity of refs array +}; +``` + +The map uses a **sorted array with binary search** for lookup, not a hash table. This gives O(log n) lookup and O(n) insertion with shifting. + +### Anti-DoS Limiting + +The `ref_size` and `max_ref_size` fields prevent pathological inputs from causing excessive memory usage: + +```c +unsigned int max_ref_size; // Set to 100 * input length at parser init +unsigned int ref_size; // Sum of all label + url + title lengths +``` + +When `ref_size` exceeds `max_ref_size`, new reference additions are silently rejected. This prevents quadratic memory blowup from inputs with many reference definitions. + +## Label Normalization + +```c +static unsigned char *normalize_reference(cmark_mem *mem, + cmark_chunk *ref) { + cmark_strbuf normalized = CMARK_BUF_INIT(mem); + + if (ref == NULL) return NULL; + + if (ref->len == 0) return NULL; + + cmark_utf8proc_case_fold(&normalized, ref->data, ref->len); + cmark_strbuf_trim(&normalized); + cmark_strbuf_normalize_whitespace(&normalized); + + return cmark_strbuf_detach(&normalized); +} +``` + +The normalization process (per CommonMark spec): +1. **Case fold** — Uses Unicode case folding (not simple lowercasing), via `cmark_utf8proc_case_fold()` +2. **Trim** — Remove leading and trailing whitespace +3. **Collapse whitespace** — Replace runs of whitespace with a single space + +This means `[Foo Bar]`, `[FOO BAR]`, and `[foo bar]` all normalize to the same label. + +## Reference Creation + +```c +static void cmark_reference_create(cmark_reference_map *map, + cmark_chunk *label, + cmark_chunk *url, + cmark_chunk *title) { + cmark_reference *ref; + unsigned char *reflabel = normalize_reference(map->mem, label); + + if (reflabel == NULL) return; + + // Anti-DoS: check cumulative size limit + if (map->ref_size > map->max_ref_size) { + map->mem->free(reflabel); + return; + } + + ref = (cmark_reference *)map->mem->calloc(1, sizeof(*ref)); + ref->label = reflabel; + ref->url = cmark_clean_url(map->mem, url); + ref->title = cmark_clean_title(map->mem, title); + ref->age = map->size; + ref->size = (unsigned int)strlen((char *)reflabel); + + // Track cumulative size + map->ref_size += ref->size; + if (ref->url) map->ref_size += (unsigned int)strlen((char *)ref->url); + if (ref->title) map->ref_size += (unsigned int)strlen((char *)ref->title); + + // Add to array + if (map->size >= map->asize) { + // Grow array (double capacity) + map->asize = map->asize ? 2 * map->asize : 8; + map->refs = (cmark_reference **)map->mem->realloc( + map->refs, map->asize * sizeof(cmark_reference *)); + } + map->refs[map->size] = ref; + map->size++; + map->last = ref; +} +``` + +References are appended to the array in insertion order. The array is NOT kept sorted during insertion — it's sorted once at lookup time (lazily). + +## Reference Lookup + +```c +cmark_reference *cmark_reference_lookup(cmark_reference_map *map, + cmark_chunk *label) { + if (label->len < 1 || label->len > MAX_LINK_LABEL_LENGTH) return NULL; + if (map == NULL || map->size == 0) return NULL; + + unsigned char *norm = normalize_reference(map->mem, label); + if (norm == NULL) return NULL; + + // Sort on first lookup + if (!map->sorted) { + qsort(map->refs, map->size, sizeof(cmark_reference *), refcmp); + // Remove duplicates (keep first occurrence) + // ... + map->sorted = true; + } + + // Binary search + cmark_reference **found = (cmark_reference **)bsearch( + &norm, map->refs, map->size, sizeof(cmark_reference *), refcmp); + + map->mem->free(norm); + return found ? *found : NULL; +} +``` + +### Lazy Sorting + +The reference map is NOT sorted during insertion. On the first call to `cmark_reference_lookup()`, the array is sorted using `qsort()` with a comparison function: + +```c +static int refcmp(const void *a, const void *b) { + const cmark_reference *refa = *(const cmark_reference **)a; + const cmark_reference *refb = *(const cmark_reference **)b; + int cmp = strcmp((char *)refa->label, (char *)refb->label); + if (cmp != 0) return cmp; + // Tie-break by age (earlier wins) + if (refa->age < refb->age) return -1; + if (refa->age > refb->age) return 1; + return 0; +} +``` + +When labels collide (same normalized label), the first definition wins (lowest `age`). + +After sorting, duplicates are removed — entries with the same label as the preceding entry are freed: +```c +unsigned int write = 0; +for (unsigned int read = 0; read < map->size; read++) { + if (write > 0 && + strcmp((char *)map->refs[write-1]->label, + (char *)map->refs[read]->label) == 0) { + // Duplicate — free it + cmark_reference_free(map->mem, map->refs[read]); + } else { + map->refs[write++] = map->refs[read]; + } +} +map->size = write; +``` + +### Binary Search + +After sorting and deduplication, lookups use standard `bsearch()`, giving O(log n) lookup time. + +## URL and Title Cleaning + +When creating references, URLs and titles are cleaned: + +### `cmark_clean_url()` +```c +unsigned char *cmark_clean_url(cmark_mem *mem, cmark_chunk *url); +``` +- Removes surrounding `<` and `>` if present (angle-bracket URLs) +- Unescapes backslash escapes +- Decodes entity references +- Percent-encodes non-URL-safe characters via `houdini_escape_href()` + +### `cmark_clean_title()` +```c +unsigned char *cmark_clean_title(cmark_mem *mem, cmark_chunk *title); +``` +- Strips the first and last character (the delimiter: `"`, `'`, or `(`) +- Unescapes backslash escapes +- Decodes entity references + +## Integration with Parser + +### Extraction during Block Parsing + +Reference definitions are extracted when paragraphs are finalized: + +```c +// In blocks.c, during paragraph finalization: +while (cmark_parse_reference_inline(parser->mem, &node_content, + parser->refmap)) { + // Keep parsing references from the start of the paragraph +} +``` + +### `cmark_parse_reference_inline()` + +```c +int cmark_parse_reference_inline(cmark_mem *mem, cmark_strbuf *input, + cmark_reference_map *refmap) { + // Parse: [label]: destination "title" + // Returns 1 if a reference was found and consumed, 0 otherwise + subject subj; + // ... initialize subject on the input buffer + // Parse label + cmark_chunk lab = cmark_chunk_literal(""); + cmark_chunk url = cmark_chunk_literal(""); + cmark_chunk title = cmark_chunk_literal(""); + + if (!link_label(&subj, &lab) || lab.len == 0) return 0; + if (peek_char(&subj) != ':') return 0; + advance(&subj); + spnl(&subj); // skip spaces and up to one newline + if (!manual_scan_link_url(&subj, &url)) return 0; + // ... parse optional title + // ... validate: rest of line must be blank + cmark_reference_create(refmap, &lab, &url, &title); + // Remove consumed bytes from input + return 1; +} +``` + +The parser repeatedly calls this function on paragraph content. Each successful parse removes the reference definition from the paragraph. If the entire paragraph consists of reference definitions, the paragraph node is removed from the AST. + +### Lookup during Inline Parsing + +In `inlines.c`, when a potential reference link is found: + +```c +cmark_reference *ref = cmark_reference_lookup(subj->refmap, &raw_label); +if (ref) { + // Create link node with ref->url and ref->title +} +``` + +## Label Length Limit + +```c +#define MAX_LINK_LABEL_LENGTH 999 +``` + +Reference labels longer than 999 characters are rejected, per the CommonMark spec. + +## Map Lifecycle + +```c +cmark_reference_map *cmark_reference_map_new(cmark_mem *mem); +void cmark_reference_map_free(cmark_reference_map *map); +``` + +The map is created during parser initialization and freed when the parser is freed. The AST's reference links have already been resolved and store their own copies of URL and title — the reference map is not needed after parsing. + +### Cleanup + +```c +void cmark_reference_map_free(cmark_reference_map *map) { + if (map == NULL) return; + for (unsigned int i = 0; i < map->size; i++) { + cmark_reference_free(map->mem, map->refs[i]); + } + map->mem->free(map->refs); + map->mem->free(map); +} +``` + +Each reference and its strings (label, url, title) are freed, then the array and map struct are freed. + +## Cross-References + +- [references.c](../../cmark/src/references.c) — Implementation +- [references.h](../../cmark/src/references.h) — Data structures +- [block-parsing.md](block-parsing.md) — Reference extraction during paragraph finalization +- [inline-parsing.md](inline-parsing.md) — Reference lookup during link resolution +- [utf8-handling.md](utf8-handling.md) — Case folding used in label normalization diff --git a/docs/handbook/cmark/render-framework.md b/docs/handbook/cmark/render-framework.md new file mode 100644 index 0000000000..065b9c878f --- /dev/null +++ b/docs/handbook/cmark/render-framework.md @@ -0,0 +1,294 @@ +# cmark — Render Framework + +## Overview + +The render framework (`render.c`, `render.h`) provides a generic rendering infrastructure used by three of the five renderers: LaTeX, man, and CommonMark. It handles line wrapping, prefix management, and character-level output dispatch. The HTML and XML renderers bypass this framework and write directly to buffers. + +## The `cmark_renderer` Structure + +```c +struct cmark_renderer { + cmark_mem *mem; + cmark_strbuf *buffer; // Output buffer + cmark_strbuf *prefix; // Current line prefix (e.g., "> " for blockquotes) + int column; // Current column position (for wrapping) + int width; // Target width (0 = no wrapping) + int need_cr; // Pending newlines count + bufsize_t last_breakable; // Position of last breakable point in buffer + bool begin_line; // True if at the start of a line + bool begin_content; // True if no content has been output on current line (after prefix) + bool no_linebreaks; // Suppress newlines (for rendering within attributes) + bool in_tight_list_item; // Currently inside a tight list item + void (*outc)(cmark_renderer *, cmark_escaping, int32_t, unsigned char); + // Per-character output callback + int32_t (*render_node)(cmark_renderer *, cmark_node *, cmark_event_type, int); + // Per-node render callback +}; +``` + +### Key Fields + +- **`column`** — Tracks horizontal position for word-wrap decisions. +- **`width`** — If > 0, enables automatic line wrapping at word boundaries. +- **`prefix`** — Accumulated prefix string. For nested block quotes and list items, prefixes stack (e.g., `"> - "` for a list item inside a block quote). +- **`last_breakable`** — Buffer position of the last whitespace where a line break could be inserted. Used for retroactive line wrapping. +- **`begin_line`** — True immediately after a newline. Used by renderers to decide whether to escape line-start characters. +- **`begin_content`** — True until the first non-prefix content on a line. Distinguished from `begin_line` because the prefix itself isn't "content". +- **`no_linebreaks`** — When true, newlines are converted to spaces. Used when rendering content inside constructs that can't contain literal newlines. + +## Entry Point + +```c +char *cmark_render(cmark_mem *mem, cmark_node *root, int options, int width, + void (*outc)(cmark_renderer *, cmark_escaping, int32_t, unsigned char), + int32_t (*render_node)(cmark_renderer *, cmark_node *, + cmark_event_type, int)) { + cmark_renderer renderer = { + mem, + &buf, // buffer + &pref, // prefix + 0, // column + width, // width + 0, // need_cr + 0, // last_breakable + true, // begin_line + true, // begin_content + false, // no_linebreaks + false, // in_tight_list_item + outc, // outc + render_node // render_node + }; + // ... iterate AST, call render_node for each event + return (char *)cmark_strbuf_detach(&buf); +} +``` + +The framework creates a `cmark_renderer`, iterates over the AST using `cmark_iter`, and calls the provided `render_node` function for each event. The `outc` callback handles per-character output with escaping decisions. + +## Escaping Modes + +```c +typedef enum { + LITERAL, // No escaping — output characters as-is + NORMAL, // Full escaping for prose text + TITLE, // Escaping for link titles + URL, // Escaping for URLs +} cmark_escaping; +``` + +Each renderer's `outc` function switches on this enum to determine how to handle special characters. + +## Output Functions + +### `cmark_render_code_point()` + +```c +void cmark_render_code_point(cmark_renderer *renderer, int32_t c) { + cmark_utf8proc_encode_char(c, renderer->buffer); + renderer->column += 1; +} +``` + +Low-level: encodes a single Unicode codepoint as UTF-8 into the buffer and advances the column counter. + +### `cmark_render_ascii()` + +```c +void cmark_render_ascii(cmark_renderer *renderer, const char *s) { + int len = (int)strlen(s); + cmark_strbuf_puts(renderer->buffer, s); + renderer->column += len; +} +``` + +Outputs an ASCII string and advances the column counter. Used for fixed escape sequences like `\&`, `\textbf{`, etc. + +### `S_out()` — Main Output Dispatcher + +```c +static CMARK_INLINE void S_out(cmark_renderer *renderer, const char *source, + bool wrap, cmark_escaping escape) { + int length = (int)strlen(source); + unsigned char nextc; + int32_t c; + int i = 0; + int len; + cmark_chunk remainder = cmark_chunk_literal(""); + int k = renderer->buffer->size - 1; + + wrap = wrap && !renderer->no_linebreaks; + + if (renderer->need_cr) { + // Output pending newlines + while (renderer->need_cr > 0) { + S_cr(renderer); + renderer->need_cr--; + } + } + + while (i < length) { + if (renderer->begin_line) { + // Output prefix at start of each line + cmark_strbuf_puts(renderer->buffer, (char *)renderer->prefix->ptr); + renderer->column = renderer->prefix->size; + renderer->begin_line = false; + renderer->begin_content = true; + } + + len = cmark_utf8proc_charlen((uint8_t *)source + i, length - i); + if (len == -1) { // Invalid UTF-8 + // ... handle error + } + + cmark_utf8proc_iterate((uint8_t *)source + i, len, &c); + + if (c == 10) { + // Newline + cmark_strbuf_putc(renderer->buffer, '\n'); + renderer->column = 0; + renderer->begin_line = true; + renderer->begin_content = true; + renderer->last_breakable = 0; + } else if (wrap) { + if (c == 32 && renderer->column > renderer->width / 2) { + // Space past half-width — mark as potential break point + renderer->last_breakable = renderer->buffer->size; + cmark_render_code_point(renderer, c); + } else if (renderer->column > renderer->width && + renderer->last_breakable > 0) { + // Past target width with a break point — retroactively break + // Replace the space at last_breakable with newline + prefix + // ... + } else { + renderer->outc(renderer, escape, c, nextc); + } + } else { + renderer->outc(renderer, escape, c, nextc); + } + + if (c != 10) { + renderer->begin_content = false; + } + i += len; + } +} +``` + +This is the core output function. It: +1. Handles deferred newlines (`need_cr`) +2. Outputs line prefixes at the start of each line +3. Tracks column position +4. Implements word wrapping via retroactive line breaks +5. Delegates character-level escaping to `renderer->outc()` + +### Line Wrapping Algorithm + +The wrapping algorithm uses a **retroactive break** strategy: + +1. As text flows through `S_out()`, spaces past the half-width mark are recorded as potential break points (`last_breakable`). +2. When the column exceeds `width`, the buffer is split at `last_breakable`: + - Everything after the break point is saved in `remainder` + - A newline and the current prefix are inserted at the break point + - The remainder is reappended + +This avoids forward-looking: the renderer doesn't need to know the length of upcoming content to decide where to break. + +```c +// Retroactive line break: +remainder = cmark_chunk_dup(&renderer->buffer->..., last_breakable, ...); +cmark_strbuf_truncate(renderer->buffer, last_breakable); +cmark_strbuf_putc(renderer->buffer, '\n'); +cmark_strbuf_puts(renderer->buffer, (char *)renderer->prefix->ptr); +cmark_strbuf_put(renderer->buffer, remainder.data, remainder.len); +renderer->column = renderer->prefix->size + cmark_chunk_len(&remainder); +renderer->last_breakable = 0; +renderer->begin_line = false; +renderer->begin_content = false; +``` + +## Convenience Functions + +### `CR()` + +```c +#define CR() renderer->need_cr = 1 +``` + +Requests a newline before the next content output. Multiple `CR()` calls don't stack — only one newline is inserted. + +### `BLANKLINE()` + +```c +#define BLANKLINE() renderer->need_cr = 2 +``` + +Requests a blank line (two newlines) before the next content output. + +### `OUT()` + +```c +#define OUT(s, wrap, escaping) (S_out(renderer, s, wrap, escaping)) +``` + +### `LIT()` + +```c +#define LIT(s) (S_out(renderer, s, false, LITERAL)) +``` + +Output literal text (no escaping, no wrapping). + +### `NOBREAKS()` + +```c +#define NOBREAKS(s) \ + do { renderer->no_linebreaks = true; OUT(s, false, NORMAL); renderer->no_linebreaks = false; } while(0) +``` + +Output text with normal escaping but with newlines suppressed (converted to spaces). + +## Prefix Management + +Prefixes are used for block-level indentation. The renderer maintains a `cmark_strbuf` prefix that is output at the start of each line. + +### Usage Pattern + +```c +// In commonmark.c, entering a block quote: +cmark_strbuf_puts(renderer->prefix, "> "); +// ... render children ... +// On exit: +cmark_strbuf_truncate(renderer->prefix, original_prefix_len); +``` + +Renderers save the prefix length before modifying it and restore it on exit. This creates a stack-like behavior for nested containers. + +## Framework vs Direct Rendering + +| Feature | Framework (render.c) | Direct (html.c, xml.c) | +|---------|---------------------|----------------------| +| Line wrapping | Yes (`width` parameter) | No | +| Prefix management | Yes (automatic) | No (uses HTML tags) | +| Per-char escaping | Via `outc` callback | Via `escape_html()` helper | +| Column tracking | Yes | No | +| Break points | Retroactive insertion | N/A | +| `cmark_escaping` enum | Yes | No | + +## Which Renderers Use the Framework + +| Renderer | Uses Framework | Why/Why Not | +|----------|---------------|-------------| +| LaTeX (`latex.c`) | Yes | Needs wrapping for structured text | +| man (`man.c`) | Yes | Needs wrapping for terminal display | +| CommonMark (`commonmark.c`) | Yes | Needs wrapping and prefix management | +| HTML (`html.c`) | No | HTML handles layout via browser | +| XML (`xml.c`) | No | XML output is structural, not visual | + +## Cross-References + +- [render.c](../../cmark/src/render.c) — Framework implementation +- [render.h](../../cmark/src/render.h) — `cmark_renderer` struct and `cmark_escaping` enum +- [latex-renderer.md](latex-renderer.md) — LaTeX `outc` and `S_render_node` +- [man-renderer.md](man-renderer.md) — Man `S_outc` and `S_render_node` +- [commonmark-renderer.md](commonmark-renderer.md) — CommonMark `outc` and `S_render_node` +- [html-renderer.md](html-renderer.md) — Direct renderer (no framework) diff --git a/docs/handbook/cmark/scanner-system.md b/docs/handbook/cmark/scanner-system.md new file mode 100644 index 0000000000..79adf03798 --- /dev/null +++ b/docs/handbook/cmark/scanner-system.md @@ -0,0 +1,223 @@ +# cmark — Scanner System + +## Overview + +The scanner system (`scanners.h`, `scanners.re`, `scanners.c`) provides fast pattern-matching functions used throughout cmark's block and inline parsers. The scanners are generated from re2c specifications and compiled into optimized C switch-statement automata. They perform context-free matching only (no backtracking, no captures beyond match length). + +## Architecture + +### Source Files + +- `scanners.re` — re2c source with pattern specifications +- `scanners.c` — Generated C code (committed to the repository, regenerated manually) +- `scanners.h` — Public declarations (macro wrappers and function prototypes) + +### Generation + +Scanners are regenerated from re2c source via: +```bash +re2c --case-insensitive -b -i --no-generation-date --8bit -o scanners.c scanners.re +``` + +Flags: +- `--case-insensitive` — Case-insensitive matching +- `-b` — Use bit vectors for character classes +- `-i` — Use `if` statements instead of `switch` +- `--no-generation-date` — Reproducible output +- `--8bit` — 8-bit character width + +The generated code consists of state machines implemented as nested `switch`/`if` blocks with direct character comparisons. There are no regular expression structs, no DFA tables — the patterns are compiled directly into C control flow. + +## Scanner Interface + +### The `_scan_at` Wrapper + +```c +#define _scan_at(scanner, s, p) scanner(s->input.data, s->input.len, p) +``` + +All scanner functions share the signature: +```c +bufsize_t scan_PATTERN(const unsigned char *s, bufsize_t len, bufsize_t offset); +``` + +Parameters: +- `s` — Input byte string +- `len` — Total length of `s` +- `offset` — Starting position within `s` + +Return value: +- Length of the match (in bytes) if successful +- `0` if no match at the given position + +### Common Pattern + +```c +// In blocks.c: +matched = _scan_at(&scan_thematic_break, &input, first_nonspace); + +// In inlines.c: +matched = _scan_at(&scan_autolink_uri, subj, subj->pos); +``` + +## Scanner Functions + +### Block Structure Scanners + +| Scanner | Purpose | Used In | +|---------|---------|---------| +| `scan_thematic_break` | Matches `***`, `---`, `___` (with optional spaces) | `blocks.c` | +| `scan_atx_heading_start` | Matches `#{1,6}` followed by space or EOL | `blocks.c` | +| `scan_setext_heading_line` | Matches `=+` or `-+` at line start | `blocks.c` | +| `scan_open_code_fence` | Matches `` ``` `` or `~~~` (3+ fence chars) | `blocks.c` | +| `scan_close_code_fence` | Matches closing fence (≥ opening length) | `blocks.c` | +| `scan_html_block_start` | Matches HTML block type 1-5 openers | `blocks.c` | +| `scan_html_block_start_7` | Matches HTML block type 6-7 openers | `blocks.c` | +| `scan_html_block_end_1` | Matches ``, ``, `` | `blocks.c` | +| `scan_html_block_end_2` | Matches `-->` | `blocks.c` | +| `scan_html_block_end_3` | Matches `?>` | `blocks.c` | +| `scan_html_block_end_4` | Matches `>` | `blocks.c` | +| `scan_html_block_end_5` | Matches `]]>` | `blocks.c` | +| `scan_link_title` | Matches `"..."`, `'...'`, or `(...)` titles | `inlines.c` | + +### Inline Scanners + +| Scanner | Purpose | Used In | +|---------|---------|---------| +| `scan_autolink_uri` | Matches URI autolinks `` | `inlines.c` | +| `scan_autolink_email` | Matches email autolinks `` | `inlines.c` | +| `scan_html_tag` | Matches inline HTML tags (open, close, comment, PI, CDATA, declaration) | `inlines.c` | +| `scan_entity` | Matches HTML entities (`&`, `{`, ``) | `inlines.c` | +| `scan_dangerous_url` | Matches `javascript:`, `vbscript:`, `file:`, `data:` URLs | `html.c` | +| `scan_spacechars` | Matches runs of spaces and tabs | `inlines.c` | + +### Link/Reference Scanners + +| Scanner | Purpose | Used In | +|---------|---------|---------| +| `scan_link_url` | Matches link destinations (parenthesized or bare) | `inlines.c` | +| `scan_link_title` | Matches quoted link titles | `inlines.c` | + +## Scanner Patterns (from `scanners.re`) + +### Thematic Break +``` +thematic_break = (('*' [ \t]*){3,} | ('-' [ \t]*){3,} | ('_' [ \t]*){3,}) [ \t]* [\n] +``` +Three or more `*`, `-`, or `_` characters, optionally separated by spaces/tabs. + +### ATX Heading +``` +atx_heading_start = '#{1,6}' ([ \t]+ | [\n]) +``` +1-6 `#` characters followed by space/tab or newline. + +### Code Fence +``` +open_code_fence = '`{3,}' [^`\n]* [\n] | '~{3,}' [^\n]* [\n] +``` +Three or more backticks (not followed by backtick in info string) or three or more tildes. + +### HTML Block Start (Types 1-7) + +The CommonMark spec defines 7 types of HTML blocks, each matched by different scanners: + +1. `