diff options
| author | Mehmet Samet Duman <yongdohyun@projecttick.org> | 2026-04-02 19:56:09 +0300 |
|---|---|---|
| committer | Mehmet Samet Duman <yongdohyun@projecttick.org> | 2026-04-02 19:56:09 +0300 |
| commit | 7fb132859fda54aa96bc9dd46d302b343eeb5a02 (patch) | |
| tree | b43ae77d7451fb470a260c03349a1caf2846c5e5 /neozip/arch/s390/README.md | |
| parent | b1e34e861b5d732afe828d58aad2c638135061fd (diff) | |
| parent | c2712b8a345191f6ed79558c089777df94590087 (diff) | |
| download | Project-Tick-7fb132859fda54aa96bc9dd46d302b343eeb5a02.tar.gz Project-Tick-7fb132859fda54aa96bc9dd46d302b343eeb5a02.zip | |
Add 'neozip/' from commit 'c2712b8a345191f6ed79558c089777df94590087'
git-subtree-dir: neozip
git-subtree-mainline: b1e34e861b5d732afe828d58aad2c638135061fd
git-subtree-split: c2712b8a345191f6ed79558c089777df94590087
Diffstat (limited to 'neozip/arch/s390/README.md')
| -rw-r--r-- | neozip/arch/s390/README.md | 265 |
1 files changed, 265 insertions, 0 deletions
diff --git a/neozip/arch/s390/README.md b/neozip/arch/s390/README.md new file mode 100644 index 0000000000..c56ffd7654 --- /dev/null +++ b/neozip/arch/s390/README.md @@ -0,0 +1,265 @@ +# Introduction + +This directory contains SystemZ deflate hardware acceleration support. +It can be enabled using the following build commands: + + $ ./configure --with-dfltcc-deflate --with-dfltcc-inflate + $ make + +or + + $ cmake -DWITH_DFLTCC_DEFLATE=1 -DWITH_DFLTCC_INFLATE=1 . + $ make + +When built like this, zlib-ng would compress using hardware on level 1, +and using software on all other levels. Decompression will always happen +in hardware. In order to enable hardware compression for levels 1-6 +(i.e. to make it used by default) one could add +`-DDFLTCC_LEVEL_MASK=0x7e` to CFLAGS when building zlib-ng. + +SystemZ deflate hardware acceleration is available on [IBM z15]( +https://www.ibm.com/products/z15) and newer machines under the name [ +"Integrated Accelerator for zEnterprise Data Compression"]( +https://www.ibm.com/support/z-content-solutions/compression/). The +programming interface to it is a machine instruction called DEFLATE +CONVERSION CALL (DFLTCC). It is documented in Chapter 26 of [Principles +of Operation](https://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf). Both +the code and the rest of this document refer to this feature simply as +"DFLTCC". + +# Performance + +Performance figures are published [here]( +https://github.com/iii-i/zlib-ng/wiki/Performance-with-dfltcc-patch-applied-and-dfltcc-support-built-on-dfltcc-enabled-machine +). The compression speed-up can be as high as 110x and the decompression +speed-up can be as high as 15x. + +# Limitations + +Two DFLTCC compression calls with identical inputs are not guaranteed to +produce identical outputs. Therefore care should be taken when using +hardware compression when reproducible results are desired. In +particular, zlib-ng-specific `zng_deflateSetParams` call allows setting +`Z_DEFLATE_REPRODUCIBLE` parameter, which disables DFLTCC support for a +particular stream. + +DFLTCC does not support every single zlib-ng feature, in particular: + +* `inflate(Z_BLOCK)` and `inflate(Z_TREES)` +* `inflateMark()` +* `inflatePrime()` +* `inflateSyncPoint()` + +When used, these functions will either switch to software, or, in case +this is not possible, gracefully fail. + +# Code structure + +All SystemZ-specific code lives in `arch/s390` directory and is +integrated with the rest of zlib-ng using hook macros. + +## Hook macros + +DFLTCC takes as arguments a parameter block, an input buffer, an output +buffer, and a window. Parameter blocks are stored alongside zlib states; +buffers are forwarded from the caller; and window - which must be +4k-aligned and is always 64k large, is managed using the `PAD_WINDOW()`, +`WINDOW_PAD_SIZE`, `HINT_ALIGNED_WINDOW` and `DEFLATE_ADJUST_WINDOW_SIZE()` +and `INFLATE_ADJUST_WINDOW_SIZE()` hooks. + +Software and hardware window formats do not match, therefore, +`deflateSetDictionary()`, `deflateGetDictionary()`, `inflateSetDictionary()` +and `inflateGetDictionary()` need special handling, which is triggered using +`DEFLATE_SET_DICTIONARY_HOOK()`, `DEFLATE_GET_DICTIONARY_HOOK()`, +`INFLATE_SET_DICTIONARY_HOOK()` and `INFLATE_GET_DICTIONARY_HOOK()` macros. + +`deflateResetKeep()` and `inflateResetKeep()` update the DFLTCC +parameter block using `DEFLATE_RESET_KEEP_HOOK()` and +`INFLATE_RESET_KEEP_HOOK()` macros. + +`INFLATE_PRIME_HOOK()`, `INFLATE_MARK_HOOK()` and +`INFLATE_SYNC_POINT_HOOK()` macros make the respective unsupported +calls gracefully fail. + +`DEFLATE_PARAMS_HOOK()` implements switching between hardware and +software compression mid-stream using `deflateParams()`. Switching +normally entails flushing the current block, which might not be possible +in low memory situations. `deflateParams()` uses `DEFLATE_DONE()` hook +in order to detect and gracefully handle such situations. + +The algorithm implemented in hardware has different compression ratio +than the one implemented in software. `DEFLATE_BOUND_ADJUST_COMPLEN()` +and `DEFLATE_NEED_CONSERVATIVE_BOUND()` macros make `deflateBound()` +return the correct results for the hardware implementation. + +Actual compression and decompression are handled by `DEFLATE_HOOK()` and +`INFLATE_TYPEDO_HOOK()` macros. Since inflation with DFLTCC manages the +window on its own, calling `updatewindow()` is suppressed using +`INFLATE_NEED_UPDATEWINDOW()` macro. + +In addition to compression, DFLTCC computes CRC-32 and Adler-32 +checksums, therefore, whenever it's used, software checksumming is +suppressed using `DEFLATE_NEED_CHECKSUM()` and `INFLATE_NEED_CHECKSUM()` +macros. + +While software always produces reproducible compression results, this +is not the case for DFLTCC. Therefore, zlib-ng users are given the +ability to specify whether or not reproducible compression results +are required. While it is always possible to specify this setting +before the compression begins, it is not always possible to do so in +the middle of a deflate stream - the exact conditions for that are +determined by `DEFLATE_CAN_SET_REPRODUCIBLE()` macro. + +## SystemZ-specific code + +When zlib-ng is built with DFLTCC, the hooks described above are +converted to calls to functions, which are implemented in +`arch/s390/dfltcc_*` files. The functions can be grouped in three broad +categories: + +* Base DFLTCC support, e.g. wrapping the machine instruction - `dfltcc()`. +* Translating between software and hardware data formats, e.g. + `dfltcc_deflate_set_dictionary()`. +* Translating between software and hardware state machines, e.g. + `dfltcc_deflate()` and `dfltcc_inflate()`. + +The functions from the first two categories are fairly simple, however, +various quirks in both software and hardware state machines make the +functions from the third category quite complicated. + +### `dfltcc_deflate()` function + +This function is called by `deflate()` and has the following +responsibilities: + +* Checking whether DFLTCC can be used with the current stream. If this + is not the case, then it returns `0`, making `deflate()` use some + other function in order to compress in software. Otherwise it returns + `1`. +* Block management and Huffman table generation. DFLTCC ends blocks only + when explicitly instructed to do so by the software. Furthermore, + whether to use fixed or dynamic Huffman tables must also be determined + by the software. Since looking at data in order to gather statistics + would negate performance benefits, the following approach is used: the + first `DFLTCC_FIRST_FHT_BLOCK_SIZE` bytes are placed into a fixed + block, and every next `DFLTCC_BLOCK_SIZE` bytes are placed into + dynamic blocks. +* Writing EOBS. Block Closing Control bit in the parameter block + instructs DFLTCC to write EOBS, however, certain conditions need to be + met: input data length must be non-zero or Continuation Flag must be + set. To put this in simpler terms, DFLTCC will silently refuse to + write EOBS if this is the only thing that it is asked to do. Since the + code has to be able to emit EOBS in software anyway, in order to avoid + tricky corner cases Block Closing Control is never used. Whether to + write EOBS is instead controlled by `soft_bcc` variable. +* Triggering block post-processing. Depending on flush mode, `deflate()` + must perform various additional actions when a block or a stream ends. + `dfltcc_deflate()` informs `deflate()` about this using + `block_state *result` parameter. +* Converting software state fields into hardware parameter block fields, + and vice versa. For example, `wrap` and Check Value Type or `bi_valid` + and Sub-Byte Boundary. Certain fields cannot be translated and must + persist untouched in the parameter block between calls, for example, + Continuation Flag or Continuation State Buffer. +* Handling flush modes and low-memory situations. These aspects are + quite intertwined and pervasive. The general idea here is that the + code must not do anything in software - whether explicitly by e.g. + calling `send_eobs()`, or implicitly - by returning to `deflate()` + with certain return and `*result` values, when Continuation Flag is + set. +* Ending streams. When a new block is started and flush mode is + `Z_FINISH`, Block Header Final parameter block bit is used to mark + this block as final. However, sometimes an empty final block is + needed, and, unfortunately, just like with EOBS, DFLTCC will silently + refuse to do this. The general idea of DFLTCC implementation is to + rely as much as possible on the existing code. Here in order to do + this, the code pretends that it does not support DFLTCC, which makes + `deflate()` call a software compression function, which writes an + empty final block. Whether this is required is controlled by + `need_empty_block` variable. +* Error handling. This is simply converting + Operation-Ending-Supplemental Code to string. Errors can only happen + due to things like memory corruption, and therefore they don't affect + the `deflate()` return code. + +### `dfltcc_inflate()` function + +This function is called by `inflate()` from the `TYPEDO` state (that is, +when all the metadata is parsed and the stream is positioned at the type +bits of deflate block header) and it's responsible for the following: + +* Falling back to software when flush mode is `Z_BLOCK` or `Z_TREES`. + Unfortunately, there is no way to ask DFLTCC to stop decompressing on + block or tree boundary. +* `inflate()` decompression loop management. This is controlled using + the return value, which can be either `DFLTCC_INFLATE_BREAK` or + `DFLTCC_INFLATE_CONTINUE`. +* Converting software state fields into hardware parameter block fields, + and vice versa. For example, `whave` and History Length or `wnext` and + History Offset. +* Ending streams. This instructs `inflate()` to return `Z_STREAM_END` + and is controlled by `last` state field. +* Error handling. Like deflate, error handling comprises + Operation-Ending-Supplemental Code to string conversion. Unlike + deflate, errors may happen due to bad inputs, therefore they are + propagated to `inflate()` by setting `mode` field to `MEM` or `BAD`. + +# Testing + +Given complexity of DFLTCC machine instruction, it is not clear whether +QEMU TCG will ever support it. At the time of writing, one has to have +access to an IBM z15+ VM or LPAR in order to test DFLTCC support. Since +DFLTCC is a non-privileged instruction, neither special VM/LPAR +configuration nor root are required. + +zlib-ng CI uses an IBM-provided z15 self-hosted builder for the DFLTCC +testing. There is no official IBM Z GitHub Actions runner, so we build +one inspired by `anup-kodlekere/gaplib`. +Future updates to actions-runner might need an updated patch. The .net +version number patch has been separated into a separate file to avoid a +need for constantly changing the patch. + +## Configuring the builder. + +### Install prerequisites. +``` +sudo dnf install podman +``` + +### Create a config file, needs github personal access token. +Access token needs permissions; Repo Admin RW, Org Self-hosted runners RW. +For details, consult +https://docs.github.com/en/rest/actions/self-hosted-runners?apiVersion=2022-11-28#create-a-registration-token-for-a-repository + +#### Create file /etc/actions-runner: +``` +REPO=<owner>/<name> +PAT_TOKEN=<github_pat_***> +``` + +#### Set permissions on /etc/actions-runner: +``` +chmod 600 /etc/actions-runner +``` + +### Add actions-runner service. +``` +sudo cp self-hosted-builder/actions-runner.service /etc/systemd/system/ +sudo systemctl daemon-reload +``` + +### Autostart actions-runner. +``` +$ sudo systemctl enable --now actions-runner +``` + +### Add auto-rebuild cronjob +``` +sudo cp self-hosted-builder/actions-runner-rebuild.sh /etc/cron.weekly/ +chmod +x /etc/cron.weekly/actions-runner-rebuild.sh +``` + +## Building / Rebuilding the container +``` +sudo /etc/cron.weekly/actions-runner-rebuild.sh +``` |
