summaryrefslogtreecommitdiff
path: root/neozip/arch/s390/README.md
diff options
context:
space:
mode:
authorMehmet Samet Duman <yongdohyun@projecttick.org>2026-04-02 19:56:09 +0300
committerMehmet Samet Duman <yongdohyun@projecttick.org>2026-04-02 19:56:09 +0300
commit7fb132859fda54aa96bc9dd46d302b343eeb5a02 (patch)
treeb43ae77d7451fb470a260c03349a1caf2846c5e5 /neozip/arch/s390/README.md
parentb1e34e861b5d732afe828d58aad2c638135061fd (diff)
parentc2712b8a345191f6ed79558c089777df94590087 (diff)
downloadProject-Tick-7fb132859fda54aa96bc9dd46d302b343eeb5a02.tar.gz
Project-Tick-7fb132859fda54aa96bc9dd46d302b343eeb5a02.zip
Add 'neozip/' from commit 'c2712b8a345191f6ed79558c089777df94590087'
git-subtree-dir: neozip git-subtree-mainline: b1e34e861b5d732afe828d58aad2c638135061fd git-subtree-split: c2712b8a345191f6ed79558c089777df94590087
Diffstat (limited to 'neozip/arch/s390/README.md')
-rw-r--r--neozip/arch/s390/README.md265
1 files changed, 265 insertions, 0 deletions
diff --git a/neozip/arch/s390/README.md b/neozip/arch/s390/README.md
new file mode 100644
index 0000000000..c56ffd7654
--- /dev/null
+++ b/neozip/arch/s390/README.md
@@ -0,0 +1,265 @@
+# Introduction
+
+This directory contains SystemZ deflate hardware acceleration support.
+It can be enabled using the following build commands:
+
+ $ ./configure --with-dfltcc-deflate --with-dfltcc-inflate
+ $ make
+
+or
+
+ $ cmake -DWITH_DFLTCC_DEFLATE=1 -DWITH_DFLTCC_INFLATE=1 .
+ $ make
+
+When built like this, zlib-ng would compress using hardware on level 1,
+and using software on all other levels. Decompression will always happen
+in hardware. In order to enable hardware compression for levels 1-6
+(i.e. to make it used by default) one could add
+`-DDFLTCC_LEVEL_MASK=0x7e` to CFLAGS when building zlib-ng.
+
+SystemZ deflate hardware acceleration is available on [IBM z15](
+https://www.ibm.com/products/z15) and newer machines under the name [
+"Integrated Accelerator for zEnterprise Data Compression"](
+https://www.ibm.com/support/z-content-solutions/compression/). The
+programming interface to it is a machine instruction called DEFLATE
+CONVERSION CALL (DFLTCC). It is documented in Chapter 26 of [Principles
+of Operation](https://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf). Both
+the code and the rest of this document refer to this feature simply as
+"DFLTCC".
+
+# Performance
+
+Performance figures are published [here](
+https://github.com/iii-i/zlib-ng/wiki/Performance-with-dfltcc-patch-applied-and-dfltcc-support-built-on-dfltcc-enabled-machine
+). The compression speed-up can be as high as 110x and the decompression
+speed-up can be as high as 15x.
+
+# Limitations
+
+Two DFLTCC compression calls with identical inputs are not guaranteed to
+produce identical outputs. Therefore care should be taken when using
+hardware compression when reproducible results are desired. In
+particular, zlib-ng-specific `zng_deflateSetParams` call allows setting
+`Z_DEFLATE_REPRODUCIBLE` parameter, which disables DFLTCC support for a
+particular stream.
+
+DFLTCC does not support every single zlib-ng feature, in particular:
+
+* `inflate(Z_BLOCK)` and `inflate(Z_TREES)`
+* `inflateMark()`
+* `inflatePrime()`
+* `inflateSyncPoint()`
+
+When used, these functions will either switch to software, or, in case
+this is not possible, gracefully fail.
+
+# Code structure
+
+All SystemZ-specific code lives in `arch/s390` directory and is
+integrated with the rest of zlib-ng using hook macros.
+
+## Hook macros
+
+DFLTCC takes as arguments a parameter block, an input buffer, an output
+buffer, and a window. Parameter blocks are stored alongside zlib states;
+buffers are forwarded from the caller; and window - which must be
+4k-aligned and is always 64k large, is managed using the `PAD_WINDOW()`,
+`WINDOW_PAD_SIZE`, `HINT_ALIGNED_WINDOW` and `DEFLATE_ADJUST_WINDOW_SIZE()`
+and `INFLATE_ADJUST_WINDOW_SIZE()` hooks.
+
+Software and hardware window formats do not match, therefore,
+`deflateSetDictionary()`, `deflateGetDictionary()`, `inflateSetDictionary()`
+and `inflateGetDictionary()` need special handling, which is triggered using
+`DEFLATE_SET_DICTIONARY_HOOK()`, `DEFLATE_GET_DICTIONARY_HOOK()`,
+`INFLATE_SET_DICTIONARY_HOOK()` and `INFLATE_GET_DICTIONARY_HOOK()` macros.
+
+`deflateResetKeep()` and `inflateResetKeep()` update the DFLTCC
+parameter block using `DEFLATE_RESET_KEEP_HOOK()` and
+`INFLATE_RESET_KEEP_HOOK()` macros.
+
+`INFLATE_PRIME_HOOK()`, `INFLATE_MARK_HOOK()` and
+`INFLATE_SYNC_POINT_HOOK()` macros make the respective unsupported
+calls gracefully fail.
+
+`DEFLATE_PARAMS_HOOK()` implements switching between hardware and
+software compression mid-stream using `deflateParams()`. Switching
+normally entails flushing the current block, which might not be possible
+in low memory situations. `deflateParams()` uses `DEFLATE_DONE()` hook
+in order to detect and gracefully handle such situations.
+
+The algorithm implemented in hardware has different compression ratio
+than the one implemented in software. `DEFLATE_BOUND_ADJUST_COMPLEN()`
+and `DEFLATE_NEED_CONSERVATIVE_BOUND()` macros make `deflateBound()`
+return the correct results for the hardware implementation.
+
+Actual compression and decompression are handled by `DEFLATE_HOOK()` and
+`INFLATE_TYPEDO_HOOK()` macros. Since inflation with DFLTCC manages the
+window on its own, calling `updatewindow()` is suppressed using
+`INFLATE_NEED_UPDATEWINDOW()` macro.
+
+In addition to compression, DFLTCC computes CRC-32 and Adler-32
+checksums, therefore, whenever it's used, software checksumming is
+suppressed using `DEFLATE_NEED_CHECKSUM()` and `INFLATE_NEED_CHECKSUM()`
+macros.
+
+While software always produces reproducible compression results, this
+is not the case for DFLTCC. Therefore, zlib-ng users are given the
+ability to specify whether or not reproducible compression results
+are required. While it is always possible to specify this setting
+before the compression begins, it is not always possible to do so in
+the middle of a deflate stream - the exact conditions for that are
+determined by `DEFLATE_CAN_SET_REPRODUCIBLE()` macro.
+
+## SystemZ-specific code
+
+When zlib-ng is built with DFLTCC, the hooks described above are
+converted to calls to functions, which are implemented in
+`arch/s390/dfltcc_*` files. The functions can be grouped in three broad
+categories:
+
+* Base DFLTCC support, e.g. wrapping the machine instruction - `dfltcc()`.
+* Translating between software and hardware data formats, e.g.
+ `dfltcc_deflate_set_dictionary()`.
+* Translating between software and hardware state machines, e.g.
+ `dfltcc_deflate()` and `dfltcc_inflate()`.
+
+The functions from the first two categories are fairly simple, however,
+various quirks in both software and hardware state machines make the
+functions from the third category quite complicated.
+
+### `dfltcc_deflate()` function
+
+This function is called by `deflate()` and has the following
+responsibilities:
+
+* Checking whether DFLTCC can be used with the current stream. If this
+ is not the case, then it returns `0`, making `deflate()` use some
+ other function in order to compress in software. Otherwise it returns
+ `1`.
+* Block management and Huffman table generation. DFLTCC ends blocks only
+ when explicitly instructed to do so by the software. Furthermore,
+ whether to use fixed or dynamic Huffman tables must also be determined
+ by the software. Since looking at data in order to gather statistics
+ would negate performance benefits, the following approach is used: the
+ first `DFLTCC_FIRST_FHT_BLOCK_SIZE` bytes are placed into a fixed
+ block, and every next `DFLTCC_BLOCK_SIZE` bytes are placed into
+ dynamic blocks.
+* Writing EOBS. Block Closing Control bit in the parameter block
+ instructs DFLTCC to write EOBS, however, certain conditions need to be
+ met: input data length must be non-zero or Continuation Flag must be
+ set. To put this in simpler terms, DFLTCC will silently refuse to
+ write EOBS if this is the only thing that it is asked to do. Since the
+ code has to be able to emit EOBS in software anyway, in order to avoid
+ tricky corner cases Block Closing Control is never used. Whether to
+ write EOBS is instead controlled by `soft_bcc` variable.
+* Triggering block post-processing. Depending on flush mode, `deflate()`
+ must perform various additional actions when a block or a stream ends.
+ `dfltcc_deflate()` informs `deflate()` about this using
+ `block_state *result` parameter.
+* Converting software state fields into hardware parameter block fields,
+ and vice versa. For example, `wrap` and Check Value Type or `bi_valid`
+ and Sub-Byte Boundary. Certain fields cannot be translated and must
+ persist untouched in the parameter block between calls, for example,
+ Continuation Flag or Continuation State Buffer.
+* Handling flush modes and low-memory situations. These aspects are
+ quite intertwined and pervasive. The general idea here is that the
+ code must not do anything in software - whether explicitly by e.g.
+ calling `send_eobs()`, or implicitly - by returning to `deflate()`
+ with certain return and `*result` values, when Continuation Flag is
+ set.
+* Ending streams. When a new block is started and flush mode is
+ `Z_FINISH`, Block Header Final parameter block bit is used to mark
+ this block as final. However, sometimes an empty final block is
+ needed, and, unfortunately, just like with EOBS, DFLTCC will silently
+ refuse to do this. The general idea of DFLTCC implementation is to
+ rely as much as possible on the existing code. Here in order to do
+ this, the code pretends that it does not support DFLTCC, which makes
+ `deflate()` call a software compression function, which writes an
+ empty final block. Whether this is required is controlled by
+ `need_empty_block` variable.
+* Error handling. This is simply converting
+ Operation-Ending-Supplemental Code to string. Errors can only happen
+ due to things like memory corruption, and therefore they don't affect
+ the `deflate()` return code.
+
+### `dfltcc_inflate()` function
+
+This function is called by `inflate()` from the `TYPEDO` state (that is,
+when all the metadata is parsed and the stream is positioned at the type
+bits of deflate block header) and it's responsible for the following:
+
+* Falling back to software when flush mode is `Z_BLOCK` or `Z_TREES`.
+ Unfortunately, there is no way to ask DFLTCC to stop decompressing on
+ block or tree boundary.
+* `inflate()` decompression loop management. This is controlled using
+ the return value, which can be either `DFLTCC_INFLATE_BREAK` or
+ `DFLTCC_INFLATE_CONTINUE`.
+* Converting software state fields into hardware parameter block fields,
+ and vice versa. For example, `whave` and History Length or `wnext` and
+ History Offset.
+* Ending streams. This instructs `inflate()` to return `Z_STREAM_END`
+ and is controlled by `last` state field.
+* Error handling. Like deflate, error handling comprises
+ Operation-Ending-Supplemental Code to string conversion. Unlike
+ deflate, errors may happen due to bad inputs, therefore they are
+ propagated to `inflate()` by setting `mode` field to `MEM` or `BAD`.
+
+# Testing
+
+Given complexity of DFLTCC machine instruction, it is not clear whether
+QEMU TCG will ever support it. At the time of writing, one has to have
+access to an IBM z15+ VM or LPAR in order to test DFLTCC support. Since
+DFLTCC is a non-privileged instruction, neither special VM/LPAR
+configuration nor root are required.
+
+zlib-ng CI uses an IBM-provided z15 self-hosted builder for the DFLTCC
+testing. There is no official IBM Z GitHub Actions runner, so we build
+one inspired by `anup-kodlekere/gaplib`.
+Future updates to actions-runner might need an updated patch. The .net
+version number patch has been separated into a separate file to avoid a
+need for constantly changing the patch.
+
+## Configuring the builder.
+
+### Install prerequisites.
+```
+sudo dnf install podman
+```
+
+### Create a config file, needs github personal access token.
+Access token needs permissions; Repo Admin RW, Org Self-hosted runners RW.
+For details, consult
+https://docs.github.com/en/rest/actions/self-hosted-runners?apiVersion=2022-11-28#create-a-registration-token-for-a-repository
+
+#### Create file /etc/actions-runner:
+```
+REPO=<owner>/<name>
+PAT_TOKEN=<github_pat_***>
+```
+
+#### Set permissions on /etc/actions-runner:
+```
+chmod 600 /etc/actions-runner
+```
+
+### Add actions-runner service.
+```
+sudo cp self-hosted-builder/actions-runner.service /etc/systemd/system/
+sudo systemctl daemon-reload
+```
+
+### Autostart actions-runner.
+```
+$ sudo systemctl enable --now actions-runner
+```
+
+### Add auto-rebuild cronjob
+```
+sudo cp self-hosted-builder/actions-runner-rebuild.sh /etc/cron.weekly/
+chmod +x /etc/cron.weekly/actions-runner-rebuild.sh
+```
+
+## Building / Rebuilding the container
+```
+sudo /etc/cron.weekly/actions-runner-rebuild.sh
+```