diff options
Diffstat (limited to 'docs/handbook/libnbtplusplus')
| -rw-r--r-- | docs/handbook/libnbtplusplus/architecture.md | 607 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/building.md | 401 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/code-style.md | 299 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/compound-tags.md | 602 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/endian-handling.md | 359 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/io-system.md | 672 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/list-tags.md | 682 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/overview.md | 422 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/tag-system.md | 643 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/testing.md | 291 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/visitor-pattern.md | 333 | ||||
| -rw-r--r-- | docs/handbook/libnbtplusplus/zlib-integration.md | 514 |
12 files changed, 5825 insertions, 0 deletions
diff --git a/docs/handbook/libnbtplusplus/architecture.md b/docs/handbook/libnbtplusplus/architecture.md new file mode 100644 index 0000000000..cd8da9722b --- /dev/null +++ b/docs/handbook/libnbtplusplus/architecture.md @@ -0,0 +1,607 @@ +# libnbt++ Architecture + +## High-Level Design + +libnbt++ follows a classic object-oriented design with a polymorphic tag hierarchy, augmented by C++ template metaprogramming for code reuse. The architecture has five major layers: + +1. **Tag Hierarchy** — Polymorphic class tree rooted at `tag`, with concrete types for each NBT tag +2. **Value Layer** — Type-erased wrappers (`value`, `value_initializer`) for runtime tag manipulation +3. **I/O Layer** — Stream-based readers/writers handling binary serialization and endianness +4. **Compression Layer** — zlib stream adapters for transparent gzip/deflate support +5. **Text Layer** — Formatters for human-readable tag output (JSON-like) + +--- + +## Class Hierarchy + +``` +tag (abstract base, tag.h) +└── detail::crtp_tag<Sub> (CRTP intermediate, crtp_tag.h) + ├── tag_primitive<int8_t> → typedef tag_byte + ├── tag_primitive<int16_t> → typedef tag_short + ├── tag_primitive<int32_t> → typedef tag_int + ├── tag_primitive<int64_t> → typedef tag_long + ├── tag_primitive<float> → typedef tag_float + ├── tag_primitive<double> → typedef tag_double + ├── tag_string + ├── tag_array<int8_t> → typedef tag_byte_array + ├── tag_array<int32_t> → typedef tag_int_array + ├── tag_array<int64_t> → typedef tag_long_array + ├── tag_list + └── tag_compound +``` + +All concrete tag classes are declared `final` — they cannot be further subclassed. The hierarchy uses exactly two levels of inheritance: `tag` → `crtp_tag<Sub>` → concrete class. + +--- + +## The CRTP Pattern + +The Curiously Recurring Template Pattern (CRTP) is central to libnbt++'s design. The intermediate class `detail::crtp_tag<Sub>` (defined in `include/crtp_tag.h`) automatically implements all the `tag` virtual methods that can be expressed generically: + +```cpp +namespace nbt { +namespace detail { + +template <class Sub> class crtp_tag : public tag +{ +public: + virtual ~crtp_tag() noexcept = 0; // Pure virtual to keep it abstract + + tag_type get_type() const noexcept override final { + return Sub::type; // Each Sub has a static constexpr tag_type type + }; + + std::unique_ptr<tag> clone() const& override final { + return make_unique<Sub>(sub_this()); // Copy-constructs Sub + } + + std::unique_ptr<tag> move_clone() && override final { + return make_unique<Sub>(std::move(sub_this())); // Move-constructs Sub + } + + tag& assign(tag&& rhs) override final { + return sub_this() = dynamic_cast<Sub&&>(rhs); + // Throws std::bad_cast if rhs is not the same Sub type + } + + void accept(nbt_visitor& visitor) override final { + visitor.visit(sub_this()); // Double dispatch + } + + void accept(const_nbt_visitor& visitor) const override final { + visitor.visit(sub_this()); + } + +private: + bool equals(const tag& rhs) const override final { + return sub_this() == static_cast<const Sub&>(rhs); + } + + Sub& sub_this() { return static_cast<Sub&>(*this); } + const Sub& sub_this() const { return static_cast<const Sub&>(*this); } +}; + +template <class Sub> crtp_tag<Sub>::~crtp_tag() noexcept {} + +} // namespace detail +} // namespace nbt +``` + +### What the CRTP Provides + +Each concrete tag class inherits from `crtp_tag<Self>` and automatically gets: + +| Method | Behavior | +|------------------|---------------------------------------------------------| +| `get_type()` | Returns `Sub::type` (the static `tag_type` constant) | +| `clone()` | Copy-constructs a new `Sub` via `make_unique<Sub>` | +| `move_clone()` | Move-constructs a new `Sub` | +| `assign(tag&&)` | Dynamic casts to `Sub&&` and uses `Sub::operator=` | +| `accept()` | Calls `visitor.visit(sub_this())` — double dispatch | +| `equals()` | Uses `Sub::operator==` | + +The concrete class only needs to provide: + +1. A `static constexpr tag_type type` member +2. Copy and move constructors/assignment operators +3. `operator==` and `operator!=` +4. `read_payload(io::stream_reader&)` and `write_payload(io::stream_writer&) const` + +--- + +## The tag Base Class + +The `tag` base class (defined in `include/tag.h`) establishes the interface for all NBT tags: + +```cpp +class NBT_EXPORT tag +{ +public: + virtual ~tag() noexcept {} + + virtual tag_type get_type() const noexcept = 0; + + virtual std::unique_ptr<tag> clone() const& = 0; + virtual std::unique_ptr<tag> move_clone() && = 0; + std::unique_ptr<tag> clone() &&; // Delegates to move_clone + + template <class T> T& as(); + template <class T> const T& as() const; + + virtual tag& assign(tag&& rhs) = 0; + + virtual void accept(nbt_visitor& visitor) = 0; + virtual void accept(const_nbt_visitor& visitor) const = 0; + + virtual void read_payload(io::stream_reader& reader) = 0; + virtual void write_payload(io::stream_writer& writer) const = 0; + + static std::unique_ptr<tag> create(tag_type type); + static std::unique_ptr<tag> create(tag_type type, int8_t val); + static std::unique_ptr<tag> create(tag_type type, int16_t val); + static std::unique_ptr<tag> create(tag_type type, int32_t val); + static std::unique_ptr<tag> create(tag_type type, int64_t val); + static std::unique_ptr<tag> create(tag_type type, float val); + static std::unique_ptr<tag> create(tag_type type, double val); + + friend NBT_EXPORT bool operator==(const tag& lhs, const tag& rhs); + friend NBT_EXPORT bool operator!=(const tag& lhs, const tag& rhs); + +private: + virtual bool equals(const tag& rhs) const = 0; +}; +``` + +### Key Design Choices + +1. **`clone()` is ref-qualified**: `const&` for copy-cloning, `&&` for move-cloning. The rvalue `clone()` delegates to `move_clone()`. + +2. **`as<T>()` uses `dynamic_cast`**: Provides safe downcasting with `std::bad_cast` on failure. + +3. **`operator==` uses RTTI**: The free `operator==` first checks `typeid(lhs) == typeid(rhs)`, then delegates to the virtual `equals()` method. + +4. **Factory methods**: `tag::create()` constructs tags by `tag_type` at runtime, supporting both default construction and numeric initialization. + +--- + +## Ownership Model + +libnbt++ uses a strict ownership model based on `std::unique_ptr<tag>`: + +### Where Ownership Lives + +- **`value`** — Owns a single tag via `std::unique_ptr<tag> tag_` +- **`tag_compound`** — Owns values in a `std::map<std::string, value>` +- **`tag_list`** — Owns values in a `std::vector<value>` + +### Ownership Rules + +1. **Single owner**: Every tag has exactly one owner. No shared ownership. +2. **Deep copying**: `clone()` performs a full deep copy of the entire tag tree. +3. **Move semantics**: Tags can be efficiently moved between owners without copying. +4. **No raw pointers for ownership**: The library never uses raw `new`/`delete` for tag management. + +### The value Class + +The `value` class (`include/value.h`) is the primary type-erasure mechanism. It wraps `std::unique_ptr<tag>` and provides: + +```cpp +class NBT_EXPORT value +{ +public: + value() noexcept {} // Empty/null value + explicit value(std::unique_ptr<tag>&& t) noexcept; // Takes ownership + explicit value(tag&& t); // Clones the tag + + // Move only (no implicit copy) + value(value&&) noexcept = default; + value& operator=(value&&) noexcept = default; + + // Explicit copy + explicit value(const value& rhs); + value& operator=(const value& rhs); + + // Type conversion + operator tag&(); + operator const tag&() const; + tag& get(); + const tag& get() const; + template <class T> T& as(); + template <class T> const T& as() const; + + // Numeric assignments (existing tag gets updated, or new one created) + value& operator=(int8_t val); + value& operator=(int16_t val); + value& operator=(int32_t val); + value& operator=(int64_t val); + value& operator=(float val); + value& operator=(double val); + + // String assignment + value& operator=(const std::string& str); + value& operator=(std::string&& str); + + // Numeric conversions (widening only) + explicit operator int8_t() const; + explicit operator int16_t() const; + explicit operator int32_t() const; + explicit operator int64_t() const; + explicit operator float() const; + explicit operator double() const; + explicit operator const std::string&() const; + + // Compound access delegation + value& at(const std::string& key); + value& operator[](const std::string& key); + value& operator[](const char* key); + + // List access delegation + value& at(size_t i); + value& operator[](size_t i); + + // Null check + explicit operator bool() const { return tag_ != nullptr; } + + // Direct pointer access + std::unique_ptr<tag>& get_ptr(); + void set_ptr(std::unique_ptr<tag>&& t); + tag_type get_type() const; + +private: + std::unique_ptr<tag> tag_; +}; +``` + +### The value_initializer Class + +`value_initializer` (`include/value_initializer.h`) extends `value` with **implicit** constructors. It is used as a parameter type in functions like `tag_compound::put()` and `tag_list::push_back()`: + +```cpp +class NBT_EXPORT value_initializer : public value +{ +public: + value_initializer(std::unique_ptr<tag>&& t) noexcept; + value_initializer(std::nullptr_t) noexcept; + value_initializer(value&& val) noexcept; + value_initializer(tag&& t); + + value_initializer(int8_t val); // Creates tag_byte + value_initializer(int16_t val); // Creates tag_short + value_initializer(int32_t val); // Creates tag_int + value_initializer(int64_t val); // Creates tag_long + value_initializer(float val); // Creates tag_float + value_initializer(double val); // Creates tag_double + value_initializer(const std::string& str); // Creates tag_string + value_initializer(std::string&& str); // Creates tag_string + value_initializer(const char* str); // Creates tag_string +}; +``` + +This is why you can write `compound.put("key", 42)` — the `42` (an `int`) implicitly converts to `value_initializer(int32_t(42))`, which constructs a `tag_int(42)` inside a `value`. + +### Why value vs value_initializer? + +The separation exists because implicit conversions on `value` itself would cause ambiguity problems. For example, if `value` had an implicit constructor from `tag&&`, then expressions involving compound assignment could be ambiguous. By limiting implicit conversions to `value_initializer` (used only as function parameters), the library avoids these issues. + +--- + +## Template Design + +### tag_primitive<T> + +Six NBT types share the same structure: a single numeric value. The `tag_primitive<T>` template (`include/tag_primitive.h`) handles all of them: + +```cpp +template <class T> +class tag_primitive final : public detail::crtp_tag<tag_primitive<T>> +{ +public: + typedef T value_type; + static constexpr tag_type type = detail::get_primitive_type<T>::value; + + constexpr tag_primitive(T val = 0) noexcept : value(val) {} + + operator T&(); + constexpr operator T() const; + constexpr T get() const { return value; } + + tag_primitive& operator=(T val); + void set(T val); + + void read_payload(io::stream_reader& reader) override; + void write_payload(io::stream_writer& writer) const override; + +private: + T value; +}; +``` + +The type mapping uses `detail::get_primitive_type<T>` (`include/primitive_detail.h`): + +```cpp +template <> struct get_primitive_type<int8_t> : std::integral_constant<tag_type, tag_type::Byte> {}; +template <> struct get_primitive_type<int16_t> : std::integral_constant<tag_type, tag_type::Short> {}; +template <> struct get_primitive_type<int32_t> : std::integral_constant<tag_type, tag_type::Int> {}; +template <> struct get_primitive_type<int64_t> : std::integral_constant<tag_type, tag_type::Long> {}; +template <> struct get_primitive_type<float> : std::integral_constant<tag_type, tag_type::Float> {}; +template <> struct get_primitive_type<double> : std::integral_constant<tag_type, tag_type::Double> {}; +``` + +**Explicit instantiation**: Template instantiations are declared `extern template class NBT_EXPORT tag_primitive<...>` in the header and explicitly instantiated in `src/tag.cpp`. This prevents duplicate template instantiations across translation units. + +### tag_array<T> + +Three NBT array types share the same vector-based structure. The `tag_array<T>` template (`include/tag_array.h`) handles all of them: + +```cpp +template <class T> +class tag_array final : public detail::crtp_tag<tag_array<T>> +{ +public: + typedef typename std::vector<T>::iterator iterator; + typedef typename std::vector<T>::const_iterator const_iterator; + typedef T value_type; + static constexpr tag_type type = detail::get_array_type<T>::value; + + tag_array() {} + tag_array(std::initializer_list<T> init); + tag_array(std::vector<T>&& vec) noexcept; + + std::vector<T>& get(); + T& at(size_t i); + T& operator[](size_t i); + void push_back(T val); + void pop_back(); + size_t size() const; + void clear(); + + iterator begin(); iterator end(); + const_iterator begin() const; const_iterator end() const; + + void read_payload(io::stream_reader& reader) override; + void write_payload(io::stream_writer& writer) const override; + +private: + std::vector<T> data; +}; +``` + +The type mapping uses `detail::get_array_type<T>`: + +```cpp +template <> struct get_array_type<int8_t> : std::integral_constant<tag_type, tag_type::Byte_Array> {}; +template <> struct get_array_type<int32_t> : std::integral_constant<tag_type, tag_type::Int_Array> {}; +template <> struct get_array_type<int64_t> : std::integral_constant<tag_type, tag_type::Long_Array> {}; +``` + +**Specialized I/O**: `read_payload` and `write_payload` have explicit template specializations for `int8_t` (byte arrays can be read/written as raw byte blocks) and `int64_t` (long arrays read element-by-element with `read_num`). The generic template handles `int32_t` arrays. + +--- + +## File Roles Breakdown + +### Core Headers + +| File | Role | +|------|------| +| `include/tag.h` | Defines the `tag` abstract base class, the `tag_type` enum (End through Long_Array, plus Null), `is_valid_type()`, the `create()` factory methods, `operator==`/`!=`, and `operator<<`. Also forward-declares `nbt_visitor`, `const_nbt_visitor`, `io::stream_reader`, and `io::stream_writer`. | +| `include/tagfwd.h` | Forward declarations only. Declares `tag`, `tag_primitive<T>` with all six typedefs, `tag_string`, `tag_array<T>` with all three typedefs, `tag_list`, and `tag_compound`. Used by headers that need type names without full definitions. | +| `include/nbt_tags.h` | Convenience umbrella header. Simply includes `tag_primitive.h`, `tag_string.h`, `tag_array.h`, `tag_list.h`, and `tag_compound.h`. | +| `include/crtp_tag.h` | Defines `detail::crtp_tag<Sub>`, the CRTP intermediate class. Includes `tag.h`, `nbt_visitor.h`, and `make_unique.h`. | +| `include/primitive_detail.h` | Defines `detail::get_primitive_type<T>`, mapping C++ types to `tag_type` values. Uses `std::integral_constant` for compile-time constants. | +| `include/make_unique.h` | Provides `nbt::make_unique<T>(args...)`, a C++11 polyfill for `std::make_unique` (which was only added in C++14). | + +### Tag Implementation Headers + +| File | Role | +|------|------| +| `include/tag_primitive.h` | Full definition of `tag_primitive<T>` including inline `read_payload`/`write_payload`. The six typedefs (`tag_byte` through `tag_double`) are declared here, along with `extern template` declarations for link-time optimization. | +| `include/tag_string.h` | Full definition of `tag_string`. Wraps `std::string` with constructors from `const std::string&`, `std::string&&`, and `const char*`. Provides implicit conversion operators to `std::string&` and `const std::string&`. | +| `include/tag_array.h` | Full definition of `tag_array<T>` with specialized `read_payload`/`write_payload` for `int8_t`, `int64_t`, and the generic case. The three typedefs (`tag_byte_array`, `tag_int_array`, `tag_long_array`) are at the bottom. | +| `include/tag_list.h` | Full definition of `tag_list`. Stores `std::vector<value>` with a tracked `el_type_` (element type). Provides `of<T>()` static factory, `push_back(value_initializer&&)`, `emplace_back<T, Args...>()`, `set()`, iterators, and I/O methods. | +| `include/tag_compound.h` | Full definition of `tag_compound`. Stores `std::map<std::string, value>`. Provides `at()`, `operator[]`, `put()`, `insert()`, `emplace<T>()`, `erase()`, `has_key()`, iterators, and I/O methods. | + +### Value Layer + +| File | Role | +|------|------| +| `include/value.h` | Type-erased `value` class wrapping `std::unique_ptr<tag>`. Provides numeric/string assignment operators, conversion operators (with widening semantics), compound/list access delegation via `operator[]`. | +| `include/value_initializer.h` | `value_initializer` subclass of `value` with implicit constructors from primitive types, strings, tags, and `nullptr`. Used as function parameter type. | + +### I/O Headers + +| File | Role | +|------|------| +| `include/endian_str.h` | The `endian` namespace. Declares `read_little`/`read_big`/`write_little`/`write_big` overloads for all integer and floating-point types. Template functions `read()`/`write()` dispatch based on an `endian::endian` enum. | +| `include/io/stream_reader.h` | `io::stream_reader` class and `io::input_error` exception. Free functions `read_compound()` and `read_tag()`. The reader tracks nesting depth (max 1024) to prevent stack overflow attacks. | +| `include/io/stream_writer.h` | `io::stream_writer` class. Free function `write_tag()`. Defines `max_string_len` (UINT16_MAX) and `max_array_len` (INT32_MAX) constants. | + +### Compression Headers + +| File | Role | +|------|------| +| `include/io/zlib_streambuf.h` | Base class `zlib::zlib_streambuf` extending `std::streambuf`. Contains input/output buffers (`std::vector<char>`) and a `z_stream` struct. Also defines `zlib::zlib_error` exception. | +| `include/io/izlibstream.h` | `zlib::inflate_streambuf` and `zlib::izlibstream`. Decompresses data read from a wrapped `std::istream`. Auto-detects gzip vs zlib format via `window_bits = 32 + 15`. | +| `include/io/ozlibstream.h` | `zlib::deflate_streambuf` and `zlib::ozlibstream`. Compresses data written to a wrapped `std::ostream`. Supports configurable compression level and gzip vs zlib output. | + +### Text Headers + +| File | Role | +|------|------| +| `include/text/json_formatter.h` | `text::json_formatter` class with a single `print()` method. | + +### Visitor + +| File | Role | +|------|------| +| `include/nbt_visitor.h` | `nbt_visitor` and `const_nbt_visitor` abstract base classes with 12 `visit()` overloads each (one per concrete tag type). All overloads have default empty implementations, allowing visitors to override only the types they care about. | + +--- + +## Source File Roles + +### Core Sources + +| File | Role | +|------|------| +| `src/tag.cpp` | Contains the explicit template instantiation definitions for all six `tag_primitive<T>` specializations. Implements `tag::create()` factory methods (both default and numeric), `operator==`/`!=` (using `typeid` comparison), `operator<<` (delegating to `json_formatter`), and the `tag_type` output operator. Also contains a `static_assert` ensuring IEEE 754 floating point. | +| `src/tag_compound.cpp` | Implements `tag_compound`'s constructor from initializer list, `at()`, `put()`, `insert()`, `erase()`, `has_key()`, `read_payload()` (reads until `tag_type::End`), and `write_payload()` (writes each entry then `tag_type::End`). | +| `src/tag_list.cpp` | Implements `tag_list`'s 12 initializer list constructors (one per tag type), the `value` initializer list constructor, `at()`, `set()`, `push_back()`, `reset()`, `read_payload()`, `write_payload()`, and `operator==`/`!=`. | +| `src/tag_string.cpp` | Implements `tag_string::read_payload()` (delegates to `reader.read_string()`) and `write_payload()` (delegates to `writer.write_string()`). | +| `src/value.cpp` | Implements `value`'s copy constructor, copy assignment, `set()`, all numeric assignment operators (using `assign_numeric_impl` helper), all numeric conversion operators (widening conversions via switch/case), string operations, and compound/list delegation methods. | +| `src/value_initializer.cpp` | Implements `value_initializer`'s constructors for each primitive type and string variants. Each constructs the appropriate tag and passes it to the `value` base. | + +### I/O Sources + +| File | Role | +|------|------| +| `src/endian_str.cpp` | Implements all `read_little`/`read_big`/`write_little`/`write_big` overloads. Uses byte-by-byte construction for portability (no reliance on host endianness). Float/double conversion uses `memcpy`-based type punning. Includes `static_assert` checks for `CHAR_BIT == 8`, `sizeof(float) == 4`, `sizeof(double) == 8`. | +| `src/io/stream_reader.cpp` | Implements `stream_reader::read_compound()`, `read_tag()`, `read_payload()`, `read_type()`, and `read_string()`. The `read_payload()` method tracks nesting depth with a max of `MAX_DEPTH = 1024` to prevent stack overflow from malicious input. Free functions `read_compound()` and `read_tag()` are thin wrappers. | +| `src/io/stream_writer.cpp` | Implements `stream_writer::write_tag()` (writes type + name + payload) and `write_string()` (writes 2-byte length prefix + UTF-8 data, throws `std::length_error` if string exceeds 65535 bytes). Free function `write_tag()` is a thin wrapper. | + +### Compression Sources + +| File | Role | +|------|------| +| `src/io/izlibstream.cpp` | Implements `inflate_streambuf`: constructor calls `inflateInit2()`, destructor calls `inflateEnd()`. The `underflow()` method reads compressed data from the wrapped stream, calls `inflate()`, and handles `Z_STREAM_END` by seeking back the wrapped stream to account for over-read data. | +| `src/io/ozlibstream.cpp` | Implements `deflate_streambuf`: constructor calls `deflateInit2()`, destructor calls `close()` then `deflateEnd()`. The `deflate_chunk()` method compresses buffered data and writes to the output stream. `close()` flushes with `Z_FINISH`. `ozlibstream::close()` handles exceptions gracefully by setting `badbit` instead of re-throwing. | + +### Text Sources + +| File | Role | +|------|------| +| `src/text/json_formatter.cpp` | Implements `json_formatter::print()` using a private `json_fmt_visitor` class (extends `const_nbt_visitor`). The visitor handles indentation, JSON-like output for compounds (`{}`), lists (`[]`), quoted strings, numeric suffixes (`b`, `s`, `l`, `f`, `d`), and special float values (Infinity, NaN). | + +--- + +## Data Flow: Reading NBT + +``` +Input Stream + │ + ▼ +[izlibstream] ← optional decompression + │ + ▼ +stream_reader + ├── read_type() → reads 1 byte, validates tag type + ├── read_string() → reads 2-byte length + UTF-8 name + └── read_payload(type) → tag::create(type), then tag->read_payload(*this) + │ + ├── tag_primitive<T>::read_payload() → reader.read_num(value) + ├── tag_string::read_payload() → reader.read_string() + ├── tag_array<T>::read_payload() → reader.read_num(length), then read elements + ├── tag_list::read_payload() → read element type, length, then element payloads + └── tag_compound::read_payload() → loop: read_type(), read_string(), read_payload() until End +``` + +## Data Flow: Writing NBT + +``` +tag_compound (root) + │ + ▼ +stream_writer::write_tag(key, tag) + ├── write_type(tag.get_type()) → 1 byte + ├── write_string(key) → 2-byte length + UTF-8 + └── write_payload(tag) → tag.write_payload(*this) + │ + ├── tag_primitive<T>::write_payload() → writer.write_num(value) + ├── tag_string::write_payload() → writer.write_string(value) + ├── tag_array<T>::write_payload() → write length, then elements + ├── tag_list::write_payload() → write type + length + element payloads + └── tag_compound::write_payload() → for each entry: write_tag(key, value); write_type(End) + │ + ▼ +[ozlibstream] ← optional compression + │ + ▼ +Output Stream +``` + +--- + +## Depth Protection + +The `stream_reader` maintains a `depth` counter (private `int depth = 0`) that increments on each recursive `read_payload()` call and decrements on return. If `depth` exceeds `MAX_DEPTH` (1024), an `input_error` is thrown. This prevents stack overflow from deeply nested structures in malicious NBT files. + +```cpp +std::unique_ptr<tag> stream_reader::read_payload(tag_type type) +{ + if (++depth > MAX_DEPTH) + throw input_error("Too deeply nested"); + std::unique_ptr<tag> t = tag::create(type); + t->read_payload(*this); + --depth; + return t; +} +``` + +--- + +## Export Macros + +The library uses `generate_export_header()` from CMake to create `nbt_export.h` at build time. The `NBT_EXPORT` macro is applied to all public classes and functions. When building shared libraries (`NBT_BUILD_SHARED=ON`), symbols default to hidden (`CXX_VISIBILITY_PRESET hidden`) and only `NBT_EXPORT`-marked symbols are exported. For static builds, `NBT_EXPORT` expands to nothing. + +--- + +## Numeric Widening Rules in value + +The `value` class implements a strict widening hierarchy for numeric conversions: + +**Assignment (write) direction** — A value can be assigned a narrower type: + +``` +value holding tag_short can accept int8_t (narrower OK) +value holding tag_short rejects int32_t (wider → std::bad_cast) +``` + +**Conversion (read) direction** — A value can be read as a wider type: + +``` +value holding tag_byte can be read as int8_t, int16_t, int32_t, int64_t, float, double +value holding tag_short can be read as int16_t, int32_t, int64_t, float, double +value holding tag_int can be read as int32_t, int64_t, float, double +value holding tag_long can be read as int64_t, float, double +value holding tag_float can be read as float, double +value holding tag_double can be read as double only +``` + +The implementation in `src/value.cpp` uses an `assign_numeric_impl` helper template with a switch-case dispatching on the existing tag type, comparing ordinal positions in the `tag_type` enum. + +--- + +## Dependency Graph + +``` +nbt_tags.h ──┬── tag_primitive.h → crtp_tag.h → tag.h, nbt_visitor.h, make_unique.h + │ → primitive_detail.h + │ → io/stream_reader.h → endian_str.h, tag.h, tag_compound.h + │ → io/stream_writer.h → endian_str.h, tag.h + ├── tag_string.h ──→ crtp_tag.h + ├── tag_array.h ───→ crtp_tag.h, io/stream_reader.h, io/stream_writer.h + ├── tag_list.h ────→ crtp_tag.h, tagfwd.h, value_initializer.h → value.h → tag.h + └── tag_compound.h → crtp_tag.h, value_initializer.h + +io/izlibstream.h → io/zlib_streambuf.h, <zlib.h> +io/ozlibstream.h → io/zlib_streambuf.h, <zlib.h> +text/json_formatter.h → tagfwd.h +``` + +--- + +## Thread Safety Implications + +The architecture has no global mutable state. The `json_formatter` used by `operator<<` is a `static const` local in `tag.cpp`: + +```cpp +std::ostream& operator<<(std::ostream& os, const tag& t) +{ + static const text::json_formatter formatter; + formatter.print(os, t); + return os; +} +``` + +Since `formatter` is const and `print()` is const, multiple threads can safely use `operator<<` concurrently. Tag objects themselves are not thread-safe for concurrent mutation. diff --git a/docs/handbook/libnbtplusplus/building.md b/docs/handbook/libnbtplusplus/building.md new file mode 100644 index 0000000000..61bd5a57a3 --- /dev/null +++ b/docs/handbook/libnbtplusplus/building.md @@ -0,0 +1,401 @@ +# Building libnbt++ + +## Build System + +libnbt++ uses **CMake** (minimum version 3.15) as its build system. The root `CMakeLists.txt` defines the project, its options, source files, dependencies, and installation rules. + +--- + +## Prerequisites + +### Required + +- **C++11 compatible compiler**: GCC 4.8+, Clang 3.3+, or MSVC 2015+ +- **CMake**: Version 3.15 or later + +### Optional + +- **zlib**: Required for compressed NBT support (gzip/deflate). Enabled by default. +- **CxxTest**: Required for building and running unit tests. Must be discoverable by CMake's `find_package(CxxTest)`. +- **objcopy**: Required for test data embedding on Linux (binary test files are converted to object files via `objcopy`). + +--- + +## CMake Options + +The following options are available when configuring the project: + +| Option | Default | Description | +|-------------------|---------|----------------------------------------------------------| +| `NBT_BUILD_SHARED` | `OFF` | Build as a shared (dynamic) library instead of static | +| `NBT_USE_ZLIB` | `ON` | Enable zlib compression support | +| `NBT_BUILD_TESTS` | `ON` | Build the unit test executables | +| `NBT_NAME` | `nbt++` | Override the output library name | +| `NBT_DEST_DIR` | (unset) | If set, enables install target with specified destination| + +### Option Details + +#### NBT_BUILD_SHARED + +When `NBT_BUILD_SHARED=OFF` (default), a static library (`libnbt++.a` or `nbt++.lib`) is produced. + +When `NBT_BUILD_SHARED=ON`, a shared library is produced. In this case, CMake is configured to: +- Set `CXX_VISIBILITY_PRESET` to `hidden` +- Set `VISIBILITY_INLINES_HIDDEN` to `1` +- Use the `NBT_EXPORT` macro (generated by `generate_export_header()`) to control symbol visibility + +This means only classes and functions explicitly marked with `NBT_EXPORT` are exported from the shared library. + +#### NBT_USE_ZLIB + +When enabled (default), the build: +1. Calls `find_package(ZLIB REQUIRED)` to locate the system zlib +2. Adds the zlib source files to the library: `src/io/izlibstream.cpp` and `src/io/ozlibstream.cpp` +3. Defines the preprocessor macro `NBT_HAVE_ZLIB` +4. Links the library against `ZLIB::ZLIB` + +The zlib headers (`include/io/izlibstream.h`, `include/io/ozlibstream.h`, `include/io/zlib_streambuf.h`) include `<zlib.h>` directly. If zlib is not available, these headers cannot be included. + +#### NBT_NAME + +Allows overriding the library target name. By default, the target is called `nbt++`, producing `libnbt++.a`. Setting `NBT_NAME=mynbt` would produce `libmynbt.a`: + +```cmake +cmake -DNBT_NAME=mynbt .. +``` + +--- + +## Source Files + +### Core Library Sources + +The `NBT_SOURCES` variable lists all non-zlib source files: + +```cmake +set(NBT_SOURCES + src/endian_str.cpp + src/tag.cpp + src/tag_compound.cpp + src/tag_list.cpp + src/tag_string.cpp + src/value.cpp + src/value_initializer.cpp + src/io/stream_reader.cpp + src/io/stream_writer.cpp + src/text/json_formatter.cpp) +``` + +### Zlib Sources (Conditional) + +Only added when `NBT_USE_ZLIB=ON`: + +```cmake +set(NBT_SOURCES_Z + src/io/izlibstream.cpp + src/io/ozlibstream.cpp) +``` + +--- + +## Building Step by Step + +### 1. Clone and Navigate + +```bash +git clone https://github.com/Project-Tick/Project-Tick.git +cd Project-Tick/libnbtplusplus/ +``` + +### 2. Create Build Directory + +```bash +mkdir build +cd build +``` + +### 3. Configure + +#### Default (static library, with zlib, with tests): + +```bash +cmake .. +``` + +#### Static library, no zlib, no tests: + +```bash +cmake -DNBT_USE_ZLIB=OFF -DNBT_BUILD_TESTS=OFF .. +``` + +#### Shared library: + +```bash +cmake -DNBT_BUILD_SHARED=ON .. +``` + +#### Custom library name: + +```bash +cmake -DNBT_NAME=nbtpp .. +``` + +#### Specify a different compiler: + +```bash +cmake -DCMAKE_CXX_COMPILER=clang++ .. +``` + +#### With install destination: + +```bash +cmake -DNBT_DEST_DIR=/usr/local/lib .. +``` + +### 4. Build + +```bash +cmake --build . +``` + +Or with make directly: + +```bash +make -j$(nproc) +``` + +### 5. Run Tests (if enabled) + +```bash +ctest --output-on-failure +``` + +### 6. Install (optional) + +Only works if `NBT_DEST_DIR` was set: + +```bash +cmake --install . +``` + +--- + +## Integration into Other Projects + +### As a CMake Subdirectory + +The most common integration method is adding libnbt++ as a subdirectory in your project: + +```cmake +# In your project's CMakeLists.txt + +# Optional: disable tests for the dependency +set(NBT_BUILD_TESTS OFF CACHE BOOL "" FORCE) + +add_subdirectory(libnbtplusplus) + +add_executable(myapp main.cpp) +target_link_libraries(myapp nbt++) +``` + +The `target_include_directories` in libnbt++'s CMakeLists already uses `PUBLIC`, so include paths propagate automatically: + +```cmake +target_include_directories(${NBT_NAME} PUBLIC include ${CMAKE_CURRENT_BINARY_DIR}) +``` + +The `${CMAKE_CURRENT_BINARY_DIR}` is included because `generate_export_header()` creates `nbt_export.h` in the build directory. + +### Include Paths + +After linking against the `nbt++` target, your code can include: + +```cpp +#include <nbt_tags.h> // All tag types +#include <io/stream_reader.h> // Reading +#include <io/stream_writer.h> // Writing +#include <io/izlibstream.h> // Decompression (if NBT_USE_ZLIB) +#include <io/ozlibstream.h> // Compression (if NBT_USE_ZLIB) +``` + +### Manually (without CMake) + +If not using CMake, you need to: + +1. Add `libnbtplusplus/include/` to your include path +2. Compile all `.cpp` files in `src/` (and `src/io/`, `src/text/`) +3. If using zlib: add `-DNBT_HAVE_ZLIB`, link against `-lz` +4. Create your own `nbt_export.h` or define `NBT_EXPORT` as empty: + +```cpp +// nbt_export.h — manual version for static builds +#ifndef NBT_EXPORT_H +#define NBT_EXPORT_H +#define NBT_EXPORT +#endif +``` + +5. Set C++ standard to C++11 or later: `-std=c++11` + +--- + +## The nbt_export.h Header + +This header is **auto-generated** by CMake's `generate_export_header()` command at configure time. It is placed in `${CMAKE_CURRENT_BINARY_DIR}` and defines: + +- `NBT_EXPORT` — marks symbols for export from shared libraries +- `NBT_NO_EXPORT` — marks symbols as hidden + +For static builds, `NBT_EXPORT` typically expands to nothing. For shared builds, it maps to compiler-specific visibility attributes: + +```cpp +// Example generated content (GCC/Clang) +#define NBT_EXPORT __attribute__((visibility("default"))) +#define NBT_NO_EXPORT __attribute__((visibility("hidden"))) +``` + +The binary directory is added to include paths so all source files can `#include "nbt_export.h"`. + +--- + +## C++ Standard + +The library enforces C++11 via: + +```cmake +set_property(TARGET ${NBT_NAME} PROPERTY CXX_STANDARD 11) +``` + +This does not set `CXX_STANDARD_REQUIRED`, so CMake may use a higher standard if the compiler defaults to one. The code is compatible with C++11 through C++20+. + +--- + +## Compile-Time Assertions + +The library includes several `static_assert` checks to ensure platform compatibility: + +In `src/tag.cpp`: +```cpp +static_assert( + std::numeric_limits<float>::is_iec559 && + std::numeric_limits<double>::is_iec559, + "The floating point values for NBT must conform to IEC 559/IEEE 754"); +``` + +In `src/endian_str.cpp`: +```cpp +static_assert(CHAR_BIT == 8, "Assuming that a byte has 8 bits"); +static_assert(sizeof(float) == 4, "Assuming that a float is 4 byte long"); +static_assert(sizeof(double) == 8, "Assuming that a double is 8 byte long"); +``` + +These ensure that the platform's floating-point representation matches the NBT format's IEEE 754 requirement. + +--- + +## Platform-Specific Notes + +### Linux + +Tests are only supported on `x86_64` and `i686` architectures due to the use of `objcopy` for binary test data embedding: + +```cmake +if(CMAKE_SYSTEM_PROCESSOR STREQUAL x86_64 OR CMAKE_SYSTEM_PROCESSOR STREQUAL amd64) + set(OBJCOPY_TARGET "elf64-x86-64") + set(OBJCOPY_ARCH "x86_64") +elseif(CMAKE_SYSTEM_PROCESSOR STREQUAL i686) + set(OBJCOPY_TARGET "elf32-i386") + set(OBJCOPY_ARCH "i386") +else() + message(AUTHOR_WARNING "This is not a platform that would support testing nbt++") + return() +endif() +``` + +### macOS / Windows + +The core library compiles on any platform with a C++11 compiler and optionally zlib. However, the test suite uses Linux-specific `objcopy` commands and may not build on non-Linux platforms without modifications. + +--- + +## Shared Library Visibility + +When building as a shared library (`NBT_BUILD_SHARED=ON`), the CMake configuration applies strict visibility rules: + +```cmake +if(${BUILD_SHARED_LIBS}) + set_target_properties(${NBT_NAME} PROPERTIES + CXX_VISIBILITY_PRESET hidden + VISIBILITY_INLINES_HIDDEN 1) +endif() +``` + +This means: +- All symbols are hidden by default +- Inline functions are also hidden +- Only symbols marked `NBT_EXPORT` are exported + +This reduces binary size and prevents symbol collision when multiple libraries are loaded. + +--- + +## Typical Build Output + +After a successful build with all options enabled, you will have: + +``` +build/ +├── libnbt++.a # The static library (or libnbt++.so for shared) +├── nbt_export.h # Generated export header +└── test/ + ├── nbttest # Core tag tests + ├── endian_str_test # Endianness tests + ├── read_test # Reading tests + ├── write_test # Writing tests + ├── zlibstream_test # Compression tests (if NBT_USE_ZLIB) + ├── format_test # JSON formatter test + └── test_value # Value assignment tests +``` + +--- + +## Troubleshooting + +### "Could not find ZLIB" + +Install the zlib development package: + +```bash +# Debian/Ubuntu +sudo apt install zlib1g-dev + +# Fedora +sudo dnf install zlib-devel + +# macOS +brew install zlib +``` + +Or disable zlib: `cmake -DNBT_USE_ZLIB=OFF ..` + +### "Could not find CxxTest" + +Install CxxTest: + +```bash +# Debian/Ubuntu +sudo apt install cxxtest + +# macOS +brew install cxxtest +``` + +Or disable tests: `cmake -DNBT_BUILD_TESTS=OFF ..` + +### "nbt_export.h not found" + +This file is generated at configure time. Make sure you've run `cmake ..` (the configure step) before building. If building manually without CMake, create a minimal `nbt_export.h` as described in the manual integration section above. + +### Linking Errors with Shared Builds + +If you see undefined symbol errors when linking against the shared library, ensure your code includes the correct headers and that `nbt_export.h` was generated during the shared build configuration. Verify `NBT_EXPORT` expands to the visibility attribute. diff --git a/docs/handbook/libnbtplusplus/code-style.md b/docs/handbook/libnbtplusplus/code-style.md new file mode 100644 index 0000000000..a779f44a0f --- /dev/null +++ b/docs/handbook/libnbtplusplus/code-style.md @@ -0,0 +1,299 @@ +# Code Style & Conventions + +## Overview + +This document describes the coding conventions and patterns observed throughout the libnbt++ codebase. These are not arbitrary style choices — they reflect deliberate design decisions for a C++11 library focused on correctness, interoperability, and clean ownership semantics. + +--- + +## Namespaces + +### Primary Namespaces + +| Namespace | Purpose | Location | +|-----------|---------|----------| +| `nbt` | All public API types: tags, value, visitors | `include/` | +| `nbt::detail` | Internal implementation details | `include/crtp_tag.h`, `include/tag_primitive.h` | +| `nbt::io` | Binary serialization (stream_reader, stream_writer) | `include/io/` | +| `nbt::text` | Text formatting (json_formatter) | `include/text/` | +| `endian` | Byte-order conversion functions | `include/endian_str.h` | +| `zlib` | Compression stream wrappers | `include/io/izlibstream.h`, `include/io/ozlibstream.h` | + +### Namespace Usage + +- All user-facing code is in the `nbt` namespace +- Internal helpers like `crtp_tag` and `make_unique` are in `nbt::detail` (not `nbt`) +- The `endian` and `zlib` namespaces are top-level, **not** nested under `nbt` +- No `using namespace` directives appear in headers + +--- + +## Export Macro + +```cpp +#ifdef NBT_SHARED + #ifdef NBT_BUILD + #define NBT_EXPORT __declspec(dllexport) // When building the shared lib + #else + #define NBT_EXPORT __declspec(dllimport) // When consuming the shared lib + #endif +#else + #define NBT_EXPORT // Static build: empty +#endif +``` + +`NBT_EXPORT` is applied to all public classes and free functions: + +```cpp +class NBT_EXPORT tag { ... }; +class NBT_EXPORT tag_list final : public detail::crtp_tag<tag_list> { ... }; +NBT_EXPORT bool operator==(const tag_compound& lhs, const tag_compound& rhs); +``` + +Classes in `nbt::detail` (like `crtp_tag`) are **not** exported. + +--- + +## Class Design Patterns + +### final Classes + +All concrete tag classes are `final`: + +```cpp +class tag_list final : public detail::crtp_tag<tag_list> { ... }; +class tag_compound final : public detail::crtp_tag<tag_compound> { ... }; +class tag_string final : public detail::crtp_tag<tag_string> { ... }; +``` + +This prevents further inheritance and enables compiler devirtualization optimizations. + +### CRTP Intermediate + +The Curiously Recurring Template Pattern eliminates boilerplate: + +```cpp +template <class Sub> +class crtp_tag : public tag +{ + tag_type get_type() const noexcept override { return Sub::type; } + std::unique_ptr<tag> clone() const& override { return make_unique<Sub>(/*copy*/); } + std::unique_ptr<tag> move_clone() && override { return make_unique<Sub>(std::move(/*this*/)); } + void accept(nbt_visitor& visitor) const override { visitor.visit(const_cast<Sub&>(...)); } + // ... +}; +``` + +Each concrete class inherits from `crtp_tag<Self>` and gets all 6 virtual method implementations for free. + +### Static Type Constants + +Each tag class exposes its type as a `static constexpr`: + +```cpp +class tag_int final : public detail::crtp_tag<tag_int> +{ +public: + static constexpr tag_type type = tag_type::Int; + // ... +}; +``` + +Used for compile-time type checks and template metaprogramming. + +--- + +## Ownership Conventions + +### Unique Pointer Everywhere + +All tag ownership uses `std::unique_ptr<tag>`: + +```cpp +std::unique_ptr<tag> tag::create(tag_type type); +std::unique_ptr<tag> tag::clone() const&; +std::unique_ptr<tag> stream_reader::read_payload(tag_type type); +``` + +### Custom make_unique + +Since `std::make_unique` is C++14, the library provides its own in `nbt::detail`: + +```cpp +namespace detail { + template <class T, class... Args> + std::unique_ptr<T> make_unique(Args&&... args) + { + return std::unique_ptr<T>(new T(std::forward<Args>(args)...)); + } +} +``` + +Used throughout source files instead of raw `new`: + +```cpp +tags.emplace_back(make_unique<T>(std::forward<Args>(args)...)); +``` + +### value as Type-Erased Wrapper + +The `value` class wraps `std::unique_ptr<tag>` to provide implicit conversions and operator overloading that `unique_ptr` cannot support: + +```cpp +class value +{ + std::unique_ptr<tag> tag_; +public: + value& operator=(int32_t val); // Assigns to contained tag + operator int32_t() const; // Reads from contained tag + value& operator[](const std::string& key); // Delegates to tag_compound + // ... +}; +``` + +--- + +## C++11 Features Used + +| Feature | Usage | +|---------|-------| +| `std::unique_ptr` | All tag ownership | +| Move semantics | `value(value&&)`, `tag_list::push_back(value_initializer&&)` | +| `override` | All virtual method overrides | +| `final` | All concrete tag classes | +| `constexpr` | Static type constants, writer limits | +| `noexcept` | `get_type()`, visitor destructors | +| `std::initializer_list` | Compound and list construction | +| Range-based for | Internal iteration in compounds, lists | +| `auto` | Type deduction in local variables | +| `static_assert` | Endian implementation checks | +| `enum class` | `tag_type`, `endian::endian` | +| Variadic templates | `emplace_back<T>(Args&&...)`, `make_unique` | + +The library does **not** use C++14 or later features, maintaining broad compiler compatibility. + +--- + +## Include Structure + +### Public Headers (for Library Users) + +``` +include/ + tag.h // tag base, tag_type enum, create(), as<T>() + crtp_tag.h // CRTP intermediate (detail) + nbt_tags.h // Master include — includes all tag headers + tag_primitive.h // tag_byte, tag_short, tag_int, tag_long, tag_float, tag_double + tag_string.h // tag_string + tag_array.h // tag_byte_array, tag_int_array, tag_long_array + tag_list.h // tag_list + tag_compound.h // tag_compound + value.h // value wrapper + value_initializer.h // value_initializer for implicit construction + nbt_visitor.h // nbt_visitor, const_nbt_visitor + endian_str.h // endian read/write functions + io/ + stream_reader.h // stream_reader, read_compound(), read_tag() + stream_writer.h // stream_writer, write_tag() + izlibstream.h // izlibstream (decompression) + ozlibstream.h // ozlibstream (compression) + zlib_streambuf.h // Base streambuf for zlib + text/ + json_formatter.h // json_formatter +``` + +### Master Include + +`nbt_tags.h` includes all tag types for convenience: + +```cpp +// nbt_tags.h +#include "tag.h" +#include "tag_primitive.h" +#include "tag_string.h" +#include "tag_array.h" +#include "tag_list.h" +#include "tag_compound.h" +#include "value.h" +#include "value_initializer.h" +``` + +Users can include individual headers for faster compilation or `nbt_tags.h` for convenience. + +--- + +## Error Handling Style + +### Exceptions for Programmer Errors + +- `std::invalid_argument`: Type mismatches in lists, null values +- `std::out_of_range`: Invalid indices in lists, missing keys in compounds +- `std::logic_error`: Inconsistent internal state + +### Exceptions for I/O Errors + +- `io::input_error` (extends `std::runtime_error`): All parse/read failures +- `std::runtime_error`: Write stream failures +- `std::length_error`: Exceeding NBT format limits +- `zlib::zlib_error` (extends `std::runtime_error`): Compression/decompression failures + +### Stream State + +Reader/writer methods check `is` / `os` state after I/O operations and throw on failure. Write methods set failbit before throwing to maintain consistent stream state. + +--- + +## Naming Conventions + +| Element | Convention | Examples | +|---------|-----------|----------| +| Classes | `snake_case` | `tag_compound`, `stream_reader`, `tag_list` | +| Methods | `snake_case` | `get_type()`, `read_payload()`, `el_type()` | +| Member variables | `snake_case` with trailing `_` | `el_type_`, `tag_`, `is`, `os` | +| Template parameters | `PascalCase` | `Sub`, `T`, `Args` | +| Enum values | `PascalCase` | `tag_type::Byte_Array`, `tag_type::Long_Array` | +| Namespaces | `snake_case` | `nbt`, `nbt::io`, `nbt::detail` | +| Macros | `UPPER_SNAKE_CASE` | `NBT_EXPORT`, `NBT_HAVE_ZLIB`, `NBT_SHARED` | +| Constants | `snake_case` | `max_string_len`, `max_array_len`, `MAX_DEPTH` | + +--- + +## Template Patterns + +### Explicit Instantiation + +Template classes like `tag_primitive<T>` and `tag_array<T>` use extern template declarations in headers and explicit instantiation in source files: + +```cpp +// In tag_primitive.h +extern template class tag_primitive<int8_t>; +extern template class tag_primitive<int16_t>; +extern template class tag_primitive<int32_t>; +extern template class tag_primitive<int64_t>; +extern template class tag_primitive<float>; +extern template class tag_primitive<double>; + +// In tag.cpp +template class tag_primitive<int8_t>; +template class tag_primitive<int16_t>; +// ... +``` + +This prevents duplicate template instantiation across translation units, reducing compile time and binary size. + +### Type Aliases + +```cpp +typedef tag_primitive<int8_t> tag_byte; +typedef tag_primitive<int16_t> tag_short; +typedef tag_primitive<int32_t> tag_int; +typedef tag_primitive<int64_t> tag_long; +typedef tag_primitive<float> tag_float; +typedef tag_primitive<double> tag_double; + +typedef tag_array<int8_t> tag_byte_array; +typedef tag_array<int32_t> tag_int_array; +typedef tag_array<int64_t> tag_long_array; +``` + +Uses `typedef` rather than `using` — a C++11 compatibility choice, though both are equivalent. diff --git a/docs/handbook/libnbtplusplus/compound-tags.md b/docs/handbook/libnbtplusplus/compound-tags.md new file mode 100644 index 0000000000..fbd5ce7764 --- /dev/null +++ b/docs/handbook/libnbtplusplus/compound-tags.md @@ -0,0 +1,602 @@ +# Compound Tags + +## Overview + +`tag_compound` is the most important tag type in NBT. It represents an unordered collection of **named tags** of arbitrary types — the NBT equivalent of a JSON object or a C++ `std::map`. Every NBT file has a root compound tag, and compounds can be nested arbitrarily deep. + +Defined in `include/tag_compound.h`, implemented in `src/tag_compound.cpp`. + +--- + +## Class Definition + +```cpp +class NBT_EXPORT tag_compound final : public detail::crtp_tag<tag_compound> +{ + typedef std::map<std::string, value> map_t_; + +public: + typedef map_t_::iterator iterator; + typedef map_t_::const_iterator const_iterator; + + static constexpr tag_type type = tag_type::Compound; + + tag_compound() {} + tag_compound( + std::initializer_list<std::pair<std::string, value_initializer>> init); + + value& at(const std::string& key); + const value& at(const std::string& key) const; + + value& operator[](const std::string& key); + + std::pair<iterator, bool> put(const std::string& key, + value_initializer&& val); + std::pair<iterator, bool> insert(const std::string& key, + value_initializer&& val); + + template <class T, class... Args> + std::pair<iterator, bool> emplace(const std::string& key, Args&&... args); + + bool erase(const std::string& key); + bool has_key(const std::string& key) const; + bool has_key(const std::string& key, tag_type type) const; + + size_t size() const; + void clear(); + + iterator begin(); + iterator end(); + const_iterator begin() const; + const_iterator end() const; + const_iterator cbegin() const; + const_iterator cend() const; + + void read_payload(io::stream_reader& reader) override; + void write_payload(io::stream_writer& writer) const override; + + friend bool operator==(const tag_compound& lhs, const tag_compound& rhs); + friend bool operator!=(const tag_compound& lhs, const tag_compound& rhs); + +private: + map_t_ tags; +}; +``` + +--- + +## Internal Storage + +`tag_compound` uses a `std::map<std::string, value>` as its internal container. This means: + +- **Keys are sorted**: Iteration order is lexicographic by key name +- **Unique keys**: Each key appears at most once +- **Logarithmic access**: Lookup, insertion, and deletion are $O(\log n)$ +- **Stable iterators**: Inserting/erasing does not invalidate other iterators + +Each value in the map is a `value` object (wrapping `std::unique_ptr<tag>`), which owns its tag. + +--- + +## Construction + +### Default Constructor + +```cpp +tag_compound comp; // Empty compound +``` + +### Initializer List Constructor + +The most powerful constructor takes a brace-enclosed list of key-value pairs: + +```cpp +tag_compound comp{ + {"name", "Steve"}, // const char* → tag_string via value_initializer + {"health", int16_t(20)}, // int16_t → tag_short + {"xp", int32_t(1500)}, // int32_t → tag_int + {"velocity", 0.0}, // double → tag_double + {"inventory", tag_list::of<tag_compound>({ + {{"id", "minecraft:sword"}, {"count", int8_t(1)}}, + {{"id", "minecraft:shield"}, {"count", int8_t(1)}} + })}, + {"scores", tag_int_array{100, 200, 300}}, + {"nested", tag_compound{{"inner_key", "inner_value"}}} +}; +``` + +The initializer list type is `std::initializer_list<std::pair<std::string, value_initializer>>`. The `value_initializer` class accepts implicit conversions from all supported types (see the architecture documentation). + +#### Implementation + +```cpp +tag_compound::tag_compound( + std::initializer_list<std::pair<std::string, value_initializer>> init) +{ + for (const auto& pair : init) + tags.emplace(std::move(pair.first), std::move(pair.second)); +} +``` + +Each pair's key and value_initializer are moved into the map. Since `value_initializer` inherits from `value`, it converts seamlessly. + +--- + +## Element Access + +### operator[] — Unchecked Access + +```cpp +value& operator[](const std::string& key); +``` + +Returns a reference to the value associated with `key`. If the key does not exist, **a new uninitialized value is created** under that key (matching `std::map::operator[]` behavior). + +```cpp +tag_compound comp{{"name", "Steve"}}; + +// Access existing key +value& name = comp["name"]; +std::string n = static_cast<std::string>(name); // "Steve" + +// Create new entry (value is uninitialized/null) +value& newval = comp["new_key"]; +// newval is a null value: (bool)newval == false +newval = int32_t(42); // Now it holds a tag_int(42) +``` + +**Warning**: Since `operator[]` creates entries, do not use it to test for key existence. Use `has_key()` instead. + +### at() — Bounds-Checked Access + +```cpp +value& at(const std::string& key); +const value& at(const std::string& key) const; +``` + +Returns a reference to the value associated with `key`. Throws `std::out_of_range` if the key does not exist. + +```cpp +tag_compound comp{{"health", int16_t(20)}}; + +value& health = comp.at("health"); // OK +const value& h = comp.at("health"); // OK (const) +comp.at("missing_key"); // throws std::out_of_range +``` + +Implementation: +```cpp +value& tag_compound::at(const std::string& key) +{ + return tags.at(key); +} +``` + +--- + +## Insertion and Modification + +### put() — Insert or Assign + +```cpp +std::pair<iterator, bool> put(const std::string& key, value_initializer&& val); +``` + +If `key` already exists, **replaces** the existing value. If `key` does not exist, **inserts** a new entry. Returns a pair of (iterator to the entry, bool indicating whether the key was new). + +```cpp +tag_compound comp; + +auto [it1, inserted1] = comp.put("key", int32_t(42)); +// inserted1 == true, comp["key"] == tag_int(42) + +auto [it2, inserted2] = comp.put("key", int32_t(99)); +// inserted2 == false (key existed), comp["key"] == tag_int(99) +``` + +Implementation: +```cpp +std::pair<tag_compound::iterator, bool> +tag_compound::put(const std::string& key, value_initializer&& val) +{ + auto it = tags.find(key); + if (it != tags.end()) { + it->second = std::move(val); + return {it, false}; + } else { + return tags.emplace(key, std::move(val)); + } +} +``` + +### insert() — Insert Only + +```cpp +std::pair<iterator, bool> insert(const std::string& key, value_initializer&& val); +``` + +Inserts a new entry only if the key does **not** already exist. If the key exists, the value is not modified. Returns (iterator, bool) where bool indicates whether insertion occurred. + +```cpp +auto [it, inserted] = comp.insert("key", int32_t(42)); +// If "key" exists: inserted == false, value unchanged +// If "key" missing: inserted == true, "key" → tag_int(42) +``` + +Implementation: +```cpp +std::pair<tag_compound::iterator, bool> +tag_compound::insert(const std::string& key, value_initializer&& val) +{ + return tags.emplace(key, std::move(val)); +} +``` + +This delegates to `std::map::emplace`, which does not overwrite existing entries. + +### emplace() — Construct and Insert/Assign + +```cpp +template <class T, class... Args> +std::pair<iterator, bool> emplace(const std::string& key, Args&&... args); +``` + +Constructs a tag of type `T` in-place and assigns or inserts it. Unlike `std::map::emplace`, this **will overwrite** existing values (it delegates to `put()`). + +```cpp +// Construct a tag_int(42) in place +comp.emplace<tag_int>("key", 42); + +// Construct a tag_compound with initializer +comp.emplace<tag_compound>("nested", std::initializer_list< + std::pair<std::string, value_initializer>>{ + {"inner", int32_t(1)} + }); +``` + +Implementation: +```cpp +template <class T, class... Args> +std::pair<tag_compound::iterator, bool> +tag_compound::emplace(const std::string& key, Args&&... args) +{ + return put(key, value(make_unique<T>(std::forward<Args>(args)...))); +} +``` + +### Direct Assignment via operator[] + +Since `operator[]` returns a `value&`, you can assign directly: + +```cpp +comp["health"] = int16_t(20); // Creates/updates tag_short +comp["name"] = "Steve"; // Creates/updates tag_string +comp["data"] = tag_compound{ // Assigns a whole compound + {"nested", "value"} +}; +``` + +Assignment behavior depends on whether the value is initialized: +- **Uninitialized value**: Creates a new tag of the appropriate type +- **Initialized value**: Updates the existing tag if types match, throws `std::bad_cast` if types differ (for numeric/string assignments on `value`) + +--- + +## Deletion + +### erase() + +```cpp +bool erase(const std::string& key); +``` + +Removes the entry with the given key. Returns `true` if an entry was removed, `false` if the key did not exist. + +```cpp +tag_compound comp{{"a", 1}, {"b", 2}, {"c", 3}}; + +comp.erase("b"); // returns true, comp now has "a" and "c" +comp.erase("z"); // returns false, no effect +``` + +Implementation: +```cpp +bool tag_compound::erase(const std::string& key) +{ + return tags.erase(key) != 0; +} +``` + +--- + +## Key Queries + +### has_key() — Check Existence + +```cpp +bool has_key(const std::string& key) const; +``` + +Returns `true` if the key exists in the compound. + +```cpp +if (comp.has_key("name")) { + // Key exists +} +``` + +### has_key() — Check Existence and Type + +```cpp +bool has_key(const std::string& key, tag_type type) const; +``` + +Returns `true` if the key exists **and** the value has the specified type. + +```cpp +if (comp.has_key("health", tag_type::Short)) { + int16_t health = static_cast<int16_t>(comp.at("health")); +} +``` + +Implementation: +```cpp +bool tag_compound::has_key(const std::string& key, tag_type type) const +{ + auto it = tags.find(key); + return it != tags.end() && it->second.get_type() == type; +} +``` + +--- + +## Size and Clearing + +```cpp +size_t size() const { return tags.size(); } // Number of entries +void clear() { tags.clear(); } // Remove all entries +``` + +--- + +## Iteration + +`tag_compound` provides full bidirectional iterator support over its entries. Each entry is a `std::pair<const std::string, value>`. + +```cpp +iterator begin(); +iterator end(); +const_iterator begin() const; +const_iterator end() const; +const_iterator cbegin() const; +const_iterator cend() const; +``` + +### Iteration Examples + +```cpp +tag_compound comp{{"a", 1}, {"b", 2}, {"c", 3}}; + +// Range-based for loop +for (const auto& [key, val] : comp) { + std::cout << key << " = " << val.get() << "\n"; +} +// Output (sorted by key): +// a = 1 +// b = 2 +// c = 3 + +// Iterator-based loop +for (auto it = comp.begin(); it != comp.end(); ++it) { + std::cout << it->first << ": type=" << it->second.get_type() << "\n"; +} + +// Const iteration +for (auto it = comp.cbegin(); it != comp.cend(); ++it) { + // Read-only access +} +``` + +**Note**: Iteration order is **lexicographic by key name** because the internal `std::map` sorts its keys. This is not necessarily the same order as the original NBT file — NBT compounds are unordered in the specification. + +--- + +## Named Tag Insertion Patterns + +### Pattern 1: Initializer List (Preferred for Construction) + +```cpp +tag_compound comp{ + {"key1", int32_t(1)}, + {"key2", "hello"}, + {"key3", tag_list{1, 2, 3}} +}; +``` + +### Pattern 2: operator[] for Dynamic Updates + +```cpp +comp["new_key"] = int32_t(42); +comp["string_key"] = std::string("value"); +``` + +### Pattern 3: put() for Insert-or-Update + +```cpp +comp.put("key", int32_t(42)); +comp.put("key", int32_t(99)); // Overwrites +``` + +### Pattern 4: insert() for Insert-if-Missing + +```cpp +comp.insert("default", int32_t(42)); // Only inserts if "default" doesn't exist +``` + +### Pattern 5: emplace() for In-Place Construction + +```cpp +comp.emplace<tag_int>("key", 42); +comp.emplace<tag_string>("name", "Steve"); +``` + +### Pattern 6: Moving Tags In + +```cpp +tag_compound inner{{"nested_key", "nested_value"}}; +comp.put("section", std::move(inner)); // Moves the compound +``` + +--- + +## Binary Format + +### Reading (Deserialization) + +A compound tag's payload is a sequence of named tags terminated by a `tag_type::End` byte: + +``` +[type byte] [name length] [name bytes] [tag payload] +[type byte] [name length] [name bytes] [tag payload] +... +[0x00] ← End tag type +``` + +Implementation: +```cpp +void tag_compound::read_payload(io::stream_reader& reader) +{ + clear(); + tag_type tt; + while ((tt = reader.read_type(true)) != tag_type::End) { + std::string key; + try { + key = reader.read_string(); + } catch (io::input_error& ex) { + std::ostringstream str; + str << "Error reading key of tag_" << tt; + throw io::input_error(str.str()); + } + auto tptr = reader.read_payload(tt); + tags.emplace(std::move(key), value(std::move(tptr))); + } +} +``` + +The reader loops until it encounters `tag_type::End` (0x00). For each entry: +1. Read the tag type byte +2. Read the name string (2-byte length + UTF-8) +3. Read the tag payload via `reader.read_payload()` +4. Emplace the key-value pair into the map + +### Writing (Serialization) + +```cpp +void tag_compound::write_payload(io::stream_writer& writer) const +{ + for (const auto& pair : tags) + writer.write_tag(pair.first, pair.second); + writer.write_type(tag_type::End); +} +``` + +The writer iterates over all entries (in map order), writing each as a named tag, then writes a single `End` byte. + +--- + +## Equality Comparison + +Two compounds are equal if and only if their internal `std::map` objects are equal: + +```cpp +friend bool operator==(const tag_compound& lhs, const tag_compound& rhs) +{ + return lhs.tags == rhs.tags; +} +``` + +This performs a deep comparison: same keys, in the same order, with equal values (which recursively compares the owned tags). + +--- + +## Nested Access + +The `value` class delegates `operator[]` and `at()` to `tag_compound` when the held tag is a compound. This enables chained access: + +```cpp +tag_compound root{ + {"player", tag_compound{ + {"name", "Steve"}, + {"stats", tag_compound{ + {"health", int16_t(20)}, + {"hunger", int16_t(18)} + }} + }} +}; + +// Chained access +std::string name = static_cast<std::string>(root["player"]["name"]); +int16_t health = static_cast<int16_t>(root["player"]["stats"]["health"]); + +// Bounds-checked +root.at("player").at("stats").at("missing"); // throws std::out_of_range +``` + +The delegation works because `value::operator[](const std::string& key)` performs: +```cpp +value& value::operator[](const std::string& key) +{ + return dynamic_cast<tag_compound&>(*tag_)[key]; +} +``` + +If the held tag is not a `tag_compound`, `dynamic_cast` throws `std::bad_cast`. + +--- + +## Common Usage Patterns + +### Checking and Accessing + +```cpp +if (comp.has_key("version", tag_type::Int)) { + int32_t version = static_cast<int32_t>(comp.at("version")); +} +``` + +### Safe Nested Access + +```cpp +try { + auto& player = comp.at("player").as<tag_compound>(); + if (player.has_key("health")) { + int16_t health = static_cast<int16_t>(player.at("health")); + } +} catch (const std::out_of_range& e) { + // Key doesn't exist +} catch (const std::bad_cast& e) { + // Type mismatch +} +``` + +### Building from Dynamic Data + +```cpp +tag_compound comp; +for (const auto& item : items) { + comp.put(item.name, tag_compound{ + {"id", item.id}, + {"count", int8_t(item.count)}, + {"damage", int16_t(item.damage)} + }); +} +``` + +### Merging Compounds + +```cpp +// Copy all entries from source to dest (overwriting existing keys) +for (const auto& [key, val] : source) { + dest.put(key, value(val)); // Explicit copy via value(const value&) +} +``` diff --git a/docs/handbook/libnbtplusplus/endian-handling.md b/docs/handbook/libnbtplusplus/endian-handling.md new file mode 100644 index 0000000000..7699de0bf0 --- /dev/null +++ b/docs/handbook/libnbtplusplus/endian-handling.md @@ -0,0 +1,359 @@ +# Endian Handling + +## Overview + +The `endian` namespace provides byte-order conversion for reading and writing multi-byte numeric values in big-endian or little-endian format. This is the lowest layer of the I/O system, called by `stream_reader::read_num()` and `stream_writer::write_num()`. + +Defined in `include/endian_str.h`, implemented in `src/endian_str.cpp`. + +--- + +## Endianness Enum + +```cpp +namespace endian { + +enum class endian +{ + big, + little +}; + +} +``` + +- `endian::big` — Most significant byte first. Default for Java Edition NBT (per the Minecraft specification). +- `endian::little` — Least significant byte first. Used by Bedrock Edition NBT. + +--- + +## Public API + +### Read Functions + +```cpp +namespace endian { + +template <class T> +void read(std::istream& is, T& x, endian e); + +void read_little(std::istream& is, uint8_t& x); +void read_little(std::istream& is, int8_t& x); +void read_little(std::istream& is, uint16_t& x); +void read_little(std::istream& is, int16_t& x); +void read_little(std::istream& is, uint32_t& x); +void read_little(std::istream& is, int32_t& x); +void read_little(std::istream& is, uint64_t& x); +void read_little(std::istream& is, int64_t& x); +void read_little(std::istream& is, float& x); +void read_little(std::istream& is, double& x); + +void read_big(std::istream& is, uint8_t& x); +void read_big(std::istream& is, int8_t& x); +void read_big(std::istream& is, uint16_t& x); +void read_big(std::istream& is, int16_t& x); +void read_big(std::istream& is, uint32_t& x); +void read_big(std::istream& is, int32_t& x); +void read_big(std::istream& is, uint64_t& x); +void read_big(std::istream& is, int64_t& x); +void read_big(std::istream& is, float& x); +void read_big(std::istream& is, double& x); + +} +``` + +### Write Functions + +```cpp +namespace endian { + +template <class T> +void write(std::ostream& os, T x, endian e); + +void write_little(std::ostream& os, uint8_t x); +void write_little(std::ostream& os, int8_t x); +void write_little(std::ostream& os, uint16_t x); +void write_little(std::ostream& os, int16_t x); +void write_little(std::ostream& os, uint32_t x); +void write_little(std::ostream& os, int32_t x); +void write_little(std::ostream& os, uint64_t x); +void write_little(std::ostream& os, int64_t x); +void write_little(std::ostream& os, float x); +void write_little(std::ostream& os, double x); + +void write_big(std::ostream& os, uint8_t x); +void write_big(std::ostream& os, int8_t x); +void write_big(std::ostream& os, uint16_t x); +void write_big(std::ostream& os, int16_t x); +void write_big(std::ostream& os, uint32_t x); +void write_big(std::ostream& os, int32_t x); +void write_big(std::ostream& os, uint64_t x); +void write_big(std::ostream& os, int64_t x); +void write_big(std::ostream& os, float x); +void write_big(std::ostream& os, double x); + +} +``` + +--- + +## Template Dispatch + +The `read()` and `write()` templates dispatch to the correct endian-specific function: + +```cpp +template <class T> +void read(std::istream& is, T& x, endian e) +{ + switch (e) { + case endian::big: read_big(is, x); break; + case endian::little: read_little(is, x); break; + } +} + +template <class T> +void write(std::ostream& os, T x, endian e) +{ + switch (e) { + case endian::big: write_big(os, x); break; + case endian::little: write_little(os, x); break; + } +} +``` + +This is called by `stream_reader` and `stream_writer`: + +```cpp +// In stream_reader +template <class T> void read_num(T& x) +{ + endian::read(is, x, endian); +} + +// In stream_writer +template <class T> void write_num(T x) +{ + endian::write(os, x, endian); +} +``` + +--- + +## Implementation Details + +### Static Assertions + +The implementation begins with compile-time checks: + +```cpp +static_assert(CHAR_BIT == 8, "Assumes 8-bit bytes"); +static_assert(sizeof(float) == 4, "Assumes 32-bit float"); +static_assert(sizeof(double) == 8, "Assumes 64-bit double"); +``` + +### Single-Byte Types + +For `int8_t` and `uint8_t`, endianness is irrelevant — the byte is read/written directly: + +```cpp +void read_little(std::istream& is, uint8_t& x) +{ + x = is.get(); +} + +void write_little(std::ostream& os, uint8_t x) +{ + os.put(x); +} + +// Same for read_big/write_big +``` + +### Multi-Byte Integer Types + +Bytes are read/written individually and assembled in the correct order. + +**Big-endian read (most significant byte first):** +```cpp +void read_big(std::istream& is, uint16_t& x) +{ + uint8_t bytes[2]; + is.read(reinterpret_cast<char*>(bytes), 2); + x = static_cast<uint16_t>(bytes[0]) << 8 + | static_cast<uint16_t>(bytes[1]); +} + +void read_big(std::istream& is, uint32_t& x) +{ + uint8_t bytes[4]; + is.read(reinterpret_cast<char*>(bytes), 4); + x = static_cast<uint32_t>(bytes[0]) << 24 + | static_cast<uint32_t>(bytes[1]) << 16 + | static_cast<uint32_t>(bytes[2]) << 8 + | static_cast<uint32_t>(bytes[3]); +} + +void read_big(std::istream& is, uint64_t& x) +{ + uint8_t bytes[8]; + is.read(reinterpret_cast<char*>(bytes), 8); + x = static_cast<uint64_t>(bytes[0]) << 56 + | static_cast<uint64_t>(bytes[1]) << 48 + | static_cast<uint64_t>(bytes[2]) << 40 + | static_cast<uint64_t>(bytes[3]) << 32 + | static_cast<uint64_t>(bytes[4]) << 24 + | static_cast<uint64_t>(bytes[5]) << 16 + | static_cast<uint64_t>(bytes[6]) << 8 + | static_cast<uint64_t>(bytes[7]); +} +``` + +**Little-endian read (least significant byte first):** +```cpp +void read_little(std::istream& is, uint16_t& x) +{ + uint8_t bytes[2]; + is.read(reinterpret_cast<char*>(bytes), 2); + x = static_cast<uint16_t>(bytes[1]) << 8 + | static_cast<uint16_t>(bytes[0]); +} +``` + +**Big-endian write:** +```cpp +void write_big(std::ostream& os, uint16_t x) +{ + os.put(static_cast<char>(x >> 8)); + os.put(static_cast<char>(x)); +} + +void write_big(std::ostream& os, uint32_t x) +{ + os.put(static_cast<char>(x >> 24)); + os.put(static_cast<char>(x >> 16)); + os.put(static_cast<char>(x >> 8)); + os.put(static_cast<char>(x)); +} +``` + +**Little-endian write:** +```cpp +void write_little(std::ostream& os, uint16_t x) +{ + os.put(static_cast<char>(x)); + os.put(static_cast<char>(x >> 8)); +} +``` + +### Signed Types + +Signed integers delegate to unsigned via `reinterpret_cast`: + +```cpp +void read_big(std::istream& is, int16_t& x) +{ + read_big(is, reinterpret_cast<uint16_t&>(x)); +} + +void write_big(std::ostream& os, int16_t x) +{ + write_big(os, static_cast<uint16_t>(x)); +} +``` + +This works because the bit patterns are identical — only interpretation differs. + +### Floating-Point Types + +Floats and doubles use `memcpy` to convert between floating-point and integer representations, avoiding undefined behavior from type-punning casts: + +```cpp +void read_big(std::istream& is, float& x) +{ + uint32_t tmp; + read_big(is, tmp); + std::memcpy(&x, &tmp, sizeof(x)); +} + +void write_big(std::ostream& os, float x) +{ + uint32_t tmp; + std::memcpy(&tmp, &x, sizeof(tmp)); + write_big(os, tmp); +} + +void read_big(std::istream& is, double& x) +{ + uint64_t tmp; + read_big(is, tmp); + std::memcpy(&x, &tmp, sizeof(x)); +} + +void write_big(std::ostream& os, double x) +{ + uint64_t tmp; + std::memcpy(&tmp, &x, sizeof(tmp)); + write_big(os, tmp); +} +``` + +The `memcpy` approach: +- Is defined behavior in C++11 (unlike `reinterpret_cast` between float/int, which is UB) +- Assumes IEEE 754 representation (verified by `static_assert(sizeof(float) == 4)`) +- Is typically optimized by the compiler to a no-op or register move + +--- + +## Byte Layout Reference + +### Big-Endian (Java Edition Default) + +``` +Value: 0x12345678 (int32_t) +Memory: [0x12] [0x34] [0x56] [0x78] + MSB LSB + +Value: 3.14f (float, IEEE 754: 0x4048F5C3) +Memory: [0x40] [0x48] [0xF5] [0xC3] +``` + +### Little-Endian (Bedrock Edition) + +``` +Value: 0x12345678 (int32_t) +Memory: [0x78] [0x56] [0x34] [0x12] + LSB MSB + +Value: 3.14f (float, IEEE 754: 0x4048F5C3) +Memory: [0xC3] [0xF5] [0x48] [0x40] +``` + +--- + +## Supported Types + +| C++ Type | Size | NBT Use | +|----------|------|---------| +| `int8_t` / `uint8_t` | 1 byte | tag_byte, type bytes | +| `int16_t` / `uint16_t` | 2 bytes | tag_short, string lengths | +| `int32_t` / `uint32_t` | 4 bytes | tag_int, array/list lengths | +| `int64_t` / `uint64_t` | 8 bytes | tag_long | +| `float` | 4 bytes | tag_float | +| `double` | 8 bytes | tag_double | + +--- + +## Design Rationale + +### Why Not Use System Endianness Detection? + +The implementation always performs explicit byte-by-byte construction rather than detecting the host endianness and potentially passing through. This approach: + +1. **Portable**: Works correctly on any architecture (big-endian, little-endian, or mixed) +2. **Simple**: No preprocessor conditionals or platform detection +3. **Correct**: No alignment issues since bytes are assembled individually +4. **Predictable**: Same code path on all platforms + +### Why memcpy for Floats? + +C++ standards do not guarantee that `reinterpret_cast<uint32_t&>(float_val)` produces defined behavior (strict aliasing violation). `memcpy` is the standard-sanctioned way to perform type punning between unrelated types, and modern compilers optimize it to equivalent machine code. diff --git a/docs/handbook/libnbtplusplus/io-system.md b/docs/handbook/libnbtplusplus/io-system.md new file mode 100644 index 0000000000..9f0d543a51 --- /dev/null +++ b/docs/handbook/libnbtplusplus/io-system.md @@ -0,0 +1,672 @@ +# I/O System + +## Overview + +The `nbt::io` namespace provides the binary serialization layer for reading and writing NBT data. The two central classes are `stream_reader` and `stream_writer`, both operating on standard C++ streams (`std::istream` / `std::ostream`). + +Defined in: +- `include/io/stream_reader.h` / `src/io/stream_reader.cpp` +- `include/io/stream_writer.h` / `src/io/stream_writer.cpp` + +--- + +## stream_reader + +### Class Definition + +```cpp +class NBT_EXPORT stream_reader +{ +public: + explicit stream_reader(std::istream& is, + endian::endian e = endian::endian::big); + + std::istream& get_istr() const { return is; } + endian::endian get_endian() const { return endian; } + + // Read named + typed tags + std::pair<std::string, std::unique_ptr<tag>> read_tag(); + + // Read payload only (for tags whose type is already known) + std::unique_ptr<tag> read_payload(tag_type type); + + // Read a type byte + tag_type read_type(bool allow_end); + + // Read a length-prefixed UTF-8 string + std::string read_string(); + + // Read a numeric value in the configured endianness + template <class T> void read_num(T& x); + + static const unsigned int MAX_DEPTH = 1024; + +private: + std::istream& is; + endian::endian endian; + unsigned int depth_ = 0; +}; +``` + +### Constructor + +```cpp +stream_reader(std::istream& is, endian::endian e = endian::endian::big); +``` + +- `is`: The input stream to read from +- `e`: Byte order — `endian::big` (default, Java edition NBT) or `endian::little` (Bedrock edition) + +### read_tag() — Read a Complete Named Tag + +```cpp +std::pair<std::string, std::unique_ptr<tag>> read_tag(); +``` + +Reads a complete tag from the stream: +1. Reads the type byte +2. If type is `End`, returns `{"", nullptr}` (end-of-compound sentinel) +3. Reads the name string +4. Reads the payload via `read_payload()` + +Returns a pair of `{name, tag_ptr}`. + +Implementation: +```cpp +std::pair<std::string, std::unique_ptr<tag>> +stream_reader::read_tag() +{ + tag_type type = read_type(true); + if (type == tag_type::End) + return {"", nullptr}; + + std::string name = read_string(); + auto tag = read_payload(type); + return {std::move(name), std::move(tag)}; +} +``` + +### read_payload() — Read a Tag Payload + +```cpp +std::unique_ptr<tag> read_payload(tag_type type); +``` + +Creates a tag of the specified type, then calls its `read_payload()` virtual method. Tracks recursive nesting depth, throwing `io::input_error` if `MAX_DEPTH` (1024) is exceeded. + +Implementation: +```cpp +std::unique_ptr<tag> stream_reader::read_payload(tag_type type) +{ + if (++depth_ > MAX_DEPTH) + throw input_error("Maximum nesting depth exceeded"); + + auto ret = tag::create(type); + ret->read_payload(*this); + + --depth_; + return ret; +} +``` + +The `tag::create()` factory instantiates the correct concrete class: +```cpp +std::unique_ptr<tag> tag::create(tag_type type) +{ + switch (type) { + case tag_type::Byte: return make_unique<tag_byte>(); + case tag_type::Short: return make_unique<tag_short>(); + case tag_type::Int: return make_unique<tag_int>(); + case tag_type::Long: return make_unique<tag_long>(); + case tag_type::Float: return make_unique<tag_float>(); + case tag_type::Double: return make_unique<tag_double>(); + case tag_type::Byte_Array: return make_unique<tag_byte_array>(); + case tag_type::String: return make_unique<tag_string>(); + case tag_type::List: return make_unique<tag_list>(); + case tag_type::Compound: return make_unique<tag_compound>(); + case tag_type::Int_Array: return make_unique<tag_int_array>(); + case tag_type::Long_Array: return make_unique<tag_long_array>(); + default: + throw std::invalid_argument("Invalid tag type: " + + std::to_string(static_cast<int>(type))); + } +} +``` + +### read_type() — Read and Validate Type Byte + +```cpp +tag_type read_type(bool allow_end); +``` + +Reads a single byte, casts to `tag_type`, and validates: +```cpp +tag_type stream_reader::read_type(bool allow_end) +{ + int type = is.get(); + if (!is) + throw input_error("Error reading tag type"); + if (!is_valid_type(type, allow_end)) + throw input_error("Invalid tag type: " + + std::to_string(type)); + return static_cast<tag_type>(type); +} +``` + +The `allow_end` parameter controls whether `tag_type::End` (0) is accepted — it's valid when reading list element types or compound children, but not at the top level of a standalone tag. + +### read_string() — Read Length-Prefixed String + +```cpp +std::string read_string(); +``` + +Reads a 2-byte unsigned length, then that many bytes of UTF-8 data: +```cpp +std::string stream_reader::read_string() +{ + uint16_t len; + read_num(len); + if (!is) + throw input_error("Error reading string length"); + std::string str(len, '\0'); + is.read(&str[0], len); + if (!is) + throw input_error("Error reading string"); + return str; +} +``` + +Maximum string length: 65535 bytes (uint16_t max). + +### read_num() — Read Numeric Value + +```cpp +template <class T> void read_num(T& x) +{ + endian::read(is, x, endian); +} +``` + +Delegates to the `endian` namespace for endianness-appropriate reading. + +--- + +## stream_writer + +### Class Definition + +```cpp +class NBT_EXPORT stream_writer +{ +public: + explicit stream_writer(std::ostream& os, + endian::endian e = endian::endian::big); + + std::ostream& get_ostr() const { return os; } + endian::endian get_endian() const { return endian; } + + void write_type(tag_type type); + void write_string(const std::string& str); + void write_payload(const tag& t); + template <class T> void write_num(T x); + + static constexpr size_t max_string_len = UINT16_MAX; + static constexpr int32_t max_array_len = INT32_MAX; + +private: + std::ostream& os; + endian::endian endian; +}; +``` + +### Constructor + +```cpp +stream_writer(std::ostream& os, endian::endian e = endian::endian::big); +``` + +- `os`: The output stream to write to +- `e`: Byte order — `endian::big` (default) or `endian::little` + +### write_tag() — Free Function + +```cpp +void write_tag(const std::string& name, const tag& t, + std::ostream& os, + endian::endian e = endian::endian::big); +``` + +This is a **free function** (not a member). It writes a complete named tag: +1. Writes the type byte +2. Writes the name string +3. Writes the payload + +```cpp +void write_tag(const std::string& name, const tag& t, + std::ostream& os, endian::endian e) +{ + stream_writer writer(os, e); + writer.write_type(t.get_type()); + writer.write_string(name); + t.write_payload(writer); +} +``` + +### write_type() — Write Type Byte + +```cpp +void stream_writer::write_type(tag_type type) +{ + os.put(static_cast<char>(type)); + if (!os) + throw std::runtime_error("Error writing tag type"); +} +``` + +### write_string() — Write Length-Prefixed String + +```cpp +void stream_writer::write_string(const std::string& str) +{ + if (str.size() > max_string_len) { + os.setstate(std::ios::failbit); + throw std::length_error("String is too long for NBT"); + } + write_num(static_cast<uint16_t>(str.size())); + os.write(str.data(), str.size()); + if (!os) + throw std::runtime_error("Error writing string"); +} +``` + +Strings longer than 65535 bytes trigger a `std::length_error`. + +### write_payload() — Write Tag Payload + +```cpp +void stream_writer::write_payload(const tag& t) +{ + t.write_payload(*this); +} +``` + +Delegates to the tag's virtual `write_payload()` method. + +### write_num() — Write Numeric Value + +```cpp +template <class T> void write_num(T x) +{ + endian::write(os, x, endian); +} +``` + +--- + +## Free Functions + +### Reading + +```cpp +// In nbt::io namespace + +std::pair<std::string, std::unique_ptr<tag>> +read_compound(std::istream& is, + endian::endian e = endian::endian::big); + +std::pair<std::string, std::unique_ptr<tag>> +read_tag(std::istream& is, + endian::endian e = endian::endian::big); +``` + +**`read_compound()`** reads and validates that the top-level tag is a compound: + +```cpp +std::pair<std::string, std::unique_ptr<tag>> +read_compound(std::istream& is, endian::endian e) +{ + stream_reader reader(is, e); + auto result = reader.read_tag(); + if (!result.second || result.second->get_type() != tag_type::Compound) + throw input_error("Top-level tag is not a compound"); + return result; +} +``` + +**`read_tag()`** reads any tag without type restriction: + +```cpp +std::pair<std::string, std::unique_ptr<tag>> +read_tag(std::istream& is, endian::endian e) +{ + stream_reader reader(is, e); + return reader.read_tag(); +} +``` + +### Writing + +```cpp +void write_tag(const std::string& name, const tag& t, + std::ostream& os, + endian::endian e = endian::endian::big); +``` + +Writes a complete named tag (type + name + payload). See above. + +--- + +## Error Handling + +### input_error + +```cpp +class input_error : public std::runtime_error +{ +public: + using std::runtime_error::runtime_error; +}; +``` + +Thrown by `stream_reader` for all parse errors: +- Invalid tag type bytes +- Stream read failures +- Negative array/list lengths +- Maximum nesting depth exceeded +- Corrupt or truncated data + +### Stream State Errors + +Write errors set stream failbit and throw: +- `std::runtime_error` for general write failures +- `std::length_error` for strings exceeding `max_string_len` (65535 bytes) +- `std::length_error` for arrays/lists exceeding `max_array_len` (INT32_MAX elements) +- `std::logic_error` for list type inconsistencies during write + +--- + +## Payload Format Per Tag Type + +Each concrete tag class implements its own `read_payload()` and `write_payload()`: + +### Primitives (tag_byte, tag_short, tag_int, tag_long, tag_float, tag_double) + +```cpp +// In tag_primitive.h (inline) +void read_payload(io::stream_reader& reader) override +{ + reader.read_num(val); +} + +void write_payload(io::stream_writer& writer) const override +{ + writer.write_num(val); +} +``` + +Simply reads/writes the raw value in the configured endianness. + +| Type | Payload Size | +|------|-------------| +| tag_byte | 1 byte | +| tag_short | 2 bytes | +| tag_int | 4 bytes | +| tag_long | 8 bytes | +| tag_float | 4 bytes | +| tag_double | 8 bytes | + +### tag_string + +Payload: 2-byte length + UTF-8 data. + +```cpp +void tag_string::read_payload(io::stream_reader& reader) +{ + val = reader.read_string(); +} + +void tag_string::write_payload(io::stream_writer& writer) const +{ + writer.write_string(val); +} +``` + +### tag_array<T> + +Payload: 4-byte signed length + elements. + +Specialized for different element types: + +**tag_byte_array** (int8_t) — raw block read/write: +```cpp +// Specialization for int8_t (byte array) +void tag_array<int8_t>::read_payload(io::stream_reader& reader) +{ + int32_t length; + reader.read_num(length); + if (length < 0) + reader.get_istr().setstate(std::ios::failbit); + if (!reader.get_istr()) + throw io::input_error("Error reading length of tag_byte_array"); + data.resize(length); + reader.get_istr().read(reinterpret_cast<char*>(data.data()), length); + if (!reader.get_istr()) + throw io::input_error("Error reading tag_byte_array"); +} +``` + +**tag_long_array** (int64_t) — element-by-element: +```cpp +// Specialization for int64_t (long array) +void tag_array<int64_t>::read_payload(io::stream_reader& reader) +{ + int32_t length; + reader.read_num(length); + if (length < 0) + reader.get_istr().setstate(std::ios::failbit); + if (!reader.get_istr()) + throw io::input_error("Error reading length of tag_long_array"); + data.clear(); + data.reserve(length); + for (int32_t i = 0; i < length; ++i) { + int64_t val; + reader.read_num(val); + data.push_back(val); + } + if (!reader.get_istr()) + throw io::input_error("Error reading tag_long_array"); +} +``` + +**Generic T** (int32_t for tag_int_array): +```cpp +template <class T> +void tag_array<T>::read_payload(io::stream_reader& reader) +{ + int32_t length; + reader.read_num(length); + if (length < 0) + reader.get_istr().setstate(std::ios::failbit); + if (!reader.get_istr()) + throw io::input_error("Error reading length of tag_array"); + data.clear(); + data.reserve(length); + for (int32_t i = 0; i < length; ++i) { + T val; + reader.read_num(val); + data.push_back(val); + } + if (!reader.get_istr()) + throw io::input_error("Error reading tag_array"); +} +``` + +### tag_compound + +Payload: sequence of complete named tags, terminated by `tag_type::End` (single 0 byte): + +```cpp +void tag_compound::read_payload(io::stream_reader& reader) +{ + clear(); + std::pair<std::string, std::unique_ptr<tag>> entry; + while ((entry = reader.read_tag()).second) + tags.emplace(std::move(entry.first), std::move(entry.second)); + if (!reader.get_istr()) + throw io::input_error("Error reading tag_compound"); +} + +void tag_compound::write_payload(io::stream_writer& writer) const +{ + for (const auto& pair : tags) { + writer.write_type(pair.second.get_type()); + writer.write_string(pair.first); + pair.second.get().write_payload(writer); + } + writer.write_type(tag_type::End); +} +``` + +### tag_list + +Payload: 1-byte element type + 4-byte signed length + element payloads (without type bytes): + +(See the [list-tags.md](list-tags.md) document for the full implementation.) + +--- + +## Depth Tracking + +`stream_reader` tracks recursive depth to prevent stack overflow from maliciously crafted NBT data with deeply nested compounds or lists: + +```cpp +static const unsigned int MAX_DEPTH = 1024; +``` + +Each call to `read_payload()` increments `depth_`, and decrements on return. If `depth_` exceeds 1024, an `io::input_error` is thrown. + +This is critical for security — without depth limits, a crafted file with thousands of nested compounds could cause a stack overflow. + +--- + +## Endianness + +Both `stream_reader` and `stream_writer` take an `endian::endian` parameter: + +| Value | Use Case | +|-------|----------| +| `endian::big` | Java Edition NBT (default, per Minecraft specification) | +| `endian::little` | Bedrock Edition NBT | + +The endianness affects all numeric reads/writes (lengths, primitive values, etc.) but not single bytes (type, byte values). + +--- + +## Usage Examples + +### Reading a File + +```cpp +#include <nbt_tags.h> +#include <io/stream_reader.h> +#include <fstream> + +std::ifstream file("level.dat", std::ios::binary); +auto result = nbt::io::read_compound(file); + +std::string name = result.first; // Root tag name +tag_compound& root = result.second->as<tag_compound>(); + +int32_t version = static_cast<int32_t>(root.at("version")); +``` + +### Reading with zlib Decompression + +```cpp +#include <io/izlibstream.h> + +std::ifstream file("level.dat", std::ios::binary); +zlib::izlibstream zs(file); +auto result = nbt::io::read_compound(zs); +``` + +### Writing a File + +```cpp +#include <io/stream_writer.h> +#include <fstream> + +tag_compound root{ + {"Data", tag_compound{ + {"version", int32_t(19133)}, + {"LevelName", std::string("My World")} + }} +}; + +std::ofstream file("level.dat", std::ios::binary); +nbt::io::write_tag("", root, file); +``` + +### Writing with zlib Compression + +```cpp +#include <io/ozlibstream.h> + +std::ofstream file("level.dat", std::ios::binary); +zlib::ozlibstream zs(file); +nbt::io::write_tag("", root, zs); +zs.close(); +``` + +### Little-Endian (Bedrock) + +```cpp +auto result = nbt::io::read_compound(file, endian::endian::little); +nbt::io::write_tag("", root, file, endian::endian::little); +``` + +### Roundtrip Test + +```cpp +// Write +std::stringstream ss; +nbt::io::write_tag("test", original_root, ss); + +// Read back +ss.seekg(0); +auto [name, tag] = nbt::io::read_tag(ss); +assert(name == "test"); +assert(*tag == original_root); +``` + +--- + +## Wire Format Summary + +``` +Named Tag: + [type: 1 byte] [name_length: 2 bytes] [name: N bytes] [payload: variable] + +Compound Payload: + [child_tag_1] [child_tag_2] ... [End: 0x00] + +List Payload: + [element_type: 1 byte] [length: 4 bytes] [payload_1] [payload_2] ... + +String Payload: + [length: 2 bytes] [data: N bytes, UTF-8] + +Array Payload (Byte/Int/Long): + [length: 4 bytes] [element_1] [element_2] ... + +Primitive Payloads: + Byte: 1 byte + Short: 2 bytes + Int: 4 bytes + Long: 8 bytes + Float: 4 bytes (IEEE 754) + Double: 8 bytes (IEEE 754) +``` + +All multi-byte values use the configured endianness (big-endian by default). diff --git a/docs/handbook/libnbtplusplus/list-tags.md b/docs/handbook/libnbtplusplus/list-tags.md new file mode 100644 index 0000000000..f3ca7dabb4 --- /dev/null +++ b/docs/handbook/libnbtplusplus/list-tags.md @@ -0,0 +1,682 @@ +# List Tags + +## Overview + +`tag_list` represents an ordered collection of unnamed tags that all share the same type. It is the NBT equivalent of a typed array — all elements must be the same `tag_type`, and elements are accessed by index rather than by name. + +Defined in `include/tag_list.h`, implemented in `src/tag_list.cpp`. + +--- + +## Class Definition + +```cpp +class NBT_EXPORT tag_list final : public detail::crtp_tag<tag_list> +{ +public: + typedef std::vector<value>::iterator iterator; + typedef std::vector<value>::const_iterator const_iterator; + static constexpr tag_type type = tag_type::List; + + template <class T> static tag_list of(std::initializer_list<T> init); + + tag_list() : tag_list(tag_type::Null) {} + explicit tag_list(tag_type content_type) : el_type_(content_type) {} + + // Initializer list constructors for each supported type + tag_list(std::initializer_list<int8_t> init); + tag_list(std::initializer_list<int16_t> init); + tag_list(std::initializer_list<int32_t> init); + tag_list(std::initializer_list<int64_t> init); + tag_list(std::initializer_list<float> init); + tag_list(std::initializer_list<double> init); + tag_list(std::initializer_list<std::string> init); + tag_list(std::initializer_list<tag_byte_array> init); + tag_list(std::initializer_list<tag_list> init); + tag_list(std::initializer_list<tag_compound> init); + tag_list(std::initializer_list<tag_int_array> init); + tag_list(std::initializer_list<tag_long_array> init); + tag_list(std::initializer_list<value> init); + + value& at(size_t i); + const value& at(size_t i) const; + value& operator[](size_t i); + const value& operator[](size_t i) const; + + void set(size_t i, value&& val); + void push_back(value_initializer&& val); + template <class T, class... Args> void emplace_back(Args&&... args); + void pop_back(); + + tag_type el_type() const; + size_t size() const; + void clear(); + void reset(tag_type type = tag_type::Null); + + iterator begin(); iterator end(); + const_iterator begin() const; const_iterator end() const; + const_iterator cbegin() const; const_iterator cend() const; + + void read_payload(io::stream_reader& reader) override; + void write_payload(io::stream_writer& writer) const override; + + friend NBT_EXPORT bool operator==(const tag_list& lhs, const tag_list& rhs); + friend NBT_EXPORT bool operator!=(const tag_list& lhs, const tag_list& rhs); + +private: + std::vector<value> tags; + tag_type el_type_; + + template <class T, class Arg> void init(std::initializer_list<Arg> il); +}; +``` + +--- + +## Internal Storage + +`tag_list` stores its elements in a `std::vector<value>` and tracks the content type in `el_type_` (a `tag_type` value). + +### Element Type Tracking + +The `el_type_` field records what type of tags the list contains: + +- **Determined**: Set to a specific `tag_type` (e.g., `tag_type::Int`) when elements are present or the type has been set explicitly +- **Undetermined**: Set to `tag_type::Null` when the list is empty and no type has been specified + +The element type is automatically determined when the first element is added to an undetermined list. + +--- + +## Construction + +### Default Constructor + +```cpp +tag_list() // Empty list with undetermined type (tag_type::Null) +``` + +### Typed Empty Constructor + +```cpp +tag_list(tag_type::Int) // Empty list, but typed as Int +``` + +This is useful when you need an empty list that will later hold elements of a specific type. + +### Initializer List Constructors + +`tag_list` provides 12 initializer list constructors, one for each concrete element type: + +```cpp +tag_list bytes{int8_t(1), int8_t(2), int8_t(3)}; // List of tag_byte +tag_list shorts{int16_t(100), int16_t(200)}; // List of tag_short +tag_list ints{1, 2, 3, 4, 5}; // List of tag_int +tag_list longs{int64_t(1), int64_t(2)}; // List of tag_long +tag_list floats{1.0f, 2.0f, 3.0f}; // List of tag_float +tag_list doubles{1.0, 2.0, 3.0}; // List of tag_double +tag_list strings{"hello", "world"}; // List of tag_string (fails: not std::string) + +tag_list byte_arrays{ + tag_byte_array{1, 2, 3}, + tag_byte_array{4, 5, 6} +}; // List of tag_byte_array + +tag_list nested_lists{ + tag_list{1, 2, 3}, + tag_list{4, 5, 6} +}; // List of tag_list + +tag_list compounds{ + tag_compound{{"name", "a"}}, + tag_compound{{"name", "b"}} +}; // List of tag_compound +``` + +Each constructor delegates to the private `init<T, Arg>()` template: + +```cpp +template <class T, class Arg> +void tag_list::init(std::initializer_list<Arg> init) +{ + el_type_ = T::type; + tags.reserve(init.size()); + for (const Arg& arg : init) + tags.emplace_back(nbt::make_unique<T>(arg)); +} +``` + +### Value Initializer List Constructor + +```cpp +tag_list(std::initializer_list<value> init); +``` + +Constructs a list from `value` objects. All values must be the same type, or an exception is thrown. + +Implementation: +```cpp +tag_list::tag_list(std::initializer_list<value> init) +{ + if (init.size() == 0) + el_type_ = tag_type::Null; + else { + el_type_ = init.begin()->get_type(); + for (const value& val : init) { + if (!val || val.get_type() != el_type_) + throw std::invalid_argument( + "The values are not all the same type"); + } + tags.assign(init.begin(), init.end()); + } +} +``` + +### Static of<T>() Factory + +```cpp +template <class T> static tag_list of(std::initializer_list<T> init); +``` + +Creates a list with elements of type `T`, where each element is constructed from the corresponding value in the initializer list. Most commonly used for creating lists of compounds: + +```cpp +auto list = tag_list::of<tag_compound>({ + {{"name", "Item 1"}, {"count", int32_t(64)}}, + {{"name", "Item 2"}, {"count", int32_t(32)}} +}); + +auto shorts = tag_list::of<tag_short>({100, 200, 300}); +auto bytes = tag_list::of<tag_byte>({1, 2, 3, 4, 5}); +``` + +Implementation: +```cpp +template <class T> tag_list tag_list::of(std::initializer_list<T> il) +{ + tag_list result; + result.init<T, T>(il); + return result; +} +``` + +--- + +## Type Enforcement + +`tag_list` enforces type homogeneity at runtime. Every operation that modifies the list checks that the new element matches the list's content type. + +### How Type Enforcement Works + +1. **Empty lists** have `el_type_ == tag_type::Null` (undetermined) +2. When the **first element** is added, `el_type_` is set to that element's type +3. Subsequent additions must have the **same type** or `std::invalid_argument` is thrown +4. `clear()` preserves the content type; `reset()` clears and optionally changes it + +### Example + +```cpp +tag_list list; // el_type_ == tag_type::Null + +list.push_back(int32_t(42)); // el_type_ becomes tag_type::Int +list.push_back(int32_t(99)); // OK: same type + +list.push_back(int16_t(5)); // throws std::invalid_argument +// "The tag type does not match the list's content type" + +list.push_back(std::string("hello")); // throws std::invalid_argument +``` + +--- + +## Element Access + +### operator[] — Unchecked Access + +```cpp +value& operator[](size_t i) { return tags[i]; } +const value& operator[](size_t i) const { return tags[i]; } +``` + +No bounds checking. Behavior is undefined if `i >= size()`. + +### at() — Bounds-Checked Access + +```cpp +value& at(size_t i); +const value& at(size_t i) const; +``` + +Throws `std::out_of_range` if `i >= size()`. + +```cpp +tag_list list{1, 2, 3}; + +value& first = list[0]; // tag_int(1) +value& second = list.at(1); // tag_int(2), bounds-checked + +list.at(10); // throws std::out_of_range +``` + +### Accessing the Contained Tag + +Since each element is a `value`, you can access the underlying tag: + +```cpp +tag_list list{1, 2, 3}; + +// Via value's conversion operators +int32_t val = static_cast<int32_t>(list[0]); + +// Via as<T>() +tag_int& tag = list[0].as<tag_int>(); +int32_t raw = tag.get(); + +// Via tag reference +const tag& t = list[0].get(); +``` + +--- + +## Modification + +### push_back() + +```cpp +void push_back(value_initializer&& val); +``` + +Appends a tag to the end of the list. If the list's type is undetermined, sets it. If the type mismatches, throws `std::invalid_argument`. Null values are rejected. + +```cpp +tag_list list; +list.push_back(int32_t(1)); // list is now type Int +list.push_back(int32_t(2)); // OK +list.push_back(int16_t(3)); // throws: Short != Int +``` + +Implementation: +```cpp +void tag_list::push_back(value_initializer&& val) +{ + if (!val) + throw std::invalid_argument("The value must not be null"); + if (el_type_ == tag_type::Null) + el_type_ = val.get_type(); + else if (el_type_ != val.get_type()) + throw std::invalid_argument( + "The tag type does not match the list's content type"); + tags.push_back(std::move(val)); +} +``` + +### emplace_back() + +```cpp +template <class T, class... Args> void emplace_back(Args&&... args); +``` + +Constructs a tag of type `T` in-place at the end of the list. Type checking is performed against `T::type`. + +```cpp +tag_list list; +list.emplace_back<tag_int>(42); +list.emplace_back<tag_int>(99); +list.emplace_back<tag_short>(5); // throws: Short != Int +``` + +Implementation: +```cpp +template <class T, class... Args> +void tag_list::emplace_back(Args&&... args) +{ + if (el_type_ == tag_type::Null) + el_type_ = T::type; + else if (el_type_ != T::type) + throw std::invalid_argument( + "The tag type does not match the list's content type"); + tags.emplace_back(make_unique<T>(std::forward<Args>(args)...)); +} +``` + +### set() + +```cpp +void set(size_t i, value&& val); +``` + +Replaces the element at index `i`. Type checking is enforced — the new value must match `el_type_`. Throws `std::out_of_range` if the index is invalid. + +```cpp +tag_list list{1, 2, 3}; +list.set(1, value(tag_int(99))); // list is now {1, 99, 3} +``` + +Implementation: +```cpp +void tag_list::set(size_t i, value&& val) +{ + if (val.get_type() != el_type_) + throw std::invalid_argument( + "The tag type does not match the list's content type"); + tags.at(i) = std::move(val); +} +``` + +### pop_back() + +```cpp +void pop_back() { tags.pop_back(); } +``` + +Removes the last element. Does **not** change `el_type_`, even if the list becomes empty. + +### clear() + +```cpp +void clear() { tags.clear(); } +``` + +Removes all elements. **Preserves** the content type. + +### reset() + +```cpp +void reset(tag_type type = tag_type::Null); +``` + +Clears all elements **and** sets the content type. Defaults to `tag_type::Null` (undetermined). + +```cpp +tag_list list{1, 2, 3}; // type: Int +list.reset(); // empty, type: Null (undetermined) +list.reset(tag_type::String); // empty, type: String +``` + +--- + +## Content Type Query + +```cpp +tag_type el_type() const { return el_type_; } +``` + +Returns the content type of the list: +- A specific `tag_type` if determined +- `tag_type::Null` if undetermined + +```cpp +tag_list list; +list.el_type(); // tag_type::Null + +tag_list ints{1, 2, 3}; +ints.el_type(); // tag_type::Int + +tag_list typed(tag_type::String); +typed.el_type(); // tag_type::String +``` + +--- + +## Iteration + +`tag_list` provides full random-access iterator support over `value` elements: + +```cpp +iterator begin(); iterator end(); +const_iterator begin() const; const_iterator end() const; +const_iterator cbegin() const; const_iterator cend() const; +``` + +### Iteration Examples + +```cpp +tag_list list{10, 20, 30, 40, 50}; + +// Range-based for +for (const auto& val : list) { + int32_t num = static_cast<int32_t>(val); + std::cout << num << " "; +} +// Output: 10 20 30 40 50 + +// Index-based +for (size_t i = 0; i < list.size(); ++i) { + std::cout << static_cast<int32_t>(list[i]) << " "; +} + +// Iterator-based +for (auto it = list.begin(); it != list.end(); ++it) { + tag& t = it->get(); + // Process tag... +} +``` + +--- + +## Nested Access + +The `value` class delegates index-based access to `tag_list` when the held tag is a list. This enables chained access from compounds: + +```cpp +tag_compound root{ + {"items", tag_list::of<tag_compound>({ + {{"id", "sword"}, {"damage", int16_t(50)}}, + {{"id", "shield"}, {"damage", int16_t(100)}} + })} +}; + +// Access list element from compound +value& firstItem = root["items"][0]; +std::string id = static_cast<std::string>(firstItem["id"]); // "sword" + +// Bounds-checked +root["items"].at(99); // throws std::out_of_range +``` + +The delegation in `value`: +```cpp +value& value::operator[](size_t i) +{ + return dynamic_cast<tag_list&>(*tag_)[i]; +} + +value& value::at(size_t i) +{ + return dynamic_cast<tag_list&>(*tag_).at(i); +} +``` + +--- + +## Binary Format + +### Reading (Deserialization) + +A list tag's payload is: + +``` +[element type byte] [length (4 bytes, signed)] [element payloads...] +``` + +Implementation: +```cpp +void tag_list::read_payload(io::stream_reader& reader) +{ + tag_type lt = reader.read_type(true); + + int32_t length; + reader.read_num(length); + if (length < 0) + reader.get_istr().setstate(std::ios::failbit); + if (!reader.get_istr()) + throw io::input_error("Error reading length of tag_list"); + + if (lt != tag_type::End) { + reset(lt); + tags.reserve(length); + for (int32_t i = 0; i < length; ++i) + tags.emplace_back(reader.read_payload(lt)); + } else { + // tag_end type: leave type undetermined + reset(tag_type::Null); + } +} +``` + +Key behaviors: +- Element type `End` (0) means an empty list with undetermined type — the length is ignored +- Negative length sets failbit and throws `io::input_error` +- Each element is read as a payload-only tag (no type byte or name) + +### Writing (Serialization) + +```cpp +void tag_list::write_payload(io::stream_writer& writer) const +{ + if (size() > io::stream_writer::max_array_len) { + writer.get_ostr().setstate(std::ios::failbit); + throw std::length_error("List is too large for NBT"); + } + writer.write_type(el_type_ != tag_type::Null ? el_type_ : tag_type::End); + writer.write_num(static_cast<int32_t>(size())); + for (const auto& val : tags) { + if (val.get_type() != el_type_) { + writer.get_ostr().setstate(std::ios::failbit); + throw std::logic_error( + "The tags in the list do not all match the content type"); + } + writer.write_payload(val); + } +} +``` + +Key behaviors: +- Undetermined type (`Null`) is written as `End` (0) +- An additional consistency check verifies all elements match `el_type_` during write +- Lists exceeding `INT32_MAX` elements throw `std::length_error` + +--- + +## Equality Comparison + +Two lists are equal if they have the same element type **and** the same elements: + +```cpp +bool operator==(const tag_list& lhs, const tag_list& rhs) +{ + return lhs.el_type_ == rhs.el_type_ && lhs.tags == rhs.tags; +} +``` + +This means: +- An empty list of `tag_type::Int` is **not** equal to an empty list of `tag_type::String` +- An empty list with undetermined type **is** equal to another undetermined empty list + +--- + +## Common Usage Patterns + +### Creating a List of Compounds (Inventory Example) + +```cpp +tag_list inventory = tag_list::of<tag_compound>({ + {{"Slot", int8_t(0)}, {"id", "minecraft:diamond_sword"}, {"Count", int8_t(1)}}, + {{"Slot", int8_t(1)}, {"id", "minecraft:torch"}, {"Count", int8_t(64)}}, + {{"Slot", int8_t(2)}, {"id", "minecraft:apple"}, {"Count", int8_t(16)}} +}); +``` + +### Building a List Dynamically + +```cpp +tag_list positions; +for (const auto& pos : player_positions) { + positions.push_back(tag_compound{ + {"x", pos.x}, + {"y", pos.y}, + {"z", pos.z} + }); +} +``` + +### Processing a List of Compounds + +```cpp +tag_list& items = root->at("Items").as<tag_list>(); +for (size_t i = 0; i < items.size(); ++i) { + auto& item = items[i].as<tag_compound>(); + std::string id = static_cast<std::string>(item.at("id")); + int8_t count = static_cast<int8_t>(item.at("Count")); + std::cout << id << " x" << (int)count << "\n"; +} +``` + +### Nested Lists + +```cpp +tag_list outer = tag_list::of<tag_list>({ + tag_list{1, 2, 3}, // Inner list of Int + tag_list{4, 5, 6} // Inner list of Int +}); + +// Access: outer[0] → value wrapping tag_list{1, 2, 3} +// outer[0].as<tag_list>()[1] → tag_int(2) +``` + +### Converting Between List and Vector + +```cpp +// List to vector +tag_list list{1, 2, 3, 4, 5}; +std::vector<int32_t> vec; +for (const auto& val : list) { + vec.push_back(static_cast<int32_t>(val)); +} + +// Vector to list +tag_list result; +for (int32_t v : vec) { + result.push_back(v); +} +``` + +--- + +## Edge Cases + +### Empty Lists + +```cpp +tag_list empty1; // el_type_ == Null +tag_list empty2(tag_type::Int); // el_type_ == Int, size == 0 +tag_list empty3(tag_type::Null); // Same as default constructor + +// Read from NBT: a list with type End and length 0 +// → el_type_ = Null (undetermined) +``` + +### Clearing vs. Resetting + +```cpp +tag_list list{1, 2, 3}; // el_type_ = Int + +list.clear(); // size = 0, el_type_ = Int (preserved!) +list.push_back(int32_t(4)); // OK: type still Int + +list.reset(); // size = 0, el_type_ = Null +list.push_back("hello"); // OK: type becomes String +``` + +### Type Mismatch Prevention + +```cpp +tag_list list{1, 2, 3}; + +// These all throw std::invalid_argument: +list.push_back(int16_t(4)); // Short != Int +list.push_back("hello"); // String != Int +list.push_back(tag_compound{{"a", 1}}); // Compound != Int +list.set(0, value(tag_short(5))); // Short != Int +list.emplace_back<tag_short>(5); // Short != Int +``` diff --git a/docs/handbook/libnbtplusplus/overview.md b/docs/handbook/libnbtplusplus/overview.md new file mode 100644 index 0000000000..b2144a2bdc --- /dev/null +++ b/docs/handbook/libnbtplusplus/overview.md @@ -0,0 +1,422 @@ +# libnbt++ Overview + +## What is libnbt++? + +libnbt++ is a free C++ library for reading, writing, and manipulating Minecraft's **Named Binary Tag (NBT)** file format. It provides a modern C++11 interface for working with NBT data, supporting both compressed and uncompressed files, big-endian (Java Edition) and little-endian (Bedrock/Pocket Edition) byte orders, and full tag hierarchy manipulation. + +The library lives under the `nbt` namespace and provides strongly-typed tag classes that mirror the NBT specification exactly. It was originally created by ljfa-ag and is licensed under the GNU Lesser General Public License v3.0 (LGPL-3.0-or-later). + +libnbt++3 is a complete rewrite of the older libnbt++2, designed to eliminate boilerplate code and provide a more natural C++ syntax for NBT operations. + +--- + +## The NBT Format + +NBT (Named Binary Tag) is a binary serialization format invented by Markus "Notch" Persson for Minecraft. It is used throughout the game to store: + +- World save data (level.dat) +- Chunk data (region files) +- Player inventories and statistics +- Structure files +- Server configuration + +### Binary Structure + +An NBT file consists of a single named root tag, which is always a **Compound** tag. The binary layout is: + +``` +[tag type byte] [name length (2 bytes, big-endian)] [name (UTF-8)] [payload] +``` + +Each tag type has a specific format for its payload, and compound/list tags recursively contain other tags. + +### Compression + +NBT files in Minecraft are typically compressed with either **gzip** (most common for `.dat` files) or **zlib/deflate** (used in chunk data within region files). libnbt++ supports both through its optional zlib integration. + +--- + +## Tag Types + +The NBT format defines 13 tag types, represented in libnbt++ by the `tag_type` enum class defined in `include/tag.h`: + +```cpp +enum class tag_type : int8_t { + End = 0, // Marks the end of a compound tag + Byte = 1, // Signed 8-bit integer + Short = 2, // Signed 16-bit integer + Int = 3, // Signed 32-bit integer + Long = 4, // Signed 64-bit integer + Float = 5, // 32-bit IEEE 754 floating point + Double = 6, // 64-bit IEEE 754 floating point + Byte_Array = 7, // Array of signed bytes + String = 8, // UTF-8 string (max 65535 bytes) + List = 9, // Ordered list of unnamed tags (same type) + Compound = 10, // Collection of named tags (any type) + Int_Array = 11, // Array of signed 32-bit integers + Long_Array = 12, // Array of signed 64-bit integers + Null = -1 // Internal: denotes empty value objects +}; +``` + +The `Null` type (value -1) is an internal sentinel used by libnbt++ to represent uninitialized `value` objects; it does not appear in the NBT specification. + +The `End` type (value 0) is only valid within compound tags to mark their end; it is never used as a standalone tag. + +### Tag Type Validation + +The function `is_valid_type()` checks whether an integer value is a valid tag type: + +```cpp +bool is_valid_type(int type, bool allow_end = false); +``` + +It returns `true` when `type` falls between 1 and 12 (inclusive), or between 0 and 12 if `allow_end` is `true`. + +--- + +## C++ Tag Classes + +Each NBT tag type maps to a concrete C++ class in the `nbt` namespace. The classes are organized using templates for related types: + +| NBT Type | ID | C++ Class | Underlying Type | Header | +|-------------|----|--------------------|------------------------|--------------------| +| Byte | 1 | `tag_byte` | `tag_primitive<int8_t>` | `tag_primitive.h` | +| Short | 2 | `tag_short` | `tag_primitive<int16_t>`| `tag_primitive.h` | +| Int | 3 | `tag_int` | `tag_primitive<int32_t>`| `tag_primitive.h` | +| Long | 4 | `tag_long` | `tag_primitive<int64_t>`| `tag_primitive.h` | +| Float | 5 | `tag_float` | `tag_primitive<float>` | `tag_primitive.h` | +| Double | 6 | `tag_double` | `tag_primitive<double>` | `tag_primitive.h` | +| Byte_Array | 7 | `tag_byte_array` | `tag_array<int8_t>` | `tag_array.h` | +| String | 8 | `tag_string` | `tag_string` | `tag_string.h` | +| List | 9 | `tag_list` | `tag_list` | `tag_list.h` | +| Compound | 10 | `tag_compound` | `tag_compound` | `tag_compound.h` | +| Int_Array | 11 | `tag_int_array` | `tag_array<int32_t>` | `tag_array.h` | +| Long_Array | 12 | `tag_long_array` | `tag_array<int64_t>` | `tag_array.h` | + +The typedef names (`tag_byte`, `tag_short`, etc.) are the intended public API. The underlying template classes (`tag_primitive<T>`, `tag_array<T>`) should not be used directly. + +--- + +## Library Features + +### Modern C++11 Design + +- **Move semantics**: Tags support move construction and move assignment for efficient transfers +- **Smart pointers**: `std::unique_ptr<tag>` is used throughout for ownership management +- **Initializer lists**: Compounds and lists can be constructed with brace-enclosed initializer lists +- **Type-safe conversions**: The `value` class provides explicit conversions with `std::bad_cast` on type mismatch +- **Templates**: `tag_primitive<T>` and `tag_array<T>` use templates to avoid code duplication + +### Convenient Syntax + +Creating complex NBT structures is straightforward: + +```cpp +#include <nbt_tags.h> + +nbt::tag_compound root{ + {"playerName", "Steve"}, + {"health", int16_t(20)}, + {"position", nbt::tag_list{1.0, 64.5, -3.2}}, + {"inventory", nbt::tag_list::of<nbt::tag_compound>({ + {{"id", "minecraft:diamond_sword"}, {"count", int8_t(1)}}, + {{"id", "minecraft:apple"}, {"count", int8_t(64)}} + })}, + {"scores", nbt::tag_int_array{100, 250, 380}} +}; +``` + +### The value Class + +The `value` class (`include/value.h`) acts as a type-erased wrapper around `std::unique_ptr<tag>`. It enables: + +- Implicit numeric conversions (widening only): `int8_t` → `int16_t` → `int32_t` → `int64_t` → `float` → `double` +- Direct string assignment +- Subscript access: `compound["key"]` for compounds, `list[index]` for lists +- Chained access: `root["nested"]["deep"]["value"]` + +### I/O System + +Reading and writing NBT data uses the stream-based API: + +```cpp +#include <nbt_tags.h> +#include <io/stream_reader.h> +#include <io/stream_writer.h> +#include <fstream> + +// Reading +std::ifstream file("level.dat", std::ios::binary); +auto [name, compound] = nbt::io::read_compound(file); + +// Writing +std::ofstream out("output.nbt", std::ios::binary); +nbt::io::write_tag("Level", *compound, out); +``` + +The I/O system supports both big-endian (Java Edition default) and little-endian (Bedrock Edition) byte orders via the `endian::endian` enum: + +```cpp +// Reading Bedrock Edition data +auto pair = nbt::io::read_compound(file, endian::little); +``` + +### Zlib Compression Support + +When built with `NBT_USE_ZLIB=ON` (the default), the library provides stream wrappers for transparent compression/decompression: + +```cpp +#include <io/izlibstream.h> +#include <io/ozlibstream.h> + +// Reading gzip-compressed NBT +std::ifstream gzfile("level.dat", std::ios::binary); +zlib::izlibstream decompressed(gzfile); +auto pair = nbt::io::read_compound(decompressed); + +// Writing gzip-compressed NBT +std::ofstream outfile("output.dat", std::ios::binary); +zlib::ozlibstream compressed(outfile, Z_DEFAULT_COMPRESSION, true /* gzip */); +nbt::io::write_tag("Level", root, compressed); +compressed.close(); +``` + +### Visitor Pattern + +The library implements the Visitor pattern through `nbt_visitor` and `const_nbt_visitor` base classes, with 12 overloads (one per concrete tag type). The JSON formatter (`text::json_formatter`) is an example of a visitor that pretty-prints tag trees for debugging. + +### Polymorphic Operations + +All tag classes support: + +- **`clone()`** — Deep-copies the tag, returning `std::unique_ptr<tag>` +- **`move_clone()`** — Moves the tag into a new `unique_ptr` +- **`assign(tag&&)`** — Move-assigns from another tag of the same type +- **`get_type()`** — Returns the `tag_type` enum value +- **`operator==` / `operator!=`** — Deep equality comparison +- **`operator<<`** — JSON-like formatted output via `text::json_formatter` + +### Factory Construction + +Tags can be constructed dynamically by type: + +```cpp +auto t = nbt::tag::create(nbt::tag_type::Int); // Default-constructed tag_int(0) +auto t = nbt::tag::create(nbt::tag_type::Float, 3.14f); // Numeric tag_float(3.14) +``` + +--- + +## Namespace Organization + +| Namespace | Contents | +|----------------|-------------------------------------------------------------| +| `nbt` | All tag classes, `value`, `value_initializer`, helpers | +| `nbt::detail` | CRTP base class, primitive/array type traits (internal) | +| `nbt::io` | `stream_reader`, `stream_writer`, free functions | +| `nbt::text` | `json_formatter` for pretty-printing | +| `endian` | Endianness-aware binary read/write functions | +| `zlib` | zlib stream wrappers (`izlibstream`, `ozlibstream`) | + +--- + +## File Organization + +### Public Headers (`include/`) + +| File | Purpose | +|--------------------------|-------------------------------------------------------------| +| `tag.h` | `tag` base class, `tag_type` enum, `is_valid_type()` | +| `tagfwd.h` | Forward declarations for all tag classes | +| `nbt_tags.h` | Convenience header — includes all tag headers | +| `tag_primitive.h` | `tag_primitive<T>` template and `tag_byte`..`tag_double` typedefs | +| `tag_string.h` | `tag_string` class | +| `tag_array.h` | `tag_array<T>` template and `tag_byte_array`..`tag_long_array` | +| `tag_list.h` | `tag_list` class | +| `tag_compound.h` | `tag_compound` class | +| `value.h` | `value` type-erased tag wrapper | +| `value_initializer.h` | `value_initializer` — implicit conversions for function params | +| `crtp_tag.h` | CRTP base template implementing polymorphic dispatch | +| `primitive_detail.h` | Type traits mapping C++ types to `tag_type` values | +| `nbt_visitor.h` | `nbt_visitor` and `const_nbt_visitor` base classes | +| `endian_str.h` | Endianness-aware binary I/O functions | +| `make_unique.h` | `nbt::make_unique<T>()` helper (C++11 polyfill) | +| `io/stream_reader.h` | `stream_reader` class and `read_compound()`/`read_tag()` | +| `io/stream_writer.h` | `stream_writer` class and `write_tag()` | +| `io/izlibstream.h` | `izlibstream` for decompression (requires zlib) | +| `io/ozlibstream.h` | `ozlibstream` for compression (requires zlib) | +| `io/zlib_streambuf.h` | `zlib_streambuf` base class, `zlib_error` exception | +| `text/json_formatter.h` | `json_formatter` for pretty-printing tags | + +### Source Files (`src/`) + +| File | Purpose | +|---------------------------|------------------------------------------| +| `tag.cpp` | `tag` methods, `tag_primitive` explicit instantiations, operators | +| `tag_compound.cpp` | `tag_compound` methods, binary I/O | +| `tag_list.cpp` | `tag_list` methods, initializer lists, binary I/O | +| `tag_string.cpp` | `tag_string` read/write payload | +| `value.cpp` | `value` assignment operators, conversions | +| `value_initializer.cpp` | `value_initializer` constructors | +| `endian_str.cpp` | Big/little endian read/write implementations | +| `io/stream_reader.cpp` | `stream_reader` methods, format parsing | +| `io/stream_writer.cpp` | `stream_writer` methods, format output | +| `io/izlibstream.cpp` | `inflate_streambuf` implementation | +| `io/ozlibstream.cpp` | `deflate_streambuf` implementation | +| `text/json_formatter.cpp` | `json_formatter` visitor implementation | + +--- + +## Quick Start Examples + +### Reading an NBT File + +```cpp +#include <nbt_tags.h> +#include <io/stream_reader.h> +#include <fstream> +#include <iostream> + +int main() { + std::ifstream file("level.dat", std::ios::binary); + if (!file) return 1; + + auto [name, root] = nbt::io::read_compound(file); + std::cout << "Root tag: " << name << "\n"; + std::cout << *root << std::endl; // JSON-formatted output + + return 0; +} +``` + +### Reading a Compressed File + +```cpp +#include <nbt_tags.h> +#include <io/stream_reader.h> +#include <io/izlibstream.h> +#include <fstream> + +int main() { + std::ifstream file("level.dat", std::ios::binary); + zlib::izlibstream decompressed(file); // Auto-detects gzip/zlib + auto [name, root] = nbt::io::read_compound(decompressed); + return 0; +} +``` + +### Creating and Writing NBT Data + +```cpp +#include <nbt_tags.h> +#include <io/stream_writer.h> +#include <fstream> + +int main() { + nbt::tag_compound data{ + {"name", "World1"}, + {"seed", int64_t(123456789)}, + {"spawnX", int32_t(0)}, + {"spawnY", int32_t(64)}, + {"spawnZ", int32_t(0)}, + {"gameType", int32_t(0)}, + {"raining", int8_t(0)}, + {"version", nbt::tag_compound{ + {"id", int32_t(19133)}, + {"name", "1.20.4"}, + {"snapshot", int8_t(0)} + }} + }; + + std::ofstream out("output.nbt", std::ios::binary); + nbt::io::write_tag("", data, out); + return 0; +} +``` + +### Modifying Existing Data + +```cpp +auto [name, root] = nbt::io::read_compound(file); + +// Modify values using operator[] +(*root)["playerName"] = std::string("Alex"); +(*root)["health"] = int16_t(20); + +// Add a new nested compound +root->put("newSection", nbt::tag_compound{ + {"key1", int32_t(42)}, + {"key2", "hello"} +}); + +// Remove a tag +root->erase("oldSection"); + +// Check if a key exists +if (root->has_key("inventory", nbt::tag_type::List)) { + auto& inv = root->at("inventory").as<nbt::tag_list>(); + inv.push_back(nbt::tag_compound{{"id", "minecraft:stone"}, {"count", int8_t(1)}}); +} +``` + +### Iterating Over Tags + +```cpp +// Iterating a compound +for (const auto& [key, val] : *root) { + std::cout << key << ": type=" << val.get_type() << "\n"; +} + +// Iterating a list +auto& list = root->at("items").as<nbt::tag_list>(); +for (size_t i = 0; i < list.size(); ++i) { + std::cout << "Item " << i << ": " << list[i].get() << "\n"; +} +``` + +--- + +## Error Handling + +libnbt++ uses exceptions for error reporting: + +| Exception | Thrown When | +|------------------------|----------------------------------------------------------| +| `nbt::io::input_error` | Read failure: invalid tag type, unexpected EOF, corruption | +| `std::bad_cast` | Type mismatch in `value` conversions or `tag::assign()` | +| `std::out_of_range` | Invalid key in `tag_compound::at()` or index in `tag_list::at()` | +| `std::invalid_argument`| Invalid tag type to `tag::create()`, type mismatch in list operations | +| `std::length_error` | String > 65535 bytes, array > INT32_MAX elements | +| `zlib::zlib_error` | zlib decompression/compression failure | +| `std::bad_alloc` | zlib memory allocation failure | + +Stream state flags (`failbit`, `badbit`) are also set on the underlying `std::istream`/`std::ostream` when errors occur. + +--- + +## Thread Safety + +libnbt++ provides no thread safety guarantees beyond those of the C++ standard library. Tag objects should not be accessed concurrently from multiple threads without external synchronization. Reading from separate `stream_reader` instances using independent streams is safe. + +--- + +## Platform Requirements + +- C++11 compatible compiler (GCC 4.8+, Clang 3.3+, MSVC 2015+) +- CMake 3.15 or later +- zlib (optional, for compressed NBT support) +- IEEE 754 floating point (enforced via `static_assert`) +- 8-bit bytes (enforced via `static_assert` on `CHAR_BIT`) + +The library uses `memcpy`-based type punning (not `reinterpret_cast`) for float/double endian conversions, ensuring defined behavior across compilers. + +--- + +## License + +libnbt++ is licensed under the **GNU Lesser General Public License v3.0 or later** (LGPL-3.0-or-later). This means: + +- You can link against libnbt++ from proprietary software +- Modifications to libnbt++ itself must be released under LGPL +- The full license text is in `COPYING` and `COPYING.LESSER` diff --git a/docs/handbook/libnbtplusplus/tag-system.md b/docs/handbook/libnbtplusplus/tag-system.md new file mode 100644 index 0000000000..a467ddaf78 --- /dev/null +++ b/docs/handbook/libnbtplusplus/tag-system.md @@ -0,0 +1,643 @@ +# Tag System + +## Overview + +The tag system is the core of libnbt++. It provides a polymorphic class hierarchy where every NBT tag type maps to a concrete C++ class. All classes share the `tag` abstract base class and use the CRTP pattern via `detail::crtp_tag<Sub>` to implement common operations without repetitive boilerplate. + +--- + +## The tag_type Enum + +Defined in `include/tag.h`, `tag_type` is a strongly-typed enum representing every NBT tag type: + +```cpp +enum class tag_type : int8_t { + End = 0, + Byte = 1, + Short = 2, + Int = 3, + Long = 4, + Float = 5, + Double = 6, + Byte_Array = 7, + String = 8, + List = 9, + Compound = 10, + Int_Array = 11, + Long_Array = 12, + Null = -1 ///< Used to denote empty value objects +}; +``` + +### Type Validation + +```cpp +bool is_valid_type(int type, bool allow_end = false); +``` + +Returns `true` when `type` is between 1 and 12 (inclusive), or between 0 and 12 if `allow_end` is `true`. The `End` type (0) and `Null` type (-1) are not valid for standalone tags. + +### Type Output Operator + +```cpp +std::ostream& operator<<(std::ostream& os, tag_type tt); +``` + +Outputs human-readable names: `"byte"`, `"short"`, `"int"`, `"long"`, `"float"`, `"double"`, `"byte_array"`, `"string"`, `"list"`, `"compound"`, `"int_array"`, `"long_array"`, `"end"`, `"null"`, or `"invalid"`. + +--- + +## The tag Base Class + +Defined in `include/tag.h`, `tag` is the abstract base class for all NBT tags. It declares the interface that all concrete tag classes must implement: + +### Pure Virtual Methods + +```cpp +virtual tag_type get_type() const noexcept = 0; // Returns the tag type +virtual std::unique_ptr<tag> clone() const& = 0; // Deep-copies the tag +virtual std::unique_ptr<tag> move_clone() && = 0; // Move-constructs a copy +virtual tag& assign(tag&& rhs) = 0; // Move-assigns same-type tag +virtual void accept(nbt_visitor& visitor) = 0; // Visitor pattern (mutable) +virtual void accept(const_nbt_visitor& visitor) const = 0; // Visitor pattern (const) +virtual void read_payload(io::stream_reader& reader) = 0; // Deserialize from stream +virtual void write_payload(io::stream_writer& writer) const = 0; // Serialize to stream +``` + +### Non-Virtual Methods + +```cpp +std::unique_ptr<tag> clone() &&; // Rvalue overload: delegates to move_clone() +``` + +### Template Methods + +```cpp +template <class T> T& as(); +template <class T> const T& as() const; +``` + +Downcasts `*this` to `T&` using `dynamic_cast`. Requires `T` to be a subclass of `tag` (enforced by `static_assert`). Throws `std::bad_cast` if the tag is not of type `T`. + +```cpp +// Usage: +tag& t = /* some tag */; +tag_string& s = t.as<tag_string>(); // OK if t is a tag_string +int32_t val = t.as<tag_int>().get(); // OK if t is a tag_int +``` + +### Factory Methods + +These static methods construct tags at runtime by `tag_type`: + +```cpp +static std::unique_ptr<tag> create(tag_type type); // Default-construct +``` + +Creates a new tag with default values: +- Numeric types: value = 0 +- String: empty string +- Arrays: empty vector +- List: empty list (undetermined type) +- Compound: empty compound + +Throws `std::invalid_argument` for `tag_type::End`, `tag_type::Null`, or invalid values. + +```cpp +static std::unique_ptr<tag> create(tag_type type, int8_t val); +static std::unique_ptr<tag> create(tag_type type, int16_t val); +static std::unique_ptr<tag> create(tag_type type, int32_t val); +static std::unique_ptr<tag> create(tag_type type, int64_t val); +static std::unique_ptr<tag> create(tag_type type, float val); +static std::unique_ptr<tag> create(tag_type type, double val); +``` + +Creates a numeric tag with the specified value. The value is cast to the appropriate type. Throws `std::invalid_argument` if `type` is not a numeric type (Byte through Double). + +```cpp +auto t = tag::create(tag_type::Int); // tag_int(0) +auto t = tag::create(tag_type::Float, 3.14f); // tag_float(3.14) +auto t = tag::create(tag_type::Byte, 42); // tag_byte(42), value cast to int8_t +``` + +### Equality Operators + +```cpp +friend bool operator==(const tag& lhs, const tag& rhs); +friend bool operator!=(const tag& lhs, const tag& rhs); +``` + +The `operator==` implementation first checks `typeid(lhs) != typeid(rhs)` — if the RTTI types differ, tags are unequal. If types match, it delegates to the private virtual `equals()` method, which each concrete class implements via the CRTP. + +### Output Operator + +```cpp +std::ostream& operator<<(std::ostream& os, const tag& t); +``` + +Uses `text::json_formatter` to produce JSON-like output. Created as a `static const` in `src/tag.cpp`. + +--- + +## Concrete Tag Classes + +### tag_byte / tag_short / tag_int / tag_long / tag_float / tag_double + +These are all instantiations of `tag_primitive<T>`, defined in `include/tag_primitive.h`: + +```cpp +template <class T> +class tag_primitive final : public detail::crtp_tag<tag_primitive<T>> +{ +public: + typedef T value_type; + static constexpr tag_type type = detail::get_primitive_type<T>::value; + + constexpr tag_primitive(T val = 0) noexcept : value(val) {} + + // Implicit conversion to/from T + operator T&(); + constexpr operator T() const; + constexpr T get() const { return value; } + + tag_primitive& operator=(T val) { value = val; return *this; } + void set(T val) { value = val; } + + void read_payload(io::stream_reader& reader) override; + void write_payload(io::stream_writer& writer) const override; + +private: + T value; +}; +``` + +#### Type Mapping + +| Typedef | Template | `type` constant | C++ Type | NBT Size | +|--------------|--------------------------|----------------------|------------|----------| +| `tag_byte` | `tag_primitive<int8_t>` | `tag_type::Byte` | `int8_t` | 1 byte | +| `tag_short` | `tag_primitive<int16_t>` | `tag_type::Short` | `int16_t` | 2 bytes | +| `tag_int` | `tag_primitive<int32_t>` | `tag_type::Int` | `int32_t` | 4 bytes | +| `tag_long` | `tag_primitive<int64_t>` | `tag_type::Long` | `int64_t` | 8 bytes | +| `tag_float` | `tag_primitive<float>` | `tag_type::Float` | `float` | 4 bytes | +| `tag_double` | `tag_primitive<double>` | `tag_type::Double` | `double` | 8 bytes | + +#### Implicit Conversions + +`tag_primitive<T>` provides implicit conversion operators. This allows natural C++ usage: + +```cpp +tag_int myInt(42); +int val = myInt; // Implicit conversion: constexpr operator T() const +int& ref = myInt; // Mutable reference: operator T&() +ref = 100; // Modifies the tag's value +myInt = 200; // Uses operator=(T val) +``` + +#### Binary I/O + +Reading and writing are implemented inline in the header: + +```cpp +template <class T> +void tag_primitive<T>::read_payload(io::stream_reader& reader) +{ + reader.read_num(value); + if (!reader.get_istr()) { + std::ostringstream str; + str << "Error reading tag_" << type; + throw io::input_error(str.str()); + } +} + +template <class T> +void tag_primitive<T>::write_payload(io::stream_writer& writer) const +{ + writer.write_num(value); +} +``` + +#### Equality + +```cpp +template <class T> +bool operator==(const tag_primitive<T>& lhs, const tag_primitive<T>& rhs) +{ + return lhs.get() == rhs.get(); +} +``` + +Note: `tag_float(2.5)` and `tag_double(2.5)` are **not** equal — they are different types. + +#### Explicit Instantiation + +In `include/tag_primitive.h`: +```cpp +extern template class NBT_EXPORT tag_primitive<int8_t>; +extern template class NBT_EXPORT tag_primitive<int16_t>; +extern template class NBT_EXPORT tag_primitive<int32_t>; +extern template class NBT_EXPORT tag_primitive<int64_t>; +extern template class NBT_EXPORT tag_primitive<float>; +extern template class NBT_EXPORT tag_primitive<double>; +``` + +In `src/tag.cpp`: +```cpp +template class tag_primitive<int8_t>; +template class tag_primitive<int16_t>; +template class tag_primitive<int32_t>; +template class tag_primitive<int64_t>; +template class tag_primitive<float>; +template class tag_primitive<double>; +``` + +This ensures template code is compiled once in `tag.cpp` rather than in every translation unit. + +--- + +### tag_string + +Defined in `include/tag_string.h`: + +```cpp +class NBT_EXPORT tag_string final : public detail::crtp_tag<tag_string> +{ +public: + static constexpr tag_type type = tag_type::String; + + tag_string() {} + tag_string(const std::string& str) : value(str) {} + tag_string(std::string&& str) noexcept : value(std::move(str)) {} + tag_string(const char* str) : value(str) {} + + // Implicit conversion to/from std::string + operator std::string&(); + operator const std::string&() const; + const std::string& get() const { return value; } + + tag_string& operator=(const std::string& str); + tag_string& operator=(std::string&& str); + tag_string& operator=(const char* str); + void set(const std::string& str); + void set(std::string&& str); + + void read_payload(io::stream_reader& reader) override; + void write_payload(io::stream_writer& writer) const override; + +private: + std::string value; +}; +``` + +#### NBT String Format + +NBT strings are encoded as: +1. A 2-byte unsigned big-endian length prefix (max 65535) +2. UTF-8 encoded characters + +The `write_payload` method throws `std::length_error` if the string exceeds 65535 bytes. + +#### Usage + +```cpp +tag_string name("Steve"); +std::string& ref = name; // Implicit conversion +ref = "Alex"; // Modifies in place +name = "Notch"; // operator=(const char*) +name.set("jeb_"); // Explicit setter +``` + +--- + +### tag_byte_array / tag_int_array / tag_long_array + +These are all instantiations of `tag_array<T>`, defined in `include/tag_array.h`: + +```cpp +template <class T> +class tag_array final : public detail::crtp_tag<tag_array<T>> +{ +public: + typedef typename std::vector<T>::iterator iterator; + typedef typename std::vector<T>::const_iterator const_iterator; + typedef T value_type; + static constexpr tag_type type = detail::get_array_type<T>::value; + + tag_array() {} + tag_array(std::initializer_list<T> init) : data(init) {} + tag_array(std::vector<T>&& vec) noexcept : data(std::move(vec)) {} + + std::vector<T>& get(); + const std::vector<T>& get() const; + + T& at(size_t i); // Bounds-checked + T& operator[](size_t i); // Unchecked + + void push_back(T val); + void pop_back(); + size_t size() const; + void clear(); + + iterator begin(); iterator end(); + const_iterator begin() const; const_iterator end() const; + const_iterator cbegin() const; const_iterator cend() const; + + void read_payload(io::stream_reader& reader) override; + void write_payload(io::stream_writer& writer) const override; + +private: + std::vector<T> data; +}; +``` + +#### Type Mapping + +| Typedef | Template | `type` constant | +|-------------------|-----------------------|------------------------| +| `tag_byte_array` | `tag_array<int8_t>` | `tag_type::Byte_Array` | +| `tag_int_array` | `tag_array<int32_t>` | `tag_type::Int_Array` | +| `tag_long_array` | `tag_array<int64_t>` | `tag_type::Long_Array` | + +#### Specialized Binary I/O + +The `read_payload` and `write_payload` methods have three implementations: + +**Byte arrays** (`tag_array<int8_t>`): Read/written as raw byte blocks: +```cpp +template <> +void tag_array<int8_t>::read_payload(io::stream_reader& reader) +{ + int32_t length; + reader.read_num(length); + if (length < 0) + reader.get_istr().setstate(std::ios::failbit); + if (!reader.get_istr()) + throw io::input_error("Error reading length of tag_byte_array"); + + data.resize(length); + reader.get_istr().read(reinterpret_cast<char*>(data.data()), length); + if (!reader.get_istr()) + throw io::input_error("Error reading contents of tag_byte_array"); +} +``` + +**Long arrays** (`tag_array<int64_t>`): Read element-by-element with `read_num`: +```cpp +template <> +void tag_array<int64_t>::read_payload(io::stream_reader& reader) +{ + int32_t length; + reader.read_num(length); + if (length < 0) reader.get_istr().setstate(std::ios::failbit); + if (!reader.get_istr()) + throw io::input_error("Error reading length of tag_long_array"); + + data.clear(); + data.reserve(length); + for (int32_t i = 0; i < length; ++i) { + int64_t val; + reader.read_num(val); + data.push_back(val); + } + if (!reader.get_istr()) + throw io::input_error("Error reading contents of tag_long_array"); +} +``` + +**Int arrays and generic** (`tag_array<T>`): Uses the generic template with `read_num`: +```cpp +template <typename T> +void tag_array<T>::read_payload(io::stream_reader& reader) +{ + int32_t length; + reader.read_num(length); + // ... similar element-by-element reading ... +} +``` + +#### NBT Array Format + +Arrays are encoded as: +1. A 4-byte signed big-endian length (number of elements) +2. The elements in sequence + +Negative lengths set the failbit and throw `io::input_error`. Arrays exceeding `INT32_MAX` elements throw `std::length_error` on write. + +#### Usage + +```cpp +// Initialize with values +tag_byte_array ba{0, 1, 2, 3, 4}; +tag_int_array ia{100, 200, 300}; +tag_long_array la{1000000000LL, 2000000000LL}; + +// Access elements +int8_t first = ba[0]; +int32_t safe = ia.at(1); // Bounds-checked + +// Modify +ba.push_back(5); +ia.pop_back(); +la.clear(); + +// Iterate +for (int32_t val : ia) { + std::cout << val << " "; +} + +// Access underlying vector +std::vector<int8_t>& raw = ba.get(); +raw.insert(raw.begin(), -1); +``` + +--- + +### tag_list + +See [list-tags.md](list-tags.md) for full details. Briefly: `tag_list` stores a `std::vector<value>` with a tracked element type (`el_type_`). All elements must have the same type. + +### tag_compound + +See [compound-tags.md](compound-tags.md) for full details. Briefly: `tag_compound` stores a `std::map<std::string, value>` providing named tag access with ordered iteration. + +--- + +## Clone, Equals, and Assign + +These operations are all provided by the CRTP layer (`detail::crtp_tag<Sub>`) and work uniformly across all tag types: + +### clone() + +```cpp +std::unique_ptr<tag> clone() const&; // Copy-clones +std::unique_ptr<tag> move_clone() &&; // Move-clones +std::unique_ptr<tag> clone() &&; // Delegates to move_clone() +``` + +The CRTP implementation: +```cpp +std::unique_ptr<tag> clone() const& override final { + return make_unique<Sub>(sub_this()); // Copy constructor of Sub +} +std::unique_ptr<tag> move_clone() && override final { + return make_unique<Sub>(std::move(sub_this())); // Move constructor of Sub +} +``` + +**Example:** +```cpp +tag_compound comp{{"key", 42}}; + +// Deep copy +auto copy = comp.clone(); +// copy is a tag_compound with {"key": tag_int(42)} + +// Move (original is in moved-from state) +auto moved = std::move(comp).clone(); +``` + +### equals() + +The private virtual `equals()` method (implemented by crtp_tag) delegates to the concrete class's `operator==`: + +```cpp +bool equals(const tag& rhs) const override final { + return sub_this() == static_cast<const Sub&>(rhs); +} +``` + +The public `operator==` first checks RTTI types: +```cpp +bool operator==(const tag& lhs, const tag& rhs) +{ + if (typeid(lhs) != typeid(rhs)) + return false; + return lhs.equals(rhs); +} +``` + +**Examples:** +```cpp +tag_int a(42), b(42), c(99); +a == b; // true (same type, same value) +a == c; // false (same type, different value) + +tag_float f(42.0f); +a == f; // false (different types, even though numeric value matches) +``` + +### assign() + +The CRTP implementation: +```cpp +tag& assign(tag&& rhs) override final { + return sub_this() = dynamic_cast<Sub&&>(rhs); +} +``` + +This move-assigns the content of `rhs` into `*this`. Throws `std::bad_cast` if `rhs` is not the same concrete type as `*this`. + +**Example:** +```cpp +tag_string s("hello"); +s.assign(tag_string("world")); // OK: s now contains "world" + +tag_int i(42); +s.assign(std::move(i)); // throws std::bad_cast (int != string) +``` + +--- + +## Primitive Type Traits + +The `detail::get_primitive_type<T>` meta-struct (`include/primitive_detail.h`) uses template specialization to map C++ types to `tag_type` values at compile time: + +```cpp +namespace detail { + template <class T> struct get_primitive_type { + static_assert(sizeof(T) != sizeof(T), + "Invalid type paramter for tag_primitive"); + }; + + template <> struct get_primitive_type<int8_t> + : public std::integral_constant<tag_type, tag_type::Byte> {}; + template <> struct get_primitive_type<int16_t> + : public std::integral_constant<tag_type, tag_type::Short> {}; + template <> struct get_primitive_type<int32_t> + : public std::integral_constant<tag_type, tag_type::Int> {}; + template <> struct get_primitive_type<int64_t> + : public std::integral_constant<tag_type, tag_type::Long> {}; + template <> struct get_primitive_type<float> + : public std::integral_constant<tag_type, tag_type::Float> {}; + template <> struct get_primitive_type<double> + : public std::integral_constant<tag_type, tag_type::Double> {}; +} +``` + +The unspecialized template uses a `static_assert` that always fails (via `sizeof(T) != sizeof(T)`, which is always `false`). This ensures that attempting to create a `tag_primitive<SomeOtherType>` produces a clear compile error. + +Similarly, `detail::get_array_type<T>` maps array element types: + +```cpp +template <> struct get_array_type<int8_t> + : public std::integral_constant<tag_type, tag_type::Byte_Array> {}; +template <> struct get_array_type<int32_t> + : public std::integral_constant<tag_type, tag_type::Int_Array> {}; +template <> struct get_array_type<int64_t> + : public std::integral_constant<tag_type, tag_type::Long_Array> {}; +``` + +--- + +## make_unique Helper + +Defined in `include/make_unique.h`, this provides a C++11 polyfill for `std::make_unique` (which was introduced in C++14): + +```cpp +namespace nbt { + template <class T, class... Args> + std::unique_ptr<T> make_unique(Args&&... args) + { + return std::unique_ptr<T>(new T(std::forward<Args>(args)...)); + } +} +``` + +It is used throughout the library for creating tag instances: +```cpp +auto t = nbt::make_unique<tag_int>(42); +auto s = nbt::make_unique<tag_string>("hello"); +``` + +--- + +## Tag Construction Summary + +| Tag Type | Default Constructor | Value Constructor | Initializer List | +|----------------|---------------------|--------------------------------------------------|-------------------------------| +| `tag_byte` | `tag_byte()`→0 | `tag_byte(int8_t(42))` | N/A | +| `tag_short` | `tag_short()`→0 | `tag_short(int16_t(1000))` | N/A | +| `tag_int` | `tag_int()`→0 | `tag_int(42)` | N/A | +| `tag_long` | `tag_long()`→0 | `tag_long(int64_t(123456789))` | N/A | +| `tag_float` | `tag_float()`→0 | `tag_float(3.14f)` | N/A | +| `tag_double` | `tag_double()`→0 | `tag_double(2.71828)` | N/A | +| `tag_string` | `tag_string()`→"" | `tag_string("text")`, `tag_string(std::string)` | N/A | +| `tag_byte_array`| `tag_byte_array()` | `tag_byte_array(std::vector<int8_t>&&)` | `tag_byte_array{1,2,3}` | +| `tag_int_array` | `tag_int_array()` | `tag_int_array(std::vector<int32_t>&&)` | `tag_int_array{1,2,3}` | +| `tag_long_array`| `tag_long_array()` | `tag_long_array(std::vector<int64_t>&&)` | `tag_long_array{1,2,3}` | +| `tag_list` | `tag_list()`→Null | `tag_list(tag_type)` (empty with type) | `tag_list{1,2,3}` (various) | +| `tag_compound` | `tag_compound()` | N/A | `tag_compound{{"k",v},...}` | + +--- + +## Error Handling in Tags + +| Operation | Exception | Condition | +|-------------------|---------------------------|-------------------------------------| +| `tag::create()` | `std::invalid_argument` | Invalid type (End, Null, or >12) | +| `tag::as<T>()` | `std::bad_cast` | Tag is not of type T | +| `tag::assign()` | `std::bad_cast` | Source tag has different type | +| Primitive I/O | `io::input_error` | Stream read failure | +| String write | `std::length_error` | String exceeds 65535 bytes | +| Array read | `io::input_error` | Negative length or read failure | +| Array write | `std::length_error` | Array exceeds INT32_MAX elements | diff --git a/docs/handbook/libnbtplusplus/testing.md b/docs/handbook/libnbtplusplus/testing.md new file mode 100644 index 0000000000..807f0fbaa8 --- /dev/null +++ b/docs/handbook/libnbtplusplus/testing.md @@ -0,0 +1,291 @@ +# Testing + +## Overview + +libnbt++ uses the **CxxTest** testing framework. Tests are defined as C++ header files with test classes inheriting from `CxxTest::TestSuite`. The test suite covers all tag types, I/O operations, endian conversion, zlib streams, and value semantics. + +Build configuration is in `test/CMakeLists.txt`. + +--- + +## Build Configuration + +```cmake +# test/CMakeLists.txt +if(NOT (UNIX AND (CMAKE_SYSTEM_PROCESSOR MATCHES "x86_64|i[3-6]86"))) + message(WARNING "Tests are only supported on Linux x86/x86_64") + return() +endif() + +find_package(CxxTest REQUIRED) +``` + +Tests are **Linux x86/x86_64 only** due to the use of `objcopy` for embedding binary test data. + +### CMake Options + +Tests are controlled by the `NBT_BUILD_TESTS` option: + +```cmake +option(NBT_BUILD_TESTS "Build libnbt++ tests" ON) + +if(NBT_BUILD_TESTS) + enable_testing() + add_subdirectory(test) +endif() +``` + +### Binary Test Data Embedding + +Test data files (e.g., `bigtest.nbt`, `bigtest_uncompr`, `littletest.nbt`) are converted to object files via `objcopy` and linked directly into test executables: + +```cmake +set(BINARY_DIR "${CMAKE_CURRENT_SOURCE_DIR}/") + +function(add_nbt_test name) + # ... + foreach(binfile ${ARGN}) + set(obj "${CMAKE_CURRENT_BINARY_DIR}/${binfile}.o") + add_custom_command( + OUTPUT ${obj} + COMMAND objcopy -I binary -O elf64-x86-64 + -B i386:x86-64 ${binfile} ${obj} + WORKING_DIRECTORY ${BINARY_DIR} + DEPENDS ${BINARY_DIR}/${binfile} + ) + target_sources(${name} PRIVATE ${obj}) + endforeach() +endfunction() +``` + +The embedded data is accessed via extern symbols declared in `test/data.h`: + +```cpp +// test/data.h +#define DECLARE_BINARY(name) \ + extern "C" const char _binary_##name##_start[]; \ + extern "C" const char _binary_##name##_end[]; + +DECLARE_BINARY(bigtest_nbt) +DECLARE_BINARY(bigtest_uncompr) +DECLARE_BINARY(littletest_nbt) +DECLARE_BINARY(littletest_uncompr) +DECLARE_BINARY(errortest_eof_nbt) +DECLARE_BINARY(errortest_negative_length_nbt) +DECLARE_BINARY(errortest_excessive_depth_nbt) +``` + +--- + +## Test Targets + +| Target | Source | Tests | +|--------|--------|-------| +| `nbttest` | `test/nbttest.h` | Core tag types, value, compound, list, visitor | +| `endian_str_test` | `test/endian_str_test.h` | Endian read/write roundtrips | +| `read_test` | `test/read_test.h` | stream_reader, big/little endian, errors | +| `write_test` | `test/write_test.h` | stream_writer, payload writing, roundtrips | +| `zlibstream_test` | `test/zlibstream_test.h` | izlibstream, ozlibstream, compression roundtrip | +| `format_test` | `test/format_test.cpp` | JSON text formatting | +| `test_value` | `test/test_value.h` | Value numeric assignment and conversion | + +Test registration in CMake: + +```cmake +add_nbt_test(nbttest nbttest.h) +add_nbt_test(endian_str_test endian_str_test.h) +add_nbt_test(read_test read_test.h + bigtest.nbt bigtest_uncompr littletest.nbt littletest_uncompr + errortest_eof.nbt errortest_negative_length.nbt + errortest_excessive_depth.nbt) +add_nbt_test(write_test write_test.h + bigtest_uncompr littletest_uncompr) +if(NBT_HAVE_ZLIB) + add_nbt_test(zlibstream_test zlibstream_test.h + bigtest.nbt bigtest_uncompr) +endif() +add_nbt_test(test_value test_value.h) +``` + +`zlibstream_test` is only built when `NBT_HAVE_ZLIB` is defined. + +--- + +## Test Details + +### nbttest.h — Core Functionality + +Tests the fundamental tag and value operations: + +```cpp +class NbtTest : public CxxTest::TestSuite +{ +public: + void test_tag_primitive(); // Constructors, get/set, implicit conversion + void test_tag_string(); // String constructors, conversion operators + void test_tag_compound(); // Insertion, lookup, iteration, has_key() + void test_tag_list(); // Type enforcement, push_back, of<T>() + void test_tag_array(); // Vector access, constructors + void test_value(); // Type erasure, numeric assignment + void test_visitor(); // Double dispatch, visitor invocation + void test_equality(); // operator==, operator!= for all types + void test_clone(); // clone() and move_clone() correctness + void test_as(); // tag::as<T>() casting, bad_cast +}; +``` + +### read_test.h — Deserialization + +Tests `stream_reader` against known binary data: + +```cpp +class ReadTest : public CxxTest::TestSuite +{ +public: + void test_read_bigtest(); // Verifies full bigtest.nbt structure + void test_read_littletest(); // Little-endian variant + void test_read_bigtest_uncompr(); // Uncompressed big-endian + void test_read_littletest_uncompr(); // Uncompressed little-endian + void test_read_eof(); // Truncated data → input_error + void test_read_negative_length(); // Negative array length → input_error + void test_read_excessive_depth(); // >1024 nesting → input_error +}; +``` + +The "bigtest" validates a complex nested compound with all tag types — the standard NBT test file from the Minecraft community. Fields verified include: + +- `"byteTest"`: `tag_byte` with value 127 +- `"shortTest"`: `tag_short` with value 32767 +- `"intTest"`: `tag_int` with value 2147483647 +- `"longTest"`: `tag_long` with value 9223372036854775807 +- `"floatTest"`: `tag_float` with value ~0.498... +- `"doubleTest"`: `tag_double` with value ~0.493... +- `"stringTest"`: UTF-8 string "HELLO WORLD THIS IS A TEST STRING ÅÄÖ!" +- `"byteArrayTest"`: 1000-element byte array +- `"listTest (long)"`: List of 5 longs +- `"listTest (compound)"`: List of compounds +- Nested compound within compound + +### write_test.h — Serialization + +Tests `stream_writer` and write-then-read roundtrips: + +```cpp +class WriteTest : public CxxTest::TestSuite +{ +public: + void test_write_bigtest(); // Write and compare against reference + void test_write_littletest(); // Little-endian write + void test_write_payload(); // Individual payload writing + void test_roundtrip(); // Write → read → compare equality +}; +``` + +### endian_str_test.h — Byte Order + +Tests all numeric types through read/write roundtrips in both endiannesses: + +```cpp +class EndianStrTest : public CxxTest::TestSuite +{ +public: + void test_read_big(); + void test_read_little(); + void test_write_big(); + void test_write_little(); + void test_roundtrip_big(); + void test_roundtrip_little(); + void test_float_big(); + void test_float_little(); + void test_double_big(); + void test_double_little(); +}; +``` + +### zlibstream_test.h — Compression + +Tests zlib stream wrappers: + +```cpp +class ZlibstreamTest : public CxxTest::TestSuite +{ +public: + void test_inflate_gzip(); // Decompress gzip data + void test_inflate_zlib(); // Decompress zlib data + void test_inflate_corrupt(); // Corrupt data → zlib_error + void test_inflate_eof(); // Truncated data → EOF + void test_inflate_trailing(); // Data after compressed stream + void test_deflate_roundtrip(); // Compress → decompress → compare + void test_deflate_gzip(); // Gzip format output +}; +``` + +### test_value.h — Value Semantics + +Tests the `value` class's numeric assignment and conversion: + +```cpp +class TestValue : public CxxTest::TestSuite +{ +public: + void test_assign_byte(); + void test_assign_short(); + void test_assign_int(); + void test_assign_long(); + void test_assign_float(); + void test_assign_double(); + void test_assign_widening(); // int8 → int16 → int32 → int64 + void test_assign_narrowing(); // Narrowing disallowed + void test_assign_string(); +}; +``` + +### format_test.cpp — Text Output + +A standalone test (not CxxTest) that constructs a compound, serializes it, reads it back, and prints JSON: + +```cpp +// Constructs a compound with nested types +// Writes to stringstream, reads back +// Outputs via operator<< (json_formatter) +``` + +--- + +## Running Tests + +```bash +mkdir build && cd build +cmake .. -DNBT_BUILD_TESTS=ON +cmake --build . +ctest +``` + +Or run individual tests: + +```bash +./test/nbttest +./test/read_test +./test/write_test +./test/endian_str_test +./test/zlibstream_test +./test/test_value +``` + +--- + +## Test Data Files + +Located in `test/`: + +| File | Description | +|------|-------------| +| `bigtest.nbt` | Standard NBT test file, gzip-compressed, big-endian | +| `bigtest_uncompr` | Same as bigtest but uncompressed | +| `littletest.nbt` | Little-endian NBT test file, compressed | +| `littletest_uncompr` | Little-endian, uncompressed | +| `errortest_eof.nbt` | Truncated NBT for error testing | +| `errortest_negative_length.nbt` | NBT with negative array length | +| `errortest_excessive_depth.nbt` | NBT with >1024 nesting levels | + +These files are embedded into test binaries via `objcopy` and accessed through the `DECLARE_BINARY` macros in `data.h`. diff --git a/docs/handbook/libnbtplusplus/visitor-pattern.md b/docs/handbook/libnbtplusplus/visitor-pattern.md new file mode 100644 index 0000000000..dbf8124959 --- /dev/null +++ b/docs/handbook/libnbtplusplus/visitor-pattern.md @@ -0,0 +1,333 @@ +# Visitor Pattern + +## Overview + +libnbt++ implements the classic double-dispatch visitor pattern for traversing and processing tag hierarchies without modifying the tag classes themselves. Two visitor base classes are provided: `nbt_visitor` for mutable access and `const_nbt_visitor` for read-only traversal. + +Defined in `include/nbt_visitor.h`. + +--- + +## Visitor Base Classes + +### nbt_visitor — Mutable Visitor + +```cpp +class nbt_visitor +{ +public: + virtual ~nbt_visitor() noexcept; + + virtual void visit(tag_byte&) {} + virtual void visit(tag_short&) {} + virtual void visit(tag_int&) {} + virtual void visit(tag_long&) {} + virtual void visit(tag_float&) {} + virtual void visit(tag_double&) {} + virtual void visit(tag_byte_array&) {} + virtual void visit(tag_string&) {} + virtual void visit(tag_list&) {} + virtual void visit(tag_compound&) {} + virtual void visit(tag_int_array&) {} + virtual void visit(tag_long_array&) {} +}; +``` + +### const_nbt_visitor — Immutable Visitor + +```cpp +class const_nbt_visitor +{ +public: + virtual ~const_nbt_visitor() noexcept; + + virtual void visit(const tag_byte&) {} + virtual void visit(const tag_short&) {} + virtual void visit(const tag_int&) {} + virtual void visit(const tag_long&) {} + virtual void visit(const tag_float&) {} + virtual void visit(const tag_double&) {} + virtual void visit(const tag_byte_array&) {} + virtual void visit(const tag_string&) {} + virtual void visit(const tag_list&) {} + virtual void visit(const tag_compound&) {} + virtual void visit(const tag_int_array&) {} + virtual void visit(const tag_long_array&) {} +}; +``` + +Both provide 12 `visit()` overloads — one per concrete tag type. All default to empty (no-op), so subclasses only override the types they care about. + +--- + +## Double Dispatch via accept() + +The `tag` base class declares the `accept()` method: + +```cpp +class tag +{ +public: + virtual void accept(nbt_visitor& visitor) const = 0; + virtual void accept(const_nbt_visitor& visitor) const = 0; +}; +``` + +The CRTP intermediate `crtp_tag<Sub>` implements both `accept()` methods: + +```cpp +template <class Sub> +class crtp_tag : public tag +{ +public: + void accept(nbt_visitor& visitor) const override + { + visitor.visit(const_cast<Sub&>(static_cast<const Sub&>(*this))); + } + + void accept(const_nbt_visitor& visitor) const override + { + visitor.visit(static_cast<const Sub&>(*this)); + } +}; +``` + +For `nbt_visitor` (mutable), `const_cast` removes the `const` from `accept()` so the visitor receives a mutable reference. + +For `const_nbt_visitor`, the const reference is passed through directly. + +--- + +## How It Works + +1. Client code creates a visitor subclass, overriding `visit()` for the types it handles +2. Client calls `tag.accept(visitor)` on any tag +3. The CRTP-generated `accept()` calls `visitor.visit(static_cast<Sub&>(*this))` +4. The correct `visit()` overload is called based on the **concrete** tag type + +``` +Client → tag.accept(visitor) + → crtp_tag<tag_int>::accept() + → visitor.visit(static_cast<tag_int&>(*this)) + → YourVisitor::visit(tag_int&) // Your override +``` + +This resolves the combination of (runtime tag type) × (visitor implementation) without `dynamic_cast` or switch statements. + +--- + +## Built-in Visitor: json_fmt_visitor + +The library includes one concrete visitor in `src/text/json_formatter.cpp`: + +```cpp +class json_fmt_visitor : public const_nbt_visitor +{ +public: + json_fmt_visitor(std::ostream& os, unsigned int indent); + + void visit(const tag_byte& t) override; + void visit(const tag_short& t) override; + void visit(const tag_int& t) override; + void visit(const tag_long& t) override; + void visit(const tag_float& t) override; + void visit(const tag_double& t) override; + void visit(const tag_byte_array& t) override; + void visit(const tag_string& t) override; + void visit(const tag_list& t) override; + void visit(const tag_compound& t) override; + void visit(const tag_int_array& t) override; + void visit(const tag_long_array& t) override; + +private: + std::ostream& os; + unsigned int indent; + void write_indent(); +}; +``` + +This visitor renders any tag as a JSON-like text format. Used by `tag::operator<<` for debug output: + +```cpp +std::ostream& operator<<(std::ostream& os, const tag& t) +{ + static text::json_formatter formatter; + formatter.print(os, t); + return os; +} +``` + +### Formatting Rules + +| Type | Output Format | Example | +|------|--------------|---------| +| `tag_byte` | `<value>b` | `42b` | +| `tag_short` | `<value>s` | `100s` | +| `tag_int` | `<value>` | `12345` | +| `tag_long` | `<value>l` | `9876543210l` | +| `tag_float` | `<value>f` | `3.14f` | +| `tag_double` | `<value>d` | `2.718d` | +| `tag_string` | `"<value>"` | `"hello"` | +| `tag_byte_array` | `[B; ...]` | `[B; 1b, 2b, 3b]` | +| `tag_int_array` | `[I; ...]` | `[I; 1, 2, 3]` | +| `tag_long_array` | `[L; ...]` | `[L; 1l, 2l, 3l]` | +| `tag_list` | `[...]` | `[1, 2, 3]` | +| `tag_compound` | `{...}` | `{"key": 42}` | + +Special float/double handling: +- `+Infinity`, `-Infinity`, `NaN` are written as-is (not JSON-compliant but accurate) +- Uses the `std::defaultfloat` format + +--- + +## Writing Custom Visitors + +### Example: Tag Counter + +Count the total number of tags and tags of each type: + +```cpp +class tag_counter : public const_nbt_visitor +{ +public: + int total = 0; + std::map<tag_type, int> counts; + + void visit(const tag_byte&) override { ++total; ++counts[tag_type::Byte]; } + void visit(const tag_short&) override { ++total; ++counts[tag_type::Short]; } + void visit(const tag_int&) override { ++total; ++counts[tag_type::Int]; } + void visit(const tag_long&) override { ++total; ++counts[tag_type::Long]; } + void visit(const tag_float&) override { ++total; ++counts[tag_type::Float]; } + void visit(const tag_double&) override { ++total; ++counts[tag_type::Double]; } + void visit(const tag_string&) override { ++total; ++counts[tag_type::String]; } + void visit(const tag_byte_array&) override { ++total; ++counts[tag_type::Byte_Array]; } + void visit(const tag_int_array&) override { ++total; ++counts[tag_type::Int_Array]; } + void visit(const tag_long_array&) override { ++total; ++counts[tag_type::Long_Array]; } + + void visit(const tag_list& t) override { + ++total; + ++counts[tag_type::List]; + for (const auto& val : t) + val.get().accept(*this); // Recurse into children + } + + void visit(const tag_compound& t) override { + ++total; + ++counts[tag_type::Compound]; + for (const auto& [name, val] : t) + val.get().accept(*this); // Recurse into children + } +}; + +// Usage +tag_counter counter; +root.accept(counter); +std::cout << "Total tags: " << counter.total << "\n"; +``` + +### Example: Tag Modifier (Mutable) + +Double all integer values in a tree: + +```cpp +class int_doubler : public nbt_visitor +{ +public: + void visit(tag_int& t) override { + t.set(t.get() * 2); + } + void visit(tag_list& t) override { + for (auto& val : t) + val.get().accept(*this); + } + void visit(tag_compound& t) override { + for (auto& [name, val] : t) + val.get().accept(*this); + } +}; + +int_doubler doubler; +root.accept(doubler); +``` + +### Example: Selective Visitor + +Only handle specific types — unhandled types use the default no-op: + +```cpp +class string_collector : public const_nbt_visitor +{ +public: + std::vector<std::string> strings; + + void visit(const tag_string& t) override { + strings.push_back(t.get()); + } + void visit(const tag_list& t) override { + for (const auto& val : t) + val.get().accept(*this); + } + void visit(const tag_compound& t) override { + for (const auto& [name, val] : t) + val.get().accept(*this); + } +}; +``` + +--- + +## Recursive Traversal + +The visitor pattern does **not** automatically recurse into compounds and lists. To walk an entire tag tree, your visitor must explicitly recurse in its `visit(tag_compound&)` and `visit(tag_list&)` overloads: + +```cpp +void visit(const tag_compound& t) override { + for (const auto& [name, val] : t) + val.get().accept(*this); +} + +void visit(const tag_list& t) override { + for (const auto& val : t) + val.get().accept(*this); +} +``` + +This is by design — it gives visitors control over traversal depth, ordering, and filtering. + +--- + +## Visitor vs. Dynamic Cast + +Two approaches to type-specific processing: + +### Visitor Approach + +```cpp +class my_visitor : public const_nbt_visitor { + void visit(const tag_int& t) override { /* handle int */ } + void visit(const tag_string& t) override { /* handle string */ } + // ... +}; +my_visitor v; +tag.accept(v); +``` + +### Dynamic Cast Approach + +```cpp +if (auto* int_tag = dynamic_cast<const tag_int*>(&tag)) { + // handle int +} else if (auto* str_tag = dynamic_cast<const tag_string*>(&tag)) { + // handle string +} +``` + +The visitor pattern is preferable when: +- Processing many or all tag types +- Building reusable tree-walking logic +- The compiler should warn about unhandled types (though default no-ops mask this) + +`dynamic_cast` / `tag::as<T>()` is simpler when: +- You know the type at the call site +- You only need to handle one or two types +- You're accessing a specific child of a compound diff --git a/docs/handbook/libnbtplusplus/zlib-integration.md b/docs/handbook/libnbtplusplus/zlib-integration.md new file mode 100644 index 0000000000..592d8510da --- /dev/null +++ b/docs/handbook/libnbtplusplus/zlib-integration.md @@ -0,0 +1,514 @@ +# Zlib Integration + +## Overview + +libnbt++ provides optional zlib support for reading and writing gzip/zlib-compressed NBT data, which is the standard format for Minecraft world files (`level.dat`, region files, etc.). + +The zlib integration is in the `zlib` namespace and operates through standard C++ stream wrappers. It is conditionally compiled via the `NBT_USE_ZLIB` CMake option (default: `ON`). + +Defined in: +- `include/io/zlib_streambuf.h` — Base streambuf with z_stream management +- `include/io/izlibstream.h` / `src/io/izlibstream.cpp` — Decompression stream +- `include/io/ozlibstream.h` / `src/io/ozlibstream.cpp` — Compression stream + +--- + +## Build Configuration + +```cmake +option(NBT_USE_ZLIB "Build with zlib stream support" ON) +``` + +When enabled, CMake finds zlib and links against it: +```cmake +if(NBT_USE_ZLIB) + find_package(ZLIB REQUIRED) + target_link_libraries(${NBT_NAME} PRIVATE ZLIB::ZLIB) + target_compile_definitions(${NBT_NAME} PUBLIC NBT_HAVE_ZLIB) +endif() +``` + +The `NBT_HAVE_ZLIB` preprocessor macro is defined publicly, allowing downstream code to conditionally use zlib features: + +```cpp +#ifdef NBT_HAVE_ZLIB +#include <io/izlibstream.h> +#include <io/ozlibstream.h> +#endif +``` + +--- + +## zlib_streambuf — Base Class + +```cpp +class zlib_streambuf : public std::streambuf +{ +protected: + z_stream zstr; + std::vector<char> in; + std::vector<char> out; + + static const size_t bufsize = 32768; // 32 KB +}; +``` + +This abstract base provides the shared z_stream state and I/O buffers used by both inflate and deflate streambufs. + +- `zstr`: The zlib `z_stream` struct controlling compression/decompression state +- `in`: Input buffer (32 KB) +- `out`: Output buffer (32 KB) +- `bufsize`: Buffer size constant (32768 bytes) + +--- + +## zlib_error — Exception Class + +```cpp +class zlib_error : public std::runtime_error +{ +public: + zlib_error(const z_stream& zstr, int status); + int status() const { return status_; } + +private: + int status_; +}; +``` + +Thrown on zlib API failures. Wraps the error message from `zstr.msg` (if available) along with the numeric error code. + +--- + +## Decompression: izlibstream + +### inflate_streambuf + +```cpp +class inflate_streambuf : public zlib_streambuf +{ +public: + explicit inflate_streambuf(std::istream& input, + int window_bits = 32 + MAX_WBITS); + ~inflate_streambuf(); + + int_type underflow() override; + +private: + std::istream& is; + bool stream_end = false; +}; +``` + +**Constructor:** +```cpp +inflate_streambuf::inflate_streambuf(std::istream& input, int window_bits) + : is(input) +{ + in.resize(bufsize); + out.resize(bufsize); + + zstr.zalloc = Z_NULL; + zstr.zfree = Z_NULL; + zstr.opaque = Z_NULL; + zstr.avail_in = 0; + zstr.next_in = Z_NULL; + + int status = inflateInit2(&zstr, window_bits); + if (status != Z_OK) + throw zlib_error(zstr, status); +} +``` + +The default `window_bits = 32 + MAX_WBITS` (typically `32 + 15 = 47`) enables **automatic format detection** — zlib will detect whether the data is raw deflate, zlib, or gzip format. This is critical because Minecraft NBT files may use either gzip or zlib compression. + +**underflow() — Buffered decompression:** + +```cpp +inflate_streambuf::int_type inflate_streambuf::underflow() +{ + if (stream_end) + return traits_type::eof(); + + zstr.next_out = reinterpret_cast<Bytef*>(out.data()); + zstr.avail_out = out.size(); + + while (zstr.avail_out == out.size()) { + if (zstr.avail_in == 0) { + is.read(in.data(), in.size()); + auto bytes_read = is.gcount(); + if (bytes_read == 0) { + setg(nullptr, nullptr, nullptr); + return traits_type::eof(); + } + zstr.next_in = reinterpret_cast<Bytef*>(in.data()); + zstr.avail_in = bytes_read; + } + + int status = inflate(&zstr, Z_NO_FLUSH); + if (status == Z_STREAM_END) { + // Seek back unused input so the underlying stream + // position is correct for any subsequent reads + if (zstr.avail_in > 0) + is.seekg(-static_cast<int>(zstr.avail_in), + std::ios::cur); + stream_end = true; + break; + } + if (status != Z_OK) + throw zlib_error(zstr, status); + } + + auto decompressed = out.size() - zstr.avail_out; + if (decompressed == 0) + return traits_type::eof(); + + setg(out.data(), out.data(), out.data() + decompressed); + return traits_type::to_int_type(out[0]); +} +``` + +Key behaviors: +- Reads compressed data in 32 KB chunks from the underlying stream +- Decompresses into the output buffer +- On `Z_STREAM_END`, seeks the underlying stream back by the number of unconsumed bytes, so subsequent reads on the same stream work correctly (important for concatenated data) +- Throws `zlib_error` on decompression errors + +### izlibstream + +```cpp +class izlibstream : public std::istream +{ +public: + explicit izlibstream(std::istream& input, + int window_bits = 32 + MAX_WBITS); + +private: + inflate_streambuf buf; +}; +``` + +A simple `std::istream` wrapper around `inflate_streambuf`: + +```cpp +izlibstream::izlibstream(std::istream& input, int window_bits) + : std::istream(&buf), buf(input, window_bits) +{} +``` + +Usage: +```cpp +std::ifstream file("level.dat", std::ios::binary); +zlib::izlibstream zs(file); +auto result = nbt::io::read_compound(zs); +``` + +--- + +## Compression: ozlibstream + +### deflate_streambuf + +```cpp +class deflate_streambuf : public zlib_streambuf +{ +public: + explicit deflate_streambuf(std::ostream& output, + int level = Z_DEFAULT_COMPRESSION, + int window_bits = MAX_WBITS); + ~deflate_streambuf(); + + int_type overflow(int_type ch) override; + int sync() override; + void close(); + +private: + std::ostream& os; + bool closed_ = false; + + void deflate_chunk(int flush); +}; +``` + +**Constructor:** +```cpp +deflate_streambuf::deflate_streambuf(std::ostream& output, + int level, int window_bits) + : os(output) +{ + in.resize(bufsize); + out.resize(bufsize); + + zstr.zalloc = Z_NULL; + zstr.zfree = Z_NULL; + zstr.opaque = Z_NULL; + + int status = deflateInit2(&zstr, level, Z_DEFLATED, + window_bits, 8, Z_DEFAULT_STRATEGY); + if (status != Z_OK) + throw zlib_error(zstr, status); + + setp(in.data(), in.data() + in.size()); +} +``` + +Parameters: +- `level`: Compression level (0–9, or `Z_DEFAULT_COMPRESSION` = -1) +- `window_bits`: Format control + - `MAX_WBITS` (15): raw zlib format + - `MAX_WBITS + 16` (31): gzip format + +**overflow() — Buffer full, deflate and flush:** +```cpp +deflate_streambuf::int_type deflate_streambuf::overflow(int_type ch) +{ + deflate_chunk(Z_NO_FLUSH); + if (ch != traits_type::eof()) { + *pptr() = traits_type::to_char_type(ch); + pbump(1); + } + return ch; +} +``` + +**sync() — Flush current buffer:** +```cpp +int deflate_streambuf::sync() +{ + deflate_chunk(Z_SYNC_FLUSH); + return 0; +} +``` + +**deflate_chunk() — Core compression loop:** +```cpp +void deflate_streambuf::deflate_chunk(int flush) +{ + zstr.next_in = reinterpret_cast<Bytef*>(pbase()); + zstr.avail_in = pptr() - pbase(); + + do { + zstr.next_out = reinterpret_cast<Bytef*>(out.data()); + zstr.avail_out = out.size(); + + int status = deflate(&zstr, flush); + if (status != Z_OK && status != Z_STREAM_END + && status != Z_BUF_ERROR) + throw zlib_error(zstr, status); + + auto compressed = out.size() - zstr.avail_out; + if (compressed > 0) + os.write(out.data(), compressed); + } while (zstr.avail_out == 0); + + setp(in.data(), in.data() + in.size()); +} +``` + +**close() — Finalize compression:** +```cpp +void deflate_streambuf::close() +{ + if (closed_) return; + closed_ = true; + deflate_chunk(Z_FINISH); +} +``` + +Must be called to write the final compressed block with `Z_FINISH`. + +### ozlibstream + +```cpp +class ozlibstream : public std::ostream +{ +public: + explicit ozlibstream(std::ostream& output, + int level = Z_DEFAULT_COMPRESSION, + bool gzip = true); + + void close(); + +private: + deflate_streambuf buf; +}; +``` + +**Constructor:** +```cpp +ozlibstream::ozlibstream(std::ostream& output, int level, bool gzip) + : std::ostream(&buf), + buf(output, level, MAX_WBITS + (gzip ? 16 : 0)) +{} +``` + +The `gzip` parameter (default: `true`) controls the output format: +- `true`: gzip format (`window_bits = MAX_WBITS + 16 = 31`) +- `false`: raw zlib format (`window_bits = MAX_WBITS = 15`) + +**close() — Exception-safe stream finalization:** +```cpp +void ozlibstream::close() +{ + try { + buf.close(); + } catch (...) { + setstate(std::ios::badbit); + if (exceptions() & std::ios::badbit) + throw; + } +} +``` + +`close()` catches exceptions from `buf.close()` and converts them to a badbit state. If the stream has badbit exceptions enabled, it re-throws. + +--- + +## Format Detection + +### Automatic Detection (Reading) + +The default inflate `window_bits = 32 + MAX_WBITS` enables automatic format detection: + +| Bits | Format | +|------|--------| +| `MAX_WBITS` (15) | Raw deflate | +| `MAX_WBITS + 16` (31) | Gzip only | +| `MAX_WBITS + 32` (47) | Auto-detect gzip or zlib | + +The library defaults to auto-detect (47), so it handles both formats transparently. + +### Explicit Format (Writing) + +When writing, you must choose: + +```cpp +// Gzip format (default for Minecraft) +zlib::ozlibstream gzip_out(file); // gzip=true +zlib::ozlibstream gzip_out(file, Z_DEFAULT_COMPRESSION, true); // explicit + +// Zlib format +zlib::ozlibstream zlib_out(file, Z_DEFAULT_COMPRESSION, false); +``` + +--- + +## Usage Examples + +### Reading a Gzip-Compressed NBT File + +```cpp +std::ifstream file("level.dat", std::ios::binary); +zlib::izlibstream zs(file); +auto [name, root] = nbt::io::read_compound(zs); +``` + +### Writing a Gzip-Compressed NBT File + +```cpp +std::ofstream file("level.dat", std::ios::binary); +zlib::ozlibstream zs(file); // gzip by default +nbt::io::write_tag("", root, zs); +zs.close(); // MUST call close() to finalize +``` + +### Roundtrip Compression + +```cpp +// Compress +std::stringstream ss; +{ + zlib::ozlibstream zs(ss); + nbt::io::write_tag("test", compound, zs); + zs.close(); +} + +// Decompress +ss.seekg(0); +{ + zlib::izlibstream zs(ss); + auto [name, tag] = nbt::io::read_compound(zs); + // tag now contains the original compound +} +``` + +### Controlling Compression Level + +```cpp +// No compression (fastest) +zlib::ozlibstream fast(file, Z_NO_COMPRESSION); + +// Best compression (slowest) +zlib::ozlibstream best(file, Z_BEST_COMPRESSION); + +// Default compression (balanced) +zlib::ozlibstream default_level(file, Z_DEFAULT_COMPRESSION); + +// Specific level (0-9) +zlib::ozlibstream level6(file, 6); +``` + +--- + +## Error Handling + +### zlib_error + +All zlib API failures throw `zlib::zlib_error`: + +```cpp +try { + zlib::izlibstream zs(file); + auto result = nbt::io::read_compound(zs); +} catch (const zlib::zlib_error& e) { + std::cerr << "Zlib error: " << e.what() + << " (status: " << e.status() << ")\n"; +} +``` + +Common error codes: +| Status | Meaning | +|--------|---------| +| `Z_DATA_ERROR` | Corrupted compressed data | +| `Z_MEM_ERROR` | Insufficient memory | +| `Z_BUF_ERROR` | Buffer/progress error | +| `Z_STREAM_ERROR` | Invalid parameters | + +### Stream State + +After decompression errors, the `izlibstream` may be in a bad state. After `ozlibstream::close()` catches an exception from the deflate buffer, it sets `std::ios::badbit` on the stream. + +--- + +## Resource Management + +### Destructors + +Both `inflate_streambuf` and `deflate_streambuf` call the corresponding zlib cleanup in their destructors: + +```cpp +inflate_streambuf::~inflate_streambuf() +{ + inflateEnd(&zstr); +} + +deflate_streambuf::~deflate_streambuf() +{ + deflateEnd(&zstr); +} +``` + +These release all memory allocated by zlib. The destructors are noexcept-safe — `inflateEnd`/`deflateEnd` do not throw. + +### Important: Call close() Before Destruction + +For `ozlibstream`, you **must** call `close()` before the stream is destroyed to ensure the final compressed block is flushed. The destructor calls `deflateEnd()` but does **not** call `close()` — failing to close will produce a truncated compressed file. + +```cpp +{ + std::ofstream file("output.dat", std::ios::binary); + zlib::ozlibstream zs(file); + nbt::io::write_tag("", root, zs); + zs.close(); // Required! +} // file and zs destroyed safely +``` |
