From 12b8eb0529a73d55f2913406ba41ccf0486a7b03 Mon Sep 17 00:00:00 2001 From: Ida Iyes Date: Thu, 15 Aug 2024 18:33:12 +0300 Subject: [PATCH] update docs on dataformat --- doc/src/SUMMARY.md | 8 +- doc/src/dataformat/file.md | 45 ++++++ .../frames.md} | 98 +----------- doc/src/dataformat/intro.md | 56 +++++++ doc/src/dataformat/is.md | 95 ++++++++++++ .../msgs.md} | 144 +----------------- doc/src/tech.md | 1 - 7 files changed, 210 insertions(+), 237 deletions(-) create mode 100644 doc/src/dataformat/file.md rename doc/src/{tech/dataformat-spectator.md => dataformat/frames.md} (52%) create mode 100644 doc/src/dataformat/intro.md create mode 100644 doc/src/dataformat/is.md rename doc/src/{tech/dataformat-player.md => dataformat/msgs.md} (71%) delete mode 100644 doc/src/tech.md diff --git a/doc/src/SUMMARY.md b/doc/src/SUMMARY.md index b72862a..0698571 100644 --- a/doc/src/SUMMARY.md +++ b/doc/src/SUMMARY.md @@ -26,6 +26,8 @@ - [Session Configuration](./server/sessions.md) - [Security](./server/security.md) - [Runtime Management](./server/management.md) -- [Technical Documentation](./tech.md) - - [Player Stream Format](./tech/dataformat-player.md) - - [Spectator/Replay Stream Format](./tech/dataformat-spectator.md) +- [MineWars DataFormat](./dataformat/intro.md) + - [File Structure](./dataformat/file.md) + - [Initialization Sequence (IS)](./dataformat/is.md) + - [Game Updates and Framing](./dataformat/frames.md) + - [Game Update Messages](./dataformat/msgs.md) diff --git a/doc/src/dataformat/file.md b/doc/src/dataformat/file.md new file mode 100644 index 0000000..32403b0 --- /dev/null +++ b/doc/src/dataformat/file.md @@ -0,0 +1,45 @@ +# File Structure + +A MineWars File contains the following, in order: + - [File Header](#file-header) + - [Initialization Sequence](./is.md) + - [Frames of Game Updates](./frames.md) + +## File Header + +The file header has the following structure: + - `[u64; 3]`: checksums + - `u32`: length of compressed frame data in bytes + - `u32`: length of uncompressed frame data in bytes + +If compressed length == uncompressed length, the frames data is stored uncompressed. + +If compressed length < uncompressed length, all the frames are compressed as a single big LZ4 block. + +### Checksums + +The file begins with 3 SeaHash checksums. + +The first checksum covers: + - the remainder of the file header, incl. the following 2 checksums + - the header part of the [Initialization Sequence](./is.md) + +The second checksum covers: + - the data of the Initialization Sequence (everything after the header) + +The third checksum covers: + - all the frames data + +## Initialization Sequence + +After the File Header follows the [Initialization Sequence](./is.md). + +## Frames + +After the IS follow [frames of game updates](./frames.md) + +Note: neither the length of the IS nor the start offset of the frame data +are encoded in the file header. The IS Header must be parsed to compute that. + +It is thus impossible to read the frames from a MineWars file without +decoding the IS first. diff --git a/doc/src/tech/dataformat-spectator.md b/doc/src/dataformat/frames.md similarity index 52% rename from doc/src/tech/dataformat-spectator.md rename to doc/src/dataformat/frames.md index 804bc3f..1962356 100644 --- a/doc/src/tech/dataformat-spectator.md +++ b/doc/src/dataformat/frames.md @@ -1,61 +1,4 @@ -# Spectator/Replay Stream Format - -The Spectator Protocol is essentially a container format that multiplexes -multiple [player protocol](./dataformat-player.md) streams (one for each player in -the game, representing their view of the world) together, along with a global -"spectator view" stream (also in the same format) providing a global view -of the game world. - -This is used to give spectator clients all the data they need to simultaneously -follow all participants in the game. This is also the file format used for -replay files. - -## Stream Structure - -The contents of the stream/file appear in this order: - - - File Header (file only) - - Initialization Sequence - - [... frames ...] - -## File Header - -In the case of a replay file, a header is prepended. - -The file header has the following structure: - - `[u64; 3]`: checksums - - `u32`: length of compressed frame data in bytes - - `u32`: length of uncompressed frame data in bytes - -If compressed length == uncompressed length, the frames data is stored uncompressed. - -If compressed length < uncompressed length, all the frames are compressed as a single big LZ4 block. - -## Checksums - -Checksums are only used in the case of replay files. Network streams do -not have checksums. In that case, the transport protocol is assumed to be -responsible for data integrity. - -The file begins with 3 SeaHash checksums. - -The first checksum covers: - - the remainder of the file header, incl. the following 2 checksums - - the header part of the Initialization Sequence - -The second checksum covers: - - the data of the Initialization Sequence (everything after the header) - -The third checksum covers: - - all the frames data - -## Initialization Sequence - -This is the same as described in the [player protocol documentation](./dataformat-player.md). - -However, starting item positions should be encoded inside the map data. - -## Frames +# Game Updates and Framing A Frame is a collection of game updates that happen together at the same time. It encodes the point of view of every player in the game who is involved + a @@ -64,13 +7,13 @@ the frame. Note: it is not a requirement that *all* game update messages from the same timestamp are encoded together. They may be fragmented into multiple frames. -Subsequent frames would just have the timestamp field set to zero. +Subsequent frames would just have their time offset set to zero. Such fragmentation is necessary if the frame payload exceeds 256 bytes in length. There are three kinds of frame encodings: Homogenous, Heterogenous, Keepalive. -### Homogenous Frames +## Homogenous Frames Homogenous frames are frames where every participant gets the same data. The data is only encoded once and assumed to apply to all participating streams. @@ -94,7 +37,7 @@ Initialization Sequence. `u8` if `max_plid <= 7`, `u16` if `max_plid >= 8`. The data payload is the [player protocol update messages](./dataformat-player.md#gameplay-messages). All of the players listed in the participation mask must receive the entire identical data payload. -### Heterogenous Frames +## Heterogenous Frames Heterogenous frames are freams where each participant gets different data. The data for each participating stream is included in the frame. @@ -128,7 +71,7 @@ messages](./dataformat-player.md#gameplay-messages) for that view. The total length of the data payload is the sum of the lengths of each view's data, as given in the Heterogenous Frame Header described above. -### Keepalive Frames +## Keepalive Frames Keepalive frames are to be used if the time delta since the last frame is too long to be represented in a single frame header. It is an empty frame with no @@ -141,34 +84,3 @@ Keepalive frames have the following structure: - `u16`: `-111111111111111` Note: there is no participation mask, no data length field, no data payload - -## The Global Spectator View - -The global spectator view behaves somewhat differently from the player views. - - - No fog of war must be displayed - - Digits are to be calculated by the client, from known mine locations - -To accommodate this, there are some special provisions in the spectator -stream format, that differ from the player stream. - -The initialization sequence encodes mine positions inside the map data. - -The global spectator view is controlled using the same update message format -as player views, but some message types are used differently: - - "Digit Update" and "Capture + Digits" must have the tile owner inferred - from the participation mask. The mask must encode only one PlayerID - (other than bit 0 for the spectator stream). - -## Compression Dictionary - -A special dictionary is prepared to help improve compression of the update -frames. It is to be generated from the data in the initialization sequence. - -It is constructed by concatenating the following data: - - - Every mountain coordinate on the map, in sorted order. - - Every land coordinate on the map, in sorted order. - -This effectively pre-seeds the compression algorithm with data sequences -likely to occur early-game. diff --git a/doc/src/dataformat/intro.md b/doc/src/dataformat/intro.md new file mode 100644 index 0000000..9c1d65f --- /dev/null +++ b/doc/src/dataformat/intro.md @@ -0,0 +1,56 @@ +# MineWars Data Format + +The MineWars Data Format is the file format and encoding used to store MineWars +game data. Specifically, this is the format of `*.minewars` files that can +store maps and replays. It is implemented in the `mw_dataformat` Rust crate. + +The Data Format is capable of storing: + - Map Data + - Other parameters and metadata of the game session + - A stream of gameplay updates for all players in the game, multiplexed together, + with timing information, to allow for watching a replay of a game. + +It is not to be confused with the MineWars Player Protocol, which is what is used +over-the-wire for communication between the Game Client App and Host Server for +networked multiplayer gameplay. + +The Player Protocol does internally use the Data Format for some purposes, such as: + - Transmitting the map data and configuration metadata to start a game session (Initialization Sequence). + - Encoding of most gameplay updates/events during gameplay (Game Update Messages). + - Multiplexing the PoVs of all the players in the game for sending to spectators (Framing). + +However, the Player Protocol also does a lot more. The full Player Protocol +is proprietary and not publicly documented. + +The Player Protocol and the Data Format are versioned separately (and separately from +the MineWars client and server software), but both of their versions are important for +compatibility. + +Reusing the encoding of map data and gameplay updates between all of these use cases +(live gameplay, spectation, replay files) makes it easier to implement all of this +functionality in MineWars. That is the design goal of the Data Format. + +## General Properties of the Data Format + +This is a custom purpose-built compact binary format. + +All multi-byte values are encoded as **big endian** and unaligned. + +All **coordinates** are encoded as `(row: u8, col: u8)` (note (Y,X) order). +In places where a sequence of multiple coordinates is listed, it is recommended +to encode them in sorted order. This helps compression. + +Some places use a special encoding for **time durations**: + +|Bits |Meaning | +|----------|-------------------------| +|`0xxxxxxx`| `x` milliseconds | +|`10xxxxxx`| (`x` + 13) centiseconds | +|`11xxxxxx`| (`x` + 8) deciseconds | + +**PlayerId**: a value between 1-15 inclusive. + +**PlayerSubId**: a value between 0-14 inclusive. + +You will also need to bring a LZ4 implementation supporting **raw blocks**. +The `lz4_flex` Rust crate is perfect. :) diff --git a/doc/src/dataformat/is.md b/doc/src/dataformat/is.md new file mode 100644 index 0000000..2c08714 --- /dev/null +++ b/doc/src/dataformat/is.md @@ -0,0 +1,95 @@ +# Initialization Sequence (IS) + +The IS is what sets the general configuration and metadata of the game and +encodes the initial state of map that the game will be played on. + +## Header + +It begins with a header: + - (`u8`,`u8`,`u8`,`u8`): Data Format Version + - `u8`: flags + - `u8`: map size (radius) + - `u8`: `max_plid` (bits 0-3), `max_sub_plid` (bits 4-7) + - `u8`: number of cities/regions + - `u32`: length of compressed map data in bytes + - `u16`: length of the Rules data + - `u16`: length of the Cits names data + +The `flags` field is encoded as follows: + +|Bits |Meaning | +|----------|----------------------------| +|`----0---`| Game uses a hexagonal grid | +|`----1---`| Game uses a square grid | +|`xxx--xxx`|(reserved bits) | + +## Map Data + +Then follows the map data. + +If compressed length < uncompressed length, the data is LZ4 compressed. + +If compressed length == uncompressed length, the data is raw/uncompressed. + +The compressed length is stored in the header. The uncompressed length must +be computed from the map radius. + +First, the map data is encoded as one byte per tile: + +|Bits |Meaning | +|----------|----------------------------| +|`----xxxx`| Tile Kind | +|`xxxx----`| Item Kind | + +Tile Kind: same encoding as the "Tile Kind Update" message below. +Item Kind: same encoding as the "Reveal Item" message below. + +The Item Kind is useful for spectators and replay files, so that they don't +need to start with a long sequence of "Reveal Item" messages at tick 0 for +all the initial items on the map. Other use cases (such as "map files") +may just always set it to zero. + +The tiles are encoded in concentric-ring order, starting from the center of +the map. The map data ends when all rings up until the map radius specified in +the header have been encoded. + +Each ring starts from the lowest (Y,X) coordinate and follows the +X direction first: + +Square example: +``` +654 +7.3 +012 +``` + +Hex example: +``` + 4 3 +5 . 2 + 0 1 +``` + +(`0` is the starting position, assuming +X points right and +Y points up) + +After the map data, regions are encoded the same way: one byte per tile, in +concentric ring order. The byte is the city/region ID for that tile. + +If the number of cities/regions is 0, this part of the map data is skipped. + +## City Info + +First, locations for each city on the map: + - `(u8, u8)`: (y, x) location + +Then, names for each city on the map: + - `u8`: length in bytes + - …: phonemes + +The name uses a special Phoneme encoding (undocumented, see source code), +which can be rendered/localized based on client language. + +## Game Parameters / Rules + +Then follow the parameters used for the game rules, in this game. + +// TODO diff --git a/doc/src/tech/dataformat-player.md b/doc/src/dataformat/msgs.md similarity index 71% rename from doc/src/tech/dataformat-player.md rename to doc/src/dataformat/msgs.md index 27b75f8..4696c9c 100644 --- a/doc/src/tech/dataformat-player.md +++ b/doc/src/dataformat/msgs.md @@ -1,136 +1,7 @@ -# Player Stream Format +# Game Update Messages -This page describes all the encodings used for data sent from the server to -the client. - -This message format is also used inside of the [spectator/replay -message](./dataformat-spectator.md) format (which encapsulates multiple streams of -this player message format). - ---- - -## Prerequisites for Implementation - -This is a custom purpose-built compact binary format. - -All multi-byte values are encoded as **big endian** and unaligned. - -All **coordinates** are encoded as `(row: u8, col: u8)` (note (Y,X) order). -In places where a sequence of multiple coordinates is listed, it is recommended -to encode them in sorted order. This helps compression. - -All **time durations** are encoded as: - -|Bits |Meaning | -|----------|-------------------------| -|`0xxxxxxx`| `x` milliseconds | -|`10xxxxxx`| (`x` + 13) centiseconds | -|`11xxxxxx`| (`x` + 8) deciseconds | - -**PlayerId**: a value between 1-15 inclusive. - -You will also need to bring a LZ4 implementation supporting **raw blocks** -and dictionary data. The `lz4_flex` Rust crate is perfect. :) - -## Initialization Sequence - -When a connected player is successfully authenticated and ready to begin -the game, it will receive an **initialization sequence**, which includes -metadata about the game session, and the map data. - -### Header - -It begins with a header: - - (`u8`,`u8`,`u8`,`u8`): protocol version - - `u8`: flags - - `u8`: map size (radius) - - `u8`: `max_plid` (bits 0-3), `max_sub_plid` (bits 4-7) - - `u8`: number of cities/regions - - `u32`: length of compressed map data in bytes - - `u16`: length of the Rules data - - `u16`: length of the Cits names data - -The `flags` field is encoded as follows: - -|Bits |Meaning | -|----------|----------------------------| -|`----0---`| Game uses a hexagonal grid | -|`----1---`| Game uses a square grid | -|`xxx--xxx`|(reserved bits) | - -#### Map Data - -Then follows the map data. - -If compressed length < uncompressed length, the data is LZ4 compressed. - -If compressed length == uncompressed length, the data is raw/uncompressed. - -First, the map data is encoded as one byte per tile: - -|Bits |Meaning | -|----------|----------------------------| -|`----xxxx`| Tile Kind | -|`xxxx----`| Item Kind | - -Tile Kind: same encoding as the "Tile Kind Update" message below. -Item Kind: same encoding as the "Reveal Item" message below. - -The Item Kind is only used for spectator streams and replay files, so that they -don't need to start with a long sequence of "Reveal Item" messages at tick 0 -for all the initial items on the map. In player streams, this field should be 0. - -If any starting Structures must be encoded (say for a custom game mode / scenario), -initialize them using regular gameplay messages at tick 0. - -The tiles are encoded in concentric-ring order, starting from the center of -the map. The map data ends when all rings up until the map radius specified in -the header have been encoded. - -Each ring starts from the lowest (Y,X) coordinate and follows the +X direction first: - -Square example: -``` -654 -7.3 -012 -``` - -Hex example: -``` - 4 3 -5 . 2 - 0 1 -``` - -(`0` is the starting position, assuming +X points right and +Y points up) - -After the map data, regions are encoded the same way: one byte per tile, in -concentric ring order. The byte is the city/region ID for that tile. - -### City Info - -First, locations for each city on the map: - - `(u8, u8)`: (y, x) location - -Then, names for each city on the map: - - `u8`: length in bytes - - …: phonemes - -The name uses a special Phoneme encoding (undocumented, see source code), -which can be rendered/localized based on client language. - -### Game Parameters / Rules - -Then follow the parameters used for the game rules, in this game. - -// TODO - -## Gameplay Messages - -Updates for the player are encoded as a raw uncompressed block of data -consisting of any number of **messages** concatenated together. Each message -is a variable-length byte sequence. +Updates for the players are encoded as any number of **messages** concatenated +together. Each message is a variable-length byte sequence. Each message is at least one byte long. The type of the message is determined by magic bits in that first byte (similar to opcodes in CPU instruction set @@ -213,7 +84,7 @@ The next byte specifies the message kind (what happened): |Bits |Meaning |Granularity|Assembly |Class | |----------|----------------|-----------|----------------------------|-------------| -|`00000000`| Joined |PlayerSubId|`JOIN name` |Notification | +|`00000000`| Joined |PlayerSubId|`JOIN` |Notification | |`00000001`| Ping/RTT Info |PlayerSubId|`RTT millis` |Unreliable | |`00000010`| Timeout |Either |`TIMEOUT millis` |Notification | |`00000011`| TimeoutDone |Either |`RESUME` |Notification | @@ -225,14 +96,7 @@ The next byte specifies the message kind (what happened): |`00001001`| Surrendered |PlayerId |`SURRENDER` |Notification | |`00001010`| Disconnected |PlayerSubId|`LEAVE` |Notification | |`00001011`| Kicked |PlayerSubId|`KICK` |Notification | -|`00001100`| Vote No |PlayerSubId|`VOTE id N` |Background | -|`00001101`| Vote Yes |PlayerSubId|`VOTE id Y` |Background | -|`00001110`| Vote Failed |PlayerSubId|`VOTEFAIL id` |Background | -|`00001111`| Vote Success |PlayerSubId|`VOTEPASS id` |Background | -|`00010000`| Chat (All) |PlayerSubId|`CHATALL string` |Background | -|`00010001`| Chat (Friendly)|PlayerSubId|`CHAT string` |Background | |`00010010`| MatchTimeRemain|Either |`TIMELIMIT secs` |Notification | -|`00010011`| Initiate Vote |PlayerSubId|`VOTENEW id string` |Background | |`10001000`| Capturing City |Either |`CITCAPTING citid millis` |PvP | |`10001001`| Capture City |Either |`CITCAPTURE citid` |PvP | |`10001010`| Contested City |Either |`CITCONTEST citid` |PvP | diff --git a/doc/src/tech.md b/doc/src/tech.md deleted file mode 100644 index 33fa5e5..0000000 --- a/doc/src/tech.md +++ /dev/null @@ -1 +0,0 @@ -# Technical Documentation