From 1482add39abf264ea93c5c0bb5fa5fa2929fcc68 Mon Sep 17 00:00:00 2001 From: Koz Ross Date: Thu, 6 Jun 2024 11:02:57 +1200 Subject: [PATCH] Fix bitwise binary op descriptions, specify padding more explicitly --- CIP-0122/CIP-0122.md | 136 ++++++++++++++++++++++--------------------- 1 file changed, 71 insertions(+), 65 deletions(-) diff --git a/CIP-0122/CIP-0122.md b/CIP-0122/CIP-0122.md index dea0b435e6..dc7849d1b7 100644 --- a/CIP-0122/CIP-0122.md +++ b/CIP-0122/CIP-0122.md @@ -304,11 +304,38 @@ _maximum_ of the two arguments (which we call _padding semantics_). As these can both be useful depending on context, we allow both, controlled by a `BuiltinBool` flag, on all the operations listed above. +In cases where we have arguments of different lengths, in order to produce a +result of the appropriate lengths, one of the arguments needs to be either +padded or truncated. Let `short` and `long` refer to the `BuiltinByteString` +argument of shorter length, and of longer length, respectively. The following +table describes what happens to the arguments before the operation: + +| Semantics | `short` | `long` | +|-----------|---------|--------| +| Padding | Pad at high _byte_ indexes | Unchanged | +| Truncation | Unchanged | Truncate high _byte_ indexes | + +We pad with different bytes depending on operation: for `bitwiseLogicalAnd`, we +pad with `0xFF`, while for `bitwiseLogicalOr` and `bitwiseLogicalXor` we pad +with `0x00` instead. We refer to arguments so changed as +_semantics-modified_ arguments. + For example, consider the `BuiltinByteString`s `x = [0x00, 0xF0, 0xFF]` and `y = -[0xFF, 0xF0]`. Under padding semantics, the result of `bitwiseLogicalAnd`, -`bitwiseLogicalOr` or `bitwiseLogicalXor` using these two as arguments would -have a length of 3; under truncation semantics, the result would have a length -of 2 instead. +[0xFF, 0xF0]`. The following table describes what the semantics-modified +versions of these arguments would become for each operation and each semantics: + +| Operation | Semantics | `x` | `y` | +|-----------|-----------|-----|-----| +| `bitwiseLogicalAnd` | Padding | `[0x00, 0xF0, 0xFF]` | `[0xFF, 0xF0, 0xFF]` | +| `bitwiseLogicalAnd` | Truncation | `[0x00, 0xF0]` | `[0xFF, 0xF0]` | +| `bitwiseLogicalOr` | Padding | `[0x00, 0xF0, 0xFF]` | `[0xFF, 0xF0, 0x00]` | +| `bitwiseLogicalor` | Truncation | `[0x00, 0xF0]` | `[0xFF, 0xF0]` | +| `bitwiseLogicalXor` | Padding | `[0x00, 0xF0, 0xFF]` | `[0xFF, 0xF0, 0x00]` | +| `bitwiseLogicalXor` | Truncation | `[0x00, 0xF0]` | `[0xFF, 0xF0]` | + +Based on the above, we observe that under padding semantics, the result of any +of the listed operations would have a byte length of 3, while under truncation +semantics, the result would have a byte length of 2 instead. #### `bitwiseLogicalAnd` @@ -320,26 +347,19 @@ of 2 instead. 2. The first input `BuiltinByteString`. This is the _first data argument_. 3. The second input `BuiltinByteString`. This is the _second data argument_. -Let $b_1, b_2$ refer to the first data argument and the second data -argument respectively, and let $n_1, n_2$ be their respective lengths in bytes. -Let the result of `bitwiseLogicalAnd`, given $b_1, b_2$ and some padding -semantics argument, be $b_r$, of length $n_r$ in bytes. We use $b_1[i]$ to refer -to the value at index $i$ of $b_1$ (and analogously for -$b_2, b_r$); see the [section on the bit indexing scheme](#bit-indexing-scheme) -for the exact specification of this. +Let $b_1, b_2$ refer to the semantics-modified first data argument and +semantics-modified second data argument respectively, and let $n$ be either of +their lengths in bytes; see the +[section on padding versus truncation semantics](#padding-versus-truncation-semantics) +for the exact specification of this. Let the result of `bitwiseLogicalAnd`, given +$b_1, b_2$ and some padding semantics argument, be $b_r$, also of length $n$ +in bytes. We use $b_1\\{i\\}$ to refer to the byte at index $i$ in $b_1$ (and +analogously for $b_2$, $b_r#); see the [section on the bit indexing +scheme](#bit-indexing-scheme) for the exact specification of this. -If the padding semantics argument is `True`, then we have $n_r = \max \\{ n_1, -n_2 \\}$; otherwise, $n_r = \min \\{ n_1, n_2 \\}$. For all $i \in 0, 1, \ldots 8 -\cdot n_r - 1$, we have - -$$ -b_r[i] = \begin{cases} - b_0[i] & \text{if } n_1 < n_0 \text{ and } i \geq 8 \cdot \min \\{ n_1, n_2 \\} \\ - b_1[i] & \text{if } n_0 < n_1 \text { and } i \geq 8 \cdot \min \\{ n_1, n_2 \\} \\ - 1 & \text{if } b_0[i] = b_1[i] = 1 \\ - 0 & \text{otherwise} \\ - \end{cases} -$$ +For all $i \in 0, 1, \ldots, n - 1$, we have +$b_r\\{i\\} = b_0\\{i\\} \text{ }\\& \text{ } b_1\\{i\\}$, where $\\&$ refers to a +[bitwise AND][bitwise-and]. Some examples of the intended behaviour of `bitwiseLogicalAnd` follow. For brevity, we write `BuiltinByteString` literals as lists of hexadecimal values. @@ -378,29 +398,19 @@ bitwiseLogicalAnd False [0x4F, 0x00] [0xF4] => [0x44, 0x00] 2. The first input `BuiltinByteString`. This is the _first data argument_. 3. The second input `BuiltinByteString`. This is the _second data argument_. -Let $b_1, b_2$ refer to the first data argument and the second data -argument respectively, and let $n_1, n_2$ be their respective lengths in bytes. -Let the result of `bitwiseLogicalOr`, given $b_1, b_2$ and some padding -semantics argument, be $b_r$, of length $n_r$ in bytes. We use $b_1[i]$ to refer -to the value at index $i$ of $b_1$ (and analogously for -$b_2, b_r$); see the [section on the bit indexing scheme](#bit-indexing-scheme) -for the exact specification of this. - -If the padding semantics argument is `True`, then we have $n_r = \max \{ n_1, -n_2 \}$; otherwise, $n_r = \min \{ n_1, n_2 \}$. For all $i \in 0, 1, \ldots 8 -\cdot n_r - 1$, we have - -$$ -b_r[i] = \begin{cases} - b_0[i] & \text{if } n_1 < n_0 \text{ and } i \geq 8 \cdot \min \\{ n_1, n_2 \\} \\ - b_1[i] & \text{if } n_0 < n_1 \text { and } i \geq 8 \cdot \min \\{ n_1, n_2 \\} \\ - 0 & \text{if } b_0[i] = b_1[i] = 0 \\ - 1 & \text{otherwise} \\ - \end{cases} -$$ +Let $b_1, b_2$ refer to the semantics-modified first data argument and +semantics-modified second data argument respectively, and let $n$ be either of +their lengths in bytes; see the +[section on padding versus truncation semantics](#padding-versus-truncation-semantics) +for the exact specification of this. Let the result of `bitwiseLogicalOr`, given +$b_1, b_2$ and some padding semantics argument, be $b_r$, also of length $n$ +in bytes. We use $b_1\\{i\\}$ to refer to the byte at index $i$ in $b_1$ (and +analogously for $b_2$, $b_r#); see the [section on the bit indexing +scheme](#bit-indexing-scheme) for the exact specification of this. -Some examples of the intended behaviour of `bitwiseLogicalOr` follow. For -brevity, we write `BuiltinByteString` literals as lists of hexadecimal values. +For all $i \in 0, 1, \ldots, n - 1$, we have +$b_r\\{i\\} = b_0\\{i\\} \text{ } \| \text{ } b_1\\{i\\}$, where $\|$ refers to +a [bitwise OR][bitwise-or]. ``` -- truncation semantics @@ -436,26 +446,19 @@ bitwiseLogicalOr False [0x4F, 0x00] [0xF4] => [0xFF, 0x00] 2. The first input `BuiltinByteString`. This is the _first data argument_. 3. The second input `BuiltinByteString`. This is the _second data argument_. -Let $b_1, b_2$ refer to the first data argument and the second data -argument respectively, and let $n_1, n_2$ be their respective lengths in bytes. -Let the result of `bitwiseLogicalXor`, given $b_1, b_2$ and some padding -semantics argument, be $b_r$, of length $n_r$ in bytes. We use $b_1[i]$ to -refer to the value at index $i$ of $b_1$ (and analogously for -$b_2, b_r$); see the [section on the bit indexing scheme](#bit-indexing-scheme) -for the exact specification of this. - -If the padding semantics argument is `True`, then we have $n_r = \max \{ n_1, -n_2 \}$; otherwise, $n_r = \min \{ n_1, n_2 \}$. For all $i \in 0, 1, \ldots 8 -\cdot n_r - 1$, we have +Let $b_1, b_2$ refer to the semantics-modified first data argument and +semantics-modified second data argument respectively, and let $n$ be either of +their lengths in bytes; see the +[section on padding versus truncation semantics](#padding-versus-truncation-semantics) +for the exact specification of this. Let the result of `bitwiseLogicalXor`, given +$b_1, b_2$ and some padding semantics argument, be $b_r$, also of length $n$ +in bytes. We use $b_1\\{i\\}$ to refer to the byte at index $i$ in $b_1$ (and +analogously for $b_2$, $b_r#); see the [section on the bit indexing +scheme](#bit-indexing-scheme) for the exact specification of this. -$$ -b_r[i] = \begin{cases} - b_0[i] & \text{if } n_1 < n_0 \text{ and } i \geq 8 \cdot \min \\{ n_1, n_2 \\} \\ - b_1[i] & \text{if } n_0 < n_1 \text { and } i \geq 8 \cdot \min \\{ n_1, n_2 \\} \\ - 0 & \text{if } b_0[i] = b_1[i] \\ - 1 & \text{otherwise} \\ - \end{cases} -$$ +For all $i \in 0, 1, \ldots, n - 1$, we have +$b_r\\{i\\} = b_0\\{i\\} \text{ } \wedge \text{ } b_1\\{i\\}$, where $\wedge$ refers to +a [bitwise XOR][bitwise-xor]. Some examples of the intended behaviour of `bitwiseLogicalXor` follow. For brevity, we write `BuiltinByteString` literals as lists of hexadecimal values. @@ -1485,4 +1488,7 @@ This CIP is licensed under [Apache-2.0](http://www.apache.org/licenses/LICENSE-2 [blake2b]: https://en.wikipedia.org/wiki/BLAKE_(hash_function) [argon2]: https://en.wikipedia.org/wiki/Argon2 [xor-crypto]: https://en.wikipedia.org/wiki/Exclusive_or#Bitwise_operation -[cip-121-big-endian]: https://github.com/mlabs-haskell/CIPs/blob/koz/to-from-bytestring/CIP-0121/README.md#representation +[cip-121-big-endian]: https://github.com/mlabs-haskell/CIPs/blob/koz/to-from-bytestring/CIP-0121/README.md#representation +[bitwise-and]: https://en.wikipedia.org/wiki/Bitwise_operation#AND +[bitwise-or]: https://en.wikipedia.org/wiki/Bitwise_operation#OR +[bitwise-xor]: https://en.wikipedia.org/wiki/Bitwise_operation#XOR