Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[repository schema] String Termination and Padding Characters #216

Open
mkudukin opened this issue Jul 29, 2024 · 0 comments
Open

[repository schema] String Termination and Padding Characters #216

mkudukin opened this issue Jul 29, 2024 · 0 comments
Assignees
Labels
ENCODING Support for binary and other encoding protocols

Comments

@mkudukin
Copy link

The proposal is to add ability to define null-termination mode and padding characters for alphanumeric datatypes as discussed in #197. The only change here is that we use code points instead of characters to allow usage of control characters which XML doesn't allow to be represented directly.

Issue

Most binary encodings use fixed-length fields for alphanumeric data, i.e. for string values. If the actual value doesn't occupy the field length entirely, it could either be padded with a particular character on the beginning or the end (padded string), or terminated with NUL character (null-terminated string). In fact, a null-terminated string can be considered as a special case of padding with NUL control characters.

Protocols have different policies on including the null-terminator in the string length. This means that in some cases (e.g. in SBE, NYSE Pillar) if the actual length of the value fits the entire length of the field, then it's acceptable that the value doesn't include a null-terminator character. In other cases (e.g. in HKEX OCG) a null-terminator is required to be present in the field's value, making one byte less space available for the actual string.

Hence, there is a requirement to define a padding side and character, and a null-termination requirement flag for fixed-length string datatypes.

Proposal

We recommend adding following optional attributes to the mappedDatatype element:

  • The paddingSide attribute which could be set to left or right.
  • The paddingCodePoint attribute which could be set to the code point of a character.
  • The nullTerminated attribute which could be set to true or false. The following rules apply when the attribute is set to true:
    • The field's value must be terminated by NUL character, even if the value is empty.
    • The maximum length of a string which the field could accommodate would be one character less than its declared length.
    • In case padding attributes are also set, the value on the wire would be terminated by a null character, and then the remaining space would be filled with padding characters.
    • If paddingSide="left" then null-terminator character must be present at the left side of a string.
    • If no padding attributes specified, the string would be padded with null characters on the right after the terminator character.

Example

<datatype name="charArray" kind="array">
	<mappedDatatype standard="ISO11404" base="array" element="character" paddingSide="right" paddingCodePoint="32"/>
	<annotation>
		<documentation>Fixed length string padded on the right with spaces.</documentation>
	</annotation>
</datatype>
<datatype name="zcharArray" kind="array">
	<mappedDatatype standard="ISO11404" base="array" element="character" paddingSide="right" paddingCodePoint="0"/>
	<annotation>
		<documentation>Fixed length string padded on the right with NUL (ASCII 0) characters.</documentation>
	</annotation>
</datatype>
<datatype name="zcharArrayStrict" kind="array">
	<mappedDatatype standard="ISO11404" base="array" element="character" nullTerminated="true"/>
	<annotation>
		<documentation>Fixed length null-terminated string. NUL (ASCII 0) character is mandatory.</documentation>
	</annotation>
</datatype>
@kleihan kleihan added ENCODING Support for binary and other encoding protocols and removed enhancement labels Aug 20, 2024
@kleihan kleihan moved this to Backlog in Orchestra v1.1 RC2 Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ENCODING Support for binary and other encoding protocols
Projects
Status: No status
Development

No branches or pull requests

2 participants