Encodings: String
Do you want to help improve these docs? Edit this page on GitHub
UTF8_STRING_NO_LENGTH
The encoding consist in the UTF-8 encoding of the input string.
Options
Option | Type | Description |
---|---|---|
size |
uint |
The string UTF-8 byte-length |
Conditions
Condition | Description |
---|---|
len(value) == size |
The input string must have the declared UTF-8 byte-length |
Examples
Given the input value “foo bar” with a corresponding size of 7, the encoding results in:
+------+------+------+------+------+------+------+
| 0x66 | 0x6f | 0x6f | 0x20 | 0x62 | 0x61 | 0x72 |
+------+------+------+------+------+------+------+
f o o b a r
FLOOR_VARINT_PREFIX_UTF8_STRING_SHARED
The encoding consists of the byte-length of the string minus minimum
plus 1
as a Base-128 64-bit Little Endian variable-length unsigned integer followed by
the UTF-8 encoding of the input value.
Optionally, if the input string has already been encoded to the buffer using
UTF-8, the encoding may consist of the byte constant 0x00
followed by the
byte-length of the string minus minimum
plus 1 as a Base-128 64-bit Little
Endian variable-length unsigned integer, followed by the current offset minus
the offset to the start of the UTF-8 string value in the buffer encoded as a
Base-128 64-bit Little Endian variable-length unsigned integer.
Options
Option | Type | Description |
---|---|---|
minimum |
uint |
The inclusive minimum string UTF-8 byte-length |
Conditions
Condition | Description |
---|---|
len(value) >= minimum |
The input string byte-length is equal to or greater than the minimum |
Examples
Given the input string foo
with a minimum 3 where the string has not been
previously encoded, the encoding results in:
+------+------+------+------+
| 0x01 | 0x66 | 0x6f | 0x6f |
+------+------+------+------+
f o o
Given the encoding of foo
with a minimum of 0 followed by the encoding of
foo
with a minimum of 3, the encoding may result in:
0 1 2 3 4 5 6
^ ^ ^ ^ ^ ^ ^
+------+------+------+------+------+------+------+
| 0x04 | 0x66 | 0x6f | 0x6f | 0x00 | 0x01 | 0x05 |
+------+------+------+------+------+------+------+
f o o 6 - 1
ROOF_VARINT_PREFIX_UTF8_STRING_SHARED
The encoding consists of maximum
minus the byte-length of the string plus 1
as a Base-128 64-bit Little Endian variable-length unsigned integer followed by
the UTF-8 encoding of the input value.
Optionally, if the input string has already been encoded to the buffer using
UTF-8, the encoding may consist of the byte constant 0x00
followed by
maximum
minus the byte-length of the string plus 1 as a Base-128 64-bit
Little Endian variable-length unsigned integer, followed by the current offset
minus the offset to the start of the UTF-8 string value in the buffer encoded
as a Base-128 64-bit Little Endian variable-length unsigned integer.
Options
Option | Type | Description |
---|---|---|
maximum |
uint |
The inclusive maximum string UTF-8 byte-length |
Conditions
Condition | Description |
---|---|
len(value) <= maximum |
The input string byte-length is equal to or less than the maximum |
Examples
Given the input string foo
with a maximum 4 where the string has not been
previously encoded, the encoding results in:
+------+------+------+------+
| 0x02 | 0x66 | 0x6f | 0x6f |
+------+------+------+------+
f o o
Given the encoding of foo
with a maximum of 3 followed by the encoding of
foo
with a maximum of 5, the encoding may result in:
0 1 2 3 4 5 6
^ ^ ^ ^ ^ ^ ^
+------+------+------+------+------+------+------+
| 0x01 | 0x66 | 0x6f | 0x6f | 0x00 | 0x03 | 0x05 |
+------+------+------+------+------+------+------+
f o o 6 - 1
BOUNDED_8BIT_PREFIX_UTF8_STRING_SHARED
The encoding consists of the byte-length of the string minus minimum
plus 1
as an 8-bit fixed-length unsigned integer followed by the UTF-8 encoding of the
input value.
Optionally, if the input string has already been encoded to the buffer using
UTF-8, the encoding may consist of the byte constant 0x00
followed by the
byte-length of the string minus minimum
plus 1 as an 8-bit fixed-length
unsigned integer, followed by the current offset minus the offset to the start
of the UTF-8 string value in the buffer encoded as a Base-128 64-bit Little
Endian variable-length unsigned integer.
The byte-length of the string is encoded even if maximum
equals minimum
in
order to disambiguate between shared and non-shared fixed strings.
Options
Option | Type | Description |
---|---|---|
minimum |
uint |
The inclusive minimum string UTF-8 byte-length |
maximum |
uint |
The inclusive maximum string UTF-8 byte-length |
Conditions
Condition | Description |
---|---|
len(value) >= minimum |
The input string byte-length is equal to or greater than the minimum |
len(value) <= maximum |
The input string byte-length is equal to or less than the maximum |
maximum - minimum < 2 ** 8 - 1 |
The range minus 1 must be representable in 8 bits |
Examples
Given the input string foo
with a minimum 3 and a maximum 5 where the string
has not been previously encoded, the encoding results in:
+------+------+------+------+
| 0x01 | 0x66 | 0x6f | 0x6f |
+------+------+------+------+
f o o
Given the encoding of foo
with a minimum of 0 and a maximum of 6 followed by
the encoding of foo
with a minimum of 3 and a maximum of 100, the encoding
may result in:
0 1 2 3 4 5 6
^ ^ ^ ^ ^ ^ ^
+------+------+------+------+------+------+------+
| 0x04 | 0x66 | 0x6f | 0x6f | 0x00 | 0x01 | 0x05 |
+------+------+------+------+------+------+------+
f o o 6 - 1
RFC3339_DATE_INTEGER_TRIPLET
The encoding consists of an implementation of RFC3339 date expressions as the sequence of 3 integers: the year as a 16-bit fixed-length Little Endian unsigned integer, the month as an 8-bit fixed-length unsigned integer, and the day as an 8-bit fixed-length unsigned integer.
Options
None
Conditions
Condition | Description |
---|---|
len(value) == 10 |
The input string consists of 10 characters |
value[0:4] >= 0 |
The year is greater than or equal to 0 |
value[0:4] <= 9999 |
The year is less than or equal to 9999 as stated by RFC3339 |
value[4] == '-' |
The year and the month are divided by a hyphen |
value[5:7] >= 1 |
The month is greater than or equal to 1 |
value[5:7] <= 12 |
The month is less than or equal to 12 |
value[7] == '-' |
The month and the day are divided by a hyphen |
value[8:10] >= 1 |
The day is greater than or equal to 1 |
value[8:10] <= 31 |
The day is less than or equal to 31 |
Examples
Given the input string 2014-10-01
, the encoding results in:
+------+------+------+------+
| 0xde | 0x07 | 0x0a | 0x01 |
+------+------+------+------+
year ... month day