logo JSON BinPack


Encodings: String

Do you want to help improve these docs? Edit this page on GitHub


UTF8_STRING_NO_LENGTH

The encoding consist in the UTF-8 encoding of the input string.

Options

Option Type Description
size uint The string UTF-8 byte-length

Conditions

Condition Description
len(value) == size The input string must have the declared UTF-8 byte-length

Examples

Given the input value “foo bar” with a corresponding size of 7, the encoding results in:

+------+------+------+------+------+------+------+
| 0x66 | 0x6f | 0x6f | 0x20 | 0x62 | 0x61 | 0x72 |
+------+------+------+------+------+------+------+
  f      o      o             b      a      r

FLOOR_VARINT_PREFIX_UTF8_STRING_SHARED

The encoding consists of the byte-length of the string minus minimum plus 1 as a Base-128 64-bit Little Endian variable-length unsigned integer followed by the UTF-8 encoding of the input value.

Optionally, if the input string has already been encoded to the buffer using UTF-8, the encoding may consist of the byte constant 0x00 followed by the byte-length of the string minus minimum plus 1 as a Base-128 64-bit Little Endian variable-length unsigned integer, followed by the current offset minus the offset to the start of the UTF-8 string value in the buffer encoded as a Base-128 64-bit Little Endian variable-length unsigned integer.

Options

Option Type Description
minimum uint The inclusive minimum string UTF-8 byte-length

Conditions

Condition Description
len(value) >= minimum The input string byte-length is equal to or greater than the minimum

Examples

Given the input string foo with a minimum 3 where the string has not been previously encoded, the encoding results in:

+------+------+------+------+
| 0x01 | 0x66 | 0x6f | 0x6f |
+------+------+------+------+
         f      o      o

Given the encoding of foo with a minimum of 0 followed by the encoding of foo with a minimum of 3, the encoding may result in:

0      1      2      3      4      5      6
^      ^      ^      ^      ^      ^      ^
+------+------+------+------+------+------+------+
| 0x04 | 0x66 | 0x6f | 0x6f | 0x00 | 0x01 | 0x05 |
+------+------+------+------+------+------+------+
         f      o      o                    6 - 1

ROOF_VARINT_PREFIX_UTF8_STRING_SHARED

The encoding consists of maximum minus the byte-length of the string plus 1 as a Base-128 64-bit Little Endian variable-length unsigned integer followed by the UTF-8 encoding of the input value.

Optionally, if the input string has already been encoded to the buffer using UTF-8, the encoding may consist of the byte constant 0x00 followed by maximum minus the byte-length of the string plus 1 as a Base-128 64-bit Little Endian variable-length unsigned integer, followed by the current offset minus the offset to the start of the UTF-8 string value in the buffer encoded as a Base-128 64-bit Little Endian variable-length unsigned integer.

Options

Option Type Description
maximum uint The inclusive maximum string UTF-8 byte-length

Conditions

Condition Description
len(value) <= maximum The input string byte-length is equal to or less than the maximum

Examples

Given the input string foo with a maximum 4 where the string has not been previously encoded, the encoding results in:

+------+------+------+------+
| 0x02 | 0x66 | 0x6f | 0x6f |
+------+------+------+------+
         f      o      o

Given the encoding of foo with a maximum of 3 followed by the encoding of foo with a maximum of 5, the encoding may result in:

0      1      2      3      4      5      6
^      ^      ^      ^      ^      ^      ^
+------+------+------+------+------+------+------+
| 0x01 | 0x66 | 0x6f | 0x6f | 0x00 | 0x03 | 0x05 |
+------+------+------+------+------+------+------+
         f      o      o                    6 - 1

BOUNDED_8BIT_PREFIX_UTF8_STRING_SHARED

The encoding consists of the byte-length of the string minus minimum plus 1 as an 8-bit fixed-length unsigned integer followed by the UTF-8 encoding of the input value.

Optionally, if the input string has already been encoded to the buffer using UTF-8, the encoding may consist of the byte constant 0x00 followed by the byte-length of the string minus minimum plus 1 as an 8-bit fixed-length unsigned integer, followed by the current offset minus the offset to the start of the UTF-8 string value in the buffer encoded as a Base-128 64-bit Little Endian variable-length unsigned integer.

The byte-length of the string is encoded even if maximum equals minimum in order to disambiguate between shared and non-shared fixed strings.

Options

Option Type Description
minimum uint The inclusive minimum string UTF-8 byte-length
maximum uint The inclusive maximum string UTF-8 byte-length

Conditions

Condition Description
len(value) >= minimum The input string byte-length is equal to or greater than the minimum
len(value) <= maximum The input string byte-length is equal to or less than the maximum
maximum - minimum < 2 ** 8 - 1 The range minus 1 must be representable in 8 bits

Examples

Given the input string foo with a minimum 3 and a maximum 5 where the string has not been previously encoded, the encoding results in:

+------+------+------+------+
| 0x01 | 0x66 | 0x6f | 0x6f |
+------+------+------+------+
         f      o      o

Given the encoding of foo with a minimum of 0 and a maximum of 6 followed by the encoding of foo with a minimum of 3 and a maximum of 100, the encoding may result in:

0      1      2      3      4      5      6
^      ^      ^      ^      ^      ^      ^
+------+------+------+------+------+------+------+
| 0x04 | 0x66 | 0x6f | 0x6f | 0x00 | 0x01 | 0x05 |
+------+------+------+------+------+------+------+
         f      o      o                    6 - 1

RFC3339_DATE_INTEGER_TRIPLET

The encoding consists of an implementation of RFC3339 date expressions as the sequence of 3 integers: the year as a 16-bit fixed-length Little Endian unsigned integer, the month as an 8-bit fixed-length unsigned integer, and the day as an 8-bit fixed-length unsigned integer.

Options

None

Conditions

Condition Description
len(value) == 10 The input string consists of 10 characters
value[0:4] >= 0 The year is greater than or equal to 0
value[0:4] <= 9999 The year is less than or equal to 9999 as stated by RFC3339
value[4] == '-' The year and the month are divided by a hyphen
value[5:7] >= 1 The month is greater than or equal to 1
value[5:7] <= 12 The month is less than or equal to 12
value[7] == '-' The month and the day are divided by a hyphen
value[8:10] >= 1 The day is greater than or equal to 1
value[8:10] <= 31 The day is less than or equal to 31

Examples

Given the input string 2014-10-01, the encoding results in:

+------+------+------+------+
| 0xde | 0x07 | 0x0a | 0x01 |
+------+------+------+------+
  year   ...    month  day