Fixed EncodingΒΆ

Each MapD datatype takes up a certain amount of space in memory and on disk. The default sizes of datatypes are listed in the following table.

Datatype Size (bytes)
TEXT ENCODED DICT 4
TEXT ENCODED NONE Variable (size of the string + 6 bytes)
TIMESTAMP 8
TIME 8
DATE 8
FLOAT 4
DOUBLE 8
INTEGER 4
SMALLINT 2
BIGINT 8
BOOLEAN 1
DECIMAL/NUMERIC 8

For certain datatypes, you can use a more compact representation of these values. The options for these datatypes are listed in the following table.

Encoding Size (bytes) Notes
TIMESTAMP ENCODING FIXED(32) 4 Range: 1901-12-13 20:45:53 - 2038-01-19 03:14:07
TIME ENCODING FIXED(32) 4 Range: 00:00:00 - 23:59:59
DATE ENCODING FIXED(32) 4 Range: 1901-12-13 - 2038-01-19
TEXT ENCODED DICT(16) 2 Max cardinality 64K
TEXT ENCODED DICT(8) 1 Max cardinality 255
INTEGER ENCODING FIXED(16) 2 Same as SMALLINT
INTEGER ENCODING FIXED(8) 1 Max range -127 to 127
SMALLINT ENCODING FIXED(8) 1 Max range -127 to 127
BIGINT ENCODING FIXED(32) 4 Same as INTEGER
BIGINT ENCODING FIXED(16) 2 Same as SMALLINT
BIGINT ENCODING FIXED(8) 1 Max range -127 to 127

To be able to effectively use these fixed length fields, the range or cardinality of the data must fit into the constraints as described.

The best use for these encodings is on low cardinality TEXT fields where you can achieve large savings, and on TIMESTAMP fields where the timestamps range between 1901-12-13 20:45:53 and 2038-01-19 03:14:07.

All options are shown, but many of the INTEGER options overlap.

If a text encoded field does not match the defined cardinality, MapD substitutes a NULL value records the change as a log entry.

Once your schema is well understood, you can achieve significant savings through careful application of these fixed encoding types.