Datatypes
This topic describes standard datatypes and space-saving variations for values stored in OmniSci.
Datatypes
Each OmniSci datatype uses space in memory and on disk. For certain datatypes, you can use fixed encoding for a more compact representation of these values. Datatypes, variations, and sizes are described in the following table.
Datatype | Size (bytes) | Notes |
---|---|---|
BIGINT |
8 | Minimum value: -9,223,372,036,854,775,807 ; maximum value: 9,223,372,036,854,775,807 . |
BIGINT ENCODING FIXED(8) |
1 | Minimum value: -127 ; maximum value: 127 |
BIGINT ENCODING FIXED(16) |
2 | Same as SMALLINT . |
BIGINT ENCODING FIXED(32) |
4 | Same as INTEGER . |
BOOLEAN |
1 | TRUE: 'true' ,
'1' , 't' . FALSE: 'false' , '0' , 'f' . Text values are not case-sensitive. |
DATE 1 |
4 | Range in years: +/-5,883,517 around epoch. Maximum date January 1,
5885487 (approximately). Minimum value: -2,147,483,648 ; maximum value: 2,147,483,647 .Supported formats when using COPY FROM : mm/dd/yyyy , dd-mmm-yy , yyyy-mm-dd , dd/mmm/yyyy .
|
DATE ENCODING FIXED(16) |
2 | Range in days: -32,768 - 32,767 Range in years: +/-90 around epoch, April 14, 1880 -
September 9, 2059. Minumum value: -2,831,155,200 ; maximum value: 2,831,068,800 . Supported formats when using COPY FROM : mm/dd/yyyy , dd-mmm-yy , yyyy-mm-dd , dd/mmm/yyyy . |
DATE ENCODING FIXED(32) |
4 | Default encoding; same as When you create a column using |
DECIMAL/NUMERIC |
2, 4, or 8 | Size depends on precision:
|
DOUBLE |
8 | Variable precision. Minimum value: -1.79 x e^308 ; maximum value: 1.79 x e^308 |
EPOCH
| 8 | Seconds ranging from -30610224000
(1/1/1000 00:00:00 ) through 185542587100800
(1/1/5885487 23:59:59 ). |
FLOAT |
4 | Variable precision. Minimum value: -3.4 x e^38 ; maximum value: 3.4 x e^38 . |
INTEGER |
4 | Minimum value: -2,147,483,647 ; maximum value: 2,147,483,647 . |
INTEGER ENCODING FIXED(8) |
1 | Minumum value: -127 ; maximum value: 127 . |
INTEGER ENCODING FIXED(16) |
2 | Same as SMALLINT . |
LINESTRING |
Variable2 | Geospatial datatype. A sequence of 2 or more points and the lines that connect them. For example: LINESTRING(0 0,1 1,1 2) |
MULTIPOLYGON |
Variable2 | Geospatial datatype. A set of one or more polygons. For example:MULTIPOLYGON(((0 0,4 0,4 4,0 4,0 0),(1 1,2 1,2 2,1 2,1 1)), ((-1 -1,-1 -2,-2 -2,-2 -1,-1 -1))) |
POINT |
Variable2 | Geospatial datatype. A point described by two coordinates. When the coordinates are longitude and latitude, OmniSci stores longitude first, and then latitude. For example: POINT(0 0) |
POLYGON |
Variable2 | Geospatial datatype. A set of one or more rings (closed line strings), with the first representing the shape (external ring) and the rest representing holes in that shape (internal rings). For example: POLYGON((0 0,4 0,4 4,0 4,0 0),(1 1, 2 1, 2 2, 1 2,1 1)) |
SMALLINT |
2 | Minimum value: -32,767 ; maximum value: 32,767 . |
SMALLINT ENCODING FIXED(8) |
1 | Minumum value: -127 ; maximum value: 127 . |
TEXT ENCODING DICT |
4 | Max cardinality 2 billion distinct string values. Maximum string length is 32,767. |
TEXT ENCODING DICT(8) |
1 | Max cardinality 255 distinct string values. |
TEXT ENCODING DICT(16) |
2 | Max cardinality 64 K distinct string values. |
TEXT ENCODING NONE |
Variable | Size of the string + 6 bytes. Maximum string length is 32,767. |
TIME |
8 | Minimum value: 00:00:00 ; maximum value: 23:59:59 . |
TIME ENCODING FIXED(32) |
4 | Minimum value: 00:00:00 ; maximum value: 23:59:59 . |
TIMESTAMP(0) | 8 | Linux timestamp from -30610224000 (1/1/1000 00:00:00 ) through 29379542399 (12/31/2900 23:59:59 ). Can also be inserted and stored in human-readable format: YYYY-MM-DD HH:MM:SS or YYYY-MM-DDTHH:MM:SS (the T is dropped when the field is populated). |
TIMESTAMP(3) (milliseconds) | 8 | Linux timestamp from -30610224000000 (1/1/1000 00:00:00.000 ) through 29379542399999 (12/31/2900 23:59:59.999 ). Can also be inserted and stored in human-readable format: YYYY-MM-DD HH:MM:SS.fff or YYYY-MM-DDTHH:MM:SS.fff (the T is dropped when the field is populated). |
TIMESTAMP(6) (microseconds) | 8 | Linux timestamp from -30610224000000000 (1/1/1000 00:00:00.000000 ) through 29379542399999999 (12/31/2900 23:59:59.999999 ). Can also be inserted and stored in human-readable format: YYYY-MM-DD HH:MM:SS.ffffff or YYYY-MM-DDTHH:MM:SS.ffffff (the T is dropped when the field is populated). |
TIMESTAMP(9) (nanoseconds) | 8 | Linux timestamp from -9223372036854775807 (09/21/1677 00:12:43.145224193 ) through 9223372036854775807 (11/04/2262 23:47:16.854775807 ). Can also be inserted and stored in human-readable format: YYYY-MM-DD HH:MM:SS.fffffffff or YYYY-MM-DDTHH:MM:SS.fffffffff (the T is dropped when the field is populated). |
TIMESTAMP ENCODING FIXED(32) |
4 | Range: 1901-12-13 20:45:53 - 2038-01-19 03:14:07 . Can also be inserted and stored in human-readable format:
YYYY-MM-DD HH:MM:SS or
YYYY-MM-DDTHH:MM:SS (the T is dropped when the field is populated).
|
TINYINT |
1 | Minimum value: -127 ; maximum value: 127 . |
1 In OmniSci release 4.4.0 and higher, you can use existing 8-byte
DATE
columns, but you can create only 4-byte DATE
columns (default) and 2-byte DATE
columns
(see DATE ENCODING FIXED(16)
).
2 See Storage and Compression below for information about geospatial datatype sizes.
Note |
|
Geospatial Datatypes
OmniSci supports the LINESTRING
, MULTIPOLYGON
, POINT
, and POLYGON
geospatial datatypes.
In the following example:
- p0, p1, ls0, and poly0 are simple (planar) geometries.
- p4 is point geometry with Web Mercator longitude/latitude coordinates.
- p2, p3, ls1, ls2, poly1, and mpoly0 are geometries using WGS84 SRID=4326 longitude/latitude coordinates.
CREATE TABLE geo ( name TEXT ENCODING DICT(32), p0 POINT, p1 GEOMETRY(POINT), p2 GEOMETRY(POINT, 4326), p3 GEOMETRY(POINT, 4326) ENCODING NONE, p4 GEOMETRY(POINT, 900913), ls0 LINESTRING, ls1 GEOMETRY(LINESTRING, 4326) ENCODING COMPRESSED(32), ls2 GEOMETRY(LINESTRING, 4326) ENCODING NONE, poly0 POLYGON, poly1 GEOMETRY(POLYGON, 4326) ENCODING COMPRESSED(32), mpoly0 GEOMETRY(MULTIPOLYGON, 4326) );
Storage
Geometry storage requirements are largely dependent on coordinate data. Coordinates are normally stored as 8-byte doubles, two coordinates per point, for all points that form a geometry. Each POINT geometry in the p1 column, for example, requires 16 bytes.
Compression
WGS84 (SRID 4326) coordinates are compressed to 32 bits by default. This sacrifices some precision but reduces storage requirements by half.
For example, columns p2, ls1, poly1, and mpoly0 in the table defined above are compressed. Each geometry in the p2 column requires 8 bytes, compared to 16 bytes for p0.
You can explicitly disable compression. WGS84 columns p3, ls2 are not compressed and continue using doubles. Simple (planar) columns p0, p1, ls0, poly1 and non-4326 column p4 are not compressed.
For more information about geospatial datatypes and functions, see Geospatial Capabilities.
Defining Arrays
Define datatype arrays by appending square brackets, as shown in the arrayexamples
DDL sample.
CREATE TABLE arrayexamples ( tiny_int_array TINYINT[], int_array INTEGER[], big_int_array BIGINT[], text_array TEXT[] ENCODING DICT(32), --OmniSci supports only DICT(32) TEXT arrays. float_array FLOAT[], double_array DOUBLE[], decimal_array DECIMAL(18,6)[], boolean_array BOOLEAN[], date_array DATE[], time_array TIME[], timestamp_array TIMESTAMP[])
You can also define fixed-length arrays. For example:
CREATE TABLE arrayexamples ( float_array3 FLOAT[3], date_array4 DATE[4]
Fixed-length arrays require less storage space than variable-length arrays.