Functions and Operators
- Basic Mathematical Operators
- Comparison Operators
- Mathematical Functions
- Trigonometric Functions
- Geometric Functions
- String Functions
- Pattern-matching Functions
- Date/Time Functions
- Aggregate Functions
- Statistical Functions
- Window Functions
Basic Mathematical Operators
Mathematical Operator Precedence
- Parenthesization
- Multiplication and division
- Addition and subtraction
Comparison Operators
Mathematical Functions
Trigonometric Functions
Geometric Functions
String Functions
Function | Description |
---|---|
CHAR_LENGTH(str) |
Returns the number of characters in a string |
LENGTH(str) |
Returns the length of a string in bytes |
Note |
String functions only work with unencoded fields (ENCODING set to
none ).
|
Pattern-matching Functions
Usage Notes
The following wildcard characters are supported by LIKE
and ILIKE
:
%
matches any number of characters, including zero characters._
matches exactly one character.
Date/Time Functions
Usage Notes
Supported date_part types:
DATE_TRUNC [YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, MILLENNIUM, CENTURY, DECADE, WEEK,
QUARTERDAY]
EXTRACT [YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, DOW, ISODOW, DOY, EPOCH, QUARTERDAY,
WEEK]
DATEDIFF [YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, MILLENNIUM, CENTURY, DECADE,
QUARTERDAY]
Supported interval types:
DATEADD [DECADE, YEAR, QUARTER, MONTH, WEEK, WEEKDAY, DAY, HOUR, MINUTE, SECOND ] TIMESTAMPADD [YEAR, QUARTER, MONTH, WEEKDAY, DAY, HOUR, MINUTE, SECOND ] DATEPART [YEAR, QUARTER, MONTH, DAYOFYEAR, QUARTERDAY, WEEKDAY, DAY, HOUR, MINUTE, SECOND ]
Accepted Date, Time, and Timestamp Formats
Usage Notes
- For two-digit years, years 69-99 are assumed to be previous century (for example, 1969), and 0-68 are assumed to be current century (for example, 2016).
- For four-digit years, negative years (BC) are not supported.
- Hours are expressed in 24-hour format.
- When time components are separated by colons, you can write them as one or two digits.
- Months are case insensitive. You can spell them out or abbreviate to three characters.
- For timestamps, decimal seconds are ignored. Time zone offsets are written as +/-HHMM.
- For timestamps, a numeric string is converted to +/- seconds since January 1, 1970. Supported timestamps range from -30610224000 (January 1, 1000) through 29379456000 (December 31, 2900).
- On output, dates are formatted as YYYY-MM-DD. Times are formatted as HH:MM:SS.
- Linux EPOCH values range from -30610224000 (1/1/1000) through 185542587100800 (1/1/5885487). Complete range in years: +/-5,883,517 around epoch.
Aggregate Functions
Usage Notes
COUNT(DISTINCT x)
, especially when used in conjunction with GROUP BY, can require a very large amount of memory to keep track of all distinct values in large tables with large cardinalities. To avoid this large overhead, use APPROX_COUNT_DISTINCT.APPROX_COUNT_DISTINCT(x, e)
gives an approximate count of the value x, based on an expected error rate defined in e. The error rate is an integer value from 1 to 100. The lower the value of e, the higher the precision, and the higher the memory cost. Select a value for e based on the level of precision required. On large tables with large cardinalities, consider usingAPPROX_COUNT_DISTINCT
when possible to preserve memory.Set the default error rate using the -hll-precision-bits configuration paramenter.
-
Currently, OmniSci does not support grouping by non-dictionary-encoded strings.
However, with the
SAMPLE
aggregate function, you can select non-dictionary-encoded strings that are presumed to be unique in a group. For example:SELECT user_name, SAMPLE(user_decription) FROM tweets GROUP BY user_name;
If the aggregated column (user_description in the example above) is not unique within a group,SAMPLE
selects a value that might be nondeterministic because of the parallel nature of OmniSci query execution.
Statistical Functions
Both double-precision (standard) and single-precision floating point functions are provided. Single-precision functions run faster on GPUs but might cause overflow errors.
Window Functions
Window functions allow you to work with a subset of rows related to the currently selected row.
Window functions must always contain an OVER clause. The OVER clause splits up the rows of the query for processing by the window function.
The PARTITION BY list divides the rows into groups that share the same values of the PARTITION BY expression(s). For each row, the window function is computed using all rows in the same partition as the current row.
Rows that have the same value in the ORDER BY clause are considered peers. The ranking functions give the same answer for any two peer rows.
Usage Notes
- OmniSciDB supports the aggregate functions
AVG
,MIN
,MAX
,SUM
, andCOUNT
in window functions. - OmniSciDB does not support empty partitions. For example, the following query
triggers an exception because the OVER clause requires a PARTITION BY list:
SELECT dest, ntile(4) OVER (ORDER BY total_count DESC) AS quartile FROM my_test_data.
- Window functions only work on single fragment datasets. If you want to run window functions over base data in your table, you must ensure there is only one fragment (by increasing the fragment size to be greater than the number of rows expected in the table before import). If you are running the window function on top of an intermediate result (for example, a GROUP BY), the intermediate result is contained in a single fragment, even if the underlying table contains multiple fragments. This happens automatically if a GROUP BY clause is part of the window function query.
- Window functions are not supported in distributed mode.
Example
SELECT * FROM ( SELECT ntile(5) OVER (PARTITION BY airtime ORDER BY distance DESC) AS ranking, distance, airtime, origin_name, dest_name FROM flights_2008_10k) AS myQuery where distance < 140 ORDER BY ranking DESC ;
This query might return more than 10 rows if there are multiple flights with the same airtime.