Functions and Operators
- Basic Mathematical Operators
- Comparison Operators
- Mathematical Functions
- Trigonometric Functions
- Geometric Functions
- String Functions
- Pattern-matching Functions
- Date/Time Functions
- Aggregate Functions
- Statistical Functions
Basic Mathematical Operators
Operator | Description |
---|---|
+numeric |
Returns numeric |
–numeric |
Returns negative value of numeric |
numeric1 + numeric2 |
Sum of numeric1 and numeric2 |
numeric1 – numeric2 |
Difference of numeric1 and numeric2 |
numeric1 * numeric2 |
Product of numeric1 and numeric2 |
numeric1 / numeric2 |
Quotient (numeric1 divided by numeric2) |
Mathematical Operator Precedence
- Parenthesization
- Multiplication and division
- Addition and subtraction
Comparison Operators
Operator | Description |
---|---|
= |
Equals |
<> |
Not equals |
> |
Greater than |
>= |
Greater than or equal to |
< |
Less than |
<= |
Less than or equal to |
BETWEEN x AND y |
Is a value within a range |
NOT BETWEEN x AND y |
Is a value not within a range |
IS NULL |
Is a value null |
IS NOT NULL |
Is a value not null |
NULLIF(x, y) |
Compare expressions x and
y. If different, return
x. If they are the same,
return null . For example,
if a dataset uses ‘NA’ for
null values, you can use
this statement to return
null using SELECT
NULLIF(field_name,'NA') . |
Mathematical Functions
Function | Description |
---|---|
ABS(x) |
Returns the absolute value of x |
CEIL(x) |
Returns the smallest integer not less than the argument |
DEGREES(x) |
Converts radians to degrees |
EXP(x) |
Returns the value of e to the power of x |
FLOOR(x) |
Returns the largest integer not greater than the argument |
LN(x) |
Returns the natural logarithm of x |
LOG(x) |
Returns the natural logarithm of x |
LOG10(x) |
Returns the base-10 logarithm of the specified float expression x |
MOD(x, y) |
Returns the remainder of int x divided by int y |
PI() |
Returns the value of pi |
POWER(x,
y) |
Returns the value of x raised to the power of y |
RADIANS(x) |
Converts degrees to radians |
ROUND(x ) |
Rounds x to the nearest integer value, but does not change the data type. For example, the double value 4.1 rounds to the double value 4. |
ROUND_TO_DIGIT
(x, y) |
Rounds x to y decimal places |
SIGN(x) |
Returns the sign of x as -1, 0, 1 if x is negative, zero, or positive |
SQRT(x) |
Returns the square root of x. |
TRUNCATE(x,
y) |
Truncates x to y decimal places |
Trigonometric Functions
Function | Description |
---|---|
ACOS(x) |
Returns the arc cosine of x |
ASIN(x) |
Returns the arc sine of x |
ATAN(x) |
Returns the arc tangent of x |
ATAN2(x, y) |
Returns the arc tangent of x and y |
COS(x) |
Returns the cosine of x |
COT(x) |
Returns the cotangent of x |
SIN(x) |
Returns the sine of x |
TAN(x) |
Returns the tangent of x |
Geometric Functions
Function | Description |
---|---|
DISTANCE_IN_METERS(fromLon, fromLat,
toLon, toLat) |
Calculates distance in meters between two WGS-84 positions. |
CONV_4326_900913_X(x) |
Converts WGS-84 latitude to WGS-84 Web Mercator x coordinate. |
CONV_4326_900913_Y(y) |
Converts WGS-84 longitude to WGS-84 Web Mercator y coordinate. |
String Functions
Function | Description |
---|---|
CHAR_LENGTH(str) |
Returns the number of characters in a string |
LENGTH(str) |
Returns the length of a string in bytes |
Note |
String functions only work with unencoded fields (ENCODING set to
none ).
|
Pattern-matching Functions
Name | Example | Description |
---|---|---|
str LIKE pattern |
'ab' LIKE 'ab' |
Returns true if the string matches the pattern |
str NOT LIKE pattern |
'ab' NOT LIKE 'cd' |
Returns true if the string does not match the pattern |
str ILIKE pattern |
'AB' ILIKE 'ab' |
Case-insensitive LIKE |
str REGEXP POSIX pattern |
'^[a-z]+r$' |
Lowercase string ending with r |
REGEXP_LIKE ( str , POSIX
pattern ) |
'^[hc]at' |
cat or hat |
Usage Notes
The following wildcard characters are supported by LIKE
and ILIKE
:
%
matches any number of characters, including zero characters._
matches exactly one character.
Date/Time Functions
Function | Description |
---|---|
DATE_TRUNC(date_part,
timestamp) |
Truncates the timestamp to the specified date_part. Note that date_part is not enclosed in single quotes. |
Example
SELECT DATE_TRUNC(MINUTE, arr_timestamp) Arrival \ FROM flights_2008_10k LIMIT 10; |
|
EXTRACT(date_part FROM
timestamp) |
Returns the specified date_part from timestamp. Note that date_part is not enclosed in single quotes. |
Example
SELECT EXTRACT(HOUR FROM arr_timestamp) Arrival_Hour \ FROM flights_2008_10k LIMIT 10; |
|
INTERVAL 'count' date_part |
Adds or Subtracts count date_part units from a timestamp. Note that 'count' is enclosed in single quotes and date_part is not enclosed in single quotes. |
Example
SELECT arr_timestamp + INTERVAL '10' YEAR \ FROM flights_2008_10k LIMIT 10; |
|
NOW() |
Returns the current timestamp. |
Example
SELECT NOW() FROM flights_2008_10k LIMIT 1; |
|
TIMESTAMPADD(date_part, count, timestamp |
date) |
Adds an interval of count date_part to timestamp or date and returns signed date_part units in the provided timestamp or date form. Note that neither count nor date_part are enclosed in single quotes. |
Example
SELECT TIMESTAMPADD(DAY, 14, arr_timestamp) Fortnight \ FROM flights_2008_10k LIMIT 10; |
|
TIMESTAMPDIFF(date_part,
timestamp1, timestamp2) |
Subtracts timestamp1 from timestamp2 and returns the result in signed date_part units. Note that date_part is not enclosed in single quotes. |
Example
SELECT TIMESTAMPDIFF(MINUTE, arr_timestamp, dep_timestamp) Flight_Time \ FROM flights_2008_10k LIMIT 10; |
|
DATEDIFF('date_part',
date, date) |
Returns the difference between two
dates, calculated to the
lowest level of the 'date_part'
you specify. For example, if you set
the 'date_part' as DAY, only
the year, month, and day are used to
calculate the result. Other fields,
such as hour and minute, are
ignored. Note that 'date_part' must be enclosed in single quotes. |
Example
SELECT DATEDIFF('YEAR', plane_issue_date, now()) Years_In_Service \ FROM flights_2008_10k LIMIT 10; |
|
DATEADD('date_part',
interval, date | timestamp) |
Returns a date after a specified time/date interval has been added. Note that 'date_part' must be enclosed in single quotes. |
Example
SELECT DATEADD('MINUTE', 6000, dep_timestamp) Arrival_Estimate \ FROM flights_2008_10k LIMIT 10; |
|
DATEPART('interval', date | timestamp) |
Returns a specified part of a given date or timestamp as an integer value. Note that 'interval' must be enclosed in single quotes. |
Example
SELECT DATEPART('YEAR', plane_issue_date) Year_Issued \ FROM flights_2008_10k LIMIT 10; |
Usage Notes
Supported date_part types:
DATE_TRUNC [YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, MILLISECOND,
MICROSECOND, NANOSECOND, MILLENNIUM, CENTURY, DECADE, WEEK,
QUARTERDAY]
EXTRACT [YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, MILLISECOND,
MICROSECOND, NANOSECOND, DOW, ISODOW, DOY, EPOCH, QUARTERDAY,
WEEK]
DATEDIFF [YEAR, QUARTER, MONTH, DAY, HOUR, MINUTE, SECOND, MILLISECOND,
MICROSECOND, NANOSECOND, MILLENNIUM, CENTURY, DECADE, WEEK,
QUARTERDAY]
Supported interval types:
DATEADD [DECADE, YEAR, QUARTER, MONTH, WEEK, WEEKDAY, DAY, HOUR, MINUTE, SECOND, MILLISECOND, MICROSECOND, NANOSECOND] TIMESTAMPADD [YEAR, QUARTER, MONTH, WEEKDAY, DAY, HOUR, MINUTE, SECOND, MILLISECOND, MICROSECOND, NANOSECOND] DATEPART [YEAR, QUARTER, MONTH, DAYOFYEAR, QUARTERDAY, WEEKDAY, DAY, HOUR, MINUTE, SECOND, MILLISECOND, MICROSECOND, NANOSECOND]
Accepted Date, Time, and Timestamp Formats
Datatype | Formats | Examples |
---|---|---|
DATE | YYYY-MM-DD | 2013-10-31 |
DATE | MM/DD/YYYY | 10/31/2013 |
DATE | DD-MON-YY | 31-Oct-13 |
DATE | DD/Mon/YYYY | 31/Oct/2013 |
EPOCH | 1383262225 | |
TIME | HH:MM | 23:49 |
TIME | HHMMSS | 234901 |
TIME | HH:MM:SS | 23:49:01 |
TIMESTAMP | DATE TIME | 31-Oct-13 23:49:01 |
TIMESTAMP | DATETTIME | 31-Oct-13T23:49:01 |
TIMESTAMP | DATE:TIME | 11/31/2013:234901 |
TIMESTAMP | DATE TIME ZONE | 31-Oct-13 11:30:25 -0800 |
TIMESTAMP | DATE HH.MM.SS PM | 31-Oct-13 11.30.25pm |
TIMESTAMP | DATE HH:MM:SS PM | 31-Oct-13 11:30:25pm |
TIMESTAMP | 1383262225 |
Usage Notes
- For two-digit years, years 69-99 are assumed to be previous century (for example, 1969), and 0-68 are assumed to be current century (for example, 2016).
- For four-digit years, negative years (BC) are not supported.
- Hours are expressed in 24-hour format.
- When time components are separated by colons, you can write them as one or two digits.
- Months are case insensitive. You can spell them out or abbreviate to three characters.
- For timestamps, decimal seconds are ignored. Time zone offsets are written as +/-HHMM.
- For timestamps, a numeric string is converted to +/- seconds since January 1, 1970. Supported timestamps range from -30610224000 (January 1, 1000) through 29379456000 (December 31, 2900).
- On output, dates are formatted as YYYY-MM-DD. Times are formatted as HH:MM:SS.
- Linux EPOCH values range from -30610224000 (1/1/1000) through 185542587100800 (1/1/5885487). Complete range in years: +/-5,883,517 around epoch.
Aggregate Functions
Function | Description |
---|---|
AVG(x) |
Returns the average value of x |
COUNT() |
Returns the count of the number of rows returned |
COUNT(DISTINCT x) |
Returns the count of distinct values of x |
APPROX_COUNT_DISTINCT(x, e) |
Returns the approximate count of distinct values of x with defined expected error rate e |
MAX(x) |
Returns the maximum value of x |
MIN(x) |
Returns the minimum value of x |
SUM(x) |
Returns the sum of the values of x |
SAMPLE(x) |
Returns one sample value from aggregated column x.
For example, the following query returns population grouped by
city, along with one value from the state column for each group:
SELECT city, SAMPLE(state), sum (population) FROM census_table GROUP BY city
Note: This was previously |
Usage Notes
COUNT(DISTINCT x)
, especially when used in conjunction with GROUP BY, can require a very large amount of memory to keep track of all distinct values in large tables with large cardinalities. To avoid this large overhead, use APPROX_COUNT_DISTINCT.APPROX_COUNT_DISTINCT(x, e)
gives an approximate count of the value x, based on an expected error rate defined in e. The error rate is an integer value from 1 to 100. The lower the value of e, the higher the precision, and the higher the memory cost. Select a value for e based on the level of precision required. On large tables with large cardinalities, consider usingAPPROX_COUNT_DISTINCT
when possible to preserve memory.Set the default error rate using the -hll-precision-bits configuration paramenter.
-
Currently, OmniSci does not support grouping by non-dictionary-encoded strings.
However, with the
SAMPLE
aggregate function, you can select non-dictionary-encoded strings that are presumed to be unique in a group. For example:SELECT user_name, SAMPLE(user_decription) FROM tweets GROUP BY user_name;
If the aggregated column (user_description in the example above) is not unique within a group,SAMPLE
selects a value that might be nondeterministic because of the parallel nature of OmniSci query execution.
Statistical Functions
Both double-precision (standard) and single-precision floating point functions are provided. Single-precision functions run faster on GPUs but might cause overflow errors.
Double-precision FP Function | Single-precision FP Function | Description |
---|---|---|
CORRELATION(x, y) |
CORRELATION_FLOAT(x, y) |
Alias of CORR. Returns the coefficient of correlation of a set of number pairs. |
CORR(x, y) |
CORR_FLOAT(x, y) |
Returns the coefficient of correlation of a set of number pairs. |
COVAR_POP(x, y) |
COVAR_POP_FLOAT(x, y) |
Returns the population covariance of a set of number pairs. |
COVAR_SAMP(x, y) |
COVAR_SAMP_FLOAT(x, y) |
Returns the sample covariance of a set of number pairs. |
STDDEV(x) |
STDDEV_FLOAT(x) |
Alias of STDDEV_SAMP. Returns sample standard deviation of the value. |
STDDEV_POP(x) |
STDDEV_POP_FLOAT(x) |
Returns the population standard the standard deviation of the value. |
STDDEV_SAMP(x) |
STDDEV_SAMP_FLOAT(x) |
Returns the sample standard deviation of the value. |
VARIANCE(x) |
VARIANCE_FLOAT(x) |
Alias of VAR_SAMP. Returns the sample variance of the value. |
VAR_POP(x) |
VAR_POP_FLOAT(x) |
Returns the population variance sample variance of the value. |
VAR_SAMP(x) |
VAR_SAMP_FLOAT(x) |
Returns the sample variance of the value. |