Data
Use Vega data
to specify the visualization data sources by providing an array of one or more data definitions.
A data definition must be an object identified by a unique name, which can be referenced in other areas of the specification.
Data can be statically defined inline ("values":
), can reference columns from a database table using a SQL statement ("SQL":
), or can be loaded from an existing data set ("source":
).
JSON format:
"data": [ { "name": <dataID>, "format": <datasourceFormat>, "values": <valueSet> | "SQL": <dataSource> | "source": <dataSource> "transform": [ ... elided ... ] }, { ... } ]
The data specification has the following properties:
Property | Data Type | Required | Description |
---|---|---|---|
name | string | X | User-assigned database table name. |
format | string/object | How the data are parsed. polys and lines are the only supported
format mark types and are for rendering purposes only. Use the single string "short form" for polygon and simple linestring renders. Use the JSON object "long form" to provide more information for rendering more complex line types.
|
|
Data Source | string | Data source:
|
|
transform | string | An array of transforms to perform on the input data. The output of
the transform pipeline then becomes the value of this data set.
Currently, can only be used with source data set types. |
Examples
Load discrete x- and y column values using the values
database table type:
vegaSpec = {
width: 384,
height: 564,
data: [
{
name: "coordinates",
values: [ {"x":0, "y":3}, {"x":1, "y":5} ],
scales: [ ... elided ... ],
marks: [ ... elided ... ]
};
Use the sql
database table type to load latitude and longitude coordinates from the tweets_data
database table:
vegaSpec = {
width: 384,
height: 564,
data: [
{
name: "tweets",
sql: "SELECT lon as x, lat as y FROM tweets_data WHERE (lon >= -32 AND lon < 66) AND (lat >= -45 AND lat < 68)"
}
],
scales: [ ... elided ... ],
marks: [ ... elided ... ]
};
Use the source
type to use the data set defined in the sql
data section and perform aggregation transforms:
vegaSpec = {
width: 384,
height: 564,
data: [
{
name: "tweets",
sql: "SELECT lon as x, lat as y FROM tweets_data WHERE (lon >= -32 AND lon < 66) AND (lat >= -45 AND lat < 68)"
},
{
name: "tweets_stats",
source: "tweets",
transform: [
{
type: "aggregate",
fields: ["x", "x"],
ops: ["min", "max"],
as: ["minx", "maxx"]
}
]
},
],
scales: [ ... elided ... ],
marks: [ ... elided ... ]
}
Data Properties
name
The name
property uniquely identifies a data set, and is used for reference by other Vega properties, such as the Marks property.
format
The format
property indicates that data preprocessing is needed before rendering the query result. If this property is not specified,
data is assumed to be in row-oriented JSON format.
This property is required for Polys and Lines mark types. The property has one of two forms:
- The "short form", where
format
is a single string, which must be eitherpolys
orlines
. This form is used for all polygon rendering, and for fast ‘in-situ’ rendering of LINESTRING data. - The "long form", where
format
is an object containing other properties, as follows:
Format Property | Description |
---|---|
type |
Marks property type: |
coords |
Applies to Specifies This permits column extraction pertaining to line rendering and place them in
a rendering buffer. The Separate x- and y-array columns are also supported. |
layout |
(optional) Applies to Specifies how vertices are packed in the vertices column. All arrays must have the same layout:
|
For lines
, each row in the query corresponds to a single line.
This lines format
example of interleaved
data renders ten lines, all of the same length.
"data": [
{
"name": "table",
"sql": "select lineArrayTest.rowid as rowid, vertices, color from lineArrayTest order by color desc limit 10;",
"format": {
"type": "lines",
"coords": {
"x": ["vertices"],
"y": [
{"from": "vertices" }
]
},
"layout": "interleaved"
}
}
]
In this lines format
example of sequential
data, x
only stores points corresponding to the x coordinate and y
only stores
points corresponding to the y coordinate. Make sure that columns only contain a single coordinate if using multiple columns in sequential layout.
"data": [
{
"name": "table",
"sql": "select lineArrayTestSeq.rowid as rowid, x, y, color from lineArrayTestSeq order by color desc limit 10;",
"format": {
"type": "lines",
"coords": {
"x": ["x"],
"y": ["y"]
},
"layout": "sequential"
}
}
],
The following example shows a fast "in-situ" LINESTRING format
:
"data": [
{
"name": "table",
"format": "lines",
"sql": "SELECT rowid, linestring_column, ... FROM ..."
}
]
The following example shows a polys format
:
"data": [
{
"name": "polys",
"format": "polys",
"sql": "SELECT ... elided ..."
}
]
Data Source
The database table source property key-value pair specifies the location of the data and defines how the data is loaded:
Key | Value | Description |
---|---|---|
source |
String | Data is loaded from an existing data set. |
sql |
SQL statement | Data is loaded using a SQL statement.
You can use extention functions to convert distance in meters from a coordinate or point to a pixel size, and determine if a coordinate or point is located within a view defined by latitude and longitude. For more information, see OmniSci SQL Extensions. |
values |
JSON data | Data is loaded from static, key-value pair data definitions. |
transform
Transforms process a data stream to calculate new aggregated statistic fields and derive new data streams from them. Currently, transforms are specified only as part of a source
data definition. Transforms are defined as an array of specific transform types that are executed in sequential order. Each element of the array must be an object and must contain a type
property.
Currently, two transform types are supported: aggregate
and formula
.
Property | Description |
---|---|
aggregate |
Performs aggregation operations on input data columns to calculate new aggregated statistic fields and derive new data streams from them. The following properties are required:
|
formula |
Evaluates a user-defined expression. The following properties are required:
Note: Currently, expressions can only be performed against outputs (as values) from prior aggregate transforms. |
See Tutorial: Using the source Data Type with Transforms for an example of using the source
and transform
properties.