Nested(name1 Type1, Name2 Type2, ...)
A nested data structure is like a table inside a cell. The parameters of a nested data structure – the column names and types – are specified the same way as in a CREATE TABLE query. Each table row can correspond to any number of rows in a nested data structure.
Column names containing dots, columns sharing a common dot-prefix, and columns with the Array type can each be interpreted as part of a flattened Nested structure when flatten_nested = 1 (the default). This can cause unexpected array-length validation on inserts and renaming restrictions.
Avoid using dots in column names if possible.
Use underscores (_) or another separator instead of dots in column names unless you intentionally need Nested semantics.
Example:
The CREATE TABLE DDL statement above declares the Goals nested data structure, which contains data about conversions, or goals reached.
Each row in the 'visits' table corresponds to zero or more conversions.
When the flatten_nested setting is set to 0 (flatten_nested=1 by default) arbitrary levels of nesting are supported.
In most cases, when working with a nested data structure, its columns are specified with column names separated by a dot. These columns make up an array of matching types. All column arrays of a single nested data structure have the same length.
For example:
It is easiest to think of a nested data structure as a set of multiple column arrays of equal length.
Filtering Nested columns in WHERE
Because each column of a Nested structure is stored as an Array, referencing it in a WHERE clause gives you the whole array for every row, not an individual element. You cannot compare a nested column directly to a scalar value so you must use array functions instead.
For example, this query does not silently return no rows — it raises an exception, because Goals.ID has type Array(UInt32) and equals(Array(UInt32), UInt32) is not a valid comparison:
Use has to check whether an array contains a specific value:
Use arrayExists when the condition is more complex:
You can filter by array length with length or exclude empty arrays with notEmpty:
To filter on individual elements of a nested structure rather than on whole rows, use ARRAY JOIN to unfold the arrays first.
After ARRAY JOIN, each element becomes a separate row, so the WHERE clause applies to scalar values.
For more information, see ARRAY JOIN clause. Example:
You can't perform SELECT for an entire nested data structure. You can only explicitly list individual columns that are part of it.
Inserting data
For an INSERT query, you should pass all the component column arrays of a nested data structure separately (as if they were individual column arrays). During insertion, the system checks that they have the same length.
Each nested sub-column is listed in the column list using dot notation (Goals.ID, Goals.Serial, ...), and the corresponding values are arrays:
All nested sub-column arrays within a single row must have the same length. Mismatched lengths cause an error:
For a DESCRIBE query, the columns in a nested data structure are listed separately in the same way.
ALTER limitations
ALTER queries on nested data structures have the following limitations:
Adding sub-columns works normally. You can add a new sub-column to an existing Nested structure:
Dropping sub-columns works for individual sub-columns:
Modifying the type of a sub-column works and triggers a mutation (data rewrite):
Renaming has restrictions. You can rename a sub-column within the same nested structure:
However, you cannot:
- Rename the entire nested structure itself (e.g.,
GoalstoConversions). - Move a sub-column to a different nested structure (e.g.,
Goals.IDtoOtherNested.ID). - Move a sub-column out of or into a nested structure (e.g.,
Goals.IDtoGoalIDor vice versa).