Unnesting Arrays
Any non-trivial data model has relationships and databases and application platforms may or may not support the notion of storing the relationships inline, that is, denormalizing the relationships into the main record. To give a real example, a user may have a set of roles in the application and this user-role relationship can be either stored in a separate table or directly within the user table. However, since the user-role relationship is 1-to-many, storing this directly within the user data model requires storing the relationship as a multi-valued column and arrays are perfect for that.
The challenge that does come with storing the 1-to-many relationsips as arrays is that it then makes it difficult to do certain SQL queries that are easy to do had the relationship been stored explicitly. That is why some databases offer a special operator to deal with such array fields.
Unnest
Unnest
is an operator that takes an array and transforms it into a set of rows.
For example, the array [1,2,3]
becomes three rows containing 1,2 and 3.
Unnest
can support transforming more than array field. When two arrays are
simultaneously being transformed into rows, how would the rows look like? In a
set theoritic approach, one would imagine the two arrays as two sets (multi-sets if duplicates exist)
and the resulting rows as the cartesian product of the two sets. However, the
unnest
operator works differently. It takes ith element from each array and forms a
row out of them. If one of the arrays is smaller than the other, it can be thought of
as padded with null
values at the end.
Unnest
can also support non-array fields along with array fields. The non-array value
is simply repeated with each generated row.
Array data model
Below is an example user data model that stores roles and skills of the user as arrays.
Unnesting the array
Following example unnests multiple array fields along with non-array fields
showing the maximum flexibility of using the unnest
operator.
Unnest and transform
With the ability to unnest
, it is possible to answer aggregation questions
like "which roles are most common?"
SQL construct
unnest
is a SQL construct offered by a few databases like PostgreSQL.
Snowflake doesn't have the unnest
construct but it does have flatten
which can be used to mimic unnest
operator.
UX
The main advantage of array fields for end users is the user experience, when it is done right. The best UX for array fields is how some systems allow adding multiple tags to a record. It is essentially a multi-valued editable field that allows either ad-hoc or controlled set of tags to be entered.
Note that this UX strictly doesn't require an array data model. The application layer can choose to still provide this type of UI while the underlying data model is stored as a separate relationship.
Arrays in Excel
While Excel has enhanced its support for arrays and array formulas in the last couple of years, it doesn't offer the above mentioned UX to store arrays at the individual cell level and manipulate them. Instead, the arrays are expanded into multiple cells. This may or may not always be desirable.
Conclusion
Low-code application platforms mask a lot of complexity from both end users and
developers by offering easy to use contructs. Arrays and the unnest
operator along
with the above mentioned tags like UX provide productivity enhancing constructs to create
and use low-code applications.