Skip to main content

Unnesting Arrays

· 4 min read
Siva Dirisala
Creator of SQL Frames

Any non-trivial data model has relationships and databases and application platforms may or may not support the notion of storing the relationships inline, that is, denormalizing the relationships into the main record. To give a real example, a user may have a set of roles in the application and this user-role relationship can be either stored in a separate table or directly within the user table. However, since the user-role relationship is 1-to-many, storing this directly within the user data model requires storing the relationship as a multi-valued column and arrays are perfect for that.

The challenge that does come with storing the 1-to-many relationsips as arrays is that it then makes it difficult to do certain SQL queries that are easy to do had the relationship been stored explicitly. That is why some databases offer a special operator to deal with such array fields.

Unnest

Unnest is an operator that takes an array and transforms it into a set of rows. For example, the array [1,2,3] becomes three rows containing 1,2 and 3.

Unnest can support transforming more than array field. When two arrays are simultaneously being transformed into rows, how would the rows look like? In a set theoritic approach, one would imagine the two arrays as two sets (multi-sets if duplicates exist) and the resulting rows as the cartesian product of the two sets. However, the unnest operator works differently. It takes ith element from each array and forms a row out of them. If one of the arrays is smaller than the other, it can be thought of as padded with null values at the end.

Unnest can also support non-array fields along with array fields. The non-array value is simply repeated with each generated row.

Array data model

Below is an example user data model that stores roles and skills of the user as arrays.

Loading...

Unnesting the array

Following example unnests multiple array fields along with non-array fields showing the maximum flexibility of using the unnest operator.

Loading...

Unnest and transform

With the ability to unnest, it is possible to answer aggregation questions like "which roles are most common?"

Loading...

SQL construct

unnest is a SQL construct offered by a few databases like PostgreSQL. Snowflake doesn't have the unnest construct but it does have flatten which can be used to mimic unnest operator.

UX

The main advantage of array fields for end users is the user experience, when it is done right. The best UX for array fields is how some systems allow adding multiple tags to a record. It is essentially a multi-valued editable field that allows either ad-hoc or controlled set of tags to be entered.

Note that this UX strictly doesn't require an array data model. The application layer can choose to still provide this type of UI while the underlying data model is stored as a separate relationship.

Arrays in Excel

While Excel has enhanced its support for arrays and array formulas in the last couple of years, it doesn't offer the above mentioned UX to store arrays at the individual cell level and manipulate them. Instead, the arrays are expanded into multiple cells. This may or may not always be desirable.

Conclusion

Low-code application platforms mask a lot of complexity from both end users and developers by offering easy to use contructs. Arrays and the unnest operator along with the above mentioned tags like UX provide productivity enhancing constructs to create and use low-code applications.