Dimension links help build out DJ’s dimensional metadata graph, a key component of the DJ DAG.
There are two types of dimension links: join links and alias/reference links.
You can configure a join link between a dimension node and any source, transform, or dimension nodes. Configuring this
join link will make it so that all dimension attributes on the dimension node are accessible by the original node.
Let’s look at an example to understand what “accessible” means in this context.
Here is a simple dimension join link configured between the events source node and the user dimension node:
This join link was configured using events’s user_id column joined to the user dimension’s id column. This
tells DJ how to perform a join, should we ever need the country dimension for any of the events node’s
downstream metrics.
👉
In most cases, the join link’s join_on clause will just be equality comparisons between the primary key and foreign
key columns of the original node and the dimension node. However, more complex join clauses can be configured if
desired, including the ability to specify RIGHT, LEFT or INNER join links.
Let’s also assume that this metric total_event_duration was created using the events source node:
After the dimension link, all dimension attributes on the user dimension node (id, username, name etc)
will be available to the total_event_duration metric to optionally group or filter by. When someone asks DJ for
the total_event_duration metric grouped by the user’s registration_country, DJ will use the configured join link
to build a join between the user dimension and the events source node.
Some dimensions themselves may include join links to other dimensions, adding to the available join paths and
allowing DJ to discover more dimensions that can be used by downstream metrics. Let’s walk through an example of such
a case.
Extending from the example above, let’s add on a country dimension node:
Now events and its downstream metric total_event_duration will have additional dimensions available:
country.id, country.name, country.population, in additional to the dimensions from the user dimension
node.
Note the role attribute on the dimension link from above.
The dimension link’s role attribute is used to distinguish between dimensions that play different roles
for a given node. In this case, the country dimension represents a user’s registration_country, but it can
also play a different role, like representing the country an event was recorded in.
In this case, the country dimension was linked to both user (as the role registration_country) and to events (as
the role event_country). After this link is in place, request the country.name dimension for the events node will
be ambiguous without choosing a role.
DJ will distinguish between the two dimension roles with the following syntax:
country.name[registration_country]
country.name[event_country]
Or more generally, <dimension>[<role>]. The [role] part can be safely omitted if there is only a single role
defined for that dimension.
You can configure a dimension alias/reference between a particular column on a table/view-like node
(source, transform, dimension) and a column on a dimension node. An example of the alias/reference link:
In this case, configuring a reference between events.country_name and country.name will indicate that the
semantic meaning behind the events.country_name column refers to the default.country dimension’s name field.
If someone requests events with the dimension default.country.name (or the metric total_event_duration with the
same dimension), DJ will know not to perform a join to the country dimension, since the name attribute is
already available directly on the original node.