Dimension Links

Dimension links help build out DJ’s dimensional metadata graph, a key component of the DJ DAG. There are two types of dimension links: join links and alias/reference links.

You can configure a join link between a dimension node and any source, transform, or dimension nodes. Configuring this join link will make it so that all dimension attributes on the dimension node are accessible by the original node.

Let’s look at an example to understand what “accessible” means in this context.

Here is a simple dimension join link configured between the events source node and the user dimension node:

eventsuser_idlongcountry_idintevent_secslongevent_tslonguserintidPKstrusernamestrnamestrregistration_countryDimension Join Linkstrjoin_onevents.user_id = user.idenumjoin_typeLEFTstrroleevent_userlinked vialinked via

This join link was configured using events’s user_id column joined to the user dimension’s id column. This tells DJ how to perform a join, should we ever need the country dimension for any of the events node’s downstream metrics.

Let’s also assume that this metric total_event_duration was created using the events source node:

eventsuser_idlongcountry_idintevent_secslongevent_tslongtotal_event_durationquerysum[event_secs]queries from

After the dimension link, all dimension attributes on the user dimension node (id, username, name etc) will be available to the total_event_duration metric to optionally group or filter by. When someone asks DJ for the total_event_duration metric grouped by the user’s registration_country, DJ will use the configured join link to build a join between the user dimension and the events source node.

Dimension join links can be configured in DJ using the following:

curl -X 'POST' \
  'http://localhost:8000/nodes/default.events/columns/user_id/?dimension=default.user&dimension_column=id' \
  -H 'accept: application/json'
dimension = dj.dimension("default.events")
dimension.link_dimension(
    column="user_id",
    dimension="default.user",
    dimension_column="id",
)
dj.dimensions.link("default.events", "user_id", "default.user", "id").then(data => console.log(data))

Some dimensions themselves may include join links to other dimensions, adding to the available join paths and allowing DJ to discover more dimensions that can be used by downstream metrics. Let’s walk through an example of such a case.

Extending from the example above, let’s add on a country dimension node:

eventsuser_idlongcountry_idintevent_secslongevent_tslonguserintidPKstrusernamestrnamestrregistration_countrycountryintidPKstrnamelongpopulationDimension Join Linkstrjoin_onevents.user_id = user.idenumjoin_typeLEFTstrroleevent_userlinked vialinked via

This can be linked to the user dimension node using the node’s registration_country column:

eventsuser_idlongcountry_idintevent_secslongevent_tslonguserintidPKstrusernamestrnamestrregistration_countrycountryintidPKstrnamelongpopulationDimension Join Linkstrjoin_onevents.user_id = user.idenumjoin_typeLEFTstrroleevent_userDimension Join Link 2strjoin_onuser.registration_country = country.idenumjoin_typeLEFTstrroleregistration_countrylinked vialinked vialinked vialinked via

Now events and its downstream metric total_event_duration will have additional dimensions available: country.id, country.name, country.population, in additional to the dimensions from the user dimension node.

Dimension Roles

The dimension link’s role attribute is used to distinguish between dimensions that play different roles for a given node. In this case, the country dimension represents a user’s registration_country, but it can also play a different role, like representing the country an event was recorded in.

Let’s look at an example:

eventsuser_idlongcountry_idintevent_secslongevent_tslonguserintidPKstrusernamestrnamestrregistration_countrycountryintidPKstrnamelongpopulationDimension Join Link 2strjoin_onuser.registration_country = country.idenumjoin_typeLEFTstrroleregistration_country👉Dimension Join Link 3strjoin_onevents.country_id = country.idenumjoin_typeLEFTstrrole👉event_countryDimension Join Linkstrjoin_onevents.user_id = user.idenumjoin_typeLEFTstrroleevent_userlinked vialinked vialinked vialinked vialinked vialinked via

In this case, the country dimension was linked to both user (as the role registration_country) and to events (as the role event_country). After this link is in place, request the country.name dimension for the events node will be ambiguous without choosing a role.

DJ will distinguish between the two dimension roles with the following syntax:

  • country.name[registration_country]
  • country.name[event_country]

Or more generally, <dimension>[<role>]. The [role] part can be safely omitted if there is only a single role defined for that dimension.

You can configure a dimension alias/reference between a particular column on a table/view-like node (source, transform, dimension) and a column on a dimension node. An example of the alias/reference link:

eventslonguser_idstringcountry_namelongevent_secslongevent_tscountryintidPKstrnamelongpopulationDimension Aliasstrcolumnevents.country_namestrdimension_columncountry.namestrroleevent_country

In this case, configuring a reference between events.country_name and country.name will indicate that the semantic meaning behind the events.country_name column refers to the default.country dimension’s name field.

If someone requests events with the dimension default.country.name (or the metric total_event_duration with the same dimension), DJ will know not to perform a join to the country dimension, since the name attribute is already available directly on the original node.