Materialization

Cube Nodes

When we attach a materialization config to a cube node (instructions here), we are requesting DJ to prepare for the materialization of the cube’s underlying data into an OLAP database (such as Druid). This enables low-latency metric queries across all defined dimensions in the cube.

However, many such databases are only configured to work with simple aggregations, so DJ will break down each metric expression into its constituent simple aggregation measures prior to materialization. These measures are ingested into the OLAP database as separate columns and they’re combined back together into the original metrics when users request metric data.

A few examples include:

SELECT
  AVG(price)
    AS avg_repair_price
FROM repair_order_details

price_count

count

count(price)

price_sum

sum

sum(price)
SELECT
  SUM(price)
    AS total_repair_price
FROM repair_order_details

price_sum

sum

sum(price)
SELECT
  CAST(
    SUM(
      IF(discount > 0.0,
          1, 0)
    ) AS DOUBLE
  ) / COUNT(*)
    AS discounted_orders_rate
FROM repair_order_details

discount_sum

sum

sum(
  if(
    discount > 0.0,
    1, 0
  )
)

count_star

count

count(*)
SELECT
  SUM(price1) +
    SUM(price2)
  AS total_cost
FROM costs

price1_sum

sum

sum(price1)

price2_sum

sum

sum(price2)