<iframe src="//www.googletagmanager.com/ns.html?id=GTM-TT4L49" height="0" width="0" style="display:none;visibility:hidden">
Jethro - We Make Real-Time Business Intelligence Work on Hadoop
logo_jethro_header_teal

Blog

Create Join Index

By Boaz Raufman on November 01, 2017

Share

Creates a new join index for a given dimension column and a fact table.  The dimension join key must be a PRIMARY KEY for the dimension table.

CREATE JOIN INDEX {join-index-name}
 ON {fact-table-name}({dim-table-name}.{dim-column-name})
 FROM {join-clause}
 
{join-clause} :
 {fact-table_name} JOIN {dim-table-name} on {fact-join-key-column}
= {dim-join-key-column}

DESCRIPTION
Creates a new join index for a given dimension column and a fact table.  The dimension join key must be a PRIMARY KEY for the dimension table.

Join index is an index on a one table, based on the values of another table column and on a specific join criteria. Typically it is an index on a large fact table based on the values of a dimension attribute. Join index accelerates queries by eliminating both the fetch of the join key from the fact table and the join implementation (hash join or IN – merging indexes).

Join indexes are relevant we have relatively large dimension (few K values or more) and the attribute (the column in the dimension) is low cardinality so that each value in the attribute represent large number of join key values.

Typically a join index is define if the average ratio between unique dimension attribute and the related join keys value is 1000 or more, but if fact table is large (more than few billion) it is recommended to define join index for attribute with smaller number of related join keys per value.

EXAMPLE

CREATE JOIN INDEX store_sales_by_item_color_idx
 ON store_sales(item.item_color)
 FROM store_sales JOIN item ON ss_item_sk = i_item_sk;