Forum

Linking multiple fa...
 
Notifications
Clear all

Linking multiple fact tables via a dimension table

3 Posts
2 Users
0 Reactions
120 Views
(@laurenh)
Posts: 2
New Member
Topic starter
 

Hi Mynda (and everyone),

First of all, thank you for these fantastic courses – my 2 colleagues and I have completed the Power Query and Power Pivot courses, which have really changed the way we work in Excel.

I am hoping for a bit of advice on structuring our data in Power Pivot. I know that we should link data via dimension tables rather than linking fact tables directly to each other, but also that dimension tables should be made up of the variables/factors used for columns/rows in a pivot table. Unfortunately this doesn’t fit the data we are working with. Our data is made up of 7 fact tables from a relational database, which are all brought into Power Pivot via Power Query. The data is about Police use of force events; the different tables contain data about the overall event/location/time, person details, staff details, tactic details, injuries, weapons etc. Each row of data has a report identification number and person identification number, which are the values for the basis of the relationships between the tables (there can be multiple people involved within the same event, hence the two different ID numbers).

My question is whether I should set up a dimension table with the identification numbers to link all the data? It is easy enough to do this, but I am unsure how this would be different from (or better than) using one of the fact tables as a primary linking hub or linking the fact tables directly? The dimension table wouldn’t be static, and would really be just a reduced version of one of the fact tables containing all the report and person IDs, set up via a duplicate query and updating as more events occur. What would be the benefit of setting up the dimension table like this or the problems with not doing it this way? It would not make sense to use the identification numbers as the factor/variable for analysis in a pivot table and in some ways this data almost seems like the opposite of that – giving the highest level of granularity rather than a grouping variable. We will have dimension tables for other factors (e.g. location, tactic type) but just unsure about the best way to proceed for the overall linking between tables.

One final question, if fact tables are related to each other via another table (rather than directly) should it be possible to use the RELATED function to draw the data through from one fact table to the other, or is that function limited to the immediate relationship?

Any advice would be greatly appreciated.

Thanks!

Lauren

 
Posted : 09/12/2021 5:32 pm
(@mynda)
Posts: 4762
Member Admin
 

Hi Lauren,

It's great to hear you and your team are making use of your new skills!

One of the main reasons we don't use fact tables as proxy dimension tables is because there is a risk that the fact table you choose to be the dimension table will not have every ID number required, and it may have duplicates of said ID. If these are never a concern for your data, then theoretically you could use a fact table as a dimension table. You need to remember to always use the row and column labels from the table you choose as your dimension table and not the other fact tables....it can get messy and confusing.

The other benefit of having dimension tables is to store the attributes of an a dimension once in a table, rather than repeating them row after row in a fact table. e.g. if an ID number appears multiple times in a fact table, then you can save space and improve model efficiency by storing the attributes in a dimension table rather than the fact table. These attributes might be things like names, addresses etc. i.e. the things you refer to as person details and staff details. I would expect you have the same staff appear multiple times in your fact tables. Moving these 'details' to the dimension table so they're only listed once for each Staff ID, is more efficient than storing them multiple times in a fact table for each row that staff ID appears. The bottom line is a lot less data in your model.

You can only use RELATED where there is a direct relationship between two tables. That said, I would only use RELATED where you need to perform a calculation at a row level of detail, otherwise it's just duplicating data. 

I hope that clarifies things, but happy to help if you have more questions.

Mynda

 
Posted : 09/12/2021 9:54 pm
(@laurenh)
Posts: 2
New Member
Topic starter
 

Thanks Mynda - that is very helpfulSmile 

 
Posted : 10/12/2021 4:46 pm
Share: