Forum

Notifications
Clear all

Lesson 2.01

6 Posts
2 Users
0 Reactions
169 Views
(@dieneba)
Posts: 50
Trusted Member
Topic starter
 

Hi Mynda

In lesson 2.01 of Power BI I am a bit lost. Why don't we have country in the file OlympicsData. Are we assuming that these are 3 different files (country, OlympicsData, Teamcodes). I am lost. In a real case in an organisation would that mean I will make a connection between 3 separate tables?
Kindly assist
Regards

Dieneba

 
Posted : 07/06/2023 2:57 pm
(@mynda)
Posts: 4761
Member Admin
 

Hi Dieneba,

Ideally your data should be modelled in a star scehma design ( https://learn.microsoft.com/en-us/power-bi/guidance/star-schema).

This means creating dimension tables for attributes that are not required on every row of your fact table. The Country column in a fact table would have many duplicate records that could be more efficiently referenced in a dimension table. The measure/visual can then look up the city field in the dim table and find the country if/when required.

So yes, it is recommended that you would create a separate dimension table for the country if you don't already have your data in that structure.

Mynda

 
Posted : 07/06/2023 8:31 pm
(@dieneba)
Posts: 50
Trusted Member
Topic starter
 

Thanks Mynda

I will read the article and continue. I suspect things will be clearer later in the training.
Regards
Dieneba

 
Posted : 08/06/2023 5:40 am
(@dieneba)
Posts: 50
Trusted Member
Topic starter
 

Hi Mynda,

I continued through the course and I need to clarify my understanding.  I will use the attached file based on the exercise you gave us pbi_1.03_website_traffic_odb.

1- a fact table is the initial database we use.  In the table attached the fact table would be the raw data

2- The fact keys could be for example each of the columns from the fact table from Visitor Type to Time on site (foreign keys).  This will depend on what we would define as our KPI.

3- The dimension keys are primary keys and could be (country and city) or (visitor and time on site).  The choice of dimensions will depend on the analysis we want to make. At least one dimension will be included in the fact table. In my example country is in the fact table but not the cities.

Referring to the second dimension ("data sum visitor-time"), do  I need to sum the returning and new data to have aggregate data?  

4- Is it relevant to use visitor type and time on site as both criteria are in the fact table? or is it more relevant to have visitor type linked to another variable, not in the fact table?  For example Bounces and Cities?

I hope to have it right

Regards

Dieneba

 
Posted : 28/06/2023 6:41 am
(@mynda)
Posts: 4761
Member Admin
 

Hi Dieneba,

1. Correct. The fact table contains the transactional data.

2. Time on site is a transaction type of data, not a dimension. Dimensions are the fields you're likely to group/summarise the data by. e.g. by country, visitor type, source, date.

3. This data isn't really suited to splitting out into fact and dimension tables because there aren't any columns in the fact table that are related. e.g. If the fact table contained columns for city and country, then you could create a dimension table containing the city and country. Then delete the country column from the fact table and leave city. You'd then use the city columns in the fact and dimension tables to create the relationships. 

4. You're trying to find dimensions in this data that aren't there, and I think this is confusing you. Better to use the session 2 Olympic data or session 4 data as this actually has facts and dimensions, which I think will help your understanding.

I hope that points you in the right direction.

Mynda

 
Posted : 28/06/2023 8:20 pm
(@dieneba)
Posts: 50
Trusted Member
Topic starter
 

Thanks Mynda

 
Posted : 01/07/2023 3:22 am
Share: