Forum

One Stop SUPER Shop...
 
Notifications
Clear all

One Stop SUPER Shop or Separate tables

9 Posts
3 Users
0 Reactions
105 Views
(@mareepiksters-com)
Posts: 19
Eminent Member
Topic starter
 

Hello,

I have completed the PowerQuery Course and am up to section 6 of the PowerPivot course...

Currently, I'm finding I am really confused with what data I should bring in from my data source.

Originally my contracted Database administrator had built me tables to join and connect all of the data I might need (before I had done the course). For example an Order table that I needed to link to the customer table to get the name, link to the product table to get the products etc... I could end up with needing to complete 5 - 10 joins to get all of the information I needed... through linking the id numbers - which for a simple Excel girl was daunting and took time

However, now I have requested that he create a table as a one stop super shop... With all of these smaller tables joined already that I can filter in PowerQuery and report on. I see that I would have 3-4 super tables inventory, orders, payments etc. that I could call on filter and delete columns that aren't needed.

HOWEVER - the Database administrator (who has very limited excel, powerpivot or power query knowledge), has advised that I will slow everything down my pulling it all in at once and the functionality of powerpivot is its ability to join the keys and reduce the number of records I'm bringing in... which is true.

I guess what I'm asking is - functionality wise am I better to append and join separate tables from my database OR get a big table (one stop super shop) made that has mostly everything I could possibly require and extract what I need for reporting?

If smaller tables where should I join? The data keys mean nothing to me - should I connect to the table in PowerQuery OR PowerPivot?

Sorry if my language is not correct or my question is dumb... As I said I'm a non technical, self taught, excel girl - trying to get the data right so I can play with it.

Thanks

Maree

 
Posted : 19/05/2020 8:33 pm
(@mynda)
Posts: 4761
Member Admin
 

Hi Maree,

In session 6.08 of the Power Pivot course I explain the ideal structure for your data in Power Pivot. Your DB administrator is right, you're better to have a star schema layout (see 6.08), than one giant table.

Your DB admin might need to create bigger dimension tables, so you can have a star schema structure rather than a snowflake schema, but you should have a fact table and then multiple dimension tables.

The joins, called 'relationships' in Power Pivot, are created between the fact table and the dimension tables based on the fields they have in column, as shown in the course.

I hope that answers your questions, but I'm happy to help if you require further clarification on anything.

Mynda

 
Posted : 19/05/2020 8:48 pm
(@mareepiksters-com)
Posts: 19
Eminent Member
Topic starter
 

Thank you Mynda, I will go and watch section 6 of the PowerPivot course.

 
Posted : 19/05/2020 9:49 pm
(@mareepiksters-com)
Posts: 19
Eminent Member
Topic starter
 

Data-Model.JPGHi Mynda,

I am still finding myself a little confused about my data model.

The course is saying Star is better than Snowflake... However, my data administrator says why? Is it for performance only? What does the snowflake do to the performance?

In my attached data model example should I be "appending" (or is it merging?) the customer GroupView and AreaView to the CustomerView AND "appending" the Inventory Group View" with the Inventory View. So that my data becomes a "star"?

Secondly when I am slightly confused between what I should be doing in "Power Pivot" versus what I do in "Power Query". Particularly around the dimension tables such as "date range". Is there a way you can add the date range once OR do I need to apply it to each query (in the example to the DateDimensions and the SP_BI_InventoryTransactionView).

I guess I am trying to understand how to connect all the tables so that only the data that is required is transferred to the data model. Is there something that I am missing here? For example, I don't need all of the customers in from the customer view in the model only the ones that have a transaction on the transaction view, don't I? 

Sorry about all of the questions - I can see how much this will benefit us and our reporting. I'm just having trouble getting my head around how all of the pieces fit together. I am up to section 11 of the PowerPivot course (and I haven't started Power BI yet... I'm sure that is a whole other beast).

Thank you and kind regards

Maree

 
Posted : 02/06/2020 11:21 pm
(@catalinb)
Posts: 1937
Member Admin
 

Hi Maree,

"In my attached data model example should I be "appending" (or is it merging?) the customer GroupView and AreaView to the CustomerView AND "appending" the Inventory Group View" with the Inventory View. So that my data becomes a "star"?"

In CustomerView table, in power pivot data model, you should add calculated columns, to bring data from GroupView and AreaView.

The formula is simple:

=RELATED(SP_BI_CustomerGroupView[CustomerGroupDescription])

Now, you have CustomerGroupDescription in CustomerView table.

In pivot tables, use the field CustomerGroupDescription from CustomerView table, not from CustomerGroupView table.

For dates, you should have a single date table, related to Transactions (The date table should be generated based on Min and Max dates from Transactions)

I don't see a DateRange dimension in your model, not sure what you mean.
You intend to implement a date range grouping? What date ranges you need to add?

 
Posted : 03/06/2020 12:54 am
(@mareepiksters-com)
Posts: 19
Eminent Member
Topic starter
 

Sorry, Catalin, I'm not a very technical person... so apologies are for my possibly silly questions.

So you are saying bring the data in on the separate tables and then in the "CustomerView" add the related columns from Customer Group and Area with the "RELATED" dax formula. Rather than joining the data in Power Query before bringing the data into the model?

There is no date range dimension - I am asking how I apply a filter within "Power Query" to the two tables before I bring the data into the Model. (This maybe should have been asked on the Power Query forum.).

Thank you

Maree

 
Posted : 03/06/2020 1:05 am
(@catalinb)
Posts: 1937
Member Admin
 

No worries Maree, we're here to help.

Yes, instead of merging tables in power query, add related formula in power pivot to bring data from those tables. It is faster than merging tables in power query.

If they are small tables, you can do a merge in power query, the performance drop will not be significant. But if they are large tables, use PP related columns, the model efficiency will increase.

In power query, just filter the date column for the date range you need to keep. If DateDimension table is built using the method I presented above, no need to filter, it will adjust based on transactions.

 
Posted : 03/06/2020 3:40 am
(@mareepiksters-com)
Posts: 19
Eminent Member
Topic starter
 

Thank you Catalin,

Can you define small? my area table is 221 rows (customer group is only 19) but the table I'm joining to "Customer Views" is 17807 rows long.

I don't understand what you mean by generation by min and max. I brought my date table in from my database at the same time as my transactions... from the SQL tables connected to our ERP.... is there a way to connect the filters I apply to my transaction table to the date table OR have I done this wrong? Is there a particular section I should watch again to understand?

Thank you

Maree

 
Posted : 03/06/2020 7:09 pm
(@catalinb)
Posts: 1937
Member Admin
 

17000 rows should not be a problem, should load fast enough.

Looks like we have different perspectives, that's the reason for misunderstandings. I thought you are creating a date table in power query, there is no clue in your description about where the dates come from.

Instead of importing a date table from database, you can build one in power query, based on the imported transactions table, that is already filtered to the date range you need.

Here is a sample query to start a date table based on another table:

let

Min = Number.From(List.Min(CombinedData[invoice date])),
Max = Number.From(List.Max(CombinedData[invoice date])),
Source = Table.FromList({Min..Max}, Splitter.SplitByNothing(), null, null, ExtraValues.Error)

in

Source

Paste this code in a blank query. (use advanced editor to replace existing code) . You will have to replace the red areas with your transactions table date column. Close the advanced editor, you should see a table with date values, format the column as date, then add all the necessary details (add new columns based on date column): year, month, quarter, day, week.

 
Posted : 03/06/2020 11:01 pm
Share: