Active Member
April 29, 2024
Hi,
I have data from a Building Energy Modeling software that comes pre-formatted, and I'm trying to get it into a better format for analysis and presentation. Results are always provided in a similar format...hourly annual results (8760 rows) for each variable I'm selecting to export. The problem is these aren't the same from project to project, i.e. some buildings are going to have Cooling while some are Heating Only, or a building might have Electric Heat and/or Natural Gas Heat and these are going to change the top row of my data. So when I try to change the source data I guess errors saying it can't find the same set of columns.
Variable names/column headers have something a best like:
"Interior Lighting (kBtu/h)" which I'm breaking into "Interior Lighting" for end-use, and "kBtu/h" for unit of measurement.
To at worst something like:
"Space Heating - Electricity: Meter 1 (kBtu/h)" which I'm breaking into "Space Heating" for end-use, "Electricity" for energy source, "Meter 1" for sub-meter name, and "kBtu/h" for unit of measurement.
I'm then filling in any null values in energy source or sub-meter name with Electricity and Meter 1 since those are generally the default if not presented.
Attached is a file with Queries that get the data into a form that I think works for what I want, but if I try taking the results from another file, or even a sub-set of results from the same file, I just get errors about missing columns.
Any help with this or suggestions to make this more repeatable across similar pre-formatted data would be great.
Moderators
January 31, 2022
Looked at the file and if I understand correctly, your Raw Data file will have different column headers (split over 3 rows, one of which is empty) depending on the project the data relates to.
Create a sheet with all possible header rows. Connect that that one just like you connect to the Raw Data and create a table with one header row from it. Then, after you have created the single header row in the raw data query, append it to the 'headers only' table.
Now you write a query that does all imaginable transformations on all imaginable columns. That should do the trick. If can upload a file that contains raw data with other headers and other applied steps, I could give it a try.
Active Member
April 29, 2024
I added a "Column Headers" sheet with 1,496 possible heading combinations. This should cover all the defaults, if I'm adding custom types that's a problem for another day.
There is still the issue that the file name given to the simulation run is listed as a secondary header and will be different from every project.
Moderators
January 31, 2022
Oh my, that's not what I expected. I changed my mind and split the process into three steps.
1) Obtain the APS File name from the raw data. It seems to be always on the 3rd row of Column3. Though, I wonder why you need this as you don't seem to use it in the PT.
2) Extract the relevant data elements from the column headers in row 1 (End Use, Energy Source, Sub-meter, Uom) . The trick is to unpivot the data as it comes with the PQ assigned column names.
3) Then, back to the raw data. Remove the first three rows, fill down the dates and unpivot. Add a column with the APS file name, and merge this table with the ColumnHeaders query from step 2. Expand and do some clean-up.
The PT I created (Jan- Feb only) matched the outcome in your full year PT. See if this concept works for you.
Needed to remove your PT from the Data Model as the file size became too big for uploading it here.
Active Member
April 29, 2024
Thanks, this seems to work on a few test trials of new info! I want to keep the APS File name cause for most building simulation scenarios you're going to be comparing multiple runs...a proposed design vs design alternatives, or proposed design vs code minimum baseline building. Since the APS File name is only being pulled from a single cell reference, I won't be able to pull multiple APS results at once into a single PT cause the next APS name would appear in the column after the end of the first APS's data. But this gives me something to think on and work with for the time being and at least I can now visualize a single simulation run's data.
Thanks very much!
Moderators
January 31, 2022
OK, but then you can remove the query and steps where I pick-up the APS name and add a column with that name on all rows in the DataTable query. In the ColumnHeader query, keep the first three rows, remove blank rows (i.e. row 2), transpose, merge both columns (with an 'odd' character), transpose back. Once unpivoted add one more step to extract the APS name.
Finally, in the DataTable query, after merging with the header information, expand the APS name as well.
See attached.
1 Guest(s)