Active Member
August 5, 2022
Hi,
My .txt file is made up of 3 records of 3 journal articles, each with a different title, and a different subset of authors. Each record has a title of the paper, its authors and a journal name. The data organised in the below manner. How would I go about importing this type of file into Power Query, and organise the data into exactly 4 columns: Record Number, Title, Authors and Journal?
Many thanks for your help!
SR - Start of Record
TI - Title01
AU - Author01
AU - Author02
AU - Author03
AU - Author04
JO - Journal01
ER - End of Record
SR - Start of Record
TI - Title02
AU - Author04
AU - Author02
AU - Author03
JO - Journal02
ER - End of Record
SR - Start of Record
TI - Title03
AU - Author01
AU - Author04
JO - Journal01
ER - End of Record
Trusted Members
February 13, 2021
Moderators
January 31, 2022
Active Member
August 5, 2022
Thanks ever so much, Riny. That's certainly done the trick.
I thought perhaps there was a native Power Query solution dealing with records that are structured in the way of my example. Something along the lines of telling Power Query what the structure of each record is, and PQ then assigning the data according to the user-defined structure. The third step would then be to extract certain fields from each record.
Does that make sense?
Moderators
January 31, 2022
Not sure I understand what you mean, but your txt file contains just text. PQ doesn't know these are records and you need to define the logic of it yourself. Thus, the telling part is what you do in the applied steps. Creating a logic that can be repeated over and over again to construct a table with records.
New Member
April 26, 2023
Please test the below steps to convert it to the desired record
let
Source = Csv.Document(File.Contents("C:\Users
11472235\OneDrive - Queensland University of Technology\Desktop\Author.txt"),null,{0, 3, 5},ExtraValues.Ignore,1252),
Added_Index = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
Indexing = Table.TransformColumns(Added_Index, {"Index", (mn)=> List.Count(List.Select(Table.SelectRows(Added_Index,(ot)=> ot[Index]<mn+1)[Column1],each _="SR "))}),
Grouped_Rows = Table.Group(Indexing, {"Index", "Column1"}, {{"Group", each _[Column3]}}),
Custom1 = Table.TransformColumns(Grouped_Rows,{"Group",each if List.Count(_)=1 then List.First(_) else _}),
#"Grouped Rows" = Table.Group(Custom1, {"Index"}, {{"Grouping", each Table.ToRecords(Table.PromoteHeaders(Table.Transpose(Table.RemoveColumns(_,"Index"))))}}),
#"Expanded Grouping" = Table.ExpandListColumn(#"Grouped Rows", "Grouping"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Grouping",{"Index"})
in
#"Removed Columns"
1 Guest(s)