Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chaining M folder functions #7

Closed
MortalCatalyst opened this issue Apr 6, 2016 · 10 comments
Closed

Chaining M folder functions #7

MortalCatalyst opened this issue Apr 6, 2016 · 10 comments

Comments

@MortalCatalyst
Copy link

Hi

I want to be able to chain 4 functions together that will be used to import xml files from a folder. It is the same file each query on a table in the file.

Which function in the repository is best for chaining the execution of M functions.

This is my thread where i was referred here as s last possible chance to get it working https://social.technet.microsoft.com/Forums/en-US/1d31402b-b05a-4091-8ce9-930481c34f93/combine-4-xml-queries-on-same-source-to-be-a-global-folder-query?forum=powerquery

@KiaraGrouwstra
Copy link
Owner

I'm not actually sure you'd need any of my functions here.

To parameterize your queries so you could use them on multiple files, make them into functions. i.e. from:
let Source = Xml.Tables(File.Contents("C:\Users\Sayth\OneDrive\Racing\20160305RAND0.xml")),
to:
(FileName) => let Source = Xml.Tables(File.Contents(FileName)),

This yields a reusable function that you can then reuse with:
functionName("C:\Users\Sayth\OneDrive\Racing\20160305RAND0.xml")

Ehren in that thread was correct; for the purpose of getting the info into tables for the Data Model, one query would be expected to output one table.
This isn't to say one query cannot do all of this processing -- it can. Once you have reusable functions for each of them, you could have one function do all four:

(FileName) => [FirstResult=functionName(FileName), ...]

The problem is though, since PowerPivot will want separate tables, if you're doing this you'd again need separate queries to separate these results back. Which begs the question, why even bother in the first place.

Which brings me back to your aversion to multiple queries, as Ehren did. You hadn't responded to her anymore -- I presume the refreshing isn't an issue?

Feel free to clarify further.

@MortalCatalyst
Copy link
Author

Refreshing is not an issue.
I was of the understanding that Ehren was unable to assist and had suggested use of these libraries as a chance to resolve.

Thank you for taking the time to respond, very much appreciated.

I am open to suggestion and not averse to any methodology as relatively new to powerquery although not to Excel.

In my assumption and reading your post above I think you have solved the problem largely. So if I have a folder of 50 xml files and I process the queries as

(FileName) => [FirstResult=functionName(FileName), ...]

I would end up with 4 tables all filled with the respective data from the 50 files and as long as the relationship in my data model are correct then it will work?

@KiaraGrouwstra
Copy link
Owner

[FirstResult=functionName(FileName), ...] is a record in M. So that's a function returning a record of 4 results (such as tables) for a single filename.
However, PowerPivot takes tables for input (one per query), not records.

One approach: have that function, iterate over it for all files in a new query, so as to obtain a List of Records (or, a Table), then have four more queries pick out those respective parts out (merging results per file into one and the same table) for separate consumption by PowerPivot.

Another approach: for each of your four result types, make a query iterating over your files and extracting one piece of info, optionally splitting out the extraction logic into a resuable function.

Ultimately, approach #1 may not justify the complication it adds, which involves (if you must):

  • splitting your list of records or table into it's four parts (Table.Column() for the table; using List.Transform() to map over the list and get the right part from the record using Record.Field() or just my_rec[my_field])
  • merging your list of result tables for all files for one question into one using Table.Combine().

If you're interested, I'd recommend using the language reference and formula reference for looking things up, but feel free to follow up with further questions.

@MortalCatalyst
Copy link
Author

Thank you for the references. Not sure which route i will take, however sure to learn either way.

@MortalCatalyst
Copy link
Author

Can I ask for help, Ehren and i cannot solve the question https://social.technet.microsoft.com/Forums/en-US/1d31402b-b05a-4091-8ce9-930481c34f93/combine-4-xml-queries-on-same-source-to-be-a-global-folder-query?forum=powerquery

I chained the functions as you discussed above however I am unable to run the query on a folder.

let ExcelFile = (FilePath, FileName) => let Source = Folder.Files(FilePath), Query1 = (FileName) => [FirstResult=Meeting(FileName), SecondResult=Race(FileName), ThirdResult=Reference(FileName), FourthResult=Horses(FileName)] in ExcelFile

Or

let xmlImport = (FilePath, FileName) => let Source = Folder.Files(FilePath), Query1 = (FileName) => [FirstResult=Meeting(FileName), SecondResult=Race(FileName), ThirdResult=Reference(FileName), FourthResult=Horses(FileName)], content = Binary.Combine(Source[Query1]), xmlImport = Xml.Tables(content) in xmlImport

Seems as though I thought what I was doing was a good simple starting project hasn't worked that way.

To clarify as there are multiple tables in my XML they cannot be resolved in one query so I create a query for each one. Now I wanted to chain them together so that they would run at once on a folder of XML files and return the results to a sheet for each query.

@KiaraGrouwstra
Copy link
Owner

As mentioned above Ehren is correct the data model wants functions to output one table.
This seems largely about the basics of M syntax; you may do well to check that language reference to get more of a grasp of what's happening. Specifically:

  • let ... in ... is an M construct that allows you to store some expression results into variables before returning a final result. Your queries will not parse because the statements are currently unbalanced -- there are two lets yet only one in.
  • In the first example you're defining two functions, ExcelFile and Query1. You're returning the first function as a query result (which is okay); Source and Query1 remain unused. In this case I think I think you're making a mistake by declaring FileName as a parameter in the outer function, since it is currently only used within your inner function, which already declares it itself. If the intention of this function is iterating over the files in Source so as to invoke the inner function on them, you may want to look at the Table.Column() and List.Transform() functions mentioned above.
  • The second approach makes similar mistakes in Source[Query1]. After you get the query to parse, you can zoom in on the sub-results Source and Query1; you'll see the former is a table, the latter is a function. Whatever the intention was there, that definitely doesn't seem right; the foo[bar] notation is most commonly used to obtain the value at key bar from a record foo.

@MortalCatalyst
Copy link
Author

I have been to the reference online, can i ask you straight though i thought this task simple in scope is it beyond M and PowerQuery current capability?

Seems if i did this i would be the first based on internet searches, the msin similar example being xsv which is much easier.

@MortalCatalyst
Copy link
Author

I have no desire for 4 just seems most logical way to handle related data tables

@KiaraGrouwstra
Copy link
Owner

It's not beyond it; use Table.Column() and List.Transform(). But you're taking the hard approach; the easy one is just to iterate over your folder from separate queries, not combining the functions. If you'd be comfortable with the language that approach is fine (I've used it), but until then, it's just adding unnecessary complexity.

@MortalCatalyst
Copy link
Author

Thanks

KiaraGrouwstra pushed a commit that referenced this issue Jul 25, 2017
Pull from Ivan's additions
IvanBond added a commit that referenced this issue Sep 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants