Over a year ago I wrote a blog post in this specific topic. I highly recommend that you read it (here) before you read this one.
The reason why I’m writing kind of a follow up on this it’s because I recently had a really heated conversation with a very close friend of mine who I can only call as an expert on Power Query.
It was a pretty heated conversation because this person was on the “dark side”, as he didn’t know of what the Data Source Settings button actually did and was using dark [M]agic to make things work.
This was pretty crazy to me! I use this button / feature almost on a daily basis and I can’t stop recommending it to people who use Power BI / Power Query, so why would this person choose any other method over this?
Let me show you WHY this is such a big deal and why you need to start using the gear icons more frequently.
The situation with changing Folder Paths for the Combine Files Experience
I recently wrote a whole 4 part series on how you can combine files and in several of those blog posts I explained how you can use the Combine Files experience to append / consolidate / combine the data within your files into just one big table.
[custom_posts template=”one_half_gallery” effect=”none” hide=”date,author,comments_link,summary,categories,tags” entry_ids=”17711,17777″]
Now, let’s say that you created a connection to a folder located in:
“D:\Sales”
but now you want to change the folder path to be:
“D:\Sales\Americas”
so you can only combine the data of the files within the Americas subfolder.
How do you do that with Power BI / Power Query? In Power BI / Power Query, this is categorized as changing a Data Source.
What we’re trying to change is a Data Source (a folder path in this case) which has an authentication method attached to it. For a local folder path, this authentication is completely implicit so you won’t get the “Authentication” window, but if you were connecting to a SharePoint / OneDrive Folder then you’d get that authentication window.
Let’s look at the options that we have in order to change this data source.
Option 1: Using the Source Step gear icon (The Dark Side)
When you create a query to a new data source, you’ll notice that the first step will be called “Source” and it’ll have a gear icon right next to it.
When you click that gear icon, you’ll get the same prompt window that you got when you initially tried to connect to your data source which in our case it’ll be the “Folder” window to browse to a new folder path:
What’s the problem with this approach? That’s a good question! it seems like a fine approach, until you realize that you’ll need to do this for every single query that has that same data source which is just tedious, time consuming and sometimes we forget where all of our data sources are being defined (at a step level).
For example, with the Combine Files experience, the folder path gets referenced in 2 places:
- The original query:
- The sample file query:
The situation here is that in some cases the consequent steps of some of those queries also rely on the names of the “Data Source” so it can get complicated when you want to change your data source as all of this will need to be done manually.
There are some workarounds to make this work with a step level configuration, but they fall on what I call a MacGyver approach.
Option 2: Using the Data Source Settings (The Right Side)
This is the option that I usually recommend and here’s the logic behind it.
I’m a pretty visual type of guy, and that’s the main reason why you see so many diagrams and illustrations created by myself on this blog, so let’s take a visual approach here.
If you go to Power BI / Power Query, get to the “View” menu and select the “Query Dependencies” you’ll get a window like the following:
- Go to the View tab
- Click on the Query Dependencies button
- Check the visual representation of all of your queries
As you can see, you get a visual representation of all of your queries and there’s only 1 data source which is denoted with a folder icon (D:\Sales).
See, this is like a master record for that specific data source so you don’t need to go through each query that references that data source, but instead you go to a different button. What button? you might ask
Here’s the full process that you need to take in order to change your data source:
- Go to Home tab
- Select the Data Source Settings button
- A new window for the Data Source settings should appear
- Select your data source from the list. In my case it was the “D:\Sales” folder
- Click the “Change Source…” button at the very bottom left side of the window
- A new window will appear where you can change your data source for ALL of your queries
and that’s it! even if you had dozens of queries that reference that same data source, you wouldn’t have to go through each and every one of them to make the changes with this option
Conclusions
While both approaches are very much valid, when it comes to changing a data source I highly recommend that you go through the Option #2 with the Data Source Settings button.
The other gear icons, at the step level, do have their own place and that’s what I mainly talk in my previous blog post about the Step configuration. Be sure to check out that blog post here.
Wouldn’t a referencing a single parameter that holds the connection string/file path etc and simply changing that as required be even fewer steps?
If your connection was against a database then you’d need 2 parameters (a server and the database name) not to mention that they would not be in the same window, so it wouldn’t provide the best experience.
There’s also the scenario of singletons (connectors that only require authentication) and the best way to manage a data source is by going through this data source setting.
Adding more parameters to something that is already a parameters just creates unnecessary overhead and complexity (even visual complexity at the query dependencies view).
Don’t forget that the parameters experience is quite different from Excel, Power BI and the rest of the implementations, while this “data source settings” is pretty standard among all implementations of Power Query.
I bought your recipes and they seem to suggest creating a file list query and then have the dependent queries reference the file list query. Do you no longer recommend the file list approach and instead favor creating the links directly and then using the Data Source Setting button when you need to change source?
Both are valid, but it depends on your scenario. Usually I go for just the data source button, but maybe you want to have a parameter for the data source in the Power BI Service that you can repoint to other folder or file path for example, there the parameter would be valuable to you.
The quick and easy way will always be the data source setting. What do you think?