Power Query
Power Query
Power Query is a data transformation and data preparation engine. Power Query comes with a graphical
interface for getting data from sources and a Power Query Editor for applying transformations. Because the
engine is available in many products and services, the destination where the data will be stored depends on
where Power Query was used. Using Power Query, you can perform the extract, transform, and load (ETL)
processing of data.
Diagram with symbolized data sources on the right, passing though Power query for transformation, and then
going to various destinations, such as Azure Data Lake Storage, Dataverse, Microsoft Excel, or Power BI.
EXIST IN G C H A L L EN GE H O W DO ES P O W ER Q UERY H EL P ?
Finding and connecting to data is too difficult Power Query enables connectivity to a wide range of data
sources, including data of all sizes and shapes.
Experiences for data connectivity are too fragmented Consistency of experience, and parity of query capabilities
over all data sources.
Data often needs to be reshaped before consumption Highly interactive and intuitive experience for rapidly and
iteratively building queries over any data source, of any size.
EXIST IN G C H A L L EN GE H O W DO ES P O W ER Q UERY H EL P ?
Any shaping is one-off and not repeatable When using Power Query to access and transform data, you
define a repeatable process (query) that can be easily
refreshed in the future to get up-to-date data.
In the event that you need to modify the process or query
to account for underlying data or schema changes, you can
use the same interactive and intuitive experience you used
when you initially defined the query.
Volume (data sizes), velocity (rate of change), and variety Power Query offers the ability to work against a subset of
(breadth of data sources and data shapes) the entire dataset to define the required data
transformations, allowing you to easily filter down and
transform your data to a manageable size.
Power Query queries can be refreshed manually or by taking
advantage of scheduled refresh capabilities in specific
products (such as Power BI) or even programmatically (by
using the Excel object model).
Because Power Query provides connectivity to hundreds of
data sources and over 350 different types of data
transformations for each of these sources, you can work with
data from any source and in any shape.
NOTE
Although two Power Query experiences exist, they both provide almost the same user experience in every scenario.
Transformations
The transformation engine in Power Query includes many prebuilt transformation functions that can be used
through the graphical interface of the Power Query Editor. These transformations can be as simple as removing
a column or filtering rows, or as common as using the first row as a table header. There are also advanced
transformation options such as merge, append, group by, pivot, and unpivot.
All these transformations are made possible by choosing the transformation option in the menu, and then
applying the options required for that transformation. The following illustration shows a few of the
transformations available in Power Query Editor.
Dataflows
Power Query can be used in many products, such as Power BI and Excel. However, using Power Query within a
product limits its usage to only that specific product. Dataflows are a product-agnostic service version of the
Power Query experience that runs in the cloud. Using dataflows, you can get data and transform data in the
same way, but instead of sending the output to Power BI or Excel, you can store the output in other storage
options such as Dataverse or Azure Data Lake Storage. This way, you can use the output of dataflows in other
products and services.
More information: What are dataflows?
P O W ER Q UERY P O W ER Q UERY
P RO DUC T M EN GIN E 1 DESK TO P 2 O N L IN E 3 DATA F LO W S 4
2 Power Quer y Desktop The Power Query experience found in desktop applications.
3 Power Quer y Online The Power Query experience found in web browser
applications.
See also
Data sources in Power Query
Getting data
Power Query quickstart
Shape and combine data using Power Query
What are dataflows
Getting data overview
5/25/2022 • 7 minutes to read • Edit Online
Power Query can connect to many different data sources so you can work with the data you need. This article
walks you through the steps for bringing in data to Power Query either in Power Query Desktop or Power
Query Online.
Connecting to a data source with Power Query follows a standard set of stages before landing the data at a
destination. This article describes each of these stages.
IMPORTANT
In some cases, a connector might have all stages of the get data experience, and in other cases a connector might have
just a few of them. For more information about the experience of a specific connector, go to the documentation available
for the specific connector by searching on the Connectors in Power Query article.
1. Connection settings
Most connectors initially require at least one parameter to initialize a connection to the data source. For
example, the SQL Server connector requires at least the host name to establish a connection to the SQL Server
database.
In comparison, when trying to connect to an Excel file, Power Query requires that you use the file path to find
the file you want to connect to.
The connector parameters are commonly used to establish a connection to a data source, and they—in
conjunction with the connector used—define what's called a data source path.
NOTE
Some connectors don't require you to enter any parameters at all. These are called singleton connectors and will only
have one data source path available per environment. Some examples are Adobe Analytics, MailChimp, and Google
Analytics.
2. Authentication
Every single connection that's made in Power Query has to be authenticated. The authentication methods vary
from connector to connector, and some connectors might offer multiple methods of authentication.
The currently available methods of authentication for Power Query are:
Anonymous : Commonly used when connecting to a data source that doesn't require user authentication,
such as a webpage or a file available over public HTTP.
Basic : A username and password sent in base64 encoding are accepted for authentication.
API Key : A single API key is accepted for authentication.
Organizational account or Microsoft account : This method is also known as OAuth 2.0 .
Windows : Can be implicit or explicit.
Database : This is only available in some database connectors.
For example, the available authentication methods for the SQL Server database connector are Windows,
Database, and Microsoft account.
3. Data preview
The goal of the data preview stage is to provide you with a user-friendly way to preview and select your data.
Depending on the connector that you're using, you can preview data by using either:
Navigator window
Table preview dialog box
Navigator window (navigation table)
The Navigator window consists of two main sections:
The object selection pane is displayed on the left side of the window. The user can interact with and select
these objects.
NOTE
For Power Query in Excel, select the Select multiple items option from the upper-left corner of the navigation
window to select more than one object at a time in the object selection pane.
NOTE
The list of objects in Power Query Desktop is limited to 10,000 items. This limit does not exist in Power Query
Online. For a workaround in Power Query Desktop, see Object limitation workaround.
The data preview pane on the right side of the window shows a preview of the data from the object you
selected.
O bj ec t l i m i t at i o n w o r kar o u n d
There’s a fixed limit of 10,000 objects in the Navigator in Power Query Desktop. This limit doesn’t occur in
Power Query Online. Eventually, the Power Query Online UI will replace the one in the desktop.
In the interim, you can use the following workaround:
1. Right-click on the root node of the Navigator , and then select Transform Data .
2. Power Query Editor then opens with the full navigation table in the table preview area. This view doesn't
have a limit on the number of objects, and you can use filters or any other Power Query transforms to
explore the list and find the rows you want (for example, based on the Name column).
3. Upon finding the item you want, you can get at the contents by selecting the data link (such as the Table
link in the following image).
Connection settings
In the connection settings section, you define the information needed to establish a connection to your data
source. Depending on your connector, that could be the name of the server, the name of a database, a folder
path, a file path, or other information required by the connector to establish a connection to your data source.
Some connectors also enable specific subsections or advanced options to give you more control and options
when connecting to your data source.
Connection credentials
The first time that you use Power Query to connect to a specific data source, you're required to create a new
connection associated with that data source. A connection is the full definition of the gateway, credentials,
privacy levels, and other connector-specific fields that make up the connection credentials required to establish a
connection to your data source.
NOTE
Some connectors offer specific fields inside the connection credentials section to enable or define any sort of security
related to the connection that needs to be established. For example, the SQL Server connector offers the Use Encr ypted
Connection field.
The primary information required by all connectors to define a connection are:
Connection name: This is the name that you can define to uniquely identify your connections. Note that
you can't duplicate the name of a connection in your environment.
Data gateway: If your data source requires a data gateway, select the gateway using the dropdown list from
this field.
Authentication kind & credentials: Depending on the connector, you're presented with multiple
authentication kind options that are available to establish a connection, as well as fields where you enter your
credentials. For this example, the Windows authentication kind has been selected and you can see the
Username and Password fields that need to be filled in to establish a connection.
Privacy level: You can define the privacy level for your data source to be either None , Private ,
Organizational , or Public .
NOTE
To learn more about what data gateways are and how to register a new gateway for your environment or tenant, go to
Using on-premises data gateway.
IMPORTANT
Some Power Query integrations don't currently enable a defined connection or a privacy level. But, all Power Query Online
experiences do provide a way to define the data gateway, authentication kind, and the credentials needed to establish a
connection with your data source.
Once you've defined a connection in Power Query Online, you can reuse the same connection later on without
reentering all this information again. The Connection field offers a dropdown menu where you select your
already defined connections. Once you've selected your already defined connection, you don't need to enter any
other details before selecting Next .
After you select a connection from this menu, you can also make changes to the credentials, privacy level, and
other connector-specific fields for your data source in your project. Select Edit connection , and then change
any of the provided fields.
2. Data preview
The goal of the data preview stage is to provide you with a user-friendly way to preview and select your data.
Depending on the connector that you're using, you can preview data by using either:
Navigator window
Table preview dialog box
Navigator window (navigation table) in Power Query Online
The Navigator window consists of two main sections:
The object selection pane is displayed on the left side of the window. The user can interact with and select
these objects.
The data preview pane on the right side of the window shows a preview of the data from the object you
selected.
Table preview dialog box in Power Query Online
The table preview dialog box consists of only one section for the data preview. An example of a connector that
provides this experience and window is the Folder connector.
3. Query editor
For Power Query Online, you're required to load the data into the Power Query editor where you can further
transform and enrich the query if you choose to do so.
Additional information
To better understand how to get data using the different product integrations of Power Query, go to Where to
get data.
Where to get data
5/25/2022 • 5 minutes to read • Edit Online
Getting data from available data sources is usually the first encounter you have with Power Query. This article
provides basic steps for getting data from each of the Microsoft products that include Power Query.
NOTE
Each of these Power Query get data experiences contain different feature sets. More information: Where can you use
Power Query?
NOTE
Not all Excel versions support all of the same Power Query connectors. For a complete list of the Power Query connectors
supported by all versions of Excel for Windows and Excel for Mac, go to Power Query data sources in Excel versions.
You can also choose to get data directly from an Excel worksheet without using the Get data option.
3. Select the connector from the list of data sources.
To import data to an existing table in Power Apps:
1. On the left side of Power Apps, select Dataverse > Tables .
2. In the Tables pane, either:
Select a table in the Tables pane that you want to import data to, and then select Data > Get
data .
Open the table to its individual pane, and then select Data > Get data .
In either case, you can also choose to get data from an Excel worksheet without using the Get data
option.
3. Select the connector from the list of data sources.
To get data in Power Apps when creating a dataflow:
1. On the left side of Power Apps, select Dataverse > Dataflows .
2. If a dataflow already exists:
a. Double-click on the dataflow.
b. From the Power Query editor, select Get data .
c. Select the connector from the list of data sources.
3. If no dataflow exists and you want to create a new dataflow:
a. Select New dataflow .
b. In the New dataflow dialog box, enter a name for your new dataflow.
c. Select Create .
d. Select the connector from the list of data sources.
5. Select Next .
6. Select the connector from the list of data sources.
When you attempt to connect to a data source using a new connector for the first time, you might be asked to
select the authentication method to use when accessing the data. After you've selected the authentication
method, you won't be asked to select an authentication method for the connector using the specified connection
parameters. However, if you need to change the authentication method later, you can do so.
If you're using a connector from an online app, such as the Power BI service or Power Apps, you'll see an
authentication method dialog box for the OData Feed connector that looks something like the following image.
As you can see, a different selection of authentication methods is presented from an online app. Also, some
connectors might ask you to enter the name of an on-premises data gateway to be able to connect to your data.
The level you select for the authentication method you chose for this connector determines what part of a URL
will have the authentication method applied to it. If you select the top-level web address, the authentication
method you select for this connector will be used for that URL address or any subaddress within that address.
However, you might not want to set the top-level address to a specific authentication method because different
subaddresses can require different authentication methods. One example might be if you were accessing two
separate folders of a single SharePoint site and wanted to use different Microsoft accounts to access each one.
After you've set the authentication method for a connector's specific address, you won't need to select the
authentication method for that connector using that URL address or any subaddress again. For example, let's say
you select the https://contoso.com/ address as the level you want the Web connector URL settings to apply to.
Whenever you use a Web connector to access any webpage that begins with this address, you won't be required
to select the authentication method again.
2. In the Data source settings dialog box, select Global permissions , choose the website where you
want to change the permission setting, and then select Edit Permissions .
3. In the Edit Permissions dialog box, under Credentials , select Edit .
4. Change the credentials to the type required by the website, select Save , and then select OK .
You can also delete the credentials for a particular website in step 3 by selecting Clear Permissions for a
selected website, or by selecting Clear All Permissions for all of the listed websites.
To edit the authentication method in online ser vices, such as for dataflows in the Power BI ser vice
and Microsoft Power Platform
1. Select the connector, and then select Edit connection .
Connecting with Azure Active Directory using the Web and OData
connectors
When connecting to data sources and services that require authentication through OAuth or Azure Active
Directory-based authentication, in certain cases where the service is configured correctly, you can use the built-
in Web or OData connectors to authenticate and connect to data without requiring a service-specific or custom
connector.
This section outlines connection symptoms when the service isn't configured properly. It also provides
information on how Power Query interacts with the service when it's properly configured.
Symptoms when the service isn't configured properly
If you run into the error We were unable to connect because this credential type isn’t suppor ted for
this resource. Please choose another credential type , this error means that your service doesn't support
the authentication type.
One example of this is the Northwind OData service.
1. Enter the Northwind endpoint in the "Get Data" experience using the OData connector.
2. Select OK to enter the authentication experience. Normally, because Northwind isn’t an authenticated
service, you would just use Anonymous . To demonstrate lack of support for Azure Active Directory,
choose Organizational account , and then select Sign in .
3. You'll encounter the error, indicating that OAuth or Azure Active Directory authentication isn't supported
in the service.
Supported workflow
One example of a supported service working properly with OAuth is CRM, for example,
https://*.crm.dynamics.com/api/data/v8.2 .
1. Enter the URL in the "Get Data" experience using the OData connector.
2. Select Organizational Account , and then select Sign-in to proceed to connect using OAuth.
3. The request succeeds and the OAuth flow continues to allow you to authenticate successfully.
When you select Sign-in in Step 2 above, Power Query sends a request to the provided URL endpoint with an
Authorization header with an empty bearer token.
The service is then expected to respond with a 401 response with a WWW_Authenticate header indicating the
Azure AD authorization URI to use. This response should include the tenant to sign into, or /common/ if the
resource isn’t associated with a specific tenant.
HTTP/1.1 401 Unauthorized
Cache-Control: private
Content-Type: text/html
Server:
WWW-Authenticate: Bearer authorization_uri=https://login.microsoftonline.com/3df2eaf6-33d0-4a10-8ce8-
7e596000ebe7/oauth2/authorize
Date: Wed, 15 Aug 2018 15:02:04 GMT
Content-Length: 49
Power Query can then initiate the OAuth flow against the authorization_uri . Power Query requests an Azure
AD Resource or Audience value equal to the domain of the URL being requested. This value would be the value
you use for your Azure Application ID URL value in your API/service registration. For example, if accessing
https://api.myservice.com/path/to/data/api , Power Query would expect your Application ID URL value to be
equal to https://api.myservice.com .
The following Azure Active Directory client IDs are used by Power Query. You might need to explicitly allow
these client IDs to access your service and API, depending on your overall Azure Active Directory settings.
a672d62c-fc7b-4e81-a576- Power Query for Excel Public client, used in Power BI Desktop
e60dc46e951d and Gateway.
7ab7862c-4c57-491e-8a45- Power Apps and Power Automate Confidential client, used in Power Apps
d52a7e023983 and Power Automate.
You might need to explicitly allow these client IDs to access your service and API, depending on your overall
Azure Active Directory settings. Go to step 8 of Add a scope for more details.
If you need more control over the OAuth flow (for example, if your service must respond with a 302 rather than
a 401 ), or if your application’s Application ID URL or Azure AD Resource value don't match the URL of your
service, then you’d need to use a custom connector. For more information about using our built-in Azure AD
flow, go to Azure Active Directory authentication.
Upload a file (Preview)
5/25/2022 • 2 minutes to read • Edit Online
You can upload files to your Power Query project when using Power Query Online.
The following connectors currently support the upload a file feature.
Excel
JSON
PDF
Text / CSV
XML
NOTE
Only files with the following extensions are supported for upload: .csv, .json, .pdf, .prn, .tsv, .txt, .xl, .xls, .xlsb, .xlsm, .xlsw,
.xlsx, .xml.
When you select Upload file , the option underneath opens a simple dialog to let you either drag a file or
browse in your local file system to upload a file.
After you've selected your file, a progress bar shows you how the upload process is going. Once the upload
process is finished, you'll be able to see a green check mark underneath your file name, with the message
Upload successful and the file size right next to it.
NOTE
The files that are uploaded through this feature are stored in your personal Microsoft OneDrive for Business account.
Before you select the next button, you need to change the authentication kind from Anonymous to
Organizational account and go through the authentication process. Start this process by selecting Sign in .
After going through the authentication process, a You are currently signed in message underneath the
Authentication Kind selection let's you know that you've successfully signed in. After you've signed in, select
Next . The file is then stored in your personal Microsoft OneDrive for Business account, and a new query is
created from the file that you've uploaded.
Power Query offers a series of ways to gain access to files that are hosted on either SharePoint or OneDrive for
Business.
Browse files
NOTE
Currently, you can only browse for OneDrive for Business files of the authenticated user inside of Power Query Online for
PowerApps.
Power Query provides a Browse OneDrive button next to the File path or URL text box when you create a
dataflow in PowerApps using any of these connectors:
Excel
JSON
PDF
XML
TXT/CSV
When you select this button, you'll be prompted to go through the authentication process. After completing this
process, a new window appears with all the files inside the OneDrive for Business of the authenticated user.
You can select the file of your choice, and then select the Open button. After selecting Open , you'll be taken
back to the initial connection settings page where you'll see that the File path or URL text box now holds the
exact URL to the file you've selected from OneDrive for Business.
You can select the Next button at the bottom-right corner of the window to continue the process and get your
data.
NOTE
Your browser interface might not look exactly like the following image. There are many ways to select Open in
Excel for files in your OneDrive for Business browser interface. You can use any option that allows you to open
the file in Excel.
2. In Excel, select File > Info , and then select the Copy path button.
To use the link you just copied in Power Query, take the following steps:
1. Select Get Data > Web .
2. In the From Web dialog box, select the Basic option and paste the link in URL .
3. Remove the ?web=1 string at the end of the link so that Power Query can properly navigate to your file,
and then select OK .
4. If Power Query prompts you for credentials, choose either Windows (for on-premises SharePoint sites)
or Organizational Account (for Microsoft 365 or OneDrive for Business sites). The select Connect .
Cau t i on
When working with files hosted on OneDrive for Home, the file that you want to connect to needs to be
publicly available. When setting the authentication method for this connection, select the Anonymous
option.
When the Navigator dialog box appears, you can select from the list of tables, sheets, and ranges found in the
Excel workbook. From there, you can use the OneDrive for Business file just like any other Excel file. You can
create reports and use it in datasets like you would with any other data source.
NOTE
To use a OneDrive for Business file as a data source in the Power BI service, with Ser vice Refresh enabled for that file,
make sure you select OAuth2 as the Authentication method when configuring your refresh settings. Otherwise, you
may encounter an error (such as, Failed to update data source credentials) when you attempt to connect or to refresh.
Selecting OAuth2 as the authentication method remedies that credentials error.
The table has a column named Content that contains your file in a binary format. The values in the Content
column have a different color than the rest of the values in the other columns of the table, which indicates that
they're selectable.
By selecting a Binar y value in the Content column, Power Query will automatically add a series of steps in your
query to navigate to the file and interpret its contents where possible.
For example, from the table shown in the previous image, you can select the second row where the Name field
has a value of 02-Februar y.csv . Power Query will automatically create a series of steps to navigate and
interpret the contents of the file as a CSV file.
NOTE
You can interact with the table by applying filters, sortings, and other transforms before navigating to the file of your
choice. Once you've finished these transforms, select the Binar y value you want to view.
You don't need the full URL, but only the first few parts. The URL you need to use in Power Query will have the
following format:
https://<unique_tenant_name>.sharepoint.com/personal/<user_identifier>
For example:
https://contoso-my.sharepoint/personal/user123_contoso_com
SharePoint.Contents function
While the SharePoint folder connector offers you an experience where you can see all the files available in your
SharePoint or OneDrive for Business site at once, you can also opt for a different experience. In this experience,
you can navigate through your SharePoint or OneDrive for Business folders and reach the folder or file(s) that
you're interested in.
This experience is provided through the SharePoint.Contents function. Take the following steps to use this
function:
1. Create a Blank Query.
2. Change the code in the formula bar to be SharePoint.Contents("url") where url is the same format
used for the SharePoint folder connector. For example:
SharePoint.Contents("https://contoso.sharepoint.com/marketing/data")
NOTE
By default, this function tries to use SharePoint API Version 14 to connect. If you aren't certain of the API version
being used by your SharePoint site, you might want to try using the following example code:
SharePoint.Contents("https://contoso.sharepoint.com/marketing/data", [ApiVersion="Auto"]) .
3. Power Query will request that you add an authentication method for your connection. Use the same
authentication method that you'd use for the SharePoint files connector.
4. Navigate through the different documents to the specific folder or file(s) that you're interested in.
For example, imagine a SharePoint site with a Shared Documents folder. You can select the Table value in
the Content column for that folder and navigate directly to that folder.
Inside this Shared Documents folder there's a folder where the company stores all the sales reports. This
folder is named Sales Reports. You can select the Table value on the Content column for that row.
With all the files inside the Sales Reports folder, you could select the Combine files button (see Combine
files overview) to combine the data from all the files in this folder to a single table. Or you could navigate
directly to a single file of your choice by selecting the Binar y value from the Content column.
NOTE
The experience provided by the SharePoint.Contents function is optimal for SharePoint and OneDrive for Business
environments with a large number of files.
Lack of Support for Microsoft Graph in Power
Query
5/25/2022 • 2 minutes to read • Edit Online
Connecting to Microsoft Graph REST APIs from Power Query isn't recommended or supported. Instead, we
recommend users explore alternative solutions for retrieving analytics data based on Graph, such as Microsoft
Graph data connect.
You might find you can make certain REST calls to Microsoft Graph API endpoints work through the
Web.Contents or OData.Feed functions, but these approaches aren't reliable as long-term solutions.
This article outlines the issues associated with Microsoft Graph connectivity from Power Query and explains
why it isn't recommended.
Authentication
The built-in Organizational Account authentication flow for Power Query’s Web.Contents and OData.Feed
functions isn't compatible with most Graph endpoints. Specifically, Power Query’s Azure Active Directory (Azure
AD) client requests the user_impersonation scope, which isn't compatible with Graph’s security model. Graph
uses a rich set of permissions that aren't available through our generic Web and OData connectors.
Implementing your own Azure AD credential retrieval flows directly from your query, or using hardcoded or
embedded credentials, also isn't recommended for security reasons.
Performance
The Microsoft Graph API is designed to support many application scenarios, but is suboptimal for the large-
scale data retrieval required for most analytics scenarios. If you try to retrieve large amounts of data from Graph
APIs, you might encounter performance issues. Details around scenario applicability can be found in the Graph
documentation.
While Power BI Desktop offers out-of-box connectivity to over 150 data sources, there might be cases where you
want to connect to a data source for which no out-of-box connector is available.
When creating a new dataflow project in Power Query Online, you can select the on-premises data gateway
used for your specific data sources during the get data experience. This article showcases how you can modify
or assign a gateway to an existing dataflow project.
NOTE
Before being able to change a gateway, make sure that you have the needed gateways already registered under your
tenant and with access for the authors of the dataflow project. You can learn more about data gateways from Using an
on-premises data gateway in Power Platform dataflows.
4. After selecting the correct gateway for the project, in this case Gateway B, select OK to go back to the
Power Query editor.
NOTE
The M engine identifies a data source using a combination of its kind and path.
The kind defines what connector or data source function is being used, such as SQL Server, folder, Excel workbook, or
others.
The path value is derived from the required parameters of your data source function and, for this example, that would be
the folder path.
The best way to validate the data source path is to go into the query where your data source function is being
used and check the parameters being used for it. For this example, there's only one query that connects to a
folder and this query has the Source step with the data source path defined in it. You can double-click the
Source step to get the dialog that indicates the parameters used for your data source function. Make sure that
the folder path, or the correct parameters for your data source function, is the correct one in relation to the
gateway being used.
Modify authentication
To modify the credentials used against your data source, select Get data in the Power Query editor ribbon to
launch the Choose data source dialog box, then define a new or existing connection to your data source. For
the purpose of this example, the connector that's' used is the Folder connector.
Once in Connection settings , create a new connection or select or modify a different connection for your data
source.
After defining the connection details, select Next at the bottom-right corner and validate that your query is
loading in the Power Query editor.
NOTE
This process is the same as if you were to connect again to your data source. But by doing the process again, you're
effectively re-defining what authentication method and credentials to use against your data source.
The Power Query user interface
5/25/2022 • 14 minutes to read • Edit Online
With Power Query, you can connect to many different data sources and transform the data into the shape you
want.
In this article, you'll learn how to create queries with Power Query by discovering:
How the "Get Data" experience works in Power Query.
How to use and take advantage of the Power Query user interface.
How to perform common transformations like grouping and merging data.
If you're new to Power Query, you can sign up for a free trial of Power BI before you begin. You can use Power BI
dataflows to try out the Power Query Online experiences described in this article.
You can also download Power BI Desktop for free.
Examples in this article connect to and use the Northwind OData feed.
https://services.odata.org/V4/Northwind/Northwind.svc/
To start, locate the OData feed connector from the "Get Data" experience. You can select the Other category
from the top, or search for OData in the search bar in the top-right corner.
Once you select this connector, the screen displays the connection settings and credentials.
For URL , enter the URL to the Northwind OData feed shown in the previous section.
For On-premises data gateway , leave as none.
For Authentication kind , leave as anonymous.
Select the Next button.
The Navigator now opens, where you select the tables you want to connect to from the data source. Select the
Customers table to load a preview of the data, and then select Transform data .
The dialog then loads the data from the Customers table into the Power Query editor.
The above experience of connecting to your data, specifying the authentication method, and selecting the
specific object or table to connect to is called the get data experience and is documented with further detail in
the Getting data article.
NOTE
To learn more about the OData feed connector, go to OData feed.
1. Ribbon : the ribbon navigation experience, which provides multiple tabs to add transforms, select options for
your query, and access different ribbon buttons to complete various tasks.
2. Queries pane : a view of all your available queries.
3. Current view : your main working view, that by default, displays a preview of the data for your query. You
can also enable the diagram view along with the data preview view. You can also switch between the schema
view and the data preview view while maintaining the diagram view.
4. Quer y settings : a view of the currently selected query with relevant information, such as query name,
query steps, and various indicators.
5. Status bar : a bar displaying relevant important information about your query, such as execution time, total
columns and rows, and processing status. This bar also contains buttons to change your current view.
NOTE
The schema and diagram view are currently only available in Power Query Online.
The Power Query interface is responsive and tries to adjust your screen resolution to show you the best
experience. In scenarios where you'd like to use a compact version of the ribbon, there's also a collapse button at
the bottom-right corner of the ribbon to help you switch to the compact ribbon.
You can switch back to the standard ribbon view by simply clicking on the expand icon at the bottom-right
corner of the ribbon
Expand and collapse panes
You'll notice that throughout the Power Query user interface there are icons that help you collapse or expand
certain views or sections. For example, there's an icon on the top right-hand corner of the Queries pane that
collapses the queries pane when selected, and expands the pane when selected again.
The right side of the status bar also contains icons for the diagram, data, and schema views. You can use these
icons to change between views. You can also use these icons to enable or disable the view of your choice.
NOTE
To learn more about schema view, go to Using Schema view.
For example, in schema view, select the check mark next to the Orders and CustomerDemographics
columns, and from the ribbon select the Remove columns action. This selection applies a transformation to
remove these columns from your data.
What is diagram view
You can now switch back to the data preview view and enable diagram view to use a more visual perspective of
your data and query.
The diagram view helps you visualize how your query is structured and how it might interact with other queries
in your project. Each step in your query has a distinct icon to help you recognize the transform that was used.
There are also lines that connect steps to illustrate dependencies. Since both data preview view and diagram
view are enabled, the diagram view displays on top of the data preview.
NOTE
To learn more about diagram view, go to Diagram view.
The Group by dialog then appears. You can set the Group by operation to group by the country and count the
number of customer rows per country.
1. Keep the Basic radio button selected.
2. Select Countr y to group by.
3. Select Customers and Count rows as the column name and operation respectively.
Select OK to perform the operation. Your data preview refreshes to show the total number of customers by
country.
An alternative way to launch the Group by dialog would be to use the Group by button in the ribbon or by
right-clicking the Countr y column.
For convenience, transforms in Power Query can often be accessed from multiple places, so users can opt to use
the experience they prefer.
Select Create to add the new query to the Power Query editor. The queries pane should now display both the
Customers and the Suppliers query.
Open the Group by dialog again, this time by selecting the Group by button on the ribbon under the
Transform tab.
In the Group by dialog, set the Group by operation to group by the country and count the number of supplier
rows per country.
1. Keep the Basic radio button selected.
2. Select Countr y to group by.
3. Select Suppliers and Count rows as the column name and operation respectively.
NOTE
To learn more about the Group by transform, go to Grouping or summarizing rows.
Referencing queries
Now that you have a query for customers and a query for suppliers, your next goal is to combine these queries
into one. There are many ways to accomplish this, including using the Merge option in the Customers table,
duplicating a query, or referencing a query. For this example, you'll create a reference by right-clicking the
Customers table and selecting Reference , which effectively creates a new query that references the
Customers query.
After creating this new query, change the name of the query to Countr y Analysis and disable the load of the
Customers table by unmarking the Enable load option from the Suppliers query.
Merging queries
A merge queries operation joins two existing tables together based on matching values from one or multiple
columns. In this example, the goal is to join both the Customers and Suppliers tables into one table only for
the countries that have both Customers and Suppliers .
Inside the Countr y Analysis query, select the Merge queries option from the Home tab in the ribbon.
A new dialog for the Merge operation appears. You can then select the query to merge with your current query.
Select the Suppliers query and select the Countr y field from both queries. Finally, select the Inner join kind, as
you only want the countries where you have Customers and Suppliers for this analysis.
After selecting the OK button, a new column is added to your Countr y Analysis query that contains the data
from the Suppliers query. Select the icon next to the Suppliers field, which displays a menu where you can
select which fields you want to expand. Select only the Suppliers field, and then select the OK button.
The result of this expand operation is a table with only 12 rows. Rename the Suppliers.Suppliers field to just
Suppliers by double-clicking the field name and entering the new name.
NOTE
To learn more about the Merge queries feature, go to Merge queries overview.
Applied steps
Every transformation that is applied to your query is saved as a step in the Applied steps section of the query
settings pane. If you ever need to check how your query is transformed from step to step, you can select a step
and preview how your query resolves at that specific point.
You can also right-click a query and select the Proper ties option to change the name of the query or add a
description for the query. For example, right-click the Merge queries step from the Countr y Analysis query
and change the name of the query to be Merge with Suppliers and the description to be Getting data from
the Suppliers quer y for Suppliers by Countr y .
This change adds a new icon next to your step that you can hover over to read its description.
NOTE
To learn more about Applied steps , go to Using the Applied Steps list.
Before moving on to the next section, disable the Diagram view to only use the Data preview .
This change creates a new column called Integer-division that you can rename to Ratio . This change is the
final step of your query, and provides the customer-to-supplier ratio for the countries where the data has
customers and suppliers.
Data profiling
Another Power Query feature that can help you better understand your data is Data Profiling . By enabling the
data profiling features, you'll get feedback about the data inside your query fields, such as value distribution,
column quality, and more.
We recommended that you use this feature throughout the development of your queries, but you can always
enable and disable the feature at your convenience. The following image shows all the data profiling tools
enabled for your Countr y Analysis query.
NOTE
To learn more about Data profiling , go to Using the data profiling tools.
NOTE
Currently, Azure Analysis Services doesn't contain any inline Power Query help links. However, you can get help for Power
Query M functions. More information is contained in the next section.
1. With the Power Query editor open, select the insert step ( ) button.
2. In the formula bar, enter the name of a function you want to check.
a. If you are using Power Query Desktop, enter an equal sign, a space, and the name of a function.
b. If you are using Power Query Online, enter the name of a function.
3. Select the properties of the function.
a. If you are using Power Query Desktop, in the Quer y Settings pane, under Proper ties , select All
proper ties .
b. If you are using Power Query Online, in the Quer y Settings pane, select Proper ties .
These steps will open the inline help information for your selected function, and let you enter individual
properties used by the function.
Summary
In this article, you created a series of queries with Power Query that provides a customer-to-supplier ratio
analysis at the country level for the Northwind corporation.
You learned the components of the Power Query user interface, how to create new queries inside the query
editor, reference queries, merge queries, understand the applied steps section, add new columns, and how to use
the data profiling tools to better understand your data.
Power Query is a powerful tool used to connect to many different data sources and transform the data into the
shape you want. The scenarios outlined in this article are examples to show you how you can use Power Query
to transform raw data into important actionable business insights.
Using the Applied Steps list
5/25/2022 • 2 minutes to read • Edit Online
Any transformations to your data will show in the Applied Steps list. For instance, if you change the first
column name, it will display in the Applied Steps list as Renamed Columns .
Selecting any step will show you the results of that particular step, so you can see exactly how your data
changes as you add steps to the query.
The Quer y Settings menu will open to the right with the Applied Steps list.
Rename step
To rename a step, right-click the step and select Rename .
Enter in the name you want, and then either select Enter or click away from the step.
Delete step
To delete a step, right-click the step and select Delete .
Alternatively, select the x next to the step.
To insert a new intermediate step, right-click on a step and select Inser t step after . Then select Inser t on the
new window.
To set a transformation for the new step, select the new step in the list and make the change to the data. It will
automatically link the transformation to the selected step.
Move step
To move a step up one position in the list, right-click the step and select Move up .
To move a step down one position in the list, right-click the step and select Move down .
Alternatively, or to move more than a single position, drag and drop the step to the desired location.
Extract the previous steps into query
You can also separate a series of transformations into a different query. This allows the query to be referenced
for other sources, which can be helpful if you're trying to apply the same transformation to multiple datasets. To
extract all the previous steps into a new query, right-click the first step you do not want to include in the query
and select Extract Previous .
Name the new query and select OK . To access the new query, navigate to the Queries pane on the left side of
the screen.
The global search box offers you the ability to search for:
Queries found in your project.
Actions available in your version of Power Query that are commonly found in the ribbon.
Get data connectors that can also be found through the Get Data dialog box.
The global search box is located at the top center of the Power Query editor. The search box follows the same
design principles that you find in Microsoft Search in Office, but contextualized to Power Query.
Search results
To make use of the global search box, select the search box or press Alt + Q. Before you enter anything, you'll be
presented with some default options to choose from.
When you start entering something to search for, the results will be updated in real time, displaying queries,
actions, and get data connectors that match the text that you've entered.
For scenarios where you'd like to see all available options for a given search query, you can also select the See
more results for option. This option is positioned as the last result of the search box query when there are
multiple matches to your query.
Overview of query evaluation and query folding in
Power Query
5/25/2022 • 8 minutes to read • Edit Online
This article provides a basic overview of how M queries are processed and turned into data source requests.
TIP
You can think of the M script as a recipe that describes how to prepare your data.
The most common way to create an M script is by using the Power Query editor. For example, when you connect
to a data source, such as a SQL Server database, you'll notice on the right-hand side of your screen that there's a
section called applied steps. This section displays all the steps or transforms used in your query. In this sense,
the Power Query editor serves as an interface to help you create the appropriate M script for the transforms that
you're after, and ensures that the code you use is valid.
NOTE
The M script is used in the Power Query editor to:
Display the query as a series of steps and allow the creation or modification of new steps.
Display a diagram view.
The previous image emphasizes the applied steps section, which contains the following steps:
Source : Makes the connection to the data source. In this case, it's a connection to a SQL Server database.
Navigation : Navigates to a specific table in the database.
Removed other columns : Selects which columns from the table to keep.
Sor ted rows : Sorts the table using one or more columns.
Kept top rows : Filters the table to only keep a certain number of rows from the top of the table.
This set of step names is a friendly way to view the M script that Power Query has created for you. There are
several ways to view the full M script. In Power Query, you can select Advanced Editor in the View tab. You can
also select Advanced Editor from the Quer y group in the Home tab. In some versions of Power Query, you
can also change the view of the formula bar to show the query script by going into the View tab and from the
Layout group, select Script view > Quer y script .
Most of the names found in the Applied steps pane are also being used as is in the M script. Steps of a query
are named using something called identifiers in the M language. Sometimes extra characters are wrapped
around step names in M, but these characters aren’t shown in the applied steps. An example is
#"Kept top rows" , which is categorized as a quoted identifier because of these extra characters. A quoted
identifier can be used to allow any sequence of zero or more Unicode characters to be used as an identifier,
including keywords, whitespace, comments, operators, and punctuators. To learn more about identifiers in the M
language, go to lexical structure.
Any changes that you make to your query through the Power Query editor will automatically update the M
script for your query. For example, using the previous image as the starting point, if you change the Kept top
rows step name to be Top 20 rows , this change will automatically be updated in the script view.
While we recommend that you use the Power Query editor to create all or most of the M script for you, you can
manually add or modify pieces of your M script. To learn more about the M language, go to the official docs site
for the M language.
NOTE
M script, also referred to as M code, is a term used for any code that uses the M language. In the context of this article,
M script also refers to the code found inside a Power Query query and accessible through the advanced editor window or
through the script view in the formula bar.
NOTE
While this example showcases a query with a SQL Database as a data source, the concept applies to queries with or
without a data source.
When Power Query reads your M script, it runs the script through an optimization process to more efficiently
evaluate your query. In this process, it determines which steps (transforms) from your query can be offloaded to
your data source. It also determines which other steps need to be evaluated using the Power Query engine. This
optimization process is called query folding, where Power Query tries to push as much of the possible execution
to the data source to optimize your query's execution.
IMPORTANT
All rules from the Power Query M formula language (also known as the M language) are followed. Most notably, lazy
evaluation plays an important role during the optimization process. In this process, Power Query understands what
specific transforms from your query need to be evaluated. Power Query also understands what other transforms don't
need to be evaluated because they're not needed in the output of your query.
Furthermore, when multiple sources are involved, the data privacy level of each data source is taken into consideration
when evaluating the query. More information: Behind the scenes of the Data Privacy Firewall
The following diagram demonstrates the steps that take place in this optimization process.
1. The M script, found inside the advanced editor, is submitted to the Power Query engine. Other important
information is also supplied, such as credentials and data source privacy levels.
2. The Query folding mechanism submits metadata requests to the data source to determine the capabilities of
the data source, table schemas, relationships between different entities at the data source, and more.
3. Based on the metadata received, the query folding mechanism determines what information to extract from
the data source and what set of transformations need to happen inside the Power Query engine. It sends the
instructions to two other components that take care of retrieving the data from the data source and
transforming the incoming data in the Power Query engine if necessary.
4. Once the instructions have been received by the internal components of Power Query, Power Query sends a
request to the data source using a data source query.
5. The data source receives the request from Power Query and transfers the data to the Power Query engine.
6. Once the data is inside Power Query, the transformation engine inside Power Query (also known as mashup
engine) does the transformations that couldn't be folded back or offloaded to the data source.
7. The results derived from the previous point are loaded to a destination.
NOTE
Depending on the transformations and data source used in the M script, Power Query determines if it will stream or
buffer the incoming data.
IMPORTANT
All data source functions, commonly shown as the Source step of a query, queries the data at the data source in its
native language. The query folding mechanism is utilized on all transforms applied to your query after your data source
function so they can be translated and combined into a single data source query or as many transforms that can be
offloaded to the data source.
Depending on how the query is structured, there could be three possible outcomes to the query folding
mechanism:
Full quer y folding : When all of your query transformations get pushed back to the data source and
minimal processing occurs at the Power Query engine.
Par tial quer y folding : When only a few transformations in your query, and not all, can be pushed back to
the data source. In this case, only a subset of your transformations is done at your data source and the rest of
your query transformations occur in the Power Query engine.
No quer y folding : When the query contains transformations that can't be translated to the native query
language of your data source, either because the transformations aren't supported or the connector doesn't
support query folding. For this case, Power Query gets the raw data from your data source and uses the
Power Query engine to achieve the output you want by processing the required transforms at the Power
Query engine level.
NOTE
The query folding mechanism is primarily available in connectors for structured data sources such as, but not limited to,
Microsoft SQL Server and OData Feed. During the optimization phase, the engine might sometimes reorder steps in the
query.
Leveraging a data source that has more processing resources and has query folding capabilities can expedite your query
loading times as the processing occurs at the data source and not at the Power Query engine.
Next steps
For detailed examples of the three possible outcomes of the query folding mechanism, go to Query folding
examples.
Query folding examples
5/25/2022 • 17 minutes to read • Edit Online
This article provides some example scenarios for each of the three possible outcomes for query folding. It also
includes some suggestions on how to get the most out of the query folding mechanism, and the effect that it
can have in your queries.
The scenario
Imagine a scenario where, using the Wide World Importers database for Azure Synapse Analytics SQL database,
you're tasked with creating a query in Power Query that connects to the fact_Sale table and retrieves the last
10 sales with only the following fields:
Sale Key
Customer Key
Invoice Date Key
Description
Quantity
NOTE
For demonstration purposes, this article uses the database outlined on the tutorial on loading the Wide World Importers
database into Azure Synapse Analytics. The main difference in this article is the fact_Sale table only holds data for the
year 2000, with a total of 3,644,356 rows.
While the results might not exactly match the results that you get by following the tutorial from the Azure Synapse
Analytics documentation, the goal of this article is to showcase the core concepts and impact that query folding can have
in your queries.
This article showcases three ways to achieve the same output with different levels of query folding:
No query folding
Partial query folding
Full query folding
After connecting to your database and navigating to the fact_Sale table, you select the Keep bottom rows
transform found inside the Reduce rows group of the Home tab.
After selecting this transform, a new dialog appears. In this new dialog, you can enter the number of rows that
you'd like to keep. For this case, enter the value 10, and then select OK .
TIP
For this case, performing this operation yields the result of the last ten sales. In most scenarios, we recommend that you
provide a more explicit logic that defines which rows are considered last by applying a sort operation on the table.
Next, select the Choose columns transform found inside the Manage columns group of the Home tab. You
can then select the columns you want to keep from your table and remove the rest.
Lastly, inside the Choose columns dialog, select the Sale Key , Customer Key , Invoice Date Key , Description ,
and Quantity columns, and then select OK .
The following code sample is the full M script for the query you created:
let
Source = Sql.Database(ServerName, DatabaseName),
Navigation = Source{[Schema = "wwi", Item = "fact_Sale"]}[Data],
#"Kept bottom rows" = Table.LastN(Navigation, 10),
#"Choose columns" = Table.SelectColumns(#"Kept bottom rows", {"Sale Key", "Customer Key", "Invoice Date
Key", "Description", "Quantity"})
in
#"Choose columns""
You can right-click the last step of your query, the one named Choose columns , and select the option that
reads View Quer y plan . The goal of the query plan is to provide you with a detailed view of how your query is
run. To learn more about this feature, go to Query plan.
Each box in the previous image is called a node. A node represents the operation breakdown to fulfill this query.
Nodes that represent data sources, such as SQL Server in the example above and the Value.NativeQuery node,
represent which part of the query is offloaded to the data source. The rest of the nodes, in this case Table.LastN
and Table.SelectColumns highlighted in the rectangle in the previous image, are evaluated by the Power Query
engine. These two nodes represent the two transforms that you added, Kept bottom rows and Choose
columns . The rest of the nodes represent operations that happen at your data source level.
To see the exact request that is sent to your data source, select View details in the Value.NativeQuery node.
This data source request is in the native language of your data source. For this case, that language is SQL and
this statement represents a request for all the rows and fields from the fact_Sale table.
Consulting this data source request can help you better understand the story that the query plan tries to convey:
Sql.Database : This node represents the data source access. Connects to the database and sends metadata
requests to understand its capabilities.
Value.NativeQuery : Represents the request that was generated by Power Query to fulfill the query. Power
Query submits the data requests in a native SQL statement to the data source. In this case, that represents all
records and fields (columns) from the fact_Sale table. For this scenario, this case is undesirable, as the table
contains millions of rows and the interest is only in the last 10.
Table.LastN : Once Power Query receives all records from the fact_Sale table, it uses the Power Query
engine to filter the table and keep only the last 10 rows.
Table.SelectColumns : Power Query will use the output of the Table.LastN node and apply a new transform
called Table.SelectColumns , which selects the specific columns that you want to keep from a table.
For its evaluation, this query had to download all rows and fields from the fact_Sale table. This query took an
average of 6 minutes and 1 second to be processed in a standard instance of Power BI dataflows (which
accounts for the evaluation and loading of data to dataflows).
Inside the Choose columns dialog, select the Sale Key , Customer Key , Invoice Date Key , Description , and
Quantity columns and then select OK .
You now create logic that will sort the table to have the last sales at the bottom of the table. Select the Sale Key
column, which is the primary key and incremental sequence or index of the table. Sort the table using only this
field in ascending order from the context menu for the column.
Next, select the table contextual menu and choose the Keep bottom rows transform.
In Keep bottom rows , enter the value 10, and then select OK .
The following code sample is the full M script for the query you created:
let
Source = Sql.Database(ServerName, DatabaseName),
Navigation = Source{[Schema = "wwi", Item = "fact_Sale"]}[Data],
#"Choose columns" = Table.SelectColumns(Navigation, {"Sale Key", "Customer Key", "Invoice Date Key",
"Description", "Quantity"}),
#"Sorted rows" = Table.Sort(#"Choose columns", {{"Sale Key", Order.Ascending}}),
#"Kept bottom rows" = Table.LastN(#"Sorted rows", 10)
in
#"Kept bottom rows"
Each box in the previous image is called a node. A node represents every process that needs to happen (from
left to right) in order for your query to be evaluated. Some of these nodes can be evaluated at your data source
while others, like the node for Table.LastN , represented by the Kept bottom rows step, are evaluated using
the Power Query engine.
To see the exact request that is sent to your data source, select View details in the Value.NativeQuery node.
This request is in the native language of your data source. For this case, that language is SQL and this statement
represents a request for all the rows, with only the requested fields from the fact_Sale table ordered by the
Sale Key field.
Consulting this data source request can help you better understand the story that the full query plan tries to
convey. The order of the nodes is a sequential process that starts by requesting the data from your data source:
Sql.Database : Connects to the database and sends metadata requests to understand its capabilities.
Value.NativeQuery : Represents the request that was generated by Power Query to fulfill the query. Power
Query submits the data requests in a native SQL statement to the data source. For this case, that represents
all records, with only the requested fields from the fact_Sale table in the database sorted in ascending
order by the Sales Key field.
Table.LastN : Once Power Query receives all records from the fact_Sale table, it uses the Power Query
engine to filter the table and keep only the last 10 rows.
For its evaluation, this query had to download all rows and only the required fields from the fact_Sale table. It
took an average of 3 minutes and 4 seconds to be processed in a standard instance of Power BI dataflows
(which accounts for the evaluation and loading of data to dataflows).
You now create logic that will sort the table to have the last sales at the top of the table. Select the Sale Key
column, which is the primary key and incremental sequence or index of the table. Sort the table only using this
field in descending order from the context menu for the column.
Next, select the table contextual menu and choose the Keep top rows transform.
In Keep top rows , enter the value 10, and then select OK .
The following code sample is the full M script for the query you created:
let
Source = Sql.Database(ServerName, DatabaseName),
Navigation = Source{[Schema = "wwi", Item = "fact_Sale"]}[Data],
#"Choose columns" = Table.SelectColumns(Navigation, {"Sale Key", "Customer Key", "Invoice Date Key",
"Description", "Quantity"}),
#"Sorted rows" = Table.Sort(#"Choose columns", {{"Sale Key", Order.Descending}}),
#"Kept top rows" = Table.FirstN(#"Sorted rows", 10)
in
#"Kept top rows"
You can right-click the last step of your query, the one named Kept top rows , and select the option that reads
Quer y plan .
This request is in the native language of your data source. For this case, that language is SQL and this statement
represents a request for all the rows and fields from the fact_Sale table.
Consulting this data source query can help you better understand the story that the full query plan tries to
convey:
Sql.Database : Connects to the database and sends metadata requests to understand its capabilities.
Value.NativeQuery : Represents the request that was generated by Power Query to fulfill the query. Power
Query submits the data requests in a native SQL statement to the data source. For this case, that represents a
request for only the top 10 records of the fact_Sale table, with only the required fields after being sorted in
descending order using the Sale Key field.
NOTE
While there's no clause that can be used to SELECT the bottom rows of a table in the T-SQL language, there's a TOP clause
that retrieves the top rows of a table.
For its evaluation, this query only downloads 10 rows, with only the fields that you requested from the
fact_Sale table. This query took an average of 31 seconds to be processed in a standard instance of Power BI
dataflows (which accounts for the evaluation and loading of data to dataflows).
Performance comparison
To better understand the affect that query folding has in these queries, you can refresh your queries, record the
time it takes to fully refresh each query, and compare them. For simplicity, this article provides the average
refresh timings captured using the Power BI dataflows refresh mechanic while connecting to a dedicated Azure
Synapse Analytics environment with DW2000c as the service level.
The refresh time for each query was as follows:
EXA M P L E L A B EL T IM E IN SEC O N DS
It's often the case that a query that fully folds back to the data source outperforms similar queries that don't
completely fold back to the data source. There could be many reasons why this is the case. These reasons range
from the complexity of the transforms that your query performs, to the query optimizations implemented at
your data source, such as indexes and dedicated computing, and network resources. Still, there are two specific
key processes that query folding tries to use that minimizes the affect that both of these processes have with
Power Query:
Data in transit
Transforms executed by the Power Query engine
The following sections explain the affect that these two processes have in the previously mentioned queries.
Data in transit
When a query gets executed, it tries to fetch the data from the data source as one of its first steps. What data is
fetched from the data source is defined by the query folding mechanism. This mechanism identifies the steps
from the query that can be offloaded to the data source.
The following table lists the number of rows requested from the fact_Sale table of the database. The table also
includes a brief description of the SQL statement sent to request such data from the data source.
No query folding None 3644356 Request for all fields and all
records from the
fact_Sale table
EXA M P L E L A B EL RO W S REQ UEST ED DESC RIP T IO N
Partial query folding Partial 3644356 Request for all records, but
only required fields from
the fact_Sale table after
it was sorted by the
Sale Key field
When requesting data from a data source, the data source needs to compute the results for the request and then
send the data to the requestor. While the computing resources have already been mentioned, the network
resources of moving the data from the data source to Power Query, and then have Power Query be able to
effectively receive the data and prepare it for the transforms that will happen locally can take some time
depending on the size of the data.
For the showcased examples, Power Query had to request over 3.6 million rows from the data source for the no
query folding and partial query folding examples. For the full query folding example, it only requested 10 rows.
For the fields requested, the no query folding example requested all the available fields from the table. Both the
partial query folding and the full query folding examples only submitted a request for exactly the fields that they
needed.
Cau t i on
We recommend that you implement incremental refresh solutions that leverage query folding for queries or
entities with large amounts of data. Different product integrations of Power Query implement timeouts to
terminate long running queries. Some data sources also implement timeouts on long running sessions, trying to
execute expensive queries against their servers. More information: Using incremental refresh with dataflows and
Incremental refresh for datasets
Transforms executed by the Power Query engine
This article showcased how you can use the Query plan to better understand how your query might be
evaluated. Inside the query plan, you can see the exact nodes of the transform operations that will be performed
by the Power Query engine.
The following table showcases the nodes from the query plans of the previous queries that would have been
evaluated by the Power Query engine.
P O W ER Q UERY EN GIN E T RA N SF O RM
EXA M P L E L A B EL N O DES
For the examples showcased in this article, the full query folding example doesn't require any transforms to
happen inside the Power Query engine as the required output table comes directly from the data source. In
contrast, the other two queries required some computation to happen at the Power Query engine. Because of
the amount of data that needs to be processed by these two queries, the process for these examples takes more
time than the full query folding example.
Transforms can be grouped into the following categories:
Full scan Operators that need to gather all the rows before the data
can move on to the next operator in the chain. For example,
to sort data, Power Query needs to gather all the data.
Other examples of full scan operators are Table.Group ,
Table.NestedJoin , and Table.Pivot .
TIP
While not every transform is the same from a performance standpoint, in most cases, having fewer transforms is usually
better.
The data profiling tools provide new and intuitive ways to clean, transform, and understand data in Power Query
Editor. They include:
Column quality
Column distribution
Column profile
To enable the data profiling tools, go to the View tab on the ribbon. Enable the options you want in the Data
preview group, as shown in the following image.
After you enable the options, you'll see something like the following image in Power Query Editor.
NOTE
By default, Power Query will perform this data profiling over the first 1,000 rows of your data. To have it operate over the
entire dataset, check the lower-left corner of your editor window to change how column profiling is performed.
Column quality
The column quality feature labels values in rows in five categories:
Valid , shown in green.
Error , shown in red.
Empty , shown in dark grey.
Unknown , shown in dashed green. Indicates when there are errors in a column, the quality of the
remaining data is unknown.
Unexpected error , shown in dashed red.
These indicators are displayed directly underneath the name of the column as part of a small bar chart, as
shown in the following image.
The number of records in each column quality category is also displayed as a percentage.
By hovering over any of the columns, you are presented with the numerical distribution of the quality of values
throughout the column. Additionally, selecting the ellipsis button (...) opens some quick action buttons for
operations on the values.
Column distribution
This feature provides a set of visuals underneath the names of the columns that showcase the frequency and
distribution of the values in each of the columns. The data in these visualizations is sorted in descending order
from the value with the highest frequency.
By hovering over the distribution data in any of the columns, you get information about the overall data in the
column (with distinct count and unique values). You can also select the ellipsis button and choose from a menu
of available operations.
Column profile
This feature provides a more in-depth look at the data in a column. Apart from the column distribution chart, it
contains a column statistics chart. This information is displayed underneath the data preview section, as shown
in the following image.
Filter by value
You can interact with the value distribution chart on the right side and select any of the bars by hovering over
the parts of the chart.
Copy data
In the upper-right corner of both the column statistics and value distribution sections, you can select the ellipsis
button (...) to display a Copy shortcut menu. Select it to copy the data displayed in either section to the
clipboard.
Group by value
When you select the ellipsis button (...) in the upper-right corner of the value distribution chart, in addition to
Copy you can select Group by . This feature groups the values in your chart by a set of available options.
The image below shows a column of product names that have been grouped by text length. After the values
have been grouped in the chart, you can interact with individual values in the chart as described in Filter by
value.
Using the Queries pane
5/25/2022 • 3 minutes to read • Edit Online
In Power Query, you'll be creating many different queries. Whether it be from getting data from many tables or
from duplicating the original query, the number of queries will increase.
You'll be using the Queries pane to navigate through the queries.
NOTE
Some actions in the Power Query Online editor may be different than actions in the Power Query Desktop editor. These
differences will be noted in this article.
To be more comprehensive, we'll be touching on all of the context menu actions that are relevant for either.
Rename a query
To directly change the name of the query, double-select on the name of the query. This action will allow you to
immediately change the name.
Other options to rename the query are:
Go to the context menu and select Rename .
Go to Quer y Settings and enter in a different name in the Name input field.
Delete a query
To delete a query, open the context pane on the query and select Delete . There will be an additional pop-up
confirming the deletion. To complete the deletion, select the Delete button.
Duplicating a query
Duplicating a query will create a copy of the query you're selecting.
To duplicate your query, open the context pane on the query and select Duplicate . A new duplicate query will
pop up on the side of the query pane.
Referencing a query
Referencing a query will create a new query. The new query uses the steps of a previous query without having
to duplicate the query. Additionally, any changes on the original query will transfer down to the referenced
query.
To reference your query, open the context pane on the query and select Reference . A new referenced query will
pop up on the side of the query pane.
NOTE
To learn more about how to copy and paste queries in Power Query, see Sharing a query.
For the sake of being more comprehensive, we'll once again describe all of the context menu actions that are
relevant for either.
New query
You can import data into the Power Query editor as an option from the context menu.
This option functions the same as the Get Data feature.
NOTE
To learn about how to get data into Power Query, see Getting data
Merge queries
When you select the Merge queries option from the context menu, the Merge queries input screen opens.
This option functions the same as the Merge queries feature located on the ribbon and in other areas of the
editor.
NOTE
To learn more about how to use the Merge queries feature, see Merge queries overview.
New parameter
When you select the New parameter option from the context menu, the New parameter input screen opens.
This option functions the same as the New parameter feature located on the ribbon.
NOTE
To learn more about Parameters in Power Query, see Using parameters.
New group
You can make folders and move the queries into and out of the folders for organizational purposes. These
folders are called groups.
To move the query into a group, open the context menu on the specific query.
In the menu, select Move to group .
Then, select the group you want to put the query in.
The move will look like the following image. Using the same steps as above, you can also move the query out of
the group by selecting Queries (root) or another group.
In desktop versions of Power Query, you can also drag and drop the queries into the folders.
Diagram view
5/25/2022 • 12 minutes to read • Edit Online
Diagram view offers a visual way to prepare data in the Power Query editor. With this interface, you can easily
create queries and visualize the data preparation process. Diagram view simplifies the experience of getting
started with data wrangling. It speeds up the data preparation process and helps you quickly understand the
dataflow, both the "big picture view" of how queries are related and the "detailed view" of the specific data
preparation steps in a query.
This article provides an overview of the capabilities provided by diagram view.
This feature is enabled by selecting Diagram view in the View tab on the ribbon. With diagram view enabled,
the steps pane and queries pane will be collapsed.
NOTE
Currently, diagram view is only available in Power Query Online.
By searching and selecting the transform from the shortcut menu, the step gets added to the query, as shown in
the following image.
NOTE
To learn more about how to author queries in the Query editor using the Power Query editor ribbon or data preview, go
to Power Query Quickstart.
You can perform more query level actions such as duplicate, reference, and so on, by selecting the query level
context menu (the three vertical dots). You can also right-click in the query and get to the same context menu.
Expand or collapse query
To expand or collapse a query, right-click in the query and select Expand/Collapse from the query's context
menu. You can also double-click in the query to expand or collapse a query.
Delete query
To delete a query, right-click in a query and select Delete from the context menu. There will be an additional
pop-up to confirm the deletion.
Rename query
To rename a query, right-click in a query and select Rename from the context menu.
Enable load
To ensure that the results provided by the query are available for downstream use such as report building, by
default Enable load is set to true. In case you need to disable load for a given query, right-click in a query and
select Enable load . The queries where Enable load is set to false will be displayed with a grey outline.
Duplicate
To create a copy of a given query, right-click in the query and select Duplicate . A new duplicate query will
appear in the diagram view.
Reference
Referencing a query will create a new query. The new query will use the steps of the previous query without
having to duplicate the query. Additionally, any changes on the original query will transfer down to the
referenced query. To reference a query, right-click in the query and select Reference .
Move to group
You can make folders and move the queries into these folders for organizational purposes. These folders are
called groups. To move a given query to a Query group, right-click in a query and select Move to group . You
can choose to move the queries to an existing group or create a new query group.
You can view the query groups above the query box in the diagram view.
Create function
When you need to apply the same set of transformations in different queries or values, creating custom Power
Query functions can be valuable. To learn more about custom functions, go to Using custom functions. To
convert a query into a reusable function, right-click in a given query and select Create function .
Convert to parameter
A parameter provides the flexibility to dynamically change the output of your queries depending on their value
and promotes reusability. To convert a non-structured value such as date, text, number, and so on, right-click in
the query and select Conver t to Parameter .
NOTE
To learn more about parameters, go to Power Query parameters.
Advanced editor
With the advanced editor, you can see the code that Power Query editor is creating with each step. To view the
code for a given query, right-click in the query and select Advanced editor .
NOTE
To learn more about the code used in the advanced editor, go to Power Query M language specification.
This action will open a dialog box where you can edit the name of the query or add to or modify the query
description.
Queries with query description will have an affordance (i icon). You can view the query description by hovering
near the query name.
Append queries/Append queries as new
To append or perform a UNION of queries, right-click in a query and select Append queries . This action will
display the Append dialog box where you can add more tables to the current query. Append queries as new
will also display the Append dialog box, but will allow you to append multiple tables into a new query.
NOTE
To learn more about how to append queries in Power Query, go to Append queries.
You can also perform step level actions by hovering over the step and selecting the ellipsis (three vertical dots).
Edit settings
To edit the step level settings, right-click the step and choose Edit settings . Instead, you can double-click the
step (that has step settings) and directly get to the settings dialog box. In the settings dialog box, you can view or
change the step level settings. For example, the following image shows the settings dialog box for the Split
column step.
Rename step
To rename a step, right-click the step and select Rename . This action opens the Step proper ties dialog. Enter
the name you want, and then select OK .
Delete step
To delete a step, right-click the step and select Delete . To delete a series of steps until the end, right-click the step
and select Delete until end .
Move before/Move after
To move a step one position before, right-click a step and select Move before . To move a step one position after,
right-click a step and select Move after .
Extract previous
To extract all previous steps into a new query, right-click the first step that you do not want to include in the
query and then select Extract previous .
You can also get to the step level context menu by hovering over the step and selecting the ellipsis (three vertical
dots).
This action will open a dialog box where you can add the step description. This step description will come handy
when you come back to the same query after a few days or when you share your queries or dataflows with
other users.
By hovering over each step, you can view a call out that shows the step label, step name, and step descriptions
(that were added).
By selecting each step, you can see the corresponding data preview for that step.
You can also expand or collapse a query by selecting the query level actions from the query’s context menu.
To expand all or collapse all queries, select the Expand all/Collapse all button next to the layout options in the
diagram view pane.
You can also right-click any empty space in the diagram view pane and see a context menu to expand all or
collapse all queries.
In the collapsed mode, you can quickly look at the steps in the query by hovering over the number of steps in
the query. You can select these steps to navigate to that specific step within the query.
Layout Options
There are five layout options available in the diagram view: zoom out, zoom in, mini-map, full screen, fit to view,
and reset.
Zoom out/zoom in
With this option, you can adjust the zoom level and zoom out or zoom in to view all the queries in the diagram
view.
Mini-map
With this option, you can turn the diagram view mini-map on or off. More information: Show mini-map
Full screen
With this option, you can view all the queries and their relationships through the Full screen mode. The diagram
view pane expands to full screen and the data preview pane, queries pane, and steps pane remain collapsed.
Fit to view
With this option, you can adjust the zoom level so that all the queries and their relationships can be fully viewed
in the diagram view.
Reset
With this option, you can reset the zoom level back to 100% and also reset the pane to the top-left corner.
Similarly, you can select the right dongle to view direct and indirect dependent queries.
You can also hover on the link icon below a step to view a callout that shows the query relationships.
You can change diagram view settings to show step names to match the applied steps within the quer y
settings pane.
Compact view
When you have queries with multiple steps, it can be challenging to scroll horizontally to view all your steps
within the viewport.
To address this, diagram view offers Compact view , which compresses the steps from top to bottom instead of
left to right. This view can be especially useful when you have queries with multiple steps, so that you can see as
many queries as possible within the viewport.
To enable this view, navigate to diagram view settings and select Compact view inside the View tab in the
ribbon.
Show mini-map
Once the number of queries begin to overflow the diagram view, you can use the scroll bars at the bottom and
right side of the diagram view to scroll through the queries. One other method of scrolling is to use the diagram
view mini-map control. The mini-map control lets you keep track of the overall dataflow "map", and quickly
navigate, while looking at an specific area of the map in the main diagram view area.
To open the mini-map, either select Show mini-map from the diagram view menu or select the mini-map
button in the layout options.
Right-click and hold the rectangle on the mini-map, then move the rectangle to move around in the diagram
view.
Show animations
When the Show animations menu item is selected, the transitions of the sizes and positions of the queries is
animated. These transitions are easiest to see when collapsing or expanding the queries or when changing the
dependencies of existing queries. When cleared, the transitions will be immediate. Animations are turned on by
default.
You can also expand or collapse related queries from the query level context menu.
Multi-select queries
You select multiple queries within the diagram view by holding down the Ctrl key and clicking queries. Once you
multi-select, right-clicking will show a context menu that allows performing operations such as merge, append,
move to group, expand/collapse and more.
Inline rename
You can double-click the query name to rename the query.
Double-clicking the step name allows you to rename the step, provided the diagram view setting is showing
step names.
When step labels are displayed in diagram view, double-clicking the step label shows the dialog box to rename
the step name and provide a description.
Accessibility
Diagram view supports accessibility features such as keyboard navigation, high-contrast mode, and screen
reader support. The following table describes the keyboard shortcuts that are available within diagram view. To
learn more about keyboard shortcuts available within Power Query Online, see keyboard shortcuts in Power
Query.
A C T IO N K EY B O A RD SH O RTC UT
Move focus from query level to step level Alt+Down arrow key
Schema view is designed to optimize your flow when working on schema level operations by putting your
query's column information front and center. Schema view provides contextual interactions to shape your data
structure, and lower latency operations as it only requires the column metadata to be computed and not the
complete data results.
This article walks you through schema view and the capabilities it offers.
NOTE
The Schema view feature is available only for Power Query Online.
Overview
When working on data sets with many columns, simple tasks can become incredibly cumbersome because even
finding the right column by horizontally scrolling and parsing through all the data is inefficient. Schema view
displays your column information in a list that's easy to parse and interact with, making it easier than ever to
work on your schema.
In addition to an optimized column management experience, another key benefit of schema view is that
transforms tend to yield results faster. These results are faster because this view only requires the columns
information to be computed instead of a preview of the data. So even working with long running queries with a
few columns will benefit from using schema view.
You can turn on schema view by selecting Schema view in the View tab. When you're ready to work on your
data again, you can select Data view to go back.
Reordering columns
One common task when working on your schema is reordering columns. In Schema View this can easily be
done by dragging columns in the list and dropping in the right location until you achieve the desired column
order.
Applying transforms
For more advanced changes to your schema, you can find the most used column-level transforms right at your
fingertips directly in the list and in the Schema tools tab. Plus, you can also use transforms available in other
tabs on the ribbon.
Share a query
5/25/2022 • 2 minutes to read • Edit Online
You can use Power Query to extract and transform data from external data sources. These extraction and
transformations steps are represented as queries. Queries created with Power Query are expressed using the M
language and executed through the M Engine.
You can easily share and reuse your queries across projects, and also across Power Query product integrations.
This article covers the general mechanisms to share a query in Power Query.
Copy / Paste
In the queries pane, right-click the query you want to copy. From the dropdown menu, select the Copy option.
The query and its definition will be added to your clipboard.
NOTE
The copy feature is currently not available in Power Query Online instances.
To paste the query from your clipboard, go to the queries pane and right-click on any empty space in it. From
the menu, select Paste .
When pasting this query on an instance that already has the same query name, the pasted query will have a
suffix added with the format (#) , where the pound sign is replaced with a number to distinguish the pasted
queries.
You can also paste queries between multiple instances and product integrations. For example, you can copy the
query from Power BI Desktop, as shown in the previous images, and paste it in Power Query for Excel as shown
in the following image.
WARNING
Copying and pasting queries between product integrations doesn't guarantee that all functions and functionality found in
the pasted query will work on the destination. Some functionality might only be available in the origin product
integration.
NOTE
To create a blank query, go to the Get Data window and select Blank quer y from the options.
If you find yourself in a situation where you need to apply the same set of transformations to different queries
or values, creating a Power Query custom function that can be reused as many times as you need could be
beneficial. A Power Query custom function is a mapping from a set of input values to a single output value, and
is created from native M functions and operators.
While you can manually create your own Power Query custom function using code as shown in Understanding
Power Query M functions, the Power Query user interface offers you features to speed up, simplify, and enhance
the process of creating and managing a custom function.
This article focuses on this experience, provided only through the Power Query user interface, and how to get
the most out of it.
IMPORTANT
This article outlines how to create a custom function with Power Query using common transforms accessible in the Power
Query user interface. It focuses on the core concepts to create custom functions, and links to additional articles in Power
Query documentation for more information on specific transforms that are referenced in this article.
You can follow along with this example by downloading the sample files used in this article from the following
download link. For simplicity, this article will be using the Folder connector. To learn more about the Folder
connector, go to Folder. The goal of this example is to create a custom function that can be applied to all the files
in that folder before combining all of the data from all files into a single table.
Start by using the Folder connector experience to navigate to the folder where your files are located and select
Transform Data or Edit . This will take you to the Power Query experience. Right-click on the Binar y value of
your choice from the Content field and select the Add as New Quer y option. For this example, you'll see that
the selection was made for the first file from the list, which happens to be the file April 2019.csv.
This option will effectively create a new query with a navigation step directly to that file as a Binary, and the
name of this new query will be the file path of the selected file. Rename this query to be Sample File .
Create a new parameter with the name File Parameter . Use the Sample File query as the Current Value , as
shown in the following image.
NOTE
We recommend that you read the article on Parameters to better understand how to create and manage parameters in
Power Query.
Custom functions can be created using any parameters type. There's no requirement for any custom function to have a
binary as a parameter.
The binary parameter type is only displayed inside the Parameters dialog Type dropdown menu when you have a query
that evaluates to a binary.
It's possible to create a custom function without a parameter. This is commonly seen in scenarios where an input can be
inferred from the environment where the function is being invoked. For example, a function that takes the environment's
current date and time, and creates a specific text string from those values.
Right-click File Parameter from the Queries pane. Select the Reference option.
Rename the newly created query from File Parameter (2) to Transform Sample file .
Right-click this new Transform Sample file query and select the Create Function option.
This operation will effectively create a new function that will be linked with the Transform Sample file query.
Any changes that you make to the Transform Sample file query will be automatically replicated to your
custom function. During the creation of this new function, use Transform file as the Function name .
After creating the function, you'll notice that a new group will be created for you with the name of your function.
This new group will contain:
All parameters that were referenced in your Transform Sample file query.
Your Transform Sample file query, commonly known as the sample query.
Your newly created function, in this case Transform file .
Applying transformations to a sample query
With your new function created, select the query with the name Transform Sample file . This query is now
linked with the Transform file function, so any changes made to this query will be reflected in the function.
This is what is known as the concept of a sample query linked to a function.
The first transformation that needs to happen to this query is one that will interpret the binary. You can right-
click the binary from the preview pane and select the CSV option to interpret the binary as a CSV file.
The format of all the CSV files in the folder is the same. They all have a header that spans the first top four rows.
The column headers are located in row five and the data starts from row six downwards, as shown in the next
image.
The next set of transformation steps that need to be applied to the Transform Sample file are:
1. Remove the top four rows —This action will get rid of the rows that are considered part of the header
section of the file.
NOTE
To learn more about how to remove rows or filter a table by row position, see Filter by row position.
2. Promote headers —The headers for your final table are now in the first row of the table. You can
promote them as shown in the next image.
Power Query by default will automatically add a new Changed Type step after promoting your column
headers that will automatically detect the data types for each column. Your Transform Sample file query will
look like the next image.
NOTE
To learn more about how to promote and demote headers, see Promote or demote column headers.
Cau t i on
Your Transform file function relies on the steps performed in the Transform Sample file query. However, if
you try to manually modify the code for the Transform file function, you'll be greeted with a warning that
reads
The definition of the function 'Transform file' is updated whenever query 'Transform Sample file' is updated.
However, updates will stop if you directly modify function 'Transform file'.
Invoke a custom function as a new column
With the custom function now created and all the transformation steps incorporated, you can go back to the
original query where you have the list of files from the folder. Inside the Add Column tab in the ribbon, select
Invoke Custom Function from the General group. Inside the Invoke Custom Function window, enter
Output Table as the New column name . Select the name of your function, Transform file , from the
Function quer y dropdown. After selecting the function from the dropdown menu, the parameter for the
function will be displayed and you can select which column from the table to use as the argument for this
function. Select the Content column as the value / argument to be passed for the File Parameter .
After you select OK , a new column with the name Output Table will be created. This column has Table values
in its cells, as shown in the next image. For simplicity, remove all columns from this table except Name and
Output Table .
NOTE
To learn more about how to choose or remove columns from a table, see Choose or remove columns.
Your function was applied to every single row from the table using the values from the Content column as the
argument for your function. Now that the data has been transformed into the shape that you're looking for, you
can expand the Output Table column, as shown in the image below, without using any prefix for the expanded
columns.
You can verify that you have data from all files in the folder by checking the values in the Name or Date
column. For this case, you can check the values from the Date column, as each file only contains data for a
single month from a given year. If you see more than one, it means that you've successfully combined data from
multiple files into a single table.
NOTE
What you've read so far is fundamentally the same process that happens during the Combine files experience, but done
manually.
We recommend that you also read the article on Combine files overview and Combine CSV files to further understand
how the combine files experience works in Power Query and the role that custom functions play.
NOTE
To learn more about how to filter columns by values, see Filter values.
Applying this new step to your query will automatically update the Transform file function, which will now
require two parameters based on the two parameters that your Transform Sample file uses.
But the CSV files query has a warning sign next to it. Now that your function has been updated, it requires two
parameters. So the step where you invoke the function results in error values, since only one of the arguments
was passed to the Transform file function during the Invoked Custom Function step.
To fix the errors, double-click Invoked Custom Function in the Applied Steps to open the Invoke Custom
Function window. In the Market parameter, manually enter the value Panama .
You can now check your query to validate that only rows where Countr y is equal to Panama show up in the
final result set of the CSV Files query.
Create a custom function from a reusable piece of logic
If you have multiple queries or values that require the same set of transformations, you could create a custom
function that acts as a reusable piece of logic. Later, this custom function can be invoked against the queries or
values of your choice. This custom function could save you time and help you in managing your set of
transformations in a central location, which you can modify at any moment.
For example, imagine a query that has several codes as a text string and you want to create a function that will
decode those values, as in the following sample table:
C O DE
PTY-CM1090-LAX
LAX-CM701-PTY
PTY-CM4441-MIA
MIA-UA1257-LAX
LAX-XY2842-MIA
You start by having a parameter that has a value that serves as an example. For this case, it will be the value
PTY-CM1090-L AX .
From that parameter, you create a new query where you apply the transformations that you need. For this case,
you want to split the code PTY-CM1090-LAX into multiple components:
Origin = PTY
Destination = LAX
Airline = CM
FlightID = 1090
NOTE
To learn more about the Power Query M formula language, see Power Query M formula language
You can then transform that query into a function by doing a right-click on the query and selecting Create
Function . Finally, you can invoke your custom function into any of your queries or values, as shown in the next
image.
After a few more transformations, you can see that you've reached your desired output and leveraged the logic
for such a transformation from a custom function.
Promote or demote column headers
5/25/2022 • 2 minutes to read • Edit Online
When creating a new query from unstructured data sources such as text files, Power Query analyzes the
contents of the file. If Power Query identifies a different pattern for the first row, it will try to promote the first
row of data to be the column headings for your table. However, Power Query might not identify the pattern
correctly 100 percent of the time, so this article explains how you can manually promote or demote column
headers from rows.
Table with the columns (Column1, Column2, Column3 and column 4) all set to the Text data type, with four rows
containing a header at the top, a column header in row 5, and seven data rows at the bottom.
Before you can promote the headers, you need to remove the first four rows of the table. To make that happen,
select the table menu in the upper-left corner of the preview window, and then select Remove top rows .
In the Remove top rows window, enter 4 in the Number of rows box.
NOTE
To learn more about Remove top rows and other table operations, go to Filter by row position.
The result of that operation will leave the headers as the first row of your table.
Locations of the promote headers operation
From here, you have a number of places where you can select the promote headers operation:
On the Home tab, in the Transform group.
After you do the promote headers operation, your table will look like the following image.
Table with Date, Country, Total Units, and Total Revenue column headers, and seven rows of data. The Date
column header has a Date data type, the Country column header has a Text data type, the Total Units column
header has a Whole number data type, and the Total Revenue column header has a Decimal number data type.
NOTE
Table column names must be unique. If the row you want to promote to a header row contains multiple instances of the
same text string, Power Query will disambiguate the column headings by adding a numeric suffix preceded by a dot to
every text string that isn't unique.
As a last step, select each column and type a new name for it. The end result will resemble the following image.
Final table after renaming column headers to Date, Country, Total Units, and Total Revenue, with Renamed
columns emphasized in the Query settings pane and the M code shown in the formula bar.
See also
Filter by row position
Filter a table by row position
5/25/2022 • 6 minutes to read • Edit Online
Power Query has multiple options to filter a table based on the positions of its rows, either by keeping or
removing those rows. This article covers all the available methods.
Keep rows
The keep rows set of functions will select a set of rows from the table and remove any other rows that don't
meet the criteria.
There are two places where you can find the Keep rows buttons:
On the Home tab, in the Reduce Rows group.
This report always contains seven rows of data, and below the data it has a section for comments with an
unknown number of rows. In this example, you only want to keep the first seven rows of data. To do that, select
Keep top rows from the table menu. In the Keep top rows dialog box, enter 7 in the Number of rows box.
The result of that change will give you the output table you're looking for. After you set the data types for your
columns, your table will look like the following image.
Keep bottom rows
Imagine the following table that comes out of a system with a fixed layout.
Initial sample table with Column1, Column2, and Column3 as the column headers, all set to the Text data type,
and the bottom seven rows containing data, and above that a column headers row and an unknown number of
comments.
This report always contains seven rows of data at the end of the report page. Above the data, the report has a
section for comments with an unknown number of rows. In this example, you only want to keep those last seven
rows of data and the header row.
To do that, select Keep bottom rows from the table menu. In the Keep bottom rows dialog box, enter 8 in
the Number of rows box.
The result of that operation will give you eight rows, but now your header row is part of the table.
You need to promote the column headers from the first row of your table. To do this, select Use first row as
headers from the table menu. After you define data types for your columns, you'll create a table that looks like
the following image.
Final sample table for Keep bottom rows after promoting the first row to column headers and retaining seven
rows of data, and then setting the Units to the Number data type.
More information: Promote or demote column headers
Keep a range of rows
Imagine the following table that comes out of a system with a fixed layout.
Initial sample table with the columns (Column1, Column2, and Column3) all set to the Text data type, and
containing the column headers and seven rows of data in the middle of the table.
This report always contains five rows for the header, one row of column headers below the header, seven rows
of data below the column headers, and then an unknown number of rows for its comments section. In this
example, you want to get the eight rows after the header section of the report, and only those eight rows.
To do that, select Keep range of rows from the table menu. In the Keep range of rows dialog box, enter 6 in
the First row box and 8 in the Number of rows box.
Similar to the previous example for keeping bottom rows, the result of this operation gives you eight rows with
your column headers as part of the table. Any rows above the First row that you defined (row 6) are removed.
You can perform the same operation as described in Keep bottom rows to promote the column headers from
the first row of your table. After you set data types for your columns, your table will look like the following
image.
Final sample table for Keep range of rows after promoting first row to column headers, setting the Units column
to the Number data type, and keeping seven rows of data.
Remove rows
This set of functions will select a set of rows from the table, remove them, and keep the rest of the rows in the
table.
There are two places where you can find the Remove rows buttons:
On the Home tab, in the Reduce Rows group.
Initial sample table for Remove top rows with the columns (Column1, Column2, and Column3) all set to the Text
data type, a header at the top and a column header row and seven data rows at the bottom.
This report always contains a fixed header from row 1 to row 5 of the table. In this example, you want to remove
these first five rows and keep the rest of the data.
To do that, select Remove top rows from the table menu. In the Remove top rows dialog box, enter 5 in the
Number of rows box.
In the same way as the previous examples for "Keep bottom rows" and "Keep a range of rows," the result of this
operation gives you eight rows with your column headers as part of the table.
You can perform the same operation as described in previous examples to promote the column headers from
the first row of your table. After you set data types for your columns, your table will look like the following
image.
Final sample table for Remove top rows after promoting first row to column headers and setting the Units
column to the Number data type, and retaining seven rows of data.
Remove bottom rows
Imagine the following table that comes out of a system with a fixed layout.
Initial sample table for Remove bottom rows, with the header columns all set to the Text data type, seven rows of
data, then a footer of fixed length at the bottom.
This report always contains a fixed section or footer that occupies the last five rows of the table. In this example,
you want to remove those last five rows and keep the rest of the data.
To do that, select Remove bottom rows from the table menu. In the Remove top rows dialog box, enter 5 in
the Number of rows box.
The result of that change will give you the output table that you're looking for. After you set data types for your
columns, your table will look like the following image.
The result of that selection will give you the output table that you're looking for. After you set the data types to
your columns, your table will look like the following image.
Filter by values in a column
5/25/2022 • 4 minutes to read • Edit Online
In Power Query, you can include or exclude rows according to a specific value in a column. You can choose from
three methods to filter the values in your column:
Sort and filter menu
Cell shortcut menu
Type-specific filter
After you apply a filter to a column, a small filter icon appears in the column heading, as shown in the following
illustration.
NOTE
In this article, we'll focus on aspects related to filtering data. To learn more about the sort options and how to sort
columns in Power Query, go to Sort columns.
Remove empty
The Remove empty command applies two filter rules to your column. The first rule gets rid of any null values.
The second rule gets rid of any blank values. For example, imagine a table with just one text column with five
rows, where you have one null value and one blank cell.
NOTE
A null value is a specific value in the Power Query language that represents no value.
You then select Remove empty from the sort and filter menu, as shown in the following image.
You can also select this option from the Home tab in the Reduce Rows group in the Remove Rows drop-
down options, as shown in the next image.
The result of the Remove empty operation gives you the same table without the empty values.
Clear filter
When a filter is applied to a column, the Clear filter command appears on the sort and filter menu.
Auto filter
The list in the sort and filter menu is called the auto filter list, which shows the unique values in your column.
You can manually select or deselect which values to include in the list. Any selected values will be taken into
consideration by the filter; any values that aren't selected will be ignored.
This auto filter section also has a search bar to help you find any values from your list.
NOTE
When you load the auto filter list, only the top 1,000 distinct values in the column are loaded. If there are more than
1,000 distinct values in the column in the that you're filtering, a message will appear indicating that the list of values in
the filter list might be incomplete, and the Load more link appears. Select the Load more link to load another 1,000
distinct values.
If exactly 1,000 distinct values are found again, the list is displayed with a message stating that the list might still be
incomplete.
If fewer than 1,000 distinct values are found, the full list of values is shown.
Type-specific filters
Depending on the data type of your column, you'll see different commands in the sort and filter menu. The
following images show examples for date, text, and numeric columns.
Filter rows
When selecting any of the type-specific filters, you'll use the Filter rows dialog box to specify filter rules for the
column. This dialog box is shown in the following image.
The Filter rows dialog box has two modes: Basic and Advanced .
B a si c
With basic mode, you can implement up to two filter rules based on type-specific filters. In the preceding image,
notice that the name of the selected column is displayed after the label Keep rows where , to let you know
which column these filter rules are being implemented on.
For example, imagine that in the following table, you want to filter the Account Code by all values that start
with either PA or PTY .
To do that, you can go to the Filter rows dialog box for the Account Code column and specify the set of filter
rules you want.
In this example, first select the Basic button. Then under Keep rows where "Account Code" , select begins
with , and then enter PA . Then select the or button. Under the or button, select begins with , and then enter
PTY . The select OK .
The result of that operation will give you the set of rows that you're looking for.
A dvan c ed
With advanced mode, you can implement as many type-specific filters as necessary from all the columns in the
table.
For example, imagine that instead of applying the previous filter in basic mode, you wanted to implement a filter
to Account Code to show all values that end with 4 . Also, you want to show values over $100 in the Sales
column.
In this example, first select the Advanced button. In the first row, select Account Code under Column name ,
ends with under Operator , and select 4 for the Value. In the second row, select and , and then select Sales
under Column Name , is greater than under Operator , and 100 under Value . Then select OK
The result of that operation will give you just one row that meets both criteria.
NOTE
You can add as many clauses as you'd like by selecting Add clause . All clauses act at the same level, so you might want
to consider creating multiple filter steps if you need to implement filters that rely on other filters.
Choose or remove columns
5/25/2022 • 2 minutes to read • Edit Online
Choose columns and Remove columns are operations that help you define what columns your table needs
to keep and which ones it needs to remove. This article will showcase how to use the Choose columns and
Remove columns commands by using the following sample table for both operations.
The goal is to create a table that looks like the following image.
Choose columns
On the Home tab, in the Manage columns group, select Choose columns .
The Choose columns dialog box appears, containing all the available columns in your table. You can select all
the fields that you want to keep and remove specific fields by clearing their associated check box. For this
example, you want to remove the GUID and Repor t created by columns, so you clear the check boxes for
those fields.
After selecting OK , you'll create a table that only contains the Date , Product , SalesPerson , and Units columns.
Remove columns
When you select Remove columns from the Home tab, you have two options:
Remove columns : Removes the selected columns.
Remove other columns : Removes all columns from the table except the selected ones.
After selecting Remove other columns , you'll create a table that only contains the Date , Product ,
SalesPerson , and Units columns.
Grouping or summarizing rows
5/25/2022 • 6 minutes to read • Edit Online
In Power Query, you can group values in various rows into a single value by grouping the rows according to the
values in one or more columns. You can choose from two types of grouping operations:
Column groupings.
Row groupings.
For this tutorial, you'll be using the following sample table.
Table with columns showing Year (2020), Country (USA, Panama, or Canada), Product (Shirt or Shorts), Sales
channel (Online or Reseller), and Units (various values from 55 to 7500)
Operations available
With the Group by feature, the available operations can be categorized in two ways:
Row level operation
Column level operation
The following table describes each of these operations.
Count distinct rows Row operation Calculates the number of distinct rows
from a given group
After that operation is complete, notice how the Products column has [Table] values inside each cell. Each
[Table] value contains all the rows that were grouped by the Countr y and Sales Channel columns from your
original table. You can select the white space inside the cell to see a preview of the contents of the table at the
bottom of the dialog box.
NOTE
The details preview pane might not show all the rows that were used for the group-by operation. You can select the
[Table] value to see all rows pertaining to the corresponding group-by operation.
Next, you need to extract the row that has the highest value in the Units column of the tables inside the new
Products column, and call that new column Top performer product .
Extract the top performer product information
With the new Products column with [Table] values, you create a new custom column by going to the Add
Column tab on the ribbon and selecting Custom column from the General group.
Name your new column Top performer product . Enter the formula Table.Max([Products], "Units" ) under
Custom column formula .
The result of that formula creates a new column with [Record] values. These record values are essentially a table
with just one row. These records contain the row with the maximum value for the Units column of each [Table]
value in the Products column.
With this new Top performer product column that contains [Record] values, you can select the expand icon,
select the Product and Units fields, and then select OK .
After removing your Products column and setting the data type for both newly expanded columns, your result
will resemble the following image.
Fuzzy grouping
NOTE
The following feature is only available in Power Query Online.
To demonstrate how to do "fuzzy grouping," consider the sample table shown in the following image.
The goal of fuzzy grouping is to do a group-by operation that uses an approximate match algorithm for text
strings. Power Query uses the Jaccard similarity algorithm to measure the similarity between pairs of instances.
Then it applies agglomerative hierarchical clustering to group instances together. The following image shows the
output that you expect, where the table will be grouped by the Person column.
To do the fuzzy grouping, you perform the same steps previously described in this article. The only difference is
that this time, in the Group by dialog box, you select the Use fuzzy grouping check box.
For each group of rows, Power Query will pick the most frequent instance as the "canonical" instance. If multiple
instances occur with the same frequency, Power Query will pick the first one. After you select OK in the Group
by dialog box, you'll get the result that you were expecting.
However, you have more control over the fuzzy grouping operation by expanding Fuzzy group options .
The following options are available for fuzzy grouping:
Similarity threshold (optional) : This option indicates how similar two values must be to be grouped
together. The minimum setting of 0 will cause all values to be grouped together. The maximum setting of 1
will only allow values that match exactly to be grouped together. The default is 0.8.
Ignore case : When comparing text strings, case will be ignored. This option is enabled by default.
Group by combining text par ts : The algorithm will try to combine text parts (such as combining Micro
and soft into Microsoft ) to group values.
Show similarity scores : Show similarity scores between the input values and the computed representative
values after fuzzy grouping. Requires the addition of an operation such as All rows to showcase this
information on a row-by-row level.
Transformation table (optional) : You can select a transformation table that will map values (such as
mapping MSFT to Microsoft ) to group them together.
For this example, a transformation table will be used to demonstrate how values can be mapped. The
transformation table has two columns:
From : The text string to look for in your table.
To : The text string to use to replace the text string in the From column.
The following image shows the transformation table used in this example.
IMPORTANT
It's important that the transformation table has a the same columns and column names as shown above (they have to be
"From" and "To"), otherwise Power Query will not recognize these.
Return to the Group by dialog box, expand Fuzzy group options , change the operation from Count rows to
All rows , enable the Show similarity scores option, and then select the Transformation table drop-down
menu.
After selecting your transformation table, select OK . The result of that operation gives you the the following
information.
In this example, the Ignore case option was enabled, so the values in the From column of the Transformation
table are used to look for the text string without considering the case of the string. This transformation
operation occurs first, and then the fuzzy grouping operation is performed.
The similarity score is also shown in the table value next to the person column, which reflects exactly how the
values were grouped and their respective similarity scores. You have the option to expand this column if needed
or use the values from the new Frequency columns for other sorts of transformations.
NOTE
When grouping by multiple columns, the transformation table performs the replace operation in all columns if replacing
the value increases the similarity score.
See also
Add a custom column
Remove duplicates
Unpivot columns
5/25/2022 • 7 minutes to read • Edit Online
In Power Query, you can transform columns into attribute-value pairs, where columns become rows.
Diagram showing a table on the left with a blank column and rows, and the Attributes values A1, A2, and A3 as
column headers. The A1 column contains the values V1, V4, and V7, the A2 column contains the values V2, V5,
and V8, and the A3 column contains the values V3, V6, and V9. With the columns unpivoted, a table on the right
of the diagram contains a blank column and rows, an Attributes column with nine rows with A1, A2, and A3
repeated three times, and a Values column with values V1 through V9.
For example, given a table like the following, where country rows and date columns create a matrix of values, it's
difficult to analyze the data in a scalable way.
Table containing a Country column set in the Text data type, and 6/1/2020, 7/1/2020, and 8/1/2020 columns set
as the Whole number data type. The Country column contains USA in row 1, Canada in row 2, and Panama in
row 3.
Instead, you can transform the table into a table with unpivoted columns, as shown in the following image. In
the transformed table, it's easier to use the date as an attribute to filter on.
Table containing a Country column set as the Text data type, an Attribute column set as the Text data type, and a
Value column set as the Whole number data type. The Country column contains USA in the first three rows,
Canada in the next three rows, and Panama in the last three rows. The Attribute column contains 6/1/2020 in the
first, forth, and seventh rows, 7/1/2020 in the second, fifth, and eighth rows, and 8/1/2020 in the third, sixth,
and ninth rows.
The key in this transformation is that you have a set of dates in the table that should all be part of a single
column. The respective value for each date and country should be in a different column, effectively creating an
attribute-value pair.
Power Query will always create the attribute-value pair by using two columns:
Attribute : The name of the column headings that were unpivoted.
Value : The values that were underneath each of the unpivoted column headings.
There are multiple places in the user interface where you can find Unpivot columns . You can right-click the
columns that you want to unpivot, or you can select the command from the Transform tab in the ribbon.
There are three ways that you can unpivot columns from a table:
Unpivot columns
Unpivot other columns
Unpivot only selected columns
Unpivot columns
For the scenario described above, you first need to select the columns you want to unpivot. You can select Ctrl
as you select as many columns as you need. For this scenario, you want to select all the columns except the one
named Countr y . After selecting the columns, right-click any of the selected columns, and then select Unpivot
columns .
The result of that operation will yield the result shown in the following image.
Table containing a Country column set as the Text data type, an Attribute column set as the Text data type, and a
Value column set as the Whole number data type. The Country column contains USA in the first three rows,
Canada in the next three rows, and Panama in the last three rows. The Attribute column contains 6/1/2020 in the
first, forth, and seventh rows, 7/1/2020 in the second, fifth, and eighth rows, and 8/1/2020 in the third, sixth,
and ninth rows. In addition, the Unpivot columns entry is emphasized in the Query settings pane and the M
language code is shown in the formula bar.
Special considerations
After creating your query from the steps above, imagine that your initial table gets updated to look like the
following screenshot.
Table with the same original Country, 6/1/2020, 7/1/2020, and 8/1/2020 columns, with the addition of a
9/1/2020 column. The Country column still contains the USA, Canada, and Panama values, but also has UK
added to the fourth row and Mexico added to the fifth row.
Notice that you've added a new column for the date 9/1/2020 (September 1, 2020), and two new rows for the
countries UK and Mexico.
If you refresh your query, you'll notice that the operation will be done on the updated column, but won't affect
the column that wasn't originally selected (Countr y , in this example). This means that any new column that's
added to the source table will be unpivoted as well.
The following image shows what your query will look like after the refresh with the new updated source table.
Table with Country, Attribute, and Value columns. The first four rows of the Country column contains USA, the
second four rows contains Canada, the third four rows contains Panama, the fourth four rows contains UK, and
the fifth four rows contains Mexico. The Attribute column contains 6/1/2020, 7/1/2020, 8/1/2020, and 9/1/2020
in the first four rows, which are repeated for each country.
Table containing a Country column set as the Text data type, an Attribute column set as the Text data type, and a
Value column set as the Whole number data type. The Country column contains USA in the first three rows,
Canada in the next three rows, and Panama in the last three rows. The Attribute column contains 6/1/2020 in the
first, forth, and seventh rows, 7/1/2020 in the second, fifth, and eighth rows, and 8/1/2020 in the third, sixth,
and ninth rows.
NOTE
This transformation is crucial for queries that have an unknown number of columns. The operation will unpivot all
columns from your table except the ones that you've selected. This is an ideal solution if the data source of your scenario
got new date columns in a refresh, because those will get picked up and unpivoted.
Special considerations
Similar to the Unpivot columns operation, if your query is refreshed and more data is picked up from the data
source, all the columns will be unpivoted except the ones that were previously selected.
To illustrate this, say that you have a new table like the one in the following image.
Table with Country, 6/1/2020, 7/1/2020, 8/1/2020, and 9/1/2020 columns, with all columns set to the Text data
type. The Country column contains, from top to bottom, USA, Canada, Panama, UK, and Mexico.
You can select the Countr y column, and then select Unpivot other column , which will yield the following
result.
Table with Country, Attribute, and Value columns. The Country and Attribute columns are set to the Text data
type. The Value column is set to the Whole value data type. The first four rows of the Country column contain
USA, the second four rows contains Canada, the third four rows contains Panama, the fourth four rows contains
UK, and the fifth four rows contains Mexico. The Attribute column contains 6/1/2020, 7/1/2020, 8/1/2020, and
9/1/2020 in the first four rows, which are repeated for each country.
Notice how this operation will yield the same output as the previous examples.
Table containing a Country column set as the Text data type, an Attribute column set as the Text data type, and a
Value column set as the Whole number data type. The Country column contains USA in the first three rows,
Canada in the next three rows, and Panama in the last three rows. The Attribute column contains 6/1/2020 in the
first, forth, and seventh rows, 7/1/2020 in the second, fifth, and eighth rows, and 8/1/2020 in the third, sixth,
and ninth rows.
Special considerations
After doing a refresh, if our source table changes to have a new 9/1/2020 column and new rows for UK and
Mexico, the output of the query will be different from the previous examples. Say that our source table, after a
refresh, changes to the table in the following image.
The output of our query will look like the following image.
It looks like this because the unpivot operation was applied only on the 6/1/2020 , 7/1/2020 , and 8/1/2020
columns, so the column with the header 9/1/2020 remains unchanged.
Pivot columns
5/25/2022 • 4 minutes to read • Edit Online
In Power Query, you can create a table that contains an aggregate value for each unique value in a column.
Power Query groups each unique value, does an aggregate calculation for each value, and pivots the column
into a new table.
Diagram showing a table on the left with a blank column and rows. An Attributes column contains nine rows
with A1, A2, and A3 repeated three times. A Values column contains, from top to bottom, values V1 through V9.
With the columns pivoted, a table on the right contains a blank column and rows, the Attributes values A1, A2,
and A3 as column headers, with the A1 column containing the values V1, V4, and V7, the A2 column containing
the values V2, V5, and V8, and the A3 column containing the values V3, V6, and V9.
Imagine a table like the one in the following image.
Table containing a Country column set as the Text data type, a Date column set as the Data data type, and a Value
column set as the Whole number data type. The Country column contains USA in the first three rows, Canada in
the next three rows, and Panama in the last three rows. The Date column contains 6/1/2020 in the first, forth,
and seventh rows, 7/1/2020 in the second, fifth, and eighth rows, and 8/1/2020 in the third, sixth, and ninth
rows.
This table contains values by country and date in a simple table. In this example, you want to transform this
table into the one where the date column is pivoted, as shown in the following image.
Table containing a Country column set in the Text data type, and 6/1/2020, 7/1/2020, and 8/1/2020 columns set
as the Whole number data type. The Country column contains Canada in row 1, Panama in row 2, and USA in
row 3.
NOTE
During the pivot columns operation, Power Query will sort the table based on the values found on the first column—at
the left side of the table—in ascending order.
To pivot a column
1. Select the column that you want to pivot.
2. On the Transform tab in the Any column group, select Pivot column .
3. In the Pivot column dialog box, in the Value column list, select Value .
By default, Power Query will try to do a sum as the aggregation, but you can select the Advanced option
to see other available aggregations.
In the Pivot column dialog box, select the Product column as the value column. Select the Advanced option
button in the Pivot columns dialog box, and then select Don't aggregate .
The result of this operation will yield the result shown in the following image.
Table containing Country, First Place, second Place, and Third Place columns, with the Country column
containing Canada in row 1, Panama in row 2, and USA in row 3.
Errors when using the Don't aggregate option
The way the Don't aggregate option works is that it grabs a single value for the pivot operation to be placed as
the value for the intersection of the column and row pair. For example, let's say you have a table like the one in
the following image.
Table with a Country, Date, and Value columns. The Country column contains USA in the first three rows, Canada
in the next three rows, and Panama in the last three rows. The Date column contains a date of 6/1/2020 in all
rows. The value column contains various whole numbers between 20 and 785.
You want to pivot that table by using the Date column, and you want to use the values from the Value column.
Because this pivot would make your table have just the Countr y values on rows and the Dates as columns,
you'd get an error for every single cell value because there are multiple rows for every combination of Countr y
and Date . The outcome of the Pivot column operation will yield the results shown in the following image.
Power Query Editor pane showing a table with Country and 6/1/2020 columns. The Country column contains
Canada in the first row, Panama in the second row, and USA in the third row. All of the rows under the 6/1/2020
column contain Errors. Under the table is another pane that shows the expression error with the "There are too
many elements in the enumeration to complete the operation" message.
Notice the error message "Expression.Error: There were too many elements in the enumeration to complete the
operation." This error occurs because the Don't aggregate operation only expects a single value for the
country and date combination.
Transpose a table
5/25/2022 • 2 minutes to read • Edit Online
The transpose table operation in Power Query rotates your table 90 degrees, turning your rows into columns
and your columns into rows.
Imagine a table like the one in the following image, with three rows and four columns.
Table with four columns named Column1 through Column4, with all columns set to the Text data type. Column1
contains Events in row 1, Participants in row 2, and Funds in row 3. Column2 contains Event 1 in row 1, 150 in
row 2, and 4000 in row 3. Column3 contains Event 2 in row 1, 450 in row 2, and 10000 in row 3. Column4
contains Event 2 in row 1, 1250 in row 2, and 15000 in row 3.
The goal of this example is to transpose that table so you end up with four rows and three columns.
Table with three columns named Events with a Text data type, Participants with a Whole number data type, and
Funds with a whole number data type. The Events column contains, from top to bottom, Event 1, Event 2, and
Event 3. The Participants column contains, from top to bottom, 150, 450, and 1250. The Funds column contains,
from top to bottom, 4000, 10000, and 15000.
On the Transform tab in the ribbon, select Transpose .
The result of that operation will look like the following image.
Table with three columns named Column1, Column2, and Column 3, with all columns set to the Any data type.
Column1 contains, from top to bottom, Events, Event 1, Event 2, and Event 3. Column2 contains, from top to
bottom, Participants, 150, 450, and 1250. Column 3 contains, from top to bottom, Funds, 4000, 10000, and
15000.
NOTE
Only the contents of the table will be transposed during the transpose operation; the column headers of the initial table
will be lost. The new columns will have the name Column followed by a sequential number.
The headers you need in this example are in the first row of the table. To promote the first row to headers, select
the table icon in the upper-left corner of the data preview, and then select Use first row as headers .
The result of that operation will give you the output that you're looking for.
Final table with three columns named Events with a Text data type, Participants with a Whole number data type,
and Funds with a whole number data type. The Events column contains, from top to bottom, Event 1, Event 2,
and Event 3. The Participants column contains, from top to bottom, 150, 450, and 1250. The Funds column
contains, from top to bottom, 4000, 10000, and 15000.
NOTE
To learn more about the promote headers operation, also known as Use first row as headers , go to Promote or
demote column headers.
Reverse rows
5/25/2022 • 2 minutes to read • Edit Online
With Power Query, it's possible to reverse the order of rows in a table.
Imagine a table with two columns, ID and Countr y , as shown in the following image.
Initial table with ID and Country columns. The ID rows contain, from top to bottom, values of 1 through 7. The
Country rows contain, from top to bottom, USA, Canada, Mexico, China, Spain, Panama, and Columbia.
On the Transform tab, select Reverse rows .
Output table with the rows reversed. The ID rows now contain, from top to bottom, values of 7 down to 1. The
Country rows contain, from top to bottom, Columbia, Panama, Spain, China, Mexico, Canada, and USA.
Data types in Power Query
5/25/2022 • 9 minutes to read • Edit Online
Data types in Power Query are used to classify values to have a more structured dataset. Data types are defined
at the field level—values inside a field are set to conform to the data type of the field.
The data type of a column is displayed on the left side of the column heading with an icon that symbolizes the
data type.
NOTE
Power Query provides a set of contextual transformations and options based on the data type of the column. For
example, when you select a column with a data type of Date, you get transformations and options that apply to that
specific data type. These transformations and options occur throughout the Power Query interface, such as on the
Transform and Add column tabs and the smart filter options.
The most common data types used in Power Query are listed in the following table. Although beyond the scope
of this article, you can find the complete list of data types in the Power Query M formula language Types article.
On the Transform tab, in the Any column group, on the Data type drop-down menu.
By selecting the icon on the left side of the column heading.
When you try setting the data type of the Date column to be Date , you get error values.
These errors occur because the locale being used is trying to interpret the date in the English (United States)
format, which is month/day/year. Because there's no month 22 in the calendar, it causes an error.
Instead of trying to just select the Date data type, you can right-click the column heading, select Change type ,
and then select Using locale .
In the Change column type with locale dialog box, you select the data type that you want to set, but you also
select which locale to use, which in this case needs to be English (United Kingdom) .
Using this locale, Power Query will be able to interpret values correctly and convert those values to the right
data type.
By using these columns, you can verify that your date value has been converted correctly.
—
Decim
al
numb
er
—
Curre
ncy
—
Whol
e
numb
er
—
Perce
ntage
—
Date/
Time
—
Date
—
Time
—
Date/
Time/
Timez
one
—
Durati
on
—
Text
—
True/F
alse
IC O N DESC RIP T IO N
Possible
IC O N DESC RIP T IO N
Not possible
Step-level error
A step-level error prevents the query from loading and displays the error components in a yellow pane.
Error reason : The first section before the colon. In the example above, the error reason is
Expression.Error .
Error message : The section directly after the reason. In the example above, the error message is The
column 'Column' of the table wasn't found .
Error detail : The section directly after the Details: string. In the example above, the error detail is Column .
Common step-level errors
In all cases, we recommend that you take a close look at the error reason, error message, and error detail to
understand what's causing the error. You can select the Go to error button, if available, to view the first step
where the error occurred.
Possible solutions : You can change the file path of the text file to a path that both users have access to. As user
B, you can change the file path to be a local copy of the same text file. If the Edit settings button is available in
the error pane, you can select it and change the file path.
The column of the table wasn't found
This error is commonly triggered when a step makes a direct reference to a column name that doesn't exist in
the query.
Example : You have a query from a text file where one of the column names was Column . In your query, you
have a step that renames that column to Date . But there was a change in the original text file, and it no longer
has a column heading with the name Column because it was manually changed to Date . Power Query is
unable to find a column heading named Column , so it can't rename any columns. It displays the error shown in
the following image.
Possible solutions : There are multiple solutions for this case, but they all depend on what you'd like to do. For
this example, because the correct Date column header already comes from your text file, you can just remove
the step that renames the column. This will allow your query to run without this error.
Other common step-level errors
When combining or merging data between multiple data sources, you might get a Formula.Firewall error
such as the one shown in the following image.
This error can be caused by a number of reasons, such as the data privacy levels between data sources or the
way that these data sources are being combined or merged. For more information about how to diagnose this
issue, go to Data privacy firewall.
Cell-level error
A cell-level error won't prevent the query from loading, but displays error values as Error in the cell. Selecting
the white space in the cell displays the error pane underneath the data preview.
NOTE
The data profiling tools can help you more easily identify cell-level errors with the column quality feature. More
information: Data profiling tools
Remove errors
To remove rows with errors in Power Query, first select the column that contains errors. On the Home tab, in the
Reduce rows group, select Remove rows . From the drop-down menu, select Remove errors .
The result of that operation will give you the table that you're looking for.
Replace errors
If instead of removing rows with errors, you want to replace the errors with a fixed value, you can do so as well.
To replace rows that have errors, first select the column that contains errors. On the Transform tab, in the Any
column group, select Replace values . From the drop-down menu, select Replace errors .
In the Replace errors dialog box, enter the value 10 because you want to replace all errors with the value 10.
The result of that operation will give you the table that you're looking for.
Keep errors
Power Query can serve as a good auditing tool to identify any rows with errors even if you don't fix the errors.
This is where Keep errors can be helpful. To keep rows that have errors, first select the column that contains
errors. On the Home tab, in the Reduce rows group, select Keep rows . From the drop-down menu, select
Keep errors .
The result of that operation will give you the table that you're looking for.
Possible solutions : After identifying the row with the error, you can either modify the data source to reflect the
correct value rather than NA , or you can apply a Replace error operation to provide a value for any NA values
that cause an error.
Operation errors
When trying to apply an operation that isn't supported, such as multiplying a text value by a numeric value, an
error occurs.
Example : You want to create a custom column for your query by creating a text string that contains the phrase
"Total Sales: " concatenated with the value from the Sales column. An error occurs because the concatenation
operation only supports text columns and not numeric ones.
Possible solutions : Before creating this custom column, change the data type of the Sales column to be text.
Working with duplicate values
5/25/2022 • 2 minutes to read • Edit Online
You can work with duplicate sets of values through transformations that can remove duplicates from your data
or filter your data to show duplicates only, so you can focus on them.
WARNING
Power Query is case-sensitive. When working with duplicate values, Power Query considers the case of the text, which
might lead to undesired results. As a workaround, users can apply an uppercase or lowercase transform prior to removing
duplicates.
For this article, the examples use the following table with id , Categor y , and Total columns.
Remove duplicates
One of the operations that you can perform is to remove duplicate values from your table.
1. Select the columns that contain duplicate values.
2. Go to the Home tab.
3. In the Reduce rows group, select Remove rows .
4. From the drop-down menu, select Remove duplicates .
WARNING
There's no guarantee that the first instance in a set of duplicates will be chosen when duplicates are removed. To learn
more about how to preserve sorting, go to Preserve sort.
You have four rows that are duplicates. Your goal is to remove those duplicate rows so there are only unique
rows in your table. Select all columns from your table, and then select Remove duplicates .
The result of that operation will give you the table that you're looking for.
NOTE
This operation can also be performed with a subset of columns.
You want to remove those duplicates and only keep unique values. To remove duplicates from the Categor y
column, select it, and then select Remove duplicates .
The result of that operation will give you the table that you're looking for.
Keep duplicates
Another operation you can perform with duplicates is to keep only the duplicates found in your table.
1. Select the columns that contain duplicate values.
2. Go to the Home tab.
3. In the Reduce rows group, select Keep rows .
4. From the drop-down menu, select Keep duplicates .
You have four rows that are duplicates. Your goal in this example is to keep only the rows that are duplicated in
your table. Select all the columns in your table, and then select Keep duplicates .
The result of that operation will give you the table that you're looking for.
Keep duplicates from a single column
In this example, you want to identify and keep the duplicates by using only the id column from your table.
In this example, you have multiple duplicates and you want to keep only those duplicates from your table. To
keep duplicates from the id column, select the id column, and then select Keep duplicates .
The result of that operation will give you the table that you're looking for.
See also
Data profiling tools
Fill values in a column
5/25/2022 • 2 minutes to read • Edit Online
You can use fill up and fill down to replace null values with the last non-empty value in a column. For example,
imagine the following table where you'd like to fill down in the Date column and fill up in the Comments
column.
Fill down
The fill down operation takes a column and traverses through the values in it to fill any null values in the next
rows until it finds a new value. This process continues on a row-by-row basis until there are no more values in
that column.
In the following example, you want to fill down on the Date column. To do that, you can right-click to select the
Date column, and then select Fill > Down .
The result of that operation will look like the following image.
Fill up
In the same way as the fill down operation, fill up works on a column. But by contrast, fill up finds the last value
of the column and fills any null values in the previous rows until it finds a new value. Then the same process
occurs for that value. This process continues until there are no more values in that column.
In the following example, you want to fill the Comments column from the bottom up. You'll notice that your
Comments column doesn't have null values. Instead it has what appears to be empty cells. Before you can do
the fill up operation, you need to transform those empty cells into null values: select the column, go to the
Transform tab, and then select Replace values .
In the Replace values dialog box, leave Value to find blank. For Replace with , enter null .
The result of that operation will look like the following image.
Cleaning up your table
1. Filter the Units column to show only rows that aren't equal to null .
3. Remove the Sales Person: values from the Sales Person column so you only get the names of the
salespeople.
Now you should have exactly the table you were looking for.
See also
Replace values
Sort columns
5/25/2022 • 2 minutes to read • Edit Online
You can sort a table in Power Query by one column or multiple columns. For example, take the following table
with the columns named Competition , Competitor , and Position .
Table with Competition, Competitor, and Position columns. The Competition column contains 1 - Opening in
rows 1 and 6, 2 - Main in rows 3 and 5, and 3-Final in rows 2 and 4. The Position row contains a value of either 1
or 2 for each of the Competition values.
For this example, the goal is to sort this table by the Competition and Position fields in ascending order.
Table with Competition, Competitor, and Position columns. The Competition column contains 1 - Opening in
rows 1 and 2, 2 - Main in rows 3 and 4, and 3-Final in rows 5 and 6. The Position row contains, from top to
bottom, a value of 1, 2, 1, 2, 1, and 2.
From the column heading drop-down menu. Next to the name of the column there's a drop-down menu
indicator . When you select the icon, you'll see the option to sort the column.
In this example, first you need to sort the Competition column. You'll perform the operation by using the
buttons in the Sor t group on the Home tab. This action creates a new step in the Applied steps section named
Sor ted rows .
A visual indicator, displayed as an arrow pointing up, gets added to the Competitor drop-down menu icon to
show that the column is being sorted in ascending order.
Now you'll sort the Position field in ascending order as well, but this time you'll use the Position column
heading drop-down menu.
Notice that this action doesn't create a new Sor ted rows step, but modifies it to perform both sort operations
in one step. When you sort multiple columns, the order that the columns are sorted in is based on the order the
columns were selected in. A visual indicator, displayed as a number to the left of the drop-down menu indicator,
shows the place each column occupies in the sort order.
In Power Query, you can rename columns to format the dataset in a clear and concise way.
As an example, let's start with a dataset that has two columns.
C O L UM N 1 C O L UM N 2
Panama Panama
Canada Toronto
The column headers are Column 1 and Column 2 , but you want to change those names to more friendly
names for your columns.
Column 1 becomes Countr y
Column 2 becomes City
The end result that you want in Power Query looks like the following table.
Right-click the column of your choice : A contextual menu is displayed and you can select the
Rename option to rename the selected column.
Rename option in the Transform tab : In the Transform tab, under the Any column group, select the
Rename option.
NOTE
To learn more about how to promote headers from your first row, go toPromote or demote column headers.
Expanding a column with a field name that also exists in the current table : This can happen, for
example, when you perform a merge operation and the column with the merged table has field names
that also exist in the table. When you try to expand the fields from that column, Power Query
automatically tries to disambiguate to prevent Column Name Conflict errors.
Move columns
5/25/2022 • 2 minutes to read • Edit Online
Move option
The following example shows the different ways of moving columns. This example focuses on moving the
Contact Name column.
You move the column using the Move option. This option located in the Any column group under the
Transform tab. In the Move option, the available choices are:
Before
After
To beginning
To end
You can also find this option when you right-click a column.
If you want to move one column to the left, then select Before .
The new location of the column is now one column to the left of its original location.
If you want to move one column to the right, then select After .
The new location of the column is now one column to the right of its original location.
If you want to move the column to the most left space of the dataset, then select To beginning .
The new location of the column is now on the far left side of the table.
If you want to move the column to the most right space of the dataset, then select To end .
The new location of the column is now on the far right side of the table.
Go to column feature
If you want to find a specific column, then go to the View tab in the ribbon and select Go to column .
From there, you can specifically select the column you would like to view, which is especially useful if there are
many columns.
Replace values and errors
5/25/2022 • 2 minutes to read • Edit Online
With Power Query, you can replace one value with another value wherever that value is found in a column. The
Replace values command can be found:
On the cell shortcut menu. Right-click the cell to replace the selected value in the column with another
value.
The value of -1 in the Sales Goal column is an error in the source and needs to be replaced with the standard
sales goal defined by the business for these instances, which is 250,000. To do that, right-click the -1 value, and
then select Replace values . This action will bring up the Replace values dialog box with Value to find set to
-1 . Now all you need to do is enter 250000 in the Replace with box.
The outcome of that operation will give you the result that you're looking for.
The result of that operation gives you the table in the following image.
Parse text as JSON or XML
5/25/2022 • 2 minutes to read • Edit Online
In Power Query, you can parse the contents of a column with text strings by identifying the contents as either a
JSON or XML text string.
You can perform this parse operation by selecting the Parse button found inside the following places in the
Power Query Editor:
Transform tab —This button will transform the existing column by parsing its contents.
Add column tab —This button will add a new column to the table parsing the contents of the selected
column.
For this article, you'll be using the following sample table that contains the following columns that you need to
parse:
SalesPerson —Contains unparsed JSON text strings with information about the FirstName and
LastName of the sales person, as in the following example.
{
"id" : 249319,
"FirstName": "Lesa",
"LastName": "Byrd"
}
Countr y —Contains unparsed XML text strings with information about the Countr y and the Division
that the account has been assigned to, as in the following example.
<root>
<id>1</id>
<Country>USA</Country>
<Division>BI-3316</Division>
</root>
The sample table looks as follows.
The goal is to parse the above mentioned columns and expand the contents of those columns to get this output.
As JSON
Select the SalesPerson column. Then select JSON from the Parse dropdown menu inside the Transform tab.
These steps will transform the SalesPerson column from having text strings to having Record values, as
shown in the next image. You can select anywhere in the whitespace inside the cell of the Record value to get a
detailed preview of the record contents at the bottom of the screen.
Select the expand icon next to the SalesPerson column header. From the expand columns menu, select only the
FirstName and LastName fields, as shown in the following image.
The result of that operation will give you the following table.
As XML
Select the Countr y column. Then select the XML button from the Parse dropdown menu inside the Transform
tab. These steps will transform the Countr y column from having text strings to having Table values as shown
in the next image. You can select anywhere in the whitespace inside the cell of the Table value to get a detailed
preview of the contents of the table at the bottom of the screen.
Select the expand icon next to the Countr y column header. From the expand columns menu, select only the
Countr y and Division fields, as shown in the following image.
You can define all the new columns as text columns. The result of that operation will give you the output table
that you're looking for.
Add a column from examples
5/25/2022 • 3 minutes to read • Edit Online
When you add columns from examples, you can quickly and easily create new columns that meet your needs.
This is useful for the following situations:
You know the data you want in your new column, but you're not sure which transformation, or collection of
transformations, will get you there.
You already know which transformations you need, but you're not sure what to select in the UI to make them
happen.
You know all about the transformations you need by using a custom column expression in the M language,
but one or more of those transformations aren't available in the UI.
The Column from examples command is located on the Add column tab, in the General group.
The preview pane displays a new, editable column where you can enter your examples. For the first example, the
value from the selected column is 19500. So in your new column, enter the text 15000 to 20000 , which is the
bin where that value falls.
When Power Query finds a matching transformation, it fills the transformation results into the remaining rows
using light-colored text. You can also see the M formula text for the transformation above the table preview.
After you select OK , you'll see your new column as part of your query. You'll also see a new step added to your
query.
After you select OK , you'll see your new column as part of your query. You'll also see a new step added to your
query.
Your last step is to remove the First Name , Last Name , and Monthly Income columns. Your final table now
contains the Range and Full Name columns with all the data you produced in the previous steps.
Tips and considerations
When providing examples, Power Query offers a helpful list of available fields, values, and suggested
transformations for the selected columns. You can view this list by selecting any cell of the new column.
It's important to note that the Column from examples experience works only on the top 100 rows of your
data preview. You can apply steps before the Column from examples step to create your own data sample.
After the Column from examples column has been created, you can delete those prior steps; the newly
created column won't be affected.
NOTE
All Text transformations take into account the potential need to trim, clean, or apply a case transformation to the column
value.
Date transformations
Day
Day of Week
Day of Week Name
Day of Year
Month
Month Name
Quarter of Year
Week of Month
Week of Year
Year
Age
Start of Year
End of Year
Start of Month
End of Month
Start of Quarter
Days in Month
End of Quarter
Start of Week
End of Week
Day of Month
Start of Day
End of Day
Time transformations
Hour
Minute
Second
To Local Time
NOTE
All Date and Time transformations take into account the potential need to convert the column value to Date, Time, or
DateTime.
Number transformations
Absolute Value
Arccosine
Arcsine
Arctangent
Convert to Number
Cosine
Cube
Divide
Exponent
Factorial
Integer Divide
Is Even
Is Odd
Ln
Base-10 Logarithm
Modulo
Multiply
Round Down
Round Up
Sign
Sine
Square Root
Square
Subtract
Sum
Tangent
Bucketing/Ranges
Add an index column
5/25/2022 • 2 minutes to read • Edit Online
The Index column command adds a new column to the table with explicit position values, and is usually
created to support other transformation patterns.
By default, the starting index will start from the value 0 and have an increment of 1 per row.
You can also configure the behavior of this step by selecting the Custom option and configuring two
parameters:
Star ting index : Specifies the initial index value.
Increment : Specifies how much to increment each index value.
For the example in this article, you start with the following table that has only one column, but notice the data
pattern in the column.
Let's say that your goal is to transform that table into the one shown in the following image, with the columns
Date , Account , and Sale .
In the Modulo dialog box, enter the number from which to find the remainder for each value in the column. In
this case, your pattern repeats itself every three rows, so you'll enter 3 .
The result of that operation will give you a new column named Modulo .
Remove the Index column, because you no longer need it. Your table now looks like the following image.
If you need more flexibility for adding new columns than the ones provided out of the box in Power Query, you
can create your own custom column using the Power Query M formula language.
Imagine that you have a table with the following set of columns.
Using the Units , Unit Price , and Discount columns, you'd like to create two new columns:
Total Sale before Discount : Calculated by multiplying the Units column times the Unit Price column.
Total Sale after Discount : Calculated by multiplying the Total Sale before Discount column by the net
percentage value (one minus the discount value).
The goal is to create a table with new columns that contain the total sales before the discount and the total sales
after the discount.
The Custom column dialog box appears. This dialog box is where you define the formula to create your
column.
The Custom column dialog box contains:
The initial name of your custom column in the New column name box. You can rename this column.
A dropdown menu where you can select the data type for your new column.
An Available columns list on the right underneath the Data type selection.
A Custom column formula box where you can enter a Power Query M formula.
To add a new custom column, select a column from the Available columns list. Then, select the Inser t
column button below the list to add it to the custom column formula. You can also add a column by selecting it
in the list. Alternatively, you can write your own formula by using the Power Query M formula language in
Custom column formula .
NOTE
If a syntax error occurs when you create your custom column, you'll see a yellow warning icon, along with an error
message and reason.
NOTE
If you're using Power Query Desktop, you'll notice that the Data type field isn't available in Custom column . This
means that you'll need to define a data type for any custom columns after creating the columns. More information: Data
types in Power Query
NOTE
Depending on the formula you've used for your custom column, Power Query changes the settings behavior of your step
for a more simplified and native experience. For this example, the Added custom step changed its behavior from a
standard custom column step to a Multiplication experience because the formula from that step only multiplies the values
from two columns.
Next steps
You can create a custom column in other ways, such as creating a column based on examples you provide to
Power Query Editor. More information: Add a column from an example
For Power Query M reference information, go to Power Query M function reference.
Add a conditional column
5/25/2022 • 2 minutes to read • Edit Online
With Power Query, you can create new columns whose values will be based on one or more conditions applied
to other columns in your table.
The Conditional column command is located on the Add column tab, in the General group.
In this table, you have a field that gives you the CustomerGroup . You also have different prices applicable to
that customer in the Tier 1 Price , Tier 2 Price , and Tier 3 Price fields. In this example, your goal is to create a
new column with the name Final Price based on the value found in the CustomerGroup field. If the value in
the CustomerGroup field is equal to 1, you'll want to use the value from the Tier 1 Price field; otherwise,
you'll use the value from the Tier 3 Price .
To add this conditional column, select Conditional column . In the Add conditional column dialog box, you
can define three sections numbered in the following image.
1. New column name : You can define the name of your new column. In this example, you'll use the name
Final Price .
2. Conditional clauses : Here you define your conditional clauses. You can add more clauses by selecting Add
clause . Each conditional clause will be tested on the order shown in the dialog box, from top to bottom. Each
clause has four parts:
Column name : In the drop-down list, select the column to use for the conditional test. For this
example, select CustomerGroup .
Operator : Select the type of test or operator for the conditional test. In this example, the value from
the CustomerGroup column has to be equal to 1, so select equals .
Value : You can enter a value or select a column to be used for the conditional test. For this example,
enter 1 .
Output : If the test is positive, the value entered here or the column selected will be the output. For this
example, if the CustomerGroup value is equal to 1, your Output value should be the value from the
Tier 1 Price column.
3. Final Else clause : If none of the clauses above yield a positive test, the output of this operation will be the
one defined here, as a manually entered value or a value from a column. In this case, the output will be the
value from the Tier 3 Price column.
The result of that operation will give you a new Final Price column.
NOTE
New conditional columns won't have a data type defined. You can add a new step to define a data type for this newly
created column by following the steps described in Data types in Power Query.
The result of that operation will give you the result that you're looking for.
Cluster values
5/25/2022 • 3 minutes to read • Edit Online
Cluster values automatically create groups with similar values using a fuzzy matching algorithm, and then maps
each column's value to the best-matched group. This transform is very useful when you're working with data
that has many different variations of the same value and you need to combine values into consistent groups.
Consider a sample table with an id column that contains a set of IDs and a Person column containing a set of
variously spelled and capitalized versions of the names Miguel, Mike, William, and Bill.
In this example, the outcome you're looking for is a table with a new column that shows the right groups of
values from the Person column and not all the different variations of the same words.
NOTE
The Cluster values feature is available only for Power Query Online.
The result of that operation yields the result shown in the next image.
NOTE
For each cluster of values, Power Query picks the most frequent instance from the selected column as the "canonical"
instance. If multiple instances occur with the same frequency, Power Query picks the first one.
IMPORTANT
It's important that the transformation table has the same columns and column names as shown in the previous image
(they have to be named "From" and "To"), otherwise Power Query won't recognize this table as a transformation table,
and no transformation will take place.
Using the previously created query, double-click the Clustered values step, then in the Cluster values dialog
box, expand Fuzzy cluster options . Under Fuzzy cluster options, enable the Show similarity scores option.
For Transformation table (optional) , select the query that has the transform table.
After selecting your transformation table and enabling the Show similarity scores option, select OK . The
result of that operation will give you a table that contains the same id and Person columns as the original table,
but also includes two new columns on the right called Cluster and Person_Cluster_Similarity . The Cluster
column contains the properly spelled and capitalized versions of the names Miguel for versions of Miguel and
Mike, and William for versions of Bill, Billy, and William. The Person_Cluster_Similarity column contains the
similarity scores for each of the names.
Append queries
5/25/2022 • 2 minutes to read • Edit Online
The append operation creates a single table by adding the contents of one or more tables to another, and
aggregates the column headers from the tables to create the schema for the new table.
NOTE
When tables that don't have the same column headers are appended, all column headers from all tables are appended to
the resulting table. If one of the appended tables doesn't have a column header from other tables, the resulting table
shows null values in the respective column, as shown in the previous image in columns C and D.
You can find the Append queries command on the Home tab in the Combine group. On the drop-down
menu, you'll see two options:
Append queries displays the Append dialog box to add additional tables to the current query.
Append queries as new displays the Append dialog box to create a new query by appending multiple
tables.
The append operation requires at least two tables. The Append dialog box has two modes:
Two tables : Combine two table queries together. This mode is the default mode.
Three or more tables : Allow an arbitrary number of table queries to be combined.
NOTE
The tables will be appended in the order in which they're selected, starting with the Primar y table for the Two tables
mode and from the primary table in the Tables to append list for the Three or more tables mode.
To append these tables, first select the Online Sales table. On the Home tab, select Append queries , which
creates a new step in the Online Sales query. The Online Sales table will be the primary table. The table to
append to the primary table will be Store Sales .
Power Query performs the append operation based on the names of the column headers found on both tables,
and not based on their relative position in the headers sections of their respective tables. The final table will have
all columns from all tables appended.
In the event that one table doesn't have columns found in another table, null values will appear in the
corresponding column, as shown in the Referer column of the final query.
Append three or more tables
In this example, you want to append not only the Online Sales and Store Sales tables, but also a new table
named Wholesale Sales .
The new approach for this example is to select Append queries as new , and then in the Append dialog box,
select the Three or more tables option button. In the Available table(s) list, select each table you want to
append, and then select Add . After all the tables you want appear in the Tables to append list, select OK .
After selecting OK , a new query will be created with all your tables appended.
Combine files overview
5/25/2022 • 4 minutes to read • Edit Online
With Power Query, you can combine multiple files that have the same schema into a single logical table.
This feature is useful when you want to combine all the files you have in the same folder. For example, if you
have a folder that contains monthly files with all the purchase orders for your company, you can combine these
files to consolidate the orders into a single view.
Files can come from a variety of sources, such as (but not limited to):
Local folders
SharePoint sites
Azure Blob storage
Azure Data Lake Storage (Gen1 and Gen2)
When working with these sources, you'll notice that they share the same table schema, commonly referred to as
the file system view . The following screenshot shows an example of the file system view.
In the file system view, the Content column contains the binary representation of each file.
NOTE
You can filter the list of files in the file system view by using any of the available fields. It's good practice to filter this view
to show only the files you need to combine, for example by filtering fields such as Extension or Folder Path . More
information: Folder
Selecting any of the [Binary] values in the Content column automatically creates a series of navigation steps to
that specific file. Power Query will try to interpret the binary by using one of the available connectors, such as
Text/CSV, Excel, JSON, or XML.
Combining files takes place in the following stages:
Table preview
Combine files dialog box
Combined files output
Table preview
When you connect to a data source by using any of the previously mentioned connectors, a table preview opens.
If you're certain that you want to combine all the files in the folder, select Combine in the lower-right corner of
the screen.
Alternatively, you can select Transform data to access the Power Query Editor and create a subset of the list of
files (for example, by using filters on the folder path column to only include files from a specific subfolder). Then
combine files by selecting the column that contains the binaries in the Content column and then selecting
either:
The Combine files command in the Combine group on the Home tab.
The Combine files icon in the column header of the column that contains [Binary] values.
NOTE
You can modify the steps inside the example query to change the function applied to each binary in your query. The
example query is linked to the function, so any changes made to the example query will be reflected in the function query.
If any of the changes affect column names or column data types, be sure to check the last step of your output query.
Adding a Change column type step can introduce a step-level error that prevents you from visualizing your table.
More information: Dealing with errors
See also
Combine CSV files
Combine CSV files
5/25/2022 • 5 minutes to read • Edit Online
In Power Query, you can combine multiple files from a given data source. This article describes how the
experience works when the files that you want to combine are CSV files. More information: Combine files
overview
TIP
You can follow along with this example by downloading the sample files used in this article from this download link. You
can place those files in the data source of your choice, such as a local folder, SharePoint folder, Azure Blob storage, Azure
Data Lake Storage, or other data source that provides the file system view.
For simplicity, the example in this article uses the Folder connector. More information: Folder
The number of rows varies from file to file, but all files have a header section in the first four rows. They have
column headers in the fifth row, and the data for the table begins in the sixth row and continues through all
subsequent rows.
The goal is to combine all 12 files into a single table. This combined table contains the header row at the top of
the table, and includes the source name, date, country, units, and revenue data for the entire year in separate
columns after the header row.
Table preview
When connecting to the folder that hosts the files that you want to combine—in this example, the name of that
folder is CSV Files —you're shown the table preview dialog box, which displays your folder path in the upper-
left corner. The data preview shows the file system view.
NOTE
In a different situation, you might select Transform data to further filter and transform your data before combining the
files. Selecting Combine is only recommended when you're certain that the folder contains only the files that you want to
combine.
For this example, leave all the default settings (Example file set to First file , and the default values for File
origin , Delimiter , and Data type detection ).
Now select Transform data in the lower-right corner to go to the output query.
Output query
After selecting Transform data in the Combine files dialog box, you'll be taken back to the Power Query
Editor in the query that you initially created from the connection to the local folder. The output query now
contains the source file name in the left-most column, along with the data from each of the source files in the
remaining columns.
However, the data isn't in the correct shape. You need to remove the top four rows from each file before
combining them. To make this change in each file before you combine them, select the Transform Sample file
query in the Queries pane on the left side of your screen.
Modify the Transform Sample file query
In this Transform Sample file query, the values in the Date column indicate that the data is for the month of
April, which has the year-month-day (YYYY-MM-DD) format. April 2019.csv is the first file that's displayed in the
table preview.
You now need to apply a new set of transformations to clean the data. Each transformation will be automatically
converted to a function inside the Helper queries group that will be applied to every file in the folder before
combining the data from each file.
The transformations that need to be added to the Transform Sample file query are:
1. Remove top rows : To perform this operation, select the table icon menu in the upper-left corner of the
table, and then select Remove top rows .
In the Remove top rows dialog box, enter 4 , and then select OK .
After selecting OK , your table will no longer have the top four rows.
2. Use first row as headers : Select the table icon again, and then select Use first row as headers .
The result of that operation will promote the first row of the table to the new column headers.
After this operation is completed, Power Query by default will try to automatically detect the data types of the
columns and add a new Changed column type step.
Revising the output query
When you go back to the CSV Files query, you'll notice that the last step is giving you an error that reads "The
column 'Column1' of the table wasn't found." The reason behind this error is that the previous state of the query
was doing an operation against a column named Column1 . But because of the changes made to the
Transform Sample file query, this column no longer exists. More information: Dealing with errors in Power
Query
You can remove this last step of the query from the Applied steps pane by selecting the X delete icon on the
left side of the name of the step. After deleting this step, your query will show the correct results.
However, notice that none of the columns derived from the files (Date, Country, Units, Revenue) have a specific
data type assigned to them. Assign the correct data type to each column by using the following table.
C O L UM N N A M E DATA T Y P E
Date Date
Country Text
Revenue Currency
After defining the data types for each column, you'll be ready to load the table.
NOTE
To learn how to define or change column data types, see Data types.
Verification
To validate that all files have been combined, you can select the filter icon on the Source.Name column
heading, which will display all the names of the files that have been combined. If you get the warning "List may
be incomplete," select Load more at the bottom of the menu to display more available values in the column.
After you select Load more , all available file names will be displayed.
Merge queries overview
5/25/2022 • 4 minutes to read • Edit Online
A merge queries operation joins two existing tables together based on matching values from one or multiple
columns. You can choose to use different types of joins, depending on the output you want.
Merging queries
You can find the Merge queries command on the Home tab, in the Combine group. From the drop-down
menu, you'll see two options:
Merge queries : Displays the Merge dialog box, with the selected query as the left table of the merge
operation.
Merge queries as new : Displays the Merge dialog box without any preselected tables for the merge
operation.
NOTE
Although this example shows the same column header for both tables, this isn't a requirement for the merge operation.
Column headers don't need to match between tables. However, it's important to note that the columns must be of the
same data type, otherwise the merge operation might not yield correct results.
You can also select multiple columns to perform the join by selecting Ctrl as you select the columns. When you
do so, the order in which the columns were selected is displayed in small numbers next to the column headings,
starting with 1.
For this example, you have the Sales and Countries tables. Each of the tables has Countr yID and StateID
columns, which you need to pair for the join between both columns.
First select the Countr yID column in the Sales table, select Ctrl , and then select the StateID column. (This will
show the small numbers in the column headings.) Next, perform the same selections in the Countries table.
The following image shows the result of selecting those columns.
![Merge dialog box with the Left table for merge set to Sales, with the CountryID and StateID columns selected,
and the Right table for merge set to Countries, with the CountryID and StateID columns selected. The Join kind is
set to Left outer.
Expand or aggregate the new merged table column
After selecting OK in the Merge dialog box, the base table of your query will have all the columns from your left
table. Also, a new column will be added with the same name as your right table. This column holds the values
corresponding to the right table on a row-by-row basis.
From here, you can choose to expand or aggregate the fields from this new table column, which will be the
fields from your right table.
Table showing the merged Countries column on the right, with all rows containing a Table. The expand icon on
the right of the Countries column header has been selected, and the expand menu is open. The expand menu
has the Select all, CountryID, StateID, Country, and State selections selected. The Use original column name as
prefix is also selected.
NOTE
Currently, the Power Query Online experience only provides the expand operation in its interface. The option to aggregate
will be added later this year.
Join kinds
A join kind specifies how a merge operation will be performed. The following table describes the available join
kinds in Power Query.
JO IN K IN D IC O N DESC RIP T IO N
Fuzzy matching
You use fuzzy merge to apply fuzzy matching algorithms when comparing columns, to try to find matches
across the tables you're merging. You can enable this feature by selecting the Use fuzzy matching to perform
the merge check box in the Merge dialog box. Expand Fuzzy matching options to view all available
configurations.
NOTE
Fuzzy matching is only supported for merge operations over text columns.
Left outer join
5/25/2022 • 2 minutes to read • Edit Online
One of the join kinds available in the Merge dialog box in Power Query is a left outer join, which keeps all the
rows from the left table and brings in any matching rows from the right table. More information: Merge
operations overview
Figure shows a table on the left with Date, CountryID, and Units columns. The emphasized CountryID column
contains values of 1 in rows 1 and 2, 3 in row 3, and 4 in row 4. A table on the right contains ID and Country
columns. The emphasized ID column contains values of 1 in row 1 (denoting USA), 2 in row 2 (denoting
Canada), and 3 in row 3 (denoting Panama). A table below the first two tables contains Date, CountryID, Units,
and Country columns. The table has four rows, with the top two rows containing the data for CountryID 1, one
row for CountryID 3, and one row for Country ID 4. Since the right table didn't contain an ID of 4, the value of
the fourth row in the Country column contains null.
This article uses sample data to show how to do a merge operation with the left outer join. The sample source
tables for this example are:
Sales : This table includes the fields Date , Countr yID , and Units . Countr yID is a whole number value
that represents the unique identifier from the Countries table.
Countries : This table is a reference table with the fields id and Countr y . The id field represents the
unique identifier for each record.
Countries table with id set to 1 in row 1, 2 in row 2, and 3 in row 3, and Country set to USA in row
1, Canada in row 2, and Panama in row 3.
In this example, you'll merge both tables, with the Sales table as the left table and the Countries table as the
right one. The join will be made between the following columns.
CountryID id
The goal is to create a table like the following, where the name of the country appears as a new Countr y
column in the Sales table as long as the Countr yID exists in the Countries table. If there are no matches
between the left and right tables, a null value is the result of the merge for that row. In the following image, this
is shown to be the case for Countr yID 4, which was brought in from the Sales table.
One of the join kinds available in the Merge dialog box in Power Query is a right outer join, which keeps all the
rows from the right table and brings in any matching rows from the left table. More information: Merge
operations overview
Figure shows a table on the left with Date, CountryID, and Units columns. The emphasized CountryID column
contains values of 1 in rows 1 and 2, 3 in row 3, and 4 in row 4. A table on the right contains ID and Country
columns, with only one row. The emphasized ID column contains a value of 3 in row 1 (denoting Panama). A
table below the first two tables contains Date, CountryID, Units, and Country columns. The table has one row,
with the CountryID of 3 and the Country of Panama.
This article uses sample data to show how to do a merge operation with the right outer join. The sample source
tables for this example are:
Sales : This table includes the fields Date , Countr yID , and Units . The Countr yID is a whole number
value that represents the unique identifier from the Countries table.
Countries : This table is a reference table with the fields id and Countr y . The id field represents the
unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the Countries table as the
right one. The join will be made between the following columns.
CountryID id
The goal is to create a table like the following, where the name of the country appears as a new Countr y
column in the Sales table. Because of how the right outer join works, all rows from the right table will be
brought in, but only matching rows from the left table will be kept.
From the newly created Countries column, expand the Countr y field. Don't select the Use original column
name as prefix check box.
After performing this operation, you'll create a table that looks like the following image.
Full outer join
5/25/2022 • 3 minutes to read • Edit Online
One of the join kinds available in the Merge dialog box in Power Query is a full outer join, which brings in all
the rows from both the left and right tables. More information: Merge operations overview
Figure shows a table on the left with Date, CountryID, and Units columns. The emphasized CountryID column
contains values of 1 in rows 1 and 2, 3 in row 3, and 2 in row 4. A table on the right contains ID and Country
columns. The emphasized ID column contains values of 1 in row 1 (denoting USA), 2 in row 2 (denoting
Canada), 3 in row 3 (denoting Panama), and 4 (denoting Spain) in row 4. A table below the first two tables
contains Date, CountryID, Units, and Country columns. All rows have been rearranged in numerical order
according to the CountryID value. The country associated with the CountryID number is shown in the Country
column. Because the country ID for Spain wasn't contained in the left table, a new row is added, and the date,
country ID, and units values for this row are set to null.
This article uses sample data to show how to do a merge operation with the full outer join. The sample source
tables for this example are:
Sales : This table includes the fields Date , Countr yID , and Units . Countr yID is a whole number value
that represents the unique identifier from the Countries table.
Countries : This is a reference table with the fields id and Countr y . The id field represents the unique
identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the Countries table as the
right one. The join will be made between the following columns.
CountryID id
The goal is to create a table like the following, where the name of the country appears as a new Countr y
column in the Sales table. Because of how the full outer join works, all rows from both the left and right tables
will be brought in, regardless of whether they only appear in one of the tables.
Full outer join final table with Date, a CountryID, and Units derived from the Sales table, and a Country column
derived from the Countries table. A fifth row was added to contain data from Spain, but that row contains null in
the Date, CountryID, and Units columns since those values did not exist for Spain in the Sales table.
To perform a full outer join
1. Select the Sales query, and then select Merge queries .
2. In the Merge dialog box, under Right table for merge , select Countries .
3. In the Sales table, select the Countr yID column.
4. In the Countries table, select the id column.
5. In the Join kind section, select Full outer .
6. Select OK
TIP
Take a closer look at the message at the bottom of the dialog box that reads "The selection matches 4 of 4 rows from the
first table, and 3 of 4 rows from the second table." This message is crucial for understanding the result that you get from
this operation.
In the Countries table, you have the Countr y Spain with id of 4, but there are no records for Countr yID 4 in
the Sales table. That's why only three of four rows from the right table found a match. All rows from the right
table that didn't have matching rows from the left table will be grouped and shown in a new row in the output
table with no values for the fields from the left table.
From the newly created Countries column after the merge operation, expand the Countr y field. Don't select
the Use original column name as prefix check box.
After performing this operation, you'll create a table that looks like the following image.
Full outer join final table containing Date, a CountryID, and Units derived from the Sales table, and a Country
column derived from the Countries table. A fifth row was added to contain data from Spain, but that row
contains null in the Date, CountryID, and Units columns since those values didn't exist for Spain in the Sales
table.
Inner join
5/25/2022 • 2 minutes to read • Edit Online
One of the join kinds available in the Merge dialog box in Power Query is an inner join, which brings in only
matching rows from both the left and right tables. More information: Merge operations overview
Figure shows a table on the left with Date, CountryID, and Units columns. The emphasized CountryID column
contains values of 1 in rows 1 and 2, 3 in row 3, and 2 in row 4. A table on the right contains ID and Country
columns. The emphasized ID column contains values of 3 in row 1 (denoting Panama) and 4 in row 2 (denoting
Spain). A table below the first two tables contains Date, CountryID, Units, and Country columns, but only one
row of data for Panama.
This article uses sample data to show how to do a merge operation with the inner join. The sample source tables
for this example are:
Sales : This table includes the fields Date , Countr yID , and Units . Countr yID is a whole number value
that represents the unique identifier from the Countries table.
Countries : This is a reference table with the fields id and Countr y . The id field represents the unique
identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the Countries table as the
right one. The join will be made between the following columns.
F IEL D F RO M T H E SA L ES TA B L E F IEL D F RO M T H E C O UN T RIES TA B L E
CountryID id
The goal is to create a table like the following, where the name of the country appears as a new Countr y
column in the Sales table. Because of how the inner join works, only matching rows from both the left and right
tables will be brought in.
In the Sales table, you have a Countr yID of 1 and 2, but neither of these values are found in the Countries
table. That's why the match only found one of four rows in the left (first) table.
In the Countries table, you have the Countr y Spain with the id 4, but there are no records for a Countr yID of
4 in the Sales table. That's why only one of two rows from the right (second) table found a match.
From the newly created Countries column, expand the Countr y field. Don't select the Use original column
name as prefix check box.
After performing this operation, you'll create a table that looks like the following image.
Left anti join
5/25/2022 • 3 minutes to read • Edit Online
One of the join kinds available in the Merge dialog box in Power Query is a left anti join, which brings in only
rows from the left table that don't have any matching rows from the right table. More information: Merge
operations overview
Figure shows a table on the left with Date, CountryID, and Units columns. The emphasized CountryID column
contains values of 1 in rows 1 and 2, 3 in row 3, and 2 in row 4. A table on the right contains ID and Country
columns. The emphasized ID column contains values of 3 in row 1 (denoting Panama) and 4 in row 2 (denoting
Spain). A table below the first two tables contains Date, CountryID, Units, and Country columns. The table has
three rows, with two rows containing the data for CountryID 1, and one row for CountryID 2. Since none of the
remaining CountryIDs match any of the countries in the right table, the rows in the Country column in the
merged table all contain null.
This article uses sample data to show how to do a merge operation with the left anti join. The sample source
tables for this example are:
Sales : This table includes the fields Date , Countr yID , and Units . Countr yID is a whole number value
that represents the unique identifier from the Countries table.
Countries : This table is a reference table with the fields id and Countr y . The id field represents the
unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the Countries table as the
right one. The join will be made between the following columns.
CountryID id
The goal is to create a table like the following, where only the rows from the left table that don't match any from
the right table are kept.
Left anti join final table with Date, CountryID, Units, and Country column headers, and three rows of data of
which the values for the Country column are all null.
To do a left anti join
1. Select the Sales query, and then select Merge queries .
2. In the Merge dialog box, under Right table for merge , select Countries .
3. In the Sales table, select the Countr yID column.
4. In the Countries table, select the id column.
5. In the Join kind section, select Left anti .
6. Select OK .
TIP
Take a closer look at the message at the bottom of the dialog box that reads "The selection excludes 1 of 4 rows from the
first table." This message is crucial to understanding the result that you get from this operation.
In the Sales table, you have a Countr yID of 1 and 2, but neither of them are found in the Countries table.
That's why the match only found one of four rows in the left (first) table.
In the Countries table, you have the Countr y Spain with an id of 4, but there are no records for Countr yID 4
in the Sales table. That's why only one of two rows from the right (second) table found a match.
From the newly created Countries column, expand the Countr y field. Don't select the Use original column
name as prefix check box.
After doing this operation, you'll create a table that looks like the following image. The newly expanded
Countr y field doesn't have any values. That's because the left anti join doesn't bring any values from the right
table—it only keeps rows from the left table.
Final table with Date, CountryID, Units, and Country column headers, and three rows of data of which the values
for the Country column are all null.
Right anti join
5/25/2022 • 2 minutes to read • Edit Online
One of the join kinds available in the Merge dialog box in Power Query is a right anti join, which brings in only
rows from the right table that don't have any matching rows from the left table. More information: Merge
operations overview
Figure shows a table on the left with Date, CountryID, and Units columns. The emphasized CountryID column
contains values of 1 in rows 1 and 2, 3 in row 3, and 2 in row 4. A table on the right contains ID and Country
columns. The emphasized ID column contains values of 3 in row 1 (denoting Panama) and 4 in row 2 (denoting
Spain). A table below the first two tables contains Date, CountryID, Units, and Country columns. The table has
one row, with the Date, CountryID and Units set to null, and the Country set to Spain.
This article uses sample data to show how to do a merge operation with the right anti join. The sample source
tables for this example are:
Sales : This table includes the fields Date , Countr yID , and Units . Countr yID is a whole number value
that represents the unique identifier from the Countries table.
Countries : This is a reference table with the fields id and Countr y . The id field represents the unique
identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the Countries table as the
right one. The join will be made between the following columns.
F IEL D F RO M T H E SA L ES TA B L E F IEL D F RO M T H E C O UN T RIES TA B L E
CountryID id
The goal is to create a table like the following, where only the rows from the right table that don't match any
from the left table are kept. As a common use case, you can find all the rows that are available in the right table
but aren't found in the left table.
Right anti join final table with the Date, CountryID, Units, and Country header columns, containing one row with
null in all columns except Country, which contains Spain.
To do a right anti join
1. Select the Sales query, and then select Merge queries .
2. In the Merge dialog box, under Right table for merge , select Countries .
3. In the Sales table, select the Countr yID column.
4. In the Countries table, select the id column.
5. In the Join kind section, select Right anti .
6. Select OK .
TIP
Take a closer look at the message at the bottom of the dialog box that reads "The selection excludes 1 of 2 rows from the
second table." This message is crucial to understanding the result that you get from this operation.
In the Countries table, you have the Countr y Spain with an id of 4, but there are no records for Countr yID 4
in the Sales table. That's why only one of two rows from the right (second) table found a match. Because of how
the right anti join works, you'll never see any rows from the left (first) table in the output of this operation.
From the newly created Countries column, expand the Countr y field. Don't select the Use original column
name as prefix check box.
After performing this operation, you'll create a table that looks like the following image. The newly expanded
Countr y field doesn't have any values. That's because the right anti join doesn't bring any values from the left
table—it only keeps rows from the right table.
Final table with the Date, CountryID, Units, and Country header columns, containing one row with null in all
columns except Country, which contains Spain.
Fuzzy merge
5/25/2022 • 5 minutes to read • Edit Online
Fuzzy merge is a smart data preparation feature you can use to apply fuzzy matching algorithms when
comparing columns, to try to find matches across the tables that are being merged.
You can enable fuzzy matching at the bottom of the Merge dialog box by selecting the Use fuzzy matching to
perform the merge option button. More information: Merge operations overview
NOTE
Fuzzy matching is only supported on merge operations over text columns. Power Query uses the Jaccard similarity
algorithm to measure the similarity between pairs of instances.
Sample scenario
A common use case for fuzzy matching is with freeform text fields, such as in a survey. For this article, the
sample table was taken directly from an online survey sent to a group with only one question: What is your
favorite fruit?
The results of that survey are shown in the following image.
Sample survey output table containing the column distribution graph showing nine distinct answers with all
answers unique, and the answers to the survey with all the typos, plural or singular, and case problems.
The nine records reflect the survey submissions. The problem with the survey submissions is that some have
typos, some are plural, some are singular, some are uppercase, and some are lowercase.
To help standardize these values, in this example you have a Fruits reference table.
Fruits reference table containing column distribution graph showing four distinct fruits with all fruits unique,
and the list of fruits: apple, pineapple, watermelon, and banana.
NOTE
For simplicity, this Fruits reference table only includes the name of the fruits that will be needed for this scenario. Your
reference table can have as many rows as you need.
The goal is to create a table like the following, where you've standardized all these values so you can do more
analysis.
Sample survey output table with the Question column containing the column distribution graph showing nine
distinct answers with all answers unique, and the answers to the survey with all the typos, plural or singular, and
case problems, and also contains the Fruit column containing the column distribution graph showing four
distinct answers with one unique answer and lists all of the fruits properly spelled, singular, and proper case.
Fuzzy merge
To do the fuzzy merge, you start by doing a merge. In this case, you'll use a left outer join, where the left table is
the one from the survey and the right table is the Fruits reference table. At the bottom of the dialog box, select
the Use fuzzy matching to perform the merge check box.
After you select OK , you can see a new column in your table because of this merge operation. If you expand it,
you'll notice that there's one row that doesn't have any values in it. That's exactly what the dialog box message in
the previous image stated when it said "The selection matches 8 of 9 rows from the first table."
Fruit column added to the Survey table, with all rows in the Question column expanded, except for row 9, which
could not expand and the Fruit column contains null.
F RO M TO
apls Apple
You can go back to the Merge dialog box, and in Fuzzy matching options under Number of matches , enter
1 . Enable the Show similarity scores option, and then, under Transformation table , select Transform
Table from the drop-down menu.
After you select OK , you can go to the merge step. When you expand the column with table values, you'll notice
that besides the Fruit field you'll also see the Similarity score field . Select both and expand them without
adding a prefix.
After expanding these two fields, they'll be added to your table. Note the values you get for the similarity scores
of each value. These scores can help you with further transformations if needed to determine if you should
lower or raise your similarity threshold.
For this example, the Similarity score serves only as additional information and isn't needed in the output of
this query, so you can remove it. Note how the example started with nine distinct values, but after the fuzzy
merge, there are only four distinct values.
Fuzzy merge survey output table with the Question column containing the column distribution graph showing
nine distinct answers with all answers unique, and the answers to the survey with all the typos, plural or
singular, and case problems. Also contains the Fruit column with the column distribution graph showing four
distinct answers with one unique answer and lists all of the fruits properly spelled, singular, and proper case.
Cross join
5/25/2022 • 2 minutes to read • Edit Online
A cross join is a type of join that returns the Cartesian product of rows from the tables in the join. In other
words, it combines each row from the first table with each row from the second table.
This article demonstrates, with a practical example, how to do a cross join in Power Query.
Colors : A table with all the product variations, as colors, that you can have in your inventory.
The goal is to perform a cross-join operation with these two tables to create a list of all unique products that you
can have in your inventory, as shown in the following table. This operation is necessary because the Product
table only contains the generic product name, and doesn't give the level of detail you need to see what product
variations (such as color) there are.
In the Custom column dialog box, enter whatever name you like in the New column name box, and enter
Colors in the Custom column formula box.
IMPORTANT
If your query name has spaces in it, such as Product Colors , the text that you need to enter in the Custom column
formula section has to follow the syntax #"Query name" . For Product Colors , you need to enter #"Product Colors"
You can check the name of your queries in the Quer y settings pane on the right side of your screen or in the Queries
pane on the left side.
After you select OK in the Custom column dialog box, a new column is added to the table. In the new column
heading, select Expand to expand the contents of this newly created column, and then select OK .
After you select OK , you'll reach your goal of creating a table with all possible combinations of Product and
Colors .
Split columns by delimiter
5/25/2022 • 2 minutes to read • Edit Online
In Power Query, you can split a column through different methods. In this case, the column(s) selected can be
split by a delimiter.
Transform tab —under the Split column dropdown menu inside the Text column group.
The result of that operation will give you a table with the two columns that you're expecting.
NOTE
Power Query will split the column into as many columns as needed. The name of the new columns will contain the same
name as the original column. A suffix that includes a dot and a number that represents the split sections of the original
column will be appended to the name of the new columns.
The Accounts column has values in pairs separated by a comma. These pairs are separated by a semicolon. The
goal of this example is to split this column into new rows by using the semicolon as the delimiter.
To do that split, select the Accounts column. Select the option to split the column by a delimiter. In Split
Column by Delimiter , apply the following configuration:
Select or enter delimiter : Semicolon
Split at : Each occurrence of the delimiter
Split into : Rows
The result of that operation will give you a table with the same number of columns, but many more rows
because the values inside the cells are now in their own cells.
Final Split
Your table still requires one last split column operation. You need to split the Accounts column by the first
comma that it finds. This split will create a column for the account name and another one for the account
number.
To do that split, select the Accounts column and then select Split Column > By Delimiter . Inside the Split
column window, apply the following configuration:
Select or enter delimiter : Comma
Split at : Each occurrence of the delimiter
The result of that operation will give you a table with the three columns that you're expecting. You then rename
the columns as follows:
P REVIO US N A M E N EW N A M E
Your final table looks like the one in the following image.
Split columns by number of characters
5/25/2022 • 2 minutes to read • Edit Online
In Power Query, you can split a column through different methods. In this case, the column(s) selected can be
split by the number of characters.
Transform tab —under the Split Column dropdown menu inside the Text Column group.
NOTE
Power Query will split the column into only two columns. The name of the new columns will contain the same name as
the original column. A suffix containing a dot and a number that represents the split section of the column will be
appended to the names of the new columns.
Now continue to do the same operation over the new Column1.2 column, but with the following configuration:
Number of characters : 8
Split : Once, as far left as possible
The result of that operation will yield a table with three columns. Notice the new names of the two columns on
the far right. Column1.2.1 and Column1.2.2 were automatically created by the split column operation.
You can now change the name of the columns and also define the data types of each column as follows:
O RIGIN A L C O L UM N N A M E N EW C O L UM N N A M E DATA T Y P E
Your final table will look like the one in the following image.
The Account column can hold multiple values in the same cell. Each value has the same length in characters,
with a total of six characters. In this example, you want to split these values so you can have each account value
in its own row.
To do that, select the Account column and then select the option to split the column by the number of
characters. In Split column by Number of Characters , apply the following configuration:
Number of characters : 6
Split : Repeatedly
Split into : Rows
The result of that operation will give you a table with the same number of columns, but many more rows
because the fragments inside the original cell values in the Account column are now split into multiple rows.
Split columns by positions
5/25/2022 • 2 minutes to read • Edit Online
In Power Query, you can split a column through different methods. In this case, the column(s) selected can be
split by positions.
Transform tab —under the Split Column dropdown menu inside the Text Column group.
The result of that operation will give you a table with three columns.
NOTE
Power Query will split the column into only two columns. The name of the new columns will contain the same name as
the original column. A suffix created by a dot and a number that represents the split section of the column will be
appended to the name of the new columns.
You can now change the name of the columns, and also define the data types of each column as follows:
O RIGIN A L C O L UM N N A M E N EW C O L UM N N A M E DATA T Y P E
Your final table will look the one in the following image.
NOTE
This operation will first start creating a column from position 0 to position 6. There will be another column should there
be values with a length of 8 or more characters in the current data preview contents.
The result of that operation will give you a table with the same number of columns, but many more rows
because the values inside the cells are now in their own cells.
Split columns by lowercase to uppercase
5/25/2022 • 2 minutes to read • Edit Online
In Power Query, you can split a column through different methods. If your data contains CamelCased text or a
similar pattern, then the column(s) selected can be split by every instance of the last lowercase letter to the next
uppercase letter easily.
Transform tab —under the Split Column dropdown menu inside the Text Column group.
In Power Query, you can split a column through different methods. In this case, the column(s) selected can be
split by every instance of the last uppercase letter to the next lowercase letter.
Transform tab —under the Split Column dropdown menu inside the Text Column group.
In Power Query, you can split a column through different methods. In this case, the column(s) selected can be
split by every instance of a digit followed by a non-digit.
Transform tab —under the Split Column dropdown menu inside the Text Column group.
In Power Query, you can split a column through different methods. In this case, the column(s) selected can be
split by every instance of a non-digit followed by a digit.
Transform tab —under the Split Column dropdown menu inside the Text Column group.
Dataflows are a self-service, cloud-based, data preparation technology. Dataflows enable customers to ingest,
transform, and load data into Microsoft Dataverse environments, Power BI workspaces, or your organization's
Azure Data Lake Storage account. Dataflows are authored by using Power Query, a unified data connectivity and
preparation experience already featured in many Microsoft products, including Excel and Power BI. Customers
can trigger dataflows to run either on demand or automatically on a schedule; data is always kept up to date.
The previous image shows an overall view of how a dataflow is defined. A dataflow gets data from different data
sources (more than 80 data sources are supported already). Then, based on the transformations configured with
the Power Query authoring experience, the dataflow transforms the data by using the dataflow engine. Finally,
the data is loaded to the output destination, which can be a Microsoft Power Platform environment, a Power BI
workspace, or the organization's Azure Data Lake Storage account.
Dataflows run in the cloud
Dataflows are cloud-based. When a dataflow is authored and saved, its definition is stored in the cloud. A
dataflow also runs in the cloud. However, if a data source is on-premises, an on-premises data gateway can be
used to extract the data to the cloud. When a dataflow run is triggered, the data transformation and computation
happens in the cloud, and the destination is always in the cloud.
Dataflows use a powerful transformation engine
Power Query is the data transformation engine used in the dataflow. This engine is capable enough to support
many advanced transformations. It also uses a straightforward, yet powerful, graphical user interface called
Power Query Editor. You can use dataflows with this editor to develop your data integration solutions faster and
more easily.
Benefits of dataflows
The following list highlights some of the benefits of using dataflows:
A dataflow decouples the data transformation layer from the modeling and visualization layer in a Power
BI solution.
The data transformation code can reside in a central location, a dataflow, rather than be spread out
among multiple artifacts.
A dataflow creator only needs Power Query skills. In an environment with multiple creators, the dataflow
creator can be part of a team that together builds the entire BI solution or operational application.
A dataflow is product-agnostic. It's not a component of Power BI only; you can get its data in other tools
and services.
Dataflows take advantage of Power Query, a powerful, graphical, self-service data transformation
experience.
Dataflows run entirely in the cloud. No additional infrastructure is required.
You have multiple options for starting to work with dataflows, using licenses for Power Apps, Power BI,
and Customer Insights.
Although dataflows are capable of advanced transformations, they're designed for self-service scenarios
and require no IT or developer background.
Next steps
The following articles provide further study materials for dataflows.
Create and use dataflows in Microsoft Power Platform
Creating and using dataflows in Power BI
Understanding the differences between dataflow
types
5/25/2022 • 5 minutes to read • Edit Online
Dataflows are used to extract, transform, and load data to a storage destination where it can be leveraged for
different scenarios. Because not all storage destinations share the same characteristics, some dataflow features
and behaviors differ depending on the storage destination the dataflow loads data into. Before you create a
dataflow, it's important to understand how the data is going to be used, and choose the storage destination
according to the requirements of your solution.
Selecting a storage destination of a dataflow determines the dataflow's type. A dataflow that loads data into
Dataverse tables is categorized as a standard dataflow . Dataflows that load data to analytical entities is
categorized as an analytical dataflow .
Dataflows created in Power BI are always analytical dataflows. Dataflows created in Power Apps can either be
standard or analytical, depending on your selection when creating the dataflow.
Standard dataflows
A standard dataflow loads data to Dataverse tables. Standard dataflows can only be created in Power Apps. One
benefit of creating this type of dataflow is that any application that depends on data in Dataverse can work with
the data created by standard dataflows. Typical applications that leverage Dataverse tables are Power Apps,
Power Automate, AI Builder and Power Virtual Agents.
Analytical dataflows
An analytical dataflow loads data to storage types optimized for analytics—Azure Data Lake Storage. Microsoft
Power Platform environments and Power BI workspaces provide customers with a managed analytical storage
location that's bundled with those product licenses. In addition, customers can link their organization’s Azure
Data Lake storage account as a destination for dataflows.
Analytical dataflows are capable additional analytical features. For example, integration with Power BI’s AI
features or use of computed entities which will be discussed later.
You can create analytical dataflows in Power BI. By default, they'll load data to Power BI’s managed storage. But
you can also configure Power BI to store the data in the organization’s Azure Data Lake Storage.
You can also create analytical dataflows in Power Apps and Dynamics 365 customer insights portals. When
you're creating a dataflow in Power Apps portal, you can choose between Dataverse managed analytical storage
or in your organization’s Azure Data Lake Storage account.
AI Integration
Sometimes, depending on the requirement, you might need to apply some AI and machine learning functions
on the data through the dataflow. These functionalities are available in Power BI dataflows and require a
Premium workspace.
The following articles discuss how to use AI functions in a dataflow:
Azure Machine Learning integration in Power BI
Cognitive Services in Power BI
Automated Machine Learning in Power BI
Note that the features listed above are Power BI specific and are not available when creating a dataflow in the
Power Apps or Dynamics 365 customer insights portals.
Computed entities
One of the reasons to use a computed entity is the ability to process large amounts of data. The computed entity
helps in those scenarios. If you have an entity in a dataflow, and another entity in the same dataflow uses the
first entity's output, this will create a computed entity.
The computed entity helps with the performance of the data transformations. Instead of re-doing the
transformations needed in the first entity multiple times, the transformation will be done only once in the
computed entity. Then the result will be used multiple times in other entities.
To learn more about computed entities, see Using computed entities on Power BI Premium.
Computed entities are available only in an analytical dataflow.
O P ERAT IO N STA N DA RD A N A LY T IC A L
AI functions No Yes
Can be used in other applications Yes, through Dataverse Power BI dataflows: Only in Power BI
Power Platform dataflows or Power BI
external dataflows: Yes, through Azure
Data Lake Storage
Using dataflows with Microsoft Power Platform makes data preparation easier, and lets you reuse your data
preparation work in subsequent reports, apps, and models.
In the world of ever-expanding data, data preparation can be difficult and expensive, consuming as much as 60
to 80 percent of the time and cost for a typical analytics project. Such projects can require wrangling fragmented
and incomplete data, complex system integration, data with structural inconsistency, and a high skillset barrier.
To make data preparation easier and to help you get more value out of your data, Power Query and Power
Platform dataflows were created.
With dataflows, Microsoft brings the self-service data preparation capabilities of Power Query into the Power BI
and Power Apps online services, and expands existing capabilities in the following ways:
Self-ser vice data prep for big data with dataflows : Dataflows can be used to easily ingest, cleanse,
transform, integrate, enrich, and schematize data from a large and ever-growing array of transactional
and observational sources, encompassing all data preparation logic. Previously, extract, transform, load
(ETL) logic could only be included within datasets in Power BI, copied over and over between datasets,
and bound to dataset management settings.
With dataflows, ETL logic is elevated to a first-class artifact within Microsoft Power Platform services, and
includes dedicated authoring and management experiences. Business analysts, BI professionals, and data
scientists can use dataflows to handle the most complex data preparation challenges and build on each
other's work, thanks to a revolutionary model-driven calculation engine, which takes care of all the
transformation and dependency logic—cutting time, cost, and expertise to a fraction of what's
traditionally been required for those tasks. You can create dataflows by using the well-known, self-service
data preparation experience of Power Query. Dataflows are created and easily managed in app
workspaces or environments, in Power BI or Power Apps, respectively, enjoying all the capabilities these
services have to offer, such as permission management and scheduled refreshes.
Load data to Dataverse or Azure Data Lake Storage : Depending on your use case, you can store
data prepared by Power Platform dataflows in the Dataverse or your organization's Azure Data Lake
Storage account:
Dataverse lets you securely store and manage data that's used by business applications. Data
within Dataverse is stored in a set of tables. A table is a set of rows (formerly referred to as
records) and columns (formerly referred to as fields/attributes). Each column in the table is
designed to store a certain type of data, for example, name, age, salary, and so on. Dataverse
includes a base set of standard tables that cover typical scenarios, but you can also create custom
tables specific to your organization and populate them with data by using dataflows. App makers
can then use Power Apps and Power Automate to build rich applications that use this data.
Azure Data Lake Storage lets you collaborate with people in your organization using Power BI,
Azure Data, and AI services, or using custom-built Line of Business Applications that read data
from the lake. Dataflows that load data to an Azure Data Lake Storage account store data in
Common Data Model folders. Common Data Model folders contain schematized data and
metadata in a standardized format, to facilitate data exchange and to enable full interoperability
across services that produce or consume data stored in an organization’s Azure Data Lake Storage
account as the shared storage layer.
Advanced Analytics and AI with Azure : Power Platform dataflows store data in Dataverse or Azure
Data Lake Storage—which means that data ingested through dataflows is now available to data
engineers and data scientists to leverage the full power of Azure Data Services, such as Azure Machine
Learning, Azure Databricks, and Azure Synapse Analytics for advanced analytics and AI. This enables
business analysts, data engineers, and data scientists to collaborate on the same data within their
organization.
Suppor t for Common Data Model : Common Data Model is a set of a standardized data schemas and
a metadata system to allow consistency of data and its meaning across applications and business
processes. Dataflows support Common Data Model by offering easy mapping from any data in any
shape into the standard Common Data Model entities, such as Account and Contact. Dataflows also land
the data, both standard and custom entities, in schematized Common Data Model form. Business analysts
can take advantage of the standard schema and its semantic consistency, or customize their entities
based on their unique needs. Common Data Model continues to evolve as part of the Open Data
Initiative.
DATA F LO W C A PA B IL IT Y P O W ER A P P S P O W ER B I
Dataflows Data Connector in Power BI For dataflows with Azure Data Lake Yes
Desktop Storage as the destination
Dataflow linked entities For dataflows with Azure Data Lake Yes
Storage as the destination
Computed Entities (in-storage For dataflows with Azure Data Lake Power BI Premium only
transformations using M) Storage as the destination
Dataflow incremental refresh For dataflows with Azure Data Lake Power BI Premium only
Storage as the destination, requires
Power Apps Plan2
Next steps
The following articles go into more detail about common usage scenarios for dataflows.
Using incremental refresh with dataflows
Creating computed entities in dataflows
Connect to data sources for dataflows
Link entities between dataflows
For more information about Common Data Model and the Common Data Model folder standard, read the
following articles:
Common Data Model - overview
Common Data Model folders
Common Data Model folder model file definition
Create and use dataflows in Microsoft Teams
(Preview)
5/25/2022 • 5 minutes to read • Edit Online
NOTE
We are rolling out dataflows for Microsoft Teams gradually. This feature might not be available in your region yet.
Microsoft Dataverse for Teams delivers a built-in, low-code data platform for Microsoft Teams. It provides
relational data storage, rich data types, enterprise-grade governance, and one-click solution deployment.
Dataverse for Teams enables everyone to easily build and deploy apps.
Before today, the way to get data into Dataverse for Teams was by manually adding data directly into a table.
This process can be prone to errors and isn't scalable. But now, with self-service data prep you can find, clean,
shape, and import your data into Dataverse for Teams.
With your organizational data already sitting in a different location, you can use Power Query dataflows to
directly access your data through the connectors and load the data into Dataverse for Teams. When you update
in your organizational data, you can refresh your dataflows by just one click and the data in Dataverse for Teams
is updated too. You can also use the Power Query data transformations to easily validate and clean your data
and enforce data quality for your Apps.
Dataflows were introduced to help organizations retrieve data from disparate sources and prepare it for
consumption. You can easily create dataflows using the familiar, self-service Power Query experience to ingest,
transform, integrate, and enrich data. When creating a dataflow, you'll connect to data, transform the data, and
load data into Dataverse for Teams tables. Once the dataflow is created, it begins the process of importing data
into the Dataverse table. Then you can start building apps to leverage that data.
5. Enter a URL address in File path or URL , or use the Browse OneDrive button to navigate through
your OneDrive folders. Select the file you want, and then select the Next button. For more information
about using the OneDrive connection or getting data, see SharePoint and OneDrive for Business files
import or Getting data from other sources.
6. In Navigator , select the tables that are present in your Excel file. If your Excel file has multiple sheets and
tables, select only the tables you're interested in. When you're done, select Transform data .
7. Clean and transform your data using Power Query. You can use the out-of-the box transformations to
delete missing values, delete unnecessary columns, or to filter your data. With Power Query, you can
apply more than 300 different transformations on your data. To learn more about Power Query
transformations, see Use Power Query to transform data. After you're finished with preparing your data,
select Next .
8. In Map tables , select Load to new table to create a new table in Dataverse for Teams. You can also
choose to load your data into an existing table. In the Map tables screen, you can also specify a Unique
primar y name column and an Alternate key column (optional) . In this example, leave these
selections with the default values. To learn more about mapping your data and the different settings, see
Field mapping considerations for standard dataflows.
9. Select Create to finish your dataflow. Once you’ve created your dataflow, data begins loading into
Dataverse for Teams. This process can take some time and you can use the management page to check
the status. When a dataflow completes a run, its data is available to use.
NOTE
Dataflows in Teams don't support on-premises data sources, such as on premises file locations.
The following table lists the major feature differences between dataflows for Dataverse in Teams and dataflows
for Dataverse.
1 Although there's no limitation on the amount of data you can load into Dataverse for
Teams, for better
performance in loading larger amounts of data, we recommend a Dataverse environment.
Consume data from dataflows
5/25/2022 • 3 minutes to read • Edit Online
The ways you can consume data from Microsoft dataflows depends on several factors, like storage and type of
dataflow. In this article, you'll learn how to choose the right dataflow for your needs.
Type of dataflow
There are multiple types of dataflows available for you to create. You can choose between a Power BI dataflow,
standard dataflow, or an analytical dataflow. To learn more about the differences and how to select the right type
based on your needs, go to Understanding the differences between dataflow types.
Storage type
A dataflow can write to multiple output destination types. In short, you should be using the Dataflows connector
unless your destination is a Dataverse table. Then you use the Dataverse/CDS connector.
Azure Data Lake Storage
Azure Data Lake storage is available in Power BI dataflows and Power Apps analytical dataflows. By default
you're using a Microsoft Managed Data Lake. However, you can also connect a self-hosted data lake to the
dataflow environment. The following articles describe how to connect the data lake to your environment:
Connect Data Lake Gen 2 storage to a Power BI Workspace
Connect Data Lake Gen 2 storage to a Power Apps Environment
When you've connected your data lake, you should still use the Dataflows connector. If this connector doesn't
meet your needs, you could consider using the Azure Data Lake connector instead.
Dataverse
A standard dataflow writes the output data to a Dataverse table. Dataverse lets you securely store and manage
data that's used by business applications. After you load data in the Dataverse table, you can consume the data
using the Dataverse connector.
Next Steps
The following articles provide more details about related articles.
Creating and using dataflows in Power BI
Link entities between dataflows in Power BI
Connect to data created by Power BI dataflows in Power BI Desktop (Beta)
Create and use dataflows in Power Platform
Link entities between dataflows (Power Platform)
Working with duplicate values in dataflows
Overview of solution-aware dataflows
5/25/2022 • 3 minutes to read • Edit Online
When you include your dataflows in a solution, their definitions become portable, making it easier to move
them from one environment to another, saving time required to author the dataflow.
A typical use case is for an independent software vendor (ISV) to develop a solution containing a dataflow, that
extracts and transforms data from a data source to Dataverse tables, in a sandbox environment. The ISV would
then move that dataflow and destination tables to a test environment to test with their test data source to
validate that the solution works well and is ready for production. After testing completes, the ISV would provide
the dataflow and tables to clients who will import them into their production environment to operate on client’s
data. This process is much easier when you add both the dataflows and tables they load data to into solutions,
and then move the solutions and their contents between environments.
Dataflows added to a solution are known as solution-aware dataflows. You can add multiple dataflows to a
single solution.
NOTE
Only dataflows created in Power Platform environments can be solution-aware.
The data loaded by dataflows to their destination isn't portable as part of solutions, only the dataflow definitions are.
To recreate the data after a dataflow was deployed as part of a solution, you need to refresh the dataflow.
To save your work, be sure to publish all customizations. Now, the solution is ready for you to export
from the source environment and import to the destination environment.
2. In the Dataflow list, locate and double-click the dataflow that was added as part of the solution you’ve
imported.
3. You'll be asked to enter credentials required for the dataflow.
Once the credentials for the connection have been updated, all queries that use that connection
automatically load.
4. If your dataflow loads data in Dataverse tables, select Next to review the mapping configuration.
5. The mapping configuration is also saved as part of the solution. Since you also added the destination
table to the solutions, there's no need to recreate the table in this environment and you can publish the
dataflow.
That's it. Your dataflow now refreshes and loads data to the destination table.
Known limitations
Dataflows can't be created from within solutions. To add a dataflow to a solution, follow the steps outlined in
this article.
Dataflows can't be edited directly from within solutions. Instead, the dataflow must be edited in the dataflows
experience.
Dataflows can't use connection references for any connector.
Environment variables can't be used by dataflows.
Dataflows don't support adding required components, such as custom tables they load data to. Instead, the
custom table should be manually added to the solution.
Dataflows can't be deployed by application users (service principals).
Using incremental refresh with dataflows
5/25/2022 • 9 minutes to read • Edit Online
With dataflows, you can bring large amounts of data into Power BI or your organization's provided storage. In
some cases, however, it's not practical to update a full copy of source data in each refresh. A good alternative is
incremental refresh , which provides the following benefits for dataflows:
Refresh occurs faster : Only data that's changed needs to be refreshed. For example, refresh only the last
five days of a 10-year dataflow.
Refresh is more reliable : For example, it's not necessary to maintain long-running connections to volatile
source systems.
Resource consumption is reduced : Less data to refresh reduces overall consumption of memory and
other resources.
Incremental refresh is available in dataflows created in Power BI and dataflows created in Power Apps. This
article shows screens from Power BI, but these instructions apply to dataflows created in Power BI or in Power
Apps.
NOTE
When the schema for a table in an analytical dataflow changes, a full refresh takes place to ensure that all the resulting
data matches the new schema. As a result, any data stored incrementally is refreshed and in some cases, if the source
system doesn't retain historic data, is lost.
Using incremental refresh in dataflows created in Power BI requires that the dataflow reside in a workspace in
Premium capacity. Incremental refresh in Power Apps requires Power Apps Plan 2.
In either Power BI or Power Apps, using incremental refresh requires that source data ingested into the dataflow
have a DateTime field on which incremental refresh can filter.
When you select the icon, the Incremental refresh settings window appears. Turn on incremental refresh.
The following list explains the settings in the Incremental refresh settings window.
Incremental refresh on/off toggle : Turns the incremental refresh policy on or off for the entity.
Filter field drop-down : Selects the query field on which the entity should be filtered for increments.
This field only contains DateTime fields. You can't use incremental refresh if your entity doesn't contain a
DateTime field.
Store/refresh rows from the past : The example in the previous image illustrates these next few
settings.
In this example, we define a refresh policy to store five years of data in total and incrementally refresh 10
days of data. Assuming that the entity is refreshed daily, the following actions are carried out for each
refresh operation:
Add a new day of data.
Refresh 10 days, up to the current date.
Remove calendar years that are older than five years before the current date. For example, if the
current date is January 1, 2019, the year 2013 is removed.
The first dataflow refresh might take a while to import all five years, but subsequent refreshes are likely
to be completed much more quickly.
Detect data changes : An incremental refresh of 10 days is much more efficient than a full refresh of
five years, but you might be able to do even better. When you select the Detect data changes check
box, you can select a date/time column to identify and refresh only the days where the data has changed.
This assumes such a column exists in the source system, which is typically for auditing purposes. The
maximum value of this column is evaluated for each of the periods in the incremental range. If that data
hasn't changed since the last refresh, there's no need to refresh the period. In the example, this might
further reduce the days incrementally refreshed from 10 to perhaps 2.
TIP
The current design requires that the column used to detect data changes be persisted and cached into memory.
You might want to consider one of the following techniques to reduce cardinality and memory consumption:
Persist only the maximum value of this column at time of refresh, perhaps by using a Power Query function.
Reduce the precision to a level that's acceptable given your refresh-frequency requirements.
Only refresh complete periods : Imagine that your refresh is scheduled to run at 4:00 AM every day. If
data appears in the source system during those first four hours of that day, you might not want to
account for it. Some business metrics, such as barrels per day in the oil and gas industry, aren't practical
or sensible to account for based on partial days.
Another example where only refreshing complete periods is appropriate is refreshing data from a
financial system. Imagine a financial system where data for the previous month is approved on the 12th
calendar day of the month. You can set the incremental range to one month and schedule the refresh to
run on the 12th day of the month. With this option selected, the system will refresh January data (the
most recent complete monthly period) on February 12.
NOTE
Dataflow incremental refresh determines dates according to the following logic: if a refresh is scheduled, incremental
refresh for dataflows uses the time zone defined in the refresh policy. If no schedule for refreshing exists, incremental
refresh uses the time from the computer running the refresh.
After incremental refresh is configured, the dataflow automatically alters your query to include filtering by date.
If the dataflow was created in Power BI, you can also edit the automatically generated query by using the
advanced editor in Power Query to fine-tune or customize your refresh. Read more about incremental refresh
and how it works in the following sections.
With Microsoft Power BI and Power Platform dataflows, you can connect to many different data sources to
create new dataflows, or add new entities to an existing dataflow.
This article describes how to create dataflows by using these data sources. For an overview of how to create and
use dataflows, go to Creating a dataflow for Power BI service and Create and use dataflows in Power Apps.
Power BI service
Power Apps
A connection window for the selected data connection is displayed. If credentials are required, you're prompted
to provide them. The following image shows a server and database being entered to connect to a SQL Server
database.
After the server URL or resource connection information is provided, enter the credentials to use for access to
the data. You may also need to enter the name of an on-premises data gateway. Then select Next .
Power Query Online initiates and establishes the connection to the data source. It then presents the available
tables from that data source in the Navigator window.
You can select tables and data to load by selecting the check box next to each in the left pane. To transform the
data you've chosen, select Transform data from the bottom of the Navigator window. A Power Query Online
dialog box appears, where you can edit queries and perform any other transformations you want to the selected
data.
Connecting to additional data sources
There are additional data connectors that aren't shown in the Power BI dataflows user interface, but are
supported with a few additional steps.
You can take the following steps to create a connection to a connector that isn't displayed in the user interface:
1. Open Power BI Desktop, and then select Get data .
2. Open Power Query Editor in Power BI Desktop, right-click the relevant query, and then select Advanced
Editor , as shown in the following image. From there, you can copy the M script that appears in the
Advanced Editor window.
3. Open the Power BI dataflow, and then select Get data for a blank query.
4. Paste the copied query into the blank query for the dataflow.
Next steps
This article showed which data sources you can connect to for dataflows. The following articles go into more
detail about common usage scenarios for dataflows:
Self-service data prep in Power BI
Using incremental refresh with dataflows
Creating computed entities in dataflows
Link entities between dataflows
For information about individual Power Query connectors, go to the connector reference list of Power Query
connectors, and select the connector you want to learn more about.
Additional information about dataflows and related information can be found in the following articles:
Create and use dataflows in Power BI
Using dataflows with on-premises data sources
Developer resources for Power BI dataflows
Dataflows and Azure Data Lake integration (Preview)
For more information about Power Query and scheduled refresh, you can read these articles:
Query overview in Power BI Desktop
Configuring scheduled refresh
For more information about Common Data Model, you can read its overview article:
Common Data Model - overview
What licenses do you need to use dataflows?
5/25/2022 • 4 minutes to read • Edit Online
Dataflows can be created in different portals, such as Power BI and the Power Apps, and can be of either
analytical or standard type. In addition, some dataflow features are only available as Premium features.
Considering the wide range of products that can use dataflows, and feature availability in each product or
dataflow type, it's important to know what licensing options you need to use dataflows.
Premium features
Some of the dataflow features are limited to premium licenses. If you want to use the enhanced compute engine
to speed up your dataflow queries' performance over computed entities, or have the DirectQuery connection
option to the dataflow, you need to have Power BI P1 or A3 or higher capacities.
AI capabilities in Power BI, linked entity, and computed entity are all premium functions that aren't available with
a Power BI Pro account.
Features
The following table contains a list of features and the license needed for them to be available.
F EAT URE P O W ER B I P O W ER A P P S
Store data in Azure Data Lake Storage Power BI Pro Yes, using analytical dataflows
(analytical dataflow) Power BI Premium
Store data in customer provided Azure Power BI Pro Per app plan
Data Lake Storage (analytical dataflow; Power BI Premium Per user plan
bring your own Azure Data Lake
Storage)
F EAT URE P O W ER B I P O W ER A P P S
Dataflow incremental refresh Power BI Premium Yes, using analytical dataflows with Per
user Plan
Next step
If you want to read more details about the concepts discussed in this article, follow any of the links below.
Pricing
Power BI pricing
Power Apps pricing
Azure Data Lake Storage Gen 2 pricing
Features
Computed entities
Linked entities
AI capabilities in Power BI dataflows
Standard vs. analytical dataflows
The enhanced compute engine
How to migrate queries from Power Query in the
desktop (Power BI and Excel) to dataflows
5/25/2022 • 3 minutes to read • Edit Online
If you already have queries in Power Query, either in Power BI Desktop or in Excel, you might want to migrate
the queries into dataflows. The migration process is simple and straightforward. In this article, you'll learn the
steps to do so.
To learn how to create a dataflow in Microsoft Power Platform, go to Create and use dataflows in Power
Platform. To learn how to create a dataflow in Power BI, go to Creating and using dataflows in Power BI.
In Excel on the Data tab, select Get Data > Launch Power Quer y Editor .
2. Copy the queries:
If you've organized your queries into folders (called groups in Power Query):
a. In the Queries pane, select Ctrl as you select the folders you want to migrate to the dataflow.
b. Select Ctrl +C.
The gateway isn't needed for data sources residing in the cloud, such as an Azure SQL database.
5. Configure the connection to the data source by selecting Configure connection and entering
credentials or anything else you need to connect to the data source at this stage.
If a scenario like this happens, you have two options. You can set up the gateway for that data source, or you can
update the query in the Power Query Editor for the dataflow by using a set of steps that are supported without
the need for the gateway.
Refresh the dataflow entities
After migrating your queries to the dataflow, you must refresh the dataflow to get data loaded into these
entities. You can refresh a dataflow manually or configure an automatic refresh based on the schedule of your
choice.
Install an on-premises data gateway to transfer data quickly and securely between a Power Platform dataflow
and a data source that isn't in the cloud, such as an on-premises SQL Server database or an on-premises
SharePoint site. You can view all gateways for which you have administrative permissions and manage
permissions and connections for those gateways.
Prerequisites
Power BI service
A Power BI service account. Don't have one? Sign up for 60 days free.
Administrative permissions on a gateway. These permissions are provided by default for gateways you
install. Administrators can grant other people permissions for gateways.
Power Apps
A Power Apps account. Don't have one? Sign up for 30 days free.
Administrative permissions on a gateway. These permissions are provided by default for gateways you
install. Administrators can grant other people permissions for gateways.
A license that supports accessing on-premises data using an on-premises gateway. More information:
"Connect to your data" row of the "Explore Power Apps plans" table in the Power Apps pricing page.
Install a gateway
You can install an on-premises data gateway directly from the online service.
NOTE
It's a good general practice to make sure you're using a supported version of the on-premises data gateway. We
release a new update of the on-premises data gateway every month. Currently, Microsoft actively supports only the
last six releases of the on-premises data gateway.
Starting April 2022, the minimum required gateway version will be Feburary 2021. Dataflows that refresh using an
earlier version of the gateway might stop refreshing.
3. Provide the connection details for the enterprise gateway that will be used to access the on-premises
data. You must select the gateway itself, and provide credentials for the selected gateway. Only gateways
for which you're an administrator appear in the list.
You can change the enterprise gateway used for a given dataflow and change the gateway assigned to all of
your queries using the dataflow authoring tool.
NOTE
The dataflow will try to find or create the required data sources using the new gateway. If it can't do so, you won't be able
to change the gateway until all needed dataflows are available from the selected gateway.
View and manage gateway permissions
Power BI service gateway permissions
1. Select the setup button in the upper right corner of Power BI service, choose Manage gateways , and
then select the gateway you want.
2. To add a user to the gateway, select the Administrators table and enter the email address of the user
you would like to add as an administrator. Creating or modifying data sources in dataflows requires
Admin permissions to the gateway. Admins have full control of the gateway, including adding users,
setting permissions, creating connections to all available data sources, and deleting the gateway.
The following conditions apply when adding a user to the gateway:
1. If we detect that an existing data source is available for the selected gateway, the Username and
Password fields will be pre-populated.
a. If you select Next at this point, you're considered to be using that existing data source, and so you
only need to have permissions to that data source.
b. If you edit any of the credential fields and select Next , then you're considered to be editing that
existing data source, at which point you need to be an admin of the gateway.
2. If we don't detect that an existing data source is available for the selected gateway, the Username and
Password fields will be blank, and if you edit the credential fields and select Next , then you're
considered to be creating a new data source on the gateway, at which point you need to be an admin of
the gateway.
If you only have data source user permission on the gateway, then 1.b and 2 can't be achieved and the dataflow
can't be created.
Power Apps gateway permissions
1. In the left navigation pane of powerapps.com, select Gateways and then select the gateway you want.
2. To add a user to a gateway, select Users , specify a user or group, and then specify a permission level.
Creating new data sources with a gateway in dataflows requires Admin permission on the gateway.
Admins have full control of the gateway, including adding users, setting permissions, creating
connections to all available data sources, and deleting the gateway.
View and manage gateway connections
Power BI service gateway connections
1. Select the setup button in the upper right corner of Power BI service, choose Manage gateways , and
then select the gateway you want.
2. Perform the action that you want:
To view details and edit the settings, select Gateway Cluster Settings .
To add users as administrators of the gateway, select Administrators .
To add a data source to the gateway, select Add Data Source , enter a data source name and choose
the data source type under Data Source Settings , and then enter the email address of the person
who will use the data source.
To delete a gateway, select the ellipsis to the right of the gateway name and then select Remove .
Power Apps gateway connections
1. In the left navigation bar of powerapps.com, select Gateways , and then choose the gateway you want.
2. Perform the action that you want:
To view details, edit the settings, or delete a gateway, select Connections , and then select a
connection.
To share a connection, select Share and then add or remove users.
NOTE
You can only share some types of connections, such as a SQL Server connection. For more information,
see Share canvas-app resources in Power Apps.
For more information about how to manage a connection, see Manage canvas-app connections in Power
Apps.
Limitations
There are a few known limitations when using enterprise gateways and dataflows.
Dataflow refresh might fail if an out-of-date data gateway is used. Starting April 2022, the minimum
required data gateway version is February 2021.
Each dataflow can use only one gateway. As such, all queries should be configured using the same
gateway.
Changing the gateway impacts the entire dataflow.
If several gateways are needed, the best practice is to build several dataflows (one for each gateway).
Then use the compute or table reference capabilities to unify the data.
Dataflows are only supported using enterprise gateways. Personal gateways won't be available for
selection in the drop-down lists and settings screens.
Creating new data sources with a gateway in dataflows is only supported for people with Admins
permissions.
Users with Can Use or Can Use + Share permissions can use existing connections when creating
dataflows.
The following connectors are supported:
DB2
File System
Apache Impala
Informix
MySQL
Oracle Database
PostgreSQL
SAP ERP
SharePoint
SQL Server
Teradata
Desktop flows
HTTP with Azure AD
Troubleshooting
When you attempt to use an on-premises data source to publish a dataflow, you might come across the
following MashupException error:
This error usually occurs because you're attempting to connect to an Azure Data Lake Storage endpoint through
a proxy, but you haven't properly configured the proxy settings for the on-premises data gateway. To learn more
about how to configure these proxy settings, go to Configure proxy settings for the on-premises data gateway.
For more information about troubleshooting issues with gateways, or configuring the gateway service for your
network, go to the On-premises data gateway documentation.
If you're experiencing issues with the gateway version you're using, try updating to the latest version as your
issue might have been resolved in the latest version. For more information about updating your gateway, go to
Update an on-premises data gateway.
Next steps
Create and use dataflows in Power Apps
Add data to a table in Microsoft Dataverse by using Power Query
Connect Azure Data Lake Storage Gen2 for dataflow storage
Creating computed entities in dataflows
5/25/2022 • 3 minutes to read • Edit Online
You can perform in-storage computations when using dataflows with a Power BI Premium subscription. This lets
you do calculations on your existing dataflows, and return results that enable you to focus on report creation
and analytics.
To perform in-storage computations, you first must create the dataflow and bring data into that Power BI
dataflow storage. After you have a dataflow that contains data, you can create computed entities, which are
entities that do in-storage computations.
There are two ways you can connect dataflow data to Power BI:
Using self-service authoring of a dataflow
Using an external dataflow
The following sections describe how to create computed entities on your dataflow data.
Any transformation you do on this newly created entity will be run on the data that already resides in Power BI
dataflow storage. That means that the query won't run against the external data source from which the data was
imported (for example, the SQL database from which the data was pulled).
Example use cases
What kind of transformations can be done with computed entities? Any transformation that you usually specify
by using the transformation user interface in Power BI, or the M editor, are all supported when performing in-
storage computation.
Consider the following example. You have an Account entity that contains the raw data for all the customers
from your Dynamics 365 subscription. You also have ServiceCalls raw data from the service center, with data
from the support calls that were performed from the different accounts on each day of the year.
Imagine you want to enrich the Account entity with data from ServiceCalls.
First you would need to aggregate the data from the ServiceCalls to calculate the number of support calls that
were done for each account in the last year.
Next, you merge the Account entity with the ServiceCallsAggregated entity to calculate the enriched Account
table.
Then you can see the results, shown as EnrichedAccount in the following image.
And that's it—the transformation is done on the data in the dataflow that resides in your Power BI Premium
subscription, not on the source data.
See also
Computed entity scenarios and use cases
This article described computed entities and dataflows. Here are some more articles that might be useful:
Self-service data prep in Power BI
Using incremental refresh with dataflows
Connect to data sources for dataflows
Link entities between dataflows
The following links provide additional information about dataflows in Power BI and other resources:
Create and use dataflows in Power BI
Using dataflows with on-premises data sources
Developer resources for Power BI dataflows
Configure workspace dataflow settings (Preview)
Add a CDM folder to Power BI as a dataflow (Preview)
Connect Azure Data Lake Storage Gen2 for dataflow storage (Preview)
For more information about Power Query and scheduled refresh, you can read these articles:
Query overview in Power BI Desktop
Configuring scheduled refresh
For more information about Common Data Model, you can read its overview article:
Common Data Model
Link entities between dataflows
5/25/2022 • 5 minutes to read • Edit Online
With dataflows in Microsoft Power Platform, you can have a single organizational data storage source where
business analysts can prep and manage their data once, and then reuse it between different analytics apps in the
organization.
When you link entities between dataflows, you can reuse entities that have already been ingested, cleansed, and
transformed by dataflows that are owned by others, without the need to maintain that data. The linked entities
simply point to the entities in other dataflows, and do not copy or duplicate the data.
Linked entities are read-only, so if you want to create transformations for a linked entity, you must create a new
computed entity with a reference to the linked entity.
NOTE
Entities differ based on whether they're standard entities or computed entities. Standard entities (often simply referred to
as entities) query an external data source, such as a SQL database. Computed entities require Premium capacity on Power
BI and run their transformations on data that's already in Power BI storage.
If your dataflow isn't located in a Premium capacity workspace, you can still reference a single query—or combine two or
more queries—as long as the transformations aren't defined as in-storage transformations. Such references are
considered standard entities. To do this, turn off the Enable load option for the referenced queries to prevent the data
from being materialized and ingested into storage. From there, you can reference those Enable load = false queries,
and set Enable load to On only for the resulting queries that you want to materialize.
A Navigator window opens, and you can choose a set of entities you can connect to. The window displays
entities for which you have permissions across all workspaces and environments in your organization.
After you select your linked entities, they appear in the list of entities for your dataflow in the authoring tool,
with a special icon identifying them as linked entities.
You can also view the source dataflow from the dataflow settings of your linked entity.
NOTE
The entire refresh process is committed at once. Because of this, if the data refresh for the destination dataflow
fails, the data refresh for the source dataflow fails as well.
Next steps
The following articles might be useful as you create or work with dataflows:
Self-service data prep in Power BI
Using incremental refresh with dataflows
Creating computed entities in dataflows
Connect to data sources for dataflows
The articles below provide more information about dataflows and Power BI:
Create and use dataflows in Power BI
Using computed entities on Power BI Premium
Using dataflows with on-premises data sources
Developer resources for Power BI dataflows
For more information about Power Query and scheduled refresh, you can read these articles:
Query overview in Power BI Desktop
Configuring scheduled refresh
For more information about Common Data Model, you can read its overview article:
Common Data Model - overview
Connect Azure Data Lake Storage Gen2 for
dataflow storage
5/25/2022 • 6 minutes to read • Edit Online
You can configure dataflows to store their data in your organization’s Azure Data Lake Storage Gen2 account.
This article describes the general steps necessary to do so, and provides guidance and best practices along the
way.
IMPORTANT
Dataflow with Analytical tables feature utilizes the Azure Synapse Link for Dataverse service, which may offer varying
levels of compliance, privacy, security, and data location commitments. For more information about Azure Synapse Link for
Dataverse, go to What is Azure Synapse Link for Dataverse?.
There are some advantages to configuring dataflows to store their definitions and datafiles in your data lake,
such as:
Azure Data Lake Storage Gen2 provides an enormously scalable storage facility for data.
Dataflow data and definition files can be leveraged by your IT department's developers to leverage Azure
data and artificial intelligence (AI) services as demonstrated in the GitHub samples from Azure data services.
It enables developers in your organization to integrate dataflow data into internal applications and line-of-
business solutions, using developer resources for dataflows and Azure.
Requirements
To use Azure Data Lake Storage Gen2 for dataflows, you need the following:
A Power Apps environment. Any Power Apps plan will allow you to create dataflows with Azure Data Lake
Storage Gen2 as a destination. You'll need to be authorized in the environment as a maker.
An Azure subscription. You need an Azure subscription to use Azure Data Lake Storage Gen2.
A resource group. Use a resource group you already have, or create a new one.
An Azure storage account. The storage account must have the Data Lake Storage Gen2 feature enabled.
TIP
If you don't have an Azure subscription, create a free trial account before you begin.
Prepare your Azure Data Lake Storage Gen2 for Power Platform
dataflows
Before you configure your environment with an Azure Data Lake Storage Gen2 account, you must create and
configure a storage account. Here are the requirements for Power Platform dataflows:
1. The storage account must be created in the same Azure Active Directory tenant as your Power Apps tenant.
2. We recommend that the storage account is created in the same region as the Power Apps environment you
plan to use it in. To determine where your Power Apps environment is, contact your environment admin.
3. The storage account must have the Hierarchical Name Space feature enabled.
4. You must be granted an Owner role on the storage account.
The following sections walk through the steps necessary to configure your Azure Data Lake Storage Gen2
account.
3. In the list that appears, select Dataflows and then on the command bar select New dataflow .
4. Select the analytical tables you want. These tables indicate what data you want to store in your
organization's Azure Data Lake Store Gen2 account.
IMPORTANT
You shouldn't change files created by dataflows in your organization’s lake or add files to a dataflow’s CDM Folder .
Changing files might damage dataflows or alter their behavior and is not supported. Power Platform Dataflows only
grants read access to files it creates in the lake. If you authorize other people or services to the filesystem used by Power
Platform Dataflows, only grant them read access to files or folders in that filesystem.
Privacy notice
By enabling the creation of dataflows with Analytical tables in your organization, via the Azure Synapse Link for
Dataverse service, details about the Azure Data Lake storage account, such as the name of the storage account,
will be sent to and stored in the Azure Synapse Link for Dataverse service, which is currently located outside the
PowerApps compliance boundary and may employ lesser or different privacy and security measures than those
typically in PowerApps. Note that you may remove the data lake association at any time to discontinue use of
this functionality and your Azure Data Lake storage account details will be removed from the Azure Synapse
Link for Dataverse service. Further information about Azure Synapse Link for Dataverse is available in this
article.
Next steps
This article provided guidance about how to connect an Azure Data Lake Storage Gen2 account for dataflow
storage.
For more information about dataflows, the Common Data Model, and Azure Data Lake Storage Gen2, go to
these articles:
Self-service data prep with dataflows
Creating and using dataflows in Power Apps
Add data to a table in Microsoft Dataverse
For more information about Azure storage, go to this article:
Azure Storage security guide
For more information about the Common Data Model, go to these articles:
Common Data Model - overview
Common Data Model folders
CDM model file definition
You can ask questions in the Power Apps Community.
What is the storage structure for analytical
dataflows?
5/25/2022 • 3 minutes to read • Edit Online
Analytical dataflows store both data and metadata in Azure Data Lake Storage. Dataflows leverage a standard
structure to store and describe data created in the lake, which is called Common Data Model folders. In this
article, you'll learn more about the storage standard that dataflows use behind the scenes.
You can use this JSON file to migrate (or import) your dataflow into another workspace or environment.
To learn exactly what the model.json metadata file includes, go to The metadata file (model.json) for Common
Data Model.
Data files
In addition to the metadata file, the dataflow folder includes other subfolders. A dataflow stores the data for
each entity in a subfolder with the entity's name. Data for an entity might be split into multiple data partitions,
stored in CSV format.
Next steps
Use the Common Data Model to optimize Azure Data Lake Storage Gen2
The metadata file (model.json) for the Common Data Model
Add a CDM folder to Power BI as a dataflow (Preview)
Connect Azure Data Lake Storage Gen2 for dataflow storage
Dataflows and Azure Data Lake Integration (Preview)
Configure workspace dataflow settings (Preview)
Dataflow storage options
5/25/2022 • 2 minutes to read • Edit Online
Standard dataflows always load data into Dataverse tables in an environment. Analytical dataflows always load
data into Azure Data Lake Storage accounts. For both dataflow types, there's no need to provision or manage the
storage. Dataflow storage, by default, is provided and managed by products the dataflow is created in.
Analytical dataflows allow an additional storage option: your organizations' Azure Data Lake Storage account.
This option enables access to the data created by a dataflow directly through Azure Data Lake Storage interfaces.
Providing your own storage account for analytical dataflows enables other Azure or line-of-business
applications to leverage the data by connecting to the lake directly.
Linking a Power Platform environment to your organization's Azure Data Lake Storage
To configure dataflows created in Power Apps to store data in your organization's Azure Data Lake Storage,
follow the steps in Connect Azure Data Lake Storage Gen2 for dataflow storage in Power Apps.
Known limitations
After a dataflow is created, its storage location can't be changed.
Linked and computed entities features are only available when both dataflows are in the same storage
account.
There are benefits to using computed entities in a dataflow. This article describes use cases for computed
entities and describes how they work behind the scenes.
A computed entity provides one place as the source code for the transformation and speeds up the
transformation because it need only be done once instead of multiple times. The load on the data source is also
reduced.
Image showing how to create a computed entity from the Orders entity. First right-click the Orders entity in the
Queries pane, select the Reference option from the drop-down menu, which creates the computed entity, which
is renamed here to Orders aggregated.
The computed entity can have further transformations. For example, you can use Group By to aggregate the
data at the customer level.
This means that the Orders Aggregated entity will be getting data from the Orders entity, and not from the data
source again. Because some of the transformations that need to be done have already been done in the Orders
entity, performance is better and data transformation is faster.
Image emphasizes the Power Platform dataflows connector from the Power Query choos data source window,
with a description that states that one dataflow entity can be built on top of the data from another dataflow
entity, which is already persisted in storage.
The concept of the computed entity is to have a table persisted in storage, and other tables sourced from it, so
that you can reduce the read time from the data source and share some of the common transformations. This
can be achieved by getting data from other dataflows through the dataflow connector or referencing another
query in the same dataflow.
If the dataflow you're developing is getting bigger and more complex, here are some things you can do to
improve on your original design.
Image showing data being extracted from a data source to staging dataflows, where the enities are either stored
in Dataverse or Azure Data Lake storage, then the data is moved to transformation dataflows where the data is
transformed and converted to the data warehouse structure, and then the data is moved to the dataset.
This article discusses a collection of best practices for reusing dataflows effectively and efficiently. Read this
article to avoid design pitfalls and potential performance issues as you develop dataflows for reuse.
Image with data being extracted from a data source to staging dataflows, where the entities are either stored in
Dataverse or Azure Data Lake storage, then the data is moved to transformation dataflows where the data is
transformed and converted to the data warehouse structure, and then the data is loaded to a Power BI dataset.
Designing a dimensional model is one of the most common tasks you can do with a dataflow. This article
highlights some of the best practices for creating a dimensional model using a dataflow.
Staging dataflows
One of the key points in any data integration system is to reduce the number of reads from the source
operational system. In the traditional data integration architecture, this reduction is done by creating a new
database called a staging database. The purpose of the staging database is to load data as-is from the data
source into the staging database on a regular schedule.
The rest of the data integration will then use the staging database as the source for further transformation and
converting it to the dimensional model structure.
We recommended that you follow the same approach using dataflows. Create a set of dataflows that are
responsible for just loading data as-is from the source system (and only for the tables you need). The result is
then stored in the storage structure of the dataflow (either Azure Data Lake Storage or Dataverse). This change
ensures that the read operation from the source system is minimal.
Next, you can create other dataflows that source their data from staging dataflows. The benefits of this approach
include:
Reducing the number of read operations from the source system, and reducing the load on the source
system as a result.
Reducing the load on data gateways if an on-premises data source is used.
Having an intermediate copy of the data for reconciliation purpose, in case the source system data changes.
Making the transformation dataflows source-independent.
Image emphasizing staging dataflows and staging storage, and showing the data being accessed from the data
source by the staging dataflow, and entities being stored in either Cadavers or Azure Data Lake Storage. The
entities are then shown being transformed along with other dataflows, which are then sent out as queries.
Transformation dataflows
When you've separated your transformation dataflows from the staging dataflows, the transformation will be
independent from the source. This separation helps if you're migrating the source system to a new system. All
you need to do in that case is to change the staging dataflows. The transformation dataflows are likely to work
without any problem, because they're sourced only from the staging dataflows.
This separation also helps in case the source system connection is slow. The transformation dataflow won't need
to wait for a long time to get records coming through a slow connection from the source system. The staging
dataflow has already done that part, and the data will be ready for the transformation layer.
Layered Architecture
A layered architecture is an architecture in which you perform actions in separate layers. The staging and
transformation dataflows can be two layers of a multi-layered dataflow architecture. Trying to do actions in
layers ensures the minimum maintenance required. When you want to change something, you just need to
change it in the layer in which it's located. The other layers should all continue to work fine.
The following image shows a multi-layered architecture for dataflows in which their entities are then used in
Power BI datasets.
In the previous image, the computed entity gets the data directly from the source. However, in the architecture of
staging and transformation dataflows, it's likely that the computed entities are sourced from the staging
dataflows.
One of the best practices for dataflow implementations is separating the responsibilities of dataflows into two
layers: data ingestion and data transformation. This pattern is specifically helpful when dealing with multiple
queries of slower data sources in one dataflow, or multiple dataflows querying the same data sources. Instead of
getting data from a slow data source again and again for each query, the data ingestion process can be done
once, and the transformation can be done on top of that process. This article explains the process.
Using analytical dataflows for data ingestion minimizes the get data process from the source and focuses on
loading data to Azure Data Lake Storage. Once in storage, other dataflows can be created that leverage the
ingestion dataflow's output. The dataflow engine can read the data and do the transformations directly from the
data lake, without contacting the original data source or gateway.
Slow data source
The same process is valid when a data source is slow. Some of the software as a service (SaaS) data sources
perform slowly because of the limitations of their API calls.
Depending on the storage for the output of the Microsoft Power Platform dataflows, you can use that output in
other Azure services.
In the standard dataflow, you can easily map fields from the dataflow query into Dataverse tables. However, if
the Dataverse table has lookup or relationship fields, additional consideration is required to make sure this
process works.
What's done in the database design practice is to create a table for Region in scenarios like the one described
above. This Region table would have a Region ID, Name, and other information about the region. The other two
tables (Customers and Stores) will have links to this table using a field (which can be Region ID if we have the ID
in both tables, or Name if it's unique enough to determine a region). This means having a relationship from the
Stores and Customers table to the Region table.
In Dataverse, there are a number of ways to create a relationship. One way is to create a table, and then create a
field in one table that's a relationship (or lookup) to another table, as described in the next section.
In the preceding image, the Region field is a lookup field to another table named Region Lookup. To learn more
about different types of relationships, go to Create a relationship between tables.
After setting the key field, you can see the field in the mapping of the dataflow.
Known limitations
Mapping to polymorphic lookup fields is currently not supported.
Mapping to a multi-level lookup field, a lookup that points to another tables' lookup field, is currently not
supported.
Field mapping considerations for standard
dataflows
5/25/2022 • 3 minutes to read • Edit Online
When you create dataflows that write their output to Dataverse, you can follow some guide lines and best
practices to get the best outcome. In this article, some of those best practices are covered.
The primary name field that you see in the field mapping is for a label field; this field doesn't need to be unique.
The field that's used in the entity for checking duplication will be the field that you set in the Alternate Key
field.
Having a primary key in the entity ensures that even if you have duplicate data rows with the same value in the
field that's mapped to the primary key, the duplicate entries won't be loaded into the entity, and the entity will
always have a high quality of the data. Having an entity with a high quality of data is essential in building
reporting solutions based on the entity.
The primary name field
The primary name field is a display field used in Dataverse. This field is used in default views to show the
content of the entity in other applications. This field isn't the primary key field, and shouldn't be considered as
that. This field can have duplicates, because it's a display field. The best practice, however, is to use a
concatenated field to map to the primary name field, so the name is fully explanatory.
The alternate key field is what is used as the primary key.
If someone in the team has created a dataflow and wants to share it with other team members, how does it
work? What are the roles and permission level options available? This article takes you through the roles and
permission levels related to standard dataflows.
Roles
There are multiple roles used to configure the security level for standard dataflows. The following table
describes each role, along with the level of permission associated with that role.
Basic User Write to non-custom entities Has all the rights to work with non-
custom entities
System Customizer Create custom entities Custom entities this user creates will
be visible to this user only
Members of the environment Get data from dataflows Every member in the environment can
get data from the dataflows in that
environment
4. Select the user from the list of users in the environment, and then select Manage roles .
6. SelectOK .
Sync your Excel data source with Dataverse using a
dataflow
5/25/2022 • 3 minutes to read • Edit Online
One of the common scenarios that happens when you integrate data into Dataverse is keeping it synchronized
with the source. Using the standard dataflow, you can load data into Dataverse. This article explains how you can
keep the data synchronized with the source system.
Having a key column is important for the table in Dataverse. The key column is the row identifier; this column
contains unique values in each row. Having a key column helps in avoiding duplicate rows, and it also helps in
synchronizing the data with the source system. If a row is removed from the source system, having a key
column is helpful to find it and remove it from Dataverse as well.
The setting is simple, you just need to set the alternate key. However, if you have multiple files or tables, it has
one other step to consider.
If you have multiple files
If you have just one Excel file (or sheet or table), then the steps in the previous procedure are enough to set the
alternate key. However, if you have multiple files (or sheets or tables) with the same structure (but with different
data), then you to append them together.
If you're getting data from multiple Excel files, then the Combine Files option of Power Query will
automatically append all the data together, and your output will look like the following image.
As shown in the preceding image, besides the append result, Power Query also brings in the Source.Name
column, which contains the file name. The Index value in each file might be unique, but it's not unique across
multiple files. However, the combination of the Index column and the Source.Name column is a unique
combination. Choose a composite alternate key for this scenario.
In this procedure, you'll create a table in Dataverse and fill that table with data from an OData feed by using
Power Query. You can use the same techniques to integrate data from these online and on-premises sources,
among others:
SQL Server
Salesforce
IBM DB2
Access
Excel
Web APIs
OData feeds
Text files
You can also filter, transform, and combine data before you load it into a new or existing table.
If you don't have a license for Power Apps, you can sign up for free.
Prerequisites
Before you start to follow this article:
Switch to an environment in which you can create tables.
You must have a Power Apps per user plan or Power Apps per app plan.
5. Under Connection settings , type or paste this URL, and then select Next :
https://services.odata.org/V4/Northwind/Northwind.svc/
6. In the list of tables, select the Customers check box, and then select Next .
7. (optional) Modify the schema to suit your needs by choosing which columns to include, transforming the
table in one or more ways, adding an index or conditional column, or making other changes.
8. In the lower-right corner, select Next .
You can give the new table a different name or display name, but leave the default values to follow this
tutorial exactly.
2. In the Unique primar y name column list, select ContactName , and then select Next .
You can specify a different primary-name column, map a different column in the source table to each
column in the table that you're creating, or both. You can also specify whether Text columns in your query
output should be created as either Multiline Text or Single-Line Text in the Dataverse. To follow this
tutorial exactly, leave the default column mapping.
3. Select Refresh manually for Power Query - Refresh Settings, and then select Publish .
4. Under Dataverse (near the left edge), select Tables to show the list of tables in your database.
The Customers table that you created from an OData feed appears as a custom table.
WARNING
If you use Power Query to add data to an existing table, all data in that table will be overwritten.
If you select Load to existing table , you can specify a table into which you add data from the Customers
table. You could, for example, add the data to the Account table with which the Dataverse ships. Under Column
mapping , you can further specify that data in the ContactName column from the Customers table should be
added to the Name column in the Account table.
Microsoft Power Platform dataflows and Azure Data Factory dataflows are often considered to be doing the
same thing: extracting data from source systems, transforming the data, and loading the transformed data into a
destination. However, there are differences in these two types of dataflows, and you can have a solution
implemented that works with a combination of these technologies. This article describes this relationship in
more detail.
DATA FA C TO RY W RA N GL IN G
F EAT URES P O W ER P L AT F O RM DATA F LO W S DATA F LO W S
Destinations Dataverse or Azure Data Lake Storage Many destinations (see the list here)
Power Query transformation All Power Query functions are A limited set of functions are
supported supported (see the list here)
Sources Many sources are supported Only a few sources (see the list here)
When your dataflow refresh completes, you or others who manage or depend on the dataflow might want to
receive a notification to alert you of the dataflow refresh status. This way, you know your data is up to date and
you can start getting new insights. Another common scenario addressed by this tutorial is notification after a
dataflow fails. A notification allows you to start investigating the problem and alert people that depend on the
data being successfully refreshed.
To set up a Power Automate notification that will be sent when a dataflow fails:
1. Navigate to Power Automate.
2. Select Create > Automated cloud flow .
3. Enter a flow name, and then search for the "When a dataflow refresh completes" connector. Select this
connector from the list, and then select Create .
4. Customize the connector. Enter the following information on your dataflow:
Group Type : Select Environment when connecting to Power Apps and Workspace when connecting
to Power BI.
Group : Select the Power Apps environment or the Power BI workspace your dataflow is in.
Dataflow : Select your dataflow by name.
5. Select New step to add an action to your flow.
6. Search for the Condition connector, and then select it.
7. Customize the Condition connector. Enter the following information:
a. In the first cell, add Refresh Status from the dataflow connector.
b. Leave the second cell as is equal to .
c. In the third cell, enter False .
When your dataflow refresh completes or has been taking longer than expected, you might want your support
team to investigate. With this tutorial, you can automatically open a support ticket, create a message in a queue
or Service Bus, or add an item to Azure DevOps to notify your support team.
In this tutorial, we make use of Azure Service Bus. For instructions on how to set up an Azure Service Bus and
create a queue, go to Use Azure portal to create a Service Bus namespace and a queue.
To automatically create a queue in Azure Service Bus:
1. Navigate to Power Automate.
2. Select Create > Automated cloud flow .
3. Enter a flow name, and then search for the "When a dataflow refresh completes" connector. Select this
connector from the list, and then select Create .
4. Customize the connector. Enter the following information on your dataflow:
Group Type : Select Environment when connecting to Power Apps and Workspace when connecting
to Power BI.
Group : Select the Power Apps environment or the Power BI workspace your dataflow is in.
Dataflow : Select your dataflow by name.
5. Select New step to add an action to your flow.
6. Search for the Condition connector, and then select it.
7. Customize the Condition connector. Enter the following information:
a. In the first cell, add Refresh Status from the dataflow connector.
b. Leave the second cell as is equal to .
c. In the third cell, enter False .
8. In the If Yes section, select Add an action .
9. Search for the "Send message" connector from Service Bus, and then select it.
10. Enter a Connection name for this message. In Connection string , enter the connection string that was
generated when you created the Service Bus namespace. Then select Create .
11. Add dataflow information to the content of your message by selecting the field next to Content , and then
select the dynamic content you want to use from Dynamic content .
Trigger dataflows and Power BI datasets sequentially
5/25/2022 • 2 minutes to read • Edit Online
There are two common scenarios for how you can use this connector to trigger multiple dataflows and Power BI
datasets sequentially.
Trigger the refresh of a standard dataflow after the successful completion of an analytical dataflow
refresh.
If a single dataflow does every action, then it's hard to reuse its entities in other dataflows or for other
purposes. The best dataflows to reuse are dataflows doing only a few actions, specializing in one specific
task. If you have a set of dataflows as staging dataflows, and their only action is to extract data "as is"
from the source system, these dataflows can be reused in multiple other dataflows. More information:
Best practices for reusing dataflows across environments and workspaces
Trigger the refresh of a Power BI dataset when a dataflow refresh completes successfully.
If you want to ensure that your dashboard is up to date after a dataflow refreshes your data, you can use
the connector to trigger the refresh of a Power BI dataset after your dataflow refreshes successfully.
This tutorial covers the first scenario.
To trigger dataflows sequentially:
1. Navigate to Power Automate.
2. Select Create > Automated cloud flow .
3. Enter a flow name, and then search for the "When a dataflow refresh completes" connector. Select this
connector from the list, and then select Create .
4. Customize the connector. Enter the following information on your dataflow:
Group Type : Select Environment when connecting to Power Apps and Workspace when connecting
to Power BI.
Group : Select the Power Apps environment or the Power BI workspace your dataflow is in.
Dataflow : Select your dataflow by name.
5. Select New step to add an action to your flow.
6. Search for the Condition connector, and then select it.
7. Customize the Condition connector. Enter the following information:
a. In the first cell, add Refresh Status from the dataflow connector.
b. Leave the second cell as is equal to .
c. In the third cell, enter Success .
8. In the If Yes section, select Add an action .
9. Search for the "Refresh a dataflow" connector, and then select it.
10. Customize the connector:
Group Type : Select Environment when connecting to Power Apps and Workspace when connecting
to Power BI.
Group : Select the Power Apps environment or the Power BI workspace your dataflow is in.
Dataflow : Select your dataflow by name.
Load data in a Dataverse table and build a
dataflows monitoring report with Power BI
5/25/2022 • 2 minutes to read • Edit Online
This tutorial demonstrates how to load data in a Dataverse table to create a dataflows monitoring report in
Power BI.
You can use this dashboard to monitor your dataflows' refresh duration and failure count. With this dashboard,
you can track any issues with your dataflows performance and share the data with others.
First, you'll create a new Dataverse table that stores all the metadata from the dataflow run. For every refresh of
a dataflow, a record is added to this table. You can also store metadata for multiple dataflow runs in the same
table. After the table is created, you'll connect the Power BI file to the Dataverse table.
Prerequisites
Power BI Desktop.
A Dataverse environment with permissions to create new custom tables.
A Premium Power Automate License.
A Power BI dataflow or Power Platform dataflow.
Download the .pbit file
First, download the Dataverse .pbit file.
This tutorial demonstrates how to use an Excel file and the dataflows connector in Power Automate to create a
dataflows monitoring report in Power BI.
First, you'll download the Excel file and save it in OneDrive for Business or SharePoint. Next, you'll create a Power
Automate connector that loads metadata from your dataflow to the Excel file in OneDrive for Business or
SharePoint. Lastly, you'll connect a Power BI file to the Excel file to visualize the metadata and start monitoring
the dataflows.
You can use this dashboard to monitor your dataflows' refresh duration and failure count. With this dashboard,
you can track any issues with your dataflows performance and share the data with others.
Prerequisites
Microsoft Excel
Power BI Desktop.
A Premium Power Automate License
OneDrive for Business.
A Power BI dataflow or Power Platform dataflow.
Create a dataflow
If you don't already have one, create a dataflow. You can create a dataflow in either Power BI dataflows or Power
Apps dataflows.
This tutorial demonstrates how to load data in a Power BI streaming dataset to create a dataflows monitoring
report in Power BI.
First, you'll create a new streaming dataset in Power BI. This dataset collects all the metadata from the dataflow
run, and for every refresh of a dataflow, a record is added to this dataset. You can run multiple dataflows all to
the same dataset. Lastly, you can build a Power BI report on the data to visualize the metadata and start
monitoring the dataflows.
You can use this dashboard to monitor your dataflows' refresh duration and failure count. With this dashboard,
you can track any issues with your dataflows performance and share the data with others.
Prerequisites
A Power BI Pro License.
A Premium Power Automate License
A Power BI dataflow or Power Platform dataflow.
Create a new streaming dataset in Power BI
1. Navigate to Power BI.
2. Open a workspace.
3. From the workspace, select New > Streaming dataset .
4. From New streaming dataset , select the API tile, and then select Next .
5. In the new pane, turn Historic data analysis on.
6. Enter the following values, and then select Create .
Dataset Name : "Dataflow Monitoring".
Value : "Dataflow Name", Data type : Text.
Value : "Dataflow ID", Data type : Text.
Value : "Refresh Status", Data type : Text.
Value : "Refresh Type", Data type : Text.
Value : "Start Time", Data type : Date and Time.
Value : "End Time", Data type : Date and Time.
Create a dataflow
If you do not already have one, create a dataflow. You can create a dataflow in either Power BI dataflows or
Power Apps dataflows.
In the scenario where you want to automatically retry a dataflow when the refresh fails, the Power Automate
Connector is probably the way to go. In this tutorial, we'll guide you step by step in setting up your Power
Automate flow.
To automatically retry a dataflow on failure:
1. Navigate to Power Automate.
2. Select Create > Automated cloud flow .
3. Enter a flow name, and then search for the When a dataflow refresh completes connector. Select this
connector from the list, and then select Create .
4. Customize the connector. Enter the following information on your dataflow:
a. Group Type : Select Environment if you're connecting to Power Apps and Workspace if you're
connecting to Power BI.
b. Group : Select the Power Apps environment or the Power BI workspace your dataflow is in.
c. Dataflow : Select your dataflow by name.
5. Select New step to add an action to your flow.
6. Search for the Condition connector, and then select it.
7. Customize the Condition connector. Enter the following information:
a. In the first cell, add Refresh Status from the dataflow connector.
b. Leave the second cell as is equal to .
c. In the third cell, enter Failed .
When working with any kind dataflows other than Power BI dataflows, you have the ability to monitor dataflow
refreshes using Power BI. This article includes step by step instructions on how to set up your own dashboard to
share with everyone on your team. This dashboard provides insights into the success rate of refreshes, duration,
and much more.
To use these tables, we suggest that you use Power BI to get data through the Dataverse connector.
Known issues
In some cases when you try to connect to the Dataverse tables manually through Power BI, the tables might
appear to be empty. To solve this issue, just refresh the preview and you should be good to go.
Troubleshooting dataflow issues: Creating dataflows
5/25/2022 • 2 minutes to read • Edit Online
This article explains some of the most common errors and issues you might get when you want to create a
dataflow, and how to fix them.
Reason:
Creating dataflows in My workspace isn't supported.
Resolution:
Create your dataflows in organizational workspaces. To learn how to create an organizational workspace, go to
Create the new workspaces in Power BI.
You might have created a dataflow but then had difficulty getting data from it (either by using Power Query in
Power BI Desktop or from other dataflows). This article explains some of the most common problems with
getting data from a dataflow.
After a dataflow is refreshed, the data in entities will be visible in the Navigator window of other tools and
services.
More information: Refreshing a dataflow in Power BI and Set the refresh frequency in Power Apps
When you create a dataflow, sometimes you get an error connecting to the data source. This error can be caused
by the gateway, credentials, or other reasons. This article explains the most common connection errors and
problems, and their resolution.
Reason:
When your entity in the dataflow gets data from an on-premises data source, a gateway is needed for the
connection, but the gateway hasn't been selected.
Resolution:
Select Select gateway . If the gateway hasn't been set up yet, see Install an on-premises data gateway.
Reason:
Disabled modules are related to functions that require an on-premises data gateway connection to work. Even if
the function is getting data from a webpage, because of some security compliance requirements, it needs to go
through a gateway connection.
Resolution:
First, install and set up an on-premises gateway. Then add a web data source for the web URL you're connecting
to.
After adding the web data source, you can select the gateway in the dataflow from Options > Project options .
You might be asked to set up credentials. When you've set up the gateway and your credentials successfully, the
modules will no longer be disabled."
Keyboard shortcuts in Power Query
5/25/2022 • 2 minutes to read • Edit Online
Keyboard shortcuts provide a quick way to navigate and allow users to work more efficiently. For users with
mobility or vision disabilities, keyboard shortcuts can be easier than using the touchscreen, and are an essential
alternative to using the mouse. The table in this article lists all the shortcuts available in Power Query Online.
When using the Query Editor in Power Query Online, you can navigate to the Keyboard shor tcuts button in
the Help tab to view the list of keyboard shortcuts.
Query Editor
A C T IO N K EY B O A RD SH O RTC UT
Go to column Ctrl+G
Refresh Alt+F5
Search Alt+Q
Data Preview
A C T IO N K EY B O A RD SH O RTC UT
W h e n t h e fo c u s i s o n t h e c o l u m n h e a d e r
A C T IO N K EY B O A RD SH O RTC UT
W h e n t h e fo c u s i s o n t h e c e l l
A C T IO N K EY B O A RD SH O RTC UT
Diagram View
A C T IO N K EY B O A RD SH O RTC UT
Move focus from query level to step level Alt+Down arrow key
Queries pane
A C T IO N K EY B O A RD SH O RTC UT
Select multiple consecutive queries Ctrl+Up arrow key and Ctrl+Down arrow key
Best practices when working with Power Query
5/25/2022 • 11 minutes to read • Edit Online
This article contains some tips and tricks to make the most out of your data wrangling experience in Power
Query.
Filter early
It's always recommended to filter your data in the early stages of your query or as early as possible. Some
connectors will take advantage of your filters through query folding, as described in Power Query query folding.
It's also a best practice to filter out any data that isn't relevant for your case. This will let you better focus on your
task at hand by only showing data that’s relevant in the data preview section.
You can use the auto filter menu that displays a distinct list of the values found in your column to select the
values that you want to keep or filter out. You can also use the search bar to help you find the values in your
column.
You can also take advantage of the type-specific filters such as In the previous for a date, datetime, or even
date timezone column.
These type-specific filters can help you create a dynamic filter that will always retrieve data that's in the previous
x number of seconds, minutes, hours, days, weeks, months, quarters, or years as showcased in the following
image.
NOTE
To learn more about filtering your data based on values from a column, see Filter by values.
A similar situation occurs for the type-specific filters, since they're specific to certain data types. If your column
doesn't have the correct data type defined, these type-specific filters won't be available.
It's crucial that you always work with the correct data types for your columns. When working with structured
data sources such as databases, the data type information will be brought from the table schema found in the
database. But for unstructured data sources such as TXT and CSV files, it's important that you set the correct
data types for the columns coming from that data source. By default, Power Query offers an automatic data type
detection for unstructured data sources. You can read more about this feature and how it can help you in Data
types.
NOTE
To learn more about the importance of data types and how to work with them, see Data types.
NOTE
To learn more about the data profiling tools, see Data profiling tools.
NOTE
To learn more about all the available features and components found inside the applied steps pane, see Using the Applied
steps list.
You could split this query into two at the Merge with Prices table step. That way it's easier to understand the
steps that were applied to the sales query before the merge. To do this operation, you right-click the Merge
with Prices table step and select the Extract Previous option.
You'll then be prompted with a dialog to give your new query a name. This will effectively split your query into
two queries. One query will have all the queries before the merge. The other query will have an initial step that
will reference your new query and the rest of the steps that you had in your original query from the Merge
with Prices table step downward.
You could also leverage the use of query referencing as you see fit. But it's a good idea to keep your queries at a
level that doesn't seem daunting at first glance with so many steps.
NOTE
To learn more about query referencing, see Understanding the queries pane.
Create groups
A great way to keep your work organized is by leveraging the use of groups in the queries pane.
The sole purpose of groups is to help you keep your work organized by serving as folders for your queries. You
can create groups within groups should you ever need to. Moving queries across groups is as easy as drag and
drop.
Try to give your groups a meaningful name that makes sense to you and your case.
NOTE
To learn more about all the available features and components found inside the queries pane, see Understanding the
queries pane.
Future-proofing queries
Making sure that you create a query that won't have any issues during a future refresh is a top priority. There
are several features in Power Query to make your query resilient to changes and able to refresh even when
some components of your data source changes.
It's a best practice to define the scope of your query as to what it should do and what it should account for in
terms of structure, layout, column names, data types, and any other component that you consider relevant to the
scope.
Some examples of transformations that can help you make your query resilient to changes are:
If your query has a dynamic number of rows with data, but a fixed number of rows that serve as the
footer that should be removed, you can use the Remove bottom rows feature.
NOTE
To learn more about filtering your data by row position, see Filter a table by row position.
If your query has a dynamic number of columns, but you only need to select specific columns from your
dataset, you can use the Choose columns feature.
NOTE
To learn more about choosing or removing columns, see Choose or remove columns.
If your query has a dynamic number of columns and you need to unpivot only a subset of your columns,
you can use the unpivot only selected columns feature.
NOTE
To learn more about the options to unpivot your columns, see Unpivot columns.
If your query has a step that changes the data type of a column, but some cells yield errors as the values
don't conform to the desired data type, you could remove the rows that yielded error values.
NOTE
To more about working and dealing with errors, see Dealing with errors.
Use parameters
Creating queries that are dynamic and flexible is a best practice. Parameters in Power Query help you make your
queries more dynamic and flexible. A parameter serves as a way to easily store and manage a value that can be
reused in many different ways. But it's more commonly used in two scenarios:
Step argument —You can use a parameter as the argument of multiple transformations driven from the
user interface.
Custom Function argument —You can create a new function from a query, and reference parameters
as the arguments of your custom function.
The main benefits of creating and using parameters are:
Centralized view of all your parameters through the Manage Parameters window.
Reusability of the parameter in multiple steps or queries.
Makes the creation of custom functions straightforward and easy.
You can even use parameters in some of the arguments of the data connectors. For example, you could create a
parameter for your server name when connecting to your SQL Server database. Then you could use that
parameter inside the SQL Server database dialog.
If you change your server location, all you need to do is update the parameter for your server name and your
queries will be updated.
NOTE
To learn more about creating and using parameters, see Using parameters.
You start by having a parameter that has a value that serves as an example.
From that parameter, you create a new query where you apply the transformations that you need. For this case,
you want to split the code PTY-CM1090-L AX into multiple components:
Origin = PTY
Destination = LAX
Airline = CM
FlightID = 1090
You can then transform that query into a function by doing a right-click on the query and selecting Create
Function . Finally, you can invoke your custom function into any of your queries or values, as shown in the
following image.
After a few more transformations, you can see that you've reached your desired output and leveraged the logic
for such a transformation from a custom function.
NOTE
To learn more about how to create and use custom functions in Power Query from the article Custom Functions.
Power Query feedback
5/25/2022 • 2 minutes to read • Edit Online
This article describes how to get support or submit feedback for Power Query.
For Power Quer y connectors , go to Feedback and support for Power Query connectors.
For Power Quer y documentation , you can submit feedback through the Submit and view feedback for -
This page link at the bottom of each article.
Submitting feedback
To submit feedback about Power Query, provide the feedback to the "ideas" forum for the product you're using
Power Query in. For example, for Power BI, visit the Power BI ideas forum. If you have one, you can also provide
feedback directly to your Microsoft account contact.
Power Query query folding
5/25/2022 • 4 minutes to read • Edit Online
This article targets data modelers developing models in Power Pivot or Power BI Desktop. It describes what
Power Query query folding is, and why it's important in your data model designs. This article also describes the
data sources and transformations that can achieve query folding, and how to determine that your Power Query
queries can be folded—whether fully or partially.
Query folding is the ability for a Power Query query to generate a single query statement to retrieve and
transform source data. The Power Query mashup engine strives to achieve query folding whenever possible for
reasons of efficiency.
Query folding is an important topic for data modeling for several reasons:
Impor t model tables: Data refresh will take place efficiently for Import model tables (Power Pivot or Power
BI Desktop), in terms of resource utilization and refresh duration.
DirectQuer y and Dual storage mode tables: Each DirectQuery and Dual storage mode table (Power BI
only) must be based on a Power Query query that can be folded.
Incremental refresh: Incremental data refresh (Power BI only) will be efficient, in terms of resource
utilization and refresh duration. In fact, the Power BI Incremental Refresh configuration window will notify
you of a warning should it determine that query folding for the table can't be achieved. If it can't be achieved,
the goal of incremental refresh is defeated. The mashup engine would then be required to retrieve all source
rows, and then apply filters to determine incremental changes.
Query folding may occur for an entire Power Query query, or for a subset of its steps. When query folding
cannot be achieved—either partially or fully—the Power Query mashup engine must compensate by processing
data transformations itself. This process can involve retrieving source query results, which for large datasets is
very resource intensive and slow.
We recommend that you strive to achieve efficiency in your model designs by ensuring query folding occurs
whenever possible.
Date.Year([OrderDate])
Date.ToText([OrderDate], "yyyy")
To view the folded query, you select the View Native Quer y option. You're then be presented with the native
query that Power Query will use to source data.
If the View Native Quer y option isn't enabled (greyed out), this is evidence that not all query steps can be
folded. However, it could mean that a subset of steps can still be folded. Working backwards from the last step,
you can check each step to see if the View Native Quer y option is enabled. If so, then you've learned where, in
the sequence of steps, that query folding could no longer be achieved.
Next steps
For more information about Query Folding and related articles, check out the following resources:
Best practice guidance for query folding
Use composite models in Power BI Desktop
Incremental refresh in Power BI Premium
Using Table.View to Implement Query Folding
How fuzzy matching works in Power Query
5/25/2022 • 3 minutes to read • Edit Online
Power Query features such as fuzzy merge, cluster values, and fuzzy grouping use the same mechanisms to
work as fuzzy matching.
This article goes over many scenarios that demonstrate how to take advantage of the options that fuzzy
matching has, with the goal of making 'fuzzy' clear.
Because the word Apples in the second string is only a small part of the whole text string, that comparison
yields a lower similarity score.
For example, the following dataset consists of responses from a survey that had only one question—"What is
your favorite fruit?"
F RUIT
Blueberries
Strawberries
Strawberries = <3
Apples
'sples
4ppl3s
Bananas
Banas
The survey provided one single textbox to input the value and had no validation.
Now you're tasked with clustering the values. To do that task, load the previous table of fruits into Power Query,
select the column, and then select the Cluster values option in the Add column tab in the ribbon.
The Cluster values dialog box appears, where you can specify the name of the new column. Name this new
column Cluster and select OK .
By default, Power Query uses a similarity threshold of 0.8 (or 80%) and the result of the previous operation
yields the following table with a new Cluster column.
While the clustering has been done, it's not giving you the expected results for all the rows. Row number two (2)
still has the value Blue berries are simply the best , but it should be clustered to Blueberries , and something
similar happens to the text strings Strawberries = <3 , fav fruit is bananas , and
My favorite fruit, by far, is Apples. I simply love them! .
To determine what's causing this clustering, double-click Clustered values in the Applied steps panel to bring
back the Cluster values dialog box. Inside this dialog box, expand Fuzzy cluster options . Enable the Show
similarity scores option, and then select OK .
Enabling the Show similarity scores option creates a new column in your table. This column shows you the
exact similarity score between the defined cluster and the original value.
Upon closer inspection, Power Query couldn't find any other values in the similarity threshold for the text strings
Blue berries are simply the best , Strawberries = <3 , fav fruit is bananas , and
My favorite fruit, by far, is Apples. I simply love them! .
Go back to the Cluster values dialog box one more time by double-clicking Clustered values in the Applied
steps panel. Change the Similarity threshold from 0.8 to 0.6 , and then select OK .
This change gets you closer to the result that you're looking for, except for the text string
My favorite fruit, by far, is Apples. I simply love them! . When you changed the Similarity threshold
value from 0.8 to 0.6 , Power Query was now able to use the values with a similarity score that starts from 0.6
all the way up to 1.
NOTE
Power Query always uses the value closest to the threshold to define the clusters. The threshold defines the lower limit of
the similarity score that's acceptable to assign the value to a cluster.
You can try again by changing the Similarity score from 0.6 to a lower number until you get the results that
you're looking for. In this case, change the Similarity score to 0.5 . This change yields the exact result that
you're expecting with the text string My favorite fruit, by far, is Apples. I simply love them! now assigned
to the cluster Apples .
NOTE
Currently, only the Cluster values feature in Power Query Online provides a new column with the similarity score.
Behind the scenes of the Data Privacy Firewall
5/25/2022 • 13 minutes to read • Edit Online
NOTE
Privacy levels are currently unavailable in Power Platform dataflows. The product team is working towards re-enabling this
functionality in the coming weeks.
If you’ve used Power Query for any length of time, you’ve likely experienced it. There you are, querying away,
when you suddenly get an error that no amount of online searching, query tweaking, or keyboard bashing can
remedy. An error like:
Formula.Firewall: Query 'Query1' (step 'Source') references other queries or steps, so it may not directly
access a data source. Please rebuild this data combination.
Or maybe:
Formula.Firewall: Query 'Query1' (step 'Source') is accessing data sources that have privacy levels which
cannot be used together. Please rebuild this data combination.
These Formula.Firewall errors are the result of Power Query’s Data Privacy Firewall (aka the Firewall), which at
times may seem like it exists solely to frustrate data analysts the world over. Believe it or not, however, the
Firewall serves an important purpose. In this article, we’ll delve under the hood to better understand how it
works. Armed with greater understanding, you'll hopefully be able to better diagnose and fix Firewall errors in
the future.
What is it?
The purpose of the Data Privacy Firewall is simple: it exists to prevent Power Query from unintentionally leaking
data between sources.
Why is this needed? I mean, you could certainly author some M that would pass a SQL value to an OData feed.
But this would be intentional data leakage. The mashup author would (or at least should) know they were doing
this. Why then the need for protection against unintentional data leakage?
The answer? Folding.
Folding?
Folding is a term that refers to converting expressions in M (such as filters, renames, joins, and so on) into
operations against a raw data source (such as SQL, OData, and so on). A huge part of Power Query’s power
comes from the fact that PQ can convert the operations a user performs via its user interface into complex SQL
or other backend data source languages, without the user having to know said languages. Users get the
performance benefit of native data source operations, with the ease of use of a UI where all data sources can be
transformed using a common set of commands.
As part of folding, PQ sometimes may determine that the most efficient way to execute a given mashup is to
take data from one source and pass it to another. For example, if you’re joining a small CSV file to a huge SQL
table, you probably don’t want PQ to read the CSV file, read the entire SQL table, and then join them together on
your local computer. You probably want PQ to inline the CSV data into a SQL statement and ask the SQL
database to perform the join.
This is how unintentional data leakage can happen.
Imagine if you were joining SQL data that included employee Social Security Numbers with the results of an
external OData feed, and you suddenly discovered that the Social Security Numbers from SQL were being sent
to the OData service. Bad news, right?
This is the kind of scenario the Firewall is intended to prevent.
These queries will end up divided into two partitions: one for the Employees query, and one for the
EmployeesReference query (which will reference the Employees partition). When evaluated with the Firewall on,
these queries will be rewritten like so:
Notice that the simple reference to the Employees query has been replaced by a call to Value.Firewall , which is
provided the full name of the Employees query.
When EmployeesReference is evaluated, the call to Value.Firewall("Section1/Employees") is intercepted by the
Firewall, which now has a chance to control whether (and how) the requested data flows into the
EmployeesReference partition. It can do any number of things: deny the request, buffer the requested data
(which prevents any further folding to its original data source from occurring), and so on.
This is how the Firewall maintains control over the data flowing between partitions.
Partitions that directly access data sources
Let’s say you define a query Query1 with one step (note that this single-step query will correspond to one
Firewall partition), and that this single step accesses two data sources: a SQL database table and a CSV file. How
does the Firewall deal with this, since there’s no partition reference, and thus no call to Value.Firewall for it to
intercept? Let’s review to the rule stated earlier:
A partition may either access compatible data sources, or reference other partitions, but not both.
In order for your single-partition-but-two-data-sources query to be allowed to run, its two data sources must be
“compatible”. In other words, it needs to be okay for data to be shared between them. In terms of the Power
Query UI, this means the privacy levels of the SQL and CSV data sources need to both be Public, or both be
Organizational. If they are both marked Private, or one is marked Public and one is marked Organizational, or
they are marked using some other combination of privacy levels, then it's not safe for them to both be evaluated
in the same partition. Doing so would mean unsafe data leakage could occur (due to folding), and the Firewall
would have no way to prevent it.
What happens if you try to access incompatible data sources in the same partition?
Formula.Firewall: Query 'Query1' (step 'Source') is accessing data sources that have privacy levels which
cannot be used together. Please rebuild this data combination.
Hopefully you now better understand one of the error messages listed at the beginning of this article.
Note that this compatibility requirement only applies within a given partition. If a partition is referencing other
partitions, the data sources from the referenced partitions don't have to be compatible with one another. This is
because the Firewall can buffer the data, which will prevent any further folding against the original data source.
The data will be loaded into memory and treated as if it came from nowhere.
Why not do both?
Let’s say you define a query with one step (which will again correspond to one partition) that accesses two other
queries (that is, two other partitions). What if you wanted, in the same step, to also directly access a SQL
database? Why can’t a partition reference other partitions and directly access compatible data sources?
As you saw earlier, when one partition references another partition, the Firewall acts as the gatekeeper for all the
data flowing into the partition. To do so, it must be able to control what data is allowed in. If there are data
sources being accessed within the partition, as well as data flowing in from other partitions, it loses its ability to
be the gatekeeper, since the data flowing in could be leaked to one of the internally accessed data sources
without it knowing about it. Thus the Firewall prevents a partition that accesses other partitions from being
allowed to directly access any data sources.
So what happens if a partition tries to reference other partitions and also directly access data sources?
Formula.Firewall: Query 'Query1' (step 'Source') references other queries or steps, so it may not directly
access a data source. Please rebuild this data combination.
Now you hopefully better understand the other error message listed at the beginning of this article.
Partitions in-depth
As you can probably guess from the above information, how queries are partitioned ends up being incredibly
important. If you have some steps that are referencing other queries, and other steps that access data sources,
you now hopefully recognize that drawing the partition boundaries in certain places will cause Firewall errors,
while drawing them in other places will allow your query to run just fine.
So how exactly do queries get partitioned?
This section is probably the most important for understanding why you’re seeing Firewall errors, as well as
understanding how to resolve them (where possible).
Here’s a high-level summary of the partitioning logic.
Initial Partitioning
Creates a partition for each step in each query
Static Phase
This phase doesn’t depend on evaluation results. Instead, it relies on how the queries are structured.
Parameter Trimming
Trims parameter-esque partitions, that is, any one that:
Doesn’t reference any other partitions
Doesn’t contain any function invocations
Isn’t cyclic (that is, it doesn’t refer to itself)
Note that “removing” a partition effectively includes it in whatever other partitions reference it.
Trimming parameter partitions allows parameter references used within data source function
calls (for example, Web.Contents(myUrl) ) to work, instead of throwing “partition can’t reference
data sources and other steps” errors.
Grouping (Static)
Partitions are merged, while maintaining separation between:
Partitions in different queries
Partitions that reference other partitions vs. those that don’t
Dynamic Phase
This phase depends on evaluation results, including information about data sources accessed by
various partitions.
Trimming
Trims partitions that meet all the following requirements:
Doesn’t access any data sources
Doesn’t reference any partitions that access data sources
Isn’t cyclic
Grouping (Dynamic)
Now that unnecessary partitions have been trimmed, try to create Source partitions that are as
large as possible.
Merge all partitions with their input partitions if each of its inputs:
Is part of the same query
Doesn’t reference any other partitions
Is only referenced by the current partition
Isn’t the result (that is, final step) of a query
Isn’t cyclic
in
#"Changed Type";
Source = Sql.Databases(DbServer),
AdventureWorks = Source{[Name="AdventureWorks"]}[Data],
HumanResources_Employee = AdventureWorks{[Schema="HumanResources",Item="Employee"]}[Data],
in
#"Expanded Contacts";
Here’s a higher-level view, showing the dependencies.
Let’s partition
Let’s zoom in a bit and include steps in the picture, and start walking through the partitioning logic. Here’s a
diagram of the three queries, showing the initial firewall partitions in green. Notice that each step starts in its
own partition.
Next, we trim parameter partitions. Thus, DbServer gets implicitly included in the Source partition.
Now we perform the static grouping. This maintains separation between partitions in separate queries (note for
instance that the last two steps of Employees don’t get grouped with the steps of Contacts), as well as between
partitions that reference other partitions (such as the last two steps of Employees) and those that don’t (such as
the first three steps of Employees).
Now we enter the dynamic phase. In this phase, the above static partitions are evaluated. Partitions that don’t
access any data sources are trimmed. Partitions are then grouped to create source partitions that are as large as
possible. However, in this sample scenario, all the remaining partitions access data sources, and there isn’t any
further grouping that can be done. The partitions in our sample thus won’t change during this phase.
Let’s pretend
For the sake of illustration, though, let’s look at what would happen if the Contacts query, instead of coming
from a text file, were hard-coded in M (perhaps via the Enter Data dialog).
In this case, the Contacts query would not access any data sources. Thus, it would get trimmed during the first
part of the dynamic phase.
With the Contacts partition removed, the last two steps of Employees would no longer reference any partitions
except the one containing the first three steps of Employees. Thus, the two partitions would be grouped.
The resulting partition would look like this.
Example: Passing data from one data source to another
Okay, enough abstract explanation. Let's look at a common scenario where you're likely to encounter a Firewall
error and the steps to resolve it.
Imagine you want to look up a company name from the Northwind OData service, and then use the company
name to perform a Bing search.
First, you create a Company query to retrieve the company name.
let
Source = OData.Feed("https://services.odata.org/V4/Northwind/Northwind.svc/", null,
[Implementation="2.0"]),
Customers_table = Source{[Name="Customers",Signature="table"]}[Data],
CHOPS = Customers_table{[CustomerID="CHOPS"]}[CompanyName]
in
CHOPS
Next, you create a Search query that references Company and passes it to Bing.
let
Source = Text.FromBinary(Web.Contents("https://www.bing.com/search?q=" & Company))
in
Source
At this point you run into trouble. Evaluating Search produces a Firewall error.
Formula.Firewall: Query 'Search' (step 'Source') references other queries or steps, so it may not directly
access a data source. Please rebuild this data combination.
This is because the Source step of Search is referencing a data source (bing.com) and also referencing another
query/partition (Company ). It is violating the rule mentioned above ("a partition may either access compatible
data sources, or reference other partitions, but not both").
What to do? One option is to disable the Firewall altogether (via the Privacy option labeled Ignore the Privacy
Levels and potentially improve performance ). But what if you want to leave the Firewall enabled?
To resolve the error without disabling the Firewall, you can combine Company and Search into a single query,
like this:
let
Source = OData.Feed("https://services.odata.org/V4/Northwind/Northwind.svc/", null,
[Implementation="2.0"]),
Customers_table = Source{[Name="Customers",Signature="table"]}[Data],
CHOPS = Customers_table{[CustomerID="CHOPS"]}[CompanyName],
Search = Text.FromBinary(Web.Contents("https://www.bing.com/search?q=" & CHOPS))
in
Search
Everything is now happening inside a single partition. Assuming that the privacy levels for the two data sources
are compatible, the Firewall should now be happy, and you'll no longer get an error.
That’s a wrap
While there's much more that could be said on this topic, this introductory article is already long enough.
Hopefully it’s given you a better understanding of the Firewall, and will help you to understand and fix Firewall
errors when you encounter them in the future.
Query Diagnostics
5/25/2022 • 11 minutes to read • Edit Online
With Query Diagnostics, you can achieve a better understanding of what Power Query is doing at authoring and
at refresh time in Power BI Desktop. While we'll be expanding on this feature in the future, including adding the
ability to use it during full refreshes, at this time you can use it to understand what sort of queries you're
emitting, what slowdowns you might run into during authoring refresh, and what kind of background events are
happening.
To use Query Diagnostics, go to the Tools tab in the Power Query Editor ribbon.
By default, Query Diagnostics might require administrative rights to run (depending on IT policy). If you find
yourself unable to run Query Diagnostics, open the Power BI options page, and in the Diagnostics tab, select
Enable in Quer y Editor (does not require running as admin) . This selection constrains you from being
able to trace diagnostics when doing a full refresh into Power BI rather than the Power Query editor, but does
allow you to still use it when previewing, authoring, and so on.
Whenever you start diagnostics, Power Query begins tracing any evaluations that you cause. The evaluation that
most users think of is when you press refresh, or when you retrieve data for the first time, but there are many
actions that can cause evaluations, depending on the connector. For example, with the SQL connector, when you
retrieve a list of values to filter, that would kick off an evaluation as well—but it doesn't associate with a user
query, and that's represented in the diagnostics. Other system-generated queries might include the navigator or
the get data experience.
When you press Diagnose Step , Power Query runs a special evaluation of just the step you're looking at. It
then shows you the diagnostics for that step, without showing you the diagnostics for other steps in the query.
This can make it much easier to get a narrow view into a problem.
It's important that if you're recording all traces from Star t Diagnostics that you press Stop diagnostics .
Stopping the diagnostics allows the engine to collect the recorded traces and parse them into the proper output.
Without this step, you'll lose your traces.
Types of Diagnostics
We currently provide three types of diagnostics, one of which has two levels of detail.
The first of these diagnostics are the primary diagnostics, which have a detailed view and a summarized view.
The summarized view is aimed to give you an immediate insight into where time is being spent in your query.
The detailed view is much deeper, line by line, and is, in general, only needed for serious diagnosing by power
users.
For this view, some capabilities, like the Data Source Query column, are currently available only on certain
connectors. We'll be working to extend the breadth of this coverage in the future.
Data privacy partitions provide you with a better understanding of the logical partitions used for data privacy.
NOTE
Power Query might perform evaluations that you may not have directly triggered. Some of these evaluations are
performed in order to retrieve metadata so we can best optimize our queries or to provide a better user experience (such
as retrieving the list of distinct values within a column that are displayed in the Filter Rows experience). Others might be
related to how a connector handles parallel evaluations. At the same time, if you see in your query diagnostics repeated
queries that you don't believe make sense, feel free to reach out through normal support channels—your feedback is how
we improve our product.
Diagnostics Schema
Id
When analyzing the results of a recording, it's important to filter the recording session by Id, so that columns
such as Exclusive Duration % make sense.
Id is a composite identifier. It's formed of two numbers—one before the dot, and one after. The first number is
the same for all evaluations that resulted from a single user action. In other words, if you press refresh twice,
there will be two different numbers leading the dot, one for each user activity taken. This numbering is
sequential for a given diagnostics recording.
The second number represents an evaluation by the engine. This number is sequential for the lifetime of the
process where the evaluation is queued. If you run multiple diagnostics recording sessions, you'll see this
number continue to grow across the different sessions.
To summarize, if you start recording, press evaluation once, and stop recording, you'll have some number of Ids
in your diagnostics. But since you only took one action, they'll all be 1.1, 1.2, 1.3, and so on.
The combination of the activityId and the evaluationId, separated by the dot, provides a unique identifier for an
evaluation of a single recording session.
Query
The name of the Query in the left-hand pane of the Power Query editor.
Step
The name of the Step in the right-hand pane of the Power Query editor. Things like filter dropdowns generally
associate with the step you're filtering on, even if you're not refreshing the step.
Category
The category of the operation.
Data Source Kind
This tells you what sort of data source you're accessing, such as SQL or Oracle.
Operation
The actual operation being performed. This operation can include evaluator work, opening connections, sending
queries to the data source, and many more.
Start Time
The time that the operation started.
End Time
The time that the operation ended.
Exclusive Duration (%)
The Exclusive Duration column of an event is the amount of time the event was active. This contrasts with the
"duration" value that results from subtracting the values in an event's Start Time column and End Time column.
This "duration" value represents the total time that elapsed between when an event began and when it ended,
which may include times the event was in a suspended or inactive state and another event was consuming
resources.
Exclusive duration % adds up to approximately 100% within a given evaluation, as represented by the Id column.
For example, if you filter on rows with Id 1.x, the Exclusive Duration percentages would sum to approximately
100%. This isn't the case if you sum the Exclusive Duration % values of all rows in a given diagnostic table.
Exclusive Duration
The absolute time, rather than %, of exclusive duration. The total duration (that is, exclusive duration + time
when the event was inactive) of an evaluation can be calculated in one of two ways:
Find the operation called "Evaluation". The difference between End Time–Start Time results in the total
duration of an event.
Subtract the minimum start time of all operations in an event from the maximum end time. Note that in
cases when the information collected for an event doesn't account for the total duration, an operation
called "Trace Gaps" is generated to account for this time gap.
Resource
The resource you're accessing for data. The exact format of this resource depends on the data source.
Data Source Query
Power Query does something called Folding, which is the act of running as many parts of the query against the
back-end data source as possible. In Direct Query mode (over Power Query), where enabled, only transforms
that fold will run. In import mode, transforms that can't fold will instead be run locally.
The Data Source Query column allows you to see the query or HTTP request/response sent against the back-end
data source. As you author your Query in the editor, many Data Source Queries will be emitted. Some of these
are the actual final Data Source Query to render the preview, but others may be for data profiling, filter
dropdowns, information on joins, retrieving metadata for schemas, and any number of other small queries.
In general, you shouldn't be concerned by the number of Data Source Queries emitted unless there are specific
reasons to be concerned. Instead, you should focus on making sure the proper content is being retrieved. This
column might also help determine if the Power Query evaluation was fully folded.
Additional Info
There's a lot of information retrieved by our connectors. Much of it is ragged and doesn't fit well into a standard
column hierarchy. This information is put in a record in the additional info column. Information logged from
custom connectors also appears here.
Row Count
The number of rows returned by a Data Source Query. Not enabled on all connectors.
Content Length
Content length returned by HTTP Requests, as commonly defined. This isn't enabled in all connectors, and it
won't be accurate for connectors that retrieve requests in chunks.
Is User Query
A Boolean value that indicates if it's a query authored by the user and present in the left-hand pane, or if it was
generated by some other user action. Other user actions can include things such as filter selection or using the
navigator in the get data experience.
Path
Path represents the relative route of the operation when viewed as part of an interval tree for all operations
within a single evaluation. At the top (root) of the tree, there's a single operation called Evaluation with path "0".
The start time of this evaluation corresponds to the start of this evaluation as a whole. The end time of this
evaluation shows when the whole evaluation finished. This top-level operation has an exclusive duration of 0, as
its only purpose is to serve as the root of the tree.
Further operations branch from the root. For example, an operation might have "0/1/5" as a path. This path
would be understood as:
0: tree root
1: current operation's parent
5: index of current operation
Operation "0/1/5" might have a child node, in which case, the path has the form "0/1/5/8", with 8 representing
the index of the child.
Group ID
Combining two (or more) operations won't occur if it leads to detail loss. The grouping is designed to
approximate "commands" executed during the evaluation. In the detailed view, multiple operations share a
Group Id, corresponding to the groups that are aggregated in the Summary view.
As with most columns, the group ID is only relevant within a specific evaluation, as filtered by the Id column.
Additional Reading
How to record diagnostics in various use cases
More about reading and visualizing your recorded traces
How to understand what query operations are folding using Query Diagnostics
Recording Query Diagnostics in Power BI
5/25/2022 • 6 minutes to read • Edit Online
When authoring in Power Query, the basic workflow is that you connect to a data source, apply some
transformations, potentially refresh your data in the Power Query editor, and then load it to the Power BI model.
Once it's in the Power BI model, you may refresh it from time to time in Power BI Desktop (if you're using
Desktop to view analytics), aside from any refreshes you do in the service.
While you may get a similar result at the end of an authoring workflow, refreshing in the editor, or refreshing in
Power BI proper, very different evaluations are run by the software for the different user experiences provided.
It's important to know what to expect when doing query diagnostics in these different workflows so you aren't
surprised by the very different diagnostic data.
To start Query Diagnostics, go to the 'Tools' tab in the Power Query Editor ribbon. You're presented here with a
few different options.
There are two primary options here, 'Diagnose Step' and 'Start Diagnostics' (paired with 'Stop Diagnostics'). The
former will give you information on a query up to a selected step, and is most useful for understanding what
operations are being performed locally or remotely in a query. The latter gives you more insight into a variety of
other cases, discussed below.
Connector Specifics
It's important to mention that there is no way to cover all the different permutations of what you'll see in Query
Diagnostics. There are lots of things that can change exactly what you see in results:
Connector
Transforms applied
System that you're running on
Network configuration
Advanced configuration choices
ODBC configuration
For the most broad coverage this documentation will focus on Query Diagnostics of the Northwind Customers
table, both on SQL and OData. The OData notes use the public endpoint found at the OData.org website, while
you'll need to provide a SQL server for yourself. Many data sources will differ significantly from these, and will
have connector specific documentation added over time.
Once you connect and choose authentication, select the 'Customers' table from the OData service.
This will present you with the Customers table in the Power Query interface. Let's say that we want to know how
many Sales Representatives there are in different countries. First, right click on 'Sales Representative' under the
'Contact Title' column, mouse over 'Text Filters', and select 'Equals'.
Now, select 'Group By' from the Ribbon and do a grouping by 'Country', with your aggregate being a 'Count'.
This should present you with the same data you see below.
Finally, navigate back to the 'Tools' tab of the Ribbon and click 'Stop Diagnostics'. This will stop the tracing and
build your diagnostics file for you, and the summary and detailed tables will appear on the left-hand side.
If you trace an entire authoring session, you will generally expect to see something like a source query
evaluation, then evaluations related to the relevant navigator, then at least one query emitted for each step you
apply (with potentially more depending on the exact UX actions taken). In some connectors, parallel evaluations
will happen for performance reasons that will yield very similar sets of data.
Refresh Preview
When you have finished transforming your data, you have a sequence of steps in a query. When you press
'Refresh Preview' or 'Refresh All' in the Power Query editor, you won't see just one step in your query
diagnostics. The reason for this is that refreshing in the Power Query Editor explicitly refreshes the query ending
with the last step applied, and then steps back through the applied steps and refreshes for the query up to that
point, back to the source.
This means that if you have five steps in your query, including Source and Navigator, you will expect to see five
different evaluations in your diagnostics. The first one, chronologically, will often (but not always) take the
longest. This is due to two different reasons:
It may potentially cache input data that the queries run after it (representing earlier steps in the User Query)
can access faster locally.
It may have transforms applied to it that significantly truncate how much data has to be returned.
Note that when talking about 'Refresh All' that it will refresh all queries and you'll need to filter to the ones you
care about, as you might expect.
Full Refresh
Query Diagnostics can be used to diagnose the so-called 'final query' that is emitted during the Refresh in
Power BI, rather than just the Power Query editor experience. To do this, you first need to load the data to the
model once. If you are planning to do this, make sure that you realize that if you press 'Close and Apply' that the
editor window will close (interrupting tracing) so you either need to do it on the second refresh, or click the
dropdown icon under 'Close and Apply' and press 'Apply' instead.
Either way, make sure to press 'Start Diagnostics' on the Diagnostics section of the 'Tools' tab in the editor. Once
you've done this refresh your model, or even just the table you care about.
Once it's done loading the data to model, press 'Stop' diagnostics.
You can expect to see some combination of metadata and data queries. Metadata calls grab the information it
can about the data source. Data retrieval is about accessing the data source, emitting the final built up Data
Source Query with folded down operations, and then performing whatever evaluations are missing on top,
locally.
It's important to note that just because you see a resource (database, web endpoint, etc.) or a data source query
in your diagnostics, it doesn't mean that it's necessarily performing network activity. Power Query may retrieve
this information from its cache. In future updates, we will indicate whether or not information is being retrieved
from the cache for easier diagnosis.
Diagnose Step
'Diagnose Step' is more useful for getting an insight into what evaluations are happening up to a single step,
which can help you identify, up to that step, what performance is like as well as what parts of your query are
being performed locally or remotely.
If you used 'Diagnose Step' on the query we built above, you'll find that it only returns 10 or so rows, and if we
look at the last row with a Data Source Query we can get a pretty good idea of what our final emitted query to
the data source will be. In this case, we can see that Sales Representative was filtered remotely, but the grouping
(by process of elimination) happened locally.
If you start and stop diagnostics and refresh the same query, we get 40 rows due to the fact that, as mentioned
above, Power Query is getting information on every step, not just the final step. This makes it harder when
you're just trying to get insight into one particular part of your query.
Additional Reading
An introduction to the feature
More about reading and visualizing your recorded traces
How to understand what query operations are folding using Query Diagnostics
Visualizing and Interpreting Query Diagnostics in
Power BI
5/25/2022 • 4 minutes to read • Edit Online
Introduction
Once you've recorded the diagnostics you want to use, the next step is being able to understand what they say.
It's helpful to have a good understanding of what exactly each column in the query diagnostics schema means,
which we're not going to repeat in this short tutorial. There's a full write up of that here.
In general, when building visualizations, it's better to use the full detailed table. Because regardless of how many
rows it is, what you're probably looking at is some kind of depiction of how the time spent in different resources
adds up, or what the native query emitted was.
As mentioned in our article on recording the diagnostics, I'm working with the OData and SQL traces for the
same table (or nearly so)—the Customers table from Northwind. In particular, I'm going to focus on common
ask from our customers, and one of the easier to interpret sets of traces: full refresh of the data model.
If we perform all the same operations and build similar visualizations, but with the SQL traces instead of the
ODATA ones, we can see how the two data sources compare!
If we select the Data Source table, like with the ODATA diagnostics we can see the first evaluation (2.3 in this
image) emits metadata queries, with the second evaluation actually retrieving the data we care about. Because
we're retrieving small amounts of data in this case, the data pulled back takes a small amount of time (less than
a tenth of a second for the entire second evaluation to happen, with less than a twentieth of a second for data
retrieval itself), but that won't be true in all cases.
As above, we can select the 'Data Source' category on the legend to see the emitted queries.
Digging into the data
Looking at paths
When you're looking at this, if it seems like time spent is strange—for example, on the OData query you might
see that there's a Data Source Query with the following value:
Request:
https://services.odata.org/V4/Northwind/Northwind.svc/Customers?
$filter=ContactTitle%20eq%20%27Sales%20Representative%27&$select=CustomerID%2CCountry HTTP/1.1
Content-Type:
application/json;odata.metadata=minimal;q=1.0,application/json;odata=minimalmetadata;q=0.9,application/atoms
vc+xml;q=0.8,application/atom+xml;q=0.8,application/xml;q=0.7,text/plain;q=0.7
<Content placeholder>
Response:
Content-Type:
application/json;odata.metadata=minimal;q=1.0,application/json;odata=minimalmetadata;q=0.9,application/atoms
vc+xml;q=0.8,application/atom+xml;q=0.8,application/xml;q=0.7,text/plain;q=0.7
Content-Length: 435
<Content placeholder>
This Data Source Query is associated with an operation that only takes up, say, 1% of the Exclusive Duration.
Meanwhile, there's a similar one:
Request:
GET https://services.odata.org/V4/Northwind/Northwind.svc/Customers?$filter=ContactTitle eq 'Sales
Representative'&$select=CustomerID%2CCountry HTTP/1.1
Response:
https://services.odata.org/V4/Northwind/Northwind.svc/Customers?$filter=ContactTitle eq 'Sales
Representative'&$select=CustomerID%2CCountry
HTTP/1.1 200 OK
This Data Source Query is associated with an operation that takes up nearly 75% of the Exclusive Duration. If
you turn on the Path, you discover the latter is actually a child of the former. This means that the first query
basically added a small amount of time on its own, with the actual data retrieval being tracked by the 'inner'
query.
These are extreme values, but they're within the bounds of what might be seen.
Understanding folding with Query Diagnostics
5/25/2022 • 2 minutes to read • Edit Online
One of the most common reasons to use Query Diagnostics is to have a better understanding of what
operations were 'pushed down' by Power Query to be performed by the back-end data source, which is also
known as 'folding'. If we want to see what folded, we can look at what is the 'most specific' query, or queries,
that get sent to the back-end data source. We can look at this for both ODATA and SQL.
The operation that was described in the article on Recording Diagnostics does essentially four things:
Connects to the data source
Grabs the customer table
Filters the Customer ID role to 'Sales Representative'
Groups by 'Country'
Since the ODATA connector doesn't currently support folding COUNT() to the endpoint, and since this endpoint
is somewhat limited in its operations as well, we don't expect that final step to fold. On the other hand, filtering
is relatively trivial. This is exactly what we see if we look at the most specific query emitted above:
Request:
GET https://services.odata.org/V4/Northwind/Northwind.svc/Customers?$filter=ContactTitle eq 'Sales
Representative'&$select=CustomerID%2CCountry HTTP/1.1
Response:
https://services.odata.org/V4/Northwind/Northwind.svc/Customers?$filter=ContactTitle eq 'Sales
Representative'&$select=CustomerID%2CCountry
HTTP/1.1 200 OK
We can see we're filtering the table for ContactTitle equallying 'Sales Representative', and we're only returning
two columns--Customer ID and Country. Country, of course, is needed for the grouping operation, which since it
isn't being performed by the ODATA endpoint must be performed locally. We can conclude what folds and
doesn't fold here.
Similarly, if we look at the specific and final query emitted in the SQL diagnostics, we see something slightly
different:
count(1) as [Count]
from
(
select [_].[Country]
from [dbo].[Customers] as [_]
where [_].[ContactTitle] = 'Sales Representative' and [_].[ContactTitle] is not null
) as [rows]
group by [Country]
Here, we can see that Power Query creates a subselection where ContactTitle is filtered to 'Sales Representative',
then groups by Country on this subselection. All of our operations folded.
Using Query Diagnostics, we can examine what kind of operations folded--in the future, we hope to make this
capability easier to use.
Why does my query run multiple times?
5/25/2022 • 5 minutes to read • Edit Online
When refreshing in Power Query, there's a lot done behind the scenes to attempt to give you a smooth user
experience, and to execute your queries efficiently and securely. However, in some cases you might notice that
multiple data source requests are being triggered by Power Query when data is refreshed. Sometimes these
requests are normal, but other times they can be prevented.
In this example, you’ll have only a single M evaluation that happens when you refresh the Power Query editor
preview. If the duplicate requests occur at this point, then they’re somehow inherent in the way the query is
authored. If not, and if you enable the settings above one-by-one, you can then observe at what point the
duplicate requests start occurring.
The following sections describe these steps in more detail.
Set up Power Query editor
You don't need to reconnect or recreate your query, just open the query you want to test in the Power Query
editor. You can duplicate the query in the editor if you don't want to mess with the existing query.
Disable the data privacy firewall
The next step is to disable the data privacy firewall. This step assumes you aren't concerned about data leakage
between sources, so disabling the data privacy firewall can be done using the Always ignore Privacy Level
settings described in Set Fast Combine option in Excel or using the Ignore the Privacy levels and
potentially improve performance setting described in Power BI Desktop privacy levels in Power BI Desktop.
Be sure to undo this step before resuming normal testing.
Disable background analysis
The next step is to disable background analysis. Background analysis is controlled by the Allow data preview
to download in the background setting described in Disable Power Query background refresh for Power BI.
You can also disable this option in Excel.
Buffer your table
Optionally, you can also use Table.Buffer to force all the data to be read, which imitates what happens during a
load. To use Table.Buffer in the Power Query editor:
1. In the Power Query editor formula bar, select the fx button to add a new step.
2. In the formula bar, surround the name of the previous step with Table.Buffer(<previous step name goes
here>). For example, if the previous step was named Source , the formula bar will display = Source . Edit
the step in the formula bar to say = Table.Buffer(Source) .
NOTE
Before reading this article, we recommended that you read Overview of query evaluation and query folding in Power
Query to better understand how folding works in Power Query.
Query folding indicators help you understand the steps that fold or don't fold.
With query folding indicators, it becomes obvious when you make a change that breaks folding. This feature
helps you to more easily resolve issues quickly, avoid performance issues in the first place, and have better
insight into your queries. In most cases you run into, steps will fold or won't fold. But there are many cases
where the outcome isn't as obvious, and these cases are discussed in Step diagnostics indicators (Dynamic,
Opaque, and Unknown).
NOTE
The query folding indicators feature is available only for Power Query Online.
let
Source = Sql.Database("ServerName", "AdventureWorks"),
Navigation = Source{[Schema = "Production", Item = "Product"]}[Data]
in
Navigation
If you examine how this code shows up in query folding indicators, you'll note that the first step is inconclusive.
But the second step does fold, which means that the query up to that point does fold.
In this example, the initial steps can't be confirmed to fold (is inconclusive), but the final step generated when
you load data initially does fold. How the first steps (Source , and sometimes other Navigation steps) are
handled depends on the connector. With SQL, for example, it's handled as a catalog table value, which doesn't
fold. However, as soon as you select data for that connector, it will fold.
Conversely, this can also mean that your query folds up to a point and then stops folding. Unlike in the case
where you have a folding indicator for the step that shows that everything folds, when you have a not-folding
indicator it doesn't mean that everything doesn't fold. Instead, it means that "not everything" folds. Generally,
everything up to the last folding indicator will fold, with more operations happening after.
Modifying the example from above, you can give a transform that never folds—Capitalize Each Word.
let
Source = Sql.Database("ServerName", "AdventureWorks"),
Navigation = Source{[Schema = "Production", Item = "Product"]}[Data],
#"Capitalized each word" = Table.TransformColumns(Navigation, {{"Name", each Text.Proper(_), type text}})
in
#"Capitalized each word"
In the query folding indicators, you have the same indicators as above, except the final step doesn't fold.
Everything up to this final step will be performed on the data source, while the final step will be performed
locally.
Example analysis
For an example analysis, start by connecting to the Production.Product table in Adventure Works (SQL). The
initial load, similar to the example above, looks like the following image.
Adding more steps that fold will extend that green line on the right side. This extension occurs because this step
also folds.
Adding a step that doesn't fold displays a different indicator. For example, Capitalize each word never folds.
The indicator changes, showing that as of this step, it's stopped folding. As mentioned earlier, the previous steps
will still fold.
Adding more steps downstream that depend on Capitalize each step will continue to not fold.
However, if you remove the column you applied the capitalization to so that the optimized query plan can all
fold once more, you'll get a result like the following image. However, something like this is uncommon. This
image illustrates how it's not just the order of steps, but the actual transformations that apply as well.
Query plan for Power Query (Preview)
5/25/2022 • 7 minutes to read • Edit Online
Query plan for Power Query is a feature that provides a better view of your query's evaluation. It's useful to help
determine why a particular query might not fold at a particular step.
Through a practical example, this article will demonstrate the main use case and potential benefits of using the
query plan feature to review your query steps. The examples used in this article have been created using the
AdventureWorksLT sample database for Azure SQL Server, which you can download from AdventureWorks
sample databases.
NOTE
The query plan feature for Power Query is only available in Power Query Online.
This article has been divided in a series of recommended steps in order to interpret the query plan. These steps
are:
1. Review the query folding indicators.
2. Select the query step to review its query plan.
3. Implement changes to your query.
Use the following steps to create the query in your own Power Query Online environment.
1. From Power Quer y - Choose data source , select Blank quer y .
2. Replace the blank query's script with the following query.
let
Source = Sql.Database("servername", "database"),
Navigation = Source{[Schema = "Sales", Item = "SalesOrderHeader"]}[Data],
#"Removed other columns" = Table.SelectColumns(Navigation, {"SalesOrderID", "OrderDate",
"SalesOrderNumber", "PurchaseOrderNumber", "AccountNumber", "CustomerID", "TotalDue"}),
#"Filtered rows" = Table.SelectRows(#"Removed other columns", each [TotalDue] > 1000),
#"Kept bottom rows" = Table.LastN(#"Filtered rows", 5)
in
#"Kept bottom rows"
3. Change servername and database with the correct names for your own environment.
4. (Optional) If you're trying to connect to a server and database for an on-premises environment, be sure
to configure a gateway for that environment.
5. Select Next .
6. In the Power Query Editor, select Configure connection and provide the credentials to your data
source.
NOTE
For more information about connecting to a SQL Server, go to SQL Server database.
After following these steps, your query will look like the one in the following image.
This query connects to the SalesOrderHeader table, and selects a few columns from the last five orders with a
TotalDue value above 1000.
NOTE
This article uses a simplified example to showcase this feature, but the concepts described in this article apply to all
queries. We recommend that you have a good knowledge of query folding before reading the query plan. To learn more
about query folding, go to Query folding basics.
Your first step in this process is to review your query and pay close attention to the query folding indicators. The
goal is to review the steps that are marked as not folded. Then you can see if making changes to the overall
query could make those transformations fold completely.
For this example, the only step that can't be folded is Kept bottom rows , which is easy to identify through the
not folded step indicator. This step is also the last step of the query.
The goal now is to review this step and understand what's being folded back to the data source and what can't
be folded.
Power Query tries to optimize your query by taking advantage of lazy evaluation and query folding, as
mentioned in Query folding basics. This query plan represents the optimized translation of your M query into
the native query that's sent to the data source. It also includes any transforms that are performed by the Power
Query Engine. The order in which the nodes appear follows the order of your query starting from the last step
or output of your query, which is represented on the far left of the diagram and in this case is the Table.LastN
node that represents the Kept bottom rows step.
At the bottom of the dialog, there's a bar with icons that help you zoom in or out of the query plan view, and
other buttons to help you manage the view. For the previous image, the Fit to view option from this bar was
used to better appreciate the nodes.
NOTE
The query plan represents the optimized plan. When the engine is evaluating a query, it tries to fold all operators into a
data source. In some cases, it might even do some internal reordering of the steps to maximize folding. With this in mind,
the nodes/operators left in this optimized query plan typically contain the "folded" data source query and any operators
that couldn't be folded and are evaluated locally.
The query shown here might not be exactly the same query sent to the data source, but it's a good
approximation. For this case, it tells you exactly what columns will be queried from the SalesOrderHeader table
and then how it will filter that table using the TotalDue field to only get rows where the value for that field is
larger than 1000. The node next to it, Table.LastN, is calculated locally by the Power Query engine, as it can't be
folded.
NOTE
The operators might not exactly match the functions used in the query's script.
Review non-folded nodes and consider actions to make your transform fold
You've now determined which nodes couldn't be folded and will be evaluated locally. This case only has the
Table.LastN node, but in other scenarios it could have many more.
The goal is to apply changes to your query so that the step can be folded. Some of the changes you might
implement could range from rearranging your steps to applying an alternative logic to your query that's more
explicit to the data source. This doesn't mean that all queries and all operations are foldable by applying some
changes. But it's a good practice to determine through trial and error if your query could be folded back.
Since the data source is a SQL Server database, if the goal is to retrieve the last five orders from the table, then a
good alternative would be to take advantage of the TOP and ORDER BY clauses in SQL. Since there's no BOTTOM
clause in SQL, the Table.LastN transform in PowerQuery can't be translated into SQL. You could remove the
Table.LastN step and replace it with:
A sor t descending step by the SalesOrderID column in the table, since this column determines which
order goes first and which has been entered last.
Select the top five rows since the table has been sorted, this transform accomplishes the same as if it was
a Kept bottom rows ( Table.LastN ).
This alternative is equivalent to the original query. While this alternative in theory seems good, you need to
make the changes to see if this alternative will make this node fully fold back to the data source.
4. Select the table icon on the top-left corner of the data preview view and select the option that reads Keep
top rows. In the dialog, pass the number five as the argument and hit OK.
After implementing the changes, check the query folding indicators again and see if it's giving you a folded
indicator.
Now it's time to review the query plan of the last step, which is now Keep top rows . Now there are only folded
nodes. Select View details under Value.NativeQuery to verify which query is being sent to the database.
While this article is suggesting what alternative to apply, the main goal is for you to learn how to use the query
plan to investigate query folding. This article also provides visibility of what's being sent to your data source and
what transforms will be done locally.
You can adjust your code to see the impact that it has in your query. By using the query folding indicators, you'll
also have a better idea of which steps are preventing your query from folding.
Using parameters
5/25/2022 • 7 minutes to read • Edit Online
A parameter serves as a way to easily store and manage a value that can be reused.
Parameters give you the flexibility to dynamically change the output of your queries depending on their value,
and can be used for:
Changing the argument values for particular transforms and data source functions.
Inputs in custom functions.
You can easily manage your parameters inside the Manage Parameters window. To get to the Manage
Parameters window, select the Manage Parameters option inside Manage Parameters in the Home tab.
Creating a parameter
Power Query provides two easy ways to create parameters:
From an existing quer y : Right-click a query whose value is a simple non-structured constant, such as a
date, text, or number, and then select Conver t to Parameter .
You can also convert a parameter to a query by right-clicking the parameter and then selecting Conver t
To Quer y .
Using the Manage Parameters window : Select the New Parameter option from the dropdown
menu of Manage Parameters in the Home tab. Or launch the Manage Parameters window and select
New on the top to create a parameter. Fill in this form, and then select OK to create a new parameter.
After creating the parameter, you can always go back to the Manage Parameters window to modify any of
your parameters at any moment.
Parameter properties
A parameter stores a value that can be used for transformations in Power Query. Apart from the name of the
parameter and the value that it stores, it also has other properties that provide metadata to it. The properties of
a parameter are:
Name : Provide a name for this parameter that lets you easily recognize and differentiate it from other
parameters you might create.
Description : The description is displayed next to the parameter name when parameter information is
displayed, helping users who are specifying the parameter value to understand its purpose and its
semantics.
Required : The checkbox indicates whether subsequent users can specify whether a value for the
parameter must be provided.
Type : Specifies the data type of the parameter. We recommended that you always set up the data type of
your parameter. To learn more about the importance of data types, go to Data types.
Suggested Values : Provides the user with suggestions to select a value for the Current Value from the
available options:
Any value : The current value can be any manually entered value.
List of values : Provides you with a simple table-like experience so you can define a list of
suggested values that you can later select from for the Current Value . When this option is
selected, a new option called Default Value will be made available. From here, you can select
what should be the default value for this parameter, which is the default value shown to the user
when referencing the parameter. This value isn't the same as the Current Value , which is the
value that's stored inside the parameter and can be passed as an argument in transformations.
Using the List of values provides a drop-down menu that's displayed in the Default Value and
Current Value fields, where you can pick one of the values from the suggested list of values.
NOTE
You can still manually type any value that you want to pass to the parameter. The list of suggested values
only serves as simple suggestions.
Quer y : Uses a list query (a query whose output is a list) to provide the list of suggested values
that you can later select for the Current Value .
Current Value : The value that's stored in this parameter.
For example, the following Orders table contains the OrderID , Units , and Margin fields.
In this example, create a new parameter with the name Minimum Margin with a Decimal Number type and a
Current Value of 0.2.
Go to the Orders query, and in the Margin field select the Greater Than filter option.
In the Filter Rows window, there's a button with a data type for the field selected. Select the Parameter option
from the dropdown menu for this button. From the field selection right next to the data type button, select the
parameter that you want to pass to this argument. In this case, it's the Minimum Margin parameter.
After you select OK , your table is filtered using the Current Value for your parameter.
If you modify the Current Value of your Minimum Margin parameter to be 0.3, your orders query gets
updated immediately and shows you only the rows where the Margin is above 30%.
TIP
Many transformations in Power Query let you select your parameter from a dropdown. We recommend that you always
look for it and take advantage of what parameters can offer you.
You can name this new function however you want. For demonstration purposes, the name of this new function
is MyFunction . After you select OK , a new group is created in the Queries pane using the name of your new
function. In this group, you'll find the parameters being used for the function, the query that was used to create
the function, and the function itself.
To test this new function, enter a value, such as 0.4, in the field underneath the Minimum Margin label. Then
select the Invoke button. This creates a new query with the name Invoked Function , effectively passing the
value 0.4 to be used as the argument for the function and giving you only the rows where the margin is above
40%.
To learn more about how to create custom functions, go to Creating a custom function.
TIP
If you want to have more control over what values are used in your list parameter, you can always create a list with
constant values and convert your list query to a parameter as showcased previously in this article.
With the new Interesting Orders list parameters in place, head back to the Orders query. Select the auto-filter
menu of the OrderID field. Select Number filters > In .
After selecting this option, a new Filter rows dialog box appears. From here, you can select the list parameter
from a drop-down menu.
NOTE
List parameters can work with either the In or Not in options. In lets you filter only by the values from your list. Not in
does exactly the opposite, and tries to filter your column to get all values that are not equal to the values stored in your
parameter.
After selecting OK , you'll be taken back to your query. There, your query has been filtered using the list
parameter that you've created, with the result that only the rows where the OrderID was equal to either 125 ,
777 , or 999 was kept.
Error handling
5/25/2022 • 4 minutes to read • Edit Online
Similar to how Excel and the DAX language have an IFERROR function, Power Query has its own syntax to test
and catch errors.
As mentioned in the article on dealing with errors in Power Query, errors can appear either at the step or cell
level. This article will focus on how you can catch and manage errors based on our own specific logic.
NOTE
To demonstrate this concept, this article will use an Excel Workbook as its data source. The concepts showcased here
apply to all values in Power Query and not only the ones coming from an Excel Workbook.
This table from an Excel Workbook has Excel errors such as #NULL! , #REF! , and #DIV/0! in the Standard
Rate column. When you import this table into the Power Query Editor, the following image shows how it will
look.
Notice how the errors from the Excel workbook are shown with the [Error] value in each of the cells.
In this case, the goal is to create a new Final Rate column that will use the values from the Standard Rate
column. If there are any errors, then it will use the value from the correspondent Special Rate column.
Add custom column with try and otherwise syntax
To create a new custom column, go to the Add column menu and select Custom column . In the Custom
column window, enter the formula try [Standard Rate] otherwise [Special Rate] . Name this new column
Final Rate .
The formula above will try to evaluate the Standard Rate column and will output its value if no errors are
found. If errors are found in the Standard Rate column, then the output will be the value defined after the
otherwise statement, which in this case is the Special Rate column.
After adding the correct data types to all of the columns in the table, the following image shows how the final
table looks.
NOTE
The sole purpose of excluding the #REF! error is for demonstration purposes. With the concepts showcased in this
article, you can target any error reasons, messages, or details of your choice.
When you select any of the whitespace next to the error value, you get the details pane at the bottom of the
screen. The details pane contains both the error reason, DataFormat.Error , and the error message,
Invalid cell value '#REF!' :
You can only select one cell at a time, so you can effectively only see the error components of one error value at
a time. This is where you'll create a new custom column and use the try expression.
Add custom column with try syntax
To create a new custom column, go to the Add column menu and select Custom column . In the Custom
column window, enter the formula try [Standard Rate] . Name this new column All Errors .
The try expression converts values and errors into a record value that indicates whether the try expression
handled an error or not, as well as the proper value or the error record.
You can expand this newly created column with record values and look at the available fields to be expanded by
selecting the icon next to the column header.
More resources
Understanding and working with errors in Power Query
Add a Custom column in Power Query
Add a Conditional column in Power Query
Import data from a database using native database
query
5/25/2022 • 4 minutes to read • Edit Online
Power Query gives you the flexibility to import data from wide variety of databases that it supports. It can run
native database queries, which can save you the time it takes to build queries using the Power Query interface.
This feature is especially useful for using complex queries that already exist—and that you might not want to or
know how to rebuild using the Power Query interface.
NOTE
One intent of native database queries is to be non-side effecting. However, Power Query does not guarantee that the
query will not affect the database. If you run a native database query written by another user, you will be prompted to
ensure that you're aware of the queries that will be evaluated with your credentials. For more information, see Native
database query security.
Power Query enables you to specify your native database query in a text box under Advanced options when
connecting to a database. In the example below, you'll import data from a SQL Server database using a native
database query entered in the SQL statement text box. The procedure is similar in all other databases with
native database query that Power Query supports.
1. Connect to a SQL Server database using Power Query. Select the SQL Ser ver database option in the
connector selection.
2. In the SQL Ser ver database popup window:
a. Specify the Ser ver and Database where you want to import data from using native database
query.
b. Under Advanced options , select the SQL statement field and paste or enter your native
database query, then select OK .
3. If this is the first time you're connecting to this server, you'll see a prompt to select the authentication
mode to connect to the database. Select an appropriate authentication mode, and continue.
NOTE
If you don't have access to the data source (both Server and Database), you'll see a prompt to request access to
the server and database (if access-request information is specified in Power BI for the data source).
4. If the connection is established, the result data is returned in the Power Query Editor.
Shape the data as you prefer, then select Apply & Close to save the changes and import the data.
DataWorld.Dataset dwSQL
C O N N EC TO R T Y P E O F N AT IVE DATA B A SE Q UERY
By default, if you run a native database query outside of the connector dialogs, you'll be prompted each time
you run a different query text to ensure that the query text that will be executed is approved by you.
NOTE
Native database queries that you insert in your get data operation won't ask you whether you want to run the query or
not. They'll just run.
You can turn off the native database query security messages if the native database query is run in either Power
BI Desktop or Excel. To turn off the security messages:
1. If you're using Power BI Desktop, under the File tab, select Options and settings > Options .
If you're using Excel, under the Data tab, select Get Data > Quer y Options .
2. Under Global settings, select Security .
3. Clear Require user approval for new native database queries .
4. Select OK .
You can also revoke the approval of any native database queries that you've previously approved for a given
data source in either Power BI Desktop or Excel. To revoke the approval:
1. If you're using Power BI Desktop, under the File tab, select Options and settings > Data source
settings .
If you're using Excel, under the Data tab, select Get Data > Data Source Settings .
2. In the Data source settings dialog box, select Global permissions . Then select the data source
containing the native database queries whose approval you want to revoke.
3. Select Edit permissions .
4. In the Edit permissions dialog box, under Native Database Queries , select Revoke Approvals .
Query folding on native queries
5/25/2022 • 4 minutes to read • Edit Online
In Power Query, you're able to define a native query and run it against your data source. The Import data from a
database using native database query article explains how to do this process with multiple data sources. But, by
using the process described in that article, your query won't take advantage of any query folding from
subsequent query steps.
This article showcases an alternative method to create native queries against your data source using the
Value.NativeQuery function and keep the query folding mechanism active for subsequent steps of your query.
NOTE
We recommend that you read the documentation on query folding and the query folding indicators to better understand
the concepts used throughout this article.
When connecting to the data source, it's important that you connect to the node or level where you want to
execute your native query. For the example in this article, that node will be the database level inside the server.
After defining the connection settings and supplying the credentials for your connection, you'll be taken to the
navigation dialog for your data source. In that dialog, you'll see all the available objects that you can connect to.
From this list, you need to select the object where the native query is run (also known as the target). For this
example, that object is the database level.
At the navigator window in Power Query, right-click the database node in the navigator window and select the
Transform Data option. Selecting this option creates a new query of the overall view of your database, which is
the target you need to run your native query.
Once your query lands in the Power Query editor, only the Source step should show in the Applied steps pane.
This step contains a table with all the available objects in your database, similar to how they were displayed in
the Navigator window.
Use Value.NativeQuery function
The goal of this process is to execute the following SQL code, and to apply more transformations with Power
Query that can be folded back to the source.
SELECT DepartmentID, Name FROM HumanResources.Department WHERE GroupName = 'Research and Development'
The first step was to define the correct target, which in this case is the database where the SQL code will be run.
Once a step has the correct target, you can select that step—in this case, Source in Applied Steps —and then
select the fx button in the formula bar to add a custom step. In this example, replace the Source formula with
the following formula:
The most important component of this formula is the use of the optional record for the forth parameter of the
function that has the EnableFolding record field set to true.
NOTE
You can read more about the Value.NativeQuery function from the official documentation article.
After you have entered the formula, a warning will be shown that will require you to enable native queries to
run for your specific step. You can click continue for this step to be evaluated.
This SQL statement yields a table with only three rows and two columns.
Test query folding
To test the query folding of your query, you can try to apply a filter to any of your columns and see if the query
folding indicator in the applied steps section shows the step as folded. For this case, you can filter the
DepartmentID column to have values that are not equal to two.
After adding this filter, you can check that the query folding indicators still show the query folding happening at
this new step.
To further validate what query is being sent to the data source, you can right-click the Filtered rows step and
select the option that reads View query plan to check the query plan for that step.
In the query plan view, you can see that a node with the name Value.NativeQuery at the left side of the screen
that has a hyperlink text that reads View details. You can click this hyperlink text to view the exact query that is
being sent to the SQL Server database.
The native query is wrapped around another SELECT statement to create a subquery of the original. Power
Query will do its best to create the most optimal query given the transforms used and the native query
provided.
TIP
For scenarios where you get errors because query folding wasn't possible, it is recommended that you try validating your
steps as a subquery of your original native query to check if there might be any syntax or context conflicts.
Create Power Microsoft Platform dataflows from
queries in Microsoft Excel (Preview)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The preview feature for creating Power Query templates from queries feature is only available to Office Insiders. For more
information on the Office insider program, see Office Insider.
Overview
Working with large datasets or long-running queries can be cumbersome every time you have to manually
trigger a data refresh in Excel because it takes resources from your computer to do this, and you have to wait
until the computation is done to get the latest data. Moving these data operations into a Power Platform
dataflow is an effective way to free up your computer's resources and to have the latest data easily available for
you to consume in Excel.
It only takes two quick steps to do this:
1. Exporting queries in Excel to a Power Query template
2. Creating a Power Platform dataflow from the Power Query template
3. The template requires basic information such as a name and a description before it can be saved locally
on your computer.
Creating a Power Platform dataflow from the Power Query template
1. Sign in to Power Apps.
2. In the left navigation pane, select Data > Dataflows .
3. From the toolbar, select New dataflow > Impor t template .
4. Select the Power Query template you created earlier. The dataflow name will prepopulate with the
template name provided. Once you're done with the dataflow creation screen, select Next to see your
queries from Excel in the query editor.
5. From this point, go through the normal dataflow creation and configuration process so you can further
transform your data, set refresh schedules on the dataflow, and any other dataflow operation possible.
For more information on how to configure and create Power Platform dataflows, see Create and use
dataflows.
See also
Create and use dataflows in Power Apps
Optimize Power Query when expanding table
columns
5/25/2022 • 3 minutes to read • Edit Online
The simplicity and ease of use that allows Power BI users to quickly gather data and generate interesting and
powerful reports to make intelligent business decisions also allows users to easily generate poorly performing
queries. This often occurs when there are two tables that are related in the way a foreign key relates SQL tables
or SharePoint lists. (For the record, this issue isn't specific to SQL or SharePoint, and occurs in many backend
data extraction scenarios, especially where schema is fluid and customizable.) There's also nothing inherently
wrong with storing data in separate tables that share a common key—in fact this is a fundamental tenet of
database design and normalization. But it does imply a better way to expand the relationship.
Consider the following example of a SharePoint customer list.
When you expand the record, you see the fields joined from the secondary table.
When expanding related rows from one table to another, the default behavior of Power BI is to generate a call to
Table.ExpandTableColumn . You can see this in the generated formula field. Unfortunately, this method generates
an individual call to the second table for every row in the first table.
This increases the number of HTTP calls by one for each row in the primary list. This may not seem like a lot in
the above example of five or six rows, but in production systems where SharePoint lists reach hundreds of
thousands of rows, this can cause a significant experience degradation.
When queries reach this bottleneck, the best mitigation is to avoid the call-per-row behavior by using a classic
table join. This ensures that there will be only one call to retrieve the second table, and the rest of the expansion
can occur in memory using the common key between the two tables. The performance difference can be
massive in some cases.
First, start with the original table, noting the column you want to expand, and ensuring you have the ID of the
item so that you can match it. Typically the foreign key is named similar to the display name of the column with
Id appended. In this example, it's LocationId .
Second, load the secondary table, making sure to include the Id , which is the foreign key. Right-click on the
Queries panel to create a new query.
Finally, join the two tables using the respective column names that match. You can typically find this field by first
expanding the column, then looking for the matching columns in the preview.
In this example, you can see that LocationId in the primary list matches Id in the secondary list. The UI renames
this to Location.Id to make the column name unique. Now let's use this information to merge the tables.
By right-clicking on the query panel and selecting New Quer y > Combine > Merge Queries as New , you
see a friendly UI to help you combine these two queries.
Select each table from the drop-down to see a preview of the query.
Once you've selected both tables, select the column that joins the tables logically (in this example, it's
LocationId from the primary table and Id from the secondary table). The dialog will instruct you how many of
the rows match using that foreign key. You'll likely want to use the default join kind (left outer) for this kind of
data.
Select OK and you'll see a new query, which is the result of the join. Expanding the record now doesn't imply
additional calls to the backend.
Refreshing this data will result in only two calls to SharePoint—one for the primary list, and one for the
secondary list. The join will be performed in memory, significantly reducing the number of calls to SharePoint.
This approach can be used for any two tables in PowerQuery that have a matching foreign key.
NOTE
SharePoint user lists and taxonomy are also accessible as tables, and can be joined in exactly the way described above,
provided the user has adequate privileges to access these lists.
Enabling Microsoft Edge (Chromium) for OAuth
authentication in Power BI Desktop
5/25/2022 • 2 minutes to read • Edit Online
If you're using OAuth authentication to connect to your data, the OAuth dialog in Power Query uses the
Microsoft Internet Explorer 11 embedded control browser. However, certain web services, such as QuickBooks
Online, Salesforce Reports, and Salesforce Objects no longer support Internet Explorer 11.
As of October of 2021, Power BI Desktop now uses Microsoft Edge WebView2, by default, for OAuth
authentication for all connectors. However, you can change the default behavior using environment variables.
To disable the use of WebView2 for specific connectors, set PQ_ExtendEdgeChromiumOAuthDenyList with the
name(s) of the connector(s) you want to disable. Multiple connectors are separated by semicolons.
To enable WebView2 for specific connectors, set PQ_ExtendEdgeChromiumOAuthAllowList with the name(s)
of the connector(s) you want to enable. Multiple connectors are separated by semicolons.
The following table contains a list of all the connectors currently available for Power Query. For those connectors
that have a reference page in this document, a link is provided under the connector icon and name.
A checkmark indicates the connector is currently supported in the listed service; an X indicates that the
connector is not currently supported in the listed service.
The connectors are listed in alphabetical order in separate tables for each letter in the alphabet. Use the In this
ar ticle list on the right side of this article to go to any of the alphabetized tables.
NOTE
The Excel column in the following table indicates all connectors that are available on at least one version of Excel. However,
not all Excel versions support all of these indicated Power Query connectors. For a complete list of the Power Query
connectors supported by all versions of Excel, go to Power Query data sources in Excel versions.
A
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Access
Database
By Microsoft
Active
Director y
By Microsoft
Acter ys
(Beta)
By Acterys
Actian
(Beta)
By Actian
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Adobe
Analytics
By Microsoft
Amazon
Athena
By Amazon
Amazon
OpenSearch
Project
(Beta)
By Amazon
Amazon
Redshift
By Microsoft
Anaplan
By Anaplan
appFigures
(Beta)
By Microsoft
Asana
By Asana
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Assemble
Views
By Autodesk
AtScale
cubes
(Beta)
By Microsoft
Autodesk
Constructio
n Cloud
(Beta)
By Autodesk
Automation
Anywhere
By
Automation
Anywhere
Automy
Data
Analytics
(Beta)
By
ACEROYALTY
Azure
Analysis
Ser vices
database
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Azure Blob
Storage
By Microsoft
Azure
CosmosDB
By Microsoft
Azure Cost
Managemen
t
By Microsoft
Azure
Databricks
By Databricks
Azure Data
Explorer
(Beta)
By Microsoft
Azure Data
Lake
Storage
Gen1
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Azure Data
Lake
Storage
Gen2
By Microsoft
Azure
DevOps
(Beta)
By Microsoft
Azure
DevOps
Ser ver
(Beta)
By Microsoft
Azure
HDInsight
(HDFS)
By Microsoft
Azure
HDInsight
Spark
By Microsoft
Azure
Synapse
Analytics
(SQL DW)
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Azure
Synapse
Analytics
workspace
(Beta)
By Microsoft
Azure SQL
database
By Microsoft
Azure Table
Storage
By Microsoft
Azure Time
Series
Insights
(Beta)
By Microsoft
B
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
BI
Connector
By Guidanz
BI360
By Solver
Global
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
BitSight
Security
Ratings
(Beta)
By BitSight
Bloomberg
Data
and
Analytics
By Bloomberg
BQE Core
(Beta)
By BQE
C
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Cher well
(Beta)
By Cherwell
Cognite
Data
Fustion
(Beta)
By Cognite
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Common
Data
Ser vice
(legacy)
By Microsoft
D
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Data.World
-
Get Dataset
(Beta)
By Microsoft
Data
Vir tuality
(Beta)
By Data
Virtuality
Dataflows
By Microsoft
Dataverse
By Microsoft
Delta
Sharing
(Beta)
By Databricks
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Denodo
By Denodo
Digital
Constructio
n Works
Insights
(Beta)
By Digital
Construction
Works
Dremio
By Dremio
Dynamics
365
(online)
By Microsoft
Dynamics
365
Business
Central
By Microsoft
Dynamics
365
Business
Central
(on-
premises)
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Dynamics
365
Customer
Insights
(Beta)
By Microsoft
Dynamics
NAV
By Microsoft
E
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
eWay-CRM
By eWay-
CRM
Emigo Data
Source
By Sagra
Entersoft
Business
Suite
(Beta)
By Entersoft
EQuIS
(Beta)
By EarthSoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Essbase
By Microsoft
Exasol
By Exasol
Excel
By Microsoft
F
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
FactSet
Analytics
(Beta)
By FactSet
FactSet RMS
(Beta)
By FactSet
FHIR
By Microsoft
Folder
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Funnel
(Beta)
By Funnel
G
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Github
(Beta)
By Microsoft
Google
Analytics
By Microsoft
Google
BigQuer y
By Microsoft
Google
Sheets
(Beta)
By Microsoft
H
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Hadoop File
(HDFS)
By Microsoft
HDInsight
Interactive
Quer y
By Microsoft
Hexagon
PPM
Smar t API
By Hexagon
PPM
Hive LL AP
By Microsoft
I
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
IBM DB2
database
By Microsoft
IBM
Informix
database
(Beta)
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
IBM
Netezza
By Microsoft
Impala
By Microsoft
Indexima
(Beta)
By Indexima
Industrial
App Store
By Intelligent
Plant
Information
Grid (Beta)
By Luminis
InterSystem
s
IRIS (Beta)
By
Intersystems
Intune Data
Warehouse
(Beta)
By Microsoft
J
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Jamf Pro
(Beta)
By Jamf
Jethro
(Beta)
By JethroData
JSON
By Microsoft
K
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Kognitwin
(Beta)
By Kongsberg
Kyligence
By Kyligence
L
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Linkar PICK
Style/MultiV
alue
Databases
(Beta)
By Kosday
Solutions
LinkedIn
Sales
Navigator
(Beta)
By Microsoft
M
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Marketo
(Beta)
By Microsoft
MarkLogic
By MarkLogic
MariaDB
By MariaDB
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Microsoft
Azure
Consumptio
n Insights
(Beta)
(Deprecated
)
By Microsoft
Microsoft
Exchange
By Microsoft
Microsoft
Exchange
Online
By Microsoft
Microsoft
Graph
Security
(Deprecated
)
By Microsoft
MicroStrate
gy
for Power BI
By
MicroStrategy
Mixpanel
(Beta)
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
MySQL
database
By Microsoft
O
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
OData Feed
By Microsoft
ODBC
By Microsoft
OLE DB
By Microsoft
OpenSearch
Project
(Beta)
By
OpenSearch
Oracle
database
By Microsoft
P
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Parquet
By Microsoft
Palantir
Foundr y
By Palantir
Paxata
By Paxata
2 2 1
PDF
By Microsoft
Planview
Enterprise
One - CTM
(Beta)
By Planview
Planview
Enterprise
One - PRM
(Beta)
By Planview
PostgreSQL
database
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Power BI
datasets
By Microsoft
Product
Insights
(Beta)
By Microsoft
Projectplace
for Power BI
(Beta)
By Planview
Python
Script
By Microsoft
Q
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
QubolePres
to Beta
By Qubole
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Quickbooks
Online
(Beta)
By Microsoft
Quick Base
By Quick Base
R
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
R Script
By Microsoft
Roamler
(Beta)
By Roamler
S
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Salesforce
Objects
By Microsoft
Salesforce
Repor ts
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
SAP
Business
Warehouse
Application
Ser ver
By Microsoft
SAP
Business
Warehouse
Message
Ser ver
By Microsoft
SAP HANA
database
By Microsoft
SIS-CC
SDMX
By SIS-CC
SharePoint
folder
By Microsoft
SharePoint
list
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
SharePoint
Online
list
By Microsoft
Shor tcuts
Business
Insights
(Beta)
By Shortcuts
SiteImprove
By
SiteImprove
Smar tsheet
By Microsoft
Snowflake
By Microsoft
SoftOneBI
(Beta)
By SoftOne
Solver
By BI360
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Spark
By Microsoft
SparkPost
(Beta)
By Microsoft
Spigit (Beta)
By Spigit
Starburst
Enterprise
(Beta)
By Starburst
Data
SumTotal
(Beta)
By SumTotal
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Sur veyMon
key (Beta)
By
SurveyMonke
y
SweetIQ
(Beta)
By Microsoft
Sybase
Database
By Microsoft
T
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
TeamDesk
(Beta)
By ForeSoft
Tenforce
(Smar t)List
By Tenforce
Teradata
database
By Microsoft
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Text/CSV
By Microsoft
TIBCO(R)
Data
Vir tualizatio
n
By TIBCO
Twilio
(Deprecated
) (Beta)
By Microsoft
U
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Usercube
(Beta)
By Usercube
V
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Vena (Beta)
By Vena
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Ver tica
By Microsoft
Vessel
Insight
By Kongsberg
W
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Web
By Microsoft
Webtrends
Analytics
(Beta)
By Microsoft
Witivio
(Beta)
By Witivio
Workforce
Dimensions
(Beta)
(Deprecated
)
By Kronos
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Workplace
Analytics
(Beta)
By Microsoft
X
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
XML
By Microsoft
Z
C USTO M ER
P O W ER B I P O W ER B I P O W ER A P P S IN SIGH T S A N A LY SIS
C O N N EC TO R EXC EL ( DATA SET S) ( DATA F LO W S) ( DATA F LO W S) ( DATA F LO W S) SERVIC ES
Zendesk
(Beta)
By Microsoft
Zoho
Creator
(Beta)
By Zoho
Zucchetti
HR
Infinity
(Beta)
By Zucchetti
Next steps
Power BI data sources (datasets)
Connect to data sources for Power BI dataflows
Available data sources (Dynamics 365 Customer Insights)
Data sources supported in Azure Analysis Services
Access database
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
If you're connecting to an Access database from Power Query Online, the system that contains the on-premises
data gateway must have the 64-bit version of the Access Database Engine 2010 OLEDB provider installed.
If you're loading an Access database to Power BI Desktop, the versions of the Access Database Engine 2010
OLEDB provider and Power BI Desktop on that machine must match (that is, either 32-bit or 64-bit). For more
information, go to Import Access database to Power BI Desktop.
Capabilities Supported
Import
NOTE
You must select an on-premises data gateway for this connector, whether the Access database is on your local
network or on a web site.
5. Select the type of credentials for the connection to the Access database in Authentication kind .
6. Enter your credentials.
7. Select Next to continue.
8. In Navigator , select the data you require, and then select Transform data to continue transforming the
data in Power Query Editor.
Troubleshooting
Connect to local file from Power Query Online
When you attempt to connect to a local Access database using Power Query Online, you must select an on-
premises data gateway, even if your Access database is online.
On-premises data gateway error
A 64-bit version of the Access Database Engine 2010 OLEDB provider must be installed on your on-premises
data gateway machine to be able to load Access database files. If you already have a 64-bit version of Microsoft
Office installed on the same machine as the gateway, the Access Database Engine 2010 OLEDB provider is
already installed. If not, you can download the driver from the following location:
https://www.microsoft.com/download/details.aspx?id=13255
Import Access database to Power BI Desktop
In some cases, you may get a The 'Microsoft.ACE.OLEDB.12.0' provider is not registered error when
attempting to import an Access database file to Power BI Desktop. This error may be caused by using
mismatched bit versions of Power BI Desktop and the Access Database Engine 2010 OLEDB provider. For more
information about how you can fix this mismatch, see Troubleshoot importing Access and Excel .xls files in
Power BI Desktop.
Adobe Analytics
5/25/2022 • 3 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Prerequisites
Before you can sign in to Adobe Analytics, you must have an Adobe Analytics account (username/password).
Capabilities Supported
Import
4. In the Adobe Analytics window that appears, provide your credentials to sign in to your Adobe Analytics
account. You can either supply a username (which is usually an email address), or select Continue with
Google or Continue with Facebook .
If you entered an email address, select Continue .
5. Enter your Adobe Analytics password and select Continue .
6. Once you've successfully signed in, select Connect .
Once the connection is established, you can preview and select multiple dimensions and measures within the
Navigator dialog box to create a single tabular output.
You can also provide any optional input parameters required for the selected items. For more information about
these parameters, see Optional input parameters.
You can Load the selected table, which brings the entire table into Power BI Desktop, or you can select
Transform Data to edit the query, which opens Power Query Editor. You can then filter and refine the set of
data you want to use, and then load that refined set of data into Power BI Desktop.
Top—filter the data based on the top items for the dimension. You can enter a value in the Top text box, or
select the ellipsis next to the text box to select some default values. By default, all items are selected.
Dimension—filter the data based on the selected dimension. By default, all dimensions are selected.
Custom Adobe dimension filters are not currently supported in the Power Query user interface, but can
be defined by hand as M parameters in the query. For more information, see Using Query Parameters in
Power BI Desktop.
Next steps
You may also find the following Adobe Analytics information useful:
Adobe Analytics 1.4 APIs
Adobe Analytics Reporting API
Metrics
Elements
Segments
GetReportSuites
Adobe Analytics support
Amazon Athena
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by Amazon, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the Amazon website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
An Amazon Web Services (AWS) account
Permissions to use Athena
Customers must install the Amazon Athena ODBC driver before using the connector
Capabilities supported
Import
DirectQuery
6. Select OK .
7. At the prompt to configure data source authentication, select either Use Data Source Configuration or
AAD Authentication . Enter any required sign-in information. Then select Connect .
Your data catalog, databases, and tables appear in the Navigator dialog box.
8. In the Display Options pane, select the check box for the dataset that you want to use.
9. If you want to transform the dataset before you import it, go to the bottom of the dialog box and select
Transform Data . This selection opens the Power Query Editor so that you can filter and refine the set of
data you want to use.
10. Otherwise, select Load . After the load is complete, you can create visualizations like the one in the
following image. If you selected DirectQuer y , Power BI issues a query to Athena for the visualization
that you requested.
Amazon OpenSearch Service (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by Amazon, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the OpenSearch website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
Microsoft Power BI Desktop
OpenSearch
OpenSearch SQL ODBC driver
Capabilities supported
Import
DirectQuery
Troubleshooting
If you get an error indicating the driver wasn't installed, install the OpenSearch SQL ODBC Driver.
If you get a connection error:
1. Check if the host and port values are correct.
2. Check if the authentication credentials are correct.
3. Check if the server is running.
Amazon Redshift
5/25/2022 • 4 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
An Amazon Web Services (AWS) account
Capabilities supported
Import
DirectQuery (Power BI Desktop only)
Advanced options
Provider name
Batch size
SQL statement
Azure AD Single Sign-On (SSO ) for Amazon Redshift with an on-premises data gateway
Before you can enable Azure AD SSO for Amazon Redshift, you must first enable Azure AD SSO for all data
sources that support Azure AD SSO with an on-premises data gateway:
1. In Power BI service, select Admin por tal from the settings list.
2. Under Tenant settings , enable Azure AD Single-Sign On (SSO) for Gateway .
Once you've enabled Azure AD SSO for all data sources, then enable Azure AD SSO for Amazon Redshift:
1. Enable the Redshift SSO option.
2. Select Manage gateways from the settings list.
NOTE
The following connector article is provided by Anaplan, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the Anaplan website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Capabilities supported
The connector runs through Anaplan public data integration APIs and allows you to load all Anaplan models
(aside from archived ones) and saved export actions into Power BI.
Troubleshooting
If you get a connector related error message, first, try refreshing.
Credential error in the Navigator
Do one of the following:
Clear cache within Power BI (File , Options , Clear cache) and restart the connector, or
Select Cancel and select Refresh (top right).
If you still receive a credential error after you clear cache, also clear your recent sources.
1. Select Recent sources
3. Establish the connection to the export again, and your data refreshes.
Credential error in the Power Query editor
If you encounter a credential error in the Power Query editor, select Close & Apply or Refresh Preview to
refresh the data.
NOTE
The following connector article is provided by Autodesk, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the Autodesk website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Release State GA
Prerequisites
To use the Assemble Views connector, you must have an Autodesk account with a username and password, and
be a member of at least one project in Assemble.
You'll also need at least one view associated with the Assemble project.
Capabilities supported
Import
a. Uncheck Use original column name as prefix and select OK for each view data query you've
selected.
b. Select Close & Apply to load the datasets.
6. (Optional) If you have chosen to load images, you'll need to update the Data categor y for the image
field.
a. Expand the [Your Project] View Thumbnails table, and then select the Image field. This selection
opens the Column tools tab.
b. Open the Data categor y drop-down and select Image URL . You can now drag and drop the Image
field into your report visuals.
Known issues and limitations
Views with greater than 100,000 rows may not load depending on the number of fields included in the
view. To avoid this limitation, we suggest breaking large views into multiple smaller views and appending
the queries in your report, or creating relationships in your data model.
The view images feature currently only supports thumbnail sized images because of a row size
limitation in Power BI.
Autodesk Construction Cloud (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by Autodesk, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the Autodesk website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
To use the Autodesk Construction Cloud connector, you must have an Autodesk account with a username and
password and have access to the Executive Overview in a BIM360 or an ACC Account. You also need to run a
Data Connector extraction manually or have the extractions scheduled to run in order to use this connector. The
Connector pulls from the last ran extract.
Capabilities Supported
Import
Supports US and EU Autodesk Construction Cloud servers
5. In the Autodesk window that appears, provide your credentials to sign in to your Autodesk account.
3. If prompted, follow steps 4 through 6 in the previous procedure to sign-in and connect.
NOTE
The following connector article is provided by ACEROYALTY, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the ACEROYALTY website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
Before you can sign in to Automy Data Analytics, you must have an Automy Report Token.
Capabilities Supported
Import
5. In the Navigator dialog box, select the Automy tables you want. You can then either load or transform
the data.
If you’re selecting functions, be sure to select Transform Data so that you can add parameters to the
functions you’ve selected. More information: Using parameters
2. Select the data source, and then select Clear permissions . Establish the connection to the navigation
again.
Azure Data Lake Storage Gen1
5/25/2022 • 2 minutes to read • Edit Online
NOTE
On Feb 29, 2024 Azure Data Lake Storage Gen1 will be retired. For more information, go to the official announcement. If
you use Azure Data Lake Storage Gen1, make sure to migrate to Azure Data Lake Storage Gen2 prior to that date. To
learn how, go to Migrate Azure Data Lake Storage from Gen1 to Gen2.
Unless you already have an Azure Data Lake Storage Gen1 account, you can't create new ones.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
An Azure subscription. Go to Get Azure free trial.
An Azure Data Lake Storage Gen1 account. Follow the instructions at Get started with Azure Data Lake
Storage Gen1 using the Azure portal. This article assumes that you've already created a Data Lake
Storage Gen1 account, called myadlsg1 , and uploaded a sample data file (Drivers.txt ) to it. This sample
file is available for download from Azure Data Lake Git Repository.
Capabilities supported
Import
Advanced options
Page size in bytes
2. In the Azure Data Lake Store dialog box, provide the URL to your Data Lake Storage Gen1 account.
Optionally, enter a value in Page Size in Bytes. Then select OK .
3. If this is the first time you're connecting to this database, select Sign in to sign into the Azure Data Lake
Storage Gen1 account. You'll be redirected to your organization's sign-in page. Follow the prompts to sign
in to the account.
4. After you've successfully signed in, select Connect .
5. The Navigator dialog box shows the file that you uploaded to your Azure Data Lake Storage Gen1
account. Verify the information and then select either Transform Data to transform the data in Power
Query or Load to load the data in Power BI Desktop.
Page Size in Bytes Used to break up large files into smaller pieces. The default
page size is 4 MB.
See also
Azure Data Lake Storage Gen2
Azure Data Lake Storage Gen1 documentation
Azure Data Lake Storage Gen2
5/25/2022 • 4 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
An Azure subscription. Go to Get Azure free trial.
A storage account that has a hierarchical namespace. Follow the instructions at Create a storage account
to create one. This article assumes that you've created a storage account named myadlsg2 .
Ensure you're granted one of the following roles for the storage account: Blob Data Reader , Blob Data
Contributor , or Blob Data Owner .
A sample data file named Drivers.txt located in your storage account. You can download this sample
from Azure Data Lake Git Repository, and then upload that file to your storage account.
Capabilities supported
Import
File System View
CDM Folder View
2. In the Azure Data Lake Storage Gen2 dialog box, provide the URL to your Azure Data Lake Storage
Gen2 account, container, or subfolder using the container endpoint format. URLs for Data Lake Storage
Gen2 have the following pattern:
https://<accountname>.dfs.core.windows.net/<container>/<subfolder>
You can also select whether you want to use the file system view or the Common Data Model folder view.
Select OK to continue.
3. If this is the first time you're using this URL address, you'll be asked to select the authentication method.
If you select the Organizational account method, select Sign in to sign into your storage account. You'll
be redirected to your organization's sign-in page. Follow the prompts to sign into the account. After
you've successfully signed in, select Connect .
If you select the Account key method, enter your account key and then select Connect .
4. The Navigator dialog box shows all files under the URL you provided. Verify the information and then
select either Transform Data to transform the data in Power Query or Load to load the data.
Connect to Azure Data Lake Storage Gen2 from Power Query Online
1. Select the Azure Data Lake Storage Gen2 option in the Get Data selection, and then select Connect .
More information: Where to get data
2. In Connect to data source , enter the URL to your Azure Data Lake Storage Gen2 account. Refer to
Limitations to determine the URL to use.
3. Select whether you want to use the file system view or the Common Data Model folder view.
4. If needed, select the on-premises data gateway in Data gateway .
5. Select Sign in to sign into the Azure Data Lake Storage Gen2 account. You'll be redirected to your
organization's sign-in page. Follow the prompts to sign in to the account.
6. After you've successfully signed in, select Next .
7. The Choose data page shows all files under the URL you provided. Verify the information and then
select Transform Data to transform the data in Power Query.
Limitations
Subfolder or file not supported in Power Query Online
Currently, in Power Query Online, the Azure Data Lake Storage Gen2 connector only supports paths with
container, and not subfolder or file. For example, https://<accountname>.dfs.core.windows.net/<container> will
work, while https://<accountname>.dfs.core.windows.net/<container>/<filename> or
https://<accountname>.dfs.core.windows.net/<container>/<subfolder> will fail.
Refresh authentication
Microsoft doesn't support dataflow or dataset refresh using OAuth2 authentication when the Azure Data Lake
Storage Gen2 (ADLS) account is in a different tenant. This limitation only applies to ADLS when the
authentication method is OAuth2, that is, when you attempt to connect to a cross-tenant ADLS using an Azure
AD account. In this case, we recommend that you use a different authentication method that isn't OAuth2/AAD,
such as the Key authentication method.
Proxy and firewall requirements
When you create a dataflow using a gateway, you might need to change some of your proxy settings or firewall
ports to successfully connect to your Azure data lake. If a dataflow fails with a gateway-bound refresh, it might
be due to a firewall or proxy issue on the gateway to the Azure storage endpoints.
If you're using a proxy with your gateway, you might need to configure the
Microsoft.Mashup.Container.NetFX45.exe.config file in the on-premises data gateway. More information:
Configure proxy settings for the on-premises data gateway.
To enable connectivity from your network to the Azure data lake, you might need to enable list specific IP
addresses on the gateway machine. For example, if your network has any firewall rules in place that might block
these attempts, you'll need to unblock the outbound network connections for your Azure data lake. To enable list
the required outbound addresses, use the AzureDataLake service tag. More information: Virtual network
service tags
Dataflows also support the "Bring Your Own" data lake option, which means you create your own data lake,
manage your permissions, and you explicitly connect it to your dataflow. In this case, when you're connecting to
your development or production environment using an Organizational account, you must enable one of the
following roles for the storage account: Blob Data Reader, Blob Data Contributor, or Blob Data Owner.
See also
Analyze data in Azure Data Lake Storage Gen2 by using Power BI
Introduction to Azure Data Lake Storage Gen2
Analyze data in Azure Data Lake Storage Gen2 by
using Power BI
5/25/2022 • 2 minutes to read • Edit Online
In this article, you'll learn how to use Power BI Desktop to analyze and visualize data that's stored in a storage
account that has a hierarchical namespace (Azure Data Lake Storage Gen2).
Prerequisites
Before you begin this tutorial, you must have the following prerequisites:
An Azure subscription. Go to Get Azure free trial.
A storage account that has a hierarchical namespace. Follow the instructions at Create a storage account to
create one. This article assumes that you've created a storage account named contosoadlscdm .
Ensure you are granted one of the following roles for the storage account: Blob Data Reader , Blob Data
Contributor , or Blob Data Owner .
A sample data file named Drivers.txt located in your storage account. You can download this sample from
Azure Data Lake Git Repository, and then upload that file to your storage account.
Power BI Desktop . You can download this application from the Microsoft Download Center.
4. After the data has been successfully loaded into Power BI, the following fields are displayed in the Fields
panel.
However, to visualize and analyze the data, you might prefer the data to be available using the following
fields.
In the next steps, you'll update the query to convert the imported data to the desired format.
5. From the Home tab on the ribbon, select Transform Data . The Power Query editor then opens,
displaying the contents of the file.
6. In the Power Query editor, under the Content column, select Binar y . The file will automatically be
detected as CSV and will contain the output as shown below. Your data is now available in a format that
you can use to create visualizations.
7. From the Home tab on the ribbon, select Close & Apply .
8. Once the query is updated, the Fields tab displays the new fields available for visualization.
9. Now you can create a pie chart to represent the drivers in each city for a given country. To do so, make
the following selections.
From the Visualizations tab, select the symbol for a pie chart.
In this example, the columns you're going to use are Column 4 (name of the city) and Column 7 (name of
the country). Drag these columns from the Fields tab to the Visualizations tab as shown below.
The pie chart should now resemble the one shown below.
10. If you select a specific country from the page level filters, the number of drivers in each city of the
selected country will be displayed. For example, under the Visualizations tab, under Page level filters ,
select Brazil .
11. The pie chart is automatically updated to display the drivers in the cities of Brazil.
12. From the File menu, select Save to save the visualization as a Power BI Desktop file.
See also
Azure Data Lake Storage Gen2
Azure SQL database
5/25/2022 • 3 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Authentication types supported Windows (Power BI Desktop, Excel, Power Query Online with
gateway)
Database (Power BI Desktop, Excel)
Microsoft Account (all)
Basic (Power Query Online)
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
By default, Power BI installs an OLE DB driver for Azure SQL database. However, for optimal performance, we
recommend that the customer installs the SQL Server Native Client before using the Azure SQL database
connector. SQL Server Native Client 11.0 and SQL Server Native Client 10.0 are both supported in the latest
version.
Capabilities supported
Import
DirectQuery (Power BI only)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
SQL Server failover support
Connect to Azure SQL database from Power Query Desktop
To connect to an Azure SQL database from Power Query Desktop, take the following steps:
1. Select the Azure SQL database option in the connector selection.
2. In SQL Ser ver database , provide the name of the server and database (optional).
For more information about authentication methods, go to Authentication with a data source.
NOTE
If the connection is not encrypted, you'll be prompted with the following message.
Select OK to connect to the database by using an unencrypted connection, or follow the instructions in
Enable encrypted connections to the Database Engine to set up encrypted connections to Azure SQL
database.
7. In Navigator , select the database information you want, then either select Load to load the data or
Transform Data to continue transforming the data in Power Query Editor.
Command timeout in minutes If your connection lasts longer than 10 minutes (the default
timeout), you can enter another value in minutes to keep
the connection open longer. This option is only available in
Power Query Desktop.
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, you won’t see those
columns.
Navigate using full hierarchy If checked, the navigator displays the complete hierarchy of
tables in the database you're connecting to. If cleared, the
navigator displays only the tables whose columns and rows
contain data.
Enable SQL Server Failover support If checked, when a node in the Azure SQL failover group isn't
available, Power Query moves from that node to another
when failover occurs. If cleared, no failover occurs.
Once you've selected the advanced options you require, select OK in Power Query Desktop or Next in Power
Query Online to connect to your Azure SQL database.
Troubleshooting
Always Encrypted columns
Power Query doesn't support 'Always Encrypted' columns.
Azure Synapse Analytics (SQL DW)
5/25/2022 • 3 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Authentication Types Supported Windows (Power BI Desktop, Excel, online service with
gateway)
Database (Power BI Desktop, Excel)
Microsoft Account (all)
Basic (online service)
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
By default, Power BI installs an OLE DB driver for Azure Synapse Analytics (SQL DW). However, for optimal
performance, we recommend that the customer installs the SQL Server Native Client before using the Azure
Synapse Analytics (SQL DW) connector. SQL Server Native Client 11.0 and SQL Server Native Client 10.0 are
both supported in the latest version.
Capabilities Supported
Import
DirectQuery (Power BI only)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
SQL Server failover support
Connect to Azure Synapse Analytics (SQL DW) from Power Query
Desktop
To make the connection from Power Query Desktop:
1. Select the Azure Synapse Analytics (SQL DW) option in the connector selection.
2. In the SQL Ser ver database dialog that appears, provide the name of the server and database
(optional). In this example, TestAzureSQLServer is the server name and AdventureWorks2012 is the
database.
For more information about authentication methods, go to Authentication with a data source.
NOTE
If the connection is not encrypted, you'll be prompted with the following dialog.
Select OK to connect to the database by using an unencrypted connection, or follow the instructions in
Enable encrypted connections to the Database Engine to set up encrypted connections to Azure Synapse
Analytics (SQL DW).
6. In Navigator , select the database information you want, then either select Load to load the data or
Transform Data to continue transforming the data in Power Query Editor.
Command timeout in minutes If your connection lasts longer than 10 minutes (the default
timeout), you can enter another value in minutes to keep
the connection open longer. This option is only available in
Power Query Desktop.
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, you won’t see those
columns.
Navigate using full hierarchy If checked, the navigator displays the complete hierarchy of
tables in the database you're connecting to. If cleared, the
navigator displays only the tables whose columns and rows
contain data.
Enable SQL Server Failover support If checked, when a node in the Azure SQL failover group isn't
available, Power Query moves from that node to another
when failover occurs. If cleared, no failover occurs.
Once you've selected the advanced options you require, select OK in Power Query Desktop or Next in Power
Query Online to connect to your Azure SQL database.
Troubleshooting
Always Encrypted columns
Power Query doesn't support 'Always Encrypted' columns.
Azure Synapse Analytics workspace (Beta)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
This Azure Synapse Analytics workspace connector doesn't replace the Azure Synapse Analytics (SQL DW) connector. This
connector makes exploring data in Synapse workspaces more accessible. Some capabilities aren't present in this connector,
including native query and DirectQuery support.
NOTE
This connector supports access to all data in your Synapse workspace, including Synapse Serverless, Synapse on-demand,
and Spark tables.
Prerequisites
Before you can sign in to Synapse workspaces, you must have access to Azure Synapse Analytics Workspace.
Capabilities Supported
Import
3. In the Sign in with Microsoft window that appears, provide your credentials to sign in to your Synapse
account. Then select Next .
4. Once you've successfully signed in, select Connect .
Once the connection is established, you’ll see a list of the workspaces you have access to. Drill through the
workspaces, databases, and tables.
You can Load the selected table, which brings the entire table into Power BI Desktop, or you can select
Transform Data to edit the query, which opens the Power Query editor. You can then filter and refine the set of
data you want to use, and then load that refined set of data into Power BI Desktop.
Troubleshooting
I don't see my Synapse workspace in the connector
The Synapse connector is using Azure role-based access control (RBAC) to find the Synapse workspaces you
have access to.
If your access is only defined in Synapse RBAC, you might not see the workspace.
Make sure your access is defined by Azure RBAC to ensure all Synapse workspaces are displayed.
BitSight Security Ratings
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by BitSight, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the BitSight website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
A user must have a BitSight Security Ratings product in order to access the BitSight data in Power BI. For more
information on BitSight Security Ratings, go to https://www.bitsight.com/security-ratings.
Users must also have the March 2021 release of Power BI Desktop or later.
Capabilities Supported
Import
NOTE
The following connector article is provided by Bloomberg, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the Bloomberg website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
Your organization must subscribe to Bloomberg PORT Enterprise and you must be a Bloomberg Anywhere user
and have a Bloomberg biometric authentication device (B-Unit).
Capabilities Supported
Import
Once the connection is established, you will see data available for preview in Navigator .
You can Load the selected table, or you can select Transform Data to edit the query, which opens Power Query
Editor. You can then filter and refine the set of data you want to use, and then load that refined set of data into
Power BI Desktop.
BQE Core (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by BQE, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the BQE website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
To use the BQE Core Power BI connector, you must have a BQE Core account with username and password.
Capabilities supported
Import
4. In the sign in screen, enter your Core email and password. Select Login .
5. You'll then be prompted to select your Core company file.
a. Select the Core company file you want to use.
b. (Optional) If you select Remember my consent , the next time you connect to this Core company file
you won't need to grant permission again.
c. Select Grant Permission .
6. Select Connect , and then select a module. For reference, review the API Reference under the Core API
Documentation.
7. From the Navigator, select the tables to load, and then select Transform Data to transform the data in
Power Query.
Common Data Service (Legacy)
5/25/2022 • 5 minutes to read • Edit Online
NOTE
The Common Data Service (Legacy) connector has be superseded by the Power Query Dataverse connector. In most
cases, we recommend that you use the Dataverse connector instead of the Common Data Service (Legacy) connector.
However, there may be limited cases where it's necessary to choose the Common Data Service (Legacy) connector. These
cases are described in When to use the Common Data Service (Legacy) connector.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
You must have a Common Data Service (Legacy) environment with maker permissions to access the portal, and
read permissions to access data within tables.
Capabilities supported
Server URL
Advanced
Reorder columns
Add display column
When the table is loaded in the Navigator dialog box, by default the columns in the table are reordered in
alphabetical order by the column names. If you don't want the columns reordered, in the advanced
settings enter false in Reorder columns .
Also when the table is loaded, by default if the table contains any picklist fields, a new column with the
name of the picklist field with _display appended at the end of the name is added to the table. If you
don't want the picklist field display column added, in the advanced settings enter false in Add display
column .
When you've finished filling in the information, select OK .
4. If this attempt is the first time you're connecting to this site, select Sign in and input your credentials.
Then select Connect .
5. In Navigator , select the data you require, then either load or transform the data.
3. If necessary, enter an on-premises data gateway if you're going to be using on-premises data. For
example, if you're going to combine data from Dataverse and an on-premises SQL Server database.
4. Sign in to your organizational account.
5. When you've successfully signed in, select Next .
6. In the navigation page, select the data you require, and then select Transform Data .
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
You must have a Dataverse environment.
You must have read permissions to access data within tables.
To use the Dataverse connector, the TDS endpoint setting must be enabled in your environment. More
information: Manage feature settings
To use the Dataverse connector, one of TCP ports 1433 or 5558 need to be open to connect. Port 1433 is used
automatically. However, if port 1433 is blocked, you can use port 5558 instead. To enable port 5558, you must
append that port number to the Dataverse environment URL, such as
yourenvironmentid.crm.dynamics.com,5558. More information: SQL Server connection issue due to closed
ports
NOTE
If you are using Power BI Desktop and need to use port 5558, you must create a source with the Dataverse environment
URL, such as yourenvironmentid.crm.dynamics.com,5558, in Power Query M.
Capabilities supported
Server URL
Import
DirectQuery (Power BI only)
Advanced
Reorder columns
Add display column
3. If this attempt is the first time you're connecting to this site, select Sign in and input your credentials.
Then select Connect .
4. In Navigator , select the data you require, then either load or transform the data.
5. Select either the Impor t or DirectQuer y data connectivity mode. Then select OK .
Connect to Dataverse from Power Query Online
To connect to Dataverse from Power Query Online:
1. From the Data sources page, select Dataverse .
2. Leave the server URL address blank. Leaving the address blank will list all of the available environments
you have permission to use in the Power Query Navigator window.
NOTE
If you need to use port 5558 to access your data, you'll need to load a specific environment with port 5558
appended at the end in the server URL address. In this case, go to Finding your Dataverse environment URL for
instructions on obtaining the correct server URL address.
3. If necessary, enter an on-premises data gateway if you're going to be using on-premises data. For
example, if you're going to combine data from Dataverse and an on-premises SQL Server database.
4. Sign in to your organizational account.
5. When you've successfully signed in, select Next .
6. In the navigation page, select the data you require, and then select Transform Data .
Once a database source has been defined, you can specify a native query using the Value.NativeQuery function.
let
Source = CommonDataService.Database("[DATABASE]"),
myQuery = Value.NativeQuery(Source, "[QUERY]", null, [EnableFolding=true])
in
myQuery
Note that misspelling a column name may result in an error message about query folding instead of missing
column.
Accessing large datasets
Power BI datasets contained in Dataverse can be very large. If you're using the Power Query Dataverse
connector, any specific query that accesses the dataset must return less than 80 MB of data. So you might need
to query the data multiple times to access all of the data in the dataset. Using multiple queries can take a
considerable amount of time to return all the data.
If you're using the Common Data Service (Legacy) connector, you can use a single query to access all of the data
in the dataset. This connector works differently and returns the result in “pages” of 5 K records. Although the
Common Data Service (Legacy) connector is more efficient in returning large amounts of data, it can still take a
significant amount of time to return the result.
Instead of using these connectors to access large datasets, we recommend that you use Azure Synapse Link to
access large datasets. Using Azure Synapse Link is even more efficient that either the Power Query Dataverse or
Common Data Service (Legacy) connectors, and it is specifically designed around data integration scenarios.
Performance issues related to relationship columns
Similar to the SQL Server connector, there's an option available to disable navigation properties (relationship
columns) in the Dataverse connector to improve performance. This option is not yet available in the user
interface, but can be set using the CreateNavigationProperties=false parameter in the Dataverse connector
function.
Source = CommonDataService.Database("{crminstance}.crm.dynamics.com",[CreateNavigationProperties=false]),
Dataflows
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
You must have an existing Dataflow with maker permissions to access the portal, and read permissions to access
data from the dataflow.
Capabilities supported
Import
DirectQuery (Power BI Desktop only)
NOTE
DirectQuery requires Power BI premium. More information: Premium features of dataflows
4. In Navigator , select the Dataflow you require, then either load or transform the data.
Get data from Dataflows in Power Query Online
To get data from Dataflows in Power Query Online:
1. From the Data sources page, select Dataflows .
NOTE
The following connector article is provided by Databricks, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the Databricks website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
If you use Power BI Desktop you need to install the November release of Power BI Desktop or later. Download
the latest version.
The data provider sends an activation URL from which you can download a credentials file that grants you
access to the shared data.
After downloading the credentials file, open it with a text editor to retrieve the endpoint URL and the token.
For detailed information about Delta Sharing, visit Access data shared with you using Delta Sharing.
Capabilities supported
Import
NOTE
The following connector article is provided by Denodo, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the Denodo website and use the support channels there.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
To use this connector, you must have installed the Denodo platform, and configured and started its service. In
case of a connection using ODBC, you must have correctly configured the connection in the ODBC Data Source
Administrator.
Capabilities supported
Import
DirectQuery
In this case:
Kerberos authentication is enabled in the Virtual DataPort server.
The Denodo Virtual DataPort database that the DSN connects to must be configured with
the option ODBC/ADO.net authentication type set to Kerberos.
The client, Power BI Desktop, has to belong to the Windows domain because the ODBC
driver requests the Kerberos ticket to the ticket cache.
In the advanced options of the DSN configuration, consider the type of authentication you
want to use.
Basic : This authentication type allows you to connect Power BI Desktop to your Virtual DataPort
data using your Virtual DataPort server credentials.
Summary
IT EM DESC RIP T IO N
Authentication types supported Digital Construction Works JSON Web Token (JWT)
NOTE
The following connector article is provided by Digital Construction Works (DCW), the owner of this connector and a
member of the Microsoft Power Query Connector Certification Program. If you have questions regarding the content of
this article or have changes you would like to see made to this article, visit the DCW website and use the support
channels there.
Prerequisites
Use of this connector requires a Digital Construction Works Integrations Platform subscription. To learn more,
go to https://www.digitalconstructionworks.com/solutions/the-dcw-integrations-platform. Visit
https://www.digitalconstructionworks.com for company information.
Users of the Digital Construction Works (DCW) Integrations Platform can request a JSON Web Token (JWT) from
their project administrator in order to access data using the DCW Insights connector. Users can then follow the
documentation for the OData API to connect to the datasets they want to use in Power BI.
Capabilities supported
Import
3. Select OK .
4. If this is the first time you're connecting to this endpoint, you'll be asked to enter in the JWT used to
authorize you for this project. Then select Connect .
For more information about authentication methods, go to Authentication with a data source.
NOTE
If the connection isn't specified to use https , you'll be prompted to update your URL.
5. In Navigator , select the database information you want, then either select Load to load the data or
Transform Data to continue transforming the data in Power Query editor.
Troubleshooting
Always Encrypted columns
Power Query doesn't support "Always Encrypted" columns.
OData.Feed
We use the following default settings when using OData.Feed:
Implementation = "2.0", MoreColumns = true, ODataVersion = 4
EQuIS (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by EarthSoft, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the EarthSoft website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
To use the EQuIS connector, you must have a valid user account in an EQuIS Enterprise site (version 7.0.0.19300
or later) that includes a REST API license. Your user account must be a member of the REST API role. To verify
user account configuration, go to the Roles tab in your user profile and verify that you are a member of the
REST API role.
Capabilities supported
Import
Additional Information
For best functionality and performance, EarthSoft recommends that you use the EQuIS connector with the
latest build of EQuIS Enterprise.
When using reports in a facility group, non-administrator users must have permission to all facilities
contained in the facility group.
Only "grid" reports will be available in the Navigator .
All datasets consumed by the EQuIS connector will use camelCase for column names.
The current version of the EQuIS connector will retrieve a dataset in a single request and is limited to
1,048,576 rows in a single dataset (this limitation might be removed in a future version of the connector).
Essbase
5/25/2022 • 16 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Prerequisites
None
Capabilities Supported
Import
Direct Query
Advanced options
Command timeout in minutes
Server
Application
MDX statement
5. In Navigator , select the data you require. Then, either select Transform data to transform the data in
Power Query Editor, or Load to load the data in Power BI.
Connect using advanced options
Power Query provides a set of advanced options that you can add to your query if needed. The following table
lists all of the advanced options you can set in Power Query.
Command timeout in minutes Lets you set the maximum time a command is allowed to
run before Power BI abandons the call. If the command
timeout is reached, Power BI may retry two more times
before completely abandoning the call. This setting is helpful
for querying large amounts of data. The default value of the
command timeout is 140 seconds.
Server The name of the server where the optional MDX statement
is to run. This value is case sensitive.
Be aware that this look is a stylistic decision and that there are no differences in data. The levels in the Power
Query navigator correspond to the hierarchical level.
In the example above, Level 1 would contain “R_ReportingUnits”, “Adjustment Entity Input” and “No_Entity”.
Level 2 would contain “R_Americas”, “R_EMEA”, “R_AsiaPacific”, “1_ReportingUnits_Adjustment”,
“CALA_HFM_Input”, “CALA_Total”, and so on.
The reason is because the navigator in Power Query is limited to 10,000 members to display, and there can be
millions or billions of members underneath a hierarchy. Even for the case of no member display limit (such as
with Power Query Online), navigating and selecting every individual member in a tree format with so many
possible values quickly becomes tedious and difficult to use.
So, the grouping of the hierarchical levels makes it easier to select what to import, and the subsequent report
generation can use filters to target only the members the end user wants.
Known limitations
The Essbase connector doesn't support measure hierarchies. All measures are displayed at the same level. You
can still select all the measures that you need. The search field can be used to narrow down the displayed
measures if there are large numbers of measures.
Performance considerations
Interacting with Power BI in DirectQuery mode is very dynamic. When selecting a checkbox to include a
measure or dimension level in the visualization, Power BI Desktop generates a query and sends it to the Oracle
Essbase server to get the results. Power BI is optimized to cache any repeated queries to improve performance.
But if any new query is generated, it's sent to the Oracle Essbase server to produce a new result. Depending on
the number of selected measures, dimension levels, and the filters applied, the query might get sent more
quickly than the Oracle Essbase server can respond. To improve performance and increase responsiveness,
consider the following three methods to optimize your interaction with the Oracle Essbase server.
Query reductions options
There are three options to reduce the number of queries sent. In Power BI Desktop, select the File tab, then
select Options and settings > Options , and then select Quer y reductions under the Current File section.
Selecting the Disabling cross highlighting/filtering by default option under Reduce number of queries
sent by disables cross highlighting/filtering by default. When disabled, member lists in the filter don't get
updated when filtering members in other levels of the same dimension. Selecting the Slicer selections option
under Show an Apply button and only send queries once for section displays the Apply button when a
slicer selection is changed. Selecting the Filter selections option under Show an Apply button and only
send queries once for section displays the Apply button when a filter selection is changed.
NOTE
These options apply only to the current file you are working on. Current File option settings are saved with the file and
restored when opening the same file.
2. If you have members you want to filter on in the initial dimension, select the column properties button
to display the list of available dimension members at this level. Select only the dimension members
you need at this level and then select OK to apply the filter.
3. The resulting data is now updated with the applied filter. Applied Steps now contains a new step
(Filtered Rows ) for the filter you set. You can select the settings button for the step to modify the
filter at a later time.
4. Now you'll add a new dimension level. In this case, you're going to add the next level down for the same
dimension you initially chose. Select Add Items on the ribbon to bring up the Navigator dialog box.
5. Navigate to the same dimension, but this time select the next level below the first level. Then select OK to
add the dimension level to the result.
6. The result grid now has the data from the new dimension level. Notice that because you've applied a filter
at the top level, only the related members in the second level are returned.
7. You can now apply a filter to the second-level dimension as you did for the first level.
8. In this way, each subsequent step ensures only the members and data you need are retrieved from the
server.
9. Now let's add a new dimension level by repeating the previous steps. Select Add Items on the ribbon
bar again.
10. Navigate to the dimension level you want, select it, and then select OK to add the dimension level to the
result.
4. When you have filters for two or more levels of the same dimension, you'll notice that selecting members
from a higher level in the dimension changes the members available in the lower levels of that
dimension.
This cross highlighting/filtering behavior can be disabled by checking the Disabling cross
highlighting/filtering by default option, as described in Query reductions options.
5. When you've finished choosing the members you want in the dimension level filter, it's a good time to
add that dimension level to your visualization. Check the matching dimension level in the Fields pane
and it's then added to your current visualization.
For more information about adding filters, go to Add a filter to a report in Power BI.
Troubleshooting
This section outlines common issues that you might come across, and includes troubleshooting steps to address
the issues.
Connection issues
Symptom 1
Power BI Desktop returns the error message "Unable to connect to the remote server".
Resolution
1. Ensure the Essbase Analytic Provider Services (APS) server is configured correctly for the Provider
Servers and Standalone Servers in the Essbase Administration Service (EAS) console. More information:
Configuring Essbase Clusters
2. Ensure that the URL is correct.
Check to ensure the hostname and or IP address is correct.
Check to ensure the provided port is correct.
Check to ensure the http (not https) protocol is specified.
Check to ensure the case is correct for the /aps/XMLA path in the URL.
3. If there's a firewall between Power BI Desktop and the provided hostname, check to ensure the provided
hostname and port can pass outbound through your firewall.
Validation
Trying to connect again won't show the error and the Cube and member list is in the navigation pane. You can
also select and display in preview in Import mode.
Symptom 2
Power BI Desktop returns the error message "We couldn't authenticate with the credentials provided. Please try
again."
Resolution
Ensure the provided username and password are correct. Reenter their values carefully. The password is case-
sensitive.
Validation
After correcting the username and password, you should be able to display the members and the value in the
preview or be able to load the data.
Symptom 3
Power BI Desktop returns the error message "Data at the root level is invalid. Line 1, position 1."
Resolution
Ensure the Essbase Analytic Provider Services (APS) server is configured correctly for the Provider Servers and
Standalone Servers in the Essbase Administration Service (EAS) console. More information: Configuring Essbase
Clusters.
Validation
Trying to connect again won't show the error and the Cube and member list is displayed in the navigation pane.
You can also select and display in the preview in Import mode.
Symptom 4
Once successfully connected to the Oracle Essbase Analytic Provider Services (APS) server, there are servers
listed below the URL node in the data source navigator. However, when you expand a server node, no
applications are listed below that server node.
Resolution
We recommend configuring the Oracle Hyperion server to define the provider and standalone servers through
the Essbase Administration Service (EAS) console. Refer to section Addendum: Registering Provider and
Standalone Servers in Essbase Administration Service (EAS) Console.
Validation
Trying to connect again won't show the error and you can see the Cube and member list in the navigation pane.
You can also select and display in the preview in Import mode.
Time out or large data issue
Symptom 1
Power Query returns the error message "The operation has timed out"
Resolution
1. Ensure the network is stable and there's a reliable network path to the Essbase Analytic Provider Services
(APS) server provided in the data source URL.
2. If there's a possibility that the query to the service could return a large amount of data, specify a long (or
longer) command timeout interval. If possible, add filters to your query to reduce the amount of data
returned. For example, select only specific members of each dimension you want returned.
Validation
Retry to load the data and if the problem persists, try to increase to a longer timeout interval or filter the data
further. If the problem still persists, try the resolution on Symptoms 3.
Symptom 2
The query returns the error message "Internal error: Query is allocating too large memory ( > 4GB) and cannot
be executed. Query allocation exceeds allocation limits."
Resolution
The query you're trying to execute is producing results greater than the Oracle Essbase server can handle.
Supply or increase the filters on the query to reduce the amount of data the server will return. For example,
select specific members for each level of each dimension or set numeric limits on the value of measures.
Validation
Retry to load the data and if the problem persists, try to increase to a longer timeout interval or filter the data
further. If the problem still persists, try the resolution on Symptoms 3.
Symptom 3
Essbase Analytic Provider Services (APS) or Essbase server indicates a large number of connections with long
running sessions.
Resolution
When the connectivity mode is DirectQuery, it's easy to select measures or dimension levels to add to the
selected visualization. However, each new selection creates a new query and a new session to the Essbase
Analytic Provider Services (APS)/Essbase server. There are a few ways to ensure a reduced number of queries or
to reduce the size of each query result. Review Performance Considerations to reduce the number of times the
server is queried and to also reduce the size of query results.
Validation
Retry to load the data.
Key not matching when running MDX
Symptom
An MDX statement returns the error message "The key didn't match any rows in the table".
Resolution
It's likely that the value or the case of the Server and Application fields don't match. Select the Edit button and
correct the value and case of the Server and Application fields.
Validation
Retry to load the data.
Unable to get cube issue - MDX
Symptom
An MDX statement returns the error message "Unable to get the cube name from the statement. Check the
format used for specifying the cube name".
Resolution
Ensure the database name in the MDX statement's FROM clause is fully qualified with the application and
database name, for example, [Sample.Basic]. Select the Edit button and correct the fully qualified database name
in the MDX statement's FROM clause.
Validation
Retry to load the data.
Essbase Error (1260060) issue - MDX
Symptom
An MDX statement returns the error message "Essbase Error (1260060): The cube name XXXX does not match
with current application/database"
Resolution
Ensure the application name and the fully qualified database name in the FROM clause match. Select the Edit
button and correct either the application name or the fully qualified database name in the MDX statement's
FROM clause
Validation
Retry to load the data.
Essbase Error (1200549): Repeated dimension [Measures] in MDX query
Symptom
Loading a dimension returns the error message "Essbase Error (1200549): Repeated dimension [Measures] in
MDX query".
Resolution
1. Sign in to the Essbase server, open the Essbase Administration Services Console and sign in with an
admin user (or whoever has permissions over the problematic database).
2. Navigate to the Essbase server > application > database with the problematic "Measures" dimension.
3. Unlock the outline of the database and edit it.
4. Determine which dimension should be the "Accounts" dimension type. Right-click it and select Edit
member proper ties… .
5. Select the Dimension Type field and set it to Accounts . Select OK .
6. Verify and Save the outline.
Validation
Retry to load the dimension.
Excel
5/25/2022 • 7 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
To connect to a legacy workbook (such as .xls or .xlsb), the Access Database Engine OLEDB (or ACE) provider is
required. To install this provider, go to the download page and install the relevant (32 bit or 64 bit) version. If you
don't have it installed, you'll see the following error when connecting to legacy workbooks:
The 'Microsoft.ACE.OLEDB.12.0' provider is not registered on the local machine. The 32-bit (or 64-bit)
version of the Access Database Engine OLEDB provider may be required to read this type of file. To download
the client software, visit the following site: https://go.microsoft.com/fwlink/?LinkID=285987.
ACE can't be installed in cloud service environments. So if you're seeing this error in a cloud host (such as Power
Query Online), you'll need to use a gateway that has ACE installed to connect to the legacy Excel files.
Capabilities Supported
Import
If the Excel workbook is online, use the Web connector to connect to the workbook.
3. In Navigator , select the workbook information you want, then either select Load to load the data or
Transform Data to continue transforming the data in Power Query Editor.
Troubleshooting
Numeric precision (or "Why did my numbers change?")
When importing Excel data, you may notice that certain number values seem to change slightly when imported
into Power Query. For example, if you select a cell containing 0.049 in Excel, this number is displayed in the
formula bar as 0.049. But if you import the same cell into Power Query and select it, the preview details display
it as 0.049000000000000002 (even though in the preview table it's formatted as 0.049). What's going on here?
The answer is a bit complicated, and has to do with how Excel stores numbers using something called binary
floating-point notation. The bottom line is that there are certain numbers that Excel can't represent with 100%
precision. If you crack open the .xlsx file and look at the actual value being stored, you'll see that in the .xlsx file,
0.049 is actually stored as 0.049000000000000002. This is the value Power Query reads from the .xlsx, and thus
the value that appears when you select the cell in Power Query. (For more information on numeric precision in
Power Query, go to the "Decimal number" and "Fixed decimal number" sections of Data types in Power Query.)
Connecting to an online Excel workbook
If you want to connect to an Excel document hosted in Sharepoint, you can do so via the Web connector in
Power BI Desktop, Excel, and Dataflows, and also with the Excel connector in Dataflows. To get the link to the file:
1. Open the document in Excel Desktop.
2. Open the File menu, select the Info tab, and then select Copy Path .
3. Copy the address into the File Path or URL field, and remove the ?web=1 from the end of the address.
Legacy ACE connector
Power Query reads legacy workbooks (such as .xls or .xlsb) use the Access Database Engine (or ACE) OLEDB
provider. Because of this, you may come across unexpected behaviors when importing legacy workbooks that
don't occur when importing OpenXML workbooks (such as .xlsx). Here are some common examples.
Unexpected value formatting
Because of ACE, values from a legacy Excel workbook might be imported with less precision or fidelity than you
expect. For example, imagine your Excel file contains the number 1024.231, which you've formatted for display
as "1,024.23". When imported into Power Query, this value is represented as the text value "1,024.23" instead of
as the underlying full-fidelity number (1024.231). This is because, in this case, ACE doesn't surface the
underlying value to Power Query, but only the value as it's displayed in Excel.
Unexpected null values
When ACE loads a sheet, it looks at the first eight rows to determine the data types of the columns. If the first
eight rows aren't representative of the later rows, ACE may apply an incorrect type to that column and return
nulls for any value that doesn't match the type. For example, if a column contains numbers in the first eight rows
(such as 1000, 1001, and so on) but has non-numerical data in later rows (such as "100Y" and "100Z"), ACE
concludes that the column contains numbers, and any non-numeric values are returned as null.
Inconsistent value formatting
In some cases, ACE returns completely different results across refreshes. Using the example described in the
formatting section, you might suddenly see the value 1024.231 instead of "1,024.23". This difference can be
caused by having the legacy workbook open in Excel while importing it into Power Query. To resolve this
problem, close the workbook.
Missing or incomplete Excel data
Sometimes Power Query fails to extract all the data from an Excel Worksheet. This failure is often caused by the
Worksheet having incorrect dimensions (for example, having dimensions of A1:C200 when the actual data
occupies more than three columns or 200 rows).
How to diagnose incorrect dimensions
To view the dimensions of a Worksheet:
1. Rename the xlsx file with a .zip extension.
2. Open the file in File Explorer.
3. Navigate into xl\worksheets.
4. Copy the xml file for the problematic sheet (for example, Sheet1.xml) out of the zip file to another location.
5. Inspect the first few lines of the file. If the file is small enough, open it in a text editor. If the file is too large to
be opened in a text editor, run the following command from a Command Prompt: more Sheet1.xml .
6. Look for a <dimension .../> tag (for example, <dimension ref="A1:C200" /> ).
If your file has a dimension attribute that points to a single cell (such as <dimension ref="A1" /> ), Power Query
uses this attribute to find the starting row and column of the data on the sheet.
However, if your file has a dimension attribute that points to multiple cells (such as
<dimension ref="A1:AJ45000"/> ), Power Query uses this range to find the starting row and column as well as
the ending row and column . If this range doesn't contain all the data on the sheet, some of the data won't be
loaded.
How to fix incorrect dimensions
You can fix issues caused by incorrect dimensions by doing one of the following actions:
Open and resave the document in Excel. This action will overwrite the incorrect dimensions stored in the
file with the correct value.
Ensure the tool that generated the Excel file is fixed to output the dimensions correctly.
Update your M query to ignore the incorrect dimensions. As of the December 2020 release of Power
Query, Excel.Workbook now supports an InferSheetDimensions option. When true, this option will cause
the function to ignore the dimensions stored in the Workbook and instead determine them by inspecting
the data.
Here's an example of how to provide this option:
Excel.Workbook(File.Contents("C:\MyExcelFile.xlsx"), [DelayTypes = true, InferSheetDimensions = true])
Summary
IT EM DESC RIP T IO N
NOTE
The following connector article is provided by FactSet, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the FactSet website and use the support channels there.
Prerequisites
To start using the FactSet RMS connector, the following prerequisite steps need to be completed.
Download Power BI
Ensure that you're using latest version of Power BI, as the latest major update to the FactSet Power BI
data connector will only be available there. Any subsequent major or minor version updates will only
be available by upgrading Power BI.
Subscription and authentication
To access FactSet’s IRN, the appropriate subscription is required. Refer to the FactSet Client
Assistance page for more details.
With the subscription in place, the next step is to generate the API key from the Developer Portal.
Follow the steps outlined in FactSet API keys Authentication v1 documentation.
Capabilities supported
Import
4. In the authentication page, you'll be prompted to enter the Username - Serial and the API key. Go to the
FactSet Developer Portal for more instructions on setting up an API Key.
5. The connector opens the Power Query navigator with a list of all provided functions. Note that all
functions might not be available, depending on your available subscriptions. Your account team can assist
with requirements for access to additional products.
6. Use the Get* queries to look up parameters for your Notes and create new queries. A form will populate
in the query window with parameter fields to narrow your universe and return the relevant data set of
interest based on IRN Subject, Author, Date Range, Recommendations and/or Sentiments. Note that the
functions contain Get* queries that are common for IRN Notes, Custom Symbols, and Meetings APIs.
The following table describes the Get functions in the connector.
F UN C T IO N N A M E F UN C T IO N DESC RIP T IO N
Fast Healthcare Interoperability Resources (FHIR®) is a new standard for healthcare data interoperability.
Healthcare data is represented as resources such as Patient , Observation , Encounter , and so on, and a REST
API is used for querying healthcare data served by a FHIR server. The Power Query connector for FHIR can be
used to import and shape data from a FHIR server.
If you don't have a FHIR server, you can provision the Azure API for FHIR.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities Supported
Import
Prerequisites
You must have a FHIR Data Reader role on the FHIR server to read data from the server. More information:
Assign roles for the FHIR service
You can optionally enter an initial query for the FHIR server, if you know exactly what data you're looking
for.
Select OK to proceed.
4. Decide on your authentication scheme.
The connector supports "Anonymous" for FHIR servers with no access controls (for example, public test
servers (like http://test.fhir.org/r4) or Azure Active Directory authentication. You must have a FHIR Data
Reader role on the FHIR server to read data from the server. Go to FHIR connector authentication for
details.
5. Select the resources you're interested in.
8. Create dashboards with data, for example, make a plot of the patient locations based on postal code.
Connect to a FHIR server from Power Query Online
To make a connection to a FHIR server, take the following steps:
1. In Power Quer y - Choose data source , select the Other category, and then select FHIR .
2. In the FHIR dialog, enter the URL for your FHIR server.
You can optionally enter an initial query for the FHIR server, if you know exactly what data you're looking
for.
3. If necessary, include the name of your on-premises data gateway.
4. Select the Organizational account authentication kind, and select Sign in . Enter your credentials when
asked. You must have a FHIR Data Reader role on the FHIR server to read data from the server.
5. Select Next to proceed.
6. Select the resources you're interested in.
NOTE
In some cases, query folding can't be obtained purely through data shaping with the graphical user interface
(GUI), as shown in the previous image. To learn more about query folding when using the FHIR connector, see
FHIR query folding.
Next Steps
In this article, you've learned how to use the Power Query connector for FHIR to access FHIR data. Next explore
the authentication features of the Power Query connector for FHIR.
FHIR connector authentication
FHIR® and the FHIR Flame icon are the registered trademarks of HL7 and are used with the permission of
HL7. Use of the FHIR trademark does not constitute endorsement of this product by HL7.
FHIR connector authentication
5/25/2022 • 2 minutes to read • Edit Online
This article explains authenticated access to FHIR servers using the Power Query connector for FHIR. The
connector supports anonymous access to publicly accessible FHIR servers and authenticated access to FHIR
servers using Azure Active Directory authentication. The Azure API for FHIR is secured with Azure Active
Directory.
NOTE
If you are connecting to a FHIR server from an online service, such as Power BI service, you can only use an organizational
account.
Anonymous access
There are many publicly accessible FHIR servers. To enable testing with these public servers, the Power Query
connector for FHIR supports the "Anonymous" authentication scheme. For example to access the public
https://vonk.fire.ly server:
1. Enter the URL of the public Vonk server.
After that, follow the steps to query and shape your data.
Next steps
In this article, you've learned how to use the Power Query connector for FHIR authentication features. Next,
explore query folding.
FHIR Power Query folding
FHIR query folding
5/25/2022 • 4 minutes to read • Edit Online
Power Query folding is the mechanism used by a Power Query connector to turn data transformations into
queries that are sent to the data source. This allows Power Query to off-load as much of the data selection as
possible to the data source rather than retrieving large amounts of unneeded data only to discard it in the client.
The Power Query connector for FHIR includes query folding capabilities, but due to the nature of FHIR search,
special attention must be given to the Power Query expressions to ensure that query folding is performed when
possible. This article explains the basics of FHIR Power Query folding and provides guidelines and examples.
let
Source = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null),
Patient1 = Source{[Name="Patient"]}[Data],
#"Filtered Rows" = Table.SelectRows(Patient1, each [birthDate] < #date(1980, 1, 1))
in
#"Filtered Rows"
Instead of retrieving all Patient resources from the FHIR server and filtering them in the client (Power BI), it's
more efficient to send a query with a search parameter to the FHIR server:
GET https://myfhirserver.azurehealthcareapis.com/Patient?birthdate=lt1980-01-01
With such a query, the client would only receive the patients of interest and would not need to discard data in
the client.
In the example of a birth date, the query folding is straightforward, but in general it is challenging in FHIR
because the search parameter names don't always correspond to the data field names and frequently multiple
data fields will contribute to a single search parameter.
For example, let's consider the Observation resource and the category field. The Observation.category field is
a CodeableConcept in FHIR, which has a coding field, which have system and code fields (among other fields).
Suppose you're interested in vital-signs only, you would be interested in Observations where
Observation.category.coding.code = "vital-signs" , but the FHIR search would look something like
https://myfhirserver.azurehealthcareapis.com/Observation?category=vital-signs .
To be able to achieve query folding in the more complicated cases, the Power Query connector for FHIR matches
Power Query expressions with a list of expression patterns and translates them into appropriate search
parameters. The expression patterns are generated from the FHIR specification.
This matching with expression patterns works best when any selection expressions (filtering) is done as early as
possible in data transformation steps before any other shaping of the data.
NOTE
To give the Power Query engine the best chance of performing query folding, you should do all data selection expressions
before any shaping of the data.
Unfortunately, the Power Query engine no longer recognized that as a selection pattern that maps to the
category search parameter, but if you restructure the query to:
The search query /Observation?category=vital-signs will be sent to the FHIR server, which will reduce the
amount of data that the client will receive from the server.
While the first and the second Power Query expressions will result in the same data set, the latter will, in general,
result in better query performance. It's important to note that the second, more efficient, version of the query
can't be obtained purely through data shaping with the graphical user interface (GUI). It's necessary to write the
query in the "Advanced Editor".
The initial data exploration can be done with the GUI query editor, but it's recommended that the query be
refactored with query folding in mind. Specifically, selective queries (filtering) should be performed as early as
possible.
Summary
Query folding provides more efficient Power Query expressions. A properly crafted Power Query will enable
query folding and thus off-load much of the data filtering burden to the data source.
Next steps
In this article, you've learned how to use query folding in the Power Query connector for FHIR. Next, explore the
list of FHIR Power Query folding patterns.
FHIR Power Query folding patterns
FHIR query folding patterns
5/25/2022 • 9 minutes to read • Edit Online
This article describes Power Query patterns that will allow effective query folding in FHIR. It assumes that you
are familiar with with using the Power Query connector for FHIR and understand the basic motivation and
principles for Power Query folding in FHIR.
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "birthdate=lt1980-01-01"
FilteredPatients = Table.SelectRows(Patients, each [birthDate] < #date(1980, 1, 1))
in
FilteredPatients
Filtering Patients by birth date range using and , only the 1970s:
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "birthdate=ge1970-01-01&birthdate=lt1980-01-01"
FilteredPatients = Table.SelectRows(Patients, each [birthDate] < #date(1980, 1, 1) and [birthDate] >=
#date(1970, 1, 1))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "birthdate=ge1980-01-01,lt1970-01-01"
FilteredPatients = Table.SelectRows(Patients, each [birthDate] >= #date(1980, 1, 1) or [birthDate] <
#date(1970, 1, 1))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "active=true"
FilteredPatients = Table.SelectRows(Patients, each [active])
in
FilteredPatients
Alternative search for patients where active not true (could include missing):
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "active:not=true"
FilteredPatients = Table.SelectRows(Patients, each [active] <> true)
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "gender=male"
FilteredPatients = Table.SelectRows(Patients, each [gender] = "male")
in
FilteredPatients
Filtering to keep only patients that are not male (includes other):
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "gender:not=male"
FilteredPatients = Table.SelectRows(Patients, each [gender] <> "male")
in
FilteredPatients
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "status=final"
FilteredObservations = Table.SelectRows(Observations, each [status] = "final")
in
FilteredObservations
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "_lastUpdated=2010-12-31T11:56:02.000+00:00"
FilteredPatients = Table.SelectRows(Patients, each [meta][lastUpdated] = #datetimezone(2010, 12, 31, 11,
56, 2, 0, 0))
in
FilteredPatients
let
Encounters = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Encounter" ]}
[Data],
// Fold: "class=s|c"
FilteredEncounters = Table.SelectRows(Encounters, each [class][system] = "s" and [class][code] = "c")
in
FilteredEncounters
let
Encounters = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Encounter" ]}
[Data],
// Fold: "class=c"
FilteredEncounters = Table.SelectRows(Encounters, each [class][code] = "c")
in
FilteredEncounters
// Fold: "class=s|"
FilteredEncounters = Table.SelectRows(Encounters, each [class][system] = "s")
in
FilteredEncounters
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "subject=Patient/1234"
FilteredObservations = Table.SelectRows(Observations, each [subject][reference] = "Patient/1234")
in
FilteredObservations
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "subject=1234,Patient/1234,https://myfhirservice/Patient/1234"
FilteredObservations = Table.SelectRows(Observations, each [subject][reference] = "1234" or [subject]
[reference] = "Patient/1234" or [subject][reference] = "https://myfhirservice/Patient/1234")
in
FilteredObservations
let
ChargeItems = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "ChargeItem"
]}[Data],
// Fold: "quantity=1"
FilteredChargeItems = Table.SelectRows(ChargeItems, each [quantity][value] = 1)
in
FilteredChargeItems
let
ChargeItems = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "ChargeItem"
]}[Data],
// Fold: "quantity=gt1.001"
FilteredChargeItems = Table.SelectRows(ChargeItems, each [quantity][value] > 1.001)
in
FilteredChargeItems
// Fold: "quantity=lt1.001|s|c"
FilteredChargeItems = Table.SelectRows(ChargeItems, each [quantity][value] < 1.001 and [quantity]
[system] = "s" and [quantity][code] = "c")
in
FilteredChargeItems
let
Consents = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Consent" ]}
[Data],
// Fold: "period=sa2010-01-01T00:00:00.000+00:00"
FiltertedConsents = Table.SelectRows(Consents, each [provision][period][start] > #datetimezone(2010, 1,
1, 0, 0, 0, 0, 0))
in
FiltertedConsents
let
Consents = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Consent" ]}
[Data],
// Fold: "period=eb2010-01-01T00:00:00.000+00:00"
FiltertedConsents = Table.SelectRows(Consents, each [provision][period][end] < #datetimezone(2010, 1, 1,
0, 0, 0, 0, 0))
in
FiltertedConsents
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "code:text=t"
FilteredObservations = Table.SelectRows(Observations, each [code][text] = "t")
in
FilteredObservations
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "code:text=t"
FilteredObservations = Table.SelectRows(Observations, each Text.StartsWith([code][text], "t"))
in
FilteredObservations
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "_profile=http://myprofile"
FilteredPatients = Table.SelectRows(Patients, each List.MatchesAny([meta][profile], each _ =
"http://myprofile"))
in
FilteredPatients
let
AllergyIntolerances = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"AllergyIntolerance" ]}[Data],
// Fold: "category=food"
FilteredAllergyIntolerances = Table.SelectRows(AllergyIntolerances, each List.MatchesAny([category],
each _ = "food"))
in
FilteredAllergyIntolerances
let
AllergyIntolerances = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"AllergyIntolerance" ]}[Data],
// Fold: "category:missing=true"
FilteredAllergyIntolerances = Table.SelectRows(AllergyIntolerances, each List.MatchesAll([category],
each _ = null))
in
FilteredAllergyIntolerances
let
AllergyIntolerances = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"AllergyIntolerance" ]}[Data],
// Fold: "category:missing=true"
FilteredAllergyIntolerances = Table.SelectRows(AllergyIntolerances, each [category] = null)
in
FilteredAllergyIntolerances
// Fold: "family:exact=Johnson"
FilteredPatients = Table.SelectRows(Patients, each Table.MatchesAnyRows([name], each [family] =
"Johnson"))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "family=John"
FilteredPatients = Table.SelectRows(Patients, each Table.MatchesAnyRows([name], each
Text.StartsWith([family], "John")))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "family=John,Paul"
FilteredPatients = Table.SelectRows(Patients, each Table.MatchesAnyRows([name], each
Text.StartsWith([family], "John") or Text.StartsWith([family], "Paul")))
in
FilteredPatients
Filtering Patients on family name starts with John and given starts with Paul :
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "family=John&given=Paul"
FilteredPatients = Table.SelectRows(
Patients,
each
Table.MatchesAnyRows([name], each Text.StartsWith([family], "John")) and
Table.MatchesAnyRows([name], each List.MatchesAny([given], each Text.StartsWith(_, "Paul"))))
in
FilteredPatients
let
Goals = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Goal" ]}[Data],
// Fold: "target-date=gt2020-03-01"
FilteredGoals = Table.SelectRows(Goals, each Table.MatchesAnyRows([target], each [due][date] >
#date(2020,3,1)))
in
FilteredGoals
Filtering Patient on identifier:
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "identifier=s|v"
FilteredPatients = Table.SelectRows(Patients, each Table.MatchesAnyRows([identifier], each [system] =
"s" and _[value] = "v"))
in
FilteredPatients
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "code=s|c"
FilteredObservations = Table.SelectRows(Observations, each Table.MatchesAnyRows([code][coding], each
[system] = "s" and [code] = "c"))
in
FilteredObservations
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "code:text=t&code=s|c"
FilteredObservations = Table.SelectRows(Observations, each Table.MatchesAnyRows([code][coding], each
[system] = "s" and [code] = "c") and [code][text] = "t")
in
FilteredObservations
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Patient" ]}
[Data],
// Fold: "family=John&given=Paul"
FilteredPatients =
Table.SelectRows(
Patients,
each
Table.MatchesAnyRows([name], each Text.StartsWith([family], "John")) and
Table.MatchesAnyRows([name], each List.MatchesAny([given], each Text.StartsWith(_,
"Paul"))))
in
FilteredPatients
// Fold: "category=vital-signs"
FilteredObservations = Table.SelectRows(Observations, each Table.MatchesAnyRows([category], each
Table.MatchesAnyRows([coding], each [code] = "vital-signs")))
in
FilteredObservations
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "category=s|c"
FilteredObservations = Table.SelectRows(Observations, each Table.MatchesAnyRows([category], each
Table.MatchesAnyRows([coding], each [system] = "s" and [code] = "c")))
in
FilteredObservations
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "category=s1|c1,s2|c2"
FilteredObservations =
Table.SelectRows(
Observations,
each
Table.MatchesAnyRows(
[category],
each
Table.MatchesAnyRows(
[coding],
each
([system] = "s1" and [code] = "c1") or
([system] = "s2" and [code] = "c2"))))
in
FilteredObservations
let
AuditEvents = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "AuditEvent"
]}[Data],
// Fold: "policy=http://mypolicy"
FilteredAuditEvents = Table.SelectRows(AuditEvents, each Table.MatchesAnyRows([agent], each
List.MatchesAny([policy], each _ = "http://mypolicy")))
in
FilteredAuditEvents
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "code-value-quantity=http://loinc.org|8302-2$gt150"
FilteredObservations = Table.SelectRows(Observations, each Table.MatchesAnyRows([code][coding], each
[system] = "http://loinc.org" and [code] = "8302-2") and [value][Quantity][value] > 150)
in
FilteredObservations
Filtering on Observation component code and value quantity, systolic blood pressure greater than 140:
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "component-code-value-quantity=http://loinc.org|8480-6$gt140"
FilteredObservations = Table.SelectRows(Observations, each Table.MatchesAnyRows([component], each
Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and [code] = "8480-6") and [value]
[Quantity][value] > 140))
in
FilteredObservations
Filtering on multiple component code value quantities (AND), diastolic blood pressure greater than 90 and
systolic blood pressure greater than 140:
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "component-code-value-quantity=http://loinc.org|8462-4$gt90&component-code-value-
quantity=http://loinc.org|8480-6$gt140"
FilteredObservations =
Table.SelectRows(
Observations,
each
Table.MatchesAnyRows(
[component],
each
Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and [code] =
"8462-4") and [value][Quantity][value] > 90) and
Table.MatchesAnyRows([component], each Table.MatchesAnyRows([code][coding], each
[system] = "http://loinc.org" and [code] = "8480-6") and [value][Quantity][value] > 140))
in
FilteredObservations
Filtering on multiple component code value quantities (OR), diastolic blood pressure greater than 90 or systolic
blood pressure greater than 140:
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "component-code-value-quantity=http://loinc.org|8462-4$gt90,http://loinc.org|8480-6$gt140"
FilteredObservations =
Table.SelectRows(
Observations,
each
Table.MatchesAnyRows(
[component],
each
(Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and [code]
= "8462-4") and [value][Quantity][value] > 90) or
Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and [code]
= "8480-6") and [value][Quantity][value] > 140 ))
in
FilteredObservations
let
Observations = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name = "Observation"
]}[Data],
// Fold: "combo-code-value-quantity=http://loinc.org|8302-2$gt150"
FilteredObservations =
Table.SelectRows(
Observations,
each
(Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and [code] = "8302-
2") and [value][Quantity][value] > 150) or
(Table.MatchesAnyRows([component], each Table.MatchesAnyRows([code][coding], each [system] =
"http://loinc.org" and [code] = "8302-2") and [value][Quantity][value] > 150)))
in
FilteredObservations
Summary
Query folding turns Power Query filtering expressions into FHIR search parameters. The Power Query connector
for FHIR recognizes certain patterns and attempts to identify matching search parameters. Recognizing those
patterns will help you write more efficient Power Query expressions.
Next steps
In this article, we reviewed some classes of filtering expressions that will fold to FHIR search parameters. Next
read about establishing relationships between FHIR resources.
FHIR Power Query relationships
FHIR Relationships
5/25/2022 • 2 minutes to read • Edit Online
This article describes how to establish relationships between tables that have been imported using the Power
Query connector for FHIR.
Introduction
FHIR resources are related to each other, for example, an Observation that references a subject ( Patient ):
{
"resourceType": "Observation",
"id": "1234",
"subject": {
"reference": "Patient/456"
}
Some of the resource reference fields in FHIR can refer to multiple different types of resources (for example,
Practitioner or Organization ). To facilitate an easier way to resolve references, the Power Query connector for
FHIR adds a synthetic field to all imported resources called <referenceId> , which contains a concatenation of
the resource type and the resource ID.
To establish a relationship between two tables, you can connect a specific reference field on a resource to the
corresponding <referenceId> field on the resource you would like it linked to. In simple cases, Power BI will
even detect this for you automatically.
5. Establish the relationship. In this simple example, Power BI will likely have detected the relationship
automatically:
If not, you can add it manually:
Next steps
In this article, you've learned how to establish relationships between tables imported with the Power Query
connector for FHIR. Next, explore query folding with the Power Query connector for FHIR.
FHIR Power Query folding
Folder
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities supported
Folder path
Combine
Combine and load
Combine and transform
3. Select Combine & Transform Data to combine the data in the files of the selected folder and load the
data in the Power Query Editor for editing. Select Combine & Load to load the data from all of the files
in the folder directly into your app. Or select Transform Data to load the folder data as-is in the Power
Query Editor.
NOTE
The Combine & Transform Data and Combine & Load buttons are the easiest ways to combine data found in the
files of the folder you specify. You could also use the Load button (in Power BI Desktop only) or the Transform Data
buttons to combine the files as well, but that requires more manual steps.
3. Enter the name of an on-premises data gateway that you'll use to access the folder.
4. Select the authentication kind to connect to the folder. If you select the Windows authentication kind,
enter your credentials.
5. Select Next .
6. In the Navigator dialog box, select Combine to combine the data in the files of the selected folder and
load the data into the Power Query Editor for editing. Or select Transform data to load the folder data
as-is in the Power Query Editor.
Troubleshooting
Combining files
When you combine files using the folder connector, all the files in the folder and its subfolders are processed the
same way, and the results are then combined. The way the files are processed is determined by the example file
you select. For example, if you select an Excel file and choose a table called "Table1", then all the files will be
treated as Excel files that contain a table called "Table1".
To ensure that combining the files works properly, make sure that all the files in the folder and its subfolders
have the same file format and structure. If you need to exclude some of the files, first select Transform data
instead of Combine and filter the table of files in the Power Query Editor before combining.
For more information about combining files, go to Combine files in Power Query.
Funnel (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by Funnel, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the Funnel website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
To use the Funnel connector, you need a Funnel subscription. Funnel helps you collect data from all your
marketing platforms, transform it, and send it to the destinations you want, like Power BI (https://funnel.io/).
In the Funnel App, go to your account, navigate to the Microsoft Power BI page in the left navigation (if you can't
see it, contact us). Follow the instructions on the page. You need to create a "View" that contains the fields you
want to expose in Power BI.
Capabilities Supported
Import
3. Sign in with your Google user connected to Funnel or use your Funnel credentials.
4. Once you've successfully signed in, select Connect to continue.
5. In the Navigator dialog box, choose one or more views from your accounts to import your data.
For each view, you can enter number of rolling months of data you want.
NOTE
The default number of months is 12. If today is 22.03.2022, then you'll get data for the period 01.04.2021
through 22.03.2022.
You can then either select Load to load the data or select Transform Data to transform the data.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
NOTE
Effective July 2021, Google will discontinue support for sign-ins to Google accounts from embedded browser frameworks.
Due to this change, you will need to update your Power BI Desktop version to June 2021 to support signing in to Google.
NOTE
This connector uses V4 of the Google Analytics API.
Prerequisites
Before you can sign in to Google Analytics, you must have a Google Analytics account (username/password).
Capabilities Supported
Import
4. In the Sign in with Google window that appears, provide your credentials to sign in to your Google
Analytics account. You can either supply an email address or phone number. Then select Next .
5. Enter your Google Analytics password and select Next .
6. When asked if you want Power BI Desktop to access your Google account, select Allow .
7. Once you've successfully signed in, select Connect .
Once the connection is established, you’ll see a list of the accounts you have access to. Drill through the account,
properties, and views to see a selection of values, categorized in display folders.
You can Load the selected table, which brings the entire table into Power BI Desktop, or you can select
Transform Data to edit the query, which opens Power Query Editor. You can then filter and refine the set of
data you want to use, and then load that refined set of data into Power BI Desktop.
Connect to Google Analytics data from Power Query Online
To connect to Google Analytics data:
1. Select Google Analytics from the Power Quer y - Choose data source page.
2. From the connection page, enter a connection name and choose an on-premises data gateway if
necessary.
NOTE
Currently, the Google Analytics sign-in dialog boxes indicate that you are signing in to Power Query Desktop. This
wording will be changed in the future.
5. Enter your Google Analytics password and select Next .
6. When asked if you want Power BI Desktop to access your Google account, select Allow .
7. Once you've successfully signed in, select Next .
Once the connection is established, you’ll see a list of the accounts you have access to. Drill through the
account, properties, and views to see a selection of values, categorized in display folders.
8. Select Transform data to edit the query in Power Query Editor. You can then filter and refine the set of
data you want to use, and then load that refined set of data into Power Apps.
Troubleshooting
Validating Unexpected Data
When date ranges are very large, Google Analytics will return only a subset of values. You can use the process
described in this section to understand what dates are being retrieved, and manually edit them. If you need
more data, you can append multiple queries with different date ranges. If you're not sure you're getting back the
data you expect to see, you can also use Data Profiling to get a quick look at what's being returned.
To make sure that the data you're seeing is the same as you would get from Google Analytics, you can execute
the query yourself in Google's interactive tool. To understand what data Power Query is retrieving, you can use
Query Diagnostics to understand what query parameters are being sent to Google Analytics.
If you follow the instructions for Query Diagnostics and run Diagnose Step on any Added Items , you can see
the generated results in the Diagnostics Data Source Quer y column. We recommend running this with as few
additional operations as possible on top of your initial connection to Google Analytics, to make sure you're not
losing data in a Power Query transform rather than what's being retrieved from Google Analytics.
Depending on your query, the row containing the emitted API call to Google Analytics may not be in the same
place. But for a simple Google Analytics only query, you'll generally see it as the last row that has content in that
column.
In the Data Source Quer y column, you'll find a record with the following pattern:
Request:
GET https://www.googleapis.com/analytics/v3/data/ga?ids=ga:<GA
Id>&metrics=ga:users&dimensions=ga:source&start-date=2009-03-12&end-date=2020-08-11&start-index=1&max-
results=1000"aUser=<User>%40gmail.com HTTP/1.1
<Content placeholder>
Response:
HTTP/1.1 200 OK
Content-Length: -1
<Content placeholder>
From this record, you can see you have your Analytics view (profile) ID, your list of metrics (in this case, just
ga:users ), your list of dimensions (in this case, just referral source), the start-date and end-date, the start-index,
max-results (set to 1000 for the editor by default), and the quotaUser.
You can copy these values into the Google Analytics Query Explorer to validate that the same data you're seeing
returned by your query is also being returned by the API.
If your error is around a date range, you can easily fix it. Go into the Advanced Editor. You'll have an M query
that looks something like this (at a minimum—there may be other transforms on top of it).
let
Source = GoogleAnalytics.Accounts(),
#"<ID>" = Source{[Id="<ID>"]}[Data],
#"UA-<ID>-1" = #"<ID>"{[Id="UA-<ID>-1"]}[Data],
#"<View ID>" = #"UA-<ID>-1"{[Id="<View ID>"]}[Data],
#"Added Items" = Cube.Transform(#"<View ID>",
{
{Cube.AddAndExpandDimensionColumn, "ga:source", {"ga:source"}, {"Source"}},
{Cube.AddMeasureColumn, "Users", "ga:users"}
})
in
#"Added Items"
You can do one of two things. If you have a Date column, you can filter on the Date. This is the easier option. If
you don't care about breaking it up by date, you can Group afterwards.
If you don't have a Date column, you can manually manipulate the query in the Advanced Editor to add one and
filter on it. For example:
let
Source = GoogleAnalytics.Accounts(),
#"<ID>" = Source{[Id="<ID>"]}[Data],
#"UA-<ID>-1" = #"<ID>"{[Id="UA-<ID>-1"]}[Data],
#"<View ID>" = #"UA-<ID>-1"{[Id="<View ID>"]}[Data],
#"Added Items" = Cube.Transform(#"<View ID>",
{
{Cube.AddAndExpandDimensionColumn, "ga:date", {"ga:date"}, {"Date"}},
{Cube.AddAndExpandDimensionColumn, "ga:source", {"ga:source"}, {"Source"}},
{Cube.AddMeasureColumn, "Organic Searches", "ga:organicSearches"}
}),
#"Filtered Rows" = Table.SelectRows(#"Added Items", each [Date] >= #date(2019, 9, 1) and [Date] <=
#date(2019, 9, 30))
in
#"Filtered Rows"
Next steps
Google Analytics Dimensions & Metrics Explorer
Google Analytics Core Reporting API
Google BigQuery
5/25/2022 • 6 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
NOTE
Effective July 2021, Google will discontinue support for sign-ins to Google accounts from embedded browser frameworks.
Due to this change, you will need to update your Power BI Desktop version to June 2021 to support signing in to Google.
Prerequisites
You'll need a Google account or a Google service account to sign in to Google BigQuery.
Capabilities supported
Import
DirectQuery (Power BI only)
3. The Google BigQuery connector supports connecting through an organizational account or a service
account sign-in. In this example, you'll use the organizational account to sign in. Select Sign In to
continue.
You can also sign in using a Google service account. In this case, select Ser vice Account Login and
enter your service account email and your service account JSON key file contents. Then select Connect .
4. A Sign in with Google dialog appears. Select your Google account and approve connecting to Power BI
Desktop.
5. Once signed in, select Connect to continue.
6. Once you successfully connect, a Navigator window appears and displays the data available on the
server. Select your data in the navigator. Then select either Transform Data to transform the data in
Power Query or Load to load the data in Power BI Desktop.
Connect to Google BigQuery data from Power Query Online
To connect to Google BigQuery from Power Query Online, take the following steps:
1. In the Get Data experience, select the Database category, and then select Google BigQuer y .
2. In the Google BigQuer y Database dialog, you may need to either create a new connection or select an
existing connection. If you're using on-premises data, select an on-premises data gateway. Then select
Sign in .
3. A Sign in with Google dialog appears. Select your Google account and approve connecting.
NOTE
Although the sign in dialog box says you'll continue to Power BI Desktop once you've signed in, you'll be sent to
your online app instead.
4. If you want to use any advance options, select Advanced options . More information: Connect using
advanced options
5. Once signed in, select Next to continue.
6. Once you successfully connect, a Navigator window appears and displays the data available on the
server. Select your data in the navigator. Then select Next to transform the data in Power Query.
Billing Project ID A project against which Power Query will run queries.
Permissions and billing are tied to this project.
Use Storage Api A flag that enables using the Storage API of Google
BigQuery. This option is true by default. This option can be
set to false to not use the Storage API and use REST APIs
instead.
Connection timeout duration The standard connection setting (in seconds) that controls
how long Power Query waits for a connection to complete.
You can change this value if your connection doesn't
complete before 15 seconds (the default value.)
Command timeout duration How long Power Query waits for a query to complete and
return results. The default depends on the driver default. You
can enter another value in minutes to keep the connection
open longer.
Project ID The project that you want to run native queries on. This
option is only available in Power Query Desktop.
A DVA N C ED O P T IO N DESC RIP T IO N
Once you've selected the advanced options you require, select OK in Power Query Desktop or Next in Power
Query Online to connect to your Google BigQuery data.
Users should select Transform Data and then use the JSON parsing capabilities in the Power Query Editor to
extract the data.
1. Under the Transforms ribbon tab, the Text Column category, select Parse and then JSON .
2. Extract the JSON record fields using the Expand Column option.
Setting up a Google service account
For more information on setting up or using Google service accounts, go to Creating and managing service
account keys in the Google docs.
Authenticating through a Google service account
When authenticating through a Google service account in Power BI Desktop, there's a specific credential format
that's required by the connector.
Service Account Email: must be in email format
Service Account JSON key file contents: once this JSON key is downloaded, all new lines must be removed
from the file so that the contents are in one line. Once the JSON file is in that format, the contents can be
pasted into this field.
When authenticating through a Google service account in Power BI service or Power Query Online, users need
to use "Basic" authentication. The Username field maps to the Ser vice Account Email field above, and the
Password field maps to the Ser vice Account JSON key file contents field above. The format requirements
for each credential remain the same in both Power BI Desktop, Power BI service, and Power Query Online.
Unable to authenticate with Google BigQuery Storage API
The Google BigQuery connector uses Google BigQuery Storage API by default. This feature is controlled by the
advanced option called UseStorageApi. You might encounter issues with this feature if you use granular
permissions. In this scenario, you might see the following error message or fail to get any data from your query:
ERROR [HY000] [Microsoft][BigQuery] (131) Unable to authenticate with Google BigQuery Storage API. Check your
account permissions
You can resolve this issue by adjusting the user permissions for the BigQuery Storage API correctly. These
storage API permissions are required to access data correctly with BigQueryStorage API:
bigquery.readsessions.create : Creates a new read session via the BigQuery Storage API.
bigquery.readsessions.getData : Reads data from a read session via the BigQuery Storage API.
bigquery.readsessions.update : Updates a read session via the BigQuery Storage API.
These permissions typically are provided in the BigQuery.User role. More information, Google BigQuery
Predefined roles and permissions
If the above steps don't resolve the problem, you can disable the BigQuery Storage API.
Google Sheets (Beta)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Prerequisites
Before you can use the Google Sheets connector, you must have a Google account and have access to the
Google Sheet you're trying to connect to.
Capabilities Supported
Import
4. A Sign in with Google dialog appears in an external browser window. Select your Google account and
approve connecting to Power BI Desktop.
5. Once signed in, select Connect to continue.
6. Once you successfully connect, a Navigator window appears and displays the data available on the
server. Select your data in the navigator. Then select either Transform Data to transform the data in
Power Query or Load to load the data in Power BI Desktop.
Summary
IT EM DESC RIP T IO N
Prerequisites
An Apache Hive LLAP username and password.
Capabilities Supported
Import
Direct Query
Thrift Transport Protocol
HTTP
Standard
Troubleshooting
You might come across the following "SSL_connect" error after entering the authentication information for the
connector and selecting Connect .
3. In Edit Permissions , under Encr yption , clear the Encr ypt connections check box.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
By default, the IBM Db2 database connector uses the Microsoft driver to connect to your data. If you choose to
use the IBM driver in the advanced options in Power Query Desktop, you must first install the IBM Db2 driver for
.NET on the machine used to connect to the data. The name of this driver changes from time to time, so be sure
to install the IBM Db2 driver that works with .NET. For instructions on how to download, install, and configure
the IBM Db2 driver for .NET, go to Download initial Version 11.5 clients and drivers. More information: Driver
limitations, Ensure the IBM Db2 driver is installed
Capabilities Supported
Import
DirectQuery (Power BI Desktop only)
Advanced options
Driver (IBM or Microsoft)
Command timeout in minutes
Package collection
SQL statement
Include relationship columns
Navigate using full hierarchy
Connect to an IBM Db2 database from Power Query Desktop
To make the connection, take the following steps:
1. Select the IBM Db2 database option from Get Data .
2. Specify the IBM Db2 server to connect to in Ser ver . If a port is required, specify it by using the format
ServerName:Port, where Port is the port number. Also, enter the IBM Db2 database you want to access in
Database . In this example, the server name and port are TestIBMDb2server.contoso.com:4000 and the IBM
Db2 database being accessed is NORTHWD2 .
3. If you're connecting from Power BI Desktop, select either the Impor t or DirectQuer y data connectivity
mode. The rest of these example steps use the Import data connectivity mode. To learn more about
DirectQuery, go to Use DirectQuery in Power BI Desktop.
NOTE
By default, the IBM Db2 database dialog box uses the Microsoft driver during sign in. If you want to use the IBM
driver, open Advanced options and select IBM . More information: Connect using advanced options
If you select DirectQuer y as your data connectivity mode, the SQL statement in the advanced options will be
disabled. DirectQuery currently does not support query push down on top of a native database query for the
IBM Db2 connector.
4. Select OK .
5. If this is the first time you're connecting to this IBM Db2 database, select the authentication type you want
to use, enter your credentials, and then select Connect . For more information about authentication, go to
Authentication with a data source.
By default, Power Query attempts to connect to the IBM Db2 database using an encrypted connection. If
Power Query can't connect using an encrypted connection, an "unable to connect" dialog box will appear.
To connect using an unencrypted connection, select OK .
6. In Navigator , select the data you require, then either select Load to load the data or Transform Data to
transform the data.
NOTE
You must select an on-premises data gateway for this connector, whether the IBM Db2 database is on your local
network or online.
4. If this is the first time you're connecting to this IBM Db2 database, select the type of credentials for the
connection in Authentication kind . Choose Basic if you plan to use an account that's created in the IBM
Db2 database instead of Windows authentication.
5. Enter your credentials.
6. Select Use Encr ypted Connection if you want to use an encrypted connection, or clear the option if
you want to use an unencrypted connection.
The following table lists all of the advanced options you can set in Power Query.
Command timeout in minutes If your connection lasts longer than 10 minutes (the default
timeout), you can enter another value in minutes to keep
the connection open longer.
A DVA N C ED O P T IO N DESC RIP T IO N
Package collection Specifies where to look for packages. Packages are control
structures used by Db2 when processing an SQL statement,
and will be automatically created if necessary. By default, this
option uses the value NULLID . Only available when using
the Microsoft driver. More information: DB2 packages:
Concepts, examples, and common problems
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, you won’t see those
columns.
Navigate using full hierarchy If checked, the navigator displays the complete hierarchy of
tables in the database you're connecting to. If cleared, the
navigator displays only the tables whose columns and rows
contain data.
Once you've selected the advanced options you require, select OK in Power Query Desktop or Next in Power
Query Online to connect to your IBM Db2 database.
Troubleshooting
Ensure the IBM Db2 driver is installed
If you choose to use the IBM Db2 driver for Power Query Desktop, you first have to download, install, and
configure the driver on your machine. To ensure the IBM Db2 driver has been installed:
1. Open Windows PowerShell on your machine.
2. Enter the following command:
[System.Data.Common.DbProviderFactories]::GetFactoryClasses() | ogv
3. In the dialog box that opens, you should see the following name in the InvariantName column:
IBM.Data.DB2
If this name is in the InvariantName column, the IBM Db2 driver has been installed and configured correctly.
SQLCODE -805 and SQLCODE -551 error codes
When attempting to connect to an IBM Db2 database, you may sometimes come across the common error
SQLCODE -805, which indicates the package isn't found in the NULLID or other collection (specified in the Power
Query Package connection configuration). You may also encounter the common error SQLCODE -551, which
indicates you can't create packages because you lack package binding authority.
Typically, SQLCODE -805 is followed by SQLCODE -551, but you'll see only the second exception. In reality, the
problem is the same. You lack the authority to bind the package to either NULLID or the specified collection.
Typically, most IBM Db2 administrators don't provide bind package authority to end users—especially in an IBM
z/OS (mainframe) or IBM i (AS/400) environment. Db2 on Linux, Unix, or Windows is different in that user
accounts have bind privileges by default, which create the MSCS001 (Cursor Stability) package in the user’s own
collection (name = user login name).
If you don't have bind package privileges, you'll need to ask your Db2 administrator for package binding
authority. With this package binding authority, connect to the database and fetch data, which will auto-create the
package. Afterwards, the administrator can revoke the packaging binding authority. Also, afterwards, the
administrator can "bind copy" the package to other collections—to increase concurrency, to better match your
internal standards for where packages are bound, and so on.
When connecting to IBM Db2 for z/OS, the Db2 administrator can do the following steps.
1. Grant authority to bind a new package to the user with one of the following commands:
GRANT BINDADD ON SYSTEM TO <authorization_name>
GRANT PACKADM ON <collection_name> TO <authorization_name>
2. Using Power Query, connect to the IBM Db2 database and retrieve a list of schemas, tables, and views.
The Power Query IBM Db2 database connector will auto-create the package NULLID.MSCS001, and then
grant execute on the package to public.
3. Revoke authority to bind a new package to the user with one of the following commands:
REVOKE BINDADD FROM <authorization_name>
REVOKE PACKADM ON <collection_name> FROM <authorization_name>
When connecting to IBM Db2 for Linux, Unix, or Windows, the Db2 administrator can do the following steps.
1. GRANT BINDADD ON DATABASE TO USER <authorization_name>.
2. Using Power Query, connect to the IBM Db2 database and retrieve a list of schemas, tables, and views.
The Power Query IBM Db2 connector will auto-create the package NULLID.MSCS001, and then grant
execute on the package to public.
3. REVOKE BINDADD ON DATABASE FROM USER <authorization_name>.
4. GRANT EXECUTE ON PACKAGE <collection.package> TO USER <authorization_name>.
When connecting to IBM Db2 for i, the Db2 administrator can do the following steps.
1. WRKOBJ QSYS/CRTSQLPKG. Type "2" to change the object authority.
2. Change authority from *EXCLUDE to PUBLIC or <authorization_name>.
3. Afterwards, change authority back to *EXCLUDE.
SQLCODE -360 error code
When attempting to connect to the IBM Db2 database, you may come across the following error:
Microsoft Db2 Client: The host resource could not be found. Check that the Initial Catalog value matches the
host resource name. SQLSTATE=HY000 SQLCODE=-360
This error message indicates that you didn’t put the right value in for the name of the database.
SQLCODE -1336 error code
The specified host could not be found.
Double check the name, and confirm that the host is reachable. For example, use ping in a command prompt to
attempt to reach the server and ensure the IP address is correct, or use telnet to communicate with the server.
SQLCODE -1037 error code
Host is reachable, but is not responding on the specified port.
The port is specified at the end of the server name, separated by a colon. If omitted, the default value of 50000 is
used.
To find the port Db2 is using for Linux, Unix, and Windows, run this command:
db2 get dbm cfg | findstr SVCENAME
Look in the output for an entry for SVCENAME (and SSL_SVCENAME for TLS encrypted connections). If this
value is a number, that’s the port. Otherwise cross reference the value with the system's "services" table. You can
usually find this at /etc/services, or at c:\windows\system32\drivers\etc\services for Windows.
The following screenshot shows the output of this command in Linux/Unix.
The following screenshot shows the output of this command in Windows.
2. One of the entries will have a Remote Location of *LOCAL . This entry is the one to use.
Determine port number
The Microsoft driver connects to the database using the Distributed Relational Database Architecture (DRDA)
protocol. The default port for DRDA is port 446. Try this value first.
To find for certain what port the DRDA service is running on:
1. Run the IBM i command WRKSRVTBLE .
2. Scroll down until your find the entries for DRDA.
3. To confirm that the DRDA service is up and listening on that port, run NETSTAT .
4. Choose either option 3 (for IPv4) or option 6 (for IPv6).
5. Press F14 to see the port numbers instead of names, and scroll until you see the port in question. It
should have an entry with a state of “Listen”.
More information
HIS - Microsoft OLE DB Provider for DB2
JSON
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities supported
Import
If you're trying to load a JSON Lines file, the following sample M code converts all JSON Lines input to a single
flattened table automatically:
let
// Read the file into a list of lines
Source = Table.FromColumns({Lines.FromBinary(File.Contents("C:\json-lines-example.json"), null, null)}),
// Transform each line using Json.Document
#"Transformed Column" = Table.TransformColumns(Source, {"Column1", Json.Document})
in
#"Transformed Column"
You'll then need to use an Expand operation to combine the lines together.
Mailchimp (Deprecated)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Products -
Deprecation
This connector is deprecated, and won't be supported soon. We recommend you transition off existing
connections using this connector, and don't use this connector for new connections.
Microsoft Azure Consumption Insights (Beta)
(Deprecated)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Products -
Deprecation
NOTE
This connector is deprecated because of end of support for the Microsoft Azure Consumption Insights service. We
recommend that users transition off existing connections using this connector, and don't use this connector for new
connections.
Transition instructions
Users are instructed to use the certified Microsoft Azure Cost Management connector as a replacement. The
table and field names are similar and should offer the same functionality.
Timeline
The Microsoft Azure Consumption Insights service will stop working in December 2021. Users should transition
off the Microsoft Azure Consumption Insights connector to the Microsoft Azure Cost Management connector by
December 2021.
Microsoft Graph Security (Deprecated)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Products -
Deprecation
NOTE
This connector is deprecated. We recommend that you transition off existing connections using this connector, and don't
use this connector for new connections.
MySQL database
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Authentication Types Supported Windows (Power BI Desktop, Excel, online service with
gateway)
Database (Power BI Desktop, Excel)
Basic (online service with gateway)
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
Users need to install the Oracle MySQL Connector/NET package prior to using this connector in Power BI
Desktop. This component must also be installed on the machine running the on-premises data gateway in order
to use this connector in Power Query Online (dataflows) or Power BI Service.
Capabilities Supported
Import
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
3. Select the Database authentication type and input your MySQL credentials in the User name and
Password boxes.
NOTE
If the connection is not encrypted, you'll be prompted with the following dialog.
Select OK to connect to the database by using an unencrypted connection, or follow the instructions to
set up encrypted connections to SQL Server.
6. In Navigator , select the data you require, then either load or transform the data.
Connect to MySQL database from Power Query Online
To make the connection, take the following steps:
1. Select the MySQL database option in the connector selection.
2. In the MySQL database dialog, provide the name of the server and database.
The following table lists all of the advanced options you can set in Power Query Desktop.
Command timeout in minutes If your connection lasts longer than 10 minutes (the default
timeout), you can enter another value in minutes to keep
the connection open longer. This option is only available in
Power Query Desktop.
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, you won’t see those
columns.
Navigate using full hierarchy If checked, the navigator displays the complete hierarchy of
tables in the database you're connecting to. If cleared, the
navigator displays only the tables whose columns and rows
contain data.
Once you've selected the advanced options you require, select OK in Power Query Desktop to connect to your
MySQL database.
OData Feed
5/25/2022 • 4 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities supported
Basic
Advanced
URL parts
Open type columns
Select related tables
NOTE
Microsoft Graph is not supported. More information: Lack of Support for Microsoft Graph in Power Query
If the URL address you enter is invalid, a warning icon will appear next to the URL textbox.
3. If this is the first time you're connecting using the OData Feed, select the authentication type, input your
credentials (if necessary), and select the level to apply the authentication settings to. Then select
Connect .
4. From the Navigator dialog, you can select a table, then either transform the data in the Power Query
Editor by selecting Transform Data , or load the data by selecting Load .
If you have multiple tables that have a direct relationship to one or more of the already selected tables,
you can select the Select Related Tables button. When you do, all tables that have a direct relationship
to one or more of the already selected tables will be imported as well.
Load data from an OData Feed in Power Query Online
To load data from an OData Feed in Power Query Online:
1. Select the OData or OData Feed option in the connector selection.
2. In the OData dialog that appears, enter a URL in the text box.
3. If this is the first time you're connecting using the OData Feed, select the authentication kind and enter
your credentials (if necessary). Then select Next .
4. From the Navigator dialog, you can select a table, then transform the data in the Power Query Editor by
selecting Transform Data .
If you have multiple tables that have a direct relationship to one or more of the already selected tables,
you can select the Select Related Tables button. When you do, all tables that have a direct relationship
to one or more of the already selected tables will be imported as well.
Contact the service owner. They'll either need to change the authentication configuration or build a custom
connector.
Maximum URL length
If you're using the OData feed connector to connect to a SharePoint list, SharePoint online list, or Project Online,
the maximum URL length for these connections is approximately 2100 characters. Exceeding the character limit
results in a 401 error. This maximum URL length is built in the SharePoint front end and can't be changed.
To get around this limitation, start with the root OData endpoint and then navigate and filter inside Power
Query. Power Query filters this URL locally when the URL is too long for SharePoint to handle. For example, start
with:
OData.Feed("https://contoso.sharepoint.com/teams/sales/_api/ProjectData")
instead of
OData.Feed("https://contoso.sharepoint.com/teams/sales/_api/ProjectData/Projects?
select=_x0031_MetricName...etc...")
ODBC
5/25/2022 • 3 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
Before you get started, make sure you've properly configured the connection in the Windows ODBC Data
Source Administrator. The exact process here will depend on the driver.
Capabilities Supported
Import
Advanced options
Connection string (non-credential properties)
SQL statement
Supported row reduction clauses
Connection string (non-credential properties) Provides an optional connection string that can be used
instead of the Data source name (DSN) selection in
Power BI Desktop. If Data source name (DSN) is set to
(None) , you can enter a connection string here instead. For
example, the following connection strings are valid:
dsn= <myDSN> or
driver= <myDriver> ;por t= <myPor tNumber> ;ser ver=
<mySer ver> ;database= <myDatabase> ; . The { }
characters can be used to escape special characters. Keys for
connection strings will vary between different ODBC drivers.
Consult your ODBC driver provider for more information
about valid connection strings.
Supported row reduction clauses Enables folding support for Table.FirstN. Select Detect to
find supported row reduction clauses, or select from one of
the drop down options (TOP, LIMIT and OFFSET, LIMIT, or
ANSI SQL-compatible). This option is not applicable when
using a native SQL statement. Only available in Power Query
Desktop.
OpenSearch Project (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by OpenSearch, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the OpenSearch website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
Microsoft Power BI Desktop
OpenSearch
OpenSearch SQL ODBC driver
Capabilities supported
Import
DirectQuery
Troubleshooting
If you get an error indicating the driver wasn't installed, install the OpenSearch SQL ODBC Driver.
If you get a connection error:
1. Check if the host and port values are correct.
2. Check if the authentication credentials are correct.
3. Check if the server is running.
Oracle database
5/25/2022 • 11 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
Supported Oracle versions:
Oracle Server 9 and later
Oracle Data Access Client (ODAC) software 11.2 and later
Before you can connect to an Oracle database using Power Query, you need to install the Oracle client software
v8.1.7 or greater on your computer. To install the 32-bit Oracle client software, go to 32-bit Oracle Data Access
Components (ODAC) with Oracle Developer Tools for Visual Studio (12.1.0.2.4). To install the 64-bit Oracle client,
go to 64-bit ODAC 12c Release 4 (12.1.0.2.4) Xcopy for Windows x64.
NOTE
Choose a version of Oracle Data Access Client (ODAC) that's compatible with your Oracle Server. For instance, ODAC 12.x
doesn't always support Oracle Server version 9. Choose the Windows installer of the Oracle Client. During the setup of
the Oracle client, make sure you enable Configure ODP.NET and/or Oracle Providers for ASP.NET at machine-wide level by
selecting the corresponding checkbox during the setup wizard. Some versions of the Oracle client wizard selects the
checkbox by default, others do'nt. Make sure that checkbox is selected so that Power Query can connect to your Oracle
database.
To connect to an Oracle database with the on-premises data gateway, the correct Oracle client software must be
installed on the computer running the gateway. The Oracle client software you use depends on the Oracle server
version, but will always match the 64-bit gateway. For more information, go to Manage your data source -
Oracle.
Capabilities Supported
Import
DirectQuery
Advanced options
Command timeout in minutes
SQL statement
Include relationship columns
Navigate using full hierarchy
NOTE
If you are using a local database, or autonomous database connections, you may need to place the server name
in quotation marks to avoid connection errors.
3. If you're connecting from Power BI Desktop, select either the Impor t or DirectQuer y data connectivity
mode. The rest of these example steps use the Import data connectivity mode. To learn more about
DirectQuery, go to Use DirectQuery in Power BI Desktop.
4. If this is the first time you're connecting to this Oracle database, select the authentication type you want to
use, and then enter your credentials. For more information about authentication, go to Authentication
with a data source.
5. In Navigator , select the data you require, then either select Load to load the data or Transform Data to
transform the data.
NOTE
You must select an on-premises data gateway for this connector, whether the Oracle database is on your local
network or on a web site.
4. If this is the first time you're connecting to this Oracle database, select the type of credentials for the
connection in Authentication kind . Choose Basic if you plan to use an account that's created within
Oracle instead of Windows authentication.
5. Enter your credentials.
6. Select Next to continue.
7. In Navigator , select the data you require, then select Transform data to transform the data in Power
Query Editor.
To connect to an Oracle Autonomous Database, you need the following accounts and apps:
An Oracle.com account (Get an Oracle.com Account)
An Oracle Cloud account (About Oracle Cloud Accounts)
An Oracle Autonomous Database (Oracle Autonomous Database)
Power BI Desktop (Get Power BI Desktop)
Power BI service account (Licensing the Power BI service for users in your organization)
On-premises data gateway (Download and install a standard gateway)
Download your client credentials
The first step in setting up a connection to the Oracle Autonomous database is to download your client
credentials.
To download your client credentials:
1. In your Oracle Autonomous database details page, select DB Connection .
10. In Specify Installation Location , select the location for the Oracle base folder. Select Browse to
browse to the folder you want to use, then select Select . Then select Next .
11. In Available Product Components , only select Oracle Data Provider for .NET . Then select Next .
12. The installer then performs some prerequisite checks to ensure your system meets the minimum
installation and configuration requirements. Once this check is finished, select Next .
13. The installer then presents a summary of the actions it's going to take. To continue, select Install .
14. Once the installer has finished installing all of your driver components, select Close .
Configure the unmanaged ODP.NET
1. In the command prompt, go to <install-folder>\odp.net\bin\4. In this example, the location is
c:\oracle\driver\odp.net\bin\4 .
5. If this is the first time you're signing in to this server from Power BI Desktop, you'll be asked to enter your
credentials. Select Database , then enter the user name and password for the Oracle database. The
credentials you enter here are the user name and password for the specific Oracle Autonomous Database
you want to connect to. In this example, the database's initial administrator user name and password are
used. Then select Connect .
At this point, the Navigator appears and displays the connection data.
You might also come across one of several errors because the configuration hasn't been properly set up. These
errors are discussed in Troubleshooting.
One error that might occur in this initial test takes place in Navigator , where the database appears to be
connected, but contains no data. Instead, an Oracle: ORA-28759: failure to open file error appears in place of the
data.
If this error occurs, be sure that the wallet folder path you supplied in sqlnet.ora is the full and correct path to
the wallet folder.
Configure the gateway
1. In Power BI service, select the gear icon in the upper right-hand side, then select Manage gateways .
2. In Add Data Source , select Add data sources to use the gateway .
3. In Data Source Name , enter the name you want to use as the data source setting.
4. In Data Source Type , select Oracle .
5. In Ser ver , enter the name of the Oracle Autonomous Database server.
6. In Authentication Method , select Basic .
7. Enter the user name and password for the Oracle Autonomous Database. In this example, the default
database administrator user name and password are used.
8. Select Add .
If everything has been installed and configured correctly, a Connection Successful message appears. You can
now connect to the Oracle Autonomous Database using the steps described in Connect to an Oracle database
from Power Query Online.
Command timeout in minutes If your connection lasts longer than 10 minutes (the default
timeout), you can enter another value in minutes to keep
the connection open longer. This option is only available in
Power Query Desktop.
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, these columns won't
appear.
Navigate using full hierarchy If checked, the navigator displays the complete hierarchy of
tables in the database you're connecting to. If cleared, the
navigator displays only the tables whose columns and rows
contain data.
Once you've selected the advanced options you require, select OK in Power Query Desktop to connect to your
Oracle database.
Troubleshooting
You might come across any of several errors from Oracle when the naming syntax is either incorrect or not
configured properly:
ORA-12154: TNS: could not resolve the connect identifier specified.
ORA-12514: TNS: listener does not currently know of service requested in connect descriptor.
ORA-12541: TNS: no listener.
ORA-12170: TNS: connect timeout occurred.
ORA-12504: TNS: listener was not given the SERVICE_NAME in CONNECT_DATA.
These errors might occur if the Oracle client either isn't installed or isn't configured properly. If it's installed,
verify the tnsnames.ora file is properly configured and you're using the proper net_service_name. You also need
to make sure the net_service_name is the same between the machine that uses Power BI Desktop and the
machine that runs the gateway. More information: Prerequisites
You might also come across a compatibility issue between the Oracle server version and the Oracle Data Access
Client version. Typically, you want these versions to match, as some combinations are incompatible. For instance,
ODAC 12.x doesn't support Oracle Server version 9.
If you downloaded Power BI Desktop from the Microsoft Store, you might be unable to connect to Oracle
databases because of an Oracle driver issue. If you come across this issue, the error message returned is: Object
reference not set. To address the issue, do one of these steps:
Download Power BI Desktop from the Download Center instead of Microsoft Store.
If you want to use the version from Microsoft Store: on your local computer, copy oraons.dll from
12.X.X\client_X to 12.X.X\client_X\bin, where X represents version and directory numbers.
If the Object reference not set error message occurs in Power BI when you connect to an Oracle database using
the on-premises data gateway, follow the instructions in Manage your data source - Oracle.
If you're using Power BI Report Server, consult the guidance in the Oracle Connection Type article.
Next steps
Optimize Power Query when expanding table columns
PDF
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
NOTE
PDF is not supported in Power BI Premium.
Prerequisites
None.
Capabilities Supported
Import
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
As of the December 2019 release, NpgSQL 4.0.10 shipped with Power BI Desktop and no additional installation
is required. GAC Installation overrides the version provided with Power BI Desktop, which will be the default.
Refreshing is supported both through the cloud in the Power BI Service and also on premise through the
Gateway. If PostgreSQL is hosted in a cloud other than Microsoft Azure (such as AWS, GCP, and so on), you must
use an on-premises data gateway for the scheduled refresh to work in Power BI service. In the Power BI service,
NpgSQL 4.0.10 will be used, while on premise refresh will use the local installation of NpgSQL, if available, and
otherwise use NpgSQL 4.0.10.
For Power BI Desktop versions released before December 2019, you must install the NpgSQL provider on your
local machine. To install the NpgSQL provider, go to the releases page and download the relevant release. The
provider architecture (32-bit or 64-bit) needs to match the architecture of the product where you intend to use
the connector. When installing, make sure that you select NpgSQL GAC Installation to ensure NpgSQL itself is
added to your machine.
We recommend NpgSQL 4.0.10. NpgSQL 4.1 and up won't work due to .NET version
incompatibilities.
Capabilities Supported
Import
DirectQuery (Power BI only)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
For more information about using authentication methods, go to Authentication with a data source.
NOTE
If the connection is not encrypted, you'll be prompted with the following message.
Select OK to connect to the database by using an unencrypted connection, or follow the instructions in
Enable encrypted connections to the Database Engine to set up encrypted connections to PostgreSQL
database.
5. In Navigator , select the database information you want, then either select Load to load the data or
Transform Data to continue transforming the data in Power Query Editor.
Connect to a PostgreSQL database from Power Query Online
To make the connection, take the following steps:
1. Select the PostgreSQL database option in the connector selection.
2. In the PostgreSQL database dialog that appears, provide the name of the server and database.
3. Select the name of the on-premises data gateway you want to use.
4. Select the Basic authentication kind and input your MySQL credentials in the Username and Password
boxes.
5. If your connection isn't encrypted, clear Use Encr ypted Connection .
6. Select Next to connect to the database.
7. In Navigator , select the data you require, then select Transform data to transform the data in Power
Query Editor.
Command timeout in minutes If your connection lasts longer than 10 minutes (the default
timeout), you can enter another value in minutes to keep
the connection open longer. This option is only available in
Power Query Desktop.
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, you won’t see those
columns.
Navigate using full hierarchy If checked, the navigator displays the complete hierarchy of
tables in the database you're connecting to. If cleared, the
navigator displays only the tables whose columns and rows
contain data.
Once you've selected the advanced options you require, select OK in Power Query Desktop to connect to your
PostgreSQL database.
Troubleshooting
Your native query may throw the following error:
We cannot fold on top of this native query. Please modify the native query or remove the 'EnableFolding'
option.
A basic trouble shooting step is to check if the query in Value.NativeQuery() throws the same error with a
limit 1 clause around it:
The Power BI QuickBooks Online connector enables connecting to your QuickBooks Online account and viewing,
analyzing, and reporting on your company QuickBooks data in Power BI.
Summary
IT EM DESC RIP T IO N
WARNING
QuickBooks Online has deprecated support for Internet Explorer 11, which Power Query Desktop uses for authentication
to online services. To be able to log in to Quickbooks Online from Power BI Desktop, go to Enabling Microsoft Edge
(Chromium) for OAuth Authentication in Power BI Desktop.
Prerequisites
To use the QuickBooks Online connector, you must have a QuickBooks Online account username and password.
The QuickBooks Online connector uses the QuickBooks ODBC driver. The QuickBooks ODBC driver is shipped
with Power BI Desktop and no additional installation is required.
Capabilities Supported
Import
4. In the following dialog, enter your QuickBooks credentials. You may be required to provide 2FA (two
factor authentication code) as well.
5. In the following dialog, select a company and then select Next .
6. Once you've successfully signed in, select Connect .
7. In the Navigator dialog box, select the QuickBooks tables you want to load. You can then either load or
transform the data.
Known issues
Beginning on August 1, 2020, Intuit will no longer support Microsoft Internet Explorer 11 (IE 11) for QuickBooks
Online. When you use OAuth2 for authorizing QuickBooks Online, after August 1, 2020, only the following
browsers will be supported:
Microsoft Edge
Mozilla Firefox
Google Chrome
Safari 11 or newer (Mac only)
For more information, see Alert: Support for IE11 deprecating on July 31, 2020 for Authorization screens.
For information about current Microsoft Edge support in Power BI Desktop, go to Enabling Microsoft Edge
(Chromium) for OAuth Authentication in Power BI Desktop.
Salesforce Objects
5/25/2022 • 4 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
WARNING
By default, Salesforce does not support Internet Explorer 11, which is used as part of the authentication experience to
online services in Power Query Desktop. Please opt-in for extended support for accessing Lightning Experience Using
Microsoft Internet Explorer 11. You may also want to review Salesforce documentation on configuring Internet Explorer.
At this time, users will be impaired from authenticating, but stored credentials should continue to work until their existing
authentication tokens expire. To resolve this, go to Enabling Microsoft Edge (Chromium) for OAuth Authentication in
Power BI Desktop.
Prerequisites
To use the Salesforce Objects connector, you must have a Salesforce account username and password.
Also, Salesforce API access should be enabled. To verify access settings, go to your personal Salesforce page,
open your profile settings, and search for and make sure the API Enabled checkbox is selected. Note that
Salesforce trial accounts don't have API access.
Capabilities Supported
Production
Custom
Custom domains
CNAME record redirects
Relationship columns
Connect to Salesforce Objects from Power Query Desktop
To connect to Salesforce Objects data:
1. Select Salesforce Objects from the product-specific data connector list, and then select Connect .
2. In Salesforce Objects , choose the Production URL if you use the Salesforce production URL (
https://www.salesforce.com ) to sign in.
You can also select Custom and enter a custom URL to sign in. This custom URL might be a custom
domain you've created within Salesforce, such as https://contoso.salesforce.com . You can also use the
custom URL selection if you're using your own CNAME record that redirects to Salesforce.
Also, you can select Include relationship columns . This selection alters the query by including
columns that might have foreign-key relationships to other tables. If this box is unchecked, you won’t see
those columns.
Once you've selected the URL, select OK to continue.
3. Select Sign in to sign in to your Salesforce account.
NOTE
Currently, you may need to select the Custom URL, enter https://www.salesforce.com in the text box, and
then select Production to connect to your data.
You can also select Custom and enter a custom URL to sign in. This custom URL might be a custom
domain you've created within Salesforce, such as https://contoso.salesforce.com . You can also use the
custom URL selection if you're using your own CNAME record that redirects to Salesforce.
Also, you can select Include relationship columns. This selection alters the query by including columns
that might have foreign-key relationships to other tables. If this box is unchecked, you won’t see those
columns.
3. If this is the first time you've made this connection, select an on-premises data gateway, if needed.
4. Select Sign in to sign in to your Salesforce account. Once you've successfully signed in, select Next .
5. In the Navigator dialog box, select the Salesforce Objects you want to load. Then select Transform Data
to transform the data.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
WARNING
By default, Salesforce does not support Internet Explorer 11, which is used as part of the authentication experience to
online services in Power Query Desktop. Please opt-in for extended support for accessing Lightning Experience Using
Microsoft Internet Explorer 11. You may also want to review Salesforce documentation on configuring Internet Explorer.
At this time, users will be impaired from authenticating, but stored credentials should continue to work until their existing
authentication tokens expire. To resolve this, go to Enabling Microsoft Edge (Chromium) for OAuth Authentication in
Power BI Desktop.
Prerequisites
To use the Salesforce Reports connector, you must have a Salesforce account username and password.
Also, Salesforce API access should be enabled. To verify access settings, go to your personal Salesforce page,
open your profile settings, and search for and make sure the API Enabled checkbox is selected. Note that
Salesforce trial accounts don't have API access.
Capabilities Supported
Production
Custom
Custom domains
CNAME record redirects
Connect to Salesforce Reports from Power Query Desktop
To connect to Salesforce Reports data:
1. Select Salesforce Repor ts from the product-specific data connector list, and then select Connect .
2. In Salesforce Repor ts , choose the Production URL if you use the Salesforce production URL (
https://www.salesforce.com ) to sign in.
You can also select Custom and enter a custom URL to sign in. This custom URL might be a custom
domain you've created within Salesforce, such as https://contoso.salesforce.com . You can also use the
custom URL selection if you're using your own CNAME record that redirects to Salesforce.
Once you've selected the URL, select OK to continue.
3. Select Sign in to sign in to your Salesforce account.
NOTE
Currently, you may need to select the Custom URL, enter https://www.salesforce.com in the text box, and
then select Production to connect to your data.
You can also select Custom and enter a custom URL to sign in. This custom URL might be a custom
domain you've created within Salesforce, such as https://contoso.salesforce.com . You can also use the
custom URL selection if you're using your own CNAME record that redirects to Salesforce.
Also, you can select Include relationship columns. This selection alters the query by including columns
that might have foreign-key relationships to other tables. If this box is unchecked, you won’t see those
columns.
3. If this is the first time you've made this connection, select an on-premises data gateway, if needed.
4. Select Sign in to sign in to your Salesforce account. Once you've successfully signed in, select Next .
5. In the Navigator dialog box, select the Salesforce Reports you want to load. Then select Transform
Data to transform the data.
NOTE
The SAP Business Warehouse (BW) Application Server connector is now certified for SAP BW/4HANA as of June 2020.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
You'll need an SAP account to sign in to the website and download the drivers. If you're unsure, contact the SAP
administrator in your organization. The drivers need to be installed on the gateway machine.
You can use either version 1.0 of the SAP Business Warehouse (BW) Application Server connector or the
Implementation 2.0 SAP connector in Power Query Desktop. The following sections describe the installation of
each version, in turn. You can choose one or the other connector when connecting to an SAP BW Application
Server from Power BI Desktop.
BW 7.3, BW 7.5 and BW/4HANA 2.0 is supported.
NOTE
We suggest you use the Implementation 2.0 SAP connector whenever possible because it provides significant
performance, functionality, and reliability improvements over 1.0.
NOTE
Power Query Online uses the version 2.0 SAP BW Application Server connector by default. However, version 1.0 of the
SAP BW Application Server connector works in the M Engine level if you really need to use it.
NOTE
If you want to use version 1 of the SAP BW Application Server connector, you must use the SAP NetWeaver library. For
more information about installing version 1, see Prerequisites for version 1.0. We recommend using the Implementation
2.0 SAP BW Application Server connector whenever possible.
Capabilities Supported
Import
Direct Query
Implementation
2.0 (Requires SAP .NET Connector 3.0)
1.0 (Requires NetWeaver RFC)
Advanced
Language code
Execution mode
Batch size
MDX statement
Enable characteristic structures
5. From the Navigator dialog box, you can either transform the data in the Power Query Editor by selecting
Transform Data , or load the data by selecting Load .
Connect to an SAP BW Application Server from Power Query Online
To connect to an SAP BW Application Server from Power Query Online:
1. From the Data sources page, select SAP BW Application Ser ver .
2. Enter the server name, system number, and client ID of the SAP BW Application Server you want to
connect to. This example uses SAPBWTestServer as the server name, a system number of 00 , and a client
ID of 837 .
3. Select the on-premises data gateway you want to use to connect to the data.
4. Set Authentication Kind to Basic . Enter your user name and password.
5. You can also select from a set of advanced options to fine-tune your query.
6. Select Next to connect.
7. From the Navigator dialog box, select the items you want to use. When you select one or more items
from the server, the Navigator dialog box creates a preview of the output table. For more information
about navigating the SAP BW Application Server query objects in Power Query, go to Navigate the query
objects.
8. From the Navigator dialog box, you can transform the data in the Power Query Editor by selecting
Transform Data .
See also
Navigate the query objects
SAP Business Warehouse fundamentals
Use advanced options
SAP Business Warehouse connector troubleshooting
SAP Business Warehouse Message Server
5/25/2022 • 5 minutes to read • Edit Online
NOTE
The SAP Business Warehouse (BW) Message Server connector is now certified for SAP BW/4HANA as of June 2020.
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
You'll need an SAP account to sign in to the website and download the drivers. If you're unsure, contact the SAP
administrator in your organization.
You can use either version 1.0 of the SAP Business Warehouse (BW) Message Server connector or the
Implementation 2.0 SAP connector in Power Query Desktop. The following sections describe the installation of
each version, in turn. You can choose one or the other connector when connecting to an SAP BW Message
Server from Power BI Desktop. We suggest you use the Implementation 2.0 SAP connector whenever possible.
NOTE
We suggest you use the Implementation 2.0 SAP connector whenever possible because it provides significant
performance, functionality, and reliability improvements over 1.0.
NOTE
Power Query Online uses the version 2.0 SAP BW Message Server connector by default. However, version 1.0 of the SAP
BW Message Server connector works in the M Engine level if you really need to use it.
NOTE
If you want to use version 1.0 of the SAP BW Message Server connector, you must use the SAP NetWeaver library. For
more information about installing version 1.0, see Prerequisites for version 1.0. We recommend using the Implementation
2.0 SAP BW Message Server connector whenever possible.
Capabilities Supported
Import
Direct Query
Implementation
2.0 (Requires SAP .NET Connector 3.0)
1.0 (Requires NetWeaver RFC)
Advanced
Language code
Execution mode
Batch size
MDX statement
Enable characteristic structures
See also
Navigate the query objects
SAP Business Warehouse fundamentals
Use advanced options
SAP Business Warehouse connector troubleshooting
SAP BW fundamentals
5/25/2022 • 5 minutes to read • Edit Online
This article describes basic terminology used when describing interactions between the SAP BW server and
Power Query. It also includes information about tools that you may find useful when using the Power Query
SAP BW connector.
Integration Architecture
From a technical point of view, the integration between applications and SAP BW is based on the so-called
Online Analytical Processing (OLAP) Business Application Programming Interfaces (BAPI).
The OLAP BAPIs are delivered with SAP BW and provide 3rd-parties and developers with standardized
interfaces that enable them to access the data and metadata of SAP BW with their own front-end tools.
Applications of all types can be connected with an SAP BW server using these methods.
The OLAP BAPIs are implemented in SAP BW as RFC-enabled function modules and are invoked by applications
over SAP’s RFC protocol. This requires the NetWeaver RFC Library or SAP .NET Connector to be installed on the
application's machine.
The OLAP BAPIs provide methods for browsing metadata and master data, and also for passing MDX statements
for execution to the MDX Processor.
The OLAP Processor is responsible for retrieving, processing, and formatting the data from the SAP BW source
objects, which are further described in SAP BW data source and Data objects in SAP BW.
SAP Business Explorer and other SAP tools use a more direct interface to the SAP BW OLAP Processor called
Business Intelligence Consumer Services, commonly known as BICS. BICS isn't available for 3rd party tools.
After you connect to your SAP BW instance, the Navigator dialog box will show a list of available catalogs in
the selected server.
You'll see one catalog folder with the name $INFOCUBE. This folder contains all InfoProviders in the SAP BW
system.
The other catalog folders represent InfoProviders in SAP BW for which at least one query exists.
The Navigator dialog box displays a hierarchical tree of data objects from the connected SAP BW system. The
following table describes the types of objects.
SY M B O L DESC RIP T IO N
Key figure
Characteristic
Characteristic level
Property (Attribute)
Hierarchy
NOTE
The navigator shows InfoCubes and BEx queries. For BEx queries, you may need to go into Business Explorer, open the
desired query and check Allow External Access to this Quer y: By OLE DB for OL AP for the query to be available
in the navigator.
NOTE
In Power BI Desktop, objects below an InfoCube or BEx Query node, such as the key figures, characteristics, and
properties are only shown in Import connectivity mode, not in DirectQuery mode. In DirectQuery mode, all the available
objects are mapped to a Power BI model and will be available for use in any visual.
In the navigator, you can select from different display options to view the available query objects in SAP BW:
Only selected items : This option limits the objects shown in the list to just the selected items. By
default, all query objects are displayed. This option is useful for a review of the objects that you included
in your query. Another approach to viewing selected items is to select the column names in the preview
area.
Enable data previews : This value is the default. This option allows you to control whether a preview of
the data should be displayed on the right-hand side in the Navigator dialog box. Disabling data previews
reduces the amount of server interaction and response time. In Power BI Desktop, data preview is only
available in Import connectivity mode.
Technical names : SAP BW supports the notion of technical names for query objects, as opposed to the
descriptive names that are shown by default. Technical names uniquely identify an object within SAP BW.
With the option selected, the technical names will appear next to the descriptive name of the object.
Characteristic hierarchies
A characteristic will always have at least one characteristic level (Level 01), even when no hierarchy is defined on
the characteristic. The Characteristic Level 01 object contains all members for the characteristic as a flat list of
values.
Characteristics in SAP BW can have more than one hierarchy defined. For those characteristics, you can only
select one hierarchy or the Level 01 object.
For characteristics with hierarchies, the properties selected for that characteristic will be included for each
selected level of the hierarchy.
Measure properties
When you pick a measure, you have an option to select the units/currency, formatted value, and format string. In
the screenshot below, it's useful to get the formatted value for COGS. This helps us follow the same formatting
standard across all the reports.
NOTE
Measure properties are not available in Power BI Desktop in DirectQuery mode.
Flattening of multi-dimensional data
Based on the selected objects and properties in the navigator, Power Query constructs an MDX statement that is
sent for execution to SAP BW. The MDX statement returns a flattened data set that can be further manipulated
using the Power Query Editor.
Power Query uses a newer interface that is available in SAP BW version 7.01 or higher. The interface reduces
memory consumption and the result set is not restricted by the number of cells.
The flattened data set is aggregated in SAP BW at the level of the selected characteristics and properties.
Even with these improvements, the resulting dataset can become very large and time-consuming to process.
Performance recommendation
Only include the characteristics and properties that you ultimately need. Aim for higher levels of aggregation,
that is, do you need Material-level details in your report, or is MaterialGroup-level enough? What hierarchy
levels are required in Power BI? Try to create smaller datasets, with higher levels of aggregation, or multiple
smaller datasets, that can be joined together later.
Query parameters
Queries in SAP BW can have dynamic filters defined that allow you to restrict the data set that's returned by the
query. In the BEx Query Designer, this type of dynamic filter can be defined with what's called a Characteristic
Restriction and assigning a Variable to that restriction. Variables on a query can be required or optional, and
they're available to the user in the navigator.
When you select an SAP BW query with characteristic restrictions in the Power Query navigator, you'll see the
variables displayed as parameters above the data preview area.
Using the Show selector, you can display all parameters that are defined on the query, or just the required ones.
The query shown in the previous image has several optional parameters, including one for Material Group .
You can select one or more material groups to only return purchasing information for the selected values, that
is, casings, motherboards, and processors. You can also type the values directly into the values field. For
variables with multiple entries, comma-separated values are expected, in this example it would look like
[0D_MTLGROUP].[201], [0D_MTLGROUP].[202], [0D_MTLGROUP].[208] .
The value # means unassigned; in the example any data record without an assigned material group value.
Performance recommendation
Filters based on parameter values get processed in the SAP BW data source, not in Power BI. This type of
processing can have performance advantages for larger datasets when loading or refreshing SAP BW data into
Power BI. The time it takes to load data from SAP BW into Power BI increases with the size of the dataset, for
example, the number of columns and rows in the flattened result set. To reduce the number of columns, only
select the key figures, characteristics, and properties in the navigator that you eventually want to see.
Similarly, to reduce the number of rows, use the available parameters on the query to narrow the dataset, or to
split up a larger dataset into multiple, smaller datasets that can be joined together in the Power BI Desktop data
model.
In many cases, it may also be possible to work with the author of the BEx Query in SAP BW to clone and modify
an existing query and optimize it for performance by adding additional characteristic restrictions or removing
unnecessary characteristics.
With Power Query Editor, you can apply additional data transformations and filtering steps before you bring the
dataset from SAP BW into the Power BI Desktop or Microsoft Power Platform data model.
In Power Query Editor, the Applied Steps for the query are shown in the Quer y Settings pane on the right. To
modify or review a step, select the gear icon next to a step.
For example, if you select the gear icon next to Added Items , you can review the selected data objects in SAP
BW, or modify the specified query parameters. This way it's possible to filter a dataset using a characteristic that
isn't included in the result set.
You can apply additional filters on the dataset by selecting the drop-down menu for one of the columns.
Another easy way to set a filter is to right-click on one of the values in the table, then select Member Filters or
Text Filters .
For example, you could filter the dataset to only include records for Calendar Year/Month FEB 2003, or apply a
text filter to only include records where Calendar Year/Month contains 2003.
Not every filter will get folded into the query against SAP BW. You can determine if a filter is folded into the
query by examining the icon in the top-left corner of the data table, directly above the number 1 of the first data
record.
If the icon is a cube, then the filter is applied in the query against the SAP BW system.
If the icon is a table, then the filter isn't part of the query and only applied to the table.
Behind the UI of Power Query Editor, code is generated based on the M formula language for data mashup
queries.
You can view the generated M code with the Advanced Editor option in the View tab.
To see a description for each function or to test it, right-click on the existing SAP BW query in the Queries pane
and select Create Function . In the formula bar at the top, enter:
= < function name >
where <function name> is the name of the function you want to see described. The following example shows
the description of the Cube.Transform function.
See also
Power Query M formula language reference
Implementation details
Implementation details
5/25/2022 • 11 minutes to read • Edit Online
This article describes conversion information and specific features available in Implementation 2 of the Power
Query SAP Business Warehouse connector.
SapBusinessWarehouseExecutionMode.BasXmlGzip
SapBusinessWarehouseExecutionMode.DataStream
If so, add the Implementation 2.0 option, and remove the ScaleMeasures option, if present, as shown.
If the query doesn't already include an options record, just add it. For the following option:
Every effort has been made to make Implementation 2.0 of the SAP BW connector compatible with version 1.
However, there may be some differences because of the different SAP BW MDX execution modes being used. To
resolve any discrepancies, try switching between execution modes.
You'll need to add the key in to access the typed date. For example, if there's a dimension attribute called
[0CALDAY], you'll need to add the key [20CALDAY] to get the typed value.
In the example above, this means that:
Calendar day.Calendar day Level 01 [0CALDAY] will be text (a caption). (Added by default when the
dimension is added.)
Calendar day.Calendar day Level 01.Key [20CALDAY] will be a date (must be manually selected).
To manually add the key in Import mode, just expand Proper ties and select the key.
The key column will be of type date, and can be used for filtering. Filtering on this column will fold to the server.
Local calculations Local calculations defined in a BEX Query will change the
numbers as displayed through tools like Bex Analyzer.
However, they aren't reflected in the numbers returned from
SAP, through the public MDX interface.
Currency formatting Any currency formatting (for example, $2,300 or 4000 AUD)
isn't reflected in Power Query.
Units of measure Units of measure (for example, 230 KG) aren't reflected in
Power Query.
Key versus text (short, medium, long) For an SAP BW characteristic like CostCenter, the navigator
will show a single item Cost Center Level 01. Selecting this
item will include the default text for Cost Center in the field
list. Also, the Key value, Short Name, Medium Name, and
Long Name values are available for selection in the
Properties node for the characteristic (if maintained in SAP
BW).
Multiple hierarchies of a characteristic In SAP, a characteristic can have multiple hierarchies. Then in
tools like BEx Analyzer, when a characteristic is included in a
query, the user can select the hierarchy to use.
Treatment of ragged hierarchies SAP BW supports ragged hierarchies, where levels can be
missed, for example:
Continent
Americas
Canada
USA
Not Assigned
Australia
Continent
Americas
Canada
USA
Not Assigned
(Blank)
Australia
Scaling factor/reverse sign In SAP, a key figure can have a scaling factor (for example,
1000) defined as a formatting option, meaning that all
displays will be scaled by that factor.
Hierarchies where levels appear/disappear dynamically Initially when connecting to SAP BW, the information on the
levels of a hierarchy will be retrieved, resulting in a set of
fields in the field list. This is cached, and if the set of levels
changes, then the set of fields doesn't change until Refresh is
invoked.
Default filter A BEX query can include Default Filters, which will be applied
automatically by SAP Bex Analyzer. These aren't exposed, and
so the equivalent usage in Power Query won't apply the
same filters by default.
F EAT URE DESC RIP T IO N
Hidden Key figures A BEX query can control visibility of Key Figures, and those
that are hidden won't appear in SAP BEx Analyzer. This isn't
reflected through the public API, and so such hidden key
figures will still appear in the field list. However, they can
then be hidden within Power Query.
Sort Order The sort order (by Text, or by Key) for a characteristic can be
defined in SAP. This sort order isn't reflected in Power Query.
For example, months might appear as "April", "Aug", and so
on.
End user language setting The locale used to connect to SAP BW is set as part of the
connection details, and doesn't reflect the locale of the final
report consumer.
Customer Exit Variables Customer Exit variables aren't exposed by the public API, and
are therefore not supported by Power Query.
Performance Considerations
The following table provides a summary list of suggestions to improve performance for data load and refresh
from SAP BW.
Limit characteristics and properties (attribute) selection The time it takes to load data from SAP BW into Power
Query increases with the size of the dataset, that is, the
number of columns and rows in the flattened result set. To
reduce the number of columns, only select the characteristics
and properties in the navigator that you eventually want to
see in your report or dashboard.
Limit number of key figures Selecting many key figures from a BEx query/BW model can
have a significant performance impact during query
execution because of the time being spent on loading
metadata for units. Only include the key figures that you
need in Power Query.
Split up very large queries into multiple, smaller queries For very large queries against InfoCubes or BEx queries, it
may be beneficial to split up the query. For example, one
query might be getting the key figures, while another query
(or several other queries) is getting the characteristics data.
You can join the individual query results in Power Query.
Avoid Virtual Providers (MultiProviders or InfoSets) VirtualProviders are similar to structures without persistent
storage. They are useful in many scenarios, but can show
slower query performance because they represent an
additional layer on top of actual data.
Avoid use of navigation attributes in BEx query A query with a navigation attribute has to run an additional
join, compared with a query with the same object as a
characteristic in order to arrive at the values.
Use RSRT to monitor and troubleshoot slow running queries Your SAP Admin can use the Query Monitor in SAP BW
(transaction RSRT) to analyze performance issues with SAP
BW queries. Review SAP note 1591837 for more information.
Avoid Restricted Key Figures and Calculated Key Figures Both are computed during query execution and can slow
down query performance.
Consider using incremental refresh to improve performance Power BI refreshes the complete dataset with each refresh. If
you're working with large volume of data, refreshing the full
dataset on each refresh may not be optimal. In this scenario,
you can use incremental refresh, so you're refreshing only a
subset of data. For more details, go to Incremental refresh in
Power BI.
See also
SAP Business Warehouse Application Server
SAP Business Warehouse Message Server
Import vs. DirectQuery for SAP BW
Import vs. DirectQuery for SAP BW
5/25/2022 • 7 minutes to read • Edit Online
NOTE
This article discusses the differences between Import and DirectQuery modes in Power BI Desktop. For a description of
using Import mode in Power Query Desktop or Power Query Online, go to the following sections:
SAP BW Application Server connector:
Connect to an SAP BW Application Server from Power Query Desktop
Connect to an SAP BW Application Server from Power Query Online
SAP BW Message Server connector:
Connect to an SAP BW Message Server from Power Query Desktop
Connect to an SAP BW Message Server from Power Query Online
With Power Query, you can connect to a wide variety of data sources, including online services, databases,
different file formats, and others. If you are using Power BI Desktop, you can connect to these data sources in
two different ways: either import the data into Power BI, or connect directly to data in the source repository,
which is known as DirectQuery. When you connect to an SAP BW system, you can also choose between these
two connectivity modes. For a complete list of data sources that support DirectQuery, refer to Power BI data
sources.
The main differences between the two connectivity modes are outlined here, as well as guidelines and
limitations, as they relate to SAP BW connections. For additional information about DirectQuery mode, go to
Using DirectQuery in Power BI.
Import Connections
When you connect to a data source with Power BI Desktop, the navigator will allow you to select a set of tables
(for relational sources) or a set of source objects (for multidimensional sources).
For SAP BW connections, you can select the objects you want to include in your query from the tree displayed.
You can select an InfoProvider or BEx query for an InfoProvider, expand its key figures and dimensions, and
select specific key figures, characteristics, attributes (properties), or hierarchies to be included in your query.
The selection defines a query that will return a flattened data set consisting of columns and rows. The selected
characteristics levels, properties and key figures will be represented in the data set as columns. The key figures
are aggregated according to the selected characteristics and their levels. A preview of the data is displayed in the
navigator. You can edit these queries in Power Query prior to loading the data, for example to apply filters, or
aggregate the data, or join different tables.
When the data defined by the queries is loaded, it will be imported into the Power BI in-memory cache.
As you start creating your visuals in Power BI Desktop, the imported data in the cache will be queried. The
querying of cached data is very fast and changes to the visuals will be reflected immediately.
However, the user should take care when building visuals that further aggregate the data, when dealing with
non-additive measures. For example, if the query imported each Sales Office, and the Growth % for each one,
then if the user built a visual that will Sum the Growth % values across all Sales Offices, that aggregation will be
performed locally, over the cached data. The result wouldn't be the same as requesting the overall Growth %
from SAP BW, and is probably not what's intended. To avoid such accidental aggregations, it's useful to set the
Default Summarization for such columns to Do not summarize .
If the data in the underlying source changes, it won't be reflected in your visuals. It will be necessary to do a
Refresh , which will reimport the data from the underlying source into the Power BI cache.
When you publish a report (.pbix file) to the Power BI service, a dataset is created and uploaded to the Power BI
server. The imported data in the cache is included with that dataset. While you work with a report in the Power
BI service, the uploaded data is queried, providing a fast response time and interactivity. You can set up a
scheduled refresh of the dataset, or re-import the data manually. For on-premise SAP BW data sources, it's
necessary to configure an on-premises data gateway. Information about installing and configuring the on-
premises data gateway can be found in the following documentation:
On-premises data gateway documentation
Manage gateway data source in Power BI
Data source management in Power Platform
DirectQuery Connections
The navigation experience is slightly different when connecting to an SAP BW source in DirectQuery mode. The
navigator will still display a list of available InfoProviders and BEx queries in SAP BW, however no Power BI
query is defined in the process. You'll select the source object itself, that is, the InfoProvider or BEx query, and see
the field list with the characteristics and key figures once you connect.
For SAP BW queries with variables, you can enter or select values as parameters of the query. Select the Apply
button to include the specified parameters in the query.
Instead of a data preview, the metadata of the selected InfoCube or BEx Query is displayed. Once you select the
Load button in Navigator , no data will be imported.
You can make changes to the values for the SAP BW query variables with the Edit Queries option on the Power
BI Desktop ribbon.
As you start creating your visuals in Power BI Desktop, the underlying data source in SAP BW is queried to
retrieve the required data. The time it takes to update a visual depends on the performance of the underlying
SAP BW system.
Any changes in the underlying data won't be immediately reflected in your visuals. It will still be necessary to do
a Refresh , which will rerun the queries for each visual against the underlying data source.
When you publish a report to the Power BI service, it will again result in the creation of a dataset in the Power BI
service, just as for an import connection. However, no data is included with that dataset.
While you work with a report in the Power BI service, the underlying data source is queried again to retrieve the
necessary data. For DirectQuery connections to your SAP BW and SAP HANA systems, you must have an on-
premises data gateway installed and the data source registered with the gateway.
For SAP BW queries with variables, end users can edit parameters of the query.
NOTE
For the end user to edit parameters, the dataset needs to be published to a premium workspace, in DirectQuery mode,
and single sign-on (SSO) needs to be enabled.
General Recommendations
You should import data to Power BI whenever possible. Importing data takes advantage of the high-
performance query engine of Power BI and provides a highly interactive and fully featured experience over your
data.
However, DirectQuery provides the following advantages when connecting to SAP BW:
Provides the ability to access SAP BW data using SSO, to ensure that security defined in the underlying
SAP BW source is always applied. When accessing SAP BW using SSO, the user’s data access permissions
in SAP will apply, which may produce different results for different users. Data that a user isn't authorized
to view will be trimmed by SAP BW.
Ensures that the latest data can easily be seen, even if it's changing frequently in the underlying SAP BW
source.
Ensures that complex measures can easily be handled, where the source SAP BW is always queried for
the aggregate data, with no risk of unintended and misleading aggregates over imported caches of the
data.
Avoids caches of data being extracted and published, which might violate data sovereignty or security
policies that apply.
Using DirectQuery is generally only feasible when the underlying data source can provide interactive queries for
the typical aggregate query within seconds and is able to handle the query load that will be generated.
Additionally, the list of limitations that accompany use of DirectQuery should be considered, to ensure your
goals can still be met.
If you're working with either very large datasets or encounter slow SAP BW query response time in DirectQuery
mode, Power BI provides options in the report to send fewer queries, which makes it easier to interact with the
report. To access these options in Power BI Desktop, go to File > Options and settings > Options , and select
Quer y reduction .
You can disable cross-highlighting throughout your entire report, which reduces the number of queries sent to
SAP BW. You can also add an Apply button to slicers and filter selections. You can make as many slicer and filter
selections as you want, but no queries will be sent to SAP BW until you select the Apply button. Your selections
will then be used to filter all your data.
These changes will apply to your report while you interact with it in Power BI Desktop, as well as when your
users consume the report in the Power BI service.
In the Power BI service, the query cache for DirectQuery connections is updated on a periodic basis by querying
the data source. By default, this update happens every hour, but it can be configured to a different interval in
dataset settings. For more information, go to Data refresh in Power BI.
Also, many of the general best practices described in Using DirectQuery in Power BI apply equally when using
DirectQuery over SAP BW. Additional details specific to SAP BW are described in Connect to SAP Business
Warehouse by using DirectQuery in Power BI.
See also
Windows authentication and single sign-on
Windows authentication and single sign-on
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following information about Windows authentication and single sign-on applies only to Power Query Desktop. For
more information about using Windows authentication and single sign-on in Power Query Desktop, go to Overview of
single sign-on (SSO) for gateways in Power BI.
For Windows-based authentication and single sign-on functionality, your SAP BW server must be configured for
sign in using Secure Network Communications (SNC). SNC is a mechanism provided by the SAP system that
enables application-level security on data exchanged between a client, such as Power BI Desktop, and the SAP
BW server. SNC works with different external security products and offers features that the SAP system doesn't
directly provide, including single sign-on.
In addition to your SAP BW server being configured for SNC sign in, your SAP user account needs to be
configured with an SNC name (transaction SU01 in your SAP system).
For more detailed information, go to Secure Network Communication, and the chapter Single Sign-On
Configuration in this document.
Secure Login is a software solution by SAP that allows customers to benefit from the advantages of SNC
without having to set up a public-key infrastructure (PKI). Secure Login allows users to authenticate with
Windows Active Directory credentials.
Secure Login requires the installation of the Secure Login Client on your Power BI Desktop machine. The
installation package is named SAPSetupSCL.EXE and can be obtained from the SAP Service Marketplace
(requires SAP customer credentials).
For further information, go to Secure Login.
1. In the SAP Business Warehouse ser ver dialog box, select the Windows tab.
2. Select to either use your current Windows credentials or specify alternate Windows credentials.
3. Enter the SNC Par tner Name . This name is the configured SNC name in the SAP BW server’s security
token. You can retrieve the SNC name with transaction RZ11 (Profile Parameter Maintenance) in SAPGUI
and parameter name snc/identity/as .
For X.509 certificate security tokens, the format is:
p:<X.509 Distinguished Name>
Example (values are case-sensitive): p:CN=BW0, OU=BI, O=MyOrg, C=US
For Kerberos security tokens, the format is:
p:CN= <ser vice_User_Principal_Name>
Example (values are case-sensitive): p:CN=SAPSer [email protected]
4. Select the SNC Librar y that your SAP BW environment has been configured for.
The SNC_LIB or SNC_LIB_64 option will check the corresponding environment variable on your
machine and use the DLL that's specified there.
The NTLM and KERBEROS options will expect the corresponding DLL to be in a folder that's been
specified in the PATH variable on your local machine. The libraries for 32-bit systems are
GSSNTLM.DLL (for NTLM) and GSSKRB5.DLL (for Kerberos). The libraries for 64-bit systems are
GX64NTLM.DLL (for NTLM) and GX64KRB5.DLL (for Kerberos).
The Custom option allows for the use of a custom developed library.
Validate the settings with your SAP Administrator.
5. Select Connect .
See also
Use advanced options
Use advanced options
5/25/2022 • 4 minutes to read • Edit Online
When you create a connection to an SAP Business Warehouse server, you can optionally specify a language
code, execution mode, batch size, and an MDX Statement. Also, you can select whether you want to enable
characteristic structures.
NOTE
Although the images in this article illustrate the advanced options in the SAP Business Warehouse Application Server
connector, they work the same way in the SAP Business Warehouse Message Server connector.
Language code
You can optionally specify a language code when establishing a connection to the SAP BW server.
The expected value is a two-letter language code as defined in the SAP system. In Power Query Desktop, select
the Help icon (question mark) next to the Language Code field for a list of valid values.
After you set the language code, Power Query displays the descriptive names of the data objects in SAP BW in
the specified language, including the field names for the selected objects.
NOTE
Not all listed languages might be configured in your SAP BW system, and object descriptions might not be translated in
all languages.
If no language code is specified, the default locale from the Options dialog will be used and mapped to a valid
SAP language code. To view or override the current locale in Power BI Desktop, open the File > Options and
settings > Options dialog box and select Current File > Regional settings . To view or override the current
locale in Power Query Online, open the Home > Options > Project options dialog box. If you do override the
locale, your setting gets persisted in your M query and would be honored if you copy-paste your query from
Power Query Desktop to Power Query Online.
Execution mode
NOTE
Execution mode can't be changed in version 1.0 of the SAP BW connector.
The Execution mode option specifies the MDX interface is used to execute queries on the server. The following
options are valid:
BasXml : Specifies the bXML flattening mode option for MDX execution in SAP Business Warehouse.
BasXmlGzip : Specifies the Gzip compressed bXML flattening mode option for MDX execution in SAP
Business Warehouse. This option is recommended for low latency or high volume queries. The default
value for the execution mode option.
DataStream : Specifies the DataStream flattening mode option for MDX execution in SAP Business
Warehouse.
Batch size
NOTE
Batch size can't be changed in version 1.0 of the SAP BW connector.
Specifies the maximum number of rows to retrieve at a time when executing an MDX statement. A small number
translates into more calls to the server when retrieving a large dataset. A large number of rows may improve
performance, but could cause memory issues on the SAP BW server. The default value is 50000 rows.
MDX Statement
NOTE
The MDX statement option is not available in Power Query Online.
Instead of using the navigator to browse through and select from available data objects in SAP BW, a user who's
familiar with the MDX query language can specify an MDX statement for direct execution in SAP BW. However,
be aware that no further query folding will be applied when using a custom MDX statement.
The statement for the example used here would look as shown in the following sample, based on the technical
names of the objects and properties in SAP BW.
SELECT {[0EFUZM0P10X72MBPOYVBYIMLB].[0EFUZM0P10X72MBPOYVBYISWV]} ON COLUMNS ,
NON EMPTY CROSSJOIN(CROSSJOIN([0D_MATERIAL].[LEVEL01].MEMBERS,[0D_PUR_ORG].[LEVEL01].MEMBERS) ,
[0D_VENDOR].[LEVEL01].MEMBERS)
DIMENSION PROPERTIES
[0D_MATERIAL].[20D_MATERIAL],
[0D_MATERIAL].[50D_MATERIAL],
[0D_PUR_ORG].[20D_PUR_ORG],
[0D_PUR_ORG].[50D_PUR_ORG],
[0D_VENDOR].[20D_VENDOR],
[0D_VENTOR].[50D_VENDOR] ON ROWS FROM [0D_PU_C01/0D_PU_C01_Q0013]
The SAP BW connector will display a preview of the data that is returned by the MDX statement. You can then
either select Load to load the data (Power Query Desktop only), or select Transform Data to further
manipulate the data set in the Power Query Editor.
To validate and troubleshoot an MDX statement, SAP BW provides the MDXTEST transaction for SAP GUI for
Windows users. Further, the MDXTEST transaction can be a useful tool for analyzing server errors or
performance concerns as a result of processing that occurs within the SAP BW system.
For more detailed information on this transaction, go to MDX Test Environment.
MDXTEST can also be used to construct an MDX statement. The transaction screen includes panels on the left
that assist the user in browsing to a query object in SAP BW and generating an MDX statement.
The transaction offers different execution modes/interfaces for the MDX statement. Select Flattening (basXML) to
mimic how Power Query would execute the query in SAP BW. This interface in SAP BW creates the row set
dynamically using the selections of the MDX statement. The resulting dynamic table that's returned to Power
Query Desktop has a very compact form that reduces memory consumption.
The transaction will display the result set of the MDX statement and useful runtime metrics.
If selected, the connector produces only the available measures. For example:
See also
Navigate the query objects
Transform and filter SAP BW dataset
SAP Business Warehouse connector troubleshooting
SAP Business Warehouse connector troubleshooting
5/25/2022 • 11 minutes to read • Edit Online
This article provides troubleshooting situations (and possible solutions) for working with the SAP Business
Warehouse (BW) connector.
Many times when an error occurs, it may be advantageous to collect a trace of the query that was sent to the
SAP BW server and its response. The following procedure shows how to set up advanced traces for issues that
occur using the SAP BW connector.
1. Close Power BI Desktop if it’s running.
2. Create a new environment variable:
a. From the Windows Control Panel, select System > Advanced System Settings .
You could also open a Command Prompt and enter sysdm.cpl .
b. In System Proper ties , select the Advanced tab, and then select Environment Variables .
c. In Environment Variables , under System Variables , select New .
d. In New System Variable , under Variable name , enter PBI_EnableSapBwTracing and under
Variable value , enter true .
e. Select OK .
When this advanced tracing is activated, an additional folder called SapBw will be created in the Traces
folder. See the rest of this procedure for the location of the Traces folder.
3. Open Power BI Desktop.
4. Clear the cache before capturing.
a. In Power BI desktop, select the File tab.
b. Select Options and settings > Options .
c. Under Global settings, choose Data Load .
d. Select Clear Cache .
5. While you're still in Options and settings , enable tracing.
a. Under Global settings, choose Diagnostics .
b. Select Enable tracing .
6. While you're still in Options and settings > Global > Diagnostics , select Open crash dump/traces
folder . Ensure the folder is clear before capturing new traces.
7. Reproduce the issue.
8. Once done, close Power BI Desktop so the logs are flushed to disk.
9. You can view the newly captured traces under the SapBw folder (the Traces folder that contains the
SapBw folder is shown by selecting Open crash dump/traces folder on the Diagnostics page in
Power BI Desktop).
10. Make sure you deactivate this advanced tracing once you’re done, by either removing the environment
variable or setting PBI_EnableSapBwTracing to false.
In the logs— Message: [Expression.Error] The key didn't match any rows in the table.
StackTrace:
at Microsoft.Mashup.Engine1.Runtime.TableValue.get_Item(Value key)
at
Microsoft.Mashup.Engine1.Library.Cube.CubeParametersModule.Cube.ApplyParameterFunctionValue.GetParame
terValue(CubeValue cubeValue, Value parameter)
at
Microsoft.Mashup.Engine1.Library.Cube.CubeParametersModule.Cube.ApplyParameterFunctionValue.TypedInvo
ke(TableValue cube, Value parameter, Value arguments)
Detail: [Key = [Id = \"[!V000004]\"], Table = #table({...}, {...})]
NOTE
This environment variable is unsupported, so should only be used as outlined here.
NOTE
The following information only applies when using Implementation 1.0 of the SAP BW connector or Implementation 2.0 of
the SAP BW connector with Flattening mode (when ExecutionMode=67).
User accounts in SAP BW have default settings for how decimal or date/time values are formatted when
displayed to the user in the SAP GUI.
The default settings are maintained in the SAP system in the User Profile for an account, and the user can view
or change these settings in the SAP GUI with the menu path System > User Profile > Own Data .
Power BI Desktop queries the SAP system for the decimal notation of the connected user and uses that notation
to format decimal values in the data from SAP BW.
SAP BW returns decimal data with either a , (comma) or a . (dot) as the decimal separator. To specify which
of those SAP BW should use for the decimal separator, the driver used by Power BI Desktop makes a call to
BAPI_USER_GET_DETAIL . This call returns a structure called DEFAULTS , which has a field called DCPFM that stores
Decimal Format Notation. The field takes one of the following values:
' ' (space) = Decimal point is comma: N.NNN,NN
'X' = Decimal point is period: N,NNN.NN
'Y' = Decimal point is N NNN NNN,NN
Customers who have reported this issue found that the call to BAPI_USER_GET_DETAIL is failing for a particular
user, which is showing the incorrect data, with an error message similar to the following message:
To solve this error, users must ask their SAP admin to grant the SAP BW user being used in Power BI the right to
execute BAPI_USER_GET_DETAIL . It’s also worth verifying that the user has the required DCPFM value, as described
earlier in this troubleshooting solution.
Connectivity for SAP BEx queries
You can perform BEx queries in Power BI Desktop by enabling a specific property, as shown in the following
image:
MDX interface limitation
A limitation of the MDX interface is that long variables lose their technical name and are replaced by V00000#.
No data preview in Navigator window
In some cases, the Navigator dialog box doesn't display a data preview and instead provides an object
reference not set to an instance of an object error message.
SAP users need access to specific BAPI function modules to get metadata and retrieve data from SAP BW's
InfoProviders. These modules include:
BAPI_MDPROVIDER_GET_CATALOGS
BAPI_MDPROVIDER_GET_CUBES
BAPI_MDPROVIDER_GET_DIMENSIONS
BAPI_MDPROVIDER_GET_HIERARCHYS
BAPI_MDPROVIDER_GET_LEVELS
BAPI_MDPROVIDER_GET_MEASURES
BAPI_MDPROVIDER_GET_MEMBERS
BAPI_MDPROVIDER_GET_VARIABLES
BAPI_IOBJ_GETDETAIL
To solve this issue, verify that the user has access to the various MDPROVIDER modules and
BAPI_IOBJ_GETDETAIL . To further troubleshoot this or similar issues, you can enable tracing. Select File >
Options and settings > Options . In Options , select Diagnostics , then select Enable tracing . Attempt to
retrieve data from SAP BW while tracing is active, and examine the trace file for more detail.
Memory Exceptions
In some cases, you might encounter one of the following memory errors:
Message: No more memory available to add rows to an internal table.
Message: [DataSource.Error] SAP Business Warehouse: The memory request for [number] bytes could not be
complied with.
Message: The memory request for [number] bytes could not be complied with.
These memory exceptions are from the SAP BW server and are due to the server running out of available
memory to process the query. This might happen when the query returns a large set of results or when the
query is too complex for the server to handle, for example, when a query has many crossjoins.
To resolve this error, the recommendation is to simplify the query or divide it into smaller queries. If possible,
push more aggregation to the server. Alternatively, contact your SAP Basis team to increase the resources
available in the server.
Loading text strings longer than 60 characters in Power BI Desktop fails
In some cases you may find that text strings are being truncated to 60 characters in Power BI Desktop.
First, follow the instructions in 2777473 - MDX: FAQ for Power BI accessing BW or BW/4HANA and see if that
resolves your issue.
Because the Power Query SAP Business Warehouse connector uses the MDX interface provided by SAP for 3rd
party access, you'll need to contact SAP for possible solutions as they own the layer between the MDX interface
and the SAP BW server. Ask how "long text is XL" can be specified for your specific scenario.
SAP HANA database
5/25/2022 • 6 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
You'll need an SAP account to sign in to the website and download the drivers. If you're unsure, contact the SAP
administrator in your organization.
To use SAP HANA in Power BI Desktop or Excel, you must have the SAP HANA ODBC driver installed on the local
client computer for the SAP HANA data connection to work properly. You can download the SAP HANA Client
tools from SAP Development Tools, which contains the necessary ODBC driver. Or you can get it from the SAP
Software Download Center. In the Software portal, search for the SAP HANA CLIENT for Windows computers.
Since the SAP Software Download Center changes its structure frequently, more specific guidance for navigating
that site isn't available. For instructions about installing the SAP HANA ODBC driver, see Installing SAP HANA
ODBC Driver on Windows 64 Bits.
To use SAP HANA in Excel, you must have either the 32-bit or 64-bit SAP HANA ODBC driver (depending on
whether you're using the 32-bit or 64-bit version of Excel) installed on the local client computer.
This feature is only available in Excel for Windows if you have Office 2019 or a Microsoft 365 subscription. If
you're a Microsoft 365 subscriber, make sure you have the latest version of Office.
HANA 1.0 SPS 12rev122.09, 2.0 SPS 3rev30 and BW/4HANA 2.0 is supported.
Capabilities Supported
Import
Direct Query
Advanced
SQL Statement
By default, the port number is set to support a single container database. If your SAP HANA database can
contain more than one multitenant database container, select Multi-container system database
(30013) . If you want to connect to a tenant database or a database with a non-default instance number,
select Custom from the Por t drop-down menu.
If you're connecting to an SAP HANA database from Power BI Desktop, you're also given the option of
selecting either Impor t or DirectQuer y . The example in this article uses Impor t , which is the default
(and the only mode for Excel). For more information about connecting to the database using DirectQuery
in Power BI Desktop, see Connect to SAP HANA data sources by using DirectQuery in Power BI.
If you select Advanced options , you can also enter an SQL statement. For more information on using
this SQL statement, see Import data from a database using native database query.
Once you've entered all of your options, select OK .
3. If you are accessing a database for the first time, you'll be asked to enter your credentials for
authentication. In this example, the SAP HANA server requires database user credentials, so select
Database and enter your user name and password. If necessary, enter your server certificate
information.
Also, you may need to validate the server certificate. For more information about using validate server
certificate selections, see Using SAP HANA encryption. In Power BI Desktop and Excel, the validate server
certificate selection is enabled by default. If you've already set up these selections in ODBC Data Source
Administrator, clear the Validate ser ver cer tificate check box. To learn more about using ODBC Data
Source Administrator to set up these selections, see Configure SSL for ODBC client access to SAP HANA.
For more information about authentication, see Authentication with a data source.
Once you've filled in all required information, select Connect .
4. From the Navigator dialog box, you can either transform the data in the Power Query editor by selecting
Transform Data , or load the data by selecting Load .
NOTE
You must use an on-premises data gateway with this connector, whether your data is local or online.
4. Choose the authentication kind you want to use to access your data. You'll also need to enter a username
and password.
NOTE
Currently, Power Query Online does not support Windows authentication. Windows authentication support is
planned to become available in a few months.
Next steps
Enable encryption for SAP HANA
The following articles contain more information that you may find useful when connecting to an SAP HANA
debase.
Manage your data source - SAP HANA
Use Kerberos for single sign-on (SSO) to SAP HANA
Enable encryption for SAP HANA
5/25/2022 • 4 minutes to read • Edit Online
We recommend that you encrypt connections to an SAP HANA server from Power Query Desktop and Power
Query Online. You can enable HANA encryption using SAP's proprietary CommonCryptoLib (formerly known as
sapcrypto) library. SAP recommends using CommonCryptoLib.
NOTE
SAP no longer supports the OpenSSL, and as a result, Microsoft also has discontinued its support. Use
CommonCryptoLib instead.
This article provides an overview of enabling encryption using CommonCryptoLib, and references some specific
areas of the SAP documentation. We update content and links periodically, but for comprehensive instructions
and support, always refer to the official SAP documentation. Use CommonCryptoLib to set up encryption
instead of OpenSSL; for steps to do so, go to How to Configure TLS/SSL in SAP HANA 2.0. For steps on how to
migrate from OpenSSL to CommonCryptoLib, go to SAP Note 2093286 (s-user required).
NOTE
The setup steps for encryption detailed in this article overlap with the setup and configuration steps for SAML SSO. Use
CommonCryptoLib as your HANA server's encryption provider, and make sure that your choice of CommonCryptoLib is
consistent across SAML and encryption configurations.
There are four phases to enabling encryption for SAP HANA. We cover these phases next. More information:
Securing the Communication between SAP HANA Studio and SAP HANA Server through SSL
Use CommonCryptoLib
Ensure your HANA server is configured to use CommonCryptoLib as its cryptographic provider.
You must first convert cert.pem into a .crt file before you can import the certificate into the Trusted Root
Certification Authorities folder.
Before you can validate a server certificate in the Power BI service online, you must have a data source already
set up for the on-premises data gateway. If you don't already have a data source set up to test the connection,
you'll have to create one. To set up the data source on the gateway:
1. From the Power BI service, select the setup icon.
2. From the drop-down list, select Manage gateways .
3. Select the ellipsis (...) next to the name of the gateway you want to use with this connector.
4. From the drop-down list, select Add data source .
5. In Data Source Settings , enter the data source name you want to call this new source in the Data
Source Name text box.
6. In Data Source Type , select SAP HANA .
7. Enter the server name in Ser ver , and select the authentication method.
8. Continue following the instructions in the next procedure.
Test the connection in Power BI Desktop or the Power BI service.
1. In Power BI Desktop or in the Data Source Settings page of the Power BI service, ensure that Validate
ser ver cer tificate is enabled before attempting to establish a connection to your SAP HANA server. For
SSL cr ypto provider , select commoncrypto. Leave the SSL key store and SSL trust store fields blank.
Power BI Desktop
Power BI service
2. Verify that you can successfully establish an encrypted connection to the server with the Validate
ser ver cer tificate option enabled, by loading data in Power BI Desktop or refreshing a published report
in Power BI service.
You'll note that only the SSL cr ypto provider information is required. However, your implementation might
require that you also use the key store and trust store. For more information about these stores and how to
create them, go to Client-Side TLS/SSL Connection Properties (ODBC).
Additional information
Server-Side TLS/SSL Configuration Properties for External Communication (JDBC/ODBC)
Next steps
Configure SSL for ODBC client access to SAP HANA
Configure SSL for ODBC client access to SAP
HANA
5/25/2022 • 2 minutes to read • Edit Online
If you're connecting to an SAP HANA database from Power Query Online, you may need to set up various
property values to connect. These properties could be the SSL crypto provider, an SSL key store, and an SSL
trust store. You may also require that the connection be encrypted. In this case, you can use the ODBC Data
Source Administrator application supplied with Windows to set up these properties.
In Power BI Desktop and Excel, you can set up these properties when you first sign in using the Power Query
SAP HANA database connector. The Validate ser ver cer tificate selection in the authentication dialog box is
enabled by default. You can then enter values in the SSL cr ypto provider , SSL key store , and SSL trust
store properties in this dialog box. However, all of the validate server certificate selections in the authentication
dialog box in Power BI Desktop and Excel are optional. They're optional in case you want to use ODBC Data
Source Administrator to set them up at the driver level.
NOTE
You must have the proper SAP HANA ODBC driver (32-bit or 64-bit) installed before you can set these properties in
ODBC Data Source Administrator.
If you're going to use ODBC Data Source Administrator to set up the SSL crypto provider, SSL key store, and SSL
trust store in Power BI or Excel, clear the Validate ser ver cer tificate check box when presented with the
authentication dialog box.
To use ODBC Data Source Administrator to set up the validate server certificate selections:
1. From the Windows Start menu, select Windows Administrative Tools > ODBC Data Sources . If
you're using a 32-bit version of Power BI Desktop or Excel, open ODBC Data Sources (32-bit), otherwise
open ODBC Data Sources (64-bit).
2. In the User DSN tab, select Add .
3. In the Create New Data Source dialog box, select the HDBODBC driver, and then select Finish .
4. In the ODBC Configuration for SAP HANA dialog box, enter a Data source name . Then enter your
server and database information, and select Validate the TLS/SSL cer tificate .
5. Select the Advanced button.
6. In the Advanced ODBC Connection Proper ty Setup dialog box, select the Add button.
7. In the Add/Modify Connection Proper ty dialog box, enter sslCr yptoProvider in the Proper ty text
box.
8. In the Value text box, enter the name of the crypto provider you'll be using: either sapcr ypto ,
commoncr ypto , openssl , or mscr ypto .
9. Select OK .
10. You can also add the optional sslKeyStore and sslTrustStore properties and values if necessary. If the
connection must be encrypted, add ENCRYPT as the property and TRUE as the value.
11. In the Advanced ODBC Connection Proper ty Setup dialog box, select OK .
12. To test the connection you’ve set up, select Test connection in the ODBC Configuration for SAP
HANA dialog box.
13. When the test connection has completed successfully, select OK .
For more information about the SAP HANA connection properties, see Server-Side TLS/SSL Configuration
Properties for External Communication (JDBC/ODBC).
NOTE
If you select Validate ser ver cer tificate in the SAP HANA authentication dialog box in Power BI Desktop or Excel, any
values you enter in SSL cr ypto provider , SSL key store , and SSL trust store in the authentication dialog box will
override any selections you've set up using ODBC Data Source Administrator.
Next steps
SAP HANA database connector troubleshooting
Troubleshooting
5/25/2022 • 2 minutes to read • Edit Online
The following section describes some issues that may occur while using the Power Query SAP HANA connector,
along with some possible solutions.
If you’re on a 64-bit machine, but Excel or Power BI Desktop is 32-bit (like the screenshots below), you can check
for the driver in the WOW6432 node instead:
HKEY_LOCAL_MACHINE\Software\WOW6432Node\ODBC\ODBCINST.INI\ODBC Drivers
Note that the driver needs to match the bit version of your Excel or Power BI Desktop. If you’re using:
32-bit Excel/Power BI Desktop, you'll need the 32-bit ODBC driver (HDBODBC32).
64-bit Excel/Power BI Desktop, you'll need the 64-bit ODBC driver (HDBODBC).
The driver is usually installed by running hdbsetup.exe.
Finally, the driver should also show up as "ODBC DataSources 32-bit" or "ODBC DataSources 64-bit".
Unfortunately, this is an SAP issue so you'll need to wait for a fix from SAP.
SharePoint folder
5/25/2022 • 4 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
NOTE
AAD/OAuth for SharePoint on-premises isn’t supported using the on-premises data gateway.
Capabilities supported
Folder path
Combine
Combine and load
Combine and transform
Select OK to continue.
3. If this is the first time you've visited this site address, select the appropriate authentication method. Enter
your credentials and choose which level to apply these settings to. Then select Connect .
For more information about authentication methods and level settings, go to Authentication with a data
source.
4. When you select the SharePoint folder you want to use, the file information about all of the files in that
SharePoint folder are displayed. In addition, file information about any files in any subfolders is also
displayed.
5. Select Combine & Transform Data to combine the data in the files of the selected SharePoint folder
and load the data into the Power Query Editor for editing. Or select Combine & Load to load the data
from all of the files in the SharePoint folder directly into your app.
NOTE
The Combine & Transform Data and Combine & Load buttons are the easiest ways to combine data found in the
files of the SharePoint folder you specify. You could also use the Load button or the Transform Data buttons to
combine the files as well, but that requires more manual steps.
7. Select Combine to combine the data in the files of the selected SharePoint folder and load the data into
the Power Query Editor for editing.
NOTE
The Combine button is the easiest way to combine data found in the files of the SharePoint folder you specify.
You could also use the Transform Data buttons to combine the files as well, but that requires more manual
steps.
Troubleshooting
Combining files
All of the files in the SharePoint folder you select will be included in the data to be combined. If you have data
files located in a subfolder of the SharePoint folder you select, all of these files will also be included. To ensure
that combining the file data works properly, make sure that all of the files in the folder and the subfolders have
the same schema.
In some cases, you might have multiple folders on your SharePoint site containing different types of data. In this
case, you'll need to delete the unnecessary files. To delete these files:
1. In the list of files from the SharePoint folder you chose, select Transform Data .
2. In the Power Query editor, scroll down to find the files you want to keep.
3. In the example shown in the screenshot above, the required files are the last rows in the table. Select
Remove Rows , enter the value of the last row before the files to keep (in this case 903), and select OK .
4. Once you've removed all the unnecessary files, select Combine Files from the Home ribbon to combine
the data from all of the remaining files.
For more information about combining files, go to Combine files in Power Query.
Filename special characters
If a filename contains certain special characters, it may lead to authentication errors because of the filename
being truncated in the URL. If you're getting unusual authentication errors, make sure all of the filenames you're
using don't contain any of the following special characters.
# % $
If these characters are present in the filename, the file owner must rename the file so that it does NOT contain
any of these characters.
SharePoint list
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
NOTE
AAD/OAuth for SharePoint on-premises isn’t supported using the on-premises data gateway.
Capabilities supported
Site URL
If the URL address you enter is invalid, a warning icon will appear next to the Site URL textbox.
Select OK to continue.
3. If this is the first time you've visited this site address, select the appropriate authentication method. Enter
your credentials and chose which level to apply these settings to. Then select Connect .
For more information about authentication methods and level settings, go to Authentication with a data
source.
4. From the Navigator , you can select a location, then either transform the data in the Power Query editor
by selecting Transform Data , or load the data by selecting Load .
Troubleshooting
Use root SharePoint address
Make sure you supply the root address of the SharePoint site, without any subfolders or documents. For
example, use link similar to the following: https://contoso.sharepoint.com/teams/ObjectModel/
Inconsistent behavior around boolean data
When using the SharePoint list connector, Boolean values are represented inconsistently as TRUE/FALSE or 1/0
in Power BI Desktop and Power BI service environments. This may result in wrong data, incorrect filters, and
empty visuals.
This issue only happens when the Data Type is not explicitly set for a column in the Query View of Power BI
Desktop. You can tell that the data type isn't set by seeing the "ABC 123" image on the column and "Any" data
type in the ribbon as shown below.
The user can force the interpretation to be consistent by explicitly setting the data type for the column through
the Power Query Editor. For example, the following image shows the column with an explicit Boolean type.
Next steps
Optimize Power Query when expanding table columns
SharePoint Online list
5/25/2022 • 4 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities supported
Site URL
If the URL address you enter is invalid, a warning icon will appear next to the Site URL textbox.
You can also select either the 1.0 implementation of this connector or the beta 2.0 implementation. More
information: Connect to SharePoint Online list v2.0 (Beta)
Select OK to continue.
3. If this is the first time you've visited this site address, select the appropriate authentication method. Enter
your credentials and chose which level to apply these settings to. Then select Connect .
For more information about authentication methods and level settings, go to Authentication with a data
source.
4. From the Navigator , you can select a location, then either transform the data in the Power Query editor
by selecting Transform Data , or load the data by selecting Load .
The first operation changes the type to datetimezone , and the second operation converts it to the computer's
local time.
SharePoint join limit
This issue is limited to the SharePoint Online list v2.0 connector
The SharePoint Online list v2.0 connector uses a different API than the v1.0 connector and, as such, is subject to
a maximum of 12 join operations per query, as documented in the SharePoint Online documentation under List
view lookup threshold . This issue will manifest as SharePoint queries failing when more than 12 columns are
accessed simultaneously from a SharePoint list. However, you can work around this situation by creating a
default view with less than 12 lookup columns.
Using OData to access a SharePoint Online list
If you use an OData feed to access a SharePoint Online list, there's an approximately 2100 character limitation to
the URL you use to connect. More information: Maximum URL length
SIS-CC SDMX
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by the Statistical Information System Collaboration Community (SIS-CC), the
owner of this connector and a member of the Microsoft Power Query Connector Certification Program. If you have
questions regarding the content of this article or have changes you would like to see made to this article, visit the SIS-CC
website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
Before you get started, make sure you've properly configured the URL from the service provider’s API. The exact
process here will depend on the service provider.
Capabilities supported
Import of SDMX-CSV 2.1 format. Other formats aren't supported.
Connection instructions
To connect to SDMX Web Service data:
1. Select Get Data from the Home ribbon in Power BI Desktop. Select All from the categories on the left,
and then select SIS-CC SDMX . Then select Connect .
2. Fill in the parameters:
a. In the Data quer y URL , enter an SDMX REST data query URL (the web service must support the
SDMX-CSV format).
b. In Display format , select one of the options:
Show codes and labels; example: FREQ: Frequency
Show codes; example: FREQ
Show labels; example: Frequency
Optionally, enter a language preference in Label language preference using an IETF BCP 47
tag
3. If this is the first time you’re connecting to the REST web service in the previous step Data quer y URL ,
this authentication step is displayed. As the connection is Anonymous, select Connect
4. Select Load to import the data into Power BI, or Transform Data to edit the query in Power Query
Editor where you can refine the query before loading into Power BI.
Next steps
If you want to submit a feature request or contribute to the open-source project, then go to the Gitlab project
site.
Snowflake
5/25/2022 • 3 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities Supported
Import
DirectQuery (Power BI only)
Advanced options
Specify a text value to use as Role name
Relationship columns
Connection timeout in seconds
Command timeout in seconds
Database
Native SQL statement
3. Optionally, enter values in any advanced options that you want to use to modify the connection query,
such as a text value to use as a Role name or a command timeout. More information: Connect using
advanced options
4. Select OK .
5. To sign in to your Snowflake computing warehouse, enter your username and password, and then select
Connect .
NOTE
Once you enter your username and password for a particular Snowflake server, Power BI Desktop uses those
same credentials in subsequent connection attempts. You can modify those credentials by going to File >
Options and settings > Data source settings . More information: Change the authentication method
If you want to use the Microsoft account option, the Snowflake Azure Active Directory (Azure AD)
integration must be configured on the Snowflake side. More information: Power BI SSO to Snowflake -
Getting Started
6. In Navigator , select one or multiple elements to import and use in Power BI Desktop. Then select either
Load to load the table in Power BI Desktop, or Transform Data to open the Power Query Editor where
you can filter and refine the set of data you want to use, and then load that refined set of data into Power
BI Desktop.
7. Select Impor t to import data directly into Power BI, or select DirectQuer y , then select OK . More
information: Use DirectQuery in Power BI Desktop
NOTE
Azure Active Directory (Azure AD) Single Sign-On (SSO) only supports DirectQuery.
Connect to a Snowflake database from Power Query Online
To make the connection, take the following steps:
1. Select the Snowflake option in the connector selection.
2. In the Snowflake dialog that appears, enter the name of the server and warehouse.
3. Enter any values in the advanced options you want to use. If there are any advanced options not
represented in the UI, you can edit them in the Advanced Editor in Power Query later.
4. Enter your connection credentials, including selecting or creating a new connection, which gateway you
would like to use, and a username and password. Only the Basic authentication kind is supported in
Power Query Online.
5. Select Next to connect to the database.
6. In Navigator , select the data you require, then select Transform data to transform the data in Power
Query Editor.
Role name Specifies the role that the report uses via the driver. This role
must be available to the user, otherwise no role will be set.
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, you won’t see those
columns.
Connection timeout in seconds Specifies how long to wait for a response when interacting
with the Snowflake service before returning an error. Default
is 0 (no timeout).
Command timeout in seconds Specifies how long to wait for a query to complete before
returning an error. Default is 0 (no timeout).
Once you've selected the advanced options you require, select OK in Power Query Desktop or Next in Power
Query Online to connect to your Snowflake database.
Additional information
Connect to Snowflake in Power BI Service
SoftOne BI (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by SoftOne, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the SoftOne website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
You'll need to have the Soft1 ERP/CRM or Atlantis ERP product installed with a licensed SoftOne BI connector
module. A web account must be configured in the application with access to the SoftOne BI Connector service.
This account information and your installation serial number will be validated during authentication by the
SoftOne BI connector.
The SoftOne BI connector is supported from Soft1 Series 5 version 500.521.11424 or later and Atlantis ERP
version 3.3.2697.1 or later.
Capabilities supported
Import
Connection instructions
SoftOne provides many templates as Power BI template files (.pbit) that you can use or customize which will
provide you with a start to your BI project. For example, Sales & Collections, and Finance.
To connect in Power BI Desktop using a new report, follow the steps below. If you're connecting from a report
created using one of the SoftOne BI templates, see Using a provided template later in this article.
Connect to your Soft1 or Atlantis data store from scratch
To load data from your installation with Power Query Desktop:
1. Select Get Data > More... > Online Ser vices in Power BI Desktop and search for SoftOne BI . Select
Connect .
2. Select Sign in . An authentication form will display.
NOTE
If you enter incorrect credentials, you'll receive a message stating that your sign in failed due to invalid
credentials.
If the SoftOne BI Connector is not activated, or the Web Account that you're using is not configured with the
service, you'll receive a message stating that access is denied because the selected module is not activated.
3. After signing in with SoftOne Web Services, you can connect to your data store.
Selecting Connect will take you to the navigation table and display the available tables from the data
store from which you may select the data required.
4. In the navigator, you should now see the tables in your data store. Fetching the tables can take some time.
You must have uploaded the data from your Soft1 or Atlantis installation (per the product
documentation) to see any tables. If you haven't uploaded your data, you won't see any tables displayed
in the Navigation Table.
In this case, you'll need to go back to your application and upload your data.
Using a provided template
1. Open the selected template, Power BI Desktop will attempt to load the data from the data store, and will
prompt for credentials.
2. Select Sign in and enter your credentials (Serial number, username, and password).
IMPORTANT
If you're working with more than one Soft1/Atlantis installation, then when switching between data stores, you must clear
the SoftOne BI credentials saved by Power BI Desktop.
SQL Server
5/25/2022 • 3 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
By default, Power BI installs an OLE DB driver for SQL Server. However, for optimal performance, we recommend
that the customer installs the SQL Server Native Client before using the SQL Server connector. SQL Server
Native Client 11.0 and SQL Server Native Client 10.0 are both supported in the latest version.
Capabilities Supported
Import
DirectQuery (Power BI Desktop)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
SQL Server failover support
3. Select either the Impor t or DirectQuer y data connectivity mode (Power BI Desktop only).
4. Select OK .
5. If this is the first time you're connecting to this database, select the authentication type, input your
credentials, and select the level to apply the authentication settings to. Then select Connect .
NOTE
If the connection is not encrypted, you'll be prompted with the following dialog.
Select OK to connect to the database by using an unencrypted connection, or follow these instructions to
setup encrypted connections to SQL Server.
6. In Navigator , select the database information you want, then either select Load to load the data or
Transform Data to continue transforming the data in Power Query Editor.
Command timeout in minutes If your connection lasts longer than 10 minutes (the default
timeout), you can enter another value in minutes to keep
the connection open longer. This option is only available in
Power Query Desktop.
Include relationship columns If checked, includes columns that might have relationships to
other tables. If this box is cleared, you won’t see those
columns.
Navigate using full hierarchy If checked, the Navigator displays the complete hierarchy of
tables in the database you're connecting to. If cleared,
Navigator displays only the tables whose columns and rows
contain data.
Enable SQL Server Failover support If checked, when a node in the SQL Server failover group
isn't available, Power Query moves from that node to
another when failover occurs. If cleared, no failover will occur.
Once you've selected the advanced options you require, select OK in Power Query Desktop or Next in Power
Query Online to connect to your SQL Server database.
Troubleshooting
Always Encrypted columns
Power Query doesn't support 'Always Encrypted' columns.
Next steps
Optimize Power Query when expanding table columns
Stripe (Deprecated)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Products -
Deprecation
This connector is deprecated, and won't be supported soon. We recommend you transition off existing
connections using this connector, and don't use this connector for new connections.
SumTotal (Beta)
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by SumTotal, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the SumTotal website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
You must have a SumTotal hosted environment with standard permissions to access the portal, and read
permissions to access data in tables.
Capabilities supported
Import
NOTE
You'll be prompted with a script error; this is expected and loads the JS/CSS scripts that the login form uses. Select
Yes .
3. When the table is loaded in Navigator , you'll be presented with the list of OData API entities that are
currently supported by the connector. You can select to load one or multiple entities.
4. When you've finished selecting entities, select Load to load the data directly in Power BI desktop, or select
Transform Data to transform the data.
NOTE
If this is the first time you're connecting to this site, select Sign in and input your credentials. Then select Connect .
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities supported
Import
Text/CSV delimiters
Power Query will treat CSVs as structured files with a comma as a delimiter—a special case of a text file. If you
choose a text file, Power Query will automatically attempt to determine if it has delimiter separated values, and
what that delimiter is. If it can infer a delimiter, it will automatically treat it as a structured data source.
Unstructured Text
If your text file doesn't have structure, you'll get a single column with a new row per line encoded in the source
text. As a sample for unstructured text, you can consider a notepad file with the following contents:
Hello world.
This is sample data.
When you load it, you're presented with a navigation screen that loads each of these lines into their own row.
There's only one thing you can configure on this dialog, which is the File Origin dropdown select. This
dropdown lets you select which character set was used to generate the file. Currently, character set isn't inferred,
and UTF-8 will only be inferred if it starts with a UTF-8 BOM.
CSV
You can find a sample CSV file here.
In addition to file origin, CSV also supports specifying the delimiter and how data type detection will be handled.
Delimiters available include colon, comma, equals sign, semicolon, space, tab, a custom delimiter (which can be
any string), and a fixed width (splitting up text by some standard number of characters).
The final dropdown allows you to select how you want to handle data type detection. It can be done based on
the first 200 rows, on the entire data set, or you can choose to not do automatic data type detection and instead
let all columns default to 'Text'. Warning: if you do it on the entire data set it may cause the initial load of the data
in the editor to be slower.
Since inference can be incorrect, it's worth double checking settings before loading.
Structured Text
When Power Query can detect structure to your text file, it will treat the text file as a delimiter separated value
file, and give you the same options available when opening a CSV—which is essentially just a file with an
extension indicating the delimiter type.
For example, if you save the following example as a text file, it will be read as having a tab delimiter rather than
unstructured text.
The Line breaks dropdown will allow you to select if you want to apply line breaks that are inside quotes or
not.
For example, if you edit the 'structured' sample provided above, you can add a line break.
If Line breaks is set to Ignore quoted line breaks , it will load as if there was no line break (with an extra
space).
If Line breaks is set to Apply all line breaks , it will load an extra row, with the content after the line breaks
being the only content in that row (exact output may depend on structure of the file contents).
The Open file as dropdown will let you edit what you want to load the file as—important for troubleshooting.
For structured files that aren't technically CSVs (such as a tab separated value file saved as a text file), you should
still have Open file as set to CSV. This setting also determines which dropdowns are available in the rest of the
dialog.
Text/CSV by Example
Text/CSV By Example in Power Query is a generally available feature in Power BI Desktop and Power Query
Online. When you use the Text/CSV connector, you'll see an option to Extract Table Using Examples on the
bottom-left corner of the navigator.
When you select that button, you’ll be taken into the Extract Table Using Examples page. On this page, you
specify sample output values for the data you’d like to extract from your Text/CSV file. After you enter the first
cell of the column, other cells in the column are filled out. For the data to be extracted correctly, you may need to
enter more than one cell in the column. If some cells in the column are incorrect, you can fix the first incorrect
cell and the data will be extracted again. Check the data in the first few cells to ensure that the data has been
extracted successfully.
NOTE
We recommend that you enter the examples in column order. Once the column has successfully been filled out, create a
new column and begin entering examples in the new column.
Once you’re done constructing that table, you can either select to load or transform the data. Notice how the
resulting queries contain a detailed breakdown of all the steps that were inferred for the data extraction. These
steps are just regular query steps that you can customize as needed.
Troubleshooting
Loading Files from the Web
If you're requesting text/csv files from the web and also promoting headers, and you’re retrieving enough files
that you need to be concerned with potential throttling, you should consider wrapping your Web.Contents call
with Binary.Buffer() . In this case, buffering the file before promoting headers will cause the file to only be
requested once.
Working with large CSV files
If you're dealing with large CSV files in the Power Query Online editor, you might receive an Internal Error. We
suggest you work with a smaller sized CSV file first, apply the steps in the editor, and once you're done, change
the path to the bigger CSV file. This method lets you work more efficiently and reduces your chances of
encountering a timeout in the online editor. We don't expect you to encounter this error during refresh time, as
we allow for a longer timeout duration.
Unstructured text being interpreted as structured
In rare cases, a document that has similar comma numbers across paragraphs might be interpreted to be a CSV.
If this issue happens, edit the Source step in the Query Editor, and select Text instead of CSV in the Open File
As dropdown select.
Error: Connection closed by host
When loading Text/CSV files from a web source and also promoting headers, you might sometimes encounter
the following errors: “An existing connection was forcibly closed by the remote host” or
“Received an unexpected EOF or 0 bytes from the transport stream.” These errors might be caused by the host
employing protective measures and closing a connection which might be temporarily paused, for example,
when waiting on another data source connection for a join or append operation. To work around these errors,
try adding a Binary.Buffer (recommended) or Table.Buffer call, which will download the file, load it into memory,
and immediately close the connection. This should prevent any pause during download and keep the host from
forcibly closing the connection before the content is retrieved.
The following example illustrates this workaround. This buffering needs to be done before the resulting table is
passed to Table.PromoteHeaders .
Original:
Csv.Document(Web.Contents("https://.../MyFile.csv"))
With Binary.Buffer :
Csv.Document(Binary.Buffer(Web.Contents("https://.../MyFile.csv")))
With Table.Buffer :
Table.Buffer(Csv.Document(Web.Contents("https://.../MyFile.csv")))
TIBCO(R) Data Virtualization
5/25/2022 • 3 minutes to read • Edit Online
NOTE
The following connector article is provided by TIBCO, the owner of this connector and a member of the Microsoft Power
Query Connector Certification Program. If you have questions regarding the content of this article or have changes you
would like to see made to this article, visit the TIBCO website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
To access the TIBCO eDelivery site, you must have purchased TIBCO software. There's no TIBCO license required
for the TIBCO(R) Data Virtualization (TDV) software—a TIBCO customer only needs to have a valid contract in
place. If you don't have access, then you'll need to contact the TIBCO admin in your organization.
The Power BI Connector for TIBCO(R) Data Virtualization must first be downloaded from
https://edelivery.tibco.com and installed on the machine running Power BI Desktop. The eDelivery site
downloads a ZIP file (for example, TIB_tdv_drivers_<VERSION>_all.zip*.zip where <VERSION>=TDV Version)
that contains an installer program that installs all TDV client drivers, including the Power BI Connector.
Once the connector is installed, configure a data source name (DSN) to specify the connection properties
needed to connect to the TIBCO(R) Data Virtualization server.
NOTE
The DSN architecture (32-bit or 64-bit) needs to match the architecture of the product where you intend to use the
connector.
NOTE
Power BI Connector for TIBCO(R) Data Virtualization is the driver used by the TIBCO(R) Data Virtualization connector to
connect Power BI Desktop to TDV.
Capabilities Supported
Import
DirectQuery (Power BI Desktop only)
Advanced Connection Properties
Advanced
Native SQL statement
Once you've selected the advanced options you require, select OK in Power Query Desktop to connect to your
TIBCO(R) Data Virtualization Server.
Kerberos-based single sign-on (SSO) for TIBCO(R) Data Virtualization
The TIBCO(R) Data Virtualization connector now supports Kerberos-based single sign-on (SSO).
To use this feature:
1. Sign in to your Power BI account, and navigate to the Gateway management page.
2. Add a new data source under the gateway cluster you want to use.
3. Select the connector in the Data Source Type list.
4. Expand the Advanced Settings section.
5. Select the option to Use SSO via Kerberos for DirectQuer y queries or Use SSO via Kerberos for
DirectQuer y and Impor t queries .
More information: Configure Kerberos-based SSO from Power BI service to on-premises data sources
Twilio (Deprecated) (Beta)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Products -
Deprecation
NOTE
This connector is deprecated. We recommend that you transition off existing connections using this connector, and don't
use this connector for new connections.
Usercube
5/25/2022 • 2 minutes to read • Edit Online
NOTE
The following connector article is provided by Usercube, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the Usercube website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
You must have a Usercube instance with the PowerBI option.
Capabilities supported
Import
4. Enter the client credentials. The Client Id must be built from the Identifier of an OpenIdClient element.
This element is defined in the configuration of your Usercube instance. To this identifier, you must
concatenate the @ character and the domain name of the Usercube instance.
5. In Navigator , select the data you require. Then, either select Transform data to transform the data in
the Power Query Editor, or choose Load to load the data in Power BI.
Vessel Insight
5/25/2022 • 4 minutes to read • Edit Online
NOTE
The following connector article is provided by Kongsberg, the owner of this connector and a member of the Microsoft
Power Query Connector Certification Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the Kongsberg website and use the support channels there.
Summary
IT EM DESC RIP T IO N
Prerequisites
Before you can sign in to Vessel Insight, you must have an organization account (username/password)
connected to a tenant.
Capabilities Supported
Import
4. In the window that appears, provide your Vessel Insight tenant URL in the format
[companyname] .kognif.ai . Then select Validate .
5. In the window that appears, provide your credentials to sign in to your Vessel Insight account.
Once the connection is established, you can preview and select data within the Navigator dialog box to create a
single tabular output.
Recommended content
You might also find the following Vessel Insight information useful:
About Vessel Insight Power BI connector
About Vessel Insight
Vessel Insight API
Web
5/25/2022 • 9 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
Internet Explorer 10
Capabilities supported
Basic
Advanced
URL parts
Command timeout
HTTP request header parameters
NOTE
When uploading the report to the Power BI service, only the anonymous , Windows and basic authentication
methods are available.
The level you select for the authentication method determines what part of a URL will have the
authentication method applied to it. If you select the top-level web address, the authentication method
you select here will be used for that URL address or any subaddress within that address. However, you
might not want to set the top URL address to a specific authentication method because different
subaddresses could require different authentication methods. For example, if you were accessing two
separate folders of a single SharePoint site and wanted to use different Microsoft Accounts to access each
one.
Once you've set the authentication method for a specific web site address, you won't need to select the
authentication method for that URL address or any subaddress again. For example, if you select the
https://en.wikipedia.org/ address in this dialog, any web page that begins with this address won't require
that you select the authentication method again.
NOTE
If you need to change the authentication method later, go to Changing the authentication method.
4. From the Navigator dialog, you can select a table, then either transform the data in the Power Query
editor by selecting Transform Data , or load the data by selecting Load .
The right side of the Navigator dialog displays the contents of the table you select to transform or load.
If you're uncertain which table contains the data you're interested in, you can select the Web View tab.
The web view lets you see the entire contents of the web page, and highlights each of the tables that have
been detected on that site. You can select the check box above the highlighted table to obtain the data
from that table.
On the lower left side of the Navigator dialog, you can also select the Add table using examples
button. This selection presents an interactive window where you can preview the content of the web page
and enter sample values of the data you want to extract. For more information on using this feature, go to
Get webpage data by providing examples.
In most cases, you'll want to select the Web page connector. For security reasons, you'll need to use an
on-premises data gateway with this connector. The Web Page connector requires a gateway because
HTML pages are retrieved using a browser control, which involves potential security concerns. This
concern isn't an issue with Web API connector, as it doesn't use a browser control.
In some cases, you might want to use a URL that points at either an API or a file stored on the web. In
those scenarios, the Web API connector (or file-specific connectors) would allow you to move forward
without using an on-premises data gateway.
Also note that if your URL points to a file, you should use the specific file connector instead of the Web
page connector.
2. Enter a URL address in the text box. For this example, enter
https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States .
4. Select the authentication method you'll use to connect to the web page.
The available authentication methods for this connector are:
Anonymous : Select this authentication method if the web page doesn't require any credentials.
Windows : Select this authentication method if the web page requires your Windows credentials.
Basic : Select this authentication method if the web page requires a basic user name and password.
Organizational account : Select this authentication method if the web page requires
organizational account credentials.
Once you've chosen the authentication method, select Next .
5. From the Navigator dialog, you can select a table, then transform the data in the Power Query Editor by
selecting Transform Data .
Use the URL par ts section of the dialog to assemble the URL you want to use to get data. The first part of the
URL in the URL par ts section most likely would consist of the scheme, authority, and path of the URI (for
example, http://contoso.com/products/ ). The second text box could include any queries or fragments that you
would use to filter the information provided to the web site. If you need to add more than one part, select Add
par t to add another URL fragment text box. As you enter each part of the URL, the complete URL that will be
used when you select OK is displayed in the URL preview box.
Depending on how long the POST request takes to process data, you may need to prolong the time the request
continues to stay connected to the web site. The default timeout for both POST and GET is 100 seconds. If this
timeout is too short, you can use the optional Command timeout in minutes to extend the number of
minutes you stay connected.
You can also add specific request headers to the POST you send to the web site using the optional HTTP
request header parameters drop-down box. The following table describes the request headers you can
select.
Referer Specifies a URI reference for the resource from which the
target URI was obtained.
As you can see, the Web connector returns the web contents from the URL you supplied, and then
automatically wraps the web contents in the appropriate document type specified by the URL (
Json.Document in this example).
Getting data from a web page lets users easily extract data from web pages. Often however, data on Web pages
aren't in tidy tables that are easy to extract. Getting data from such pages can be challenging, even if the data is
structured and consistent.
There's a solution. With the Get Data from Web by example feature, you can essentially show Power Query data
you want to extract by providing one or more examples within the connector dialog. Power Query gathers other
data on the page that match your examples. With this solution you can extract all sorts of data from Web pages,
including data found in tables and other non-table data.
NOTE
Prices listed in the images are for example purposes only.
Add table using examples presents an interactive window where you can preview the content of the Web
page. Enter sample values of the data you want to extract.
In this example, you'll extract the Name and Price for each of the games on the page. You can do that by
specifying a couple of examples from the page for each column. As you enter examples, Power Query extracts
data that fits the pattern of example entries using smart data extraction algorithms.
NOTE
Value suggestions only include values less than or equal to 128 characters in length.
Once you're happy with the data extracted from the Web page, select OK to go to Power Query Editor. You can
then apply more transformations or shape the data, such as combining this data with other data sources.
See also
Add a column from examples
Shape and combine data
Getting data
Troubleshooting the Power Query Web connector
Troubleshooting the Web connector
5/25/2022 • 5 minutes to read • Edit Online
4. Select OK .
5. Restart Power BI Desktop.
IMPORTANT
Be aware that unchecking Enable cer tificate revocation check will make web connections less secure.
To set this scenario in Group Policy, use the "DisableCertificateRevocationCheck" key under the registry path
"Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft Power BI Desktop". Setting
"DisableCertificateRevocationCheck" to 0 will always enable the check (stopping Fiddler and similar software
from working) and setting "DisableCertificateRevocationCheck" to 1 will always disable the check (enabling
Fiddler and similar software).
The legacy Power Query Web connector automatically creates a Web.Page query that supports authentication.
The only limitation occurs if you select Windows authentication in the authentication method dialog box. In this
case, the Use my current credentials selection works correctly, but Use alternate credentials won't
authenticate.
The new version of the Web connector (currently available in Power BI Desktop) automatically creates a
Web.BrowserContents query. Such queries currently only support anonymous authentication. In other words, the
new Web connector can't be used to connect to a source that requires non-anonymous authentication. This
limitation applies to the Web.BrowserContents function, regardless of the host environment.
Currently, Power BI Desktop automatically uses the Web.BrowserContents function. The Web.Page function is still
used automatically by Excel and Power Query Online. Power Query Online does support Web.BrowserContents
using an on-premises data gateway, but you currently would have to enter such a formula manually. When Web
By Example becomes available in Power Query Online in mid-October 2020, this feature will use
Web.BrowserContents .
The Web.Page function requires that you have Internet Explorer 10 installed on your computer. When refreshing
a Web.Page query via an on-premises data gateway, the computer containing the gateway must have Internet
Explorer 10 installed. If you use only the Web.BrowserContents function, you don't need to have Internet Explorer
10 installed on your computer or the computer containing the on-premises data gateway.
In cases where you need to use Web.Page instead of Web.BrowserContents because of authentication issues, you
can still manually use Web.Page .
In Power BI Desktop, you can use the older Web.Page function by clearing the New web table inference
preview feature:
1. Under the File tab, select Options and settings > Options .
2. In the Global section, select Preview features .
3. Clear the New web table inference preview feature, and then select OK .
4. Restart Power BI Desktop.
NOTE
Currently, you can't turn off the use of Web.BrowserContents in Power BI Desktop optimized for Power BI Report
Server.
You can also get a copy of a Web.Page query from Excel. To copy the code from Excel:
1. Select From Web from the Data tab.
2. Enter the address in the From Web dialog box, and then select OK .
3. In Navigator , choose the data you want to load, and then select Transform Data .
4. In the Home tab of Power Query, select Advanced Editor .
5. In the Advanced Editor , copy the M formula.
6. In the app that uses Web.BrowserContents , select the Blank Quer y connector.
7. If you're copying to Power BI Desktop:
a. In the Home tab, select Advanced Editor .
b. Paste the copied Web.Page query in the editor, and then select Done .
8. If you're copying to Power Query Online:
a. In the Blank Quer y , paste the copied Web.Page query in the blank query.
b. Select an on-premises data gateway to use.
c. Select Next .
You can also manually enter the following code into a blank query. Ensure that you enter the address of the web
page you want to load.
let
Source = Web.Page(Web.Contents("<your address here>")),
Navigation = Source{0}[Data]
in
Navigation
Please contact the service owner. They will either need to change the authentication configuration or build a
custom connector.
Summary
IT EM DESC RIP T IO N
Products -
Deprecation
NOTE
This connector is deprecated because of end of support for the connector. We recommend that users transition off
existing connections using this connector, and don't use this connector for new connections.
XML
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Capabilities supported
Import
Troubleshooting
Data Structure
Due to the fact that many XML documents have ragged or nested data, you may have to do extra data shaping
to get it in the sort of form that will make it convenient to do analytics. This holds true whether you use the UI
accessible Xml.Tables function, or the Xml.Document function. Depending on your needs, you may find you
have to do more or less data shaping.
Text versus nodes
If your document contains a mixture of text and non-text sibling nodes, you may encounter issues.
For example if you have a node like this:
<abc>
Hello <i>world</i>
</abc>
Xml.Tables will return the "world" portion but ignore "Hello". Only the element(s) are returned, not the text.
However, Xml.Document will return "Hello <i>world</i>". The entire inner node is turned to text, and structure
isn't preserved.
Zendesk (Beta)
5/25/2022 • 2 minutes to read • Edit Online
Summary
IT EM DESC RIP T IO N
Prerequisites
Before you can sign in to Zendesk, you must have a Zendesk account (username/password).
Capabilities Supported
Import
6. Select Sign in .
7. Once you've successfully signed in, select Connect .
8. In Navigator , select the information you want, then either select Load to load the data or Transform
Data to continue transforming the data in Power Query Editor.
Summary
Power Query Online is integrated into a variety of Microsoft products. Since these products target different
scenarios, they may set different limits for Power Query Online usage.
Limits are enforced at the beginning of query evaluations. Once an evaluation is underway, only timeout limits
are imposed.
Limit Types
Hourly Evaluation Count: The maximum number of evaluation requests a user can issue during any 60 minute
period
Daily Evaluation Time: The net time a user can spend evaluating queries during any 24 hour period
Concurrent Evaluations: The maximum number of evaluations a user can have running at any given time
Authoring Limits
Authoring limits are the same across all products. During authoring, query evaluations return previews that may
be subsets of the data. Data is not persisted.
Hourly Evaluation Count: 1000
Daily Evaluation Time: Currently unrestricted
Per Query Timeout: 10 minutes
Refresh Limits
During refresh (either scheduled or on-demand), query evaluations return complete results. Data is typically
persisted in storage.
Dataflows in 500 2 5
PowerApps.com (Trial)
Dataflows in 1000 8 10
PowerApps.com
(Production)
Preserving sort
You might assume that if you sort your data, any downstream operations will preserve the sort order.
For example, if you sort a sales table so that each store's largest sale is shown first, you might expect that doing
a "Remove duplicates" operation will return only the top sale for each store. And this operation might, in fact,
appear to work. However, this behavior isn't guaranteed.
Because of the way Power Query optimizes certain operations, including skipping them or offloading them to
data sources (which can have their own unique ordering behavior), sort order isn't guaranteed to be preserved
through aggregations (such as Table.Group ), merges (such as Table.NestedJoin ), or duplicate removal (such as
Table.Distinct ).
There are a number of ways to work around this. Here are two suggestions:
Perform a sort after applying the downstream operation. For example, when grouping rows, sort the nested
table in each group before applying further steps. Here's some sample M code that demonstrates this
approach:
Table.Group(Sales_SalesPerson, {"TerritoryID"}, {{"SortedRows", each Table.Sort(_, {"SalesYTD",
Order.Descending})}})
Buffer the data (using Table.Buffer ) before applying the downstream operation. In some cases, this
operation will cause the downstream operation to preserve the buffered sort order.
Certificate revocation
An upcoming version of Power BI Desktop will cause SSL connections failure from Desktop when any
certificates in the SSL chain are missing certificate revocation status. This is a change from the current state,
where revocation only caused connection failure in the case where the certificate was explicitly revoked. Other
certificate issues might include invalid signatures, and certificate expiration.
As there are configurations in which revocation status may be stripped, such as with corporate proxy servers,
we'll be providing another option to ignore certificates that don't have revocation information. This option will
allow situations where revocation information is stripped in certain cases, but you don't want to lower security
entirely, to continue working.
It isn't recommended, but users will continue to be able to turn off revocation checks entirely.
Background
Due to the way that queries are stored in Power Query Online, there are cases where manually entered M script
(generally comments) is lost. The 'Review Script Changes' pane provides a diff experience highlighting the
changes, which allows users to understand what changes are being made. Users can then accept the changes or
rearrange their script to fix it.
There are three notable cases that may cause this experience.
Script for ribbon transforms
Ribbon transforms always generate the same M script, which may be different than the way they are manually
entered. This should always be equivalent script. Contact support if this is not the case.
Comments
Comments always have to be inside the Let .. in expression, and above a step. This will be shown in the user
interface as a 'Step property'. We lose all other comments. Comments that are written on the same line as one
step, but above another step (for example, after the comma that trails every step) will be moved down.
Removing script errors
In certain cases, your script will be updated if it results in a syntax error by escaping your script (for example,
when using the formula bar).
Experience
When you commit a query, Power Query Online will evaluate it to see if the 'stored' version of the script differs
at all from what you have submitted. If it does, it will present you with a 'Review Script Changes' dialog box that
will allow you to accept or cancel.
If you accept, the changes will be made to your query.
If you cancel, you might rewrite your query to make sure that you move your comments properly, or
rearrange however else you want.
Power Query connector feedback
5/25/2022 • 2 minutes to read • Edit Online
This article describes how to submit feedback for Power Query connectors. It's important to distinguish between
Microsoft-owned connectors and non-Microsoft-owned connectors, as the support and feedback channels are
different.
To confirm whether a connector is Microsoft-owned, visit the connector reference. Only connectors marked as
"By Microsoft" are Microsoft-owned connectors.
Microsoft-owned connectors
This section outlines instructions to receive support or submit feedback on Microsoft-owned connectors.
Support and troubleshooting
If you're finding an issue with a Power Query connector, use the dedicated support channels for the product
you're using Power Query connectors in. For example, for Power BI, visit the Power BI support page.
If you're seeking help with using Microsoft-owned Power Query connectors, visit one of the following resources.
Community forums for the product you're using Power Query in. For example, for Power BI, this forum would
be the Power BI Community and for PowerPlatform dataflows, the forum would be Power Apps Community.
Power Query website resources.
Submitting feedback
To submit feedback about a Microsoft-owned connector, provide the feedback to the "ideas" forum for the
product you're using Power Query connectors in. For example, for Power BI, visit the Power BI ideas forum. If you
have one, you can also provide feedback directly to your Microsoft account contact.
Non-Microsoft-owned connectors
This section outlines instructions to receive support or submit feedback on non-Microsoft-owned connectors.
Support and troubleshooting
For non-Microsoft-owned connectors, support and troubleshooting questions should go to the connector owner
through their support channels. For example, for a Contoso-owned connector, you should submit a request
through the Contoso support channels.
You can also engage the Power Query community resources indicated above for Microsoft-owned connectors, in
case a member of the community can assist.
Submitting feedback
As non-Microsoft-owned connectors are managed and updated by the respective connector owner, feedback
should be sent directly to the connector owner. For example, to submit feedback about a Contoso-owned
connector, you should directly submit feedback to Contoso.
Capture web requests with Fiddler
5/25/2022 • 2 minutes to read • Edit Online
When diagnosing issues that might occur when Power Query communicates with your data, you might be asked
to supply a Fiddler trace. The information provided by Fiddler can be of significant use when troubleshooting
connectivity issues.
NOTE
This article assumes that you are already familiar with how Fiddler works in general.
See also
Query diagnostics
Power Query feedback
Getting started with Fiddler Classic
Installing the Power Query SDK
5/25/2022 • 2 minutes to read • Edit Online
Quickstart
NOTE
The steps to enable extensions changed in the June 2017 version of Power BI Desktop.
1. Install the Power Query SDK from the Visual Studio Marketplace.
2. Create a new data connector project.
3. Define your connector logic.
4. Build the project to produce an extension file.
5. Copy the extension file into [Documents]/Power BI Desktop/Custom Connectors.
6. Check the option (Not Recommended) Allow any extension to load without validation or warning
in Power BI Desktop (under File | Options and settings | Options | Security | Data Extensions).
7. Restart Power BI Desktop.
NOTE
In an upcoming change the default extension will be changed from .mez to .pqx.
To get you up to speed with Power Query, this page lists some of the most common questions.
What software do I need to get started with the Power Query SDK?
You need to install the Power Query SDK in addition to Visual Studio. To be able to test your connectors, you
should also have Power BI installed.
What can you do with a custom connector?
Custom connectors allow you to create new data sources or customize and extend an existing source. Common
use cases include:
Creating a business analyst-friendly view for a REST API.
Providing branding for a source that Power Query supports with an existing connector (such as an OData
service or ODBC driver).
Implementing OAuth v2 authentication flow for a SaaS offering.
Exposing a limited or filtered view over your data source to improve usability.
Enabling DirectQuery for a data source using an ODBC driver.
Custom connectors are only available in Power BI Desktop and Power BI Service through the use of an on-
premises data gateway. More information: TripPin 9 - Test Connection
Creating your first connector: Hello World
5/25/2022 • 2 minutes to read • Edit Online
HelloWorld = [
Authentication = [
Implicit = []
],
Label = Extension.LoadString("DataSourceLabel")
];
HelloWorld.Publish = [
Beta = true,
ButtonText = { Extension.LoadString("FormulaTitle"), Extension.LoadString("FormulaHelp") },
SourceImage = HelloWorld.Icons,
SourceTypeImage = HelloWorld.Icons
];
HelloWorld.Icons = [
Icon16 = { Extension.Contents("HelloWorld16.png"), Extension.Contents("HelloWorld20.png"), Extension.Con
tents("HelloWorld24.png"), Extension.Contents("HelloWorld32.png") },
Icon32 = { Extension.Contents("HelloWorld32.png"), Extension.Contents("HelloWorld40.png"), Extension.Con
tents("HelloWorld48.png"), Extension.Contents("HelloWorld64.png") }
];
Once you've built the file and copied it to the correct directory, following the instructions in Installing the
PowerQuery SDK tutorial, open PowerBI. You can search for "hello" to find your connector in the Get Data
dialog.
This step will bring up an authentication dialog. Since there's no authentication options and the function takes
no parameters, there's no further steps in these dialogs.
Press Connect and the dialog will tell you that it's a "Preview connector", since Beta is set to true in the query.
Since there's no authentication, the authentication screen will present a tab for Anonymous authentication with
no fields. Press Connect again to finish.
Finally, the query editor will come up showing what you expect—a function that returns the text "Hello world".
For the fully implemented sample, see the Hello World Sample in the Data Connectors sample repo.
TripPin Tutorial
5/25/2022 • 2 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
This tutorial uses a public OData service (TripPin) as a reference source. Although this lesson requires the use of
the M engine's OData functions, subsequent lessons will use Web.Contents, making it applicable to (most) REST
APIs.
Prerequisites
The following applications will be used throughout this tutorial:
Power BI Desktop, May 2017 release or later
Power Query SDK for Visual Studio
Fiddler—Optional, but recommended for viewing and debugging requests to your REST service
It's strongly suggested that you review:
Installing the PowerQuery SDK
Start developing custom connectors
Creating your first connector: Hello World
Handling Data Access
Handling Authentication
NOTE
You can also start trace logging of your work at any time by enabling diagnostics, which is described later on in this
tutorial. More information: Enabling diagnostics
Parts
PA RT L ESSO N DETA IL S
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Create a new Data Connector project using the Visual Studio SDK
Author a base function to pull data from a source
Test your connector in Visual Studio
Register your connector in Power BI Desktop
Open the TripPin.pq file and paste in the following connector definition.
section TripPin;
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Feed = Value.ReplaceType(TripPinImpl, type function (url as Uri.Type) as any);
TripPin.Feed("https://services.odata.org/v4/TripPinService/")
Select the Anonymous credential type, and then select Set Credential .
Select OK to close the dialog, and then select the Star t button once again. You see a query execution status
dialog, and finally a Query Result table showing the data returned from your query.
You can try out a few different OData URLs in the test file to see what how different results are returned. For
example:
https://services.odata.org/v4/TripPinService/Me
https://services.odata.org/v4/TripPinService/GetPersonWithMostFriends()
https://services.odata.org/v4/TripPinService/People
The TripPin.query.pq file can contain single statements, let statements, or full section documents.
let
Source = TripPin.Feed("https://services.odata.org/v4/TripPinService/"),
People = Source{[Name="People"]}[Data],
SelectColumns = Table.SelectColumns(People, {"UserName", "FirstName", "LastName"})
in
SelectColumns
Open Fiddler to capture HTTP traffic, and run the query. You should see a few different requires to
services.odata.org, generated by the mashup container process. You can see that accessing the root URL of the
service results in a 302 status and a redirect to the longer version of the URL. Following redirects is another
behavior you get “for free” from the base library functions.
One thing to note if you look at the URLs is that you can see the query folding that happened with the
SelectColumns statement.
https://services.odata.org/v4/TripPinService/People?$select=UserName%2CFirstName%2CLastName
If you add more transformations to your query, you can see how they impact the generated URL.
This behavior is important to note. Even though you did not implement explicit folding logic, your connector
inherits these capabilities from the OData.Feed function. M statements are compose-able—filter contexts will
flow from one function to another, whenever possible. This is similar in concept to the way data source functions
used within your connector inherit their authentication context and credentials. In later lessons, you'll replace the
use of OData.Feed, which has native folding capabilities, with Web.Contents, which does not. To get the same
level of capabilities, you'll need to use the Table.View interface and implement your own explicit folding logic.
Select the function name, and select Connect . A third-party message appears—select Continue to continue.
The function invocation dialog now appears. Enter the root URL of the service (
https://services.odata.org/v4/TripPinService/ ), and select OK .
Since this is the first time you are accessing this data source, you'll receive a prompt for credentials. Check that
the shortest URL is selected, and then select Connect .
Notice that instead of getting a simple table of data, the navigator appears. This is because the OData.Feed
function returns a table with special metadata on top of it that the Power Query experience knows to display as
a navigation table. This walkthrough will cover how you can create and customize your own navigation table in
a future lesson.
Select the Me table, and then select Transform Data . Notice that the columns already have types assigned
(well, most of them). This is another feature of the underlying OData.Feed function. If you watch the requests in
Fiddler, you'll see that you've fetched the service's $metadata document. The engine's OData implementation
does this automatically to determine the service's schema, data types, and relationships.
Conclusion
This lesson walked you through the creation of a simple connector based on the OData.Feed library function. As
you saw, very little logic is needed to enable a fully functional connector over the OData base function. Other
extensibility enabled functions, such as ODBC.DataSource, provide similar capabilities.
In the next lesson, you'll replace the use of OData.Feed with a less capable function—Web.Contents. Each lesson
will implement more connector features, including paging, metadata/schema detection, and query folding to the
OData query syntax, until your custom connector supports the same range of capabilities as OData.Feed.
Next steps
TripPin Part 2 - Data Connector for a REST Service
TripPin Part 2 - Data Connector for a REST Service
5/25/2022 • 7 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Create a base function that calls out to a REST API using Web.Contents
Learn how to set request headers and process a JSON response
Use Power BI Desktop to wrangle the response into a user friendly format
This lesson converts the OData based connector for the TripPin service (created in the previous lesson) to a
connector that resembles something you'd create for any RESTful API. OData is a RESTful API, but one with a
fixed set of conventions. The advantage of OData is that it provides a schema, data retrieval protocol, and
standard query language. Taking away the use of OData.Feed will require us to build these capabilities into the
connector ourselves.
TripPin.Feed("https://services.odata.org/v4/TripPinService/Me")
Open Fiddler and then select the Start button in Visual Studio.
In Fiddler, you'll see three requests to the server:
When the query finishes evaluating, the M Query Output window should show the Record value for the Me
singleton.
If you compare the fields in the output window with the fields returned in the raw JSON response, you'll notice a
mismatch. The query result has additional fields ( Friends , Trips , GetFriendsTrips ) that don't appear
anywhere in the JSON response. The OData.Feed function automatically appended these fields to the record
based on the schema returned by $metadata. This is a good example of how a connector might augment and/or
reformat the response from the service to provide a better user experience.
DefaultRequestHeaders = [
#"Accept" = "application/json;odata.metadata=minimal", // column name and values only
#"OData-MaxVersion" = "4.0" // we only support v4
];
You'll change your implementation of your TripPin.Feed function so that rather than using OData.Feed , it uses
Web.Contents to make a web request, and parses the result as a JSON document.
You can now test this out in Visual Studio using the query file. The result of the /Me record now resembles the
raw JSON that you saw in the Fiddler request.
If you watch Fiddler when running the new function, you'll also notice that the evaluation now makes a single
web request, rather than three. Congratulations—you've achieved a 300% performance increase! Of course,
you've now lost all the type and schema information, but there's no need to focus on that part just yet.
Update your query to access some of the TripPin Entities/Tables, such as:
https://services.odata.org/v4/TripPinService/Airlines
https://services.odata.org/v4/TripPinService/Airports
https://services.odata.org/v4/TripPinService/Me/Trips
You'll notice that the paths that used to return nicely formatted tables now return a top level "value" field with an
embedded [List]. You'll need to do some transformations on the result to make it usable for Power BI scenarios.
let
Source = TripPin.Feed("https://services.odata.org/v4/TripPinService/Airlines"),
value = Source[value],
toTable = Table.FromList(value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
expand = Table.ExpandRecordColumn(toTable, "Column1", {"AirlineCode", "Name"}, {"AirlineCode", "Name"})
in
expand
let
Source = TripPin.Feed("https://services.odata.org/v4/TripPinService/Airports"),
value = Source[value],
#"Converted to Table" = Table.FromList(value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"Name", "IcaoCode",
"IataCode", "Location"}, {"Name", "IcaoCode", "IataCode", "Location"}),
#"Expanded Location" = Table.ExpandRecordColumn(#"Expanded Column1", "Location", {"Address", "Loc",
"City"}, {"Address", "Loc", "City"}),
#"Expanded City" = Table.ExpandRecordColumn(#"Expanded Location", "City", {"Name", "CountryRegion",
"Region"}, {"Name.1", "CountryRegion", "Region"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded City",{{"Name.1", "City"}}),
#"Expanded Loc" = Table.ExpandRecordColumn(#"Renamed Columns", "Loc", {"coordinates"}, {"coordinates"}),
#"Added Custom" = Table.AddColumn(#"Expanded Loc", "Latitude", each [coordinates]{1}),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Longitude", each [coordinates]{0}),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"coordinates"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Name", type text}, {"IcaoCode", type
text}, {"IataCode", type text}, {"Address", type text}, {"City", type text}, {"CountryRegion", type text},
{"Region", type text}, {"Latitude", type number}, {"Longitude", type number}})
in
#"Changed Type"
You can repeat this process for additional paths under the service. Once you're ready, move onto the next step of
creating a (mock) navigation table.
let
source = #table({"Name", "Data"}, {
{ "Airlines", Airlines },
{ "Airports", Airports }
})
in
source
If you have not set your Privacy Levels setting to "Always ignore Privacy level settings" (also known as "Fast
Combine") you'll see a privacy prompt.
Privacy prompts appear when you're combining data from multiple sources and have not yet specified a privacy
level for the source(s). Select the Continue button and set the privacy level of the top source to Public .
Select Save and your table will appear. While this isn't a navigation table yet, it provides the basic functionality
you need to turn it into one in a subsequent lesson.
Data combination checks do not occur when accessing multiple data sources from within an extension. Since all
data source calls made from within the extension inherit the same authorization context, it is assumed they are
"safe" to combine. Your extension will always be treated as a single data source when it comes to data
combination rules. Users would still receive the regular privacy prompts when combining your source with
other M sources.
If you run Fiddler and click the Refresh Preview button in the Query Editor, you'll notice separate web requests
for each item in your navigation table. This indicates that an eager evaluation is occurring, which isn't ideal when
building navigation tables with a lot of elements. Subsequent lessons will show how to build a proper
navigation table that supports lazy evaluation.
Conclusion
This lesson showed you how to build a simple connector for a REST service. In this case, you turned an existing
OData extension into a standard REST extension (using Web.Contents), but the same concepts apply if you were
creating a new extension from scratch.
In the next lesson, you'll take the queries created in this lesson using Power BI Desktop and turn them into a true
navigation table within the extension.
Next steps
TripPin Part 3 - Navigation Tables
TripPin Part 3 - Navigation Tables
5/25/2022 • 4 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Create a navigation table for a fixed set of queries
Test the navigation table in Power BI Desktop
This lesson adds a navigation table to the TripPin connector created in the previous lesson. When your connector
used the OData.Feed function (Part 1), you received the navigation table “for free”, as derived from the OData
service’s $metadata document. When you moved to the Web.Contents function (Part 2), you lost the built-in
navigation table. In this lesson, you'll take a set of fixed queries you created in Power BI Desktop and add the
appropriate metadata for Power Query to popup the Navigator dialog for your data source function.
See the Navigation Table documentation for more information about using navigation tables.
Next you'll import the mock navigation table query you wrote that creates a fixed table linking to these data set
queries. Call it TripPinNavTable :
Finally you'll declare a new shared function, TripPin.Contents , that will be used as your main data source
function. You'll also remove the Publish value from TripPin.Feed so that it no longer shows up in the Get
Data dialog.
[DataSource.Kind="TripPin"]
shared TripPin.Feed = Value.ReplaceType(TripPinImpl, type function (url as Uri.Type) as any);
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Contents = Value.ReplaceType(TripPinNavTable, type function (url as Uri.Type) as any);
NOTE
Your extension can mark multiple functions as shared , with or without associating them with a DataSource.Kind .
However, when you associate a function with a specific DataSource.Kind , each function must have the same set of
required parameters, with the same name and type. This is because the data source function parameters are combined to
make a 'key' used for looking up cached credentials.
You can test your TripPin.Contents function using your TripPin.query.pq file. Running the following test query
will give you a credential prompt, and a simple table output.
TripPin.Contents("https://services.odata.org/v4/TripPinService/")
Table.ToNavigationTable = (
table as table,
keyColumns as list,
nameColumn as text,
dataColumn as text,
itemKindColumn as text,
itemNameColumn as text,
isLeafColumn as text
) as table =>
let
tableType = Value.Type(table),
newTableType = Type.AddTableKey(tableType, keyColumns, true) meta
[
NavigationTable.NameColumn = nameColumn,
NavigationTable.DataColumn = dataColumn,
NavigationTable.ItemKindColumn = itemKindColumn,
Preview.DelayColumn = itemNameColumn,
NavigationTable.IsLeafColumn = isLeafColumn
],
navigationTable = Value.ReplaceType(table, newTableType)
in
navigationTable;
After copying this into your extension file, you'll update your TripPinNavTable function to add the navigation
table fields.
Running your test query again will give you a similar result as last time—with a few more columns added.
NOTE
You will not see the Navigator window appear in Visual Studio. The M Quer y Output window always displays the
underlying table.
If you copy your extension over to your Power BI Desktop custom connector and invoke the new function from
the Get Data dialog, you'll see your navigator appear.
If you right click on the root of the navigation tree and select Edit , you'll see the same table as you did within
Visual Studio.
Conclusion
In this tutorial, you added a Navigation Table to your extension. Navigation Tables are a key feature that make
connectors easier to use. In this example your navigation table only has a single level, but the Power Query UI
supports displaying navigation tables that have multiple dimensions (even when they are ragged).
Next steps
TripPin Part 4 - Data Source Paths
TripPin Part 4 - Data Source Paths
5/25/2022 • 6 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Simplify the connection logic for your connector
Improve the navigation table experience
This lesson simplifies the connector built in the previous lesson by removing its required function parameters,
and improving the user experience by moving to a dynamically generated navigation table.
For an in-depth explanation of how credentials are identified, see the Data Source Paths section of Handling
Authentication.
[DataSource.Kind="TripPin"]
shared TripPin.Feed = Value.ReplaceType(TripPinImpl, type function (url as Uri.Type) as any);
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Contents = Value.ReplaceType(TripPinNavTable, type function (url as Uri.Type) as any);
The first time you run a query that uses one of the functions, you'll receive a credential prompt with drop downs
that lets you select a path and an authentication type.
If you run the same query again, with the same parameters, the M engine is able to locate the cached
credentials, and no credential prompt is shown. If you modify the url argument to your function so that the
base path no longer matches, a new credential prompt is displayed for the new path.
You can see any cached credentials on the Credentials table in the M Quer y Output window.
Depending on the type of change, modifying the parameters of your function will likely result in a credential
error.
Since the TripPin service has a fixed URL endpoint, you don't need to prompt the user for any values. You'll
remove the url parameter from your function, and define a BaseUrl variable in your connector.
BaseUrl = "https://services.odata.org/v4/TripPinService/";
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Contents = () => TripPinNavTable(BaseUrl) as table;
You'll keep the TripPin.Feed function, but no longer make it shared, no longer associate it with a Data Source
Kind, and simplify its declaration. From this point on, you'll only use it internally within this section document.
If you update the TripPin.Contents() call in your TripPin.query.pq file and run it in Visual Studio, you'll see a
new credential prompt. Note that there is now a single Data Source Path value—TripPin.
Improving the Navigation Table
In the first tutorial you used the built-in OData functions to connect to the TripPin service. This gave you a really
nice looking navigation table, based on the TripPin service document, with no additional code on your side. The
OData.Feed function automatically did the hard work for you. Since you're "roughing it" by using Web.Contents
rather than OData.Feed, you'll need to recreate this navigation table yourself.
RootEntities = {
"Airlines",
"Airports",
"People"
};
You then update your TripPinNavTable function to build the table a column at a time. The [Data] column for
each entity is retrieved by calling TripPin.Feed with the full URL to the entity.
When dynamically building URL paths, make sure you're clear where your forward slashes (/) are! Note that
Uri.Combine uses the following rules when combining paths:
When the relativeUri parameter starts with a /, it will replace the entire path of the baseUri parameter
If the relativeUri parameter does not start with a / and baseUri ends with a /, the path is appended
If the relativeUri parameter does not start with a / and baseUri does not end with a /, the last segment of
the path is replaced
The following image shows examples of this:
NOTE
A disadvantage of using a generic approach to process your entities is that you lose the nice formating and type
information for your entities. A later section in this tutorial shows how to enforce schema on REST API calls.
Conclusion
In this tutorial, you cleaned up and simplified your connector by fixing your Data Source Path value, and moving
to a more flexible format for your navigation table. After completing these steps (or using the sample code in
this directory), the TripPin.Contents function returns a navigation table in Power BI Desktop.
Next steps
TripPin Part 5 - Paging
TripPin Part 5 - Paging
5/25/2022 • 7 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Add paging support to the connector
Many Rest APIs will return data in "pages", requiring clients to make multiple requests to stitch the results
together. Although there are some common conventions for pagination (such as RFC 5988), it generally varies
from API to API. Thankfully, TripPin is an OData service, and the OData standard defines a way of doing
pagination using odata.nextLink values returned in the body of the response.
To simplify previous iterations of the connector, the TripPin.Feed function was not page aware. It simply parsed
whatever JSON was returned from the request and formatted it as a table. Those familiar with the OData
protocol might have noticed that a number of incorrect assumptions were made on the format of the response
(such as assuming there is a value field containing an array of records).
In this lesson you'll improve your response handling logic by making it page aware. Future tutorials will make
the page handling logic more robust and able to handle multiple response formats (including errors from the
service).
NOTE
You do not need to implement your own paging logic with connectors based on OData.Feed, as it handles it all for you
automatically.
Paging Checklist
When implementing paging support, you'll need to know the following things about your API:
How do you request the next page of data?
Does the paging mechanism involve calculating values, or do you extract the URL for the next page from the
response?
How do you know when to stop paging?
Are there parameters related to paging that you should be aware of? (such as "page size")
The answer to these questions will impact the way you implement your paging logic. While there is some
amount of code reuse across paging implementations (such as the use of Table.GenerateByPage, most
connectors will end up requiring custom logic.
NOTE
This lesson contains paging logic for an OData service, which follows a specific format. Check the documentation for your
API to determine the changes you'll need to make in your connector to support its paging format.
{
"odata.context": "...",
"odata.count": 37,
"value": [
{ },
{ },
{ }
],
"odata.nextLink": "...?$skiptoken=342r89"
}
Some OData services allow clients to supply a max page size preference, but it is up to the service whether or
not to honor it. Power Query should be able to handle responses of any size, so you don't need to worry about
specifying a page size preference—you can support whatever the service throws at you.
More information about Server-Driven Paging can be found in the OData specification.
Testing TripPin
Before fixing your paging implementation, confirm the current behavior of the extension from the previous
tutorial. The following test query will retrieve the People table and add an index column to show your current
row count.
let
source = TripPin.Contents(),
data = source{[Name="People"]}[Data],
withRowCount = Table.AddIndexColumn(data, "Index")
in
withRowCount
Turn on fiddler, and run the query in Visual Studio. You'll notice that the query returns a table with 8 rows (index
0 to 7).
If you look at the body of the response from fiddler, you'll see that it does in fact contain an @odata.nextLink
field, indicating that there are more pages of data available.
{
"@odata.context": "https://services.odata.org/V4/TripPinService/$metadata#People",
"@odata.nextLink": "https://services.odata.org/v4/TripPinService/People?%24skiptoken=8",
"value": [
{ },
{ },
{ }
]
}
NOTE
As stated earlier in this tutorial, paging logic will vary between data sources. The implementation here tries to break up
the logic into functions that should be reusable for sources that use next links returned in the response.
Table.GenerateByPage
The Table.GenerateByPage function can be used to efficiently combine multiple 'pages' of data into a single
table. It does this by repeatedly calling the function passed in as the getNextPage parameter, until it receives a
null . The function parameter must take a single argument, and return a nullable table .
Each call to getNextPage receives the output from the previous call.
// The getNextPage function takes a single argument and is expected to return a nullable table
Table.GenerateByPage = (getNextPage as function) as table =>
let
listOfPages = List.Generate(
() => getNextPage(null), // get the first page of data
(lastPage) => lastPage <> null, // stop when the function returns null
(lastPage) => getNextPage(lastPage) // pass the previous page to the next function call
),
// concatenate the pages together
tableOfPages = Table.FromList(listOfPages, Splitter.SplitByNothing(), {"Column1"}),
firstRow = tableOfPages{0}?
in
// if we didn't get back any pages of data, return an empty table
// otherwise set the table type based on the columns of the first page
if (firstRow = null) then
Table.FromRows({})
else
Value.ReplaceType(
Table.ExpandTableColumn(tableOfPages, "Column1", Table.ColumnNames(firstRow[Column1])),
Value.Type(firstRow[Column1])
);
Implementing GetAllPagesByNextLink
The body of your GetAllPagesByNextLink function implements the getNextPage function argument for
Table.GenerateByPage . It will call the GetPage function, and retrieve the URL for the next page of data from the
NextLink field of the meta record from the previous call.
Implementing GetPage
Your GetPage function will use Web.Contents to retrieve a single page of data from the TripPin service, and
convert the response into a table. It passes the response from Web.Contents to the GetNextLink function to
extract the URL of the next page, and sets it on the meta record of the returned table (page of data).
This implementation is a slightly modified version of the TripPin.Feed call from the previous tutorials.
Implementing GetNextLink
Your GetNextLink function simply checks the body of the response for an @odata.nextLink field, and returns its
value.
// In this implementation, 'response' will be the parsed body of the response after the call to
Json.Document.
// Look for the '@odata.nextLink' field and simply return null if it doesn't exist.
GetNextLink = (response) as nullable text => Record.FieldOrDefault(response, "@odata.nextLink");
If you re-run the same test query from earlier in the tutorial, you should now see the page reader in action. You
should also see that you have 20 rows in the response rather than 8.
If you look at the requests in fiddler, you should now see separate requests for each page of data.
NOTE
You'll notice duplicate requests for the first page of data from the service, which is not ideal. The extra request is a result of
the M engine's schema checking behavior. Ignore this issue for now and resolve it in the next tutorial, where you'll apply
an explict schema.
Conclusion
This lesson showed you how to implement pagination support for a Rest API. While the logic will likely vary
between APIs, the pattern established here should be reusable with minor modifications.
In the next lesson, you'll look at how to apply an explicit schema to your data, going beyond the simple text
and number data types you get from Json.Document .
Next steps
TripPin Part 6 - Schema
TripPin Part 6 - Schema
5/25/2022 • 11 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Define a fixed schema for a REST API
Dynamically set data types for columns
Enforce a table structure to avoid transformation errors due to missing columns
Hide columns from the result set
One of the big advantages of an OData service over a standard REST API is its $metadata definition. The
$metadata document describes the data found on this service, including the schema for all of its Entities (Tables)
and Fields (Columns). The OData.Feed function uses this schema definition to automatically set data type
information—so instead of getting all text and number fields (like you would from Json.Document ), end users
will get dates, whole numbers, times, and so on, providing a better overall user experience.
Many REST APIs don't have a way to programmatically determine their schema. In these cases, you'll need to
include schema definitions within your connector. In this lesson you'll define a simple, hardcoded schema for
each of your tables, and enforce the schema on the data you read from the service.
NOTE
The approach described here should work for many REST services. Future lessons will build upon this approach by
recursively enforcing schemas on structured columns (record, list, table), and provide sample implementations that can
programmatically generate a schema table from CSDL or JSON Schema documents.
Overall, enforcing a schema on the data returned by your connector has multiple benefits, such as:
Setting the correct data types
Removing columns that don't need to be shown to end users (such as internal IDs or state information)
Ensuring that each page of data has the same shape by adding any columns that might be missing from a
response (a common way for REST APIs to indicate a field should be null)
let
source = TripPin.Contents(),
data = source{[Name="Airlines"]}[Data]
in
data
The "@odata.*" columns are part of OData protocol, and not something you'd want or need to show to the end
users of your connector. AirlineCode and Name are the two columns you'll want to keep. If you look at the
schema of the table (using the handy Table.Schema function), you can see that all of the columns in the table
have a data type of Any.Type .
let
source = TripPin.Contents(),
data = source{[Name="Airlines"]}[Data]
in
Table.Schema(data)
Table.Schema returns a lot of metadata about the columns in a table, including names, positions, type
information, and many advanced properties, such as Precision, Scale, and MaxLength. Future lessons will
provide design patterns for setting these advanced properties, but for now you need only concern yourself with
the ascribed type ( TypeName ), primitive type ( Kind ), and whether the column value might be null ( IsNullable ).
C O L UM N DETA IL S
Name The name of the column. This must match the name in the
results returned by the service.
Type The M data type you're going to set. This can be a primitive
type ( text , number , datetime , and so on), or an
ascribed type ( Int64.Type , Currency.Type , and so on).
The hardcoded schema table for the Airlines table will set its AirlineCode and Name columns to text , and
looks like this:
The Airports table has four fields you'll want to keep (including one of type record ):
Finally, the People table has seven fields, including lists ( Emails , AddressInfo ), a nullable column ( Gender ),
and a column with an ascribed type ( Concurrency ).
NOTE
The last step to set the table type will remove the need for the Power Query UI to infer type information when viewing
the results in the query editor. This removes the double request issue you saw at the end of the previous tutorial.
The following helper code can be copy and pasted into your extension:
EnforceSchema.Strict = 1; // Add any missing columns, remove extra columns, set table type
EnforceSchema.IgnoreExtraColumns = 2; // Add missing columns, do not remove extra columns
EnforceSchema.IgnoreMissingColumns = 3; // Do not add or remove columns
SchemaTransformTable = (table as table, schema as table, optional enforceSchema as number) as table =>
let
// Default to EnforceSchema.Strict
_enforceSchema = if (enforceSchema <> null) then enforceSchema else EnforceSchema.Strict,
You'll also update all of the calls to these functions to make sure that you pass the schema through correctly.
Enforcing the schema
The actual schema enforcement will be done in your GetPage function.
GetPage = (url as text, optional schema as table) as table =>
let
response = Web.Contents(url, [ Headers = DefaultRequestHeaders ]),
body = Json.Document(response),
nextLink = GetNextLink(body),
data = Table.FromRecords(body[value]),
// enforce the schema
withSchema = if (schema <> null) then SchemaTransformTable(data, schema) else data
in
withSchema meta [NextLink = nextLink];
[Note] This GetPage implementation uses Table.FromRecords to convert the list of records in the JSON
response to a table. A major downside to using Table.FromRecords is that it assumes all records in the list
have the same set of fields. This works for the TripPin service, since the OData records are guarenteed to
contain the same fields, but this might not be the case for all REST APIs. A more robust implementation
would use a combination of Table.FromList and Table.ExpandRecordColumn. Later tutorials will change the
implementation to get the column list from the schema table, ensuring that no columns are lost or missing
during the JSON to M translation.
You'll then update your TripPinNavTable function to call GetEntity , rather than making all of the calls inline.
The main advantage to this is that it will let you continue modifying your entity building code, without having to
touch your nav table logic.
You now see that your Airlines table only has the two columns you defined in its schema:
let
source = TripPin.Contents(),
data = source{[Name="People"]}[Data]
in
Table.Schema(data)
You'll see that the ascribed type you used ( Int64.Type ) was also set correctly.
An important thing to note is that this implementation of SchemaTransformTable doesn't modify the types of
list and record columns, but the Emails and AddressInfo columns are still typed as list . This is because
Json.Document will correctly map JSON arrays to M lists, and JSON objects to M records. If you were to expand
the list or record column in Power Query, you'd see that all of the expanded columns will be of type any. Future
tutorials will improve the implementation to recursively set type information for nested complex types.
Conclusion
This tutorial provided a sample implementation for enforcing a schema on JSON data returned from a REST
service. While this sample uses a simple hardcoded schema table format, the approach could be expanded upon
by dynamically building a schema table definition from another source, such as a JSON schema file, or metadata
service/endpoint exposed by the data source.
In addition to modifying column types (and values), your code is also setting the correct type information on the
table itself. Setting this type information benefits performance when running inside of Power Query, as the user
experience always attempts to infer type information to display the right UI queues to the end user, and the
inference calls can end up triggering additional calls to the underlying data APIs.
If you view the People table using the TripPin connector from the previous lesson, you'll see that all of the
columns have a 'type any' icon (even the columns that contain lists):
Running the same query with the TripPin connector from this lesson, you'll now see that the type information is
displayed correctly.
Next steps
TripPin Part 7 - Advanced Schema with M Types
TripPin Part 7 - Advanced Schema with M Types
5/25/2022 • 7 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Enforce a table schema using M Types
Set types for nested records and lists
Refactor code for reuse and unit testing
In the previous lesson you defined your table schemas using a simple "Schema Table" system. This schema table
approach works for many REST APIs/Data Connectors, but services that return complete or deeply nested data
sets might benefit from the approach in this tutorial, which leverages the M type system.
This lesson will guide you through the following steps:
1. Adding unit tests
2. Defining custom M types
3. Enforcing a schema using types
4. Refactoring common code into separate files
shared TripPin.UnitTest =
[
// Put any common variables here if you only want them to be evaluated once
RootTable = TripPin.Contents(),
Airlines = RootTable{[Name="Airlines"]}[Data],
Airports = RootTable{[Name="Airports"]}[Data],
People = RootTable{[Name="People"]}[Data],
report = Facts.Summarize(facts)
][report];
Clicking run on the project will evaluate all of the Facts, and give you a report output that looks like this:
Using some principles from test-driven development, you'll now add a test that currently fails, but will soon be
reimplemented and fixed (by the end of this tutorial). Specifically, you'll add a test that checks one of the nested
records (Emails) you get back in the People entity.
If you run the code again, you should now see that you have a failing test.
Now you just need to implement the functionality to make this work.
A type value is a value that classifies other values. A value that is classified by a type is said to conform
to that type. The M type system consists of the following kinds of types:
Primitive types, which classify primitive values ( binary , date , datetime , datetimezone , duration ,
list , logical , null , number , record , text , time , type ) and also include a number of abstract
types ( function , table , any , and none )
Record types, which classify record values based on field names and value types
List types, which classify lists using a single item base type
Function types, which classify function values based on the types of their parameters and return values
Table types, which classify table values based on column names, column types, and keys
Nullable types, which classifies the value null in addition to all the values classified by a base type
Type types, which classify values that are types
Using the raw JSON output you get (and/or looking up the definitions in the service's $metadata), you can
define the following record types to represent OData complex types:
LocationType = type [
Address = text,
City = CityType,
Loc = LocType
];
CityType = type [
CountryRegion = text,
Name = text,
Region = text
];
LocType = type [
#"type" = text,
coordinates = {number},
crs = CrsType
];
CrsType = type [
#"type" = text,
properties = record
];
Note how the LocationType references the CityType and LocType to represent its structured columns.
For the top level entities (that you want represented as Tables), you define table types:
You then update your SchemaTable variable (which you use as a "lookup table" for entity to type mappings) to
use these new type definitions:
The full code listing for the Table.ChangeType function can be found in the Table.ChangeType.pqm file.
NOTE
For flexibility, the function can be used on tables, as well as lists of records (which is how tables would be represented in a
JSON document).
You then need to update the connector code to change the schema parameter from a table to a type , and add
a call to Table.ChangeType in GetEntity .
GetPage is updated to use the list of fields from the schema (to know the names of what to expand when you
get the results), but leaves the actual schema enforcement to GetEntity .
Running your unit tests again show that they are now all passing.
Refactoring Common Code into Separate Files
NOTE
The M engine will have improved support for referencing external modules/common code in the future, but this approach
should carry you through until then.
At this point, your extension almost has as much "common" code as TripPin connector code. In the future these
common functions will either be part of the built-in standard function library, or you'll be able to reference them
from another extension. For now, you refactor your code in the following way:
1. Move the reusable functions to separate files (.pqm).
2. Set the Build Action property on the file to Compile to make sure it gets included in your extension file
during the build.
3. Define a function to load the code using Expression.Evaluate.
4. Load each of the common functions you want to use.
The code to do this is included in the snippet below:
Table.ChangeType = Extension.LoadFunction("Table.ChangeType.pqm");
Table.GenerateByPage = Extension.LoadFunction("Table.GenerateByPage.pqm");
Table.ToNavigationTable = Extension.LoadFunction("Table.ToNavigationTable.pqm");
Conclusion
This tutorial made a number of improvements to the way you enforce a schema on the data you get from a
REST API. The connector is currently hard coding its schema information, which has a performance benefit at
runtime, but is unable to adapt to changes in the service's metadata overtime. Future tutorials will move to a
purely dynamic approach that will infer the schema from the service's $metadata document.
In addition to the schema changes, this tutorial added Unit Tests for your code, and refactored the common
helper functions into separate files to improve overall readability.
Next steps
TripPin Part 8 - Adding Diagnostics
TripPin Part 8 - Adding Diagnostics
5/25/2022 • 8 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Learn about the Diagnostics.Trace function
Use the Diagnostics helper functions to add trace information to help debug your connector
Enabling diagnostics
Power Query users can enable trace logging by selecting the checkbox under Options | Diagnostics .
Once enabled, any subsequent queries will cause the M engine to emit trace information to log files located in a
fixed user directory.
When running M queries from within the Power Query SDK, tracing is enabled at the project level. On the
project properties page, there are three settings related to tracing:
Clear Log —when this is set to true , the log will be reset/cleared when you run your queries. We
recommend you keep this set to true .
Show Engine Traces —this setting controls the output of built-in traces from the M engine. These traces are
generally only useful to members of the Power Query team, so you'll typically want to keep this set to false .
Show User Traces —this setting controls trace information output by your connector. You'll want to set this
to true .
Once enabled, you'll start seeing log entries in the M Query Output window, under the Log tab.
Diagnostics.Trace
The Diagnostics.Trace function is used to write messages into the M engine's trace log.
Diagnostics.Trace = (traceLevel as number, message as text, value as any, optional delayed as nullable
logical as any) => ...
IMPORTANT
M is a functional language with lazy evaluation. When using Diagnostics.Trace , keep in mind that the function will
only be called if the expression its a part of is actually evaluated. Examples of this can be found later in this tutorial.
The traceLevel parameter can be one of the following values (in descending order):
TraceLevel.Critical
TraceLevel.Error
TraceLevel.Warning
TraceLevel.Information
TraceLevel.Verbose
When tracing is enabled, the user can select the maximum level of messages they would like to see. All trace
messages of this level and under will be output to the log. For example, if the user selects the "Warning" level,
trace messages of TraceLevel.Warning , TraceLevel.Error , and TraceLevel.Critical would appear in the logs.
The message parameter is the actual text that will be output to the trace file. Note that the text will not contain
the value parameter unless you explicitly include it in the text.
The value parameter is what the function will return. When the delayed parameter is set to true , value will
be a zero parameter function that returns the actual value you're evaluating. When delayed is set to false ,
value will be the actual value. An example of how this works can be found below.
You can force an error during evaluation (for test purposes!) by passing an invalid entity name to the GetEntity
function. Here you change the withData line in the TripPinNavTable function, replacing [Name] with
"DoesNotExist" .
Enable tracing for your project, and run your test queries. On the Errors tab you should see the text of the error
you raised:
Also, on the Log tab, you should see the same message. Note that if you use different values for the message
and value parameters, these would be different.
Also note that the Action field of the log message contains the name (Data Source Kind) of your extension (in
this case, Engine/Extension/TripPin ). This makes it easier to find the messages related to your extension when
there are multiple queries involved and/or system (mashup engine) tracing is enabled.
Delayed evaluation
As an example of how the delayed parameter works, you'll make some modifications and run the queries again.
First, set the delayed value to false , but leave the value parameter as-is:
When you run the query, you'll receive an error that "We cannot convert a value of type Function to type Type",
and not the actual error you raised. This is because the call is now returning a function value, rather than the
value itself.
Next, remove the function from the value parameter:
When you run the query, you'll receive the correct error, but if you check the Log tab, there will be no messages.
This is because the error ends up being raised/evaluated during the call to Diagnostics.Trace , so the message
is never actually output.
Now that you understand the impact of the delayed parameter, be sure to reset your connector back to a
working state before proceeding.
// Diagnostics module contains multiple functions. We can take the ones we need.
Diagnostics = Extension.LoadFunction("Diagnostics.pqm");
Diagnostics.LogValue = Diagnostics[LogValue];
Diagnostics.LogFailure = Diagnostics[LogFailure];
Diagnostics.LogValue
The Diagnostics.LogValue function is a lot like Diagnostics.Trace , and can be used to output the value of what
you're evaluating.
The prefix parameter is prepended to the log message. You'd use this to figure out which call output the
message. The value parameter is what the function will return, and will also be written to the trace as a text
representation of the M value. For example, if value is equal to a table with columns A and B, the log will
contain the equivalent #table representation:
#table({"A", "B"}, {{"row1 A", "row1 B"}, {"row2 A", row2 B"}})
NOTE
Serializing M values to text can be an expensive operation. Be aware of the potential size of the values you are outputting
to the trace.
NOTE
Most Power Query environments will truncate trace messages to a maximum length.
As an example, you'll update the TripPin.Feed function to trace the url and schema arguments passed into
the function.
Note that you have to use the new _url and _schema values in the call to GetAllPagesByNextLink . If you used
the original function parameters, the Diagnostics.LogValue calls would never actually be evaluated, resulting in
no messages written to the trace. Functional programming is fun!
When you run your queries, you should now see new messages in the log.
Accessing url:
Schema type:
Note that you see the serialized version of the schema parameter type , rather than what you'd get when you
do a simple Text.FromValue on a type value (which results in "type").
Diagnostics.LogFailure
The Diagnostics.LogFailure function can be used to wrap function calls, and will only write to the trace if the
function call fails (that is, returns an error ).
Internally, Diagnostics.LogFailure adds a try operator to the function call. If the call fails, the text value is
written to the trace before returning the original error . If the function call succeeds, the result is returned
without writing anything to the trace. Since M errors don't contain a full stack trace (that is, you typically only
see the message of the error), this can be useful when you want to pinpoint where the error was actually raised.
As a (poor) example, modify the withData line of the TripPinNavTable function to force an error once again:
In the trace, you can find the resulting error message containing your text , and the original error information.
Be sure to reset your function to a working state before proceeding with the next tutorial.
Conclusion
This brief (but important!) lesson showed you how to make use of the diagnostic helper functions to log to the
Power Query trace files. When used properly, these functions are extremely useful in debugging issues within
your connector.
NOTE
As a connector developer, it is your responsibility to ensure that you do not log sensitive or personally identifiable
information (PII) as part of your diagnostic logging. You must also be careful to not output too much trace information, as
it can have a negative performance impact.
Next steps
TripPin Part 9 - TestConnection
TripPin Part 9 - TestConnection
5/25/2022 • 4 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you'll:
Add a TestConnection handler
Configure the on-premises data gateway (personal mode)
Test scheduled refresh through the Power BI service
Custom connector support was added to the April 2018 release of the personal on-premises data gateway. This
new (preview) functionality allows for Scheduled Refresh of reports that make use of your custom connector.
This tutorial will cover the process of enabling your connector for refresh, and provide a quick walkthrough of
the steps to configure the gateway. Specifically you'll:
1. Add a TestConnection handler to your connector
2. Install the On-Premises Data Gateway in Personal mode
3. Enable Custom Connector support in the Gateway
4. Publish a workbook that uses your connector to PowerBI.com
5. Configure scheduled refresh to test your connector
See Handling Gateway Support for more information on the TestConnection handler.
Background
There are three prerequisites for configuring a data source for scheduled refresh using PowerBI.com:
The data source is suppor ted: This means that the target gateway environment is aware of all of the
functions contained in the query you want to refresh.
Credentials are provided: To present the right credential entry dialog, Power BI needs to know the support
authentication mechanism for a given data source.
The credentials are valid: After the user provides credentials, they're validated by calling the data source's
TestConnection handler.
The first two items are handled by registering your connector with the gateway. When the user attempts to
configure scheduled refresh in PowerBI.com, the query information is sent to your personal gateway to
determine if any data sources that aren't recognized by the Power BI service (that is, custom ones that you
created) are available there. The third item is handled by invoking the TestConnection handler defined for your
data source.
NOTE
Future versions of the Power Query SDK will provide a way to validate the TestConnection handler from Visual Studio.
Currently, the only mechanism that uses TestConnection is the on-premises data gateway.
Select the Edit credentials link to bring up the authentication dialog, and then select sign-in.
NOTE
If you receive an error similar to the one below ("Failed to update data source credentials"), you most likely have an issue
with your TestConnection handler.
After a successful call to TestConnection, the credentials will be accepted. You can now schedule refresh, or select
the dataset ellipse and then select Refresh Now . You can select the Refresh histor y link to view the status of
the refresh (which generally takes a few minutes to get kicked off).
Conclusion
Congratulations! You now have a production ready custom connector that supports automated refresh through
the Power BI service.
Next steps
TripPin Part 10 - Query Folding
TripPin Part 10—Basic Query Folding
5/25/2022 • 16 minutes to read • Edit Online
This multi-part tutorial covers the creation of a new data source extension for Power Query. The tutorial is
meant to be done sequentially—each lesson builds on the connector created in previous lessons, incrementally
adding new capabilities to your connector.
In this lesson, you will:
Learn the basics of query folding
Learn about the Table.View function
Replicate OData query folding handlers for:
$top
$skip
$count
$select
$orderby
One of the powerful features of the M language is its ability to push transformation work to underlying data
source(s). This capability is referred to as Query Folding (other tools/technologies also refer to similar function
as Predicate Pushdown, or Query Delegation). When creating a custom connector that uses an M function with
built-in query folding capabilities, such as OData.Feed or Odbc.DataSource , your connector will automatically
inherit this capability for free.
This tutorial will replicate the built-in query folding behavior for OData by implementing function handlers for
the Table.View function. This part of the tutorial will implement some of the easier handlers to implement (that
is, ones that don't require expression parsing and state tracking).
To understand more about the query capabilities that an OData service might offer, see OData v4 URL
Conventions.
NOTE
As stated above, the OData.Feed function will automatically provide query folding capabilities. Since the TripPin series is
treating the OData service as a regular REST API, using Web.Contents rather than OData.Feed , you'll need to
implement the query folding handlers yourself. For real world usage, we recommend that you use OData.Feed whenever
possible.
See the Table.View documentation for more information about query folding in M.
Using Table.View
The Table.View function allows a custom connector to override default transformation handlers for your data
source. An implementation of Table.View will provide a function for one or more of the supported handlers. If a
handler is unimplemented, or returns an error during evaluation, the M engine will fall back to its default
handler.
When a custom connector uses a function that doesn't support implicit query folding, such as Web.Contents ,
default transformation handlers will always be performed locally. If the REST API you are connecting to supports
query parameters as part of the query, Table.View will allow you to add optimizations that allow
transformation work to be pushed to the service.
The Table.View function has the following signature:
Your implementation will wrap your main data source function. There are two required handlers for Table.View :
GetType —returns the expected table type of the query result
GetRows —returns the actual table result of your data source function
If you re-run the unit tests, you'll see that the behavior of your function hasn't changed. In this case your
Table.View implementation is simply passing through the call to GetEntity . Since your haven't implemented
any transformation handlers (yet), the original url parameter remains untouched.
//
// Helper functions
//
// Retrieves the cached schema. If this is the first call
// to CalculateSchema, the table type is calculated based on
// the entity name that was passed into the function.
CalculateSchema = (state) as type =>
if (state[Schema]? = null) then
GetSchemaForEntity(entity)
else
state[Schema],
If you look at the call to , you'll see an additional wrapper function around the handlers record—
Table.View
Diagnostics.WrapHandlers . This helper function is found in the Diagnostics module (that was introduced in a
previous tutorial), and provides you with a useful way to automatically trace any errors raised by individual
handlers.
The GetType and functions have been updated to make use of two new helper functions—
GetRows
CalculateSchema and CaculateUrl . Right now the implementations of those functions are fairly straightforward
—you'll notice they contain parts of what was previously done by the GetEntity function.
Finally, you'll notice that you're defining an internal function ( View ) that accepts a state parameter. As you
implement more handlers, they will recursively call the internal View function, updating and passing along
state as they go.
Update the TripPinNavTable function once again, replacing the call to TripPin.SuperSimpleView with a call to the
new TripPin.View function, and re-run the unit tests. You won't see any new functionality yet, but you now have
a solid baseline for testing.
NOTE
The Error on Folding Failure setting is an "all or nothing" approach. If you want to test queries that aren't designed to
fold as part of your unit tests, you'll need to add some conditional logic to enable/disable tests accordingly.
The remaining sections of this tutorial will each add a new Table.View handler. You'll be taking a Test Driven
Development (TDD) approach, where you first add failing unit tests, and then implement the M code to resolve
them.
Each handler section below will describe the functionality provided by the handler, the OData equivalent query
syntax, the unit tests, and the implementation. Using the scaffolding code described above, each handler
implementation requires two changes:
Adding the handler to Table.View that will update the state record.
Modifying CalculateUrl to retrieve the values from the state and add to the url and/or query string
parameters.
Handling Table.FirstN with OnTake
The OnTake handler receives a count parameter, which is the maximum number of rows to take. In OData
terms, you can translate this to the $top query parameter.
You'll use the following unit tests:
These tests both use Table.FirstN to filter to the result set to the first X number of rows. If you run these tests
with Error on Folding Failure set to False (the default), the tests should succeed, but if you run Fiddler (or
check the trace logs), you'll see that the request you send doesn't contain any OData query parameters.
If you set Error on Folding Failure to True , they will fail with the "Please try a simpler expression." error. To
fix this, you'll define your first Table.View handler for OnTake .
The OnTake handler looks like this:
The CalculateUrl function is updated to extract the Top value from the state record, and set the right
parameter in the query string.
encodedQueryString = Uri.BuildQueryString(qsWithTop),
finalUrl = urlWithEntity & "?" & encodedQueryString
in
finalUrl
Rerunning the unit tests, you can see that the URL you are accessing now contains the $top parameter. (Note
that due to URL encoding, $top appears as %24top , but the OData service is smart enough to convert it
automatically).
// OnSkip
Fact("Fold $skip 14 on Airlines",
#table( type table [AirlineCode = text, Name = text] , {{"EK", "Emirates"}} ),
Table.Skip(Airlines, 14)
),
Fact("Fold $skip 0 and $top 1",
#table( type table [AirlineCode = text, Name = text] , {{"AA", "American Airlines"}} ),
Table.FirstN(Table.Skip(Airlines, 0), 1)
),
Implementation:
qsWithSkip =
if (state[Skip]? <> null) then
qsWithTop & [ #"$skip" = Number.ToText(state[Skip]) ]
else
qsWithTop,
// OnSelectColumns
Fact("Fold $select single column",
#table( type table [AirlineCode = text] , {{"AA"}} ),
Table.FirstN(Table.SelectColumns(Airlines, {"AirlineCode"}), 1)
),
Fact("Fold $select multiple column",
#table( type table [UserName = text, FirstName = text, LastName = text],{{"russellwhyte", "Russell",
"Whyte"}}),
Table.FirstN(Table.SelectColumns(People, {"UserName", "FirstName", "LastName"}), 1)
),
Fact("Fold $select with ignore column",
#table( type table [AirlineCode = text] , {{"AA"}} ),
Table.FirstN(Table.SelectColumns(Airlines, {"AirlineCode", "DoesNotExist"}, MissingField.Ignore), 1)
),
The first two tests select different numbers of columns with Table.SelectColumns , and include a Table.FirstN
call to simplify the test case.
NOTE
If the test were to simply return the column names (using Table.ColumnNames ) and not any data, the request to the
OData service will never actually be sent. This is because the call to GetType will return the schema, which contains all of
the information the M engine needs to calculate the result.
The third test uses the MissingField.Ignore option, which tells the M engine to ignore any selected columns that
don't exist in the result set. The OnSelectColumns handler does not need to worry about this option—the M
engine will handle it automatically (that is, missing columns won't be included in the columns list).
NOTE
The other option for Table.SelectColumns , MissingField.UseNull , requires a connector to implement the
OnAddColumn handler. This will be done in a subsequent lesson.
CalculateUrl is updated to retrieve the list of columns from the state, and combine them (with a separator) for
the $select parameter.
// OnSort
Fact("Fold $orderby single column",
#table( type table [AirlineCode = text, Name = text], {{"TK", "Turkish Airlines"}}),
Table.FirstN(Table.Sort(Airlines, {{"AirlineCode", Order.Descending}}), 1)
),
Fact("Fold $orderby multiple column",
#table( type table [UserName = text], {{"javieralfred"}}),
Table.SelectColumns(Table.FirstN(Table.Sort(People, {{"LastName", Order.Ascending}, {"UserName",
Order.Descending}}), 1), {"UserName"})
)
Implementation:
// OnSort - receives a list of records containing two fields:
// [Name] - the name of the column to sort on
// [Order] - equal to Order.Ascending or Order.Descending
// If there are multiple records, the sort order must be maintained.
//
// OData allows you to sort on columns that do not appear in the result
// set, so we do not have to validate that the sorted columns are in our
// existing schema.
OnSort = (order as list) =>
let
// This will convert the list of records to a list of text,
// where each entry is "<columnName> <asc|desc>"
sorting = List.Transform(order, (o) =>
let
column = o[Name],
order = o[Order],
orderText = if (order = Order.Ascending) then "asc" else "desc"
in
column & " " & orderText
),
orderBy = Text.Combine(sorting, ", ")
in
@View(state & [ OrderBy = orderBy ]),
Updates to CalculateUrl :
qsWithOrderBy =
if (state[OrderBy]? <> null) then
qsWithSelect & [ #"$orderby" = state[OrderBy] ]
else
qsWithSelect,
The $count query parameter, which returns the count as a separate field in the result set.
The /$count path segment, which will return only the total count, as a scalar value.
The downside to the query parameter approach is that you still need to send the entire query to the OData
service. Since the count comes back inline as part of the result set, you'll have to process the first page of data
from the result set. While this is still more efficient then reading the entire result set and counting the rows, it's
probably still more work than you want to do.
The advantage of the path segment approach is that you'll only receive a single scalar value in the result. This
makes the entire operation a lot more efficient. However, as described in the OData specification, the /$count
path segment will return an error if you include other query parameters, such as $top or $skip , which limits
its usefulness.
In this tutorial, you'll implement the GetRowCount handler using the path segment approach. To avoid the errors
you'd get if other query parameters are included, you'll check for other state values, and return an
"unimplemented error" ( ... ) if you find any. Returning any error from a Table.View handler tells the M engine
that the operation cannot be folded, and it should fallback to the default handler instead (which in this case
would be counting the total number of rows).
First, add a simple unit test:
// GetRowCount
Fact("Fold $count", 15, Table.RowCount(Airlines)),
Since the /$count path segment returns a single value (in plain/text format) rather than a JSON result set, you'll
also have to add a new internal function ( TripPin.Scalar ) for making the request and handling the result.
The implementation will then use this function (if no other query parameters are found in the state ):
The CalculateUrl function is updated to append /$count to the URL if the RowCountOnly field is set in the
state .
Then add a test that uses both Table.RowCount and Table.FirstN to force the error.
An important note here is that this test will now return an error if Error on Folding Error is set to false ,
because the Table.RowCount operation will fall back to the local (default) handler. Running the tests with Error
on Folding Error set to true will cause Table.RowCount to fail, and allows the test to succeed.
Conclusion
Implementing Table.View for your connector adds a significant amount of complexity to your code. Since the M
engine can process all transformations locally, adding Table.View handlers does not enable new scenarios for
your users, but will result in more efficient processing (and potentially, happier users). One of the main
advantages of the Table.View handlers being optional is that it allows you to incrementally add new
functionality without impacting backwards compatibility for your connector.
For most connectors, an important (and basic) handler to implement is OnTake (which translates to $top in
OData), as it limits the amount of rows returned. The Power Query experience will always perform an OnTake of
1000 rows when displaying previews in the navigator and query editor, so your users might see significant
performance improvements when working with larger data sets.
GitHub Connector Sample
5/25/2022 • 7 minutes to read • Edit Online
The GitHub M extension shows how to add support for an OAuth 2.0 protocol authentication flow. You can learn
more about the specifics of GitHub's authentication flow on the GitHub Developer site.
Before you get started creating an M extension, you need to register a new app on GitHub, and replace the
client_id and client_secret files with the appropriate values for your app.
Note about compatibility issues in Visual Studio: The Power Query SDK uses an Internet Explorer based
control to popup OAuth dialogs. GitHub has deprecated its support for the version of IE used by this control,
which will prevent you from completing the permission grant for you app if run from within Visual Studio. An
alternative is to load the extension with Power BI Desktop and complete the first OAuth flow there. After your
application has been granted access to your account, subsequent logins will work fine from Visual Studio.
NOTE
To allow Power BI to obtain and use the access_token, you must specify the redirect url as
https://oauth.powerbi.com/views/oauthredirect.html.
When you specify this URL and GitHub successfully authenticates and grants permissions, GitHub will redirect to
PowerBI's oauthredirect endpoint so that Power BI can retrieve the access_token and refresh_token.
NOTE
A registered OAuth application is assigned a unique Client ID and Client Secret. The Client Secret should not be shared.
You get the Client ID and Client Secret from the GitHub application page. Update the files in your Data Connector project
with the Client ID ( client_id file) and Client Secret ( client_secret file).
//
// Data Source definition
//
GithubSample = [
Authentication = [
OAuth = [
StartLogin = StartLogin,
FinishLogin = FinishLogin
]
],
Label = Extension.LoadString("DataSourceLabel")
];
Step 2 - Provide details so the M engine can start the OAuth flow
The GitHub OAuth flow starts when you direct users to the https://github.com/login/oauth/authorize page. For
the user to login, you need to specify a number of query parameters:
If this is the first time the user is logging in with your app (identified by its client_id value), they'll see a page
that asks them to grant access to your app. Subsequent login attempts will simply ask for their credentials.
Step 3 - Convert the code received from GitHub into an access_token
If the user completes the authentication flow, GitHub redirects back to the Power BI redirect URL with a
temporary code in a code parameter, as well as the state you provided in the previous step in a state
parameter. Your FinishLogin function will extract the code from the callbackUri parameter, and then exchange
it for an access token (using the TokenMethod function).
To get a GitHub access token, you pass the temporary code from the GitHub Authorize Response. In the
TokenMethod function, you formulate a POST request to GitHub's access_token endpoint (
https://github.com/login/oauth/access_token ). The following parameters are required for the GitHub endpoint:
Here are the details used parameters for the Web.Contents call.
options A record to control the behavior of this Not used in this case
function.
This code snippet describes how to implement a TokenMethod function to exchange an auth code for an access
token.
TokenMethod = (code) =>
let
Response = Web.Contents("https://Github.com/login/oauth/access_token", [
Content = Text.ToBinary(Uri.BuildQueryString([
client_id = client_id,
client_secret = client_secret,
code = code,
redirect_uri = redirect_uri])),
Headers=[#"Content-type" = "application/x-www-form-urlencoded",#"Accept" =
"application/json"]]),
Parts = Json.Document(Response)
in
Parts;
The JSON response from the service will contain an access_token field. The TokenMethod method converts the
JSON response into an M record using Json.Document, and returns it to the engine.
Sample response:
{
"access_token":"e72e16c7e42f292c6912e7710c838347ae178b4a",
"scope":"user,repo",
"token_type":"bearer"
}
[DataSource.Kind="GithubSample", Publish="GithubSample.UI"]
shared GithubSample.Contents = Value.ReplaceType(Github.Contents, type function (url as Uri.Type) as any);
[DataSource.Kind="GithubSample"]
shared GithubSample.PagedTable = Value.ReplaceType(Github.PagedTable, type function (url as Uri.Type) as
nullable table);
The GithubSample.Contents function is also published to the UI (allowing it to appear in the Get Data dialog).
The Value.ReplaceType function is used to set the function parameter to the Url.Type ascribed type.
By associating these functions with the GithubSample data source kind, they'll automatically use the credentials
that the user provided. Any M library functions that have been enabled for extensibility (such as Web.Contents)
will automatically inherit these credentials as well.
For more details on how credential and authentication works, see Handling Authentication.
Sample URL
This connector is able to retrieve formatted data from any of the GitHub v3 REST API endpoints. For example,
the query to pull all commits to the Data Connectors repo would look like this:
GithubSample.Contents("https://api.github.com/repos/microsoft/dataconnectors/commits")
List of Samples
5/25/2022 • 2 minutes to read • Edit Online
We maintain a list of samples on the DataConnectors repo on GitHub. Each of the links below links to a folder in
the sample repository. Generally these folders include a readme, one or more .pq / .query.pq files, a project file
for Visual Studio, and in some cases icons. To open these files in Visual Studio, make sure you've set up the SDK
properly, and run the .mproj file from the cloned or downloaded folder.
Functionality
SA M P L E DESC RIP T IO N L IN K
Hello World This simple sample shows the basic GitHub Link
structure of a connector.
Hello World with Docs Similar to the Hello World sample, this GitHub Link
sample shows how to add
documentation to a shared function.
Unit Testing This sample shows how you can add GitHub Link
simple unit testing to your
<extension>.query.pq file.
OAuth
SA M P L E DESC RIP T IO N L IN K
ODBC
SA M P L E DESC RIP T IO N L IN K
Hive LLAP This connector sample uses the Hive GitHub Link
ODBC driver, and is based on the
connector template.
Direct Query for SQL This sample creates an ODBC-based GitHub Link
custom connector that enables Direct
Query for SQL Server.
TripPin
SA M P L E DESC RIP T IO N L IN K
We maintain a list of samples on the DataConnectors repo on GitHub. Each of the links below links to a folder in
the sample repository. Generally these folders include a readme, one or more .pq / .query.pq files, a project file
for Visual Studio, and in some cases icons. To open these files in Visual Studio, make sure you've set up the SDK
properly, and run the .mproj file from the cloned or downloaded folder.
Functionality
SA M P L E DESC RIP T IO N L IN K
Hello World This simple sample shows the basic GitHub Link
structure of a connector.
Hello World with Docs Similar to the Hello World sample, this GitHub Link
sample shows how to add
documentation to a shared function.
Unit Testing This sample shows how you can add GitHub Link
simple unit testing to your
<extension>.query.pq file.
OAuth
SA M P L E DESC RIP T IO N L IN K
ODBC
SA M P L E DESC RIP T IO N L IN K
Hive LLAP This connector sample uses the Hive GitHub Link
ODBC driver, and is based on the
connector template.
Direct Query for SQL This sample creates an ODBC-based GitHub Link
custom connector that enables Direct
Query for SQL Server.
TripPin
SA M P L E DESC RIP T IO N L IN K
We maintain a list of samples on the DataConnectors repo on GitHub. Each of the links below links to a folder in
the sample repository. Generally these folders include a readme, one or more .pq / .query.pq files, a project file
for Visual Studio, and in some cases icons. To open these files in Visual Studio, make sure you've set up the SDK
properly, and run the .mproj file from the cloned or downloaded folder.
Functionality
SA M P L E DESC RIP T IO N L IN K
Hello World This simple sample shows the basic GitHub Link
structure of a connector.
Hello World with Docs Similar to the Hello World sample, this GitHub Link
sample shows how to add
documentation to a shared function.
Unit Testing This sample shows how you can add GitHub Link
simple unit testing to your
<extension>.query.pq file.
OAuth
SA M P L E DESC RIP T IO N L IN K
ODBC
SA M P L E DESC RIP T IO N L IN K
Hive LLAP This connector sample uses the Hive GitHub Link
ODBC driver, and is based on the
connector template.
Direct Query for SQL This sample creates an ODBC-based GitHub Link
custom connector that enables Direct
Query for SQL Server.
TripPin
SA M P L E DESC RIP T IO N L IN K
This article provides information about different types of additional connector functionality that connector
developers might want to invest in. For each type, this article outlines availability and instructions to enable the
functionality.
Authentication
While implementing authentication is covered in the authentication article, there are other methods that
connector owners might be interested in offering.
Windows authentication
Windows authentication is supported. To enable Windows-based authentication in your connector, add the
following line in the Authentication section of your connector.
This change will expose Windows authentication as an option in the Power BI Desktop authentication experience.
The Suppor tsAlternateCredentials flag will expose the option to "Connect using alternative credentials".
After this flag is enabled, you can specify explicit Windows account credentials (username and password). You
can use this feature to test impersonation by providing your own account credentials.
Single sign-on authentication
This section outlines options available for implementing single sign-on (SSO) functionality into your certified
connector. Currently, there is no support for "plug and play" extensibility for SSO. Enabling SSO would require
changes and collaboration both on the Microsoft and data source or connector sides, so reach out to your
Microsoft contact prior to starting work.
Azure Active Directory SSO
Azure Active Directory (Azure AD)-based SSO is supported in cloud scenarios. The data source must accept
Azure AD access tokens, as the Power BI Azure AD user token will be exchanged with a data source token from
Azure AD. If you have a certified connector, reach out to your Microsoft contact to learn more.
Kerberos SSO
Kerberos-based single sign-on is supported in gateway scenarios. The data source must support Windows
authentication. Generally, these scenarios involve Direct Query-based reports, and a connector based on an
ODBC driver. The primary requirements for the driver are that it can determine Kerberos configuration settings
from the current thread context, and that it supports thread-based user impersonation. The gateway must be
configured to support Kerberos Constrained Delegation (KCD). An example can be found in the Impala sample
connector.
Power BI will send the current user information to the gateway. The gateway will use Kerberos Constrained
Delegation to invoke the query process as the impersonated user.
After making the above changes, the connector owner can test the following scenarios to validate functionality.
In Power BI Desktop: Windows impersonation (current user)
In Power BI Desktop: Windows impersonation using alternate credentials
In the gateway: Windows impersonation using alternate credentials, by pre-configuring the data source with
Windows account credentials in the Gateway Power BI Admin portal.
Connector developers can also use this procedure to test their implementation of Kerberos-based SSO.
1. Set up an on-premises data gateway with single sign-on enabled using instructions in the Power BI
Kerberos SSO documentation article.
2. Validate the setup by testing with SQL Server and Windows accounts. Set up the SQL Server Kerberos
configuration manager. If you can use Kerberos SSO with SQL Server then your Power BI data gateway is
properly set up to enable Kerberos SSO for other data sources as well.
3. Create an application (for example, a command-line tool) that connects to your server through your
ODBC driver. Ensure that your application can use Windows authentication for the connection.
4. Modify your test application so that it can take a username (UPN) as an argument and use the
WindowsIdentity constructor with it. Once complete, with the privileges granted to the gateway account
set up in Step 1, you should be able to obtain the user's AccessToken property and impersonate this
token.
5. Once you've made the changes to your application, ensure that you can use impersonation to load and
connect to your service through the ODBC driver. Ensure that data can be retrieved. If you want to use
native C or C++ code instead, you'll need to use LsaLoginUser to retrieve a token with just the username
and use the KERB_S4U_LOGON option.
After this functionality is validated, Microsoft will make a change to thread the UPN from the Power BI Service
down through the gateway. Once at the gateway, it will essentially act the same way as your test application to
retrieve data.
Reach out to your Microsoft contact prior to starting work to learn more on how to request this change.
SAML SSO
SAML-based SSO is often not supported by end data sources and isn't a recommended approach. If your
scenario requires the use of SAML-based SSO, reach out to your Microsoft contact or visit our documentation to
learn more.
The evaluation requires a permission that has not been provided. Data source kind: 'Extension'. Data source
path: 'test'. Permission kind: 'Native Query'
section Extension;
Extension = [
// MakeResourcePath overrides the default Data Source Path creation logic that serializes
// all required parameters as a JSON encoded value. This is required to keep the data source
// path the same between the Extension.DataSource and Extension.Query functions. Alternatively,
// you can provide a function documentation type and use DataSource.Path = false for the query
// parameter to exclude it from the data source path calculation.
Type="Custom",
MakeResourcePath = (server) => server,
ParseResourcePath = (resource) => { resource },
// Use NativeQuery to enable a Native Database Query prompt in the Power Query user experience.
NativeQuery = (optional query) => query,
Authentication=[Anonymous=null]
];
When evaluated, if the parameter names of the data source function can be mapped to the parameter names of
the NativeQuery function on the data source definition, and the NativeQuery function returns text, then the call
site generates a native query prompt. In this case, Extension.Query("server", "select 1") generates a challenge
for the native query text select 1 , while Extension.DataSource("server") won't generate a native query
challenge.
Allowing users to use Direct Query over a custom SQL statement
Scenario : An end user can use Direct Query over native database queries.
Status : This feature is not currently supported in our extensibility SDK. The product team is investigating this
scenario and expect that this scenario may eventually be possible for connectors with ODBC drivers and end
data sources supporting ANSI SQL92 "pass through" mode.
Workarounds : None.
Handling Authentication
5/25/2022 • 12 minutes to read • Edit Online
Authentication Kinds
An extension can support one or more kinds of Authentication. Each authentication kind is a different type of
credential. The authentication UI displayed to end users in Power Query is driven by the type of credential(s) that
an extension supports.
The list of supported authentication types is defined as part of an extension's Data Source Kind definition. Each
Authentication value is a record with specific fields. The following table lists the expected fields for each kind. All
fields are required unless marked otherwise.
The sample below shows the Authentication record for a connector that supports OAuth, Key, Windows, Basic
(Username and Password), and anonymous credentials.
Example:
Authentication = [
OAuth = [
StartLogin = StartLogin,
FinishLogin = FinishLogin,
Refresh = Refresh,
Logout = Logout
],
Key = [],
UsernamePassword = [],
Windows = [],
Implicit = []
]
Key The API key value. Note, the key value Key
is also available in the Password field as
well. By default, the mashup engine will
insert this in an Authorization header
as if this value were a basic auth
password (with no username). If this is
not the behavior you want, you must
specify the ManualCredentials = true
option in the options record.
The following code sample accesses the current credential for an API key and uses it to populate a custom
header ( x-APIKey ).
Example:
#"x-APIKey" = apiKey,
Accept = "application/vnd.api+json",
#"Content-Type" = "application/json"
],
request = Web.Contents(_url, [ Headers = headers, ManualCredentials = true ])
in
request
There are two sets of OAuth function signatures; the original signature that contains a minimal number of
parameters, and an advanced signature that accepts additional parameters. Most OAuth flows can be
implemented using the original signatures. You can also mix and match signature types in your implementation.
The function calls are matches based on the number of parameters (and their types). The parameter names are
not taken into consideration.
See the Github sample for more details.
Original OAuth Signatures
NOTE
If your data source requires scopes other than user_impersonation , or is incompatible with the use of
user_impersonation , then you should use the OAuth authentication kind.
NOTE
If you implement your own OAuth flow for Azure AD, users who have enabled Conditional Access for their tenant might
encounter issues when refreshing using the Power BI service. This won't impact gateway-based refresh, but would impact
a certified connector that supports refresh from the Power BI service. Users might run into a problem stemming from the
connector using a public client application when configuring web-based credentials through the Power BI service. The
access token generated by this flow will ultimately be used on a different computer (that is, the Power BI service in an
Azure data center, not on the company's network) than the one used to originally authenticate (that is, the computer of
the user who configures the data source credentials on the company's network). The built-in Aad type works around this
problem by using a different Azure AD client when configuring credentials in the Power BI service. This option won't be
available to connectors that use the OAuth authentication kind.
Most connectors will need to provide values for the AuthorizationUri and Resource fields. Both fields can be
text values, or a single argument function that returns a text value .
AuthorizationUri = "https://login.microsoftonline.com/common/oauth2/authorize"
Connectors that use a Uri based identifier do not need to provide a Resource value. By default, the value will be
equal to the root path of the connector's Uri parameter. If the data source's Azure AD resource is different than
the domain value (for example, it uses a GUID), then a Resource value needs to be provided.
Aad authentication kind samples
In this case, the data source supports global cloud Azure AD using the common tenant (no Azure B2B support).
Authentication = [
Aad = [
AuthorizationUri = "https://login.microsoftonline.com/common/oauth2/authorize",
Resource = "77256ee0-fe79-11ea-adc1-0242ac120002" // Azure AD resource value for your service - Guid
or URL
]
]
In this case, the data source supports tenant discovery based on OpenID Connect (OIDC) or similar protocol.
This allows the connector to determine the correct Azure AD endpoint to use based on one or more parameters
in the data source path. This dynamic discovery approach allows the connector to support Azure B2B.
// Implement this function to retrieve or calculate the service URL based on the data source path parameters
GetServiceRootFromDataSourcePath = (dataSourcePath) as text => ...;
Authentication = [
Aad = [
AuthorizationUri = (dataSourcePath) =>
GetAuthorizationUrlFromWwwAuthenticate(
GetServiceRootFromDataSourcePath(dataSourcePath)
),
Resource = "https://myAadResourceValue.com", // Azure AD resource value for your service - Guid or
URL
]
]
You can see an example of how credentials are stored in the Data source settings dialog in Power BI Desktop.
In this dialog, the Kind is represented by an icon, and the Path value is displayed as text.
NOTE
If you change your data source function's required parameters during development, previously stored credentials will no
longer work (because the path values no longer match). You should delete any stored credentials any time you change
your data source function parameters. If incompatible credentials are found, you may receive an error at runtime.
The function has a single required parameter ( message ) of type text , and will be used to calculate the data
source path. The optional parameter ( count ) would be ignored. The path would be displayed
Credential prompt:
Data source settings UI:
When a Label value is defined, the data source path value wouldn't be shown:
NOTE
We currently recommend you do not include a Label for your data source if your function has required parameters, as
users won't be able to distinguish between the different credentials they've entered. We are hoping to improve this in the
future (that is, allowing data connectors to display their own custom data source paths).
As Uri.Type is an ascribed type rather than a primitive type in the M language, you'll need to use the
Value.ReplaceType function to indicate that your text parameter should be treated as a Uri.
[DataSource.Kind="HelloWorld", Publish="HelloWorld.Publish"]
shared HelloWorld.Contents = (optional message as text) =>
let
message = if (message <> null) then message else "Hello world"
in
message;
HelloWorld = [
Authentication = [
Implicit = []
],
Label = Extension.LoadString("DataSourceLabel")
];
Properties
The following table lists the fields for your Data Source definition record.
F IEL D TYPE DETA IL S
Publish to UI
Similar to the (Data Source)[#data-source-kind] definition record, the Publish record provides the Power Query
UI the information it needs to expose this extension in the Get Data dialog.
Example:
HelloWorld.Publish = [
Beta = true,
ButtonText = { Extension.LoadString("FormulaTitle"), Extension.LoadString("FormulaHelp") },
SourceImage = HelloWorld.Icons,
SourceTypeImage = HelloWorld.Icons
];
HelloWorld.Icons = [
Icon16 = { Extension.Contents("HelloWorld16.png"), Extension.Contents("HelloWorld20.png"),
Extension.Contents("HelloWorld24.png"), Extension.Contents("HelloWorld32.png") },
Icon32 = { Extension.Contents("HelloWorld32.png"), Extension.Contents("HelloWorld40.png"),
Extension.Contents("HelloWorld48.png"), Extension.Contents("HelloWorld64.png") }
];
Properties
The following table lists the fields for your Publish record.
Using M's built-in Odbc.DataSource function is the recommended way to create custom connectors for data
sources that have an existing ODBC driver and/or support a SQL query syntax. Wrapping the Odbc.DataSource
function allows your connector to inherit default query folding behavior based on the capabilities reported by
your driver. This will enable the M engine to generate SQL statements based on filters and other transformations
defined by the user within the Power Query experience, without having to provide this logic within the
connector itself.
ODBC extensions can optionally enable DirectQuery mode, allowing Power BI to dynamically generate queries at
runtime without pre-caching the user's data model.
NOTE
Enabling DirectQuery support raises the difficulty and complexity level of your connector. When DirectQuery is enabled,
Power BI prevents the M engine from compensating for operations that can't be fully pushed to the underlying data
source.
This article assumes familiarity with the creation of a basic custom connector.
Refer to the SqlODBC sample for most of the code examples in the following sections. Other samples can be
found in the ODBC samples directory.
Next steps
Parameters for Odbc.DataSource
Parameters for Odbc.DataSource
5/25/2022 • 15 minutes to read • Edit Online
The Odbc.DataSource function takes two parameters—a connectionString for your driver, and an options
record that lets you override various driver behaviors. Through the options record you can override capabilities
and other information reported by the driver, control the navigator behavior, and affect the SQL queries
generated by the M engine.
The supported options records fields fall into two categories—those that are public and always available, and
those that are only available in an extensibility context.
The following table describes the public fields in the options record.
Default: 10 minutes
Default: 15 seconds
Default: true
HierarchicalNavigation A logical value that sets whether to view the tables grouped
by their schema names. When set to false, tables are
displayed in a flat list under each database.
Default: false
Default: true
The following table describes the options record fields that are only available through extensibility. Fields that
aren't simple literal values are described in later sections.
Default: false
Default: false
Default: false
Default: false
Default: false
Default: false
Overriding AstVisitor
The AstVisitor field is set through the Odbc.DataSource options record. It's used to modify SQL statements
generated for specific query scenarios.
NOTE
Drivers that support LIMIT and OFFSET clauses (rather than TOP) will want to provide a LimitClause override for
AstVisitor .
Constant
Providing an override for this value has been deprecated and may be removed from future implementations.
LimitClause
This field is a function that receives two Int64.Type arguments ( skip , take ), and returns a record with two
text fields ( Text , Location ).
The skip parameter is the number of rows to skip (that is, the argument to OFFSET). If an offset isn't specified,
the skip value will be null. If your driver supports LIMIT, but doesn't support OFFSET, the LimitClause function
should return an unimplemented error (...) when skip is greater than 0.
The take parameter is the number of rows to take (that is, the argument to LIMIT).
The Text field of the result contains the SQL text to add to the generated query.
The Location field specifies where to insert the clause. The following table describes supported values.
VA L UE DESC RIP T IO N EXA M P L E
LIMIT 5
FROM table
WHERE a > 10
WHERE a > 10
WHERE a > 10
The following code snippet provides a LimitClause implementation for a driver that expects a LIMIT clause, with
an optional OFFSET, in the following format: [OFFSET <offset> ROWS] LIMIT <row_count>
The following code snippet provides a LimitClause implementation for a driver that supports LIMIT, but not
OFFSET. Format: LIMIT <row_count> .
Overriding SqlCapabilities
F IEL D DETA IL S
Default: null
Default: false
SupportsTop A logical value that indicates the driver supports the TOP
clause to limit the number of returned rows.
Default: false
StringLiteralEscapeCharacters A list of text values that specify the character(s) to use when
escaping string literals and LIKE expressions.
Example: {""}
Default: null
Default: false
Default: false
Default: false
F IEL D DETA IL S
Default: false
Default: false
Overriding SQLColumns
SQLColumns is a function handler that receives the results of an ODBC call to SQLColumns. The source
parameter contains a table with the data type information. This override is typically used to fix up data type
mismatches between calls to SQLGetTypeInfo and SQLColumns .
For details of the format of the source table parameter, go to SQLColumns Function.
Overriding SQLGetFunctions
This field is used to override SQLFunctions values returned by an ODBC driver. It contains a record whose field
names are equal to the FunctionId constants defined for the ODBC SQLGetFunctions function. Numeric
constants for each of these fields can be found in the ODBC specification.
F IEL D DETA IL S
Default: false
The following code snippet provides an example explicitly telling the M engine to use CAST rather than
CONVERT.
SQLGetFunctions = [
SQL_CONVERT_FUNCTIONS = 0x2 /* SQL_FN_CVT_CAST */
]
Overriding SQLGetInfo
This field is used to override SQLGetInfo values returned by an ODBC driver. It contains a record whose fields
are names equal to the InfoType constants defined for the ODBC SQLGetInfo function. Numeric constants for
each of these fields can be found in the ODBC specification. The full list of InfoTypes that are checked can be
found in the mashup engine trace files.
The following table contains commonly overridden SQLGetInfo properties:
F IEL D DETA IL S
SQL_AF_ALL
SQL_AF_AVG
SQL_AF_COUNT
SQL_AF_DISTINCT
SQL_AF_MAX
SQL_AF_MIN
SQL_AF_SUM
The following helper function can be used to create bitmask values from a list of integer values:
Overriding SQLGetTypeInfo
SQLGetTypeInfo can be specified in two ways:
A fixed table value that contains the same type information as an ODBC call to SQLGetTypeInfo .
A function that accepts a table argument, and returns a table. The argument contains the original results of
the ODBC call to SQLGetTypeInfo . Your function implementation can modify or add to this table.
The first approach is used to completely override the values returned by the ODBC driver. The second approach
is used if you want to add to or modify these values.
For details of the format of the types table parameter and expected return value, go to SQLGetTypeInfo function
reference.
SQLGetTypeInfo using a static table
The following code snippet provides a static implementation for SQLGetTypeInfo .
SQLGetTypeInfo = #table(
{ "TYPE_NAME", "DATA_TYPE", "COLUMN_SIZE", "LITERAL_PREF", "LITERAL_SUFFIX", "CREATE_PARAS",
"NULLABLE", "CASE_SENSITIVE", "SEARCHABLE", "UNSIGNED_ATTRIBUTE", "FIXED_PREC_SCALE", "AUTO_UNIQUE_VALUE",
"LOCAL_TYPE_NAME", "MINIMUM_SCALE", "MAXIMUM_SCALE", "SQL_DATA_TYPE", "SQL_DATETIME_SUB", "NUM_PREC_RADIX",
"INTERNAL_PRECISION", "USER_DATA_TYPE" }, {
Next steps
Test and troubleshoot an ODBC-based connector
Test and troubleshoot an ODBC-based connector
5/25/2022 • 2 minutes to read • Edit Online
While you're building your ODBC-based connector, it's a good idea to occasionally test and troubleshoot the
connector. This section describes how to set up and use some test and troubleshooting tools.
NOTE
The DAX CONCATENATE function isn't currently supported by Power Query/ODBC extensions. Extension authors should
ensure string concatenation works through the query editor by adding calculated columns (
[stringCol1] & [stringCol2] ). When the capability to fold the CONCATENATE operation is added in the future, it
should work seamlessly with existing extensions.
Handling Resource Path
5/25/2022 • 2 minutes to read • Edit Online
The M engine identifies a data source using a combination of its Kind and Path. When a data source is
encountered during a query evaluation, the M engine will try to find matching credentials. If no credentials are
found, the engine returns a special error that results in a credential prompt in Power Query.
The Kind value comes from Data Source Kind definition.
The Path value is derived from the required parameters of your data source function(s). Optional parameters
aren't factored into the data source path identifier. As a result, all data source functions associated with a data
source kind must have the same parameters. There's special handling for functions that have a single parameter
of type Uri.Type . See below for further details.
You can see an example of how credentials are stored in the Data source settings dialog in Power BI Desktop.
In this dialog, the Kind is represented by an icon, and the Path value is displayed as text.
[Note] If you change your data source function's required parameters during development, previously
stored credentials will no longer work (because the path values no longer match). You should delete any
stored credentials any time you change your data source function parameters. If incompatible credentials
are found, you may receive an error at runtime.
The function has a single required parameter ( message ) of type text , and will be used to calculate the data
source path. The optional parameter ( count ) will be ignored. The path would be displayed as follows:
Credential prompt:
When a Label value is defined, the data source path value won't be shown:
[Note] We currently recommend that you do not include a Label for your data source if your function has
required parameters, as users won't be able to distinguish between the different credentials they've entered.
We are hoping to improve this in the future (that is, allowing data connectors to display their own custom
data source paths).
As Uri.Type is an ascribed type rather than a primitive type in the M language, you'll need to use the
Value.ReplaceType function to indicate that your text parameter should be treated as a Uri.
REST APIs typically have some mechanism to transmit large volumes of records broken up into pages of results.
Power Query has the flexibility to support many different paging mechanisms. However, since each paging
mechanism is different, some amount of modification of the paging examples is likely to be necessary to fit your
situation.
Typical Patterns
The heavy lifting of compiling all page results into a single table is performed by the Table.GenerateByPage()
helper function, which can generally be used with no modification. The code snippets presented in the
Table.GenerateByPage() helper function section describe how to implement some common paging patterns.
Regardless of pattern, you'll need to understand:
1. How do you request the next page of data?
2. Does the paging mechanism involve calculating values, or do you extract the URL for the next page from the
response?
3. How do you know when to stop paging?
4. Are there parameters related to paging (such as "page size") that you should be aware of?
Handling Transformations
5/25/2022 • 3 minutes to read • Edit Online
For situations where the data source response isn't presented in a format that Power BI can consume directly,
Power Query can be used to perform a series of transformations.
Static Transformations
In most cases, the data is presented in a consistent way by the data source: column names, data types, and
hierarchical structure are consistent for a given endpoint. In this situation it's appropriate to always apply the
same set of transformations to get the data in a format acceptable to Power BI.
An example of static transformation can be found in the TripPin Part 2 - Data Connector for a REST Service
tutorial when the data source is treated as a standard REST service:
let
Source = TripPin.Feed("https://services.odata.org/v4/TripPinService/Airlines"),
value = Source[value],
toTable = Table.FromList(value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
expand = Table.ExpandRecordColumn(toTable, "Column1", {"AirlineCode", "Name"}, {"AirlineCode", "Name"})
in
expand
It's important to note that a sequence of static transformations of this specificity are only applicable to a single
endpoint. In the example above, this sequence of transformations will only work if "AirlineCode" and "Name"
exist in the REST endpoint response, since they are hard-coded into the M code. Thus, this sequence of
transformations may not work if you try to hit the /Event endpoint.
This high level of specificity may be necessary for pushing data to a navigation table, but for more general data
access functions it's recommended that you only perform transformations that are appropriate for all endpoints.
NOTE
Be sure to test transformations under a variety of data circumstances. If the user doesn't have any data at the
/airlines endpoint, do your transformations result in an empty table with the correct schema? Or is an error
encountered during evaluation? See TripPin Part 7: Advanced Schema with M Types for a discussion on unit testing.
Dynamic Transformations
More complex logic is sometimes needed to convert API responses into stable and consistent forms appropriate
for Power BI data models.
Inconsistent API Responses
Basic M control flow (if statements, HTTP status codes, try...catch blocks, and so on) are typically sufficient to
handle situations where there are a handful of ways in which the API responds.
Determining Schema On-The -Fly
Some APIs are designed such that multiple pieces of information must be combined to get the correct tabular
format. Consider Smartsheet's /sheets endpoint response, which contains an array of column names and an
array of data rows. The Smartsheet Connector is able to parse this response in the following way:
raw = Web.Contents(...),
columns = raw[columns],
columnTitles = List.Transform(columns, each [title]),
columnTitlesWithRowNumber = List.InsertRange(columnTitles, 0, {"RowNumber"}),
1. First deal with column header information. You can pull the title record of each column into a List,
prepending with a RowNumber column that you know will always be represented as this first column.
2. Next you can define a function that allows you to parse a row into a List of cell value s. You can again
prepend rowNumber information.
3. Apply your RowAsList() function to each of the row s returned in the API response.
4. Convert the List to a table, specifying the column headers.
Handling Schema
5/25/2022 • 7 minutes to read • Edit Online
Depending on your data source, information about data types and column names may or may not be provided
explicitly. OData REST APIs typically handle this using the $metadata definition, and the Power Query
OData.Feed method automatically handles parsing this information and applying it to the data returned from an
OData source.
Many REST APIs don't have a way to programmatically determine their schema. In these cases you'll need to
include a schema definition in your connector.
Consider the following code that returns a simple table from the TripPin OData sample service:
let
url = "https://services.odata.org/TripPinWebApiService/Airlines",
source = Json.Document(Web.Contents(url))[value],
asTable = Table.FromRecords(source)
in
asTable
NOTE
TripPin is an OData source, so realistically it would make more sense to simply use the OData.Feed function's automatic
schema handling. In this example you'll be treating the source as a typical REST API and using Web.Contents to
demonstrate the technique of hardcoding a schema by hand.
You can use the handy Table.Schema function to check the data type of the columns:
let
url = "https://services.odata.org/TripPinWebApiService/Airlines",
source = Json.Document(Web.Contents(url))[value],
asTable = Table.FromRecords(source)
in
Table.Schema(asTable)
Both AirlineCode and Name are of any type. Table.Schema returns a lot of metadata about the columns in a
table, including names, positions, type information, and many advanced properties such as Precision, Scale, and
MaxLength. For now you should only concern yourself with the ascribed type ( TypeName ), primitive type ( Kind ),
and whether the column value might be null ( IsNullable ).
Defining a Simple Schema Table
Your schema table will be composed of two columns:
C O L UM N DETA IL S
Name The name of the column. This must match the name in the
results returned by the service.
Type The M data type you're going to set. This can be a primitive
type (text, number, datetime, and so on), or an ascribed type
(Int64.Type, Currency.Type, and so on).
The hardcoded schema table for the Airlines table will set its AirlineCode and Name columns to text and
looks like this:
As you look to some of the other endpoints, consider the following schema tables:
The Airports table has four fields you'll want to keep (including one of type record ):
The People table has seven fields, including list s ( Emails , AddressInfo ), a nullable column ( Gender ), and a
column with an ascribed type ( Concurrency ):
People = #table({"Name", "Type"}, {
{"UserName", type text},
{"FirstName", type text},
{"LastName", type text},
{"Emails", type list},
{"AddressInfo", type list},
{"Gender", type nullable text},
{"Concurrency", Int64.Type}
})
You can put all of these tables into a single master schema table SchemaTable :
NOTE
The last step to set the table type will remove the need for the Power Query UI to infer type information when viewing
the results in the query editor, which can sometimes result in a double-call to the API.
Sophisticated Approach
The hardcoded implementation discussed above does a good job of making sure that schemas remain
consistent for simple JSON repsonses, but it's limited to parsing the first level of the response. Deeply nested
data sets would benefit from the following approach, which takes advantage of M Types.
Here is a quick refresh about types in the M language from the Language Specification:
A type value is a value that classifies other values. A value that is classified by a type is said to conform
to that type. The M type system consists of the following kinds of types:
Primitive types, which classify primitive values ( binary , date , datetime , datetimezone , duration ,
list , logical , null , number , record , text , time , type ) and also include a number of abstract
types ( function , table , any , and none ).
Record types, which classify record values based on field names and value types.
List types, which classify lists using a single item base type.
Function types, which classify function values based on the types of their parameters and return values.
Table types, which classify table values based on column names, column types, and keys.
Nullable types, which classify the value null in addition to all the values classified by a base type.
Type types, which classify values that are types.
Using the raw JSON output you get (and/or by looking up the definitions in the service's $metadata), you can
define the following record types to represent OData complex types:
LocationType = type [
Address = text,
City = CityType,
Loc = LocType
];
CityType = type [
CountryRegion = text,
Name = text,
Region = text
];
LocType = type [
#"type" = text,
coordinates = {number},
crs = CrsType
];
CrsType = type [
#"type" = text,
properties = record
];
Notice how LocationType references the CityType and LocType to represent its structured columns.
For the top-level entities that you'll want represented as Tables, you can define table types:
You can then update your SchemaTable variable (which you can use as a lookup table for entity-to-type
mappings) to use these new type definitions:
SchemaTable = #table({"Entity", "Type"}, {
{"Airlines", AirlinesType},
{"Airports", AirportsType},
{"People", PeopleType}
});
You can rely on a common function ( Table.ChangeType ) to enforce a schema on your data, much like you used
SchemaTransformTable in the earlier exercise. Unlike SchemaTransformTable , Table.ChangeType takes an actual M
table type as an argument, and will apply your schema recursively for all nested types. Its signature is:
NOTE
For flexibility, the function can be used on tables as well as lists of records (which is how tables are represented in a JSON
document).
You'll then need to update the connector code to change the schema parameter from a table to a type , and
add a call to Table.ChangeType . Again, the details for doing so are very implementation-specific and thus not
worth going into in detail here. This extended TripPin connector example demonstrates an end-to-end solution
implementing this more sophisticated approach to handling schema.
Status Code Handling with Web.Contents
5/25/2022 • 2 minutes to read • Edit Online
The Web.Contents function has some built in functionality for dealing with certain HTTP status codes. The
default behavior can be overridden in your extension using the ManualStatusHandling field in the options record.
Automatic retry
Web.Contents will automatically retry requests that fail with one of the following status codes:
C O DE STAT US
Requests will be retried up to 3 times before failing. The engine uses an exponential back-off algorithm to
determine how long to wait until the next retry, unless the response contains a Retry-after header. When the
header is found, the engine will wait the specified number of seconds before the next retry. The minimum
supported wait time is 0.5 seconds, and the maximum value is 120 seconds.
NOTE
The Retry-after value must be in the delta-seconds format. The HTTP-date format is currently not supported.
Authentication exceptions
The following status codes will result in a credentials exception, causing an authentication prompt asking the
user to provide credentials (or re-login in the case of an expired OAuth token).
C O DE STAT US
401 Unauthorized
403 Forbidden
NOTE
Extensions are able to use the ManualStatusHandling option with status codes 401 and 403, which is not something
that can be done in Web.Contents calls made outside of an extension context (that is, directly from Power Query).
Redirection
The follow status codes will result in an automatic redirect to the URI specified in the Location header. A
missing Location header will result in an error.
C O DE STAT US
302 Found
NOTE
Only status code 307 will keep a POST request method. All other redirect status codes will result in a switch to GET .
Wait-Retry Pattern
5/25/2022 • 2 minutes to read • Edit Online
In some situations, a data source's behavior doesn't match that expected by Power Query's default HTTP code
handling. The examples below show how to work around this situation.
In this scenario you'll be working with a REST API that occasionally returns a 500 status code, indicating an
internal server error. In these instances, you could wait a few seconds and retry, potentially a few times before
you give up.
ManualStatusHandling
If Web.Contents gets a 500 status code response, it throws a DataSource.Error by default. You can override this
behavior by providing a list of codes as an optional argument to Web.Contents :
By specifying the status codes in this way, Power Query will continue to process the web response as normal.
However, normal response processing is often not appropriate in these cases. You'll need to understand that an
abnormal response code has been received and perform special logic to handle it. To determine the response
code that was returned from the web service, you can access it from the meta Record that accompanies the
response:
responseCode = Value.Metadata(response)[Response.Status]
Based on whether responseCode is 200 or 500, you can either process the result as normal, or follow your wait-
retry logic that you'll flesh out in the next section.
IsRetry
Power Query has a local cache that stores the results of previous calls to Web.Contents. When polling the same
URL for a new response, or when retrying after an error status, you'll need to ensure that the query ignores any
cached results. You can do this by including the IsRetry option in the call to the Web.Contents function. In this
sample, we'll set IsRetry to true after the first iteration of the Value.WaitFor loop.
Value.WaitFor
Value.WaitFor() is a standard helper function that can usually be used with no modification. It works by
building a List of retry attempts.
producer Argument
This contains the task to be (possibly) retried. It's represented as a function so that the iteration number can be
used in the producer logic. The expected behavior is that producer will return null if a retry is determined to
be necessary. If anything other than null is returned by producer , that value is in turn returned by
Value.WaitFor .
delay Argument
This contains the logic to execute between retries. It's represented as a function so that the iteration number can
be used in the delay logic. The expected behavior is that delay returns a Duration.
count Argument (optional)
A maximum number of retries can be set by providing a number to the count argument.
let
waitForResult = Value.WaitFor(
(iteration) =>
let
result = Web.Contents(url, [ManualStatusHandling = {500}, IsRetry = iteration > 0]),
status = Value.Metadata(result)[Response.Status],
actualResult = if status = 500 then null else result
in
actualResult,
(iteration) => #duration(0, 0, 0, Number.Power(2, iteration)),
5)
in
if waitForResult = null then
error "Value.WaitFor() Failed after multiple retry attempts"
else
waitForResult
Handling Unit Testing
5/25/2022 • 2 minutes to read • Edit Online
For both simple and complex connectors, adding unit tests is a best practice and highly recommended.
Unit testing is accomplished in the context of Visual Studio's Power Query SDK. Each test is defined as a Fact
that has a name, an expected value, and an actual value. In most cases, the "actual value" will be an M expression
that tests part of your expression.
Consider a very simple extension that exports three functions:
section Unittesting;
This unit test code is made up of a number of Facts, and a bunch of common code for the unit test framework (
ValueToText , Fact , Facts , Facts.Summarize ). The following code provides an example set of Facts (see
UnitTesting.query.pq for the common code):
section UnitTestingTests;
shared MyExtension.UnitTest =
[
// Put any common variables here if you only want them to be evaluated once
Running the sample in Visual Studio will evaluate all of the Facts and give you a visual summary of the pass
rates:
Implementing unit testing early in the connector development process enables you to follow the principles of
test-driven development. Imagine that you need to write a function called Uri.GetHost that returns only the
host data from a URI. You might start by writing a test case to verify that the function appropriately performs the
expected function:
Additional tests can be written to ensure that the function appropriately handles edge cases.
An early version of the function might pass some but not all tests:
The final version of the function should pass all unit tests. This also makes it easy to ensure that future updates
to the function do not accidentally remove any of its basic functionality.
Helper Functions
5/25/2022 • 10 minutes to read • Edit Online
This topic contains a number of helper functions commonly used in M extensions. These functions may
eventually be moved to the official M library, but for now can be copied into your extension file code. You
shouldn't mark any of these functions as shared within your extension code.
Navigation Tables
Table.ToNavigationTable
This function adds the table type metadata needed for your extension to return a table value that Power Query
can recognize as a Navigation Tree. See Navigation Tables for more information.
Table.ToNavigationTable = (
table as table,
keyColumns as list,
nameColumn as text,
dataColumn as text,
itemKindColumn as text,
itemNameColumn as text,
isLeafColumn as text
) as table =>
let
tableType = Value.Type(table),
newTableType = Type.AddTableKey(tableType, keyColumns, true) meta
[
NavigationTable.NameColumn = nameColumn,
NavigationTable.DataColumn = dataColumn,
NavigationTable.ItemKindColumn = itemKindColumn,
Preview.DelayColumn = itemNameColumn,
NavigationTable.IsLeafColumn = isLeafColumn
],
navigationTable = Value.ReplaceType(table, newTableType)
in
navigationTable;
PA RA M ET ER DETA IL S
keyColumns List of column names that act as the primary key for your
navigation table.
nameColumn The name of the column that should be used as the display
name in the navigator.
dataColumn The name of the column that contains the Table or Function
to display.
Example usage:
URI Manipulation
Uri.FromParts
This function constructs a full URL based on individual fields in the record. It acts as the reverse of Uri.Parts.
Uri.GetHost
This function returns the scheme, host, and default port (for HTTP/HTTPS) for a given URL. For example,
https://bing.com/subpath/query?param=1¶m2=hello would become https://bing.com:443 .
ValidateUrlScheme = (url as text) as text => if (Uri.Parts(url)[Scheme] <> "https") then error "Url scheme
must be HTTPS" else url;
To apply it, just wrap your url parameter in your data access function.
Retrieving Data
Value.WaitFor
This function is useful when making an asynchronous HTTP request and you need to poll the server until the
request is complete.
Value.WaitFor = (producer as function, interval as function, optional count as number) as any =>
let
list = List.Generate(
() => {0, null},
(state) => state{0} <> null and (count = null or state{0} < count),
(state) => if state{1} <> null then {null, state{1}} else {1 + state{0}, Function.InvokeAfter(()
=> producer(state{0}), interval(state{0}))},
(state) => state{1})
in
List.Last(list);
Table.GenerateByPage
This function is used when an API returns data in an incremental/paged format, which is common for many
REST APIs. The getNextPage argument is a function that takes in a single parameter, which will be the result of
the previous call to getNextPage , and should return a nullable table .
getNextPage is called repeatedly until it returns null . The function will collate all pages into a single table.
When the result of the first call to getNextPage is null, an empty table is returned.
// The getNextPage function takes a single argument and is expected to return a nullable table
Table.GenerateByPage = (getNextPage as function) as table =>
let
listOfPages = List.Generate(
() => getNextPage(null), // get the first page of data
(lastPage) => lastPage <> null, // stop when the function returns null
(lastPage) => getNextPage(lastPage) // pass the previous page to the next function call
),
// concatenate the pages together
tableOfPages = Table.FromList(listOfPages, Splitter.SplitByNothing(), {"Column1"}),
firstRow = tableOfPages{0}?
in
// if we didn't get back any pages of data, return an empty table
// otherwise set the table type based on the columns of the first page
if (firstRow = null) then
Table.FromRows({})
else
Value.ReplaceType(
Table.ExpandTableColumn(tableOfPages, "Column1", Table.ColumnNames(firstRow[Column1])),
Value.Type(firstRow[Column1])
);
Additional notes:
The getNextPage function will need to retrieve the next page URL (or page number, or whatever other values
are used to implement the paging logic). This is generally done by adding meta values to the page before
returning it.
The columns and table type of the combined table (that is, all pages together) are derived from the first page
of data. The getNextPage function should normalize each page of data.
The first call to getNextPage receives a null parameter.
getNextPage must return null when there are no pages left.
An example of using this function can be found in the Github sample, and the TripPin paging sample.
SchemaTransformTable
EnforceSchema.Strict = 1; // Add any missing columns, remove extra columns, set table type
EnforceSchema.IgnoreExtraColumns = 2; // Add missing columns, do not remove extra columns
EnforceSchema.IgnoreMissingColumns = 3; // Do not add or remove columns
SchemaTransformTable = (table as table, schema as table, optional enforceSchema as number) as table =>
let
// Default to EnforceSchema.Strict
_enforceSchema = if (enforceSchema <> null) then enforceSchema else EnforceSchema.Strict,
Table.ChangeType
let
// table should be an actual Table.Type, or a List.Type of Records
Table.ChangeType = (table, tableType as type) as nullable table =>
// we only operate on table types
if (not Type.Is(tableType, type table)) then error "type argument should be a table type" else
// if we have a null value, just return it
// if we have a null value, just return it
if (table = null) then table else
let
columnsForType = Type.RecordFields(Type.TableRow(tableType)),
columnsAsTable = Record.ToTable(columnsForType),
schema = Table.ExpandRecordColumn(columnsAsTable, "Value", {"Type"}, {"Type"}),
previousMeta = Value.Metadata(tableType),
// If given a generic record type (no predefined fields), the original record is returned
Record.ChangeType = (record as record, recordType as type) =>
let
// record field format is [ fieldName = [ Type = type, Optional = logical], ... ]
fields = try Type.RecordFields(recordType) otherwise error "Record.ChangeType: failed to get
record fields. Is this a record type?",
fieldNames = Record.FieldNames(fields),
fieldTable = Record.ToTable(fields),
optionalFields = Table.SelectRows(fieldTable, each [Value][Optional])[Name],
requiredFields = List.Difference(fieldNames, optionalFields),
// make sure all required fields exist
withRequired = Record.SelectFields(record, requiredFields, MissingField.UseNull),
// append optional fields
withOptional = withRequired & Record.SelectFields(record, optionalFields, MissingField.Ignore),
// set types
transforms = GetTransformsForType(recordType),
withTypes = Record.TransformFields(withOptional, transforms, MissingField.Ignore),
// order the same as the record type
reorder = Record.ReorderFields(withTypes, fieldNames, MissingField.Ignore)
in
if (List.IsEmpty(fieldNames)) then record else reorder,
Errors in Power Query generally halt query evaluation and display a message to the user.
let
Source = "foo",
Output = error "error message"
in
Output
let
Source = "foo",
Output = error Error.Record("error reason", "error message", "error detail")
in
Output
try "foo"
If an error is found, the following record is returned from the try expression:
try "foo"+1
The Error record contains Reason , Message , and Detail fields.
Depending on the error, the Detail field may contain additional information.
The otherwise clause can be used with a try expression to perform some action if an error occurs:
Power Query will automatically generate an invocation UI for you based on the arguments for your function. By
default, this UI will contain the name of your function, and an input for each of your parameters.
Similarly, evaluating the name of your function, without specifying parameters, will display information about it.
You might notice that built-in functions typically provide a better user experience, with descriptions, tooltips, and
even sample values. You can take advantage of this same mechanism by defining specific meta values on your
function type. This topic describes the meta fields that are used by Power Query, and how you can make use of
them in your extensions.
Function Types
You can provide documentation for your function by defining custom type values. The process looks like this:
1. Define a type for each parameter.
2. Define a type for your function.
3. Add various Documentation.* fields to your types metadata record.
4. Call Value.ReplaceType to ascribe the type to your shared function.
You can find more information about types and metadata values in the M Language Specification.
Using this approach allows you to supply descriptions and display names for your function, as well as individual
parameters. You can also supply sample values for parameters, as well as defining a preset list of values (turning
the default text box control into a drop down).
The Power Query experience retrieves documentation from meta values on the type of your function, using a
combination of calls to Value.Type, Type.FunctionParameters, and Value.Metadata.
Function Documentation
The following table lists the Documentation fields that can be set in the metadata for your function. All fields are
optional.
Parameter Documentation
The following table lists the Documentation fields that can be set in the metadata for your function parameters.
All fields are optional.
Basic Example
The following code snippet (and resulting dialogs) are from the HelloWorldWithDocs sample.
[DataSource.Kind="HelloWorldWithDocs", Publish="HelloWorldWithDocs.Publish"]
shared HelloWorldWithDocs.Contents = Value.ReplaceType(HelloWorldImpl, HelloWorldType);
Function info
Multi-Line Example
[DataSource.Kind="HelloWorld", Publish="HelloWorld.Publish"]
shared HelloWorld.Contents =
let
HelloWorldType = type function (
message1 as (type text meta [
Documentation.FieldCaption = "Message 1",
Documentation.FieldDescription = "Text to display for message 1",
Documentation.SampleValues = {"Hello world"},
Formatting.IsMultiLine = true,
Formatting.IsCode = true
]),
message2 as (type text meta [
Documentation.FieldCaption = "Message 2",
Documentation.FieldDescription = "Text to display for message 2",
Documentation.SampleValues = {"Hola mundo"},
Formatting.IsMultiLine = true,
Formatting.IsCode = false
])) as text,
HelloWorldFunction = (message1 as text, message2 as text) as text => message1 & message2
in
Value.ReplaceType(HelloWorldFunction, HelloWorldType);
This code (with associated publish information, etc.) results in the following dialogue in Power BI. New lines will
be represented in text with '#(lf)', or 'line feed'.
Handling Navigation
5/25/2022 • 3 minutes to read • Edit Online
Navigation Tables (or nav tables) are a core part of providing a user-friendly experience for your connector. The
Power Query experience displays them to the user after they've entered any required parameters for your data
source function, and have authenticated with the data source.
Behind the scenes, a nav table is just a regular M Table value with specific metadata fields defined on its Type.
When your data source function returns a table with these fields defined, Power Query will display the navigator
dialog. You can actually see the underlying data as a Table value by right-clicking on the root node and selecting
Edit .
Table.ToNavigationTable
You can use the Table.ToNavigationTable function to add the table type metadata needed to create a nav table.
NOTE
You currently need to copy and paste this function into your M extension. In the future it will likely be moved into the M
standard library.
keyColumns List of column names that act as the primary key for your
navigation table.
nameColumn The name of the column that should be used as the display
name in the navigator.
dataColumn The name of the column that contains the Table or Function
to display.
F IEL D PA RA M ET ER
NavigationTable.NameColumn nameColumn
NavigationTable.DataColumn dataColumn
NavigationTable.ItemKindColumn itemKindColumn
NavigationTable.IsLeafColumn isLeafColumn
Preview.DelayColumn itemNameColumn
This code will result in the following Navigator display in Power BI Desktop:
This code would result in the following Navigator display in Power BI Desktop:
Test Connection
Custom Connector support is available in both Personal and Standard modes of the on-premises data
gateway. Both gateway modes support Impor t . Direct Quer y is only supported in Standard mode. OAuth
for custom connectors via gateways is currently supported only for gateway admins but not other data
source users.
The method for implementing TestConnection functionality is likely to change while the Power BI Custom
Data Connector functionality is in preview.
To support scheduled refresh through the on-premises data gateway, your connector must implement a
TestConnection handler. The function is called when the user is configuring credentials for your source, and used
to ensure they are valid. The TestConnection handler is set in the Data Source Kind record, and has the following
signature:
Where dataSourcePath is the Data Source Path value for your function, and the return value is a list composed
of:
The name of the function to call (this function must be marked as #shared , and is usually your primary data
source function).
One or more arguments to pass to your function.
If the invocation of the function results in an error, TestConnection is considered to have failed, and the
credential won't be persisted.
NOTE
As stated above, the function name provided by TestConnection must be a shared member.
TripPin = [
TestConnection = (dataSourcePath) => { "TripPin.Contents" },
Authentication = [
Anonymous = []
],
Label = "TripPin"
];
Example: Connector with a URL parameter
If your data source function has a single required parameter of the type Uri.Type , its dataSourcePath will be
equal to the URL provided by the user. The snippet below shows the TestConnection implementation from the
Github Sample.
GithubSample = [
TestConnection = (dataSourcePath) => {"GithubSample.Contents", dataSourcePath},
Authentication = [
OAuth = [
StartLogin = StartLogin,
FinishLogin = FinishLogin,
Label = Extension.LoadString("AuthenticationLabel")
]
]
];
DirectSQL = [
TestConnection = (dataSourcePath) =>
let
json = Json.Document(dataSourcePath),
server = json[server],
database = json[database]
in
{ "DirectSQL.Database", server, database },
Authentication = [
Windows = [],
UsernamePassword = []
],
Label = "Direct Query for SQL"
];
Handling Power Query Connector Signing
5/25/2022 • 3 minutes to read • Edit Online
In Power BI, the loading of custom connectors is limited by your choice of security setting. As a general rule,
when the security for loading custom connectors is set to 'Recommended', the custom connectors won't load at
all, and you have to lower it to make them load.
The exception to this is trusted, 'signed connectors'. Signed connectors are a special format of custom connector,
a .pqx instead of .mez file, which has been signed with a certificate. The signer can provide the user or the user's
IT department with a thumbprint of the signature, which can be put into the registry to securely indicate trusting
a given connector.
The following steps enable you to use a certificate (with an explanation on how to generate one if you don't
have one available) and sign a custom connector with the 'MakePQX' tool.
NOTE
If you need help creating a self-signed certificate to test these instructions, go to the Microsoft documentation on New-
SelfSignedCertificate in PowerShell.
NOTE
If you need help exporting your certificate as a pfx, go to Export-PfxCertificate.
1. Download MakePQX.
2. Extract the MakePQX folder in the included zip to the target you want.
3. To run it, call MakePQX in the command line. It requires the other libraries in the folder, so you can't copy
just the one executable. Running without any parameters will return the help information.
Usage: MakePQX [ options] [ command]
Options:
O P T IO N S DESC RIP T IO N
Commands:
verify Verify the signature status on a pqx file. Return value will be
non-zero if the signature is invalid.
There are three commands in MakePQX. Use MakePQX [ command] --help for more information about a
command.
Pack
The Pack command takes a mez file and packs it into a pqx file, which can be signed. The pqx file is also able to
support some capabilities that will be added in the future.
Usage: MakePQX pack [ options]
Options:
O P T IO N DESC RIP T IO N
-t | --target Output file name. Defaults to the same name as the input
file.
Example
C:\Users\cpope\Downloads\MakePQX>MakePQX.exe pack -mz
"C:\Users\cpope\OneDrive\Documents\Power BI Desktop\Custom Connectors\HelloWorld.mez" -t
"C:\Users\cpope\OneDrive\Documents\Power BI Desktop\Custom
Connectors\HelloWorldSigned.pqx"
Sign
The Sign command signs your pqx file with a certificate, giving it a thumbprint that can be checked for trust by
Power BI clients with the higher security setting. This command takes a pqx file and returns the same pqx file,
signed.
Usage: MakePQX sign [ arguments] [ options]
Arguments:
Options:
O P T IO N DESC RIP T IO N
Example
C:\Users\cpope\Downloads\MakePQX>MakePQX sign
"C:\Users\cpope\OneDrive\Documents\Power BI Desktop\Custom
Connectors\HelloWorldSigned.pqx" --cer tificate ContosoTestCer tificate.pfx --password password
Verify
The Verify command verifies that your module has been properly signed, and is showing the Certificate status.
Usage: MakePQX verify [ arguments] [ options]
Arguments:
Options:
O P T IO N DESC RIP T IO N
Example
C:\Users\cpope\Downloads\MakePQX>MakePQX verify
"C:\Users\cpope\OneDrive\Documents\Power BI Desktop\Custom
Connectors\HelloWorldSigned.pqx"
{
"SignatureStatus": "Success",
"CertificateStatus": [
{
"Issuer": "CN=Colin Popell",
"Thumbprint": "16AF59E4BE5384CD860E230ED4AED474C2A3BC69",
"Subject": "CN=Colin Popell",
"NotBefore": "2019-02-14T22:47:42-08:00",
"NotAfter": "2020-02-14T23:07:42-08:00",
"Valid": false,
"Parent": null,
"Status": "UntrustedRoot"
}
]
}
This article describes how you can enable proxy support in your Power Query custom connector using the
Power Query SDK.
Example usage
Example 1
To use Web.DefaultProxy in the connector code, a boolean type variable can be used to opt in or out of using this
functionality. In this example, Web.DefaultProxy is invoked in the connector code if the optional boolean
parameter UseWebDefaultProxy is set to true (defaults to false).
UseWebDefaultProxyOption = options[UseWebDefaultProxy]?,
ProxyUriRecord = if UseWebDefaultProxyOption then Web.DefaultProxy(Host) else null,
ProxyOptions = if Record.FieldCount(ProxyUriRecord) > 0 then
[
Proxy = ProxyUriRecord[ProxyUri]
]
else [],
...
Once the UseWebDefaultProxy is set to true and ProxyUriRecord is fetched, a record can be created to set the
Proxy (configuration parameter supported by the driver, which can vary) with the ProxyUri field returned by
Web.DefaultProxy . It can be named something like ProxyOptions . This record can then be appended to the base
ConnectionString , and include the proxy details along with it.
Example 2
If there are multiple configuration parameters used by the driver for setting the proxy details (like host and port
details being handled separately), Uri.Parts can be used.
UseWebDefaultProxyOption = options[UseWebDefaultProxy]?,
ProxyRecord = if UseWebDefaultProxyOption then Web.DefaultProxy(Host) else null,
UriRecord = if ProxyRecord <> null then Uri.Parts(ProxyRecord) else null,
ProxyOptions = if UriRecord <> null then
[
ProxyHost = UriRecord[Scheme] & "://" & UriRecord[Host],
ProxyPort = UriRecord[Port]
]
else [],
...
Power Query Connector Certification
5/25/2022 • 7 minutes to read • Edit Online
NOTE
This article describes the requirements and process to submit a Power Query custom connector for certification. Read the
entire article closely before starting the certification process.
Introduction
Certifying a Power Query custom connector makes the connector available publicly, out-of-box, within Power BI
Desktop. Certified connectors are supported in PowerBI.com and all versions of Power BI Premium, except
dataflows. Certification is governed by Microsoft's Connector Certification Program, where Microsoft works with
partner developers to extend the data connectivity capabilities of Power BI.
Certified connectors are:
Maintained by the partner developer
Supported by the partner developer
Certified by Microsoft
Distributed by Microsoft
We work with partners to try to make sure that they have support in maintenance, but customer issues with the
connector itself will be directed to the partner developer.
Certified connectors are bundled out-of-box in Power BI Desktop. Custom connectors need to be loaded in
Power BI Desktop, as described in Loading your extension in Power BI Desktop. Both can be refreshed through
Power BI Desktop or Power BI Service through using an on-premises data gateway by implementing a
TestConnection.
Certified connectors with a TestConnection implementation also support end-to-end refresh through the cloud
(Power BI Service) without the need of an on-premises data gateway. The Power BI service environment
essentially hosts a “cloud gateway” that runs similar to the on-premises gateway. After certification, we will
deploy your connector to this environment so that it's available to all Power BI customers. There are additional
requirements for connectors that need to use additional components, such as an ODBC-based driver. Be sure to
reach out to your Microsoft contact if your connector requires the use of additional components.
NOTE
Template apps do not support connectors that require a gateway.
Power Query connector submission
5/25/2022 • 3 minutes to read • Edit Online
Introduction
This article provides instructions for how to submit your Power Query custom connector for certification. Don't
submit your connector for certification unless you've been directed to by your Microsoft contact.
Prerequisites
After you've been approved for certification, ensure that your connector meets the certification requirements
and follows all feature, style, and security guidelines. Prepare the submission artifacts for submission.
Once you've finished designing your Power Query custom connector, you'll need to submit an article that
provides instructions on how to use your connector for publication on docs.microsoft.com. This article discusses
the layout of such an article and how to format the text of your article.
Article layout
This section describes the general layout of the Power Query connector articles. Your custom connector article
should follow this general layout.
Support note
Right after the title of the article, insert the following note.
NOTE
The following connector article is provided by <company name>, the owner of this connector and a member of the
Microsoft Power Query Connector Certification Program. If you have questions regarding the content of this article or
have changes you would like to see made to this article, visit the <company name> website and use the support
channels there.
NOTE
Some capabilities may be present in one product but not others due to deployment schedules and host-specific
capabilities.
Prerequisites
If your custom connector requires that other applications be installed on the system running your connector or
requires that a set-up procedure be done before using your custom connector, you must include a Prerequisites
section that describes these installation and set-up procedures. This section will also include any information
about setting up various versions of your connector (if applicable).
Capabilities supported
This section should contain a list of the capabilities supported by your custom connector. These capabilities are
usually a bulleted list that indicates if the connector supports Import and DirectQuery modes, and also any
advanced options that are available in the initial dialog box that appears after the user selects your connector in
Get data .
Connection instructions
This section contains the procedures required to connect to data. If your custom connector is only used in Power
Query Desktop, only one procedure is required. However, if your custom connector is used on both Power
Query Desktop and Power Query Online, you must supply a separate procedure in separate sections for each
instance. That is, if your custom connector is only used by Power Query Desktop, you'll have one procedure
starting with a second order heading and a single step-by-step procedure. If your custom connector is used by
both Power Query Desktop and Power Query Online, you'll have two procedures. Each procedure starts with a
second order heading, and contains a separate step-by-step procedure under each heading. For examples of
each of these types of procedures, go to Example connector articles.
The procedure is made up of a numbered list that includes each step required to fill in the information needed to
provide a normal connection (not requiring advance options) to the data.
Connect using advanced options (optional)
If your custom connector contains advanced options that can be used to connect to the data, this information
should be covered in a separate section of the documentation. Each of the advanced options should be
documented, and the purpose of each advanced option explained in this section.
Troubleshooting (optional)
If you know of any common errors that may occur with your custom connector, you can add a troubleshooting
section to describe ways to either fix the error, or work around the error. This section can also include
information on any known limitations of your connector or the retrieval of data. You can also include any known
issues with using your connector to connect to data.
Additional instructions (optional)
Any other instructions or information about your connector that hasn't been covered by the previous sections
can go in this section.
See the Microsoft Docs contributor guide on how you can contribute to our repo.
The article should be formatted and submitted as a Markdown file. It should use the Microsoft style for
describing procedures and the UI layout.
The following articles include instructions on formatting your document in Markdown, and the Microsoft style
that you should follow when authoring your article:
Docs Markdown reference
Microsoft Writing Style Guide