What is Power BI Dataflows?
Power BI Dataflows are collections of tables you can create and manage in workspaces in the Power BI Service. These dataflows allow users to extract, clean, transform, and combine data from various sources, creating reusable data preparation logic.
Dataflows are a cloud-based data preparation and transformation feature within the Power BI ecosystem and offer capabilities for data profiling, data lineage, and scheduled data refresh. Dataflows are designed to enhance data integrity and consistency, making it easier to build reports and dashboards on well-structured data.
Table of contents
- Power BI Dataflows simplify and enhance data preparation in Power BI. They allow for reusing data preparation logic across reports.
- Dataflows can be organized into workspaces for better management. Consistent data across reports ensures accurate analysis and reporting.
- There are multiple ways you can create dataflows, such as defining new tables, linked tables, computed tables, common data folders, or using export/import.
- You can use dataflows by
- Creating a linked table from the Power BI dataflow and allowing another dataflow owner to access the data.
- Creating a dataset from the dataflow to enable a user to use the data.
- Creating a connection from the external tools capable of reading data from the CDM format.
How to create Power BI Dataflows?
Power BI Dataflows can be created only by users in a premium workspace, i.e., either users with Power BI Pro license or Premium per user (PPU) licenses. There are multiple ways you can create Power BI Dataflows. These include:
- Option 1: Using define new tables
- Here, you define a new table and connect to a new data source.
- By selecting a data source, you provide connection settings such as Server, Database, etc.
- Post connection establishment, you can choose the data for your table.
- Once the data is selected for the table, you can transform the dataset using the dataflow editor as per your requirements.
- Option 2: Using linked tables.
- This option works only with users having Power BI Premium licenses.
- Here, you refer to an existing table, which is defined in a different dataflow with read-only mode. Using Linked tables can be a good option if:
- You want to create a table only once and then refer it across multiple dataflows.
- You want to store the data and work like a cache store if you want to stop multiple refreshes to a data source.
- You want to merge two tables.
- Option 3: Using a computed table
- Using a computed table can be a good idea if you want to refer to a linked table and perform write-only operations on top of these tables.
- You can convert a linked table into a computed table either by creating a new query from merging two tables or creating a reference/duplication of the table.
- Option 4: Using a CDM (Common Data Model) folder
- When you create a dataflow from the CDM folder, you can refer to a table where another applicable has written data in the CDM. You need to provide the full path to the CDM format file, which is stored in ADLS Gen 2.
- Option 5: Using import/export
- This option allows you to create a dataflow by importing dataflow from a file.
- It is helpful when you want to save it offline or move a dataflow from one workspace to another.
Now, let’s look at how to use the Power BI Dataflows.
Excel VBA – All in One Courses Bundle (35+ Hours of Video Tutorials)
If you want to learn Excel and VBA professionally, then Excel VBA All in One Courses Bundle (35+ hours) is the perfect solution. Whether you’re a beginner or an experienced user, this bundle covers it all – from Basic Excel to Advanced Excel, Macros, Power Query, and VBA.
How To use Power BI Dataflows?
You can use Power BI dataflows in 3 different ways. They include:
- Create a linked table from the Power BI dataflow and allow another dataflow owner to access the data.
- Create a dataset from the dataflow to enable a user to use the data for creating reports.
- Create a connection from the external tools capable of reading data from the CDM format.
To use the Power BI dataflows, follow the steps highlighted below
Step 1: Navigate to the Power BI Desktop and select the Dataflows option under the Get data dropdown in the Home tab.
Step 2: Select the data flow and the tables you want to connect to.
Where DirectQuery connection is available, you can choose whether you want to connect to the tables using DirectQuery or Import option. For the DirectQuery option, the connection will be quite fast. However, you can’t apply transformations. For the import option, you need to apply data refresh regardless of the data flow.
Step 3: Create relationships, measures, and visuals based on the Dataflow data.
Step 4: Build your report or dashboard leveraging the prepped data from the Dataflow
In this section, we will look at some of the examples of using Power BI Dataflows.
Example #1 – Using One Power Query Table in Multiple Power BI Reports
Here’s how you can use one Power Query table in multiple Power BI reports.
- Create a Dataflow:
- In Power BI Desktop, go to the “Power Query Editor” by selecting “Edit Query.”
- Create or import your data and transform it as needed.
- Create a dataflow by selecting “Home” > “Dataflows” > “New Dataflow.”
- Add Dataflow to Reports:
- In each Power BI report where you want to use this dataflow, go to “Home” > “Get Data” > “Power Platform” > “Power BI Dataflows.”
- Select the dataflow you created in step 1.
- Create Visualizations:
- In your report, create visualizations using the tables from the dataflow as if they were part of your dataset.
- Publish Reports:
- Publish the reports to the Power BI Service. Now, you can use the same data flow in multiple reports, ensuring consistency in data across those reports.
Example #2 – Connecting Different Data Sources with Different Schedules of Refresh
Here is how you can connect different data sources with different schedules of refresh.
- Create Dataflows:
- Create separate dataflows for each data source.
- Set Refresh Schedules:
- In the Power BI Service, go to the “Settings” of each data flow. Configure the refresh schedule for each data flow according to its update frequency.
- Use Dataflows in Reports:
- In your Power BI reports, connect to the relevant dataflows as described in Example #1.
Now, your reports will pull data from different data sources, and each data will refresh on its schedule.
Example #3 – Connecting Power BI Dataflows to Workspaces
You can connect Power BI Dataflows to Workspaces as per below:
- Create Dataflows in Workspaces:
- In the Power BI Service, create or move your dataflows to the workspace you want to use.
- Access Dataflows in Reports:
- When creating reports in that workspace, you can access the dataflows directly, making it easier to use the shared data.
- Share Workspaces:
- Share the workspace with team members, ensuring they have access to both the dataflows and the reports using that data.
Some of the key use cases of using Power BI Dataflows include:
|Migrating data from existing legacy systems
|With dataflow, you can migrate the existing on-premises data into dataverse and then link it to tools such as Power Apps, Power Automate, AI builder, etc.
|Building a warehouse
|Dataflow can be used as an alternative to ETL tools for building a warehouse. You can use dataflows to design and build star schema for data warehouse and store fact and dimension tables in data lake storage. Then, use Power BI to generate reports by connecting the dataset from the dataflows.
|Building a dimensional model
|Using dataflow, you can build a dimensional model that can be further used in Power BI reports.
|Prepare data centrally and use it across different reports
|With dataflow, you can prepare the data at a central location, and then multiple Power BI solutions can then reference this dataset for preparing reports
Some of the key benefits of using Power BI Dataflows:
- You can create transformation logic that is reusable and easy to share across multiple datasets and reports in Power BI.
- You can use industry-standard definitions to create a golden data source from the raw dataset, and this can work with other services and products in the Microsoft Power Platform.
- You can also strengthen security around underlying data sources, wherein you can expose the dataset to report creators in dataflows.
- For working with large data volumes and performing ETL operations at scale, dataflows with Power BI Premium provide you with higher flexibility, and it can also scale optimally.
Important Things to Note
- Dataflows can be used with only Power BI Pro or Premium licenses. There are a number of limitations, such as Power BI Dataflow doesn’t allow you to edit the parameters unless you edit the entire dataflow, though you can use the parameters.
- Other limitations include that if you are refreshing more than ten dataflows across the workspace, you will need a Power BI Premium subscription.
- You can’t use any global variables in a URL argument in Power BI Dataflows.
- When you delete a data source from the dataflow data source page, the deleted data sources are not removed and appear as a lineage for a dataflow.
Note that the Power BI Dataflow legacy implementation has been replaced with newer, enhanced Power BI Query dataflows. However, legacy is still supported.
It is recommended that you use enhanced Dataflows for better performance and capabilities.
Frequently Asked Questions (FAQs)
• Power BI Dataflows are primarily focused on data preparation and transformation within the Power BI ecosystem. They are designed for report developers to create, share, and manage data preparation steps for Power BI reports and dashboards.
• The Azure Data Factory is a cloud-based data integration service provided by Microsoft Azure. It can connect to various data sources and destinations, orchestrate complex data workflows, and schedule data integration tasks.
To create a date table, you can follow these steps:
• In Power BI Desktop, create a new Dataflow or open an existing one.
• Add a new query to your Dataflow for the data table by clicking “Get Data” and selecting a date source.
• Use the Power Query editor to transform this source data into a date table.
• Once you’ve prepared the date table query, save and publish the Dataflow to the Power BI service.
• In the Power BI Desktop report where you want to use this date table, connect to the Dataflow that contains the date table by clicking “Get Data” and selecting the Dataflow as a data source.
• You can now use the date table from the Dataflow in your report and build relationships with other tables as needed.
This has been a guide to Power BI Dataflows. We learn how to create and use dataflows in power BI, its key use cases, benefits, and points to remember. You can learn more from the following articles –