Business Intelligence (BI) tools gather data from multiple sources and convert it into insightful reports and dashboards. As data grows, SQL Server and Oracle databases become vital to efficiently store and manage it. Analyzing stored data with BI tools is essential for businesses to make informed decisions and stay competitive.
Power BI & modes of connectivity
Power BI provides two connectivity modes: Import and Direct Query. With Import, data is downloaded to the Power BI model, which can be refreshed, and stored in the Common Data Model. While Direct Query generates optimized SQL queries for specific visualizations without downloading the entire data. It connects to the data source engine to produce results on-the-fly.
Modes: Import VS. Direct Query
- Storage – Import mode enables the user to store data in the model, whereas Direct Query mode does not involve storing data at all.
- Performance – The Import mode performs quickly when dealing with less data volume, but its performance degrades as the volume of data increases. On the other hand, the performance of Direct Query mode is fast if appropriate indexes are created on the database, but it might underperform if indexes are not in place.
- Volume – The Import mode can store a maximum of 1 GB of data in the model, whereas the Direct Query mode has no such restriction.
- Compatibility – Import mode is compatible with all types of data sources, whereas Direct Query mode is only supported for Database Server types of data sources.
- RLS (Row Level Security) – Import mode supports RLS, and Direct Query mode also supports RLS. However, it is crucial to exercise caution since Service Accounts may be used to connect to the database, effectively impersonating the identity of the Power BI user.
- DAX & Transformations – Import mode supports all Power Query transformations and DAX, while Direct Query mode supports only those transformations and DAX for which Power BI can generate an equivalent SQL query.
- Availability – In Import mode, if the data refresh fails due to the unavailability of the data source, the visuals are prepared using the last data that persisted in the model. In Direct Query mode, if the data refresh fails due to the unavailability of the data source, the entire report goes blank since the data is not stored in the model, and there is no option to revert to the previous state.
- Refresh Schedule – In Import mode, the user has the option to schedule automatic data refresh based on subscription at regular intervals. However, in Direct Query mode, since the data is not stored in the model, the concept of scheduling refresh is not applicable. Instead, the data is fetched from the server when the user opens the report.
Reasons for Direct Query Outperforming Import
Direct Query outperforms Import in most scenarios because database engines are optimized for handling data calculations and have indexing capabilities, which DAX and Power Query lack. When visuals are prepared in Power BI, operations such as merging, grouping, aggregating, and filtering are performed.
DAX and M Script, Power BI’s in-memory calculation engines, are efficient but are not optimized for performing calculations in an efficient manner as they lack indexing. Database engines reuse query results of the last few queries by keeping track of changing data, which is complex for both DAX and Power Query.
As data size grows, the performance difference between Import and Direct Query becomes more noticeable, making Direct Query a better choice in most scenarios.
When does Import mode outperform Direct Query?
In certain scenarios, import mode might perform better due to factors such as a lack of proper indexing in the database, running the database server with low resources, or having many concurrent users querying the database. When multiple users query the database, Power BI might fire multiple SQL queries for multiple visuals, leading to possible locking on the table and a long wait time that can freeze Power BI visuals.
Additionally, a long connection time to a database can occur when a connection request is made to an on-premises server over VPN with high latency. If the volume of data is not large, it is recommended to use Import mode instead of Direct Query. With Import mode, one can always enjoy the good service of Azure resources in the backend of the Power BI Service.
Does Direct Query provide real-time data?
There is a difference between “Live” and “Real-time” connections. Direct Query falls under the category of a “Live” connection, where a new connection to the database server is established every time the report is opened to retrieve the most recent data. Once the screen is opened, the visual becomes static and will only display updated data upon manual refresh or reopening of the report.