Introduction to BI Engine
BigQuery BI Engine is a fast, in-memory analysis service that accelerates many SQL queries in BigQuery by intelligently caching the data you use most frequently. BI Engine can accelerate SQL queries from any source, including those written by data visualization tools, and can manage cached tables for ongoing optimization. This lets you improve query performance without manual tuning or data tiering. You can cluster and partition tables to further optimize BI Engine performance for large tables.
For example, if your dashboard only displays the last quarter's data, then you could partition your tables by time so only the latest partitions are loaded into memory. You can also combine the benefits of materialized views and BI Engine. This works particularly well when the materialized views are used to join and flatten data to optimize their structure for BI Engine.
BI Engine provides the following advantages:
- BigQuery API compatibility: BI Engine directly integrates with the BigQuery API. Any BI solution or custom application that works with the BigQuery API through standard mechanisms such as REST or JDBC and ODBC drivers can use BI Engine without modification.
- Vectorized runtime: Using vectorized processing in an execution engine makes more efficient use of modern CPU architecture, by operating on batches of data at a time. BI Engine also uses advanced data encodings, specifically dictionary run-length encoding, to further compress the data that's stored in the in-memory layer.
- Seamless integration: BI Engine works with BigQuery features and metadata, including authorized views, column-level security, and data masking.
- Reservation allocations: BI Engine reservations separately manage memory allocation for each project and region. BI Engine only caches the queried, required parts of columns and partitions. You can specify which tables use BI Engine acceleration with preferred tables.
BI Engine use cases
BI Engine can significantly accelerate many SQL queries, including those used for BI dashboards. Acceleration is most effective if you identify the tables that are essential to your queries, and then mark them as preferred tables. To use BI Engine, you create a reservation in a region and specify its size. You can let BigQuery determine which tables to cache based on the project's usage patterns or you can specify tables to prevent other traffic from interfering with their acceleration.
BI Engine is useful in the following use cases:
- You use BI tools to analyze your data: BI Engine accelerate BigQuery queries whether they run in the BigQuery console, a BI tool such as Looker Studio or Tableau, or a client library, API, or an ODBC or JDBC connector. This can significantly improve the performance of dashboards connected to BigQuery through a built-in connection (API) or connectors.
- You have frequently queried tables: BI Engine lets you designate preferred tables to accelerate. This is helpful if you have a subset of tables that are queried more frequently or are used for high-visibility dashboards.
BI Engine might not fit your needs in the following cases:
- You use wildcards in your queries: Queries referencing wildcard tables are not supported by BI Engine and don't benefit from acceleration.
- You require BigQuery features unsupported by BI Engine: While BI Engine supports most SQL functions and operators, BI Engine unsupported features include external tables, row-level security, and non-SQL user-defined functions.
Considerations for BI Engine
Consider the following when deciding how to configure BI Engine:
Ensure acceleration for specific queries
To ensure a set of queries are accelerated, create a separate project with a dedicated BI Engine reservation. First, estimate the compute capacity required for your queries, then designate those tables as preferred tables for BI Engine.
Minimize your joins
BI Engine works best for pre-joined or pre-aggregated data, and for queries with a small number of joins. This is particularly true when one side of the join is large and the others are much smaller, such as when you query a large fact table joined with smaller dimension tables. You can combine BI Engine with materialized views, which perform joins to produce a single large, flat table. In this way, the same joins aren't performed for each query. Stale materialized views are recommended for optimal query performance.
Understand the impact of BI Engine
To understand your use of BI Engine, see
Monitor BI Engine with Cloud Monitoring,
or query the
INFORMATION_SCHEMA.BI_CAPACITIES
and
INFORMATION_SCHEMA.BI_CAPACITY_CHANGES
views. Be sure to disable the Use cached results option in
BigQuery to get the most accurate comparison. For more
information, see Use cached query results.
Preferred tables
BI Engine preferred tables let you limit BI Engine acceleration to a specified set of tables. Queries to all other tables use regular BigQuery slots. For example, with preferred tables you can accelerate only the tables and dashboards that you identify as important to your business.
If there is not enough RAM in the project to hold all of the preferred tables, BI Engine offloads partitions and columns that haven't been accessed recently. This process frees memory for new queries that need acceleration.
Preferred tables limitations
BI Engine preferred tables have the following limitations:
- You cannot add views to the preferred tables reservation list. BI Engine preferred tables only support tables.
- Queries to materialized views are only accelerated if both the materialized views and their base tables are in the preferred tables list.
- Specifying partitions or columns for acceleration is not supported.
JSON
type columns are unsupported and are not accelerated by BI Engine.- Queries that access multiple tables are only accelerated if all tables are
preferred tables. For example, all tables in a query with a
JOIN
must be in the preferred tables list to be accelerated. If even one table is not in the preferred list, then the query cannot use BI Engine. - Public datasets are not supported in the Google Cloud console. To add a public table as a preferred table, use the API or the DDL.
Quotas and limits
See BigQuery quotas and limits for quotas and limits that apply to BI Engine.
Pricing
For information on BI Engine pricing, see the BigQuery Pricing page.
What's next
- To learn how to create your BI Engine reservation, see Reserve BI Engine capacity.
- For information designating preferred tables, see BI Engine preferred tables.
- To understand your utilization of BI Engine, see Monitor BI Engine with Cloud Monitoring.
- Learn about BI Engine optimized functions
- Learn how to use BI Engine with the following: