Search code, repositories, users, issues, pull requests...

Changing default MAXDOP in Azure SQL Database

tip_id: 1010

Maximum degree of parallelism (MAXDOP) for the primary replica is not in the recommended range of 1 to 8. Depending on workload, this may cause unnecessary resource utilization, and in extreme cases, errors and timeouts.

-- Set MAXDOP to 8 for the primary replica
ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 8;

Changing default MAXDOP in Azure SQL Database

tip_id: 1020

Maximum degree of parallelism (MAXDOP) for secondary replicas is not in the recommended range of 1 to 8. Depending on workload, this may cause unnecessary resource utilization, and in extreme cases, errors and timeouts.

-- Set MAXDOP to 8 for the secondary replicas
ALTER DATABASE SCOPED CONFIGURATION FOR SECONDARY SET MAXDOP = 8;

Snapshot isolation in SQL Server

tip_id: 1030

Database compatibility level is not current. Recently added database engine features and improvements may not be available when using older compatibility levels. Consider using latest supported compatibility level, but note that changing compatibility level may require functional and performance testing before and after the change, known as A/B testing.

ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 150; -- check the first link for the latest compatibility level in Azure SQL Database (150 as of this writing)

tip_id: 1040

Auto-create statistics is disabled. This may cause poor query performance due to suboptimal query plans. Enable auto-create statistics.

ALTER DATABASE CURRENT SET AUTO_CREATE_STATISTICS ON;

Statistics

tip_id: 1050

Auto-update statistics is disabled. This may cause poor query performance due to suboptimal query plans. Enable auto-update statistics.

ALTER DATABASE CURRENT SET AUTO_UPDATE_STATISTICS ON;

Statistics

tip_id: 1060

Read Committed Snapshot Isolation (RCSI) is disabled. This may cause unnecessary lock blocking, such as read queries blocking write queries. RCSI is the preferred alternative to the NOLOCK hint, which should not be used in scenarios that require correct query results.

The sample query below enables RCSI for a database. Note that this command requires exclusive database access, and will terminate all connections to the database after waiting for 5 seconds to give open transactions an opportunity to complete.

-- Execute in the master database, and change to reference the actual database name
ALTER DATABASE [database-name] SET READ_COMMITTED_SNAPSHOT ON WITH ROLLBACK AFTER 5 SECONDS;

SET TRANSACTION ISOLATION LEVEL

The Dirty Secrets of NOLOCK, a free course by Kendra Little

tip_id: 1070

Query Store is disabled. This complicates performance tuning and troubleshooting, and could disable certain Intelligent Query Processing features that can improve query performance and reduce resource consumption. Enable Query Store.

ALTER DATABASE CURRENT SET QUERY_STORE = ON;

This tip may only appear on the primary replica.

tip_id: 1071

Query Store operation mode is READ_ONLY. This complicates performance tuning and troubleshooting, and impacts certain Intelligent Query Processing features that can improve query performance and reduce resource consumption.

The details column provides the reason for the read-only state. Change Query Store operation mode to READ_WRITE.

ALTER DATABASE CURRENT SET QUERY_STORE (OPERATION_MODE = READ_WRITE);

This tip may only appear on the primary replica.

tip_id: 1072

Query Store capture mode is NONE. This complicates performance tuning and troubleshooting, and impacts certain Intelligent Query Processing features that can improve query performance and reduce resource consumption.

Change Query Store query capture mode to AUTO or ALL depending on requirements, or use a custom query capture policy that captures all queries relevant to your workload.

ALTER DATABASE CURRENT SET QUERY_STORE (QUERY_CAPTURE_MODE = AUTO);
ALTER DATABASE CURRENT SET QUERY_STORE (QUERY_CAPTURE_MODE = ALL);

This tip may only appear on the primary replica.

Considerations for AUTO_SHRINK

tip_id: 1080

AUTO_SHRINK is enabled. An occasional shrinking of a database may be needed in response to a one-time data deletion event and/or to reduce excessive allocated but unused space in data files. However, shrinking should not be executed continuously by keeping auto-shrink enabled at all times, because it causes persistent and high resource utilization that will negatively impact workload performance.

ALTER DATABASE CURRENT SET AUTO_SHRINK OFF;

Recommendations and guidelines for setting the AUTO_SHRINK database option in SQL Server

tip_id: 1100

Btree indexes have uniqueidentifier (GUID) leading columns. For larger tables, the details column contains a list of btree indexes with uniqueidentifier leading columns. Such indexes are subject to low page density and high fragmentation as data is modified. This leads to increased memory and disk space usage and negatively impacts performance.

Avoid this pattern in physical database design, particularly for clustered btree indexes. Rebuild indexes periodically when page density is less than 70%, either manually or using a database maintenance solution.

-- Determine average page density for an index.
-- Replace [object_id] and [index_id] placeholders with actual values from the details column, or with NULL/default to see all objects and indexes
SELECT OBJECT_SCHEMA_NAME(ips.object_id) AS schema_name,
       OBJECT_NAME(ips.object_id) AS object_name,
       i.name AS index_name,
       i.type_desc AS index_type,
       ips.avg_page_space_used_in_percent,
       ips.page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), [object_id], [index_id], default, 'SAMPLED') AS ips
INNER JOIN sys.indexes AS i 
ON ips.object_id = i.object_id
   AND
ips.index_id = i.index_id
ORDER BY page_count DESC;

-- Consider rebuilding an index if its average page density is below 70%
-- set FILLFACTOR to 80 if significant amounts of new data are being added, or if the values in the uniqueidentifier index leading column change frequently
ALTER INDEX [index_name] ON [schema_name].[table_name] REBUILD WITH (ONLINE = ON, RESUMABLE = ON, MAXDOP = 8, FILLFACTOR = 100);

SQL Server Index Architecture and Design Guide

Indexes

ALTER INDEX

Database maintenance solution by Ola Hallengren

tip_id: 1110

FORCE_LAST_GOOD_PLAN auto-tuning option is not enabled. Query plan regressions will not be fixed automatically via plan forcing. To avoid performance issues due to unexpected query plan regressions, enable this auto-tuning option.

ALTER DATABASE CURRENT SET AUTOMATIC_TUNING (FORCE_LAST_GOOD_PLAN = ON);

This tip may only appear on the primary replica.

Automatic tuning

Automatic tuning in Azure SQL Database

Enable automatic tuning in the Azure portal

tip_id: 1120

Used data size within the database is close to maximum configured database size (MAXSIZE). To allow continued data growth, increase maximum database size proactively, or scale up to a service tier or service objective that supports higher maximum database size, or implement data compression, or delete unneeded data.

If used data size is approaching 4 TB, plan on reducing used data size by compressing or deleting data, or switch to the Hyperscale service tier.

-- Determine used space, allocated space, and database MAXSIZE
SELECT SUM(CAST(FILEPROPERTY(name, 'SpaceUsed') AS bigint) * 8 / 1024.) AS space_used_mb,
       SUM(CAST(size AS bigint) * 8 / 1024.) AS space_allocated_mb,
       CAST(DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes') AS bigint) / 1024. / 1024 AS max_size_mb
FROM sys.database_files
WHERE type_desc = 'ROWS';

-- Increase MAXSIZE to 4 TB
ALTER DATABASE [database_name] MODIFY (MAXSIZE = 4096 GB); -- adjust database name and MAXSIZE value according to requirements

Manage file space for databases in Azure SQL Database

What happens when database storage limit is reached

tip_id: 1130

Storage size allocated for data files is close to maximum configured database size (MAXSIZE). If used size is not close to MAXSIZE and significant data growth is not expected, consider shrinking data files to reduce allocated size. This is particularly relevant for databases in elastic pools when the total allocated storage size for the pool is close to pool storage limit.

Note that data file shrink is a long-running and resource intensive process that should not be executed on a regular basis. Shrink will also increase index fragmentation, which may impact workloads using large readaheads in index scans.

Prior to shrinking, if page density is low (below 70%), rebuild indexes.

-- Determine used space, allocated space, and database MAXSIZE
SELECT SUM(CAST(FILEPROPERTY(name, 'SpaceUsed') AS bigint) * 8 / 1024.) AS space_used_mb,
       SUM(CAST(size AS bigint) * 8 / 1024.) AS space_allocated_mb,
       CAST(DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes') AS bigint) / 1024. / 1024 AS max_size_mb
FROM sys.database_files
WHERE type_desc = 'ROWS';

-- Determine average page density for each index
-- Replace [object_id] and [index_id] placeholders with actual values from the details column, or with NULL/default to see all objects and indexes
SELECT OBJECT_SCHEMA_NAME(ips.object_id) AS schema_name,
       OBJECT_NAME(ips.object_id) AS object_name,
       i.name AS index_name,
       i.type_desc AS index_type,
       ips.avg_page_space_used_in_percent,
       ips.page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), [object_id], [index_id], default, 'SAMPLED') AS ips
INNER JOIN sys.indexes AS i 
ON ips.object_id = i.object_id
   AND
ips.index_id = i.index_id
ORDER BY page_count DESC;

-- Consider rebuilding an index if its average page density is below 70%
ALTER INDEX [index_name] ON [schema_name].[table_name] REBUILD WITH (ONLINE = ON, RESUMABLE = ON, MAXDOP = 8);

-- First, attempt to truncate to release any space at the end of the file
DBCC SHRINKFILE (1, TRUNCATEONLY); -- change file_id if needed

-- If allocated space is still too large, shrink the file. This takes longer.
DBCC SHRINKFILE (1, 1024); -- change file_id if needed, change target size from 1024 (1 GB) if needed

Manage file space for databases in Azure SQL Database

sys.database_files

tip_id: 1140

Storage size allocated for data files is much larger than used data size. If significant data growth is not expected, consider shrinking data files to reduce allocated size. This is particularly relevant for databases in elastic pools when the total allocated storage size for the pool is close to pool storage limit.

Note that data file shrink is a long-running and resource intensive process that should not be used on a regular basis. Shrink will also increase index fragmentation, which may impact workloads using large readaheads in index scans.

Prior to shrinking, if page density is low (below 70%), rebuild indexes.

-- Determine used space, allocated space, and database MAXSIZE
SELECT SUM(CAST(FILEPROPERTY(name, 'SpaceUsed') AS bigint) * 8 / 1024.) AS space_used_mb,
       SUM(CAST(size AS bigint) * 8 / 1024.) AS space_allocated_mb,
       CAST(DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes') AS bigint) / 1024. / 1024 AS max_size_mb
FROM sys.database_files
WHERE type_desc = 'ROWS';

-- Determine average page density for each index
-- Replace [object_id] and [index_id] placeholders with actual values from the details column, or with NULL/default to see all objects and indexes
SELECT OBJECT_SCHEMA_NAME(ips.object_id) AS schema_name,
       OBJECT_NAME(ips.object_id) AS object_name,
       i.name AS index_name,
       i.type_desc AS index_type,
       ips.avg_page_space_used_in_percent,
       ips.page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), [object_id], [index_id], default, 'SAMPLED') AS ips
INNER JOIN sys.indexes AS i 
ON ips.object_id = i.object_id
   AND
ips.index_id = i.index_id
ORDER BY page_count DESC;

-- Consider rebuilding an index if its average page density is below 70%
ALTER INDEX [index_name] ON [schema_name].[table_name] REBUILD WITH (ONLINE = ON, RESUMABLE = ON, MAXDOP = 8);

-- First, attempt to truncate to release any space at the end of the file
DBCC SHRINKFILE (1, TRUNCATEONLY); -- change file_id if needed

-- If allocated space is still too large, shrink the file. This takes longer.
DBCC SHRINKFILE (1, 1024); -- change file_id if needed, change target size from 1024 (1 GB) if needed

Manage file space for databases in Azure SQL Database

sys.database_files

tip_id: 1150

Significant CPU throttling has recently occurred, as noted in the details column. If workload performance has been inadequate during that time, tune query workload to consume less CPU, or scale up to a service objective with more CPU capacity, or both. To find queries that have consumed the most CPU in a time interval, review top queries ranked high by cpu time.

CPU throttling is common on lower and resource-constrained service objectives, e.g. Basic, S0, S1, etc.

What happens when CPU resource limits are reached

tip_id: 1160

Out of memory errors have recently occurred, as noted in the details column. Tune query workload to consume less memory, or scale up to a service objective with more memory, or both. To find queries that have consumed the most memory in a time interval, review top queries ranked high by maximum query memory used.

What happens when memory resource limits are reached

tip_id: 1165

Memory grant waits and/or timeouts have recently occurred. The data in the details column notes the number of recent requests for memory grants that had to wait (commonly manifested as RESOURCE_SEMAPHORE wait types), and the number of such requests that timed out waiting. To find queries that have consumed the most memory in a time interval, review top queries ranked high by maximum query memory used.

Tune query workload to use smaller memory grants, or reduce the number of queries running concurrently, or scale up to a service objective with more memory.

What happens when memory resource limits are reached

tip_id: 1170

Nonclustered indexes with low reads found. The details column contains a list of non-unique nonclustered indexes where the number of index read operations is much less than the number of index write (update) operations.

As data changes, indexes must be updated, which requires CPU and IO resources and slows down DML statements. The resource overhead of updating indexes that have few reads may outweigh their benefit, if any. This overhead is particularly noticeable when there are many indexes with low reads in the database, a phenomenon known as "over-indexed" databases.

If the data in details is for a sufficiently long period during which all critical workloads have run, drop these indexes. Then, monitor performance to validate that no significant negative impact has occurred. If performance regression is unacceptable, recreate dropped indexes.

When running query workloads on readable secondary replicas, use the sample query below to collect index read and write statistics from all replicas and add them up for each index. An index that has low reads on the primary replica may have high reads on readable replicas, and may be required.

-- Index read (seeks, scans, lookups) and write (updates) statistics for each nonclustered index not used for constraint enforcement
SELECT QUOTENAME(OBJECT_SCHEMA_NAME(o.object_id)) AS schema_name, 
       QUOTENAME(o.name) AS object_name, 
       QUOTENAME(i.name) AS index_name, 
       ius.user_seeks,
       ius.user_scans,
       ius.user_lookups,
       ius.user_updates
FROM sys.dm_db_index_usage_stats AS ius
INNER JOIN sys.indexes AS i
ON ius.object_id = i.object_id
   AND
   ius.index_id = i.index_id
INNER JOIN sys.objects AS o
ON i.object_id = o.object_id
   AND
   ius.object_id = o.object_id
WHERE ius.database_id = DB_ID()
      AND
      i.type_desc = 'NONCLUSTERED'
      AND
      i.is_primary_key = 0
      AND
      i.is_unique_constraint = 0
      AND
      i.is_unique = 0
      AND
      o.is_ms_shipped = 0;

SQL Server Index Architecture and Design Guide

tip_id: 1180

ROW or PAGE data compression opportunities found. The details column contains a list of partition ranges for each object and index where data compression may be beneficial, based on recent workload sampling and heuristics. The current and suggested new data compression types are shown. The list is sorted with the largest objects first. To improve tip accuracy, obtain this result while a representative workload is running, or shortly thereafter.

In the following example of data in the details column, the suggested data compression is ROW for the first 5 partitions, and PAGE for the last 5 partitions of the clustered index in the usertable table. The total size of all partitions in each range, and a breakdown of that size by storage type are included. The row overflow and LOB storage is not compressible.

schema: [dbo], object: [usertable], index: [pk_usertable], index type: CLUSTERED, object size (MB): 11,535.46, partition range: 1-5, partition range total size (MB): 5,319.10 (in-row: 5,319.10, row overflow: 0.00, LOB: 0.00), present compression type: ROW, suggested compression type: ROW
schema: [dbo], object: [usertable], index: [pk_usertable], index type: CLUSTERED, object size (MB): 11,535.46, partition range: 6-10, partition range total size (MB): 6,216.36 (in-row: 6,216.36, row overflow: 0.00, LOB: 0.00), present compression type: ROW, suggested compression type: PAGE

ROW compression adds minimal CPU utilization while providing substantial storage and memory savings. For most workloads, ROW compression should be enabled by default for new tables and indexes. PAGE compression adds higher CPU utilization, but also provides higher storage and memory savings.

-- Create an index with ROW compression
CREATE INDEX ix_table1_1 ON dbo.table1 (column1, column2) WITH (ONLINE=ON, RESUMABLE=ON, DATA_COMPRESSION=ROW);

-- Rebuild an index with PAGE compression
ALTER INDEX ix_table1_2 ON dbo.table1 REBUILD WITH (ONLINE=ON, RESUMABLE=ON, DATA_COMPRESSION=PAGE);

Additional analysis and testing based on information in provided links may be required to select data compression types optimally. Any data compression requires additional CPU processing, thus testing is particularly important when low query latency must be maintained in transactional workloads.

This tip may only appear on the primary replica.

Data Compression

Data Compression: Strategy, Capacity Planning and Best Practices

tip_id: 1190

There are recent occurrences of transaction log IO (aka log rate) approaching the limit of the service objective, as noted in the details column. To improve performance of bulk data modifications including data loading, consider tuning the workload to reduce log IO, or scale up to a service objective with a higher log IO limit.

Additional solutions and workarounds include:

Using data compression (ROW, PAGE, columnstore) to reduce effective log IO;
Loading transient data to be further transformed/aggregated into the tempdb database (e.g. using global temporary tables), or into non-durable memory-optimized tables. If using global temporary tables (e.g. ##tablename), set the GLOBAL_TEMPORARY_TABLE_AUTO_DROP database-scoped configuration to OFF, to retain global temporary tables until database engine restarts.

This tip may only appear on the primary replica.

Transaction log rate governance

tip_id: 1200

A significant part of plan cache is occupied by single-use plans, which have never been reused by query executions and therefore use memory unnecessarily.

To avoid caching single-use plans for a database and reduce memory consumption, enable the OPTIMIZE_FOR_AD_HOC_WORKLOADS database-scoped configuration for each database where single-use plans occupy a large amount of memory, as noted in the details column. Review application code to ensure that all parameter data type declarations match table column data types exactly, including length, precision, and scale of the column, and are not derived from each value assigned to the parameter.

ALTER DATABASE SCOPED CONFIGURATION SET OPTIMIZE_FOR_AD_HOC_WORKLOADS = ON;

optimize for ad hoc workloads

SQL Server Index Architecture and Design Guide

sys.dm_exec_cached_plans

tip_id: 1210

Queries could benefit from indexes that do not currently exist. The details column includes data to help create new indexes, based on metadata collected since database engine startup.

For more accurate indexing recommendations based on longer term historical data and including CREATE INDEX and DROP INDEX statements, see performance recommendations for this database in Azure portal, or use the Get-AzSqlDatabaseRecommendedAction PowerShell cmdlet.

CREATE INDEX

tip_id: 1220

Redo queue on a secondary readable replica is large. Queries running on this replica may experience significant data latency relative to the primary replica. A large redo queue may also cause longer failovers if the replica is used as the failover target.

Common causes of a large redo queue are CPU and/or IO resource contention on the secondary replica, or blocking of the redo thread due to long-running queries on the secondary replica and concurrent DDL operations on the primary replica.

Tune/reduce workload on the secondary replica to use less resources, and/or increase the service objective. Avoid DDL operations such as schema changes, index rebuilds, or statistics updates on the primary replica while there are long-running queries executing on the secondary replica.

This tip is produced only on readable secondary replicas, including non-Hyperscale read scale-out replicas and geo-replicas.

Monitoring and troubleshooting read-only replicas

tip_id: 1230

Data IOPS are close to workload group limit. If workload performance for this database is insufficient and PAGEIOLATCH* waits are a top wait type, either tune the workload to generate less data IO, or scale to a service objective with a higher data IOPS limit.

Workload group defines resource governance limits for a single database, or for a database in an elastic pool.

Resource limits for single databases using the vCore purchasing model

Resource limits for single databases using the DTU purchasing model

Resource limits for single databases using the vCore purchasing model

tip_id: 1240

Workload group IO governance impact is significant. If workload performance for this database is insufficient and PAGEIOLATCH* waits are a top wait type, either tune the workload to generate less data IO, or scale to a service objective with a higher data IOPS limit.

Workload group defines resource governance limits for a single database, or for a database in an elastic pool.

Resource limits for single databases using the DTU purchasing model

Resource limits for elastic pools using the vCore purchasing model

tip_id: 1250

Data IOPS are close to resource pool limit. If workload performance for one or more databases in this elastic pool is insufficient and PAGEIOLATCH* waits are a top wait type, either tune the workload to generate less data IO, or scale the elastic pool to a service objective with a higher data IOPS limit, or rebalance databases across multiple elastic pools to reduce cumulative data IO in this elastic pool.

Resource pool defines resource governance limits for an elastic pool.

Resources limits for elastic pools using the DTU purchasing model

Resource limits for elastic pools using the vCore purchasing model

tip_id: 1260

Resource pool IO governance impact is significant. If workload performance for one or more databases in this elastic pool is insufficient and PAGEIOLATCH* waits are a top wait type, either tune the workload to generate less data IO, or scale the elastic pool to a service objective with a higher data IOPS limit, or rebalance databases across multiple elastic pools to reduce cumulative data IO in this elastic pool.

Resource pool defines resource governance limits for an elastic pool.

Resources limits for elastic pools using the DTU purchasing model

Accelerated Database Recovery

tip_id: 1270

Persistent Version Store (PVS) size is large. PVS size, used and allocated data size, maximum database size, and other relevant information is available in the details column.

PVS is a part of the user database, and is used to store row versions when Accelerated Database Recovery (ADR), Read-Committed Snapshot Isolation (RCSI), or Snapshot Isolation (SI) are enabled for the database. Large PVS size is usually caused by long-running active transactions. It may also occur due to a large number of aborted transactions that haven't yet been cleaned up by the PVS cleaner background thread. In extreme cases, a large PVS may cause the database to run out of space until PVS space is cleaned up.

Review application patterns to avoid long-running transactions, and make sure to explicitly roll back open transactions after query timeouts and other errors.

If there are any transactions that have been unintentionally left open for a long time and caused PVS to become large, consider killing them to allow PVS cleanup to occur.

-- Transactions sorted by duration
SELECT st.session_id,
       dt.database_transaction_begin_time,
       DATEDIFF(second, dt.database_transaction_begin_time, CURRENT_TIMESTAMP) AS transaction_duration_seconds,
       dt.database_transaction_log_bytes_used,
       dt.database_transaction_log_bytes_reserved,
       st.is_user_transaction,
       st.open_transaction_count,
       ib.event_type,
       ib.parameters,
       ib.event_info
FROM sys.dm_tran_database_transactions AS dt
INNER JOIN sys.dm_tran_session_transactions AS st
ON dt.transaction_id = st.transaction_id
OUTER APPLY sys.dm_exec_input_buffer(st.session_id, default) AS ib
WHERE dt.database_id = DB_ID()
ORDER BY transaction_duration_seconds DESC;

-- Kill the session with a transaction open for a long time
KILL 4242;

PVS Troubleshooting

Manage file space for databases in Azure SQL Database

tip_id: 1280

Paused resumable index maintenance operations found. Paused resumable index operations will be aborted automatically after a period of time specified by the PAUSED_RESUMABLE_INDEX_ABORT_DURATION_MINUTES database-scoped configuration.

Either resume these index operations, or, if not planning to resume, abort them explicitly to avoid unnecessary performance impact on the DML operations changing these indexes between now and the time when auto-abort occurs.

ALTER INDEX [index_name] ON [schema_name].[object_name] RESUME; -- change schema, object, and index names
ALTER INDEX [index_name] ON [schema_name].[object_name] ABORT; -- change schema, object, and index names

This tip may only appear on the primary replica.

ALTER INDEX

sys.index_resumable_operations

tip_id: 1290

A clustered columnstore index (CCI) can be created on one or more tables to improve query and data load performance and drastically reduce storage space and memory consumption by the table.

CCI indexes are most useful in analytical/reporting workloads where queries use large scans. CCI indexes are not recommended in OLTP workloads, or when queries predominantly use index seeks, lookups, and small range scans. They are also not recommended when rows in the table are frequently updated, or when less frequent but large updates occur.

The details column provides a list of CCI candidate tables. Index usage statistics for the recent query workload on each table are included to help determine if CCI will benefit queries referencing this table.

To improve tip accuracy, obtain this result while a representative workload is running, or shortly thereafter.

This tip may only appear on the primary replica.

Columnstore indexes: Overview

Columnstore indexes - Design guidance

Columnstore indexes - Query performance

Columnstore indexes - Data loading guidance

tip_id: 1300

Some geo-replication links may be unhealthy. The details column describes geo-replication link health, including data propagation lag (reported on the primary geo-replica only) and replica state.

If a geo-replication link becomes unhealthy during high write load on the primary geo-replica, then the service objective of the secondary geo-replica may need to be increased to sustain data propagation rate and avoid geo-replica unavailability.

For a recently created geo-replica, the SEEDING replication state is expected. In all other cases, replication state for a healthy geo-replica should be CATCH_UP.

Geo-replication

Failover groups

sys.dm_geo_replication_link_status

tip_id: 1310

Last partitions are not empty for some partitioned tables. If the very last partition is not empty, and is split to create a new partition for new data, the split operation will be slow and resource-intensive.

If new partitions are periodically added to accommodate new data, make sure to keep a buffer of several empty partitions at the end of the partition list. While a single empty partition is sufficient, multiple empty partitions are preferred. They reduce the risk of data getting into the very last partition because of automation failure to split the last empty partition on time.

The details column provides the list of last several partitions for each table if at least some of them are not empty. If the very last partition (i.e. the partition number equal to the total count of partitions) is empty, act before any data is loaded into this partition to split it and create at least one empty partition at the end of the partition list. Otherwise, plan and prepare for a time-consuming process of splitting a non-empty partition, to avoid all new data being in the last partition.

ALTER PARTITION FUNCTION [function_name] SPLIT RANGE ([new boundary value]);

ALTER PARTITION FUNCTION

Partitioned Tables and Indexes

Blog articles by Kendra Little:

How to Fix Lopsided Partitioned Tables

Sliding Window Table Partitioning: What to Decide Before You Automate

tip_id: 1320

To help in performance investigations, top queries for a time interval are summarized in the details column. This tip is informational in nature, and is produced when any queries have executed in the time interval. On idle databases, top queries may include the queries executed by the get-sqldb-tips.sql script itself. On readable secondary replicas, this tip reports on queries that executed on the primary.

For each query, identified by its query_hash value, query_id, plan_id, the number of query executions by execution type (regular, aborted, exception) and weighted wait categories are included in details.

Additionally, each query is ranked by its consumption of the following resources:

cpu time
duration
execution count
logical IO reads
physical IO reads
maximum query memory used
log bytes used
tempdb space used
parallelism

Lower rank numbers mean higher rank, e.g. the highest CPU consuming query has cpu time rank 1.

In the following example, the query with hash 0x5F9043DBAAEBB1FF is the top consumer of tempdb space (tempdb used rank: 1), has two associated query_id values (66 and one more), and three associated plan_id values (36 and two more). It successfully executed 31615 times in the time interval. Most waits (~77.4%) were on CPU.

query hash: 0x5F9043DBAAEBB1FF, query_id: 66 (+1), plan_id: 36 (+2), executions: (regular: 31615, aborted: 0, exception: 0), CPU time rank: 4, duration rank: 32, executions rank: 50, logical IO reads rank: 6, physical IO reads rank: 5, max used memory rank: 79, log bytes used rank: 48, tempdb used rank: 1, parallelism rank: 50, weighted wait categories: CPU (0.774) | Latch (0.137) | Memory (0.057) | Unknown (0.028) | Preemptive (0.002) | Buffer IO (0.002)

Depending on the nature of the performance issue, focus on the queries ranked high for the relevant resource type. For example, if troubleshooting high tempdb space consumption, look at the tempdb used rank.

To change the number of top ranked queries included for each resource type, change the @QueryStoreTopQueryCount variable.

By default, the time interval used is the last hour. You can change the @QueryStoreIntervalMinutes variable to look at shorter or longer intervals. To look at a specific time interval in the past, set the @QueryStoreCustomTimeStart and @QueryStoreCustomTimeEnd variables to the start and end time of the interval. Note that this tip may not appear if the specified interval is shorter than the configured Query Store resource statistics interval length.

For further investigations, use Query Performance Insight in Azure portal, Query Store reports in SSMS, or query sys.query_store_* views directly.

The query below provides resource consumption details for a given query hash value, as well as associated query_id and plan_id values and SQL text.

-- Return query/plan runtime statistics in the specified interval for a given query_hash value from the details column for further investigation
DECLARE @QueryHash binary(8) = 0x113AAAF159577E8C; -- replace query hash with the actual value from details
DECLARE @QueryStoreIntervalMinutes int = 60;
DECLARE @QueryStoreCustomTimeStart datetimeoffset -- = '2021-01-01 00:01 +00:00';
DECLARE @QueryStoreCustomTimeEnd datetimeoffset -- = '2021-12-31 23:59 +00:00';

SELECT q.query_id,
       q.context_settings_id,
       qt.query_sql_text,
       OBJECT_SCHEMA_NAME(q.object_id) AS schema_name,
       OBJECT_NAME(q.object_id) AS object_name,
       p.plan_id,
       TRY_CAST(p.query_plan AS xml) AS query_plan,
       SUM(IIF(rs.execution_type_desc = 'Regular', rs.count_executions, 0)) AS count_regular_executions,
       SUM(IIF(rs.execution_type_desc = 'Aborted', rs.count_executions, 0)) AS count_aborted_executions,
       SUM(IIF(rs.execution_type_desc = 'Exception', rs.count_executions, 0)) AS count_exception_executions,
       SUM(rs.avg_cpu_time * rs.count_executions) / SUM(rs.count_executions) AS avg_cpu_time,
       SUM(rs.avg_duration * rs.count_executions) / SUM(rs.count_executions) AS avg_duration,
       SUM(rs.avg_logical_io_reads * rs.count_executions) / SUM(rs.count_executions) AS avg_logical_io_reads,
       SUM(rs.avg_physical_io_reads * rs.count_executions) / SUM(rs.count_executions) AS avg_physical_io_reads,
       SUM(rs.avg_query_max_used_memory * rs.count_executions) / SUM(rs.count_executions) AS avg_query_max_used_memory,
       SUM(rs.avg_log_bytes_used * rs.count_executions) / SUM(rs.count_executions) AS avg_log_bytes_used,
       SUM(rs.avg_tempdb_space_used * rs.count_executions) / SUM(rs.count_executions) AS avg_tempdb_space_used,
       SUM(rs.avg_dop * rs.count_executions) / SUM(rs.count_executions) AS avg_dop
FROM sys.query_store_query AS q
INNER JOIN sys.query_store_query_text AS qt
ON q.query_text_id = qt.query_text_id
INNER JOIN sys.query_store_plan AS p
ON q.query_id = p.query_id
INNER JOIN sys.query_store_runtime_stats AS rs
ON p.plan_id = rs.plan_id
INNER JOIN sys.query_store_runtime_stats_interval AS rsi
ON rs.runtime_stats_interval_id = rsi.runtime_stats_interval_id
WHERE q.query_hash = @QueryHash
      AND
      rsi.start_time >= IIF(
                           (@QueryStoreCustomTimeStart IS NULL OR @QueryStoreCustomTimeEnd IS NULL) AND @QueryStoreIntervalMinutes IS NOT NULL,
                           DATEADD(minute, -@QueryStoreIntervalMinutes, SYSDATETIMEOFFSET()),
                           @QueryStoreCustomTimeStart
                           )
      AND
      rsi.start_time <= IIF(
                           (@QueryStoreCustomTimeStart IS NULL OR @QueryStoreCustomTimeEnd IS NULL) AND @QueryStoreIntervalMinutes IS NOT NULL,
                           SYSDATETIMEOFFSET(),
                           @QueryStoreCustomTimeEnd
                           )
GROUP BY q.query_hash,
         q.query_id,
         q.context_settings_id,
         q.object_id,
         qt.query_sql_text,
         p.plan_id,
         p.query_plan
ORDER BY query_id, plan_id
OPTION (RECOMPILE);

Query Performance Insight for Azure SQL Database

Query Store Catalog Views