2023 SQL Server Performance Tunning Note

2023-06-20

筆記 2023 處理文學書房 SQL Server 資料庫效能問題的過程與心得。

Why Query Slow From Different Expert

Performance Tuning Expert

Lack of Proper Indexing: Insufficient or inappropriate indexing can result in queries performing full table scans or index scans, leading to decreased performance.
Incorrect Indexing Strategy: Creating too many or unnecessary indexes can increase the cost of write operations and degrade write performance.
Locking Contention: Concurrent transactions can result in locking contention, particularly in long-running transactions or high-concurrency environments, causing performance bottlenecks and blocking.
Disk I/O Bottleneck: Slow disks or excessive disk I/O operations can lead to performance issues. This could be due to disk failures, improper configurations, incorrect disk partitioning, or incorrect I/O operations.
Poor Query Writing: Poorly written queries can contribute to performance degradation. This includes unnecessary joins, subqueries, incorrect predicates, and inappropriate query plans.
Excessive Query Parameterization: Over-parameterization when the same query is executed multiple times with different parameters can lead to performance degradation. This may be caused by cache invalidation and frequent recompilations.
Improper Memory Configuration: Incorrect memory configuration can result in low buffer cache hit ratios, increasing disk I/O operations. Additionally, incorrect max memory settings can lead to memory pressure and performance issues.
Inadequate Hardware Resources: Insufficient server hardware resources, such as CPU, memory, and disk, can cause performance bottlenecks. This may require hardware upgrades or reconfiguring server setups.
Inaccurate Statistics: Outdated or inaccurate statistics can cause SQL Server to make incorrect query plan decisions, impacting performance.
Incorrect Parallelism Settings: Improperly configured parallel query settings can result in excessive parallelism, consuming significant CPU and memory resources. This may require adjusting the max degree of parallelism setting appropriately.

Developer

Poorly Written Queries: Inefficient queries with complex joins, subqueries, or excessive data retrieval can lead to slow performance.
Lack of Indexes: Insufficient or missing indexes on frequently queried columns can result in table scans or high I/O operations, impacting performance.
Incorrect Data Types: Inappropriate data types can lead to implicit conversions, reducing query performance.
Improperly Designed Database Schema: Inadequate normalization, lack of proper relationships, or excessive denormalization can cause inefficient queries and slow performance.
Lack of Parameterization: Failure to use parameterized queries allows for SQL injection vulnerabilities and can lead to unnecessary query compilation and decreased performance.
Insufficient Connection Pooling: Inefficient connection management, such as opening and closing connections for each query, can result in performance degradation due to increased overhead.
Missing or Inadequate Caching: Not implementing caching mechanisms or utilizing caching techniques effectively can result in redundant database queries and slower performance.
Improper Transaction Management: Long-running or poorly managed transactions can cause blocking and lock contention, leading to performance issues.
Inadequate Error Handling: Insufficient error handling and exception management can result in repeated query execution or unnecessary rollbacks, impacting performance.
Lack of Query Plan Optimization: Failure to analyze and optimize query execution plans can lead to suboptimal plans, inefficient index usage, and poor performance.

Infrastructure

Insufficient Memory: Inadequate memory allocation for SQL Server can lead to excessive disk I/O operations, as data pages are frequently read from and written to disk.
Slow Disk Subsystem: Slow or misconfigured disk subsystem, including disks with low RPM or improper RAID configurations, can result in high disk latency and poor I/O performance.
CPU Bottleneck: Inadequate CPU resources can lead to high CPU utilization, causing queries to wait in a runnable state and impacting overall performance.
Network Latency: Slow or congested network connections between the application server and SQL Server can result in delayed query execution and decreased performance.
Inadequate Server Configuration: Improper configuration of SQL Server settings, such as max degree of parallelism, maximum server memory, or priority boost, can negatively impact performance.
Lack of Maintenance: Failure to regularly update statistics, perform database maintenance tasks, and optimize indexes can lead to degraded performance over time.
Inadequate Backup and Recovery Strategy: Improper backup and recovery mechanisms can cause increased disk I/O, impacting performance during backup and restore operations.
Inefficient Virtualization: Poorly configured or overcommitted virtual environments can result in resource contention, affecting SQL Server performance.
Firewall and Security Settings: Restrictive firewall rules or misconfigured security settings can cause network delays and authentication issues, impacting query performance.
Inadequate Monitoring and Capacity Planning: Insufficient monitoring of server performance metrics and lack of capacity planning can lead to unexpected resource limitations and performance degradation.

DBA

Poor Indexing Strategy: Inadequate or missing indexes can lead to excessive disk I/O, increased CPU usage, and slower query performance.
Outdated Statistics: Out-of-date statistics can result in inaccurate cardinality estimates, leading to suboptimal query plans and decreased performance.
Fragmented Indexes: Fragmented indexes can cause increased disk I/O operations and decreased query performance. Regular index maintenance is essential.
Inefficient Query Execution Plans: Suboptimal query plans, such as scans instead of seeks or improper join algorithms, can lead to performance degradation. Reviewing and optimizing execution plans is crucial.
Blocking and Locking Contention: Long-running transactions, insufficient isolation levels, or concurrent access to the same resources can cause blocking and locking contention, impacting performance.
Insufficient Memory Allocation: Inadequate memory allocation to SQL Server can result in excessive disk I/O, as data pages are frequently read from and written to disk.
Tempdb Bottlenecks: Tempdb contention due to high usage, improper configuration, or lack of multiple data files can degrade performance for queries using temporary objects.
Inadequate Server Configuration: Incorrect configuration settings, such as max worker threads, parallelism, or memory settings, can impact SQL Server performance.
Disk Configuration Issues: Inappropriate disk configurations, such as using RAID 5 for write-intensive workloads or placing log files and data files on the same physical disks, can lead to performance problems.
Resource Contentions: Insufficient CPU, memory, or disk resources can result in resource contention, causing slower response times and degraded performance.

Methodology

蒐集使用數據、查詢語法、執行計畫
分析是否組態設定問題
分析索引維護問題
分析是否缺少索引或者索引設計不佳
分析是否查詢語法設計不佳
分析是否硬體規格不足
處理組態設定與維護排程 (C)
處理遺漏的索引或調整索引設計 (I)
處理程式設計與查詢與改 (Q)
處理硬體設備強化 (H)

Analyzing Operator Properties

Estimated and Actual Rows: Compare the estimated number of rows with the actual number of rows processed by an operator. Significant discrepancies (差異) between the estimates and actuals may indicate issues such as outdated statistics or incorrect cardinality estimates.

Cardinality estimates in SQL Server refer to the estimated number of rows that will be returned or processed by a specific operation or operator in the query execution plan. These estimates are crucial for SQL Server to determine the most efficient execution plan and allocate appropriate resources for query processing.

I/O and CPU Statistics: Look at the I/O and CPU statistics for each operator. High I/O or CPU usage could suggest areas for performance improvement, such as optimizing indexes or rewriting queries.

Predicates: Predicates refer to the conditions used to filter or join data. Evaluate the predicates associated with each operator to ensure they match your intended query logic. If there are unnecessary or incorrect predicates, it may affect performance.

When analyzing predicates in an execution plan, pay attention to the following values and metrics:

Selectivity: Evaluate the selectivity of predicates, which refers to the percentage of rows that satisfy a specific condition. Higher selectivity can lead to more efficient query plans.
Predicate Pushdown: Determine if predicates are pushed down to the underlying table scans or seeks. Pushing predicates closer to the data source can reduce the amount of data processed, improving performance. (使 Table 或 Index 儘量早的使用 Predicate 縮限資料量，而非在執行計畫的最後才進行 Predicate)
CPU and I/O Statistics: Assess the CPU and I/O statistics associated with operators using predicates. High CPU or I/O usage can indicate performance bottlenecks that may require optimization.

Access Methods: Depending on the operator, there may be specific access methods used, such as index scans or seeks. Check whether the access methods align with your expectations and consider potential improvements, such as adding or modifying indexes.

Parallelism: Some operators may have parallel execution properties, indicated by parallel arrows connecting them. Evaluate the use of parallelism and assess its impact on query performance.

Webber Script

SELECT 
    qs.creation_time,
    qs.execution_count,
    qs.total_worker_time,
    qs.total_elapsed_time,
    qs.total_logical_reads,
    SUBSTRING(st.text, (qs.statement_start_offset/2) + 1, 
                ((CASE qs.statement_end_offset
                    WHEN -1 THEN DATALENGTH(st.text)
                    ELSE qs.statement_end_offset
                END - qs.statement_start_offset)/2) + 1) AS statement_text
FROM 
    sys.dm_exec_query_stats AS qs
CROSS APPLY 
    sys.dm_exec_sql_text(qs.sql_handle) AS st
WHERE
	qs.execution_count > 1000
ORDER BY 
    (qs.total_worker_time / qs.execution_count) DESC;

數值	說明
creation_time	查詢的創建時間，可以用來判斷查詢執行的時間範圍
execution_count	查詢執行的次數，如果某個查詢的執行次數非常高，可能是效能問題的源頭
total_worker_time	查詢執行所花費的總工作時間，可以用來評估查詢的效能消耗
total_elapsed_time	查詢執行所花費的總經過時間，包括等待和執行時間
total_logical_reads	查詢執行的總邏輯讀取次數，表示從緩衝區中讀取的頁數
statement_text	查詢的原始 SQL 文本，可以根據需要進一步分析查詢的內容

根據這些統計資訊，可以識別哪些查詢消耗了大量的資源，並進一步分析和優化這些查詢，以改善 SQL Server 的效能。需要注意的是 sys.dm_exec_query_stats 只會保留一定數量的執行統計資訊，所以在進行分析之前，確保收集到足夠的統計資料。

輔助產生遺漏索引的 Script：

SELECT mig.index_group_handle,
       mid.index_handle,
       CONVERT(DECIMAL(28, 1), migs.avg_total_user_cost * migs.avg_user_impact *
                               (
                               migs.user_seeks + migs.user_scans )) AS
       improvement_measure,
       'CREATE INDEX [IX_'
       + Quotename(mid.[statement]) + '_'
       + Quotename(mid.equality_columns) + CASE WHEN mid.inequality_columns IS
       NOT NULL
       THEN '_' + Quotename(mid.inequality_columns) ELSE '' END + ']'
       + ' ON ' + mid.[statement] + ' ('
       + Isnull(mid.equality_columns, '') + CASE WHEN mid.equality_columns IS
       NOT NULL
       AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END
       + Isnull(mid.inequality_columns, '') + ')'
       + Isnull(' INCLUDE (' + mid.included_columns + ')', '')      AS
       create_index_statement,
       migs.*
FROM   sys.dm_db_missing_index_groups mig
       INNER JOIN sys.dm_db_missing_index_group_stats migs
               ON migs.group_handle = mig.index_group_handle
       INNER JOIN sys.dm_db_missing_index_details mid
               ON mig.index_handle = mid.index_handle
ORDER  BY improvement_measure DESC;

Racy Script

Microsoft TSS

TSS Link | aka.ms

透過以下 Script 找出耗費 I/O 最多的 SQL Handle 及 Plan Handle，進而求出 SQL Script 及執行計畫，從執行計畫尋找是否有使用造成效能不佳的 Scan (Table or Clustered Index Scan)

SELECT TOP 10 ( total_logical_reads / execution_count )  AS avg_logical_reads,
              ( total_logical_writes / execution_count ) AS avg_logical_writes,
              ( total_physical_reads / execution_count ) AS avg_phys_reads,
              execution_count,
              --statement_start_offset as stmt_start_offset,
              --sql_handle,
              --plan_handle,
              q.text,
              queryplan.query_plan
FROM   sys.dm_exec_query_stats stat
       CROSS apply sys.Dm_exec_sql_text(stat.sql_handle) AS q
       CROSS apply sys.Dm_exec_query_plan(stat.plan_handle) AS queryplan
ORDER  BY ( total_logical_reads + total_logical_writes ) DESC

數值	說明
avg_logical_reads, avg_logical_writes & avg_phys_reads	數值越高，表示該查詢需要讀取或寫入的資料越多，可能需要進行效能優化。
execution_count	數值越高，表示該查詢被執行的次數越多，可能是一個常用的查詢或存在重複執行的問題

搭配 sys.dm_db_missing_index_details, sys.dm_db_missing_index_groups, sys.dm_db_missing_index_columns 等 DMV，判斷是否存在遺漏索引的問題：

SELECT mig.*,
       statement AS table_name,
       column_id,
       column_name,
       column_usage
FROM   sys.dm_db_missing_index_details AS mid
       CROSS apply sys.Dm_db_missing_index_columns (mid.index_handle)
       INNER JOIN sys.dm_db_missing_index_groups AS mig
               ON mig.index_handle = mid.index_handle
ORDER  BY mig.index_group_handle,
          mig.index_handle,
          column_id;

數值	說明
mig.index_group_handle & mig.index_handle	數值代表索引群組和索引的識別碼，可以用來識別具有相同索引需求的查詢
column_id	欄位的 ID，可以用來識別該欄位在表中的位置
column_name	代表欄位的名稱，指出可能需要建立索引的欄位
column_usage	描述欄位的使用情況，例如「EQUALITY」表示該欄位用於等值比較，「INCLUDE」表示該欄位需要包含在索引中

相當於使用以下 Script 判斷索引遺漏對於查詢的影響：

SELECT TOP 10 statement AS [database.scheme.table],
              column_id,
              column_name,
              column_usage,
              migs.user_seeks,
              migs.user_scans,
              migs.last_user_seek,
              migs.avg_total_user_cost,
              migs.avg_user_impact
FROM   sys.dm_db_missing_index_details AS mid
       CROSS apply sys.Dm_db_missing_index_columns (mid.index_handle)
       INNER JOIN sys.dm_db_missing_index_groups AS mig
               ON mig.index_handle = mid.index_handle
       INNER JOIN sys.dm_db_missing_index_group_stats AS migs
               ON mig.index_group_handle = migs.group_handle
ORDER  BY migs.avg_user_impact DESC

數值	說明
database.scheme.table	遺漏索引所在的資料庫、架構和表名稱，提供了需要建立索引的目標物件
column_id	欄位的 ID，可以用來識別該欄位在表中的位置
column_name	欄位的名稱，指出可能需要建立索引的欄位
column_usage	描述欄位的使用情況，例如「EQUALITY」表示該欄位用於等值比較，「INCLUDE」表示該欄位需要包含在索引中
migs.user_seeks	使用者查詢的次數，顯示該索引對使用者查詢的使用頻率
migs.user_scans	使用者掃描的次數，顯示該索引對使用者掃描操作的使用頻率
migs.last_user_seek	最後一次使用者查詢的時間，可用於了解索引最近的使用情況
migs.avg_total_user_cost	平均總成本，顯示在使用者查詢中該索引的平均成本
migs.avg_user_impact	平均使用者影響程度，顯示該索引對使用者查詢效能的平均影響程度，數值越大表示影響越大

使用 sys.dm_db_missing_index_details Script，可以看出沒有使用正確索引查詢所需要的成本：

SELECT TOP 10 *
FROM   sys.dm_db_missing_index_group_stats
ORDER  BY
  avg_total_user_cost * avg_user_impact * ( user_seeks + user_scans ) DESC;

使用 Management Studio 進行查詢將找到高成本的 Disk I/O 的 Query，搭配實際的執行計畫，以找出遺漏索引的提示，並且補上遺漏的索引。

參考資訊

Using SQL Server DMVs to Identify Missing Indexes

Troubleshooting Performance Problems in SQL Server 2005 (I/O Bottlenecks)