RAC Interconnect Health Check (interconnect_health.sql)
What This Script Does
Section titled “What This Script Does”This script monitors RAC interconnect health by calculating the average time to receive consistent read (CR) blocks across the cluster interconnect. It:
- Analyzes global cache CR block transfer times from AWR data
- Calculates average latency in milliseconds
- Provides color-coded status (GREEN/YELLOW/RED) based on thresholds
- Shows data for the last 24 hours by default
The Script
Section titled “The Script”ttitle left 'Health Check - Interconnect' skip 2---- GLOBAL CACHE CR PERFORMANCE-- This shows the average latency of a consistent block request.-- AVG CR BLOCK RECEIVE TIME should typically be about 15 milliseconds depending-- on your system configuration and volume, is the average latency of a-- consistent-read request round-trip from the requesting instance to the holding-- instance and back to the requesting instance. If your CPU has limited idle time-- and your system typically processes long-running queries, then the latency may-- be higher. However, it is possible to have an average latency of less than one-- millisecond with User-mode IPC. Latency can be influenced by a high value for-- the DB_MULTI_BLOCK_READ_COUNT parameter. This is because a requesting process-- can issue more than one request for a block depending on the setting of this-- parameter. Correspondingly, the requesting process may wait longer. Also check-- interconnect badwidth, OS tcp settings, and OS udp settings if-- AVG CR BLOCK RECEIVE TIME is high.--column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9--SELECT to_char(a.end_interval_time, 'MM/DD HH24:MI') end_interval_time, CR_BLOCKS_RECEIVED.instance_number, CR_BLOCKS_RECEIVED.VALUE blocks_received, CR_BLOCKS_RECEIVE_TM.VALUE blocks_tm, ((CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE) * 10) "AVG CR BLOCK RECEIVE TIME (ms)", (CASE WHEN ( (CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE) * 10) < 15 THEN 'GREEN' WHEN ( (CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE) * 10) < 22 THEN 'YELLOW' ELSE 'RED' END) STATUS FROM (SELECT snap_id, instance_number, VALUE FROM dba_hist_sysstat WHERE stat_name = ('gc cr blocks received')) CR_BLOCKS_RECEIVED, (SELECT snap_id, instance_number, VALUE FROM dba_hist_sysstat WHERE stat_name = ('gc cr block receive time')) CR_BLOCKS_RECEIVE_TM, dba_hist_snapshot a WHERE CR_BLOCKS_RECEIVED.snap_id = CR_BLOCKS_RECEIVE_TM.snap_id AND CR_BLOCKS_RECEIVED.instance_number = CR_BLOCKS_RECEIVE_TM.instance_number AND CR_BLOCKS_RECEIVED.VALUE > 0 AND a.snap_id = CR_BLOCKS_RECEIVED.snap_id AND a.instance_number = CR_BLOCKS_RECEIVED.instance_number and a.end_interval_time > sysdate - 1 order by to_char(a.end_interval_time, 'MM/DD HH24:MI') desc , 2;
-- Basic usage (last 24 hours)@interconnect_health.sql
-- No parameters required
Required Privileges
Section titled “Required Privileges”SELECT ON DBA_HIST_SYSSTATSELECT ON DBA_HIST_SNAPSHOT
Sample Output
Section titled “Sample Output”Health Check - Interconnect
END_INTERVAL_TIME INSTANCE_NUMBER BLOCKS_RECEIVED BLOCKS_TM AVG CR BLOCK RECEIVE TIME (ms) STATUS----------------- --------------- --------------- ---------- ------------------------------ ------12/06 14:00 1 2456789 24568 10.0 GREEN12/06 14:00 2 2345678 28148 12.0 GREEN12/06 13:00 1 2567890 41086 16.0 YELLOW12/06 13:00 2 2456789 46678 19.0 YELLOW12/06 12:00 1 2345678 58642 25.0 RED12/06 12:00 2 2234567 55864 25.0 RED
Key Output Columns
Section titled “Key Output Columns”- END_INTERVAL_TIME - AWR snapshot time
- INSTANCE_NUMBER - RAC instance number
- BLOCKS_RECEIVED - Number of CR blocks received
- BLOCKS_TM - Total time for CR block receives (centiseconds)
- AVG CR BLOCK RECEIVE TIME (ms) - Average latency per block
- STATUS - Performance indicator:
- GREEN: < 15ms (good performance)
- YELLOW: 15-22ms (marginal performance)
- RED: > 22ms (poor performance)
Performance Thresholds
Section titled “Performance Thresholds”Typical Expected Values:
- < 1ms: Possible with User-mode IPC
- ~15ms: Typical for well-configured systems
- > 22ms: Indicates potential issues
Factors Affecting Latency:
- CPU saturation (limited idle time)
- Long-running queries
- High DB_MULTI_BLOCK_READ_COUNT settings
- Interconnect bandwidth limitations
- Network configuration issues
Common Use Cases
Section titled “Common Use Cases”Regular Health Monitoring
-- Run during peak hours to check interconnect performance@interconnect_health.sql-- Look for RED status entries
Performance Troubleshooting
-- When users report cluster-wide slowness@interconnect_health.sql-- Compare latencies across instances
Troubleshooting High Latencies
Section titled “Troubleshooting High Latencies”If seeing RED status:
- Check interconnect bandwidth utilization
- Review OS TCP/UDP settings
- Verify network MTU settings
- Check for CPU saturation
- Review DB_MULTI_BLOCK_READ_COUNT parameter
- Look for network packet loss/retransmissions
Additional Diagnostics:
-- Check current interconnect statisticsSELECT * FROM gv$cluster_interconnects;
-- Review wait eventsSELECT event, total_waits, time_waitedFROM gv$system_eventWHERE event LIKE 'gc%'ORDER BY time_waited DESC;
Related Scripts
Section titled “Related Scripts”- RAC Instance Status - Check RAC instance status
- CPU Usage Statistics - Monitor CPU utilization
- Active Session Analysis - View active sessions