RAC Interconnect Health Check (interconnect_health.sql)

What This Script Does

This script monitors RAC interconnect health by calculating the average time to receive consistent read (CR) blocks across the cluster interconnect. It:

Analyzes global cache CR block transfer times from AWR data
Calculates average latency in milliseconds
Provides color-coded status (GREEN/YELLOW/RED) based on thresholds
Shows data for the last 24 hours by default

The Script

ttitle left  'Health Check - Interconnect' skip 2
--
-- GLOBAL CACHE CR PERFORMANCE-- This shows the average latency of a consistent block request.
-- AVG CR BLOCK RECEIVE TIME should typically be about 15 milliseconds depending
-- on your system configuration and volume, is the average latency of a
-- consistent-read request round-trip from the requesting instance to the holding
-- instance and back to the requesting instance. If your CPU has limited idle time
-- and your system typically processes long-running queries, then the latency may
-- be higher. However, it is possible to have an average latency of less than one
-- millisecond with User-mode IPC. Latency can be influenced by a high value for
-- the DB_MULTI_BLOCK_READ_COUNT parameter. This is because a requesting process
-- can issue more than one request for a block depending on the setting of this
-- parameter. Correspondingly, the requesting process may wait longer.  Also check
-- interconnect badwidth, OS tcp settings, and OS udp settings if
-- AVG CR BLOCK RECEIVE TIME is high.
--
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
--
SELECT  to_char(a.end_interval_time, 'MM/DD HH24:MI') end_interval_time,
       CR_BLOCKS_RECEIVED.instance_number,
       CR_BLOCKS_RECEIVED.VALUE blocks_received,
       CR_BLOCKS_RECEIVE_TM.VALUE blocks_tm,
       ((CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE) * 10)
          "AVG CR BLOCK RECEIVE TIME (ms)",
       (CASE
           WHEN ( (CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE)
                 * 10) < 15
           THEN
              'GREEN'
           WHEN ( (CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE)
                 * 10) < 22
           THEN
              'YELLOW'
          ELSE
             'RED'
       END)
         STATUS
  FROM (SELECT snap_id, instance_number, VALUE
          FROM dba_hist_sysstat
         WHERE stat_name = ('gc cr blocks received')) CR_BLOCKS_RECEIVED,
       (SELECT snap_id, instance_number, VALUE
          FROM dba_hist_sysstat
         WHERE stat_name = ('gc cr block receive time')) CR_BLOCKS_RECEIVE_TM,
         dba_hist_snapshot a
 WHERE CR_BLOCKS_RECEIVED.snap_id = CR_BLOCKS_RECEIVE_TM.snap_id
       AND CR_BLOCKS_RECEIVED.instance_number =  CR_BLOCKS_RECEIVE_TM.instance_number
       AND CR_BLOCKS_RECEIVED.VALUE > 0
       AND a.snap_id = CR_BLOCKS_RECEIVED.snap_id
       AND a.instance_number = CR_BLOCKS_RECEIVED.instance_number
       and a.end_interval_time > sysdate - 1
 order by to_char(a.end_interval_time, 'MM/DD HH24:MI') desc , 2
;

Usage

-- Basic usage (last 24 hours)
@interconnect_health.sql

-- No parameters required

Required Privileges

SELECT ON DBA_HIST_SYSSTAT
SELECT ON DBA_HIST_SNAPSHOT

Sample Output

Health Check - Interconnect

END_INTERVAL_TIME INSTANCE_NUMBER BLOCKS_RECEIVED  BLOCKS_TM AVG CR BLOCK RECEIVE TIME (ms) STATUS
----------------- --------------- --------------- ---------- ------------------------------ ------
12/06 14:00                     1        2456789      24568                           10.0 GREEN
12/06 14:00                     2        2345678      28148                           12.0 GREEN
12/06 13:00                     1        2567890      41086                           16.0 YELLOW
12/06 13:00                     2        2456789      46678                           19.0 YELLOW
12/06 12:00                     1        2345678      58642                           25.0 RED
12/06 12:00                     2        2234567      55864                           25.0 RED

Key Output Columns

END_INTERVAL_TIME - AWR snapshot time
INSTANCE_NUMBER - RAC instance number
BLOCKS_RECEIVED - Number of CR blocks received
BLOCKS_TM - Total time for CR block receives (centiseconds)
AVG CR BLOCK RECEIVE TIME (ms) - Average latency per block
STATUS - Performance indicator:
- GREEN: < 15ms (good performance)
- YELLOW: 15-22ms (marginal performance)
- RED: > 22ms (poor performance)

Performance Thresholds

Typical Expected Values:

< 1ms: Possible with User-mode IPC
~15ms: Typical for well-configured systems
> 22ms: Indicates potential issues

Factors Affecting Latency:

CPU saturation (limited idle time)
Long-running queries
High DB_MULTI_BLOCK_READ_COUNT settings
Interconnect bandwidth limitations
Network configuration issues

Common Use Cases

Regular Health Monitoring

-- Run during peak hours to check interconnect performance
@interconnect_health.sql
-- Look for RED status entries

Performance Troubleshooting

-- When users report cluster-wide slowness
@interconnect_health.sql
-- Compare latencies across instances

Troubleshooting High Latencies

If seeing RED status:

Check interconnect bandwidth utilization
Review OS TCP/UDP settings
Verify network MTU settings
Check for CPU saturation
Review DB_MULTI_BLOCK_READ_COUNT parameter
Look for network packet loss/retransmissions

Additional Diagnostics:

-- Check current interconnect statistics
SELECT * FROM gv$cluster_interconnects;

-- Review wait events
SELECT event, total_waits, time_waited
FROM gv$system_event
WHERE event LIKE 'gc%'
ORDER BY time_waited DESC;

RAC Instance Status - Check RAC instance status
CPU Usage Statistics - Monitor CPU utilization
Active Session Analysis - View active sessions