Skip to content

RAC Interconnect Health Check (interconnect_health.sql)

This script monitors RAC interconnect health by calculating the average time to receive consistent read (CR) blocks across the cluster interconnect. It:

  • Analyzes global cache CR block transfer times from AWR data
  • Calculates average latency in milliseconds
  • Provides color-coded status (GREEN/YELLOW/RED) based on thresholds
  • Shows data for the last 24 hours by default
ttitle left 'Health Check - Interconnect' skip 2
--
-- GLOBAL CACHE CR PERFORMANCE-- This shows the average latency of a consistent block request.
-- AVG CR BLOCK RECEIVE TIME should typically be about 15 milliseconds depending
-- on your system configuration and volume, is the average latency of a
-- consistent-read request round-trip from the requesting instance to the holding
-- instance and back to the requesting instance. If your CPU has limited idle time
-- and your system typically processes long-running queries, then the latency may
-- be higher. However, it is possible to have an average latency of less than one
-- millisecond with User-mode IPC. Latency can be influenced by a high value for
-- the DB_MULTI_BLOCK_READ_COUNT parameter. This is because a requesting process
-- can issue more than one request for a block depending on the setting of this
-- parameter. Correspondingly, the requesting process may wait longer. Also check
-- interconnect badwidth, OS tcp settings, and OS udp settings if
-- AVG CR BLOCK RECEIVE TIME is high.
--
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
--
SELECT to_char(a.end_interval_time, 'MM/DD HH24:MI') end_interval_time,
CR_BLOCKS_RECEIVED.instance_number,
CR_BLOCKS_RECEIVED.VALUE blocks_received,
CR_BLOCKS_RECEIVE_TM.VALUE blocks_tm,
((CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE) * 10)
"AVG CR BLOCK RECEIVE TIME (ms)",
(CASE
WHEN ( (CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE)
* 10) < 15
THEN
'GREEN'
WHEN ( (CR_BLOCKS_RECEIVE_TM.VALUE / CR_BLOCKS_RECEIVED.VALUE)
* 10) < 22
THEN
'YELLOW'
ELSE
'RED'
END)
STATUS
FROM (SELECT snap_id, instance_number, VALUE
FROM dba_hist_sysstat
WHERE stat_name = ('gc cr blocks received')) CR_BLOCKS_RECEIVED,
(SELECT snap_id, instance_number, VALUE
FROM dba_hist_sysstat
WHERE stat_name = ('gc cr block receive time')) CR_BLOCKS_RECEIVE_TM,
dba_hist_snapshot a
WHERE CR_BLOCKS_RECEIVED.snap_id = CR_BLOCKS_RECEIVE_TM.snap_id
AND CR_BLOCKS_RECEIVED.instance_number = CR_BLOCKS_RECEIVE_TM.instance_number
AND CR_BLOCKS_RECEIVED.VALUE > 0
AND a.snap_id = CR_BLOCKS_RECEIVED.snap_id
AND a.instance_number = CR_BLOCKS_RECEIVED.instance_number
and a.end_interval_time > sysdate - 1
order by to_char(a.end_interval_time, 'MM/DD HH24:MI') desc , 2
;
-- Basic usage (last 24 hours)
@interconnect_health.sql
-- No parameters required
SELECT ON DBA_HIST_SYSSTAT
SELECT ON DBA_HIST_SNAPSHOT
Health Check - Interconnect
END_INTERVAL_TIME INSTANCE_NUMBER BLOCKS_RECEIVED BLOCKS_TM AVG CR BLOCK RECEIVE TIME (ms) STATUS
----------------- --------------- --------------- ---------- ------------------------------ ------
12/06 14:00 1 2456789 24568 10.0 GREEN
12/06 14:00 2 2345678 28148 12.0 GREEN
12/06 13:00 1 2567890 41086 16.0 YELLOW
12/06 13:00 2 2456789 46678 19.0 YELLOW
12/06 12:00 1 2345678 58642 25.0 RED
12/06 12:00 2 2234567 55864 25.0 RED
  • END_INTERVAL_TIME - AWR snapshot time
  • INSTANCE_NUMBER - RAC instance number
  • BLOCKS_RECEIVED - Number of CR blocks received
  • BLOCKS_TM - Total time for CR block receives (centiseconds)
  • AVG CR BLOCK RECEIVE TIME (ms) - Average latency per block
  • STATUS - Performance indicator:
    • GREEN: < 15ms (good performance)
    • YELLOW: 15-22ms (marginal performance)
    • RED: > 22ms (poor performance)

Typical Expected Values:

  • < 1ms: Possible with User-mode IPC
  • ~15ms: Typical for well-configured systems
  • > 22ms: Indicates potential issues

Factors Affecting Latency:

  • CPU saturation (limited idle time)
  • Long-running queries
  • High DB_MULTI_BLOCK_READ_COUNT settings
  • Interconnect bandwidth limitations
  • Network configuration issues

Regular Health Monitoring

-- Run during peak hours to check interconnect performance
@interconnect_health.sql
-- Look for RED status entries

Performance Troubleshooting

-- When users report cluster-wide slowness
@interconnect_health.sql
-- Compare latencies across instances

If seeing RED status:

  1. Check interconnect bandwidth utilization
  2. Review OS TCP/UDP settings
  3. Verify network MTU settings
  4. Check for CPU saturation
  5. Review DB_MULTI_BLOCK_READ_COUNT parameter
  6. Look for network packet loss/retransmissions

Additional Diagnostics:

-- Check current interconnect statistics
SELECT * FROM gv$cluster_interconnects;
-- Review wait events
SELECT event, total_waits, time_waited
FROM gv$system_event
WHERE event LIKE 'gc%'
ORDER BY time_waited DESC;