Skip to content

log file parallel write - Tune Oracle Redo Log I/O

Wait Event Class: System I/O

Parameters: files (number of log files being written), blocks (number of redo blocks written), requests (number of I/O requests issued)

log file parallel write is the wait recorded by the Log Writer (LGWR) background process itself as it writes redo entries from the log buffer to the online redo log files on disk. This is a System I/O class event — meaning it is a background process wait — as opposed to log file sync, which is the foreground session wait.

The Relationship Between log file sync and log file parallel write

Section titled “The Relationship Between log file sync and log file parallel write”

These two events are tightly coupled:

  • A committing session waits on log file sync
  • While the session waits, LGWR is performing log file parallel write
  • When log file parallel write completes, LGWR signals all waiting sessions, ending their log file sync waits

The “parallel” in the name refers to LGWR writing to all members of a log group simultaneously (in parallel I/O operations). For a log group with 2 members, LGWR issues 2 parallel writes and waits for both to complete before signaling sessions.

Although log file parallel write is a background wait, it appears in AWR background wait event reports. Its average latency is the most direct measure of redo log I/O performance. A well-tuned system has:

  • log file parallel write average ≈ log file sync average (LGWR scheduled promptly)
  • Both values below 2–5 ms

If log file sync is significantly higher than log file parallel write, the excess is LGWR CPU scheduling latency — often seen in overcommitted virtual machines.


Average Wait TimeAssessment
< 1 msExcellent — NVMe or write-cached SAN
1–3 msGood — well-configured storage
3–10 msInvestigate — storage contention or suboptimal placement
10–20 msPoor — redo log storage is a bottleneck
> 20 msCritical — redo I/O severely degraded
  • log file parallel write average latency has increased over time without hardware changes
  • A specific redo log group member is on slower storage than others
  • Log switch rate is very high (many small redo logs)
  • The database is running on a virtualized host with shared I/O resources
  • log file sync is elevated and log file parallel write confirms storage is the cause

-- log file parallel write in background wait events
SELECT
event,
total_waits,
total_timeouts,
ROUND(time_waited / 100, 2) AS total_secs,
ROUND(average_wait * 10, 2) AS avg_wait_ms,
ROUND(max_wait / 100, 2) AS max_wait_secs
FROM v$system_event
WHERE event IN ('log file parallel write', 'log file sync',
'log file switch completion', 'log file switch (checkpoint incomplete)')
ORDER BY time_waited DESC;
-- Compare LGWR write time vs session commit wait time
SELECT
se.event,
se.total_waits,
ROUND(se.average_wait * 10, 2) AS avg_ms
FROM v$system_event se
WHERE se.event IN ('log file sync', 'log file parallel write')
ORDER BY se.event;

2. Redo Log Configuration — Size, Members, and Current Status

Section titled “2. Redo Log Configuration — Size, Members, and Current Status”
-- Full redo log group status with file locations
SELECT
l.group#,
lf.member,
ROUND(l.bytes / 1024 / 1024, 0) AS size_mb,
l.members AS member_count,
l.status,
l.archived,
l.sequence#,
TO_CHAR(l.first_time, 'YYYY-MM-DD HH24:MI:SS') AS first_switch_time
FROM v$log l
JOIN v$logfile lf ON l.group# = lf.group#
ORDER BY l.group#, lf.member;

3. Log Switch History — Frequency Analysis

Section titled “3. Log Switch History — Frequency Analysis”
-- Log switch frequency by hour (7-day history)
SELECT
TO_CHAR(first_time, 'YYYY-MM-DD HH24') AS hour,
COUNT(*) AS log_switches,
ROUND(60.0 / COUNT(*), 1) AS avg_mins_between_switches,
MIN(ROUND((LEAD(first_time) OVER (ORDER BY sequence#) - first_time) * 1440, 1))
AS min_mins_between_switches
FROM v$log_history
WHERE first_time > SYSDATE - 7
GROUP BY TO_CHAR(first_time, 'YYYY-MM-DD HH24')
ORDER BY 1 DESC
FETCH FIRST 168 ROWS ONLY;
-- Identify the fastest and slowest log switches
SELECT
sequence#,
TO_CHAR(first_time, 'YYYY-MM-DD HH24:MI:SS') AS switch_time,
ROUND((LEAD(first_time) OVER (ORDER BY sequence#) - first_time) * 1440, 2) AS mins_for_this_log
FROM v$log_history
WHERE first_time > SYSDATE - 1
ORDER BY mins_for_this_log ASC
FETCH FIRST 20 ROWS ONLY;

4. Redo Generation Rate — Size Redo Logs Appropriately

Section titled “4. Redo Generation Rate — Size Redo Logs Appropriately”
-- Average redo generated per minute from AWR (requires Diagnostics Pack)
SELECT
TO_CHAR(s.begin_interval_time, 'YYYY-MM-DD HH24:MI') AS snap_time,
ROUND((e_stat.value - b_stat.value) / 1024 / 1024, 2) AS redo_mb_in_interval,
ROUND((s.end_interval_time - s.begin_interval_time) * 1440, 1) AS interval_mins,
ROUND(
(e_stat.value - b_stat.value) / 1024 / 1024 /
NULLIF((s.end_interval_time - s.begin_interval_time) * 1440, 0), 2
) AS redo_mb_per_minute
FROM dba_hist_sysstat b_stat
JOIN dba_hist_sysstat e_stat ON e_stat.snap_id = b_stat.snap_id + 1
AND e_stat.stat_name = b_stat.stat_name
AND e_stat.dbid = b_stat.dbid
AND e_stat.instance_number = b_stat.instance_number
JOIN dba_hist_snapshot s ON s.snap_id = e_stat.snap_id
AND s.dbid = e_stat.dbid
AND s.instance_number = e_stat.instance_number
WHERE b_stat.stat_name = 'redo size'
AND s.begin_interval_time > SYSDATE - 3
ORDER BY s.begin_interval_time DESC
FETCH FIRST 72 ROWS ONLY;
-- File-level I/O performance for redo log files
SELECT
filetype_name,
small_read_reqs,
small_write_reqs,
ROUND(small_write_svctime / NULLIF(small_write_reqs, 0), 4) AS avg_write_ms,
large_write_reqs,
ROUND(large_write_svctime / NULLIF(large_write_reqs, 0), 4) AS avg_large_write_ms
FROM v$iostat_file
WHERE filetype_name = 'Log File';

The most common and impactful cause. Every write to a redo log group must be confirmed on all members before LGWR signals completion. If any member resides on slow storage — spinning HDD, heavily loaded SAN LUN, NFS — every write pays that storage’s latency, regardless of how fast the other members are.

Critical placement rules:

  • Never place redo log members on the same spindles as data files (causes head contention)
  • Never place redo logs on NFS mounts (network latency + no OS write cache guarantees)
  • Avoid tablespace-level compression or encryption for redo log volumes
  • In ASM, use a separate dedicated disk group for redo logs

2. Log Groups with Unequal Member Performance

Section titled “2. Log Groups with Unequal Member Performance”

If a group has two members — one on fast SSD and one on slow HDD — LGWR always waits for the slow member. The fast member provides zero benefit for latency, only for redundancy.

Detection: Monitor per-file I/O stats with V$IOSTAT_FILE or V$FILESTAT to identify members with different response times.

While multiplexing redo logs is essential for redundancy, having 3+ members per group on separate slow storage paths multiplies write I/O. Each additional member is an additional write that must complete. Two members on fast storage is typically the right balance for most environments.

4. Log Files Too Small — Excessive Log Switch Rate

Section titled “4. Log Files Too Small — Excessive Log Switch Rate”

Small redo logs (e.g., 50 MB default in older databases) generate rapid log switches. During each switch, LGWR must wait for the next group to be available (not ACTIVE/CURRENT). If the next group is still being archived or a checkpoint is incomplete, LGWR stalls. This shows up as log file switch (checkpoint incomplete) or log file switch (archiving needed) waits alongside elevated log file parallel write.

Target: Redo log switch frequency of 15–30 minutes at peak load. Log size = redo MB/minute × 20 minutes.

Without OS-level async I/O, LGWR uses synchronous writes to each member sequentially. With async I/O, LGWR issues all member writes in parallel and waits for completion — dramatically improving throughput for multi-member groups. Unavailable on some file systems (e.g., older ext3, certain NFS mounts).

In virtualized environments (VMware, KVM, cloud VMs), the hypervisor may throttle I/O throughput or impose latency during storage overcommit conditions. LGWR, being a latency-sensitive sequential writer, is particularly impacted. I/O latency spikes from host-level contention directly increase log file parallel write.


-- Calculate target log size (redo_mb_per_min * target_minutes_between_switches)
-- Example: 50 MB/min * 20 min = 1000 MB = 1 GB per log file
-- Add new properly-sized groups on fast storage
ALTER DATABASE ADD LOGFILE GROUP 4 (
'/fast_storage/redo/redo04a.log',
'/mirror_storage/redo/redo04b.log'
) SIZE 1G;
ALTER DATABASE ADD LOGFILE GROUP 5 (
'/fast_storage/redo/redo05a.log',
'/mirror_storage/redo/redo05b.log'
) SIZE 1G;
ALTER DATABASE ADD LOGFILE GROUP 6 (
'/fast_storage/redo/redo06a.log',
'/mirror_storage/redo/redo06b.log'
) SIZE 1G;
-- Rotate through log groups to make old groups INACTIVE
ALTER SYSTEM SWITCH LOGFILE;
ALTER SYSTEM CHECKPOINT;
-- Verify old groups are now INACTIVE before dropping
SELECT group#, status, archived FROM v$log;
-- Drop old undersized groups
ALTER DATABASE DROP LOGFILE GROUP 1;
ALTER DATABASE DROP LOGFILE GROUP 2;
ALTER DATABASE DROP LOGFILE GROUP 3;
-- Can't move a CURRENT log — must drop and re-add members
-- First, add new members on fast storage to all groups:
ALTER DATABASE ADD LOGFILE MEMBER '/fast/redo/redo01b.log' TO GROUP 1;
ALTER DATABASE ADD LOGFILE MEMBER '/fast/redo/redo02b.log' TO GROUP 2;
ALTER DATABASE ADD LOGFILE MEMBER '/fast/redo/redo03b.log' TO GROUP 3;
-- Force log switches to populate new members
ALTER SYSTEM SWITCH LOGFILE;
ALTER SYSTEM CHECKPOINT;
-- Wait for groups to be INACTIVE, then drop slow members
ALTER DATABASE DROP LOGFILE MEMBER '/slow_nfs/redo/redo01a.log';
ALTER DATABASE DROP LOGFILE MEMBER '/slow_nfs/redo/redo02a.log';
ALTER DATABASE DROP LOGFILE MEMBER '/slow_nfs/redo/redo03a.log';

Enable LGWR I/O Slaves (For High-Write Systems)

Section titled “Enable LGWR I/O Slaves (For High-Write Systems)”
-- In 12c+, LGWR can use I/O slave processes for very high redo generation rates
-- Check if LGWR workers are active
SELECT name, value FROM v$parameter
WHERE name IN ('_use_single_log_writer', 'lgwr_io_slaves');
-- For very high redo systems, allow LGWR to spawn I/O slaves
-- (This is usually automatic in 12c+ based on workload)
ALTER SYSTEM SET lgwr_io_slaves = 4 SCOPE=SPFILE;
-- Requires restart; use only if directed by Oracle Support
-- Confirm async I/O is enabled
SELECT name, value FROM v$parameter
WHERE name IN ('disk_asynch_io', 'filesystemio_options');
-- Enable async I/O if disabled (requires restart)
ALTER SYSTEM SET disk_asynch_io = TRUE SCOPE=SPFILE;
-- For file-system redo logs (not ASM), set filesystem async I/O:
ALTER SYSTEM SET filesystemio_options = SETALL SCOPE=SPFILE;
-- Options: NONE, DIRECTIO, ASYNCH, SETALL

1. Standardize on Correct Log Sizing During Database Creation

Section titled “1. Standardize on Correct Log Sizing During Database Creation”

Establish redo log sizing based on expected peak redo generation from the start. Increasing log sizes later requires dropping and recreating log groups, which is an outage risk during the transition.

-- Recommended log group count and size formula:
-- Minimum 3 log groups
-- Each log sized for 20-30 minutes of peak redo generation
-- Example sizing check:
SELECT
ROUND(MAX(redo_mb_per_min) * 25, 0) AS recommended_log_size_mb
FROM (
SELECT
(e_stat.value - b_stat.value) / 1024 / 1024 /
NULLIF((s.end_interval_time - s.begin_interval_time) * 1440, 0) AS redo_mb_per_min
FROM dba_hist_sysstat b_stat
JOIN dba_hist_sysstat e_stat ON e_stat.snap_id = b_stat.snap_id + 1
AND e_stat.stat_name = b_stat.stat_name
AND e_stat.dbid = b_stat.dbid
JOIN dba_hist_snapshot s ON s.snap_id = e_stat.snap_id
AND s.dbid = e_stat.dbid
WHERE b_stat.stat_name = 'redo size'
AND s.begin_interval_time > SYSDATE - 14
);

Include redo log performance in weekly DBA checks:

  • Review average log file parallel write from AWR
  • Check log switch frequency from V$LOG_HISTORY
  • Confirm all log members are VALID and on appropriate storage

In Data Guard configurations, redo must also be transmitted to standby databases. High log file parallel write on the primary combined with high log file sync may indicate standby redo apply lag is feeding back to the primary (in Maximum Protection mode). Review V$DATAGUARD_STATS and consider switching to Maximum Performance or Availability mode.

-- Check Data Guard redo transport performance
SELECT
dest_id,
dest_name,
status,
target,
archiver,
schedule,
transmit_mode,
async_blocks
FROM v$archive_dest
WHERE status = 'VALID' AND target = 'STANDBY';

  • log file sync — The foreground session wait for LGWR completion; the user-facing manifestation of this wait
  • db file sequential read — Compare latency profiles to understand relative storage performance
  • direct path read — Direct I/O patterns; useful baseline for storage capability