Pradipta DBA: June 2011

ASM exists to manage file storage for the RDBMS

ASM does NOT perform I/O on behalf of the RDBMS
I/O is performed by the RDBMS processes as it does with other storage types
Thus, ASM is not an intermediary for I/O (would be a bottleneck)
I/O can occur synchronously or asynchronously depending on the value of the DISK_ASYNCH_IO parameter
Disks are RAW devices to ASM
Files that can be stored in ASM: typical database data files, control files, redologs, archivelogs, flashback logs, spfiles,
RMAN backups and incremental tracking bitmaps, datapump dumpsets.

In 11gR2, ASM has been extended to allow storing any kind of file using Oracle ACFS capability (it appears as another filesystem to clients). Note that database files are not supported within ACFS

ASM Basics

The smallest unit of storage written to disk is called an "allocation unit" (AU) and is usually 1MB (4MB recommended for Exadata)
Very simply, ASM is organized around storing files
Files are divided into pieces called "extents"
Extent sizes are typically equal to 1 AU, except in 11g where it will use variable extent sizes that can be 1, 8, or 64 AUs
File extent locations are maintained by ASM using file extent maps.
ASM maintains file metadata in headers on the disks rather than in a data dictionary
The file extent maps are cached in the RDBMS shared pool; these are consulted when an RDBMS process does I/O
ASM is very crash resilient since it uses instance / crash recovery similar to a normal RDBMS (similar to using undo and redo logging)

Storage is organized into "diskgroups" (DGs)

A DG has a name like "DATA" in ASM which is visible to the RDBMS as a file begining with "+DATA"; when tablespaces are created, they refer to a DG for storage such as "+DATA/.../..."
Beneath a diskgroup are one or more failure groups (FGs)
FGs are defined over a set of "disks"
"Disks" can be based on raw physical volumes, a disk partition, a LUN presenting a disk array, or even an LVM or NAS device
FGs should have disks defined that have a common failure component, otherwise ASM redundancy will not be effective

High availability

ASM can perform mirroring to recover from device failures
You have a choice of EXTERNAL, NORMAL, OR HIGH redundancy mirroring
EXTERNAL means allow the underlying physical disk array do the mirroring
NORMAL means ASM will create one additional copy of an extent for redundancy
HIGH means ASM will create two additional copies of an extent for redundancy
Mirroring is implemented via "failure groups" and extent partnering; ASM can tolerate the complete loss of all disks in a failure group when NORMAL or HIGH redundancy is implemented

FG mirroring implementation

Mirroring is not implemented like RAID 1 arrays (where a disk is partnered with another disk
Mirroring occurs at the file extent level and these extents are distributed among several disks known as "partners"
Partner disks will reside in one or more separate failure groups (otherwise mirror copies would be vulnerable)
ASM automatically choses partners and limits the number of them to less than 10 (varies by RDBMS version) in order to contain the overall impact of multiple disk failures
If a disk fails, then ASM updates its extent mapping such that reads will now occur on the surviving partners
This is one example when ASM and the RDBMS communicate with each other

Rebalancing

"Rebalancing" is the process of moving file extents onto or off of disks for the purpose of evenly distributing the I/O load of the diskgroup
It occurs asynchronously in the background and can be monitored
In a clustered environment, rebalancing for a disk group is done within a single ASM instance only and cannot be distributed across multiple cluster node to speed it up
ASM will automatically rebalance data on disks when disks are added or removed
The speed and effort placed on rebalancing can be controlled via a POWER LIMIT setting
POWER LIMIT controls the number of background processes involved in the rebalancing effort and is limited to 11. Level 0 means no rebalancing will occur
I/O performance is impacted during rebalancing, but the amount of impact varies on which disks are being rebalanced and how much they are part of the I/O workload. The default power limit was chosen so as not to impact application performance

Performance

ASM will maximize the available bandwidth of disks by striping file extents across all disks in a DG
Two stripe widths are available: coarse which has a stripe size of 1 AU, and fine with stripe size of 128K
Fine striping still uses normally-sized file extents, but the striping occurs in small pieces across these extents in a round-robin fashion
ASM does not read from alternating mirror copies since disks contain primary and mirror extents and I/O is already balanced
By default the RDBMS will read from a primary extent; in 11.1 this can be changed via the PREFERRED_READ_FAILURE_GROUP parameter setting for cases where reading extents from a local node results in lower latency. Note: This is a special case applicable to "stretch clusters" and not applicable in the general usage of ASM

Miscellaneous

ASM can work for RAC and non-RAC databases
One ASM instance on a node will service any number of instances on that node
If using ASM for RAC, ASM must also be clustered to allow instances to update each other when file mapping changes occur
In 11.2 onwards, ASM is installed in a grid home along with the clusterware as opposed to an RDBMS home in prior versions.

Pradipta DBA

Wednesday, June 8, 2011

ASM foot notes