DOC

Trent Foundation School - East Midlands Deanery

By Joan Bell,2014-09-30 06:56
10 views 0
Trent Foundation School - East Midlands Deanery

    The Fun of Oracle RAC/ASM and Devices Names

    Or

    Why can’t AIX rename a device?

    By: Allan E Cano, Itrus Technologies

    Email: acano@itrus.com

    Date: November 02, 2009

Contents

    Abstract ........................................................................................................................................................................... 3 Disclaimer ........................................................................................................................................................................ 4 NIC Naming Solutions....................................................................................................................................................... 5 IBM Supported Solution ............................................................................................................................................... 5

    For Node3 ................................................................................................................................................................ 5

    For Node4 ................................................................................................................................................................ 5 Unsupported Method .................................................................................................................................................. 6

    For Node3 ................................................................................................................................................................ 6

    For Node4 ................................................................................................................................................................ 6 Disk Naming Solutions...................................................................................................................................................... 7 IBM Supported Solution ............................................................................................................................................... 7 Oracle Documented Solution ....................................................................................................................................... 9 Unsupported Solution .................................................................................................................................................10 Possible alog solution ..................................................................................................................................................11 Request for more… .........................................................................................................................................................11 Conclusions .....................................................................................................................................................................11 Rename an Ethernet Adapter (rename_ent) ...................................................................................................................12 Rename an Hdisk (rename_hdisk) ...................................................................................................................................14 Quick Script to Compare Disk Mappings ..........................................................................................................................15

Abstract

    I was recently tasked to help a client integrate two new AIX nodes into an existing Oracle RAC environment and found some annoying device name requirements for Oracle.

    First, all network adapters have to have the same name.

    Second, unless the DBAs and SAs created aliased names (via mknod) for the disks used by ASM all the disks must map to the same hdisk name in AIX.

    For details on these requirements I referred to Oracle Real Application Clusters Installation and Configuration Guide.

    The easy way to get network adapters and disk names to match between systems would be run something like

    # chdev l old_name n new_name

    Except, this command option does NOT exist. A quick Google search shows the request to be able rename a disk dates back to at least 1996.

    Following, I discuss solutions IBM will support to synchronize adapter and disks name based on my experience with AIX support. Followed by a work-around I used to solve the problem.

Disclaimer

    This scripts presented in this document were written for a specific task and environment. There is no guarantee that these scripts will work correctly or for that matter not do damage. It is the responsibility of the system administrator to review the script and insure that it will work correctly in his/her environment.

    The author accepts no responsibility for these scripts in any environment that he has not directly placed these scripts.

NIC Naming Solutions

    Instead of looking at the subnet for the public and private network determining which adapter to use (al-la HACMP),

    Oracle requires you to use, for instance, ent0 for the public address on all nodes in the cluster and ent1 for the private

    address on all nodes in the cluster.

    This is not usually a problem with new builds; however, in this environment, detailed below, getting the network

    interfaces to match up requires either renaming the network interfaces or very carefully adding the desired interface in

    the correct order and a few unused VIO NICs as place holders.

    The existing production nodes had the following configuration:

    Node Name Network Used Ethernet Adapters Unused

    Node1 Private ent6 Etherchannel of ent0/ent1 ent3 VIO Apater

    Public ent2 VIO Adapter ent4, ent5 2 port GigE

    Node2 Private ent6 Etherchannel of ent0/ent1 ent3 VIO Apater

    Public ent2 VIO Adapter ent4, ent5 2 port GigE

    The new nodes started with the following adapters available for the public/private networks. Node Name Network Used Ethernet Adapters Unused

    Node3 Private ent0/ent1

    Public ent2 VIO Adapter

    Node4 Private ent0/ent2 ent1,ent3

    Public ent4 VIO Adapter

IBM Supported Solution

    For Node3

    1. Add 3 VIO NICs to the LPAR

    2. Run cfgmgr

    3. Then create the Etherchannel for ent0 and ent1 producing ent6. For Node4

    Assuming you’re using LPARs, you have to:

    1. Un-assign one 2-port adapter (via DLPAR if you can else through profile and re-activate)

    2. Remove all network adapters

    # for x in 0 1 2 3 4

    > do

    > rmdev dl en$x

    > rmdev dl et$x

    > rmdev dl ent$x

    > done

    3. Run cfgmgr

    4. Re-assign the 2 port adapter

    5. Run cfgmgr

    6. Add a new VIO NIC

    7. Run cfgmgr

    8. Build the Etherchannel from what now should be ent0 and ent3 producing ent6. Without LPARs you have to physically unplug adapters to get everything named correctly and hopefully find a mkdev

    command to create a bogus defined network adapters for a place holder. If you need to add a place holder you may find

    the necessary parameters for mkvdev in the cfgmgr log:

    # alog -o -t cfg > /tmp/afile.txt

    We’ll cover alog in more detail later.

Unsupported Method

    The quick fix would be to rename adapters to match the existing configuration on the other nodes. This requires

    updating the ODM and not trying to re-invent the wheel I of course searched Google and found one useful answer at:

    http://www.phase2.net/2007/06/08/how-to-rename-network-adapters-in-aix/

    From here, and using an old rename_hdisk I had written, I built the ODM script rename_ent. This script does do some BASIC error checking and provides BASIC safe guards. It also REQUIRES a reboot for the adapters to work.

    Pretty much any work in the ODM will get you a “That’s unsupported. Thank you for calling,” from IBM Support. From me it gets the disclaimer at the start of this document.

    So, how to use the script to resolve the network in our example: For Node3

    1. Create the Etherchannel for ent0 and ent1 producing ent3.

    2. Rename ent3 to ent6

    # rename_ent 3 6

    For Node4

    1. Remove ent2, ent3

    # for x in 2 3

    > do

    > rmdev dl en$x

    > rmdev dl et$x

    > rmdev dl ent$x

    > done

    2. Rename ent4 to ent2

    # rename_ent 4 2

    3. Run cfgmgr bringing back the 2-port adapter as ent3 and ent4

    4. Build the Etherchannel from what now should be ent0 and ent3 producing ent6.

Disk Naming Solutions

    The problem here is that the customer directly used the hdisk name in configuring the ASM disks in to RAC. Then they added additional disks in the existing environment (SAN LUNs). The result is that there is NO guarantee that a new Node will find the disks in the same order … unless, you mask one LUN at a time to the new system in the desired order. IBM Supported Solution

    Mask one LUN at a time to the new system in the desired order.

    The following is directly from the PMR I worked with IBM on this problem:

    IBM

    Regarding the question on calling mkdev manually to configure a device with a specific name - the mkdev

    command does not support this feature for fibre channel attached devices. The only (unsupported) option

    would be to modify the ODM, which I believe you already indicated was not an option. The only potential option

    to get the names to match up would be to use the cfgmgr "-l" option to scan a particular adapter and name

    those devices first - the ODM would then remember those names on subsequent reboots. This assumes that

    each host sees exactly the same number of devices from each adapter.

    For example, assuming fcs0/fcs1 are configured, to have the devices on fcs1 have the lowest numbered hdisks,

    run:

    1) rmdev -Rdl fscsi0

    2) rmdev -Rdl fscsi1

    3) cfgmgr -l fcs1

    4) cfgmgr -l fcs0

    The only other thing I'd mention is that while manual modifications to the ODM are not supported, there's no

    indication that simply changing the name in the correct tables wouldn't work. Nor are we aware of any known

    issues when the default device numbering is not used. While we can't advise/assist performing the action, from

    a supportability standpoint in the future we'd simply require rmdev'ing the adapters and child devices and re-

    running cfgmgr to bring back the default naming if we suspected that as a potential cause of any issue.

    Wish I had a different answer, let me know if you have any questions.

    Itrus

    Not to beat a dead horse...

    Though 'mkdev -l <name>' is not be supported for fiber attached devices there should still be a supported

    mkdev command to simply add the SAN disks in the desired order.

    

    Essentially, I need to know:

    Is the only supported method for configuring fiber attached devices to run 'cfgmgr' or 'cfgmgr -l fscsiX'

    Or is there any supported AIX command/method for adding fiber attached devices in a particular order other

    than zoning/masking devices one at a time to the system?

    From what I'm reading the answers are 1 - Yes, 2 - No.

    IBM

     mkdev from the command line (even with the correct -w parameters) will still not allow the device to be made

    'available' using this method.

    On the two questions, "yes" on #1, and "no" on #2 are correct.

    Wish I had a different answer.

Fortunately, to get our disks renamed in a supported manner we only had to

    1. Unmask one new LUN

    2. Remove all the LUNs

    3. Run cfgmgr

    4. Re-mask the new LUN

    5. Run cfgmgr

    6. This shifted all the hdisks numbers down one to match the existing production nodes. But what if there are 100 LUNs added over 5 years and we expect to double the size in the next 3 years. Raise your hand if you want to maps and mask 100 to 300 LUNs, potentially, one at a time. OK not likely to happen on a single server, but across a standard dev, test, qa, and prod environment each with 25 LUNs is still not something I’d look forward to.

    Oracle Documented Solution

    Had the DBAs used the Oracle recommended mknod command to alias these drives there would not be a problem. The

    DBAs would simply have used aliased names when configuring the RAC/ASM environment. The system admin would

    just need to write a script to map the alias name to the real drive by LUN id (can’t use PVIDs with ASM). A good start to

    this script would be in the compare disk procedure provided in this document.

    This from the Oracle Real Application Clusters Installation and Configuration Guide (2-36)

    If the device name associated with the PVID for a disk that you want to use is different on any node, you must create a new

    device file for the disk on each of the nodes using a common unused name.

    For the new device files, choose an alternative device file name that identifies the purpose of the disk device. The previous

    table suggests alternative device file names for each file. For database files, replace dbname in the alternative device file

    name with the name that you chose for the database in step 1.

    To create a new common device file for a disk device on all nodes, follow these steps on each node:

    a. Enter the following command to determine the device major and minor numbers that identify the disk device, where

    n is the disk number for the disk device on this node:

    # ls -alF /dev/*hdiskn

    The output from this command is similar to the following:

    brw------- 1 root system 24,8192 Dec 05 2001 /dev/hdiskn

    crw------- 1 root system 24,8192 Dec 05 2001 /dev/rhdiskn

    In this example, the device file /dev/rhdiskn represents the character raw device, 24 is the device major number, and

    8192 is the device minor number.

    b. Enter a command similar to the following to create the new device file, specifying the new device file name and the

    device major and minor numbers that you identified in the previous step:

    # mknod /dev/ora_ocr_raw_100m c 24 8192

    c. Enter commands similar to the following to change the owner, group, and permissions on the character raw device

    file for the disk:

     OCR:

    # chown root:oinstall /dev/ora_ocr_raw_100m

    # chmod 640 /dev/ora_ocr_raw_100m

     CRS voting disk or database files:

    # chown oracle:dba /dev/ora_vote_raw_20m

    # chmod 660 /dev/ora_vote_raw_20m

    d. Enter a command similar to the following to verify that you have created the new device file successfully:

    # ls -alF /dev | grep "24,8192"

    The output should be similar to the following:

    brw------- 1 root system 24,8192 Dec 05 2001 /dev/hdiskn

    crw-r----- 1 root oinstall 24,8192 Dec 05 2001 /dev/ora_ocr_raw_

Unsupported Solution

    If, as in this case, the original DBA and SA didn’t use aliased names and because of undesired down time this method couldn’t be retroactively put in place, you’ll find a useful script called rename_hdisk in this document. This script and fix is essentially the same as we saw with the network adapters tweak the ODM. The gotcha here is that sometimes the disks come back defined after a reboot and you have to do a mkdev –l hdiskX’ to enable them. Since this solution was unsupported I couldn’t get IBM investigate further why the disks on one system came up defined with on the second

    system they came up available.

    Generally to use this script the SA will just run a simple swap algorithm. For instance

    LUN ID Current hdisk names on new Server Required hdisk names

    5000000000000 Hdisk1 Hdisk3

    5000000000001 Hdisk2 Hdisk1

    5000000000002 Hdisk3 Hdisk2

     # rename_hdisk hdisk1 hdisk100

     # rename_hdisk hdisk2 hdisk1

     # rename_hdisk hdisk3 hdisk2

     # rename_hdisk hdisk100 hdisk3

Report this document

For any questions or suggestions please email
cust-service@docsford.com