RS6000 SP Software Maintenance
RS6000 SP Software Maintenance
Hajo Kitzhöfer, Bärbel Altmann, Janakiraman Balasayee, Atul Sharma, Jorge Vergara
http://www.redbooks.ibm.com
SG24-5160-00
SG24-5160-00
June 1999
Take Note!
Before using this information and the product it supports, be sure to read the general information in
Appendix D, “Special Notices” on page 259.
This edition applies to PSSP Version 2, Release 4 and Version 3, Release 1 of IBM Parallel System
Support Programs for use with AIX 4.3.2
When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the
information in any way it believes appropriate without incurring any obligation to you.
Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ix
Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xi
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
The Team That Wrote This Redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Comments Welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
v
8.4 Kerberos Daemons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8.5 setup_authent, setup_server and the Kerberos Database . . . . . . . . 166
8.6 The krb-srvtab File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8.7 Tickets and Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
8.7.1 Ticket-Granting-Ticket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.7.2 Contents and Purpose of the Ticket-Granting-Ticket (tgt) . . . . . 171
8.7.3 Asking for a Remote Service . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.7.4 The Ticket Cache File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.7.5 Authentication and Authorization . . . . . . . . . . . . . . . . . . . . . . . 177
8.7.6 Never-expiring ticket. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.7.7 Hardmon Access Control List - hmacls File. . . . . . . . . . . . . . . . 179
8.7.8 Kerberos Database Access Control Lists . . . . . . . . . . . . . . . . . 181
8.8 Kerberos Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
8.8.1 List Kerberos Principal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
8.8.2 Make Kerberos Principal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.8.3 Change Kerberos Principal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8.8.4 Remove Kerberos Principal . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
8.9 How to Reinitialize Kerberos without Rebooting the Nodes?. . . . . . . 187
8.9.1 Removing or Reinitializing the Kerberos Subsystem . . . . . . . . . 187
8.10 How You Can Recreate the krb-srvtab File for a Node . . . . . . . . . . 189
8.10.1 The create_krb_files Wrapper. . . . . . . . . . . . . . . . . . . . . . . . . 189
8.10.2 The ext_srvtab Wrapper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
8.11 Merging Two (or more) SP Systems to One Kerberos Realm . . . . . 191
8.11.1 Deleting Authentication Server Setup . . . . . . . . . . . . . . . . . . . 192
8.11.2 Provide Name Resolution and Time Synchronization . . . . . . . 193
8.11.3 Adjust the /etc/krb.realms File . . . . . . . . . . . . . . . . . . . . . . . . 193
8.11.4 Extend the /.klogin File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
8.11.5 Adding Principals for Remote Nodes . . . . . . . . . . . . . . . . . . . 195
8.11.6 Creating the srvtab Files for the Remote Nodes . . . . . . . . . . . 197
8.11.7 Distributing the Kerberos Files to the Remote Nodes . . . . . . . 199
8.12 Setting up a Secondary Authentication Server . . . . . . . . . . . . . . . . 199
8.12.1 Install Kerberos-Related Filesets on a Secondary Server . . . . 201
8.12.2 /etc/krb.conf File Modification and Distribution . . . . . . . . . . . . 201
8.12.3 Run setup_authent on the Secondary Authentication Server . 202
8.13 Working with the WCOLL Variable . . . . . . . . . . . . . . . . . . . . . . . . . 204
vii
C.3 On the Node that will become Secondary Authentication Server . . . . . . 257
C.4 On the Control Workstation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
C.5 On the Future Secondary Authentication Server . . . . . . . . . . . . . . . . . . 258
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
This redbook is intended for system administrators and system operators who
need to manage an SP system. It will help you install, tailor and configure an
RS/6000 SP system. It will also illustrate solutions for updating or migrating
your SP to a higher level of AIX or PSSP software.
This redbook is also a good starting point for anyone wanting to get more
background information about the network install management (NIM), the
switch commands and the use of kerberos.
Thanks to the following people for their invaluable contributions to this project:
Rich Ferri
IBM PPS Lab Poughkeepsie
Linda Mellor
IBM PPS Lab Poughkeepsie
xv
xvi RS/6000 SP Software Maintenance
Chapter 1. Software Maintenance
This publication is also a good starting point for anyone wanting to get more
background information about the network install management (NIM), the
switch commands and the use of kerberos.
This granularity allows customers to install exactly what they need to create
their required environment, allowing a smaller minimum installation size. The
packaging terms are explained in the following sections.
1.1.2 Package
A package is a group of filesets with common function collected into a single
installable image. The image is in backup file format (BFF).
Revisions to filesets are tracked using the version, release, modification, and
fix (VRMF) levels. Each time a fileset update is applied, the fix level is
adjusted. Each time a maintenance level is applied, the modification level is
adjusted, and the fix level is reset to zero.
1.1.3.1 VRMF
Revisions to filesets are tracked using a four-part
Version.Release.Modification.Fix level. Along with each part of the VRMF
versioning scheme come some implied rules for its use.
Version
Release
A new release of AIX and LPP implies pervasive changes in the product or
packaging. Most filesets are likely to be changed. Binary compatibility is
usually preserved, although not guaranteed. Each release of an AIX version
is supported independently, having it is own service stream.
Modification
Software Maintenance 2
support for new hardware, and may also contain minor functional
enhancements. Binary compatibility with previous maintenance levels of the
same release is guaranteed. Pervasive changes are discouraged, since they
would likely have a profound impact on the size, instabilities, and the risk
associated with future individual fix packages.
Fix
Figure 1 shows the relation between LPPs, optional program products and
PTFs.
Operating System
AIX System
(LPP)
Software Maintenance 3
1.1.5 Bundle
Bundles are collections of installable operating-system software components
and Licensed Program Product (LPP) components that are grouped together
and can be installed with one selection. AIX V4.1 supports both
system-defined and user-defined bundles.
Software Maintenance 4
Table 1 demonstrates the differences between LPP, package, and fileset.
Table 1. Example of Packaging Term Usage
Customers that purchase Support Line can also request updates on media,
either through the World Wide Web, or through the AIX Support Center.
Software Maintenance 5
ML 4210, also called AIX 4.2.1
Maintenance levels for AIX 4 are released approximately twice a year while
the release is active. Maintenance levels are likely to contain support for new
hardware, and may also contain minor functional enhancements.
1.3.1 CD-ROM
Due to increased reliability and speed, CD-ROM is by far the most popular
media. The typical AIX order ships the following CD-ROMs.
1.3.1.3 Update CD
The Update CD is a recent addition to the AIX CD-ROM set. This CD includes
fixes for critical problems, optionally installable preventive maintenance, and
Software Maintenance 6
may also contain new device support. Documentation for the update CD is
included in both ASCII and HTML formats on the CD, with instructions for
browsing the documentation on the CD jacket. The documentation includes
descriptions of critical fixes, preventive maintenance, and device support, as
well as installation instructions. This CD is updated approximately once per
quarter.
1.3.2 Tape
The Stacked Product Option (SPO) tape not only contains AIX, but can also
contain Licensed Program Products (LPPs) that the customer ordered at the
same time as AIX.
In addition to AIX and LPPs, the SPO tape may also contain critical fixes.
These fixes will be automatically installed along with the filesets they update.
The optional preventive maintenance contained on the Update CD cannot be
included on the tape media, since it too would automatically be installed, and
would therefore no longer be optional.
Software Maintenance 7
Software Maintenance 8
Chapter 2. The Installation and Customization Process
One of the more complex tasks within an SP system is the installation of the
nodes. In contradiction to standalone workstation which are usually equipped
with tape- and/or CD-devices all the SP nodes are missing these devices,
therefore the ordinary installation methods via tape or CD does not work. The
solution for this dilemma is a installation over a network. The administrative
Ethernet, the connection between the CWS and all the nodes is used for this
installation method. That is one of the reasons why the Ethernet is
mandatory.
Since version 3.2, AIX has had a network installation utility with very limited
capabilities. The AIX 4 network installation management (NIM) was a totally
new approach. NIM provides the ability to install machines with software from
a centrally managed repository in the network.
By looking at the NIM installation methods we notice that the PSSP software
is only using a subset of the NIM capabilities (for example, only the
installation of standalone workstations is used).
All of the NIM commands and options are being hidden by PSSP scripts and
commands. Because of this, and as long as the installation is successful,
some of the complexity appears to be transparent.
Not all the offered functionality of NIM is used in the SP. Only the minimum to
install a standalone system is utilized. Table 3 on page 10 shows the required
information used during installation of an SP node.
Table 3. NIM Objects used by PSSP
The CWS is configured as a NIM master. The master and clients make up a
NIM environment.The master provide resources to the clients. Referring to
AIX Version 4.3, Network Installation Management Guide and Reference,
SC23-4113, each NIM environment can have only one NIM master. As long
as the CWS is the only boot/install server this rule is not hurt. As soon as you
are using nodes as boot/install server as described in 2.1.1, “SP Installation
Install
Master Client
Resource
PUSH
Or,
Master
Install
Client
Resource
PULL
Figure 2. NIM Installation Methods
1 17 33
.. .. ..
15
1 4
16
19 20
31 32
35 36
47 48
Figure 3. SP Installation Hierarchy
If your physical network structure supports such a hierarchy you can initiate
an installation process in each of this trees. In this case more nodes can be
installed simultaneously then using one flat network structure.
A NIM master makes use of the Network File System (NFS) utility to share
resources with clients. As such, all resources required by clients must be
local file systems on the master.
NIM Resources
SP2
lppsource Control W orkstation
spot
mksysb NIM M aster
bosinst_data
scripts
The steps for creating a NIM master in an AIX environment to perform the
various options are:
1. Create one of the standalone systems with AIX and NIM filesets installed
as a NIM master, giving a unique network name for the primary network
interface.
2. Define NIM standalone and diskless clients.
3. Create the basic installation resources like lpp_source and spot. If a
customized mksysb image needs to be installed, then create mksysb
resource. Create the bosinst_data resource if you would like to have a
customized BOS install program.
4. Allocate the required resources for the clients based on the operations to
be performed.
5. Invoke the operations you want to perform on the client machines.
When you configure the NIM master, you specify a unique identifier to name
the object that NIM creates to represent the network. This is the master's
primary interface that connects to the clients. Once this object has been
created, the name you specify identifies the network in all subsequent NIM
operations.
In the SP CWS, the network type is created by default as spnet_en0 for the
Ethernet network during installation. The SMIT fastpath to initialize the NIM
master is smitty nimconfig.
[Entry Fields]
* Network Name [spnet_en0]
* Primary Network Install Interface [en0] +
You will have to define additional NIM networks if clients reside on other local
area networks or subnets. In this case it is required to add a default or static
NIM route between networks you specify. NIM routes are added to network
definitions so NIM can determine the connectivity between NIM machines.
When defining a default or static NIM route, you must also specify the
gateways used by machines on the specified network.
The network types supported by NIM are tok, ent, fddi, generic and ATM. In
the SP environment, the Ethernet (ent) network is always used as the primary
network for NIM operations.
The NIM master manages the installation of the rest of the machines in the
NIM environment. The master is the only machine that can remotely run NIM
commands on the clients. All the other machines in the NIM environment are
clients to the master.
Standalone Clients
The standalone NIM clients are clients with the capability of booting and
running from local resources. The standalone clients mount all the file
systems from local disks and have a local boot image. They are not
dependent upon network servers for operation. They are managed in NIM
networks primarily to install and update software.
Diskless and dataless clients are machines that are not capable of booting
and running without the assistance of servers on a network. The diskless
clients have no hard disks and the dataless clients have hard disks, but they
will not be able to hold all the data that is required for operation.
The fastpath to define a NIM standalone or diskless client on the NIM master
is smitty nim_mkmac. Use the following to add the NIM standalone client
sp4n10 using SMIT,
# smitty nim
Select the option Perform NIM Administration Tasks.
Select the option Manage Machines.
Select the option Define a Machine.
Host Name of Machine : sp4n10
Type of Network Attached to the Primary Network : ent
[Entry Fields]
* NIM Machine Name [sp4n10]
* Machine Type [standalone] +
* Hardware Platform Type [rs6k] +
Kernel to use for Network Boot [up] +
Primary Network Install Interface
* Cable Type bnc +
* NIM Network spnet_en0
* Host Name sp4n10
Network Adapter Hardware Address [10005AFA159D]
Network Adapter Logical Device Name [ent]
IPL ROM Emulation Device [] +/
CPU Id []
The NIM resources that are normally used for standalone clients in SP are:
• lppsource
• Shared Product Object Tree (SPOT)
• mksysb
• scripts
• bosinst_data
Let us look at what these resources are and how to create them using SMIT.
Define a Resource
[Entry Fields]
* Resource Name [lppsource_aix432]
* Resource Type lpp_source
* Server of Resource [master] +
* Location of Resource [/spdata/sys1/install/a> /
Source of Install Images [] +/
Names of Option Packages []
Comments []
SPOT
A Shared Product Object Tree (SPOT) provides a /usr file system for
diskless and dataless clients, as well as the network boot support for all
clients. This is the fundamental resource in an NIM environment. It is
required to install or initialize all machine configuration types.
Recommendation
For avoiding running out of filesystem space in the root filesystem we
strongly recommend creating a separate file system /tftpboot. The size
of this filesystem should be at least 60 MB.
Define a Resource
[Entry Fields]
* Resource Name [spot_aix432]
* Resource Type spot
* Server of Resource [master] +
* Source of Install Images [lppsource_aix432] +
* Location of Resource [/spdata/sys1/install/a> /
EXPAND file systems if space needed? yes +
Comments []
installp Flags
COMMIT software updates? no +
SAVE replaced files? yes +
AUTOMATICALLY install requisite software? yes +
OVERWRITE same or newer versions? no +
VERIFY install and check file sizes? no +
Define a Resource
control_flow:
BOSINST_DEBUG = yes
CONSOLE = /dev/tty0
INSTALL_METHOD = overwrite
PROMPT = no
EXISTING_SYSTEM_OVERWRITE = yes
INSTALL_X_IF_ADAPTER = no
RUN_STARTUP = no
RM_INST_ROOTS = no
ERROR_EXIT =
CUSTOMIZATION_FILE =
TCB = no
INSTALL_TYPE = full
BUNDLES =
target_disk_data:
LOCATION =
SIZE_MB =
HDISKNAME = hdisk0
locale:
BOSINST_LANG = en_US
CULTURAL_CONVENTION = en_US
MESSAGES = en_US
KEYBOARD = en_US
After editing this file, we can create the no-prompt resource using SMIT as
follows:
# smitty nim
Select the option Perform NIM Administration Tasks.
Select the option Manage Resources.
Select the option Define a Resource.
Resource Type : bosinst_data
[Entry Fields]
* Resource Name [noprompt]
* Resource Type bosinst_data
* Server of Resource [master] +
* Location of Resource [/spdata/sys1/install/p> /
Comments []
script
This is a customization program executed after installation. It is only a
shell script which you create with any filename and define as a script
resource for getting executed after the installation of the mksysb image or
the spot image. This script can be used for customizing your environment
in the client system, such as increasing the paging space and the file
system size, configuring the NIS client, and so on. The steps for
configuring this resource are the same as for creating the bosinst_data
resource.
Note: You do not have a full AIX environment during the NIM
customization process. Also default routes and additional adapter are not
configured at this stage of the installation.
Operation to Perform
[TOP]
diag = enable a machine to boot a diagnostic image
cust = perform software customization
bos_inst = perform a BOS installation
maint = perform software maintenance
reset = reset an object's NIM state
fix_query = perform queries on installed fixes
check = check the status of a NIM object
reboot = reboot specified machines
maint_boot = enable a machine to boot in maintenance mode
showlog = display a log in the NIM environment
[MORE...3]
F1=Help F2=Refresh F3=Cancel
F8=Image F10=Exit Enter=Do
/=Find n=Find Next
bos_inst
This operation is used to install the AIX Base Operating System on
standalone clients. The standalone clients are the boot/install servers and
the nodes.
diag
This is used to boot the clients into diagnostics mode to perform hardware
maintenance.
cust
This is used to install software filesets and updates on standalone clients
and spot resources.
fix_query
This is used to display whether specified fixes are installed on a client
machine or spot resources.
maint
This is used to deinstall software filesets and commit and reject updates
on standalone clients and spot resources.
maint_boot
This is used to prepare resources for a client to be network-booted into
maintenance mode to perform software maintenance.
reset
This is used to change the state of a NIM client or resource, so NIM
operations can be performed with it. A reset may be required on a
machine or resource if an operation was stopped before it completed
successfully.
check
This is used to verify the usability of a machine or resource in the NIM
environment. The check operation can be performed on NIM clients, or a
group of NIM clients, a spot resource, or an lpp_source resource.
showlog
This is used to list software installed on a NIM client or spot resource.
reboot
This is used to reboot a NIM client machine. The target of a reboot
operation can be any standalone NIM client or groups of clients.
[Entry Fields]
Target Name sp4n10
Source for BOS Runtime Files mksysb +
installp Flags [-agX]
Fileset Names []
Remain NIM client after install? yes +
Initiate Boot Operation on Client? yes +
Set Boot List if Boot not Initiated on Client? no +
Force Unattended Installation Enablement? yes +
Manage Networks
Manage Machines
Manage Resources
Manage Groups
Backup/Restore the NIM Database
Configure NIM Environment Options
Rebuild the niminfo File on the Master
Unconfigure NIM
Note: In many cases, NIM prevents operations on a target object when the
object is not in a ready state. In such situations you can either try to reset the
object to a ready state or set the force option for an operation to "on" when
the action is initiated. This will ignore the state of the target object and the
operation can be performed on it. In NIM the force option is available for the
following operations:
Manage Networks
This displays a menu of operations that enable you to manage NIM networks.
Operations include creating a network, changing or showing the
characteristics of a network, removing a network, and managing routing
information for a network.
Note: You must have root user authority to perform any of the network
operations.
Manage Resources
Manage Groups
These wrappers are called by the setup_server command for configuring the
NIM environment in the CWS and Boot/Install servers (BIS). These scripts
can also be called from the command line if you know the sequence in which
they have to be executed. These scripts use the information from the SDR to
initiate the appropriate NIM commands with the required options. In this
section we discuss how NIM is configured in an SP using these wrappers.
The NIM wrappers are part of the ssp.basic filesets. They are:
• mknimmast
• mknimint
• mknimclient
• mkconfig
• mkinstall
• mknimres
• delnimmast
• delnimclient
• allnimres
• unallnimres
Here we discuss the basic overview of all these wrappers. For more
information on the flow of logic within each script, refer to Section 2.2.2 of
RS/6000 SP: PSSP 2.2 Survival Guide, SG24-4928.
2.2.1 mknimmast
This wrapper initializes the NIM master. This is the first step for configuring a
NIM master; this is the Control Workstation. In order to configure the NIM
master, the bos.sysmgt.nim.master and bos.sysmgt.nim.spot filesets need to
be installed. This wrapper is executed only on CWS and BIS as part of the
setup_server script using information from the SDR to configure the NIM
master.
# mknimmast -l <node_number_list>
Example:
# mknimmast -l 0
2.2.2 mknimint
This wrapper creates network objects on the NIM master to serve the NIM
clients. If there is more than one NIM master configured, the mknimint creates
network objects to find the CWS. In cases were more than one NIM master is
defined with in the SP (nodes are Boot/Install Server) the CWS remains as a
resource center for other NIM masters. Therefore it is import to configure
network paths’ for reaching the CWS. This is the reason why the BIS, while
executing the mknimint, searches for network interfaces of the CWS using the
netstat command. command. To make sure that the BIS can reach every
network interface of the CWS, the route definitions must be in place.
Command syntax:
# mknimint -l <node_number_list>
Example:
# mknimint -l 0
This command configures the NIM network objects for all the interfaces in the
CWS.
2.2.3 mknimclient
This wrapper takes input from the SDR to create the clients on the CWS and
the BIS which are configured as NIM masters. The node’s reliable hostname
configured in the SDR is used as the node’s NIM machine object name.
# mknimclient -l <node_number_list>
Example:
# mknimclient -l 5
This command creates the NIM client definitions for the node 5 on the NIM
master.
2.2.4 mkconfig
This wrapper creates the /tftpboot/<reliable_hostname>.config_info. The
input values for this script are retrieved from the SDR for all the nodes. This
command has no options, and whenever it is executed it creates the config
files for all nodes that have a bootp_response value is unequal disk. This file
is used during network installation of the nodes.
Command syntax:
# mkconfig
#!/bin/ksh
export control_workstation="9.12.0.4 192.168.4.140"
export cw_hostaddr="192.168.4.140"
export cw_hostname="sp4en0"
export server_addr="192.168.4.140"
export server_hostname="sp4en0"
export rel_addr="192.168.4.9"
export rel_hostname="sp4n09"
export initial_hostname="sp4n09"
export auth_ifs="sp4en0/192.168.4.140"
export authent_server="ssp"
export netinst_boot_disk="hdisk0"
export netinst_bosobj="bos.obj.ssp.432"
export remove_image="false"
export sysman="true"
export code_version="PSSP-3.1"
export proctype="UP"
export LPPsource_name="aix432"
export cwsk4=""
export LPPsource_hostname="sp4en0"
export LPPsource_addr="192.168.4.140"
export platform="rs6k"
export ssp_jm="yes"
2.2.5 mkinstall
This wrapper creates the /tftpboot/<reliable_hostname>.install_info files for
every node in the SDR whose bootp_response is unequal disk. This file is
used during the network installation of the nodes.
Command syntax:
# mkinstall
cat /tftpboot/sp4n09.config_info
2.2.6 mknimres
This wrapper creates all the NIM resources for installation, diagnostics,
migration and customization. The resources created will be used by the
allnimres wrapper for allocation to the clients depending on the
bootp_response field.
Command syntax:
# mknimres -l <node_number_list>
Example:
# mknimres -l 1
The resources that are created by this wrapper are shown in the following
screen:
psspscript:
This is a script file that is copied from the
/usr/lpp/ssp/install/bin/pssp_script to the /spdata/sys1/install/pssp
directory when the mknimres command is executed. This script is called by
NIM after installation of a node and before NIM reboots the node.
It is run under a single user environment with the RAM file system in
place. It installs the required LPPs and does the post-installation setup.
This script takes the input from the /tftpboot/<node>.config_info and
/tftpboot/<node>.install_info files.
prompt:
This resource is allocated when you want the node to prompt for the input
from the console. This is used to perform a maintenance or diagnostic
mode of operation.
noprompt:
This resource is allocated when installing or migrating the node using full
overwrite install and you do not want the installation to prompt for any
input on the console.
lppsource_aix432:
This resource contains the BOS minimum filesets and PTFs required for
installing the nodes in installp format. Whenever you copy any new filesets
or PTFs to this directory, you must initialize the table of contents.toc file
and also update the spot resource to reflect these PTFs.
mksysb_1:
This resource contains the image to be installed on the nodes. Along with
the SP system, you get a spimg tape that contains the minimum BOS
filesets and PTFs in image file format. If you are not using this minimal
image and want to use the image you have created, then make sure that
all the minimum PTFs are also installed. This image is copied to the
directory /spdata/sys1/install/images as part of the installation steps.
spot_aix432:
This resource is defined as a directory structure that contains the run time
files common to all machines. This resource is created under the directory
/spdata/sys1/install/<aix version>/spot.
2.2.7 delnimmast
This wrapper is used to delete the NIM master definitions and all the NIM
objects on the CWS or boot/install server. The NIM filesets will also be
de-installed by this script.
Command syntax:
# delnimmast -l <node_number_list>
Example:
# delnimmast -l 0
This command can be used to unconfigure the NIM master definitions in the
CWS.
Command syntax:
Example:
# delnimclient -l 7,8
This command deletes the NIM client definitions for the nodes 7 and 8 from
the NIM master.
2.2.9 allnimres
This wrapper is used to allocate the NIM resources to the clients depending
on the value of the bootp_response attribute defined in the SDR. If the
bootp_response value is set to install, migration, maintenance or diagnostic
the appropriate resources will be allocated. In case of a disk or customize
value all the resources will be deallocated.
Command syntax:
# allnimres -l <node_number_list>
Example:
# allnimres -l 5
The command to check the resources that are allocated for performing a
install operation on clients is as follows:
The resources that are allocated for performing a diag operation on clients
are as follows:
The resources that are allocated for migrate operation on clients are as
follows:
2.2.10 unallnimres
This wrapper is used to unallocate all the NIM resources for the clients. The
command syntax is:
# unallnimres -l <node_number_list>
Example:
# unallnimres -l 5
When you run setup_server, it looks for all the nodes and customizes them
according to their bootp_response values each time you execute this
command. This is really time-consuming and unnecessary. Instead of running
setup_server, you can execute the following wrappers to set up the node for
installation. The steps to set up node sp4n06 for installation are as follows:
# spbootins -s no -r install -l 6
This sets the node 6 for the install operation and adds an entry in
/etc/bootptab for node 6.
# create_krb_files
# mkconfig
# mkinstall
# allnimres -l 6
This command allocates all the resources that are required for the install
operation for the node sp4n06.
Now from the perspectives, if you invoke a netboot on node 6, the node will
reboot and start the installation of the node.
WSM includes the components that were available for doing system
administration using SMIT. It can be launched from the Java-enabled browser
and also from the application icon in the CDE application manager. To launch
the WSM from a browser on a PC or some other client, you should have the
Internet server daemon httpd running.
Figure 5 on page 42 shows the main screen when you open the WSM from a
browser.
When you double click on the NIM icon, it takes you to the NIM Installation
Management menu, as shown in Figure 6 on page 43. This is the main screen
for configuring and doing NIM administration.
There are three types of objects shown on this screen: the Task Guide, the
Container, and the NIM machine objects (standalone and master). The
screen also shows the state of each object and for some objects it also gives
additional information. For example, you can see that nodes sp4n01 is
enabled for BOS installation, and that node sp4n06 is enabled for diagnostic
boot.
These objects are dialogs designed to assist the user in performing tasks in
an ordered series of steps. The first two lines in Figure 6 show two objects of
type TaskGuide. They are:
1. Configure NIM
This is used for adding new standalone or diskless and dataless clients to the
NIM environment. Here it is possible to specify multiple systems at the same
time to define in the NIM database.
Container Objects:
The container objects include other container objects and simple objects
representing elements of system resources to be configured and managed. In
Figure 6 on page 43, you can see there are two types of container objects.
They are:
1. Resources
This container object contains the Add New Resource of type Taskguide and
the resource objects that are configured in the NIM environment. Figure 7 on
page 45 shows the NIM Resources screen.
The Add New Resource object is for creating new resources in the NIM
environment. When you double click on the other resource objects, it gives all
the information and status of that resource.
For example, when you double click on the spot_aix432 object, you will see
the screen shown in Figure 8 on page 46. This screen shows the general
information of this object, such as resource name, location of the resource,
server of the resource and resource type. You also see there are other menu
options, like boot image information and the state information for this object.
2. Networks
This network container object contains the Add New Network TaskGuide and
the network object spnet_en0 that is configured in this NIM environment.
Figure 10 on page 48 shows the NIM Networks screen.
The Add New Network TaskGuide object is for creating new networks. Double
clicking the spnet_en0 object, you can see the general NIM routes and status
information menus for this object.
The node installation process is complex, so we detail the process from start
to finish. To help you find out what is actually happening during node
installation, we also look at the debug options that are available.
lppssource = 500 MB
mksysb_images = 300 MB
pssp_lpp_images = 350 MB
SPOT = 200 MB
sys1
install
lppsource
PSSP-2.4 PSSP-3.1
aix421
lppsource
PSSP AIX
2.1 2.2 2.3 2.4 3.1 4.1.4 4.1.5 4.2 4.2.1 4.3 4.3.1 4.3.2
PSSP 2.1 - - - - - Y Y n n n n n
2.2 Y - - - - Y Y Y Y n n n
2.3 Y Y - - - n n n Y Y Y Y
2.4 Y Y Y - - n n n Y n Y Y
3.1 Y Y Y Y - n n n n n n Y
All AIX filesets must be installed in lppsource using the directory structure
shown in the following screen. The corresponding SPOT is also shown here.
A SPOT resource contains basically all the files that are normally found in the
/usr filesystem. When you install new filesets in lppsource, you should then
update the corresponding SPOT; this is covered in 5.2, “Applying AIX PTFs in
the CWS” on page 106.
/spdata/sys1/install/aix415/LPPsource
/spdata/sys1/install/aix415/spot
/spdata/sys1/install/aix421/LPPsource
/spdata/sys1/install/aix421/spot
/spdata/sys1/install/aix432/LPPsource
/spdata/sys1/install/aix432/spot
/spdata/sys1/install/pssplpp/PSSP-2.2
/spdata/sys1/install/pssplpp/PSSP-2.3
/spdata/sys1/install/pssplpp/PSSP-2.4
/spdata/sys1/install/pssplpp/PSSP-3.1
The perfagent.server fileset is a prerequisite for PSSP. This file set it part of
the Performance Aide for AIX (PAIDE) feature of the Performance Toolbox for
AIX (PTX), a separate product. The perfagent.tools file set is part of AIX
4.3.2.
For an installation of AIX 4.3.2 and PSSP 3.1 the following filesets need to be
copied to /spdata/sys1/install/aix432/lppsource:
• xlC.rte 3.6.4.0
• perfagent.tools 2.2.32.x
Note
If you are adding dependent nodes, you must also install the ssp.spmgr
fileset.
Install the PSSP software from SMIT panels, or use the installp command.
To install PSSP from the command line, enter the following command:
# cd /spdata/sys1/install/pssplpp/PSSP-3.1
# smitty installp
[Entry Fields]
* INPUT device / directory for software .
* SOFTWARE to install [ssp > +
PREVIEW only? (install operation will NOT occur) no +
COMMIT software updates? yes +
SAVE replaced files? no +
AUTOMATICALLY install requisite software? yes +
EXTEND file systems if space needed? yes +
OVERWRITE same or newer versions? no +
VERIFY install and check file sizes? no +
Include corresponding LANGUAGE filesets? yes +
DETAILED output? no +
Process multiple volumes? yes
Once PSSP has been installed on the CWS, we can perform the rest of the
steps required to configure our CWS as an authentication server, NIM master
and boot/install server. These steps need to be performed before we can
install the SP nodes.
2.9 setup_authent
To proceed with the rest of the installation, we need to create a primary
authentication server, which is normally the CWS. The minimum configuration
required at this stage is to initialize the kerberos database and define and
register a kerberos administrative user. The administrative Unix user is root, A
kerberos principal must be created (it should be named root.admin) before we
can perform subsequent steps in the installation. Enter the following
command to configure the kerberos database and to set up the root.admin
principal.
# setup_authent
Files
/.k master key cache file
/.klogin lists the names of principles
authorized to invoke remote
commands
/etc/krb.conf contains the local realm and a list of
authentication server for the realm
/etc/krb.realms maps host names to an
authentication realm
/etc/krb-srvtab server key file contains names and
private keys
/tmp/tkt0 the ticket cache file, contains the
ticket granting ticket
/var/kerberos/database/principle.dir part of the kerberos database
/var/kerberos/database/principle.pag part of the kerberos database
Daemons
The setup_authent script starts the kerberos and kadmind daemons. These
daemons are added to the /etc/inittab file and are started from here
thereafter. The following screen shows an entry for each of the kerberos
daemons added to the /etc/inittab file.
kerb:2:once:/usr/bin/startsrc -s kerberos
kadm:2:once:/usr/bin/startsrc -s kadmind
The kpropd daemon only exists on the secondary authentication server and is
responsible for updating the secondary authentication servers database from
the primary authentication server.
# /usr/kerberos/bin/klist
Ticket file: /tmp/tkt0
Principal: root.admin@SP4EN0
If your ticket is expired you have to run kinit root.admin from the command
line and enter your password. Once you have a ticket granting ticket, enter the
following command.
# install_cw
# cat /spdata/sy1/spmon/hmacl
1 root.admin vsm
1 hardmon.sp4en0 vsm
2. It creates the /etc/SDR_dest_info file. This file contains the IP address and
name of the SDR server, which is normally the CWS. The following screen
shows the output of our SDR_dest_info file.
# cat /etc/SDR_dest_info
default:192.168.4.140
primary:192.168.4.140
nameofdefault:sp4en0
nameofprimary:sp4en0
hardmon 8435/tcp
sdr 5712/tcp
heartbeat 4893/udp
The main tasks performed by the setup_server command are setting up PSSP
services such as NTP, AMD and File Collections. This command ensures that
the required Kerberos files exist and then creates a list of known rcmd
principals. It creates the CWS as a NIM master and gets information from the
SDR to create the NIM environment including network objects, NIM clients,
SPOT and tftp boot images for the different platforms, and stores them in the
After creating the NIM environment, the NIM resources are allocated to NIM
clients inside the SP. When setup_server is run for the first time on a CWS, all
of these steps need to be performed. This can take up to one hour or more
depending on the CWS configuration and hardware environment. Creating
the SPOT takes up the majority of this time.
NIM logs the actions performed during SPOT creation in the /tmp directory.
There you should look for a file name spot.out.<pid>; where pid is the process
id of the SPOT creation process. This process is probably not running any
long so in case you find several spot.out.xxx files take the newest one. View
the contents of this file and see in you find any hints while the build of the
SPOT was unsuccessful.
# SDR_test
The SDR_test creates an SDR class and adds attributes and values, then
deletes the SDR class to ensure the SDR is working correctly.
# spmon_itest
After these test are complete, log files are created in /var/adm/SPlogs. These
log files are called SDR_test.log and spmon_itest.log.
Viewing the debug output can be extremely useful, because you will be able
to see the commands that failed. The problem may be a misconfiguration of
the boot network adapter, incorrect NIM configuration information, or errors in
resource definitions. Examining the debug output, you can reduce the scope
of investigation, isolate the cause of the problem, and fix it.
Perform the following steps to produce debug output from a network boot
image:
where SPOTName is the name of your SPOT (for example spot_aix432). The
displayed output will be similar to the one shown in the following figure:
spot_aix432:
enter_dbg = "chrp.mp.ent 0x001f3d1c"
enter_dbg = "rs6k.mp.ent 0x001f3d1c"
enter_dbg = "rs6k.up.ent 0x001bd844"
Write down the enter_dbg address for the type of node you are going to boot.
For example, if your client is an rs6k-multiprocessor machine, you would write
down the address "1f3d1c".
The boot process has to be initiated using the spmon command. To boot node
1 enter:
# spmon -power on node1
During the boot process a menu will appear and the system will prompt you
for the selection of the network boot device. Make sure to select the SP
ethernet adapter. This is "normally" automated by the nodecond command
during node installation. Now it has to be done "manually".
After getting the prompt again, type g for go and press <Enter> to start the
boot process.
The system will begin installation and provide you with the commands and
outputs during the installation process. If the system hangs, you can spot the
command causing this and what the problem is.
If the boot image is left in debug mode, every time a client is booted from
these boot images, the machine will stop and wait for a command at the
debugger ">" prompt. If you attempt to use these debug-enabled boot images
and there is not a tty attached to the client, the machine will appear to be
hanging for no reason.
Enter the following command to verify that the node boot response is set to
install. Check that the version levels for AIX and PSSP software are correct.
Check the hdisk number for the boot disk.
# splstdata -b -l 9
[sp4en0:/tmp]# splstdata -b -l 9
List Node Boot/Install Information
sp4n09-new-srvtab
sp4n09.config_info
sp4n09.install_info
Each file name starts with the reliable hostname. The srvtab file is the
kerberos service key file; for more information, see 8.6, “The krb-srvtab File”
on page 168.
cat /etc/bootptab
sp4n09:bf=/tftpboot/sp4n09:ip=192.168.4.9:ht=ethernet:ha=10005AFA158A:sa=
192.168.4.140:sm=255.255.255.0:
# lsnim -l sp4n09
class = machines
type = standalone
platform = rs6k
netboot_kernel = up
if1 = spnet_en0 sp4n09 10005AFA158A ent
cable_type1 = bnc
Cstate = BOS installation has been enabled
prev_state = ready for a NIM operation
Mstate = currently running
boot = boot
bosinst_data = noprompt
lpp_source = lppsource_aix432
mksysb = mksysb_1
NIM_script = NIM_script
script = psspscript
spot = spot_aix432
cpuid = 000201335700
control = master
LPPs
DATE
TIMEZONE
SP2
KERBEROS
OK
SDR
This chapter provides guidelines that you can use to verify that your SP
system is well installed and customized, and that all subsystems integrated in
the SP are running as needed according to your specific environment. Some
scripts are also provided through the SMIT panels to verify your system. You
can type smitty sp_verify to go to the panel of verifications and execute the
scripts.
Following the verification procedures in the order in which they are presented
in this chapter is a good way to verify your whole system. However, you can
also use them when only verifying some of your subsystems.
Never go to production having just the software included in the original media,
because many problems are found after the software is installed and used
massively by the customers.
There are also some specific PTFs required for AIX that should be installed in
the CWS and the nodes so they will function properly. These PTFs are
mentioned in the README FIRST files, so be sure to read that
documentation to certify that all the PTFs are installed in your system. You
can verify the level of your SP software using the following command:
# lslpp -L | grep -E "ssp|rsct"
There are some LPPs that are absolutely necessary for your CWS; the
following list specifies the minimum PSSP LPPs that you should have
installed:
rsct.basic.hacmp
rsct.basic.rte
rsct.basic.sp
rsct.clients.hacmp
rsct.clients.rte
rsct.clients.sp
ssp.authent (if CWS is Kerberos authentication server)
ssp.basic
ssp.clients
ssp.css (if switch is installed)
ssp.ha_topsvcs.compat
ssp.perlpkg
ssp.sysct
ssp.sysman
ssp.top (if switch is installed)
ssp.ucode
If you are using Virtual Shared Disks (VSDs), you should see:
vsd.cmi
vsd.hsd
vsd.rvsd.hc
Verifying Your SP 69
vsd.rvsd.rvsdd
vsd.rvsd.scripts
vsd.sysctl
vsd.vsdd
This list is valid for PSSP 3.1. It may vary for other PSSP versions.
Other products that are included with the PSSP media, but are not absolutely
necessary unless you have that specific requirement, are:
ssp.docs
ssp.gui
sss.hacws
ssp.pman
ssp.ptpegui
ssp.public
ssp.resctr.rte
ssp.st
ssp.tecad
ssp.top.gui
ssp.vsdgui
After you verify that all the required LPPs are installed, check for the required
PTFs. Use the instfix command to determine weather a specific APAR is
installed. For example, to display if APAR IX71835 is installed in your system,
type:
# instfix -ik IX71835
Or:
There was no data for IX71832 in the fix database.
After you have received this response, you should verify that the software is
consistent throughout your system; that is, that your nodes and CWS have
the same levels of software.
For example, to query LPP information for ssp.basic on all nodes in the
system, enter:
# lppdiff -Ga ssp.basic
Verifying Your SP 71
You should receive output similar to the following:
-------------------------------------------------------------------------------
Name Path Level PTF State Type Num
-------------------------------------------------------------------------------
LPP: ssp.basic /etc/objrepos 3.1.0.0 COMMITTED I 10
From: sp4n01 sp4n05 sp4n06 sp4n07 sp4n08 sp4n09 sp4n10 sp4n11 sp4n13 sp4n15
-------------------------------------------------------------------------------
LPP: ssp.basic /etc/objrepos 3.1.0.4 APPLIED F 10
From: sp4n01 sp4n05 sp4n06 sp4n07 sp4n08 sp4n09 sp4n10 sp4n11 sp4n13 sp4n15
-------------------------------------------------------------------------------
LPP: ssp.basic /usr/lib/objrepos 3.1.0.0 COMMITTED I 10
From: sp4n01 sp4n05 sp4n06 sp4n07 sp4n08 sp4n09 sp4n10 sp4n11 sp4n13 sp4n15
-------------------------------------------------------------------------------
LPP: ssp.basic /usr/lib/objrepos 3.1.0.4 APPLIED F 10
From: sp4n01 sp4n05 sp4n06 sp4n07 sp4n08 sp4n09 sp4n10 sp4n11 sp4n13 sp4n15
If you find any inconsistency with the software, stall the required LPPs or
PTFs.
# date
Fri Apr 16 17:19:27 EDT 1999
Do the same with the nodes. You can use the dsh command for this check:
# dsh -a date
If you see a difference of more than five minutes or the timezone is incorrect,
you will have to change it because there are daemons and subsystems that
use time stamps to accomplish their work. For example, kerberos uses a
timestamp when decoding tickets for authentication, and cannot handle
differences of more than five minutes in the clocks of the authenticator and
the clients.
If you find such differences in your system, refer to the IBM Parallel System
Support Programs for AIX: Administration Guide, GC23-3897 to understand
the procedure to change hostnames and IP addresses.
You should receive responses from all nodes that are powered on, giving you
their system date, as in the following example:
# dsh -a date
sp4n01: Fri Apr 16 10:55:02 EDT 1999
sp4n05: Fri Apr 16 10:55:02 EDT 1999
sp4n06: Fri Apr 16 10:55:02 EDT 1999
If you have an improper configuration, you will receive messages such as:
sp4n10.msc.itso.ibm.com: Fri Apr 16 10:36:44 EDT 1999
sp4n10.msc.itso.ibm.com: rshd: Kerberos Authentication Failed: Access
denied because of improper credentials.
sp4n10.msc.itso.ibm.com: spk4rsh: 0041-004 Kerberos rcmd failed: rcmd
protocol failure.
sp4n11.msc.itso.ibm.com: Fri Apr 16 10:36:44 EDT 1999
sp4n11.msc.itso.ibm.com: rshd: Kerberos Authentication Failed: Access
denied because of improper credentials.
sp4n11.msc.itso.ibm.com: spk4rsh: 0041-004 Kerberos rcmd failed: rcmd
protocol failure.
If you receive an error in that command, verify that your tickets are available
and have not expired by typing klist.
Verifying Your SP 73
Check the column labeled Expires. If you need to update your tickets, execute
the command kinit followed by the principal, as in this example:
# kinit root.admin
Within the Kerberos subsystem, you should verify that the following daemons
are running and that the following files exist, with the correct data and
permissions.
You can also check that these daemons are in the /etc/inittab file so they will
be started automatically in each reboot.
/.k
This file contains the master key of the kerberos database. The kadmind and
the utility commands read the key from this file instead of prompting for the
master password.
$HOME/.klogin
This file specifies a list of the remote principals that are authorized to invoke
commands on the local user account. For example, the .klogin file of the user
root file contains the principals who are authorized to invoke processes as the
/tmp/tkt<uid>
This file contains the tickets owned by a client of the authentication database.
This file is continuously created and destroyed using the kinit and kdestroy
commands.
To verify that you have correct tickets, you can reinitialize the file by executing
the kinit command. You should be prompted for the password of the principal
and if it is correct, you should return to the AIX prompt, as follows:
# kinit root.admin
Kerberos Initialization for "root.admin"
Password:
The permissions are: -rw------ , owner <uid> and group <group of uid>.
/etc/krb-srvtab
This file contains the names and private keys of the local instances for all the
services protected by Kerberos. It is important to verify that the key versions
of the nodes match the ones specified in the CWS for the same services. To
check this, execute:
# klist -srvtab
Verifying Your SP 75
In the CWS, you will have a response similar to:
Server key file: /etc/krb-srvtab
Service Instance Realm Key Version
------------------------------------------------------
rcmd sp4en0 SP4EN0 1
hardmon sp4en0 SP4EN0 1
Look for the column labeled Key Version. If you find that the versions are
different between the CWS and the nodes for the same service, for example
rcmd, the /usr/lpp/ssp/kerberos/etc/ext_srvtab command can be used to
create new server key files for each node.
/etc/krb.conf
The SP authentication configuration file defines the local realm and the
location of authentication servers for known realms.
# cat /etc/krb.conf
SP4EN0
SP4EN0 sp4en0 admin server
/etc/krb.realms
/var/adm/SPlogs/kerberos
These files contain messages about the behavior of the kerberos and
kadmind daemons, respectively. You should check them to see if you have
any error messages.
Verifying Your SP 77
node_number
adapter_type
netaddr
netmask
host_name
type
rate
As shown in Figure 14 on page 78, you will have all the IP addresses defined
in the SDR for your SP, so do not worry if you do not see addresses for all the
adapters that you have in your nodes. The basic ones that you should see will
be the adapters for the service LAN and the switch adapters (if you have a
switch of course).
Check that the IP addresses and the netmasks are effectively correct.
Verifying Your SP 79
List Node Boot/Install Information
attribute value
-----------------------------------------------------------------------------
control_workstation sp4en0
cw_ipaddrs 9.12.0.4:192.168.4.140:
install_image bos.obj.ssp.432
remove_image false
primary_node 1
ntp_config consensus
ntp_server ""
ntp_version 3
amd_config false
print_config false
print_id ""
usermgmt_config true
passwd_file /etc/passwd
passwd_file_loc sp4en0
homedir_server sp4en0
homedir_path /home/sp4en0
filecoll_config true
supman_uid 102
supfilesrv_port 8431
spacct_enable true
spacct_actnode_thresh 80
spacct_excluse_enable false
acct_master 0
cw_has_usr_clients false
code_version PSSP-3.1
layout_dir ""
authent_server ssp
backup_cw ""
ipaddrs_bucw ""
active_cw ""
sec_master ""
cds_server ""
cell_name ""
cw_lppsource_name aix432
cw_dcehostname ""
The install_image value is the default image that gets install. The
code_version variable is default PSSP version which will be installed and
the cw_lppsource_name is the level of NIM to install on the CWS.
There are other variables that you can also check, such as amd_config,
which tells you if you have amd configured or not, and the
usermgmt_config variable, which enables the user management option.
-n Displays the following SDR node data:
node_number
frame_number
Verifying Your SP 81
slot_number
slots_used
initial_hostname
reliable_hostname
default_route
processor_type
processors_installed
description
The screen output shows information about all your nodes. Check that the
hostnames are well defined and that the default route is the one that you
want to be (in a normal installation, you will see the IP address of the CWS
as the default route).
There is another command that will help you to determine the status of your
nodes in a specific moment in time, that is the spmon command, the most
complete report can be obtained using the -d and -G flags. In that list you can
verify that you have connection with the frame (or frames) and if you do, you
can see the information Figure 15 Figure 15 on page 84.
Verify that you can see all the frames and all the nodes in your SP. Check the
column labeled Power to determine whether a particular node is power off or
on (remember that this "power" refers only to the logical power on, (not the
physical). If you have a node with this field set to no, you do not know if the
node is logically powered off or physically powered off.
Verify that your Host Responds column is displaying yes for all nodes. If it is
not, you may have a problem with your Ethernet TCP/IP connection; try to
ping the node to see if it responds. If it does respond, problems within the
High Available Infrastructure layer (specially hats) could be the cause for the
missing host response. In this case refer to RS/6000 SP: Problem
Determination Guide, SG24-4778 redbook.
Do the same check with the Switch Responds column. If you find a no there,
you probably need to initialize your switch by executing the Estart command;
Refer to 10.2, “Initialize the Switch” on page 230 for details.
The column "Env Fail" shows the status of your environment; this should
always remain in value no, If this flag shows value yes, then you have a
problem that must be resolved urgently because it indicates a failure in the
systems physical environment. For example, there could be an excessive
temperature in the node because a fan is not working, or because the air
conditioner is not working well.
The column of Front Panel LCD/LEDs will accomplish the same functions as
in a stand-alone machine.
Verifying Your SP 83
# spmon -d -G
1. Checking server process
Process 13680 has accumulated 5 minutes and 45 seconds.
Check ok
3. Querying frame(s)
1 frame(s)
Check ok
4. Checking frames
You should also verify that you have enough free space in your file systems,
like / - and /var-filesystems. In particular, the /var filesystem should be
checked because most of the system log-files are written to this filesystem.
Alternatively, if you want detailed output you can list the information of the
subsystem hats.<partition name> with the -l flag and it will look similar to the
following:
This command helpful information; for example, the State of the subsystem
should be S (Stable). You can also verify that the number of nodes defined in
the group of topology servers is equal to the number of Members active at a
given moment; this number is the sum of the total nodes, plus your CWS,
minus the nodes that are installed with PSSP version 1.2 or 2.1 that do not
work with hats.
While there are commands that allow you to look at, modify, and delete your
SDR, the recommendation is to not modify the SDR directly. Instead use the
commands provided to administrate the SP. However, if you find that you do
need to modify the SDR, refer to 7.1.2, “Backing Up the SDR” on page 145 to
learn how to back up the SDR before you do any modification. The SDR is
located in the /spdata/sys1 directory and its subdirectory structure is shown
in Figure 16 on page 86.
Verifying Your SP 85
SDR
a r c h iv e s d e fs p a r ti tio n s s y s te m
V S D _ Ta b le c la s e s f ile s lo c k s
Node
S w it c h
F ra m e
.. .
The archives subdirectory could either be empty (if you have not taken any
backup of the SDR), or it could contain the files of the backups that you have
taken. In the defs subdirectory, you should have the header files for all the
object classes. The partitions subdirectory should have at least one directory
named as the IP address of the CWS as well as other additional
subdirectories for each additional partition that you have defined. The system
subdirectory contains the classes and files which are global to the system.
# SDR_test
SDR_test: Start SDR commandline verification test
SDR_test: Verification succeeded
If you have a different response, you have a problem with your SDR. In that
case, do the following:
You will see one SDR daemon for each partition that you have in your system.
You can also check this by executing the following command:
Verify that the name of the subsystem is associated with the name of the
partition.
However, the normal (and easiest) procedure if you have a problem with the
SDR is to stop the subsystem and restart it; you can do this using the stopsrc
and startsrc commands.
Verifying Your SP 87
88 RS/6000 SP Software Maintenance
Chapter 4. Alternate and Mirrored rootvg
In this chapter we explain the new concepts of alternate and mirrored root
volume groups which was introduced with PSSP 3.1. Will also look at the
different boot modes which are required for doing software maintenance.
CWS or Node
In order to support this new functionality a new class was added to the SDR,
Volume_Group class.
To manage this new class, five new commands were added to the SP system.
These commands are:
4.1.1 Prerequisites
To enable an alternate rootvg, you must have the following hardware:
• At least two hard disks (one for each rootvg volume group)
• A virtual battery on the SP nodes
The SP virtual battery is a power source that keeps NVRAM powered and the
time of the day clock running while the SP frame is plugged into a power
outlet with power applied, or until a node is removed from the SP frame.
Power is supplied by the node supervisor card in each node. This card
provides line power even when the node is powered off.
In order to make a clear relation between the installation device and the
physical device it is recommended to use the physical location code
(00-00-0S-1,0) in the Physical Volume List field of the next input mask instead
of using logical device names, line hdisk0. The reason for is that with PSSP
3.1 also booting from SSA devices is support. AIX assigns location device
names dynamically to physical device names. Therefore it is not always
obvious which physical device belongs the the logical device hdisk0.
The next screen shows a command output from one of our nodes.
We want to define a alternate rootvg rootvg_432. We can use the fast path
SMIT command:
# smitty creatvg_dialog
[Entry Fields]
Start Frame [1] #
Start Slot [11] #
Node Count [1] #
OR
Node List []
COMMAND STATUS
spmkvgobj: Volume_Group object for node 11, name root_432 not found, adding n
ew Volume_Group object.
spmkvgobj: The total number of Volume_Group objects successfully added is 1.
spmkvgobj: The total number of rejected Volume_Group additions is 0.
[Entry Fields]
Start Frame [1] #
Start Slot [11] #
Node Count [1] #
OR
Node List []
# splstdata -b -l 11
3. Now you can either run setup_server to initiate an install operation, or you
can do it directly by using the wrapper-commands. Execute the following
commands to prepare for the installation of node sp4n11.
# setup_server
or
# Efence -G -autojoin sp4n11
After the node comes up, the lspv command in Figure 18 on page 95 shows
that the rootvg is located on hdisk1 and that hdisk0 is unassigned; even if our
first root volume group is installed on this disk.
# lspv
hdisk0 0000097330a4837d None
hdisk1 00000973967bdfa2 rootvg
hdisk2 000003411ee207da None
hdisk3 00003550d567d4e3 None
To verify that we have more than one root volume group definition for node 11
we run the splstdata command:
# splstdata -v -l 11
List Volume Group Information
hdisk0
11 rootvg_432 0 true 1 PSSP-3.1 aix432
As you can see, the new -v option for the splstdata command shows all
defined volume groups for a node.
The next question is, how do we know which root volume group will be used
for the next boot? We use the splstdata -b to display this information:
As we can see, the current active system is the AIX 4.2.1 version on hdisk0.
If we want to switch to out alternate root volume group we will use the
spbootins command:
# spbootins -c rootvg_432 -l 11 -s no
So the rootvg_432 becomes the current root volume group for subsequent
installation and customization. Let us check using the splstdata -b command
that our changes are carried out.
# splstdata -b -l 11
There is one final step missing. We have to modify the bootlist of node 11 so
that the next boot will be done using hdisk1 (rootvg_432) instead of hdisk0
(rootvg). We do this by using the new PSSP command spbootlist.
Like the normal AIX bootlist command, the spbootlist command will look at
the vg_name attribute in the Volume_Group object and determines which
physical volume(s) are in the volume group and set the bootlist to them. Let
us look an an example:
# spbootlist -l 11
Important
When you boot your node from one rootvg on one specific physical volume,
you will see that the other physical volumes assigned to an alternate rootvg
appear as unassigned (see Figure 18 on page 95). Do not define anything
on those physical volumes or you will destroy your alternate rootvg.
The normal bootlist within your nodes will be modified by using the spbootlist
command. You should also think about modifying your service bootlist. Why?
Image that you are trying to boot your system in service mode for doing some
diagnostic work and your system is not coming up. If you have defined the
second physical disk (with the alternate rootvg installed on it) as an alternate
boot device you system will try to boot from this device. Probably you node
will boot from this device. In this case your node is still accessible.
You can configure two or three copies of each logical volume of the operating
system (the original plus one or two copies). The only logical volume that you
cannot be mirrored is the dump logical device.
When we talk about mirroring we need to think about disk quorum. When
quorum is enabled, a voting scheme will be used to determine if the number
of physical volumes that are up is enough to maintain a quorum. If quorum is
lest, the entire volume group will be taken off line to preserve data integrity. It
quorum is disabled, the volume group will remain on line as long as there is at
Enter the information of the node so that your volume group is defined with
two copies. You can use the spchvgobj command to modify the information of
your node. You can execute:
# spchvgobj -r rootvg -h hdisk0,hdisk1 -c 2 -l 10
[Entry Fields]
Start Frame [1] #
Start Slot [10] #
Node Count [1] #
OR
Node List []
With this instructions, we specify that the rootvg volume group will be
installed on hdisk0 and hdisk1. AIX takes care that each copy goes to a
separate physical volume. Recall that the prerequisite to mirroring is to have
at least two physical volumes in the volume group of interest. You can define
mirroring at installation time or dynamically on a running node.
You can initiate the mirroring of the root volume group on a running system by
issuing the following command:
# spmirrorvg -l 10 -f
This assumes that you defined the mirroring of the rootvg already as
described before. The command uses information found in the Volume_Group
object to initiate the mirroring.
You could also remove the mirroring by using the normal AIX commands on
the node or through SMIT and modify the information in the SDR afterwards.
Otherwise, the next time you customize that node, the system will try to
recreate the mirroring.
You can select the mode that you want through the SMIT panels or by using
the spbootins command. This mode will determine the actions taken when the
setup_server script is executed. If will influence the next boot procedure.
If you want check the current boot mode settings for a node, use the splstdata
-b command and look for the column labeled "response". See the following
example output of this command.
To modify the bootp_response of a node and save this in the SDR type the
following command:
# spbootins -r <mode> -l <node list>
When you run this command, the default option is to also run the setup_server
script.
The following sections explains briefly the procedures will are executed when
you change the boot mode.
In this mode, when you run setup_server, all the resource allocations are
removed for this NIM client. The file /tftpboot/<hostname>.info is also
removed, as well as the entry in the /etc/bootptab for that client. All this
means that the bootp request will be ignored and that your node will boot from
the local disk.
You can reboot the node as if it were a normal standalone machine. It will
boot from the local disk and start the operating system (if the key is in normal)
or the diagnostics menu (if the key is in service).
Then, in order to install your node, you must boot from the network.
The difference between booting a node locally with the key in service position
and booting from the network in diag mode is that in the first case, you are
still booting from your local disk and loading the diagnostic software from that
disk, while in the second case, the diagnostic software will be mounted from
the CWS to the node.
This chapter is to be used for applying Program Temporary Fixes (PTFs) for
AIX, Parallel System Support Program (PSSP) and other Licensed Program
Products (LPPs) in SP. If you are planning for migration of the AIX version or
PSSP software, refer to Chapter 6, “Migrating PSSP and AIX to Later
Versions” on page 125.
When installing PTFs on a production system, you must first take certain
precautionary measures. In this chapter we show the steps for successfully
installing PTFs on the CWS and the nodes.
Before applying any PTFs, you should review the memos, the so called
README files. These README files are most times related to filesets and
included the latest information which could sometimes not be included into
the manuals. They tell you what problems are being fixed and whether any
specific prerequisite filesets are required. They also tell you whether a
shutdown of the node is required for the fixes to take effect. Generally, it is
suggested you install the fixes in apply mode so that if there is any problem,
you can reject the filesets that have been applied. In some cases we suggest
you commit the fixes, as they may be mandatory. For these reasons, it is
imperative that you read the memos before proceeding.
The README files for all PSSP versions and their LPPs are available at:
http://www.rs6000.ibm.com/support/sp/sp_secure/readme/
As an example, when you apply the PTF ssp.css.2.4.0.2, you get the following
output on the console:
Now let us look at the steps to be followed for installing the PTFs on the CWS
and the nodes.
[Entry Fields]
INPUT device / directory for software /spdata/sys1/install/aix432>
SOFTWARE to update _update_all
PREVIEW only? (update operation will NOT occur) yes +
COMMIT software updates? yes +
SAVE replaced files? no +
AUTOMATICALLY install requisite software? yes +
EXTEND file systems if space needed? yes +
VERIFY install and check file sizes? no +
DETAILED output? no +
Process multiple volumes? yes +
First run with the PREVIEW only option set to yes, and check that the
prerequisites are available. If it is OK, then continue installing the PTFs by
changing the PREVIEW only option set to no.
During the installation, check for any errors being reported. It is also
possible to check this by viewing the smit.log file anytime after the
installation is over.
5. Update the SPOT with the PTFs in the lppsource directory using the
following command, as shown in the next screen:
# smitty nim_res_op
Resource name : spot_aix432.
Operation to perform : update_all
[Entry Fields]
* Resource Name spot_aix432
Fixes (Keywords) update_all
* Source of Install Images [lppsource_aix432] +
EXPAND file systems if space needed? yes +
Force no +
installp Flags
PREVIEW only? (install operation will NOT occur) no +
COMMIT software updates? yes +
SAVE replaced files? no +
AUTOMATICALLY install requisite software? yes +
OVERWRITE same or newer versions? no +
VERIFY install and check file sizes? no +
6. If the status of the install is OK, then you are through with the update of
the AIX PTFs in the CWS. If the status of the install is FAILED, then you
should review the output for the cause of the failure and resolve the
problem.
The three different methods for installing AIX PTFs are as follows:
1. Creating a new image which contains all the PTFs in one node and
propagating the image to the nodes.
2. Mounting the lppsource directory from the CWS to the nodes, and
installing the PTFs manually using smitty update_all for all nodes.
3. If you have the DSMIT package installed, then you can install using
dsmitty update_all to all the nodes in one effort.
7. Unmount the /mnt filesystem in the node and remove the NFS export of
the directory /spdata/sys1/install/images from the CWS. If you have the
Switch installed, fence the node using the command Efence sp2n10.
Reboot the node for the PTFs to take effect.
8. After the node has been rebooted, unfence the Switch using the
command:
# Eunfence sp2n10
Now the image is ready on the CWS for installation on the other nodes.
However, you should first check that all the operations in this node are
working without any problems. It is generally advisable to keep this node
under observation for few days before installing the image on the other
nodes.
To test the image, we will install our new image on node sp2n11. The
following steps install node sp2n11 using the new image sp2n10.img which
we created in step 6 of 5.3.1, “Method 1: Applying AIX PTFs using mksysb
Install Method” on page 108.
1. For our test installation we define a alternate rootvg as described in 4.1.2,
“Defining a Alternate rootvg” on page 91. For this execute the command:
# smitty createvg_dialog
Fill in the input fields as shown in the following screen. The only difference
here, when compared to the normal setup, is that the Network Install
Image Name will be set to sp2n10.img.
[Entry Fields]
Start Frame [1] #
Start Slot [11] #
Node Count [1] #
OR
Node List []
The following screen output shows that the alternate rootvg test_rootvg
did not exist and was therefore created.
COMMAND STATUS
spmkvgobj: Volume_Group object for node 11, name test_rootvg not found, adding n
ew Volume_Group object.
spmkvgobj: The total number of Volume_Group objects successfully added is 1.
spmkvgobj: The total number of rejected Volume_Group additions is 0.
2. The next step is to set the bootp_response attribute for the node to install.
Use the command:
[Entry Fields]
Start Frame [1] #
Start Slot [11] #
Node Count [1] #
OR
Node List []
4. Check the SDR by using the command splstdata -b; this should reflect the
new image.
# splstdata -b -l 11
5. Now you can either run setup_server to initiate an install operation, or you
can do it manually using wrappers only for this node. For doing the manual
steps to install the node, execute the following commands to prepare for
the installation of node sp2n10.
# Efence sp2n10
# create_krb_files
# Efence sp2n10
All nodes successfully fenced.
# create_krb_files
create_krb_files: tftpaccess.ctl file and client srvtab files created/updated
on server node 0.
# mkconfig
# mkinstall
# mknimres -l 0
mknimres: Copying /usr/lpp/ssp/install/bin/pssp_script to /spdata/sys1/install/p
ssp/pssp_script.
mknimres: Copying /usr/lpp/ssp/install/config/bosinst_data_prompt.template to /s
pdata/sys1/install/pssp/bosinst_data_prompt.
mknimres: Copying /usr/lpp/ssp/install/config/bosinst_data_migrate.template to /
spdata/sys1/install/pssp/bosinst_data_migrate.
mknimres: Successfully created the mksysb resource named mksysb_2 from dir
/spdata/sys1/install/images/sp2n10.img on sp2en0.
# export_clients
export_clients: File systems exported to clients from server node 0.
# allnimres -l 10
exportfs: /spdata/sys1/install/images/sp2n10.img: parent-directory (/spdata/sys1
/install/images) already exported
exportfs: /spdata/sys1/install/images/sp2n10.img: parent-directory (/spdata/sys1
/install/images) already exported
allnimres: Node 10 (sp2n10) prepared for operation: install.
6. Start the netboot and see the node is getting installed using this new
image.
This method is to be used for installing the PTFs on the nodes by using the
dsh command from the CWS.
For installing to all nodes, use the dsh -a option to have the nodes in the
working collection. If you want only selective nodes, use the command dsh -w
sp2n01,sp2n05,sp2n06,sp2n07 to install only in these nodes.
DSMIT
Environment
SP2
Control Workstation
DSMIT server
In the lab, we configured the CWS as the DSMIT server and all the nodes as
DSMIT clients.
# dsmitty -w sp2n10,sp2n11
If your credentials have expired or if you are logging on for the first time, it will
prompt you for the administrator password.
System Management
Domain Management
Select the option System Management to get into the SMIT screen, or
Domain Management to get into DSMIT administration. Once you get into
the system management, whatever operation you perform will be executed on
clients sp2n10 and sp2n11. If you had selected more than one client, it will
ask you whether you want to execute the command in sequence on all the
clients, or if you want them to be executed concurrently.
Now let us discuss how to update the AIX PTFs to nodes sp2n10 and sp2n11
using DSMIT.
1. We have to mount the lppsource of the CWS to clients sp2n10 and
sp2n11:
# dsmitty -w sp2n10,sp2n11 mknfsmnt
You will get a screen similar to the following:
When you press Enter, it will prompt for sequential or concurrent mode.
Make your choice and press Enter. This mounts the
sp2en0:/spdata/sys1/install/aix432/lppsource to /mnt in both clients
sp2n10 and sp2n11. You can verify that the mount has happened by
entering:
# dsh -w sp2n10,sp2n11 df /mnt
2. Install the PTFs in nodes sp2n10 and sp2n11:
# dsmitty -w sp2n10,sp2n11 update_all
[Entry Fields]
* Use first value entered as default for [yes] +
the rest of the fields
* sp2n10 [/mnt] +
* sp2n11 [] +
Input the Entry fields as shown in the preceding screen and press Enter to
take you to the next screen.
Common Dialogue for Update Installed Software to Latest Level (Update All)
[Entry Fields]
* INPUT device / directory for software /mnt
* SOFTWARE to update _update_all
PREVIEW only? (update operation will NOT occur) no +
COMMIT software updates? yes +
SAVE replaced files? no +
AUTOMATICALLY install requisite software? yes +
EXTEND file systems if space needed? yes +
VERIFY install and check file sizes? no +
DETAILED output? no +
Process multiple volumes? yes +
When you press Enter, it will prompt for sequential or concurrent mode.
Make your choice and press Enter. This command is passed to clients
sp2n10 and sp2n11 and the PTFs are installed in both systems. In this
way, you can select all systems in one effort in order to install the PTFs in
the nodes.
[Entry Fields]
INPUT device / directory for software /spdata/sys1/install/pssplpp/PSSP>
SOFTWARE to update _update_all
PREVIEW only? (update operation will NOT occur) yes +
COMMIT software updates? yes +
SAVE replaced files? no +
AUTOMATICALLY install requisite software? yes +
EXTEND file systems if space needed? yes +
VERIFY install and check file sizes? no +
DETAILED output? no +
Process multiple volumes? yes +
When the update is finished, go through the output messages for any
specific note.Based on these messages, you must take appropriate
actions, depending on your environment.
For installing the PSSP PTFs, follow the same procedure except for Step 2:
you will have to mount the PSSP PTF directory instead of the lppsource
directory. The command is:
Follow the same procedure except for Step 2: in both Option 1 and Option 2,
you have to mount the PSSP PTF directory instead of the lppsource directory.
The command is:
Follow the same procedure except for Step 1: instead of mounting the
lppsource directory, you will have to give the path of the PSSP ptfs directory.
In the SMIT dialogue panel for Add a File System for Mounting, enter the
pathname of the remote directory as /spdata/sys1/install/pssplpp/PSSP-3.1.
When updating the ssp.css fileset of PSSP, you must reboot the nodes in
order for the kernel extensions to take effect.
The steps for installing the PTFs by using the Switch are as follows:
1. Create a filesystem or directory in one of the nodes. Check that you have
sufficient space to copy the files to this filesystem. In the lab, we created
file system /ptf in node sp2n07.
2. NFS-export this filesystem to the CWS with read-write permission by
using the command:
# dsh -w sp2n07 /usr/sbin/mknfsexp -d '/ptf' -t 'rw' -r 'sp2en0' '-B'
3. Mount the exported filesystem of the node in the CWS on /mnt by using
the command:
# mount sp2n07:/ptf /mnt
4. Copy the PTFs from the tape to /mnt.
This chapter covers the migration of PSSP and AIX to later versions in an SP
environment. The objective is to help you to prepare and perform the
migration of PSSP and AIX. Before migrating a production system, there are
several things to be checked in order to perform a successful migration. In
this chapter, we describe in detail the steps from the planning stage to the
implementation of migration of PSSP software.
There are several things you need to do, or check, before starting the
migration. The following list gives most of, but it is not exhaustive.
1. Get the relevant documentation, like READ THIS FIRST document for
PSSP, and read it fully before performing the migration. The README files
for all PSSP versions and their LPPs are available at the following URL:
http://www.rs6000.ibm.com/support/sp/sp_secure/readme/
2. Check and document the AIX and PSSP levels of your CWS and all nodes.
3. Check your installed LPPs in the CWS and the nodes and document them.
4. Look for the required minimum AIX and PSSP PTF levels for the various
AIX and PSSP versions.
5. Check the PTF levels for AIX and PSSP that are installed in the CWS and
the nodes. Find out the PTFs that are needed to be installed for AIX,
PSSP and other LPPs.
6. Understand the issues related to coexistence of various PSSP and AIX
levels.
7. Choose the appropriate method (update/migrate/install) for upgrading your
CWS and the nodes.
8. Check your disk space, and if necessary, get some extra disk space.
9. Document your system and the migration plan.
10.Estimate the migration time.
11.Prepare for recovery procedures, in case of unsuccessful migration.
12.Back up the rootvg of the CWS and the nodes. Also back up the files in the
/spdata directory (directory backup, file system backup or savevg backup)
of the CWS depending upon your configuration.
The basic flow of steps for doing a PSSP migration is shown in Figure 20 on
page 126 and Figure 21 on page 127.
Are PSSP
No Update PSSP PTFs
PTFs at the latest
level in the CWS? in the CWS
Yes
Are PSSP
No Update PSSP PTFs
PTFs at the latest
level in the in the Nodes
Nodes?
Yes
Is AIX Migrate/Update to
No
at the required level in the the required AIX
CWS for the new PSSP
version with PTF's
version?
Yes
Are PSSP
No Update PSSP PTFs
PTFs at the latest
level in the CWS? in the CWS
Yes
Are PSSP
No Update PSSP PTFs
PTFs at the latest
level in the in the Nodes
Nodes?
Yes
Is AIX Migrate/Update to
No
at the required level in the the required AIX
CWS for the new PSSP
version with PTF's
version?
Yes
Coexistence / Migration
PSSP 1.2 Nodes Not Supported
To migrate from PSSP 2.4 to PSSP 3.1 with the AIX version at 4.3.2, we
suggest doing it as an upgrade service.
Note: Use this section in conjunction with the IBM Parallel System Support
Programs for AIX: Installation and Migration Guide, GC23-3898 when
migrating the CWS.
The steps for migrating the CWS from PSSP 2.4 to PSSP 3.1 with AIX version
at 4.3.2 are as follows:
1. Create a mksysb backup image of the CWS. We recommend that you
always take two backups on two different tapes, and never trust just one
tape. Check the tape is OK by listing the contents of the tape to be
readable by using the command smitty lsmksysb. For more information on
mksysb backup, refer to 7.1.1, “Mksysb and Savevg” on page 141.
2. Check that there is sufficient space in the /tftpboot and /spdata file
systems. See 2.4.2, “Disk Space Considerations” on page 50 for more
details.
3. Create directory /spdata/sys1/install/pssplpp/PSSP-3.1 by using:
# mkdir /spdata/sys1/install/pssplpp/PSSP-3.1
4. Copy the PSSP 3.1 images into the /spdata/sys1/install/pssplpp/PSSP-3.1
directory. Rename the pssp package to pssp.installp and create the .toc
file by using the inutoc command. The files related to PSSP-3.1 are:
• pssp.installp
• rsct.basic
• rsct.clients
• ssp.resctr
5. Stop the daemons in the CWS. Execute the following commands in the
same sequence to stop the daemons in the CWS:
# syspar_ctrl -G -k
# stopsrc -s sysctld
# stopsrc -s splogd
# stopsrc -s hardmon
# stopsrc -g sdr
10.Verify the SDR and system monitor for correct installation by using the
following commands:
# SDR_test
SDR_test: Start SDR commandline verification test
SDR_test: Verification succeeded
# spmon_itest
spmon_itest: Start spmon installation verification test
spmon_itest: Verification Succeeded
#
11.Set up the site environment in the CWS for the AIX level by using the
command:
# spsitenv cw_lppsource_name=aix432
12.Start the system management environments on the CWS by using
command services_config; the output should look looks as follows:
# /usr/lpp/ssp/install/bin/services_config
rc.ntp: NTP already running - not starting ntp
0513-029 The supfilesrv Subsystem is already active.
Multiple instances are not supported.
/etc/auto/startauto: The automount daemon is already running on this system.
#
14.The output of the previous command indicates that microcode for high
node in frame-1, slot-1 needs to be updated. Update the microcode by
using the following command:
[Entry Fields]
* System Partition names sp4en0 +
* Authorization Methods k4 std +
[Entry Fields]
Enable on Control Workstation Only yes +
Force change on nodes yes +
* System Partition names sp4en0 +
* Authentication Methods k4 std +
17.Verify that all the system partition subsystems have been properly started
by using the following command:
3. Verify the values in the SDR for the correct pssp_ver by using the following
command:
4. Now you can either run setup_server to initiate an install operation, or you
can do it directly by using the wrapper-commands. Execute the following
commands to prepare for the installation of node sp2n06.
# setup_server
or
# Efence -G -autojoin sp2n06
# create_krb_files
# mkconfig
# mkinstall
# export_clients
# allnimres -l 6
After you execute these commands, the screen output should look as
follows:
# create_krb_files
create_krb_files: tftpaccess.ctl file and client srvtab files created/updated
on server node 0.
# mkconfig
# mkinstall
# export_clients
export_clients: File systems exported to clients from server node 0.
# llnimres -l 6
allnimres: Node 6 (sp2n06) prepared for operation: customize.
6. Copy the pssp script from the CWS to the /tmp directory of sp2n06 by
using the command:
# pcp -w sp2n06 /spdata/sys1/install/pssp/pssp_script /tmp/pssp_script
7. Execute the pssp script on node sp2n06 from the CWS by using the
command:
# dsh -w sp2n06 /tmp/pssp_script
The output of this command should look as follows:
Although the most important data to protect is the information of the business.
it is also important to back up the configuration of your system so that you
could continue working or restart your production environment in case of
component failure.
This chapter helps you to understand how to back up and restore your whole
SP2 system, as well as its subsystems: the SDR, the Kerberos database,
normal directories and logical volumes.
Keep the following things in mind when taking a backup with mksysb:
You can generate the mksysb either through SMIT or by using the command
line.
# smitty mksysb
If you use the command from command line and you want to generate the
./image.data file before taking the mksysb, type the following command:
If you do not want to generate the ./image.data file because you made a
change to the existing one, you can omit the -i flag.
It is always important to have a recent backup of the CWS. You can decide to
take one image periodically, for example each week or month, depending
upon your environment, and take another copy each time you change the
configuration.
Remember that mksysb will back up only the data contained in the rootvg
volume group and that you will have to use savevg to back up the other
volume groups.
In order to take an image of your node, follow the same steps mentioned in
7.1.1.1, “Mksysb of the CWS” on page 142. However, your image is usually
created on your local node but for later use you need to have it on your CWS,
preferred subdirectory is /spdata/sys1/install/images. As always, there is
more than one solution solving this problem.
• Create the image on the local node and transfer it afterwards to the CWS.
You can use the ftp or rcp command.
• Export /spdata/sys1/install/images and mount it on your node. Create the
image on this mounted filesystem.
• Create a named pipe on your system and use this pipe as the output
device for the mksysb command. Start a remote copy; use the named pipe
as the source and the file name of the image (in the subdirectory
/spdata/sys1/install/images on the CWS) as the destination.
The image file is not bootable, so you will have to restore this as explained in
7.2.1.2, “Restoring a mksysb of a Node” on page 150.
In common system environments, you have some nodes with the same
configuration working the same application. You can take a backup of one
node of each group so that you can restore that image into any node of the
same kind and purpose. For example, if you have 10 nodes working with AIX
4.3.2 and configured to work with Oracle, you could take a mksysb of one
node. Then, if you need to restore the information of any of those nodes in the
future you can use the same mksysb to restore both AIX and also Oracle and
restore the Oracle configuration files of a specific node.
Data Verification
You can verify that you can read the data in the tape and store a copy of its
contents, if desired, in the following way:
Boot Verification
The only way to verify that a mksysb tape will successfully boot is to bring the
machine down and boot from the tape. No data needs to be restored.
Attention
Having the PROMPT field in the bosint.data file set to no causes the
system to begin the mksysb restore automatically, using preset values with
no user intervention.
If the state of PROMPT is unknown, this can be set during the boot
process. After answering the prompt to select a console during the boot up,
a rotating character is seen in the lower left of the screen. As soon as this
character appears, type 000 and press Enter. This will set the prompt
variable to yes.
You can also check the state of this variable while in normal mode by typing
the following commands:
# chdev -l rmt0 -a block_size=512
# tctl -f /dev/rmt0 rewind
# cd /tmp
# restore -s2 -xvqf /dev/rmt0.1 ./bosinst.data
Then you can edit the file bosinst.data and check the PROMPT variable to
know its value.
7.1.1.4 Savevg
The savevg command finds and backs up all files belonging to a specified
volume group. The volume group must be varied on, and the file systems
must be mounted.
It is important to have a backup of the spdata file system. If you did not create
an independent volume group to store this file system, and it is mounted in
the rootvg file system, the spdata file system will be included in your CWS
image created with the mksysb command. However, if you have the file system
There are two ways of using savevg, one through SMIT and the other through
the command line.
If you are taking the backup in a node and you do not have a tape drive
installed locally, you can choose one of the three methods described at the
beginning of this section.
You can take a backup of the SDR through SMIT or by using the command
line, as follows.
backup.JULIANdate.HHMM.append_string
If you want, you can take a look to the file using the command:
# tar -tvf <file name>
Switch, spmon, parallel commands and so forth. Therefore, you must have an
up-to-date backup of your kerberos database.
Kerberos provides some commands that help you to take backups and do
recovery of the kerberos database. The command used to backup is kdb_util
dump, followed by the file name where you want to store the backup.
For example, if you want to back up the kerberos database to a file called
kerberos.940422, you should execute:
The command will create a ASCII file called 990422 that has a format similar
to the one shown in Figure 23 on 147. Although you can put this file
anywhere, we recommend that you put this file in the kerberos database
directory called which is /var/kerberos/database.
When you have secondary authentication servers defined and you follow the
steps outlined for setting them up, you will automatically create a backup of
the database every time the cron file entry runs that propagates the database
The entry that propagates the database to the secondary servers and takes a
backup of the database looks like the following:
0 * * * * /usr/kerberos/etc/push-kprop
If you have this entry, the push-kprop script will create a backup file called
slavesave in the database directory /var/kerberos/database.
If you do not have this entry, but you want to automatically and periodically
take backups of your kerberos database, you can include the kdb_util
command to your crontab file.
# smitty nim_backup_db
Then enter the name of a device or a file to which the NIM database and the
/etc/niminfo file will be backed up.
When you use SMIT, you can see that a tar is executed to back up these files.
Do not forget that a backup of a NIM database should only be restored to a
system with a NIM master fileset that is at the same level or a higher level
than the level from which the backup was created.
Sometimes you need to recover only one part of a mksysb. For example, if
you install a new SP with a PCI CWS and you cannot recover the mksysb that
is sent with the SP, because that is taken in a microchannel machine, you will
want to recover at least the /spdata directory. To do that, follow these next
instructions:
1. Determine the blocksize the tape was set to when the mksys was taken:
# cd /tmp
# tctl -f /dev/rmt0 rewind
# chdev -l rmt0 -a block_size=512
# restore -s2 -xqdvf /dev/rmt0.1 ./tapeblksz
# cat ./tapeblksz
Then
2. Set the blocksize of the tape drive accordingly by running the following
command:
# chdev -l rmt0 -a block_size=[number in the ./tapeblksz file]
3. Restore the files or directories that you want by executing the following
commands:
# cd /
# tctl -f /dev/rmt0 rewind
# restore -s4 -xqdvf /dev/rmt0.1 ./< path >
To define your install image in an PSSP 3.1 environment you need to run the
following commands:
# spchvgobj -r rootvg -i <image name> -l <node_number>
# spbootins -r install -l <node_number>
Example:
# splstdata -b -l 10
List Node Boot/Install Information
Now you should net boot the node to restore the correct image.
You can restore this image in a node different from the original node without
worrying about the number of the node and its specific configuration. That
configuration is restored in the customization face of the installation process.
You can restore the SDR backup either through SMIT or by using the
command line, as follows.
This way you restore the SDR and also the corresponding partition sensitive
subsystems.
or
# sprestore_config <backup file name>
# SDRRestore backup.99201.1811.jvm
0513-044 The stop of the sdr.sp4en0 Subsystem was completed successfully.
0513-083 Subsystem has been Deleted.
0513-071 The sdr.sp4en0 Subsystem has been added.
0513-059 The sdr.sp4en0 Subsystem has been started. Subsystem PID is 37628
Alternatively, sprestore_config not only restores the SDR, but also restores all
partition-sensitive subsystems. This is actually the command executed when
you restore using SMIT. (see Figure 25).
# sprestore_config backup.99201.1811.jvm
0513-044 The stop of the sdr.sp4en0 Subsystem was completed successfully.
0513-083 Subsystem has been Deleted.
0513-071 The sdr.sp4en0 Subsystem has been added.
0513-059 The sdr.sp4en0 Subsystem has been started. Subsystem PID is 17934.
stopping "hr.sp4en0"
0513-044 The stop of the hr.sp4en0 Subsystem was completed successfully.
removing "hr.sp4en0"
0513-083 Subsystem has been Deleted.
0513-071 The hats.sp4en0 Subsystem has been added.
0513-071 The hags.sp4en0 Subsystem has been added.
0513-071 The hagsglsm.sp4en0 Subsystem has been added.
0513-071 The haem.sp4en0 Subsystem has been added.
0513-071 The haemaixos.sp4en0 Subsystem has been added.
Added 0 objects to class EM_Resource_Variable
Added 0 objects to class EM_Structured_Byte_String
Added 0 objects to class EM_Resource_ID
Added 0 objects to class EM_Resource_Class
Added 0 objects to class EM_Resource_Monitor
making SRC object "hr.sp4en0"
0513-071 The hr.sp4en0 Subsystem has been added.
0513-071 The pman.sp4en0 Subsystem has been added.
0513-071 The pmanrm.sp4en0 Subsystem has been added.
0513-071 The Emonitor.sp4en0 Subsystem has been added.
0513-059 The hats.sp4en0 Subsystem has been started. Subsystem PID is 38712.
0513-059 The hags.sp4en0 Subsystem has been started. Subsystem PID is 22996.
0513-059 The hagsglsm.sp4en0 Subsystem has been started. Subsystem PID is 25742.
0513-059 The haem.sp4en0 Subsystem has been started. Subsystem PID is 20558.
0513-059 The haemaixos.sp4en0 Subsystem has been started. Subsystem PID is 24642.
0513-059 The hr.sp4en0 Subsystem has been started. Subsystem PID is 35508.
0513-059 The pman.sp4en0 Subsystem has been started. Subsystem PID is 24880.
0513-059 The pmanrm.sp4en0 Subsystem has been started. Subsystem PID is 40450.
0513-059 The sp_configd Subsystem has been started. Subsystem PID is 18412.
If you do not have a good backup of the kerberos database, and you need to
rebuild it, refer to page 85 in RS/6000 SP: Problem Determination Guide,
SG24-4778 for a detailed procedure to reconstruct the database.
# smitty nim_restore_db
The first option can be good if you have enough space in your local node and
have a slow network; by using this option, you can take the backup fast,
compress the file, and send it to the other machine.
If you want to take a backup directly to a file or tape in other machine, you can
use the tar and dd commands in the following way:
# tar -cvf - <path> | rsh <remote host> dd of= <remote tape or file> \
bs=<block size>
Do not forget the "-" before the path and remember that the block size
depends of current tape device settings. For example, take a backup of the
/tmp filesystem to the tape /dev/rmt0 connected to a machine called sp4en0
and defined with a block size of 1024 using the dsh command, you must
execute:
# tar -cvf - /tmp | dsh -w sp4en0 dd of=/dev/rmt0 bs=1024
However, running this command, you can experience some problems. For
example, if you have a block size that does not match the one in the remote
tape, you will receive a response similar to the following:
To restore a backup taken like this, you must execute the following command:
# rsh <remote host> dd if=<remote tape or file> ibs=<block size> |tar \
-xvf - <path>
Remember to use the same block size that you used to take the backup.
To create a savevg tape remotely, you must use a pipe; run the following
commands:
# mknod /tmp/pipe p
# savevg -i -f /tmp/pipe <vgname> &
# dd if=/tmp/pipe |dsh -w <remote host> dd of=<remote tape or file> \
bs=<block size>
One common requirement when working with the SP2 is the need to install
remotely third party applications that require a "local" tape drive to install their
product. To accomplish this, assuming that the product is installed with the
syntax installxx <device>, run the following procedure:
# mkfifo /tmp/pipe
# dsh -w sp4en0 <remote host> dd if=<remote tape> bs=<Block size> \
>/tmp/pipe 2>/dev/null
# installxx /tmp/pipe
Ever since kerberos was introduced into PSSP software there have been
complaints about it. The question is often asked: "Isn’t it possible to get rid of
this kerberos stuff on the SP?" The reasons for these complaints are always
the same: scripts do not run because of expired kerberos tickets, the switch is
not startable, spmon does not work after changing the IP configuration,
distributed shell commands return errors, and so on. As a consequence,
many customers stored a /.rhosts file on the nodes just to be sure that remote
shell commands run even if kerberos does not work properly. But one of the
reasons for using kerberos in the first place was to elimination the need for an
/.rhosts file.
"With the SP comes kerberos and with kerberos comes trouble." These are
the very first words of a service bulletin about kerberos in 1995. In this
chapter we show that this statement is not true and that it can be easy to
handle kereros as soon as the processes of kerberos authentication are
made clear.
We start with a brief overview of what a default SP kerberos setup looks like,
and explain the necessary kerberos-related terms. We then approach
kerberos in a chronological way corresponding to the installation process of
an SP. We explain: what does setup_authent do? What happens during
setup_server? Which files are needed where, and when are they distributed?
/etc/krb.conf
/etc/krb.realms
/etc/krb-srvtab
/.klogin
/etc/krb.conf
/etc/krb.realms
/etc/krb-srvtab n09 cws:>
/.klogin
/etc/krb.conf
/etc/krb.realms n05
/etc/krb-srvtab
/.klogin /.k
/etc/krb.conf
/etc/krb.conf
/etc/krb.realms
n01 /etc/krb.realms
/etc/krb-srvtab /etc/krb-srvtab
/.klogin
/.klogin
/tmp/tkt0
Since there is one central database for one kerberos realm, all kerberos user
and service names must be unique within this realm. All clients of this
kerberos realm, in our example all SP nodes, have kerberos configuration
files, therefore they know who their authentication server is and which nodes
belong to the same realm. As you can see in Figure 27 on page 159, the
client nodes have almost all the kerberos-related files as their authentication
server. But some files are unique on every host. Figure 28 on page 160
shows that the /etc/krb.conf, /etc/krb.realms, and /.klogin files on the clients
are just copies of the server files, while the /etc/krb-srvtab and /tmp/tkt0 (if it
exists) files are unique on each host, although the names are the same.
Au then tication
Server
Client C lient
N ode /.k N ode
/var/kerberos/\
database/*
After the installation of the PSSP code on the CWS, you have to issue
setup_authent.This script defines the CWS as authentication server, creates
the kerberos database, and asks you for the kerberos master key, the very
first kerberos principal (root.admin) and some default values. Let us have a
look which files are already created at this point of the installation.
setup_authent on cws
[cws] >
# setup_authent
/.k
creates on
/etc/krb.conf
the cws
/etc/krb.realms
/etc/krb-srvtab
/.klogin
/tmp/tkt0
/var/kerberos/database/*
SP2EN0
SP2EN0 sp2en0 admin server
For this
realm...
this host
is ...
As shown in Figure 30, the /etc/krb.conf file has at least two entries. The first
line indicates that this host belongs to, for example, the SP2EN0 realm, and
the second line gives the information about the authentication server of this
realm.
You may ask, who chose the realm name SP2EN0? The setup_authent script
does the following. First it checks whether a /etc/krb.conf file already exists. If
not, setup_authent takes the hostname of the CWS and changes it into
uppercase to define the realm name. AS you can see in Figure 30, the
hostname of the CWS is sp2en0, so the realm name became SP2EN0. When
you use a domain name server, setup_authent uses the domain entry of the
/etc/resolv.conf file as the realm name and changes it into uppercase.
Note, however, that you are not forced to use any of these realm names. You
can choose the realm name and create your own /etc/krb.conf file in the same
syntax. As soon as setup_authent detects an existing /etc/krb.conf file, the first
entry will then become your realm name.
In our setup the CWS has a Token Ring adapter with the adapter name
sp2cw0 and an Ethernet adapter with the sp2en0 adapter name. The
hostname of our CWS is based on the Ethernet adapter (look at the
command prompt in the screen output above). Since the primary
authentication server is using the hostname the related adapter name (the
ethernet adapter name in our configuration) is not included in the
/etc/krb.realms file. Any additional configured adapter, like our Token Ring
adapter, will be included in the krb.realms file.
This file will get enlarged with all the adapter names of supported (xxx)
adapters (like Ethernet, Token Ring, Switch adapter, FDDI etc.) from all the
nodes during the installation steps following the setup_authent command.
The very first kerberos user principal that you define (usually root.admin)
must have the instance admin. This instance is needed to do administrative
tasks on the kerberos server. This kerberos principal must be logged in as
Unix user root. For further user principals, the instance name is freely
choosable but it must not be created before you define the principal.
The instance name of a service principal is the adapter name on which this
service is provided. The rcmd service-principal provides the kerberos remote
commands. Since you want to be allowed to set up kerberized remote
commands over each network in your SP, you have as many rcmd principals
as the number of adapters in your SP configuration.
When you are allowed to use the hardmon service on the CWS, you can talk
to the hardmon daemon that monitors the serial link to the SP frame. Without
hardmon service, you are not able to run any command that uses the serial
line: these commands are spmon, s1term, hmcmds, and hmmon. That is why you
cannot run spmon commands without a working kerberos environment. By
default only root.admin is allowed to use the hardmon services, but you can
change this configuration; see 8.7.7, “Hardmon Access Control List - hmacls
File” on page 179, for more information.
The kerberos daemon is used for the authentication of kerberos users. The
kadmind is only used during administrative tasks on the kerberos database
(for example adding principals, changing passwords, and so on). It happens
that using remote commands like dsh work fine but issuing a kadmin command
hangs; no input prompt returns. In such cases make sure both daemons are
running. You will not notice the death of the kadmind daemon as long as you
are only asking for tickets and services.
These daemons are started from the /etc/inittab. Also, starting with PSSP 2.3,
these daemons are managed by the system resource controller. You can start
and stop them as follows:
# stopsrc -s kadmin
# startsrc -s kadmin
Before PSSP 2.3 version, in order to stop kerberos and kadmind, you had to
change the respawn value in the /etc/inittab file of these daemons to off and
then kill the corresponding processes.
In our example, setup_authent did run, but not setup_server, so what is the
contents of the kerberos database at this stage? Let us look into the database
by issuing the lskp command.
sp2en0:/# lskp
K.M tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
changepw.kerberos tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
default tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
hardmon.sp2cw0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
hardmon.sp2en0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
krbtgt.SP2EN0 tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2cw0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2en0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
root.admin tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
As shown in the preceding screen, you can see the root.admin user principal,
hardmon service and rcmd principals for the CWS.
After filling up the SDR with node information, you execute setup_server for
the very first time. One wrapper of setup_server, called setup_CWS, scans the
SDR for the adapter names for all nodes. It then adds for every adapter name,
sp2en0:/# lskp | pg
K.M tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
changepw.kerberos tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
default tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
hardmon.sp2en0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
krbtgt.SP2EN0 tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2cw0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2en0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css01 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css05 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css06 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css07 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css08 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css09 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css10 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2css11 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n01 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n05 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n06 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n07 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n08 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n09 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n10 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2n11 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
root.admin tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
Now there is one rcmd principal defined for each adapter in the SP system.
As already mentioned, this service principal provides the kerberized remote
commands.
The /etc/krb.realms file has also been filled up with all these new adapters.
This file indicates which adapter belongs to which realm. The example shows
only the first part of it:
rcmd sp2en0 255 1 1 0 10cb558e 72d4f65c 203801010459 199904212101 root admin Kerb
er
K M 255 1 1 0 f21f04fe a5f20edc 203801010459 199904212101 db_creation * Data os
hardmon sp2cw0 255 1 1 0 beab30fc 1a6c5384 203801010459 199904212101 root admin base
rcmd sp2css07 255 1 1 0 a1bfc471 48c942fd 203801010459 199904212105 root admin
anna sap 3 1 1 0 e5270e06 4b6d3066 200004270459 199904222132 root admin
rcmd sp2n06 255 1 1 0 b4b6542a 27abcdd8 203801010459 199904212105 root admin
rcmd sp2n13 255 1 1 0 cf88f27f 556f40ed 203801010459 199904212105 root admin
rcmd sp2tr01 255 1 1 0 45bda4ec 37a3c2ad 203801010459 199904212105 root admin
rcmd sp2css06 255 1 1 0 3170111 84a51456 203801010459 199904212105 root admin
/tftpboot/sp2n06-new-srvtab
After the installation of the whole SP, each SP node has its own
/etc/krb-srvtab file.
To point out the difference between these two kinds of tickets, let us imagine a
trip to Disney World. At the entrance you have to buy a ticket just to enter, and
this ticket allows you to stay inside for a defined period of time. This initial
ticket can be compared to the kerberos ticket-granting-ticket which has a
specified lifetime. It only gives you permission to use kerberos services in this
realm. But you do not enter Disney World (or a kerberos realm) just get inside
and then stay in only one place; instead, you are going to have some fun and
visit several shows.
If Disney World was organized like a kerberos realm, there would be a central
ticket counter. In a kerberos realm, this ticket counter (called the
Ticket-Granting-Server) is located in the authentication server and it already
issued you the ticket-granting-ticket. To get access to one of the Disney
shows or even better -- to one of the kerberos services, you have to show
your ticket-granting-ticket at the central ticket counter. It will be checked and if
it is accepted, you will obtain a ticket (in kerberos terminology, a service
ticket) to visit the show -- or use kerberos remote services. Without a
ticket-granting-ticket, you will not get any access.
Now keys, as we know, are used both to lock something and to open it again.
If two people share the same key, one can lock and the other can reopen. In
the kerberos environment, we use two different keys:
• Session Key
• Private Key
8.7.1 Ticket-Granting-Ticket
The last part of the setup_authent command invokes the kinit command for
the first principal you have defined, usually root.admin. This kinit root.admin
prompts you again for the password of this principal, then provides you with a
ticket-granting-ticket. Since this ticket-granting-ticket has by default a lifetime
of 30 days, you have to get a new one after expiration. To obtain a new
ticket-granting-ticket for root.admin, just issue kinit root.admin at the
command prompt. If you are interested in the values of your current ticket,
klist gives you the information needed, as shown:
The first line of the klist output always points to your ticket cache file that
stores your ticket; therefore, it will survive a reboot of the machine (see 8.7.4,
“The Ticket Cache File” on page 176). The second line indicates the owner of
this ticket, and the first line of the next block represents the
ticket-granting-ticket itself.
Client
kinit root.admin Authentication
n07 Server
Who ? Time ?
# kinit root.admin 1 Part of Realm ?
Kerberos Initialization for "root.admin"
Password: 2 Okay
3 tgt
session Session Key
As soon as you type the root. time ticket
sp2n07 key
admin stamp lifetime n07-TGS
root.admin password locally n07-TGS
Session /.k
n07-TGS /.k
/var/kerberos/\
database/*
/tmp/tkt0
/.klogin
/etc/krb.realms
Ticket-Granting
/etc/krb.conf
Server - TGS
/etc/krb-srvtab
Step 1: The client n07 contacts the authentication server asking for a
ticket-granting-ticket for the root.admin principal.
Step 2: The authentication server verifies whether this client is part of the
realm and creates a packet including a tgt encrypted with the master key (/.k),
so that only the authentication server itself is able open it. This tgt contains
the client’s name, the name of the Ticket-Granting-Server, a timestamp, the
ticket lifetime, the client’s IP address, and the Session Key for n07-TGS.
Furthermore, a copy of this Session Key n07-TGS is appended to the tgt. The
Step 3: When this packet arrives at the client node, the password prompt
appears. The root.admin password is issued locally and the kerberos code on
the client generates a key as well, based on that password. If the password
was correct, the generated key is able to open the packet. Only the session
key contained in the outer brackets is usable by the client (node n07) for
further communication. The ticket-granting-ticket itself is always a kind of
"black box" for the client. Only the authentication server is able to open it with
its master key (/.k). This tgt is now stored under /tmp/tkt0 on client n07.
Let as assume n07 asks for a remote service of n08, n07 wants to set up a
remote shell command on n08. We divide this procedure into two steps, first
the communication with the authentication server and then the direct
communication between n07 and n08.
Session
tgt = Authenticator ?
n07-TGS
Does the info in the tgt
correspond to the info
in the Authenticator ?
Private
Key of
n08
Session
Session
n07-n08
n07-TGS
Session
n07-TGS
Ticket-Granting
tgt Server - TGS
Step 4: Client n07 wants to set up an ls command on n08 via remote shell.
Since n07 has no service ticket for n08 yet, the authentication server has to
be contacted first. n07 wraps a packet with the following contents: n08, the
host that wants to be contacted, the ticket-granting-ticket, and an
authenticator storing the name and IP-address of n07, plus a timestamp. This
authenticator is encrypted with the Session Key n07-TGS. This packet is sent
to the authentication server.
The authentication server opens this packet with its Session Key n07-TGS
and opens the tgt with its master key. The contents of the tgt is compared to
Step 5: The authentication server returns a packet to n07. The first part
contains the Service Ticket for n08 including the Session Key for n07 and n08
(n07-n08). This part is encrypted with the private key of n08 so that only n08
is able to open it. Then the same Session Key n07-n08 is appended and the
whole packet encrypted again with the Session Key n07-TGS. 07 opens the
packet with its Session Key n07-TGS and gets the Session Key n07-n08 and
an encrypted packet containing the Service Ticket for n08.
Client Client
n08
n07 dsh -w n08 (part II)
Private
Key of
Session n08
n07-n08
Session
n07-TGS
tgt
In general the kerberos ticket cache files are stored in the /tmp directory. The
file name tkt0 is by default is composed of tkt followed by the UID of the Unix
user that asked for this ticket-granting-ticket. Since the root user has the UID
0, the ticket cache file’s name is /tmp/tkt0. However, you can name this file
whatever you want since the correlation of Unix UID and kerberos principal is
integrated in this file. Having the UID as the ending of the file just facilitates
the identification for you.
You may want to store the ticket files in another location. This can be done
with the KRBTKT environment variable in the ~/.profile; for example:
# export KRBTKFILE=~/tkt$LOGIN
This command will name the tgt for Unix user anna to be /home/anna/tktanna.
Wherever you decide to store the tickets, insure that each user who asks for a
ticket is allowed to write to this directory; that is the reason why it is stored in
/tmp by default.
Assume we have two Unix users on the system and a kerberos user principal
called anna.sap (refer to 8.8.2, “Make Kerberos Principal” on page 183 for
information about how a new kerberos principal is created).
sp2en0:/home/anna $ id
uid=203(anna) gid=1(staff)
sp2en0:/home/anna $ kinit anna.sap
Kerberos Initialization for "anna.sap"
Password:
sp2en0:/home/anna $ klist
Ticket file: /tmp/tkt203
Principal: anna.sap@SP2EN0
sp2en0:/home/heiner $ id
uid=209(heiner) gid=1(staff)
sp2en0:/home/heiner $ kinit anna.sap
Kerberos Initialization for "anna.sap"
Password:
sp2en0:/home/heiner $ klist
Ticket file: /tmp/tkt209
Principal: anna.sap@SP2EN0
Since anna and heiner have different ticket cache files, they can share one
kerberos principal. You may ask: "For what purpose?”
Kerberos does the authentication and permits or denies (by providing you a
ticket or not) sending remote shell commands to the hosts belonging to the
same realm. The authorization (meaning the permission to read, write or
execute a file) is still done by the remote Unix system by checking the Unix
permissions. For instance, whether or not anna is allowed to copy a file to a
remote machine is under the control of the remote Unix system. If anna is not
known on the remote host, the rcp command will be refused even if the
kerberos authentication was okay.
Keep in mind that for kerberized remote shell commands, the users UID must
be the same on all the nodes. As long as your are using the SP User
Management this facility takes care of it.
sp2n01:/# /usr/lpp/ssp/rcmd/bin/rcmdtgt
sp2n01:/# klist
Ticket file: /tmp/tkt0
Principal: rcmd.sp2n01@SP2EN0
# c a t /s p d a ta /s y s1 /s p m o n /h m a c ls
s p 2 e n 0 ro o t.a d m in a
s p 2 e n 0 h a rd m o n .s p 2 e n 0 a
1 ro o t.a d m in vs m
1 h a rd m o n .s p 2 e n 0 vs m
p e rm is s io n s
vs m
fo r fra m e
num ber 1
th is
p rin c ip a l
h a s th e
...
sp2en0:/# vi /spdata/sys1/spmon/hmacls
sp2en0 root.admin a
sp2en0 hardmon.sp2en0 a
1 root.admin vsm
1 hardmon.sp2en0 vsm
1 anna.sap m
# kinit anna.sap
Trying spmon -d now would return an error message because the hardmon
subsystem has to be stopped and restarted so that the hmacls file is read
again:
# stopsrc -s hardmon
# startsrc -s hardmon
As kerberos principal anna.sap, let us now try to use spmon -d:
That works, but when we try to open a write terminal on a node, the following
message is returned:
sp2en0:/# s1term -w 1 7
s1term: 0026-645 The S1 port in frame 1 slot 7 cannot be accessed.
It either does not exist or you do not have S1 permission.
It does not matter which Unix user has a ticket for the principal anna.sap.
Neither root nor any normal Unix user who is authenticated as anna.sap is
sp2en0:/# ls -l /var/kerberos/database
total 78
-rw-r----- 1 root system 11 Apr 6 16:40 admin_acl.add
-rw-r----- 1 root system 11 Apr 6 16:40 admin_acl.get
-rw-r----- 1 root system 11 Apr 6 16:40 admin_acl.mod
-rw------- 1 root system 4096 Apr 6 16:41 principal.dir
-rw------- 1 root system 0 Apr 6 16:38 principal.ok
-rw------- 1 root system 82944 Apr 6 16:41 principal.pag
eros
Kerb ase
b
Data
eros
# kdb_util dump /tmp/kerbdb Kerb se
ba
# cat /tmp/kerbdb | pg Data
This command is useful to find out which principals are currently defined and
what the current settings are; for example, the expiration date of a kerberos
account, the ticket lifetime and the key version of this principal, as shown in
the following screen:
sp2en0:/# lskp
K.M tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
changepw.kerberos tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
default tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
hardmon.sp2cw0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
hardmon.sp2en0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
krbtgt.SP2EN0 tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2cw0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
rcmd.sp2en0 tkt-life: Unlimited key-vers: 1 expires: 2037-12-31 23:59
root.admin tkt-life: 30d key-vers: 1 expires: 2037-12-31 23:59
# mkkp anna.sap
This creates a new principal anna with the instance sap in the kerberos
database. But we have not yet set a password for anna.sap. Since these new
PSSP kerberos commands are not interactive, you are not able to set the
initial password with the mkkp command. But without a password, the principal
is blocked. The consequence is that you have to set the password later by
getting the kadmin prompt and setting the first password with the
change_principal_password ( cpw) option. There is always one additional step
required so we recommend that you use the kadmin command at once to
define a new principal. You will be prompted for the root.admin password first
and then set the password for the new principal as in the following example.
Note: Regarding the kadmin command, in case you named your very first
kerberos principal not root.admin decided to chose a different name with
instance admin, you have to specify this admin user as a parameter for the
kadmin command. For instance, if your admin principal is sp.admin, you will
get the kadmin prompt by typing:
# kadmin -u sp.admin
The kadmin command, without any options, expects the default root.admin
principal. Now let us define anna.sap with the old kadmin command:
How can you find out when a kerberos principal account will expire and how
long the ticket lifetime for this principal currently is? The best way is to use
both commands, lskp and kdb_util dump, especially when you want to change
something. The lskp command returns the expiration date and the ticket
lifetime in hours and minutes.
However, when you want to change the ticket lifetime of a principal, you
cannot specify hours and minutes; instead you need to specify values from 1
to 255. Values 1 to 128 represent multiples of 5 minutes, while lifetime values
from 129 to 255 represent intervals from 11 hours and 40 minutes to 30 days
(see IBM Parallel System Support Programs for AIX: Administration Guide,
GC23-3897).
The current value is readable in the dump file of the database. Let us first
have a look to the current settings of the principal anna.sap, and then use the
chkp command to change them.
As you can see, the expiration date has been changed and the ticket lifetime
is 10 hours and 40 minutes. You can look up this value by using the kdb_util
dump command.
eros
Kerb ase
b
Data
eros
# kdb_util dump /tmp/kerbdb Kerb ase
b
# cat /tmp/kerbdb |pg Data
p2en0:/# kdb_edit
Opening database...
# rmkp anna.sap
For all new kerberos commands introduced with PSSP 2.3, only root authority
is required; no ticket-granting-ticket is needed.
If you want to keep some of the kerberos configuration files, then follow the
steps as in older PSSP versions before a rerun of setup_authent takes place.
Until PSSP 2.3 you had to delete the kerberos files manually. Since it is
possible to keep some of the files, especially if you edited them, we will not
only describe how to delete all kerberos files, but also tell you which files can
be kept.
# rm /.k /.klogin
The master key /.k must be removed, but /.klogin can be kept.
# rm /etc/krb*
# kdb_destroy
This command deletes the database files, meaning the principal.pag and the
principal.dir files under /var/kerberos/database. The ACL files for the kerberos
database remain. In each ACL file there is an entry for the root.admin
principal; consequently, if you set up your kerberos with a different first admin
principal than root.admin, you will run into trouble. To delete all these files use
the command:
As soon as you have deleted the kerberos database and the master key /.k,
the setup_authent command will not run into the replacement loop but instead
set up kerberos again based on the existing files. Now you can rerun:
# setup_authent
Since we plan neither a reboot of the nodes nor starting the pssp_script
manually, we can set all the nodes back to response=disk and then distribute
the files by file transfer. For the bootp_response value customize, no NIM
resource allocation has taken place; therefore, it is sufficient to change this
response value in the SDR without running setup_server.
All new-srvtab files for the nodes are stored under the /tftpboot directory on
the CWS. At a minimum, these files have to be copied to the nodes. If nothing
changed in your configuration, meaning you are using the same realm name
and the same adapter names, the /etc/krb.conf and /.klogin files on the nodes
can still be used. Regardless of whether the kerberos client could use the old
files or really needs the new ones, we will demonstrate the transfer of all files
that are required on a kerberos client for sp2n01. Due to the long outputs of
the ftp command, we cut the messages out and show only the essential part:
This procedure has to be repeated for each node. After the distribution of the
new-srvtab files to the nodes, delete them under /tftpboot on the CWS, since
they are no longer needed in this place and this directory permits access for
tftp.
# rm /trftpboot/*new*
# spbootins -r customize -l 6 -s no
This command sets node 6 to customize and setup_server will not run due to
the -s option. We only need the wrapper create_krb_files (refer to Figure 31
on page 169). This command creates the new-srvtab file for node 6 under
/tftpboot.
When create_krb_files has finished and you have obtained all needed
<node>-new-srvtab files, do not forget to set the appropriate node(s) back to
disk:
# spbootins -r disk -l 6 -s no
In this case, it is sufficient just to set the bootp_response value in the SDR
back to disk. For "customize", no resource has been allocated, therefore it is
not necessary to run the unallnimres command (or setup_server).
The result is two new-srvtab files in the current directory. What is still left to
do is the concatenation of all the srvtab files belonging to a node into one
srvtab file (whose name is free-choosable), and then the transfer to the
corresponding node as /etc/krb-srvtab file.
Assume we have two SP systems, sp4en0 and sp5en0, and sp5en0 will be
integrated to the realm of sp4en0. That means the CWS sp4en0 will become
the authentication server for both SP systems. Later, we will set up sp5en0 as
a secondary authentication server for this realm. Figure 38 illustrates this
configuration.
The common realm name will be SP4EN0, but as you know, you are free to
choose the realm name and create your own /etc/krb.conf file before issuing
setup_authent (see 8.2.2, “The Kerberos Configuration File” on page 162 for
further information.
sp4n11 sp4n12
sp4n07 sp4n08
sp4en0:# sp5en0:#
What are files the kerberos master is using and which subsystems belong to
the kerberos master? Besides the kerberos configuration files that are stored
on the server and the clients, the server is owner of the kerberos database.
So let us delete the related files and remove the subsystems on sp5en0:
Since kerberos packets have a life time of 5 minutes, the time difference
between the hosts must not exceed this limit. Therefore, when you plan to
spread one kerberos realm across SP systems, you have to insure that the
time difference between the these systems is limits. Otherwise, your kerberos
authentication and communication will not work. One solution for solving this
problem is using an external time server to synchronize the time services
within the SPs.
When you use the kadmin prompt, you have to define one rcmd principal after
the other and set a password for each of them. It is required to have a valid
password for every principal; otherwise, you will not be allowed to log on or
authenticate as this principal. On the other hand, you are never prompted for
the rcmd principals’ password because the login routine for the rcmd principal
is done under cover. The rcmdtgt command asks for a ticket-granting-ticket.
Then the authentication server sends a packet back containing the tgt
encrypted with the private key of this node (see also 8.7, “Tickets and Keys”
on page 170). The /etc/krb-srvtab on the node contains the password for this
principal. Finally, with this srvtab file, the packet can be decrypted locally so
that the tgt can be obtained.
sp4en0:/# kadmin
Welcome to the Kerberos Administration Program, version 2
Type "help" if you need it.
admin: ank rcmd.sp5n01
Admin password:
Password for rcmd.sp5n01:
Verifying, please re-enter Password for rcmd.sp5n01:
rcmd.sp5n01 added to database.
admin: ank rcmd.sp5n01x
Admin password:
Password for rcmd.sp5n01x:
Verifying, please re-enter Password for rcmd.sp5n01x:
rcmd.sp5n01x added to database.
admin: quit
Cleaning up and exiting.
You could define every rcmd principal within one kadmin session. You have to
set a password and verify it, but you can forget it afterwards. Now let us check
if these principals are defined:
First of all we create a file that has read and write permission only for the
owner, say /tmp/addnew, in order to insure that no other person can look into
this file during the procedure of creating principals.
We do not delete the /tmp/addnew file immediately because we can also use
it to create the new-srvtab files for the new principals; it saves time to use this
file as input.
The instance names are equal to the adapter names. The entries in
/tmp/addnew look like rcmd.sp5n05 <password>, so we will cut the instance
names out and use them as input for ext_srvtab. The new-srvtabs are created
in the current directory. To make work easier it recommended to create new
directory /tftpboot/srv and change to it. Now all new-srvtab files will be
generated in this subdirectory.
sp4en0:/# cd /tftpboot/srv
sp4en0:/tftpboot/srv# ext_srvtab -n `cat /tmp/addnew | cut -d’.’ \
>-f2 | awk '{print $1}'`
Generating 'sp5en0-new-srvtab'....
Generating 'sp5cw0-new-srvtab'....
Generating 'sp5n05-new-srvtab'....
Generating 'sp5n05x-new-srvtab'....
Do not forget to remove the /tmp/addnew file with the passwords for the rcmd
principals if you have used the add_principal command.
# rm /tmp/addnew
Since the rcmd principals for node 1 have been defined with the kadmin
command, they were not included in our /tmp/addnew list. Let us now create
these srvtabs by naming the principals as parameters and then have a look at
the directory:
The result is one new-srvtab file for each adapter. Therefore, we have to
concatenate the srvtab files belonging to one node before we distribute them.
All files except the /etc/krb-srvtab files are copies of the kerberos server files.
The setup_authent script that will run to define the secondary server does not
allow you to define an SP node as a secondary server. As soon as it realizes
that you are working on an SP node the script exits. This works as designed
because usually the nodes are the login hosts for your user and it is not
recommended to place a copy of the database on a "user" machine.
Nevertheless a workaround for that is described in Appendix C, “Secondary
Authentication Server on an SP Node” on page 257.
Our starting point consists of a kerberos realm spread two SP systems. The
kerberos configuration after the setup of a secondary authentication server
will look like Figure 39 on page 200.
sp4en0 as sp5en0 as
primary secondary
authentication server authentication server
/.k /.k
/etc/krb.conf /etc/krb.conf
sp4en0:# sp5en0:#
/etc/krb.realms /etc/krb.realms
/etc/krb-srvtab /etc/krb-srvtab
/.klogin /.klogin
/tmp/tkt0 /tmp/tkt0
/var/
ros/\ ke
datab rb eros/\
erb e
/var/k se/*
data ba replicate ase/*
/.k /.k
Assuming that the future secondary server is already a client of the kerberos
realm, the following steps are necessary to set up the secondary server:
• Install the kerberos filesets on the future secondary server (if necessary).
• Edit the /etc/krb.conf file and distribute it to all hosts belonging to the
realm.
• Run setup_authent on the secondary authentication server.
• Add a /usr/kerberos/etc/push-krop entry to the crontab on the primary
server.
Insure that the name resolution for all hosts belonging to the kerberos realm
works properly.
For this
realm...
this host
is ...
a
secondary
authentication
server
To distribute this file to all the nodes of our realm we use the parallel copy
command pcp. Details on the parallel copy command and its possible options
are covered in 8.13, “Working with the WCOLL Variable” on page 204.
Assume the file /tmp/allhosts contains all nodes of both SP systems, plus the
second CWS. First we will set the WCOLL variable to that file, and then copy
the /etc/krb.conf to all nodes in parallel:
sp5en0:/# setup_authent
********************************************************************
ATTENTION
Enter y or n: y
***********************************************************************
Logging into Kerberos as an admin user
[...]
The kpropd and kerberos subsystems have been added and we already
received a read-only replica of the kerberos database. Due to the read-only
functionality on a secondary server, the kadmind subsystem is not installed.
All changes to the database must be, and can only be, done on the primary
authentication server.
The kpropd is the daemon that receives the copy of the kerberos database
from the primary authentication server. Now let us have look to the contents
of the /var/kerberos/database directory on the secondary server, because
there is a difference from the primary server:
Next we set the WCOLL variable to this file and export it:
# export WCOLL=/tmp/allhosts
It is well known that the dsh command without any option takes the contents of
the file to which the WCOLL variable points:
# dsh date
# dsh ’’ date
This is how all the other mentioned commands can work with the WCOLL
variable. Let us try with two nodes in the working collective:
HOST: sp2n05
------------
/var 4096 4068 28 1% 338 686 67%
With PSSP 3.1, a new strategy has arisen. The High Availability Infrastructure
that was restricted to the SP Environment before now becomes a Cluster
Technology, spread over a cluster consisting of both SP systems and external
machines. Further, HACMP Enhanced Scalability Version 4.3 moves beyond
partition boundaries and can be used over several partitions and different SP
systems, as well as external workstations.
Kerberos never sends a password unencrypted over the network. That is fine,
but as an SP administrator, you usually do not work directly in the lab but
instead telnet from another machine to your CWS by typing the root
password. If you authenticate to the realm you will be asked for the
root.admin password. During both authentication procedures the password
between you workstation and the CWS is transferred of the network
unencrypted. So, to maintain this security function, it makes sense to
integrate at least this your workstation to your SP kerberos realm.
The procedure is nearly the same as merging two SPs into one kerberos
realm. In the following section we will describe the necessary steps required
for the realm integration. For background information concerning the
kerberos-related files, the rcmd principal and the srvtab file on the nodes,
refer to Chapter 8, “Taming Kerberos” on page 157.
In our example we will show the integration of the machine named risc77 that
is connected to the CWS over Token Ring. Figure 41 on page 208 gives an
overview of our scenario.
sp2n13 sp2n14
sp2n01
SP Ethernet /v a r/
k e rb
d a ta e ro s
base /\
Serial Link /*
/.k
root.admin
Token Ring rcmd.sp2en0
rcmd.sp2n01
risc77:#
rcmd.sp2n05
rcmd.sp2n06
/etc/krb.conf :
/etc/krb.realm s :
/etc/krb-srvtab rcmd.risc77
/.klogin
Make sure that the name resolution for your external workstation works
properly. We recommend that you include the workstation in the /etc/hosts file
on your CWS and also include the CWS and the node in the /etc/hosts file on
your workstation.
p2en0:/# kadmin
Welcome to the Kerberos Administration Program, version 2
Type "help" if you need it.
admin: ank rcmd.risc77
Admin password:
Password for rcmd.risc77:
Verifying, please re-enter Password for rcmd.risc77:
rcmd.risc77 added to database.
admin: quit
Cleaning up and exiting.
Repeat this action for each adapter configured in the external workstation.
Just add a principal rcmd.<adapter name> to the database. First you are
prompted for the admin password that is the root.admin password, then you
are prompted for the rcmd.principal’s password. Provide a password. Choose
an arbitrary one and bear in mind that you will never need it again. Why? You
get authenticated as an rcmd principal by issuing rcmdtgt and this command
does not ask you for any password. For details on that please refer to 8.7.6,
“Never-expiring ticket” on page 178.
[Entry Fields]
* Kerberos 5 [no] +
* Kerberos 4 [yes] +
* Standard Aix [yes] +
If you set Kerberos 4 to yes but set Standard AIX and Kerberos 5 to no,
standard remote login (telnet, tfp) is not longer provided. Kerberos 5 requires
DCE version 2.2 or higher.
Including this external workstation to the file that is referenced by the WCOLL
variable (in our example, /tmp/nodes) makes this workstation accessible in
combination with your SP nodes.
Since you cannot use the internal SP Ethernet for the installation of external
clients you have to define a second network served by the NIM master.
Until you migrate to this level, however, there is a workaround. Since only the
client definition is deleted (no network or resource is deleted), it is very easy
to write a script that redefines the non-SP clients after setup_server runs.
Figure 42 on page 214 shows an example configuration and we provide you
with the workaround script.
risc78:#
sp2n15
sp2n13 sp2n14
risc77:#
sp2n11 sp2n12
sp2n09 sp2n10
sp2n07 sp2n08
sp2n05 sp2n06
sp2en0:#
sp2n01
SP Ethernet
ing
nR
ke
Serial Link To
NIM Master
serves: spnet_en0 serves: net_tok0
clients: sp2n01 clients: risc77
sp2n05 risc78
... risc79
Resources
mksysb, spot, lppsource, boot, noprompt
prompt, nim_script ...
Since the external hosts are not part of the SP, we cannot use the helpful
wrappers of setup_server . Instead, we are forced to use the standard NIM
commands. In our example we are going to install the risc77 from the NIM
master sp2en0 over the Token Ring. First let us define the second network.
The SMIT fastpath for NIM administrative tasks is nim_mknet.
1. Type smitty nim_mknet
2. Select the appropriate network interface
[Entry Fields]
* Network Name [sp_tok]
* Network Type tok
* Network IP Address [9.12.1.0]
* Subnetmask [255.255.255.0]
Other Network Type +
Comments []
The fields that are required are shown in bold characters. The Network Name
is free-choosable, and it is the equivalent value to spnet_en0.
Up to now only the network itself is defined, so this new network now has to
be defined as an interface served by the NIM master, as follows:
1. Type smitty nim_mkmac_if
2. Select master
3. Enter the appropriate interface
[Entry Fields]
* Host Name of Network Install Interface [sp2cw0]
Select the appropriate network interface (Token Ring in our example). Then
you will see the following screen.
Define a Network
[Entry Fields]
* Network Name [sp_tok]
* Network Type tok
* Network IP Address [9.12.1.0]
* Subnetmask [255.255.255.0]
Other Network Type +
Comments []
The Hardware Address of the interface is needed, and you can get it with the
command:
# lscfg -v -l tok0
Define a Machine
[Entry Fields]
* Host Name of Machine [risc77]
(Primary Network Install Interface)
After entering the hostname of the external NIM client you will see the
following screen.
[Entry Fields]
* NIM Machine Name [risc77]
* Machine Type [standalone] +
* Hardware Platform Type [rs6k] +
Kernel to use for Network Boot [up] +
Primary Network Install Interface
* Ring Speed [16] +
* NIM Network sp_tok
* Host Name risc77
Network Adapter Hardware Address [10005AA8E7A3]
Network Adapter Logical Device Name [tok]
IPL ROM Emulation Device [] +/
CPU Id []
Machine Group [] +
Comments []
In this menu yu are not required to fill in the CPU ID of the client. After the
first installation, the NIM master fetches the CPU ID of the client itself. The
following screen shows the client definition.
Now the additional definitions for our external workstations on the NIM master
are finished completed.
psspscript script
prompt bosinst_data
> noprompt bosinst_data
migrate bosinst_data
> lppsource_aix432 lpp_source
mksysb_1 mksysb
> spot_aix432 spot
> mksysb_risc77 mksysb
--------------------------------------------------------------------------------
Select the needed resources, depending upon to your setup. In our example
they are noprompt, lppsource_aix432, spot_aix432 and mksysb_risc77.
Finally, the boot resource has to be allocated, the required directories have to
be exported, and an entry to the /etc/bootptab has to be made for our external
NIM client. This is done by the command:
-----------------------------------------------------------------------------
Operation to Perform
[TOP]
diag = enable a machine to boot a diagnostic image
cust = perform software customization
bos_inst = perform a BOS installation
maint = perform software maintenance
reset = reset an object’s NIM state
fix_query = perform queries on installed fixes
check = check the status of a NIM object
reboot = reboot specified machines
maint_boot = enable a machine to boot in maintenance mode
showlog = display a log in the NIM environment
[MORE...3]
[Entry Fields]
Target Name risc77
Source for BOS Runtime Files mksysb +
installp Flags [-agX]
Fileset Names []
Remain NIM client after install? yes +
Initiate Boot Operation on Client? no +
Set Boot List if Boot not Initiated on Client? no +
Force Unattended Installation Enablement? no +
With these choices, you have to start the client installation manually and the
mksyb image that is allocated will be used. Before starting the installation, the
allocated NIM resources, the /tftpboot/<client_name> file and the
All entries are correct. The necessary resources are allocated: there is one
entry in the /etc/bootptab file for the client risc77 and the /tftpboot/risc77 file
points to the appropriate uniprocessor boot kernel, indicated as bootfile (bf) in
the bootptab entry.
A second way to initialize the client installation is to use the IPL ROM of the
client machine. It will offer you in a menu the interfaces that are available and
you can choose over which interface it should boot. If you have already done
the manual node conditioning on an SP node, you will be familiar with this
procedure:
1. Set key to secure.
2. Power on and wait for LED 200.
At this point, the IPL ROM menu will be offered. You can then make the
appropriate choices and start the system boot over the network.
After the installation of the nodes, you might start the Switch interface.
Usually this is done from the CWS with the Estart command, and you hope
that every node will join the switch. However, several things can occur
between issuing Estart and getting the green switch responds: It can fail at
once; or only some of the nodes join the switch; or the primary node is fenced
and when you try to unfence it, you get a message that it cannot be unfenced
because it is the primary node, and so on.
To avoid these situations, there are several verification steps that you can do
before using Estart. This chapter describes these verification steps.
To Switch
Node Node
Number Number
J3 3 4 7 0 J34 N14 SNN13
J4 2 5 6 1 J33 N13 SNN12
SW3 SW4
J5 1 6 5 2 J32 N10 SNN9
J6 0 7 4 3 J31 N9 SNN8
The rule for setting primary and primary backup node is as follows: if you
have more than one switch board, spread them over different switch boards.
But in any case, do not define the primary and primary backup on nodes that
share the same switch chip because if a switch chip fails, the primary backup
node will not be able to become the primary node (see 10.2.2, “The Worm
Daemon” on page 232).
Figure 44 on page 227 shows the same topic, but focuses on the nodes in the
frame.
In our configuration, five of the 16 available switch ports are left because only
nine nodes are connected to the switch board. They could be used, for
example, to connect the SP Switch Router or S70 nodes.
The Eprimary command without options returns the current settings for the
primary and primary backup node. With the old High Performance Switch
(HiPS), the output of the Eprimary command is only one line telling you which
node is currently the primary node. For more information about the
differences between HiPS and SPS, refer to the Redbook RS/6000 SP:
Problem Determination Guide, SG24-4778, Chapter 4.
The oncoming primary and oncoming primary backup node will become the
primary and primary backup node, respectively, after the next Estart. In our
example, the Switch is currently not up, therefore primary and primary backup
are marked as none.
If the Switch were up and running, the output would be the following:
Estart
1 - primary
1 - oncoming primary
7 - primary backup
8 - oncoming primary backup
When you set the primary or the primary backup node (or both) to another
node, this only changes the oncoming values stored in the SDR. These
values will be activated during the next Estart.
In a one-frame configuration there, is only one possibility for the switch clock
source setting. The clock input has to be 0, meaning the internal clock of this
switch board is used.
Since this is just the definition of the clock source setting in the SDR, it may
happen that the switch clock signal is missing on the board or on a switch
adapter. In this case, you are not able to start the Switch and you will find an
entry in the error report. To set the clock source again, look for the appropriate
Eclock file fitting to your configuration and issue the following command:
# Eclock -f /etc/SP/Eclock.top.1nsb.0isb.0
This command should only be used when necessary, because it stops the
entire Switch.
On the oncoming Get the topology info from SDR and distribute it
primary node
to the other nodes
Estart_sw Request Worm daemon to check the switch What has to run?
(cabling, unfenced nodes, and so on) Worm daemon
Compute routes for primary node Nodes unfenced
Download route table to the adapter
Distribute topology changes
Update switch_responds values in SDR
Update topology
On each
unfenced Compute routes for this node
node Download routes to adapter
These three parts have different prerequisites and they also run on different
nodes. To verify that the Estart command will bring the Switch up means
knowing which script is running where and what is needed for successful
execution. In the following section we discuss this in detail.
# Estart
Estart: Oncoming primary != primary, Estart directed to oncoming primary
rshd: Kerberos Authentication Failed: Access denied because of improper credenti
als.
/usr/lpp/ssp/rcmd/bin/rsh: 0041-004 Kerberos rcmd failed: rcmd protocol failure.
trying normal rsh (/usr/bin/rsh)
rshd: 0826-813 Permission is denied.
Estart: 0028-028 Fault service worm not up on oncoming primary node, cannot Est
art : sp4n01.
When a kerberized remote shell command fails, the standard rsh command
will be tried. In other words, if you have a /.rhosts file on the oncoming
primary node allowing the root user from the CWS to set up remote shell
commands, Estart will be successful even if Kerberos does not work. Within
the Kerberos error messages the Estart reports how many nodes have joined
the Switch.
# Estart
Estart: Oncoming primary != primary, Estart directed to oncoming primary
rshd: Kerberos Authentication Failed: Access denied because of improper credenti
als.
/usr/lpp/ssp/rcmd/bin/rsh: 0041-004 Kerberos rcmd failed: rcmd protocol failure.
trying normal rsh (/usr/bin/rsh)
Estart:0028-06 Estart is being issued to the primary node: sp4n01
Switch initialization started on sp4n01.
Initialized 11 node(s).
Switch initialization completed.
rshd: Kerberos Authentication Failed: Access denied because of improper credenti
als.
/usr/lpp/ssp/rcmd/bin/rsh: 0041-004 Kerberos rcmd failed: rcmd protocol failure.
trying normal rsh (/usr/bin/rsh)
Estart finds out which node is the oncoming primary node simply by asking
the SDR on the CWS. Then Estart executes a dsh command running
Estart_sw on the primary node. It is possible to issue Estart on every node
that belongs to same partition as long as you have a valid kerberos ticket for
this node.
send/receive
Library
Switch
C omm ands
fault_service
(Estart, E fenc e, e tc .)
daem on
g et fs_req ue st
User space
Kernel space
fault_service
work queue
CS S Device
D river
T Bx Adapter
The Worm daemon plays a key role in the coordination of the switch network.
It is a non-concurrent server, and therefore can only service one switch event
to completion before servicing the next. Examples of switch events (or faults)
include switch initialization, Switch Chip error detection/recovery and node
The Worm daemon is started from the /etc/inittab by the rc.switch script; it
runs on all nodes. While the Worm daemon on every node is the same, any
node is able to be or become primary or primary backup node.
The best command to use for checking that Kerberos is working and the
Worm daemons are running, is:
In case the Worm daemon is not running on a node, you can restart it by
issuing the rc.switch script directly either on the node itself or from the CWS
with a dsh:
Sometimes it may happen that even after a restart of a node’s Worm daemon,
this node is not able to join the Switch, particularly if you had to set the switch
clock source with the Eclock command. In this case you have to use a more
powerful command to unload the device driver from the switch adapter, then
load it again and start the Worm daemon (so rc.switch is not necessary), as
follows:
Based on this information, each node builds its own device database with the
information on which nodes are currently ready to join the switch. Then, on
each node, the Routing_Table_Generator-part (RTG) of the Worm computes
the switch route table, including all the paths to the other nodes, and
downloads it to the switch adapter.
Finally, the primary node updates the switch_responds class in the SDR and
sets the primary and primary backup values from none to the appropriate
nodes.
Currently the switch is up and running but node 7, 9 and 15 are isolated,
(perhaps they are fenced) and logically they have no switch responds. For
example, when you fence a node by typing Efence 8 on the CWS, the primary
node excludes this node from the switch interface, then updates its own
routing table and requests all the other nodes to update their routing tables by
deleting the paths to node 8. Finally, the primary node sets the isolated value
in the SDR for node eight to 1. The same principle (vice versa) happens when
you integrate a node by the Eunfence command.
So when will this value be evaluated? Assume the Switch is up but node 10
and 11 are already fenced, now let us fence two more nodes with the autojoin
flag.
# Efence 8 9 -autojoin
This command isolates nodes 8 and 9 from the Switch and sets the isolated
and autojoin values in the SDR for these nodes to 1. The switch responds of
nodes 8 and 9 are off and the values in the SDR look like the following:
spmon -d
[...]
--------------------------------- Frame 1 -------------------------------------
Frame Node Node Host/Switch Key Env Front Panel LCD/LED is
Slot Number Type Power Responds Switch Fail LCD/LED Flashing
-------------------------------------------------------------------------------
1 1 high on yes yes normal no LCDs are blank no
5 5 thin on yes yes normal no LEDs are blank no
6 6 thin on yes yes normal no LEDs are blank no
7 7 thin on yes yes normal no LEDs are blank no
8 8 thin on yes fence normal no LEDs are blank no
9 9 thin on yes fence normal no LEDs are blank no
10 10 thin on yes off normal no LEDs are blank no
11 11 thin on yes off normal no LEDs are blank no
12 12 thin on yes yes normal no LEDs are blank no
13 13 thin on yes yes normal no LEDs are blank no
14 14 thin on yes yes normal no LEDs are blank no
15 15 wide on yes yes normal no LEDs are blank no
This is also true for the primary node: the primary backup node immediately
becomes the primary node, and after finishing the recovery procedure, the
old primary node is set to isolated in the SDR.
By the way... What is the criteria that you say the Switch is up and running? It
is up as soon as the primary and primary backup nodes have initialized the
switch interface. After that, all the other nodes can be integrated by using the
Eunfence command without any Estart.
Before changing the SDR manually, make a backup by using the following
command:
# SDRArchive
# SDRGetObjects switch_responds
node_number switch_responds autojoin isolated adapter_config_status
1 0 0 1 css_ready
5 0 0 1 css_ready
6 0 0 1 css_ready
7 0 0 1 css_ready
8 0 0 1 css_ready
9 0 0 1 css_ready
10 0 0 1 css_ready
11 0 0 1 css_ready
12 0 0 1 css_ready
13 0 0 1 css_ready
14 0 0 1 css_ready
15 0 0 1 css_ready
To get the Switch up, at least the isolated value of primary and primary
backup must be 0 in the SDR. The primary and primary backup are nodes 1
and 6. Figure 48 on page 239 shows the two commands for changing the
isolated values of these nodes.
this is the
this is the
selector
new setting
# SDRGetObjects switch_responds
node_number switch_responds autojoin isolated adapter_config_status
1 0 0 0 css_ready
5 0 0 1 css_ready
6 0 0 0 css_ready
7 0 0 1 css_ready
8 0 0 1 css_ready
9 0 0 1 css_ready
10 0 0 1 css_ready
11 0 0 1 css_ready
12 0 0 1 css_ready
13 0 0 1 css_ready
14 0 0 1 css_ready
15 0 0 1 css_ready
The preceding command will change every isolated attribute value found in
the SDR class switch_responds, as follows:
The reason for using the SDRArchive before using this SDRChangeAttrValues
command is obvious. If you look at the command syntax for the
SDRChangeAttrValues command you will notice that the node gets selected
by using a node_number==6. Using the double equal is very important. If your
are using s single equal instead all the isolated value of all nodes will be set
to the new value. If this happens it is a good time to use your SDR archive
and restore it. Here is an example how to restore the SDR:
For more information on this subject refer to 7.2.2, “Restoring the SDR” on
page 151.
# vi /etc/SP/Emonitor.cfg
<Shift G>
# example entries (remove # to use
#5 # This will set node 5 to be monitored
#1 # This will set node 1 to be monitored
1
5
6
7
8
9
10
# Estart -m
That means Emonitor will not recover from a "death" of a node’s Worm
daemon because there is no impact on the host_responds value of this node.
Furthermore, Emonitor is not able to restart the Worm daemon.
If a node is under control of Emonitor and this node crashes, when it comes
up again, first the rc.switch script starts the Worm daemon. As soon as the
host_responds for this node turns to 1 (green), Emonitor detects it, waits for a
time interval of 180 seconds, then checks in the SDR if the autojoin flag is set
Efence a node with Turns from 1 to 0 Turns from 1 to 0 and Emonitor will not
the autojoin flag and and after the after the Worm started react because of the
reboot it reboot to 1 to 1 due to autojoin autojoin flag
While we do not recommend it, if you think this time period is too long for your
system, you can change this value in the /usr/lpp/ssp/bin/Emonitor file by
editing the entry $EstartSTALL=180 to $EstartSTALL=<seconds>. Bear in
mind that this could be changed again after applying PSSP PTFs.
IBM provides a number of mirrored sites on the Internet where you may freely
download AIX-related fixes. While not every AIX-related fix is available, we
are constantly adding to these anonymous FTP servers. Though we do not
guarantee all fixes will be immediately made available, we usually update the
servers within 24 hours of tape distribution.
To download the fixes on these servers you can use either the ftp command,
or a Web browser, or the AIX-exclusive tool called FixDist.
This appendix discusses downloading fixes using the FixDist tool and from
the Web.
Although the location of the Web pages varies from country to country, the
most common on is:
http://service.boulder.ibm.com/support/rs6000
The FixDist tool and the user's guide are located in the anonymous FTP
directory /aix/tools/fixdist, or they can be viewed online with a Web browser.
You can use any hostname from the list in the previous Electronic Fix
Distribution section
# ftp service.software.ibm.com
> login: anonymous
> password: "email" (example: johndoe@)
> bin
> cd /aix/tools/fixdist
> get fd.tar.Z (FixDist tool in compressed tar format)
> get fixdist.ps.Z (User guide in compressed PostScript)
> quit
Install the tool into the /usr file system. You must install it from the / (root)
directory to access the online help and preserve your .netrc file.
# fixdist
The file /usr/bin/fixdist is a script that calls /usr/lpp/fifdist/fixdistm for CDE and
AIXwindows users. If you have a dumb terminal, FixDist will call
/usr/lpp/fixdist/fixdistc.
Read the user's guide for detailed configuration information. The basic
configuration tasks are: specifying an IBM server; specifying a location on
your RS/6000 where you want to put the fixes; downloading the fix database
from the IBM server to your RS/6000.
PSSP version 3.1 provides support for integrating the S70 system as a node
to SP. In this appendix we discuss how to integrate S70 to SP.
B.1 Overview
The S70 node in the SP has the following characteristics:
• The node is not physically located in the SP frame. It will be considered as
a non-SP frame node. This frame will be assigned a frame number
containing a single node. Therefore, a S70 frame will occupy 16 node
numbers, of which only one will ever be used.
• The node is connected to the SP administrative network (also referred to
as the SP LAN).
• The node is attached to the switch with a TB3PCI adapter.
• There is no frame or node supervisor card. The S70 node will be
connected to the SP Control Workstation using two serial ports. One tty
port is used for the hardware_protocol, and the second tty port is for
establishing the serial connection using the s1term command.
• The S70 node will contain all the PSSP code, and be managed and used
in all the same ways as standard SP nodes currently are.
• The S70 frame cannot be the first frame; an SP frame must always be the
lowest numbered frame.
B.2 Installation
Installation of the S70 node in SP is the same as any other node, after you set
up the physical connections between the CWS and the S70 system. As far as
the installation steps are concerned, the only difference you will find is when
configuring the frame information. For incorporating the S70 node (which, as
previously mentioned, is considered as a Non-SP frame), the Enter Database
Information SMIT panel has a new option: the Non-SP Frame Information
menu. This menu has to be used for configuring the S70 frames.
Lab Scenario:
We have a single frame SP system with four High nodes and a Control
Workstation. In this lab, we will integrate the S70 system to our existing SP
environment. Figure 51 on page 250 shows the configuration of the SP and
the S70 node.
k48n13 Slot 13
k48n09 Slot 9
Slot 1
k48n05 Slot 5
Ethernet
/dev/tty0 /dev/tty1
/dev/tty2
Ethernet
The steps to integrate the S70 system to the existing SP are as follows:
[Entry Fields]
* Start Frame [2] #
* Frame Count [1] #
* Starting Frame tty port [/dev/tty2]
* Starting Switch Port Number [1] #
s1 tty port [/dev/tty1]
Frame Hardware Protocol [SAMI]
Re-initialize the System Data Repository yes +
[k48s][/]> splstdata -f
List Frame Database Information
k48s][/]> splstdata -n
List Node Configuration Information
OR
Node Group [] +
OR
Node List []
k48s][/]> sphrdwrad 2 1 1
Acquiring hardware ethernet address for node 17 from /etc/bootptab.info
Step 11: Configure the initial hostname for the S70 node
Configure the initial hostname for the S70 node using the command
sphostnam. This command indicates that the hostname is the fully qualified
form of the hostname for the en0 adapter, for the frame 2, start slot 1 and
node count of 1.
# sphostnam -a en0 -f long 2 1 1
To monitor the messages displayed during the time of installation, start the
tty console for the S70 node by using the command:
# s1term 2 1
Now initiate the network boot on the S70 node and monitor the tty and
LCD display to verify that the AIX and PSSP software is getting installed in
the node.
# cp /etc/objrepos/CuAt /etc/objrepos/CuAt.bak
Now set up the secondary server with the setup_authent command. (see also
8.12.3, “Run setup_authent on the Secondary Authentication Server” on page
202 for more information.
# /usr/lpp/ssp/bin/setup_authent
# odmadd /tmp/node_num_stanza
Information in this book was developed in conjunction with use of the equipment
specified, and is limited in application to those specific hardware and software
products and levels.
IBM may have patents or pending patent applications covering subject matter in
this document. The furnishing of this document does not give you any license to
these patents. You can send license inquiries, in writing, to the IBM Director of
Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact IBM Corporation, Dept.
600A, Mail Drop 1329, Somers, NY 10589 USA.
The information contained in this document has not been submitted to any formal
IBM test and is distributed AS IS. The information about non-IBM ("vendor")
products in this manual has been supplied by the vendor and IBM assumes no
responsibility for its accuracy or completeness. The use of this information or the
implementation of any of these techniques is a customer responsibility and
depends on the customer's ability to evaluate and integrate them into the
customer's operational environment. While each item may have been reviewed by
IBM for accuracy in a specific situation, there is no guarantee that the same or
similar results will be obtained elsewhere. Customers attempting to adapt these
techniques to their own environments do so at their own risk.
Any pointers in this publication to external Web sites are provided for
convenience only and do not in any manner serve as an endorsement of these
Web sites.
Reference to PTF numbers that have not been released through the normal
distribution process does not imply general availability. The purpose of including
these reference numbers is to alert IBM customers to specific information relative
to the implementation of the PTF when it becomes available to each customer
according to the normal IBM PTF distribution process.
C-bus is a trademark of Corollary, Inc. in the United States and/or other countries.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Sun Microsystems, Inc. in the United States and/or other countries.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of
Microsoft Corporation in the United States and/or other countries.
SET and the SET logo are trademarks owned by SET Secure Electronic
Transaction LLC.
Other company, product, and service names may be trademarks or service marks
of others.
This information was current at the time of publication, but is continually subject to change. The latest information
may be found at the redbooks Web site.
Company
Address
We accept American Express, Diners, Eurocard, Master Card, and Visa. Payment by credit card not
available in all countries. Signature mandatory for credit card payment.
A
abbreviations 265 H
acronyms 265 High Availability 97
administrative ethernet 11
AIX 52, 53, 54, 55, 91, 97 I
alternate rootvg 91, 93 installation 9, 16, 29, 49
AMD 59
APAR 5
K
kdb_util 182
B kerberos 57, 73, 157
backup file format /.k
See BFF See kerberos master key
Base Operating System /.klogin 160, 164
See BOS /tmp/tkt0 160
BFF 1 /var/kerberos/database 160
BIS 31, 32 admin_acl.add 181
boot image 20, 21 admin_acl.get 181
boot modes 100 admin_acl.mod 181
bootlist 96 authentication 178
bootp_response 79 authentication server 159, 160
BOS 1, 13, 18, 21, 24, 37, 43, 61 authenticator 174
bundle 4, 6 authorization 178
change_principal_password 183
chkp 181
C
CHRP 21 create_krb_files 168, 189
coexistence 128 daemon 165
collection 4 ext_srvtab 189
Common Hardware Reference Platform hardmon 164
See CHRP hmcmds 164
critical fix 7 hmmon 164
customize 102 instance 163
CWS 10, 31, 32, 33, 36, 37, 50, 56, 64, 89, 108 instance admin 181
kadmin 165, 182
kadmind 74, 165
D kdb_edit 182, 186
daemon 58 kdb_util dump 184
delnimclient 38 kdb_util load 186
device support 7 krb.conf 76, 160
DSMIT 115 krb.realms 76, 160, 163
krb-srvtab 75, 160, 168, 189
KRBTKT 176
kstash 161
269
isolated
See fenced
oncoming primary 227
oncoming primary backup 227
primary backup node 225
primary node 225
rc.switch 233
Route Table Generator 232
Secondary node 225
switch clock source 228
switch responds 235
switch route table 234
switch_responds 237
Worm 232, 233
System Data Repository
See SDR
system resource controller 240
T
TB3PCI 249
timezone 72
U
unallnimres 189
V
version 2
virtual battery 91
Virtual Shared Disks
See VSDs
VRMF 2, 5
VSDs 69
W
WEBSM
container objects 44
resources 44
task guides 44
wrapper 31, 35
WSM 41
Your feedback is very important to help us maintain the quality of ITSO redbooks. Please complete this
questionnaire and return it using one of the following methods:
• Use the online evaluation form found at http://www.redbooks.ibm.com/
• Fax this form to: USA International Access Code + 1 914 432 8264
• Send your comments in an Internet note to [email protected]
Please rate your overall satisfaction with this book using the scale:
(1 = very good, 2 = good, 3 = average, 4 = poor, 5 = very poor)
Was this redbook published in time for your needs? Yes___ No___