Solaris 10 Boot Process & Phases

Legacy boot vs SMF:
In earlier versions of Solaris(9 & earlier), system uses series of scripts to start and and stop process linked with the run levels(located in /sbin directory). The init daemon is responsible for starting and stopping the service.
Solaris 10 uses SMF(Service Management Facility) which begins service in parallel based on dependencies. This allows faster system boot and minimizes dependencies conflicts.

SMF contains:
A service configuration on repository
A process restarter 
Administrative Command Line Interpreter(CLI) utilities
Supporting kernel functionality

These features enables Solaris services to:
1. specify requirement for prerequisite services and system facilities and services.
2. identity and privilege requirements for tasks.
3. specify the configuration settings for each service instance.

Phases of the boot process:
The very first boot phase of any system is Hardware and memory test done by POST (Power on Self Test) instruction.
In SPARC machines, this is done by PROM monitor and in X86/x64 machines it is done by BIOS.

In SPARC machines, if no errors are found during POST and if auto-boot? parameter is set to true, the system automatically starts the boot process.
In X86/x64 machines, if no errors are found during POST and if /boot/grub/menu.lst file is set to positive value, the system automatically starts the boot process.

The boot process is divided into five phases:
Boot PROM Phase
Boot programs Phase
Kernel intialization phase
init phase
svc.startd phase


Note: The fist two phases, boot PROM & boot programs, differ between SPARC & X86/64 systems.

SPARC Boot PROM Phase:
The boot PROM phase on a SPARC system involves following steps:
1. PROM firmware runs POST
2. PROM displays the system identification banner which includes:
   Model Type
   Keyboard status
   PROM revision number
   Processor type & speed
   Ethernet address
   Host ID
   Available RAM
   NVRAM Serial Number
3. The boot PROM identifies the boot-device PROM parameter.
4. The PROM reads the disk label located at sector 0 of the default boot device.
5. The PROM locates the boot program on the default boot device.
6. The PROM loads the bootblk program into memory.

x86/x64 Boot PROM Phase:
The boot PROM phase on a x86/x64 system involves following steps:
1. BIOS ROM runs POST & BIOS extensions in ROMs, and invokes the software interrupt INT 19h, bootstrap.
2. The handler for the interrupt begins the boot sequence
3. The processor moves the first byte of the sector image in memory. The first sector on on a hard disk contains the master boot block. This block contains the master boot(mboot) program & FDISK table.

SPARC Boot Program Phase:
The boot Program phase involves following steps:
1. The bootblk program loads the secondary boot program, ufsboot from boot device into memory.
2. The ufsboot program locates & loads the kernel.

x86/x64 Boot Program Phase:
The boot Program phase involves following steps:
1. The master boot program searches the FDISK table to find the active partition and loads GRUB stage1. It moves the first byte of GRUB into memory.
2. If the GRUB stage1 is installed on the master boot block, stage2 is loaded directly from FDISK partition.
3. The GRUB stage2 finds the GRUB menu configuration file (/boot/grub/menu.lst) and displays the GRUB menu. This menu selects the options to boot from a different partition, a different disk or from the network.
4. GRUB executes commands from /boot/grub/menu.lst to load an already constructed boot archive.
5. The multiboot program is loaded.
6. The multiboot program collects the core kernel module, connects the important modules from the boot archive, and mounts the root file system on the device.

Kernel initialization phase:
The Kernel initialization phase involves following steps:
1. The kernel reads /etc/system configuration file.
2. The kernel initializes itself and uses ufsboot command to load modules.When sufficient modules are loaded, kernel loads the / file system & unmaps the ufsboot program.
3. The kernel begins the /etc/init daemon

Note: The kernel's core is divided into two pieces of static codes: genunix & unix. The genunix is platform independent generic kernel file & the unix file is platform specific kernel file.

init phase:
The process is initiated when the init daemon initiates the svc.startd daemon that starts & stops service when requested. This phase uses information residing in the /etc/inittab file. Fields in inittab file are:
id: A two character identifier for the entry.
rstate: Run levels to which the entry applies.
action: Defines how the process filed defines the command to execute process.

svc.startd phase:
It is the master of all services and is started automatically during start up. It starts, stops & restarts all services. It also takes care of all dependencies for each service.

The /etc/system file:

It enables the user to modify the kernel configuration, including the modules and parameters that need to be loaded during th system boot.


Legacy Run Levels

Run levels: It’s nothing but the system's state. We are having 8 different run levels:


Run levelsDescription
0This run level ensures that the system is running the PROM monitor.
s or SThis run level runs in single user mode with critical file systems mounted & accessible.
1This run level ensures that the system running in a single user administrative, and it has access to all available file systems.
2In this run level system supports multiuser operations. At this run level, all system daemons, except the Network File System(NFS) server & some other network resource server related daemons, are running.
3At this run level, the system supports multiuser operations. All system daemons including the NFS resource sharing & other network resource servers are available.
4Not yet implemented.
5This is intermediate run level between the OS shutdown /powered off.
6This is a transitional run level when the OS shuts down & the system reboots to the default run level.
                                           
Determining the systems current run level:

#who -r

Changing the current run level using init command:
init s: Single user mode
init 1: Maintenance mode
init 2: Multi-user mode
init 3: Multi-user server mode
init 4: Not implemented
init 5: Shutdown/power off
init 6: shutdown & reboot
init 0: Shutdown & skips the maintenance to OBP

init s: When we are booting the machine to single user mode all the user logins, terminal logins, file system including all servers are disabled. The reason we are booting the server to the single user mode is for troubleshooting.

init 1: When the server is booting to maintenance mode the existing user logins will stay active & terminal logins get disconnected. Later on the new user & terminal logins both get disconnected. File Systems are mounted but all services are disabled.

init 2: It is the run levels where all the user logins, terminal logins, file systems including all services are enabled except NFS (Network File System) service.

init 3: It is default run level in SOLARIS. In this run level all the use logins, terminal logins, file system and all services are enabled including NFS.

Note: In SOLARIS 9 we can change the default run level by editing /etc/inittab file. But from SOLARIS 10 it is not possible, because this file acts as a script which is under control of SMF.


The /sbin directory:
This directory contains:
1. contains a script associated with each run level.
2. contains some scripts that are also hard linked to each other.
3. is executed by the svc.startd daemon to set up variables, test conditions, and call other scripts.

To display the hard links for rc(run control) scripts :
#ls -li /sbin/rc*
These scripts are present under /etc directory for backward compatibility and are symbolic link to the scripts under /sbin directory. To see the these scripts use the following command:
#ls -l /etc/rc?

Functions of /sbin/rcn scripts:


/sbin/rc0Stops system services & daemons by running the /etc/rc0.d/K* and /etc/rc0.d/S* scripts. This should be only use to perform fast cleanup functions.
/sbin/rc1It stops system services & daemons, terminating running application processes, and unmounting all remote file systems by running the /etc/rc1.d/S* scripts
/sbin/rc2Starts certain application daemons by running the /etc/rc3.d/k* & /etc/rc2.d/S*
/sbin/rc3Starts certain application daemon by running the /etc/rc3.d/K* & /etc/rc2.d/S*
/sbin/rc5 & /sbin/rc6Peforms function such as stopping system services & daemons & starting scripts that perform fast system cleanup functions by running the /etc/rc0.d/K* scripts first & then /etc/rc0.d/S* scripts
/sbin/rcSEstablishes a minimum network & brings the system to run levels S by running the /etc/rcS.d scripts



Start Run Control Scripts:
1. The start scripts in the /etc/rc#.d directories run in the sequence displayed by the ls command. 
2. File start with letter S is used to start a system process. 
3. These scripts are called by appropriate rc# script in the /sbin directory to pass the argument 'start' to them in case the names do not end in .sh scripts do not take any arguments. These are generally names as S##name-of-script.
4. To start a script: #/etc/rc3.d/<script name> start

Stop Run control scripts:

1. The stop/kill scripts in the /etc/rc#.d directories run in the sequence displayed by the ls command. 
2. File start with letter K is used to stop a system process. 
3. These scripts are called by appropriate rc# script in the /sbin directory to pass the argument 'stop' to them in case the names do not end in .sh scripts do not take any arguments. These are generally names as K##name-of-script.
4. To stop/kill a script: #/etc/rc3.d/<script name> stop 

The /etc/init.d directory:
This directory also contains rc scripts. These scripts can be used to start/stop services without changing the run levels.
#/etc/init.d/mysql start
#/etc/init.d/mysql stop

Adding a script in /etc/init.d directory to start/stop a service:
For the services not managed by SMF, we can be added in  rc scripts to start & stop services as follows:
1. Create the script: 
#cat > /etc/init.d/mysql
#chmod 744  /etc/init.d/mysql
#chgrp sys  /etc/init.d/mysql
2. Create Hard Link to required /etc/rc#.d directory 
#ln /etc/init.d/mysql /etc/rc2.d/S90mysql
#ln /etc/init.d/mysql /etc/rc2.d/K90mysql


SMF(Service Management Facility):
SMF has simplified the management of system services. It provides a centralized configuration structure to help manage services & interaction between them. Following are few features of SMF:
1. Establish dependency relationships between the system services.
2. Provides a structured mechanism for Fault Management of system services.
3. Provides information about startup behavior and service status.
4. Provides information related to starting, stopping & restarting a service.
5. Identifies the reasons for misconfigured services.
6. Creates individual log files for each service.

Service Identifier:
1. Each service within SMF is referred by an identifier called Service Identifier. 
2. This service identifier is in the form of a Fault Management Resource Identifier(FMRI), which indicates the service or category type, along with the service name & instance. 
Example:
The FMRI for the rlogin service is svc:/network/login:rlogin 
network/login: identifies the service 
rlogin: identifies the service instance
svc: The prefix svc indicates that the service is managed by SMF.
Legacy init.d scripts are also represented with FMRIs that start with lrc instead of svc.
Example: 
lrc:/etc/rc2_d/S47pppd
The legacy service's initial start times during system boot are displayed by using the svcs command. However, you cannot administer these services by using SMF.
3. The services within SMF are divided into various categories or states:
degraded The service instance is enabled, but is running at a limited capacity.
disabled The service instance is not enabled and is not running.
legacy_runThe legacy service is not managed by SMF, but the service can be observed. This state is only used by legacy services.
maintenance The service instance has encountered an error that must be resolved by the administrator.
offline The service instance is enabled, but the service is not yet running or available to run.
online The service instance is enabled and has successfully started.
uninitialized This state is the initial state for all services before their configuration has been read.

Listing Service Information:
The svcs command is used to list the information about a service.
Example:
# svcs svc:/network/http:cswapache2
  STATE    STIME  FMRI
  disabled May_31 svc:/network/http:cswapache2


STATE: The state of service.    
STIME: Service's start/stop date & time.  
FMRI: FMRI of the service. 

#svcs -a
The above command provides status of all the services.

SMF Milestones:
SMF Milestones are services that aggregate multiple service dependencies and describe a specific state of system readiness on which other services can depend. Administrators can see the list of milestones that are defined by using the svcs command, as shown in below:




With milestones you can group certain services. Thus you don´t have to define each service when configuring the dependencies, you can use a matching milestones containing all the needed services.

Furthermore you can force the system to boot to a certain milestone. For example: Booting a system into the single user mode is implemented by defining a single user milestone. When booting into single user mode, the system just starts the services of this milestone.

The milestone itself is implemented as a special kind of service. It's an anchor point for dependencies and a simplification for the admin. 


Types of the milestones:
single-user
multi-user
multi-user-server
network
name-services
sysconfig
devices



SMF Dependencies:Dependencies define the relationships between services. These relationships provide precise fault containment by restarting only those services that are directly affected by a fault, rather than restarting all of the services. The dependencies can be services or file systems.

The SMF dependencies refer to the milestones & requirements needed to reach various levels.

The svc.startd daemon:
1. It maintains system services & ensures that the system boots to the milestone specified at boot time.
2. It chooses built in milestone "all", if no milestone is specified at boot time. At present, five milestone can be used at boot time:
none
single-user
Multi-user
multi-user-server
all

To boot the system to a specific milestone use following command at OBP:
ok> boot -m milestone=single-user

3. It ensures the proper running, starting & restarting of system services.
4. It retrieves information about services from the repository.
5. It starts the processes for the run level attained.
6. It identifies the required milestone and processes the manifests in the /var/svc/manifest directory.


Service Configuration Repository:

The service configuration repository :
1. stores persistent configuration information as well as SMF runtime data for services.
2. The repository is distributed among local memory and local files.
3. Can only be manipulated or queried by using SMF interfaces.

The svccfg command offers a raw view of properties, and is precise about whether the properties are set on the service or the instance. If you view a service by using the svccfg command, you cannot see instance properties. If you view the instance instead, you cannot see service properties.

The svcprop command offers a composed view of the instance, where both instance properties and service properties are combined into a single property namespace. When service instances are started, the composed view of their properties is used.

All SMF configuration changes can be logged by using the Oracle Solaris auditing framework. 




SMF Repository Backups:

SMF automatically takes the following backups of the repository:
The boot backup: It is taken immediately before the first change to the repository is made during each system startup.
The manifest_import backups: It occur after svc:/system/early-manifest-import:default or svc:/system/manifest-import:default completes, if the service imported any new manifests or ran any upgrade scripts.

Four backups of each type are maintained by the system. The system deletes the oldest backup, when necessary. The backups are stored as /etc/svc/repository-type-YYYYMMDD_HHMMSWS, where YYYYMMDD (year, month, day) and HHMMSS (hour, minute, second), are the date and time when the backup was taken. Note that the hour format is based on a 24–hour clock.

You can restore the repository from these backups by using the /lib/svc/bin/restore_repository command.

SMF Snapshots:
The data in the service configuration repository includes snapshots, as well as a configuration that can be edited. Data about each service instance is stored in the snapshots. The standard snapshots are as follows:
initial – Taken on the first import of the manifest
running – Taken when svcadm refresh is run.
start – Taken at the last successful start

The SMF service always executes with the running snapshot. This snapshot is automatically created if it does not exist.

The svccfg command is used to change current property values. Those values become visible to the service when the svcadm command is run to integrate those values into the running snapshot. The svccfg command can also be used to, view or revert to instance configurations in another snapshot.  



svcs command:
1. Listing service:
#svcs <service name>/<Service FMRI>
2. Listing service dependencies:
a. svcs -d <service name>/<Service FMRI>: Displays services on which named service depends.
b. svcs -D <service name>/<Service FMRI>: Displays services that depend on the named service.
3. svcs -x FMRI: Determining why services are not running.
 

svcadm command:
The svcadm command can be used to change the state of service(disable/enable/clear).
Example:
Other uses of svcadm command:

1. svcadm clear FMRI: Clear faults for FMRI.
2. svcadm refresh FMRI: Force FMRI to read config file.
3. svcadm restart FMRI: Restarts FMRI.

4. svcadm -v milestone -d <milestone name>:default : Specify the milestone the svc.startd daemon achives on the system boot.

Creating new service scripts:
1. Determine the process to start & stop the service.
2. Specify the name & category of the service.
3. Determine if the service runs multiple instances.
4. Identify the dependency relationships between this service & other services.
5. Create a script to start & stop the process and save it in /usr/local/svc/method/<my service>.

#chmod 755 /usr/local/svc/method/<my service>
6. Create a service manifest file & use svccfg to incorporat the script into SMF. Create your xml file and save it in:  

/var/svc/manifest/site/myservice.xml
Incorporate the script into the SMF using svccfg utility
#svccfg import /var/svc/manifest/site/<my service>.xml


Manipulating Legacy Services Not Managed by SMF:
We can modify the legacy services not managed by SMF by using the svcs command & it will be stored in the /etc/init.d directory.
#svcs | grep legacy
#ls /etc/init.d/mysql
/etc/init.d/mysql
#/etc/init.d/mysql start
#/etc/init.d/mysql stop



Commands for booting system:
 

Stop :     Bypass POST.
Stop + A : Abort.
Stop + D : Enter diagnostic mode. Enter this command if your system bypasses POST by default and you don't want it to.
Stop + N : Reset NVRAM content to default values.


Note: The above commands are applicable for SPARC systems only.
 

Performing system shutdown and reboot in Solaris 10:
There are two commands used to perform the shutdown in Solaris 10:  The commands are init and shutdown.

It is preferred to use shutdown command as it notifies the logged in users and systems using mounted resource of the server.

Syntax:
/usr/sbin/shutdown
[-i<initState>] [-g<gracePeriod>] [-y] [<message>]

-y: Pre-answers the confirmation questions so that the command continues without asking for your intervention.
-g<grace Period>: Specifies the number of seconds before the shutdown begins. The default value is 60.
-i<init State>: Specifies the run level to which the system will be shut down. Default is the single-user level: S.
<message>: It specifies the message to be appended to the standard warning message. If the <message> contains multiple words, it should be enclosed in single or double quotes.

Examples:
#shutdown -i0 -g120 "!!!! System Maintenance is going to happen, plz save your work ASAP!!!"

If the -y option is used in the command, you will not be prompted to confirm.
If you are asked for confirmation, type y.
Do you want to continue? (y or n): y


#shutdown : Its shuts down the system to single user mode
#shutdown -i0: It stops the Solaris OS & displays the ok or Press any key to reboot prompt.
#shutdown -i5: To shut down the & automatically power it off.
#shutdown -i6: Reboots the system to state or run level defined in /etc/inittab.

Note:
Run levels 0 and 5 are states reserved for shutting the system down. Run level 6 reboots the system. Run level 2 is available as a multiuser operating state.



Note: The shutdown command invokes init daemon & executes rc0 kill scripts to properly shut down a system.


Some shutdown scenarios and commands to be used:
1. Bring down the server for anticipated outage:
shutdown -i5 -g300 -y "System going down in 5 minutes." 

2. You have changed the kernel parameters and apply those changes:
shutdown -i6 -y

3. Shutdown stand alone server:
init 0

Ungraceful shutdown: These commands should be used with extreme caution and to be used only when you are left with no option.
#halt 
#poweroff

#reboot
These commands do not use rc0 kill scripts just like init command. Unlike shutdown command they do not warn logged in user about the shut down.

 

1 comment: