Project:Infrastructure/Developer Machines/ia64

From Gentoo Wiki
Jump to: navigation, search

ia64 Admin Notes

These are various notes mainly targeted at people administrating Gentoo dev machines, although most things are probably generally useful. These are not general "how do I administrate a Gentoo box" notes.

Machine-specific Notes

dolphin

Host: dolphin.ia64.dev.gentoo.org

HP RX2600, CD writer. Donated by HP in 2003. This machine is powered off since two years ago to save power/cooling resources.

ILO is accessible using port 1 of console2. I used to access it using ssh armin76:dolphin-iLO@console2.gentoo.osuosl.org but console2 doesn't seem to answer now.

ttyS0 is accessible using port 2 of console2, ssh armin76:dolphin-ttyS0@console2.gentoo.osuosl.org

This machine had 4GB of RAM, 2x900MHz processors, 1x36GB HDD SCSI 80pin, 2x72GB HDD SCSI 80pin. No RAID.

This machine should still be in gentoo's rack in OSL, on top of bender. It does not have rails.

beluga

HP RX2620, CD/DVD reader only. Donated by HP in 2012, previously it used to be in HP's datacenter. It's stored in OSL but not in Gentoo's rack. It was sent as-is from HP, so iLO is configured with wrong parameters, probably. Also it will have static IP in the OS, wrongly configured too. I think it had a RAID5 by HW using 72GB HDDs. Cannot remember how many, probably 4 or 5. It had 2x 1.6GHz processors and 12GB of RAM.

It was stored in case guppy failed in the future and we had no other option.

guppy

HP RX3600. DVD/CD writer IIRC. Used to be in HP's DC but was sent to OSL when HP pulled the plug in DC. iLO is accessible from port 5 in console2. Once logged in you can access the remote console too.

Admin notes

Hostnames

These are the current systems we have available. See machine specific notes at bottom for more details.

Machine Name IP DNS Hostnames Console Server Console Account
guppy 140.211.166.179 guppy.ia64.dev.gentoo.org  ??  ??
Console Access

iLO2 is accessible over telnet and SSH from dev.gentoo.org box (ssh needs some legacy ciphers). Ask infra@ for credentials and IP address.

You can use this to:

  • Interact with the EFI (e.g. to select recovery kernel, boot from plugged Gentoo DVD, change boot order)
  • Log in directly over ttyS1 to recover
  • Reboot machine
Hardware notes

List devices over MP console as: 'CM' > 'DF'

PSU status

PSU status can be checked over MP console as: 'CM' > 'PS':

   Power supplies                State
   -----------------------------------
   Power Supply 0                Fault
   Power Supply 1                Normal

Here we see that PSU-0 needs to be swapped. Tracked at bug #671420.

HDD status

Disk array needs to be checked from operating system:

root #cciss_vol_status -V /dev/cciss/c0d0
Controller: Smart Array P600
  Board ID: 0x3225103c
  Logical drives: 0
  Running firmware: 1.52
  ROM firmware: 1.52
/dev/cciss/c0d0: (Smart Array P600) RAID 5 Volume 0 status: Using interim recovery mode. 
  Failed drives:
         connector 1I box 1 bay 6                 HP      DH072ABAA6                           3PD0YA8B00009816N8B5     HPD4

    Total of 1 failed physical drives detected on this logical drive.
  Physical drives: 7
         connector 1I box 1 bay 8                 HP      DG072A8B54                           3LB0RFWF00007703FJ9Y     HPD7 OK
         connector 1I box 1 bay 7                 HP      DG072A9BB7                               B365P6A072YP0641     HPD0 OK
         connector 1I box 1 bay 5                 HP      DG072A9BB7                               B365P6A074CF0641     HPD0 OK
         connector 2I box 1 bay 4                 HP      DG072A9BB7                               B365P6A073U40641     HPD0 OK
         connector 2I box 1 bay 3                 HP      DG072A9BB7                               B365P6A073KC0641     HPD0 OK
         connector 2I box 1 bay 2                 HP      DG072A9BB7                               B365P6904NHC0635     HPD0 OK
         connector 2I box 1 bay 1                 HP      DG072A9BB7                               B365P6A072RM0641     HPD0 OK
/dev/cciss/c0d0(Smart Array P600:0): Non-Volatile Cache status:
                   Cache configured: Yes
                 Total cache memory: 224 MiB
                        Cache Ratio: 50% Read / 50% Write
                  Read cache memory: 112 MiB
                 Write cache memory: 112 MiB
                Write cache enabled: No
   Write cache temporarily disabled
           Temporary disable condition. Posted write operations have
been disabled due to the fact that less than 75% of the
battery packs are at the sufficient voltage level.

Here we see that HDD-6 needs to be swapped. Tracked at bug #671420.

Batteries are also dead. I'm not sure how many batteries are there: one per controller or one per SAS I/O card. TODO: find out how to check those as well.

Common iLO commands

  • Get remote console output (ttyS1): CO
  • Get interactive console (to login and recover system on ttyS1): CO Ctrl-E f c
  • Reboot main machine: RS
  • Manage iLO users: UC
  • Get builtin help: HE

Kernel Management

ia64 systems are EFI systems. guppy uses elilo tool from sys-boot/elilo package. Things to remember:

  • Make updates to /etc/elilo.conf.
  • elilo command copies kernels and config from /boot to /EFI.
  • Run elilo whenever /etc/elilo.conf is changed or kernels referred to by the config file are updated (failure to do so could break booting).

Sample Config Files

/etc/elilo.conf

boot=/dev/cciss/c0d0p1
install=/usr/lib/elilo/elilo.efi
delay=50
timeout=80
default=gentoo
prompt

image=/boot/vmlinuz-4.9.72-gentoo
        label=gentoo
        root=/dev/cciss!c0d0p3
        read-only
        append="console=ttyS1,115200n8"

image=/boot/kernel-3.14.14-gentoo
        label=gentoo-3.14.14
        root=/dev/cciss!c0d0p3
        read-only
        append="console=ttyS1,115200n8"

Recovery notes

iLO (CM > CO) serial console runs on ttyS1, ttyS0 is wired to physycal(?) console.

Console is configured in EFI as P Serial Acpi(HWP0002,PNP0A03,0)/Pci(1|2) Vt100+ 115200.

EFI shell

In interactive EFI boot menu pick EFI Shell [Built-in]. And run the DVD kernel:

# inspect cdrom
fs0:\> ls fs0:\efi\boot
Directory of: fs0:\efi\boot

  09/27/09  08:42p <DIR>          2,048  .
  09/27/09  08:42p <DIR>          2,048  ..
  09/27/09  08:42p                  698  elilo.conf
  09/27/09  08:42p            7,020,793  gentoo
  09/27/09  08:42p              374,212  bootia64.efi
  09/27/09  08:42p            6,092,363  gentoo.igz
  09/27/09  08:42p                  380  elilo.msg

# run kernel with custom arguments (cdrom's defaults and not very suitable)
fs0:\> fs0:\efi\boot\bootia64.efi -i gentoo.igz gentoo initrd=gentoo.igz root=/dev/ram0 init=/linuxrc dokeymap looptype=squashfs loop=/image.squashfs cdroot console=ttyS1,115200n8
...
livecd ~ # uname -r
2.6.30-gentoo-r6

Or alternatively you can boot directly from HDD if you need non-standard arguments:

Shell> fs1:\EFI\gentoo\elilo.efi boot\vmlinuz-4.9.72-gentoo root=/dev/cciss!c0d0p3
ELILO shell

Type TAB at ELILO boot: prompt to interrupt boot process.

TODO: actual syntax to load initrd

Config snippets

Config snippets on plugged Gentoo-2009 cdrom:

/etc/inittab
...
# TERMINALS
c1:12345:respawn:/sbin/agetty 38400 tty1 linux
c2:2345:respawn:/sbin/agetty 38400 tty2 linux
c3:2345:respawn:/sbin/agetty 38400 tty3 linux
c4:2345:respawn:/sbin/agetty 38400 tty4 linux
c5:2345:respawn:/sbin/agetty 38400 tty5 linux
c6:2345:respawn:/sbin/agetty 38400 tty6 linux

# SERIAL CONSOLES
#s0:12345:respawn:/sbin/agetty 9600 ttyS0 vt100
#s1:12345:respawn:/sbin/agetty 9600 ttyS1 vt100
...
elilo.conf
prompt
message=/efi/boot/elilo.msg
chooser=simple
timeout=50
relocatable

image=/efi/boot/gentoo
  label=gentoo
  append="initrd=gentoo.igz root=/dev/ram0 init=/linuxrc dokeymap looptype=squashfs loop=/image.squashfs cdroot"
  initrd=/efi/boot/gentoo.igz

image=/efi/boot/gentoo
  label=gentoo-serial
  append="initrd=gentoo.igz root=/dev/ram0 init=/linuxrc dokeymap looptype=squashfs loop=/image.squashfs cdroot console=tty0 console=ttyS0,9600"
  initrd=/efi/boot/gentoo.igz

image=/efi/boot/gentoo
  label=gentoo-sgi
  append="initrd=gentoo.igz root=/dev/ram0 init=/linuxrc dokeymap looptype=squashfs loop=/image.squashfs cdroot console=tty0 console=ttySG0,115200"
  initrd=/efi/boot/gentoo.igz
/etc/conf.d/net

Useful for livecd as DHCP does not acquire data:

config_eth1="140.211.166.179/27"
routes_eth1="default via 140.211.166.161"