LinMin Bare Metal Provisioning 6.2 User's Guide

Troubleshooting Clients

Hide Navigation Pane

Troubleshooting Clients

Previous topic Next topic Expand/collapse all hidden text  

Troubleshooting Clients

Previous topic Next topic JavaScript is required for expanding text JavaScript is required for the print function Mail us feedback on this topic!  

If your LinMin Server is hosted on an VMware hypervisor, ensure that NAT and DHCP are disabled on the hypervisor.

 

Note: if running in a VMware Virtual Machine, due to host hypervisor caching issues, it is not recommended that you use MAC-Independent provisioning then MAC-Specific provisioning for the same Client (same MAC address). If you are setting up an environment to become familiar with the LinMin Server:

1) Take a snapshot of the system immediately after installation

2) Become familiar with MAC-Independent (rarely used in production, but provides a quick way to verify network connectivity and to become familiar with basic provisioning operations)

3) When ready to start the Proof of Concept or start production use, revert to the prior snapshot

4) Start using MAC-Specific Provisioning

 

Note: Windows Server 2008 R2 contains additional drivers not found in Windows Server 2008 (non-R2). If you provisioning of non-R2 fails, then try provisioning Windows Server 2008 R2.

 

 

Client System Can't Find the LinMin Server

 

The single most common error while provisioning or imaging is that the Client system cannot communicate with the LinMin Server, most often due to network configuration issues, as the Client's PXE request doesn't reach the LinMin Server.

 

Are needed firewall ports on the LinMin Server closed (these are different ports than those needed to install  the LinMin Server)?
Confirm your LinMin Server is receiving the Client's PXE request: capture activity on ports 67 & 69.
Review the contents /var/log/messages file. For example, to tail the log's last 12 lines:

# tail -n12 /var/log/messages

May 17 14:05:40 baremetal dhcpd: Wrote 7 leases to leases file.

May 17 14:05:40 baremetal dhcpd: Listening on LPF/eth0/00:50:56:00:00:bb/192.168.0/24

May 17 14:05:40 baremetal dhcpd: Sending on   LPF/eth0/00:50:56:00:00:bb/192.168.0/24

May 17 14:05:40 baremetal dhcpd: Sending on   Socket/fallback/fallback-net

May 17 14:05:47 baremetal dhcpd: DHCPDISCOVER from 00:0c:29:82:bb:62 via eth0

May 17 14:05:47 baremetal dhcpd: DHCPOFFER on 192.168.0.161 to 00:0c:29:82:bb:62 via eth0

May 17 14:05:47 baremetal dhcpd: Dynamic and static leases present for 192.168.0.161.

May 17 14:05:47 baremetal dhcpd: Remove host declaration 00-0c-29-82-bb-62 or remove 192.168.0.161

May 17 14:05:47 baremetal dhcpd: from the dynamic address pool for 192.168.0/24

May 17 14:05:47 baremetal dhcpd: DHCPREQUEST for 192.168.0.161 (192.168.0.233) from 00:0c:29:82:bb:62 via eth0

May 17 14:05:47 baremetal dhcpd: DHCPACK on 192.168.0.161 to 00:0c:29:82:bb:62 via eth0

May 31 09:20:19 baremetal kernel: VMCIUtil: Updating context id from 0x9816be49 to 0x9816be49 on event 0.

 

In the example above, you can clearly see that the Client system (MAC address 00:0c:29:82:bb:62) reached the LinMin Server and that the LinMin Server started servicing the Client system.

 

At any time, you can get the messages log entries for a Client based on its MAC address. For example, using the MAC address above:

Search for the MAC in the current log file:

# grep '00:0c:29:82:bb:62' /var/log/messages

May 17 14:03:01 baremetal dhcpd: DHCPDISCOVER from 00:0c:29:82:bb:62 via eth0

May 17 14:03:03 baremetal dhcpd: DHCPREQUEST for 192.168.0.155 (192.168.0.233) from 00:0c:29:82:bb:62 via eth0:

May 17 14:03:03 baremetal dhcpd: DHCPNAK on 192.168.0.155 to 00:0c:29:82:bb:62 via eth0

May 17 14:05:22 baremetal dhcpd: DHCPDISCOVER from 00:0c:29:82:bb:62 via eth0

May 17 14:05:22 baremetal dhcpd: DHCPOFFER on 192.168.0.161 to 00:0c:29:82:bb:62 via eth0

 

Search for the MAC in all logs (note the asterisk):

# grep '00:0c:29:82:bb:62' /var/log/messages*

 
Add a filter to see only the DHCP entries for the MAC:

# grep '00:0c:29:82:bb:62' /var/log/messages | grep 'DHCP'

 

If you do not see the MAC address of the Client to be provisioned or imaged in this log file, you must resolve the networking configuration. Customer Support cannot help you with your network configuration/topology.

 

 

Provisioning Does Not Begin on Client System

 

Premise:  The client's Role on the LinMin Server has an incorrect MAC address.

Solution:  Edit the client's provision profile, and ensure that the MAC address is correct.

 

 

Premise:  The client cannot find the live, authoritative (non-LinMin Server) DHCP server running on the subnet.

Solution:  Make sure that your live (non-LinMin Server) DHCP is installed, properly configured and running.  Read more about co-existing with your live (non-LinMin Server) DHCP server.

Make sure that the LinMin Server's firewall ports are opened as required.

Make sure that you don’t have another active LinMin Server on the same subnet.

Confirm your LinMin Server is receiving the Client's PXE request: capture activity on ports 67 & 69.

 

 

Provisioning does not complete on the Client system: is your ISO bootable?

 

Should the provisioning event begin but fails to complete (e.g., it freezes), make sure that your ISO is bootable on the Client system:

Burn the ISO file to DVD
Boot the Client system from DVD
If the Client fails to have its OS installed, you have a non-bootable ISO and LinMin cannot assist you. Locate a bootable ISO, boot your Client from this ISO, and once verified that the Client boots from this ISO, use the LBMP media management scripts then use the GUI to create MAC-Specific Templates or MAC-Independent Provisioning Roles.

 

 

MAC-Independent Provisioning works for some, but not all, Distros and OSs

 

Premise: You have not assigned the correct boot/install option for the the Role that fails to provision.

Solution:  Assign the correct boot/install option for each OS as follows:

For Red Hat, Asianux, CentOS and Fedora Core distributions, use the Kickstart option.
For SUSE-based distributions, use the YaST option.
For Ubuntu and Debian distributions, use the Debian option.
For Windows OSs and VMware ESX, use the Windows/Other option.
Use the automatically-populated Roles when using MAC-Independent Provisioning

 

 

After provisioning a Linux client, the client GUI doesn't start

 

Solution:  Log in to the client's terminal, and as root user, type the following command:

startx

 

Note: certain Linux distros provisioned by the LinMin Server do not include X-Windows and a desktop by design (e.g., Debian, Ubuntu Server, Fedora 13)

 

 

When provisioning Debian, the provisioning fails as the Client can't find the correct installation source

 

Premise:  A step in the Debian setup was missed.

Solution:  Read on Access the Debian Distribution Media.  Creating the symbolic link with the following two commands is particularly important.

cd /home/tftpboot/pub/debian/dists

ln -s etch stable

 

 

Premise:  The Client does not have Internet access to reach a repository

Solution:  Provision a non-Debian/non-Ubuntu to the Client, or deploy Debian Rescue, and ensure that the Client has Internet access.

 

 

When provisioning Debian or Ubuntu, a popup box states "Bad Archive Mirror"

 

Premise:  The client did not have access to a public (Internet) or local repository.

Solution:  Ensure your client has Internet access or access to a local repository

 

 

When provisioning Debian or Ubuntu, a popup box for language selection appears

 

Premise:  The language option was not set while performing the Debian Install Option Configuration.

Solution:  Ensure that you have entered the proper kernel parameters

Add the following string to the text entry box beside Enter additional kernel parameters:

languagechooser/language-name=English or <your_language>

 

When Provisioning Debian or Ubuntu a Pop-Up Box Asks How to Partition Disks

 

The following error message may appear:

Error_Debian_SDA_HDA

 

This may indicated that you are provisioning an IDE ("hda") drive using a preseed.cfg that has SATA ("sda") drive as the default. Change the "part-man" command from "sda" to hda" (or vice-versa).

 

 

The client UI displays “PXE-M0F: Exiting Intel Boot Agent” then hangs, does not boot from HD

Premise: there was a PXE timing error (this network anomaly happens occasionally)

Solution: reboot the client system and boot to the network (more than once if necessary)

Note: certain NIC cards retain the IP address of the previous system they did a PXE boot to. In this case, one must boot to the network 2 to 3 times to clear the prior IP address.

 

Premise:  Cable connected to the incorrect network adapter

Solution:  Try plugging your CAT cable into another network port on the Client

 

Premise:  The BIOS cannot find the HD that it is supposed to boot from

Solution 1:  Obtain a BIOS FLASH update from the manufacturer

Solution 2:  When using “Always boot from the network first” because you are letting the LinMin Server business rules decide whether a system is to be provisioned, imaged or booted from local HD, ensure the boot sequence is network, ATAPI, floppy, hard drive

Solution 3:  If have configured your system to boot from the network last, the boot sequence should be ATAPI, floppy, hard drive, network

 

Premise:  Bad network cable

Solution 1:  Replace the network cable and retry

 

When provisioning Red Hat/CentOS/Fedora, Client displays "unable to retrieve image/stage2.img" or a Debian/Ubuntu Client displays "Failed to retrieve the pre-configuration file http://10.0.15.21/tftpboot/controlfiles/{MAC-Address}.cfg"

The Red Hat-based (Red Hat, Fedora, CentOS, Asianux) client system being provisioned interrupts the provisioning process and displays a message similar to:

unable to retrieve http://10.7.1.23//tftpboot/pub/rhel5_3_x86_64/image/stage2.img

 

For Ubuntu-based distributions, a similar type of message can occur: Failed to retrieve the pre-configuration file http://10.0.15.21/tftpboot/controlfiles/001a4d681b55.cfg  The installation will proceed in non-automated mode.

 

These errors can be caused by many situations that can be very difficult to isolate. A useful free tool used to analyze the network during provisioning is available from http://www.wireshark.org/ (We recommend this tool and may offer support in selected situations on an hourly consulting fee basis.)

 

Common causes and remedies experienced by LinMin and its customers include:

This stage2 failure is often caused when the provisioning target has multiple NICs or the Linux distro has a kernel that is not in sync with PXE protocols.
oForce the kickstart installation to take place over a specified Ethernet device
oAdd the line network --device=eth# in the *ks.cfg in pub/ (for MAC-Independent Provisioning) or to the MAC-Specific Role Template text box containing the control file (for MAC-Specific Provisioning). You may append –device=eth# to any existing line starting with “network” after other parameters you may find. The eth# is normally eth0 but can be eth1, eth2, etc.
oThe ksdevice option at the install boot prompt can be used by adding the ksdevice=eth# parameter. This is accomplished by entering "ksdevice=eth#" in the GUI "Enter any additional kernel parameters" field of Provisioning Role Templates (for MAC-Specific Provisioning) or of Provisioning Roles (for MAC-Independent Provisioning). The eth# is normally eth0 but can be eth1, eth2, etc.
An other kernel parameter that addresses the issue under certain circumstances is “ksdevice=link”

 

Network connectivity problems (race conditions or other situations requiring resetting devices). Also, the anaconda installer has a known bug whereby it does not retry to locate the installation media directory and will simply time out, giving the “unable to retrieve” message.
oDisconnect and reconnect client systems network cable to switch/router; power down and up switch/router. Optionally, disconnecting cables from the LinMin Server to its switch/router and cables between all connected network devices up to the client system, powering down all devices including shutting down the LinMin Server, then reconnecting all cables, powering on all network devices, then the LinMin Server, then the client system (this is a rare instance, but it happens occasionally)
oSlow link response on switches - set each interface "spanning-tree portfast"
oCertain newer install processes involve larger files and contacting the LinMin Server DHCP service multiple times before and after downloading files. The short lease time intended to support temporary use of IPs causes different IPs to be sent and creates certain failures. If this is your issue, the resolution is to increase the default lease time:
 
    Edit /etc/dhcpd.conf and change --
 
    default-lease-time 600;
    to
    default-lease-time 1800;
 
Note: There are two default-lease-time entries and both must be changed. After the changes restart dhcpd --
 
    service dhcpd restart
oChange the IP address of the LinMin Server from a fixed IP address to an externally resolvable DNS address:
From: url --url http://$server_ip_address/$tftpboot_webpath/pub/centos5_4_x86_64
To: url --url http://linmin.yourcompany.com/$tftpboot_webpath/pub/centos5_4_x86_64
For MAC-Specific provisioning, make the change in the Provisioning Role Templates and re-generate the Provisioning Roles.
For MAC-Independent provisioning, edit the control file in pub/{distro}/ and remember that if you run setup.pl to change other networking configurations, you will need to change the IP address again.

 

Corrupted ISO media or extracted files
oRe-download the ISO file (or clean the CD/DVD and re-run loaddvd.pl) and run loadwindows.pl or loadlinux.pl