Problem with Perceus dhcpd import script

After a couple days of banging my head against the wall trying to figure out why the import script kept giving this cryptic error, we finally submitted a question to the Perceus mailing list.

Undefined subroutine &main::add_node called at ./ line 46,  line 8.

I’m sure if I knew Perl that error wouldn’t have been so confusing.

Here’s a diff for anyone who’s interested.

<    if ( $_ =~ /^s*hosts+([^s]+)s*{s*$/ ) {
>    if ( $_ =~ /^s*hosts+([^s]+)s*{?s*$/ ) {
<       &add_node($1, $hostname);
> print "Adding: $1, $hostnamen";
>       &node_add($1, $hostname);

New Nodes failing to get DHCP IP after booting in Perceus

As of Perceus 1.4 nodes will no longer automatically get an ip via DHCP. You must first enable and configure the ipaddr module in Perceus.

perceus module activate ipaddr

Then, edit the ipaddr config file in /etc/perceus/modules/ipaddr. Uncommenting the last line seems to be more than sufficient for most configurations. If your machines do not have their second ethernet card plugged in it is worth removing the eth1 portion as this will significantly reduce boot times.

* eth0:[default]/[default] eth1:[default]/[default]/[default]

Reload the Perceus service, and then restart the nodes and they should automatically get a new ipaddress.

Perceus “ERROR No such host: binsh”

I’ve been working a lot with Perceus at work, and I figured I would put up some posts about problems I have encountered, and possible solutions to the problems.

Today I was attempting to boot a node with Perceus 1.3.8 installed. The node would download and run the first kernel, but when it attempted to begin provisioning with provisiond it would exit with this error:

ERROR No such host: binsh

The node would then infinitely loop through the following while loop which printed the error every second right after running provisiond:

# Excerpt from:

while [ ! -f "/next" ]; do
	# If this works we wont even get a chance to say goodbye!
	# If it errors out, we need to touch /next to
	# iterate to next count and/or interface.
	if [ $INIT_DEBUG -eq 0 ]; then
	   provisiond -s /bin/sh $MASTERIP init || touch /next
	elif [ $INIT_DEBUG -eq 1 ]; then
	   provisiond -v -s /bin/sh $MASTERIP init || touch /next
	   provisiond -d -s /bin/sh $MASTERIP init || touch /next
	sleep 1

I was able to find a reference to error message in the source code for provisiond. Initially I thought that the node was passing “/bin/sh” to the server instead of the master’s IP address, but after trying various things with the command line parameters I decided to look elsewhere.

Eventually I noted that provisiond was running as a service on the head node, but provisiond should only run on provisioned nodes. I tried uninstalling provisiond from the head node which seemed to fix the problem. Unfortunately I tried a couple other ideas at the same time so I cannot be absolutely sure that provisiond was causing the problem.

If I get a chance I will do a more thorough test to make sure that I am correct.

edit: Never got a chance to test if this worked correctly. If anyone was able to test this situation I would be interested in hearing about it.