After joining Pythian I was introduced to several configuration management systems and Puppet was one of them. Foreman is a system management tool which can be integrated with Puppet to manage puppet modules and to initiate puppet runs on hosts from web interface. This is very useful if you want to configure large number of systems.
Puppet kick, which was previously used to initiate puppet run from foreman is deprecated now.
For initiating puppet run from foreman interface, I used mcollective. Mcollective can be used to execute parallel jobs in remote systems. There are 3 main components,
- Client – Connects to the mcollective Server and send commands.
- Server – Runs on all managed systems and execute commands.
- Middleware – A message broker like activemq.
I used mcollective puppet module from Puppetlabs for my setup.
My setup includes middleware(activemq) and mcollective client in the puppet server and mcollective servers in all managed systems.
After the implementation, I found that Puppet run from foreman web interface is failing for some servers.
I found following in /var/log/foreman-proxy/proxy.log,
W, [2014-04-18T07:20:56.167627 #4256] WARN — : Non-null exit code when executing ‘/usr/bin/sudo/usr/bin/mcopuppetrunonce-Ireg-app-02.prod.tprweb.net’
E, [2014-04-18T07:20:56.175034 #4256] ERROR — : Failed puppet run: Check Log files
You can see that mco command is trying to execute a puppet run in server.pythian.com and failing. mco command uses several sub commands called ‘applications’ to interact with all systems and ‘puppet’ is one of them.
While running the command in commandline, I received following,
Finished processing 0 / 1 hosts in 22012.79 ms
No response from:
server.pythian.com
I am able to ping the server.
When I ran ‘mco ping’ I found that the server with issue is identified with short hostnames and others with fqdn.
server3.pythian.com time=95.26 ms
server2.pythian.com time=96.16 ms
So mcollective is exporting a short hostname when foreman is expecting an FQDN (Fully Qualified Domain Name) from this server.
Foreman takes node name information from puppet certificate name and that is used for filtering while sending mco commands.
Mcollective exports identity differently. From https://docs.puppetlabs.com/mcollective/configure/server.html#facts-identity-and-classes,
The node’s name or identity. This should be unique for each node, but does not need to be.Default: The value of Ruby’s Socket.gethostname method, which is usually the server’s FQDN.
Sample value: web01.example.com
Allowed values: Any string containing only alphanumeric characters, hyphens, and dots — i.e. matching the regular expression /\A[\w\.\-]+\Z/
I passed FQDN as identity in the servers using mcollective module, which resulted in following setting,
identity = server.pythian.com
Update: I have pushed a commit to puppet-mcollective module which would automatically set the identity to fqdn.
This allowed the command to run successfully and getting ‘Puppet Run’ from foreman to work.
Now ‘mco ping’ looks good as well.
server3.pythian.com time=91.23 ms
server2.pythian.com time=82.16 ms
—
Now let us check why this was happening.
mcollective identity is exported from ruby function Socket.gethostname.
From ruby source code you can see that Socket.gethostname is getting the value from gethostname().
/*
* call-seq:
* Socket.gethostname => hostname
*
* Returns the hostname.
*
* p Socket.gethostname #=> “hal”
*
* Note that it is not guaranteed to be able to convert to IP address using gethostbyname, getaddrinfo, etc.
* If you need local IP address, use Socket.ip_address_list.
*/
static VALUE
sock_gethostname(VALUE obj)
{
#if defined(NI_MAXHOST)
# define RUBY_MAX_HOST_NAME_LEN NI_MAXHOST
#elif defined(HOST_NAME_MAX)
# define RUBY_MAX_HOST_NAME_LEN HOST_NAME_MAX
#else
# define RUBY_MAX_HOST_NAME_LEN 1024
#endif
char buf[RUBY_MAX_HOST_NAME_LEN+1];
rb_secure(3);
if (gethostname(buf, (int)sizeof buf – 1) < 0)
rb_sys_fail(“gethostname(3)”);
buf[sizeof buf – 1] = ”;
return rb_str_new2(buf);
}
gethostname is a glibc function which calls uname system call and copy the value from returned nodename.
So when foreman uses the FQDN value which it collects from puppet certificate name, mcollective exports the hostname returned by gethostname().
Now let us see how gethostname() gives different values in different systems.
When passing the complete FQDN in HOSTNAME parameter in /etc/sysconfig/network, we can see that Socket.gethostname is returning FQDN.
NETWORKING=yes
HOSTNAME=centos.pythian.com[[email protected] ~]# hostname -v
gethostname()=`centos.pythian.com’
centos.pythian.com [[email protected] ~]# irb
1.9.3-p484 :001 > require ‘socket’
=> true
1.9.3-p484 :002 > Socket.gethostname
=> “centos.pythian.com”
1.9.3-p484 :003 >
The system which was having problem was having following configuration.
NETWORKING=yes
HOSTNAME=server[[email protected] ~]# hostname -v
gethostname()=`server’
server [[email protected] ~]# irb
1.9.3-p484 :001 > require ‘socket’
=> true
1.9.3-p484 :002 > Socket.gethostname
=> “server”
1.9.3-p484 :003 >
Here ruby is only returning the short hostname for Socket.gethostname. But it was having following entry in /etc/hosts.
This allowed system to resolve FQDN.
gethostname()=`server’
Resolving `server’ …
Result: h_name=`server.pythain.com’
Result: h_aliases=`server’
Result: h_addr_list=`192.168.122.249′
server.pythain.com
From ‘man hostname’.
host name.Technically: The FQDN is the name gethostbyname(2) returns for the host name returned by gethost-
name(2). The DNS domain name is the part after the first dot.
As the resolver is able to resolve the hostname from /etc/hosts, puppet is able to pick up the fqdn value for certificate which it later used by foreman.
But mcollective exports the short hostname returned by gethostname().
To fix the issue in Red Hat based linux distributions, we can try any of the following,
* Pass an FQDN in /etc/sysconfig/network like below.
NETWORKING=yes
HOSTNAME=server.pythian.com
OR
* Use a short hostname as HOSTNAME but make sure that it would not resolve to an FQDN in /etc/hosts or DNS (not really suggested).
OR
* Pass short hostname or FQDN as HOSTNAME but, make sure that there is an entry like below in /etc/hosts and mcollective is exporting fqdn as identity.
No comments