Puppet Lunch

A Puppet Enterprise testimonial,
by S. Young.

Chapter 3 - Using Hiera

Hiera allows hierarchical configuration in Puppet, which is difficult to achieve with native Puppet code. Another major advantage is the separation of configuration data from the code, which makes everything generally easier. This is probably why Hiera comes bundled with Puppet Enterprise. To get a feel for what it can do, see the Puppet Labs Overview of Hiera.

A Basic Configuration

For testing purposes, we’re going to create a very simple Hiera-based Puppet Master configuration. At this point, we only have the three Puppet hosts to play with (Puppet Master, Puppet Console and PuppetDB), so we’ll verify the configuration on these nodes first. To keep things simple, we’ll follow the Hiera configuration example from the Overview, so this will be the order of events:

To get to grips with Hiera, it’s worth following the Puppet Labs complete example, in which the ntp service is configured. So we first need to install the ntp puppet module on the Puppet Master…

# puppet module install puppetlabs/ntp
Notice: Preparing to install into /etc/puppetlabs/puppet/modules ...
Notice: Downloading from https://forge.puppetlabs.com ...
Notice: Installing -- do not interrupt ...
/etc/puppetlabs/puppet/modules
└── puppetlabs-ntp (v2.0.1)

hiera.yaml and The Hierarchy

Hiera’s hierarchy configuration is in /etc/puppetlabs/puppet/hiera.yaml.

Note: If we edit this file, then Puppet needs to be restarted before it will pick up the changes (service pe-puppet restart).

The default Hiera configuration looks like this:

---
backends:
  - yaml
hierarchy:
  - defaults
  - "%{clientcert}"
  - "%{environment}"
  - global
yaml:
  # datadir is empty here, so hiera uses its defaults:
  # - /var/lib/hiera on *nix
  # - %CommonAppData%\PuppetLabs\hiera\var on Windows
  # When specifying a datadir, make sure the directory exists.
  datadir:

Hiera looks for data in /var/lib/hiera by default. To keep everything in one place, we’ll change this to /etc/puppetlabs/puppet/hiera. We’ll also change the hierarchy a bit, so the new config file looks like this:

---
backends:
  - yaml
hierarchy:
  - node/%{::fqdn}
  - "%{::environment}"
  - global
yaml:
  datadir: /etc/puppetlabs/puppet/hiera

Note the double-colon forces these variables to be top-scope. We do this to remove any ambiguity which may arise if a module were to define a variable of the same name. Top-scope namespace is the empty string, which is why nothing preceeds the double-colon.

This Hiera configuration provides a lot of flexibility. It allows:

For example, if we create a new node called ‘webserver1’ in environment ‘uat’, Hiera will search for Puppet module parameters in the following files, in this order:

1. /etc/puppetlabs/puppet/hiera/node/webserver1.yaml    
2. /etc/puppetlabs/puppet/hiera/uat.yaml                
3. /etc/puppetlabs/puppet/hiera/global.yaml             

So that’s:

  1. Specific config for this particular node
  2. Config for all nodes in UAT
  3. Common config for all nodes

One really nifty feature is that Puppet will search for module parameters automatically. It doesn’t have to be explicitly configured to search Hiera data (unless the module in question uses ‘defined types’ - see the Using Hiera with Defined Types section later in this page).

Example: Data Sources and the ntp Service

Data sources are written in YAML, which is a really simple human-readable data format. For a quick glance at how data structures are represented in YAML, read this.

For example, when configuring the ntp service, we can define any global options in global.yaml:

ntp::package_ensure: latest
ntp::service_enable: true
ntp::service_ensure: running
ntp::servers:
  - ntp0.puppetlunch.com iburst
  - ntp1.puppetlunch.com iburst
  - ntp2.puppetlunch.com iburst
  - ntp3.puppetlunch.com iburst
ntp::driftfile: /var/lib/ntp/drift

Any of these options may be overriden further up the hierarchy. So for example, if we have an environment called “uat” in which we didn’t want to run the ntp service, we could put this in uat.yaml:

ntp::service_ensure: stopped

To test this, we can run the hiera command line tool, which allows you to test your Hiera configuration without repeated puppet agent runs:

# hiera ntp::service_ensure ::environment=uat
stopped
# hiera ntp::service_ensure ::environment=dev
running

However, if we wanted to run the ntp service on a particular node in uat called “uat-web1.puppetlunch.com” we can do this no problem in node/uat- web1.puppetlunch.com.yaml:

ntp::service_ensure: running

And test:

# hiera ntp::service_ensure ::environment=uat
stopped
# hiera ntp::service_ensure ::environment=uat ::fqdn=uat-web1.puppetlunch.com
running

Using Hiera Data in Puppet

We briefly mentioned that Puppet automatically searches Hiera data for module parameters. So when we ‘include’ the ntp class, Puppet will query hiera for its parameters. Let’s do that…

File: /etc/puppetlabs/puppet/manifests/site.pp

node default {
  #
  # First test: ntp. This includes the puppetlabs ntp module, which should
  # pick up its parameters from Hiera automatically.
  #
  include ntp
}

We can also use Hiera to load the ntp class as well, using the hiera_include function. Like this:

File: /etc/puppetlabs/puppet/manifests/site.pp

hiera_include('classes')

Then we define the ‘classes’ array in global.yaml:

#
# Specify which classes to enable
#
classes:
  - ntp

This provides the magical ability to enable and disable classes from the global configuration file, rather than editing the Puppet site.pp manifest directly.

Note: This also allows us to leave the classes array undefined in global.yaml and enable the appropriate classes per environment, or even per node. Hopefully it’s becoming clear how flexible this Hiera tool can be…

In any case, we’re now including the ntp module, and providing parameter data via Hiera, all with very little effort!

Using Hiera with Defined Types

Sometimes the options required by a particular module are a bit more complex; usually because the module has defined a particular resource type. In these cases, Puppet’s automatic parameter lookup function will fail as it will only search for class parameters in Hiera, but we can help it to find what it’s looking for using the built-in functions hiera and create_resources.

What does this mean? Well, here’s an example of the data structure required by the puppetlabs/apache module, which defines its own resource type ‘apache::vhost’:

This is how we might use the defined type in a Puppet manifest:

apache::vhost { 'foo.example.com':
      port          => '80',
      docroot       => '/var/www/foo.example.com',
      docroot_owner => 'foo',
      docroot_group => 'foo',
      options       => ['Indexes','FollowSymLinks','MultiViews'],
      proxy_pass    => [ { 'path' => '/a', 'url' => 'http://backend-a/' } ],
}
 
apache::vhost { 'bar.example.com':
    port     => '80',
    docroot: => '/var/www/bar.example.com',
}

So it’s a “hash of hashes” data structure. We can represent this in YAML like so:

apache::vhosts:
  foo.example.com:
    port: 80
    docroot: /var/www/foo.example.com
    docroot_owner: foo
    docroot_group: foo
    options:
      - Indexes
      - FollowSymLinks
      - MultiViews
    proxy_pass:
      -
        path: '/a'
        url: 'http://localhost:8080/a'
  bar.example.com:
    port: 80
    docroot: /var/www/bar.example.com

To retrieve this data from our Puppet code, we need to pull the Hiera data into a hash variable, then define the resource that this hash represents using the amazingly useful ‘create_resources’ function:

$myvhosts = hiera('apache::vhosts', {})
create_resources('apache::vhost', $myvhosts)

This one requires a bit of thinking… The first line assigns the YAML data structure to a hash variable, and the second tells Puppet to use this hash to declare an ‘apache::vhost’ resource type for each key in the hash.

I’ll try and make it clearer…

create_resources: The General Solution

Forget about specific modules for a moment. This is the basic principle.

1. Take a complex data structure, like this:

defined_type { 'name':
  key1 => 'value1',
  key2 => 'value2',
  key3 => ['element1','element2','element3']
  key4 => [ { 'subkey1' => 'subvalue1', 'subkey2' => 'subvalue2' } ],
}

2. Represent it in YAML:

data_identifier:
  name:
    key1: value1
    key2: value2
    key3:
      - element1
      - element2
      - element3
    key4:
      -
        subkey1: subvalue1
        subkey2: subvalue2

3. Pull that Hiera data into a Puppet hash variable (the ‘hiera’ function will only pull the data relevant to a particular node, as dictated by your hierarchy):

$my_hash = hiera('data_identifier', {})

4. Then use create_resources to declare a resource for each key of the hash:

create_resources('defined_type', $my_hash)

Note: For simplicity, ‘data_identifier’ is often the same string as ‘defined_type’, but it’s important to realise that it can be anything you like. This is the key to understanding how it works - you’re either using Puppet’s auto-lookup feature, in which case the data identifier has to match the class and parameter names, or you’re using create_resources, in which case there’s no auto-lookup and the data identifier can be anything you want.

Hopefully that helps you get started. I can assure you it becomes clearer the more you use it.

Further Reading

Next…