Ansible Part 1: Basics

21 October 2024 | Mustafa Can Yücel | Ansible

What's With the Name?

It’s a science-fiction reference. An ansible is a fictional communication device that can transfer information faster than the speed of light. Ursula K. Le Guin invented the concept in her book Rocannon’s World (Ace Books, 1966), and other sci-fi authors have since borrowed the idea, including Orson Scott Card. Ansible cofounder Michael DeHaan took the name Ansible from Card’s book Ender’s Game (Tor, 1985). In that book, the ansible was used to control many remote ships at once, over vast distances. Think of it as a metaphor for controlling remote servers.

What is Ansible?

Ansible is an open-source automation tool, or platform, used for IT tasks such as configuration management, application deployment, intraservice orchestration, and provisioning. How does it help with us? Well, we can use Ansible to automate taking our server from a fresh install to a fully configured server. This not only saves time, but also ensures that all of our servers are configured in the same way. This is important because it makes it easier to manage our servers and also makes it easier to scale our infrastructure.

As an example, let's say we have a server and we want to install Caddy, and configure it to both serve static websites and reverse-proxy a bunch of other services (container or otherwise). We need the following steps:

Install Caddy
Configure Caddy to serve static websites
Configure Caddy to reverse-proxy other services

We create a yaml file that contains the steps above, and this file is called a playbook. Each step is called a task. We then run the playbook with Ansible, and it will take care of the rest. If we have multiple servers, we can run the playbook on all of them at once, by following the below steps:

Generate a Python script that performs the first task
Copy the script to all the servers
Run the script on all the servers
Wait for the script to complete execution on all hosts

Ansible then move to the next task in the list and go through the same four steps. So, the important points to note are:

Ansible is agentless, so we don't need to install anything on the servers we want to manage
Ansible is idempotent, so we can run the playbook multiple times without any issues
Ansible is declarative, so we only need to specify what we want to happen, not how it should happen
Ansible runs each task in parallel across all hosts
Ansible wawits until all hosts have completed a task before moving to the next task
Ansible runs the tasks in the order that is specified in the playbook

Definitions of Sides & Requirements

There are two sides in Ansible:

Control machine: The machine where Ansible is installed. It is used to write playbooks and run Ansible commands.
Managed nodes: The machines that are controlled by the control machine. Ansible is not installed on these machines.

For the managed nodes, Linux servers need to have SSH and Python installed (which comes as default in nearly all distros), where Windows servers need WinRM installed. On Windows, Ansible uses PowerShell instead of Python, so there is no need to preinstall an agent or any other software on the Windows servers (host).

On the control machine, it is best to install Python 3.8 or later. Depending on the resources, external libraries may also be required. We will get into more details in the following sections.

Windows is not officially supported to run ansible, but remote systems can fully managed remotely with ansible using WSL. Here is a documentation to install Kali Linux, which supports seamless mode where both OSes run on the same (single) desktop environment.

SSH Keys and WSL

It is not possible (at least yet) to use the keys that are generated and stored in Windows directories in WSL, or you will get the following error:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@         WARNING: UNPROTECTED PRIVATE KEY FILE!          @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0777 for 'private_key' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.

The files within Windows managed directories are mounted to WSL with 777 permissions, which is too open for SSH keys. Moreover, you cannot change the permissions of the files in Windows directories from WSL. There are two options to solve this issue:

Generate the key within WSL
Copy the key to a directory that is not managed by Windows, change EOLs (via dos2unix), then change its permissions to 600

System Abstractions

Ansible offers modules as abstraction layers for various tasks. For example, in order to configure a directory in Linux shell, the following commands are commonly used:

mkdir -p /etc/skel/.ssh
chown root:root /etc/skel/.ssh
chmod go-wrx /etc/skel/.ssh

By contrast, the file module in Ansible can be used to achieve the same result with the following code:

- name: Ensure .ssh directory in user skeleton
    file:
      path: /etc/skel/.ssh
      mode: '0700'
      owner: root
      group: root
      state: directory

The file module is an abstraction layer that allows us to perform the same task in a more readable and maintainable way. It also makes it possible to use the same configuration management scripts to manage different operating systems.

Installation

All the major Linux distributions package Ansible. Based on the distribution, this version may be older than the latest stable release. For the latest version, it is recommended to install Ansible using Python's package manager, pip.

Versions - Core vs Community

After version 2.9 and starting with version 2.10, Ansible distributes two deliverables; a community package called ansible, and a minimalist language and runtime called ansible-core

ansible community

Uses new versioning (2.10, then 3.0.0)
Follows semantic versioning rules
Maintains only one version at a time
Includes language, runtime, and selected collections
Developed and maintained in Collection repositories

ansible-core

Uses old versioning (2.11, then 2.12)
Does not use semantic versioning
Maintains the latest version plus two older versions
Includes language, runtime, and built-in plugins
Developed and maintained in ansible/ansible repository

Installing with pip

As per Python conventions (though it is more like a requirement by now), we will create an environment for our project. This will allow us to install Ansible without affecting the system-wide Python installation. To do this, we navigate to the project directory and run the following commands:

python -m venv .venv --prompt A
source .venv/bin/activate
(A)

Here, we gave the environment a prompt name of A. This is optional, but it makes it easier to see which environment we are in. We can now install the whole community version of Ansible using pip:

pip install ansible

Installing any version above 2.10 will install the community version with all standard collections available, so it is "batteries included".

We will need the following system packages for some of our advanced playbooks, so we may as well install them now:

sshpass: If you are going to connect to a node using username and password over ssh rather than keys, this package needs to be installed via apt.
passlib: If you are going to encrypt or hash anything, this package needs to be installed. It can be installed via pip.
ansible-lint: When combined with the Ansible plugin in VSCode, it enables command completion and linting features.

Telling Ansible about Servers

Ansible can only manage the servers it explicitly knows about. This information is conveyed by specifying them in an *inventory*. The conventional way to hold this information is by creating a directory named inventory. Each server needs a name that Ansible will use to identify it. The details can be saved as either `.ini` files (not strictly INI files as defined by Microsoft), or as YAML files. The YAML files are more readable and easier to manage, so we will use them.

Test Servers

If you don't have any servers to test with, you can use VirtualBox to create some virtual machines. You can also use Vagrant to automate the creation of the virtual machines. We will work on a real Debian server in the next part of this series.

Inventory File

We will create a file named hosts.yaml under the inventory directory. The file will contain the following content:

all:
    vars:
        ansible_ssh_common_args: '-o StrictHostKeyChecking=accept-new'
    hosts:
        host1Name:
            ansible_host: xxx.xxx.xx.xx
            server_name: host1Name
            user_private_key_file: /home/user/.ssh/host1Name
            vault_file: vaults/host1.yaml
            param1: value1
            param2: value2
            param3: value3

The ansible_ssh_common_args variable is used to accept new SSH keys. This is required when connecting to a server for the first time.
The ansible_host variable is the IP address of the server.
The server_name variable is the name of the server.
The user_private_key_file variable is the path to the private key file (ssh) that will be used to connect to the server.
The vault_file variable is the path to the vault file that will be used to store sensitive information. We will explain vaults in next parts.
The param1, param2, and param3 variables are custom variables that can be used in playbooks. These parameters should be non-sensitive information, as they are stored in plain text. If you need to store sensitive information, you should use the vault file.

Testing Server Files

We can test the connection to the server by running the following command:

ansible -i inventory/hosts.yaml all -m ping

If the connection is successful, we will see the following output:

host1Name | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

If the connection is not successful, we will see an error message. In this case, we can use the -vvv flag to get more information about the error.

The all parameter specifies that we want to run the command on all servers. We can also specify a single server by using the server name instead of all. The "changed": false part of the output means that the command did not make any changes to the server. This is because the ping module is a read-only module that checks if the server is reachable.

Basic Ansible Configuration

Ansible looks for the ansible.cfg configuration file in the following order:

File specified by the ANSIBLE_CONFIG environment variable
./ansible.cfg(in the current directory)
~/.ansible.cfg(in the home directory)
/etc/ansible/ansible.cfg(in the system directory for Linux)

A common approach is to put the configuration file in the project directory. This way, the configuration file is versioned with the project, and it is easy to see which configuration is being used. It also adds the possibility of having different configurations for different projects.

Ansible uses /etc/ansible/hosts as the default location for the inventory file. Keeping the inventory file in the project directory is a good practice, as it allows us to version the inventory file with the project. This way, we can see which servers are being managed by the project and which configuration is being used.

A very basic configuration file is as follows:

[defaults]
inventory=inventory/server.yaml
host_key_checking=False
stdout_callback=yaml
callback_enabled=timer

Managing Secrets

Writing secrets (e.g. passwords, tokens, API keys, etc) to the playbook files in plain text is far from secure. Moreover, checking them into version control systems is a big no-no. Ansible provides a way to manage secrets securely using Ansible Vault. It is an encryption tool that allows us to encrypt files containing sensitive information. The encrypted files can be safely checked into version control systems, and only authorized users can decrypt them.

Vaults

The secrets are stored in encrypted yaml files called vaults. We can create a vault file by running the following command:

ansible-vault create inventory/vaults/host1.yaml

Ansible Vault will prompt us to enter a password. This password will be used to encrypt and decrypt the file; if the password is lost, there is no way to recover the contents of an encrypted vault file. Once the password is set, the file will open in the default text editor. We can then add the secrets to the file and save it. The file will be encrypted and saved to disk.

After the password is set, the file will be opened in the terminal window. A vault file is nothing but a YAML file that contains variables as key-value pairs. Once all the secrets are entered, closing the file will automatically encrypt the contents. To edit a vault file after it is closed, the following command is used:

ansible-vault edit inventory/vaults/host1.yaml

Vault Fiels and Ansible Lint

*ansible-lint* is a command-line tool used for linting and analyzing Ansible playbooks and roles to identify potential issues, best practice violations, and syntax errors. It helps to maintain consistency, readability, and adherence to Ansible best practices across the code base. It is not included with the ansible installation by default, and it does not support installation on Windows systems (works with WSL, though). See the official documentation for installation and configuration. The VSCode Ansible plugin natively supports ansible-lint, and it will complain that it cannot find it if the package is not installed.

ansible-lint cannot directly open vault files, and it will throw a very unrelated error in VSCode; it will highlight the first line (e.g. ---) and say that an internal error has occurred. Using the command line directly will yield a better message:

user.yaml:1 ERROR! Attempting to decrypt but no vault secrets found

There is no built-in prompt for lint, and the choice is either ignoring this specific rule violation (because lint will not check the rest of the file), or decrypting the file during linting process:

ansible-vault decrypt vault.yml

This will change the contents of the vault file to unencrypted, and the lint tool can read the values there. Never forget to encrypt the vault before checking into version control systems.

Handlers

Handlers in Ansible are tasks that are triggered by other tasks, known as “notifying tasks”, when those tasks report changes. They are particularly useful for performing actions that should occur only when certain changes have been made during the execution of a playbook. Handlers are commonly used to restart services, reload configurations, or perform other actions to ensure that changes made by the playbook take effect immediately.

One of the primary reasons handlers are used is to endure idempotence and consistency in configurations. By triggering handlers only when necessary, Ansible can avoid unnecessary service restarts or other actions, which helps maintain a consistent state across managed systems.

Handlers provide a way to implement reactive behavior in playbooks; instead of defining actions directly within tasks, which may execute regardless of whether the changes are made, handlers are only executed in response to changes reported by other tasks.

Another advantage of handlers is their ability to aggregate similar actions across multiple tasks. Instead of defining the same action multiple times within different tasks, handlers allow to centralize the action definition and trigger it fro multiple places within the playbook.

Handlers are typically defined in handlers section of a playbook or in separate handler files, and **are executed after all tasks in a particular play has been executed**. This ensures that handlers are triggered only once, even if multiple tasks report changes that would trigger the same handler.

Defining Handlers in a Separate File

Handlers can be defined in a separate file and included in the main playbook using the include directive. This can help to keep the playbook more organized and easier to read, especially when there are many handlers or when handlers are used across multiple plays.

# handlers.yaml
- name: "Reload caddy server"
    ansible.builtin.systemd:
    name: caddy
    state: reloaded

Then the handler can be included in the main playbook:

# playbook.yaml
- name: Foo
    hosts: all
    become: true
    vars_files:
- vault.yaml

handlers:
- name: Handlers
    ansible.builtin.import_tasks:
    handlers/handlers.yaml

tasks:
- name: Install and configure Caddy
    ansible.builtin.include_tasks: caddy.yaml

Now the handler can be used in any child playbook that is included in the main playbook:

# caddy.yaml
- name: Copy configuration file
    ansible.builtin.copy:
    src: files/caddy/Caddyfile
    dest: /etc/caddy/Caddyfile
    mode: '0644' # -rw-r--r--
    notify: "Reload caddy server"

Important Note: The handler is defined in the main playbook via import_tasks whereas normal tasks are defined via include_tasks. The main reason is notifying a dynamic include such as include_task as a handler results in executing all tasks from within the include. It is not possible to notify a handler defined inside a dynamic include. Having a static include such as import_tasks as a handler results in that handler being effectively rewritten by handlers from within that import before the play execution. A static include itself cannot be notified; the tasks from within that include, on the other hand, can be notified individually.

Templates

Ansible uses Jinja2 templating to enable dynamic expressions and access variables in playbooks. Templates are files that contain Jinja2 expressions and are used to generate configuration files, scripts, or other files that need to be customized for each managed node. Templates can be used to create files that are specific to each managed node, or to generate configuration files based on variables defined in the playbook.

Templates are typically stored in the templates directory of an Ansible project, and are copied to the managed nodes using the template module. The template module takes a source template file and a destination file, and applies the Jinja2 expressions in the template to generate the destination file.

Templates can contain variables, loops, conditionals, and other Jinja2 expressions that allow for dynamic content generation. This makes it easy to create configuration files that are customized for each managed node, or to generate files based on the values of variables defined in the playbook. These templates can be stored in the inventory file as parameters (as we have seen in the previous sections), or in a vault file if they contain sensitive information.

The template module is used to copy the template file to the managed node. The template file below is copied to the managed node and the Jinja2 expressions are evaluated to generate the destination file:

# Caddyfile template file - variables are enclosed in double curly braces
{{ personal_page_address }} {
    root * /var/www/{{ personal_page_address }}
    file_server
}

# Playbook task to copy the template file
- name: Copy Caddyfile template
    ansible.builtin.template:
    src: "{{ caddy_file }}" # path to the template file, also stored as a variable
    dest: "/opt/caddy/config/Caddyfile" # destination path on the managed node
    owner: "{{ sudo_user_name }}"
    group: "{{ sudo_user_name }}"
    mode: '0644'

Conclusion

In this part, we have covered the basics of Ansible. We have seen what Ansible is, how it can help us, and how to install it. We have also seen how to manage servers with Ansible, how to use vaults to store secrets, and how to use handlers to manage tasks that need to be run only when certain changes are made. In the next part, we will look at how to write playbooks and roles, and how to use Ansible to automate the configuration of servers.