× {{alert.msg}} Never ask again
Receive New Tutorials
GET IT FREE

Fixing Ansible Configuration Errors

– {{showDate(postTime)}}

ansible configuration

Software system issues are parts and parcels of all software systems no matter how robust or efficient the system is. What this article will provide are basic troubleshooting tips for some common and frequently-occurring Ansible server configuration errors. Note that some error messages may not seem to point directly to the root cause—such errors may need a deeper investigation.

Overview

Ansible is a simple, yet effective open-source tool for deploying, managing, and combining multi-node software deployment. It also manages changes in the execution and configuration management of a system. Ansible is generally seen useful and some would even say that it’s better than other similar tools. However, the unexpected always happens and errors occur. Ansible configuration issues have been described below, first by identifying the root cause followed by troubleshooting tips to possibly address the problem. But before this section, let us have a look at some of the prerequisites of Ansible Configuration.

Prerequisites:

These are the prerequisites for running Ansible on your machine properly. Knowledge of these prerequisites is important for identifying solutions to the problems or issues, which might arise at any point of time.

  • Note that Ansible can be installed in only one machine, and from this (also known as the Control Machine) it can manage multiple machines via the SSH protocol.
  • The Control Machine needs to have Python 2.6 or 2.7 installed. As for the required OS, Debian, Red Hat, OS X, CentOS, or any of the Berkeley Software Distribution (BSD) are supported. Windows is not supported.
  • Note that the Central Machine communicates with other machines via SSH. Usually, it uses sftp. If sftp is not available, you can use scp in Ansible.cfg.

Time to review common errors in Ansible configuration!

Want to Learn Python? Get Python Training

Fixing Issues with Ansible Configuration

Issue 1

The system throws an error message related to space issues in a command when you are trying to copy a script to the Oracle production servers then add the script to chkconfig (which means the system is auto-starting). The error message could look like the one below:

ERROR: Syntax Error while loading YAML script, ppili.yml

Note: The error may actually appear before this position: line 4, column 12

   - name: Copying ppili script
    copy: src=/ds1/scripts/ppili dest=/etc/init.d/ppili owner=root group=rootuser  

Root cause: As can be seen from the error message, the root cause is the indentation of – name.

Troubleshooting: You just need to address the indentation issue in the – hosts section. You could do that by entering either one of the following commands in the command line:

- hosts: oracle.com
  gather_facts: False
  su: yes
  su_user: rootuser
  tasks:
    - shell: russell	

Or

ansible-playbook --su --su-user=root --ask-su-pass playbook.yml

Issue 2

The system fails to find the required Python modules. In such case, the error message could look something like this:

failed: [somehost] => {"failed": true, "parsed": false}
invalid output was: Error: ansible requires a json module, none found!

Root cause: The system was unable to locate the required Python modules.

Troubleshooting: You just need to install the Python module with the help of the Raw module. You could use the following command:

ansible -m raw -a "yum -y install python-simplejson"

Issue 3

Host file issues or machine shut down. The error message in such a case could look like the one below:

fatal: [websrvr01] => {'msg': 'FAILED: [Errno -2] Name or service not known', 'failed': True}

Root cause: You could have committed a typo in the host file or someone has shut the server down.

Troubleshooting: Verify if the server has been shut down and also check the DNS name of the host file.

Issue 4

Login issues because of incorrect SSH keys

Root cause: This is a common issue. You are either passing the wrong keys or the key that you are passing have not been added to the SSH agent, if you have been using one.

Troubleshooting: First find out the SSH keys that have been added to the SSH agent already. To do that, you could use the command $ ssh-add –l and see an output like the one below:

2049 06:c9:5c:14:de:83:00:94:ec:15:e5:c9:4e:86:4f:a6 /Usersroot/speters/devroot/projectsroot/devops/ansible/keys/ansible (RSA)
2044 dd:3b:b8:2e:85:04:06:e9:ab:ff:a8:0a:c0:04:6e:d6 /Usersroot/speters/.vagrant.d/insecure_private_key (RSA)

Tip: If there are keys that you use frequently, think of making aliases for them and add them to a custom .ssh/config so they are automatically known to Ansible.

Need help? Ask an Ansible Expert now

Issue 5

Login issues due to missing key.

Root cause: If a user wants to access the host with a key pair, the user’s public key must be available in the server so that the connection is authenticated. In the case of SSH connection, the key must be made available in the .ssh/authorized_keys location. Any deviation from this practice will lead to an error.

Troubleshooting: To add the key to the .ssh/authorized_keys location, use the following command:

cd ~user
cat newkey.pub >> .ssh/authorized_keys

Alternatively, you can use the command ssh-copy-id from your local machine.

Note that adding keys manually can be a tedious and error-prone method. You should ideally aim to automate the task with the help of Ansible Role.

Issue 6

SSH agent is not running. It may be difficult to immediately identify the cause because Ansible will provide a generic failure message.

Root cause: Failing SSH agent may be the cause behind many generic error messages.

Troubleshooting: Verify that the SSH agent is running. To do that, use the following command:

export | grep SSH
SSH_AGENT_PID=14543
SSH_AUTH_SOCK=/tmp/ssh-U4z3bbdQJiqx/agent.14543
SSH_CLIENT='192.168.10.26 59808 11'
SSH_CONNECTION='192.169.10.26 59112 10.0.32.108 22'
SSH_TTY=/dev/pts/0

Issue 7

Unknown failures

Root cause: Ansible, for all its reputation of being a robust system, can also experience unknown, unidentified errors.

Troubleshooting: You can use the debug logging feature of Ansible, which is an extremely useful and effective way to find difficult-to-spot errors. When you use debug logging, it will show you the users and the scripts that are being executed.

Issue 8

Sudo failure

Root cause: You have changed the host name of the target host but have not changed the local host entry. Sudo looks up the host and tries to match the hostname with the entries done in the hosts. In case of a mismatch, you are going to get an error message.

Troubleshooting: In case you have recently changed the name of the host, verify first that after you have changed the host name of the target, you have also changed the localhost entry in /etc/hosts.

Issue 9

When you are executing an Ansible playbook, the control machine throws an error that the Ansible for Junos Operating System module is not a legal parameter. The error message could look like the one below:

ERROR: junos_install_config is not a legal parameter in an Ansible task or handler

Root cause: The Ansible control machine is unable to find the Ansible for Junos OS modules.

Troubleshooting: First, you need to download Ansible for Junos OS modules from the Ansible website. To download, use the command ansible-galaxy install command, and specify Juniper.junos. The command could look the one below:

 [root@ansible-cm]# ansible-galaxy install Juniper.junos

To enable the Playbook so that it can access and reference the installed modules, include the Juniper.junos role in the playbook play. The command could look something the one below:

---- name: Get Device Facts
  hosts: hostname
  roles:
  - Juniper.junos
  connection: local

Conclusion:

You need to note that the error messages may be ambiguous and misleading at times. What is stated in the error message may not always reflect what is wrong (although these messages are always good starting points). Even in the case of error messages described above, they may vary depending on a lot of factors. The good thing about cracking problems is that when you troubleshoot a number of issues, you gain the experience which you can use to fix other concerns. However, you will be able to prevent a lot of issues if:

  • you keep the host name and the host entries synced,
  • have the required version of Python installed; and
  • have the SSH agent updated and coordinated.

These three steps ensure that a lot of possible Ansible configuration issues can be prevented right from the start.

Other tutorials you might be interested in:


Author’s Bio:

ansible configurationKaushik Pal has more than 16 years of experience as a technical architect and software consultant in enterprise application and product development. He has interest in new technology and innovation, along with technical writing. His main focus is web architecture, web technologies, Java/J2EE, Open source, big data, cloud, and mobile technologies.You can find more of his work at www.techalpine.com and you can email him at techalpineit@gmail.com or kaushikkpal@gmail.com




Questions about this tutorial?  Get Live 1:1 help from DevOps experts!
Fernando Yray
Fernando Yray
5.0
Full-Stack Software Developer with 5+ years of experience
Solution-oriented. Love helping others. Passionate on Software/Web Development. I'm also open for **part-time/full-time job opportunity**. If your...
Hire this Expert
Benjamin Kappel
Benjamin Kappel
5.0
Experienced Blazor, .NET Core developer (5+ years) and coding teacher
I'm a NET Core developer to the bones. My co-workers always describe me as a having integrity, reliable person and I am able to create a trustful...
Hire this Expert
comments powered by Disqus