Building an open-source TACACS+ server. Part I – Introduction

This is the first post in a planned multi-post series on deploying a TACACS+ server using open-source components. I hope to create a complete guide for such a solution, while also delving reasonably deep into TACACS+ itself, LDAP set-up and various vendor-specific configuration needed for this to work. In this post, I will describe my solution and what I was aiming to achieve. I will also give a brief overview of TACACS+ protocol that will hopefully make future posts easier to follow.

I deployed this on Rocky Linux 9 (a bug-for-bug clone of Red Hat 9), because that’s what I had to work with, but any modern Linux distro should do. You might need to adjust package names and some paths. OpenLDAP configuration might be also different between distros, so please take this into account. If I have time, I will try to replicate this on Debian, which is my personal distro of choice.

Note: At the time of this writing, I only have a PoC tested against multiple lab devices and stress-tested to determine performance limits and VM sizing. As the solution is deployed to production, I will update these posts.

Background and requirements

I’ve recently volunteered (yes, really!) to prepare a replacement for my company’s aging TACACS+ server. Turns out, there aren’t really many TACACS+ implementations out there and one has to either:

buy a commercial producy like Cisco ISE or Aruba Clearpass or
cobble a solution together from various open-source projects.

With the budget being around zero, option 1 was not really in the cards. Since I like tinkering and I’m a big supporter of open-source philosophy, I was happy to go with option 2. The requirements:

Free (as in beer), because we have no budget
Free (as in speech), because it is simply right and proper to use and contribute to open source
Support for devices from multiple vendors (Cisco, Juniper, F5, Fortigate at minimum)
Permissions defined per-device and per-user
Redundancy
Self-service for users
GUI for adding/editing users
Principle of least privilege
Mid-sized deployment: number of devices in high hundreds, dozens of users.

The solution in a nutshell

The solution I put together is based on the following components:

tac_plus-ng – the only, I believe, actively maintained open-source TACACS+ server. Great work by Mark Huber.
OpenLDAP with mirror replication
LDAP Tool Box Self Service Password
phpLDAPadmin

The PoC I created consists of two small VMs, each running a set of instances of the above components. tac_plus-ng handles incoming TACACS+ requests, but user credentials, group membership information and other data is stored in OpenLDAP directory and replicated between the two instances. This replication preserves data consistency even if data is modified while one instance is offline. It is redundant both at VM-level (you can shut down one VM and the other will happily handle incoming requests) and at component level (if tac_plus-ng fails, the other instance will handle requests. If an OpenLDAP instance fails, tac_plus-ng will keep using the other).

There’s quite a bit of setup involved, especially with OpenLDAP, which looked like black magic to me at first, until I figured out the LDAP hierarchy and the permission model. To avoid anonymous binds, both tac_plus-ng and LDAP tools use dedicated service accounts with limited permissions. LDAP tools each work with the OpenLDAP on their localhost and the changes are replicated.

What is TACACS+?

TACACS+ is the industry standard (defined in RFC 8907) for network device AAA, which in turn stands for:

Authentication – Is the user really who they say they are?
Authorization – What is the user allowed to do?
Accounting – Report what the user is actually doing.

TACACS+ runs over TCP and the IANA-assigned port is 49.

When you manage a network device (called Network Access Server, NAS, in TACACS+ parlance), the following happens (simplified):

Your SSH clients establishes a connection to the SSH server on the NAS and sends your credentials (username and password)
The NAS opens a connection to the TACACS+ server and sends an authentication requests containing the credentials from step 1.
TACACS+ server checks the credentials against its database and sends an authentication response with either PASS or FAIL.
If the NAS has received PASS in step 3., it then sends an authorization request with the username, a service name (used to distinguish vendors) and the command (probably empty at this stage).
TACACS+ server sends an authorization response with a PASS or FAIL and possibly a set of attribute-value (AV) pairs.
If the authorization response is PASS, the NAS lets the user in and assigns privileges according to the AV pairs received
Cisco and Cisco-like devices also authorize each commands separately. When you hit ENTER in the CLI, the NAS sends an authorization request and then either allows or prohibits the command depending on the response from TACACS+ server
The NAS also sends information on events (usually commands executed) via an accounting request. The TACACS+ server acknowledges this via an accounting response.

Authentication, authorization and accounting can be used configured and used separately. It is e.g. perfectly possible to only use TACACS+ for authentication and deal with authorization locally. It is also possible, even if it makes less sense to do so, to authenticate locally and authorize via TACACS+. Same with accounting, which can be done completely independently of the other two.

Authentication and accounting are pretty much vendor-independent, but authorization process has to take into account what kind NAS is authorising the user. This will be explained in detail in a later post on authorization, covering Cisco, Juniper, F5 and Fortigate (possibly others if I manage to get my hands on more lab equipment).

TACACS+ is pretty insecure. The payload of a TACACS+ PDU can be encrypted, but this is done by merely XORing the payload with a series of MD5 hashes of a pre-shared key (PSK). Since MD5 is cryptographically weak, brute-forcing the PSK is possible, at least for shorter keys. Obfuscated payload can be modified on the fly at least to some extent. Furthermore, if an attacker obtains the PSK, they can decrypt any message obfuscated with it (no perfect forward secrecy) and the key can be easily extracted from network devices config files. The vulnerabilities are such, that the RFC was updated in 2017 such that uses the term “obfuscation” instead of “encryption”.

Since there isn’t really any alternative protocol (save for RADIUS in a very limited role) for AAA, we just have to live with this. The only mitigation is to run TACACS+ over a secure network, preferably an out-of-band management network. There is a draft RFC for TLS support for TACACS+ and support is trickling in at least for Cisco Nexus (since NX-OS 10.6(1)F) and Catalyst (since IOS XE 17.18.1), but (a) these release trains are way too fresh to be used in production and (b) the feature itself should be considered experimental. tac_plus-ng has support for TACACS+ over TLS and I will gladly try it, but this will not happen soon.

Now, on to the actual build and configuration!

Background and requirements

The solution in a nutshell

What is TACACS+?

Leave a Reply Cancel reply