Kubernetes + IAM on AWS: The Best Creds are No Creds
Alternatives to the overwhelming credentials required by modern software infrastructure.
This article examines a couple of specific options that came up in the course of working with a client who is transitioning their platform onto AWS and the case for addressing this question at the design phase.
For most organizations, moving their infrastructure onto a cloud provider also means adopting some of the managed offerings that the provider offers (e.g Kinesis or DynamoDB on AWS). Integrating existing software into the cloud ecosystem also means making a decision on security. You can assume that the cloud is inherently, automatically secure (hint: it’s not) and choose to run things pretty much wide open or learn how to apply tools like IAM to limit each piece of the application to the minimal set of permissions on the minimal set of objects to do its job.
This can generate another set of challenges. As modern application architecture tends towards more, smaller units of functionality, the logical outcome is a proliferation of secrets data to be managed. On AWS, the path of least resistance for developers to get started is to create a User for each service and then generate credentials for them. The model is already well-understood, and for a dev team trying to get out of the gate, takes only a couple of minutes to set up. Secrets have to be managed, though, and the choice of how to do so has to balance ease-of-use and adoption by developers against administrative overhead, an expanding attack surface, and the risk of making the application dependent on another external system.
Automate credential handling
A clear contender for “best Kubernetes hacks” is KIAM (which is just short for “Kubernetes + IAM”). A service that runs on AWS-based clusters, it allows developers to simply annotate the containers they deploy with the desired role to be applied to them, and then handles the management of credentials automatically.
It leverages the AWS API metadata endpoint functionality by redirecting requests to the specific IAM endpoint from any container running on the node to the KIAM server Pod(s), basically using the man-in-the-middle pattern for good instead of evil. The server itself runs as a role that has permissions to assume other IAM roles and get short-lived credentials back, which it then passes along to the requesting container for its use. These STS tokens expire in a relatively short period of time (the default is 60 minutes). The KIAM server keeps track of token lifecycles and pre-fetches updated ones to deliver to the containers before retry mechanisms kick in.
It also allows for the scoping of Kuberentes namespaces to a string-match of allowed roles, making it possible to safely delegate the creation and management of these roles to individual teams. Designed and implemented carefully across an organization, this can completely remove the need to distribute, store and periodically renew credentials used by applications. It makes it easier to involve development teams in security at the design phase and essentially reduces it to a service easily accessed rather than functionality that needs to be built for each application individually.
Use mutual TLS
Most organizations will have various pieces of their stack that communicate with each other that can’t be protected by IAM because they’re on top of AWS rather than of it. The situation is similar though in that for a team looking to get started on building a product, using something like HTTP basic auth is the path of least resistance.
In some ways, this is even worse than credentials generated via IAM, because at least with IAM there is a centralized place to see how long a credential has been in use and manage them. If they’re embedded into various applications, they become even harder to manage.
As with KIAM, consider replacing managed credentials with a service, in this case an on-demand certificate issuing authority. Systems that are part of a common service can be issued certificates and keys from an intermediate CA and configured to only accept communication from systems that present a cert from the same CA. This deals with both the question of authentication and encrypts the transport channel between all endpoints. Examples of services that can serve in this role are Vault or the cert-manager service that runs on Kubernetes.
The common pattern here is one GenUI advocates for regardless of whether it’s security-related or not: replace troublesome coding problems with infrastructure-level services. It’s a great example of DevOps thinking, it keeps developers working on meaningful problems and delivers a more consistent and polished product to customers.