Although there are a lot of instructions available, I haven't found a straightforward way of deploying a container to Kubernetes cluster that is hosted in a private ECR registry. In this short article, I would like to share a sequence of steps that can be used to perform the deployment.
Prerequisites
Make sure that the machine that is going to be used to perform the deployment (whether it's your local or most likely CI/CD environment) does have aws-cli installed and configured with proper access key id and secret access key so that it has access to pull an image. More info in the official AWS article.
Overview
As a good practice, it's always best to deploy a different set of
applications in separate namespaces. We would create namespace manually
via kubectl create
command (for the reasons I explain
later).
From the application standpoint, we would consider a minimalistic
"health check application" that replies with status: ok
as
an HTTP response. Kubernetes manifest file would consist out of the
following objects:
- service (exposed via Nodeport)
- deployment (containing an application)
manifest.yml
:
apiVersion: v1
kind: Service
metadata:
name: health-check-service
namespace: health-check
spec:
type: NodePort
ports:
- port: 3000
selector:
app: node-hello-world-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: node-hello-world-deployment
namespace: health-check
labels:
app: node-hello-world-app
spec:
replicas: 1
selector:
matchLabels:
app: node-hello-world-app
template:
metadata:
labels:
app: node-hello-world-app
spec:
containers:
- name: node-hello-world-app-container
image: <aws_account_id>.dkr.ecr.<aws_region>.amazonaws.com/nodejs-hello-world
imagePullPolicy: Always
ports:
- name: web
containerPort: 3000
imagePullSecrets:
- name: regcred
Besides a familiar look of service and deployment definition, there are a couple of items that are needed to be highlighted:
- ECR Image Registry URL:
<aws_account_id>.dkr.ecr.aws_region.amazonaws.com/<image-name>:<tag>
<aws_account_id>
- your account id, e.g.e9ae3c220b23
<aws_region>
- aws region name (examples here)<image-name>
- image name<tag>
- image tag, usually defines a version or simply uselatest
- Image Pull Policy: Always enforce image force pull to avoid unexpected issues when k8s doesn't pull an image from a remote repository.
Deployment steps
Create namespace via
kubectl create
command:kubectl create namespace health-check
The reason we create namespace manually and not in the above manifest file is that in the next step we would have to create a secret within this namespace. This is super important since kubernetes secrets are scoped to a specific namespace.
Next, the secret is generated via a command line using
aws ecr
that is outside of "kubectl" ecosystem.Create a registry secret within the above namespace that would be used to pull an image from a private ECR repository:
kubectl create secret docker-registry regcred \ --docker-server=${AWS_ACCOUNT}.dkr.ecr.${AWS_REGION}.amazonaws.com \ --docker-username=AWS \ --docker-password=$(aws ecr get-login-password) \ --namespace=health-check
This command would utilize aws-cli
aws ecr get-login-password
and save the generated credentials in a specialdocker-registry
secret type. More info about it in the official kubernetes docs.Please note, that username is always set as
AWS
for all accounts.Deploy manifest file using
kubectl apply -f
command:kubectl apply -f manifest.yml
Using the http
command I can verify that my deployment
is working:
$ http <cluster-ip>:<node-port>
HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 15
Content-Type: application/json; charset=utf-8
Date: Mon, 15 Mar 2021 16:47:54 GMT
ETag: W/"f-VaSQ4oDUiZblZNAEkkN+sX+q3Sg"
Keep-Alive: timeout=5
X-Powered-By: Express
{
"message": "Hello World!"
}
One-liner for CI/CD pipeline
If you need to automate the deployment via CI/CD or simply just would line to use one-line command, here it is:
NAMESPACE_NAME="health-check" && \
kubectl create namespace $NAMESPACE_NAME || true && \
kubectl create secret docker-registry regcred \
--docker-server=${AWS_ACCOUNT}.dkr.ecr.${AWS_REGION}.amazonaws.com \
--docker-username=AWS \
--docker-password=$(aws ecr get-login-password) \
--namespace=$NAMESPACE_NAME || true && \
kubectl apply -f manifest-deployment.yml
Clarification:
NAMESPACE_NAME
variable is pulled out for reuse. It's also scoped only to a current bash session....|| true
bash statement is added to ignore errors for already created space and secret when reusing the same command to re-apply changes from the manifest file.
Update - AWS ECR Token refresh
Even though it's possible to be successful with installation following the steps above, there is one caveat. ECR will reject stale tokens that were obtained more than 12 hours ago. As was mentioned in the comment below by Patrick McMahon:
... if you have deployed an application and the scheduler decides to move the pod to another node after 12 hours you will get an 'ImagePullBackOff' error as the authentication to ECR no longer works
There are 2 possible ways to approach the problem (there might be more, but these are the 2 that I came up with):
Use cronjob on the host os (or some other remote machine that has access to a cluster) to automatically keep refresh tokens. The downside, if the host os goes down this approach may stop working until fixed. Here are the steps:
# Create a log file that cron job will output to sudo touch /var/log/aws-ecr-update-credentials.log # Make a current user owner of the file so that cronjob running under his/its account can write to it sudo chown $USER /var/log/aws-ecr-update-credentials.log # Create an empty file where the script would reside sudo touch /usr/local/bin/aws-ecr-update-credentials.sh # Allow cronjob to execute the script under the user sudo chown $USER /usr/local/bin/aws-ecr-update-credentials.sh\ # Make the script executable sudo chmod +x /usr/local/bin/aws-ecr-update-credentials.sh
Add the script to the recently created
/usr/local/bin/aws-ecr-update-credentials.sh
file:#!/usr/bin/env bash kube_namespaces=($(kubectl get secret --all-namespaces | grep regcred | awk '{print $1}')) for i in "${kube_namespaces[@]}" do : echo "$(date): Updating secret for namespace - $i" kubectl delete secret regcred --namespace $i kubectl create secret docker-registry regcred \ --docker-server=${AWS_ACCOUNT}.dkr.ecr.${AWS_REGION}.amazonaws.com \ --docker-username=AWS \ --docker-password=$(/usr/local/bin/aws ecr get-login-password) \ --namespace=$i done
This script will update only namespaces that already have regcred secret. Don't forget to replace
${AWS_ACCOUNT}
and${AWS_REGION}
with corresponding values or add corresponding environment variables.The last step - add cronjob
#open crontab file crontab -e #job 0 */10 * * * /usr/local/bin/aws-ecr-update-credentials.sh >> /var/log/aws-ecr-update-credentials.log 2>&1
You can replace "0 */10 * * *" with "* * * * *" which would result in executing this script every 1 minute, so you can check logs and make sure the script works as expected.
Use CronJob resource on the Kubernetes side:
apiVersion: v1 kind: Secret metadata: name: ecr-registry-helper-secrets namespace: health-check stringData: AWS_SECRET_ACCESS_KEY: "xxxx" AWS_ACCESS_KEY_ID: "xxx" AWS_ACCOUNT: "xxx" --- apiVersion: v1 kind: ConfigMap metadata: name: ecr-registry-helper-cm namespace: health-check data: AWS_REGION: "xxx" DOCKER_SECRET_NAME: regcred --- apiVersion: batch/v1beta1 kind: CronJob metadata: name: ecr-registry-helper namespace: health-check spec: schedule: "0 */10 * * *" successfulJobsHistoryLimit: 3 suspend: false jobTemplate: spec: template: spec: serviceAccountName: sa-health-check containers: - name: ecr-registry-helper image: odaniait/aws-kubectl:latest imagePullPolicy: IfNotPresent envFrom: - secretRef: name: ecr-registry-helper-secrets - configMapRef: name: ecr-registry-helper-cm command: - /bin/sh - -c - |- ECR_TOKEN=`aws ecr get-login-password --region ${AWS_REGION}` NAMESPACE_NAME=health-check kubectl delete secret --ignore-not-found $DOCKER_SECRET_NAME -n $NAMESPACE_NAME kubectl create secret docker-registry $DOCKER_SECRET_NAME \ --docker-server=https://${AWS_ACCOUNT}.dkr.ecr.${AWS_REGION}.amazonaws.com \ --docker-username=AWS \ --docker-password="${ECR_TOKEN}" \ --namespace=$NAMESPACE_NAME echo "Secret was successfully updated at $(date)" restartPolicy: Never --- apiVersion: v1 kind: ServiceAccount metadata: name: sa-health-check namespace: health-check --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: health-check name: role-full-access-to-secrets rules: - apiGroups: [""] resources: ["secrets"] resourceNames: ["regcred"] verbs: ["delete"] - apiGroups: [""] resources: ["secrets"] verbs: ["create"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: health-check-role-binding namespace: health-check subjects: - kind: ServiceAccount name: sa-health-check namespace: health-check apiGroup: "" roleRef: kind: Role name: role-full-access-to-secrets apiGroup: "" ---
There are quite a few configuration details in here. The biggest difference is that CronJob resource in this example is scoped to a specific namespace and is allowed to delete only regcred secret which makes it secure.
On the high-level Secret/ConfigMap are used to extract configuration details, other resources are used to permit CronJob to have the ability to remove/update
regcred
token (Role -> RoleBinding -> Service Account -> CronJob). The same as per the previous approach, it's advisable to change cron configuration from "0 */10 * * *" initially to "* * * * *" to verify that this script works as expected.
The source code and readme are also available on GitHub: https://github.com/skryvets/kubernetes-pull-an-image-from-private-ecr-registry
Conclusion
This article provides a basic and production-ready approach of how to deploy a private-hosted application to a Kubernetes cluster. It can scale to any number of pods/replica-sets/deployments as long as they reside in the same namespace. It's also possible to repeat the same steps for each namespace.