Hands-On Kubernetes On Azure
Hands-On Kubernetes On Azure
Kubernetes on Azure
Run your applications securely and at scale on the most widely adopted
orchestration platform
Shivakumar Gopalakrishnan
Gunther Lenz
BIRMINGHAM - MUMBAI
Hands-On Kubernetes on Azure
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means,
without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the
information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its
dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by
the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Vijin Boricha
Acquisition Editor: Shrilekha Inani
Content Development Editor: Nithin George Varghese
Technical Editor: Prashant Chaudhari
Copy Editor: Safis Editing
Project Coordinator: Drashti Panchal
Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Graphics: Tom Scaria
Production Coordinator: Nilesh Mohite
ISBN 978-1-78953-610-2
www.packtpub.com
I dedicate this book to my parents. Without their support on everything from getting my first computer to
encouraging me on whatever path I took, this book wouldn’t have happened.
-Shivakumar Gopalakrishnan
To Okson and Hugo
Mapt is an online digital library that gives you full access to over 5,000 books
and videos, as well as industry leading tools to help you plan your personal
development and advance your career. For more information, please visit our
website.
Why subscribe?
Spend less time learning and more time coding with practical eBooks and
Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
At www.packt.com, you can also read a collection of free technical articles, sign up
for a range of free newsletters, and receive exclusive discounts and offers on
Packt books and eBooks.
Contributors
About the authors
Shivakumar Gopalakrishnan is a DevOps architect at Varian Medical Systems.
He has introduced Docker, Kubernetes, and other cloud-native tools to Varian's
product development work to enable an everything-as-code approach. He is
highly experienced in software development in a variety of fields, including
networking, storage, medical imaging, and DevOps. He has developed scalable
storage appliances for medical imaging needs and helped architect cloud ease
solutions. He has enabled teams in large, highly regulated medical enterprises to
adopt modern Agile/DevOps methodologies. He holds a Bachelor's degree in
engineering from the College of Engineering, Guindy, and a Master's degree in
science from the University of Maryland, College Park.
Thanks to all in the open source movement who constantly “chop wood and carry water”. This book would
not be possible without leveraging open source frameworks, samples, and documentation.
Thank you to my wonderful wife Asha, who took care of everything else so that I could focus on this book.
Thanks to Nikhil and Adil for helping me in all the ways that they could, including listening to me rambling
about Config maps and Ingress.
Packt is searching for authors like
you
If you're interested in becoming an author for Packt, please visit authors.packtpub.c
om and apply today. We have worked with thousands of developers and tech
professionals, just like you, to help them share their insight with the global tech
community. You can make a general application, apply for a specific hot topic
that we are recruiting an author for, or submit your own idea.
Table of Contents
Title Page
Copyright and Credits
Hands-On Kubernetes on Azure
Dedication
About Packt
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Reviews
1. Section 1: The Basics
This book will you to monitor applications and cluster services to achieve high
availability and dynamic scalability, to deploy web applications securely in
Microsoft Azure with Docker containers. You will enhance your knowledge
about Microsoft Azure Kubernetes Service and will learn advanced techniques
such as solution orchestration, secret management, best practices, and
configuration management for complex software deployments.
Who this book is for
If you're a cloud engineer, cloud solution provider, sysadmin, site reliability
engineer, or a developer interested in DevOps, and are looking for an extensive
guide to running Kubernetes in the Azure environment, then this book is for you.
What this book covers
, Introduction to Docker and Kubernetes, covers the concepts of Docker
Chapter 1
and Kubernetes, providing the foundational context for the following chapters,
where you will dive into how to deploy Dockerized applications in Microsoft
AKS.
how to navigate the Azure portal to perform all the functions required to launch
an AKS cluster, and also use Azure Cloud Shell without installing anything on
your computer.
Chapter 6, Monitoring the AKS Cluster and the Application, will enable you to set
alerts on any metric that you would like to be notified of by leveraging Azure
Insights.
implement microservices on AKS, including how to use Event Hubs for loosely
coupled integration between applications.
more depth, covering different secrets' backends and how to use them. A brief
introduction to service mesh concepts will also be covered with the
implementation of a practical example.
Chapter 12, Next Steps, will direct you to different resources where they can learn
and implement advanced features in security, scalability. For this chapter, please
refer to: https://www.packtpub.com/sites/default/files/downloads/Next_Steps.pdf
To get the most out of this book
Though any previous knowledge of Kubernetes is not expected, some experience
with Linux and Docker containers would be beneficial.
Download the example code files
You can download the example code files for this book from your account at www.
packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support
Once the file is downloaded, please make sure that you unzip or extract the
folder using the latest version of:
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPu
blishing/Hands-On-Kubernetes-on-Azure. In case there's an update to the code, it will be
updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos
available at https://github.com/PacktPublishing/. Check them out!
Conventions used
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names,
filenames, file extensions, pathnames, dummy URLs, user input, and Twitter
handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image
file as another disk in your system."
When we wish to draw your attention to a particular part of a code block, the
relevant lines or items are set in bold:
static const char *TAG="SMARTMOBILE";
static EventGroupHandle_t wifi_event_group;
static const int CONNECTED_BIT = BIT0;
Bold: Indicates a new term, an important word, or words that you see onscreen.
For example, words in menus or dialog boxes appear in the text like this. Here is
an example: "Select System info from the Administration panel."
Warnings or important notes appear like this.
General feedback: If you have questions about any aspect of this book, mention
the book title in the subject of your message and email us at
[email protected].
Errata: Although we have taken every care to ensure the accuracy of our
content, mistakes do happen. If you have found a mistake in this book, we would
be grateful if you would report this to us. Please visit www.packt.com/submit-errata,
selecting your book, clicking on the Errata Submission Form link, and entering
the details.
Piracy: If you come across any illegal copies of our works in any form on the
Internet, we would be grateful if you would provide us with the location address
or website name. Please contact us at [email protected] with a link to the
material.
If you are interested in becoming an author: If there is a topic that you have
expertise in and you are interested in either writing or contributing to a book,
please visit authors.packtpub.com.
Reviews
Please leave a review. Once you have read and used this book, why not leave a
review on the site that you purchased it from? Potential readers can then see and
use your unbiased opinion to make purchase decisions, we at Packt can
understand what you think about our products, and our authors can see your
feedback on their book. Thank you!
This chapter is the longest chapter that you will read in terms of theory in this
book. You will get your hands dirty pretty quickly in the following chapters.
Step by step, you will be building applications that can scale and are secure. This
chapter gives you a brief introduction to the information that you will need if
you want to dig deeper, or wish to troubleshoot when something goes wrong
(remember, Murphy was an optimist!). Having cursory knowledge of this
chapter will demystify much of the magic as you build your Azure AD-
authenticated, Let's Encrypt-protected application that scales on demand based
on the metrics that you are monitoring.
Operators will take the hints from the developer specs and deliver them a stable
system – whose metrics can be used for future software development, thus
completing the virtuous cycle.
Developers owning the responsibility of running the software that they develop
instead of throwing it over the wall for operations is a change in mindset that has
origins in Amazon (https://www.slideshare.net/ufried/the-truth-about-you-build-it-you-r
un-it).
The advantages of the DevOps model not only change the responsibilities of the
operations and development teams—it also changes the business side of
organizations. The foundation of DevOps can enable businesses to accelerate the
time to market significantly if combined with a lean and agile process and
operations model.
Everything is a file
Microservices, Docker, and Kubernetes can get quickly overwhelming for
anyone. We can make it easier for ourselves by understanding the basics. It
would be an understatement to say that understanding the fundamentals is
critical to performing root cause analysis on production problems.
Compute (CPU)
Memory (RAM)
Storage (disk)
Network (NIC)
follows:
This command lists the files in the current directory (which happens to be the
root user's home directory).
Well, obviously, this is not really the case. In principle, there is no difference
between the command you ran versus launching a container.
So, what is the difference between the two? The clue lies in the word, contain.
The ls process has very limited containment (it is limited only by the rights that
the user has). ls can potentially see all files, has access to all the memory,
network, and the CPU that's available to the OS.
A container is contained by the OS by controlling access to computing, memory,
storage, and network. Each container runs in its own namespace(https://medium.co
m/@teddyking/linux-namespaces-850489d3ccf). The rights of the namespace processes are
Every container process has contained access via cgroups and namespaces, which
makes it look (from the container process perspective) as if it is running as a
complete instance of an OS. This means that it appears to have its own root
filesystem, init process (PID 1), memory, compute, and network.
You can play with Docker by creating a free Docker Hub account at Docker Hub
(https://hub.docker.com/) and using that login at play with Docker (https://labs.play-
with-docker.com/).
First, type docker run -it ubuntu. After a short period of time, you will get a prompt
such as root@<randomhexnumber>:/#. Next, type exit, and run the docker run -it
ubuntu command again. You will notice that it is super fast! Even though you
have launched a completely new instance of Ubuntu (on a host that is probably
running alpine OS), it is available instantly. This magic is, of course, due to the
fact that containers are nothing but regular processes on the OS. Finally, type exit
to complete this exercise. The full interaction of the session on play with Docker
(https://labs.play-with-docker.com/) is shown in the following script for your
reference. It demonstrates the commands and their output:
docker run -it ubuntu # runs the standard ubuntu linux distribution as a container
exit # the above command after pulling it from dockerhub will put you into the shell of the container.
docker run -it ubuntu # running it again shows you how fast launching a container is. (Compare it to l
The following content displays the output that is produced after implementing
the preceding commands:
###############################################################
# WARNING!!!! #
# This is a sandbox environment. Using personal credentials #
# is HIGHLY! discouraged. Any consequences of doing so are #
# completely the user's responsibilites. #
# #
# The PWD team. #
###############################################################
[node1] (local) [email protected] ~
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS P
[node1] (local) [email protected] ~
$ date
Mon Oct 29 05:58:25 UTC 2018
[node1] (local) [email protected] ~
$ docker run -it ubuntu
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
473ede7ed136: Pull complete
c46b5fa4d940: Pull complete
93ae3df89c92: Pull complete
6b1eed27cade: Pull complete
Digest: sha256:29934af957c53004d7fb6340139880d23fb1952505a15d69a03af0d1418878cb
Status: Downloaded newer image for ubuntu:latest
root@03c373cb2eb8:/# exit
exit
[node1] (local) [email protected] ~
$ date
Mon Oct 29 05:58:41 UTC 2018
[node1] (local) [email protected] ~
$ docker run -it ubuntu
root@4774cbe26ad7:/# exit
exit
[node1] (local) [email protected] ~
$ date
Mon Oct 29 05:58:52 UTC 2018
[node1] (local) [email protected] ~
Orchestration
An individual can rarely perform useful work alone; teams that communicate
securely and well can generally accomplish more.
Just like people, containers need to talk to each other and they need help in
organizing their work. This activity is called orchestration.
Kubernetes takes the declarative approach to orchestration; that is, you specify
what you need and Kubernetes takes care of the rest.
Underlying all this magic, Kubernetes still launches the Docker containers, like
you did previously. The extra work involves details such as networking,
attaching persistent storage, handling the container, and host failures.
In the next chapter, we will introduce the Azure Portal and its components in the
context of managing AKS.
Kubernetes on Azure (AKS)
Installing and maintaining Kubernetes clusters correctly and securely is hard.
Thankfully, all the major cloud providers such as Google Cloud Platform
(GCP) (of course, considering Kubernetes was founded in Google), AWS, and
Azure facilitate installing and maintaining clusters. You will learn how to
navigate through Azure Portal and launch your own cluster and a sample
application. And all of the above from your browser.
If you do not have an Azure account, you can create a free account here: https://a
zure.microsoft.com/en-us/free/?WT.mc_id=A261C142F.
If you do not want to run the sample application in Docker locally, then skip to
the Entering the Azure portal section where we will show you how to do the same in Azure
without installing anything locally.
In this section, we will show you how to run the Azure Voting application on
your local machine. This requires the following:
Now let's check out what version of Docker is running on your machine by using
the following command. Open your favorite command-line prompt and check
the versions of the Docker components that are installed. The response will be
the versions you are running locally.
$ docker --version
Docker version 18.06.1-ce, build e68fc7a
$docker-compose --version
docker-compose version 1.22.0, build f46880f
$docker-machine --version
docker-machine version 0.15.0, build b48dc28d
It is time to get the application code from GitHub and run the Azure Voting
application locally. You will see how easy it is to do that. In your command-line
window, type the following commands:
$ git clone https://github.com/Azure-Samples/azure-voting-app-redis.git
Now let's take a sneak peek at the docker-compose.yaml file. The Azure Voting
application is composed of three containers:
The YAML files describe the services, the container images, and ports that
compose the application. You can also see that the application is using a default
Redis Docker image. If you open the YAML file, it will look like this:
version: '3'
services:
azure-vote-back:
image: redis
container_name: azure-vote-back
ports:
- "6379:6379"
azure-vote-front:
build: ./azure-vote
image: azure-vote-front
container_name: azure-vote-front
environment:
REDIS: azure-vote-back
ports:
- "8080:80"
Use the docker-compose.yaml file we just explored to build the container images,
download the Redis image, and run the application:
$ docker-compose up -d
Last, but not least, let's start the Azure Voting app running on your local machine
by going to your web browser and typing http://localhost:8080. The application
will be loaded, and you can vote for cats or dogs. Happy voting!
Before moving on to the Azure portal, clean up the Docker images and resources
with the following:
$ docker-compose down
Stopping azure-vote-front ... done
Stopping azure-vote-back ... done
Removing azure-vote-front ... done
Removing azure-vote-back ... done
Removing network azure-voting-app-redis_default
In the next sections, you will use Azure portal to deploy and run the same
application on AKS in Microsoft Azure.
Provide the GitHub URL for the code in the chapter (setup instructions should be on the
GitHub page). Create a GitHub folder named chX, where X is the chapter number, for example,
ch1.
Entering the Azure portal
Till we get to an all command-line method of operation (aka the DevOps way),
we will be utilizing Azure Portal for most of our use cases. Even when we are
using command-line functions for most of our operations, Azure Portal is where
you can quickly check the status of your infrastructure. Familiarity with Azure
Portal is essential for running your AKS.
Creating an Azure portal account
If you already have an account with Azure and/or are familiar with Azure Portal, you can skip
this section.
Here we will show you how to get started with a free account.
In order to keep your trial account separate, you probably want to start the browser in
Incognito or Private Browsing mode.
1. Go to https://azure.microsoft.com/en-us/free/
The free account is valid only for 30 days. Please finish the book by then and delete your
account if you do not want to be charged after the trial. (Remember getting new emails is
free.)
You have to add a new Resource Group. Type handsonaks and myfirstakscluster for
the cluster name:
Use the previous names for Resource Group and Cluster Name. We will be using those
repeatedly in the future. If you type anything else, you need to substitute the right group and
name it appropriately.
The following settings were tested by us to work reliably with the free account.
Since it is just our sample application and to extend your free credits as much as
possible, we will choose only one node with one vCPU. Click on Change size:
Your free account has a four-core limit that will be violated if we go with the defaults.
Like all good things in life, if they worked out of the box, we would be out of a
job. There is one more thing we have to set, apart from the changes we needed to
make it work under the free trial. Choose West US 2 for Region before clicking
on Review + Create:
Click on Create:
You have worked really hard. You deserve a coffee break. Well, maybe not. In
any case, as of now, it takes at least 10-20 minutes to create a cluster on AKS.
So, you might as well.
If you have been really good this year, you should see this:
If you were not, we are not judging you; we can always blame Microsoft. For
example, this is the error we got for the quota limitation as shown in the
following screenshot. Double-check the settings (you thought you were smarter
than us, didn't you? We know you ... we did the same when following the
documentation) and try again:
Using Azure Cloud Shell
Once you have a successful deployment, it is time to play. As promised, we will
do it all from the Azure portal with no client installs.
The toughest part of this assignment is finding the small icon near the search bar:
First the portal will ask you to select either PowerShell or Bash as your default
shell experience.
Next, the portal will ask you to create a storage account; just confirm and create
it.
Click on the power button; it should restart, and you should see something
similar to this:
You can pull the splitter/divider up to see more of the shell:
On the shell, we need to first install kubectl. This is the command-line tool used
for many operations when operating and maintaining Kubernetes clusters.
Furthermore, kubectl is already installed for you on Azure Cloud Shell.
You need the credentials to access your cluster. For example, on the shell, type
the following command:
az aks get-credentials --resource-group handsonaks --name myfirstakscluster
The preceding command will set the correct values in ~/.kube/config so that kubectl
can access it.
You are all connected now. We are going to launch our first application now.
We are going to use the vi command-line editor. It is generally confusing to use at first, but
you can use the online code editor shown in the next section.
We have included the code from the Azure website for your convenience:
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-back
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-back
template:
metadata:
labels:
app: azure-vote-back
spec:
containers:
- name: azure-vote-back
image: redis
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 6379
name: redis
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-back
spec:
ports:
- port: 6379
selector:
app: azure-vote-back
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-front
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-front
template:
metadata:
labels:
app: azure-vote-front
spec:
containers:
- name: azure-vote-front
image: microsoft/azure-vote-front:v1
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 80
env:
- name: REDIS
value: "azure-vote-back"
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-front
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-vote-front
Back on the Azure online code editor, paste the content of the file.
Then, click on the ... in the right-hand corner to save the file as azure-vote.yaml:
The file should be saved. You can check by typing the following:
cat azure-vote.yaml
Hitting the Tab button expands the file name in Linux. In the preceding scenario, if you hit
Tab after typing az it should expand to azure-vote.yaml.
Now we wait.
Hit the Up arrow and press return till you get the status as all pods running. It
does take some time to set up everything, which you can check by typing the
following:
kubectl get all --all-namespaces
Once you see Pulled for the frontend, type the following to ensure that
everything is running:
In order to access it publicly, we need to wait for one more thing. We could type
commands repeatedly or let Kubernetes know once a service is up and running.
Now we want to know the public IP of the load balancer so that we can access it.
Wait for the public IP to appear and then press Ctrl + C to exit the watch:
Note the external IP address, and type it on a browser. You should see this:
Click on Cats or Dogs (I would go with Dogs) and watch the count go up.
You have now launched your own cluster and your first Kubernetes application.
Note the effort of connecting the frontend and the backend and exposing it to the
outside world along with providing storage for the services was all taken care of
by Kubernetes.
Summary
By the end of this chapter, we are now able to access and navigate the Azure
portal to perform all the functions required to launch AKS and also to use the
free trial on Azure to your advantage while learning the ins and outs of AKS and
other Azure services. We launched our own AKS cluster with the ability to
customize configurations if required using Azure Portal. We also learned to use
Azure Cloud Shell without installing anything on your computer. This is
important for all the upcoming sections, where you will be doing a lot more than
launching simple applications. Finally, we launched a publicly accessible service
that works! The skeleton of this application is the same for the complex
applications that you will be launching in the next sections.
Section 2: Deploying on AKS
Section 2 focuses on getting an application running on AKS. Readers will be
able to deploy, scale, and monitor an application on AKS. Readers will also learn
how to use Kubernetes' RBAC on AKS through integration with Azure AD.
You will be using this application as the basis for testing out the scaling of the
backend and the frontend independently in the next chapter.
Introducing the application
The application stores and displays guestbook entries. You can use it to record
the opinions of all the people who visit your model railroad display, for example.
Along the way, we will explain Kubernetes concepts such as Deployments and
replication controllers.
The application uses PHP with Redis backends for this purpose.
Deploying the first master
You are going to deploy the Redis master, which you will delete in the next step.
This is done for no other reason than for you to learn about ConfigMaps.
Let's do this.
It will take some time for it to download and start running. While you wait, let
me explain the command you just typed and executed. Let's start by exploring
the content of the yaml file you used:
1 apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
2 kind: Deployment
3 metadata:
4 name: redis-master
5 labels:
6 app: redis
7 spec:
8 selector:
9 matchLabels:
10 app: redis
11 role: master
12 tier: backend
13 replicas: 1
14 template:
15 metadata:
16 labels:
17 app: redis
18 role: master
19 tier: backend
20 spec:
21 containers:
22 - name: master
23 image: k8s.gcr.io/redis:e2e # or just image: redis
24 resources:
25 requests:
26 cpu: 100m
27 memory: 100Mi
28 ports:
29 - containerPort: 6379
Line 23: Says what Docker image we are going to run. In this case, it is
the redis image tagged with e2e (presumably the latest image of redis that
successfully passed its end-to-end [e2e] tests).
Lines 28-29: Say this container is going to listen on port 6379.
Line 22: Gives this container a name, which is master.
Lines 24-27: Sets the cpu/memory resources requested for the container. In this
case, the request is 0.1 CPU, which is equal to 100m and is also often referred
to as 100 millicores. The memory requested is 100Mi, or 104857600 bytes,
which is equal to ~105M (https://Kubernetes .io/docs/concepts/configuration/manage-
compute-resources-container/ ). You can also set cpu and memory limits the same
way.
This is very similar to the arguments you will give to Docker to run a particular
container image. If you had to run this manually, you would start and end getting
something like the following:
docker run -d k8s.gcr.io/redis:e2e # run the redis docker image with tag e2e in detached
docker run --name named_master -d k8s.gcr.io/redis:e2e # run the image with the name test_master
docker run --name net_master -p 6379:6379 -d k8s.gcr.io/redis:e2e # expose the port 6379
docker run --name master -p 6379:6379 -m 100M -c 100m -d k8s.gcr.io/redis:e2e # set the cpu and memory
The container spec (lines 21-29) tells Kubernetes to run the specified container
with the supplied arguments. So far, Kubernetes has not provided us anything
more than what we could have typed in as a Docker command. Let's continue
with the explanation of the code:
Line 13: It tells Kubernetes that we need exactly only one copy of the Redis
master running. This is a key aspect of the declarative nature of Kubernetes.
You provide a description of the containers your applications need to run (in
this case, only one replica of the Redis master) and Kubernetes takes care of
it.
Lines 14-19: Adds labels to the running instance so that it can be grouped
and connected to other containers. We will discuss them later to see how
they are used.
Line 2: Tells we would like a Deployment to be performed. When
Kubernetes started, Replication Controllers were used (and are still used
widely) to launch containers. You can still do most of the work you need
using just Replication Controllers. Deployment adds convenience to
managing RC. Deployments provide mechanisms to perform rollout
changes and rollback if required. You can specify the strategy you would
like to use when pushing an update (Rolling Update or Recreate).
You can see that we have a deployment named redis-master. It controls a replica
set of redis-master-<random id>. Digging deeper, you will also find that the Replica
Set is controlling Pod (a group of Docker containers that should be run
together): redis-master-<replica set random id>->random id>.
nfigure-redis-using-configmap/ ).
Copy and paste the following two lines into the Azure Cloud Shell editor and
save it as redis-config:
maxmemory 2mb
maxmemory-policy allkeys-lru
The online editor is limited and surprisingly doesn't have support for creating a new file, not
to worry. In this case, type touch redis-config. Then code redis-config works. Or you can open the
empty file using the open file command on the online code editor.
We will use the same trick to get the scoop on this ConfigMap:
kubectl describe configmap/example-redis-config
Data
====
redis-config:
----
maxmemory 2mb
maxmemory-policy allkeys-lru
Events: <none>
Creating from the command line is not very portable. It would be better if we
could describe the preceding code in a yaml file. If only there was a command that
would get the same information in yaml format. Not to worry, kubectl has the get
command:
kubectl get -o yaml configmap/example-redis-config
Let's create our yaml version of this. But first, let's delete the already-created
ConfigMap:
kubectl delete configmap/example-redis-config
Copy and paste the following lines into a file named example-redis-config.yaml, and
then save the file:
apiVersion: v1
data:
redis-config: |-
maxmemory 2mb
maxmemory-policy allkeys-lru
kind: ConfigMap
metadata:
name: example-redis-config
namespace: default
Use the touch trick to open the file on the online code editor.
kubectl get -o yaml is a useful trick to get a deployable yaml file from a running system. It takes
care of tricky yaml indentation and saves you from spending hours trying to get the format
right.
The preceding command returns the same output as the previous one:
Name: example-redis-config
Namespace: default
Labels: <none>
Annotations: <none>
Data
====
redis-config:
----
maxmemory 2mb
maxmemory-policy allkeys-lru
Events: <none>
Now that we have ConfigMap defined, let's use it. Modify redis-master-
deployment.yaml to use ConfigMap, as follows:
The following lines will give a detailed explanation of the preceding code:
Line 36-42: Here is where Kubernetes takes the config as ENV vars to the
next level.
Line 37: Gives the volume the config.
Line 38-39: Declares this volume should be loaded from the ConfigMap
example-redis-config (which has to be already defined [unless declared
optional]). We have defined this already, so we are good.
Line 40-42: Here is where the Kubernetes magic comes in. Configure a Pod
with ConfigMap (https://kubernetes.io/docs/tasks/configure-pod-container/configu
re-pod-configmap/#use-configmap-defined-environment-variables-in-pod-commands). This
shows different ways you can load the config on to a pod. Here we are
loading the value of the redis-config key (the two-line maxmemory settings)
as a file called redis.conf.
We can check whether the settings were applied by running the following
commands:
kc exec -it redis-master-<pod-id> redis-cli
127.0.0.1:6379> CONFIG GET maxmemory
1) "maxmemory"
2) "2097152"
127.0.0.1:6379> CONFIG GET maxmemory-policy
1) "maxmemory-policy"
2) "allkeys-lru"
127.0.0.1:6379>exit
You can see how this works by running the following:
kc exec -it redis-master-<pod-id> bash
root@redis-master-585bd9d8fb-p9qml:/data# ps
bash: ps: command not found
root@redis-master-585bd9d8fb-p9qml:/data# cat /proc/1/cmdline
sh-c/run.sh
root@redis-master-585bd9d8fb-p9qml:/data# cat /run.sh |grep redis.conf
redis-server /redis-master/redis.conf
perl -pi -e "s/%master-ip%/${master}/" /redis-slave/redis.conf
perl -pi -e "s/%master-port%/6379/" /redis-slave/redis.conf
redis-server /redis-slave/redis.conf
root@redis-master-585bd9d8fb-p9qml:/data# cat /run.sh |grep MASTER
if [[ "${MASTER}" == "true" ]]; then
root@redis-master-585bd9d8fb-p9qml:/data# cat /redis-master/redis.conf
maxmemory 2mb
maxmemory-policy allkeys-lru
Somehow, in this image, ps is not installed. Not to worry, we can get the info by examining the
contents of the cmdline file under pid 1. Now we know that run.sh is the file that is run, so
somewhere in that redis.conf file from ConfigMap mounted at /redis-master would be used. So
we grep redis.conf. For sure, we can see redis-server when it is started, when MASTER=TRUE
uses the config from /redis-master/redis.conf. To make sure that Kubernetes did its magic, we
examine the contents of redis.conf by running cat /redis-master/redis.conf and, lo and behold, it
is exactly the values we specified in the ConfigMap example-redis-conf.
To repeat, you have just performed the most important and tricky part of
configuring cloud-native applications. You have also noticed that the apps have
to be modified to be cloud-friendly to read config dynamically (the old image
didn't support dynamic configuration easily).
Fully deploying of the sample
guestbook application
After taking a detour to get our feet wet on dynamically configuring applications
using ConfigMap, we will return to our regularly-scheduled program by
deploying the rest of the guestbook application. You will see the concepts of
Deployment, Replica Sets, and Pods repeated again for the backends and
frontends. We will introduce another key concept called Service. You might
have also noticed the power of protocols and Docker images. Even though we
switched to a different image for redis-master, we have an expectation that the rest
of the implementation should go through without any hitches.
Exposing the Redis master service
With plain Docker, the exposed port is constrained to the host it is running.
There is no support for making the service available if the host goes down.
Kubernetes provides Service, which handles exactly that problem. Using label-
matching selectors, it proxies traffic to the right pods, including load balancing.
In this case, the master has only one pod, so it just ensures that, independent of
which node the pod runs, the traffic is directed to that pod. To create the service,
run the following command:
kubectl apply -f https://k8s.io/examples/application/guestbook/redis-master-service.yaml
Lines 1-8: Tell Kubernetes we want a proxy service that has the same labels
as our redis-master server.
Lines 10-12: Say that this Service should handle traffic arriving at 6379 and
forwarded to 6,379 ports of the pods that are matched by the selector
defined in line 13-16
Line 13-16: Used to find the pods to which the incoming traffic needs to be
proxied. So, any pod with labels matching (app:redis, AND role:master AND
tier:backend) is expected to handle port 6379 traffic.
You see that a new service, named redis-master, has been created. It has a cluster
wide IP of 10.0.22.146 (in this case, YMMV). Note that this IP will work only
within the cluster (hence the ClusterIP type). For fun, you can test this out by
running the following commands:
ab443838-9b3e-4811-b287-74e417a9@Azure:/usr/bin$ ssh -p 6379 10.0.22.146 # just hangs
^C
ab443838-9b3e-4811-b287-74e417a9@Azure:/usr/bin$ ssh -p 80 www.google.com # very quick rejection
ssh_exchange_identification: Connection closed by remote host
To verify it does work inside the cluster, we do our exec trick again:
ab443838-9b3e-4811-b287-74e417a9@Azure:/usr/bin$ ssh -p 6379 10.0.22.146 # just hangs
^C
ab443838-9b3e-4811-b287-74e417a9@Azure:/usr/bin$ ssh -p 80 www.google.com # very quick rejection
ssh_exchange_identification: Connection closed by remote host
Based on the preceding output, you can guess that this time we asked 2 replicas
of the redis slave pods. That is confirmed true when you examine the redis-slave-
deployment.yaml file:
As expected, run.sh in the image checks whether the GET_HOSTS_FROM variable is set
to env. In this case, it is set to dns, so it returns false. redis-server is launched in
slave mode pointing to the redis-master host as the master. If you ping redis-master,
you can see it is set to the ClusterIP of the redis-master service.
Similar to the master service, we need to expose the slave service by running the
following:
kubectl apply -f https://k8s.io/examples/application/guestbook/redis-slave-service.yaml
apiVersion: v1
kind: Service
metadata:
name: redis-slave
labels:
app: redis
role: slave
tier: backend
spec:
ports:
- port: 6379
selector:
app: redis
role: slave
tier: backend
The only difference between this Service and the redis-master Service is that this
service proxies traffic to pods that have the role:slave label in them.
To check whether the slave service responds at the mentioned ClusterIP and port
6379, run the following commands:
You don't get any points for guessing that this Deployment specifies a replica
count of 3 (OK, maybe a pat in the back; we are generous people). The
Deployment has the usual suspects with minor changes as shown in the
following code:
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: frontend
labels:
app: guestbook
spec:
selector:
matchLabels:
app: guestbook
tier: frontend
replicas: 3
template:
metadata:
labels:
app: guestbook
tier: frontend
spec:
containers:
- name: php-redis
image: gcr.io/google-samples/gb-frontend:v4
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
# Using `GET_HOSTS_FROM=dns` requires your cluster to
# provide a dns service. As of Kubernetes 1.3, DNS is a built-in
# service launched automatically. However, if the cluster you are using
# does not have a built-in DNS service, you can instead
# access an environment variable to find the master
# service's host. To do so, comment out the 'value: dns' line above, and
# uncomment the line below:
# value: env
ports:
- containerPort: 80
The replica count is set to 3, the labels are set to {app:guestbook, tier:frontend}, and
the image used is gb-frontend:v4.
Exposing the frontend service
In order to make it publicly available, we have to edit the service yaml file. Run
the following command to download the file:
curl -O -L https://k8s.io/examples/application/guestbook/frontend-service.yaml
Comment out the NodePort line and uncomment the type:LoadBalancer line:
apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
app: guestbook
tier: frontend
spec:
# comment or delete the following line if you want to use a LoadBalancer
# type: NodePort # line commented out
# if your cluster supports it, uncomment the following to automatically create
# an external load-balanced IP for the frontend service.
type: LoadBalancer # line uncommented
ports:
- port: 80
selector:
app: guestbook
tier: frontend
This step takes some time the first time you run it. In the background, Azure has
to perform lots of magic, to make it seamless. It has to create an Azure Load
Balancer (ALB), and set the port-forwarding rules to forward traffic on port 80
to internal ports of the cluster.
For prototyping, you probably will start from one of those yaml files.
The basic concepts of Deployment, Replica Sets, Services, and Pods do not
change and need to be understood in detail.
The obsolete part is not true at all.
Still, to conserve resources on our free trial virtual machines, it is better to delete
the deployments we made to run the next round of the deployment by using the
following commands:
kubectl delete deployment -l app=redis
kubectl delete service -l app=redis
kubectl delete deployment -l app=guestbook
kubectl delete service -l app=guestbook
The helm way of installing complex
applications
As you went along the previous sections, you might have thought, This is pretty
repetitive and boring. I wonder if there is a way to package everything and run it
in one shot. That is a valid question, and a need when deploying complex
applications in production. Helm charts is becoming the de-facto standard for
packaging applications.
That's it; Helm goes ahead and installs everything mentioned at https://Kubernetes.
io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/.
The command should have been just helm install stable/wordpress --name handsonakswp. The reason
for the extra parameters was, at the time of writing, the command did not work without the
SMTP settings. For more information visit https://github.com/bitnami/bitnami-docker-wordpress/issues/153#i
ssuecomment-450127531.
How did we figure out smtpPassword was the issue? Hint: it involved kubectl logs. We will go into
more detail on monitoring in the next chapter.
It takes some time for Helm to install and the site to come up. We will look into
a key concept, Persistent Volume Claims, while the site is loading.
Persistent Volume Claims
A process requires compute, memory, network, and storage. In the Guestbook
sample, we saw how Kubernetes helps us abstract the compute, memory, and
network. The same yaml files work across all cloud providers, including a cloud-
specific setup of public-facing Load Balancers. The WordPress example shows
how the last and the most important piece namely storage is abstracted from the
underlying cloud provider.
In this case, the WordPress helm chart depends on the Maria DB helm chart (http
s://github.com/helm/charts/tree/master/stable/mariadb) for its database install.
Describing the helm format would take another book; it is easier to look at the
output and see what was done. Unlike stateless applications, such as our
frontends, Maria DB requires careful handling of storage. We inform of this
Kubernetes by defining the Maria DB deployment as StatefulSet. StatefulSet (htt
ps://kubernetes.io/docs/concepts/workloads/controllers/statefulset/) is like Deployment
with the additional capability of ordering, and uniqueness of the pods. The
previous statement is from the documentation, so what does it really mean. It
means that Kubernetes will try really hard – and we mean really, really hard – to
ensure that the pod and its storage are kept together. One way to help us is also
the naming. The pods are named <pod-name>-#, where # starts from 0 for the first
pod (you know it's a programmer thing).
You can see from the following code that mariadb has a predictable number
attached to it, whereas the WordPress Deployment has a random number
attached to the end. The numbering reinforces the ephemeral nature of the
Deployment pods versus the StatefulSet pods.
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get pod
NAME READY STATUS RESTARTS AGE
handsonaks-wp-mariadb-0 1/1 Running 1 17h
handsonaks-wp-wordpress-6ddcfd5c89-fv6l2 1/1 Running 2 16h
Those are lots of lines, and if you read them carefully, you will see it is mostly
the same information that we provided for Deployment. In the following block,
we will highlight the key differences, to take a look at just the PVC:
1 apiVersion: apps/v1
2 kind: StatefulSet
...
19 replicas: 1
...
94 volumeMounts:
95 - mountPath: /bitnami/mariadb
96 name: data
...
114 volumeClaimTemplates:
115 - metadata:
117 labels:
118 app: mariadb
119 component: master
120 heritage: Tiller
121 release: handsonaks-wp
122 name: data
123 spec:
124 accessModes:
125 - ReadWriteOnce
126 resources:
127 requests:
128 storage: 8Gi
Persistent Volume Claims (PVC) can be used by any Pod, not just StatefulSet Pods.
The following lines will give a detailed explanation of the preceding code:
Go to http://<external-ip-shown>:
To delete the WordPress site, run the following:
helm delete --purge handsonaks-wp
Summary
This chapter was the big one. We went from a simple Docker container mapped
to a Pod, to Pods running as ReplicaSet, to ReplicaSets under Deployment, to
StatefulSets managing Persistent Volume Claims (PVCs). We went from
launching using raw Kubernetes yaml files to installing complex applications
using Helm charts.
In the next chapter, you will learn how to perform cool tricks, such as scaling
your application, and what to do when things go wrong. The power of the
Kubernetes declarative engine, which lets you specify the desired state and let
the machine figure out how to achieve it, will be realized. Without having
advanced Network certification or knowledge, you will be able to diagnose
common network errors while troubleshooting Kubernetes applications.
Scaling Your Application to
Thousands of Deployments
In this chapter, we will show you how to scale the sample application that we
introduced in Chapter 2, Kubernetes on Azure (AKS), using kubectl. We will
introduce different failures to demonstrate the power of Kubernetes' declarative
engine. The goal is to make you comfortable with kubectl, which is an important
tool for managing AKS. In addition, in this chapter you will get a brief
introduction to how network traffic is routed to different pods running on
different nodes, and how it will help you to diagnose network problems in
production.
You will find the code files for this chapter by accessing https://github.com/PacktPub
lishing/Hands-On-Kubernetes-on-Azure.
Scaling your application
Scaling on demand is one of the key benefits of using cloud-native applications.
It also helps optimize resources for your application. If the frontend component
encounters heavy loads, you can scale the frontend alone, while keeping the
same number of backend instances. You can increase or reduce the number/size
of VMs required depending on your workload and peak demand hours. You will
scale your application components independently and also see how to
troubleshoot scaling issues.
Implementing independent scaling
To demonstrate independent scaling, let's use the guestbook example that we
used in the previous chapter. Let's follow these steps to learn how to implement
independent scaling:
1. Install the guestbook by running the kubectl create command in the Azure
command line:
kubectl create -f https://raw.githubusercontent.com/kubernetes/examples/master/guestbook/all-i
2. After you have entered the preceding command, you should see the
following output in your command-line output:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kubectl create -f https://raw.githubusercontent.com/
service/redis-master created
deployment.apps/redis-master created
service/redis-slave created
deployment.apps/redis-slave created
service/frontend created
deployment.apps/frontend created
3. After a few minutes, you should get the following output in which you will
see that none of the containers are accessible from the internet, and no
external IP is assigned:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get all
NAME READY STATUS RESTARTS AGE
pod/frontend-56f7975f44-7sdn5 1/1 Running 0 1m
pod/frontend-56f7975f44-hscn7 1/1 Running 0 1m
pod/frontend-56f7975f44-pqvbg 1/1 Running 0 1m
pod/redis-master-6b464554c8-8nv4s 1/1 Running 0 1m
pod/redis-slave-b58dc4644-597qt 1/1 Running 0 1m
pod/redis-slave-b58dc4644-xtdkx 1/1 Running 0 1m
4. Expose the frontend to the public internet by default using the following
command:
kc get -o yaml svc/frontend > frontend-service.yaml
code frontend-service.yaml
5. Edit the frontend-service.yaml file to set the labels, ports, and selector, which
should appear as follows (or you can cut and paste the following):
apiVersion: v1
kind: Service
metadata:
labels:
app: guestbook
tier: frontend
name: frontend
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: guestbook
tier: frontend
type: LoadBalancer
6. Save the file and recreate the frontend service so that we can access it
publicly by deleting the frontend service and recreating it as follows:
kubectl delete -f frontend-service.yaml
kubectl create -f frontend-service.yaml
7. Use the following command to get the public IP to access the application
via the internet:
kubectl get svc
You will get the following output. You need to look for the IP displayed
under the EXTERNAL-IP column:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
frontend LoadBalancer 10.0.196.116 <EXTERNAL-IP> 80:30063/TCP 2m
8. Type the IP address from the preceding output into your browser navigation
bar as follows: http://<EXTERNAL-IP>/. The result of this is shown in the
following screenshot:
The familiar guestbook sample should be visible. You have successfully
publically accessed the guestbook.
Scaling the guestbook frontend
component
Kubernetes gives us the ability to scale each component of an application
dynamically. In this section, we will show you how to scale the frontend of the
guestbook application as follows:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc scale --replicas=6 deployment/frontend
deployment.extensions/frontend scaled
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get all
NAME READY STATUS RESTARTS AGE
pod/frontend-56f7975f44-2rnst 1/1 Running 0 4s
pod/frontend-56f7975f44-4tgkm 1/1 Running 0 4s
pod/frontend-56f7975f44-7sdn5 1/1 Running 0 2h
pod/frontend-56f7975f44-hscn7 1/1 Running 0 2h
pod/frontend-56f7975f44-p2k9w 0/1 ContainerCreating 0 4s
pod/frontend-56f7975f44-pqvbg 1/1 Running 0 2h
We achieve this by using the scale option in kubectl. You can set the number of
replicas you want, and Kubernetes takes care of the rest. You can even scale it
down to zero (one of the tricks used to reload configuration when the application
doesn't support the dynamic reload of configuration). You can see this trick in
action as follows:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc scale --replicas=0 deployment/frontend
deployment.extensions/frontend scaled
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get pods
NAME READY STATUS RESTARTS AGE
frontend-56f7975f44-4vh7c 0/1 Terminating 0 3m
frontend-56f7975f44-75trq 0/1 Terminating 0 3m
frontend-56f7975f44-p6ht5 0/1 Terminating 0 3m
frontend-56f7975f44-pqvbg 0/1 Terminating 1 14h
In this chapter, you have experienced how easy it is to scale pods with
Kubernetes. This capability provides a very powerful tool for you to not only
dynamically adjust your application components, but also to provide resilient
applications with failover capabilities enabled by running multiple instances of
components at the same time.
Furthermore, its declarative nature is another key advantage of using
Kubernetes. In the previous examples, we saw that we need to define what we
want (namely, the number of the replicas) in the .yaml file description, and
Kubernetes handles the rest.
Desired versus available nodes (if they are the same, take no action)
The number of pods
Pod placement based on CPU/memory requirements
The handling of special cases such as StatefulSets
Handling failure in AKS
Kubernetes is a distributed system with many hidden working parts.
AKS abstracts all of it for us, but it is still our responsibility to know where to
look and how to respond when bad things happen. Much of the failure handling
is done automatically by Kubernetes – still, you will run into situations where
manual intervention is required. The following is a list of the most common
failure modes that require interaction. We will look into the following failure
modes in depth in this section:
Node failures
Out-of-resource failure
Storage mount issues
Network issues
See Kubernetes the Hard Way (https://github.com/kelseyhightower/kubernetes-the-hard-way), an excellent
tutorial, to get an idea about the blocks on which Kubernetes is built. For the Azure version,
see Kubernetes the Hard Way – Azure Translation (https://github.com/ivanfioravanti/kubernetes-the-hard-
way-on-azure) .
Node failures
Intentionally (to save costs) or unintentionally, nodes can go down. When that
happens, you don't want to get the proverbial 3AM call when Kubernetes can
handle it automatically for you instead. In this exercise, we are going to bring a
node down in our cluster and see what Kubernetes does in response:
2. Check that your URL is working as shown in the following output, using
the external IP to reach the frontend:
kc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
frontend LoadBalancer 10.0.196.116 EXTERNAL-IP 80:30063/TCP 14h
3. Go to http://<EXTERNAL-IP>:
4. Let's see where the pods are running currently using the following code:
kubectl describe nodes
The following output is edited to show only the lines we are interested in:
1 ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc describe nodes
2 Name: aks-agentpool-18162866-0
5 Addresses:
6 InternalIP: 10.240.0.4
7 Hostname: aks-agentpool-18162866-0
16 Non-terminated Pods: (12 in total)
17 Namespace Name CPU Requests CPU Limits Memory R
18 --------- ---- ------------ ---------- --------
19 default frontend-56f7975f44-9k7f2 100m (10%) 0 (0%) 100Mi (7
20 default frontend-56f7975f44-rflgz 100m (10%) 0 (0%) 100Mi (7
21 default redis-master-6b464554c8-8nv4s 100m (10%) 0 (0%) 100Mi (7
22 default redis-slave-b58dc4644-wtkwj 100m (10%) 0 (0%) 100Mi (7
23 default redis-slave-b58dc4644-xtdkx 100m (10%) 0 (0%) 100Mi (7
39 Name: aks-agentpool-18162866-1
42 Addresses:
43 InternalIP: 10.240.0.5
44 Hostname: aks-agentpool-18162866-1
54 Namespace Name CPU Requests CPU Limits Me
55 --------- ---- ------------ ---------- --
56 default frontend-56f7975f44-gbsfv 100m (10%) 0 (0%) 10
6. For maximum fun, you can run the following command to hit the guestbook
frontend every 5 seconds and return the HTML (on any Bash Terminal):
while true; do curl http://<EXTERNAl-IP>/ ; sleep 5; done
The preceding command will generate infinite scroll till you press Ctrl + C.
Add some Guestbook entries to see what happens to them when you cause the
node to shut down.
Things will go crazy during the shutdown of agent-0. You can see this in the
following edited output generated during the shutdown:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get events --watch
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
47m 47m 1 frontend-56f7975f44-9k7f2.1574e5f94ac87d7c Pod Normal Scheduled default-scheduler Successful
47m 47m 1 frontend-56f7975f44-9k7f2.1574e5f9c9eb2713 Pod spec.containers{php-redis} Normal Pulled kube
47m 47m 1 frontend-56f7975f44-9k7f2.1574e5f9e3ee2348 Pod spec.containers{php-redis} Normal Created kub
47m 47m 1 frontend-56f7975f44-9k7f2.1574e5fa0ec58afa Pod spec.containers{php-redis} Normal Started kub
52s 52s 1 frontend-56f7975f44-fbksv.1574e88a6e05a7eb Pod Normal Scheduled default-scheduler Successful
50s 50s 1 frontend-56f7975f44-fbksv.1574e88aec0fb81d Pod spec.containers{php-redis} Normal Pulled kube
47m 47m 1 frontend-56f7975f44-rflgz.1574e5f9e7166672 Pod spec.containers{php-redis} Normal Created kub
47m 47m 1 frontend-56f7975f44-rflgz.1574e5fa1524773e Pod spec.containers{php-redis} Normal Started kub
52s 52s 1 frontend-56f7975f44-xw7vd.1574e88a716fa558 Pod Normal Scheduled default-scheduler Successful
49s 49s 1 frontend-56f7975f44-xw7vd.1574e88b37cf57f1 Pod spec.containers{php-redis} Normal Pulled kube
48s 48s 1 frontend-56f7975f44-xw7vd.1574e88b4cb8959f Pod spec.containers{php-redis} Normal Created kub
47s 47s 1 frontend-56f7975f44-xw7vd.1574e88b8aee5ee6 Pod spec.containers{php-redis} Normal Started kub
47m 47m 1 frontend-56f7975f44.1574e5f9483ea97c ReplicaSet Normal SuccessfulCreate replicaset-controlle
47m 47m 1 frontend-56f7975f44.1574e5f949bd8e43 ReplicaSet Normal SuccessfulCreate replicaset-
8s 52s 8 redis-master-6b464554c8-f5p7f.1574e88a71687da6 Pod Warning FailedScheduling default-scheduler
52s 52s 1 redis-master-6b464554c8.1574e88a716d02d9 ReplicaSet Normal SuccessfulCreate replicaset-contr
8s 52s 7 redis-slave-b58dc4644-7w468.1574e88a73b5ecc4 Pod Warning FailedScheduling default-scheduler 0
8s 52s 8 redis-slave-b58dc4644-lqkdp.1574e88a78913f1a Pod Warning FailedScheduling default-scheduler 0
52s 52s 1 redis-slave-b58dc4644.1574e88a73b40e64 ReplicaSet Normal SuccessfulCreate replicaset-control
52s 52s 1 redis-slave-b58dc4644.1574e88a78901fd9 ReplicaSet Normal SuccessfulCreate replicaset-control
0s 54s 8 redis-slave-b58dc4644-7w468.1574e88a73b5ecc4 Pod Warning FailedScheduling default-scheduler 0
0s 54s 9 redis-slave-b58dc4644-lqkdp.1574e88a78913f1a Pod Warning FailedScheduling default-scheduler 0
0s 54s 9 redis-master-6b464554c8-f5p7f.1574e88a71687da6 Pod
0s 1m 13 redis-slave-b58dc4644-lqkdp.1574e88a78913f1a Pod Warning FailedScheduling default-scheduler 0
What you can see is that all your precious messages are gone! This shows the
importance of having Persistent Volume Claims (PVCs) for any data that you
want to survive in the case of a node failure.
Let's look at some messages from the frontend and understand what they mean:
9m 1h 3 frontend.1574e31070390293 Service Normal UpdatedLoadBalancer service-controller
The preceding message is the first hint we get when something goes wrong. Your
curl command might have hiccupped a little bit, but has continued. You have to
hit the frontend URL on your browser for migration to kick in. The reason you
have to reload the frontend is because of how the frontend is that constructed, it
just loads the HTML and expects JavaScript to hit the Redis database. So, hit
refresh on your browser:
52s 52s 1 frontend-56f7975f44-fbksv.1574e88a6e05a7eb Pod
You can see that one of the frontend pods is scheduled for migration to agent-1:
50s 50s 1 frontend-56f7975f44-fbksv.1574e88aec0fb81d Pod spec.
50s 50s 1 frontend-56f7975f44-fbksv.1574e88b004c01e6 Pod spec.
49s 49s 1 frontend-56f7975f44-fbksv.1574e88b44244673 Pod spec.
Next, Kubernetes checks whether the Docker image is present on the node and
downloads it if required. Furthermore, the container is created and started.
Diagnosing out-of-resource errors
When deleting agent-0, we can observe the issue of being out of resources. Only
one node is available, but that node is out of disk space:
0s 1m 13 redis-slave-b58dc4644-lqkdp.1574e88a78913f1a Pod Warni
When you run the command, you will get the following output:
redis-slave-b58dc4644-tcl2x 0/1 Pending 0 4h
redis-slave-b58dc4644-wtkwj 1/1 Unknown 0 6h
redis-slave-b58dc4644-xtdkx 1/1 Unknown 1 20h
If you had launched the cluster on VMs with more vCPUs (ours was running the smallest
available, A1), you can set the replicas to be 10 or higher to recreate this issue as follows:
kubectl scale --replicas=10 deployment/redis-slave
Now that we have confirmed the issue, let's get back to the error:
redis-slave-... Warning FailedScheduling ... 0/2 nodes are available: 1 Insufficient cpu, 1 node(s
Insufficient CPU
One node not ready
One node out of disk space
One node not ready: We know about this error because we caused it. We
can also probably guess that it is the same node that is reporting out of disk
space.
How can we make sure that it is the Insufficient cpu issue instead of the node
being out of disk space? Let's explore this using the following steps:
2. Use the kubectl exec command to run a shell on the node as follows:
kc exec -it frontend-<running-pod-id> bash
4. Clearly there is enough disk space, since the node is not reporting a status.
So, enter the following command to know why the node is showing out of
disk space:
kc describe nodes
This is not much of help in determining where the out of disk space issue is
coming from (the Unknown status doesn't mean out of disk). This seems to be a bug
in the eventing mechanism reporter of Kubernetes, although this bug might be
fixed by the time you read this.
In our case, the CPU is the bottleneck. So, let's see what Kubernetes is having
trouble with, by getting the ReplicaSet definition of redis-slave as follows:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get rs
NAME DESIRED CURRENT READY AGE
frontend-56f7975f44 1 1 1 20h
redis-master-6b464554c8 1 1 1 20h
redis-slave-b58dc4644 1 1 0 20h
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get -o yaml rs/redis-slave-b58dc4644
apiVersion: extensions/v1beta1
...
kind: ReplicaSet
resources:
requests:
cpu: 100m
You might think that since redis-slave is used only for reading, the application
might still work. On the surface, it looks okay. The guestbook appears in the
browser when we enter the IP address as follows:
The Developer Web Tools are good debugging tools for these cases, and are
available in most browsers. You can launch them by right clicking and choosing
Inspect:
After a page refresh, you can see this error in the Network tab:
<br />
<b>Fatal error</b>: Uncaught exception 'Predis\Connection\ConnectionException' with message 'Connecti
There are multiple ways we can solve this issue. In production, you would restart
the node or add additional nodes. To demonstrate, we will try multiple
approaches (all of them coming from practical experience).
Reducing the number of replicas to
the bare minimum
Our first approach is to reduce the number of replicas to only what is essential
by using the following command:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc scale --replicas=1 deployment/frontend
deployment.extensions/frontend scaled
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc scale --replicas=1 deployment/redis-slave
deployment.extensions/redis-slave scaled
Despite reducing the replicas, we still get the error message. The VM that we are
running simply does not have the horsepower to run these apps.
Reducing CPU requirements
We can use the same trick of changing the yaml file as we did earlier, as follows:
kc get -o yaml deploy/frontend > frontend.yaml
...
This time, we are going to download the yaml file and modify it as follows:
curl -O -L https://raw.githubusercontent.com/kubernetes/examples/master/guestbook/all-in-one/guestboo
Find resources | cpu limit for redis-slave and frontend and replace 100m with 10m:
cpu: 10m
In our case, we get this new error from the kubectl get events command:
1s 18s 4 redis-slave-b6566c98-gq5cw.15753462c1fbce76 Pod
To fix the error shown in the previous code, let's edit the memory requirements
in the yaml file as well. This time, we will use the following command:
kubectl edit deploy/redis-deploy
Kubernetes makes the required changes to make things happen. Make changes to
replicas and resource settings to get to this state:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get pods |grep Running
frontend-84d8dff7c4-98pph 1/1 Running 0 1h
redis-master-6b464554c8-f5p7f 1/1 Running 1 23h
redis-slave-787d9ffb96-wsf62 1/1 Running 0 1m
The guestbook appears in the browser when we enter the IP address as follows:
Since we now have the entries, we can confirm that the application is working
properly.
Cleanup of the guestbook deployment
Let's clean up by running the following delete command:
kc delete -f guestbook-all-in-one.yaml
This completes the another most common failure nodes wherein you were able
to identify the errors that lead to the issue and fix it.
Fixing storage mount issues
In this section, you will fix the issue we experienced earlier in this chapter of
non-persistent storage if a node goes down. Before we start, let's make sure that
the cluster is in a clean state:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get all
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 4d
In the last example, we saw that the messages stored in redis-master were lost if it
gets restarted. The reason for this is that redis-master stores all data in its
container, and whenever it is restarted, it uses the clean image without the data.
In order to survive reboots, the data has to be stored outside. Kubernetes uses
PVCs to abstract the underlying storage provider to provide this external storage.
Starting the WordPress install
Let's start by reinstalling WordPress. We will show how it works and then verify
that storage is still present after a reboot:
This error showed that some bug in the script was expecting that the
SMTP variables would be set (in theory, you are allowed to set it as
empty).
2. Are all pods on agent-0? Your pod placement may vary. To verify this run
the following command:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-handsonaks-wp-mariadb-0 Bound pvc-752cea13-0c73-11e9-9914-82000ff4ac53 8Gi RWO default 1h
handsonaks-wp-wordpress Bound pvc-74f785bc-0c73-11e9-9914-82000ff4ac53 10Gi RWO default 1h
The following command shows the PVCs that are bound to the pods:
kc get pv # the actual persistent volumes
...
Type: AzureDisk (an Azure Data Disk mount on the host and bind mount to the pod)
DiskName: kubernetes-dynamic-pvc-74f785bc-0c73-11e9-9914-82000ff4ac53
DiskURI: /subscriptions/12736543-b466-490b-88c6-f1136a239821/resourceGroups/MC_cloud-shell-st
...
# shows the actual disk backing up the PVC
We found that, in our cluster, agent-0 had all the critical pods of the database and
WordPress.
We are going to be evil again and stop the node that can cause most damage by
shutting down node-0 on the Azure portal:
Click refresh on the page to verify that the page does not work.
You have to wait at least 300s (in this case) as that is the default for Kubernetes
for tolerations. Check this by running the following command:
kc describe pods/<pod-id>
Keep refreshing the page once in a while, and eventually you will see
Kubernetes try to migrate the pod to the running agent.
Use kubectl edit deploy/... to fix any insufficient CPU/memory errors as shown in
the last section.
In our case, we see these errors when you run kubectl get events:
2s 2s 1 handsonaks-wp-wordpress-55644f585c-hqmn5.15753d7898e1545c Pod
36s 36s 1 handsonaks-wp-wordpress-55644f585c-hqmn5.15753d953ce0bdca Pod
a
Use the Azure portal to manually detach the disk we identified previously.
Delete the old pod manually (the one with the status Unknown) to force-
detach the volume.
Give it around 5 or 10 minutes, then delete the pod to force Kubernetes to
try again.
By trying some, or all, of these, we were able to mount the WordPress volume
on agent-1, but not the mariadb volume. We had to restart agent-0 to get the cluster
to a decent state. At this point, there are only two options:
It is for this reason and more that we recommend using managed DBs for your
pods and not hosting them yourself. We will see how we can do that in the
upcoming chapters.
When your cluster is running for a long time, or at scale, eventually you will run
into the following issues:
The kube DNS kubelet stops working and will require a restart.
Azure limits outside connections from a single VM to 1,024.
If any of your pods create zombie processes and don't clean up, you won't
be able to even connect to the pod. You will have to restart the pod.
Before continuing, let's clean up the PV/PVC using the following command:
helm delete --purge handsonaks-wp
# delete any pv or pvc that might be present using kubectl delete pvc/...
By the end of this section, you now have a detailed knowledge in studying and
fixing node failures.
Upgrading your application
Using Deployments makes upgrading a straightforward operation. As with any
upgrade, you should have good backups in case something goes wrong. Most of
the issues you will run into will happen during upgrades. Cloud-native
applications are supposed to make dealing with this relatively easier which is
possible only if you have a very strong development team that has the ability to
do incremental rollouts (with support for rollback).
There is a trade-off between getting features out for customers to see versus
spending a lot of time ensuring developer training, automated tests, and
disciplined developers and product managers. Remember, most successful
companies that do upgrades in production multiple times a day had for years
monolithic applications that generated revenue before they were able to switch
to a microservices-based approach.
Most methods here work great if you have stateless applications. If you have a
state stored anywhere, back up before you try anything.
kubectl edit
For a deployment, all we have to do is change the values that we want to change
using the kubectl edit command as follows:
kubectl edit <resource>
The deployment detects the changes (if any) and matches the running state to the
desired state. Lets see how its done:
2. After few minutes, all the pods should be running. Let's do our first upgrade
by changing the service from ClusterIP to LoadBalancer:
code guestbook-all-in-one.yaml
#change the frontend service section from ClusterIP to LoadBalancer
# refer the previous sections if you are not sure how to change it
4. You should see the external IP pending message (you can ignore the
warnings about using the saveconfig option):
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
frontend LoadBalancer 10.0.247.224 <pending> 80:30886/TCP 7m
8. Running kubectl gets events will show the rolling update strategy that the
Deployment uses to update the frontend images:
12s 12s 1 frontend.157557b31a134dc7 Deployment
12s 12s 1 frontend-5785f8455c.157557b31e83d67a ReplicaSet
12s 12s 1 frontend-5785f8455c-z99v2.157557b31f68ac29 Pod
12s 12s 1 frontend.157557b31be33765 Deployment
12s 12s 1 frontend-56f7975f44.157557b31bce2beb ReplicaSet
11s 11s 1 frontend-5785f8455c-z99v2.157557b35b5176f7 Pod spec.
11s 11s 1 frontend-56f7975f44-rfd7w.157557b32b25ad47 Pod
You will also see two replica sets for the frontend, the new one replacing
the other one pod at a time:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kc get rs
NAME DESIRED CURRENT READY AGE
frontend-56f7975f44 0 0 0 19m
frontend-5785f8455c 3 3 3 4m
When you run the kubectl get pods command, you should see two pods for
wordpress:
Running describe on them and grepping for images should show that
the wordpress pod is being redeployed with the image.tag set in the second
step.
We started the chapter off by looking at how to define the use of a load balancer
and leverage the replica creation feature in Kubernetes to achieve scalability.
With this type of scalability, we also achieve failover by using load balancer and
multiple instances of the software for stateless applications.
After that, we showed you how to troubleshoot simple problems you might run
into, and how to use persistent storage to avoid data loss if a node goes down or
needs to be rebooted.
In the next chapter, we will look at how to set up Ingress services and certificate
managers to interface with LetsEncrypt.
Single Sign-On with Azure AD
HTTPs has become a necessity for any public-facing website, given phishing
attacks. Luckily, with the LetsEncrypt service and helpers in Kubernetes, it is
very easy to set verified SSL certificates. In this chapter, we will see how to set
up Ingress services and certificate managers to interface with LetsEncrypt.
You can find the code files for this chapter at https://github.com/PacktPublishing/Hand
s-On-Kubernetes-on-Azure.
HTTPS support
Obtaining Secure Sockets Layer (SSL) certificates traditionally was an
expensive business. If you want to do it cheaply, you could self-sign your
certificates, but browsers would complain when opening up your site and
identify it as not trusted. The LetsEncrypt service changes all that. You do get
some extra benefits with commercial certificate providers, but the
certificate issued by LetsEncrypt should be sufficient.
Installing Ingress
Exposing services to the public and routing was "an exercise left to the reader"
when Kubernetes started. With the Ingress object, Kubernetes provides a clean
way of securely exposing your services. It provides an SSL endpoint and name-
based routing. Let's install the nginx version of the Ingress by performing the
following steps:
You can browse to the web page by entering http://<EXTERNAL-IP> in the browser
and it will automatically redirect to the https://<EXTERNAL-IP> secure site, where you
will get the security warning.
Launching the Guestbook application
To launch the guestbook application, type in the following command:
kubectl create -f https://raw.githubusercontent.com/kubernetes/examples/master/guestbook/all-in-one/g
Adding Lets Ingress
Use the following yaml file to expose the frontend service via the ingress:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: simple-frontend-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- http:
paths:
- path: /
backend:
serviceName: frontend
servicePort: 80
1. Install the certificate manager that interfaces with the LetsEncrypt API to
request a certificate for the domain name you specify.
In case it is already taken, change the FQDN to something more unique to you, such as
handsonaks-yourpetname-ing.
For LetsEncrypt, we need a valid FQDN in order for the certificate to be issued.
LetsEncrypt assumes that if you are able to provide the valid IP for a given
Domain Name System (DNS) entry, you have the rights to the domain. It will
issue the certificate only after such verification. This is to prevent certificates
being issued for your domain by bad actors.
The following script obtains a DNS name for a given Azure Public IP:
#!/bin/bash
# Public IP address of your ingress controller
IP="<external IP of the ingress service>"
The certificate manager obtains the certificate for the domain specified and
handles the handshake required for verification. Pretty cool stuff.
Securing the frontend service
connection
Let's create the lets-encrypt HTTPs frontend tunnel. Following is the Quick
status update:
The missing piece is the connection between our public ingress to the frontend
service. The following code will create that for you:
1 apiVersion: extensions/v1beta1
2 kind: Ingress
3 metadata:
4 name: frontend-aks-ingress
5 annotations:
6 kubernetes.io/ingress.class: nginx
7 certmanager.k8s.io/cluster-issuer: letsencrypt-prod
8 nginx.ingress.kubernetes.io/rewrite-target: /
9 spec:
10 tls:
11 - hosts:
12 - handsonaks-ingress.westus2.cloudapp.azure.com
13 secretName: tls-secret
14 rules:
15 - host: handsonaks-ingress.westus2.cloudapp.azure.com
16 http:
17 paths:
18 - path: /
19 backend:
20 serviceName: frontend
21 servicePort: 80
Authentication deals with identity (who are you?), and in general requires a
trusted provider (such as Google, GitHub, or Azure).
Authorization deals with permissions (what are you trying to do?), and is very
implementation specific in terms of what application resources needs to be
protected.
Authentication deals with verifying whether you are who you say you are. The
normal verification system is via username and password. The assumption is
that only you know your username and password and therefore you are the
person who is logging in. Obviously, with recent hacks, it has not proven to be
sufficient, hence the implementation of two-factor authentication and multi-
factor authentication. On top of that, it has become very hard for people to
remember their multiple user accounts and passwords. To help alleviate that,
authentication is provided as a service by multiple providers with support for
OAuth or SAML. Here are some of the well-known providers:
Google (https://github.com/pusher/oauth2_proxy#google-auth-provider)
Azure (https://github.com/pusher/oauth2_proxy#azure-auth-provider)
Facebook (https://github.com/pusher/oauth2_proxy#facebook-auth-provider)
GitHub (https://github.com/pusher/oauth2_proxy#github-auth-provider)
GitLab (https://github.com/pusher/oauth2_proxy#gitlab-auth-provider)
LinkedIn (https://github.com/pusher/oauth2_proxy#linkedin-auth-provider
(https://docs.microsoft.com/en-us/azure/active-directory/).
3. Click on the Copy icon and save the secret in a safe place:
4. Save the client and the tenant ID:
After creating the client ID secret, we will now launch oauth2_proxy with the
following YAML file:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: oauth2-proxy
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: oauth2-proxy
template:
metadata:
labels:
app: oauth2-proxy
spec:
containers:
- args:
- --provider=azure
- --email-domain=microsoft.com
- --upstream=http://10.0.83.95:80
- --http-address=0.0.0.0:4180
- --azure-tenant=d3dc3a5f-de30-4781-8752-7814fd5d0a5e
env:
- name: OAUTH2_PROXY_CLIENT_ID
value: 9f640227-965c-43ac-bf8d-8bc5eac86ea1
- name: OAUTH2_PROXY_CLIENT_SECRET
value: "wu:q{%.}+^&X(K;_!K|0:1+k(v^.E%^]%w)7;);*NL9$;>!l()_"
- name: OAUTH2_PROXY_COOKIE_SECRET
value: 9ju360pxM2nVQdQqQZ4Dtg==
image: docker.io/colemickens/oauth2_proxy:latest
imagePullPolicy: Always
name: oauth2-proxy
ports:
- containerPort: 4180
protocol: TCP
Next, Oauth2 needs to be exposed as a service so that the ingress can talk to it by
running the following code:
apiVersion: v1
kind: Service
metadata:
name: oauth2-proxy
namespace: default
spec:
ports:
- name: http
port: 4180
protocol: TCP
targetPort: 4180
selector:
app: oauth2-proxy
Create an ingress so that any URL link that goes to handsonaks-ingress-<yourname>.westus2.cloudapp.azure.com/oauth will be redirected to
the oauth2-proxy service.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: oauth2-proxy-ingress
annotations:
kubernetes.io/ingress.class: nginx
kubernetes.io/tls-acme: "true"
spec:
tls:
- hosts:
- handsonaks-ingress.westus2.cloudapp.azure.com
secretName: tls-secret
rules:
- host: handsonaks-ingress.westus2.cloudapp.azure.com
http:
paths:
- path: /oauth2
backend:
serviceName: oauth2-proxy
servicePort: 4180
Finally, we will link the oauth2 proxy to the frontend service by creating an
ingress that configures nginx so that authentication is checked using the paths in
auth-url and auth-signin. If it is successful, the traffic is redirected to the backend
We are done with configuration. You can now log in with your existing
Microsoft account to the service at https://handsonaks-ingress-
<yourname>.westus2.cloudapp.azure.net/.
supports multiple authentication providers, such as GitHub and Google. Only the
oauth2-proxy
deployment's yaml has to be changed with the right service to change the auth
oauth2-proxy
provider. Please see the section at https://github.com/pusher/oauth2_proxy#oauth-provider-configuration.
Summary
In this chapter, we added access control to our guestbook application without
actually changing the source code of it by using the sidecar pattern in
Kubernetes (https://kubernetes.io/blog/2015/06/the-distributed-system-toolkit-
patterns/). We started by getting the Kubernetes ingress objects to redirect to a
https://.... secured site. Then we installed the certificate manager that interfaces
with the LetsEncrypt API to request a certificate for the domain name you
specified in the next steps. We leveraged a Certificate Issuer, which gets the
certificate from LetsEncrypt, and created the actual certificate for a given Fully-
Qualified Domain Name (FQDN). We then created an Ingress to the service
with the certificate we'd created. Finally, we jumped into authentication (AuthN)
and authorization (AuthZ), and showed you how to leverage AzureAD as an
authentication provider for the guestbook application.
In the next chapter, you will learn how to be a superhero, by predicting and
fixing issues before they occur through proactive monitoring and alerts. You will
also learn to use your X-ray vision to quickly identify root causes when errors do
occur, and learn how to debug applications running on AKS. You will be able to
perform the right fixes once you have identified the root cause.
Monitoring the AKS Cluster and the
Application
In this chapter, you will learn how to monitor your cluster and the applications
running on it. We will also show the use of the Microsoft Operations
Management Suite (OMS) agent and the integration with Azure Portal, as well
as set up alerts for critical events on the AKS cluster. You will be proactive in
monitoring your cluster and the applications running on it, and you will be a
hero for being able to proactively prevent errors from happening (launching
more nodes when the cluster is CPU-constrained, for example) and for quickly
resolving issues when they do happen.
You will find the code files for this chapter by accessing the following link: https
://github.com/PacktPublishing/Hands-On-Kubernetes-on-Azure.
Commands for monitoring
applications
Monitoring deployed applications on AKS along with monitoring Kubernetes
health is essential to provide reliable service to your customers. There are two
primary use cases for monitoring:
Before we start, we are going to have a clean start with our guestbook example.
If you have guestbook already running in your cluster, delete it by running the
following command on the Azure Cloud Shell:
kubectl delete -f guestbook-all-in-one.yaml
While the create command is running, we will watch the progress in the
following sections.
kubectl get command
To see the overall picture of deployed applications, x provides the get command.
The get command lists the resources that you specify. Resources can be pods,
replication controllers, ingress, nodes, deployments, secrets, and so on. We have
already run this in the previous chapters to verify whether our application is
ready to be used or not. Perform the following steps:
1. Run the following get command, which will get us the resources and their
statuses:
kubectl get all
You will get something like this, as shown in the following block:
NAME READY STATUS RESTARTS AGE
pod/frontend-5785f8455c-2dsgt 0/1 Pending 0 9s
pod/frontend-5785f8455c-f8knz 0/1 Pending 0 9s
pod/frontend-5785f8455c-p9mh9 0/1 Pending 0 9s
pod/redis-master-6b464554c8-sghfh 0/1 Pending 0 9s
pod/redis-slave-b58dc4644-2ngwx 0/1 Pending 0 9s
pod/redis-slave-b58dc4644-58lv2 0/1 Pending 0 9s
...
The command doesn't exit and changes output only if the state of any pod
changes. For example, you will see an output that is similar to this:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kubectl get -w pods
NAME READY STATUS RESTARTS AGE
frontend-5785f8455c-2dsgt 0/1 Pending 0 5m
frontend-5785f8455c-f8knz 0/1 Pending 0 5m
frontend-5785f8455c-p9mh9 0/1 Pending 0 5m
redis-master-6b464554c8-sghfh 0/1 Pending 0 5m
redis-slave-b58dc4644-2ngwx 0/1 Pending 0 5m
redis-slave-b58dc4644-58lv2 0/1 Pending 0 5m
You will see only the pods are shown. Also, you get a view of the state changes
of the pods that kubernetes handles. The first column is the pod name (frontend-
5785f8455c-f8knz), for example. The second column is how many containers in the
pod are ready over the total number of containers in the pod (0/1 initially
meaning 0 containers are up while a total of 1 is expected). The third column is
the status (Pending/ContainerCreating/Running/...). The fourth column is the number
of restarts. The fifth column is the age (when the pod was asked to be created).
Press Ctrl + C to stop the monitoring.
Once the state changes, you don't see the history of state changes. For example,
if you run the same command now, it will be stuck at the running state till you
press Ctrl + C:
ab443838-9b3e-4811-b287-74e417a9@Azure:~$ kubectl get -w pods
NAME READY STATUS RESTARTS AGE
frontend-5785f8455c-2dsgt 1/1 Running 0 26m
frontend-5785f8455c-f8knz 1/1 Running 0 26m
frontend-5785f8455c-p9mh9 1/1 Running 0 26m
redis-master-6b464554c8-sghfh 1/1 Running 0 26m
redis-slave-b58dc4644-2ngwx 1/1 Running 0 26m
redis-slave-b58dc4644-58lv2 1/1 Running 0 26m
To see the history if something goes wrong, run the following command:
kubectl get events
Kubernetes maintains events for only one hour by default. All the commands work only if the
event was fired within the past hour.
If everything went well, you should have an output something similar to the
following one:
42s 42s 1 frontend-5785f8455c-wxsdm.1581ea340ab4ab56 Pod
42s 42s 1 frontend-5785f8455c.1581ea34098640c9 ReplicaSet
40s 40s 1 frontend-5785f8455c-2trpg.1581ea3487c328b5 Pod spec.
40s 40s 1 frontend-5785f8455c-2trpg.1581ea34b5abca9e Pod
39s 39s 1 frontend-5785f8455c-2trpg.1581ea34d18c96f8 Pod
The general states for a pod are Scheduled->Pulled->Created->Started. As we will see
next, things can fail at any of the states, and we need to use the kubectl get
command to dig deeper.
kubectl describe command
The kubectl events command lists all the events for the entire namespace. If you
are interested in just a pod, you can use the following command:
kubectl describe pods
The preceding command lists all the information about all pods.
If you want information on a particular pod, you can type the following:
kubeclt describe pod/<pod-name>
From the description, you can get the node on which the pod is running, how
long it was running, its internal IP address, docker image name, ports exposed, env
variables, and the events (within the past hour).
In the preceding example, the pod name is frontend-5785f8455c-2trpg. Note that it
has the <replicaset name>-<random 5 chars> format. The replicaset name itself is
randomly generated from the deployment name frontend.
The namespace under which the pod runs is default. So far we have been just
using the default namespace, appropriately named default. In the next chapters,
we will see how namespaces help us to isolate pods.
The next section that is important from the preceding output is the node section.
Node: aks-agentpool-26533852-0/10.240.0.4
The node section lets us know which physical node/VM that the pod is actually
running on. If the pod is repeatedly starting or having issues running and
everything else seems OK, there might be an issue with the node. Having this
information is essential to perform advanced debugging.
It doesn't mean that the pod has been running from that time. So, the time can be
misleading in that sense. The actual uptime of the pod would be dependent on
whether it was moved from a node to another, or the node it was on went down.
The following shows the internal IP of the pod and its status:
Status: Running
IP: 10.244.0.87
When a service directs its traffic or another container wants to talk to the
containers in this pod, this is the IP that they will see. This IP is very useful when
resolving application issues. Let's say you want to know why the frontend is not
able to reach the server, you could find the server pod IP and try pinging it from
the frontend container.
The containers running in the pod and the ports that are exposed are listed in the
following block:
Containers:
php-redis:
...
Image: gcr.io/google-samples/gb-frontend:v3
...
Port: 80/TCP
Host Port: 0/TCP
...
Environment:
GET_HOSTS_FROM: dns
In this case, we are getting the gb-frontend container with the v3 tag from the gcr.io
container registry, and the repository name is google-samples.
Port 80 is exposed for outside traffic. Since each pod has its own IP, the same
port can be exposed for multiple containers of the same pod running even on the
same host. This is a huge management advantage as we don't have to worry
about port collisions. The port that needs to be configured is also fixed so that
scripts can be written simply without the logic of figuring out which port
actually got allocated for the pod.
In this section, we will introduce common errors and determine how to debug
and fix them.
Next, change the image tag from v3 to v_non_existent by running the following
commands:
image: gcr.io/google-samples/gb-frontend:v3
image: gcr.io/google-samples/gb-frontend:v_non_existent
Running the following command lists all the pods in the current namespace:
kc get pods
A sample error output that should be similar to your output is shown here. The
key error line is highlighted in bold:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned defaul
Normal Pulling 1m (x4 over 2m) kubelet, aks-agentpool-26533852-0 pulling image "gcr.io/google
Warning Failed 1m (x4 over 2m) kubelet, aks-agentpool-26533852-0 Failed to pull image "gcr.io
Warning Failed 1m (x4 over 2m) kubelet, aks-agentpool-26533852-0 Error: ErrImagePull
Normal BackOff 1m (x6 over 2m) kubelet, aks-agentpool-26533852-0 Back-off pulling image "gcr.
Warning Failed 1m (x7 over 2m) kubelet, aks-agentpool-26533852-0 Error: ImagePullBackOff
So, the events clearly show that the image does not exist. Errors such as passing
invalid credentials to private Docker repositories will also show up here.
Let's fix the error, by setting the image tag back to v3:
kubectl edit deployment/frontend
image: gcr.io/google-samples/gb-frontend:v_non_existent
image: gcr.io/google-samples/gb-frontend:v3
Save the file, and the deployment should get automatically fixed. You can verify
it by getting the events for the pods again.
Because we did a rolling update, the frontend was continuously available with zero downtime.
Kubernetes recognized a problem with the new specification and stopped rolling out the
changes automatically.
Application errors
We will see how to debug an application error. The errors in this section will be
self-induced similar to the last section. The method of debugging the issue is the
same as the one we have used to debug errors on running applications.
Most errors come from mis-configuration, where it can be fixed by editing the specification.
Errors in the application code itself requires a new image to be built and used.
Scaling down the frontend
With replicas=3, the request can be handled by any of the pods. To introduce the
application error and note the errors, we need to make changes in all three of
them. Let's make our life easier, by scaling the replicas to 1, so that we can make
changes in one pod only:
kubectl scale --replicas=1 deployment/frontend
Introducing an app "error"
In this case, we are going to make the Submit button fail to work.
We will use the kubectl exec command that lets you run the commands on a pod.
With the -it option, it attaches an interactive terminal to the pod and gives us a
shell that we can run our commands on. The following command launches a
Bash Terminal on the pod:
kubectl exec -it <frontend-pod-name> bash
The preceding code installs the vim editor so that we can edit the file to
introduce error.
You can run another instance of the cloud shell by clicking the button shown.
This will allow debugging while editing the application code:
We are introducing an error where reading messages would work, but not writing
them. We do this by asking the frontend to connect to the Redis master at the
non-existent local host server. The writes should fail.
This shows that the app is not production ready. If we did not know any better,
we would have thought the entry was written safely. So what is going on?
Without going too much into the application design, the way the app works is as follows:
It grabs all the entries on startup and stores them in its cache
Any time a new entry is added, the object is added to its cache independent of whether
the server acknowledges that it was written.
If you have network debugging tools turned on your browser, you can catch the
response from the server.
To verify that it has not been written to the database, hit the Refresh button in
your browser; you will see only the initial entries and the new entry has
disappeared:
As an app developer or operator, you probably get a ticket like this: After the
new deployment, new entries are not persisted. Fix it.
Logs
The first step is to get the logs. On the second shell, run the following:
kubectl logs <frontend-pod-name>
So you know that the error is somewhere when writing to the database in the
"set" section of the code.
On the other cloud shell window, edit the file to print out debug messages:
if ($_GET['cmd'] == 'set') {
if(!defined('STDOUT')) define('STDOUT', fopen('php://stdout', 'w'));
fwrite(STDOUT, "hostname at the beginning of 'set' command ");
fwrite(STDOUT, $host);
fwrite(STDOUT, "\n");
Add a new entry to the browser and look at the logs again:
kc logs <frontend-pod-name>
So we "know" that the error is between this line and the starting of the client, so
the setting of the $host = 'localhost' must be the offending error. This error is not
as uncommon as you think it would be, and as we just saw could have easily
gone through QA unless there had been a specific instruction to refresh the
browser. It could have worked perfectly will for the developer, as they could
have a running redis server on the local machine.
Check the browser network logs to be sure that the redis backend database is being hit and the
entries are retrieved.
The following points summarize some common errors and methods to fix the
errors:
We won't be installing it, as you get the same benefit, without the hassle from
Azure Kubernetes Monitoring itself. If you still want to do it, please see https://b
log.heptio.com/on-securing-the-kubernetes-dashboard-16b09b1b7aca.
The output lists, the underlying OS-IMAGE, internal IP, and other useful
information can be obtained with the following:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
aks-agentpool-26533852-0 Ready agent 34d v1.11.5 10.240.0.4 <none> Ubuntu 16.04.5 LTS 4.15.0-1036-az
You can find out which nodes are consuming the most resources, using the
following:
kubectl top nodes
ENV variables can contain sensitive information, so access to insights should be tightly
controlled through RBAC.
Apart from the logs, it also shows the environment variables that are set for the
container:
Logs
Log filtering helps in obtaining the kube events of interest that happened before
the default window of an hour.
Filtering by container name can also be done, without resorting to the Azure
cloud shell.
The following screenshot shows the log information, such as the event time,
node, resource type or object kind, namespace, resource name, and message. The
pod with the ImagePullBackOff error is highlighted as follows:
Summary
We started the chapter by showing how to use the kubectl events to monitor the
application. Then we showed how powerful the created logs are to debug the
application. The logs contain all the information that is written to stdout and
stderr. We also touched on the Kuberenetes dashboards and showed you how to
use the Kubernetes metrics for the operational monitoring of your deployments.
Lastly, we explained the use of OMS to show the AKS metrics and environment
variables, as well as logs with log filtering. You now have the skills to set alerts
on any metric that you would like to be notified of by leveraging Azure Insights.
You also learned how to debug application and cluster issues through the use of
kubectl and OMS monitoring.
In the next chapter, we will learn how to secure an AKS cluster with role-based
security, leveraging Azure Active Directory as an authentication provider.
Operation and Maintenance of AKS
Applications
In production systems, you need to allow different personnel access to certain
resources; this is known as role-based access control (RBAC). This chapter
will take you through how you can turn on RBAC on AKS and practice
assigning different roles with different rights. Users would be able to verify that
their access is denied when trying to modify resources that they do not have
access to. The benefits of establishing RBAC are that it acts not only as a
guardrail against the accidental deletion of critical resources but also an
important security feature to limit full access to the cluster to roles that really
need it.
Kubernetes developers realized this was a problem, and added RBAC along with
the concept of service roles to control access to the cluster.
Service roles let you assign read-only and read/write access to Kubernetes
resources. You can say person X has read-only access to the pods running in a
namespace. The neat thing about AKS is that the person can be tied to Azure
Active Directory (which in turn can be linked to your corporate Active Directory
via an SSO solution).
Deleting any AKS cluster without
RBAC
If you have a cluster already running, to save costs and reduce variability, it is
recommended that you delete the cluster before starting. As with the preceding
warning, it is assumed that you are following this book using your own personal
account. Be very careful before deleting the cluster if you are using your
corporate or shared account.
Please make sure that you save the key value somewhere secure.
5. Write down the Application ID. This will be used as "Client application ID"
when creating the cluster:
For example, for this book, the domain name was handsonaksoutlook: https://login.windows.net/han
dsonaksoutlook.onmicrosoft.com/.well-known/openid-configuration.
Deploying the cluster
On the cloud shell, create a resource group:
az group create --name handsonaks-rbac --location eastus
Deploy the cluster using the following command on the cloud shell:
az aks create \
--resource-group handsonaks-rbac \
--name handsonaks-rbac \
--generate-ssh-keys \
--aad-server-app-id <server-app-id> \
--aad-server-app-secret <server-app-secret> \
--aad-client-app-id <client-app-id> \
--aad-tenant-id <tenant-id>
Do not select New guest user. Guest users cannot be assigned roles.
The username has to be in the domain that you are the admin of. In this case, an
Outlook account was used and hence the domain name is
handsonaksoutlook.onmicrosoft.com. Write down the password.
Creating a read-only group and
adding the user to it
To demonstrate that you can manage using groups, instead of individual users,
let's create a read-only user group and add the new user to the group:
Note that you have to specify --admin so that you can work on your cluster:
az aks get-credentials --resource-group handsonaks-rbac --name handsonaks-rbac --admin
Creating the cluster-wide, read-only
role
Create the following file and save it as cluster-read-only-role.yaml:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
name: read-only
rules:
- apiGroups:
- ""
resources: ["*"]
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources: ["*"]
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources: ["*"]
verbs:
- get
- list
- watch
Run the following command to create a cluster-wide role named read-only that
has read-only permissions across the cluster:
kubectl create -f cluster-read-only-role.yaml
Binding the role to the AAD group
Create the following file and save it as readonly-azure-aad-group.yaml:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: read-only
roleRef:
kind: ClusterRole #this must be Role or ClusterRole
name: read-only # this must match the name of the Role or ClusterRole you wish to bind to
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
apiGroup: rbac.authorization.k8s.io
name: "<insert the read-only group id here"
Run the following command to create the read-only role, but this time access is
given to anyone who is present in the group:
kubectl create -f readonly-azure-aad-group.yaml
The access test
Now, get the credentials as the read-only user.
Log in using the readonly account username. When you log in the first time, you
will be asked to change the password:
Once you have logged in successfully, you can close the window and you should
see the following output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 14h
Error from server (Forbidden): horizontalpodautoscalers.autoscaling is forbidden: User "service-readon
Error from server (Forbidden): jobs.batch is forbidden: User "service-readonly-user@handsonaksoutlook.
Error from server (Forbidden): cronjobs.batch is forbidden: User "service-readonly-user@handsonaksoutl
So we can see most of it except the pod autoscalers/batch jobs and cronjobs.
We have ensured that we have access only to the user we have given access.
Summary
In this chapter, we learned how to secure your AKS cluster with role-based
security by leveraging Azure Active Directory as the authentication provider. We
created a service role that lets you assign read-only or read/write access to
Kubernetes resources, and we looked at some advanced features. First, we
showed you how to create the AAD server application. Then we created the
client application. After that, we showed you how to get the AAD tenant ID and
deployed the cluster. Once we had the RBAC-enabled solution deployed, we
tested the read-only feature by creating users in the Active Directory. We then
created a read-only group and added the user to it. We finished the chapter by
creating the read-only user role and binding the role to the AAD group of the
user.
In the next chapter, you will learn how to authorize Kubernetes cluster
applications to connect to other Azure services, such as Azure SQL databases
and Event Hubs.
Section 3: Leveraging Advanced
Azure PaaS Services in Combination
with AKS
Having completed this section, the reader should be able to securely access other
Azure services, such as databases, Event Hubs, and Azure Functions. Advanced
secrets and certificate management using services such as Let's Encrypt will also
be familiar to the reader.
First, we will need to install the prerequisites that are described in the following
section.
Prerequisites
Since the WordPress sample application requires custom interfacing with Azure
services, it requires a little bit more than the normal Helm init and installation
that we have explained in previous chapters. We will need to install Open
Service Broker for Azure (https://osba.sh/).
Let's start by making a number of changes to add RBAC support for Helm.
Helm with RBAC
Since we have RBAC enabled on the cluster, we need to install Helm with
RBAC support:
It is assumed that the RBAC-enabled cluster that was deployed in the previous chapter is still
running.
2. Wait until the service catalog is deployed. You can check this by running
the following command:
helm status catalog
3. Verify that the AVAILABLE column shows 1 for both the API server and the
manager:
==> v1beta1/Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
catalog-catalog-apiserver 1/1 1 1 5h54m
catalog-catalog-controller-manager 1/1 1 1 5h54m
Deploying Open Service Broker for
Azure
We need to obtain the subscription ID, tenant ID, client ID, and secrets in order
for the Open Service Broker to launch Azure services on our behalf:
3. Create a service principal with RBAC enabled so that it can launch Azure
services:
az ad sp create-for-rbac --name osba-quickstart -o table
4. Save the values from the command output in the environment variable:
export AZURE_TENANT_ID=<Tenant>
export AZURE_CLIENT_ID=<AppId>
export AZURE_CLIENT_SECRET=<Password>
This rule allows a connection to the database from any IP address. As you may
have already guessed, this is a serious security hole and is the cause of many
data breaches. The good news is that this rule is not required when using AKS.
You can add the AKS VNet to the VNET Rules section and delete the AllowAll
0.0.0.0 rule, as shown in the following screenshot:
We can reduce the attack surface tremendously by performing this simple
change.
Running the WordPress sample with
MySQL Database
You can verify that your blog site is available and running by using SERVICE_IP,
which is obtained by running the following command:
echo $(kubectl get svc --namespace osba-quickstart osba-quickstart-wordpress -o jsonpath='{.status.loa
All you have to do is click on Restore and choose the point in time from which
you want to perform the restore, as shown in the following screenshot:
Finally, press OK; after approximately 15 minutes, the MySQL service should
be restored.
Connecting WordPress to the
restored database
Azure MySQL restore creates a new instance of the database. To make our
WordPress installation connect to the restored database, we need to modify the
Kubernetes deployment files. Ideally, you will modify the Helm values file and
perform a Helm upgrade; however, that is beyond the scope of this book:
1. From the Azure Portal, note down the Server name, as shown in the
following screenshot:
2. Also, modify Connection security to allow the cluster to talk to the restored
database, as shown in the following screenshot:
To verify the restore, add a few entries on the blog. On restore, these entries
should not be present.
Modifying the host setting in
WordPress deployment
In this section, we will show you how to modify the deployment by examining
the deployment file (by using kubectl describe deploy/...):
1. You can see that the host value is obtained from the secret, as follows:
MARIADB_HOST: <set to the key 'host' in secret 'osba-quickstart-wordpress-mysql-secret'>
2. To set secrets, we need the base64 value. Obtain the base64 value of the server
name by running the following command:
echo <restored db server name> | base64
4. Get the value for URI and decode it using the following command:
echo 'base64 -uri value' | base64 -d
7. Run the following command to set the host to the new value:
kubectl edit secrets/osba-quickstart-wordpress-mysql-secret -n osba-quickstart-wordpress
8. Set the value of the host to the Base64 value that you noted down when
encoding the restored MySQL server name:
apiVersion: v1
data:
database: azltaGppYWx1Mw==
host: <change this value>
password:...
uri: <change this value>
Even though we have reset the secret value, this doesn't mean that our
server will automatically pick up the new value.
10. There are many ways to do it, but we are going to use scaling. Scale down
the number of replicas by running the following command:
kc scale --replicas=0 deploy/osba-quickstart-wordpress -n osba-quickstart
Due to problems with attaching and detaching storage, we have to wait for at least 15 minutes
for the storage to become detached.
Even though, in theory, the preceding should work, if you run the kubectl
logs on the WordPress pod, you will see that it is still using the old server
name.
This means that logs– has to come from the mounted volume.
Running grep -R 'original server name' * on the pod by using kubectl
exec shows that the values are actually stored in /bitnami/wordpress/wp-
config.php.
12. Open the file and put in the restored database name. Scale the replicas up
and down (after waiting for 15 minutes). It might be easier to create a new
persistent volume claim (PVC).
The blog logs will show that it is connecting to the restored database.
Reviewing audit logs
When you run the database on the Kubernetes cluster, it is very difficult to get
audit logs should something goes wrong. You need a robust way of dynamically
setting the audit level depending on the scenario. You also have to ensure that
the logs are shipped outside the cluster. Unless you have RBAC enabled, and
that the RBAC logs are correlated, it is difficult to determine whether anyone has
made changes to the database server settings.
The Activity log provides very valuable information in retracing the activities
that have been performed. Another option is to leverage the advanced logs that
are available, which we can obtain by enabling Server logs, as shown in the
following screenshot. First, go to the logs settings, and then click on Click here
to enable logs and configure log parameters:
For our example, we will enable monitoring for performance issues by enabling
the log_slow... statements, as shown in the following screenshot:
DR options
Depending on your Service Level Agreement (SLA) and DR needs, you can
add replicas to your MySQL server, as shown in the following screenshot:
A full list of backup, restore, and replication options are documented at https://docs.microsoft.com/e
n-us/azure/mysql/concepts-backup and https://docs.microsoft.com/en-us/azure/mysql/concepts-read-replicas.
Azure SQL HADR options
Naturally, the options are much better when you use Azure SQL Database than
with MySQL. Brief highlights of all the options are listed and users are
encouraged to choose their database server based on their own needs. You can
create a test database to see the options yourself, as shown in the following
screenshot:
The advanced options are shown in the following screenshot:
We only want to highlight two of the advanced options that we looked at in the
previous section for MySQL, which are also available with Azure SQL
Database:
Active Directory (AD) admin: You can connect your company's Azure AD
to provide controlled access to the databases.
Auditing: Fine-grained auditing, even for row-level access, can be set.
Another great feature is that Geo-Replication can also be easily added, as shown
in the following screenshot:
Summary
This chapter focused on working with the WordPress sample solution that
leverages a MySQL database as a data store. We started by showing you how to
set up the cluster to connect the MySQL database by installing the Open Service
Broker for Azure and leveraging the RBAC-enabled Helm tool. We then showed
you how to install a MySQL database and drastically minimize the attack surface
by changing the default configuration to not allow public access to the database.
Then, we discussed how to restore the database from a backup and how to
leverage the audit logs for troubleshooting. Finally, we discussed how to
configure the solution for DR, and so satisfy your organization's DR needs by
using Azure SQL geo-replication.
In the next chapter, you will learn how to implement microservices on AKS,
including by using Event Hubs for loosely-coupled integration between the
applications.
Connecting to Other Azure Services
(Event Hub)
Event-based integration is a key pattern for implementing microservices. In this
chapter, you will learn how to how to implement microservices on AKS,
including how to use Event Hub for loosely coupled integration between the
applications. The securing of the communications between microservices will
also be introduced to the reader. Microservices, when implemented with the
correct organization/support in place, help businesses develop a growth mindset
in their teams. DevOps maturity is crucial in making the digital transformation
of companies a reality. You, as a developer and/or an engineer responsible for
site reliability, will learn how to deploy them, and also how to leverage Azure
Event Hub to store events. As you will learn in this chapter, events-based
integration is one of the key differentiators between monolithic and
microservice-based applications. We will cover the following topics in brief:
Introducing to microservices
Deploying a set of microservices
Using Azure Event Hubs
Technical requirements
You will need to use a modern browser, such as Firefox, Chrome, or Edge.
Introducing to microservices
Microservices are an architectural pattern for organizing your application
according to business domains. For more information on microservices, please
see https://martinfowler.com/articles/microservices.html. Classic examples that are
usually provided for microservices are how customer, movies, and
recommendation services are implemented. Customer service simply deals with
customer details, and has no information about movies. The movies service deals
with movie details and nothing more. The recommendation engine service deals
with recommendations only, and, given a movie title, will return the movie that
are closely related.
Independent scaling is another benefit that is mostly useful for systems with
heavy users. In our example, if more requests are coming in for
recommendations, our service can be scaled up without scaling up other
services.
Each service can be built with the right language of choice. For high
performance, the Rust language can be used, or the development team might
even be more comfortable developing in Python. As long as they expose REST
services, they can be deployed with services written in any other language.
Blue/green deployment and rolling updates are also made possible by the use of
microservices. You can deploy the upgraded service and check whether or not
they are working, and then push all the new requests to the upgraded service.
The requests to the old service can be drained. The preceding deployment is
called blue/green deployment. If something goes wrong, the upgraded service is
downscaled, and the customers experience almost no downtime. Rolling updates
are similar, where the old pods are slowly replaced with the new pods and the
process is reversed if something goes wrong.
Since services are kept small, they can be rewritten quickly in case the initial
implementation is wrong. Composable services, such as aggregator services, can
be built on top of existing services to speed up development. Microservices
bring back the old Unix philosophy of doing one thing, and doing it well.
Composability is the method of integrating services rather than stuffing
everything in one service.
Microservices are no free lunch
With all the advantages mentioned previously, as described in http://highscalabilit
y.com/blog/2014/4/8/microservices-not-a-free-lunch.html by Benjamin Wootton (https://
CI/CD tools must be present that build and deploy the integrated solution
continuously. The infrastructure to create environments on demand and maintain
them is required for continuous delivery. Enough skill to automate environment
setup to ensure that automated tests can be run reliably is required.
Debugging asynchronous systems is not an easy skill without the right tools.
Operators with maturity are required that can correlate events in logs present in
multiple locations.
The reverse cannot generally be said, meaning that Kubernetes has almost
become a prerequisite for microservices. The challenges mentioned in the
previous section are the main reasons for learning and implementing a complex
system like Kubernetes.
2. We will use Kafka and ZooKeeper charts from bitnami, so let's add the
required helm repo:
helm repo add bitnami https://charts.bitnami.com
helm repo add incubator https://kubernetes-charts-incubator.storage.googleapis.com
6. Wait for about 15-30 minutes until all of the services are up and running.
This service does not implement any security, so we use local port
forwarding to access the service:
kubectl --namespace social-network port-forward svc/edge-service 9000
The next section, we will move away from storing events in the cluster and
storing them in Azure Event Hub. By leveraging recently added Kafka support
on Azure Event Hubs, and switching to using a more production-ready event
store, we will see that the process is straightforward.
Using Azure Event Hubs
Running Kafka locally is OK for demo purposes, but not suitable for production
use. The same reasons why you wouldn't want to run your own database server
are why you would avoid running and maintaining your own Kafka instance.
Azure Event Hub has added support for the Kafka protocol, so with minor
modifications, we can update our application from using local Kafka instance to
the scalable Azure Event Hub instance.
Create the Azure Event Hub via the portal and gather the required details to
connect our microservice-based application.
Modify the Helm chart to use the newly created Azure Event Hub.
Creating the Azure Event Hub
Perform the following steps to create the Azure Event Hub:
1. To create the Azure Event Hub on Azure portal, search for event hub, shown
as follows:
4. Fill in the details as follows. For Kafka support, the Standard tier must be
used:
5. Once the Event Hubs is created, select it, as shown in the following
screenshot:
6. Click on the Shared access policies | RootManageSharedAccessKey and
copy the Connection string-primary key, as shown in the following
screenshot:
Using Azure portal, we have created the Azure Event Hub that can store and
process our events as they are generated. We need to gather the connection
strings so that we can hook up our microservice-based application.
Updating the Helm files
We are going to switch the microservice deployment from using the local Kafka
instance to using the Azure-hosted, Kafka-compatible Event Hub instance:
The modified file will have a section that is similar to the following:
- name: SPRING_CLOUD_STREAM_KAFKA_BINDER_BROKERS
value: "myhandsonaks.servicebus.windows.net:9093"
- name: SPRING_CLOUD_STREAM_KAFKA_BINDER_DEFAULT_BROKER_PORT
value: "9093"
- name: SPRING_CLOUD_STREAM_KAFKA_BINDER_CONFIGURATION_SECURITY_PROTOCOL
value: "SASL_SSL"
- name: SPRING_CLOUD_STREAM_KAFKA_BINDER_CONFIGURATION_SASL_MECHANISM
value: "PLAIN"
- name: SPRING_CLOUD_STREAM_KAFKA_BINDER_CONFIGURATION_SASL_JAAS_CONFIG
value: 'org.apache.kafka.common.security.plain.PlainLoginModule required username="$Conn
nameOverride: social-network
fullNameOverride: social-network
kafka:
enabled: false
6. Wait for all the pods to be up, and then run the following command to
verify that the install worked:
# port forward the service locally
kubectl --namespace social-network port-forward svc/edge-service 9000 &
# Generates a 15 person social network using serial API calls
bash ./deployment/sbin/generate-serial.sh
7. You can see the activity on the Azure portal in the following screenshot:
Clicking on the friend Event Hub and then Metrics, we can see the number of
messages that came through and how many came over time (you have to Add
the Incoming Messages metric with the EntityName = 'friend' filter), as shown
in the following screenshot:
Summary
We started this chapter by covering microservices, their benefits and their trade-
offs. Following this, we went on to deploy a set of microservices called social
network, where we used Helm to deploy a sample microservice-based
application. We were able to test the service by sending events and watching
objects being created and updated. Finally, we covered the storing of events in
Azure Event Hub using Kafka support, and we were able to gather the required
details to connect our microservice-based application and modify the Helm
chart. The next chapter will cover cluster and port security using secret objects
provided by Kubernetes.
Securing AKS Network Connections
Loose lips sink ships is a phrase that describes how easy it can be to
jeopardize the security of a Kubernetes-managed cluster (Kubernetes, by the
way, is Greek for helmsman of a ship). If your cluster is left open with the wrong
ports or services exposed, or plain text is used for secrets in application
definitions, bad actors can take advantage of this lax security and do pretty much
whatever they want in your cluster.
In this chapter, we will explore Kubernetes secrets in more depth. You will learn
about different secrets backends and how to use them. You'll get a brief
introduction to service mesh concepts, and you'll be able to follow along with a
practical example.
Using any of the preceding methods, you can create three types of secrets:
Generic secrets: These can be created using literal values in addition to the
preceding two methods.
Docker-registry credentials: These are used to pull images from the
private registry.
TLS certificates: These are used to store SSL certificates.
Creating secrets from files
We'll begin by using the file method of creating secrets. Let's say that you need
to store a URL and a secret token for accessing an API. To achieve this, you'll
need to follow these steps:
4. We can check whether the secrets were created in the same way as any
other Kubernetes resource by using the get command:
kubectl get secrets
5. For more details about the secrets, you can also run the describe command:
kubectl describe secrets/myapi-url-token
Notice that you give the token name if you only need a specific secret
value. kubectl describe secrets will give more details on all the secrets in a
namespace.
Type: Opaque
Data
====
secrettoken.txt: 32 bytes
secreturl.txt: 45 bytes
Note that both the preceding commands did not display the actual secret
values.
The data is stored as key-value pairs, with the filename as the key and the
contents of the file as the value.
base64encoded
7. The preceding values are base64 encoded. To get the actual values, run the
following command:
#get the token value
echo 'L3h+TGh4XG5BeiEsOy5WayVbI24rIjs5cCVqR0Y2Wwo=' | base64 -d
You will get the value that was originally entered, as follows:
/x~Lhx\nAz!,;.Vk%[#n+";9p%jGF6[
8. Similarly, for the url value, you can run the following command:
#get the url value
echo 'aHR0cHM6Ly9teS1zZWNyZXQtdXJsLWxvY2F0aW9uLnRvcHNlY3JldC5jb20K' | base64 -d
In this section, we were able to encode the URL with a secret token and get the
actual secret values back using files.
Creating secrets manually using files
We will create the same secrets as in the previous section, only manually, by
following these steps:
You might notice that this is the same value that was present when we got
the yaml definition of the secret.
2. Similarly, for the url value, we can get the base64 encoded value, as shown in
the following code block:
echo 'https://my-secret-url-location.topsecret.com' | base64 -w 0
aHR0cHM6Ly9teS1zZWNyZXQtdXJsLWxvY2F0aW9uLnRvcHNlY3JldC5jb20K
3. We can now create the secret definition manually; then, save the file as
myfirstsecret.yaml:
apiVersion: v1
kind: Secret
metadata:
name: myapiurltoken
type: Opaque
data:
url: aHR0cHM6Ly9teS1zZWNyZXQtdXJsLWxvY2F0aW9uLnRvcHNlY3JldC5jb20K
token: L3h+TGh4XG5BeiEsOy5WayVbI24rIjs5cCVqR0Y2Wwo=
kind tells us that this is a secret; the name value is myapiurltoken, and type is
Opaque (from Kubernetes' perspective, values are unconstrained key-value
pairs). The data section has the actual data in the form of keys, such as url
and token, followed by the encoded values.
4. Now we can create the secrets in the same way as any other Kubernetes
resource by using the create command:
kubectl create -f myfirstsecret.yaml
kubectl get secrets
NAME TYPE DATA AGE
defau... kubernetes.io/.. 3 4d5h
myapi-url-token Opaque 2 167m
myapiurltoken Opaque 2 25m
5. You can double-check that the secrets are the same, by using kubectl get -o
yaml secrets/myapiurltoken in the same way that we described in the previous
section.
Creating generic secrets using literals
The third method of creating secrets is by using the literal method. To do this,
run the following command:
kubectl create secret generic my-api-secret-literal --from-literal=url=https://my-secret-url-locat
We can verify that the secret was created by running the following command.
kubectl get secrets
Thus we have created secrets using literal values in addition to the preceding two
methods.
Creating the Docker registry key
Connecting to a private Docker registry is a necessity in production
environments. Since this use case is so common, Kubernetes has provided
mechanisms to create the connection:
kubectl create secret docker-registry <secret-name> --docker-server=<your-registry-server>
The first parameter is the secret type, which is docker-registry. Then, you give the
secret a name; for example, regcred. The other parameters are the Docker server (
https://index.docker.io/v1/ for Docker Hub), your username, password, and email.
You can retrieve the secret in the same way as other secrets by using kubectl to
access secrets.
Creating the tls secret
To create a tls secret that can be used in ingress definitions, we use the following
command:
kubectl create secret tls <secret-name> --key <ssl.key> --cert <ssl.crt>
The first parameter is tls to set the secret type, and then the key value and the
actual certificate value. These files are usually obtained from your certificate
registrar.
If you want to generate your own secret, you can run the following command:
openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /tmp/ssl.key -out /tmp/ssl.crt -subj
"/CN=foo.bar.com"
Using your secrets
Kubernetes offers the following two ways to mount your secrets:
Under env, we define the env name as SECRET_URL. Then kubernetes gets the
value by using the valueFrom. It is referred to a key in the secret data using
secretKeyRef with the myapi-url-token name. Finally, take the value present in
2. Let's now create the pod and see whether it really worked:
kubectl create -f pod-with-env-secrets.yaml
Any application can use the secret values by referencing the appropriate env
variables. Please note that both the application and the pod definition have no
hardcoded secrets.
Secrets as files
Let's take a look at how to mount the same secrets as files. We will use the
following pod definition to demonstrate how this can be done:
apiVersion: v1
kind: Pod
metadata:
name: secret-using-volume
spec:
containers:
- name: nginx
image: nginx
volumeMounts:
- name: secretvolume
mountPath: "/etc/secrets"
readOnly: true
volumes:
- name: secretvolume
secret:
secretName: myapi-url-token
The preceding definition tells us that the volumeMounts section should mount a
volume called secretvolume. The mountPath where it should be mounted is
/etc/secrets; additionally, it is readOnly.
Note that this is more succinct than the env definition, as you don't have to define
a name for each and every secret. However, applications need to have a special
code to read the contents of the file in order to load it properly. This method is
suited for loading entire config files.
1. Save the preceding file as pod-with-vol-secret.yaml. Then, create the pod using
the following command:
kubectl create -f pod-with-vol-secret.yaml
Linkerd (https://linkerd.io/)
Envoy (https://www.envoyproxy.io/)
Istio (https://istio.io/)
Linkerd2, formerly Conduit (https://conduit.io/)
You should choose one service mesh based on your needs, and feel comfortable
in the knowledge that, until you hit really high volumes, any one of these
solutions will work for you.
We are going to try istio for no reason other than its high star rating, at the time
of writing, on GitHub (over 15,000). This rating is far higher than any other
project.
Installing Istio
Installing istio is easy; to do so, follow these steps:
1. Let's label the default namespace with the appropriate label, namely, istio-
injection=enabled:
You can see that the sidecar has indeed been applied:
Name: details-v1-7bcdcc4fd6-xqwjz
Namespace: default
...
Labels: app=details
pod-template-hash=3678770982
version=v1
Annotations: sidecar.istio.io/status:
{"version":"887285bb7fa76191bf7f637f283183f0ba057323b078d44c3db45978346cbc1a
...
1. Use the following commands to create namespaces (foo, bar, and legacy) and
create the httpbin and sleep services in those namespaces:
kubectl create ns foo
kubectl apply -f <(istioctl kube-inject -f samples/httpbin/httpbin.yaml) -n foo
kubectl apply -f <(istioctl kube-inject -f samples/sleep/sleep.yaml) -n foo
kubectl create ns bar
kubectl apply -f <(istioctl kube-inject -f samples/httpbin/httpbin.yaml) -n bar
kubectl apply -f <(istioctl kube-inject -f samples/sleep/sleep.yaml) -n bar
kubectl create ns legacy
kubectl apply -f samples/httpbin/httpbin.yaml -n legacy
kubectl apply -f samples/sleep/sleep.yaml -n legacy
As you can see, the same services are deployed in foo and bar with the
sidecar injected, while legacy is not.
In the preceding results, there should be no hosts with foo, bar, legacy, or a
* wildcard.
Globally enabling mutual TLS
Mutual TLS states that all services must use TLS when communicating with
other services. This uncovers one of the big security holes in Kubernetes. A bad
actor who has access to the cluster, even if they don't have access to the
namespace, can send commands to any pod, pretending to be a legitimate
service. If given enough rights, they can also operate as the man in the middle
between services, grabbing JSON Web Tokens (JWTs). Implementing TLS
between services reduces the chances of man-in-the-middle attacks between
services:
Since it is named default, it specifies that all workloads in the mesh will
only accept encrypted requests using TLS.
Those systems with sidecars will fail when running this command and
will receive a 503 code, as the client is still using plain text. It might take
a few seconds for MeshPolicy to take effect. The following is the output:
sleep.foo to httpbin.foo: 503
sleep.foo to httpbin.bar: 503
sleep.bar to httpbin.foo: 503
sleep.bar to httpbin.bar: 503
3. We will set the destination rule to use a * wildcard that is similar to the
mesh-wide authentication policy. This is required to configure the client
side:
cat <<EOF | kubectl apply -f -
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
name: "default"
namespace: "default"
spec:
host: "*.local"
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
EOF
Running the preceding command will make all the pods with the sidecar
communicate via TLS.
5. We can also check that the pods without the istio sidecar cannot access any
services in the foo or bar namespaces by running the following command:
for from in "legacy"; do for to in "foo" "bar"; do kubectl exec $(kubectl get pod -l app=sleep
Kubeless services
Events and serverless functions
Technical requirements
You will need to use a modern browser, such as Chrome, Firefox, or Edge.
Kubeless services
The popularity of AWS Lambda, the serverless compute platform, has resulted in
many frameworks that allow similar functionality, both as cloud provider-
managed (for example, Azure Functions, Google Cloud Functions, and IBM
Cloud Functions) and self-managed frameworks. Kubeless is one of the self-
managed ones. As in any new fast-moving technology, there is no clear winner
yet. Here are some open source alternatives to Kubeless that are Kubernetes
friendly:
Kubeless was chosen based on its compatibility with the highest GitHub star
winner (28K+ at the time of writing) serverless framework (https://github.com/serv
erless/serverless).
Installing Kubeless
This script should be pretty much expected by now. Run the following
commands on Azure Cloud Shell. It will install the kubeless framework in
the kubeless namespace with RBAC support:
helm repo add incubator https://kubernetes-charts-incubator.storage.googleapis.com/
helm install --name kubeless --namespace kubeless --set rbac.create=true incubator/kubeless
Install Kubeless binary
While Kubeless comes up, install the Kubeless binary to launch functions in the
Kubernetes cluster by running the following command:
export RELEASE=$(curl -s https://api.github.com/repos/kubeless/kubeless/releases/latest | grep tag_nam
export OS=$(uname -s| tr '[:upper:]' '[:lower:]')
curl -OL https://github.com/kubeless/kubeless/releases/download/$RELEASE/kubeless_$OS-amd64.zip && u
Ensure that the kubeless framework is installed properly by running the following
command:
kubeless --help
Usage:
kubeless [command]
Available Commands:
autoscale manage autoscale to function on Kubeless
completion Output shell completion code for the specified shell.
function function specific operations
get-server-config Print the current configuration of the controller
help Help about any command
topic manage message topics in Kubeless
trigger trigger specific operations
version Print the version of Kubeless
Flags:
-h, --help help for kubeless
Check the helm deploy status to ensure that everything has been installed:
helm status kubeless
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
kubeless-config 8 36s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
kubeless-kubeless-controller-manager-5cccf45988-nc25l 3/3 Running 0 36s
==> v1/ServiceAccount
NAME SECRETS AGE
controller-acct 1 36s
==> v1beta1/ClusterRole
NAME AGE
kubeless-kubeless-controller-deployer 36s
==> v1beta1/ClusterRoleBinding
NAME AGE
kafka-controller-deployer 36s
kubeless-kubeless-controller-deployer 36s
==> v1beta1/CustomResourceDefinition
NAME AGE
cronjobtriggers.kubeless.io 36s
functions.kubeless.io 36s
httptriggers.kubeless.io 36s
==> v1beta1/Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
kubeless-kubeless-controller-manager 1/1 1 1 36s
NOTES:
== Deploy function
https://github.com/kubeless/kubeless
Line one: This creates a new namespace for the serverless functions.
Line two: This deploys a serverless function named hello by using the hello-
serverless.py file, with runtime specified as python 2.7 and the handler as the
hello function in the serverless namespace.
You can check whether the function is ready by running the following command:
kubeless function ls hello -n serverless
To be really useful, we need the ability to trigger it through events. One of the
easiest ways to integrate our serverless functions with events is to use Azure
Event Hubs. In this section, we will integrate Azure Event Hubs with our
serverless functions. We will be using Azure Functions to call our serverless
function.
There are multiple ways that a function can be linked to Event Hub. Event Grid is also an
option. Please see https://docs.microsoft.com/en-us/azure/event-grid/custom-event-quickstart if you would like
to take this route.
1. First, create the Function App by selecting Create a resource, then clicking
on Compute, then finally selecting Function App, as shown in the following
screenshot:
2. Now, fill in the required specifications shown in the following screenshot:
Please ensure that you select the Windows option and create a new
Resource Group.
5. Click on + New Function to add the code to call our serverless function.
Choose the In-portal option and click on Continue:
6. Choose the More templates... option:
10. In the dialog that pops up, give the function the name UserEventHubTrigger, and
click on new in the Event Hub connection section:
You can check whether the function works by running the following curl
command:
curl -L --insecure --data '{"Hello world!"}' --header "<Load Balancer IP" --header "Content-Type:a
Now we can modify the Azure Function code whenever an event occurs in
the user Event Hub.
var options = {
hostname: '<insert-your-kubeless-function-load-balancer-ip>', // <<<----- IMPORTANT CHANGE THE
port: 8080,
path: '',
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Content-Length': postData.length
}
};
req.write(postData);
req.end();
});
};
Click on Save and Run once you have changed the IP address.
Your Kubeless function would have been called, and you can verify this by
running the following command:
kc logs -f <hello-pod-name> -n serverless
You can verify the Event Hub integration by changing the names in event-
sourcing-microservices-example/deployment/sbin/names-15.txt and running the following
command:
cd event-sourcing-microservices-example && ./deployment/sbin/generate-serial.sh
You will see that the function is triggered in the Kubeless function logs and also
in the Azure Portal, as shown in the following screenshot, by choosing the
Monitor option:
The actual log entries are shown in the following screenshot. Note that it might
take couple of minutes before you see the Event Hub entries:
Congratulations, you have successfully triggered a Kubeless serverless function
using an Azure Function that, in turn, was triggered by an event that occurred in
Event Hub.
Summary
This chapter was all about installing Kubeless to successfully run our first
serverless function. In the latter part of the chapter, we integrated our
Kubeless serverless functions with events using Azure Event Hubs. By using
smaller code that is loosely coupled, we can now make faster independent
releases a reality in our organization. The next and final chapter will cover future
steps, where we will be pointed to different resources and can learn/implement
advanced features in security and scalability. For this chapter, please refer to: http
s://www.packtpub.com/sites/default/files/downloads/Next_Steps.pdf
Other Books You May Enjoy
If you enjoyed this book, you may be interested in these other books by Packt:
ISBN: 978-1-78839-923-4
ISBN: 978-1-78934-062-4