Securing A Multi-Cloud App With Service Mesh

Multi-cloud is becoming a reality for many organizations, but what is multi-cloud? Multi-cloud is a very wide term that encompasses any organization using more than one cloud, whether running apps across those clouds or not. For example, if one BU in my org is using GKE and another BU is using AWS my business is already operating in a multi-cloud environment, and this needs to be operated and secured. 

So we have defined multi-cloud, what is hybrid cloud and what are multi-cloud app then? The “hybrid-cloud” term came out very close to the emergence of public clouds and private clouds. Hybrid just like public and private is about location. How do I connect the public and the private clouds, mostly from the infrastructure point of view. The hybrid-cloud seems more and more suitable for “heritage workloads” cloud migrations where the “hybridity” is about connecting distinct pieces of infrastructure together for the ability of either moving stuff to the cloud or bursting to it. Multi-cloud apps, which this article focuses on is when we run an application through multiple clouds.

The reason this pattern is emerging, starting from the big enterprises, is because customers want to be able to consume the best of breed services where ever they may be, without being constrained by “location” or cloud vendor.  For example, for a specific app, I may want to use Google ML for my machine learning, while Amazon Lambda would be used for FaaS and other application components that are PCI sensitive will stay on-premises.

But how do you operate and secure such applications? One answer is additional abstraction. This time abstracting the application plumbing and ops on the service level. But specifically, for security, the story is a bit more complicated as there are multiple layers to consider.

Let’s talk access

Consider the following scenario, an organization is using Kubernetes as the platform to run their apps and app platforms and are using some services in EKS and some in AKS. Some will might be on-premises on Tanzu. Where and how would we define access policies for these application components? when for some of these “infrastructures” we don’t even have access to the underlying layers.

For simplicity I am focusing on these 3 layers:

  • IaaS layer – this layer is the VMs/cloud instances which in the case of Kubernetes it represent the nodes of my clusters
  • Logical layer or in Kubernetes talk, the namespaces layer – namespaces are used many times as a logical tenancy containers either for separating teams or for separating apps and apps platforms. Namespace are local to each cluster and do not stretch across them. The access for this layer can be controlled on Kubernetes using network policies through the CNI
  • Service layer – these are the actual app services. On a cluster, by cluster basis we can define segmentation policies through the CNI layer or/and on Service mesh such as Istio and LinkerD. One main difference between those would be that the CNI layer is enforcing on the underlying FW which can be the IaaS like NSX or in the node OS like IPtabels. while service mesh such as Istio enforces on the Envoy proxy which runs in the application container pod.

How do I operate this in a multi-cloud environment? Does one mechanism cancel the other?


There is no one answer to these questions, except for the one that, in security, one layer never cancels the others, but more IS better only if its simple to manage.

I have been conversing with many customers in the past few months and I have seen one pattern for securing multi-cloud apps that emerges. These customers secure each layer in a way that compliments each security layer, not overlaps with the others This is how the full picture looks like:

Let’s break it down:

For the IaaS layer, we would utilise the IaaS FW. On all major clouds including VMware SDDc with Tanzu, there is an IaaS FW capability that is controlling access between “machines” (VMs and instances) using security groups. If this is a managed Kubernetes like EKS it will control access to the cluster nodes. On NSX-T for example the Service based FW is capable of protecting pods as well but i’ll talk about multi-layer enforcement later.

For the logical layer, the namespace is a local construct that does not extend beyond the clusters it is configured on. Some companies carve out services into namespaces based on teams or app constructs. For this purpose, the local CNI network policies on the respective Kubernetes cluster would be the best fit for securing access between namespaces.

In each namespaces we have the app services and other application components. We can control access between services using CNI and Service Mesh, many have found that it makes more sense in a multi-cloud environment to create access policies that are native to the service construct. Service Mesh provides both L7 control and enforcement as close to the service as possible which is very beneficial because we do not need to control or specify and infrastructure detail. and while Service mesh has many use cases from managing traffic to observability, for our discussion the authentication policies for service level segmentation are defined and enforced in layer7. As in the pod security we can enforce these policies with a IaaS FW as well. I will write about multi-layer security enforcement in future posts but of course it makes sense to develop ways to combine these layers in the future.

Looking at the whole picture again with the three layers

What about Tanzu Service Mesh (TSM)?

Many of my Tweeps know that I am working very closely with the VMware Tanzu Service Mesh product. With TSM we now can define an abstracted application from the underlying Kubernetes and IaaS layers that can stretch across clusters and across clouds (also will allow VMs but that’s for another separate post #cliffhanger).

This is achieved by using a construct called Global Namespace (GNS). See this previous post I wrote about it: https://octo.vmware.com/vmware-nsx-service-mesh-purpose-built-enterprise/

In this case, we can create Service Mesh policies that span clouds and clusters, this changes how customer can operate and secure thier multi-cloud applications.

Now, instead of containing an app in a local namespace contain it in an abstracted global namespace as such:

In this way, we define service level access policies or authentication policies that are abstracted from any underlying cloud and are specific to the app no matter where the services are running. This way developers or devops can define policies in their pipelines that focus on the business logic and do not need to address any Kubernetes/Cloud details such as any ingress to reach remote services.

This can help drive better tenancy models as it frees the organization from location and cloud technology constraints.

Obviously, when it comes to security there’s a lot more than service access, and this is just one pattern I see emerging, but this is great way to start allowing customers to start consuming any service in any cloud they need for their apps. There’s also things like encryption and IDS/IPS. And these will come in future posts as well 😉

Leave a Reply

Your email address will not be published. Required fields are marked *