Istio, mTLS and the OSI layer

I have been playing a lot with Istio and recently tested mTLS encryption. The test, which I describe in this post, really materialized the OSI layer in front of my eyes. which is always interesting how new stuff can dust off your old basic knowledge.

The entire concept of service mesh and Istio is exciting and revolutionary in my view… but just like any new groundbreaking tech, it takes a few cycles to realize how it manifests beyond the papers, blogs and theory, at least for me. So, as I usually do, I share my experiences on this blog and in my sessions with other in the thought that if I can help even one person understand it better I have achieved my goal.
read more

Service mesh is just another form of virtualization

When I started working with VMware ESX in the early 2000, I knew it was a very cool tech; and not only me, everyone knew there’s something special about it.

However, I haven’t fully grasped the full value of this technology right of the gate, at that point, I only saw “server consolidation” in front of me.

When vMotion came out, and we realized that physics has changed for our servers, we were no longer tied to the hardware the Server was running on. That hardware abstraction allowed us to do things we couldn’t do before. like fixing hardware issues or patch it with no downtime, scale much better and faster by deploying VMs when we need them and monitor the health of the infrastructure much better, even self heal. A new exciting world of agility we never saw before was opened.


Due to the above combined with automation, the effort of managing servers has been lowered, and fewer people are needed to manage fleets of servers.

What does that has to do with Service mesh you ask?

Recently I started focusing on Service mesh, mainly Istio, testing it in the lab, learning the technology and feeling that magic again. While the technology is cool, I was trying to understand the business value that is more than buzz words like distributed, load balancing, observability etc. However, at some point, I realized that I was looking at it all wrong. I was looking for the value from a networking operations point of view, it’s only when I looked at it from a developer value when it clicked.

Service mesh is a form of virtualization

When I get excited, I let the world know, that’s why I love twitter

I see much equivalency in Service mesh to virtualization.

In the monolithic app world, many of the different pieces of code that compile the application or service are running on a small set of servers, so making decisions about how that component interacts with other parts of the application are written in the code.

That means that for every piece of meaningful code that differentiates the business the application is servicing, need to have much non-differentiate code along with it.

Things like server and client side communication, service lookups, error detection and response, telemetry, security are taken care of in the code or middleware software.

With the rise of micro-services (and the use of containers for that purpose) each container now runs a piece of differentiating code and is a single purpose server that communicates with other services on the network. The distributed architecture and the proliferation micro-services, bring new challenges to manage, monitor and troubleshoot problems.


What service mesh and Istio does is outsourcing the non-differentiating work to the sidecars with Envoy where each k8s pod now has a proxy that is responsible for communicating with other proxies and out of the mesh. (Envoy can work with more than k8s pods, it can even work with VMs or Pivotal PAS AIs!)

Now we’ve abstracted the non-differentiating code. Similarly to the value we gained by virtualizing the hardware with the hypervisors and adding a control plane, we gain for the operations of the proxy by adding a control plane in the form of Istio (I will not go into the deeper architecture in this post, there are literally hundreds of posts about it out there)

Here is a diagram to illustrate the abstraction layers in one picture

We can apply our desired state as policies to anything that is not the core function of our software, change policies on the fly without changing our code which saves much effort spent by developers, dynamically changing the policies without changing any code, apply security and authentication to transactions and have better visibility into the application health. Self-healing becomes a real thing now.

But just like virtualization brought its own set of challenges, Service mesh is no different,  which I will cover in my next post.

You can read more about the details of Istio features in this blog post:

I think this analogy explains the subject, and the proliferation of abstraction layers brings a new set of challenges from a management point of view.

Have any thoughts on this? tweet your reply



NSX-T manager fails to load? It might be that the Corfu DB got corrupted

If you’re like me, and you are spinning new nested labs left and right, you are also probably over-committing on your VMFS datastore regularly.

The issue that happened to me was that I ran out of datastore space and it crashed my NSX-T manager. Perhaps this issue can also happen for other reasons. In any case the issue manifests itself by not being able to login to the NSX-T manager where it keeps saying that the service is not ready.

When runing the command “get management-cluster status” on the NSX-T manager you may get:

Number of nodes in management cluster: UNKNOWN

Management cluster status: INITIALIZING

Number of nodes in control cluster: UNKNOWN

This problem can heppn becuse the Corfu DB in NSX-T has failed to load. In the case of running out of datastore space it almost certainly a corruption in a record in the database. 

So how do we identify and resolve this issue?

Follow these steps:

  • ssh in to the NSX manager using user:admin
  • cd to /config/corfu/log/ directory. Here you should see the log files serially named. (example 280.log, 281.log,…)
  • Recommended to take a backup of the folder using cp -R /config/corfu/log/ /config/corfu/log.backup
  • In the appliance there is a log reader tool. use it to read teh latest log. e.g. corfu_logReader display <log file name> (example 281.log)
  • If the DB is corrupt the log (which might take a while to roll) will exit with an error. The output of this command will look something like the following:
  • read more

    vRA tidbit – AWS provisioning and the key pair conundrum

    One of the main advantages of vRealize automation in the Cloud management space is that it provides customers with choices, this is true in many aspects of the solution like where to consume services from, how to deploy them, how the forms will look etc but in this post I want to talk about the creation of AWS key pairs.

    There are many solutions out there that provide an interface for provisioning instances to AWS,  some have more capabilities than others and without getting into a full feature by feature comparison I will just say that vRA is one of the more comprehensive solutions out there with many capabilities that are required for cloud management such as self-service portal, multi cloud/vendor provisioning, automation and orchestration capabilities and much more.

    One of the choices vRA gives cloud admins is how to create AWS key pairs. In a nutshell a key pair is the credentials used to access an instance, many of the CMP solutions out there will allow the creation of either a global key pair or a key pair per deployed instance.

    Having a global key pair is not granular enough for most of our customer’s requirements and it will add management overhead especially on billing and security, while the other tools that create key pair for each instance are probably too granular and create a management and maintenance nightmare, the EC2 management console will be flooded with Key pairs, this can also pose a security concern as there are potentially thousands  of credentials issued which is quite a mess.


    vRA has a more elegant solution that also provides choice, choice between having a key pair generated for each Business group, A key pair per instance or a global key pair for a reservation. In most cases it will be suitable to have just one key pair per business group which be secure enough and will not clutter the environment with hundreds and thousands of key pairs, but if needs be Cloud admins can decide to provision certain instances with their own key pairs or set a key pair per reservation (the resolution of the reservation is decided by the admin) . This might not seem as a big deal but for those who work with AWS it is important.

    When it comes down to cloud provisioning  where instances are being built and destroyed automatically, constantly and on demand having that choice can make a difference so you can feel your CMP solution fits to your requirements.




    vCAC 6.1 Application Services (AppD) Installation notes

    Recently i installed the new vCAC 6.1 in my lab, vCAC 6.1 has many improvements, if you’d like to learn more about the improvements you can read Omer Kushmaro’s (Twitter @elastic_Skies)  excellent blog post  or just read the VMware blog post here.  During my installation i found that the deployment process for the Application services component of vCAC (AKA Application Director) has changed, it is more integrated with vCAC now and there are a few things you need to know when setting it up so it will work well for you. here is a short guide:

    1.       First create a DNS record for the appliance in DNS as it won’t register itself on it’s own.
    2.      Deploy the OVF, this is very straight forward and you need to just follow the wizard.
    3.      The next step is to open the console of the VM and enter a license number:

    4.     Enter a password for the root account and the Darwin account (don’t mind the warnings in my screenshot as I use simple passwords in my lab)

    5.     The appliance will run through configuration and will start the services

    6.     Now it should ask you whether you want to use this for a migration from 6.0.1, click No since this is a new install.

    7.     It will now need to register with the vCAC instance, for this it asks for a URL of the vCAC instance and a user and password. The URL format should be similar to this https://vcac.lab and the user name is usually             administrator@vsphere.local, This is not the cloud provider creation, vCAC cloud provider will need to be done as well.

    8.      If it succeeds registering with the vCAC appliance it will ask you if you want to setup out of the box content. Of course I do!, NOTE–  this is VERY important – the sample content is added into the vCAC tenant, it will ask you in the next steps to type the tenant and the admin user, you need to make sure that the user you will input has the application services roles in vCAC for it to succeed. For that log in to the vCAC instance using the tenant administrator or the application administrator you want to use and go to administration/Groups and edit the group used for administration (in my lab there is only one group but in a production environment there might be multiple administration groups)


    9.      After you made sure this is done hit Y and “Enter”

    10.     This is one of the nices enhance,ments in this version, it supports multi-tenant environment that aligns with vCAC tenants, to add the sample content to the right tenant you need to enter the tenant’s name and the user you added the roles to in the previous step and the business group you wish to add content to.

    11.      At this point it should state that the import was successful

    12.      Next step is to set the Application services admin account

    13.      Now let’s configure the Appliance with the right IP/ hosts file and date. Login again using root and run “vi /etc/sysconfig/network/ifcfg-eth0” to edit the file

    14.     Make sure it looks something similar to the screenshot below and hit “ESC” and then “ :x” and “Enter” to save and exit

    15.     Restart the network service by typing “service network restart”
    16.     To change the hostname permanently edit the file /etc/HOSTNAME using vi like in step 13 and input the hostname and save the file
    17.     Run the command “hostname <your fqdn>” and hit enter to set the hostname without a restart
    18.     You need to also set the hostname in the hosts file, edit the hosts file by running “vi /etc/hosts” and add a line for your hostname and ip as follows (example): “ vcac-appd” save and exit VI as explained in step 13.
    19.     We need also to configure NTP and time zone for time synchronization , for that edit /etc/ntpd and change the server entry to your ntp server and save the file
    20.     Run “service ntp restart” to restart the NTP service
    21.     For time zone I run “yast timezone” and configure time zone

    22.     To check that time is synced correctly run the command “date”.
    23.     Last we need to configure or make sure that DNS is configured correctly, (it can take it from DGCP but if no DHCP exist or if it is different DNS) edit “/etc/resolv.conf” and make sure this is the DNS you need
    24.     Reboot the appliance
    25.     After the reboot login to the new Application Services web interface to the tenant with the tenant administrator user, URL should be https://<Appd fqdn>:8443/Darwin/org/<tenant name>
    26.     You should see the sample applications populated

    27.     Register with a cloud provider that could be vCD, vCAC or EC2, in my lab I registered it with vCAC. Click on the cloud icon at the top right and then on “cloud providers”
    28.     here input the details of the vCAC instance, notice that the business group will populate with the bushiness group you provided at the beginning, this is great because we don’t have to have the “Default” business group like we had to in the previous version that was kinda limiting.

    One note that I got from Kris Thieler (Twitter @inkyseawho is an esteemed member of my team at the Cloud management SET team which is very important

    “When adding the OOTB content to a business group.  If the business group has white space in the name:  example.   Cloud Developers.  Then you must enclose the business group in quotes.  example:  “Cloud Developers”

    Kris’s blog is here

      That’s it for this article, i will post another one for further configuration of vCAC and Application Services or if someone else had done it will direct you to that article. In any case i hope this will be helpful to some.