Installing Multicluster Aspen Mesh on KOPS cluster

I have recently tried installing Aspen Mesh which is a enterprise grade service mesh on multicluster and was easier that I anticipated. I will try my best to walk you through my process. Ensure you have 2 Kubernetes cluster with same version of Aspen Mesh installed on each of them. Please refer to the documentation for installing Aspen Mesh on your cluster. You might have to sign up for a free 30 days trial to be able to download the binary and access the documentation.

kops get cluter

ssah-test1.dev.k8s.local        aws    us-west-2a
ssah-test2.dev.k8s.local        aws    us-west-2a

There are multiple ways to configure Aspen Mesh on a multicluster environment. In the following example, I have installed Aspen Mesh 1.9.1-am1 on both of my cluster and the installation type is Multi-Primary on different network.

Here are few pre requisite for the setup:

Api -server of each cluster must be able to access the Api server of other cluster.
Trust must be established between all clusters in mesh. This is achieved by having a common Root CA to generate intermediate certs for each clusters.

Configuring Trust:

I am creating RSA type certificate for my root cert. After I have download and extract the Aspen Mesh binary, I create a cert folder and add the folder to the directory stack.

mkdir -p certs

pushd certs

The binary downloaded should have a tools directory to create your certificate. You run the make command to create root-ca folder which will consist of 4 files: root-ca.conf, root-cert.csr, root-cert.pem and root-key.pem. For each of your cluster, you need to generate intermediate cert and key for istio CA.

make -f ../tools/certs/Makefile.selfsigned.mk root-ca

make -f ../tools/certs/Makefile.selfsigned.mk cluster1-cacerts

make -f ../tools/certs/Makefile.selfsigned.mk cluster2-cacerts

You will then have to create secret for each of your cluster in istio-system namespace with all the input files that we generated from the last step. These secrets at each of the cluster is what configures trust between them as the same root-cert.pem is used to create intermediate cert.

kubectl  create secret generic cacerts -n istio-system \\n  --from-file=ca-cert.pem \\n  --from-file=ca-key.pem \\n  --from-file=root-cert.pem \\n  --from-file=cert-chain.pem --context="${CTX_CLUSTER1}"

kubectl create secret generic cacerts -n istio-system \\n  --from-file=ca-cert.pem \\n  --from-file=ca-key.pem \\n  --from-file=root-cert.pem \\n  --from-file=cert-chain.pem --context="${CTX_CLUSTER2}"

We will move on to configuring Aspen mesh configuration, where we are enabling multicluster for istiod and giving names for network and cluster. You need to add following fields in your override file which will be used during helm installation/upgrade. Create a separate file for each cluster. You will also need to label the istio-system namespace in both of your cluster with appropriate label.

kubectl --context="${CTX_CLUSTER1}" label namespace istio-system topology.istio.io/network=network1

kubectl --context="${CTX_CLUSTER2}" label namespace istio-system topology.istio.io/network=network2

For cluster 1

#Cluster 1

#In order to make the application service callable from any cluster, the DNS lookup must succeed in each cluster
#This provides DNS interception for all workloads with a sidecar, allowing Istio to perform DNS lookup on behalf of the application.
meshConfig:
  defaultConfig:
    proxyMetadata:
    # Enable Istio agent to handle DNS requests for known hosts
    # Unknown hosts will automatically be resolved using upstream dns servers in resolv.conf
      ISTIO_META_DNS_CAPTURE: "true"

global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster1"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network1

For cluster 2

#Cluster 2

#In order to make the application service callable from any cluster, the DNS lookup must succeed in each cluster
#This provides DNS interception for all workloads with a sidecar, allowing Istio to perform DNS lookup on behalf of the application.
meshConfig:
  defaultConfig:
    proxyMetadata:
    # Enable Istio agent to handle DNS requests for known hosts
    # Unknown hosts will automatically be resolved using upstream dns servers in resolv.conf
      ISTIO_META_DNS_CAPTURE: "true"

global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster2"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network2

Now we will upgrade/install istiod manifest with the newly added configuration from the override file, as you can see i have separate override file for each cluster.

helm upgrade istiod manifests/charts/istio-control/istio-discovery -n istio-system --values sample_overrides-aspenmesh_2.yaml

helm upgrade istiod manifests/charts/istio-control/istio-discovery -n istio-system --values sample_overrides-aspenmesh.yaml

Check the pods in istio-system namespace to see if all are in running state. Make sure to delete all your application pod in your default namespace for the new configuration to kick in when the new pods will be spun. You can also check to see if the root cert used for pods in each cluster is same. I am using pods from the bookinfo sample application.

istioctl pc secrets details-v1-79f774bdb9-pqpjw -o json | jq '[.dynamicActiveSecrets[] | select(.name == "ROOTCA")][0].secret.validationContext.trustedCa.inlineBytes' -r | base64 -d | openssl x509 -noout -text | md5

istioctl pc secrets details-v1-79c697d759-tw2l7 -o json | jq '[.dynamicActiveSecrets[] | select(.name == "ROOTCA")][0].secret.validationContext.trustedCa.inlineBytes' -r | base64 -d | openssl x509 -noout -text |md5

Once the istiod is upgraded, we will move on to create ingress gateway used for communication between 2 clusters via installing an eastwest gateway. Use below configuration to create a yaml file which will be used with helm to install in each of the cluster. I have created 2 yaml files: cluster1_gateway_config.yaml and cluster2_gateway_config.yaml which will be used with respective clusters.

For Cluster 1

#This can be on separate override file as we will install a custom IGW
gateways:
  istio-ingressgateway:
    name: istio-eastwestgateway
    labels:
      app: istio-eastwestgateway
      istio: eastwestgateway
      topology.istio.io/network: network1
    ports:
    ## You can add custom gateway ports in user values overrides, but it must include those ports since helm replaces.
    # Note that AWS ELB will by default perform health checks on the first port
    # on this list. Setting this to the health check port will ensure that health
    # checks always work. https://github.com/istio/istio/issues/12503
    - port: 15021
      targetPort: 15021
      name: status-port
      protocol: TCP
    - port: 80
      targetPort: 8080
      name: http2
      protocol: TCP
    - port: 443
      targetPort: 8443
      name: https
      protocol: TCP
    - port: 15012
      targetPort: 15012
      name: tcp-istiod
      protocol: TCP
    # This is the port where sni routing happens
    - port: 15443
      targetPort: 15443
      name: tls
      protocol: TCP
    - name: tls-webhook
      port: 15017
      targetPort: 15017
    env:
      # A gateway with this mode ensures that pilot generates an additional
      # set of clusters for internal services but without Istio mTLS, to
      # enable cross cluster routing.
      ISTIO_META_ROUTER_MODE: "sni-dnat"
      ISTIO_META_REQUESTED_NETWORK_VIEW: "network1"
    serviceAnnotations:
      service.beta.kubernetes.io/aws-load-balancer-type: nlb

global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster1"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network1

For Cluster 2

gateways:
  istio-ingressgateway:
    name: istio-eastwestgateway
    labels:
      app: istio-eastwestgateway
      istio: eastwestgateway
      topology.istio.io/network: network2
    ports:
    ## You can add custom gateway ports in user values overrides, but it must include those ports since helm replaces.
    # Note that AWS ELB will by default perform health checks on the first port
    # on this list. Setting this to the health check port will ensure that health
    # checks always work. https://github.com/istio/istio/issues/12503
    - port: 15021
      targetPort: 15021
      name: status-port
      protocol: TCP
    - port: 80
      targetPort: 8080
      name: http2
      protocol: TCP
    - port: 443
      targetPort: 8443
      name: https
      protocol: TCP
    - port: 15012
      targetPort: 15012
      name: tcp-istiod
      protocol: TCP
    # This is the port where sni routing happens
    - port: 15443
      targetPort: 15443
      name: tls
      protocol: TCP
    - name: tls-webhook
      port: 15017
      targetPort: 15017
    env:
      # A gateway with this mode ensures that pilot generates an additional
      # set of clusters for internal services but without Istio mTLS, to
      # enable cross cluster routing.
      ISTIO_META_ROUTER_MODE: "sni-dnat"
      ISTIO_META_REQUESTED_NETWORK_VIEW: "network2"
    serviceAnnotations:
      service.beta.kubernetes.io/aws-load-balancer-type: nlb
global:
  meshID: mesh1
  multiCluster:
    # Set to true to connect two kubernetes clusters via their respective
    # ingressgateway services when pods in each cluster cannot directly
    # talk to one another. All clusters should be using Istio mTLS and must
    # have a shared root CA for this model to work.
    enabled: true
    # Should be set to the name of the cluster this installation will run in. This is required for sidecar injection
    # to properly label proxies
    clusterName: "cluster2"
    globalDomainSuffix: "local"
    # Enable envoy filter to translate `globalDomainSuffix` to cluster local suffix for cross cluster communication
    includeEnvoyFilter: false
  network: network2

helm install istio-eastwestgateway manifests/charts/gateways/istio-ingress --namespace istio-system --values cluster1_gateway_config.yaml

helm install istio-eastwestgateway manifests/charts/gateways/istio-ingress --namespace istio-system --values cluster2_gateway_config.yaml

After adding the new eastwest-gateway, you will get a eastwest-gateway pod deployed in istio-system namespace and the service which creates a Network Load Balancer specified in the annotations. You need to resolve the IP address of the NLBs for the eastwest-gateways and then patch them into the service as spec.externalIPs in both of your clusters, until Multi-Cluster/Multi-Network – Cannot use a hostname-based gateway for east-west traffic · Issue #29359 · istio/istio is fixed. This is not an ideal situation because of the following reasons.

k get svc -n istio-system istio-eastwestgateway
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP                                                                                  PORT(S)                                                                                      AGE
istio-eastwestgateway   LoadBalancer   100.71.211.32   a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com 15021:32138/TCP,80:30420/TCP,443:31450/TCP,15012:30150/TCP,15443:30476/TCP,15017:32335/TCP   8d

nslookup a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com
Server:        172.23.241.180
Address:    172.23.241.180#53
Non-authoritative answer:
Name:    a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com
Address: 35.X.X.X

kubectl patch svc -n istio-system istio-eastwestgateway -p '{"spec":{"externalIPs": ["35.X.X.X"]}}'

k get svc -n istio-system istio-eastwestgateway
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP                                                                                  PORT(S)                                                                                      AGE
istio-eastwestgateway   LoadBalancer   100.71.211.32   a927e6<TRUNCATED>.elb.us-west-2.amazonaws.com,35.X.X.X   15021:32138/TCP,80:30420/TCP,443:31450/TCP,15012:30150/TCP,15443:30476/TCP,15017:32335/TCP   8d

Now the gateway is configured to communicate, you will have to make sure the API of each cluster is able to talk to the other cluster. You can do this in AWS by making sure api instances are accessible to each other by creating specific rules for their security group. We will then need to create a secret in cluster 1 that provides access to cluster 2’s API server and vice versa for endpoint discovery.

#Enable endpoint discovery Cluster 2
istioctl x create-remote-secret --context="${CTX_CLUSTER1}" --name=cluster1 |kubectl apply -f - --context="${CTX_CLUSTER2}"

#Enable endpoint discovery Cluster 1
istioctl x create-remote-secret --context="${CTX_CLUSTER2}" --name=cluster2 |kubectl apply -f - --context="${CTX_CLUSTER1}"

At this stage, the pilot should (which is bundled in istiod binary) should have the new configuration and when you tail the logs for the pod, you should be able see the logs message “Number of remote cluster: 1”. With this version, you also would need to edit the ingress eastwest gateway in istio-system namespace that we created above as the selector label and annotation added via helm chart is different than expected. It shows “istio: ingressgateway” but should be “istio: eastwestgateway”. You can now create pods in each cluster and verify it is working as expected. Here is how the eastwest gateway should look like:

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  annotations:
    meta.helm.sh/release-name: istio-eastwestgateway
    meta.helm.sh/release-namespace: istio-system
  creationTimestamp: "2021-05-13T01:56:50Z"
  generation: 2
  labels:
    app: istio-eastwestgateway
    app.kubernetes.io/managed-by: Helm
    install.operator.istio.io/owning-resource: unknown
    istio: eastwestgateway
    istio.io/rev: default
    operator.istio.io/component: IngressGateways
    release: istio-eastwestgateway
    topology.istio.io/network: network2
  name: istio-multicluster-ingressgateway
  namespace: istio-system
  resourceVersion: "6777467"
  selfLink: /apis/networking.istio.io/v1beta1/namespaces/istio-system/gateways/istio-multicluster-ingressgateway
  uid: 618b2b5b-a2bb-4b37-a4a1-7f5ab7ef03d4
spec:
  selector:
    istio: eastwestgateway
  servers:
  - hosts:
    - '*.local'
    port:
      name: tls
      number: 15443
      protocol: TLS
    tls:
      mode: AUTO_PASSTHROUGH