halloween, costumes comments edit

This year we had 140 trick-or-treaters. This is pretty low for us, but I can’t complain since it’s the first year after COVID-19.

2021: 140 trick-or-treaters.

Average Trick-or-Treaters by Time Block

Year-Over-Year Trick-or-Treaters

Halloween was on a Sunday and it was chilly and windy. It had been raining a bit but didn’t rain during prime trick-or-treat time.

We didn’t hand out candy last year due to the COVID-19 outbreak. Looking up and down our street, it appeared a lot of people chose again this year to not hand out candy. We also saw some “take one” bowls on porches and various creative “candy torpedo tubes” that would send candy from the porch to the kid in a distanced fashion.

Cumulative data:

  Time Block
Year 6:00p - 6:30p 6:30p - 7:00p 7:00p - 7:30p 7:30p - 8:00p 8:00p - 8:30p Total
2006 52 59 35 16 0 162
2007 5 45 39 25 21 139
2008 14 71 82 45 25 237
2009 17 51 72 82 21 243
2010 19 77 76 48 39 259
2011 31 80 53 25 0 189
2013 28 72 113 80 5 298
2014 19 54 51 42 10 176
2015 13 14 30 28 0 85
2016 1 59 67 57 0 184
2019 1 56 59 41 33 190
2021 16 37 30 50 7 140

Our costumes this year:

  • Me: Prisoner Loki from the Disney+ Loki show
  • Jenn: Medusa
  • Phoenix: Cerise Hood from Ever After High

Me as Prisoner Loki

linux, mac, windows comments edit

I used to think setting up your PATH for your shell - whichever shell you like - was easy. But then I got into a situation where I started using more than one shell on a regular basis (both PowerShell and Bash) and things started to break down quickly.

Specifically, I have some tools that are installed in my home directory. For example, .NET global tools get installed at ~/.dotnet/tools and I want that in my path. I would like this to happen for any shell I use, and I have multiple user accounts on my machine for testing scenarios so I’d like it to ideally be a global setting, not something I have to configure for every user.

This is really hard.

I’ll gather some of my notes here on various tools and strategies I use to set paths. It’s (naturally) different based on OS and shell.

This probably won’t be 100% complete, but if you have an update, I’d totally take a PR on this blog entry.

Shell Tips

Each shell has its own mechanism for setting up profile-specific values. In most cases this is the place you’ll end up setting user-specific paths - paths that require a reference to the user’s home directory. On Mac and Linux, the big takeaway is to use /etc/profile. Most shells appear to interact with that file on some level.

PowerShell

PowerShell has a series of profiles that range from system level (all users, all hosts) through user/host specific (current user, current host). The one I use the most is “current user, current host” because I store my profile in a Git repo and pull it into the correct spot on my local machine. I don’t currently modify the path from my PowerShell profile.

  • On Windows, PowerShell will use the system/user path setup on launch and then you can modify it from your profile.
  • On Mac and Linux, PowerShell appears to evaluate the /etc/profile and ~/.profile, then subsequently use its own profiles for the path. On Mac this includes evaluation of the path_helper output. (See the Mac section below for more on path_helper.) I say “appears to evaluate” because I can’t find any documentation on it, yet that’s the behavior I’m seeing. I gather this is likely due to something like a login shell (say zsh) executing first and then having that launch pwsh, which inherits the variables. I’d love a PR on this entry if you have more info.

If you want to use PowerShell as a login shell, on Mac and Linux you can provide the -Login switch (as the first switch when running pwsh!) and it will execute sh to include /etc/profile and ~/.profile execution before launching the PowerShell process. See Get-Help pwsh for more info on that.

Bash

Bash has a lot of profiles and rules about when each one gets read. Honestly, it’s pretty complex and seems to have a lot to do with backwards compatibility with sh along with need for more flexibility and override support.

/etc/profile seems to be the way to globally set user-specific paths. After /etc/profile, things start getting complex, like if you have a .bash_profile then your .profile will get ignored.

zsh

zsh is the default login shell on Mac. It has profiles at:

  • /etc/zshrc and ~/.zshrc
  • /etc/zshenv and ~/.zshenv
  • /etc/zprofile and ~/.zprofile

It may instead use /etc/profile and ~/.profile if it’s invoked in a compatibility mode. In this case, it won’t execute the zsh profile files and will use the sh files instead. See the manpage under “Compatibility” for details or this nice Stack Overflow answer.

I’ve set user-specific paths in /etc/profile and /etc/zprofile, which seems to cover all the bases depending on how the command gets invoked.

Operating System Tips

Windows

Windows sets all paths in the System => Advanced System Settings => Environment Variables control panel. You can set system or user level environment variables there.

The Windows path separator is ;, which is different than Mac and Linux. If you’re building a path with string concatenation, be sure to use the right separator.

Mac and Linux

I’ve lumped these together because, with respect to shells and setting paths, things are largely the same. The only significant difference is that Mac has a tool called path_helper that is used to generate paths from a file at /etc/paths and files inside the folder /etc/paths.d. Linux doesn’t have path_helper.

The file format for /etc/paths and files in /etc/paths.d is plain text where each line contains a single path, like:

/usr/local/bin
/usr/bin
/bin
/usr/sbin
/sbin

Unfortunately, path_helper doesn’t respect the use of variables - it will escape any $ it finds. This is a good place to put global paths, but not great for user-specific paths.

In /etc/profile there is a call to path_helper to evaluate the set of paths across these files and set the path. I’ve found that just after that call is a good place to put “global” user-specific paths.

if [ -x /usr/libexec/path_helper ]; then
  eval `/usr/libexec/path_helper -s`
fi

PATH="$PATH:$HOME/go/bin:$HOME/.dotnet/tools:$HOME/.krew/bin"

Regardless of whether you’re on Mac or Linux, /etc/profile seems to be the most common place to put these settings. Make sure to use $HOME instead of ~ to indicate the home directory. The ~ won’t get expanded and can cause issues down the road.

If you want to use zsh, you’ll want the PATH set block in both /etc/profile and /etc/zprofile so it handles any invocation.

The Mac and Linux path separator is :, which is different than Windows. If you’re building a path with string concatenation, be sure to use the right separator.

kubernetes comments edit

I have a situation that is possibly kind of niche, but it was a real challenge to figure out so I thought I’d share the solution in case it helps you.

I have a Kubernetes cluster with Istio installed. My Istio ingress gateway is connected to an Apigee API management front-end via mTLS. Requests come in to Apigee then get routed to a secured public IP address where only Apigee is authorized to connect.

Unfortunately, this results in all requests coming in with the same Host header:

  1. Client requests api.services.com/v1/resource/operation.
  2. Apigee gets that request and routes to 1.2.3.4/v1/resource/operation via the Istio ingress gateway and mTLS.
  3. An Istio VirtualService answers to hosts: "*" (any host header at all) and matches entirely on URL path - if it’s /v1/resource/operation it routes to mysvc.myns.svc.cluster.local/resource/operation.

This is how the ingress tutorial on the Istio site works, too. No hostname-per-service.

However, there are a couple of wrenches in the works, as expected:

  • There are some API endpoints on the service that aren’t exposed through Apigee. They’re internal-only operations that allow for service-to-service communications in the cluster but aren’t for outside callers.
  • I want to do canary deployments and route traffic slowly from an existing version of the service to a new, canary version. I need both the external and internal traffic routed this way to get accurate results.

The combination of these things is a problem. I can’t assume that the match-on-path-regex setting will work for internal traffic - I need any internal service to route properly based on host name. However, you also can’t match on host: "*" for internal traffic that doesn’t come through an ingress. That means I would need two different VirtualService instances - one for internal traffic, one for external.

But if I have two different VirtualService objects to manage, it means I need to keep them in sync over the canary, which kind of sucks. I’d like to set the traffic balancing in one spot and have it work for both internal and external traffic.

I asked how to do this on the Istio discussion forum and thought for a while that a VirtualService delegate would be the answer - have one VirtualService with the load balancing information, a second service for internal traffic (delegating to the load balancing service), and a third service for external traffic (delegating to the load balancing service). It’s more complex, but I’d get the ability to control traffic in one spot.

Unfortunately (the word “unfortunately” shows up a lot here, doesn’t it?), you can’t use delegates on a VirtualService that doesn’t also connect to a gateway. That is, if it’s internal/mesh traffic, you don’t get the delegate support. This issue in the Istio repo touches on that.

Here’s where I landed.

First, I updated Apigee so it takes care of two things for me:

  1. It adds a Service-Host header with the internal host name of the target service, like Service-Host: mysvc.myns.svc.cluster.local. It more tightly couples the Apigee part of things to the service internal structure, but it frees me up from having to route entirely by regex in the cluster. (You’ll see why in a second.) I did try to set the Host header directly, but Apigee overwrites this when it issues the request on the back end.
  2. It does all the path manipulation before issuing the request. If the internal service wants /v1/resource/operation to be /resource/operation, that path update happens in Apigee so the inbound request will have the right path to start.

I did the Service-Host header with an “AssignMessage” policy.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<AssignMessage async="false" continueOnError="false" enabled="true" name="Add-Service-Host-Header">
    <DisplayName>Add Service Host Header</DisplayName>
    <Set>
        <Headers>
            <Header name="Service-Host">mysvc.myns.svc.cluster.local</Header>
        </Headers>
    </Set>
    <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
    <AssignTo createNew="false" transport="http" type="request"/>
</AssignMessage>

Next, I added an Envoy filter to the Istio ingress gateway so it knows to look for the Service-Host header and update the Host header accordingly. Again, I used Service-Host because I couldn’t get Apigee to properly set Host directly. If you can figure that out and get the Host header coming in correctly the first time, you can skip the Envoy filter.

The filter needs to run first thing in the pipeline, before Istio tries to route traffic. I found that pinning it just before the istio.metadata_exchange stage got the job done.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: propagate-host-header-from-apigee
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
      app: istio-ingressgateway
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: "envoy.http_connection_manager"
            subFilter:
              # istio.metadata_exchange is the first filter in the connection
              # manager, at least in Istio 1.6.14.
              name: "istio.metadata_exchange"
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.filters.http.lua
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
            inline_code: |
              function envoy_on_request(request_handle)
                local service_host = request_handle:headers():get("service-host")
                if service_host ~= nil then
                  request_handle:headers():replace("host", service_host)
                end
              end

Finally, the VirtualService that handles the traffic routing needs to be tied both to the ingress and to the mesh gateway. The hosts setting can just be the internal service name, though, since that’s what the ingress will use now.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: mysvc
  namespace: myns
spec:
  gateways:
    - istio-system/apigee-mtls
    - mesh
  hosts:
    - mysvc
  http:
    - route:
        - destination:
            host: mysvc-stable
          weight: 50
        - destination:
            host: mysvc-baseline
          weight: 25
        - destination:
            host: mysvc-canary
          weight: 25

Once all these things are complete, both internal and external traffic will be routed by the single VirtualService. Now I can control canary load balancing in a single location and be sure that I’m getting correct overall test results and statistics with as few moving pieces as possible.

Disclaimer: There may be reasons you don’t want to treat external traffic the same as internal, like if you have different DestinationRule settings for traffic management inside vs. outside, or if you need to pass things through different authentication filters or whatever. Everything I’m working with is super locked down so I treat internal and external traffic with the same high levels of distrust and ensure that both types of traffic are scrutinized equally. YMMV.

mac, network comments edit

I have a Mac and my user account is attached to a Windows domain. The benefit of this is actually pretty minimal in that I can change my domain password and it propagates to the local Mac user account, but that’s about it. It seems to cause more trouble than it’s worth.

I recently had an issue where something got out of sync and I couldn’t log into my Mac using my domain account. This is sort of a bunch of tips and things I did to recover that.

First, have a separate local admin account. Make it a super complex password and never use it for anything else. This is sort of your escape hatch to try to recover your regular user account. Even if you want to have a local admin account so your regular user account can stay a user and no admin… have a dedicated “escape hatch” admin account that’s separate from the “I use this sometimes for sudo purposes” admin account. I have this, and if I hadn’t, that’d have been the end of it.

It’s good to remember for a domain-joined account there are three security tokens that all need to be kept in sync: Your domain user password, your local machine OS password, and your disk encryption token. When you reboot the computer, the first password you’ll be asked for should unlock the disk encryption. Usually the token for disk encryption is tied nicely to the machine account password so you enter the one password and it both unlocks the disk and logs you in. The problem I was running into was those got out of sync. For a domain-joined account, the domain password usually is also tied to these things.

Next, keep your disk encryption recovery code handy. Store it in a password manager or something. If things get out of sync, you can use the recovery code to unlock the disk and then your OS password to log in.

For me, I was able to log in as my separate local admin account but my machine password wasn’t working unless I was connected to the domain. Only way to connect to the domain was over a VPN. That meant I needed to enable fast user switching so I could connect to the VPN under the separate local admin and then switch - without logging out - to my domain account.

Once I got to my own account I could use the Users & groups app to change my domain password and have the domain and machine accounts re-synchronized. ALWAYS ALWAYS ALWAYS USE USERS & GROUPS TO CHANGE YOUR DOMAIN ACCOUNT PASSWORD. I have not found a way otherwise to ensure everything is in sync. Don’t change it from some other workstation, don’t change it from Azure Active Directory. This is the road to ruin. Stay with Users & Groups.

The last step was that my disk encryption token wasn’t in sync - OS and domain connection was good, but I couldn’t log in after a reboot. I found the answer in a Reddit thread:

su local_admin
sysadminctl -secureTokenStatus domain_account_username
sysadminctl -secureTokenOff domain_account_username \
  -password domain_account_password \
  interactive
sysadminctl -secureTokenOn domain_account_username \
  -password domain_account_password \
  interactive

Basically, as the standalone local admin, turn off and back on again the connection to the drive encryption. This refreshes the token and gets it back in sync.

Reboot, and you should be able to log in with your domain account again.

To test it out, you may want to try changing your password from Users & Groups to see that the sync works. If you get a “password complexity” error, it could be the sign of an issue… or it could be the sign that your domain has a “you can’t change the password more than once every X days” sort of policy and since you changed it earlier you are changing it again too soon. YMMV.

And, again, always change your password from Users & Groups.

kubernetes comments edit

I have a Kubernetes 1.19.11 cluster deployed along with Istio 1.6.14. I have a central instance of Prometheus for scraping metrics, and based on the documentation, I have a manually-injected sidecar so Prometheus can make use of the Istio certificates for mTLS during scraping. Under Prometheus v2.20.1 this worked great. However, I was trying to update some of the infrastructure components to take advantage of new features and Prometheus after v2.21.0 just would not scrape.

These are my adventures in trying to debug this issue. Some of it is to remind me of what I did. Some of it is to save you some trouble if you run into the issue. Some of it is to help you see what I did so you can apply some of the techniques yourself.

TL;DR: The problem is that Prometheus v2.21.0 disabled HTTP/2 and that needs to be re-enabled for things to work. There should be a Prometheus release soon that allows you to re-enable HTTP/2 with environment variables.

I created a repro repository with a minimal amount of setup to show how things work. It can get you from a bare Kubernetes cluster up to Istio 1.6.14 and Prometheus using the same values I am. You’ll have to supply your own microservice/app to demonstrate scraping, but the prometheus-example-app may be a start.

I deploy Prometheus using the Helm chart. As part of that, I have an Istio sidecar manually injected just like they do in the official 1.6 Istio release manifests. By doing this, the sidecar will download and share the certificates but it won’t proxy any of the Prometheus traffic.

I then have a Prometheus scrape configuration that uses the certificates mounted in the container. If it finds a pod that has the Istio sidecar annotations (indicating it’s got the sidecar injected), it’ll use the certificates for authentication and communication.

- job_name: "kubernetes-pods-istio-secure"
  scheme: https
  tls_config:
    ca_file: /etc/istio-certs/root-cert.pem
    cert_file: /etc/istio-certs/cert-chain.pem
    key_file: /etc/istio-certs/key.pem
    insecure_skip_verify: true

If I deploy Prometheus v2.20.1, I see that my services are being scraped by the kubernetes-pods-istio-secure job, they’re using HTTPS, and everything is good to go. Under v2.20.1, I see the error connection reset by peer. I tried asking about this in the Prometheus newsgroup to no avail, so… I dove in.

My first step was to update the Helm chart extraArgs to turn on Prometheus debug logging.

extraArgs:
  log.level: debug

I was hoping to see more information about what was happening. Unfortunately, I got basically the same thing.

level=debug ts=2021-07-06T20:58:32.984Z caller=scrape.go:1236 component="scrape manager" scrape_pool=kubernetes-pods-istio-secure target=https://10.244.3.10:9102/metrics msg="Scrape failed" err="Get \"https://10.244.3.10:9102/metrics\": read tcp 10.244.4.89:36666->10.244.3.10:9102: read: connection reset by peer"

This got me thinking one of two things may have happened in v2.21.0:

  • Something changed in Prometheus; OR
  • Something changed in the OS configuration of the Prometheus container

I had recently fought with a dotnet CLI problem where certain TLS cipher suites were disabled by default and some OS configuration settings on our build agents affected what was seen as allowed vs. not allowed. This was stuck in my mind so I couldn’t immediately rule out the container OS configuration.

To validate the OS issue I was going to try using curl and/or openssl to connect to the microservice and see what the cipher suites were. Did I need an Istio upgrade? Was there some configuration setting I was missing? Unfortunately, it turns out the Prometheus Docker image is based on a custom busybox image where there are no package managers or tools. I mean, this is actually a very good thing from a security perspective but it’s a pain for debugging.

What I ended up doing was getting a recent Ubuntu image and connecting using that, just to see. I figured if there was anything obvious going on that I could take the extra steps of creating a custom Prometheus image with curl and openssl to investigate further. I mounted a manual sidecar just like I did for Prometheus so I could get to the certificates without proxying traffic, then I ran some commands:

curl https://10.244.3.10:9102/metrics \
  --cacert /etc/istio-certs/root-cert.pem \
  --cert /etc/istio-certs/cert-chain.pem \
  --key /etc/istio-certs/key.pem \
  --insecure

openssl s_client \
  -connect 10.244.3.10:9102 \
  -cert /etc/istio-certs/cert-chain.pem  \
  -key /etc/istio-certs/key.pem \
  -CAfile /etc/istio-certs/root-cert.pem \
  -alpn "istio"

Here’s some example output from curl to show what I was seeing:

root@sleep-5f98748557-s4wh5:/# curl https://10.244.3.10:9102/metrics --cacert /etc/istio-certs/root-cert.pem --cert /etc/istio-certs/cert-chain.pem --key /etc/istio-certs/key.pem --insecure -v
*   Trying 10.244.3.10:9102...
* TCP_NODELAY set
* Connected to 10.244.3.10 (10.244.3.10) port 9102 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/istio-certs/root-cert.pem
  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, CERT verify (15):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: [NONE]
*  start date: Jul  7 20:21:33 2021 GMT
*  expire date: Jul  8 20:21:33 2021 GMT
*  issuer: O=cluster.local
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x564d80d81e10)
> GET /metrics HTTP/2
> Host: 10.244.3.10:9102
> user-agent: curl/7.68.0
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* old SSL session ID is stale, removing
* Connection state changed (MAX_CONCURRENT_STREAMS == 2147483647)!
< HTTP/2 200

A few things in particular:

  1. I found the --alpn "istio" thing for openssl while looking through Istio issues to see if there were any pointers there. It’s always good to read through issues lists to get ideas and see if other folks are running into the same problems.
  2. Both openssl and curl were able to connect to the microservice using the certificates from Istio.
  3. The cipher suite shown in the openssl output was one that was considered “recommended.” I forgot to capture that output for the blog article, sorry about that.

At this point I went to the release notes for Prometheus v2.21.0 to see what had changed. I noticed two things that I thought may affect my situation:

  1. This release is built with Go 1.15, which deprecates X.509 CommonName in TLS certificates validation.
  2. [CHANGE] Disable HTTP/2 because of concerns with the Go HTTP/2 client. #7588 #7701

I did see in that curl output that it was using HTTP/2 but… is it required? Unclear. However, looking at the Go docs about the X.509 CommonName thing, that’s easy enough to test. I just needed to add an environment variable to the Helm chart for Prometheus:

env:
  - name: GODEBUG
    value: x509ignoreCN=0

After redeploying… it didn’t fix anything. That wasn’t the problem. That left the HTTP/2 thing. However, what I found was it’s hardcoded off, not disabled through some configuration mechanism so there isn’t a way to just turn it back on to test. The only way to test it is to do a fully custom build.

The Prometheus build for a Docker image is really complicated. They have this custom build tool promu that runs the build in a custom build container and all this is baked into layers of make and yarn and such. As it turns out, not all of it happens in the container, either, because if you try to build on a Mac you’ll get an error like this:

... [truncated huge list of downloads] ...
go: downloading github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578
go: downloading github.com/Azure/go-autorest/autorest/validation v0.3.1
go: downloading github.com/Azure/go-autorest/autorest/to v0.4.0
go build github.com/aws/aws-sdk-go/service/ec2: /usr/local/go/pkg/tool/linux_amd64/compile: signal: killed
!! command failed: build -o .build/linux-amd64/prometheus -ldflags -X github.com/prometheus/common/version.Version=2.28.1 -X github.com/prometheus/common/version.Revision=b0944590a1c9a6b35dc5a696869f75f422b107a1 -X github.com/prometheus/common/version.Branch=HEAD -X github.com/prometheus/common/version.BuildUser=root@76a91e410d00 -X github.com/prometheus/common/version.BuildDate=20210709-14:47:03  -extldflags '-static' -a -tags netgo,builtinassets github.com/prometheus/prometheus/cmd/prometheus: exit status 1
make: *** [Makefile.common:227: common-build] Error 1
!! The base builder docker image exited unexpectedly: exit status 2

You can only build on Linux even though it’s happening in a container. At least right now. Maybe that’ll change in the future. Anyway, this meant I needed to create a Linux VM and set up an environment there that could build Prometheus… or figure out how to force a build system to do it, say by creating a fake PR to the Prometheus project. I went the Linux VM route.

I changed the two lines where the HTTP/2 was disabled, I pushed that to a temporary Docker Hub location, and I got it deployed in my cluster.

Success! Once HTTP/2 was re-enabled, Prometheus was able to scrape my Istio pods again.

I worked through this all with the Prometheus team and they were able to replicate the issue using my repro repo. They are now working through how to re-enable HTTP/2 using environment variables or configuration.

All of this took close to a week to get through.

It’s easy to read these blog articles and think the writer just blasted through all this and it was all super easy, that I already knew the steps I was going to take and flew through it. I didn’t. There was a lot of reading issues. There was a lot of trying things and then retrying those same things because I forgot what I’d just tried, or maybe I discovered I forgot to change a configuration value. I totally deleted and re-created my test Kubernetes cluster like five times because I also tried updating Istio and… well, you can’t really “roll back Istio.” It got messy. Not to mention, debugging things at the protocol level is a spectacular combination of “not interesting” and “not my strong suit.”

My point is, don’t give up. Pushing through these things and reading and banging your head on it is how you get the experience so that next time you will have been through it.