azure, docker, powershell comments edit

I’ve been doing some work with creating and migrating Azure Container Registry instances around lately so I thought I’d share a few helpful scripts. Obvious disclaimers - YMMV, works on my machine, I’m not responsible if you delete something you shouldn’t have, etc.

New-AzureContainerRegistry.ps1

I need to create container registries that have customer managed key support enabled. Unfortunately, there are a lot of steps to this and there are some things that aren’t obvious, like:

  • You need to use the “Premium” SKU for this to work.
  • The Key Vault and the thing being encrypted using customer managed keys (e.g., the container registry) need to be in the same subscription and geographic region. They only say this in the docs about VM disk encryption but it seems to be applicable to all CMK usage.

Normally I’d think about doing this with something like Terraform but as of this writing, Terraform doesn’t have support for ACR + CMK so… script it is.

Delete-AzureContainerImages.ps1

This is more a “pruning” operation than deleting, but “prune” isn’t an approved PowerShell verb and I do love me some PowerShell.

In a CI/CD environment, generally you want to keep:

  • The current successfully deployed image.
  • The previous successfully deployed image.
  • The image you want to deploy next (canary style).

…and, actually, that’s about it. CI/CD is fail-forward, so there’s not really a roll-back-three-versions case. You’d roll back the code and build a new container.

Point being, there’s not really a retention policy that handles this in ACR right now. While this script also doesn’t totally handle it the way I’d like, what it can do is keep the most recent X tags of an image and prune all the old ones. I also added a way to regex match a container repository by name so you can be more precise about targeting what you want to prune.

Copy-AzureContainerImages.ps1

This is sort of a bulk copy operation for ACR. For reasons I won’t get into, I needed to copy all the images off an ACR, delete/re-create the ACR, and copy them all back. While the az CLI supports importing one image/tag at a time, there’s not really a bulk copy. There’s a ‘transfer artifacts’ mechanism but it’s sort of complex to set up and the az CLI is already here, so…

This script gets all the repositories and all the tags from each repository and does az acr import on all of them. It’s not fast, but it gets the job done.

kubernetes comments edit

Here’s what I want:

  • Istio 1.6.4 in Kubernetes acting as the ingress.
  • oauth2-proxy wrapped around one application, not the whole cluster.
  • OpenID Connect support for Azure AD - both interactive OIDC and support for client_credentials OAuth flow.
  • Istio token validation in front of the app.
  • No replacing the Istio sidecar. I want things running as stock as possible so I’m not too far off the beaten path when it’s upgrade time.

I’ve set this up in the past without too much challenge using nginx ingress but I don’t want Istio bypassed here. Unfortunately, setting up oauth2-proxy with an Istio (Envoy) ingress is a lot more complex than sticking a couple of annotations in there.

Luckily, I found this blog article by Justin Gauthier who’d done a lot of the leg-work to figure things out. The difference in that blog article and what I want done are:

  • That article uses an older version of Istio so some of the object definitions don’t apply to my Istio 1.6.4 setup.
  • That article wraps everything in the cluster (via the Istio ingress) with oauth2-proxy and I only want one service wrapped.

With all that in mind, let’s get going.

Prerequisites

There are some things you need to set up before you can get this going.

DNS Entries

Pick a subdomain on which you’ll have the service and the oauth2-proxy. For our purposes, let’s pick cluster.example.com as the subdomain. You want a single subdomain so you can share cookies and so it’s easier to set up DNS and certificates.

We’ll put the app and oauth2-proxy under that.

  • The application/service being secured will be at myapp.cluster.example.com.
  • The oauth2-proxy will be at oauth.cluster.example.com.

In your DNS system you need to assign the wildcard DNS *.cluster.example.com to the IP address that your Istio ingress is using. If someone visits https://myapp.cluster.example.com they should be able to get to your service in the cluster via the Istio ingress gateway.

Azure AD Application

For an application to allow OpenID Connect / OAuth through Azure AD, you need to register the application with Azure AD. The application should be for the service you’re securing.

In that application you need to:

  • On the “Overview” tab, make a note of…
    • The “Application (client) ID” - you’ll need it later. For this example, let’s say it’s APPLICATION-ID-GUID.
    • The “Directory (tenant) ID” - you’ll need it later. For this example, let’s say it’s TENANT-ID-GUID
  • On the “Authentication” tab:
    • Under “Web / Redirect URIs,” set the redirect URI to /oauth2/callback relative to your app, like https://myapp.cluster.example.com/oauth2/callback.
    • Under “Implicit grant,” check the box to allow access tokens to be issued.
  • On the “Expose an API” tab, create a scope. It doesn’t matter really what it’s called, but if no scopes are present then client_credentials won’t work. I called mine user_impersonation but you could call yours fluffy and it wouldn’t matter. The scope URI will end up looking like api://APPLICATION-ID-GUID/user_impersonation where that GUID is the ID for your application.
  • On the “API permissions” tab:
    • Grant permission to that user_impersonation scope you just created.
    • Grant permission to Microsoft.Graph - User.Read so oauth2-proxy can validate credentials.
    • Click the “Grant admin consent” button at the top or client_credentials won’t work. There’s no way to grant consent in the middle of that flow.
  • On the “Certificates & secrets” page, under “Client secrets,” create a client secret and take note of it. You’ll need it later. For this example, we’ll say the client secret is myapp-client-secret but yours is going to be a long string of random characters.

Finally, somewhat related - take note of the email domain associated with your users in Azure Active Directory. For our example, we’ll say everyone has an @example.com email address. We’ll use that when configuring oauth2-proxy for who can log in.

cert-manager

Set up cert-manager in the cluster. I found the DNS01 solver worked best for me with Istio in the mix because it was easy to get Azure DNS hooked up.

The example here assumes that you have it set up so you can drop a Certificate into a Kubernetes namespace and cert-manager will take over, request a certificate, and populate the appropriate Kubernetes secret that can be used by the Istio ingress gateway for TLS.

Setting up cert-manager isn’t hard, but there’s already a lot of documentation on it so I’m not going to repeat all of it.

If you can’t use cert-manager in your environment then you’ll have to adjust for that when you see the steps where the TLS bits are getting set up later.

The Setup

OK, you have the prerequisites set up, let’s get to it.

Istio Service Entry

If you have traffic going through an egress in Istio, you will need to set up a ServiceEntry to allow access to the various Azure AD endpoints from oauth2-proxy. I have all outbound traffic requiring egress so this was something I had to do.

apiVersion: networking.istio.io/v1beta1
kind: ServiceEntry
metadata:
  name: azure-istio-egress
  namespace: istio-system
spec:
  hosts:
  - '*.microsoft.com'
  - '*.microsoftonline.com'
  - '*.windows.net'
  location: MESH_EXTERNAL
  ports:
  - name: https
    number: 443
    protocol: HTTPS
  resolution: NONE

I use a lot of other Azure services, so I have some pretty permissive outbound allowances. You can try to reduce this to just the minimum of what you need by doing a little trial and error. I know I ran into:

  • graph.windows.com - Azure AD graph API
  • login.windows.net - Common JWKS endpoint
  • sts.windows.net - Token issuer, also used for token validation
  • *.microsoftonline.com, *.microsoft.com - Some UI redirection happens to allow OIDC login here with a Microsoft account

I’ll admit after I got through a bunch of different minor things, I just started whitelisting egress allowances. It wasn’t that important for me to be exact for this.

I did deploy this to the istio-system namespace. It seems that it doesn’t matter where a ServiceEntry gets deployed, once it’s out there it works for any service in the cluster. I ended up just deploying all of these to the istio-system namespace so it’s easier to track.

TLS Certificate

OpenID Connect via Azure AD requires a TLS connection for your app. cert-manager takes care of converting a Certificate object to a Kubernetes Secret for us.

It’s important to note that we’re going to use the standard istio-ingressgateway to handle our inbound traffic, and that’s in the istio-system namespace. You can’t read Kubernetes secrets across namespaces, so the Certificate needs to be deployed to the istio-system namespace.

This is one of the places where you’ll see why it’s good to have picked a common subdomain for the oauth2-proxy and the app - wildcard certificate.

apiVersion: cert-manager.io/v1beta1
kind: Certificate
metadata:
  name: tls-myapp-production
  namespace: istio-system
spec:
  commonName: '*.cluster.example.com'
  dnsNames:
  - '*.cluster.example.com'
  issuerRef:
    kind: ClusterIssuer
    name: letsencrypt-production
  secretName: tls-myapp-production

Application Namespace

Create your application namespace and enable Istio sidecar injection. This is where your app/service, oauth2-proxy, and Redis will go.

kubectl create namespace myapp
kubectl label namespace myapp istio-injection=enabled

Redis

You need to enable Redis as a session store for oauth2-proxy if you want the Istio token validation in place. I gather this isn’t required if you don’t want Istio doing any token validation, but I did, so here we go.

I used the Helm chart v10.5.7 for Redis. There are… a lot of ways you can set up Redis. I set up the demo version here in a very simple, non-clustered manner. Depending on how you set up Redis, you may need to adjust your oauth2-proxy configuration.

Here’s the values.yaml I used for deploying Redis:

cluster:
  enabled: false
usePassword: true
password: "my-redis-password"
master:
  persistence:
    enabled: false

The Application

When you deploy your application, you’ll need to set up:

  • The Kubernetes Deployment and Service
  • The Istio VirtualService and Gateway

The Deployment doesn’t have anything special, it just exposes a port that can be routed to by a Service. Here’s a simple Deployment.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  namespace: myapp
  labels:
    app.kubernetes.io/name: myapp
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: myapp
  template:
    metadata:
      labels:
        app.kubernetes.io/name: myapp
    spec:
      containers:
      - image: "docker.io/path/to/myapp:sometag"
        imagePullPolicy: IfNotPresent
        name: myapp
        ports:
        - containerPort: 80
          name: http
          protocol: TCP

We have a Kubernetes Service for that Deployment:

apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: myapp
  labels:
    app.kubernetes.io/name: myapp
spec:
  ports:
  # Exposes container port 80 on service port 8000.
  # This is pretty arbitrary, but you need to know
  # the Service port for the VirtualService later.
  - name: http
    port: 8000
    protocol: TCP
    targetPort: http
  selector:
    app.kubernetes.io/name: myapp

The Istio VirtualService is another layer on top of the Service that helps in traffic control. Here’s where we start tying the ingress gateway to the Service.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  labels:
    app.kubernetes.io/name: myapp
  name: myapp
  namespace: myapp
spec:
  gateways:
  # Name of the Gateway we're going to deploy in a minute.
  - myapp
  hosts:
  # The full host name of the app.
  - myapp.cluster.example.com
  http:
  - route:
    - destination:
        # This is the Kubernetes Service info we just deployed.
        host: myapp
        port:
          number: 8000

Finally, we have an Istio Gateway that ties the ingress to our VirtualService.

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  labels:
    app.kubernetes.io/name: myapp
  name: myapp
  namespace: myapp
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    # Same host as the one in the VirtualService, the full
    # name for the service.
    - myapp.cluster.example.com
    port:
      # The name here must be unique across all of the ports named
      # in the Istio ingress. It doesn't matter what it is as long
      # as it's unique. I like using a modified version of the
      # host name.
      name: https-myapp-cluster-example-com
      number: 443
      protocol: HTTPS
    tls:
      # This is the name of the secret that cert-manager placed
      # in the istio-system namespace. It should match the
      # secretName in the Certificate.
      credentialName: tls-myapp-production
      mode: SIMPLE

At this point, if you have everything set up right, you should be able to hit https://myapp.cluster.example.com and get to it anonymously. There’s no oauth2-proxy in place, but the ingress is all wired up to use TLS with that wildcard certificate cert-manager got you and the DNS was set up, too.

If you can’t get to the service, one of the things isn’t lining up:

  • You forgot to enable Istio sidecar injection on the app namespace or did it after you deployed. Restart the deployments to get the sidecars added.
  • DNS hasn’t propagated.
  • The secret with the TLS certificate isn’t in the istio-system namespace - it must be in istio-system for the ingress to find it.
  • The Gateway isn’t lining up - credentialName is wrong, host name is wrong, port name isn’t unique.
  • The VirtualService isn’t lining up - host name is wrong, Gateway name doesn’t match, Service name or port is wrong.
  • The Service isn’t lining up - the selector doesn’t select any pods, the destination port on the pods is wrong.

If it feels like you’re Odysseus trying to shoot an arrow through 12 axes, yeah, it’s a lot like that. This isn’t even all the axes.

oauth2-proxy

For this I used the Helm chart v3.2.2 for oauth2-proxy. I created the cookie secret for it like this:

docker run -ti --rm python:3-alpine python -c 'import secrets,base64; print(base64.b64encode(secrets.token_bytes(16)));'

You’re also going to need the client ID from your Azure AD application as well as the client secret. You should have grabbed those during the prerequisites earlier.

The values:

config:
  # The client ID of your AAD application.
  clientID: "APPLICATION-ID-GUID"
  # The client secret you generated for the AAD application.
  clientSecret: "myapp-client-secret"
  # The cookie secret you just generated with the Python container.
  cookieSecret: "the-big-base64-thing-you-made"
  # Here's where the interesting stuff happens:
  configFile: |-
    auth_logging = true
    azure_tenant = "TENANT-ID-GUID"
    cookie_httponly = true
    cookie_refresh = "1h"
    cookie_secure = true
    email_domains = "example.com"
    oidc_issuer_url = "https://sts.windows.net/TENANT-ID-GUID/"
    pass_access_token = true
    pass_authorization_header = true
    provider = "azure"
    redis_connection_url = "redis://redis-master.myapp.svc.cluster.local:6379"
    redis_password = "my-redis-password"
    request_logging = true
    session_store_type = "redis"
    set_authorization_header = true
    silence_ping_logging = true
    skip_provider_button = true
    skip_auth_strip_headers = false
    skip_jwt_bearer_tokens = true
    standard_logging = true
    upstreams = [ "static://" ]

Important things to note in the configuration file here:

  • The client ID, client secret, and Azure tenant ID information are all from that Azure AD application you registered as a prerequisite.
  • The logging settings, like silence_ping_logging or auth_logging are totally up to you. These don’t matter to the functionality but make it easier to troubleshoot.
  • The redis_connection_url is going to depend on how you deployed Redis. You want to connect to the Kubernetes Service that points to the master, at least in this demo setup. There are a lot of Redis config options for oauth2-proxy that you can tweak. Also, storing passwords in config like this isn’t secure so, like, do something better. But it’s also a lot more to explain how to set up and mount secrets and all that here, so just pretend we did the right thing.
  • The pass_access_token, pass_authorization_header, set_authorization_header, and skip_jwt_bearer_tokens values are super key here. The first three must be set that way for OIDC or OAuth to work; the last one must be set for client_credentials to work.

Note on client_credentials: If you want to use client_credentials with your app, you need to set up an authenticated emails file in oauth2-proxy. In that emails file, you need to include the service principal ID for the application that’s authenticating. Azure AD issues a token for applications with that service principal ID as the subject, and there’s no email.

The service principal ID can be retrieved if you have your application ID:

az ad sp show --id APPLICATION-ID-GUID --query objectId --out tsv

You’ll also need your app to request a scope when you submit a client_credentials request - use api://APPLICATION-ID-GUID/.default as the scope. (That .default scope won’t exist unless you have some scope defined, which is why you defined one earlier.)

Getting back to it… Once oauth2-proxy is set up, you need to add the Istio wrappers on it.

First, let’s add that VirtualService

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  labels:
    app.kubernetes.io/name: oauth2-proxy
  name: oauth2-proxy
  namespace: myapp
spec:
  gateways:
  # We'll deploy this gateway in a moment.
  - oauth2-proxy
  hosts:
  # Full host name of the oauth2-proxy.
  - oauth.cluster.example.com
  http:
  - route:
    - destination:
        # This should line up with the Service that the
        # oauth2-proxy Helm chart deployed.
        host: oauth2-proxy
        port:
          number: 80

Now the Gateway

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  labels:
    app.kubernetes.io/name: oauth2-proxy
  name: oauth2-proxy
  namespace: myapp
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    # Same host as the one in the VirtualService, the full
    # name for oauth2-proxy.
    - oauth.cluster.example.com
    port:
      # Again, this must be unique across all ports named in
      # the Istio ingress.
      name: https-oauth-cluster-example-com
      number: 443
      protocol: HTTPS
    tls:
      # Same secret as the application - it's a wildcard cert!
      credentialName: tls-myapp-production
      mode: SIMPLE

OK, now you should be able to get something if you hit https://oauth.cluster.example.com. You’re not passing through it for authentication yet you will likely see something along the lines of an error saying “The reply URL specified in the request does not match the reply URLs configured for the application.” The point is, it shouldn’t be some arbitrary 500 or 404. oauth2-proxy should kick in.

Istio Token Validation - RequestAuthentication

We want Istio to do some token validation in front of our application, so we can deploy a RequestAuthentication object.

apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  labels:
    app.kubernetes.io/name: myapp
  name: myapp
  namespace: myapp
spec:
  jwtRules:
  - issuer: https://sts.windows.net/TENANT-ID-GUID/
    jwksUri: https://login.windows.net/common/discovery/keys
  selector:
    matchLabels:
      # Match labels should not select the oauth2-proxy, just
      # the application being secured.
      app.kubernetes.io/name: myapp

The Magic - Envoy Filter for Authentication

The real magic is this last step, an Istio EnvoyFilter to pass authentication requests for your app through oauth2-proxy. This is the biggest takeaway I got from Justin’s blog article and it’s really the key to the whole thing.

Envoy filter format is in flux. The object defined here is really dependent on the version of Envoy that Istio is using. This was a huge pain. I ended up finding the docs for the Envoy ExtAuthz filter and feeling my way through the exercise, but you should be aware these things do change.

Here’s the Envoy filter:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  labels:
    app.kubernetes.io/name: myapp
  name: myapp
  namespace: istio-system
spec:
  configPatches:
  - applyTo: HTTP_FILTER
    match:
      context: GATEWAY
      listener:
        filterChain:
          filter:
            name: envoy.http_connection_manager
            subFilter:
              # In Istio 1.6.4 this is the first filter. The examples showing insertion
              # after some other authorization filter or not showing where to insert
              # the filter at all didn't work for me. Istio just failed to insert the
              # filter (silently) and moved on.
              name: istio.metadata_exchange
          # The filter should catch traffic to the service/application.
          sni: myapp.cluster.example.com
    patch:
      operation: INSERT_AFTER
      value:
        name: envoy.filters.http.ext_authz
        typed_config:
          '@type': type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
          http_service:
            authorizationRequest:
              allowedHeaders:
                patterns:
                - exact: accept
                - exact: authorization
                - exact: cookie
                - exact: from
                - exact: proxy-authorization
                - exact: user-agent
                - exact: x-forwarded-access-token
                - exact: x-forwarded-email
                - exact: x-forwarded-for
                - exact: x-forwarded-host
                - exact: x-forwarded-proto
                - exact: x-forwarded-user
                - prefix: x-auth-request
                - prefix: x-forwarded
            authorizationResponse:
              allowedClientHeaders:
                patterns:
                - exact: authorization
                - exact: location
                - exact: proxy-authenticate
                - exact: set-cookie
                - exact: www-authenticate
                - prefix: x-auth-request
                - prefix: x-forwarded
              allowedUpstreamHeaders:
                patterns:
                - exact: authorization
                - exact: location
                - exact: proxy-authenticate
                - exact: set-cookie
                - exact: www-authenticate
                - prefix: x-auth-request
                - prefix: x-forwarded
            server_uri:
              # URIs here should be to the oauth2-proxy service inside your
              # cluster, in the namespace where it was deployed. The port
              # in that 'cluster' line should also match up.
              cluster: outbound|80||oauth2-proxy.myapp.svc.cluster.local
              timeout: 1.5s
              uri: http://oauth2-proxy.myapp.svc.cluster.local

That’s it, you should be good to go!

Note I didn’t really mess around with trying to lock the headers down too much. This is the set I found from the blog article by Justin Gauthier and every time I tried to tweak too much, something would stop working in subtle ways.

Try It Out

With all of this in place, you should be able to hit https://myapp.cluster.example.com and the Envoy filter will redirect you through oauth2-proxy to Azure Active Directory. Signing in should get you redirected back to your application, this time authenticated.

Troubleshooting

There are a lot of great tips about troubleshooting and diving into Envoy on the Istio site. This forum post is also pretty good.

Here are a couple of spot tips that I found to be of particular interest.

Finding the Envoy Version

As noted in the EnvoyFilter section, filter formats change based on the version of Envoy that Istio is using. You can find out what version of Envoy you’re running in your Istio cluster by using:

$podname = kubectl get pod -l app=prometheus -n istio-system -o jsonpath='{$.items[0].metadata.name}'
kubectl exec -it $podname -c istio-proxy -n istio-system -- pilot-agent request GET server_info

You’ll get a lot of JSON explaining info about the Envoy sidecar, but the important bit is:

{
 "version": "80ad06b26b3f97606143871e16268eb036ca7dcd/1.14.3-dev/Clean/RELEASE/BoringSSL"
}

In this case, it’s 1.14.3.

Look at What Envoy is Doing

It’s hard to figure out where the Envoy configuration gets hooked up. The istioctl proxy-status command can help you.

istioctl proxy-status will yield a list like this:

NAME                                                         CDS        LDS        EDS        RDS          PILOT                       VERSION
myapp-768b999cb5-v649q.myapp                                 SYNCED     SYNCED     SYNCED     SYNCED       istiod-5cf5bd4577-frngc     1.6.4
istio-egressgateway-85b568659f-x7cwb.istio-system            SYNCED     SYNCED     SYNCED     NOT SENT     istiod-5cf5bd4577-frngc     1.6.4
istio-ingressgateway-85c67886c6-stdsf.istio-system           SYNCED     SYNCED     SYNCED     SYNCED       istiod-5cf5bd4577-frngc     1.6.4
oauth2-proxy-5655cc447d-5ftbq.myapp                          SYNCED     SYNCED     SYNCED     SYNCED       istiod-5cf5bd4577-frngc     1.6.4
redis-5f7c5b99db-tp5l7.myapp                                 SYNCED     SYNCED     SYNCED     SYNCED       istiod-5cf5bd4577-frngc     1.6.4

Once you’ve deployed, you’ll see a myapp listener as well as the Istio ingress. You can dump their config by doing something like

istioctl proxy-config listeners myapp-768b999cb5-v649q.myapp -o json

Sub in the name of the listener as needed. It will generate a huge raft of JSON, so you might need to dump it to a file so you can scroll around in it and find what you want.

  • The application listener will show you info about the sidecar attached to the app.
  • The ingress gateway listener will show you info about ingress traffic (including showing your Envoy filter).

When All Else Fails, Restart the Ingress

When all else fails, restart the ingress pod. kubectl rollout restart deploy/istio-ingressgateway -n istio-system can get you pretty far. When it seems like everything should be working but you’re getting errors like “network connection reset” and it doesn’t make sense… just try kicking the ingress pods. Sometimes the configuration needs to be freshly rebuilt and deployed and that’s how you do it.

I don’t know why this happens, but if you’ve deployed and undeployed some Envoy filters a couple of times… sometimes something just stops working. Restarting the ingress is the only way I’ve found to fix it… but it works!

Other Options

oauth2-proxy isn’t the only way to get this done.

I did see this authservice plugin, which appears to be an Envoy extension to provide oauth2-proxy services right in Envoy itself. Unfortunately, it doesn’t support the latest Istio versions; it requires you manually replace the Istio sidecar with this custom version; and it doesn’t seem to support client_credentials, which is a primary use case for me.

There’s an OAuth2 filter for Envoy currently in active development (alpha) but I didn’t see that it supported OIDC. I could be wrong there. I’d love to see someone get this working inside Istio.

For older Istio there was an App Identity and Access Adapter but Mixer adapters/plugins have been deprecated in favor of WASM extensions for Envoy.

Are there others? Let me know in the comments!

maker comments edit

For Christmas last year Jenn got me a SainSmart 3018 CNC router and I’ve really been getting into it. It’s a steep learning curve, a bit more than 3D printing, but my 3D printing knowledge has helped a lot in knowing what sort of things I should look for.

I’ve been getting into it enough that I wanted to upgrade the spindle in it, and my parents got me a Makita RT0701C trim router. This is a fairly common upgrade path - replacing the stock spindle with a Makita or DeWalt router - and you’ll see it in larger setups like the Shapeoko XXL.

IMPORTANT UPDATE: After posting this article I found that, while I was successful in getting the router mounted and generally working in 2D carving (like making letters on a sign), when doing 3D carving I lost Z height a lot. After a lot of trial and error I determined that the SainSmart 3018 PRO does not have strong enough stepper motors to drive the weight of the Makita router. Later models like the 3018PROVer do have the strength, which is why you see so many folks successful with this. For me, I ended up reverting my 3018 back to the stock spindle and upgrading to a Sienci LongMill 30 x 30 which is where I’ve got my Makita router now.

There are two challenges to overcome when you upgrade the spindle to a router like this.

First, you have to figure out how to mount the router to the CNC frame. I solved this by creating a 3D printed combination holder and dust shoe, which you can get on Thingiverse.

Second, you have to change how you turn the power on and off when carving. The stock spindle is powered right off the control board. When you send your gcode to the router, one of the codes turns the spindle on, which sends power to the spindle and it gets moving. A larger trim router like this is plugged in separately and instead of the control board turning it on and off, it’s generally accepted that you have to turn the router on manually with its power switch before you start cutting. Since nothing will be attached to the actual control board power, the “turn on the spindle” command will be effectively ignored.

I… don’t like that. I’m fine if I have to adjust the speed manually on the router, but I would really like the control board to turn the router on and off as needed for the cut. Lots of people, myself included, solve this using a relay. This shows you how to wire it up.

DISCLAIMER: You’re going to be working with electricity. Be safe. Make good connections. Don’t get your fingers in there. I’m not responsible for you burning your house down by making bad wire splices or injuring yourself from touching live electrical stuff. Respect the electricity. This isn’t much more difficult than wiring up a new lightswitch at home, but… just be careful.

Parts you’ll need:

  • One extension cord. It doesn’t have to be very long, you’re going to cut it to get the two end plugs. (Amazon)
  • One solid state relay. It should allow an input voltage of 12V DC and an output voltage matching your router (mine is a 120V AC router). I bought a relay that allows 3 - 32V DC input and 24 - 380V AC output so it’ll “just work.” (Amazon)
  • Your original spindle power cable. You’re going to cut it because you want the wires and the plastic connector that attaches to the control board. You could also make a new one, but I don’t anticipate plugging my old spindle in again.
  • Extra wire in case you want the connection between the control board and the relay to be longer.
  • A battery with some leads. The battery should be enough to trigger the relay. I chose a 9V battery which falls in that 3 - 32V DC range. You’ll use this for testing the relay wiring.
  • Something to plug in to test the relay wiring. I used a light bulb.
  • Solder and soldering iron.
  • Electrical tape.
  • Wire cutters.

Relay circuit parts

First thing we’re going to do is just make sure the relay is working. This is also helpful to understand how the control board will be turning the router on and off; and it gets your test set up.

Attach one wire to the positive input terminal of the relay and another wire to the negative terminal. Connect your battery to the wires - positive to positive, negative to negative. You should see the light go on to indicate the relay has been triggered. (If you’re usinga mechanical relay, you should hear a click.) When the control board “starts the spindle” it’s going to send 12V in and trigger the relay just like the battery is doing now.

Triggering the relay

Disconnect the battery. We’re done with this part of the testing.

Cut the extension cord so you can get some wires connected to the plugs. I cut about 12 inches from each end of the cord. That left me with:

  • A male plug with about 12 inches of cord
  • A female plug with about 12 inches of cord
  • A long strand that came from the middle of the cord

You can leave more cord connected to the plugs if you want. Just make sure you leave enough that you can make a good splice and have some slack to plug in. We don’t need that strand from the middle of the cord. You can save it and do something else with it or you can throw it away.

The extension cord will have an outer insulation/wrap and three wires inside it. Each wire also has insulation around it. Likely they’ll be color coded - green is ground, black is “hot” or “active,” and white is neutral. The black and white wires are what effectively makes the circuit powering your router, so we’re going to insert the relay in the middle of one of those to act like a switch. I chose to put the relay in the middle of the white wire.

If you don’t know how to make a good wire splice I would recommend watching this quick YouTube video on how to do a linesman’s splice. You’re working with some real electricity here and a bad splice can cause all sorts of problems like burning your house down.

Splice the two green wires together so the ground is continuous. My router is a two-prong non-grounded plug so it doesn’t use ground, but having this finished is valuable for later, I think. Wrap that splice in electrical tape to make sure it’s insulated from the other wires.

Now splice the two black wires together so the “hot” path is continuous. Again, wrap that in electrical tape so it’s nice and insulated.

Finally, attach one white wire to each of the “output” terminals on the relay. Make sure there’s a good connection and that they’re screwed down nice and tight.

You should end up with something that looks like this:

Wires spliced

Test time! Now it’s time to make sure your wire splices are good, that things are wired up correctly, and so on. This is also where you’ll want to be extra careful because if you didn’t wire stuff up right, it could be bad news.

Plug in your test load (like I used a light bulb) to the female plug. Then plug in the male plug to an electrical outlet (ideally with a surge protector and/or GFCI circuit breaker for your protection). At this point, even plugged in, the test load (light) should be off. Finally… connect the battery to the input terminals of the relay just like we did in the earlier test. The relay should activate and the test load should turn on! If you remove the battery, it should turn back off.

Test your splices

Disconnect the battery and unplug the relay from the wall and disconnect the wires you were using to test with the battery. The last step is to get the power connector from the control board to the relay working.

If you’re going to make a brand new cable that runs from the control board to the relay, now’s the time. I didn’t do that and I’m not walking through that process.

If you are reusing the original spindle power cable like I did… Snip the metal clips off the ends of the red and black wires that used to power your old spindle. Strip a small amount of the ends of the wire and connect red to positive, black to negative on the relay. It’ll end up looking like this:

Connect the control power cable

That’s it. That’s the whole circuit. Plug this into the wall, plug your router into this, connect the control cable to your control board on the router, and then flip the router switch on. If you use a gcode sender to send M3 that will turn the spindle on. You should see the light on the relay turn on and the router itself should turn on. If you use the gcode sender to send M5 that will turn the router back off.

I recommend putting this in a box or covering it. You don’t want the connections on the relay to get accidentally shorted. I made a quick 3D printed box for mine; you can do something similar or figure something else out. It all depends on the size of the relay and cord you bought, so it’s not one-size-fits-all. If you want to buy a box, search for “project boxes.”

All done, here’s what my setup looks like now:

The finished setup

The black box in the middle mounted to the wall contains the relay. It plugs into the power strip along the left. The red and black cables go to the control board. My Makita router plugs into the relay. (I have the cord routed up and hanging so it’s out of the way.)

I hope that helps folks get back some of their control with the upgraded router!

Note: You might be wondering how you can now automated speed control of the spindle, not just on/off. That’s not as straightforward and there are tons of forums involving rewiring routers with variable electronic speed controls (VESC) and all sorts of other cool-but-non-trivial things. I didn’t solve this problem since setting the speed dial before the cut isn’t a huge deal; and I generally don’t change speeds a lot.

csharp, javascript comments edit

I saw a Twitter thread the other day that got me thinking:

After 8 years with #golang, I don’t understand how I enjoyed programming before. I no longer have no tolerance for excessive boilerplate, verbosity, slow builds, overabstracted APIs, lack of first-class concurrency, bulky performance tools, …

— Jaana Dogan (@rakyll) August 12, 2020

One of the responses hit home for me:

My problem with #golang is that I love it, I got stuff done fast, didn’t need to use it for a few weeks and forgot everything. I have this learning curve spin up very time I use it. I don’t have that spin up time with C#.

— Herb Stahl (@JustAHerb) August 12, 2020

And a little disclaimer…

Warning: I’m going off on a bit of a tear here, and I color a little outside the lines of the argument. I’m having a tough time trying to convey a lot of frustration I’ve had recently with Go and Spring Boot / Java, and reading about folks loving the removal of boilerplate as a feature is touching on a pain point.

My daily work is C# and TypeScript. However, I sometimes also work in Terraform and the related SDKs, which are all in Go. So… I do have to use Go. Sometimes, but not often. When I do, I, too, find that I need to relearn nearly from the ground up. I also work, very occasionally, with Java, mostly apps based on Spring. Same thing there - I see the stuff, I sorta figure out what I need to, but if I come back to it a month later, I have no idea what’s going on.

I’m trying to figure out why that is. Like, if I have to get into a Python program and figure something out or add to it, I don’t have that feeling of being instantly just “lost.” I totally feel that with Go.

I think this part hints at it: “I … have no tolerance for excessive boilerplate.” I’m curious what, exactly, that means. I can guess, having messed around with Go - the convention-over-configuration for project structure, for example.

This is a lot of what I see Spring Boot in Java trying to address. Removing boilerplate. Convention over configuration. Auto-wireup. Just make it happen - code less, get more done.

But… if you’re removing boilerplate, you have to supplement that with easy to locate, comprehensive documentation that explains how to get things done.

The command go test runs your unit tests. Cool. What are the parameters you can pass to that? Go search on go test parameters. I’ll wait. You know what the first several results are? The test package. Doesn’t tell you anything. OK, cool, here’s a hint - just run go and you can start tracing down the help stack. Eventually you’ll get to go help testflags. Hmmm. Seems like a lot of references to GOMAXPROCS in there. What does that mean? Back to searching…

Here’s another one - I wanted to turn up the log level on a Spring Boot app that was in a Docker container. This seems like it should be easy. There is no reasonable set of search terms that will get you to the point where you just see a clear explanation that setting -Dlogging.level.root=TRACE or setting LOGGING_LEVEL_ROOT in the environment is the thing you want. I’ll save you the trouble, it’s not there. It’s just layers upon layers of abstractions in an effort to remove boilerplate, but you have to know what’s being abstracted in order to understand how to work adequately with the abstraction.

This seems to be a sort of conflicting goal in the tweet - the person doesn’t have a “tolerance for excessive boilerplate” but also doesn’t want “overabstracted APIs.” By definition, I think, removing the boilerplate necessarily implies there’s some abstraction bridging that gap.

Looking at the whole C# 9 feature of “top-level programs” - that is, a program that doesn’t require a Main() method. Seems like removing nice boilerplate, right? Except, read that bit about args - “args is available as a ‘magic’ parameter.’”

Magic.

I’m not sure writing a class and a method declaration as an explicit entry point for my program was really the stumbling block for getting things done. And what’s the debugging story? When you have to figure out where the “magic parameter” comes from… how do you do that?

I think that’s the crux of my whole issue here. I’m not a fan of just throwing away keystrokes (though you’d be hard pressed to realize that from this rant), but there’s gotta be a balance between “fewer keystrokes,” “ease of use,” and “maintainability.” If I need to spend a year learning everything that sits under Spring Boot so I can understand how to change the log levels in an app, that’s not maintainable. It might save keystrokes, it might be easy to use ‘if you know,’ but… if you don’t?

I wonder if this contributes to the polarization of people working with languages. It becomes harder and harder to be that polyglot programmer because the stack you have to know for each individual language just grows. Eventually it’s not worth trying to span all the languages because you aren’t getting anything done. So you get the C# fans, or the Go fans, or the Python fans, or the JavaScript fans, and they all love their individual languages and ecosystem, but only because they spend all day every day in there. They know the right search terms to plug in, they know the stack and why the boilerplate was removed.When they switch to something else (as it is with me), it’s a “somebody moved my cheese” situation and the tolerance for pain is far lower than the desire to just get back to being productive.