Rethinking Config with Aero & Integrant
Aero & Integrant are two fantastic Clojure libraries used to configure applications. However, I think that there are issues with the way they often get used; today I’m going to look at how they’re actually used best together by keeping them apart.
How Aero & Integrant are often used
The vast majority of the projects I’ve worked on that make use of these
libraries do so in a way advocated in this post by Kasper Gałkowski,
who shows the convenience of combining Aero & Integrant within the same
config.edn
file.
In his post, he uses the example of the following hypothetical config.edn
file:
{:http/service
{:port #env PORT
:db #ig/ref :db/pg
:auth #ig/ref :auth/okta}
:auth/okta
{:client-id #env OKTA_CLIENT_ID}
:db/pg
{:jdbc-url #env DATABASE_URL
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
{:http/service {:port #env PORT
:db #ig/ref :db/pg
:auth #ig/ref :auth/okta}
:auth/okta {:client-id #env OKTA_CLIENT_ID}
:db/pg {:jdbc-url #env DATABASE_URL
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
As a quick overview to those less familiar with the two libraries, this consists
of an Integrant map that describes the structure of our application: we have an
HTTP server, an Okta client and a Postgres DB connection, as indicated by the
three top-level keys in the map. Aero is found here in the form of reading
environment variables by using #env
reader tag.
Gałkowski goes on to show how we could swap out the Okta auth component with a dummy auth component, like so:
{:http/service
{:port #env PORT
:db #ig/ref :db/pg
:auth #ig/ref :auth/dummy}
:auth/dummy
{}
:db/pg
{:jdbc-url #env DATABASE_URL
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
{:http/service {:port #env PORT
:db #ig/ref :db/pg
:auth #ig/ref :auth/dummy}
:auth/dummy {}
:db/pg {:jdbc-url #env DATABASE_URL
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
Here we’ve replaced all instances of the :auth/okta
keyword with :auth/dummy
,
and moreover the :auth/dummy
entry’s value can simply be replaced with an
empty map.
What’s the problem?
This is all very well and good so far - so what’s the issue with using these libraries together like this? Essentially, as convenient as it is to inject environment variables into our application structure like this, we’re actually lacking a bit of ease of use here.
To see why this is, let’s consider the case of someone who wants to run this app locally in a docker container for development purposes, but wants to use dummy auth rather than a real Okta client. Let’s say the main application that they work on talks to this one, so all they care about is running this application in a simplified way. Perhaps they don’t even know Clojure, so ease of configuration is important for them.
In this scenario, swapping out the components suddenly isn’t quite so convenient:
- Using a
docker run
command, there’s no single environment variable we can use to indicate that we want to use a dummy auth component - therefore our developer has to mount a whole new
config.edn
file, copy-pasted & edited such that the existing application structure is all present, but with the dummy auth component set up correctly - in setting up the dummy auth component, they need to know
- the name of the new key to use in its place
- the fact that it only needs an empty map
- the fact that they have to update the reference to the auth component within the HTTP service component
Not really very user-friendly at all. It seems that our “configuration” is actually surprisingly hard to reconfigure! We need to find a new approach to make our configuration work better.
Settings VS structure
As I write this, the plumbing & heating system in my house has just been completely renewed and swapped out; old lead pipes have been replaced with plastic ones, pipes have been routed completely differently from how they were before, and a new hot water tank has been put in. This certainly wasn’t within my expertise, so I paid someone who actually knows what they’re doing to do it for me. But even then there have been teething issues, with occasional drops in water pressure and hot water that suddenly decides not to work; reconfiguring plumbing can be difficult, even for someone with expertise.
Thankfully though, as an “end user” of my heating system I don’t need to know or understand the inner workings of my house’s plumbing: I turn on the ‘hot water’ setting on my boiler, and soon I have hot water available to come out of my hot water taps; I can switch the dial down on my thermostat and the heating for my house slowly comes down.
You could say that when I use the controls on my boiler I’m reconfiguring my house’s heating system. From another point of view you could also say that by swapping out pipes and hot water tanks is also reconfiguring my house’s heating system. But although you could say that these are both “configuration” of a sort, they are completely different beasts.
In a similar way, if we use Aero & Integrant in this manner then we conflate application settings & application structure, chucking them together in a big bag of “config”.
Aero covers an application’s settings - simple inputs to the program which (ideally) are hard to get wrong, and obvious if they’ve been set incorrectly.
Integrant covers an application’s structure - components that can be plugged together in different ways but usually require a bit more familiarity & expertise to do so well, and even then sometimes a poor setup of app structure can result in unexpected behavior.
Let’s see what happens when we keep settings separate from our app structure:
Application blueprints
In James Reeves’ talk introducing Integrant, he describes Integrant’s data structures as defining the configuration of our application. Sadly, this helps lead to the conflation I’ve described above. Instead, I think a far better description of Integrant is that it allows you to define a blueprint for our application.
i.e. we have
[app blueprint] –creates–> [running app]
Why Integrant is so powerful - like so many other great Clojure libraries, like Hiccup - is that you can use all the power of Clojure’s core functions to manipulate data in order to define what you want, rather than being limited by some arbitrary API. Taking advantage of this, we can use our app settings as inputs which alter how we create such a blueprint, giving us the following:
[app settings] –create–> [app blueprint] –creates–> [running app]
This allows us to solve the component-swapping problem we described above. Let’s
go back to our original config.edn
, but this time making it a little more
realistic by adding some defaults and type coercion:
{:http/service
{:port #or [#long #env PORT 8080]
:db #ig/ref :db/pg
:auth #ig/ref :auth/okta}
:auth/okta
{:client-id #or [#env OKTA_CLIENT_ID
"dev-client-id"]}
:db/pg
{:jdbc-url #or [#env DATABASE_URL
"jdbc:postgresql://127.0.0.1:5432/db"]
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
{:http/service {:port #or [#long #env PORT 8080]
:db #ig/ref :db/pg
:auth #ig/ref :auth/okta}
:auth/okta {:client-id #or [#env OKTA_CLIENT_ID "dev-client-id"]}
:db/pg {:jdbc-url #or [#env DATABASE_URL "jdbc:postgresql://127.0.0.1:5432/db"]
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
Here we can see three true external inputs - the HTTP port, the DB URL and the
Okta client ID. Let’s change our config.edn
to contain only these app
settings:
{:http-port
#or [#long #env PORT 8080]
:db-url
#or [#env DATABASE_URL
"jdbc:postgresql://127.0.0.1:5432/db"]
:okta-client-id
#or [#env OKTA_CLIENT_ID
"dev-client-id"]}
{:http-port #or [#long #env PORT 8080]
:db-url #or [#env DATABASE_URL "jdbc:postgresql://127.0.0.1:5432/db"]
:okta-client-id #or [#env OKTA_CLIENT_ID "dev-client-id"]}
If we want to make it possible to use a different auth client then we have an
implicit application setting; let’s make it explicit by adding it to our
config.edn
:
{:http-port
#or [#long #env PORT 8080]
:db-url
#or [#env DATABASE_URL
"jdbc:postgresql://127.0.0.1:5432/db"]
:auth-client
#or [#keyword #env AUTH_CLIENT :okta]
:okta-client-id
#or [#env OKTA_CLIENT_ID "dev-client-id"]}
{:http-port #or [#long #env PORT 8080]
:db-url #or [#env DATABASE_URL "jdbc:postgresql://127.0.0.1:5432/db"]
:auth-client #or [#keyword #env AUTH_CLIENT :okta]
:okta-client-id #or [#env OKTA_CLIENT_ID "dev-client-id"]}
Then we can write a function that will create an app blueprint for a given settings map:
(defn ->app-blueprint [settings]
(let [auth-client-blueprint
(if (= :dummy
(:auth-client settings))
{:auth/dummy {}}
{:auth/okta
{:client-id
(:okta-client-id settings)}})]
(merge
{:http/service
{:port (:http-port settings)
:db (ig/ref :db/pg)
:auth (ig/ref (-> auth-client-blueprint
first
key))}
:db/pg
{:jdbc-url (:db-url settings)
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
auth-client-blueprint)))
(defn ->app-blueprint [settings]
(let [auth-client-blueprint
(if (= :dummy (:auth-client settings))
{:auth/dummy {}}
{:auth/okta {:client-id (:okta-client-id settings)}})]
(merge {:http/service {:port (:http-port settings)
:db (ig/ref :db/pg)
:auth (ig/ref (key (first auth-client-blueprint)))}
:db/pg {:jdbc-url (:db-url settings)
:database-name "company"
:username "admin"
:password "secret"
:minimum-idle 3
:maximum-pool-size 15}}
auth-client-blueprint)))
This gives us the flexibility we want; our hypothetical developer can now simply
pass the env value AUTH_CLIENT=dummy
to their docker container and they don’t
need to worry about the particulars of our integrant map or our application
structure. This comes at the cost of a little more code, but this is a small
price to pay for making our configuration easier to actually configure.
We’ve done this by taking better advantage of both libraries’ strengths; Aero manages our settings in a simple, explicit way with little logic, as intended; with Integrant we are taking advantage of it succinctly describing our app structure as data, and so we can alter it simply as needed using plain functions.
User-friendly settings
Decoupling our app settings from our app structure has further advantages: it makes it easier to treat settings in the same way we ought to treat any other external input into our system - namely, documenting & validating them to make sure that bad inputs are either rejected or cause the application to fail-fast.
With the distraction of our app structure out of the way, our config.edn
is
now a more inviting place to document our app settings. For example:
{:http-port
#or [#long #env PORT 8080]
:db-url
#or [#env DATABASE_URL
"jdbc:postgresql://127.0.0.1:5432/db"]
;; Set to AUTH_CLIENT to "okta" to use an
;; Okta client, or "dummy" to use a dummy client
:auth-client
#or [#keyword #env AUTH_CLIENT :okta]
;; Required when using Okta; optional otherwise
:okta-client-id
#or [#env OKTA_CLIENT_ID "dev-client-id"]}
{:http-port #or [#long #env PORT 8080]
:db-url #or [#env DATABASE_URL "jdbc:postgresql://127.0.0.1:5432/db"]
;; Set to AUTH_CLIENT to "okta" to use an Okta client,
;; or "dummy" to use a dummy client
:auth-client #or [#keyword #env AUTH_CLIENT :okta]
;; Required when using Okta; optional otherwise
:okta-client-id #or [#env OKTA_CLIENT_ID "dev-client-id"]}
With a simple data structure that consists of settings and only settings, we can safely and easily validate it, for example using spec:
(s/def ::http-port int?)
(s/def ::db-url string?)
(s/def ::auth-client #{:okta :dummy})
(s/def ::okta-client-id string?)
(defmulti auth-client :auth-client)
(defmethod auth-client :okta [_]
(s/keys :req-un [::okta-client-id]))
(defmethod auth-client :dummy [_]
(s/spec identity))
(s/def ::settings
(s/merge
(s/keys :req-un [::http-port
::db-url
::auth-client])
(s/multi-spec auth-client :auth-client)))
(s/valid?
::settings
{:http-port 8080
:db-url "jdbc:postgresql://127.0.0.1:5432/db"
:auth-client :okta
:okta-client-id "dev-client-id"})
;; => true
(s/valid?
::settings
{:http-port 8080
:db-url "jdbc:postgresql://127.0.0.1:5432/db"
:auth-client :okta})
;; => false
(s/valid?
::settings
{:http-port 8080
:db-url "jdbc:postgresql://127.0.0.1:5432/db"
:auth-client :dummy})
;; => true
(s/def ::http-port int?)
(s/def ::db-url string?)
(s/def ::auth-client #{:okta :dummy})
(s/def ::okta-client-id string?)
(defmulti auth-client :auth-client)
(defmethod auth-client :okta [_]
(s/keys :req-un [::okta-client-id]))
(defmethod auth-client :dummy [_]
(s/spec identity))
(s/def ::settings
(s/merge
(s/keys :req-un [::http-port
::db-url
::auth-client])
(s/multi-spec auth-client :auth-client)))
(s/valid? ::settings
{:http-port 8080
:db-url "jdbc:postgresql://127.0.0.1:5432/db"
:auth-client :okta
:okta-client-id "dev-client-id"})
;; => true
(s/valid? ::settings
{:http-port 8080
:db-url "jdbc:postgresql://127.0.0.1:5432/db"
:auth-client :okta})
;; => false
(s/valid? ::settings
{:http-port 8080
:db-url "jdbc:postgresql://127.0.0.1:5432/db"
:auth-client :dummy})
;; => true
Of course, strictly speaking, there’s nothing to stop us from speccing & validating our settings & structure together, but in my experience simply getting folks to be bothered to do this and maintain it is enough of a battle! If our settings data structure is coupled to our app structure then we make more work for ourselves and the bigger the risk that its maintenance falls by the wayside.
Validation of application settings is important, especially given that there can be so many layers of config management & injection such as cloud secret storage and Kubernetes config maps. Just a few months ago I was debugging a test environment issue - it turned out that a mistyped Helm entry resulted in a non-functional Kafka producer, and so a service was blithely chucking away all the messages it was meant to be sending. This wasn’t obvious in a fleet of various microservices, whereas proper settings validation on app start-up would have made the problem obvious on deployment.
Common alternate approaches
Before we close, let’s cover a couple of common alternative approaches to some of these config flexibility problems, and why they still fall short if we keep settings & structure together:
Integrant keys
argument
In Integrant you can optionally pass a collection of keys indicating which components should be started up, letting you start up only a subset of your system. This gives you some of the flexibility I’ve described for swapping out components, but it depends on you setting up derived keywords to solve the problem of updating reference keys.
More importantly though, you still need some kind of separate settings & structure processing stages in order to choose which keys you’re going to start, otherwise this technique can only really help in user namespace files for REPL interaction.
Aero profiles
We can get a bit of leeway by making use of Aero profiles. However, their key limitation is that one and only one profile can be active at a time. This makes them suitable for switching between environments, but their lack of granularity means they’re too rigid for things like providing control over swapping out components.
Summary
As Clojurians we shouldn’t be surprised when a solution to a problem involves splitting things apart. By splitting apart our usage of Aero & Integrant we gain not only configuration flexibility but a more subtle mindset shift too; rather than thinking of environment variables as values that we inject haphazardly into our application structure, we recognize them as the inputs to our system that they are and so treat them with care accordingly.
While I wouldn’t be so crass as to assert something like “Aero combined with
Integrant considered harmful”, I do think that you lose out on power & safety if
you bundle them together in the same config.edn
file. Try splitting them apart
and see what you think.