Building Pellet – Structured Logging

This post is about building a logging system for Pellet - an opinionated Kotlin web framework I’m working on, with the intention to build best-practices in from the start. I’ll discuss features I think a good logging system should have, introduce the concept of “structured logging”, and talk about implementing everything (with plenty of code examples along the way) 🚀.

Requirements
Implementation
Conclusion
Footnotes

Requirements #

An important, but often overlooked, part of any web framework is how it logs things. From sense checking conditions at startup, to logging requests, and exceptions, the point of logging is to provide context about the state of the application at runtime - to aid in development, operation, and debugging.

With this in mind, I wrote down the following requirements for a logging system:

Pellet has a general goal of reducing external dependencies where sensible to do so - try not to add many, if any, dependencies
The overhead of logging should not be too much - my first guess was almost certainly no more than a millisecond per request, and tens of milliseconds at startup
Users of the logging system should be able to do the basics, like logging a message at different levels (think debug, info, error, etc)
Users should be able to add their own context to the log output, without trying to stuff it all in a single message string
The system should play nicely with SLF4J¹, and the associated MDC² (mapped diagnostic context) libraries - these frameworks are very popular in the JVM ecosystem, and ignoring them is likely to cause major headaches
At the same time, try not to be too limited by older frameworks - try to use the best of Kotlin
Be opinionated - choose sensible defaults, don’t worry too much about dynamic modification and configuration

Looking at all of these, it was obvious that doing “structured logging” out of the box would be the best thing to do - especially the part about being able to add your own context.

Structured logging means that the log output includes additional, usually keyed, information to add extra context to the message. For example, you might include the name of the thread that logged something, or the HTTP status code that a handler returned. Although this is theoretically possible with simple plaintext messages, doing so with a format like JSON makes it much easier to ingest log lines that might not have exactly the same structure in different contexts (for example, the “status code” structured log element doesn’t make any sense in an “application startup” context).

Implementation #

With requirements sorted, I started thinking about the implementation.

A key component of a performant logging system is the avoidance of unnecessary work. During development, you’re likely to want to run the service with “debug logging” turned on, to see more of the application’s state and help you debug things. But after deploying to a production environment, these messages add unnecessary noise to the log output. Worse, if you’re not careful, constructing the debug messages themselves can add non-negligible overhead to your system. SLF4J and associated logging frameworks prevent this unnecessary overhead by letting you pass arguments along with your message, and uses “message formatting” to interpolate these at runtime, if that particular log level is enabled. For example:

val logger = LoggerFactory::getLogger(MyHandler::class.java)
val state = /* an expensive object to serialise */
logger.debug("application state: {}", state)

These frameworks are designed for Java, which does not currently have string interpolation built in. As such, the objects cannot easily be type-safe - you can pass any Object in to the logging functions, and you don’t know that they’ll serialise properly at runtime. Kotlin, however, has great string interpolation built in³, which uses StringBuilder under the hood 🤩! Combined with lambda expressions, we can avoid having to pass Objects around, keep things type safe, and lazily evaluate our output only when that particular log level is enabled - avoiding the overhead problem.

The structured logging requirement suggested that I’d need a JSON serialisation library - to turn logging models in to strings for output to the console. I knew that JetBrains were working on one called kotlinx.serialization⁴, which I’d heard good things about, so I investigated the state of the project. A couple of key things persuaded me to go with this framework to serialise log lines in to JSON - it was written and maintained by JetBrains (the same people who make Kotlin), and it allowed you to avoid reflection entirely. Instead, you can annotate classes with @Serializable, which an annotation processor picks up, and generates serialisation logic at compile time. Although JVM reflection can be fast at runtime, it takes time to “warm up”, and can be slow the first time you use it at a particular call site, causing unnecessary latency and load on the service until results are optimised and cached.

Next, I needed to think about adding context-relevant entries to the log output. Something new that I wanted to do, that isn’t available in older logging frameworks, is for everything to be type safe - if you can make a log element out of it, you’re guaranteed that it’ll serialise properly at runtime as you expect it to. To do so, we have to “box” log elements due to a limitation of Kotlin - in Swift, for example, you might add a protocol conformance of PelletLogElement to primitive classes like String, but you can’t do that with Kotlin. Instead, we can use helper methods to box based on type, and we do that for String, Number, Boolean, and PelletLoggable if you want to encapsulate your own serialisation logic somewhere else. Kotlin’s sealed class support means we can make a parent type, with a fixed set of supported boxed subtypes:

public sealed class PelletLogElement {

    object NullValue : PelletLogElement()
    data class StringValue(val value: String) : PelletLogElement()
    data class NumberValue(val value: Number) : PelletLogElement()
    data class BooleanValue(val value: Boolean) : PelletLogElement()
}

// ...

public class PelletLogElements(
    startingElements: Map<String, PelletLogElement> = mapOf()
) {
    private val elements = startingElements.toMutableMap()

    fun add(key: String, string: String?): PelletLogElements {
        elements += key to logElement(string)
        return this
    }

    // ...
}

// ...

public fun logElement(value: String?): PelletLogElement {
    if (value == null) {
        return PelletLogElement.NullValue
    }
    return PelletLogElement.StringValue(value)
}

Combining this with a Kotlin builder⁵ function, we can lazily create a map of structured log elements:

// ...

public fun logElements(
    builder: PelletLogElements.() -> Unit
): () -> PelletLogElements {
    return {
        PelletLogElements(mapOf()).apply(builder)
    }
}

// ...

val elements = logElements {
    add("hello", "world 🌎")
}
logger.info(elements) { "a message with lazily evaluated log elements" }

The features above suggested to me that it was worth making a “clean room” log interface, in pure Kotlin, with an SLF4J bridge to support the basics. Pure Java logging interfaces end up being pretty large⁶, because they have “helper methods” for each of the log levels. Kotlin offers us “extension functions”⁷ that mean the primary interface can be pretty lean, but users can still have pragmatic helper functions too:

public interface PelletLogging {

    fun log(
        level: PelletLogLevel,
        elementsBuilder: (() -> PelletLogElements)? = null,
        messageBuilder: () -> String
    )
    fun log(
        level: PelletLogLevel,
        throwable: Throwable? = null,
        elementsBuilder: (() -> PelletLogElements)? = null,
        messageBuilder: () -> String
    )
}

// ...

public fun PelletLogging.info(
    elementsBuilder: (() -> PelletLogElements)? = null,
    messageBuilder: () -> String
) {
    log(PelletLogLevel.INFO, elementsBuilder, messageBuilder)
}

public fun PelletLogging.info(
    throwable: Throwable? = null,
    elementsBuilder: (() -> PelletLogElements)? = null,
    messageBuilder: () -> String
) {
    log(PelletLogLevel.INFO, throwable, elementsBuilder, messageBuilder)
}

To create instances of the logger, Kotlin offers us a particularly nice feature called “reified generics”⁸. We can use this to make a function that automatically inlines itself, giving access to type-safe generic class information that isn’t ordinarily available in Java. Here are three examples of making a similarly named logger - the last one being my favourite and most concise way of doing it:

public fun pelletLogger(
    name: String
): PelletLogging {
    return PelletStructuredLogger(name) { PelletLogging.level }
}

public fun <T> pelletLogger(
    clazz: Class<T>
): PelletLogging {
    return pelletLogger(clazz.name)
}

public inline fun <reified T> pelletLogger(): PelletLogging {
    return pelletLogger(T::class.java)
}

// ...

val explicitLogger = pelletLogger(Main::class.java)
val namedLogger = pelletLogger("main")
val logger = pelletLogger<Main>()

The final requirement was making sure services built on Pellet still worked with SLF4J. I was comfortable with it being more of a fallback - features like the structured logging elements didn’t have to work with it. SLF4J searches for logging implementations at runtime, so I added a very basic SLF4J MarkerBase, similar to their slf4j-simple logger, that just does message formatting and then forwards the result on to a backing PelletStructuredLogger:

public class PelletSLF4JBridge(
    name: String,
    private val level: PelletLogLevel
) : MarkerIgnoringBase() {

    private val backingLogger = pelletLogger(name)

    override fun isTraceEnabled(): Boolean {
        return level.value >= PelletLogLevel.TRACE.value
    }

    override fun trace(format: String?, arg: Any?) {
        if (!isTraceEnabled) {
            return
        }
        val formattingTuple = MessageFormatter.format(format, arg)
        logFormattingTuple(PelletLogLevel.TRACE, formattingTuple)
    }

    // ...
}

Finally, putting everything together, here’s a code example from Pellet’s HTTP request handler. It adds structured log elements, like request method, and response duration; plus, it logs an “apache common log format” in the message text:

private val logger = pelletLogger<HTTPRequestHandler>()

// ...

private suspend fun respond(
    request: HTTPRequestMessage,
    response: HTTPRouteResponse,
    responder: PelletHTTPResponder,
    timer: PelletTimer
) {
    val message = mapRouteResponseToMessage(response)
    val requestDuration = timer.markAndReset()
    val elements = logElements {
        add(requestMethodKey, request.requestLine.method.toString())
        add(requestUriKey, request.requestLine.resourceUri.toString())
        add(responseCodeKey, message.statusLine.statusCode)
        add(responseDurationKey, requestDuration.toMillis())
    }
    logResponse(request, response, elements)
    responder.respond(message)
}

private fun logResponse(
    request: HTTPRequestMessage,
    response: HTTPRouteResponse,
    elements: () -> PelletLogElements
) {
    val now = Instant.now().atZone(UTC)
    val dateTime = commonDateFormat.format(now)
    val (method, uri, version) = request.requestLine
    val responseSize = response.entity.sizeBytes
    logger.info(elements) { "${client.remoteHostString} - - [$dateTime] \"$method $uri $version\" ${response.statusCode} $responseSize" }
}

Which outputs as follows:

{"level":"info","timestamp":"2022-03-07T00:32:17.765062Z","message":"127.0.0.1 - - [07/Mar/2022:00:32:17 0000] \"GET /v1/hello HTTP/1.1\" 200 22","name":"dev.pellet.server.codec.http.HTTPRequestHandler","thread":"DefaultDispatcher-worker-1","request.method":"GET","request.uri":"/v1/hello","response.code":200,"response.duration_ms":0}

Conclusion #

I’m really happy with how the first prototype of Pellet logging has turned out. Users get fast, concise structured logging out of the box, without needing to faff around with configuration or log layouts. You can choose what log level you want to output at, and if something doesn’t end up getting logged, it doesn’t add needless overhead.

The API is clean, and hopefully familiar enough to not alienate users who are used to SLF4J - but offers an even better, incrementally improved experience to Kotlin-first users, using some of the best features the language has to offer ⭐️.

Footnotes #

SLF4J - https://www.slf4j.org/ ↩
MDC - https://www.slf4j.org/api/org/slf4j/MDC.html ↩
Kotlin string templates - https://kotlinlang.org/docs/basic-types.html#string-templates ↩
kotlinx.serialization - https://github.com/Kotlin/kotlinx.serialization ↩
Kotlin functional literal with receiver - https://kotlinlang.org/docs/lambdas.html#function-literals-with-receiver ↩
SLF4J Logger - https://www.slf4j.org/api/org/slf4j/Logger.html ↩
Kotlin extension functions - https://kotlinlang.org/docs/extensions.html ↩
Kotlin reified type parameters - https://kotlinlang.org/docs/inline-functions.html#reified-type-parameters ↩