Tuesday, October 8, 2019

Writing JSON Logs from a Python Lambda

My previous post looked at a Lambda function that would accept log messages from one Kinesis stream, transform them into JSON, and write them to another Kinesis stream, from which they would be inserted into an Elasticsearch cluster. In that post I skipped over the actual transformation code, other than saying that it would pass existing JSON objects unchanged, and recognize the Lambda Python log format.

However, I'd prefer not to perform this translation: I want to write log messages as JSON, which will allow me to easily add additional information to the message and use searches based on specific details. With the Python logging facility, it's easy to make this happen.

Unless you're running on Lambda.

The linked example assumes that you can configure logging at the start of your main module. However, Lambda configures logging for you, before your code gets loaded. If you print logging.root.handlers, from a Lambda, you'll see that there's already a LambdaLoggerHandler installed. To use my formatter, you need to update this handler:

for handler in root.handlers:
    handler.setFormatter(JSONFormatter())

While I could have replaced this handler entirely, with one that wrote to StdErr, I was concerned that it performed some undocumented magic. One thing that I did discover, although it doesn't quite count as magic, is that it inserts the Lambda's request ID in the LogRecord before passing it to the formatter.

By not using that piece of information, I require that the formatter be initialized for every Lambda invocation, so that it can get the request ID out of the invocation context. This also allows me to pull out the Lambda name and version, or any of the other documented context attributes.

If you also want JSON-formatted logs from your Lambdas, I've included a documented example here.