High-quality API documentation enhances customer satisfaction, especially in the serverless architecture, where Lambda functions serve the API. Documenting these APIs has always felt like writing a novel with a feathered pen. At least, they did until now, when Powertools for AWS released its OpenAPI documentation utility.
This post presents a method to generate OpenAPI documentation for Python Lambda function-based APIs, utilizing Powertools for AWS Lambda and Pydantic.
In the next post, we will discuss automating this process and adding it to the service CI/CD pipeline as a gate to production and publishing the documentation.
Table of Contents
The Case for API Documentation
When I design features that require API changes, I document them .
I strongly believe in adopting the API first approach. Allen Helton, an AWS serverless hero, has written an excellent post about the merits of the API-first approach. It allows your API customers, whether internal or external, to develop and plan their integration with your API without being blocked until you publish the new API. You send them the API documentation and let them go to work.
An API-first approach means that for any given development project, your APIs are treated as “first-class citizens.” - Swagger.io
API documentation is also helpful to understand at any time what your service provides from the top to bottom. It's great for future integrations, developing new features, and onboarding new team members.
OpenAPI - The Standard King
OpenAPI has become the standard format for describing APIs.
OpenAPI Specification (formerly Swagger Specification) is an API description format for REST APIs. An OpenAPI file allows you to describe your entire API" - Swagger.io.
It's a simple JSON or YML formatted file that allows us to describe our REST API and its':
Available endpoints
Operations on each endpoint
Operation parameters - Input and output for each operation
Authentication methods
Organize the endpoints by tags or groups and even generate a sample request.
Swagger.io is a tool that allows you to visualize this documentation file.
You can view a live demo of the format for a dummy service here, and it looks something like this:
Now that we understand how we want to document our APIs let's talk about adapting this process to our serverless APIs backed by Lambda functions.
Generating API Documentation
Let's cover several methods of generating API documentation.
In my opinion, the best method for generating Python-based API documentation is to create it from the service code. I use Pydantic to define input and response schemas, and it natively has the option to export the schemas to an OpenAPI format, which is a massive plus in my book. So integration with this tool is necessary for any solution we choose.
The first option comes natively from API Gateway.
API Gateway has a neat feature. Once deployed, you can export the OpenAPI documentation from the console or via API as described here. However, there's no Pydantic support, and many finer detail schemas (input/ or output) are not easily configured as you define the JSON schema yourself and enable request validation. And lastly, it's not created from our handler code but from the infrastructure code, which is good but imperfect.
So, while being a nice feature, more is needed. Let's review another method.
The second option comes from frameworks such as FastAPI.
FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.8+ - https://fastapi.tiangolo.com/
Most of these Python frameworks and tools, such as FastAPI, Flask-RESTPlus/Flask-RESTx, Django REST Framework, and Connexion, are designed to create web applications and APIs that listen on a socket for incoming HTTP requests.
Let's focus on FastAPI.
FastAPI supports Pydantic schemas to describe payloads and generating OpenAPI documentation straight from the code. This approach simplifies the documentation process and keeps it in sync with the code.
However, you are essentially running a web server on Lambda that opens a socket to listen to incoming HTTP requests, which the Lambda service already does for you. It's a server-based framework with negative implications on a cold start, Lambda ZIP file size, and latency. And while you CAN do it, I don't think you SHOULD.
In my opinion, we need a native serverless framework to provide OpenAPI documentation generated from handler code. It should be fast and match the invocation model of Lambda, and as such, it should not open sockets by itself.
Let's see how we can achieve that.
Serverless OpenAPI Documentation
Our goal is to generate an OpenAPI documentation for a serverless API consisting of an API Gateway and a Lambda function. We will define the HTTP input payload schema and ALL the possible HTTP responses: their codes and complete JSON payload. We will use Pydantic to define all schemas.
The OpenAPI documentation will be served under a new API endpoint: '/swagger.'
In the next blog post, we will discuss exporting the documentation and automating this entire process.
Now that we understand the goal let's write some code.
Powertools EventHandler Introduction
We use the Powertools for AWS Lambda library. Powertools for Lambda is the serverless go-to library for observability, logging, idempotency, input validation, and many more.
We will use the event handler utility from Powertools.
The event handler utility provides lightweight routing to reduce boilerplate for API Gateway REST/HTTP API, ALB, and Lambda Function URLs. It works with micro functions (one or a few routes) and monolithic functions (all routes). Most interestingly, it has support for OpenAPI and data validation for requests/responses with Pydantic schemas.
The OpenAPI documentation is a relatively new feature. It provides a '/swagger' endpoint on your API Gateway that outputs an OpenAPI documentation.
Let's implement event handler and data validation in a real service.
We will use my AWS Lambda cookbook template project and add support for OpenAPI documentation. The Cookbook is a template project that allows you to get started with serverless with three clicks, and it has all the best practices and utilities that a production-grade serverless service requires.
Let's start with the infrastructure configuration.
OpenAPI Endpoint Infrastructure
The official docs have SAM code samples, but I use AWS CDK.
Below is a function to add to your CDK REST API construct.
By design, you must select one Lambda handler to answer the GET '/swagger' HTTP calls and generate the OpenAPI documentation. We need to connect the Lambda function that uses the event handler utility.
You need to map three 'GET' endpoints (/swagger, /swagger.css, /swagger.js) to that function for the swagger generation.
In line 5, the function receives the REST API gateway object to add the new endpoints and the Lambda function class that will serve these endpoints.
In lines 7 through 14, we add all three endpoints and attach the Lambda function to them with an HTTP GET command.
You can find the complete code here.
Event Handler Code
Now that we have the infrastructure all configured let's add the event handler code and start documenting our API.
In line 3, we create the event handler API gateway resolver and enable validation to get the input events and output response validations (using Pydantic).
In line 4, we enable the OpenApi generation via the '/swagger' endpoint and pass a title to the generated document. According to the documentation, you must enable the validation to get the complete OpenAPI definition.
You can find the complete code here.
Lambda Handler Code
Let's write the Lambda handler code and document it.
We will document the HTTP POST '/api/orders' endpoint, which creates a new customer order.
A lot is going on here, but it's pretty simple. We are going to add OpenAPI information as much as possible. We will describe the specific API, its description, input schemas for the JSON HTTP body payload, and all possible HTTP responses' schemas with Pydantic.
In line 8, we import the event handler we initialized in the previous file.
In line 13, we start the app definition. First, we mark this API as an HTTP POST.
In line 14, we set the API to respond to the path '/API/orders/.'
In line 15, we set the API description that will appear in the OpenAPI documentation.
In lines 18-31, we define all the API responses.
In lines 19-22, we define the HTTP 200 OK response. This is how we can control all the HTTP response definitions. Be advised that if you don't define these responses, the event handler will generate them automatically for you and it still include the 422 and 200 responses but not the 501 one. The 422 is a built-in response from the input validation feature. The 200 response is built using the handler's return value types that we define in line 35 - the 'CreateOrderOutput' Pydantic schema.
In lines 23-26, we define the HTTP input validation response with an HTTP code 422. We used the Pydantic schema 'InvalidApiRequest' to describe it.
In lines 27-30, we do the same for HTTP 501.
In line 32, we tag this API as part of a group called 'CRUD.' When you have multiple APIs, tags make presenting the APIs in multiple sublists easier than in one long list.
In line 39, we define the entry function to the handler. The resolver will call the correct event handler sub-function according to the HTTP path and command. Read more about it here. In our case, all calls will route to the function we defined in lines 13-36.
In lines 34-35, we use type hint to define the input the handler expects. Since we enabled data validation, once we enter line 36, we have a parsed and serialized Pydantic object in our hands and not the regular event as a dictionary. I used the special Annotated and body typing classes to tell the event handler that it expects the 'CreateOrderInput' class in the body payload and that it is a JSON dictionary and not a string.
Notice that all the API requests and responses have a Pydantic schema that defines them. You can find all the Pydantic schemas definitions here and here.
You can find the complete handler code here.
I've written this handler and the logic according to my architectural layers concepts, which I also discussed at my AWS re:invent 2023 session. Click here to learn more.
OpenAPI Endpoint in Action
Now, all that is left is to deploy our code and access our swagger endpoint.
It will look something like this:
Notice that we can click the schemas and see the Pydantic definition output as proper OpenAPI schema, with the descriptions, restrictions, and types.
You can also check a live version of the swagger here:
Limitations
I've been quite impressed with this new utility. However, it has some limitations.
It's solvable but requires development from the Powertools team or the community.
For example, not all the OpenAPI spec is generated. However, there are open issues at the repository, and community help is requested, so it might be a significant first contribution if you want to try it!
Now, onto the more significant issue.
At the time of writing, there's no support for OpenAPI generation if you use multiple micro functions; only mono lambda is supported. I've created a GitHub issue with a suggested solution, and I'd appreciate a thumbs up on the issue to get the ball rolling.
Summary
In this post, we saw how Powertools for AWS can help you generate OpenAPI documentation from your handler code. It empowers developers to own the code and its documentation and, most importantly, how to keep them both in constant sync.
Join me in the next post, where I'll discuss how you can take this approach even further and add important gates to your CI/CD pipeline to protect this precious sync between code and API documentation.