On the internet, anyone can make a request to your web service. Especially in this time of abusive web crawling linked to AI/LLM companies, it's essential to program in a defensive style and stay in control, even when faced with a volume of requests that can't be handled.

If you don't prepare, requests might start backing up, the impact might spread to services your service uses, the process itself might hit resource limits like the number of open files, and eventually your service might stop responding to requests altogether, even if the storm of requests passes.

What Guile Knots provides

Guile Knots is a library providing some tools and patterns for programming with Guile Fibers (see announcing guile knots for an introduction). The Guile Knots resource pool implementation comes from facing the challenges of running real Guile web services.

The #:default-checkout-timeout and #:default-max-waiters options on the resource pool allow setting limits on fetching resources, and failing when they're exceeded. This is important as you don't want your web service continuing to process a request long after the client and/or reverse proxy has given up. Getting an exception from the resource pool allows you to return a 429 (Too Many Requests) or 503 (Service Unavailable) response.

Guile Knots includes a fixed size resource pool (make-fixed-size-resource-pool), but also a variable size resource pool (make-resource-pool). For the variable size resource pool, you can also tune the #:add-resources-parallelism to set the number of resources that can be created in parallel.

Example 1: PostgreSQL connections in the Guix Data Service

The Guix Data Service uses PostgreSQL for its database.

Since PostgreSQL has limits on the number of connections, if the data service opened a connection to PostgreSQL when needed, it would be easy for that limit to be hit and errors to occur. When all the PostgreSQL connections are used up, this could then impact other database activity, spreading the damage further.

By using a resource pool for PostgreSQL connections, when there are too many requests to handle, an exception is raised and a 503 (Service Unavailable) response can be returned promptly.

This is what the resource pool in the Guix Data Service looks like in the code:

(let ((connection-pool
       (make-resource-pool
        (lambda ()
          (open-postgresql-connection
           "web"
           postgresql-statement-timeout))
        (floor (/ postgresql-connections 2))
        #:name "web"
        #:idle-seconds 30
        #:destructor
        (lambda (conn)
          (close-postgresql-connection conn "web"))
        #:add-resources-parallelism
        (floor (/ postgresql-connections 2))
        #:default-max-waiters (floor (/ postgresql-connections 2))
        #:default-checkout-timeout (/ postgresql-statement-timeout
                                      1000))))
   ...)

Checking out a connection for a query looks like:

(with-resource-from-pool connection-pool conn
  (exec-query conn "SELECT ..."))

Resource pools have an additional benefit of allowing connections to be reused for multiple requests, avoiding the overhead of establishing new connections each time.

Priority lanes: the reserved pool

You can even go further, the Guix Data Service actually uses multiple resource pools for PostgreSQL connections, a web pool for most queries, a web-reserved pool for more important requests and a background pool for queries not related to requests.

A little history

The Guile Knots resource pool implementation started life in the Guix Data Service. Before Guile Squee supported suspendable ports, the Guix Data Service was using its own thread pool implementation to manage connections. The thread pool was added all the way back in 2020, and the resource pool in 2023.

Example 2: BFFE using the Guix Build Coordinator

The Build Farm Front-end (BFFE) doesn't rely on a database in the same way, but makes requests to a Guix Build Coordinator instance to fetch data for responding to requests.

Guile Knots includes a connection cache, which wraps the resource pool to help use it for connections to network services.

Just like the Guix Data Service, the connection cache means that the number of concurrent connections the BFFE makes to the Guix Build Coordinator is controlled.

Before introducing this, the number of connections and file descriptors in use by the process grew uncontrollably, eventually leading to the web server becoming unresponsive.

The cache is configured at start-up:

;; bffe/server.scm
(define event-source-connection-cache
  (make-connection-cache
   (string->uri event-source)
   32
   #:connect-timeout 5
   #:default-checkout-timeout 5
   #:default-max-waiters 32))

Each request to the build coordinator then uses a cached socket:

(call-with-cached-connection
 event-source-connection-cache
 (lambda (port)
   (http-get uri
             #:port port
             #:keep-alive? #t)))

Patterns worth following

In summary, identify the resources your web service can run out of, and manage these with a resource pool.

This isn't advice specific to Guile Knots, but the resource pool and connection cache that Guile Knots provides should do this well, and you might get some performance benefits too.

Tags: