Boosting Next.js builds by caching

Felix Christl

Cover Image for Boosting Next.js builds by caching

Felix Christl

January 21, 2022

Static Site Generation and Incremental Static Regeneration enable fast websites

At HipSquare, we do a lot of work with Next.js, especially when developing large-scale websites. As we like fast loading times, we rely upon Static Site Generation (SSG) with Incremental Static Regeneration (ISR): When building the website initially, data is fetched from a Content Management System (CMS) and a large portion of the website is pre-generated. Then, to make sure that content changes from the CMS that happened after the build can be taken into consideration, we use ISR: When pages are loaded by users, we check if anything about them has changed since the build and, if so, regenerate the pages for the next user.

This mechanism combines the advantages of statically generated sites (snappy load times) with those of dynamically generated sites (content changes are reflected quickly).

Scaling content can become a challenge for build and regeneration times

When implementing SSG and ISR, you usually start out with a naive and straight-forward implementation: When you need a list of page paths for SSG, you fetch them from the CMS. When you need data for rendering a page, you fetch that, too. And when you need to regenerate a page for ISR, you again rely on a quick API request.

As your content base grows though, you will most likely run into challenges. First of all, network operations via HTTP are relatively expensive operations. Especially for site-overarching data you will most likely experience longer loading times and many requests of the same data. Think of the example of a navigation tree that is dynamically created based on all entries in the CMS: The CMS might already need one or two seconds to generate that tree, then your HTTP request overhead adds a couple of milliseconds per request, and you might need that information for every single page of your website to render navigation elements such as a menu or the footer correctly.

When dealing with large content structures, we have run into multi-hour builds in the past for exactly these reasons.

Caching to the rescue

The strategy to solving this challenge seems obvious: Let's add some caching for repetitive content that is used all over the site and costs relatively much to generate every time. The most simple caching strategy could be to simply keep content that has been loaded once in memory, i.e. store it in some variable, and reuse that content whenever needed:

let cachedContent: MyContent | undefined;
async function fetchContent(): Promise<MyContent> {
    if (cachedContent) {
        return cachedContent;
    }
    
    cachedContent = await fetchFromCms();
    return cachedContent;
}

Caching in Next.js is not that simple

In simple systems, this works nicely. When trying the strategy in Next.js, however, you will face some challenges:

When generating static sites with Next.js, that generation process happens in two distinct phases. First, getStaticPaths is called to identify which page paths shall be statically generated. This call usually already needs an overview of all content that exists. Secondly, for each path, getStaticProps is called, where you will usually need the content of the specific page to be generated but also overarching information on navigation to generate things like menus.

The functions are called in completely different workers, so you have no way of handing over in-memory data from getStaticPaths to getStaticProps.
To speed up builds, Next.js uses one worker per CPU core. If you have eight cores, you will have eight getStaticProps functions running in parallel but in entirely separate execution contexts. That means that you cannot share data in memory but need to at least fetch all relevant overarching data once per worker.

External caching becomes a necessity

Because of the way Next.js uses workers, there is no easy solution to implement a cache. Even a simple file-based cache is not an easy thing to achieve when you have eight workers to synchronize.

We have resorted to a solution based on Redis:

A Redis database is spun up for every build process. We do this using GitLab CI's service concept. We also spin up a Redis instance for each Next.js instance started with next start.
Within Redis, we have one key/value pair per content page, plus one key/value pair for the overarching navigation tree.
Additionally, we allow workers to block a key. When a worker blocks a key, it loads the value for that key and all other workers that try to access the same key are forced to wait until the key is un-blocked. Like this, we ensure that each key is only retrieved one single time. The blocking mechanism is way faster than risking fetching the same entry multiple times via expensive I/O operations. To inform workers trying to access a blocked key that the key has been unblocked, we use the Redis pub/sub mechanism: All blocked workers subscribe to a channel regarding the key, and once the blocking worker has finished fetching data, it publishes a release message to that channel.
As we use the same caching mechanism for ISR, we cannot just keep values in cache indefinitely but need a way to re-fetch data from the CMS after cache expiry. To achieve that, we store the time when a value was last fetched for a key and, every time that key is read, check if the content should be expired. If so, we re-fetch it.

Our Redis cache implementation

Our implementation for a Redis cache then comes together as this:

import * as redis from 'redis';
import { RedisClientType } from 'redis/dist/lib/client';

/**
 * A Redis strategy for CachedValue
 */
export class CachedValueRedis<T> {
  private client?: RedisClientType;
  
  // these are just random string prefixes to store meta data on keys:
  // when was the key's value last set?
  private SET_TIME_PREFIX = '___set_at';
  
  // is the key currently blocked?
  private BLOCK_PREFIX = '___block';
  
  // the pub/sub channel name prefix for a key to publish the unblocking of
  // a key
  private VALUE_SET_CHANNEL_PREFIX = '$$$value_set';

  connected = false;
  hasHadErrorBefore = false;

  constructor(private redisUrl = process.env.REDIS_URL, private timeout = 5 * 60 * 1000) {
    if (!redisUrl) {
      console.log('No redis URL is set. That is why redis caching will be disabled');
      return;
    }

    this.client = this.getClient();
  }

  private async ensureConnected() {
    if (this.connected || this.hasHadErrorBefore) {
      return;
    }
    if (!this.client) {
      throw new Error('No redis client set. Did you supply REDIS_URL?');
    }
    try {
      await this.client.connect();
      this.connected = true;
    } catch (err) {
      this.connected = false;
      console.error('Error trying to connect to redis: ' + JSON.stringify(err), err);
      this.hasHadErrorBefore = true;
    }
  }

  private getClient(): RedisClientType {
    console.log('Initiating Redis client with URL: ', this.redisUrl);
    return redis.createClient({ url: this.redisUrl });
  }

  /**
   * Block the given `key`, making all accessing workers wait until
   * it is unblocked.
   */
  async block(key: string): Promise<void> {
    await this.ensureConnected();
    await this.client?.set(`${this.BLOCK_PREFIX}${key}`, 'true');
    return;
  }

  private async isBlocked(key: string): Promise<boolean> {
    await this.ensureConnected();
    return (await this.client?.get(`${this.BLOCK_PREFIX}${key}`)) === 'true';
  }

  async getValue(key: string): Promise<T | undefined> {
    await this.ensureConnected();

    if (await this.isBlocked(key)) {
      const subClient = this.getClient();
      await subClient.connect();

      // if the given key is currently blocked, subscribe to a channel that
      // will inform its subscribers when the key's value has been set and
      // therefore is unblocked.  
      const result = await new Promise<T | undefined>((resolve) =>
        subClient.subscribe(`${this.VALUE_SET_CHANNEL_PREFIX}${key}`, (message) => {
          try {
            // when the key's value has been set, we get the updated value
            // back as the pub/sub message
            resolve(JSON.parse(message ?? '{}') as T);
          } catch (err) {
            resolve(undefined);
          }
        })
      );

      // now that the value has been retrieved, we can unsubscribe from the
      // channel
      await subClient.unsubscribe(`${this.VALUE_SET_CHANNEL_PREFIX}${key}`);
      await subClient.disconnect();
      return result;
    }

    // the key was not blocked, so we can directly retrieve it from Redis
    const stringValue = await this.client?.get(key);

    if (!stringValue) {
      return;
    }

    try {
      return JSON.parse(stringValue);
    } catch (err) {
      return;
    }
  }

  async hasCurrentValue(key: string): Promise<boolean> {
    await this.ensureConnected();

    // Check if the value has been set within the cache timeout. If not, it
    // is considered expired and not a "current" value.
    const timeoutValue = await this.client?.get(`${this.SET_TIME_PREFIX}${key}`);
    if (timeoutValue) {
      return +timeoutValue > Date.now() - this.timeout;
    } else {
      // if no timeout value is set for the given key, it has not been
      // retrieved yet, and so it doesn't have a value at all.
      return false;
    }
  }
  
  async setValue(key: string, value: T): Promise<void> {
    await this.ensureConnected();

    // unblock the key, as now a value has been set
    await this.client?.del(`${this.BLOCK_PREFIX}${key}`);
    // set the current time to update the expiry for this key
    await this.client?.set(`${this.SET_TIME_PREFIX}${key}`, `${Date.now()}`);
    // set the actual value
    await this.client?.set(key, JSON.stringify(value));
    // publish to subscribers who are waiting for the key to be unblocked
    await this.client?.publish(`${this.VALUE_SET_CHANNEL_PREFIX}${key}`, JSON.stringify(value));
  }
}

blog.hipsquare.net