Skip to content

High memory consumption and errors with pre renderering #3159

@tperale

Description

@tperale

Description

I'm running a use case with Vike that generate static website that have more than 75.000 single pages statically generated.

As described in #2786, without bumping the memory allocation for the Node process I would run into the following error:

FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory

I tried the suggestion in https://vike.dev/prerender#parallel and I've reached a point that even setting the NODE_OPTIONS="--max-old-space-size=18192" doesn't get me to the end of the preRendering step.

I gave a look at the reason behind the high memory consumption in the packages/vike/src/node/prerender/runPrerender.ts file and I see the following reason.

  1. All the prerendered page content are added to the pageContext.output array https://github.com/vikejs/vike/blob/main/packages/vike/src/node/prerender/runPrerender.ts#L965

While this was originally fixed in #1262 it has been re-introduced in #2123. In the context of static website with a huge number of pages this obviously clog the memory with a lot of content.

Right now I don't fully understand how this content is used and would appreciate pointers but multiple solution could be explored:

  • Do not store all the pages but "stream" them to consumers
  • Add an option to not store it
  • Remove this line
  1. The onPrerenderStart hook requires all the context.

Reading the usage of that hook on https://vike.dev/onPrerenderStart mentions that it's used to generate multiple pageContext on the basis of an input pageContext (with the example of i18n).

A potential solution could be to force the onPrerenderStart hooks to return an array of pageContext instead of manipulating the array in place. This also have the benefit of making it easier to parallelize the rendering pipeline.

The default onPrerenderStart hook would look like the following:

export { onPrerenderStart }
 
async function onPrerenderStart(pageContext) {
  return [pageContexts]
}

Let me know if I'm missing any other use-cases where the onPrerenderStart modify other part of the prerenderContext.

  1. The prerendering pipeline

Right now every pageContext are kept in memory a Promise is created to perform the prerendering step over their content.

While on my case with this method I start the pre-rendering at around ~2G of memory consumption, I observed that with every prerendered page the memory would keep increasing (even after removing output array mentionned in bullet point 1). I'm not totally sure about this but it looks like that the objectAssign function makes it difficult for the garbage collector to remove the duplicated reference. And since the original reference still exist in the pageContext array the memory usage keep getting bigger.

A solution to help the garbage collector to do his job is to make the prerendering pipeline a 'standalone' task done 'per url':

flowchart TD
    A[URL Generation]

    subgraph P1[Promise 1]
        B1[Context Generation]
        C1[onPrerenderStart Hook]
        D1[Rendering]
        E1[Writing]
        F1[Passing down to consumers ?]
        B1 --> C1 --> D1 --> E1 --> F1
    end

    subgraph P2[Promise 2]
        B2[Context Generation]
        C2[onPrerenderStart Hook]
        D2[Rendering]
        E2[Writing]
        F2[Passing down to consumers ?]
        B2 --> C2 --> D2 --> E2 --> F2
    end

    subgraph P3[Promise 3]
        B3[Context Generation]
        C3[onPrerenderStart Hook]
        D3[Rendering]
        E3[Writing]
        F3[Passing down to consumers ?]
        B3 --> C3 --> D3 --> E3 --> F3
    end

    A --> B1
    A --> B2
    A --> B3
Loading

I hacked something together to try to see the memory consumption with this method and in my usecase I would sit around ~1G or less.

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions