Instant forecasting with incremental computation

Non-incremental calculation illustration

By Carl Jackson

Watershed helps businesses run world-class climate programs. One part of this is helping them model what their emissions will look like in the future. This helps them understand if making a climate commitment is feasible, or how an emissions reduction might change what their climate impact looks like in the future.

In order to help our users with this, Watershed has built a sophisticated model for forecasting a business as it grows over time. It takes into account their existing carbon footprint, the business's projections of the growth of its key business metrics, a growing library of Watershed's vendor projections, and the user's custom reduction initiatives, among other factors. If you're familiar with building forecasts in a spreadsheet this forecasting model will feel familiar, but it's an order of magnitude more configurable than any spreadsheets we've seen before.

Our forecast is built into the Watershed dashboard, but just like a spreadsheet we wanted the forecast to feel interactive and playful. We wanted users to be able to quickly experiment with ideas, and to get instant feedback from the charts and tables on the page.

Problems with React’s useMemo

We initially built the feature by calculating the forecast in a React component's render function, but quickly ran into performance problems: for our largest users, computing the forecast took a second or more, making the page unusable. At Watershed, we always start with the simplest solution, so our first solution was to use useMemo, a React hook that memoizes a piece of computation. By carefully splitting up the work into small useMemo blocks we were able to make the page feel snappy again, but at a cost. We encountered two problems: what I'm calling the "taint problem" and the "modularity problem."

The taint problem is a common useMemo gotcha. useMemo has a very primitive dependency tracking system—it re-calculates a given function whenever its inputs change (based on referential equality). Unless every dependency of a useMemo block is itself memoized, it might change on every React render, causing the useMemo’d function to recompute every time. That is, a single un-memoized value will taint every computation downstream of it, negating the benefits of memoization. Worst of all, this occurs silently, without warnings to the programmer. Our forecast model regularly had performance regressions caused by accidental un-memoized computations.

The modularity problem is more subtle. We use the forecast model on several pages across the Watershed product, and each page needs different parts of the forecast (e.g., projecting next year’s Cloud emissions, or getting business-wide statistics for the next 10 years). Ideally, the parts of the computation would be modular, and we’d be able to calculate exactly what each page needed and no more. But the React hooks programming model is eagerly evaluated and doesn't allow branching, which makes it hard to evaluate only some (but not all) of the useMemo blocks in a larger model. We tried a few ideas—like boolean flags and nullable computation outputs—but those “fixes” often made our code harder to understand. Except for a few places where performance was critical, we mostly weren’t able to modularize our forecast model; instead we computed the entire thing each time.

Incremental computation

When our simple solution was no longer enough, we decided to invest in a replacement. We decided to rethink our programming model to turn these performance concerns into a pit of success.

Our solution was to write an incremental computation library. This library keeps track of which parts of a computation depend on which other parts, and when the inputs change, avoids recomputing any values that couldn't have been affected. If you've ever used a spreadsheet, this is similar to how modern spreadsheets are implemented

Getting incremental computation right is hard, and we were glad to learn from the experience of several folks who have spent a long time thinking about this problem. The best overview article we're aware of is Robert Lord's "How to recalculate a spreadsheet." We based our implementation off of Jane Street's Incremental, which Ron Minsky gave an excellent talk about. If you'd like to learn more, we highly recommend both Robert's blog post and Ron's talk: their explanations are clear and insightful.

In our implementation there are two types of values: Variable and Calculation, which represent input data and computations, respectively. Calculations can depend on the values of Variables, other Calculations, and any other part of your JavaScript codebase, and are only recomputed when a dependency Variable or Calculation changes its value.

Diagram of non-incremental vs incremental data flow
An example of incremental computation where we’re computing the calculation (x + y) + z. This shows what would happen if we only had to change the Z value

Our incremental computation library provided the conceptual building blocks of a solution to our useMemo problems. To make it easy to work with, we built a simple interface to it we call XModel (the "X" stands for "Excel," a hat-tip to the spreadsheet versions of several of our early products, including the forecast model). By adding a small number of decorators into a normal JavaScript class, XModel gives engineers an ergonomic way to make their code update incrementally.

Here's an example of what an XModel might look like in practice:

Code block ForecaseX extends XModel

If you're familiar with Javascript, it's easy to understand what this code does: just ignore the @variable and @calculation decorators, and read the rest of the code as-is. Behind the scenes, the XModel library uses the decorators to define getters and setters for underlying Variables, and wraps each method in a Calculation, but engineers don't need to understand any of that to be productive.

In addition to the XModel class, we've also developed a set of React hooks that allow React components to depend on parts of an XModel, and to automatically re-render when those parts change.

Circling back to our original problems with useMemo, has XModel and incremental computation helped with our forecast model performance issues? Yes! Performance used to be a twice-a-month fire drill for the team, but after switching to XModel it hasn't been a concern.

Before: 427.6ms. After: 10.1ms

The taint problem is completely gone: @calculation always understands its dependencies and memoizes its results, so it's impossible to accidentally de-optimize a computation. A method that is missing a @calculation decorator will simply miss out on additional optimizations.

The modularity problem has been solved by lazy demand-driven evaluation in the XModel programming model. That is, since XModel explicitly tracks which parts of the forecast are currently being used on-page, and can trace those computations back to their inputs, it always knows precisely what to calculate (or re-calculate). And because nothing is required of the programmer to manage the data dependency graph, we’ve found that the resulting memoized computations tend to be smaller, further increasing both code quality and performance.


XModel and incremental computation have allowed us to scale our business forecast model to new features and to businesses with more data than we originally thought possible, all while keeping a great instant-updating user experience. And we've done so while decreasing the cognitive burden of writing compute-heavy code—in fact, all of the features in the past six months have been without engineers having to think hard about performance or memoization.

This is just one of many engineering problems we’ll need to solve as we help decarbonize the economy. Want to join the fight against climate change? Come join us!

Stay up to date

Get the latest from Watershed, from policy updates to in-depth climate guides.


collage with smokestack

SDR disclosures: a guide for UK asset managers

Two images side by side - box ready to ship and swirling water. Conveys climate risk for business and SEC and CSRD regulations. Text: Guide

Why companies need to understand their climate risks

collage: ocean wave with pollution

The CSRD: A guide for companies

Customer stories

coyuchi product

How Coyuchi gets product-level carbon insights from Watershed

houses next to solar panels

How Aon automated its carbon footprint measurement with Watershed

kroll and watershed and cdp logos

Kroll on using Watershed to save time reporting to CDP

Watershed HQ

vitalmetrics logo + watershed logo

Watershed acquires VitalMetrics

sun coming up over the ocean - ocean mineralization

How Watershed vets carbon removal suppliers in our Marketplace

Steve Davis, Head of Climate Science at Watershed

Welcoming Steve Davis to lead climate science


Illustration of coins in a field

Breaking down the SEC’s ESG fund-labeling proposals

headshot of mark carney text SEC proposal Q & A

FAQ on the SEC’s climate disclosures proposal

Two images side by side - box ready to ship and swirling water. Conveys climate risk for business and SEC and CSRD regulations. Text: Guide

Why companies need to understand their climate risks


watershed and latham and watkins law firm logos next to an image of the SEC

Betty M. Huber of Latham & Watkins on how to prepare for mandatory climate disclosure

ropes and grey logo with the California flag, watershed logo and text: Guide

Michael Littenberg of Ropes & Gray on California’s SB 253 and 261

EU Flag plus Covington logo

Covington weighs in on the EU’s Proposed Green Claims Directive