The THERMOS network optimisation is a heuristic centred around a mixed integer/linear programming approach^{1}.

## 1 Overview

The THERMOS network optimisation process works in a few steps:

Problem preparation.

The representation of the problem used for the editor is converted into a form suitable for the optimiser.

This means:

- Making all the different objective contributions commensurable, by converting them into present values or costs
- Computing upper and lower bounds for the size and diversity for all pipe segments
- Generating linearised pipe cost functions for all pipe segments
- Tidying up the network to remove areas which cannot be connected and to combine pipe segments which contain redundant junctions

Mixed-integer linear formulation and solving.

Once the first step is done THERMOS phrases the problem as a mixed integer/linear program (MILP), and hands it off to a solver package to find a solution.

Parameter update loop.

Heat losses and diversity factors cannot be decided as part of the MILP, and depend on the structure of the proposed network.

Once we have a solution, we re-evaluate the heat losses and diversity factors. Unfortunately this implies that the MILP we solved was not quite the correct one, so we solve it again with the new parameters and see whether this changes the result.

If the result is unaffected, or we are going in circles, or we have run out of time, we stop.

Best solution re-evaluation.

Finally, we take the best solution we encountered in the parameter update loop and re-evaluate it using the nonlinear pipe cost functions; this again breaks the optimality guarantee from the MILP, but using the full nonlinear cost shapes makes the problem too hard.

## 2 Problem preparation

### 2.1 Conversion to present values

The MILP does not contain any representation of time, except for a distinction between peak and average operating conditions.

Because of this ongoing revenues and costs have to be summed up over the whole accounting period.

In general, any cost or revenue is converted into a time-series of payments spanning the accounting period. These payments are then discounted and summed in the normal way to give a present value:

\[ \mathit{PV}(x, r) = \sum_i x_i / (1 + r)^i \]

Capital costs are converted into a time-series in one of two ways, which can be combined:

- By repetition on some interval
By annualizing the cost with a loan:

\[ \frac{X×r}{1 - 1/((1+r)^t)} \]

There are four combinations possible here:

- No repetition or loan
- Repetition without a loan
- A loan without repetition
- Both a loan and repetition

Some costs depend on decisions made by the optimiser.

For example, the cost of a heat supply is framed in terms of its capacity, which we do not know up-front.

In these cases, the *unit* cost is what is converted to a present value.
Fortunately, geometric discounting is safe to apply to the unit rate, so the present value of the unit cost multiplied with the size is equal to the present value of the size multiplied with the unit cost:

\[ PV(\text{unit rate}) \times \text{size} = PV(\text{unit rate} \times \text{size}) \]

### 2.2 Flow bounds

The formulation of the MILP involves so-called "big-M" constraints (of which more later), and the use of linearised pipe costs.

To make both of these things work well, we need to have good *bounds* on the heat flow on each potential pipe segment. Producing tight bounds here makes the MILP easier to solve, and makes the linearised pipe cost function less erroneous.

### 2.3 Linearised pipe costs

Pipe costs are represented by equations of the form

\[ \text{length} \times (A + B \times (\oslash ^ {1.1}) + C \times (\oslash ^ {1.3})) \]

However, the MILP works in terms of power, rather than diameter (it decides on the capacity for pipes in terms of power flow), so we must first convert this equation into one which relates a flow of heat to a cost.

THERMOS does this using a relation between delivered power and diameter:

\[ p(d) = (-0.4834 + 4.7617 × (d ^ {0.3701})) \times \delta_t \times \rho \times c \]

where \(d\) is the diameter, \(\delta_t\) the temperature difference, \(\rho\) the density and \(c\) the specific heat capacity. This relation is an approximation; the real hydraulic behaviour is complicated and not modelled in THERMOS.

Anyhow, using this relation we are able to numerically generate a pipe cost function which relates kW of power to the pipe's cost per meter.

This function is non-linear, and also not monotonically increasing^{2}, which makes it hard to efficiently approximate within a MILP.

In THERMOS, we make a linear approximation to this function for each place where a pipe could go; the approximation's terms are chosen to minimise the square error resulting from using it.

Because we have flow bounds for every potential pipe, we are able to restrict the range we are approximating to the range of powers that the pipe may be required to deliver, which also helps to keep the error down.

### 2.4 Tidying up

Finally we do a bit of tidying up to simplify the optimiser input, removing:

- Buildings which can't be connected to any supply and have no alternative system possible
- Paths which don't go to any building
- Junctions in the road network which would have no effect on the result. This combines any paths which can be combined.

## 3 Mixed-integer linear formulation

The result of the above process is a simplified problem description, containing the following information:

- For each demand location:
- Annual and peak demand
- The number of demands, for diversity calculation
- The present value of connecting the building, in three parts: a fixed part, a part per kWh and a part per kWp (kWp being peak demand)
- The present value of connection costs for the building, split the same way.
- A list of insulation that is available there, characterised by:
- The present cost of the insulation, as a fixed cost and cost per kWh abated demand
- The maximum and minimum values for kWh demand that can be abated

- A list of alternatives that are available there, characterised by:
- Present cost, in terms of fixed cost, kWh cost and kWp cost
- Emissions factors per kWh

- For each supply location:
- The maximum peak capacity available
- The present cost of supply, in terms of a fixed cost, a cost per kWp, and a cost per kWh
- Emissions factors per kWh

- For each possible path:
- Upper bounds for the heat it might be asked to carry in any possible network, at peak and average time
- The present cost of using the path, in terms of a fixed cost and a cost per kWp

- For each type of emission:
- The present cost per tonne emitted
- Any upper bound required

### 3.1 Sketch

Before giving the formal description, here is a sketch of how the problem is defined; this should make the formalism a bit easier to read.

The task for the optimiser is to choose what to do with each demand (network or individual system), and what to do with each possible path (pipe or not, and what size).

So, there are decision variables for these choices:

For every building, a decision about how to heat it and a decision about how much insulation to buy.

These are a series of binary variables - either a building is on a network or not, it has a gas boiler or not, it has external wall insulation or not, and so on.

For insulation, there is also a continuous variable to be decided: how much insulation to buy.

For every arc (an arc being one of the two directions along a path), a decision about whether to use it and how big to make it.

So these are two variables, whether we buy the arc or not, and how big a pipe we put in.

Given these decision variables it is possible to write down the objective function. For example we can say that if the connection of a building \(i\) to network is given by the variable \(DVIN_{i}\), then the objective function includes terms like \(DVIN_{i} \times \text{pv of connecting }i\).

Similarly for pipes we might say that \(AIN_{i,j}\) is 1 if a pipe from \(i\) to \(j\) is included and 0 otherwise, and \(CAPACITY_{i,j}\) is the size of the pipe needed in kWp. Then the cost of the pipe to the objective is \(AIN_{i,j} \times \text{fixed cost} + CAPACITY_{i,j} \times \text{variable cost}\).

To prevent the optimiser producing a silly result we also need a system of constraints that describe what a legal solution looks like.

The detail of these is given below, but it mostly expresses a few simple rules:

- At every point in the network, the flow of heat has to balance, so that if heat flows out into a building or junction it must be balanced by heat that flows in from a pipe or a supply location
- Along every arc in the network, the pipe capacity must be enough to carry the flow of heat along that arc
- At every demand location, there has to be a choice of exactly one type of heating used

### 3.2 Formalism

First we should introduce some symbols for the mathematical formulation:

- The set of all vertices (junctions or end-points in a network), called \(\mathit{VTX}\) and usually indexed by \(i\), having subsets:
- The set of demand vertices \(\mathit{DVTX}\)
- The set of supply vertices \(\mathit{SVTX}\),

- The set of all arcs (directed pipes in a network), called \(\mathit{ARC}\), which is \(\mathit{VTX} \times \mathit{VTX}\), usually indexed by \(a\) or \((i, j)\)
- The set of all edges, called \(\mathit{EDGE}\), which is the undirected subset of \(\mathit{ARC}\), often indexed by \(e\).
- The set of all individual system types, called \(\mathit{ALT}\), usually indexed by \(t\)
- The set of insulation types, called \(\mathit{INS}\), usually indexed by \(t\)
Two types of 'time', usually indexed by \(t\).

The two times are t

_{peak}and t_{mean}, which reflect peak and average / annual operating conditions for the network.

In the code there are a few more sets, but they are implementation details best understood by reading the program. We don't explain them here, because it would make the design less clear.

Next we can consider the decision variables for the network part (we will cover individual systems and insulation a bit later):

- \(\mathit{DVIN}_i\) is a binary variable (valued 0 or 1) which models which \(i\) in \(\mathit{DVTX}\) are on the heat network
- \(\mathit{SVIN}_i\) is a binary variable which models which \(i\) in \(\mathit{SVTX}\) are providing heat to the network
- \(\mathit{AIN}_{i,j}\) is a binary variable which models which arcs have a pipe on them
- \(\mathit{FLOW}_{i,j,t}\) is is a nonnegative real value which models the flow of heat from \(i\) to \(j\) in time period \(t\)
- \(\mathit{CAPACITY}_{i,j}\) is is a nonnegative real value which models the pipe size required (in kw) from \(i\) to \(j\) in any period, allowing for diversity (of which more later)
- \(\mathit{SUPPLY}_{i,t}\) is a nonnegative real value which models the heat output from supply location \(i\) in time \(t\)
- \(\mathit{SUPPLYCAPACITY}_{i}\) is a nonnegative real value which models the plant capacity required at location \(i\)

These variables produce contributions to objective in a fairly direct way:

- Revenues
Since \(\mathit{DVIN}_i\) is 1 if a building is connected to the network, the revenue from a building is \[ \sum_i \mathit{DVIN}_i \times (\text{present value of connecting}) \]

Ignoring insulation, the present value of connecting the building is a constant which we can work out outside the MILP. We will return to insulation later.

- Connection costs
- Similarly to revenues, we can state the connection cost as \[ \sum_i \mathit{DVIN}_i \times \text{present cost of connecting} \]
- Heat cost
- The cost of heat input into the network is \[ \sum_{i \in \mathit{SVTX}} \mathit{SUPPLY}_{i, t_{mean}} \times \text{present cost per kwh} \]
- Plant cost
- The cost of plant is \[ \sum_{i \in \mathit{SVTX}} \mathit{SVIN}_{i} \times \text{present fixed cost of supply at $i$} + \mathit{SUPPLYCAPACITY}_i \times \text{present variable cost of supply at $i$} \]
- Pipe cost
- The cost of pipes is quite similar to the cost of supplies: \[ \sum_a \mathit{AIN}_a \times \text{fixed cost of $a$} + \mathit{CAPACITY}_a \times \text{variable cost of $a$} \]

However we must also bind the optimiser to produce a sensible answer using some constraints:

- Flow balances
The flow balance rule is what makes the model build a network at all. For every point \(i\) at each "time" \(t\) (which includes all supply points, demand points, and junctions between paths), we define the

*unmet demand*at \(i\) in \(t\) as the difference between all the heat leaving \(i\) and all the heat flowing into \(i\).In formal terms, this the unmet demand at \(i\) in time \(t\) is

\[ u = (\mathit{demand} + \mathit{outflow} + \mathit{losses}_{}) - (\mathit{supply} + \mathit{inflow}) \]

where we use \(\mathit{DVIN}\)

\[ \mathit{demand} = \mathit{DVIN}_i \times \mathit{DEMAND}_{i, t} \text{, or zero if $i$ is not a demand location} \]

and

\[ \mathit{supply} = \mathit{SUPPLY}_{i, t} \text{, or zero if $i$ is not a supply location} \]

and

\[ \mathit{outflow} = \sum_{j\in N(i)}\mathit{FLOW}_{i,j,t} \]

and

\[ \mathit{inflow} = \sum_{j\in N(i)}\mathit{FLOW}_{j,i,t} \]

and

\[ \mathit{losses} = \sum_{j\in N(i)}\mathit{AIN}_{j,i} \times \mathit{LOSS_{i,j}} \]

Disregarding insulation and skipping over heat losses for now, we constrain \(u_i = 0\) for every \(i\).

- Flow requires pipe
Since \(\mathit{AIN}\) is used to contribute pipe fixed costs to the objective, we don't want to allow \(\mathit{AIN}_{i,j} = 0\) unless \(\mathit{FLOW}_{i,j,t}\) = 0 as well.

This is done using what's normally called a

*big-M*constraint, which looks like this:\[ \forall t: \mathit{FLOW}_{i,j,t} \leq \mathit{AIN}_{i,j} \times M_{i,j,t} \]

Here \(M\) is the big-M in question - it is a number chosen to be a bit bigger than the largest value \(\mathit{FLOW}_{i,j,t}\) would sensibly need to take. In this case it is the appropriate flow upper bound, whose computation is described above.

The effect is to ensure that we cannot use the pipe unless we also pay for it!

- Capacity suffices
The pipe cost consists of fixed and variable parts; \(\mathit{AIN}\) turns the fixed part on and off, and \(\mathit{CAPACITY}\) controls the variable part. Without being forced otherwise, the optimiser would set \(\mathit{CAPACITY}\) to 0, so a bit like the previous constraint we need to make sure that if there is a flow, then there is capacity for that flow.

However, since the \(\mathit{FLOW}\) variable reflects the sum of all demands 'down the pipe' without accounting for diversity, we need to introduce a

*diversity factor*, whose mysterious origins will be described later. For now it is sufficient to know that it's a number less than or equal to 1, which makes the required pipe smaller if it is carrying many demands at peak.\[ \forall t: \mathit{CAPACITY}_{i,j} \geq \mathit{DIVERSITY}_{i,j,t} \times \mathit{FLOW}_{i,j,t} \]

and also (because capacity is about an

*edge*, but flow is about an*arc*)\[ \forall t: \mathit{CAPACITY}_{i,j} \geq \mathit{DIVERSITY}_{j,i,t} \times \mathit{FLOW}_{j,i,t} \]

- Flow one way
To prevent the model putting a pipe on a path in both the forward and reverse directions we say:

\[ \mathit{AIN}_{i,j} + \mathit{AIN}_{j, i} \leq 1 \]

- Supply capacity suffices
To ensure we purchase enough supply capacity we say

\[ \forall i, t : \mathit{SUPPLYCAPACITY}_i \geq \mathit{DIVERSITY}_{i,t} \times \mathit{SUPPPLY}_{i,t} \]

Again, diversity is a parameter whose computation is described later; here it is enough to presume that we already know the diversity, even though its value does depend on what the supply has been connected to.

- Supply requires plant
To ensure we pay the fixed cost for a supply, we say:

\[ \forall i, t : \mathit{SUPPLY}_{i,t} \leq \mathit{SVIN}_i \times M_{i, t} \]

Where \(M\) is another big-M constraint determined when computing the flow bounds; it is the maximum flow the supply could ever have to produce.

### 3.3 Insulation

Each demand location may potentially have some amount of insulation installed. For the purposes of formulating the MILP, insulation is characterised by a few bits of information:

- Fixed cost
- This is the present cost of doing any amount of the insulation
- Variable cost
- This is the present cost per kWh of insulation done
- Maximum kWh
- This is the maximum reduction in demand available

Then to represent the use of insulation we need to introduce two decision variables

- \(\mathit{INSULATION}_{i, t}\), a binary variable to indicate whether insulation of type \(t\) is being installed in demand \(i\).
- \(\mathit{INSULATIONKWH}_{i, t}\), a continuous variable indicating how much of insulation \(t\) is installed at demand \(i\).

These naturally produce an extra cost term for the objective:

\[ \sum_{i, t} \mathit{INSULATION}_{i,t} \times \text{fixed cost of $t$ at $i$} + \mathit{INSULATIONKWH}_{i, t} \times \text{cost/kwh of $t$ at $i$} \]

As above, we also need a big-M constraint to ensure we pay the fixed cost:

\[ \forall i, t : \mathit{INSULATIONKWH}_{i,t} \leq \mathit{INSULATION}_{i,t} \times M \]

Finally we need to make insulation affect the demand for heat. Earlier, we said that the unmet demand at each vertex had to be zero; when considering insulation we instead say:

\[ 0 \leq u_i \leq \sum_t \mathit{INSULATIONKWH}_{i,t} \]

You may wonder why this is not expressed with less slack, as

\[ u_i = \sum_t \mathit{INSULATIONKWH}_{i,t} \]

This is because this couples with the flow balance constraint and has the effect that insulation can only be installed if the building is also on the network. This could be fixed by including non-network systems in the flow balance equation, but that creates another problem, that heat from individual systems should not be able to be put *into* a heat network.

The slack here does create a possible odd outcome, where the model buys insulation but does not use it. For example, if insulation had a negative cost, installing it would create value, but not using it would preserve the associated revenue from selling heat. However, under normal combinations of parameters the optimiser will only want to buy insulation when it's going to use it, so this situation doesn't occur.

### 3.4 Individual systems

Individual systems (called *alternatives* in the code) are handled separately from the network model. The use of an individual system to heat a demand location is represented by a single binary decision variable \(\mathit{ALTIN}_{i, t}\) (where \(i \in \mathit{DVTX}\) and \(t \in \mathit{ALT}\)).

This variable is constrained so that \(\mathit{ALTIN}_{i, t}\) can only be 1 for demand location / individual system pairs that the user has marked as legal in the inputs.

The only other constraint applied is then that

\[ 1 = \mathit{DVIN}_i + \sum_t \mathit{ALTIN}_{i,t} \]

This constraint is relaxed for buildings that are not marked as required and have no allowed individual systems; this is arguably a quirk of the user interface, but it allows the user to express questions in which they are uninterested in considering the ins and outs of individual systems.

The cost of individual systems is mostly similar to the cost of heat network supply; however, costs related to the heating system's annual output need to reflect the effect of insulation. Since the quantity of insulation is itself a decision variable, we cannot multiply it by \(\mathit{ALTIN}\) without making a quadratic program, so the demand reduction effect is achieved by adding some constraints and another variable:

We say that \(\mathit{ALTAVOID}_{i,t}\) is the amount of alternative system \(t\)'s *output* that we are going to *avoid* using insulation at demand \(i\). This has to be less than the amount of insulation installed there:

\[ \forall i, t: \mathit{ALTAVOID}_{i, t} \leq \sum_{k \in \mathit{INS}} \mathit{INSULATIONKWH}_{i,k} \]

and we also can't avoid demand in system \(t\) unless we are actually using system \(t\):

\[ \forall i, t: \mathit{ALTAVOID}_{i, t} \leq \mathit{ALTIN}_{i,t} \times M \]

Now we can phrase the cost of alternative systems as:

\[ \sum_{i, t} \mathit{ALTIN}_{i, t} \times \text{base present cost} - \mathit{ALTAVOID}_{i, t} \times \text{unit present cost} \]

where the base present cost reflects is the discounted sum of fixed capital cost, variable capital cost, and unit rate multiplied with the demand *before insulation*, which we know up-front.

Like the slack in the unmet demand constraint, this does allow a situation in which the model purchases insulation but chooses not to use its effect, but again this should be ruled out by sensible sensible parameters (i.e. nonnegative financial costs).

## 4 Parameter updates

In the formalism above there are two sets of parameters – constants, from the point of view of the MILP – which we have referred to but not explained.

These are \(\mathit{LOSSES}\) and \(\mathit{DIVERSITY}\), which represent for each edge in the problem the typical heat losses from a pipe on that edge and the diversity factor for that edge which let us use a smaller pipe than the sum of flows would imply.

As far as we know these values cannot be expressed within the optimisation problem without either making it very non-linear (perhaps quadratic) or adding a very large number of additional binary variables and complex constraints.

Instead of doing one of these, in THERMOS we try to iteratively approximate these values by:

- Making an initial guess for each edge
- Solving the resulting MILP
- Using the solution to produce a better guess
- Updating the MILP with these new guesses, and then going back to step 2.

We stop this process if the solution stops changing, or if we find that we are in a cycle (so guess X gives solution A which leads to guess Y, which gives solution B, which leads back to guess X again).

### 4.1 Finding diversity factors

The diversity (or perhaps more properly coincidence) factor for a pipe in THERMOS is calculated using the rule:

\[ f(n) = 0.62 + 0.38/n \]

So if the pipe is meeting \(n\) demands whose peaks sum to \(d\) the pipe capacity required is taken to be \(f(n) \times d\).

Given a candidate solution, we work out a value of \(n\) for each edge by traversing the proposed network from the supply location and counting up how many demands can reached through each edge.

#### 4.1.1 Preventing invalid configurations

The diversity factor rule is slightly too simple, as it allows an incoherent outcome: consider a Y-shaped junction in a network, with the forks of the Y each feeding a single demand, and the stem being the pipe to/from the supply.

If one of the two demands is much larger than the other, then the diversified capacity for the combined pipe will be *less* than the peak capacity for the larger demand's pipe. This is an invalid result - a larger capacity pipe cannot usefully input into a smaller capacity one.

To prevent this happening, we also determine for each pipe the maximum peak demand of any of the buildings reachable through it. If the simple diversity rule above would result in a capacity below this maximum, then the pipe is sized for the maximum instead.

#### 4.1.2 Initial diversity factors

Since the choice of diversity factor depends on having a solution, we must choose some initial diversity values to parameterise the MILP before we have found any solution. We start with the most *optimistic* diversity value, which is the maximum that could happen in any solution; this is to help the optimiser avoid ruling out the use of a pipe which would be a good choice when allowing for diversity.

### 4.2 Finding heat losses

Like diversity, heat losses cannot be calculated within the optimisation and so are computed afterwards. The heat loss for each pipe is determined from the pipe's required capacity, indirectly from a model fitted between pipe diameter and heat loss rate.

#### 4.2.1 Initial heat losses

The heat losses for each pipe are initialized to the lowest values implied by the lower bound on the power the pipe might deliver if it were used in the network. This is the most optimistic assumption.

## 5 Solution re-evaluation

Once the iteration described earlier is finished, we take the best solution seen so far, fix its parameters to reflect proper diversity and heat loss values, and then re-solve it with all the binary decision variables constrained so that the solution does not change. This solution is the solution used as the result.

This could be improved slightly by doing this to every solution considered, and taking the best under those circumstances.

## 6 Choices of objective

The application user interface displays two choices of objective, *network NPV* and *whole-system NPV*.

These two objectives are implemented in the same way as far as the optimisation is concerned, and changing the objective merely changes how the cost parameters for the optimisation are determined.

In *network NPV* mode, the objective is:

- The sum of the present value of all connections, less
- The present costs for capital and operating expenditures in the network, and
- The present cost for emissions by the network supply

In *whole-system NPV* mode, the objective is:

- The sum of the present costs for capital and operating expenditures within
*and outside*the network, and - The present cost for emissions by network supply
*and individual systems*

In whole-system mode, the revenues to the network operator are not considered, as these are internal to the system boundary.

### 6.1 Market tariff

In network NPV mode, the astute reader may notice something unfair: if you set an emissions cost, the network has to pay it for anyone that it supplies.
However the network may still be *better* (e.g. lower carbon) than the alternatives, so it should be receiving some credit for the improvement, rather than just a cost.

To handle this discrepancy the model has a special *market tariff*. If a building is put on this tariff, then the price it would pay for heat from the network is calculated to *beat* the price it would pay otherwise. This price is determined by:

- The options for individual systems and insulation for the building
- Emissions prices
- A discount rate, period, and a parameter we have called
*stickiness*

We work out the market rate by computing the minimum present cost non-networked option the building has (using the discount rate and period for the market tariff - this can differ from that for the optimisation, because it is modelling consumer behaviour rather than what we want to optimise).

Given this minimum present cost, we then reduce it by the stickiness (so for a 10% stickiness, we have 90% of the minimum present cost), and calculate a unit rate for heat in the building where the present cost to the building of that heat would equal this reduced value. If you want to see this as a formula it is something like

\[ C = \min_{a \in \text{alternatives}, i \in 2^\text{insulation}} \mathit{PC}(a, i) \]

and then finding \(u\) so that

\[ \mathit{PC}(\text{annual cost of $u \times $ demand}) = \text{stickiness} \times C \]

## Footnotes:

^{2}

Monotonically increasing (where an increase is worse) nonlinear functions generally have efficient linear approximations, because a linear program given a piecewise linear approximation will 'use up' the lower (and hence better) pieces before it uses up the worse ones.

Decreasing functions can only be piecewise approximated using more complicated gadgets, because some constraints and extra integer variables are needed to prevent the solution taking 'economies of scale' from the curve when it hasn't gone to the scale needed.