Resource Balance Analysis

Main | What is RBA | Models | Tools | Contact

RBA theory

The RBA framework is a constraint-based modeling method that formalizes the constraints that govern resource allocation between cellular processes at a given growth rate (i.e. the amount of produced biomass per cell per hour) $\mu \geq 0$ as a set of equalities and inequalities.

I. Cell composition

In the RBA framework, a cell is composed of

$N_m$ (metabolic) cellular processes in the metabolic network (i.e. enzymes, transporters) $\mathbb{ E} \mathrel{\stackrel{\Delta}{=}} (\mathbb{ E}_1, \mathbb{ E}_2,\ldots, \mathbb{ E}_{N_m})$ at the concentrations $E \mathrel{\stackrel{\Delta}{=}} (E_1, E_2,\ldots, E_{N_m})$ and with the fluxes $\nu \mathrel{\stackrel{\Delta}{=}} (\nu_1,\nu_2, \ldots, \nu_{N_m})$;
$N_{p}$ (macromolecular) cellular processes $\mathbb{ M} \mathrel{\stackrel{\Delta}{=}} (\mathbb{ M}_1,\mathbb{ M}_2, \ldots, \mathbb{ M}_{N_p})$ involved in non-metabolic cellular processes at the concentrations $M \mathrel{\stackrel{\Delta}{=}} (M_1,M_2, \ldots, M_{N_p})$;
$N_g$ proteins $\mathbb{ P}_G \mathrel{\stackrel{\Delta}{=}} \{\mathbb{ P}_{G_1},\mathbb{ P}_{G_2},\ldots, \mathbb{ P}_{G_{N_g}}\}$ at the concentrations $P_G \mathrel{\stackrel{\Delta}{=}} (P_{G_1},P_{G_2},\ldots,P_{G_{N_g}})$, for which the activity is not specified;
$N_s$ metabolites $\mathbb{S} \mathrel{\stackrel{\Delta}{=}} (\mathbb{S}_1,\mathbb{S}_2, \ldots, \mathbb{S}_{N_s})$ at the concentrations $S \mathrel{\stackrel{\Delta}{=}} (S_1,S_2, \ldots, S_{N_s})^T$. Among the set $\mathbb{S}$, we distinguish a subset $\mathbb{B} \mathrel{\stackrel{\Delta}{=}} (\mathbb{B}_1,\mathbb{B}_2, \ldots, \mathbb{B}_{N_b})$ of metabolites which have fixed concentrations $\bar{B} \mathrel{\stackrel{\Delta}{=}} (\bar{B}_1,\bar{B}_2, \ldots, \bar{B}_{N_b})^T$.

II. Three main sets of constraints govern resource allocation

At the rate $\mu \geq 0$:

($C_{1}$): the metabolic network has to produce all metabolic precursors necessary for biomass production and satisfies the mass conservation: $$ - \Omega \nu + \mu(C^{S}_Y Y + C^{S}_B \bar{B} + C^{S}_G P_G) = 0 $$ where $Y = (E,M)$ is the vector of concentrations of cellular processes (for a bacterium, $Y$ contains a few hundred variables), and

$\Omega$ is the stoichiometry matrix of the metabolic network of size $N_s \times N_m$, where $\Omega_{ij}$ corresponds to the stoichiometry of metabolite $\mathbb{S}_i$ in the $j$-th enzymatic reaction;
$C^{S}_Y $ (resp. $C^{S}_G $) is a $N_s \times N_y$ (resp. $N_s \times N_g $) matrix where each coefficient $C^{S}_{Y_{ij}} $ corresponds to the number of metabolite $\mathbb{ S}_i$ consumed (or produced) for the synthesis of one machine $\mathbb{Y}_j$ (resp. $\mathbb{ P}_{G_j}$); $C^{S}_{Y_{ij}} $ is then positive, negative or null if $\mathbb{ S}_i$ is produced, consumed or not involved in the the synthesis of one cellular process $\mathbb{Y}_j$ (resp. $\mathbb{ P}_{G_j}$);
$C^{S}_B $ is a $N_s \times N_b $ matrix where each coefficient $C^{S}_{B_{ij}} $ corresponds to the number of metabolite $\mathbb{S}_i$ consumed (or produced) for the synthesis of one $\mathbb{B}_j$;

($C_{2}$): the capacity of each cellular process must be sufficient to ensure its function, i.e. to catalyze chemical conversions at a sufficient rate;

For the (macromolecular) cellular processes involved in nonmetabolic processes: $$ \mu (C^{M}_Y Y+ C^{M}_G P_G) - K_T Y \leq 0 $$ For the (metabolic) cellular processes (enzymes, transporters) involved in metabolic processes: $$ -K^{'}_{E} Y \leq \nu \leq K_{E} Y $$ where

$K_T$ ($K_E$ and $K^{'}_E$, respectively) of size $N_p \times N_p $ ($N_m \times N_m$, respectively) is diagonal matrix where each coefficient $K_{T_i}$ ($K_{E_i}$ and $K^{'}_{E_i}$, respectively) is positive and corresponds to the efficiency of the cellular process $\mathbb{M}_i$ , i.e. the rate of the 'process' per amount of the catalyzing molecular machine, (the efficiency of the enzyme $\mathbb{E}_i$ in forward and backward sense, respectively);
$C^{M}_Y $ (resp. $C^{M}_G $) is a $N_{p} \times N_y $ (resp. $N_{p} \times N_g $) matrix where each coefficient $C^{M}_{Y_{ij}} $ typically corresponds to the length in amino acids of the machine $\mathbb{Y}_j$ (resp. $\mathbb{ P}_{G_j}$). In some cases (for instance for the constraints on protein chaperoning), the length in amino acids can be multiplied by a coefficient, such as the fraction of the whole proteome that necessitates chaperoning;

The key idea of RBA is to introduce a coefficient of efficiency (matrixes $ K_T$, $ K_E$, $ K^{'}_E$) relating the flux of material produced by a cellular process to its concentration.

($C_{3}$): the intracellular density of compartments and the occupancies of membranes are limited. $$ C^{D}_Y Y + C^{D}_G P_G- \bar{D} \leq 0 $$ where

$\bar{D}$ is a vector of size $N^c$, where $N^c$ is the number of volume and surface areas for which density contraints are considered. $\bar{D}^i$ is the density of molecular entities with respect to the volume or surface area. Densities are typically expressed as a number of amino-acid residues by volume or surface area.
$C^{D}_Y $ (resp. $C^{D}_G $) is a $N^c \times N_y $ (resp. $N^c \times N_g $) matrix where each coefficient $C^{D}_{Y_{ij}}$ corresponds to the density of one cellular process $\mathbb{Y}_j$ (resp. $\mathbb{ P}_{G_j}$) in the compartment $i$. By construction, we have one unique localization per cellular process.

III. RBA optimization problem in steady state

Taken together, the equalities and inequalities define, at a given rate $\mu$, a feasibility linear programming (LP) problem that can be solved efficiently. For a fixed vector of concentrations $P_{G} \in \mathbb{R}_{>0}^{N_g}$, and the growth rate $\mu\geq 0$, $$ \begin{array}{ll} \mbox{find } & \quad Y \in \mathbb{R}_{>0}^{y}, \nu \in \mathbb{R}^{m}, \\[0pt] \mbox{subject to } \\ (C_{1}) & - \Omega \nu + \mu(C^{S}_Y Y + C^{S}_B \bar{B} + C^{S}_G P_G) = 0 \\ [6pt] (C_{2a}) & \mu (C^{M}_Y Y+ C^{M}_G P_G) - K_T Y \leq 0 \\[6pt] (C_{2b}) & -K^{'}_{E} Y \leq \nu \leq K_{E} Y \\[6pt] (C_{3}) & C^{D}_Y Y + C^{D}_G P_G- \bar{D} \leq 0 \end{array} $$ where all the inequalities are componentwise inequalities.

Having a feasible solution to the previous optimization problem means that the cell is able to grow at $\mu$. Conversely, for a given medium, constraints $C_{1}$, $C_{2}$, and $C_{3}$ define the (possibly empty) convex set of all feasible cell configurations that ensures either cell viability only (when $\mu=0$), either growth at a given growth rate.

In practice, the RBA model includes:

one constraint per considered (macromolecular) cellular process in $C_{2a}$: replication, transcription, translation, chaperoning, etc.
one constraint per considered compartment in $C_3$: cytoplasm, periplasm, cell membrane, cell inner/outer membrane, etc.

The processes and compartments to be included, and the nature of associated constraints (mathematical formulation as an equality or inequality) depend on the organism and are part of the modelling choices to be done. However, keeping the tractability of the final optimization problem is critical. Non-convex mathematical formulations of the constraints should be avoided as far as possible.

IV. The assumption of parsimonious resource allocation

Parsimonious resource allocation between cellular processes is modelled mathematically by maximizing the cell growth, and computed by solving a series of such LP feasibility problems for different growth rate values. At the optimal value $\mu^{*}$, RBA returns the cell configuration maximizing growth, i.e. the concentrations of molecular machines $Y^{*}$ and the metabolic fluxes $\nu^{*}$.