Monte Carlo and the Curse of Dimensionality

We use Monte Carlo integration as a backdrop to explore how introducing probablistic ideas help alleviate the curse of dimensionality.

19 March 2025

The curse of dimensionality is easy to understand - volume grows exponentially with dimension. This implies we need to sample exponentially many data points relative to the number of features in order to have anything meaningful. What isn't so clear is why injecting randomness helps alleviate the issue. We use Monte Carlo integration as a backdrop to investigate this phenomenon.

\text{}

We open with a seemingly unrelated lemma. In time, we will see that this lemma is the backbone to this phenomenon.

\text{}

Lemma. On a bounded domain (Ω,μ)(\Omega,\mu), for qpq\geq p we have the inclusion Lq(Ω,μ)Lp(Ω,μ)L^q(\Omega,\mu)\subset L^p(\Omega,\mu). In fact, up to a constant C=C(μ(Ω),p,q)C=C(\mu(\Omega),p,q) which is independent of the dimension,

fpCfq.\|f\|_p\leq C \|f\|_q.

 \text{ }

Proof. This is an straightforward application of Hölder's inequality. We begin by setting ss as the Hölder conjugate to qp\frac{q}{p}. We compute

Ωfp(Ω(fp)qp)pq(Ω1s)1s=fqpμ(Ω)1s.\begin{equation} \int_\Omega |f|^p\leq \left(\int_\Omega (|f|^p)^{\frac{q}{p}}\right)^{\frac{p}{q}}\left(\int_\Omega 1^s\right)^{\frac{1}{s}}=\|f\|_q^p \cdot \mu(\Omega)^{\frac{1}{s}}. \end{equation}

Taking pp-th roots gives the claim.

\text{}

Actually, for our purposes, we'll need to control only the variance of a random variable YY. In this case, we can achieve a (usually) sharper constant of 11. For this, we only need the convexity of the qq-norms.

\text{}

Lemma. For a random variable YLqY\in L^q with q2q\geq 2, we have

V[Y]Yq2.\begin{equation} \mathbb{V}[Y]\leq \|Y\|_q^2. \end{equation}

In other words, the standard deviation of YY is controlled by all its higher moments.

 \text{ }

Proof. This is a simple application of Jensen's inequality. When applied to the ss-norm for s1s\geq 1, it states

E[Y](E[Ys])1s.\begin{equation} \mathbb{E}[Y]\leq \left(\mathbb{E}[Y^s]\right)^{\frac{1}{s}}. \end{equation}

Applying this to s=q2s=\frac{q}{2}, we see

V[Y]=E[Y2](E[Y])2E[Y2](E[Y2q2])2qYq2.\begin{equation} \begin{split} \mathbb{V}[Y]&= \mathbb{E}[Y^2]-(\mathbb{E}[Y])^2\\ &\leq \mathbb{E}[Y^2]\\ &\leq \left(\mathbb{E}[Y^{2\cdot \frac{q}{2}}]\right)^\frac{2}{q}\\ &\leq \|Y\|_q^2. \end{split} \end{equation}

\text{}

Let's turn to integration. Let ΩRn\Omega\subset \R^n be a box. Clasically, we'd draw evenly spaced grid lines and integrate by sampling along all corners. As discussed, this is a bad idea by the curse of dimensionality. Instead, let μ\mu denote the uniform measure on Ω\Omega, and we draw mm iid samples X1,...,XmX_1,...,X_m in Ω\Omega. For conveneince, set XX as an auxillary random variable also sampled from (Ω,μ)(\Omega,\mu). Our first goal is to show the random Riemann sum 1mi=1mf(Xi)\frac{1}{m}\sum_{i=1}^m f(X_i) forms an unbiased estimator for the true integral Ωf  dx\int_\Omega f\; dx. We compute

E[1mi=1mf(Xi)]=1mi=1mE[f(Xi)]=1mi=1mE[f(X)]=E[f(X)]=Rnf  𝟙Ωdx=Ωf  dx.\begin{equation} \begin{split} \mathbb{E}\left[\frac{1}{m} \sum_{i=1}^m f(X_i)\right] &= \frac{1}{m} \sum_{i=1}^m \mathbb{E}[f(X_i)] \\ &= \frac{1}{m} \sum_{i=1}^m \mathbb{E}[f(X)] \\ &= \mathbb{E}[f(X)]\\ &= \int_{\R^n} f\; 𝟙_\Omega dx\\ &= \int_\Omega f\; dx. \end{split} \end{equation}

By SLLN, as mm\to \infty, we get that 1mi=1mf(Xi)Ωf\frac{1}{m}\sum_{i=1}^m f(X_i) \to \int_\Omega f almost surely. As with all these asymptotic results, they're useless in practice. We need information on the rate of convergence. Henceforth, assume fLq(Ω)f\in L^q(\Omega) for some q>2q>2. For example, if Ω\Omega is closed and ff is continuous, we automatically get that fL(Ω)f\in L^\infty(\Omega). Rates of convergence are measured in the L2L^2 sense (why?), so setting Sm:=1mi=1mf(Xi)S_m:=\frac{1}{m}\sum_{i=1}^mf(X_i) we see

E(1mi=1mf(Xi)Ωfdx)2=E(SmE[Sm])2=V[Sm]=1m2V[i=1mf(Xi)]=1mVf(X)C(μ(Ω),q,2)fXqm,\begin{equation} \begin{split} \mathbb{E}\left(\frac{1}{m}\sum_{i=1}^mf(X_i)- \int_\Omega f dx\right)^2&=\mathbb{E}\left(S_m- \mathbb{E}[S_m] \right)^2\\ &=\mathbb{V}[S_m]\\ &=\frac{1}{m^2}\mathbb{V}[\sum_{i=1}^m f(X_i)]\\ &=\frac{1}{m}\mathbb{V}f(X)\\ &\leq \frac{C(\mu(\Omega),q,2)\|f\circ X\|_q}{m}, \end{split} \end{equation}

where the last equality follows from our initial lemma. In fact, we could set this constant to be 11 with our second lemma. Either way, this means the expected error of Monte Carlo integration is dimension independent with rate O(m12)O(m^{-\frac{1}{2}}) when measured in an L2L^2 sense.