A Geometer's Derivation of the Graph Laplacian

The graph Laplacian is central in spectral graph theory, but its link to the standard Laplacian is often unclear. We show both represent the difference between a function and its local average.

26 October 2024

Introduction. This is essentially one long response to the MathOverflow question: Intuitively, what does the graph Laplacian represent? The standard way of generalizing the Laplacian

Δf:=div(f)\begin{equation} \Delta f:= \text{div}(\nabla f) \end{equation}

to the discrete case of graphs is to interpret the gradient of a function as a difference between function values on nodes sharing a common edge. Following this line of thought allows you to exploit the link between the Laplacian and energy minimization through à la Euler-Lagrange. This leads to the idea of a Laplacian quadratic form.

 \text{ }

In this article we take a different approach. Our new strategy is to develop a purely continuous heuristic of Δ\Delta involving no derivatives, then to import that idea the graph setting to define the graphical Laplacian. As a small technicality, we're going to generalize the negative Laplacian, namely the negative of equation (1)(1).

 \text{ }

Heuristic: The negative Laplacian measures the difference between a function value and its local average value.

 \text{ }

Heat equation Intuition. To extract this heuristic from Δ\Delta, let’s explore why the Laplacian appears in the heat equation. We say u(x,t)u(x,t), subject to the initial condition u(x,0)=f(x)u(x,0)=f(x) solves the heat equation if

Δu(x,t)=tu.\begin{equation} \Delta u(x,t)= \partial_t u. \end{equation}

In the above, f is interpreted as an initial heat distribution, and the solution uu is interpreted as the time evolution for ff. How does heat spread with time? The basic intuition is that heat averages out by comparison to its neighbors. If a point pp on is hotter (cooler) than its neighbors, then the point pp gets cooler (hotter) in the future. This intuition combined with equation (2)(2) suggests that as time passes, the Laplacian performs an averaging process on uu.

 \text{ }

Formalism. Let's justify this intuition with quick Taylor expansion. In one-dimension (the higher dimensional cases follow similarly), the Laplacian is the second derivative. Consider the Taylor expansion of the function uu around the point pp. That is for xBr(p)x\in B_r(p), we have the expansion

u(p+x)u(p)+u(p)x+12u(p)x2.\begin{equation} u(p+x) \approx u(p)+u'(p)x+\frac{1}{2} u''(p)x^2. \end{equation}

Here, Br(p):=[pr,p+r]B_r(p):=[p-r,p+r] and r>0r>0 is a small parameter. The average of uu in this interval is defined as

uˉ:=1BrBr(p)u(x)dx,\begin{equation} \bar{u}:=\frac{1}{|B_r|}\int_{B_r(p)}u(x)dx, \end{equation}

where Br:=length(Br)=2r|B_r|:=\text{length}(B_r)=2r. Integrating the Taylor expansion of uu over the ball and dividing Br|B_r| gives the Taylor expansion of uˉ\bar{u}. In detail,

uˉ=1Brrru(p+x)dx1Br(rru(p)dx+rru(p)xdx+12rru(p)x2dx).\begin{equation} \begin{split} \bar{u} &= \frac{1}{|B_r|}\int_{-r}^{r}u(p+x)dx\\ &\approx \frac{1}{|B_r|}\bigg(\int_{-r}^{r}u(p)dx+\int_{-r}^{r} u'(p)xdx + \frac{1}{2}\int_{-r}^{r}u''(p)x^2dx \bigg). \end{split} \end{equation}

Since u(p),u(p),u(p)Ru(p), u'(p), u''(p)\in \mathbb{R} are independent of xx, they can be factored out of their corresponding integral. Furthermore, since xx is an odd function, the second term vanishes since the domain is symmetric about 00. Therefore,

uˉu(p)Brrrdx+u(p)2Brrrx2dx=u(p)+u(p)6r2\begin{equation} \begin{split} \bar{u} &\approx\frac{u(p)}{|B_r|}\int_{-r}^rdx+\frac{u''(p)}{2|B_r|}\int_{-r}^r x^2dx\\ &= u(p)+\frac{u''(p)}{6}r^2 \end{split} \end{equation}

Rewriting this, we arrive at the aforementioned heuristic

6Br(uˉu(p))u(p).\begin{equation} \frac{6}{|B_r|}(\bar{u}-u(p))\approx u''(p). \end{equation}

Again, this interpretation of the Laplacian is that involves no derivatives, making it amenable for generalizing to the discrete case.

 \text{ }

Graph Laplacian. Now for the graph Laplacian. In the spirit of Trevisan, we focus on the case of a dd-regular graph. Fix a dd-regular graph (V,E)(V,E), and let AA be its adjacency matrix and II the identity matrix. We write column vectors as x=(x1,...,xV)Tx=(x^1,...,x^{|V|})^T, with superscripts as indices.

 \text{ }

We briefly recall the underlying vector space AA acts on. Since its size is V2|V|^2, it acts on the vector space RV:=Functions(V,R)\mathbb{R}^V := \text{Functions}(V,\mathbb{R}), the free vector space on VV. Vertices xVx\in V are in 1:11:1 correspondence to a canonical basis element of RV\mathbb{R}^V, namely the indicator functions δx\delta_x. This justifies conflating a vector xx with its corresponding function x:VR.x: V\to \mathbb{R}. Moving on, we compute

(Ax)i=jVAjixj=jN(i)xj,\begin{equation} (Ax)^i = \sum\limits_{j\in V} A^i_j x^j = \sum\limits_{j\in N(i)} x^j, \end{equation}

where N(i)N(i) denotes the neighbors of the vertex ii. The last equality follows by the definition of the adjacency matrix, which is defined as the indicator matrix on EE. In English, this computation tells us after matrix multiplication with the adjacency matrix, the resulting ithi^{th} component contains the sum over its neighbors. This is awfully close to averaging the function value of ii over its neighbors. In fact, dividing out this number by the degree d (:=# of neighbors)d\text{ } (:= \# \text{ of neighbors}) turns the result into a genuine average. Thus, the mapping xAdxx\mapsto \frac{A}{d}x is analogous to uuˉu\mapsto \bar{u}.

 \text{ }

Out analogy is close to full bloom. Since Ix=xIx=x, multiplying by the identity matrix recovers the function value (for each component). So as advertised, matrix multiplication by IAdI-\frac{A}{d} on a function (vector) gives the difference between the function value and its local average.