Combinatorics #2: An introduction to extremal graph theory

In this post, we will address to problems of the type: “At most how many edges can a graph G have without containing a certain subgraph?”, which belong to the field of extremal graph theory. First, we establish a classical result:

(Turán’s Theorem) If a graph G(V,E) on n vertices has no K_{r} (a complete graph with r vertices) as a subgraph, then it has at most \frac{n^2}{2} \cdot \left(1- \frac{1}{r-1}\right) edges.

First proof. Consider a given graph with n vertices and m edges. Label the vertices 1 through n randomly, so that each of the n! possibilities has the same probability. Call a vertex v independent if, for each vertex u not connected to it, the label of v is greater than that of u. Consider the set S of independent vertices. Suppose there exist two vertices u, v \in S such that u and v are not connected by an edge. Then the label of one of them is greater than the other, a contradiction since both are independent. Hence S is a complete subgraph of G. Notice that the probability that a given vertex v, with degree d(v), is independent is simply \frac{1}{n-d(v)}, since it corresponds to the probability that v has the greatest labeling among the n-d(v) vertices not connected to it (including itself). Thus,

 \displaystyle \mathbb{E}[|S|] = \sum_{v \in V} \mathbb{E}[v \in S] = \sum_{v \in V} \mathbb{P}[v \in S] = \sum_{v \in V} \frac{1}{n- d(v)}.

Hence, there exists some random labeling for which |S| \ge \sum_{v \in V} \frac{1}{n-d(v)}. However, we know that r-1 \ge |S|, so the Cauchy-Schwarz inequality gives

\displaystyle r-1 \ge \sum_{v \in V} \frac{1}{n-d(v)} \ge \frac{n^2}{n^2 - 2|E|} \Longrightarrow |E| \le \frac{n^2}{2} \cdot \left(1-\frac{1}{r-1}\right),

as desired. {\blacksquare}

However, this proof does not provide a construction for the equality case, so we investigate another approach.

Second proof. Consider a graph G on n vertices with the maximal number of edges without any K_{r} as a subgraph, and define the relation \sim on the vertex set defined by

 u \sim v \iff uv \not \in E.

We will show that \sim is an equivalence relation. Of course, it is reflexive and symmetric; we will show that it is transitive. Indeed, suppose there exist u,v,w such that u \sim v, v \sim w and u \not \sim w, i.e. only uw is an edge.

If g(v) < g(u), then we may replace v by a copy of u (that is, we erase v and all edges incident to it, and create a new vertex u' incident to all neighbors of u). Note that no K_{r}‘s are formed in this process (why?), and the number of edges strictly increases, contradicting the maximality of G. Hence, g(v) \ge g(u), and by symmetry, g(v) \ge g(w). In this case, we replace u and w by two copies of v. Again, no K_{r}‘s are formed in this process, and the number of edges increases by at least 2g(v) - g(u) - g(w) + 1 > 0 (we add one for the edge uw is counted twice), again contradicting the maximality of G.

Thus \sim is an equivalence relation, and partitions V into equivalence classes V_1, V_2, \cdots V_m, with the property that vertices in the same equivalence class are not connected, and vertices in different equivalence classes are connected. If m \ge r, then taking one vertex from each equivalence class would yield a K_r, thus m \le r-1, and allowing empty classes, we may set m = r-1. Suppose the vertices are not as equally distributed as possible, i.e. |V_i| \ge |V_j| + 2 for some i,j. Then passing one vertex from V_i to V_j, the number of edges increases, contradicting the maximality of G. This provides the optimal construction (a complete r-1-partite graph) and finishes the proof, as the number of edges is readily estimated. \blacksquare

Now, we turn our attention to another interesting problem: we want to estimate the largest possible number of edges, which we shall denote by f(n,t), that we can select from an n,n-bipartite graph, without containing any K_{t,t} – that is, a complete t,t-bipartite subgraph, where t< n.

First, we search for lower bounds on f(n,t). That is, we want to prove that there exists an n,n-bipartite graph with a certain number of edges not containing a K_{t,t}. For this, we employ the probabilistic method.

Select each edge randomly with probability p, to be defined later; let X denote the number of K_{t,t}‘s formed. Note that

\mathbb{E}[X] = \binom{n}{t}^2 \cdot p^{t^2},

since the probability of any given t,t bipartite subgraph being complete is p^{t^2}. Further, the expected number of edges formed is clearly pn^2. We remove one edge from each of the X \ K_{t,t}‘s, thus we are left with an expected number of pn^2 - \binom{n}{t}^2 \cdot p^{t^2} edges without forming K_{t,t}‘s. In particular, there exists some outcome with at least this many edges, i.e.

f(n,t) \ge pn^2 - \binom{n}{t}^2 \cdot p^{t^2}.

We take p so as to maximize the right-hand side, i.e. p^{t^2-1} = \left(\frac{n}{t}\right)^2 \binom{n}{t}^{-2}. Substituting back gives

\begin{aligned} f(n,t) & \ge \left(\frac{n}{t}\right)^{2/(t^2-1)} \binom{n}{t}^{-2/(t^2-1)} \cdot \left[n^2 - \left(\frac{n}{t}\right)^2\right] \\ & > \left(\frac{n}{t}\right)^{(2-2t)/(t^2-1)} e^{-2t/(t^2-1)} \cdot \left[n^2 - \left(\frac{n}{t}\right)^2\right] \\ & = n^{2-2/(t+1)} \cdot t^{2/(t+1)} \cdot e^{-2t/(t^2-1)}\cdot \left(1-\frac{1}{t^2}\right). \end{aligned}

Hence, f(n,t) = \Omega(n^{2-2/(t+1)}).

Now, we look for an upper bound on f(n,t). Suppose a bipartite graph with classes (V_1, V_2), each with n vertices, has no K_{t,t} as a subgraph. Then if m denotes the number of edges and d(v) denotes the degree of a vertex v, we have

\displaystyle \sum_{v \in V_1} \binom{d(v)}{t} \le (t-1) \binom{n}{t}.

For if the inequality would not hold, there would be a set of t points in V_2 counted t times, thus forming a K_{t,t}. Simple bounding yields

n \displaystyle \cdot\left(\frac{m}{nt}\right)^t\le n\cdot\binom{m/n}{t}\le \sum_{v\in V_1}\binom{d(v)}{t}\le (t-1)\binom{n}{t}\le (t-1)\left(\frac{ne}{t}\right)^t,

or, after rearranging, f(n,t) \le e(t-1)^{1/t} \cdot n^{2-1/t}. Hence f(n,t) = O(n^{2-1/t}). For small values of t, i.e. t \in \{2,3\}, this is known to be the actual best exponent [2], and constructions exist using finite projective planes. However, it is not known in general whether this is actually the best bound.

When t is not very small when compared to n, however, we can get better estimations.  Consider a complete K_{n,n}, and our goal is to remove as few edges as possible so that there are no copies of K_{t,t}. Color the removed edges red. For any set S of t vertices in the left half of the graph, we need a red edge with each of the t vertices subsets of the right half. Hence, by the pigeonhole principle, there are at least n-t+1 red edges with endpoints in S. Summing over all choices of S, and noting that each red edge has an endpoint in the right half counted \binom{n-1}{t-1} times, we have at least

\frac{\binom{n}{t} (n-t+1)}{\binom{n-1}{t-1}} = \frac{n}{t} (n-t+1)

red edges. Subtracting gives f(n,t) \le n^2 - \frac{n}{t}(n-t+1), which, despite being an O(n^2) bound, gives a much better estimate for most values of n, t. In fact, if t \ge \frac{n+1}{2}, we may give an explicit optimal construction. Choose S to be the left half minus its n-t vertices of highest red-degree. Since S has at least n-t+1 red edges, it has a vertex of nonzero degree, hence all other n-t vertices have degree at least one, i.e. they contain at least n-t red edges, for a total of (n-t+1) + (n-t) = 2n - 2t + 1 red edges. This is sharp for t \ge \frac{n+1}{2} – the construction is simply to remove 2n-2t+1 disjoint edges (i.e., edges without common vertices) from the set, which is possible given that n \ge 2n-2t+1.

We now address the problem of forbidden even cycles in bipartite graphs. Let C(n, \ell) denote the greatest possible number of edges in an n,n-bipartite graph with no cycles of length 2 \ell – such cycles we denote by C_{2\ell}.

Again, we start with a lower bound using the probabilistic method. We choose each edge with probability p, and let \mathcal{C} denote the number of cycles of length 2 \ell formed. Note that for each set of \ell points on the left half and \ell points on the right half, there are \frac{\ell! (\ell - 1)!}{2} possible such cycles, and the probability of each being a cycle is p^{2 \ell}. Hence, we get

\displaystyle \mathbb{E}[\mathcal{C}] = \sum p^{2 \ell} \cdot \frac{\ell ! (\ell - 1)!}{2} = \binom{n}{\ell}^2 p^{2 \ell} \cdot \frac{\ell ! (\ell - 1)!}{2}.

Removing one edge from each cycle, we get a lower bound of pn^2 - \binom{n}{\ell}^2 p^{2 \ell} \cdot \frac{\ell! (\ell - 1)!}{2}. Taking the optimal p and substituting back gives

\displaystyle C(n, \ell) \ge n^{1 + \frac{1}{2\ell - 1} }\cdot \left(1 - \frac{1}{2 \ell}\right).

On the other hand, upper bounds are more difficult; in [4] it is proven that C(n, \ell) = O(n^{1+ 1/\ell}). For \ell \in \{2,3,5\}, this is known to be sharp, but otherwise it is generally not known whether there exists a better bound.

References:

[1] C. Shine, Curso de Combinatória, Aula 11; available at http://poti.impa.br/index.php/material/

[2] http://en.wikipedia.org/wiki/Zarankiewicz_problem

[3] http://math.mit.edu/~fox/MAT307-lecture09-10.pdf

[4] http://www.sciencedirect.com/science/article/pii/0095895674900525#

Special thanks to hyperbolictangent, who made significant progress on the f(n,t) problem, specially for the construction.

Create your own exercises! Think about some other nice graph to forbid and see how at most how many edges you can get 🙂

This entry was posted in Combinatorics, Introduction, Math, Olympaid Math and tagged , , . Bookmark the permalink.

Leave a comment