3 minutes
The Intersection Emptiness Problem for Context-Free Grammars
Introduction
Time for a curveball: the Intersection Emptiness Problem for Context-Free Grammars. We ask: given two (or more) CFGs $G_1, G_2, \dots, G_k$, is
$$ L(G_1) \cap L(G_2) \cap \cdots \cap L(G_k) = \emptyset? $$
Guess what, unlike single-grammar emptiness, this one’s undecidable once you go beyond one grammar. Even for just two CFGs, there’s no general algorithm.
Formal Setup
Given CFGs
$$ G_i = (V_i, Σ, R_i, S_i) \quad\text{for } i = 1,2, $$
define the combined language
$$ L = L(G_1) \cap L(G_2). $$
The Intersection Emptiness Problem asks for the language:
$$ \mathrm{INTERSECT_EMPTY} = {\langle G_1, G_2 \rangle \mid L(G_1) \cap L(G_2) = \emptyset}. $$
A decider here would be a machine $I$ that on input $\langle G_1, G_2\rangle$ outputs 1 if they intersect emptily, and 0 otherwise.
Example Grammars
# Grammar H1:
# S -> 'a' S 'b' | 'c'
# Generates strings with a's then c then b's, like "aaacbb".
# Grammar H2:
# T -> 'a' T 'b' | 'd'
# Generates strings with a's then d then b's, like "aadbb".
- Question: Does L(H1) ∩ L(H2) contain any string? Answer: No, because H1’s middle symbol is ‘c’ and H2’s is ’d’. Intersection is empty.
# Grammar J1:
# S -> 'a' S | 'a'
# Generates {"a", "aa", "aaa", …}.
# Grammar J2:
# T -> T 'a' | 'a'
# Generates the same set {"a", "aa", "aaa", …}.
- Question: Is L(J1) ∩ L(J2) empty? Answer: No, both generate exactly the same infinite set of a’s.
These hand-crafted cases are fine, but there’s no algorithm that handles every pair of CFGs.
Proof Sketch (Reduction from Post Correspondence)
The classic proof uses a reduction from the Post Correspondence Problem (PCP), which is known to be undecidable (we’ll cover PCP in detail later). In brief:
- Start with a PCP instance consisting of two lists of tiles that you must align.
- Construct two CFGs, $G_1$ and $G_2$, whose productions generate sequences of tile indices in ways that correspond to matching the top and bottom sequences respectively.
- One shows that there’s a valid PCP solution iff the intersection $L(G_1) \cap L(G_2)$ is nonempty.
Since PCP is undecidable, so is intersection emptiness for two CFGs.
Why It Matters
- Language Operations: Closure properties for CFGs break down here, intersection of two CFLs need not be a CFL, and checking emptiness of that intersection is impossible.
- Compiler Theory: Combining multiple grammars or constraints can lead you into undecidable territory.
Wrap-Up
Intersection emptiness for CFGs is our first undecidable problem in the CFG world, showing that adding a second grammar flips the script. Next up: the Equivalence Problem for CFGs, which, surprisingly, is decidable!