2 minutes
The Finiteness Problem for Context-Free Grammars
Introduction
Hey everyone! We’ve seen how to check if a context-free grammar (CFG) produces any strings. Now let’s ask a bit more: does it produce a finite number of strings, or is the language infinite? Good news: this one’s also decidable, using a nifty trick involving grammar cycles.
Formal Setup
Recall a CFG
$$ G = (V, Σ, R, S) $$
with language
$$ L(G) = {,w \in Σ^* \mid S \xRightarrow{*} w}. $$
The Finiteness Problem asks whether |L(G)| is finite or infinite.
Example Grammars
# Grammar G1:
# S -> 'a' S | 'b'
# This generates {"b", "ab", "aab", ...}, which is infinite.
# Grammar G2:
# S -> 'a' 'b' | 'b' 'a'
# Only two strings {"ab","ba"}, so finite.
Algorithm (Detecting Infinity via Graphs)
Construct a directed graph $D$ whose vertices are the nonterminals in $V$.
Add an edge $A \to B$ if there is a production $A \to α B β$ for some $α, β$.
Detect if there is any cycle in $D$ reachable from the start symbol $S$ that also leads to a terminal derivation.
- First, compute the set of generating nonterminals $Gen$ as in the Emptiness algorithm.
- Then restrict $D$ to vertices in $Gen$.
- If there is a cycle in this restricted graph reachable from $S$, then $L(G)$ is infinite; otherwise, it’s finite.
This runs in polynomial time using standard graph algorithms.
Why It Matters
- Language Analysis: Knowing a grammar’s output size can guide parser optimizations.
- Grammar Design: Helps grammar writers ensure they don’t accidentally allow unbounded recursion.
Wrap-Up
That wraps up finiteness for CFGs, two for two on decidability! Next up: Intersection Emptiness for multiple CFGs, hint: that one’s a bit trickier.