Labs ICT
โญ Pro Login

CFG Basics

Productions, terminals, non-terminals โ€” the grammar behind programming languages.

Context-Free Grammars (CFG) Basics

A context-free grammar (CFG) is a formalism for describing the syntax of languages. Unlike regular expressions, CFGs can describe nested structures like balanced parentheses, arithmetic expressions, and programming language syntax. They are more powerful than finite automata.

Formal Definition

A CFG is a 4-tuple G = (V, ฮฃ, R, S), where:

V     โ€” finite set of variables (non-terminals)
ฮฃ     โ€” finite set of terminals (disjoint from V)
R     โ€” finite set of productions (rules)
S โˆˆ V โ€” start variable

Production format:
  A โ†’ ฮฑ
  where A โˆˆ V and ฮฑ โˆˆ (V โˆช ฮฃ)*

Example โ€” simple grammar for arithmetic expressions:
  V = {E, T, F}
  ฮฃ = {+, *, (, ), id}
  S = E
  R:
    E โ†’ E + T | T
    T โ†’ T * F | F
    F โ†’ ( E ) | id

Derivations

A derivation is a sequence of steps that replaces variables with strings of terminals and variables using the production rules. We start with the start variable S and apply productions until we have only terminals.

Grammar:
  S โ†’ aSb | ฮต

Derivation of "aabb":
  S โ‡’ aSb         (using S โ†’ aSb)
    โ‡’ aaSbb       (using S โ†’ aSb)
    โ‡’ aabbb? No โ€” let's fix this.

  Actually:
  S โ‡’ aSb
    โ‡’ aaSbb
    โ‡’ aabb        (using S โ†’ ฮต)

Leftmost derivation (always replace leftmost variable):
  S โ†’ aSb โ†’ aaSbb โ†’ aabb

Rightmost derivation (always replace rightmost variable):
  S โ†’ aSb โ†’ aabb  (replace S โ†’ ฮต first, then... )
  Actually: S โ†’ aSb โ†’ aabb (replace S with ฮต)

Language of a Grammar

The language generated by a grammar G, denoted L(G), is the set of all strings of terminals that can be derived from S. A language is context-free if there exists a CFG that generates it.

CFGs are strictly more powerful than regular expressions. The language {0^n 1^n | n >= 0} can be generated by a CFG but not by any regular expression. This is because CFGs can count and match nested structures.

Grammar for {0^n 1^n | n >= 0}:
  S โ†’ 0S1 | ฮต

Derivation of "000111":
  S โ†’ 0S1 โ†’ 00S11 โ†’ 000S111 โ†’ 000111

This language is context-free but NOT regular.
No finite automaton can count unbounded pairs.

๐Ÿงช Quick Quiz

In a CFG, a production rule of the form A -> ฮฑ means: