Generalized context-free grammar


Generalized context-free grammar is a grammar formalism that expands on context-free grammars by adding potentially non-context free composition functions to rewrite rules. Head grammar is an instance of such a GCFG which is known to be especially adept at handling a wide variety of non-CF properties of natural language.

Description

A GCFG consists of two components: a set of composition functions that combine string tuples, and a set of rewrite rules. The composition functions all have the form, where is either a single string tuple, or some use of a composition function which reduces to a string tuple. Rewrite rules look like, where,,... are string tuples or non-terminal symbols.
The rewrite semantics of GCFGs is fairly straightforward. An occurrence of a non-terminal symbol is rewritten using rewrite rules as in a context-free grammar, eventually yielding just compositions. The composition functions are then applied, successively reducing the tuples to a single tuple.

Example

A simple translation of a context-free grammar into a GCFG can be performed in the following fashion. Given the grammar in, which generates the palindrome language, where is the string reverse of, we can define the composition function conc as in and the rewrite rules as in.
The CF production of is
and the corresponding GCFG production is

Linear Context-free Rewriting Systems (LCFRSs)

Weir describes two properties of composition functions, linearity and regularity. A function defined as is linear if and only if each variable appears at most once on either side of the =, making linear but not. A function defined as is regular if the left hand side and right hand side have exactly the same variables, making regular but not or.
A grammar in which all composition functions are both linear and regular is called a Linear Context-free Rewriting System. LCFRS is a proper subclass of the GCFGs, i.e. it has strictly less computational power than the GCFGs as a whole.
On the other hand, LCFRSs are strictly more expressive than linear-indexed grammars and their weakly equivalent variant tree adjoining grammars. Head grammar is another example of an LCFRS that is strictly less powerful than the class of LCFRSs as a whole.
LCFRS are weakly equivalent to multicomponent TAGs and also with multiple context-free grammar. and minimalist grammars. The languages generated by LCFRS can be parsed in polynomial time.