P is a finite set of productions of the form A → t, with A ∈ N, and t ∈ TΣ, where TΣ is the associated term algebra, i.e. the set of all trees composed from symbols in Σ ∪ N according to their arities, where nonterminals are considered nullary.
Derivation of trees
The grammar G implicitly defines a set of trees: any tree that can be derived from Z using the rule setP is said to be described by G. This set of trees is known as the language of G. More formally, the relation ⇒G on the set TΣ is defined as follows: A tree t1∈ TΣ can be derived in a single step into a tree t2 ∈ TΣ , if there is a context S and a production ∈ P such that:
t1 = S, and
t2 = S.
Here, a context means a tree with exactly one hole in it; if S is such a context, S denotes the result of filling the treet into the hole of S. The tree language generated by G is the languageL =. Here, TΣ denotes the set of all trees composed from symbols of Σ, while ⇒G* denotes successive applications of ⇒G. A language generated by some regular tree grammar is called a regular tree language.
Examples
Let G1 =, where
N1 = is our set of nonterminals,
Σ1 = is our ranked alphabet, arities indicated by dummy arguments,
Z1 = BList is our starting nonterminal, and
the set P1 consists of the following productions:
* Bool → false
* Bool → true
* BList → nil
* BList → cons
An example derivation from the grammar G1 is BList ⇒ cons ⇒ cons ⇒ cons. The image shows the corresponding derivation tree; it is a tree of trees, whereas a derivation tree in word grammars is a tree of strings. The tree language generated by G1 is the set of all finite lists of boolean values, that is, L happens to equal TΣ1. The grammar G1 corresponds to the algebraic data type declarations : datatype Bool = false | true datatype BList = nil | cons of Bool * BList
Every member of L corresponds to a Standard-ML value of type BList. For another example, let G2 =, using the nonterminal set and the alphabet from above, but extending the production set by P2, consisting of the following productions:
BList1 → cons
BList1 → cons
The language L is the set of all finite lists of boolean values that contain true at least once. The set L has no datatype counterpart in Standard ML, nor in any other functional language. It is a proper subset of L. The above example term happens to be in L, too, as the following derivation shows: BList1 ⇒ cons ⇒ cons ⇒ cons.
Language properties
If L1, L2 both are regular tree languages, then the tree sets L1 ∩ L2, L1 ∪ L2, and L1 \ L2 are also regular tree languages, and it is decidable whether L1 ⊆ L2, and whether L1 = L2.
Alternative characterizations and relation to other formal languages