Anti-unification (computer science)

Anti-unification is the process of constructing a generalization common to two given symbolic expressions. As in unification, several frameworks are distinguished depending on which expressions are allowed, and which expressions are considered equal. If variables representing functions are allowed in an expression, the process is called "higher-order anti-unification", otherwise "first-order anti-unification". If the generalization is required to have an instance literally equal to each input expression, the process is called "syntactical anti-unification", otherwise "E-anti-unification", or "anti-unification modulo theory".
An anti-unification algorithm should compute for given expressions a complete, and minimal generalization set, that is, a set covering all generalizations, and containing no redundant members, respectively. Depending on the framework, a complete and minimal generalization set may have one, finitely many, or possibly infinitely many members, or may not exist at all; it cannot be empty, since a trivial generalization exists in any case. For first-order syntactical anti-unification, Gordon Plotkin gave an algorithm that computes a complete and minimal singleton generalization set containing the so-called "least general generalization".
Anti-unification should not be confused with dis-unification. The latter means the process of solving systems of inequations, that is of finding values for the variables such that all given inequations are satisfied. This task is quite different from finding generalizations.

Prerequisites

Formally, an anti-unification approach presupposes

An infinite set V of variables. For higher-order anti-unification, it is convenient to choose V disjoint from the set of lambda-term bound variables.
A set T of terms such that V ⊆ T. For first-order and higher-order anti-unification, T is usually the set of first-order terms and lambda terms, respectively.
An equivalence relation on, indicating which terms are considered equal. For higher-order anti-unification, usually if and are alpha equivalent. For first-order E-anti-unification, reflects the background knowledge about certain function symbols; for example, if is considered commutative, if results from by swapping the arguments of at some occurrences. If there is no background knowledge at all, then only literally, or syntactically, identical terms are considered equal.
First-order term

Given a set of variable symbols, a set of constant symbols and sets of -ary function symbols, also called operator symbols, for each natural number, the set of terms is recursively defined to be the smallest set with the following properties:

every variable symbol is a term: V ⊆ T,
every constant symbol is a term: C ⊆ T,
from every n terms t₁,…,t_n, and every n-ary function symbol f ∈ F_n, a larger term can be built.

For example, if x ∈ V is a variable symbol, 1 ∈ C is a constant symbol, and add ∈ F₂ is a binary function symbol, then x ∈ T, 1 ∈ T, and add ∈ T by the first, second, and third term building rule, respectively. The latter term is usually written as x+1, using Infix notation and the more common operator symbol + for convenience.

Higher-order term

Substitution

A substitution is a mapping from variables to terms; the notation refers to a substitution mapping each variable to the term, for, and every other variable to itself. Applying that substitution to a term is written in postfix notation as ; it means to replace every occurrence of each variable in the term by. The result of applying a substitution to a term is called an instance of that term.
As a first-order example, applying the substitution to the term

Generalization, specialization

If a term has an instance equivalent to a term, that is, if for some substitution, then is called more general than, and is called more special than, or subsumed by,. For example, is more general than if is commutative, since then.
If is literal identity of terms, a term may be both more general and more special than another one only if both terms differ just in their variable names, not in their syntactic structure; such terms are called variants, or renamings of each other.
For example, is a variant of, since and.
However, is not a variant of, since no substitution can transform the latter term into the former one, although achieves the reverse direction.
The latter term is hence properly more special than the former one.
A substitution is more special than, or subsumed by, a substitution if is more special than for each variable.
For example, is more special than, since and is more special than and, respectively.

Anti-unification problem, generalization set

An anti-unification problem is a pair of terms.
A term is a common generalization, or anti-unifier, of and if and for some substitutions.
For a given anti-unification problem, a set of anti-unifiers is called complete if each generalization subsumes some term ; the set is called minimal if none of its members subsumes another one.

First-order syntactical anti-unification

The framework of first-order syntactical anti-unification is based on being the set of first-order terms and on being syntactic equality.
In this framework, each anti-unification problem has a complete, and obviously minimal, singleton solution set.
Its member is called the least general generalization of the problem, it has an instance syntactically equal to and another one syntactically equal to.
Any common generalization of and subsumes.
The lgg is unique up to variants: if and are both complete and minimal solution sets of the same syntactical anti-unification problem, then and for some terms and, that are renamings of each other.
Plotkin has given an algorithm to compute the lgg of two given terms.
It presupposes an injective mapping, that is, a mapping assigning each pair of terms an own variable, such that no two pairs share the same variable.
The algorithm consists of two rules:
For example, ; this least general generalization reflects the common property of both inputs of being square numbers.
Plotkin used his algorithm to compute the "relative least general generalization " of two clause sets in first-order logic, which was the basis of the Golem approach to inductive logic programming.

First-order anti-unification modulo theory

Equational theories
One associative and commutative operation: ;
Commutative theories:
Free monoids:
Regular congruence classes: ;
A-, C-, AC-, ACU-theories with ordered sorts:
Purely idempontent theories:
First-order sorted anti-unification
Taxonomic sorts: ; ;
Feature terms:
A-, C-, AC-, ACU-theories with ordered sorts: [|see above]
Nominal anti-unification
Baumgartner, Alexander; Kutsia, Temur; Levy, Jordi; Villaret, Mateu. . Proc. RTA 2015. Vol. 36 of LIPIcs. Schloss Dagstuhl, 57-73.
Applications
Program analysis: ;
Code factoring:
Induction proving:
Information Extraction:
Case-based reasoning:
Program synthesis: The idea of generalizing terms with respect to an equational theory can be traced back to Manna and Waldinger who desired to apply it in program synthesis. In section "Generalization", they suggest to generalize reverse and reverse<> to obtain reverse<>m' . This generalization is only possible if the background equation u<>=u is considered.
Anti-unification of trees and linguistic applications
Parse trees for sentences can be subject to least general generalization to derive a maximal common sub-parse trees for language learning. There are applications in search and text classification.
Parse thickets for paragraphs as graphs can be subject to least general generalization.
Operation of generalization commutes with the operation of transition from syntactic to semantic level. The latter can then be subject to conventional anti-unification.
Higher-order anti-unification
Calculus of constructions:
Simply-typed lambda calculus : Baumgartner, Alexander; Kutsia, Temur; Levy, Jordi; Villaret, Mateu. . Proc. RTA 2013. Vol. 21 of LIPIcs. Schloss Dagstuhl, 113-127.
Simply-typed lambda calculus :
Restricted Higher-Order Substitutions: ;

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Anti-unification (computer science)

Prerequisites

First-order term

Higher-order term

Substitution

Generalization, specialization

Anti-unification problem, generalization set

First-order syntactical anti-unification

First-order anti-unification modulo theory

Equational theories

First-order sorted anti-unification

Nominal anti-unification

Applications

Anti-unification of trees and linguistic applications

Higher-order anti-unification