Chomsky hierarchy

Материал из WikiGrapp
Перейти к навигации Перейти к поиску

Chomsky hierarchyиерархия Хомского.

When Noam Chomsky first formalized grammars in 1956, he classified them into types now known as the Chomsky hierarchy. The difference between these types is that they have increasingly strict production rules and can express fewer formal languages. The Chomsky hierarchy consists of the following four types of grammars and languages.

(0) Type-0 grammars (or unrestricted grammars) include all formal grammars They generate exactly all languages that can be recognized by Turing machine. These languages are also known as the recursively enumerable languages.

(1) Type-1 grammars (context-sensitive grammars, or CS-grammars) generate the context-sensitive languages (or CS-languages). These grammars have rules of the form [math]\displaystyle{ \alpha A\beta \rightarrow \alpha\gamma\beta }[/math], where [math]\displaystyle{ A }[/math] is a nonterminal and [math]\displaystyle{ \alpha,\beta,\gamma }[/math] are strings of terminals and nonterminals. The strings [math]\displaystyle{ \alpha }[/math] and [math]\displaystyle{ \beta }[/math] may be empty, but [math]\displaystyle{ \gamma }[/math] must be nonempty. The rule [math]\displaystyle{ S\rightarrow e }[/math] is allowed if [math]\displaystyle{ S }[/math] does not appear in the right-hand side of any rule. The languages described by these grammars are exactly all languages that can be recognized by linear bounded automaton.

(2) Type-2 grammars (context-free grammars or CF-grammars) generate the context-free languages (or CF-languages). These grammars contain rules of the form [math]\displaystyle{ A \rightarrow \alpha }[/math], where [math]\displaystyle{ A }[/math] is a nonterminal and [math]\displaystyle{ \alpha }[/math] is a string of terminals and nonterminals. These languages are exactly all languages that can be recognized by nondeterministic pushdown automaton. Context-free languages are the theoretical basis for the syntax of most programming languages.

(3) Type-3 grammars (regular grammars) generate the regular languages. Such a grammar restricts its rules to a single nonterminal in the left-hand side and a right-hand side consisting of a single terminal, possibly followed (or preceded, but not both in the same grammar) by a single nonterminal. The rule [math]\displaystyle{ S\rightarrow e }[/math] is also allowed here if [math]\displaystyle{ S }[/math] does not appear in the right-hand side of any rule. These are exactly all languages that can be recognized by finite-state automaton. Additionally, this family of formal languages can be obtained by regular expressions. Regular languages are commonly used to define the search patterns and lexical structure of programming languages.

Литература

  • Евстигнеев В.А., Касьянов В.Н. Словарь по графам в информатике. — Новосибирск: Сибирское Научное Издательство, 2009..