Formal Grammar - Formal Language

Formal Grammar - Formal Language

formal grammar (grammar)
  • is a set of production rules for strings in a formal language
  • does not describe the meaning of the strings or what can be done with them in whatever context—only their form
formal language/expressions
  • are generated by the grammar

Formal Grammars - Formal Definition

a formal grammar G is defined as the tuple (N, Σ, P, S):

  • N - a finite set of nonterminal symbols, that is disjoint with the strings formed from G
  • Σ - a finite set of terminal symbols that is disjoint from N
  • P - (the grammar) a finite set of production rules, each rule of the form:
    • (ΣN)* N (ΣN)* → (ΣN)*
      where:
      • * is the Kleene star operator
      • ∪ denotes set union
      that is, each production rule maps from one string of symbols to another, where the first string (the "head") contains an arbitrary number of symbols provided at least one of them is a nonterminal. In the case that the second string (the "body") consists solely of the empty string—i.e., that it contains no symbols at all—it may be denoted with a special notation (often Λ, e, or Ο΅) in order to avoid confusion.
  • S - a distinguished symbol in N that is the start symbol, also called the sentence symbol

Formal Grammars - Example

 Click here to expand...

Consider the grammar G where:

  • N = {𝑆, 𝐡}
  • Σ = {π‘Ž, 𝑏, 𝑐}
  • P consists of the following production rules:
    • 𝑆 → π‘Žπ΅π‘†π‘
    • 𝑆 → π‘Žπ‘π‘
    • π΅π‘Ž → π‘Žπ΅
    • 𝐡𝑏 → π‘π‘
  • = π‘†

This grammar defines the language L(G) = {π‘Žπ‘›π‘π‘›π‘π‘› | 𝑛 ≥ 1}. Thus, the language is the set of strings that consist of 1 or more π‘Ž's, followed by the same number of 𝑏's, followed by the same number of 𝑐's.

Formal Grammars - Mathematical Constructs

 Click here to expand...

given a formal grammar G = (N, Σ, P, S):

  • the binary relation(pronounced as "G derives in one step") on strings x and y in (Σ ∪ N)* is defined 
  • the binary relation(pronounced as "G derives in zero or more steps") is defined as the reflexive transitive closure of
  • sentential form - can be derived in a finite number of steps from the start symbol S; that is, a sentential form is a member of
  • sentence - a sentential form that contains no nonterminal symbols
  • L(G) (pronounced as "language of G") - is defined as all those sentences that can be derived in a finite number of steps from the start symbol S; that is, the set

Formal Grammars - Chomsky Hierarchy

 Click here to expand...

Chomsky Hierarchy 

syntax definitions:

  • 𝐴,𝐡 - non-terminals
  • π‘Ž,𝑏,𝑐 - terminals
  • 𝛼,𝛿,𝛽 - strings of terminals and/or nonterminals (MAYBE empty)
  • 𝛾 - a string of terminals and/or nonterminals (MUST be nonempty)
  • πœ– - an empty string (i.e. the string of length 0)

Chomsky Type

  • Formal Grammar
  • Formal Language/Expressions

Automaton Accepting Formal Grammar/Language

Grammar Production RulesExample Languages

Type-4

  • Finite Enumerable Grammar (FEG)
  • Finite Enumerable Language (FEL)

Memory Lookup

are defined by rules of the forms:

  • 𝑆 → π‘ π‘‘π‘Ÿπ‘–π‘›π‘”-π‘œπ‘“-π‘‘π‘’π‘Ÿπ‘šπ‘–π‘›π‘Žπ‘™π‘ 

grammar rules G:

  • 𝑆 → π‘Žπ‘π‘
  • 𝑆 → π‘Žπ‘

language of G:

  • 𝐿 = {π‘Žπ‘π‘, π‘Žπ‘}

Type-3

  • Regular Grammar (RG)
  • Regular Language (RL)

Finite-State Automaton (FSA, FA)

regular grammar is EITHER left or right regular grammar:

  • right regular grammar - is defined by rules of the forms:

    • 𝐴 → π‘Ž
    • 𝐴 → π‘Žπ΅
    • 𝐴 → πœ–
  • left regular grammar - is defined by rules of the forms:

    • 𝐴 → π‘Ž
    • 𝐴 → π΅π‘Ž
    • 𝐴 → πœ–

grammar rules G:

  • 𝐴 → π‘Žπ΅
  • 𝐡 → 𝑏𝐡
  • 𝐡 → 𝑐

language of G:

  • 𝐿 = { π‘Žπ‘π‘›π‘ | 𝑛 ≥ 0 }

Type-2

Push-Down/Pushdown Automata (PDA)
Finite-State Transducer (FST)

defined by rules of the form:

  • 𝐴 → 𝛼

grammar rules G:

  • 𝐴 → π‘Žπ΄π‘
  • 𝐴 → πœ–

language of G:

  • 𝐿 = { π‘Žπ‘›π‘π‘›| 𝑛 > 0 }

Type-1

  • Context Sensitive Grammar (CSG)
  • Context Sensitive Language (CSL)

Linear-Bounded Automaton (LBA)

defined by rules of the form:

  • 𝛼𝐴𝛽 → 𝛼𝛾𝛽
  • 𝐴 → πœ–

the rule π΄ → πœ– is allowed if π΄ does not appear on the right side of any rule

grammar rules G:

  • 𝑆 → π‘Žπ΅π‘†π‘
  • 𝑆 → π‘Žπ‘π‘
  • π΅π‘Ž → π‘Žπ΅
  • 𝐡𝑏 → 𝑏𝑏

language of G:

  • L(G) = { π‘Žπ‘›π‘π‘›π‘π‘› | 𝑛 > 0 }

Type-0.5

  • Recursive Grammar (RG)
  • Recursive Language (RL)
recursive grammar if it contains production rules that are recursive, meaning that expanding a non-terminal according to these rules can eventually lead to a string that includes the same non-terminal again. Otherwise, it is called a non-recursive grammar


Type-0

  • Recursively Enumerable Grammar (REG)
  • Recursively Enumerable Language (REL)

Turing Machine (TM)

defined by rules of the form:

  • 𝛼𝐴𝛽 → 𝛿


Every regular language is context-free, every context-free language is context-sensitive, every context-sensitive language is recursive, and every recursive language is recursively enumerable

Formal Grammars - Chomsky Hierarchy Extended

 Click here to expand...

Chomsky hierarchy extended - an extension to Chomsky hierarchy