Pushdown Automata and Context Free Languages

compling
Author
Published

January 23, 2024

Modified

April 8, 2024

More like a rip-off machine

In the notes on Finite State Automata, we looked at this turnstile finite state automaton.

stateDiagram
  direction LR
  state "Locked" as l
  state "Unlocked" as u
  
  [*] --> l
  l --> l: push
  l --> u: coin
  u --> u: coin
  u --> l: push

An annoying thing about this turnstile is that if you don’t know how it works, it will rip you off!

A scenario

Robin approaches the finite-state turnstile with two of their friends. They think

There’s three of us, and I have three tokens. I’ll speed things up and be a good friend by popping three tokens into the machine, and then all three of us can pop through.

Robin is expecting a pattern like this to happen

, , , , ,

Little does Robin know that the way this turnstile works is that after you put a coin into the slot, the coin rolls past and triggers the unlocking mechanism and goes straight into the collection bin. If the turnstile is already unlocked, the coin just rolls into the collection bin. It doesn’t have any “memory” of how many coins it’s been fed, so after one person walks through, the turnstile relocks.

So here’s what happens to Robin and their friends

Input New State
Locked

Unlocked

Unlocked

Unlocked

Locked

With the turnstile locked again, Robin’s two friends can’t get through unless they insert yet another token!

A Pushdown Automaton

Robin was really upset and embarrassed at losing two whole tokens to the rip-off (finite state) machine in front of their friends. They vowed to invent a better turnstile so no one would ever have to face that kind of embarrassment again.

Incorporating a memory

The problem with the finite-state turnstile is that it has no “memory” of how many coins it’s been fed. Robin’s new prototype works like so:

  • Every time someone inserts a coin into Robin’s turnstile, it lands in a little collection tray. If someone inserts multiple coins, they form a stack of coins.

  • If there is even one coin in the stack, the turnstile is unlocked.

  • Any time someone pushes through the turnstile, the collection tray bounces one coin off of the stack.

Even this simple system gets a little unwieldy to represent in the same kind of state diagram. So, here’s the last one of these we’ll see for a bit.

stateDiagram
  direction LR
  state "Locked" as l
  state "Unlocked" as u
  state coin_fork1 <<fork>>
  state coin_fork2 <<fork>>
  state pop1 <<fork>>
  state pop2 <<fork>>
  state choice_state <<choice>>
  state "Stack" as s
  
  [*] --> l
  l --> l : push
  l --> coin_fork1: coin
  coin_fork1 --> u
  coin_fork1 --> s: +1
  u --> coin_fork2: coin
  coin_fork2 --> u
  coin_fork2 --> s: +1
  
  u --> choice_state: push
  choice_state --> pop1: if Stack > 1
  pop1 --> s: -1
  pop1 --> u
  
  choice_state --> pop2: if Stack == 1
  pop2 --> s: -1
  pop2 --> l

Using the Pushdown Turnstile

With Robin’s new Pushdown Turnstile installed at metro stations everywhere, they bring their two friends back to the scene of the crime, and retry their three-coins, three-people strategy. Here’s what happens.

Input New State Coin Stack
Locked 0

Unlocked 1

Unlocked 2

Unlocked 3

Unlocked 2

Unlocked 1

Locked 0

Generalizing the pattern

The way this turnstile works, generally, is that if you put in \(n\) coins, \(n\) people will be able to push through. Another way of notating that sequence of events is \(^n\)\(^n\). In the more formal-language-theory world, these kinds of patterns are usually labeled \(a^nb^n\).

Another way to think about these \(a^nb^n\) systems is in terms of bracket matching. If we replace each symbol with [ and each symbol with ], then we get a pattern that looks like this:

[
[
[
]
]
]

The requirement for the language is that every opening bracket [ needs to get matched with a closing bracket ].

Context Free Grammars

We get nested, bracket matching patterns in natural language all the time. For example the person-number agreement in this sentence.

The person who I, the guy you are looking at, am talking to is not listening.

If the form of be in this sentence were generated by a Regular Grammar, to be parsed with a Finite State Automaton, once the “you” subject appears in the sentence, every following form of be would have to be “are” the rest of the way.

*The person who I, the guy you are looking at, are talking to are not listening.

Since the first sentence is how English and other languages work, we’d conclude that natural language is, at least, Context Free.

Context Free Rules

Regular rules can look like this:

\[ A \rightarrow aA \]

Context free rules can look like this: \[ A \rightarrow aAb \]

Returning to this html snippit:

<p>
  This is a paragraph with 
  <strong>
    bold text.
  </strong>
</p>

Rules of a context free grammar that could give rise to this well-formed html are:

\[ D \rightarrow <p>C</p> \]

\[ C \rightarrow words \]

\[ C \rightarrow words S \]

\[ S \rightarrow <strong>C</strong> \]

flowchart TD

  D --> p1["p"]
  D --> C1["C"]
  D --> p2["/p"]

  C1 --> w1["This is a paragraph with"]
  C1 --> S

  S --> s1["strong"]
  S --> C2["C"]
  S --> s2["/strong"]

  C2 --> w2["bold text."]

A PDA for this grammar

Here’s a way we’d describe a Pushdown Automaton that decides whether or not a document is generated by this grammar:

  • Each time it encounters an opening <tag>, it adds it to the stack, and when it encounters a closing </tag>, it pops it from the stack.
  • When it encounters a closing </tag>, it has to match the opening <tag> that’s at the top of the stack.

  • When it gets to the end of the document, the stack needs to be empty.

Here’s a table showing how that’d play out

Input event Stack
<p> push <p> <p>
This, is, a, paragraph, with <p>
<strong> push <strong> <strong>, <p>
bold, text. <strong>, <p>
</strong> pop <strong> <p>
</p> pop <p>

One consequence of the rule that tags need to match when you pop them is that the following is not valid html.

<p>
  This is <strong>invalid!
</p>
</strong>

If you were to reason through the state of the stack after the opening <strong> tag, it would look like

<strong>, <p>

Then, when you feed it </p>, it doen’t match the tag at the top of the stack, so we’d get an error of some sort.

Limits of Pushdown Automata

An office building has installed a version of Robin’s turnstile. Each person who enters the building has to insert their id card, and the machine scans it and spits it out the other side when a person pushes through.

Robin approaches the turnstile with their friends Skylar and Alex. Both Skylar and Alex have their hands full carrying packages into the building, so Robin tries to be helpful and insert all of their id cards first, so they can then pass through. They’re walking through in the order

  1. Robin
  2. Skylar
  3. Alex

So Robin puts their ID cards into the turnstile in that order. Here’s how it works out

Input Action Stack
push
push ,
push , ,
pop ,
🚨

Oh no! The turnstile has handed Robin Alex’s id card! What a mess!

Beyond Context Free

Robin was expecting a sequence like this

, , , , ,

This involves so-called “crossing dependencies”, which can’t be recognized by a Pushdown Automaton, which means they involve a more complex grammar than context free rules.

There are some examples of crossing dependencies in human language as well, like this example in Swiss German from Shieber (1985) (cited in Jäger and Rogers (2012))

“that we let the children help Hans paint the house”
dass mer d’ chind em Hans es Huus lönd hälfe aanstriiche
that we the children-ACC Hans-DAT the house-ACC let help paint
Back to top

References

Jäger, Gerhard, and James Rogers. 2012. “Formal Language Theory: Refining the Chomsky Hierarchy.” Philosophical Transactions of the Royal Society B: Biological Sciences 367 (1598): 1956–70. https://doi.org/10.1098/rstb.2012.0077.
Shieber, Stuart M. 1985. “Evidence Against the Context-Freeness of Natural Language.” Linguistics and Philosophy 8 (3): 333–43. https://doi.org/10.1007/BF00630917.

Reuse

CC-BY-SA 4.0

Citation

BibTeX citation:
@online{fruehwald2024,
  author = {Fruehwald, Josef},
  title = {Pushdown {Automata} and {Context} {Free} {Languages}},
  date = {2024-01-23},
  url = {https://lin511-2024.github.io/notes/meetings/02_pda.html},
  langid = {en}
}
For attribution, please cite this work as:
Fruehwald, Josef. 2024. “Pushdown Automata and Context Free Languages.” January 23, 2024. https://lin511-2024.github.io/notes/meetings/02_pda.html.