context clash

context clash

(grammar)When a parser cannot tell which alternativeproduction of a syntax applies by looking at the nextinput token ("lexeme").

E.g. given syntax

C -> A | b c

A -> d | b e

If you're parsing non-terminal C and the next token is 'b',you don't know whether it's the first or second alternative ofC since they both can start with b.

To discover whether a grammar has a context clash:

For each non-terminal, N, with multiple alternatives, look atthe first symbol of each alternative's right-hand side, callit s. If s is the empty string, then find the set FOLLOWER(N)otherwise find the set FIRST*(s). If any of the sets for N'salternatives intersect then there will be a context clash whenparsing N. If the next input symbol is one of those in theintersection of two sets then you won't know which of thealternatives applies.

FIRST(s) is the set of symbols with which s can start,including s itself. If s is a non-terminal then FIRST(s) alsoincludes the first symbol of each alternative right-hand sideof s. The '*' in FIRST*(s) means the "transitive closure"of FIRST which means keep applying FIRST to each element ofthe result until the result doesn't change. I.e. start withjust the set R = s, then for each non-terminal x in R, addFIRST(x) to R. Keep doing this until nothing new is added.(We are really only interested in the terminals in FIRST*(s)but some definitions include the non-terminals).

FOLLOWER(N) is the set of symbols which can come after N in asentence. Find each occurrence of N on the right-hand side ofa rule, e.g.

M -> ... | ... N ... | ...

If there is a symbol s immediately following N then addFIRST*(s) to the result (again, we're only interested in theterminal symbols in FIRST*(s)) if there is no symbol after Nin the alternative then add FOLLOWER(M) to the result (i.e. ifN can be the last symbol in an M then anything that can followM can also follow N).

If a grammar can generate the same sentence in multipledifferent ways (with different parse tress) then it isambiguous. An ambiguity must start with a context clash (butnot all context clashes imply ambiguity). The context clashoccurs when trying to parse the first token of the phrase withmultiple parses - you will not be able to tell whichalternative to take. To see if a context clash is also a caseof ambiguity you would need to follow the alternativesinvolved in each context clash to see if they can generate thesame complete sequence of tokens.