Sind die Vorher- und Nachher-Sätze für kontextfreie Grammatiken immer kontextfrei?

14

Sei $G$ eine kontextfreie Grammatik. Eine Folge von Anschlüssen und Nichtanschlüssen von $G$ wird als sententiale Form von $G$ wenn Sie sie erhalten können, indem Sie Produktionen von $G$ null oder mehrmals auf das Startsymbol von anwenden $S$ . Sei $\operatorname{SF}(G)$ die Menge der Sententialformen von $G$ .

Sei $\alpha \in \operatorname{SF}(G)$ und sei $\beta$ ein Teilstring von $\alpha$ - wir nennen $\beta$ ein Fragment von $\operatorname{SF}(G)$ . Nun lass

$\operatorname{Before}(\beta) = \{ \gamma \ |\ \exists \delta . \gamma \beta \delta \in \operatorname{SF}(G) \}$

und

$\operatorname{After}(\beta) = \{ \delta \ |\ \exists \gamma . \gamma \beta \delta \in \operatorname{SF}(G) \}$ .

Sind und kontextfreie Sprachen? Was ist, wenn eindeutig ist? Wenn eindeutig ist, sind und auch durch eine eindeutige kontextfreie Sprache beschreibbar? $\operatorname{Before}(\beta)$ $\operatorname{After}(\beta)$ $G$ $G$ $\operatorname{Before}(\beta)$ $\operatorname{After}(\beta)$

Dies ist eine Fortsetzung meiner früheren Frage , nachdem ein früherer Versuch , die Beantwortung meiner Frage zu erleichtern, fehlgeschlagen ist. Eine negative Antwort macht die umfassende Frage, an der ich arbeite, sehr schwer zu beantworten.

formal-languages context-free formal-grammars closure-properties Alex ten Brink
quelle

8

Lassen Sie uns zuerst ein Gefühl für und . Betrachten Sie einen Ableitungsbaum, der enthält . "enthält" bedeutet hier, dass Sie Teilbäume wegschneiden können, sodass ein Teilwort der Baumfront ist. Dann sind die Vorher- (Nachher-) Mengen alle möglichen Fronten des Baumteils links (rechts) von : $\operatorname{Before}(\beta)$ $\operatorname{After}(\beta)$ $\beta$ $\beta$ $\beta$

tree with before and after sets
^{[ Quelle ]}

Wir müssen also eine Grammatik für den horizontal (vertikal) gesäumten Teil des Baumes erstellen. Das scheint einfach zu sein, da wir bereits eine Grammatik für den gesamten Baum haben. wir müssen nur sicherstellen, dass alle sententialen Formen Wörter sind (ändern Sie die Alphabete), diejenigen herausfiltern, die kein enthalten (das ist eine reguläre Eigenschaft, da festgelegt ist) und alles nach (vor) , einschließlich . Dieses Schneiden sollte auch möglich sein. $\beta$ $\beta$ $\beta$ $\beta$

Nun zu einem formellen Beweis. Wir werden die Grammatik wie beschrieben transformieren und die Verschlusseigenschaften von , um das Filtern und Schneiden durchzuführen, dh wir führen einen nicht konstruktiven Beweis durch. $\mathrm{CFL}$

Sei eine kontextfreie Grammatik. Es ist leicht zu erkennen, dass kontextfrei ist. konstruiere wie folgt: $G = (N, T, \delta, S)$ $\operatorname{SF}(G)$ $G'=(N',T',\delta',N_S)$

$N' = \{N_A \mid A \in N\}$
$T' = N \cup T$
$\delta' = \{\alpha(A) \to \alpha(\beta)\mid A\to\beta \in \delta \} \cup \{N_A \to A \mid A\in N\}$

mit für alle und für alle . Es ist klar, dass ; Daher sind die entsprechenden Präfix- und Suffix -Abschlüsse auch kontextfrei¹. $\alpha(t)=t$ $t \in T$ $\alpha(A)=N_A$ $a\in N$ $\mathcal{L}(G')=\operatorname{SF}(G)$ $\operatorname{Pref}(\operatorname{SF}(G))$ $\operatorname{Suff}(\operatorname{SF}(G))$

Nun, für jeden sind und reguläre Sprachen. Als unter Schnitt geschlossen und rechts / links Quotienten mit regulären Sprachen, erhalten wir $\beta \in (N\cup T)^*$ $\mathcal{L}(\beta(N\cup T)^*)$ $\mathcal{L}((N\cup T)^*\beta)$ $\mathrm{CFL}$

$\qquad \displaystyle \operatorname{Before}(\beta) = (\operatorname{Pref}(\operatorname{SF}(G))\ \cap\ \mathcal{L}((N\cup T)^*\beta))\,/\,\beta \in \mathrm{CFL}$

und

. $\qquad \displaystyle \operatorname{After}(\beta) = (\operatorname{Suff}(\operatorname{SF}(G))\ \cap\ \mathcal{L}(\beta(N\cup T)^*))\,\backslash\, \beta \in \mathrm{CFL}$

¹ $\mathrm{CFL}$ is closed under right (and left) quotient; $\operatorname{Pref}(L) = L / \Sigma^*$ and similar for $\operatorname{Suff}$ yield prefix resp. suffix closure.

Raphael
quelle

I started to write an answer then realized my proof was the same as yours. I'd have put it this way (compressed to fit here): form a grammar

G^{'}

$G'$ by adding a new terminal

\hat{A}

$\hat A$ (a metavariable) for each non-terminal

A

$A$ and a production

A \to \hat{A}

$A\to\hat A$ . Then sentential forms of

G

$G$ are the words recognized by

G

$G$ that consist of metavariables. This is the intersection of a CFG with a regular language and thus is regular. The prefix set of a CFG is a CFG (take a PDA and make every state final).

B e f o r e (γ) = {γ ∣ γ β \in L (P r e f i x (\hat{G}))}

$\mathrm{Before}(\gamma) = \{\gamma \mid \gamma\beta\in L(\mathrm{Prefix}(\hat G))\}$ is again a CFG.

Gilles 'SO- stop being evil'

1

@Gilles, three comments on that: 1) the sentential forms typically (properly) contain the language. 2) "make every state final" -- that won't work; you'll accept prefixes of non-words, too. 3) The last step of "cutting off" a suffix seems to be tricky to get rigorous. :/ Do you have a rigorous but more compact proof than mine?

Raphael

G

$G$

b

$b$

b

$b$

9

$\mbox{Before}(\beta)$ $\mbox{After}(\beta)$ are context-free languages. Here's how I would prove it. First, a lemma (which is the crux). If $L$ is CF then:

$\mbox{Before}(L,\beta) = \{ \gamma \ |\ \exists \delta . \gamma \beta \delta \in L \}$

and

$\mbox{After}(L,\beta) = \{ \gamma \ |\ \exists \delta . \delta \beta \gamma \in L \}$

are CF.

Proof? For $\mbox{Before}(L,\beta)$ construct a non-deterministic finite-state transducer $T_{\beta}$ that scans a string, outputting every input symbol it sees and simultaneously searches non-deterministically for $\beta$ . Whenever $T_{\beta}$ sees the first symbol of $\beta$ it forks non-deterministically and ceases outputting symbols until either it finishes seeing $\beta$ or it sees sees a symbol that deviates from $\beta$ , stopping in either case. If $T_{\beta}$ sees $\beta$ in full, it accepts upon stopping, which is the only way it accepts. If it sees a deviation from $\beta$ , it rejects.

The lemma can be jiggered to handle cases where $\beta$ could overlap with itself (like $abab$ -- keep looking for $\beta$ even while in the midst of scanning for a prior $\beta$ ) or appears multiple times (actually, the original non-determinisic forking already handles that).

It's fairly clear that $T_\beta(L) = \mbox{Before}(L,\beta)$ , and since the CFLs are closed under finite-state transduction, $\mbox{Before}(L,\beta)$ is therefore CF.

A similar argument goes for $\mbox{After}(L,\beta)$ , or it could be done with string reversals from $\mbox{Before}(L,\beta)$ , CFLs also being closed under reversal:

$\mbox{After}(L,\beta) = \mbox{rev}(\mbox{Before}(\mbox{rev}(L),\mbox{rev}(\beta)))$

Actually, now that I see the reversal argument, it would be even easier to start with $\mbox{After}(L,\beta)$ , since the transducer for that is simpler to describe and verify -- it outputs the empty string while looking for a $\beta$ . When it finds $\beta$ it forks non-deterministically, one fork continuing to look for further copies of $\beta$ , the other fork copying all subsequent characters verbatim from input to output, accepting all the while.

What remains is to make this work for sentential forms as well as CFLs. But that is pretty straightforward, since the language of sentential forms of a CFG is itself a CFL. You can show that by replacing every non-terminal $X$ throughout $G$ by say $X^\prime$ , declaring $X$ to be a terminal, and adding all productions $X^\prime \rightarrow X$ to the grammar.

I'll have to think about your question on unambiguity.

David Lewis
quelle

Sind die Vorher- und Nachher-Sätze für kontextfreie Grammatiken immer kontextfrei?

Antworten: