Einfacher Beweis dafür, dass kontextfreie Sprachen im zyklischen Wandel geschlossen werden

11

Die zyklische Verschiebung (auch Rotation oder Konjugation genannt ) einer Sprache LL ist definiert als { y x x y L }{yxxyL} . Laut Wikipedia (und hier ) sind die kontextfreien Sprachen im Rahmen dieser Operation geschlossen, mit Verweisen auf Artikel aus Oshiba und aus Maslow. Gibt es einen einfachen Beweis für diese Tatsache?

Für reguläre Sprachen wird der Abschluss in dieser Form als " Beweisen Sie, dass reguläre Sprachen unter dem Zyklusoperator geschlossen sind " erläutert .

Hendrik Jan.
quelle

Antworten:

5

Sie können versuchen, Pushdown-Automaten zu verwenden. Mit einem Pushdown-Automaten für die Originalsprache konstruieren wir einen für die zyklische Verschiebung. Der neue Automat arbeitet in zwei Stufen, die dem y- und dem x- Teil des Wortes y x entsprechen (wobei x y in der Originalsprache ist). In der ersten Stufe kann der Automat, wann immer er ein Nicht-Terminal A platzen lassen möchte , stattdessen ein Nicht-Terminal A ' drücken ; Die Idee ist, dass der Stapel am Ende der ersten Stufe in umgekehrter Reihenfolge die Symbole enthält, die nach dem Lesen von x im Stapel gefunden werdenyxyxxyAAxdurch den ursprünglichen Automaten. In der zweiten Stufe (der Schalter ist nicht deterministisch) dürfen wir , anstatt ein Nicht-Terminal A zu drücken , ein Nicht-Terminal A 'platzen lassen . Wenn der ursprüngliche Automat tatsächlich den Stapel beim Lesen von x erzeugen kann , kann der neue den gesamten Stapel exakt platzen lassen.AAx

Bearbeiten: Hier sind einige weitere Details. Angenommen, wir erhalten einen PDA mit dem Alphabet Σ , einer Menge von Zuständen Q , einer Menge von Akzeptanzzuständen F , Nichtterminals Γ , einem Anfangszustand q 0 und einer Menge zulässiger Übergänge. Jeder zulässige Übergang hat die Form ( q , a , A , q ' , α ) , was bedeutet, dass im Zustand q beim Lesen von a A (oder a = ϵ , in diesem Fall handelt es sich um einen freien Übergang) oben liegt -of-Stack ist A ΣQFΓq0(q,a,A,q,α)qaAa=ϵΓ (oder A = ε , was bedeutet Stapel leer ist), dannder PDA kann (es ist ein nicht-deterministisches Modell) zu bewegen Zustand q ' , Ersetzen A mit & agr; Γ * .AΓA=ϵqAαΓ

Der neue PDA hat eine neue nicht-terminale A ' für jedes A & Ggr; . Für jeweils zwei Zustände q , q 'Q und A & Ggr; { ε } , gibt es zwei Zustände ( q , q ' , 1 ) , ( q , q ' , 2 , A ) . Die Startzustände (der tatsächliche Startzustand wird über ϵ- Übergänge nicht deterministisch unter ihnen gewählt ) sind (AA & egr ; & Ggr;q, q'Q.A & egr ; & Ggr; { ε }( q, q', 1 ) , ( q, q', 2 , A )ϵq , q , 1 ) . Für jeden Übergang ( q , a , A , q ' , α ) gibt es entsprechende Übergänge ( ( q , q " , 1 ) , a , A , ( q ' , q " , 1 ) , α ) und ( ( q , q " , 2 , B.( q, q, 1 )( q, a , A , q', α )((q,q′′,1),a,A,(q,q′′,1),α)) , a , A , ( q ' , q ' ' , 2 , B ) , α ) . Es gibt auch andere Übergänge.((q,q′′,2,B),a,A,(q,q′′,2,B),α)

Für jeden Übergang ( q , a , A , q ' , α ) gibt es Übergänge ( ( q , q " , 1 ) , a , B ' , ( q ' , q " , 1 ) , B ' A ' α ). , wobei B & Ggr; { ε } , und ε ' =(q,a,A,q,α)((q,q′′,1),a,B,(q,q′′,1),BAα)BΓ{ϵ}ϵ . Für jeden Endzustand q F gibt es Übergänge ( ( q , q " , 1 ) , ε , A , ( q 0 , q " , 2 , ε ) , A ) , wobei A & Ggr; { ε } .ϵ=ϵqF((q,q′′,1),ϵ,A,(q0,q′′,2,ϵ),A)AΓ{ϵ}

Für jeden Übergang ( q , a , ϵ , q ' , α ) gibt es Übergänge ( ( q , q " , 2 , A ) , a , B ' , ( q ' , q " , 2 , A ) , B ' α ) , wobei A & Ggr; { ε } . Für jeden Übergang(q,a,ϵ,q,α)((q,q′′,2,A),a,B,(q,q′′,2,A),Bα)AΓ{ϵ}(q,a,ϵ,q,A)(q,a,ϵ,q,A), there are transitions ((q,q,2,B),a,A,(q,q,2,A),ϵ)((q,q′′,2,B),a,A,(q,q′′,2,A),ϵ), where BΓ{ϵ}BΓ{ϵ}. For every transition (q,a,A,q,B)(q,a,A,q,B), there are "generalized transitions" ((q,q,2,C),a,BA,(q,q,2,C),ϵ)((q,q′′,2,C),a,BA,(q,q′′,2,C),ϵ); these are implemented as a sequence of two transitions through an intermediate new state. Transitions (q,a,ϵ,q,α)(q,a,ϵ,q,α) with |α|2|α|2 are handled similarly. For every transition (q,a,A,q,A)(q,a,A,q,A), there are transitions ((q,q,2,A),a,B,(q,q,2,A),B)((q,q′′,2,A),a,B,(q,q′′,2,A),B), where BΓ{ϵ}BΓ{ϵ}. Transitions (q,a,A,q,Aα)(q,a,A,q,Aα) are handled similarly. Finally, there is a sole final state ff, and transitions ((q,q,2,A),ϵ,ϵ,f,ϵ)((q,q,2,A),ϵ,ϵ,f,ϵ).

(There might be a few transitions that I missed, and some of the details that I'm omitting are somewhat messy.)

Recall we're trying to accept a word yxyx, where xyxy is accepted by the original PDA. A state (q,q,1)(q,q,1) means that we're at stage 1, at state qq, and the original PDA is at state qq after reading xx. A state (q,q,2,A)(q,q,2,A) is similar, where AA corresponds to the last AA that was popped. At stage 1, we are allowed to push AA instead of popping AA. We do that for each non-terminal that is produced while processing xx, but only popped while processing yy. At stage 2, we are allowed to pop AA instead of pushing AA. If we do this, then we have to remember that the top-of-stock is really AA; this only applies when there are no "temporary" things on the stack, which in the simulated PDA is the same as the top-of-stack being ϵϵ or of the form BB.

Here is a simple example. Consider an automaton for xnynxnyn that pushes AA for each xx, and pops AA for each yy. The new automaton accepts words of two forms: ykxnynkykxnynk and xkynxnkxkynxnk. For words of the first form, stage 1 consists of pushing kk times AA, stage 2 consists of popping kk times AA, pushing nknk times AA, and popping nknk times AA. For words of the second form, we first push kk times AA, then pop kk times AA, push nknk times AA, transition to stage 2, and pop nknk times AA.

Here is a more complicated example, for the language of balanced parentheses of various types ("()","[]","<>") such that the immediate descendants of each type of parentheses must belong to a different type. For example, "([]<>)" is OK but "()" is wrong. For each "(", we push AA if the top-of-stack isn't AA, for each ")", we pop AA. Similarly BB,CC are associated with "[]" and "<>". Here is how we accept the word ">)([()]<". We consume ">)", pushing CACA, and transition to stage 2. We consume "(", popping AA and remembering the top-of-stack AA. We consume "[()]" , pushing and popping BABA; when pushing BB, we are aware that the "real" top-of-stack is AA, and so square brackets are allowed (we wouldn't be fooled by ">)(()<"); when pushing AA, since the top-of-stack is BB (which is not ϵϵ or of the form XX), then we know that BB is also the "real" top-of-stack, and so round parentheses are allowed (even though the shadow top-of-stack is AA). Finally, we consume "<" and pop CC.

Yuval Filmus
quelle
Sorry, I'm having trouble understanding -- can you explain further? Where does the automaton start and what are its transitions? And does the AAAA switch happen for every stack symbol? Thanks!
usul
Very interesting suggestion. Thanks. I will chew on this a little, to let it sink in.
Hendrik Jan
@usul, You'll have to fill-in the details yourself. The switch AAAA (in the first stage) should happen when the automaton "wants" to pop AA but cannot, and instead it pushes AA. You can think of it as a non-deterministic move.
Yuval Filmus
@Yuval: sorry but I cannot make this happen. As I understand your idea, the new automaton starts by simulating the yy part, changing pops and pushes. Then generate αα on the stack, where the original automaton starts with αα when reading yy. Waht is the original starts by pushing? Then the nwe automaton needs to pop from the empty stack. I still think your intuition is worthwhile, but some extra care seems needed.
Hendrik Jan
@Hendrik, I'm sorry, but I can't follow your counterexample. At what point do you think that the new automaton needs to pop from the empty stack?
Yuval Filmus
4

Consider Greibach normal form. To construct a shifted language you only need to change productions SαA1AnSαA1An to SA1AnαSA1Anα and add a new non-terminal S that behaves like S used to, in case some some production referenced S.

Karolis Juodelė
quelle
Thanks, but this is shift by a single letter. I am interested in general rotation, by an arbitrary number of letters.
Hendrik Jan
3
@HendrikJan, Well, if shift1(L) is context free, so shiftn(L)=shift1(shift1((L)) will surely be context free as well. You can construct a grammar for it by applying the method I suggested n times. You can also construct the shiftn(L) grammar directly, by "flattening" the given grammar. For instance if n=2 and the grammar has productions SαAB and AβC, you'd add a production SαβCB and rotate that. Of course, the size of your grammar might grow very quickly.
Karolis Juodelė
1
For fixed n you are right. But here n is fixed nor bounded. For instance if L={anbnn0} then we obtain {akbnak+=n}{bkanbk+=n}.
Hendrik Jan
@HendrikJan, I see. I wrongly assumed the question was about a finite shift. I'll reconsider my answer...
Karolis Juodelė
4

It turned out to be a good idea to check the old Hopcroft and Ullman classic Introduction to Automata Theory (1979). Closure under cycle is Exercise 6.4c and is marked S**. The double stars mean it is one of the most difficult problems (in the book). Fortunately the S indicates it is one of the selected problems with a solution.

The solution is as follows. Take a CFG in Chomsky normal form. Consider any derivation tree and basically turn it upside down. Consider a path S=X1,X2,,Xn in the original tree. To the left the tree has contributions x1,x2,,xn to the right y1,y2,,yn, meaning the string derived equals x1x2xnyny2y1. (Actually as the grammar is CNF when the path continues left, the contribution will be to the right and the corresponding xi is empty, etc.)

The tree upside down has a path S,ˆXn,ˆX2,ˆX1 with contributions yn,,y2y1 to the left and xn,,x2x1 to the right, so the result is a derivation for yny2y1x1x2xn. As required.

Full details of the construction are given in the book.

Note how this reminds of the (accepted) solution by Yuval. The nonterminals that are pushed instead of popped are in the opposite order on the stack. Quite similar to upside down in the tree.

Hendrik Jan
quelle