Introduction
Regular languages provide a formalism for describing and reasoning about patterns in strings, making them fundamental to designing compilers. As every regular language can be modeled as a deterministic finite automaton (DFA), they also provide several benchmarks for the running times of algorithms with string input. For instance, the set of all email addresses forms a regular language, showing that (omitting space complexity) we can check if some string 1 is a syntactically correct email address in O(n). The following was an assignment problem of mine that deals with a lesser-known theorem identifying a closure property of regular languages.
Theorem.
Let Σ be some finite alphabet. For languages A⊆Σ∗ and B⊆Σ∗, let the shuffling of A and B, denoted shuffle(A,B), be the language
shuffle(A,B)={w∈Σ∗∣w=a1b1…akbk where a1…ak∈A and b1…bk∈B for some k∈Z+} The class of regular languages is closed under the shullfing operation.
Solution
A quick note
Usually, when proving closure properties, it's much easier to construct a non-deterministic finite automaton and state its equivalence to a DFA. But, in this case, I found it much more intuitive to construct another DFA using the DFAs for languages A and B. To see why, try drawing a picture of two (preferably small) DFAs for two specific regular languages, then add the new transition function δ from the proof below.
Proving the theorem
Proof. Let Σ be some finite alphabet. Let A⊆Σ∗ and B⊆Σ∗ be regular languages. Since A and B are regular languages, let MA=(QA,Σ,δA,q0A,FA) and MB=(QB,Σ,δB,q0B,FB) be deterministic finite automata such that L(MA)=A and L(MB)=B. Let M=(Q,Σ,δ,(q0A,q0B,A),F) be a deterministic finite automaton where Q=QA×QB×{A,B}, F=FA×FB×{A}, and δ:Q×Σ↦Q is defined by for all qiA∈QA, qjB∈QB, and w∈Σ∗,
δ((qiA,qjB,A),w)=(δA(qiA,w),qjB,B)δ((qiA,qjB,B),w)=(qiA,δB(qjB,w),A) Let w∈Σ∗ and notice that w∈L(M) iff δ^((q0A,q0B,A),w)∈F2. Since F=FA×FB×{A}, by the definition of δ, we can see that M rejects w if w is of odd length or the empty string. Now, consider the case where w has even length. That is, w=x1y1…xkyk for some k∈Z+ where x1…xk∈Σ∗ and y1…yk∈Σ∗. By induction (here), we can show that, δ^((q0A,q0B,A),w)=(δ^A(q0A,x1…xk),δ^B(q0B,y1…yk),A). It follows that w∈L(M) iff δ^A(q0A,x1…xk)∈FA and δ^B(q0B,y1…yk)∈FB. Equivalently, w∈L(M) iff x1…xk∈L(MA)=A and y1…yk∈L(MB)=B. Therefore,
L(M)={w∈Σ∗∣w=a1b1…akbk where a1…ak∈A and b1…bk∈B for some k∈Z+} Thus, the class of regular languages is closed under the shuffling operation.
Proving the lemma
In case you're interested.
lemma.
Let k∈Z+. For all x1y1…xkyk∈Σ∗,
δ^((q0A,q0B,A),x1y1…xkyk)=(δ^A(q0A,x1…xk),δ^B(q0B,y1…yk),A) Proof.
Base Case: Let k=1
δ^((q0A,q0B,A),x1y1)=δ(δ^((q0A,q0B,A),x1),y1)=δ(δ(δ^((q0A,q0B,A),ϵ),x1),y1)=δ(δ((q0A,q0B,A),x1),y1)=δ((δA(q0A,x1),q0B,B)y1)=(δA(q0A,x1),δB(q0B,y1),A)=(δ^A(q0A,x1),δ^B(q0B,y1),A) Induction Step: Let k∈Z+ and assume
δ^((q0A,q0B,A),x1y1…xkyk)=(δ^A(q0A,x1…xk),δ^B(q0B,y1…yk),A)(1) Then,
δ^((q0A,q0B,A)x1y1…xk+1yk+1)=δ(δ^((q0A,q0B,A)x1y1…xkykxk+1),yk+1)=δ(δ(δ^((q0A,q0B,A),x1y1…xkyk),xk+1),yk+1)=δ(δ((δ^A(q0A,x1…xk),δ^B(q0B,y1…yk),A),xk+1),yk+1)=δ((δA(δ^A(q0A,x1…xk),xk+1),δ^B(q0B,y1…yk),B)yk+1)=δ((δ^A(q0A,x1…xk+1),δ^B(q0B,y1…yk),B),yk+1)=δ(δ^A(q0A,x1…xk+1),δ(δ^B(q0B,y1…yk),yk+1),A)=δ(δ^A(q0A,x1…xk+1),δ^B(q0B,y1…yk+1),A)(1)