Multitape Rational Expressions

This page is a complement to the paper Derived-Term Automata of Multitape Rational Expressions presented at CIAA 2016. This page exists in several forms:

More information is available here:

You may change the cells, and run then. To run a cell in a notebook, hit "Control-Enter" (in which case the focus stays on the same cell) or "Shift-Enter" (focus goes to the next cell). Beware that depending on the requested operations, Vcsn may generate and compile code, which may be a really slow process on small machines (about a minute): be patient! However, the code is compiled only once: successive uses will be way faster.

To run all the cells anew, select "Restart & Run All" in the "Kernel" menu above.

In [1]:
import vcsn

Example $\mathsf{E}_1$: A Simple Multitape Expression

First we introduce the "context" we are interested in: labels are letter-or-empty-word, two tapes, values are rational numbers.

In [2]:
ctx = vcsn.context('lat<lan(abcde), lan(xy)>, q')
ctx
Out[2]:
$(\{a, b, c, d, e\})^? \times (\{x, y\})^?\to\mathbb{Q}$

The expression $\mathsf{E}_1$ is:

In [3]:
e1 = ctx.expression('⟨5⟩ε|ε + ⟨4⟩ade*|x + ⟨3⟩bde*|x + ⟨2⟩ace*|xy + ⟨6⟩bce*|xy')
e1
Out[3]:
$ \left. \left\langle 5 \right\rangle \,\varepsilon \middle| \varepsilon \right. + \left. \left\langle 4 \right\rangle \,\left(a \, d \, {e}^{*}\right) \middle| x \right. + \left. \left\langle 3 \right\rangle \,\left(b \, d \, {e}^{*}\right) \middle| x \right. + \left. \left\langle 2 \right\rangle \,\left(a \, c \, {e}^{*}\right) \middle| x \, y \right. + \left. \left\langle 6 \right\rangle \,\left(b \, c \, {e}^{*}\right) \middle| x \, y \right. $

Its expansion is (contrary to the paper, the empty expression is denoted $\varepsilon$ instead of $\mathsf{1}$):

In [4]:
e1.expansion()
Out[4]:
$\left\langle 5\right\rangle \oplus a|x \odot \left[\left\langle 2\right\rangle \left. c \, {e}^{*} \middle| y \right. \oplus \left\langle 4\right\rangle \left. d \, {e}^{*} \middle| \varepsilon \right. \right] \oplus b|x \odot \left[\left\langle 6\right\rangle \left. c \, {e}^{*} \middle| y \right. \oplus \left\langle 3\right\rangle \left. d \, {e}^{*} \middle| \varepsilon \right. \right]$

The derived-term automaton of $\mathsf{E}_1$, $\mathcal{A}_{\mathsf{E}_1}$, is:

In [5]:
a1 = e1.derived_term()
a1
Out[5]:
%3 I0 0 ⟨5⟩ε|ε+⟨4⟩(ade*)|x+⟨3⟩(bde*)|x+⟨2⟩(ace*)|xy+⟨6⟩(bce*)|xy I0->0 F0 F3 0->F0 ⟨5⟩ 1 ce*|y 0->1 ⟨2⟩a|x, ⟨6⟩b|x 2 de*|ε 0->2 ⟨4⟩a|x, ⟨3⟩b|x 3 e*|ε 1->3 c|y 2->3 d|ε 3->F3 3->3 e|ε

The 10 shortest "multitape words" it accepts are:

In [6]:
a1.shortest(10)
Out[6]:
$\left\langle 5\right\rangle \varepsilon|\varepsilon \oplus \left\langle 2\right\rangle \mathit{ac}|\mathit{xy} \oplus \left\langle 4\right\rangle \mathit{ad}|\mathit{x} \oplus \left\langle 6\right\rangle \mathit{bc}|\mathit{xy} \oplus \left\langle 3\right\rangle \mathit{bd}|\mathit{x} \oplus \left\langle 2\right\rangle \mathit{ace}|\mathit{xy} \oplus \left\langle 4\right\rangle \mathit{ade}|\mathit{x} \oplus \left\langle 6\right\rangle \mathit{bce}|\mathit{xy} \oplus \left\langle 3\right\rangle \mathit{bde}|\mathit{x} \oplus \left\langle 2\right\rangle \mathit{acee}|\mathit{xy}$

Example $\mathcal{A}_3$: An Exponential Number of States

We introduce a three-tape context. The graphical rendering is less satisfying.

In [7]:
ctx3 = vcsn.context('lat<lan, lan, lan>, q')
a3 = ctx3.expression('a*|b*|c*').derived_term()
a3
Out[7]:
%3 I0 0 a*|b*|c* I0->0 F0 F1 F2 F3 F4 F5 F6 0->F0 0->0 a|b|c 1 ε|ε|c* 0->1 ε|ε|c 2 ε|b*|ε 0->2 ε|b|ε 3 ε|b*|c* 0->3 ε|b|c 4 a*|ε|ε 0->4 a|ε|ε 5 a*|ε|c* 0->5 a|ε|c 6 a*|b*|ε 0->6 a|b|ε 1->F1 1->1 ε|ε|c 2->F2 2->2 ε|b|ε 3->F3 3->1 ε|ε|c 3->2 ε|b|ε 3->3 ε|b|c 4->F4 4->4 a|ε|ε 5->F5 5->1 ε|ε|c 5->4 a|ε|ε 5->5 a|ε|c 6->F6 6->2 ε|b|ε 6->4 a|ε|ε 6->6 a|b|ε

Currently Vcsn is not able to extract nice rational expressions from such an automaton: it will always produce a "simple-tape expression over multitape generators":

In [8]:
a3.expression()
Out[8]:
$\left(a|b|c\right)^{*} \, \left(\varepsilon + \left(\varepsilon|\varepsilon|c\right) \, \left(\varepsilon|\varepsilon|c\right)^{*} + \left(\varepsilon|b|\varepsilon\right) \, \left(\varepsilon|b|\varepsilon\right)^{*} + \left(a|\varepsilon|\varepsilon\right) \, \left(a|\varepsilon|\varepsilon\right)^{*} + \left(\varepsilon|b|c\right) \, \left(\varepsilon|b|c\right)^{*} \, \left(\varepsilon + \left(\varepsilon|\varepsilon|c\right) \, \left(\varepsilon|\varepsilon|c\right)^{*} + \left(\varepsilon|b|\varepsilon\right) \, \left(\varepsilon|b|\varepsilon\right)^{*}\right) + \left(a|\varepsilon|c\right) \, \left(a|\varepsilon|c\right)^{*} \, \left(\varepsilon + \left(\varepsilon|\varepsilon|c\right) \, \left(\varepsilon|\varepsilon|c\right)^{*} + \left(a|\varepsilon|\varepsilon\right) \, \left(a|\varepsilon|\varepsilon\right)^{*}\right) + \left(a|b|\varepsilon\right) \, \left(a|b|\varepsilon\right)^{*} \, \left(\varepsilon + \left(\varepsilon|b|\varepsilon\right) \, \left(\varepsilon|b|\varepsilon\right)^{*} + \left(a|\varepsilon|\varepsilon\right) \, \left(a|\varepsilon|\varepsilon\right)^{*}\right)\right)$

Instead of displaying the automaton, we may list its states, for instance in the case of a five-tape expression.

In [9]:
import re
def states(a):
    '''The states of an automaton, sorted.'''
    res = re.findall(r'label = "(.*?)", shape', a.dot(), re.M)
    res.sort()
    return res

ctx5 = vcsn.context('lat<lan(a), lan(b), lan(c), lan(d), lan(e)>, q')
e5 = ctx5.expression('a*|b*|c*|d*|e*')
e5
Out[9]:
$ \left. {a}^{*} \middle| {b}^{*} \middle| {c}^{*} \middle| {d}^{*} \middle| {e}^{*} \right. $
In [10]:
e5.expansion()
Out[10]:
$\left\langle 1\right\rangle \oplus \varepsilon|\varepsilon|\varepsilon|\varepsilon|e \odot \left[ \left. \varepsilon \middle| \varepsilon \middle| \varepsilon \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus \varepsilon|\varepsilon|\varepsilon|d|\varepsilon \odot \left[ \left. \varepsilon \middle| \varepsilon \middle| \varepsilon \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus \varepsilon|\varepsilon|\varepsilon|d|e \odot \left[ \left. \varepsilon \middle| \varepsilon \middle| \varepsilon \middle| {d}^{*} \middle| {e}^{*} \right. \right] \oplus \varepsilon|\varepsilon|c|\varepsilon|\varepsilon \odot \left[ \left. \varepsilon \middle| \varepsilon \middle| {c}^{*} \middle| \varepsilon \middle| \varepsilon \right. \right] \oplus \varepsilon|\varepsilon|c|\varepsilon|e \odot \left[ \left. \varepsilon \middle| \varepsilon \middle| {c}^{*} \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus \varepsilon|\varepsilon|c|d|\varepsilon \odot \left[ \left. \varepsilon \middle| \varepsilon \middle| {c}^{*} \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus \varepsilon|\varepsilon|c|d|e \odot \left[ \left. \varepsilon \middle| \varepsilon \middle| {c}^{*} \middle| {d}^{*} \middle| {e}^{*} \right. \right] \oplus \varepsilon|b|\varepsilon|\varepsilon|\varepsilon \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| \varepsilon \middle| \varepsilon \middle| \varepsilon \right. \right] \oplus \varepsilon|b|\varepsilon|\varepsilon|e \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| \varepsilon \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus \varepsilon|b|\varepsilon|d|\varepsilon \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| \varepsilon \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus \varepsilon|b|\varepsilon|d|e \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| \varepsilon \middle| {d}^{*} \middle| {e}^{*} \right. \right] \oplus \varepsilon|b|c|\varepsilon|\varepsilon \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| {c}^{*} \middle| \varepsilon \middle| \varepsilon \right. \right] \oplus \varepsilon|b|c|\varepsilon|e \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| {c}^{*} \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus \varepsilon|b|c|d|\varepsilon \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| {c}^{*} \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus \varepsilon|b|c|d|e \odot \left[ \left. \varepsilon \middle| {b}^{*} \middle| {c}^{*} \middle| {d}^{*} \middle| {e}^{*} \right. \right] \oplus a|\varepsilon|\varepsilon|\varepsilon|\varepsilon \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| \varepsilon \middle| \varepsilon \middle| \varepsilon \right. \right] \oplus a|\varepsilon|\varepsilon|\varepsilon|e \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| \varepsilon \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus a|\varepsilon|\varepsilon|d|\varepsilon \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| \varepsilon \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus a|\varepsilon|\varepsilon|d|e \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| \varepsilon \middle| {d}^{*} \middle| {e}^{*} \right. \right] \oplus a|\varepsilon|c|\varepsilon|\varepsilon \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| {c}^{*} \middle| \varepsilon \middle| \varepsilon \right. \right] \oplus a|\varepsilon|c|\varepsilon|e \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| {c}^{*} \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus a|\varepsilon|c|d|\varepsilon \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| {c}^{*} \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus a|\varepsilon|c|d|e \odot \left[ \left. {a}^{*} \middle| \varepsilon \middle| {c}^{*} \middle| {d}^{*} \middle| {e}^{*} \right. \right] \oplus a|b|\varepsilon|\varepsilon|\varepsilon \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| \varepsilon \middle| \varepsilon \middle| \varepsilon \right. \right] \oplus a|b|\varepsilon|\varepsilon|e \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| \varepsilon \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus a|b|\varepsilon|d|\varepsilon \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| \varepsilon \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus a|b|\varepsilon|d|e \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| \varepsilon \middle| {d}^{*} \middle| {e}^{*} \right. \right] \oplus a|b|c|\varepsilon|\varepsilon \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| {c}^{*} \middle| \varepsilon \middle| \varepsilon \right. \right] \oplus a|b|c|\varepsilon|e \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| {c}^{*} \middle| \varepsilon \middle| {e}^{*} \right. \right] \oplus a|b|c|d|\varepsilon \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| {c}^{*} \middle| {d}^{*} \middle| \varepsilon \right. \right] \oplus a|b|c|d|e \odot \left[ \left. {a}^{*} \middle| {b}^{*} \middle| {c}^{*} \middle| {d}^{*} \middle| {e}^{*} \right. \right]$
In [11]:
a5 = e5.derived_term()
states(a5)
Out[11]:
['a*|b*|c*|d*|e*',
 'a*|b*|c*|d*|ε',
 'a*|b*|c*|ε|e*',
 'a*|b*|c*|ε|ε',
 'a*|b*|ε|d*|e*',
 'a*|b*|ε|d*|ε',
 'a*|b*|ε|ε|e*',
 'a*|b*|ε|ε|ε',
 'a*|ε|c*|d*|e*',
 'a*|ε|c*|d*|ε',
 'a*|ε|c*|ε|e*',
 'a*|ε|c*|ε|ε',
 'a*|ε|ε|d*|e*',
 'a*|ε|ε|d*|ε',
 'a*|ε|ε|ε|e*',
 'a*|ε|ε|ε|ε',
 'ε|b*|c*|d*|e*',
 'ε|b*|c*|d*|ε',
 'ε|b*|c*|ε|e*',
 'ε|b*|c*|ε|ε',
 'ε|b*|ε|d*|e*',
 'ε|b*|ε|d*|ε',
 'ε|b*|ε|ε|e*',
 'ε|b*|ε|ε|ε',
 'ε|ε|c*|d*|e*',
 'ε|ε|c*|d*|ε',
 'ε|ε|c*|ε|e*',
 'ε|ε|c*|ε|ε',
 'ε|ε|ε|d*|e*',
 'ε|ε|ε|d*|ε',
 'ε|ε|ε|ε|e*']

Example $\mathsf{E}_2$: A Sed-like Substitution

In [12]:
e2 = ctx.expression('(a{+}|x + b{+}|y)*')
e2
Out[12]:
$\left( \left. a \, {a}^{*} \middle| x \right. + \left. b \, {b}^{*} \middle| y \right. \right)^{*}$
In [13]:
e2.expansion()
Out[13]:
$\left\langle 1\right\rangle \oplus a|x \odot \left[\left( \left. {a}^{*} \middle| \varepsilon \right. \right) \, \left( \left. a \, {a}^{*} \middle| x \right. + \left. b \, {b}^{*} \middle| y \right. \right)^{*}\right] \oplus b|y \odot \left[\left( \left. {b}^{*} \middle| \varepsilon \right. \right) \, \left( \left. a \, {a}^{*} \middle| x \right. + \left. b \, {b}^{*} \middle| y \right. \right)^{*}\right]$
In [14]:
a2 = e2.derived_term()
a2
Out[14]:
%3 I0 0 (aa*|x+bb*|y)* I0->0 F0 F1 F2 0->F0 1 (a*|ε)(aa*|x+bb*|y)* 0->1 a|x 2 (b*|ε)(aa*|x+bb*|y)* 0->2 b|y 1->F1 1->1 a|ε, a|x 1->2 b|y 2->F2 2->1 a|x 2->2 b|ε, b|y

Again, the extracted expression is less readable.

In [15]:
a2.expression()
Out[15]:
$\varepsilon + \left(a|x\right) \, \left(a|\varepsilon + a|x\right)^{*} + \left(b|y + \left(a|x\right) \, \left(a|\varepsilon + a|x\right)^{*} \, \left(b|y\right)\right) \, \left(b|\varepsilon + b|y + \left(a|x\right) \, \left(a|\varepsilon + a|x\right)^{*} \, \left(b|y\right)\right)^{*} \, \left(\varepsilon + \left(a|x\right) \, \left(a|\varepsilon + a|x\right)^{*}\right)$

A More Complex Expression

The previous examples often look like sed-like substitutions, in the sense that the first tape was often a composite expression, but the second tape a simple label. There is no such limitation.

In [16]:
e = ctx.expression('(<2>[ab])* | (<3>[xy])*')
e
Out[16]:
$ \left. \left( \left\langle 2 \right\rangle \,\left(a + b\right)\right)^{*} \middle| \left( \left\langle 3 \right\rangle \,\left(x + y\right)\right)^{*} \right. $
In [17]:
a = e.derived_term()
a
Out[17]:
%3 I0 0 (⟨2⟩(a+b))*|(⟨3⟩(x+y))* I0->0 F0 F1 F2 0->F0 0->0 ⟨6⟩[a|xa|yb|xb|y] 1 ε|(⟨3⟩(x+y))* 0->1 ⟨3⟩ε|x, ⟨3⟩ε|y 2 (⟨2⟩(a+b))*|ε 0->2 ⟨2⟩a|ε, ⟨2⟩b|ε 1->F1 1->1 ⟨3⟩ε|x, ⟨3⟩ε|y 2->F2 2->2 ⟨2⟩a|ε, ⟨2⟩b|ε
In [18]:
print('{:l}'.format(a.shortest(20)))
\e|\e
<3>\e|x
<3>\e|y
<2>a|\e
<6>a|x
<6>a|y
<2>b|\e
<6>b|x
<6>b|y
<9>\e|xx
<9>\e|xy
<9>\e|yx
<9>\e|yy
<18>a|xx
<18>a|xy
<18>a|yx
<18>a|yy
<18>b|xx
<18>b|xy
<18>b|yx