Transducers

Transducers, also called k-tape automata, are finite state machines where transitions are labeled on several tapes. The labelset of a transducer is a cartesian product of the labelsets of each tape: $L = L_1 \times \dots \times L_k$.

Usually, it is common to manipulate 2-tape transducers, and to consider one as the input tape, and the other as the output tape. For example, we can define a 2-tape transducer with the first tape accepting letters in [a-c], and the same for the second tape:

In [1]:
import vcsn
ctx = vcsn.context("lat<lal_char(abc), lal_char(abc)>, b")
ctx
Out[1]:
$\{a, b, c\} \times \{a, b, c\}\to\mathbb{B}$

Now we can define a transducer that will transform every a into b, and keep the rest of the letters. When writing the expression, to delimit the labels (a letter for each tape), we have to use simple quotes.

In [2]:
r = ctx.expression("(a|b+b|b+c|c)*")
r
Out[2]:
$\left(a|b + b|b + c|c\right)^{*}$
In [3]:
r.automaton()
Out[3]:
%3 I0 0 0 I0->0 F0 0->F0 0->0 [a|bb|bc|c]

Similarly, it is possible to define weighted transducers, as for weighted automata:

In [4]:
import vcsn
ctxw = vcsn.context("lat<lan_char(ab), lan_char(xy)>, z")
ctxw
Out[4]:
$(\{a, b\})^? \times (\{x, y\})^?\to\mathbb{Z}$
In [5]:
r = ctxw.expression("(a|x)*((a|y)(b|x))*(b|y)*")
r
Out[5]:
$\left(a|x\right)^{*} \, \left(\left(a|y\right) \, \left(b|x\right)\right)^{*} \, \left(b|y\right)^{*}$
In [6]:
r.thompson()
Out[6]:
%3 I2 2 2 I2->2 F13 0 0 1 1 0->1 a|x 1->0 ε|ε 3 3 1->3 ε|ε 2->0 ε|ε 2->3 ε|ε 8 8 3->8 ε|ε 4 4 5 5 4->5 a|y 6 6 5->6 ε|ε 7 7 6->7 b|x 7->4 ε|ε 9 9 7->9 ε|ε 8->4 ε|ε 8->9 ε|ε 12 12 9->12 ε|ε 10 10 11 11 10->11 b|y 11->10 ε|ε 13 13 11->13 ε|ε 12->10 ε|ε 12->13 ε|ε 13->F13

This transducer transforms the as at the beginning into xs, then ab into yx, then bs into ys. As you can see, it's possible to have $\varepsilon$-transitions in a transducer.

Keep in mind that while it is the common use-case, transducers are not limited to 2 tapes, but can have an arbitrary number of tapes. The notion of input tape and output tape becomes fuzzy, and the problem will have to be addressed in the algorithms' interface.