expression.derived_term(algo="expansion",lazy=False,deterministic=False,breaking=False)

Generate the derived-term automaton from an expression.

The arguments are:

  • algo:
    • "derivation": rely on the expression.derivation.
    • "expansion": rely on the expression.expansion.
  • lazy: whether to build the result lazily, on the fly
  • deterministic: whether to build a deterministic automaton.
  • breaking: split the polynomials at each step

Preconditions:

  • derivation-based algorithms require a free labelset (e.g., word labels are not supported)
  • deterministic might not terminate on some expressions (in which case consider lazy)
  • expressions with complement operators might not terminate either

Also known as:

  • Antimirov automaton
  • equation automaton
  • partial-derivatives automaton

See also:

References:

Examples

Classical Expressions

In the classical case (labels are letters, and weights are Boolean), this is the construct as described by Antimirov.

In [1]:
import vcsn
bexp = vcsn.B.expression
r = bexp('[ab]*a[ab]{3}')
a = r.derived_term()
a
Out[1]:
%3 I0 0 (a+b)*a(a+b)³ I0->0 F4 0->0 a, b 1 (a+b)³ 0->1 a 2 (a+b)² 1->2 a, b 3 a+b 2->3 a, b 4 ε 3->4 a, b 4->F4

The states of the resulting automaton are decorated by their expressions. This can be stripped:

In [2]:
a.strip()
Out[2]:
%3 I0 0 0 I0->0 F4 0->0 a, b 1 1 0->1 a 2 2 1->2 a, b 3 3 2->3 a, b 4 4 3->4 a, b 4->F4

The result does not depend on the core computation: expansions and derivations compute the same terms:

In [3]:
r.derived_term('derivation')
Out[3]:
%3 I0 0 (a+b)*a(a+b)³ I0->0 F4 0->0 a, b 1 (a+b)³ 0->1 a 2 (a+b)² 1->2 a, b 3 a+b 2->3 a, b 4 ε 3->4 a, b 4->F4

Deterministic Automata

Alternatively, one can request a deterministic result:

In [4]:
qexp = vcsn.Q.expression
r = qexp('aba+a(a+<-1>ba)')
nfa = r.derived_term()
nfa
Out[4]:
%3 I0 0 aba+a(a+⟨-1⟩(ba)) I0->0 F3 1 ba 0->1 a 2 a+⟨-1⟩(ba) 0->2 a 4 a 1->4 b 3 ε 2->3 a 2->4 ⟨-1⟩b 3->F3 4->3 a
In [5]:
dfa = r.derived_term(deterministic=True)
dfa
Out[5]:
%3 I0 0 aba+a(a+⟨-1⟩(ba)) I0->0 F2 1 a 0->1 a 2 ε 1->2 a 2->F2
In [6]:
nfa.determinize()
Out[6]:
%3 I0 0 aba+a(a+⟨-1⟩(ba)) I0->0 F2 1 ba, a+⟨-1⟩(ba) 0->1 a 2 ε 1->2 a 2->F2

Extended Rational Expressions

Extended expressions are supported. For instance, words starting with as and then bs, but not exactly ab:

In [7]:
r = bexp('a*b* & (ab){c}')
r
Out[7]:
${a}^{*} \, {b}^{*} \& \left(a \, b\right)^{c}$
In [8]:
a = r.derived_term()
a
Out[8]:
%3 I0 0 a*b*&(ab)ᶜ I0->0 F0 F1 F2 F3 0->F0 1 a*b*&bᶜ 0->1 a 2 b* 0->2 b 1->F1 3 a*b* 1->3 a 4 b*&εᶜ 1->4 b 2->F2 2->2 b 3->F3 3->2 b 3->3 a 4->2 b
In [9]:
a.shortest(10)
Out[9]:
$\varepsilon \oplus \mathit{a} \oplus \mathit{b} \oplus \mathit{aa} \oplus \mathit{bb} \oplus \mathit{aaa} \oplus \mathit{aab} \oplus \mathit{abb} \oplus \mathit{bbb} \oplus \mathit{aaaa}$

Weighted Expressions

The following example is taken from lombardy.2005.tcs:

In [10]:
r = qexp('(<1/6>a*+<1/3>b*)*')
r.derived_term()
Out[10]:
%3 I0 0 (⟨1/6⟩a*+⟨1/3⟩b*)* I0->0 F0 F1 F2 0->F0 ⟨2⟩ 1 a*(⟨1/6⟩a*+⟨1/3⟩b*)* 0->1 ⟨1/3⟩a 2 b*(⟨1/6⟩a*+⟨1/3⟩b*)* 0->2 ⟨2/3⟩b 1->F1 ⟨2⟩ 1->1 ⟨4/3⟩a 1->2 ⟨2/3⟩b 2->F2 ⟨2⟩ 2->1 ⟨1/3⟩a 2->2 ⟨5/3⟩b

Multitape Expressions

Since both expansions and derivatives in Vcsn support the tuple operator, both flavors of derived_term can be used to build a multitape automaton from a multitape expression. For instance:

In [11]:
c = vcsn.context('lat<lan, lan>, q')
c
Out[11]:
$(\{\ldots\})^? \times (\{\ldots\})^?\to\mathbb{Q}$
In [12]:
e = c.expression('(a{+}|x + b{+}|y)*')
e
Out[12]:
$\left( \left. a \, {a}^{*} \middle| x \right. + \left. b \, {b}^{*} \middle| y \right. \right)^{*}$
In [13]:
e.derived_term('expansion')
Out[13]:
%3 I0 0 (aa*|x+bb*|y)* I0->0 F0 F1 F2 0->F0 1 (a*|ε)(aa*|x+bb*|y)* 0->1 a|x 2 (b*|ε)(aa*|x+bb*|y)* 0->2 b|y 1->F1 1->1 a|ε, a|x 1->2 b|y 2->F2 2->1 a|x 2->2 b|ε, b|y
In [14]:
e.derived_term('derivation')
Out[14]:
%3 I0 0 (aa*|x+bb*|y)* I0->0 F0 F1 F2 0->F0 1 (a*|ε)(aa*|x+bb*|y)* 0->1 a|x 2 (b*|ε)(aa*|x+bb*|y)* 0->2 b|y 1->F1 1->1 a|ε, a|x 1->2 b|y 2->F2 2->1 a|x 2->2 b|ε, b|y

Broken derived-term automaton

"Breaking" variants of derivation and expansion "split" the polynomials at each step. In short, it means that no state will be labeled by an addition: rather the addition is split into several states. As a consequence, the automaton might have several initial states.

In [15]:
r = qexp('[ab]{2}')
r.derived_term()
Out[15]:
%3 I0 0 (a+b)² I0->0 F2 1 a+b 0->1 a, b 2 ε 1->2 a, b 2->F2
In [16]:
r.derived_term(breaking=True)
Out[16]:
%3 I0 0 a(a+b) I0->0 I1 1 b(a+b) I1->1 F4 2 a 0->2 a 3 b 0->3 a 1->2 b 1->3 b 4 ε 2->4 a 3->4 b 4->F4

Lazy construction

The derived-term automaton can be built on-the-fly, on-demand: states are uncovered on needed (e.g., when traversed by an evaluation). This is especially useful when considering an expression on which the process does not terminate (i.e., would generate an "infinite" automaton).

In [17]:
r = qexp('a*+b*'); r
Out[17]:
${a}^{*} + {b}^{*}$
In [18]:
a = r.derived_term(lazy=True)
a
Out[18]:
%3 I0 0 a*+b* I0->0
In [19]:
print(a('')); a
2
Out[19]:
%3 I0 0 a*+b* I0->0 F0 0->F0 ⟨2⟩ 1 a* 0->1 a 2 b* 0->2 b
In [20]:
print(a('a')); a
1
Out[20]:
%3 I0 0 a*+b* I0->0 F0 F1 0->F0 ⟨2⟩ 1 a* 0->1 a 2 b* 0->2 b 1->F1 1->1 a
In [21]:
print(a('b')); a
1
Out[21]:
%3 I0 0 a*+b* I0->0 F0 F1 F2 0->F0 ⟨2⟩ 1 a* 0->1 a 2 b* 0->2 b 1->F1 1->1 a 2->F2 2->2 b

The following expression does not admit a (finite) deterministic automaton (and determinizing the non-deterministic derived-term automaton would not terminate either). Repeatedly calling the evaluation uncovers portions of an "infinite" deterministic automaton.

In [22]:
r = qexp('a*+(<2>a)*'); r
Out[22]:
${a}^{*} + \left( \left\langle 2 \right\rangle \,a\right)^{*}$
In [23]:
a = r.derived_term(deterministic=True, lazy=True)
a
Out[23]:
%3 I0 0 a*+(⟨2⟩a)* I0->0
In [24]:
print(a('')); a
2
Out[24]:
%3 I0 0 a*+(⟨2⟩a)* I0->0 F0 0->F0 ⟨2⟩ 1 a*+⟨2⟩(⟨2⟩a)* 0->1 a
In [25]:
print(a('a')); a
3
Out[25]:
%3 I0 0 a*+(⟨2⟩a)* I0->0 F0 F1 0->F0 ⟨2⟩ 1 a*+⟨2⟩(⟨2⟩a)* 0->1 a 1->F1 ⟨3⟩ 2 a*+⟨4⟩(⟨2⟩a)* 1->2 a
In [26]:
print(a('aaa')); a
9
Out[26]:
%3 I0 0 a*+(⟨2⟩a)* I0->0 F0 F1 F2 F3 0->F0 ⟨2⟩ 1 a*+⟨2⟩(⟨2⟩a)* 0->1 a 1->F1 ⟨3⟩ 2 a*+⟨4⟩(⟨2⟩a)* 1->2 a 2->F2 ⟨5⟩ 3 a*+⟨8⟩(⟨2⟩a)* 2->3 a 3->F3 ⟨9⟩ 4 a*+⟨16⟩(⟨2⟩a)* 3->4 a

Note that lazy determinization is also available: see automaton.determinize to see the corresponding example.

Since it entails a "local" determinization ("under" the complement operator), the algorithm fails would not terminate.

In [27]:
r = r.complement()
r
Out[27]:
$\left({a}^{*} + \left( \left\langle 2 \right\rangle \,a\right)^{*}\right)^{c}$
In [28]:
a = r.derived_term(lazy=True)
a
Out[28]:
%3 I0 0 (a*+(⟨2⟩a)*)ᶜ I0->0
In [29]:
print(a('aa')); a
0
Out[29]:
%3 I0 0 (a*+(⟨2⟩a)*)ᶜ I0->0 1 (a*+⟨2⟩(⟨2⟩a)*)ᶜ 0->1 a 2 ∅ᶜ 0->2 b 1->2 b 3 (a*+⟨4⟩(⟨2⟩a)*)ᶜ 1->3 a 3->2 b 4 (a*+⟨8⟩(⟨2⟩a)*)ᶜ 3->4 a
In [30]:
print(a('b')); a
1
Out[30]:
%3 I0 0 (a*+(⟨2⟩a)*)ᶜ I0->0 F2 1 (a*+⟨2⟩(⟨2⟩a)*)ᶜ 0->1 a 2 ∅ᶜ 0->2 b 1->2 b 3 (a*+⟨4⟩(⟨2⟩a)*)ᶜ 1->3 a 2->F2 2->2 a, b 3->2 b 4 (a*+⟨8⟩(⟨2⟩a)*)ᶜ 3->4 a