Derived-term Automata for Extended Weighted Rational Expressions¶

This page is a complement to Derived-term Automata for Extended Weighted Rational Expressions presented at ICTAC 2016. This page exists in several forms:

A Dynamic Notebook, which can be edited, played with
A static HTML page, whose graphical rendering is always correct.

More information is available here:

You may change the cells, and run then. To run a cell in a notebook, hit "Control-Enter" (in which case the focus stays on the same cell) or "Shift-Enter" (focus goes to the next cell). Beware that depending on the requested operations, Vcsn may generate and compile code, which may be a really slow process on small machines (about a minute): be patient! However, the code is compiled only once: successive uses will be way faster.

To run all the cells anew, select "Restart & Run All" in the "Kernel" menu above.

import vcsn

Valid Expressions¶

Beware that the so-called ''context'' (which, in Vcsn parlance denotes the type of labels (letters or words, etc.) and the semiring of weights) has an extreme importance. In particular, an expression might be valid in a given context, and invalid in another.

As an example, ${a^*}^*$ is valid in $\mathbb{B}$, so we can build its derived-term automaton.

e = vcsn.B.expression('a**'); e.is_valid()

True

e.derived_term()

However, the construction will fail in $\mathbb{Q}$, since the constant term for $a^*$ is 1, which is not starrable in $\mathbb{Q}$ (indeed, $1^* = \sum_{k \in \mathbb{N}} 1^k$ is not a value in $\mathbb{Q}$).

e = vcsn.Q.expression('a**'); e.is_valid()

False

try:
    e.derived_term()
except RuntimeError as err:
    print("error:", err)

error: Q: value is not starrable: 1
  while computing expansion of: a**
  while computing derived-term of: a**

Example $\mathsf{E}_3$: A Lex-like Scanner¶

Desugaring¶

We introduce the "context" (i.e., the automaton type) that corresponds to labels that are letters (lal_char) and weights that are rational numbers (q).

Q = vcsn.context('lal_char(ab), q')
Q

Vcsn supports the <+ operator. It is desugared in a combination of conjuction and complement.

e3 = Q.expression('<2>(ab) <+ <3>[ab]{+}')
e3

The Derived-Term Automaton¶

The expansion and the derived-term automaton of $\mathsf{E}_3$ follow.

e3.expansion()

e3.derived_term()

Example $\mathsf{E}_1$: A Simple Expression¶

Polynomials and Derivatives¶

Z = vcsn.context('lal_char, z')
e1 = Z.expression('<5>\e + <2>ace + <6>bce + <4>ade + <3>bde')
e1

The derivatives of $\mathsf{E}_1$ with respect to $a$ and $b$ are the following polynomials.

e1.derivation('a')

e1.derivation('b')

The expansion of $\mathsf{E}_1$ is as follows. The notation is slightly different from the paper, where for higher clarity the weights in the polynomials are separated from the expressions by a $\otimes$. In the implementation, since the weight is always displayed, this separation is useless, and not put, for conciseness.

e1.expansion()

The Derived-Term Automaton¶

The derived-term automaton can be computed using derivations or expansions.

e1.derived_term('expansion')

e1.derived_term('derivation')

The Deterministic Derived-Term Automaton¶

e1.derived_term(deterministic=True)

Section 4.2: An Infinite Automaton¶

There exists no finite weighted deterministic automaton for the following expression.

e = Q.expression('a* + (<2>a)*')
e

e.derived_term()

a = e.derived_term(deterministic=True, lazy=True)
a

Repeated evaluations uncover parts of this automaton.

print(a('')); a

2

print(a('aa')); a

5

print(a('aaaa')); a

17

Additional Operators¶

The following automata demonstrate the support of the shuffle operator (denoted ':' in ASCII, and '$\between$' in $\LaTeX$), and the infiltrate operator ('&:' and '$\uparrow$'). See also the documentation of the corresponding operations on automata: automaton.shuffle and automaton.infiltrate.

exp = vcsn.context('lal, q').expression
e1 = exp('<2>a<3>a : <5>a<6>a', 'trivial')
e1

The 'trivial' argument above restricts the identities to the trivial ones, otherwise additional identities would defeat our attempt to "taint" the as below:

exp('<2>a<3>a : <5>a<6>a')

e1.derived_term()

e2 = exp('<2>a<3>a &: <5>a<6>a', 'trivial')
e2

e2.derived_term()

Broken Derived Terms¶

"Breaking" variants "split" the polynomials at each step. In short, it means that no state will be labeled by an addition: rather the addition is split into several states. As a consequence, the automaton might have several initial states.

e3 = exp('[ab]{2}')
e3.derived_term()

e3.derived_term(breaking=True)

Derived-term Automata for Extended Weighted Rational Expressions¶

Valid Expressions¶

Example $\mathsf{E}_3$: A Lex-like Scanner¶

Desugaring¶

The Derived-Term Automaton¶

Example $\mathsf{E}_1$: A Simple Expression¶

Polynomials and Derivatives¶

The Derived-Term Automaton¶

The Deterministic Derived-Term Automaton¶

Section 4.2: An Infinite Automaton¶

Section 5: Related Work¶

Additional Operators¶

Broken Derived Terms¶