Vcsn news

This file describes user visible changes in the course of the development of Vcsn, in reverse chronological order. On occasions, significant changes in the internal API may also be documented.

Vcsn 2.1

2015-10-05

About 10,000 hours (on the calendar, not of work!) after its first public release, the Vcsn team is very happy to announce the release of Vcsn 2!

It is quite hard to cherry-pick a few new features that have been added in Vcsn 2.1, as shown by the 4k+ lines of messages below since 2.0. However, here are a few headlines:

Many pages of documentation and examples have been written (see http://www.lrde.epita.fr/dload/vcsn/2.1/notebooks).
Now http://vcsn-sandbox.lrde.epita.fr/ provides a live demo.
Transducers are much better supported, with improved syntax and several algorithms (e.g., letterize, synchronize, partial_identity, is_functional, etc.)
Expressions now offer several sets of identities specifying how they should be normalized. More generally, input/output of expressions have been improved to match most users' expectations. New operators are accepted: & for conjunction, : for shuffle, &: for infiltration, {c} (postfix) for complement, and <+ for deterministic choice.
When entering an automaton (e.g., with %%automaton in IPython) user state names are preserved.
Of course, many bugs were fixed, many algorithms were sped up, and internal details have been cleaned up.
As Easter eggs, many features have also been added, but not advertised, until we are sure of how we want them to look like.

People who worked on this release:

Akim Demaille
Antoine Pietri
Canh Luu
Clément Démoulins
Lucien Boilod
Nicolas Barray
Sébastien Piat
Sylvain Lombardy
Valentin Tolmer
Yann Bourgeois--Copigny

People who influenced this release:

Alexandre Duret-Lutz
Jacques Sakarovitch
Luca Saiu

Vcsn 2's repository has moved

To update your existing repository, run a command similar to:

$ git remote set-url origin git@gitlab.lrde.epita.fr:vcsn/vcsn

2015-09-30

shortest: applies to any automaton

The restriction to free-labelsets is lifted: shortest applies to automata labeled with words, to transducers, etc.

2015-09-28

project: new algorithm

Select a single tape from a multitape automaton (transducer).

2015-09-23

expressions: spaces are now ignored

In an attempt to look like regexps, spaces were considered so far as characters.

2015-09-17

derived-term: deterministic automata

One can now require the construction of a deterministic automaton; there are two new algorithms: expansion,deterministic and derivation,deterministic.

2015-09-15

Named automata

It is (finally) possible to keep user state names thanks to name_automaton<Aut>. Now dyn::read_automaton and (in Python) vcsn.automaton accept an additional strip argument (defaulting to true): if set, the user names will be stripped once the automaton is loaded. This applies to all the supported formats: Daut, Dot, Efsm, and FAdo.

However using %%automaton names are kept by default. Pass -s/--strip to strip it.

2015-09-07

Build fixes

Fixed (again) nasty shared-library issues at runtime on Ubuntu due to their use of -Wl,--as-needed. This affected Python only.

2015-08-31

expression.info: more information

Now atom (the number of occurrences of letters) is also reported as width, and the depth (also known as the height: the height of the tree) is now included.

2015-08-26

vcsn.automaton can guess the format

The vcsn.automaton function now accepts "auto" as format, which means "try to guess from the content".

2015-08-25

levenshtein: new algorithm

The levenshtein algorithm allows to build a transducer encoding the Levenshtein distance between two alphabets. This transducer can then be composed with two languages to obtain the edit-distance algorithm, containing much information on the distance of the languages and their words.

partial_identity: new algorithm

The partial identity turns an automaton into a transducer realizing a partial identity. I.e., for each accepted input, the output will be the same as the input (plus the origin weight from the input automaton).

2015-08-18

to_automaton: more algorithms

One can chose between both implementation of derived_term (stripped), using "derivation" or "expansion".

2015-08-14

to_expression: more heuristics

There are three new heuristics:

"delgado": select a state whose removal would contribute a small expression (number of symbols, including +, etc.).
"delgado_label": likewise, but count only the number of labels in the expression.
"best": run all the heuristics, and return the shortest result.

The default algorithm remains "auto", which now denotes "best".

2015-08-03

to_automaton: conversion from expression to automaton

Vcsn offers several means to build an automaton from an expression: thompson, zpc, standard and derived-term. The latter even builds a decorated automaton, which, often, is not desired.

In C++ dyn::to_automaton and in Python expression.automaton() allow to build a simple automaton from an expression. It accepts an optional string argument to select the conversion algorithm. It defaults to "auto", which currently means the stripped derived-automaton, but eventually, it will pick either "standard" for basic expressions (as it's the fastest algorithm), or "derived-term" for extended expressions (as "standard" does not support these additional operators).

2015-06-18

I/O in EFSM format are safer

We now correctly ensure the correspondence between our weightsets and OpenFST's arc-type, and input and output.

As a consequence, we no longer support exchange of "traditional numerical" weightsets (such as "Z", "R", etc.), since, as far as we know, they don't feature a vis-à-vis in OpenFST. It used to be accepted, but was meaningless.

Boolean weights are mapped to "standard", OpenFST's tropical semiring.

When reading EFSM files, we still try to use the smallest corresponding weightset. For instance, if the arc type is "standard" and and the weights are integral we use "zmin", otherwise, we use "rmin". Errors on reading the weight of final states have also been fixed.

2015-06-16

automaton.expression, automaton.lift

They now accept an optional argument to specify the desired identities for the expressions.

2015-06-15

power, chain have been renamed

Before, "chain" denoted the repeated concatenation for automata and expressions, and "power" meant repeated conjunction for automata and expressions. However, while aut ** 2 meant aut & aut (conjunction), exp ** 2 meant exp * exp (concatenation).

Clearly, from the Python point of view, ** is repeated *, which is what most people would understand as "power".

So, to enforce consistency, and to avoid bad surprises, these functions were renamed.

Python	dyn::	Comment
`a * b`	`multiply(a, b)`	Multiplication for automata and expressions (concatenation), polynomials, weights, labels etc.
`a ** n`	`multiply(a, n)`	Repeated multiplication.
`a & b`	`conjunction(a, b)`	Conjunction for automata and expressions.
`a & n`	`conjunction(a, n)`	Repeated conjunction.

2015-06-11

eliminate_state: fixes

Sometimes there was a mismatch between the state numbers as displayed, and as expected by eliminate_state. This is fixed.

The rendering was nonsensical when all the states were removed. This is now fixed.

Finally, passing -1 as argument, or no argument at all, delegates the choice of the state to eliminate to a heuristic.

2015-06-04

expressions: new identities sets

Rational expressions in Vcsn are "normalized" according to a set of identities (such that <0>E => \z whatever the expression E is).

Vcsn now supports four different sets of identities:

"trivial" : a minimum set of transformations are applied.

"associative" : sum and product are made associative, so a+(b+c) and (a+b)+c are equal.

"linear" : sum is made commutative, and weights are factored, so a+b+a is equal to a+b in B, and to <2>a+b in Z.

"distributive" (or "series") : product and exterior/scalar products are distributed over sum, so [ab]a is equal to aa+ba, and <2>[ab] is equal to <2>a+<2>b.

Previously the default identities were "associative". They are now "linear", to match most users' expectations.

So, for instance we used to report:

In [2]: c = vcsn.context('lal_char(a-z), z')

In [3]: c.expression('r+[a-q]')
Out[3]: r + [a-q]

In [4]: c.expression('[a-q]+r+r')
Out[4]: [^s-z] + r

we now report:

In [3]: c.expression('r+[a-q]')
Out[3]: [^s-z]

In [4]: c.expression('[a-q]+r+r')
Out[4]: [a-q] + <2>r

2015-05-29

nmin: new weightset

The new tropical semiring ⟨ℕ, +, min⟩ has been introduced. Compared to zmin, some optimizations can be done, for example in evaluation or in node distance where the absence of negative weights allows to trim some branches.

2015-05-26

cotrie: new algorithm

In additional to trie, which builds a deterministic tree-like automaton (single initial state, multiple final states), vcsn now supports cotrie which builds a "reversed trie": a codeterministic reversed tree-like automaton (single final state, multiple initial states).

The main feature of cotrie is that its result is codeterministic, so it only takes a determinization to minimize it. It turns out that in Vcsn determinization is more efficient than minimization:

In [13]: %timeit c.trie('/usr/share/dict/words').minimize()
         1 loops, best of 3: 18.8 s per loop

In [14]: %timeit c.cotrie('/usr/share/dict/words').determinize()
         1 loops, best of 3: 7.54 s per loop

These automata are isomorphic.

2015-05-22

expressions: output may use label classes

Rational expressions have long supported label classes in input, e.g., [abc0-9]. Polynomials also support them in output. However expressions never used classes, which may seriously hinder their readability. For instance, to compute an expression describe all words on {a,..., z} except 'hehe', we had:

In [2]: c = vcsn.context('lal_char(a-z), b')
        c.expression('(abcd){c}').derived_term().expression()
Out[2]: \e+a(\e+b(\e+c))+(b+c+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z+a(a+c+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z+b(a+b+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z+c(a+b+c+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z+d(a+b+c+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z)))))(a+b+c+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z)*

Now expressions also support letter classes in output:

Out[2]: \e+a(\e+b(\e+c))+([^a]+a([^b]+b([^c]+c([^d]+d[^]))))[^]*

Classes are used only on ranges of at least four (unweighted) letters in strictly increasing order:

In [3]: c.expression('a+a+b+c+d+e+f+x+y+z+w')
Out[3]: a+[a-fx-z]+w

Negated classes are issued only if the letters are in strictly increasing alphabetical order, and are more than two thirds of the whole alphabet (otherwise a regular class is preferred).

In [4]: c.expression('[a-q]')
Out[4]: [a-q]

In [5]: c.expression('[a-q]+r')
Out[5]: [^s-z]

In [6]: c.expression('r+[a-q]')
Out[6]: r + [a-q]

In [7]: c.expression('[a-q]+r+r')
Out[7]: [^s-z] + r

2015-05-21

multiply: new name for 'concatenate'

In static, dyn:: and Python, 'concatenate' is renamed 'multiply'.

The "concatenation" of automata or rational expressions would never be called the concatenation in the case of series or polynomials etc. If we were to support automata weighted by automata, the weightset product would be this multiply.

2015-05-20

trie: read directly from a file

Building a polynomial from a dictionary stored on disk, and then building the trie is a waste of time. It is now possible to build it directly from a file.

In [2]: t = vcsn.B.trie('/usr/share/dict/words')

In [3]: t.info()
Out[3]: {'is codeterministic': False,
         'is complete': True,
         'is deterministic': True,
         'is empty': False,
         'is eps-acyclic': True,
         'is normalized': False,
         'is proper': True,
         'is standard': True,
         'is trim': True,
         'is useless': False,
         'is valid': True,
         'number of accessible states': 792777,
         'number of coaccessible states': 792777,
         'number of codeterministic states': 792777,
         'number of deterministic states': 792777,
         'number of eps transitions': 0,
         'number of final states': 235886,
         'number of initial states': 1,
         'number of states': 792777,
         'number of transitions': 792776,
         'number of useful states': 792777,
         'type': 'mutable_automaton<letterset<char_letters()>, b>'}

2015-05-14

trie: new algorithm

From a finite language/series represented as a polynomial (of words), build an automaton that has the shape of a trie (a prefix tree).

In [2]: series = '<2>\e+<3>a+<4>b+<5>abc+<6>abcd+<7>abdc'

In [3]: p = vcsn.context('law_char, z').polynomial(series); p
Out[3]: <2>\e + <3>a + <4>b + <5>abc + <6>abcd + <7>abdc

In [4]: a = p.trie(); a
Out[4]: mutable_automaton<letterset<char_letters(abcd)>, z>

In [5]: a.shortest(100)
Out[5]: <2>\e + <3>a + <4>b + <5>abc + <6>abcd + <7>abdc

2015-05-01

delay_automaton: new automaton type

This algorithm transforms the automaton into an equivalent one, but where each state has been split depending on the delay between the tape, i.e. the difference of input length for each of the tapes.

is-synchronized: new algorithm

This algorithm checks whether a transducer is synchronized, i.e. that the input is read on every tape at the same rate for as long as possible.

synchronize: new algorithm

This new algorithm allows to synchronize the tapes of a k-tape transducer, i.e. "push" the letters towards the beginning of the transducer so that the input is read along every tape at the same rate for as long as possible.

has-bounded-lag: new algorithm

This algorithm checks whether a k-tape transducer has a bounded lag, i.e. if there exists a constant D such that the length of the input for each tape differs by at most D; or the output of the evaluation of a word through the transducer differs in length by at most D.

2015-04-09

shortest now subsumes enumerate

One can now specify both a maximum number of words, and a maximum size. It is also significantly faster. The Python binding automaton.enumerate was removed.

In [4]: a = vcsn.Z.expression('[01]*1(<2>[01])*').standard()

In [5]: a.shortest()
Out[5]: 1

In [6]: a.shortest(3)
Out[6]: 1 + 01 + <2>10

In [7]: a.shortest(len = 3)
Out[7]: 1 + 01 + <2>10 + <3>11 + 001 + <2>010 + <3>011 + <4>100 + <5>101 + <6>110 + <7>111

In [8]: a.shortest(len = 3, num = 5)
Out[8]: 1 + 01 + <2>10 + <3>11 + 001

In [9]: a.shortest(len = 30, num = 5)
Out[9]: 1 + 01 + <2>10 + <3>11 + 001

2015-03-25

product is renamed conjunction

After many hesitations, we finally decided to rename the synchronized product from product to conjunction. There are several reasons in favor of this change.

First product is not adequate:

we have several products: the Cauchy product (concatenation), the Hadamard product (synchronized product), the shuffle product, the infiltration product, and there are certainly more.
product is already used in the case of words and rational expressions to denote the concatenation, so, to be consistent, product will denote the concatenation of automata too.
* is naturally associated to concatenation for words and rational expressions, it was associated to product for automata. It is more consistent to use concatenate in each case.

Second conjunction is a reasonable choice:

intersection is acceptable for Boolean automata, but it is not satisfying in the case of weighted automata.
synchronized_product is too long, likewise for hadamard_product, etc.
hadamard is a proper name, which we try to avoid.
the etymology of conjunction is a perfect match with the semantics of the operation.
computer scientists are used to the correspondence between conjunction and &.

The preference for verbs as algorithm names leads naturally to conjoin. However conjunction still seems more natural.

2015-03-21

are-equivalent: now uses realtime

One may, finally, compare LAL, LAN, and LAW automata.

join: improvements

The "join" of tuplesets now works properly, so one may, for instance, add automata on A? x B and A x B? to get an automata on A? x B?.

In order to facilitate the experiments with join, it is now provided as an algorithm. In Python, it is also available as the infix or operator.

In [2]: c1 = vcsn.context('lat<lal_char(a), lan_char(x)>, z'); c1
Out[2]: {a} × ({x})? → ℤ

In [3]: c2 = vcsn.context('lat<lan_char(b), lal_char(y)>, q'); c2
Out[3]: ({b})? × {y} → ℚ

In [4]: c1 | c2
Out[4]: ({a,b})? × ({x,y})? → ℚ

2015-03-18

infiltration: fix long standing bugs

Optimizations in the computation of the shuffle-product broke the infiltration-product. This is fixed. However, the performances of infiltration are then degraded (about 2.5x). Conjunction (synchronized product) and shuffle are unaffected.

Besides, the support for variadic infiltration products was naive, and produced incorrect results. This is fixed by simply repeating the binary infiltration.

2015-03-17

is_functional: new algorithm

Whether a transducer is functional, i.e., whether each input words maps to (at most) a single word.

Actually, was implemented in September 2014, but was not registered here.

2015-03-16

Python 3 is now required

We have tried hard to remain compatible with Python 2, but support for Unicode is just too hard to get to work properly with both Python 2 and Python 3. Since there were constantly problems arising in one whose fixes break the other, we decided to drop support for Python 2. Given that all major platforms ship Python 3, we don't expect this to be a real problem.

2015-03-15

Repeated minimization

Now, minimization and cominimization return automata of the same type --- cominimization used to return an automaton whose type reveals the double transposition. In actually, calling several times the minimization and/or cominimization no longer generates decorators of decorators of etc.: the result is always a single layer decorator. Not only is this, in general, the desired result, it's also more 'economic' as it uses fewer automaton types (hence less runtime compilations).

2015-03-03

realtime: new algorithm

Compute the letterized, proper equivalent of an automaton. The result automaton will have only letter transitions (no words, no spontaneous transitions). It comes with the is_realtime algorithm, to check if an automaton is realtime or not.

is_letterized: new algorithm

This algorithm checks whether an automaton is letterized, i.e. whether each transition's label is a single letter (in the sense of the labelset).

In [2]: ctx = vcsn.context("law_char, b")

In [3]: ctx.expression("abc").standard().is_letterized()
Out[3]: False

In [4]: ctx.expression("a*(b+c)").standard().is_letterized()
Out[4]: True

2015-03-02

zpc: new algorithm

This algorithm generates the ZPC automaton from an expression. It has a single initial state (whose final weight is the constant term of the expression), a unique distinct final state (which cannot be reached from the initial state via spontaneous transitions), and no cycles of spontaneous transitions.

The expression.zpc function features an optional argument, which, when set to "compact", enables a variant, more compact, construction.

In [2]: vcsn.b.expression('ab').zpc()
    ╭───╮  \e   ╭───╮  a   ╭───╮  \e   ╭───╮  b   ╭───╮  \e   ╭───╮
──> │ 0 │ ────> │ 1 │ ───> │ 2 │ ────> │ 3 │ ───> │ 4 │ ────> │ 5 │ ──>
    ╰───╯       ╰───╯      ╰───╯       ╰───╯      ╰───╯       ╰───╯

In [3]: vcsn.b.expression('ab').zpc('compact')
    ╭───╮  a   ╭───╮  \e   ╭───╮  b   ╭───╮
──> │ 0 │ ───> │ 1 │ ────> │ 2 │ ───> │ 3 │ ──>
    ╰───╯      ╰───╯       ╰───╯      ╰───╯

2015-02-23

letterize: new algorithm

Create the equivalent automaton, but with only single-letter transitions. Basically, do the conversion from law to lan. It also works recursively with multitape transducers.

In [2]: c = vcsn.context('lat<law_char, lal_char>, z')

In [3]: a = c.expression("<2>'(abc,x)'").derived_term()

In [4]: print(a.format('daut'))
context = "lat<wordset<char_letters(abc)>, letterset<char_letters(x)>>, z"
$ -> 0
0 -> 1 <2>(abc,x)
1 -> $

In [5]: print(a.letterize().format('daut'))
context = "lat<nullableset<letterset<char_letters(abc)>>, nullableset<letterset<char_letters(x)>>>, z"
$ -> 0
0 -> 2 <2>(a,x)
1 -> $
2 -> 3 (b,\e)
3 -> 1 (c,\e)

2015-02-10

Multitape expressions

When typing multitape transducers, it is now no longer necessary to explicit the parenthesis inside (multi-)letters.

In [1]: import vcsn

In [2]: c = vcsn.context('lat<lan_char, lan_char>, b')

In [3]: r = c.expression(r"'a,\e'+'(b,c)'+'d,f'")

has_bounded_lag: new algorithm

This algorithm checks if a transducer has a bounded lag, i.e. if there is a maximum difference of length between the input words and their corresponding outputs.

In [1]: import vcsn

In [2]: c = vcsn.context('lat<lan_char, lan_char>, b')

In [3]: c.expression(r"'a,\e'").standard().has_bounded_lag()
Out[3]: True

In [4]: c.expression(r"'a,\e'*").standard().has_bounded_lag()
Out[4]: False

2014-11-18

Flex

The Flex program (a scanner generator) should no longer be required for builds from a tarball.

blind -> focus

The blind algorithm (and the blind_automaton structure) has been renamed focus (and focus_automaton) as it is much clearer on its purpose.

2014-11-14

scc: new algorithm

Create an automaton whose states correspond with a strongly connected component of the input automaton.

2014-11-13

ratexp -> expression

The name "ratexp" is an unattractive jargon, and since Vcsn will not feature other kinds of "expressions", it is hardly justified: it is the only abbreviation we use (automaton, context, expansion, polynomial, etc.).

So everywhere (static, dyn::, Python) "ratexp" was replaced by "expression" (including "make_ratexpset" -> "make_expressionset", etc.).

2014-11-07

thompson: the labelset no longer needs to feature a "one"

The Thompson automata count many spontaneous transitions, so require a labelset which supports \e (e.g., lan, law). This is annoying, especially when teaching about Automata Theory: one does not want to dive into the arcane details of contexts.

So now, if the context of the input expression does not support \e, the Thompson automaton will be built with a generalized context which does support it:

In [1]: import vcsn

In [2]: vcsn.context('lan_char, b').ratexp('a').thompson().context()
Out[2]: lan<letterset<char_letters(a)>>, b

In [3]: vcsn.context('law_char, b').ratexp('a').thompson().context()
Out[3]: wordset<char_letters(a)>, b

In [4]: vcsn.context('lal_char, b').ratexp('a').thompson().context()
Out[4]: lan<letterset<char_letters(a)>>, b

2014-10-30

filter: new algorithm

Hide states of another automaton, revealing only selected ones. This does not copy the original automaton.

2014-10-29

New project name: Vcsn

The Vaucanson 2 project, which was funded by a (French) program (ANR), is now "closed". Of course, it will continue to exist, but now there will be two Vaucansa! One will be led by Sylvain Lombardy and Jacques Sakarovitch, and the other by Alexandre Duret-Lutz and Akim Demaille from EPITA.

This file is about the latter, named "Vcsn".

2014-10-27

WARNING: resyntaxed contexts

The syntax for contexts has changed: the separator between the labelset and the weightset is now a comma (possibly with spaces) instead of an underscore.

So for instance:

lal_char(abc)_b

-> lal_char(abc), b

lat<lal_char(ab),lal_char(xy)>_lat<q,r>

-> lat<lal_char(ab), lal_char(xy)>, lat<q, r>

law_char(a-z)_ratexpset<law_char(A-Z)_b>

-> law_char(a-z), ratexpset<law_char(A-Z), b>

This syntax is not (and has never been) the intended one, it should be considered as the "internal" syntax. However, since the intended syntax is still not implemented, one, unfortunately, still has to deal with this inner syntax.

2014-10-23

proper: strip nullableset from labelsets

Our (current) implementation of proper is in-place, and as a consequence, the result features the same context as the input automaton, with a nullable labelset.

Now, proper copies the result in a context with a de-nullabled labelset. So for instance:

In [2]: vcsn.context('lan_char(ab)_b').ratexp('(ab)*').thompson().proper()
Out[2]: mutable_automaton<lal_char(ab)_b>

There is no measurable performance regression. However state numbers are now unrelated to the input automaton.

2014-10-20

Python: loading automata from files

The vcsn.automaton constructor now support a filename named argument to load an automaton from a file.

vcsn.automaton(filename = 'a.gv')
vcsn.automaton(filename = 'a.efsm', format = 'efsm')

minimize: default algorithm is "auto"

Now minimize and cominimize both support the "auto" algorithm, which is the default value. It uses the "signature" algorithm for the Boolean automata, otherwise the "weighted" algorithm.

proper: forward closure available in dyn:: and Python

It is now possible to require a forward elimination of spontaneous transitions. Calling a.proper() is equivalent to calling a.proper(prune = True, backward = True).

2014-10-07

to-expression

The static and dyn:: algorithm that was named aut_to_exp is now named to_expression. In Python it is still automaton.ratexp().

costandard, is_costandard: new algorithms

Calling aut.costandard() is equivalent to calling aut.transpose().standard().transpose(), and aut.is_costandard() is the same with aut.transpose().is_standard().

normalize: new algorithm

Composes standard and costandard on an automaton.

2014-10-02

Strip vs. transpose

Up to now, stripping a transposed automaton returned the non-transposed automaton, without even stripping it. This has finally been addressed, and a.determinize().transpose().strip() is now strictly equivalent to a.determinize().strip().transpose().

codeterminize, cominimize, is_codeterministic: new algorithms.

Calling aut.cominimize() is equivalent to calling aut.transpose().minimize().transpose(), and likewise for codeterminize.

Available in dyn and Python.

2014-10-01

split: now works on polynomials

In addition to splitting (aka, breaking) a ratexp, it is now possible to split a polynomial. This, for instance, provides another way to compute the breaking derivation of a ratexp:

In [1]: r = vcsn.context('lal_char_z').ratexp('a(b+a)+a(a+b)')

In [2]: p = r.derivation('a'); p
Out[2]: a+b + b+a

In [3]: p.split()
Out[3]: <2>a + <2>b

In [4]: r.derivation('a', True)
Out[4]: <2>a + <2>b

derivation: fix a Python bug

ratexp.derivation features a breaking optional argument, which defaulted to True instead of False.

transpose: fix the number of final and initial states

Because num_initials and num_finals were not "transposed", on occasions automaton.is_standard could crash (when it thinks there is one initial state while there is none, but it still wants to check that its initial weight is one).

2014-09-25

Automaton conjunction

On occasions, in Python, expressions such as (a & b)('a') failed to behave properly. It does now, and does correspond to the evaluation of the word "a" on the conjunction (synchronized product) of automata a and b.

TikZ: state names

Conversion of automata into the TikZ format now includes the state "names" (e.g., derived-term automata now show the states' rational expressions).

2014-09-21

determinize: Empty-In, Empty-Out

Now when given an empty automaton (no states), determinize returns an empty automaton, whereas it used to return an automaton with one state, initial. This is more consistent (e.g., when given a deterministic automaton, determinize now consistently returns the accessible part of its input).

2014-09-17

is_cycle_ambiguous: new algorithm

Whether an automaton is cycle-ambiguous (or "exponentially ambiguous").

2014-09-09

Build fixes

Portability issues with Mac OS X's own version of Flex are fixed.

Fixed runtime compilations failing to find Boost headers in some cases.

Fixed nasty shared-library issues at runtime on Ubuntu due to their use of -Wl,--as-needed.

2014-09-04

Fixes

I/O with OpenFST now properly supports initial weights and multiple initial states.

Dot I/O of multiple-tape automata with initial weights is fixed.

Dot parse errors now provide locations.

The preconditions of reduce have been relaxed from labels-are-letters to labelset-is-free.

2014-09-03

determinize provides a better support for Z

The normalization of states now uses the GCD of the weights, which means that now all the Z-automata that are determinizable can be determinized in Z, without having to convert them as Q-automata.

New weightset: qmp

Initial support for multiprecision rational numbers.

>>> a = vcsn.context('lal_char(abc)_qmp').ratexp('<2/3>a').standard() & 70
>>> a('a')
1180591620717411303424/2503155504993241601315571986085849

Requires the GMP library.

2014-09-01

compatibility with Boost 1.56

Some changes in Boost require adjustments.

2014-08-02

ambiguous_word: new algorithm

Returns an ambiguous word of an automaton, or raises if the automaton is unambiguous.

2014-07-30

Several fixes

The reduce algorithm applied only to not-decorated automata. Support for F2 was broken.

Improvements

Error messages about failed algorithm instantiations are much clearer: they display failed preconditions.

Blind is no longer limited to tape 0 or tape 1.

Vaucanson 2.0

2014-07-25

More than two years after the initial commit in our repository, the Vaucanson team is happy to announce the first release of Vaucanson 2!

People who worked on this release:

Akim Demaille
Alexandre Duret-Lutz
Alfred M. Szmidt
Antoine Pietri
Canh Luu
Jacques Sakarovitch
Luca Saiu
Sylvain Lombardy
Valentin Tolmer

2014-07-24

"series" are now supported (static, dyn, Python)

Ratexpsets now take a constructor parameter specifying which set of identities to enforce, currently with two possible values: "trivial" and "series". Old-style ratexps only support trivial identities, while series also support sum commutativity and product distributivity over sum. Heterogeneous operations involving both trivial ratexps and series ratexps are possible.

In case of a ratexpset belonging to a context, the default value for the optional parameter is always trivial. The "seriesset" alias, which does not accept an identity parameter, is also supported.

"Series" is technically a misnomer: the data structure is a better approximation of the mathematical concept of series and in particular yields a semiring structure, but actual series equality remains undecidable in the general case.

Series currently depend on the commutativity of their weightset, and only support a subset of the available operations.

>>> vcsn.context('lal_char(a)_ratexpset<lal_char(x)_b>(series)') # series
>>> vcsn.context('lal_char(a)_seriesset<lal_char(x)_b>')         # series
>>> vcsn.context(ratexpset<lal_char(x)_b>_z')                    # trivial
>>> vcsn.context('lal_char(a)_ratexpset<lal_char(x)_b>(trivial)')# trivial
>>> e = vcsn.context('lal_char(a-c)_z').ratexp('b+a+b'); e
b+a+b
>>> s = vcsn.context('lal_char(a-c)_z').series('b+a+b'); s
a+<2>b
>>> (e+s).is_series()
True

has_twins_property: new algorithm

Computes whether an automaton has the twins property.

2014-07-16

determinize supports weighted automata

The determinize algorithm (dyn and Python) now accepts an optional 'algo' argument.

When 'algo' is:

"boolean": the fast Boolean-only implementation is used. It always terminates.
"weighted": any weightset is supported. On some inputs, it may not terminate.
"auto" (default value): "boolean" if the automaton is Boolean, "weighted" otherwise.

2014-07-11

are-equivalent is generalized

The 'are_equivalent' algorithm (a1.is_equivalent(a2) in Python) now supports weights in fields (e.g., Q or R), but also on Z. The labelset still must be free.

2014-07-10

reduce: new algorithm on automata

Implements the Schützenberger algorithm for reduction of representations for any skew field. As a special case, automata with weights in Z are also supported.

2014-07-04

determinization is more robust to large alphabets

In order to optimize (Boolean) determinization, 'determinize' no longer accepts an optional 'complete' argument to require a deterministic complete automaton. One must now call 'complete()' afterwards.

The speed improvements are (erebus: OS X i7 2.9GHz 8GB, Clang 3.5 -O3 -DNDEBUG):

  (1)   (2)   (3)
 7.93s 7.80s 7.43s: a.determinize()      # a = ladybird(21)
 6.64s 6.45s 0.84s: a.determinize()      # a = lal(a-zA-Z0-9).ladybird(18)

where (1) is the "original" version, (2) is the version without the optional completion of the determinized automaton (yes, it was set to False in the bench of (1)), and (3) is the current version, which avoids considering unused letters.

2014-07-01

prefix, suffix, factor, subword: new algorithms

These four new functions (static, dyn and Python) take an automaton and return another that accepts a superset of its language:

suffix makes each accessible state final (with unit weight)
prefix makes each coaccessible state initial
factor makes each useful state initial and final
subword applies the Magnus transform: for each non spontaneous transition (src, label, weight, dst), it adds a spontaneous transition (src, one, weight, dst).

2014-06-24

minimize: new subset decorators, except for the "brzozowski" variant

Minimizing an automaton now yields a decorated automaton keeping track of source state names. The new "subset decorator" code is decoupled from minimization and is intended to be used for other algorithms as well.

>>> a = vcsn.context('lal_char(a-z)_b').ratexp('a+b*e+c+dc').standard(); a
>>> a.minimize()

The minimize algorithm no longer recognizes the "brzozowski" variant at the Static level, as it would require a very different, and likely uninteresting, decorator; the user can still directly call minimize_brzozowski at the Static level. We still support "brzozowski" as a variant at the Dyn and Python levels.

2014-06-23

efsm: support for transducers

Exchange with OpenFST via efstcompile/efstdecompile now supports transducers.

2014-06-15

TAF-Kit is phased out

Existing commands for TAF-Kit are left and are occasionally useful. However, today the last existing TAF-Kit test has been converted to Python: TAF-Kit is no longer checked at all by the test suite.

lift: now bound in Python

One can now lift automata and ratexps from Python.

2014-06-19

Minimize: extend 'weighted' and 'signature' variants

Lift arbitrary restrictions on the labelset of the 'weighted' and 'signature' minimization variants.

>>> ctx = vcsn.context('law_char(a-z)_b')
>>> ctx.ratexp('ab+<3>cd+ac').standard().minimize('weighted')

2014-06-04

More systematic use of decorators

In addition to the venerable transpose_automaton, several automaton decorator types have been introduced (blind_automaton, determinized_automaton, pair_automaton, product_automaton, ratexp_automaton), and are now used in many algorithms, including: compose, determinize, product, shuffle, infiltration, synchronizing_word, etc.

Coupled with the fact that automata can now display state names (as opposed to state numbers) in Dot output, one gets rich displays of automata. For instance:

>>> ctx = vcsn.context('lal_char(ab)_b')
>>> ctx.ratexp('aa+ab').derived_term().determinize()

now displays an automaton whose states are labeled as sets of ratexps: "{aa+ab}", "{a, b}", and "{}".

Use automaton.strip() to remove state names.

2014-05-30

dot2tex format

Automata now support for "dot2tex" format, meant to be used with the dot2tex program. It allows to combine TikZ's nice rendering of automata, LaTeX's math mode to render state and transition labels, with dot automatic layout.

IPython: an interactive display

Starting with IPython 2.0, running 'aut.display()' provides an interactive display of an automaton, with means to select the display mode (e.g., "dot", "dot2tex", etc.).

shuffle: now available on ratexps

Available in static, dyn, and Python.

2014-05-28

Optimization: composing two automata transposition yields the identity

Transposing an automata twice now yields the original automata, instead of an automata wrapped by two decorator layers.

is-isomorphic: extend to any context and any automata

Lift the previous limitation of is-isomorphic to deterministic lal automata. The sequential case keeps its linear complexity, but the new generic code has a worst-case complexity of O((n+1)!); however the common case is much faster, as we heuristically classify states according to in- and out-transitions, restricting brute-force search to states which are possible candidates for isomorphism.

>>> ctx = vcsn.context('lal_char(a-z)_b')
>>> a = ctx.ratexp('ab+<3>ab+ab+ac').standard()
>>> b = ctx.ratexp('ab+ac+<3>ab+ab').standard()
>>> a.is_isomorphic(b)
True
>>> at = ctx.ratexp('abc').standard().transpose()
>>> c = ctx.ratexp('cba').standard()
>>> at.is_isomorphic(c)
True

2014-05-22

Automata are now handled via shared pointers

Types such as vcsn::mutable_automaton<Ctx> used to support the so called "move semantics" only. In particular, simple assignment between automata was not possible. This intentional limitation was introduced to avoid accidental lose of performance by unexpected deep copies of automata, while keeping partly "value semantics".

This, however, resulted in too many constraints, especially when one wants to embed automata in other structures (which is for instance the case of transpose_automaton which wraps an automaton).

To address these issues, types such as vcsn::mutable_automaton<Ctx> are now std::shared_ptr. This does change the programming style, both when instantiating an automaton (typically now using make_mutable_automaton), and when using it (with -> instead of .):

Instead of:

automaton_t aut{ctx};
auto s1 = aut.new_state();
aut.set_initial(s1);

write:

auto aut = vcsn::make_mutable_automaton(ctx);
// or: auto aut = vcsn::make_shared_ptr<automaton_t>(ctx);
auto s1 = aut->new_state();
aut->set_initial(s1);

2014-05-21

Polynomials are now usable as weights

Polynomialsets are now usable as a generic weightset. Polynomials are mostly useful on law and ratexpset.

>>> ctx = vcsn.context('lal_char(abc)_polynomialset<law_char(xyz)_z>')
>>> ctx.ratexp('<x + xy + x + \e>a')
<\e + <2>x + xy>a

2014-05-18

ratexps: support for negated letter classes

We may now use '[^...]' to denote a letter other than the listed ones. The special case '[^]' denotes any character of the alphabet.

>>> c = vcsn.context('lal_char(0-9)_b')
>>> c.ratexp('0+[^0][^]*')
0+(1+2+3+4+5+6+7+8+9)(0+1+2+3+4+5+6+7+8+9)*

ratexps: invalid letter classes are rejected

Instead of being ignored, invalid intervals, or empty classes, are now rejected.

>>> c.ratexp('[9-0]')
RuntimeError: invalid letter interval: 9-0
>>> c.ratexp('[]')
RuntimeError: invalid empty letter class

ratexps: improved support for letter classes

Previously letter classes were supported only for context on top of a simple alphabet (LAL, LAN and LAW). Generators of more complex contexts such as LAL x LAN are now supported:

>>> c = vcsn.context('lat<lal_char(abc),lan_char(xyz)>_b')
>>> c.ratexp("['(a,x)''(c,z)']")
(a,x)+(c,z)

>>> c.ratexp("[^'(a,x)''(c,z)']")
(a,y)+(a,z)+(b,x)+(b,y)+(b,z)+(c,x)+(c,y)

>>> c.ratexp("['(a,x)'-'(a,z)']")
(a,x)+(a,y)+(a,z)

>>> c.ratexp("[^]")
(a,x)+(a,y)+(a,z)+(b,x)+(b,y)+(b,z)+(c,x)+(c,y)+(c,z)

2014-05-12

compose: new algorithm (static, dyn, Python)

A new algorithm has been introduced to allow the composition of two transducers. It computes the accessible part of the transducer resulting from the composition of the second tape of the first transducer with the first tape of the second one.

>>> c1 = vcsn.context('lat<lan_char(abc),lan_char(ijk)>_b')
>>> c2 = vcsn.context('lat<lan_char(ijk),lan_char(xyz)>_b')
>>> t1 = c1.ratexp("('(a,i)'+'(b,j)'+'(c,k)')*").thompson()
>>> t2 = c2.ratexp("('(i,x)'+'(j,y)'+'(k,z)')*").standard()
>>> t1.compose(t2).proper().shortest(8)
(\e,\e) + (a,x) + (b,y) + (c,z) + (aa,xx) + (ab,xy) + (ac,xz) + (ba,yx)

2014-05-05

insplit: new algorithm (static, dyn, Python)

It is now possible to do the in-splitting of an automaton, i.e. to get the equivalent automaton such that each state have either only incoming epsilon-transitions or no incoming epsilon-transitions. This is the first step in a product algorithm that supports epsilon-transitions.

2014-05-02

Multiplication by a weight no longer requires a standard automaton

The right-multiplication by a weight uselessly required a standard automaton; it now accepts any automaton. The left-multiplication keeps its specification for standard automata, but now supports non-standard automata, in which case the weight is put on the initial arrows.

Multiplication by 0 is fixed

The multiplication of an automaton by the null weight results in the "zero automaton": a single state, initial, and no transitions.

2014-04-27

Python syntactic sugar

Multiplication by a scalar on the left, and on the right, can be performed with implicit conversion of the weights. Instead of

>>> z = vcsn.context('lal_char(ab)_z')
>>> r = z.ratexp('[ab]*')
>>> z.weight(2) * r * z.weight(3)

one can now write

>>> 2 * r * 3

and similarly for automata.

Besides, repeated &-product can be denoted with the same symbol, &: 'a & 3' denotes 'a & a & a'.

2014-04-22

Vaucanson 2 has moved

Vaucanson 2 is now also hosted on gitlab.lrde.epita.fr. It is also renamed vaucanson.git, rather than vaucanson2.git. To update your existing repository, run a command similar to:

$ git remote set-url origin git@gitlab.lrde.epita.fr:vcsn/vaucanson

or edit your .git/config file to update the URL.

You may also visit http://gitlab.lrde.epita.fr/vcsn/vaucanson.

dyn::label is born

The family of dynamic object (which includes dyn::automaton, dyn::weight, dyn::polynomial, dyn::ratexp, and others) now features a dyn::label. Several algorithms dealing with labels used the "std::string" C++ type to this end. This was inadequate, and the corresponding signatures have been cleaned bottom up.

For instance "derivation" used to have the following signatures

Static:

template <typename RatExpSet>
inline
rat::ratexp_polynomial_t<RatExpSet>
derivation(const RatExpSet& rs, const typename RatExpSet::value_t& e,
           const std::string& word, bool breaking = false)

Dyn:

polynomial derivation(const ratexp& exp, const std::string& s,
                      bool breaking = false);

These signatures are now:

template <typename RatExpSet>
inline
rat::ratexp_polynomial_t<RatExpSet>
derivation(const RatExpSet& rs, const typename RatExpSet::value_t& e,
           const typename RatExpSet::labelset_t::word_t& word,
           bool breaking = false)


polynomial derivation(const ratexp& exp, const label& l,
                      bool breaking = false);

At the Python level, derivation was adjusted so that one may still pass a string, and see it upgraded, so both these calls work:

>>> ctx = vcsn.context('lal_char(ab)_z')
>>> r = ctx.ratexp('(<2>a)*')
>>> r.derivation(ctx.word('aa'))
<4>(<2>a)*
>>> r.derivation('aa')
<4>(<2>a)*

Context extraction

It is now possible to obtain an automaton or a ratexp's context. In dyn, use "dyn::context_of(obj)", in Python use "obj.context()".

Copy can now change the type of the automaton

vcsn::copy used to support only "homogeneous" duplications. It also offers access to the the origins (a map from resulting states to origin states).

Variadic product of automata

The synchronized product of automata is now variadic: the product of n-automata directly builds an automaton labeled with n-tuples of original states. The Python operator, &, is modified to pretend it is variadic (rather than binary): it delays the computation until the result is needed. The ".value()" method allows to force the evaluation.

>>> ctx = vcsn.context('lal_char(ab)_z')
>>> a1 = ctx.de_bruijn(5)
>>> a = ctx.ratexp('a{5}').derived_term()
>>> import timeit
>>> timeit.timeit(lambda: (((a1&a1).value() & a1).value() & a).value(),
                  number=1000)
1.9550061225891113
>>> timeit.timeit(lambda: (a1 & a1 & a1 & a).value(), number=1000)
0.5792131423950195

Be aware that if the result is not needed, then it is simply not computed at all (hence, appears to be blazingly fast):

>>> timeit.timeit(lambda: a1 & a1 & a1 & a, number=1000)
0.0039250850677490234

"Fine grain" runtime compilation works

So far dyn:: provided support for "context" runtime compilation, i.e., if a context is unknown (e.g., "lat<lal_char(ab), lal_char(xy)>_b"), C++ code is emitted, compiled, and loaded. Now Vaucanson also supports per-algorithm runtime compilation.

Brzozowski minimization.

It is now possible to ask for the Brzozowski's minimization:

>>> a = vcsn.context('lal_char(ab)_b').ratexp('a+b').standard()
>>> a.minimize('moore').info()['number of states']
2
>>> a.minimize('signature').info()['number of states']
2
>>> a.minimize('brzozowski').info()['number of states']
2

2014-04-11

GCC is back in business

GCC compiles Vaucanson incorrectly (i.e., Vaucanson appears to behave incorrectly, but it is actually the compiler that is incorrect). For this reason, some features were disabled with such a compiler, and recently GCC support was completely dropped.

Efforts were put in finding reasonable workarounds for these bugs, and now GCC is, again, supported. Parts that used to be disabled are now supported.

For more information on this bug, see the following problem report http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51253.

2014-04-10

Vaucanson 1 has moved

Vaucanson 1 is now hosted on gitlab.lrde.epita.fr. It is also renamed vaucanson1.git, rather than just vaucanson.git. To update your existing repository, run a command similar to:

$ git remote set-url origin git@gitlab.lrde.epita.fr:vcsn/vaucanson1

or edit your .git/config file to update the URL.

You may also visit http://gitlab.lrde.epita.fr/vcsn/vaucanson1.

2014-04-05

conjunctions of ratexps

The intersection operation on ratexps is renamed as conjunction.

2014-03-28

Improve treatment of nullable labelsets

As introduced earlier, nullablesets are now written lan<...>, for instance lan<lal_char(ab)>_b. The former syntax, e.g., lan_char(ab)_b, is kept for backward compatibility and ease of use.

Up to now, lan<> would only work on lal and labelsets with already a one provided, making the improvement quite useless. Now lan<...> will manufacture a one for the labelsets that don't have one, so it can be wrapped around anything.

In addition to that, simplifications are applied; for instance lan<lan<lan_char(ab)>>_b actually builds lan<lal_char(ab)>_b. Similarly, lan<lat<lan_char(ab), law_char(ab), ratexpset<lal_char(ab)_b>>>_b generates lat<lan<lal_char(ab)>, law_char(ab), ratexpset<lal_char(ab)_b>>_b, since the tuple wrapped in the outer lan already has a one, given that all of its components have one.

2014-03-26

concatenation and conjunction of ratexps accept more heterogeneous arguments

Conversions of both labels and weights are performed if needed.

>>> a = vcsn.context('ratexpset<lal_char(xy)_b>_z').ratexp("<2>'x*'")
>>> b = vcsn.context('lal_char(b)_q')              .ratexp('<1/3>b')
>>> a * b
<2>x*<1/3>b
>>> (a*b).info()['type']
'ratexpset<ratexpset<lal_char(bxy)_b>_q>'

>>> ab = vcsn.context('lal_char(ab)_z').ratexp('(a+b)*')
>>> bc = vcsn.context('lal_char(bc)_b').ratexp('(b+c)*')
>>> ab & bc
(a+b)*&(b+c)*
>>> (ab & bc).info()['type']
'ratexpset<lal_char(abc)_z>'

2014-03-09

left-mult and right-mult are bound in Python

Both work on automata and ratexps. Left multiplication now has its arguments in a more natural order in the C++ API: (weight, automaton), instead of the converse previously. TAF-Kit still has it the old way. The Python operator * (left-associative) is overloaded to provide syntactic sugar.

>>> z = vcsn.context('lal_char(abc)_z')
>>> a = z.weight("12") * z.ratexp('ab').standard() * z.weight("23")
>>> a.ratexp()
(<12>ab)<23>

>>> z.weight("12") * z.ratexp('ab') * z.weight("23")
<12>(ab)<23>

2014-03-08

sum of rational expressions accepts more heterogeneous arguments

Conversions of both labels and weights are performed if needed.

>>> a = vcsn.context('ratexpset<lal_char(xy)_b>_z').ratexp("<2>'x*'")
>>> b = vcsn.context('lal_char(b)_q')              .ratexp('<1/3>b')
>>> a + b
<2>x*+<1/3>b

concatenation accepts more heterogeneous arguments

As for products and union, it is now possible to compute the concatenation of automata with different types:

>>> z = vcsn.context('lal_char(a)_z').ratexp('<2>a')  .derived_term()
>>> q = vcsn.context('lal_char(b)_q').ratexp('<1/3>b').derived_term()
>>> r = vcsn.context('lal_char(c)_r').ratexp('<.4>c') .derived_term()
>>> (z*q*r).ratexp()
<2>a<0.333333>b<0.4>c

2014-03-05

Bug: standardization of automata

The standardization of an automaton no longer leaves former initial states that became non-accessible.

Bug: right-mult on automata

Its behavior was completely wrong.

2014-02-21

parse and display letter classes in transitions

It is now possible to directly write labels of transitions with letter classes. For instance '[a-ky]' denotes every letter between a and k in the alphabet, or y.

Transition labels with equal weights are displayed this way. For instance 'g, a, b, d, c, f' becomes [a-dfg] and '<2>a, <2>b, <2>c" is displayed '<2>[abc]', but 'a, b' stays in this form.

2014-02-20

union accepts more heterogeneous arguments

As for products (Hadamard, shuffle, infiltration), it is now possible to compute the union of automata with different types:

>>> z = vcsn.context('lal_char(a)_z').ratexp('<2>a')  .derived_term()
>>> q = vcsn.context('lal_char(b)_q').ratexp('<1/3>b').derived_term()
>>> r = vcsn.context('lal_char(c)_r').ratexp('<.4>c') .derived_term()
>>> (z|q|r).ratexp()
<2>a+<0.333333>b+<0.4>c

automaton product optimization.

An optimization enabled by state renumbering.

Score changes on Luca's workstation (before/after):

4.60s  2.87s: a.product(a)         # a = std([a-e]?{50})
2.34s  2.37s: a.shuffle(a)         # a = std([a-e]?{50})
4.01s  2.58s: a.infiltration(a)    # a = std([a-e]?{30})
4.17s  2.96s: a**12                # a = std([a-e]*b(<2>[a-e])*)

2014-02-18

state renumbering is not automatic any longer

In view of several forthcoming in-place algorithms on automata including edit, we changed the automaton output code not to automatically renumber states at the static, dyn and Python levels. A new "sort" algorithm is available to renumber states (breadth-first and then by outgoing transition label) and reorder transitions (by label) in a predictable way, when explicitly requested. TAF-Kit output is automatically sorted.

Having a predictable numbering will enable future optimizations on automaton product and other algorithms accessing all the transition from a given state by label.

2014-02-14

double_ring: New Python binding

Returns a double ring automaton.

2014-02-11

ratexpset can be used for LabelSet

Contexts such as ratexpset<lal_char(ab)_b>_b are now valid. The eliminate_state algorithm works properly on automata of such type.

2014-02-10

ratexp.info: New Python binding

Like automaton.info, returns a dictionary of properties of the ratexp.

2014-02-07

are-isomorphic/is-isomorphic: new algorithm (static, dyn, TAF-Kit, Python)

Given two automata, check whether they are isomorphic to one another. Currently implemented in the deterministic case only, for lal contexts.

>>> ctx = vcsn.context('lal_char(ab)_z')
>>> ctx.ratexp('a+b*').standard().is_isomorphic(ctx.ratexp('b*+a').standard())
True

weighted minimization (static, dyn, tafkit, Python)

A third implementation of the minimization algorithm named "weighted" is now available, supporting any lal context.

The new variant is a more widely applicable generalization of the "signature" implementation. The new variant is the default for non-boolean weightsets; it requires a trim automaton but does not rely on determinism or other properties. Preliminary measures show performance to be close to "signature", or even clearly superior in the case of sparse automata such as dictionaries.

Vaucanson 2b.3

2014-02-03

Release of our fourth beta, vaucanson-2b.3. Available on MacPorts as "vaucanson".

See https://www.lrde.epita.fr/wiki/Vaucanson/Vaucanson2b3.

A Virtual Machine for easy experiments

Installing Vaucanson 2 on some platforms can be tedious. Clément Démoulins contributed a Virtual Machine that makes it easy to experiment with Vaucanson 2 without having to compile it. He also contributed a Vagrantfile to make it even easier to deploy.

To install a Vaucanson virtual machine, please follow this procedure:

1. Install VirtualBox
   From your distro, or from https://www.virtualbox.org/wiki/Downloads.

2. Install Vagrant
   From your distro, or from http://www.vagrantup.com/downloads.html

3. Download this Vagrantfile and save it somewhere.
   https://www.lrde.epita.fr/dload/vaucanson/2.0/Vagrantfile

   For instance
   $ mkdir ~/src/vcsn2
   $ cd ~/src/vcsn2
   $ wget https://www.lrde.epita.fr/dload/vaucanson/2.0/Vagrantfile

4. Run Vagrant (first time will be slow: let it download the VM)

   $ cd ~/src/vcsn2
   $ vagrant up

   Vaucanson is running!

5. Open http://localhost:8888 in your favorite browser.

6. Experiment!  (Hit Shift-Enter to evaluate):
   import vcsn
   vcsn.context('lal_char(abc)_z').ratexp('(<2>a+<3>b)*').derived_term()

7. Turn your VM off when you are done
   $ vagrant halt

2014-01-29

ratexp difference: new algorithm (static, dyn, Python)

The "difference" algorithms generates a ratexp that accepts words of the left-hand side that are not accepted by the right-hand side ratexp. Also bound as the "%" operator in Python.

>>> ctx = vcsn.context('lal_char(abc)_b')
>>> l = ctx.ratexp('[abc]*')
>>> r = ctx.ratexp('[ab]*')
>>> l.difference(r)
(a+b+c)*&(a+b+c)*
>>> l % r
(a+b+c)*&(a+b+c)*

2014-01-27

ratexp intersection: new algorithm (static, dyn, Python)

The "intersection" algorithm computes a ratexp that denotes the Hadamard product of two rational expressions. Also bound as the "&" operator in Python.

>>> ctx = vcsn.context('lal_char(abc)_b')
>>> r = ctx.ratexp('[abc]*')
>>> r.intersection(r)
(a+b+c)*&(a+b+c)*
>>> r & r
(a+b+c)*&(a+b+c)*

ratexp concatenation: new algorithm (static, dyn, Python)

The "concatenate" algorithm computes a ratexp that denotes the concatenation of two rational expressions. Also bound as the "*" operator in Python.

>>> ctx = vcsn.context('lal_char(abc)_b')
>>> r = ctx.ratexp('[abc]*')
>>> r.concatenate(r)
(a+b+c)*(a+b+c)*
>>> r * r
(a+b+c)*(a+b+c)*

Python API: Operators overloading on automata

The Python API now overloads the following operators for automata:

"+", sum of two automata
"&", product of two automata
"~", complement of an automaton
"*", concatenation of two automata
"%", difference between two automata
"|", union of two automata
"**", power of an automaton

2014-01-24

ratexp sum: new algorithm (static, dyn, Python)

Compute the sum of two rational expressions. Also bound as the "+" operator in Python.

>>> ctx = vcsn.context('lal_char(abc)_b')
>>> r = ctx.ratexp('[abc]*')
>>> r.sum(r)
(a+b+c)*+(a+b+c)*
>>> r + r
(a+b+c)*+(a+b+c)*

proper: an optional argument to avoid state deletion

The proper algorithm eliminates the states that become inaccessible in the course of spontaneous-transition elimination. This can be disabled by passing "false" as additional argument to proper (static, dyn, Python).

2014-01-17

star_height: new algorithm (static, dynamic, Python)

Computes the star-height of an expression.

>>> vcsn.context('lal_char(ab)_b').ratexp('(a***+a**+a*)*').star_height()
4

Bison is no longer needed

To install Vaucanson from a tarball, Bison is no longer needed. Of course, modifying one of grammar files will fail, unless Bison 3.0 is installed.

2014-01-13

Letter classes in context specifications

It is now possible to use ranges to define alphabets. For instance in Python,

vcsn.context('lal_char(a-zA-Z0-9_)_b')

builds a context whose alphabet covers letters, digits, and underscore.

Vaucanson 2b.2

2014-01-10

Release of our third beta, vaucanson-2b.2. Available on MacPorts as "vaucanson".

Letter classes

Letter classes are available in ratexps: '[a-d0-9_]' is expanded into (a+b+c+d+0+1+2+3+4+5+6+7+8+9+_). The negation ([^...]) is not supported.

2014-01-02

The '.' operator is no longer printed

The pretty-printing of (non-LAW) ratexps is simplified.

Before: $ vcsn ladybird 2 | vcsn determinize | vcsn aut-to-exp +a.(b+a.a+c.(a+c).b).(a+c.(a+c)*) $ vcsn derived-term -Ee 'a:b:c' | vcsn aut-to-exp (a.b+b.a).c+(a.c+c.a).b+(b.c+c.b).a

After: $ vcsn ladybird 2 | vcsn determinize | vcsn aut-to-exp +a(b+aa+c(a+c)b)(a+c(a+c)*) $ vcsn derived-term -Ee 'a:b:c' | vcsn aut-to-exp (ab+ba)c+(ac+ca)b+(bc+cb)a

Shortlex order is now used for ratexps

This changes the pretty-printing of polynomials of ratexps, for instance the result of derivations.

Before: $ vcsn derivation -e '(a+b)a(a+b)' aa a+b + a.a.(a+b) + a

After: $ vcsn derivation -e '(a+b)a(a+b)' aa a* + a+b + aa(a+b*)

ratexp implementation overhaul

Although internal, this change is documented as it deeply changes the way ratexps are handled in C++ code.

The abstract-syntax tree of the rational expressions now matches the usual (abstract) grammar:

E ::= \z | \e | a | E+F | E.F | E* | kE | Ek       k is a weight

In other words, there are now 'left-weight' and 'right-weight' nodes that exist, whereas before, the six first cases carried left and right weights. Trivial identities are enforced, and, for instance, no tree for 'a k' exist: it is converted to 'k a'.

2013-12-19

Many of these listed features were actually developed over the last month.

products accept more heterogeneous arguments

It is already possible to compute the (regular, infiltration and shuffle) products of automata with different types of "basic" weightsets (e.g. B, Z, Q, R).

It is now also possible when weights are ratexps. For instance the product of automaton with weights in Q and an automaton with weights in RatE yields an automaton with weights in Q-RatE.

New ratexp operators: '&' (intersection) and ':' (shuffle)

The operator '&' denotes the intersection in the case of Boolean weights, or more generally, the Hadamard product. Only "derived_term" can compute an automaton from it, in which case:

derived_term(E & F) = product(derived_term(E), derived_term(F))

The operator ':' denotes the shuffle product, aka interleave. For instance 'a:b:c' denotes the language of the permutations of "abc". Only "derived_term" can compute an automaton from it, in which case:

derived_term(E : F) = shuffle(derived_term(E), derived_term(F))

minimization is much faster

There are now two different implementations of the minimization, namely "moore" and "signature". Both require a trim automaton as input, and "signature" accepts non-deterministic automata.

The "moore" minimization is currently the fastest.

context run-time compilation

When dyn::make_context is presented with an unknown but valid context, it is compiled and loaded dynamically. For instance:

$ vcsn cat -C 'lao_r' -W -e 3.14
# Wait for the context to be compiled...
3.14

The compiled context is currently left in /tmp for forthcoming runs.

$ ls /tmp/lao*
/tmp/lao_r.cc   /tmp/lao_r.o   /tmp/lao_r.so

Python binding

It is now possible to use Vaucanson from Python. The Python binding is on top of the dyn API, and inherits all its features.

This Python API is object-oriented, contrary to dyn which is a list of types (context, ratexp, automaton, etc.) and functions (derived_term, determinize, etc.):

 dyn:: functions                    |  Python methods
------------------------------------+--------------------------------------
derived_term(ratexp) -> automaton   | ratexp.derived_term() -> automaton
determinize(automaton) -> automaton | automaton.determinize() -> automaton
etc.                                | etc.

Documentation is forthcoming, but for instance:

$ python
Python 2.7.6 (default, Nov 12 2013, 13:26:39)
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import vcsn
>>> b = vcsn.context('lal_char(abc)_b')
>>> b
lal_char(abc)_b
>>> r1 = b.ratexp('(a+b)*')
>>> r1
(a+b)*
>>> a1 = r1.derived_term()
>>> a2 = b.ratexp('(b+c)*').standard()
>>> a1.product(a2).ratexp()
\e+b.b*
>>>

2013-12-03

products accept heterogeneous arguments

It is now possible to compute the (regular, infiltration and shuffle) products of a (B, Z, Q, R) automaton with a (B, Z, Q, R) automaton.

$ vcsn derived-term -C 'lal_char(ab)_q' -e '(<1/2>a+<1/3>b)*' > q.gv
$ vcsn derived-term -C 'lal_char(ab)_z' -e '(<2>a+<3>b)*'     > z.gv
$ vcsn derived-term -C 'lal_char(ab)_r' -e '(<.2>a+<.3>b)*'   > r.gv

$ vcsn product -f z.gv q.gv | vcsn aut-to-exp
(a+b)*
$ vcsn product -f q.gv z.gv | vcsn aut-to-exp
(a+b)*
$ vcsn product -f q.gv z.gv q.gv | vcsn aut-to-exp
(<1/2>a+<1/3>b)*
$ vcsn product -f q.gv z.gv z.gv | vcsn aut-to-exp
(<2>a+<3>b)*
$ vcsn product -f q.gv r.gv | vcsn aut-to-exp
(<0.1>a+<0.1>b)*
$ vcsn product -f z.gv r.gv | vcsn aut-to-exp
(<0.4>a+<0.9>b)*
$ vcsn product -f z.gv r.gv q.gv | vcsn aut-to-exp
(<0.2>a+<0.3>b)*

2013-12-02

TikZ output now uses standalone

The TikZ output can now both be compiled as a standalone document, and be used inlined (as LaTeX source) in another document. Read the document of the "standalone" package (e.g., run 'texdoc standalone').

Data library path

TAF-Kit is now able to find installed files. Control the search path via the environment variable VCSN_DATA_PATH (a ":"-separated list of directories).

$ vcsn aut-to-exp -f lal_char_z/c1.gv
(a+b)*.b.(<2>a+<2>b)*

dot format parsing is much faster

We now require Bison 3.0 to build Vaucanson. In exchange, we get is very significant speed-up. On erebus (Mac OS X i7 2.9GHz 8GB, GCC 4.8 -O3) reading standard('a?{500}') goes from 30s to 1.2s.

2013-11-25

minimize: new algorithm (static, dynamic, TAF-Kit)

Given a deterministic Boolean lal automaton, compute its minimal equivalent automaton using Moore's algorithm.

Example: $ vcsn standard -e '(a+b+c+d){100}' -o 100.gv $ vcsn cat -f 100.gv -O info | grep 'number of states' number of states: 401 $ vcsn minimize -f 100.gv -o 100min.gv $ vcsn cat -f 100min.gv -O info | grep 'number of states' number of states: 101 $ vcsn are-equivalent -f 100.gv 100min.gv true

fraction parsing bug fixes

Negative denominators used to be silently mangled, and zero denominators were accepted without failing.

2013-11-22

split: new algorithm (static, dynamic, TAF-Kit)

Implements the breaking of a ratexp into a polynomial of ratexps, as required by breaking derivation, and broken derived terms. Not named "break", as this is a C++ keyword.

2013-11-20

products accept heterogeneous arguments

It is now possible to compute the (regular, infiltration and shuffle) products of a B-automaton with any other kind of automaton.

$ vcsn standard -C 'lal_char(ab)_b' -e 'a' -o a.gv
$ vcsn standard -C 'lal_char(ab)_ratexpset<lal_char(uv)_b>' \
                -e '(<u>a+<v>b)*' -o ab.gv
$ vcsn product -f ab.gv a.gv | vcsn shortest -f - 4
<u>a
$ vcsn shuffle -f ab.gv a.gv | vcsn shortest -f - 4
a + <u+u>aa + <v>ab + <v>ba
$ vcsn infiltration -f ab.gv a.gv | vcsn shortest -f - 4
<u+\e>a + <u.u+u+u>aa + <u.v+v>ab + <v.u+v>ba

2013-11-19

shortest: accepts the number of words as argument (defaults to 1)

$ vcsn shortest -O text -Ee '(a+b)*' 2
\e + a
$ vcsn enumerate -O text -Ee '(a+b)*' 2
\e + a + b + aa + ab + ba + bb

difference: new algorithm (static, dynamic, TAF-Kit)

Computes the difference between an automaton and a B-automaton (on LAL only):

$ vcsn derived-term -e '(?@lal_char(ab)_z)(<2>a+<3>b)*' -o lhs.gv
$ vcsn derived-term -e '(a+b)*b(a+b)*' -o rhs.gv
$ vcsn difference -f lhs.gv rhs.gv | vcsn aut-to-exp
(<2>a)*
$ vcsn difference -f rhs.gv rhs.gv | vcsn aut-to-exp
\z

product accepts heterogeneous arguments

It is now possible to compute the product of (for instance) a Z-automaton with a B-automaton.

$ vcsn standard -C 'lal_char(abc)_z' -e '(<2>a+<3>b+<5>c)*' -o 1.gv
$ vcsn standard -C 'lal_char(b)_b' -e 'b*' -o 2.gv
$ vcsn product -f 1.gv 2.gv | vcsn aut-to-exp
\e+<3>b.(<3>b)*
$ vcsn product -f 2.gv 1.gv | vcsn aut-to-exp
\e+<3>b.(<3>b)*

Vaucanson 2b.1

2013-11-13

Release of our second beta, vaucanson-2b.1. Available on MacPorts as "vaucanson".

2013-11-07

TAF-Kit: infiltration, product, shuffle: accept multiple arguments

You may pass several (one or more) arguments to vcsn product, infiltration and shuffle.

infiltration, product, shuffle: accept non commutative semirings

Beware that the (well defined) behavior of the resulting automata is no longer what one might expect.

$ vcsn standard -C 'lal_char(ab)_ratexpset<lal_char(uv)_b>' \
                -e '<u>a<v>b' -o uv.gv
$ vcsn standard -C 'lal_char(ab)_ratexpset<lal_char(xy)_b>' \
                -e '<x>a<y>b' -o xy.gv
$ vcsn product -f uv.gv xy.gv | vcsn enumerate -f - 4
<u.x.v.y>ab
$ vcsn shuffle -f uv.gv xy.gv | vcsn enumerate -f - 4
<u.x.v.y+u.x.y.v+x.u.v.y+x.u.y.v>aabb
<u.v.x.y+x.y.u.v>abab
$ vcsn infiltration -f uv.gv xy.gv | vcsn enumerate -f - 4
<u.x.v.y>ab
<u.x.v.y+x.u.v.y>aab
<u.x.v.y+u.x.y.v>abb
<u.x.v.y+u.x.y.v+x.u.v.y+x.u.y.v>aabb
<u.v.x.y+x.y.u.v>abab

Input/Output fixes

Support for EFSM in input/output is improved to support weights. Output was significantly sped up.

Before: 0.64s: standard -Ee 'a?{500}' -O dot >a500.gv 4.47s: standard -Ee 'a?{500}' -O efsm >a500.efsm 4.04s: standard -Ee 'a?{500}' -O fado >a500.fado 4.83s: standard -Ee 'a?{500}' -O grail>a500.grail 1.03s: standard -Ee 'a?{500}' -O tikz >a500.tikz

After: 0.76s: standard -Ee 'a?{500}' -O dot >a500.gv 0.37s: standard -Ee 'a?{500}' -O efsm >a500.efsm 0.35s: standard -Ee 'a?{500}' -O fado >a500.fado 0.35s: standard -Ee 'a?{500}' -O grail>a500.grail 0.68s: standard -Ee 'a?{500}' -O tikz >a500.tikz

2013-11-04

boolean weight parsing bug fix

The weight parser used to ignore all characters after the first one, so that for example "1-q" was considered valid and equivalent to "1".

2013-10-31

product and infiltration-product are much faster

On erebus (MacBook Pro i7 2.9GHz 8GB, GCC 4.8 -O3), with "vcsn standard -E -e 'a?70' -o a70.gv":

Before: 13.01s: product -q -f a70.gv a70.gv 0.23s: shuffle -q -f a70.gv a70.gv 16.64s: infiltration -q -f a70.gv a70.gv After: 0.77s: product -q -f a70.gv a70.gv 0.19s: shuffle -q -f a70.gv a70.gv 0.87s: infiltration -q -f a70.gv a70.gv

2013-10-30

TAF-Kit: option name changes

The former options -W and -L (to specify the weightset and labelset) are removed, definitively replaced by -C, to specify the context.

In addition to -A and -E (input is an automaton/ratexp), option -W now replaces option -w (input is a weight), and option -P denotes "polynomials". Note that "polynomials" are actually linear combinations of labels, weighted by weights, so for instance "LAL" contexts accept only single-letter labels, used LAW to accept words.

$ vcsn cat -C 'law_char(ab)_z' -P -e 'a+ab + <-1>a+ab+<10>bb'
<2>ab + <10>bb

internal overhaul

Better names were chosen for the various details of the dyn:: value support:

           abstract_automaton -> automaton_base

             abstract_context -> context_base

          abstract_polynomial -> polynomial_base
 concrete_abstract_polynomial -> polynomial_wrapper

              abstract_ratexp -> ratexp_base
     concrete_abstract_ratexp -> ratexp_wrapper

           abstract_ratexpset -> ratexpset_base
  concrete_abstract_ratexpset -> ratexpset_wrapper

              abstract_weight -> weight_base
     concrete_abstract_weight -> weight_wrapper

Note that "base" is slightly misleading, as it actually applies to the wrappers, and not to the static objects. For instance, weightsetwrapper derives from weightset_base, but b, z, and the other (static) weightsets do not derive from weightset_base. It would therefore be more precise to name "weightset_base" as "weightset_wrapper_base", but that might be uselessly long. Names may change again in the future.

Because dyn::polynomial and dyn::weight store the corresponding polynomialset/weightset, there is no need (so far?) for a dyn::polynomialset/dyn::weightset. Therefore, abstract_polynomialset and abstract_weightset were removed.

For consistency (and many other benefits), automaton_wrapper and context_wrapper were introduced. Now, the whole static API is completely independent of the dyn API, and the dyn API consists only of wrappers of "static"-level values---instead of inheritance for automata and contexts.

2013-10-25

renamings (static, dynamic, TAF-Kit)

derive -> derivation derived-terms -> derived-term infiltrate -> infiltration

2013-10-24

taf-kit: option -q for "quiet"

Equivalent to "-O null": do not generate output.

star-normal-form: new algorithm (static, dynamic, TAF-Kit)

Compute an equivalent rational expression where starred subexpressions have a non null constant term. Yields the same standard automaton, but built faster.

$ vcsn star-normal-form -e '(a*b*)*'
(a+b)*

Valid only for Boolean automata.

ratexp printing: minimize parentheses

Print rational expressions with the minimum required number of parentheses, making some large expressions considerably easier to read. For example

Before: $ vcsn expand -C 'law_char(abc)_z' -e '(a+b)?{2}' (+<2>a+<2>b+(aa)+(ab)+(ba)+(bb)) $ vcsn ladybird 2 | vcsn determinize | vcsn aut-to-exp +(a.((b+(a.a)+(c.((a+c)).b))).(a+(c.((a+c)*))))

After: $ vcsn expand -C 'law_char(abc)_z' -e '(a+b)?{2}' (+<2>a+<2>b+aa+ab+ba+bb) $ vcsn ladybird 2 | vcsn determinize | vcsn aut-to-exp +a.(b+a.a+c.(a+c).b).(a+c.(a+c)*)

expand: new algorithm (static, dynamic, TAF-Kit)

Given a rational expression, distribute product over addition (recursively under the starred subexpressions) group and sort the equal monomials.

$ vcsn expand -C 'lal_char(abc)_z' -e '(a+b)?{2}*'
(\e+<2>a+<2>b+(a.a)+(a.b)+(b.a)+(b.b))*

Vaucanson 2b.0

2013-10-22

Release of our first beta, vaucanson-2b.0.

2013-10-17

shuffle, infiltrate: new algorithms (static, dynamic, TAF-Kit)

New algorithms on LAL automata: computes the shuffle-product and infiltration of two automata. These (and now product as well) require weightsets to be commutative semirings.

$ vcsn standard -C 'lal_char(ab)_z' -E -e 'ab' -o ab.gv
$ vcsn product    -f ab.gv ab.gv | vcsn enumerate -O text -f - 4
ab
$ vcsn shuffle    -f ab.gv ab.gv | vcsn enumerate -O text -f - 4
<4>aabb + <2>abab
$ vcsn infiltrate -f ab.gv ab.gv | vcsn enumerate -O text -f - 4
ab + <2>aab + <2>abb + <4>aabb + <2>abab

2013-10-15

derived-terms: new algorithm (static, dynamic, TAF-Kit)

In addition to thompson and standard, this is a third means to build an automaton from a (weighted) rational expression. It corresponds to the Antimirov definition of derivatives (implemented by "derive"). It requires LAL rational expressions.

$ vcsn derived-terms -C 'lal_char(ab)_q' -e '(<1/6>a*+<1/3>b*)*'
digraph
{
  vcsn_context = "lal_char(ab)_q"
  rankdir = LR
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I0
    F0
    F1
    F2
  }
  {
    node [shape = circle]
    0
    1
    2
  }
  I0 -> 0
  0 -> F0 [label = "<2>"]
  0 -> 1 [label = "<1/3>a"]
  0 -> 2 [label = "<2/3>b"]
  1 -> F1 [label = "<2>"]
  1 -> 1 [label = "<4/3>a"]
  1 -> 2 [label = "<2/3>b"]
  2 -> F2 [label = "<2>"]
  2 -> 1 [label = "<1/3>a"]
  2 -> 2 [label = "<5/3>b"]
}

vcsn eliminate-state

This tool allows one to eliminate states one after the other.

$ vcsn eliminate-state -f lao.gv 2

Its interface is likely to change, or to be completely removed.

Smaller libraries

Some compiler magic was used to reduce the size of the libraries (about 20% on Mac OS X using GCC 4.8). TAF-Kit might start faster.

2013-10-14

determinize: by default no longer forces the result to be complete

The determinization (static, dynamic, TAF-Kit) now admits an optional second argument (defaulting to '0'), a Boolean stating whether the result must be complete.

2013-10-12

polynomialset fully replaces entryset

The 'entryset' type is removed, as polynomialset now provides a strict superset of its features.

2013-10-11

derive: new algorithm (static, dynamic, TAF-Kit)

New algorithm on LAL automata: computes the derivation of rational expressions with respect to a word.

$ vcsn derive -C 'lal_char(a)_z' -e '(<2>a)*' a
<2>(<2>a)*
$ vcsn derive -C 'lal_char(a)_z' -e '(<2>a)*' aa
<4>(<2>a)*
$ vcsn derive -C 'lal_char(a)_z' -e '(<2>a)*' aaaa
<16>(<2>a)*
$ vcsn derive -C 'lal_char(a)_z' -e '(<2>a)*' b
\z

$ vcsn derive -C 'lal_char(ab)_q' -e '(<1/6>a*+<1/3>b*)*' a
<1/3>(a*).(((<1/6>a*)+(<1/3>b*))*)
$ vcsn derive -C 'lal_char(ab)_q' -e '(<1/6>a*+<1/3>b*)*' aa
<4/9>(a*).(((<1/6>a*)+(<1/3>b*))*)

2013-10-08

enumerate returns a dyn::polynomial

The 'enumerate' algorithm use to cheat, and returned a std::vector<string> (each string being a pretty-printing of the monomial, e.g., "<2>a"). It now returns a dyn::polynomial.

vcsn-enumerate is biased to prefer the 'list' output format:

$ vcsn enumerate -Ee 'a*' 3
\e
a
aa
aaa

however the proper syntax for polynomials can be asked for.

$ vcsn enumerate -O text -Ee 'a*' 3
\e + a + aa + aaa

New dyn:: type: polynomial

A new member joins the dyn:: family of types (i.e., automaton, ratexp, ratexpset, weight, weightset). Currently there are no means to read such a value from TAF-Kit, but there is output support with two different output format:

'text' (aka 'default') Prints the polynomial this usual way, e.g., "<2>a+<3>b".
'list' Prints one monomial per line, e.g. '<2>a <3>b'

2013-10-02

mutable_automaton: speed improvement

"set_transition" used to invoke twice "get_transition", which had a serious performance impact on some algorithms. This is fixed.

Before: 9.08s (0.33s+8.75s): ladybird 21 | determinize -O null 16.88s (0.85s+16.03s): thompson -C "lan_char(a)b" -Ee "a?{2000}" | proper -O null 21.30s (9.76s+11.54s): standard -Ee "(a+b+c+d)?{100}" | aut-to-exp -O null 7.29s (0.04s+7.25s): standard -C "lalchar(ab)_z" -Ee "(a+b)b(<2>a+<2>b)" | power -O null -f- 10 0.04s (0.04s): standard -E -e "(a?){70}" -o a70.gv 24.20s (24.20s): product -O null -f a70.gv a70.gv

After: 8.54s (0.13s+8.41s): ladybird 21 | determinize -O null 11.87s (0.86s+11.01s): thompson -C "lan_char(a)b" -Ee "a?{2000}" | proper -O null 21.03s (9.40s+11.63s): standard -Ee "(a+b+c+d)?{100}" | aut-to-exp -O null 6.58s (0.06s+6.52s): standard -C "lalchar(ab)_z" -Ee "(a+b)b(<2>a+<2>b)" | power -O null -f- 10 0.07s (0.07s): standard -E -e "(a?){70}" -o a70.gv 13.16s (13.16s): product -O null -f a70.gv a70.gv

Note in particular that the spontaneous transition elimination algorithm is faster, going from 16s to 11s on a MacBook Pro i7 2.9GHz 8GB RAM, on the following sequence.

$ vcsn thompson -C 'lan_char(a)_b' -Ee 'a?{2000}' | vcsn proper -O null

2013-10-01

random: new algorithm (static, dynamic, TAF-Kit)

Random generation of automata. Subject to changes. Accepts four arguments: number of states, density (of transitions, defaults to .1, 1 generates a clique), number of initial states (defaults to 1), and number of final states (defaults to 1).

So far LAL and LAN only, no support for random weights.

$ vcsn random -O fado 3
@DFA 0
0 d 1
1 c 2
2 d 2
$ vcsn random -O fado 3
@DFA 2
0 c 1
1 d 2
2 c 0
$ vcsn random -C 'lan_char(ab)_b' -O fado 3
@NFA 1 * 0
0 @epsilon 1
1 b 2
2 b 1

2013-09-24

Overhaul of the package

Thanks to Automake 1.14 features, there is now a single Makefile, which significantly speeds up the compilation of the package. The test-suite now uses the Test Anything Protocol, which results in more verbose results.

Beware that because of subtle issues (in the generated Makefile snippets that track dependencies), you are highly recommend to "make clean" after upgrading from the Git repository, and then "make" as usual.

2013-09-20

proper: faster implementation

The spontaneous transition elimination algorithm is now faster, going from 102s to 15s on a MacBook Pro i7 2.9GHz 8GB RAM, on the following sequence.

$ vcsn thompson -C 'lan_char(a)_b' -Ee 'a?{2000}' | vcsn proper -O null

2013-09-17

is-ambiguous

New algorithm (static, dynamic, TAF-Kit) on LAL automata: whether some word is the label of at least two successful computations.

$ vcsn is-ambiguous <<\EOF
digraph
{
  vcsn_context="lal_char(ab)_b"
  I -> 0
  0 -> 1 [label = "a"]
  0 -> 2 [label = "a"]
  1 -> F
}
EOF
false

$ vcsn is-ambiguous <<\EOF
digraph
{
  vcsn_context="lal_char(ab)_b"
  I -> 0
  0 -> 1 [label = "a"]
  0 -> 2 [label = "a"]
  1 -> F
  2 -> F
}
EOF
true

Also reported in the 'info' format for automata.

product: recover the original states

Similarly to 'determinize', the (static version of) 'product' can now be queried to get a map from states of the result to pairs of original states.

2013-09-08

Sum of standard automata is fixed

See the previous entry: the computation of the initial transition was wrong, which resulted in the production of non-standard automata. This is fixed:

$ vcsn standard -C 'lal_char(a)_ratexpset<lal_char(x)_b>' -e '<x>a*' > 1.gv
$ vcsn standard -C 'lal_char(b)_ratexpset<lal_char(y)_b>' -e '<y>b*' > 2.gv
$ vcsn sum -f 1.gv 2.gv
digraph
{
  vcsn_context = "lal_char(ab)_ratexpset<lal_char(xy)_b>"
  rankdir = LR
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I0
    F0
    F1
    F2
  }
  {
    node [shape = circle]
    0
    1
    2
  }
  I0 -> 0
  0 -> F0 [label = "<x+y>"]
  0 -> 1 [label = "<x>a"]
  0 -> 2 [label = "<y>b"]
  1 -> F1
  1 -> 1 [label = "a"]
  2 -> F2
  2 -> 2 [label = "b"]
}

2013-09-06

Operations on automata are generalized

Operations (union, sum, concatenate, chain) were uselessly restricted to LAL. Besides, the contexts were improperly computed (both labelset and weightset). This is fixed.

$ vcsn standard -C 'lal_char(a)_ratexpset<lal_char(x)_b>' -e '<x>a*' > 1.gv
$ vcsn standard -C 'lal_char(b)_ratexpset<lal_char(y)_b>' -e '<y>b*' > 2.gv
$ vcsn sum -f 1.gv 2.gv
digraph
{
  vcsn_context = "lal_char(ab)_ratexpset<lal_char(xy)_b>"
  rankdir = LR
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I0
    F0
    F1
    F2
  }
  {
    node [shape = circle]
    0
    1
    2
  }
  I0 -> 0 [label = "<\\e+\\e+\\e>"]
  0 -> F0 [label = "<x+y>"]
  0 -> 1 [label = "<x>a"]
  0 -> 2 [label = "<y>b"]
  1 -> F1
  1 -> 1 [label = "a"]
  2 -> F2
  2 -> 2 [label = "b"]
}

2013-09-04

double-ring

New algorithm (static, dynamic, TAF-Kit).

$ vcsn double-ring -C 'lal_char(ab)_b' 6 1 3 4 5
digraph
{
  vcsn_context = "lal_char(ab)_b"
  rankdir = LR
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I0
    F1
    F3
    F4
    F5
  }
  {
    node [shape = circle]
    0
    1
    2
    3
    4
    5
  }
  I0 -> 0
  0 -> 1 [label = "a"]
  0 -> 5 [label = "b"]
  1 -> F1
  1 -> 0 [label = "b"]
  1 -> 2 [label = "a"]
  2 -> 1 [label = "b"]
  2 -> 3 [label = "a"]
  3 -> F3
  3 -> 2 [label = "b"]
  3 -> 4 [label = "a"]
  4 -> F4
  4 -> 3 [label = "b"]
  4 -> 5 [label = "a"]
  5 -> F5
  5 -> 0 [label = "a"]
  5 -> 4 [label = "b"]
}

It is advised to pass "layout = circo" to Dot for the rendering.

concatenation, chain

New algorithm on standard automata (static, dynamic, TAF-Kit).

right-mult

New algorithm on standard automata (static, dynamic, TAF-Kit). Same limitations as left-mult, see below.

2013-09-03

left-mult

New algorithm on standard automata (static, dynamic, TAF-Kit).

The TAF-Kit version is (currently) troublesome, as it does not infer the context to parse the weight from the automaton: be sure to specify -C:

$ vcsn standard -C 'lal_char(a)_z' -e a | vcsn left-mult 12
vcsn-left-mult: left_mult: no implementation available for mutable_automaton<lal_char_z> x b
$ vcsn standard -C 'lal_char(a)_z' -e a | vcsn left-mult -C 'lal_char(a)_z' 12
digraph
{
  vcsn_context = "lal_char(a)_z"
  rankdir = LR
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I0
    F1
  }
  {
    node [shape = circle]
    0
    1
  }
  I0 -> 0
  0 -> 1 [label = "<12>a"]
  1 -> F1
}

taf-kit: option -w for weights as input

Currently only vcsn-cat supports it.

$ vcsn cat -w -e 2
vcsn-cat: invalid Boolean: 2
$ vcsn cat -w -e 1
1
$ vcsn cat -C 'lal_char(a)_z' -w -e 2
2

concatenate, star, sum

New algorithms on standard automata (static, dynamic, TAF-Kit).

2013-09-02

union

New algorithm on automata (static as "union_a", dynamic as "union_a", TAF-Kit as "union"). Plain graph union.

is-valid

New algorithm on rational expressions (static, dynamic, TAF-Kit).

$ vcsn is-valid -C 'lal_char(a)_r' -E -e '(<.5>\e)*'
true
$ vcsn is-valid -C 'lal_char(a)_r' -E -e '\e*'
false

For consistency, is-valid is now also available for automata in dyn:: and TAF-Kit (it used to be static only, visible from "info" output).

$ vcsn thompson -C 'lan_char(a)_r' -e '(<.5>\e)*' | vcsn is-valid
true
$ vcsn thompson -C 'lan_char(a)_r' -e '\e*' | vcsn is-valid
false

2013-08-13

More RatExp quantifiers

In addition to "", there is "?"/"{?}" and "{+}". Support for "{}" is added for symmetry.

$ vcsn cat -Ee 'a?'
\e+a
$ vcsn cat -Ee 'a?{3}'
(\e+a).(\e+a).(\e+a)
$ vcsn cat -Ee '(a+b){+}'
(a+b).((a+b)*)

2013-08-02

I/O EFSM format support

We are now able to produce and read EFSM format (designed as an interface to OpenFST). This is not (yet) thoroughly tested for weighted automata.

$ vcsn ladybird -O efsm 4 |
   efstcompile | fstdeterminize | efstdecompile |
   vcsn cat -I efsm -O info
type: mutable_automaton<lal_char(abc)_b>
number of states: 15
number of initial states: 1
number of final states: 8
number of accessible states: 15
number of coaccessible states: 15
number of useful states: 15
number of transitions: 43
number of deterministic states: 15
number of eps transitions: 0
is complete: 0
is deterministic: 1
is empty: 0
is eps-acyclic: 1
is normalized: 0
is proper: 1
is standard: 0
is trim: 1
is useless: 0
is valid: 1

Output in EFSM format is improved

State numbers now start appropriately at 0 (instead of 2), and when there is a single initial state, no "pre-initial state" is output; this avoids the introduction of spontaneous transitions in deterministic automata.

Before (see news of 2013-07-01):

$ vcsn ladybird -O efsm 2
#! /bin/sh

cat >transitions.fsm <<\EOFSM
0       2       \e
2
2       3       a
3       3       b
3       3       c
3       2       c
3       2       a
EOFSM

cat >isymbols.txt <<\EOFSM
\e      0
a       1
b       2
c       3
EOFSM

fstcompile --acceptor --keep_isymbols --isymbols=isymbols.txt transitions.fsm

Now:

$ vcsn ladybird -O efsm 2
#! /bin/sh

cat >isymbols.txt <<\EOFSM
\e      0
a       1
b       2
c       3
EOFSM

cat >transitions.fsm <<\EOFSM
0       1       a
1       0       a
1       1       b
1       0       c
1       1       c
0
EOFSM

fstcompile --acceptor --keep_isymbols --isymbols=isymbols.txt transitions.fsm

I/O FAdo format support

We are now able to produce and read FAdo format.

$ vcsn ladybird -O fado 4 | \
  python -c "from FAdo import fa
nfa = fa.readFromFile('/dev/stdin')[0]
dfa = nfa.toDFA()
fa.saveToFile('dl4.fado', dfa)"

$ vcsn cat -I fado -O info -f dl4.fado
type: mutable_automaton<lal_char(abc)_b>
number of states: 15
number of initial states: 1
number of final states: 8
number of accessible states: 15
number of coaccessible states: 15
number of useful states: 15
number of transitions: 43
number of deterministic states: 15
number of eps transitions: 0
is complete: 0
is deterministic: 1
is empty: 0
is eps-acyclic: 1
is normalized: 0
is proper: 1
is standard: 0
is trim: 1
is useless: 0
is valid: 1

constant-term

New algorithm on rational expressions (static, dynamic, TAF-Kit).

$ vcsn constant-term -e '(?@lal_char(a)_b)(\e)*'
1
$ vcsn constant-term -e '(?@lal_char(a)_z)(\e)*'
vcsn-constant-term: z: star: invalid value: 1
$ vcsn constant-term -C 'law_char(ab)_ratexpset<law_char(wxyz)_b>' \
                     -e '<w>(<x>a*+<y>b*)*<z>'
w.((x+y)*).z

2013-07-26

Automaton library

The set of library automata for existing contexts is now complete with respect to what Vaucanson 1 provided. Automata families are not, and will not, be part of this library ; for instance, instead of looking for ladybird-6.gv, use 'vcsn ladybird 6'.

lal_char_b: a1.gv b1.gv evena.gv oddb.gv
lal_char_z: b1.gv binary.gv c1.gv d1.gv
lal_char_zmin: minab.gv minblocka.gv slowgrow.gv

Currently one must specify their path:

$ vcsn evaluate -f share/vcsn/lal_char_zmin/minab.gv aabbba
3

2013-07-25

TAF-Kit: works on standard input by default

The very frequent "-f -" sequence is no longer required: by default the input is stdin.

$ vcsn ladybird 2 | vcsn determinize | vcsn aut-to-exp
\e+(a.((b+(a.a)+(c.((a+c)*).b))*).(a+(c.((a+c)*))))

2013-07-18

standard

New algorithm on automata (static, dynamic, TAF-Kit). Corresponds to "standardize" in Vaucanson 1.

is-standard

New algorithm on automata (static, dynamic, TAF-Kit). Also reported in "info" format.

u

New automata factory (static, dynamic, TAF-Kit): Brzozowski's universal witness.

2013-07-17

New formats: grail and fado

$ vcsn standard -O fado -e 'a+b'
@DFA 1 2
0       a       1
0       b       2
$ vcsn standard -O fado -e 'a+ab'
@NFA 1 3 * 0
0       a       1
2       b       3
0       a       2
$ vcsn standard -O grail -e 'a+ab'
(START)     |-      0
0       a       1
2       b       3
0       a       2
1       -|      (FINAL)
3       -|      (FINAL)

2013-07-14

power

New algorithm on automata (static, dynamic, TAF-Kit).

$ vcsn standard -e '(?@lal_char(01)_z)(0+1)*1(<2>0+<2>1)*' >binary.gv
$ vcsn power -f binary.gv 0 | vcsn enumerate -f- 2
\e
0
1
00
01
10
11
$ vcsn power -f binary.gv 1 | vcsn enumerate -f- 2
1
<2>10
<3>11
$ vcsn power -f binary.gv 2 | vcsn enumerate -f- 2
1
<4>10
<9>11
$ vcsn power -f binary.gv 4 | vcsn enumerate -f- 2
1
<16>10
<81>11
$ vcsn power -f binary.gv 8 | vcsn enumerate -f- 2
1
<256>10
<6561>11

2013-07-13

divkbaseb

New algorithm (static, dynamic, TAF-Kit).

$ vcsn divkbaseb -C 'lal_char(01)_b' 3 2
digraph
{
  vcsn_context = "lal_char(01)_b"
  rankdir = LR
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I0
    F0
  }
  {
    node [shape = circle]
    0
    1
    2
  }
  I0 -> 0
  0 -> F0
  0 -> 0 [label = "0"]
  0 -> 1 [label = "1"]
  1 -> 0 [label = "1"]
  1 -> 2 [label = "0"]
  2 -> 1 [label = "0"]
  2 -> 2 [label = "1"]
}

2013-07-12

enumerate produces a list of weighted words

enumerate now also provides the weight of the words. It is also fixed: it no longer reports words with nul weight (e.g., 'a' in 'a+<-1>a').

$ vcsn standard -e '(?@lal_char(01)_z)(0+1)*1(<2>0+<2>1)*' \
    | vcsn enumerate -f - 3
1
01
<2>10
<3>11
001
<2>010
<3>011
<4>100
<5>101
<6>110
<7>111

shortest is fixed and modified similarly.

$ vcsn standard -e '(?@lal_char(ab)_z)(a+<5>bb+<-1>a)' \
      | vcsn shortest -f -
<5>bb

2013-07-04

WeightSet: added support for rational weights

We can now use automata with rational numbers as their weights.

$ vcsn standard -e "(?@lal_char(ab)_q)(<1/2>a+<2>b)*" \
  | vcsn evaluate -f - aaabbbb
2

$ vcsn standard -e "(?@lal_char(ab)_q)(<1/2>a+<2>b)*"\
  | vcsn evaluate -f - aaab
1/4

2013-07-01

new format: efsm (and new tool: efstcompile)

The former output format ("fsm") is dropped, replaced by an adhoc "extended FSM" format: "efsm". The FSM format focuses only on the transitions, and lacks information about the labels (which are expected to be mapped to numbers), weights, whether it's an acceptor or transducer, etc.

The efsm format is designed to be simple to use with OpenFST: just run "efstcompile" instead of "fstcompile". As a matter fact, the new "efstcompile" tool is rather dumb, as it simply executes the "efsm" file.

$ vcsn ladybird -O efsm 2
#! /bin/sh

cat >transitions.fsm <<\EOFSM
0       2       \e
2
2       3       a
3       3       b
3       3       c
3       2       c
3       2       a
EOFSM

cat >isymbols.txt <<\EOFSM
\e      0
a       1
b       2
c       3
EOFSM

fstcompile --acceptor --keep_isymbols --isymbols=isymbols.txt transitions.fsm

$ vcsn ladybird -O efsm 8 | efstcompile | fstdeterminize | fstinfo \
    | grep '# of states'
# of states                                       256

2013-06-28

new binary: vcsn

To give a flavor of what TAF-Kit should be (a single tool instead of one per command), the new "vcsn" script bounces to the vcsn-* tools. It does not support | as TAK-Kit 1 did.

$ vcsn are-equivalent -Ee '(a*b*)*' '(a+b)*'
true

2013-06-26

proper

Now removes states to which no transition arrive after spontaneous transitions removal.

Thompson

Only the concatenation yielded an automaton whose projection on Boolean weights was different from the Thompson of the projection of the rational expression on Boolean. This is now fixed.

is-normalized

New algorithm (static, dynamic, TAF-Kit).

2013-06-25

identity, unit => one

Labels new define one() and is_one() instead of identity() and is_identity().

We now use LAO, "labels are one", instead of LAU, "labels are unit".

WeightSets now define one() and is_one() instead of unit() and is_unit().

2013-06-21

shortest, enumerate

New algorithms (static, dynamic, TAF-Kit).

2013-06-20

universal

Now accepts (LAL Boolean) rational expressions.

are-equivalent

Now accepts (LAL Boolean) rational expressions. It cannot compare a rational expression with an automaton (or vice-versa). This is a temporary defect which shell be addressed once TAF-Kit is properly developed.

$ vcsn-are-equivalent -Ee '(a*b*)*' '(a+b)*'
true
$ vcsn-are-equivalent -Ee '(a*b)*' '(a+b)*'
false

dyn: overhaul

Consistency is enforced in dyn. In particular the very first dynamic/static bridge (which was not identified as such), vcsn::rat::abstract_ratexpset, is now part of dyn::.

dyn::weight now aggregates its WeightSet instead of a Context. More similar conversions were performed, and other are to come.

2013-06-19

dot

The output now shows useless states (and their transitions) in gray.

2013-06-18

are-equivalent

New algorithm (static, dynamic, TAF-Kit). Currently works only for LAL Boolean automata.

$ vcsn-standard -e 'a(ba)*' -o a1.gv
$ vcsn-standard -e '(ab)*a' -o a2.gv
$ vcsn-are-equivalent -f a1.gv a2.gv
true

2013-06-17

is_trim, is_useless, is_empty

New algorithms (static, dynamic, TAF-Kit).

accessible_states, coaccessible_states, useful_states

New algorithms, static only.

num_accessible_states, num_coaccessible_states, num_useful_states

New algorithms, static only. Available in the "info" display.

copy accepts a predicate

It is now possible to filter the states to keep. Either as a predicate, or a set of states. For instance:

template <typename Aut>
Aut trim(const Aut& a)
{
  return vcsn::copy(a, useful_states(a));
}

universal

New algorithm (static, dynamic, and TAF-Kit). Requires a LAL Boolean automaton.

$ vcsn-universal -f a1.gv
digraph
{
  vcsn_context = "lal_char(ab)_b"
  rankdir = LR
  node [shape = circle]
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I0
    F2
  }
  { 0 1 2 }
  I0 -> 0
  0 -> 0 [label = "a, b"]
  0 -> 1 [label = "a"]
  1 -> 0 [label = "a, b"]
  1 -> 1 [label = "a, b"]
  1 -> 2 [label = "b"]
  2 -> F2
  2 -> 0 [label = "a, b"]
  2 -> 1 [label = "a, b"]
  2 -> 2 [label = "a, b"]
}

complement

New algorithm (static, dynamic, and TAF-Kit). Requires a complete deterministic LAL Boolean automaton.

$ vcsn-standard -e '(?@lal_char(ab)_b)a' \
    | vcsn-determinize -f- \
    | vcsn-complement -f-  \
    | vcsn-aut-to-exp -f-
\e+((b+(a.(a+b))).((a+b)*))

2013-06-13

is-deterministic

It is conforming with the specifications: all the states must be deterministic, not just the reachable ones.

info output adjustments

It now displays whether the automaton "is complete" and the number of deterministic states if LAL, otherwise "N/A". It also reports "is deterministic: N/A" for non-LAL.

Invalid labels are rejected

Under some circumstances, some invalid transitions could be accepted by the Dot parser (e.g., "aa" or "" in LAL). This is fixed.

2013-06-05

Support for entries is removed.

Member types, functions, and variables, about entries, have been removed. The services provided by entries in Vaucanson 1 are provided by LAU automata, so entries are not as useful in Vaucanson 2. And anyway, if needed, it should rather be a set of free standing functions.

2013-05-27

Refactoring: eps-removal => proper

Eps-removal is renamed to Proper.

2013-05-24

Syntax for rational-expressions has changed!

Angular brackets are now used for weights. Instead of

$ vcsn-standard -e '(?@lal_char(01)_z)(0+1)*1({2}1+{2}0)*' -o binary.dot

run

$ vcsn-standard -e '(?@lal_char(01)_z)(0+1)*1(<2>1+<2>0)*' -o binary.dot

Braces are now used instead of (*...) for generalized quantifiers.

a{0} => \e
a{1} => a
a{2} => a.a
a{5} => a.a.a.a.a

a{0,1} => \e+a
a{0,2} => \e+a+a.a
a{0,3} => \e+a+a.a+a.a.a

a{1,2} => a.(\e+a
a{1,3} => a.(\e+a+a.a)

a{2,5} => a.a.(\e+a+a.a+a.a.a)

a{0,} => a*
a{1,} => a.(a*)
a{4,} => a.a.a.a.(a*)

2013-05-23

info output

Now displays the number of spontaneous transitions (0 for LAL).

2013-04-29

Digits as letters

Recently broken by accident, support for digits as letters is restored.

$ vcsn-standard -e '(?@lal_char(01)_z)(0+1)*1({2}1+{2}0)*' -o binary.dot
$ vcsn-evaluate -f binary.dot '11111111'
255
$ vcsn-evaluate -f binary.dot '101010'
42

2013-04-26

Dot parser

This parser is now stricter than it used to be: be sure to escape backslashes in input Dot files, i.e.,

  0 -> 2 [label = "\e"]

is now invalid, write

  0 -> 2 [label = "\\e"]

This change was made because Graphviz treats "" (and renders it) exactly like "e".

2013-04-25

New algorithm: Thompson

Conversion from rational expression to automata. Requires lan or law. The handling of weights might be changed in the near future.

$ vcsn-thompson -Ee '(?@lan_char(abc)_z){2}(a+{3}b+c)*{5}'
digraph
{
  vcsn_context = "lan_char(abc)_z"
  rankdir = LR
  node [shape = circle]
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I8
    F9
  }
  { 0 1 2 3 4 5 6 7 8 9 }
  I8 -> 8
  0 -> 2 [label = "\\e"]
  0 -> 4 [label = "\\e"]
  0 -> 6 [label = "\\e"]
  1 -> 0 [label = "\\e"]
  1 -> 9 [label = "{5}\\e"]
  2 -> 3 [label = "a"]
  3 -> 1 [label = "\\e"]
  4 -> 5 [label = "{3}b"]
  5 -> 1 [label = "\\e"]
  6 -> 7 [label = "c"]
  7 -> 1 [label = "\\e"]
  8 -> 0 [label = "{2}\\e"]
  8 -> 9 [label = "{10}\\e"]
  9 -> F9
}

2013-04-22

standard is the new name for standard-of

The "of" is useless, inconsistent with the other algorithms, and with the TAF-Kit v1 name.

2013-04-18

Evaluation is fixed

Several initializations were incorrect, expecting the zero to be 0 (which is not the case for zmin for instance). There might also be some speed up.

2013-04-16

New output format: tikz

Layout is dumb, yet this is useful to prepare LaTeX documents.

2013-04-12

New output format: info

This pseudo format displays facts about the automaton (number of states and so on) or rational expressions (number of nodes).

is-eps-acyclic

Can now be called on LAL automata, for consistency with is-proper and is-valid.

2013-04-11

Dot

The states are now numbered from 0.

Clang Compatibility

Clang++ (3.2 and 3.3) can now compile Vaucanson.

2013-03-18

New LabelSet: nullableset

Initial support for "Labels are nullable".

$ cat lan.dot
digraph lan
{
  vcsn_context = "lan_char(a)_b"
  I1 -> 1
  2 -> F2
  1 -> 2 [label = "a"]
  1 -> 2 [label = "\e"]
}
$ vcsn-cat -Af lan.dot
digraph
{
  vcsn_context = "lan_char(a)_b"
  rankdir = LR
  node [shape = circle]
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I1
    F2
  }
  { 1 2 }
  I1 -> 1
  1 -> 2 [label = "\\e, a"]
  2 -> F2
}
$ vcsn-eps-removal -Af lan.dot
digraph
{
  vcsn_context = "lan_char(a)_b"
  rankdir = LR
  node [shape = circle]
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I1
    F2
    F1
  }
  { 1 2 }
  I1 -> 1
  1 -> F1
  1 -> 2 [label = "a"]
  2 -> F2
}

is-proper

New dyn algorithm and tool (vcsn-is-proper).

2013-03-08

determinize speed-up

Determinization algorithm is five times faster.

Before: $ time bin/vcsn-de-bruijn 18 | bin/vcsn-determinize -Af - -Onull real 0m13.049s user 0m12.773s sys 0m0.276s $ time bin/vcsn-de-bruijn 20 | bin/vcsn-determinize -Af - -Onull real 0m56.780s user 0m55.775s sys 0m0.988s

Now: $ time bin/vcsn-de-bruijn 18 | bin/vcsn-determinize -Af - -Onull real 0m2.181s user 0m2.048s sys 0m0.132s $ time bin/vcsn-de-bruijn 20 | bin/vcsn-determinize -Af - -Onull real 0m8.653s user 0m8.177s sys 0m0.476s

2013-02-26

hierarchy and API clean up

dyn:: was cleaned. Some headers were renamed:

vcsn/ctx/abstract_context.hh -> vcsn/dyn/context.hh
vcsn/algos/dyn.hh            -> vcsn/dyn/algos.hh
vcsn/core/automaton.hh       -> vcsn/dyn/automaton.hh

dyn::make_automaton was introduced to hide implementation details.

2013-02-21

LabelSet renamings

The name LetterSet, UnitSet, and WordSet were not compliant. They have been renamed as letterset, unitset, and wordset.

2013-02-20

Generalized quantifier as syntactic sugar for rational expressions

The new "(* min, max)" quantifier (postfix, like "*") allows to specify "powers" of an expression. For instance:

({a}b)(*0) => \e
({a}b)(*1) => {a}b
({a}b)(*2) => {a}b.{a}b
({a}b)(*5) => {a}b.{a}b.{a}b.{a}b.{a}b

({a}b)(*0,1) => \e+{a}b
({a}b)(*0,2) => \e+{a}b+({a}b.{a}b)
({a}b)(*0,3) => \e+{a}b+({a}b.{a}b)+({a}b.{a}b.{a}b)

({a}b)(*1,2) => {a}b.(\e+{a}b)
({a}b)(*1,3) => {a}b.(\e+{a}b+({a}b.{a}b))

({a}b)(*2,5) => {a}b.{a}b.(\e+{a}b+({a}b.{a}b)+({a}b.{a}b.{a}b))

({a}b)(*0,) => ({a}b)*
({a}b)(*1,) => {a}b.(({a}b)*)
({a}b)(*4,) => {a}b.{a}b.{a}b.{a}b.(({a}b)*)

2013-02-18

Comment in rational expressions

The (?#...) construct allows to embed comments in rational expressions. They are discarded. There is no means to include a closing parenthesis in this construct.

Context in rational expressions

The (?@...) allows a rational expression to "carry" its context. Contrast for instance the two following runs.

$ vcsn-cat -C 'lal_char(xyz)_z'  -Ee '{42}x+{51}z'
{42}x+{51}z
$ vcsn-cat -Ee '(?@lal_char(xyz)_z){42}x+{51}z'
{42}x+{51}z

This will be used, eventually, so that TAF-Kit-like tools propagate the context in runs such as:

$ vcsn-cat -Ee '(?@lal_char(xyz)_z){42}x+{51}z' | vcsn-transpose -Ef -
1.1-5: invalid Boolean: 42

Currently it fails, as the "default" context is "lal_char(abc)_b".

New output format: fsm

An initial, and rough, support for Open FSM's format is provided. Currently there is no support at all for the weights.

$ vcsn-de-bruijn -O fsm 12 | fstcompile | fstdeterminize | wc -l
1216

New output format: null

The output is discarded. This is useful for benching.

2013-02-14

New algorithms: is-deterministic and complete

is-deterministic takes an automaton as argument and exits with code status 0 if the given automaton is deterministic, 2 otherwise.

complete also takes an automaton as argument and make it complete. If the given automaton is already complete, then it is unchanged.

2013-02-12

Overhaul of the LAU, LAL, LAW implementation

So far a context was a triple: <Kind, LabelSet, WeightSet>, where (for instance), Kind is labels_are_letters, LabelSet is set_alphabet<char_letters>, and WeightSet is zmin). This is troublesome on several regards, the clearest being that the LabelSet makes no sense for LAU.

Now contexts are pairs: <LabelSet, WeightSet>, where this time the LabelSet (same name as before, different concept) can be a instance of UnitSet for LAU, WordSet for LAW, or LetterSet for LAL. These structures, in turn, are parameterized by the effective set of generators to use: for instance, LetterSet<set_alphabet<char_letters>>. Of course UnitSet is not parameterized.

As a first visible consequence, the name of the LAU contexts has changed:

lau_char_br => lau_br
lau_char(xyz)_ratexpset<lal_char(abc)_b> => lau_ratexpset<lal_char(abc)_b>

2013-01-24

options renamed

The options to select the input and output format are renamed -I and -O (instead of -i and -o).

new option: -o for output file

The vcsn-* tools now support '-o FILE' to save the output in FILE.

2013-01-22

product: strengthened preconditions

The product of automata requires LAL automata. The output alphabet is the intersection of the ones of the operands. This works as expected when the automata have disjoint alphabets.

2013-01-18

dyn::product and vcsn-product

They compute the product of automata.

2013-01-14

vcsn-determinize and vcsn-evaluate use the common command line options

These tools support -C, -g, etc. like the other tools. See "vcsn-<tool> -h".

context names are now complete

Context names used to describe the "static" structure only (e.g., lal_char_ratexpset<law_char_b>). It now includes the "dynamic" part, currently only the list of generators (e.g., lal_char(abc)_ratexpset<law_char(xyz)_b>).

This context strings are both printed and read by the various tools. For instance:

$ vcsn-standard-of -C 'law_char(xyz)_ratexpset<law_char(abc)_z>' \
      -e '{abc}xyz' | vcsn-aut-to-exp -f -
{abc}xyz

Note that the second tool, vcsn-aut-to-exp, found the context in its input, the standard automaton in Dot format.

2013-01-13

dyn::ratexpset, dyn::context

These are now handled by shared pointer, consistently with dyn::automaton and dyn::ratexp.

Nasty memory management issues have been fixed.

vcsn-de-bruijn

It now supports the same arguments as the other vcsn-* tools. It also no longer requires 'a' and 'b' to be accepted letters, and it uses the whole alphabet. For instance "vcsn-de-bruijn -g 'xyz' 3" generates an automaton for "(x+y+z)x(x+y+z)^3".

dyn::ladybird, vcsn-ladybird

New dynamic algorithm, and new tool (which also supports the common command line options).

2013-01-12

contexts are renamed

Contexts have both a name and an identifier. The name is used to display in a readable form the nature of the context, for instance in Dot output. The identifier is used for instance headers, or predefined contexts, and libraries.

So far names and identifiers are equal, but this will change.

As a first step, identifiers/names are now <Kind><LabelSet><WeightSet> instead of <LabelSet><WeightSet><Kind>. For instance:

char_b_lal  => lal_char_b
char_br_lal => lal_char_br
char_zr_lal => lal_char_zr
char_br_lau => lau_char_br
char_br_law => law_char_br
char_zr_law => law_char_zr

2013-01-11

pprat is removed

It was designed for the test suite. The vcsn-* tools are now sufficient for the test suite, and are exposed to the user.

vcsn-*: option overhaul The different tools had already too many different calling conventions. They are now (quite) consistent.

vcsn-aut-to-exp

Calls the default implementation of aut-to-exp.

$ vcsn-standard-of -Wz     -e '{2}(ab){3}' | vcsn-aut-to-exp -Af -
({2}a.b){3}
$ vcsn-standard-of -Wz -Lw -e '{2}(ab){3}' | vcsn-aut-to-exp -Af -
{6}ab

vcsn-lift

It now also supports lifting rational expressions.

$ vcsn-lift -C char_b_lal -Ee 'abc'
{a.b.c}\e

2013-01-10

dotty -> dot

The name "dotty" was incorrect (as it denotes a program instead of the format). Therefore, every occurrence of "dotty" is now mapped to "dot".

dot

The Dot output (input and output) now uses ", " as a label separator, instead of " + ".

2012-12-26

dyn: input/output

Routines: dyn::read_(automaton|ratexp)_(file|string) and dyn::print take the input format. Available input/output are: - "dotty". - "text". - "xml".

2012-12-19

genset is replaced by labelset

Through out the code.

2012-12-18

dyn: input

New routines: dyn::read_(automaton|ratexp)_(file|string).

dyn: output

New routines: dyn::print, for both automata and RatExps.

bin: new tools

vcsn-cat, vcsn-transpose (both on RatExps only currently). vcsn-standard-of. vcsn-lift.

2012-12-13

krat -> rat

kratexp, kratexpset etc. are renamed as ratexp, ratexpset, etc.

labels are unit

labels-are-empty/lae were mapped to labels-are-unit/lau.

2012-10-31

dyn::context

For consistency with dyn::automaton, vcsn::ctx::abstract_context is renamed vcsn::dyn::context. Eventually, we might turn it into a shared pointer too.

dyn::de_bruijn, bin/vcsn-de-bruijn

New tools, useful for tests for instance.

$ vcsn-de-bruijn char_b_lal 2
digraph
{
  vcsn_context = "char_b_lal"
  vcsn_genset = "ab"
  rankdir = LR
  node [shape = circle]
  {
    node [style = invis, shape = none, label = "", width = 0, height = 0]
    I1
    F4
  }
  I1 -> 1
  1 -> 1 [label = "a + b"]
  1 -> 2 [label = "a"]
  2 -> 3 [label = "a + b"]
  3 -> 4 [label = "a + b"]
  4 -> F4
}

2012-10-22

dyn::parse_file and parse_string

They construct dyn::automaton's.

dyn::eval

For bad reasons, currently works only for char_b_lal

vcsn-determinize and vcsn-evaluate

Two new shell commands to write tests.

2012-10-12

pprat works with abstract algorithms

pprat now uses only abstract (aka, dynamic) algorithms! On OS X, it is now a 77KB program; it was 11MB before.

This schedules its death: either it will be replaced by a set of smaller grain commands (vcsn-determinize, vcsn-standard-of, etc.) from which test cases will be easier to write, or it will be replaced by some early implementation of a TAF-Kit-like unified program (instead of one per context).

2012-10-09

automata provide a vname

To dispatch algorithms such as dotty, we not only need to know the context type name, but also the automaton type name, as mutable_automaton and transpose_automaton are two different types for instance.

2012-10-08

dot-parser

Because it uses only algorithms made abstract (make_context, make_automaton_editor, and dotty), the dot parser now works for any of the precompiled contexts!

2012-09-26

add_entry

In addition to the add_entry method of mutable_automaton, there is now an add_entry algorithm, which is templated by the automaton type. This algorithm provides an abstract interface to an unknown type of automaton.

2012-09-24

renamings

For consistency, polynomials is renamed polynomialset.

mutable_automaton::add_entry and del_entry

The first of these new functions allows to add directly a list of transition between two states by passing the corresponding entry_t (this is most useful when reading an automaton with entries, such as with the Dot parser). The second one removes every existing transition between two states.

2012-08-01

labels are empty

Initial work on labels-are-empty automata. See the unit/char_z_lae test. The labels are not displayed, but the "{...}" to denote the weights, are kept:

digraph
{
  vcsn_context=char_z_lae
  vcsn_genset=""
  rankdir=LR
  node [shape=circle]
  {
    node [style=invis,shape=none,label="",width=0,height=0]
    I1
    F2
  }
  I1 -> 1
  2 -> F2 [label="{10}"]
  1 -> 2 [label="{51}"]
  2 -> 3 [label="{3}"]
  2 -> 1
  1 -> 1 [label="{42}"]
  1 -> 3
}

In that case, the transitions do not store labels.

lift now returns a labels-are-empty automaton/ratexp

Accordingly, pprat -l (lift) now displays:

$ pprat -Lw -l 'ab+cd'
digraph
{
  vcsn_context=char_kratexpset<char_b_law>_lae
  vcsn_genset="abcd"
  rankdir=LR
  node [shape=circle]
  {
    node [style=invis,shape=none,label="",width=0,height=0]
    I1
    F2
    F3
  }
  1 -> 2 [label="{ab}"]
  2 -> F2
  I1 -> 1
  3 -> F3
  1 -> 3 [label="{cd}"]
}

2012-07-31

dot-parser

It is now possible to load an automaton from its dotty output. Actually, it is possible to write simpler automata. This is no yet fully generic: it works properly only for char_b_lal.

The test program unit/parse-dot gives access to it. When fed with the following input file:

digraph
{
  vcsn_context=char_b_lal vcsn_genset="a"
  {1} -> {2 3} -> {4 5 6} [label=a]
  I -> 1
  {4 5 6} -> F
}

it produces an automaton, and dumps it using the dotty algorithm:

digraph
{
  vcsn_context=char_b_lal
  vcsn_genset="a"
  rankdir=LR
  node [shape=circle]
  {
    node [style=invis,shape=none,label="",width=0,height=0]
    I1
    F4
    F5
    F6
  }
  1 -> 2 [label="a"]
  1 -> 3 [label="a"]
  2 -> 4 [label="a"]
  2 -> 5 [label="a"]
  2 -> 6 [label="a"]
  3 -> 4 [label="a"]
  3 -> 5 [label="a"]
  3 -> 6 [label="a"]
  I1 -> 1
  4 -> F4
  5 -> F5
  6 -> F6
}

2012-07-25

pprat uses -L for labels instead of -A

For consistency, since we now also use the name "labels" to denote the leaves of rational expressions (others that and ), -A is renamed -L.

Metadata are embedded in the Dot file

The pseudo name "A" which was used in every dotty output is no longer defined, as it is both optional and useless. The context name and the alphabet are also provided. For instance:

$ ./tests/unit/ladybird-b 2 | sed 4q
digraph
{
  vcsn_context=char_b_lal
  vcsn_genset="abc"
$ pprat -s -L z 'abc' | sed 4q
digraph
{
  vcsn_context=char_z_lal
  vcsn_genset="abcd"
$ pprat -s -L zr -A w 'abc' | sed 4q
digraph
{
  vcsn_context=char_kratexpset<char_z_law>_law
  vcsn_genset="abcd"

This is an experimentation, and the current choice is somewhat unsatisfactory. Instead of

  vcsn_context=char_b_lal
  vcsn_genset="abc"

it is probably more sensible to use

  vcsn_context="char_b_lal{abc}"

or maybe

  vcsn_context="char{abc}_b_lal"

so that when weightsets depend for instance upon an alphabet, it can be specified too. Instead of

  vcsn_context=char_kratexpset<char_z_law>_law
  vcsn_genset="abcd"

(which does not define the alphabet used for the weightset), one would expect:

  vcsn_context=char_kratexpset<char_z_law{xyz}>_law{abcd}

or maybe

  vcsn_context=char{abcd}_kratexpset<char{xyz}_z_law>_law

Also, the name "kratexpset" is of course open to discussion:

  vcsn_context=char_rat<char_z_laz{xyw}>_law{abcd}

polynomials::conv

It is now possible to read back polynomials such as "a+b+{2}a".

static_assert

It is used more extensively to forbid meaningless calls, such as determinizing a law automaton.

2012-07-13

Precompiled contexts

Several predefined contexts come with their own header (e.g., "ctx/char_b_lal"), and their own library (e.g., "libchar_b_lal"). This is provide for char_{b,z,zmin}_{lal,law}.

z_min renamed zmin

Consistently with Vaucanson 1.4.

2012-07-10

transposition

Transposition on automaton is a read/write view: operations such as del_state, add_transition, etc. on a transposed automaton actually modify the wrapped automaton: set_final calls set_initial and so forth.

As an extreme example, the following snippet:

using context_t = vcsn::ctx::char_b;
using automaton_t = vcsn::mutable_automaton<context_t>;
using tr_automaton_t = vcsn::details::transpose_automaton<automaton_t>;
context_t ctx{{'a', 'b'}};
auto ks = ctx.make_kratexpset();
auto aut = vcsn::standard_of<tr_automaton_t>(ctx, ks.conv("a+a+a+a"));

applies the standard-of algorithm to a transposed mutable_automaton. In other words,

aut.strip();

is the transposition of a standard automaton, except that it is a mutable_automaton, not a transpose_automaton<mutable_automaton>.

2012-07-09

transposition

The "transpose" operation is implemented on words, weights, kratexps, and automata. pprat provides support to transpose on kratexps (option -t):

pprat -W zrr -t {{{2}ab}cd}abcd   =>   {{{2}ba}dc}dcba
pprat -W zrr -t {ab}(abcd)*{cd}   =>   ({dc}(dcba)*{ba})

and on (standard) automata (option -T):

$ pprat -A w -W br    -s '{ab}(\e+a+b({abc}c{bcd})*){cd}' > forward.dot
$ pprat -A w -W br -T -s '{ab}(\e+a+b({abc}c{bcd})*){cd}' > transpose.dot
$ diff -W80 -t  -y forward.dot transpose.dot
digraph A {                            digraph A {
  rankdir=LR                             rankdir=LR
  node [shape=circle]                    node [shape=circle]
  {                                      {
    node [style=invis,shape=none,la        node [style=invis,shape=none,la
    I1                                     I1
                                    >      I2
                                    >      I4
    F1                                     F1
    F2                              <
    F4                              <
  }                                      }
  1 -> F1 [label="{(ab).(cd)}"]     |    I1 -> 1 [label="{(dc).(ba)}"]
  I1 -> 1                           |    1 -> F1
  2 -> F2 [label="{cd}"]            |    I2 -> 2 [label="{dc}"]
  1 -> 2 [label="{ab}a"]            |    2 -> 1 [label="{ba}a"]
  4 -> F4 [label="{cd}"]            |    I4 -> 4 [label="{dc}"]
  4 -> 4 [label="{(abc).(bcd)}c"]   |    4 -> 4 [label="{(dcb).(cba)}c"]
  1 -> 3 [label="{ab}b"]            |    3 -> 1 [label="{ba}b"]
  3 -> 4 [label="{(abc).(bcd)}c"]   |    4 -> 3 [label="{(dcb).(cba)}c"]
}                                      }

2012-06-19

aut_to_exp

An initial version of aut_to_exp is available. The new pprat option -a provides an access to this algorithm: apply aut_to_exp to the standard_of an expression.

pprat      -a 'a*'     => \e+(a.(a*))
pprat      -a '(a+b)c' => (a.c)+(b.c)
pprat -W z -a '{2}({3}a+{5}b){7}c{11}' => (({6}a.{7}c)+({10}b.{7}c)){11}

Currently, the only "heuristic" implemented eliminates the states in order. There are probably possible improvements.

pprat -a '(a+b)*' | wc -c => 265

pprat: -a and -w are renamed -A and -W

new factory: de Bruijn

Builds automata for (a+b)a(a+b)^n.

2012-06-18

standard_of is part of vcsn::

It used to be in vcsn::rat::.

2012-06-14

kratexpset/abstract_kratexpset

kratexpset, i.e. the object that provides operation on kratexps (with specified Gen and Weight), used to derive from abstract_kratexp (which is "opaque": it does not know the precise type that is used underneath).

Now, from abstract_kratexp we derive a concrete_abstract_kratexpset which aggregates a kratexpset. This means that kratexpset no longer derives from a weakly-typed ancestor, and can provide simple and strongly-typed routines.

kratexpset/kratexp

For consistency with weightset/weight, genset/gen, kratexps (note the s) is renamed as kratexpset and std::shared_ptr<const rat::node> as kratexp.

lift

A new algorithm which creates, from an automaton, another one with the same states and transitions, but the new automaton features only spontaneous transitions, whose weights correspond to the labels (and weights) of the initial one.

For instance:

$ pprat -aw -wz  -sl '({2}\e+{3}a){4}'
digraph A {
  rankdir=LR
  node [shape=circle]
  {
    node [style=invis,shape=none,label="",width=0,height=0]
    I1
    F1
    F2
  }
  1 -> F1 [label="{8}"]
  I1 -> 1
  2 -> F2 [label="{4}"]
  1 -> 2 [label="{3}a"]
}

digraph A {
  rankdir=LR
  node [shape=circle]
  {
    node [style=invis,shape=none,label="",width=0,height=0]
    I1
    F1
    F2
  }
  1 -> F1 [label="{{8}\\e}"]
  I1 -> 1
  2 -> F2 [label="{{4}\\e}"]
  1 -> 2 [label="{{3}a}\\e"]
}

RatExps: fix is_unit

is_unit simply checked that the expression was , but did not check that weight itself was the unit.

VCSN_DEBUG

This variable allows to force the display of weights.

2012-06-11

Contexts aggregate shared pointers

Now contexts are mutable, and hold (shared) pointers to (immutable) gensets and weightsets. This way, we can alter contexts (e.g., the ladybird factory can add the letters it needs in a new genset), yet there is good sharing, and identity can still be used to distinguish, for instance, two gensets defined equally.

It is also simpler to really expose them as pointers, so every "weightset().mul", etc. must be rewritten as "weightset()->mul".

2012-06-08

Contexts

The Kind parameter is now part of the context. The same type of Kind is now use for both RatExps and automata. This results in many significant simplifications.

For instance, again, the test case for product:

Before:

 using context_t = vcsn::ctx::char_z;
 context_t ctx { {'a', 'b', 'c'} };
 using automaton_t =
   vcsn::mutable_automaton<context_t, vcsn::labels_are_letters>;
 automaton_t aut1(ctx);

After:

using context_t = vcsn::ctx::char_z;
context_t ctx { {'a', 'b', 'c'} };
using automaton_t = vcsn::mutable_automaton<context_t>;

Or the source of pprat:

Before:

using atom_kind_t
  = typename Factory::kind_t;
using label_kind_t
  = typename vcsn::label_kind<atom_kind_t>::type;
using context_t =
  vcsn::ctx::context<typename Factory::genset_t,
                     typename Factory::weightset_t,
                     label_kind_t>;
context_t ctx{factory.genset(), factory.weightset()};
using automaton_t = vcsn::mutable_automaton<context_t>;
auto aut = vcsn::rat::standard_of<automaton_t>(ctx, e);

After:

using context_t = typename Factory::context_t;
using automaton_t = vcsn::mutable_automaton<context_t>;
auto aut = vcsn::rat::standard_of<automaton_t>(factory.context(), e);

2012-06-05

Contexts

"Contexts" were introduced to factor two aspects that are required through out the library: the GenSet type (i.e., the nature of the generators), and the WeightSet type (i.e., the nature of the weights). Not only do contexts define these types, they must also be instantiated so that "run-time" details be known: for instance the set of generators is dynamic (what are the allowed letters), and on occasion the weightset also needs run-time information (e.g., when RatExp are parameterized by RatExp, what is the alphabet of the latter ones?).

As an example of the changes on the user side, consider the product test-case.

Before:

 typedef vcsn::set_alphabet<vcsn::char_letters> alpha_t;
 typedef vcsn::mutable_automaton<alpha_t, vcsn::z,
                                vcsn::labels_are_letters> automaton_t;
 vcsn::z z;
 alpha_t alpha{'a', 'b', 'c'};
 automaton_t aut1(alpha, z);

After:

 using context_t = vcsn::ctx::char_z;
 context_t ctx { {'a', 'b', 'c'} };
 using automaton_t =
   vcsn::mutable_automaton<context_t, vcsn::labels_are_letters>;
 automaton_t aut1(ctx);

2012-05-30

RatExp: atoms are words

Expressions such as "(ab)(ab)" used to be equivalent to "(abab)" (a single four-letter atom). Now:

pprat -aw '(ab)(ab)'   => (ab).(ab)
pprat -aw 'abab'       => abab
pprat -aw 'ab.ab'      => (ab).(ab)
pprat -aw 'ab(ab)abc*' => (ab).(ab).(ab).(c*)

2012-05-28

New algorithm: eval, evaluates a word over an (weighted) automaton

Defined in vcsn/algos/eval.hh as vcsn::eval.

New algorithm: determinize, Boolean automaton determinization

Defined in vcsn/algos/determinize.hh as vcsn::determinize.

2012-05-25

RatExp: changes in the display

Fixed the output of "atoms are words" expressions. For instance "(ab)" used to be displayed as "ab" (which is wrong as it is parsed as "a(b)"). It is now properly displayed as "(ab)".

RatExp: slight changes in the grammar

A star is now valid after a weight:

$ pprat -w z '{2}ab{3}*'
({2}(a.b){3})*

2012-05-11

dotty: define initial/final states first

In order to improve readability, instead of

digraph A {
  rankdir=LR
  node [shape=circle];
  F1 [style=invis,shape=none,label="",width=0,height=0]
  1 -> F1 [label="{a.a.((d.d)*)}"]
  3 -> 2 [label="{(d.d)*}b"]
  I1 [style=invis,shape=none,label="",width=0,height=0]
  I1 -> 1
  1 -> 2 [label="{a.a.((d.d)*)}b"]
  F3 [style=invis,shape=none,label="",width=0,height=0]
  3 -> F3 [label="{(d.d)*}"]
  2 -> 3 [label="b"]
}

we now produce

digraph A {
  rankdir=LR
  node [shape=circle]
  {
    node [style=invis,shape=none,label="",width=0,height=0]
    I1
    F1
    F3
  }
  1 -> F1 [label="{a.a.((d.d)*)}"]
  3 -> 2 [label="{(d.d)*}b"]
  I1 -> 1
  1 -> 2 [label="{a.a.((d.d)*)}b"]
  3 -> F3 [label="{(d.d)*}"]
  2 -> 3 [label="b"]
}

2012-05-10

RatExp: support for the kind of atoms

As a major overhaul, the rational expressions (vcsn::rat::node) are now parameterized by Atom, which denotes the atom value. The kratexps structure is now parameterized by the Kind, from which it is deduced, from the GenSet parameter, whether we should use word_t or letter_t atoms.

pprat: an option -a

To provide user-access to these feature, pprat now supports an option -a, which accepts "letters" or "words" as argument, with obvious meaning. For instance:

pprat -a letters 'abc' => a.b.c
pprat -a letters 'abc.abc' => a.b.c.a.b.c
pprat -w br -al '{aa}bb{c}dd{a}' => ({a.a}(b.b).{c}(d.d)){a}

pprat -a words 'abc' => abc
pprat -a words 'abc.abc' => abc.abc
pprat -w br -aw '{aa}bb{c}dd{a}' => ({aa}bb.{c}dd){a}

Of course, this also works with the "standard-of" option:

$ pprat -w br -al '{aa}({dd}\e+bb)*'
{a.a}(({d.d}\e+(b.b))*)
$ pprat -s -w br -al '{aa}({dd}\e+bb)*'
digraph A {
  rankdir=LR
  node [shape=circle];
  F1 [style=invis,shape=none,label="",width=0,height=0]
  1 -> F1 [label="{a.a.((d.d)*)}"]
  3 -> 2 [label="{(d.d)*}b"]
  I1 [style=invis,shape=none,label="",width=0,height=0]
  I1 -> 1
  1 -> 2 [label="{a.a.((d.d)*)}b"]
  F3 [style=invis,shape=none,label="",width=0,height=0]
  3 -> F3 [label="{(d.d)*}"]
  2 -> 3 [label="b"]
}

versus:

$ pprat  -w br -aw '{aa}({dd}\e+bb)*'
{aa}(({dd}\e+bb)*)
$ pprat -s -w br -aw '{aa}({dd}\e+bb)*'
digraph A {
  rankdir=LR
  node [shape=circle];
  F1 [style=invis,shape=none,label="",width=0,height=0]
  1 -> F1 [label="{aa.(dd*)}"]
  2 -> 2 [label="{dd*}bb"]
  F2 [style=invis,shape=none,label="",width=0,height=0]
  2 -> F2 [label="{dd*}"]
  1 -> 2 [label="{aa.(dd*)}bb"]
  I1 [style=invis,shape=none,label="",width=0,height=0]
  I1 -> 1
}

2012-05-07

RatExp: in some case the weights could be lost

"Associativity" was applied too eagerly, which would result in loss of some weights. E.g. {a}bb{c}dd resulted in b.b.d.d, not it evaluates to {a}(b.b).{c}(d.d).

2012-05-03

RatExp: improved pretty-printing

The outermost pair of parentheses is removed if useless. For instance:

 (a.b) => a.b
 (a+b+c) => a+b+c
 (a*) => a*

But in the following examples they are kept.

 {3}(a.b){4}
 {3}(a*)
 (a+b){4}

standard-of: star is fixed

Standard-of seems to be correct.

2012-04-25

Expressions overhaul

They are immutable: we no longer make side effects on expressions. They are shared_ptr, no longer plain pointers. They can be used like the other values, by value.

2012-04-19

standard-of is fully implemented

Support for star was implemented, and checked for B and Z. For implementation reasons, one cannot yet use rational expressions as weights.

2012-04-18

mutable_automata::mul_weight

RatExp::head and tail

dotty

In order to improve the readability of its output, it no longer "defines" the reachable states. See the following diff:

 digraph A {
   rankdir=LR
   node [shape=circle];
-  1
-  2
-  3
-  4
   I1 [style=invis,shape=none,label="",width=0,height=0]
   I1 -> 1 [label="{6}"]
   F1 [style=invis,shape=none,label="",width=0,height=0]
   1 -> F1
   1 -> 2 [label="a"]
   1 -> 3 [label="a"]
   2 -> 4 [label="{3}b"]
   F4 [style=invis,shape=none,label="",width=0,height=0]
   4 -> F4
   4 -> 3 [label="a"]
 }

standard-of: many fixes in the handling of the weights

An expression such as {12}\e used to leave the weight in the initial transition; it is now on the final transition. More generally the initial transition always has unit as weight.

The product and sum of expressions now handle the left and right weights.

Accepting initial states in expressions such as "+a" are no longer lost.

2012-04-11

Many renamings

alphabet_t/alphabet() -> genset_t/genset(), etc.
factory -> abstract_kratexp.
factory_ -> kratexp.
initials() -> initial_transitions(), etc.
invalid_state -> null_state, etc.
nb_state() -> num_states(), etc.
polynomial -> polynomials, etc.

2012-04-09

product

An implementation of the product of two automata is available.

2012-04-07

z_min

An example of tropical semiring, to test show_unit().

2012-04-05

char_letters::special()

This method return a special, reserved character, that is used to
label initial and final transitions.  This character is not part of
the alphabet and is never output.

2012-04-04

standard-of

Initial work on "+".

mutable_automata are implemented using a pre() and post() states

What is missing is the correct '$' letter on the initial and final
transitions.  The current value is the default value for label_t.

The previous interface has been preserved (but maybe we should
clean it) except for one change:

  initial() and final() have been changed to return a pseudo
  container of transitions.  These transitions give us both
  the weight and the initial/final state.

The following methods are new:

  pre(), post()         returns the pre-initial and post-final state.
  all_states()          returns all states, including pre() and post()
  all_transitions()     returns all transitions, including initial and
                          final transitions
  all_entries()         likewise for entries
  all_out(s), all_in(s) likewise for outgoing and ingoing transitions


The methods get_initial_weight(s) and get_final_weight(s) are slower
now, because they need to locate the corresponding initial/final
transition.  For the same reason, is_initial(), is_final() are also
slower.

2012-04-03

standard-of

Initial version of standard-of is implemented. Can be tested with pprat's new option -s:

$ pprat -wz -s '{123}a'
digraph A {
  rankdir=LR
  node [shape=circle];
  1
  I1 [style=invis,shape=none,label="",width=0,height=0]
  I1 -> 1
  2
  F2 [style=invis,shape=none,label="",width=0,height=0]
  2 -> F2
  1 -> 2 [label="{123}a"]
}

mutable_automaton uses unsigned for state_t and transition_t.

This allows to store states in a std::vector<stored_state_t>. Likewise for transitions. Erased elements are marked (so they are skipped over during iteration), and added to a free store to be reused later.

mutable_automaton does not store any weight when WeightSet == b.

mutable_automaton has a read-only entry interface

entries() is a pseudo container that filters transitions()
  to see each (src,dst) pair at most once.

entry_at(src, dst) and entry_at(t) return a polynomial describing
  the entry between (src, dst) or (src_of(t), dst_of(t)).

entryset() returns the WeightSet that can be used to manipulate
  these polynomials.

make check-rat, make check-unit

There is a check target for each subdirectory of tests/.

2012-04-02

Alphabets are checked

pprat is hard-coded to use a, b, c, d for all the alphabets (including for inner rational expressions):

$ pprat -w b -e 'y'
1.1: invalid word: y: invalid letter: y

Weights are checked

As follows:

$ pprat -w b -e '{12}a'
1.1-5: invalid Boolean: 12

Unfortunately the locations are bad currently for complex weights:

$ pprat -w zr -e '{x}a'
1.1-4: 1.1: invalid word: x: invalid letter: x

To be fixed.

2012-03-30

G++ 4.7 is required

We use constructs that are not supported by 4.6 (e.g., constructor delegation).

zrr

As a demonstration that rational expressions can be weights of rational expressions, pprat supports '-w zrr' (Rat<Rat<Z>>):

 $ pprat -w zr -e '{{{2}{3}a}u}x{{{4}{5}\e}\e}'
 {{{120}a}u}x

Vcsn/News File

From LRDE

Vcsn news

Vcsn 2.1

2015-10-05

Vcsn 2's repository has moved

2015-09-30

shortest: applies to any automaton

2015-09-28

project: new algorithm

2015-09-23

expressions: spaces are now ignored

2015-09-17

derived-term: deterministic automata

2015-09-15

Named automata

2015-09-07

Build fixes

2015-08-31

expression.info: more information

2015-08-26

vcsn.automaton can guess the format

2015-08-25

levenshtein: new algorithm

partial_identity: new algorithm

2015-08-18

to_automaton: more algorithms

2015-08-14

to_expression: more heuristics

2015-08-03

to_automaton: conversion from expression to automaton

2015-06-18

I/O in EFSM format are safer

2015-06-16

automaton.expression, automaton.lift

2015-06-15

power, chain have been renamed

2015-06-11

eliminate_state: fixes

2015-06-04

expressions: new identities sets

2015-05-29

nmin: new weightset

2015-05-26

cotrie: new algorithm

2015-05-22

expressions: output may use label classes

2015-05-21

multiply: new name for 'concatenate'

2015-05-20

trie: read directly from a file

2015-05-14

trie: new algorithm

2015-05-01

delay_automaton: new automaton type

is-synchronized: new algorithm

synchronize: new algorithm

has-bounded-lag: new algorithm

2015-04-09

shortest now subsumes enumerate

2015-03-25

product is renamed conjunction

2015-03-21

are-equivalent: now uses realtime

join: improvements

2015-03-18

infiltration: fix long standing bugs

2015-03-17

is_functional: new algorithm

2015-03-16

Python 3 is now required

2015-03-15

Repeated minimization

2015-03-03

realtime: new algorithm

is_letterized: new algorithm

2015-03-02

zpc: new algorithm

2015-02-23

letterize: new algorithm