Difference between revisions of "Jobs/M2 AD 2015 Vcsn for Linguists"

From LRDE

Line 6: Line 6:
 
|Related project=Vaucanson
 
|Related project=Vaucanson
 
|Advisor=Akim Demaille
 
|Advisor=Akim Demaille
  +
|General presentation of the field=The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semiring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, ×⟩, ⟨ℚ, +, ×⟩, etc..), tropical (⟨ℤ min, +⟩, etc..), or yet of another type (e.g. rational expressions).
|General presentation of the field=Many properties and characteristics of an automaton can be easily computed from its syntactic monoids. Such properties are of particular importance to theoreticians.
 
  +
  +
Automata are heavily used in computational linguistics, and conversely, automata used in computational linguistics are "heavy".
   
 
Vcsn is a project led by Alexandre Duret-Lutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical object-oriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).
 
Vcsn is a project led by Alexandre Duret-Lutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical object-oriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).
   
Vcsn has a sound base of data structure and algorithms for automata and rational expressions. However, it offers no support for syntactic monoids at all.
+
Vcsn has a sound base of data structure and algorithms for automata and rational expressions. It is already able to deal with many of the typical needs of linguists. However some specific semirings have not been implemented, and some well-known algorithms are needed.
 
|Prerequisites=* good programmer in some language
 
|Prerequisites=* good programmer in some language
 
* acquaintance with C++
 
* acquaintance with C++
 
* facilities with theoretical matters
 
* facilities with theoretical matters
  +
|Objectives=The objective of this internship is develop a complete Computational Linguistics toolchain on top of Vcsn, for instance the "SMS to French" project from François Yvon (sms.limsi.fr). To this end, Vcsn will have to be completed with the needed data structures and algorithms, and possibility the existing implementation will have to be revised to cope with the extremely demanding size of these real-life automata.
|Objectives=The objective of this internship is develop support for syntactic monoids in Vcsn, and to implement recent research results in Automata Theory that use the syntactic monoid.
 
 
|References=* [http://www.amazon.com/Elements-Automata-Theory-Jacques-Sakarovitch/dp/0521844258 Jacques Sakarovitch, “Elements of Automata Theory,” Cambridge University Press.]
 
|References=* [http://www.amazon.com/Elements-Automata-Theory-Jacques-Sakarovitch/dp/0521844258 Jacques Sakarovitch, “Elements of Automata Theory,” Cambridge University Press.]
 
* [http://publications.lrde.epita.fr/201307-CIAA Akim Demaille, Alexandre Duret-Lutz, Sylvain Lombardy, Jacques Sakarovitch. “Implementation Concepts in Vaucanson 2,” CIAA’13.]
 
* [http://publications.lrde.epita.fr/201307-CIAA Akim Demaille, Alexandre Duret-Lutz, Sylvain Lombardy, Jacques Sakarovitch. “Implementation Concepts in Vaucanson 2,” CIAA’13.]
 
 
|Contact=<akim at lrde . epita . fr>
 
|Contact=<akim at lrde . epita . fr>
 
|Compensation=1000 € gross/month
 
|Compensation=1000 € gross/month

Revision as of 15:47, 29 October 2014

Vcsn for Linguists
Reference id

2015 AD Vcsn for Linguists

Dates

5-6 months in 2015

Research field

Automata Theory

Related project

Vaucanson

Advisor

Akim Demaille

General presentation of the field

The classical theory of automata, of transducers and of rational expressions, admits a very elegant and extremely useful extension (eg, in natural language processing) taking into account the concept of weighting. The weights are then taken in a semiring, which can be classical (⟨𝔹, ∨, ∧⟩, ⟨ℤ, +, ×⟩, ⟨ℚ, +, ×⟩, etc..), tropical (⟨ℤ min, +⟩, etc..), or yet of another type (e.g. rational expressions).

Automata are heavily used in computational linguistics, and conversely, automata used in computational linguistics are "heavy".

Vcsn is a project led by Alexandre Duret-Lutz and Akim Demaille (LRDE). It is a platform for the manipulation of automata, transducers and weighted rational expressions. It is written in C++11 avoiding the classical object-oriented programming in favor of generic programming (template) for more performance. Vcsn is an heir of the Vaucanson 2 project which was developed in partnership with Jacques Sakarovitch (Telecom ParisTech) and Sylvain Lombardy (LaBRI).

Vcsn has a sound base of data structure and algorithms for automata and rational expressions. It is already able to deal with many of the typical needs of linguists. However some specific semirings have not been implemented, and some well-known algorithms are needed.

Prerequisites
  • good programmer in some language
  • acquaintance with C++
  • facilities with theoretical matters
Objectives

The objective of this internship is develop a complete Computational Linguistics toolchain on top of Vcsn, for instance the "SMS to French" project from François Yvon (sms.limsi.fr). To this end, Vcsn will have to be completed with the needed data structures and algorithms, and possibility the existing implementation will have to be revised to cope with the extremely demanding size of these real-life automata.

Benefit for the candidate
References
Place LRDE: How to get to us
Compensation

1000 € gross/month

Future work opportunities
Contact

<akim at lrde . epita . fr> <akim at lrde . epita . fr>