end
  
British Computer Society's logo The British Computer Society
Natural Language Translation Specialist Group

Page URL: http://www.bcs-mt.org.uk/nala_009.htm
E-mail: admin@bcs-mt.org.uk
Updated: 28 July 2004. Copyright © 2004.
  Corpora analysis resources for Spanish

 

During 1996 a request for details of corpus analysis resources for Spanish was posted on ELSNET by Jose Luis Sancho sancho@crea.rae.es and Maria Paula Santalla santalla@crea.rae.es of the Instituto De Lexicografia, Spain.

The following information was part of the response to that request. It is reproduced here in slightly abridged form. The alphabetically-ranked contributors are all mentioned.

John Aberdeen   aberdeen@mitre.org
Contact John Aberdeen for details of a fast part-of-speech tagger, based on Eric Brill's notion of tranformation-based error-driven learning.

Ken Beesley   ken.beesley@grenoble.rxrc.xerox.com
The Rank Xerox Research Centre in Grenoble, France, has developed
systems for tokenization (word/term division) morphological analysis
(for syntax, or, less detailed, for tagging) part-of-speech 'guesser'
(for words not found by the morphological analysis) tagging (based on
an HMM tagger, trained on a corpus) for Spanish.
You can experiment with the morphological analysis and tagger on

Ken Litkowski   71520.307@compuserve.com
Dictionary utilities for creating and maintaining lexica:

Max Louwerse   m.m.louwerse@stud.let.ruu.nl
See Notabene at ...
WWW: http://sls-www.lcs.mit.edu/~flammia/Nb.html
FTP: ftp://sls-www.lcs.mit.edu/pub/flammia/Nb

Nuno Miguel Cavalheiro Marques  nmm@di.fct.unl.pt
  1. Details of two part-of-speech taggers, one using Viterbi tagging and HMM and the other using Neural Networks,
  2. An article about POLARIS - a morphological lexical acquisition and retrieval data base system.
  3. Both may be found at

Ana Martínez   sysnet@bitmailer.net
Contact Ana Martínez for details of MABLe, a 'multilingual letter authoring tool'.

Sandro Pedraziini   sandro@idsia.ch
Create and maintain lexica, generate different forms of taggers and lemmatizers:

Mike Scott   ms2928@liv.ac.uk

Carlos Subirats   c.subirats@oasis.uab.es and c.subirats@cc.uab.es
E-mail: lali1@uab.es
'Etiquetador y desambiguizador del espanol,' developed by the Laboratorio de
Linguistica Informatica de la Universidad Autonoma de Barcelona.
Contact: Carlos Subirats Ruggeberg, Universidad Autonoma de Barcelona,
Laboratorio de Linguistica Informatica, Edificio B, 08193 Bellaterra, Spain.
Telephone: +343 - 581-22-29.
Facsimile: +343 - 581-16-86.

Jean Véronis   veronis@univ-aix.fr
WWW: http://www.lpl.univ-aix.fr/projects/multext/
A useful contact is Nuria Bel nuria@gilcub.es

Yorick Wilks   yorick@dcs.shef.ac.uk
Contact: david@crl.nmsu.edu

 

top