Learning Structured Information in

 Natural Language Applications



April 3rd 2006, Trento, Italy

Hosted in conjunction with the 11th Conference of the European

Chapter of  the Association for Computational Linguistics
Main conference April, 5-7  2004

Motivations and Aims


Language processing largely deals with multidimensional and highly structured forms of information. Indeed, from the morphological up to the deep syntactic and semantic levels, linguistic information is often described by structured data, making the learning of the associated linguistic tasks more complex.


Traditional methods for the design of language applications involve the extraction of features that map data representations to vectors of attributes/values. Unfortunately, there is no methodology that helps in this feature modeling problem. Consequently, in order to encode structured data, the designer has to rely on his/her deep knowledge, expertise and intuition about the linguistic phenomenon associated with the target structures.


Recently, approaches that attempt to alleviate such modeling complexity by directly encoding structured data have been developed. Among other, kernel methods and conditional random fields provide interesting properties. The former use kernel functions to implicitly define richer feature spaces (e.g. substructure spaces) whereas the latter allow the designer to directly encode the probabilistic model on the structures. The promising aspects of such approaches open new research directions:

(a) the study of their impact on the modeling of diverse natural language structures, (b) their comparative assessment with traditional attribute-value models and (c) the investigation of techniques which aim to improve their efficiency.


Additionally, the complementary study of mapping the classification function in structured spaces is very interesting. Classification functions can be designed to output structured data instead of simple values. In other words, the output values may be interpreted as macro-labels which describe configurations and dependencies over simpler components, e.g. parse trees or semantic structures.




The main goal of this workshop is to bring together researchers from different communities such as machine learning, computational linguistics, information retrieval and data mining to promote the discussion and development of new ideas and methods for the effective exploitation of "structured data" for natural language learning and applications. These latter include but are not restricted to:


  • Coreference Resolution
  • Information Extraction/Relation Extraction
  • Machine Translation
  • Multilingual Corpus Alignment
  • Named Entity Recognition
  • Question Classification
  • Semantic Role Labeling
  • Semantic Parsing
  • Syntactic Parsing and Parse Tree Re-Ranking
  • Text Categorization
  • Word Sense Disambiguation.


We are particularly interested in the following machine learning aspects:


  • Kernel Methods
  • Maximal Margin Classifiers
  • Conditional Random Fields
  • Support Vector Machines


Information on registration and registration fees are provided at the conference web page.

Final Program
Workshop Proceedings
Workshop Chairs


Roberto Basili and Alessandro Moschitti

(University of Rome ”Tor Vergata”)


Program Committee


 Nicola Cancedda (Xerox Research Centre Europe, France)

 Nello Cristianini (University of California, Davis , USA)

 Aron Culotta (University of Massachusetts Amherst, USA)

 Walter Daelemans (University of Antwerp, Netherlands)

 Marcello Federico (ITC-Irst, Italy)

 Attilio Giordana (University of Turin, Italy)

 Marko Grobelink (J. Stefan Institute, Ljubljana, Slovenia)

 Fred Jelinek (CLSP John Hopkins University, USA)

 Thorsten Joachims (Cornell University, USA)

 Lluis Marquez (Universitat Politecnica de Catalunya, Spain)

 Giuseppe Riccardi (University of Trento, Italy)

 Dan Roth (University of Illinois at Urbana-Champaign, USA)

 Alex Smola (National ICT Australia, ANU)

 Carlo Strapparava (ITC-Irst, Italy)

 John Shawe Taylor (University of Southampton, UK)

 Ben Taskar (University of California at Berkeley , USA)

 Dimitry Zelenko (SRA international inc., USA)


Proceedings and Publications

Contacts have been established with an International Publisher for the production of a Special Issue on an International Journal or for a Book dedicated to the Workshop Topics. This will include the extended versions of selected papers from the Workshop Proceedings. Details on the post-workshop publication will be provided as soon as possible on these pages.



Important dates

Paper due

January 6, 2006

Notification of acceptance

January 27, 2006

Camera-ready papers due

February 10, 2006

Preliminary Program

February 15, 2006


April, 3 2006




Further Information

For any information, please contact: 

Alessandro Moschitti
moschitti [at] info.uniroma2.it

Dept. of Computer Science, Systems and Production
of Rome Tor Vergata
Via del Politecnico 1
Rome (ITALY)

tel:     +39 06 72597333
fax:    +39 06 72597460


Important Dates

January 6, 2006
• Submission Deadline

January 27, 2006
• Notification

February 10, 2006
• Camera-ready 

     papers due

April 3, 2006
• Workshop