Natural Language Applications
Hosted in conjunction with the 11th Conference of the European
the Association for Computational Linguistics
Motivations and Aims
Language processing largely deals with multidimensional and highly structured forms of information. Indeed, from the morphological up to the deep syntactic and semantic levels, linguistic information is often described by structured data, making the learning of the associated linguistic tasks more complex.
Traditional methods for the design of language applications involve the extraction of features that map data representations to vectors of attributes/values. Unfortunately, there is no methodology that helps in this feature modeling problem. Consequently, in order to encode structured data, the designer has to rely on his/her deep knowledge, expertise and intuition about the linguistic phenomenon associated with the target structures.
Recently, approaches that attempt to alleviate such modeling complexity by directly encoding structured data have been developed. Among other, kernel methods and conditional random fields provide interesting properties. The former use kernel functions to implicitly define richer feature spaces (e.g. substructure spaces) whereas the latter allow the designer to directly encode the probabilistic model on the structures. The promising aspects of such approaches open new research directions:
(a) the study of their impact on the modeling of diverse natural language structures, (b) their comparative assessment with traditional attribute-value models and (c) the investigation of techniques which aim to improve their efficiency.
Additionally, the complementary study of mapping the classification function in structured spaces is very interesting. Classification functions can be designed to output structured data instead of simple values. In other words, the output values may be interpreted as macro-labels which describe configurations and dependencies over simpler components, e.g. parse trees or semantic structures.
The main goal of this workshop is to bring together researchers from different communities such as machine learning, computational linguistics, information retrieval and data mining to promote the discussion and development of new ideas and methods for the effective exploitation of "structured data" for natural language learning and applications. These latter include but are not restricted to:
We are particularly interested in the following machine learning aspects:
Information on registration and registration fees are provided at the conference web page.
(University of Rome ”Tor Vergata”)
Nicola Cancedda (Xerox Research Centre Europe, France)
Nello Cristianini (University of California, Davis , USA)
Aron Culotta (University of Massachusetts Amherst, USA)
Walter Daelemans (
Marcello Federico (
Attilio Giordana (
Marko Grobelink (J. Stefan Institute,
Fred Jelinek (
Thorsten Joachims (
Lluis Marquez (Universitat Politecnica de Catalunya, Spain)
Giuseppe Riccardi (University of Trento, Italy)
Dan Roth (
Alex Smola (National ICT Australia, ANU)
Carlo Strapparava (ITC-Irst, Italy)
John Shawe Taylor (
Ben Taskar (
Dimitry Zelenko (SRA international inc., USA)
Proceedings and Publications
Contacts have been established with an International Publisher for the production of a Special Issue on an International Journal or for a Book dedicated to the Workshop Topics. This will include the extended versions of selected papers from the Workshop Proceedings. Details on the post-workshop publication will be provided as soon as possible on these pages.
For any information, please contact:
Dept. of Computer Science, Systems and
tel: +39 06 72597333