- Soutenance de thèse de Guillermo Moncecchi
- 07.03.2013 13.00 h - 17.00 h
- Université Paris Ouest - Nanterre La Défense - Nanterre
Recognizing Speculative Language in Research Texts
PRESENTEE PAR GUILLERMO MONCECCHI
Sous la direction de Monsieur Jean-Luc Minel et Madame Dina Wonsever
This thesis presents a methodology to solve certain classification problems, particularly those involving sequential classification for Natural Language Processing tasks. It proposes the use of an iterative, error-based approach to improve classiffication performance, suggesting the incorporation of expert knowledge into the learning process through the use of knowledge rules.
We applied and evaluated the methodology to two tasks related with the detection of hedging in scientific articles: those of hedge cue identification and hedge cue scope detection. Results are promising: for the first task, we improved baseline results by 2.5 points in terms of F-score incorporating cue cooccurence information, while for scope detection, the incorporation of syntax information and rules for syntax scope pruning allowed us to improve classification performance from an F-score of 0.712 to a final number of 0.835.
Compared with state-of-the-art methods, results are competitive, suggesting that the approach of improving classifiers based only on commited errors on a held out corpus could be successfully used in other, similar tasks.
Additionaly, this thesis proposes a class schema for representing sentence analysis in a unique structure, including the results of different linguistic analysis. This allows us to better manage the iterative process of classifier improvement, where different attribute sets for learning are used in each iteration. We also propose to store attributes in a relational model, instead of the traditional text-based structures, to facilitate learning data analysis and manipulation.
- Université Paris Ouest - Nanterre La Défense
- 200 avenue de la République 92001 Nanterre
EventList powered by schlu.net