2019 EPFL doctorate Award - Finalist – Amit Gupta
Special distinction from the selection committee to Amit Gupta for his thesis “Automated Taxonomy Induction and its Applications”
EPFL thesis n°8160 (2017)
Thesis director: Prof. K. Aberer
Machine-readable semantic knowledge in the form of taxonomies is beneficial in an array of NLP tasks.
In this thesis, we focus on the task of automated taxonomy induction. We first focus on the task of inducing taxonomies from Wikipedia. We introduce a set of novel heuristics for induction of an English taxonomy. We also propose a novel approach, which leverages the interlanguage links of Wikipedia to induce taxonomies in other languages. Compared to the state of the art, our approach is simpler, more principled, and results in taxonomies that are significantly more accurate across both edge-based and path-based metrics for more than 280 languages. Subsequently, we focus on the task of taxonomy induction from unstructured text. Unlike previous approaches, which typically extract singular hypernym edges, our approach utilizes a novel probabilistic framework to extract long-range hypernym subsequences. We demonstrate that our approach outperforms the state-of-the-art taxonomy induction approaches across four languages. In summary, this thesis proposes new approaches towards automated taxonomy induction. It improves upon the state of the art, and also serves to relax many of the simplifying assumptions that limited the applicability of prior approaches, thus automating the process of taxonomy induction in the true sense.