Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

Abstract

📜 Abstract

We extend previous representations of questions and answers based on graphical models of concepts and predicates with features that capture discourse relations. The problem is framed as a reranking model over the top n answers produced by a state-of-the-art retrieval model. A maximum entropy model is trained to assign a score to each candidate original rank using lexical, semantic, and discourse features. The solution can be applied to question answering (QA) systems that are based on any retrieval model. Amazingly, our model improves performance not only in terms of absolute answer reranking accuracy, but also in terms of precision and recall at the top answer ranks, showing that discourse relations can significantly complement lexical semantics in QA.

Description

✨ Summary

The paper “Discourse Complements Lexical Semantics for Non-factoid Answer Reranking” by Mihai Surdeanu et al. explores the improvement of non-factoid answer reranking by integrating discourse features with lexical semantics in question-answering systems. Using a reranking model over the top candidate answers produced by retrieval models, the study demonstrates enhanced precision, recall, and accuracy through the inclusion of discourse relations via a maximum entropy model.

In terms of impact, the paper’s methodology of combining discourse analysis with lexical semantics has influenced further exploration in enhancing question-answering systems, serving as a reference for developing methods to improve answer ranking in non-factoid contexts. The integration of discourse features is also discussed in studies like “Answering Non-factoid Questions through Multi-type Answer Re-ranking” by Kan and others, which builds upon these concepts to further refine answer retrieval accuracy. (Reference: https://aclanthology.org/2017.bea-1.1/)

Despite being published over a decade ago, the paper lays foundational insights into leveraging discourse analysis for computational linguistics, specifically in improving the performance of question-answering models.