paper

Graph-of-Word and TW-IDF: New Approach to Ad Hoc IR

  • Authors:

📜 Abstract

This paper introduces Graph-of-Word and TW-IDF, a novel approach for ad hoc information retrieval (IR) in which documents are represented as graphs. Nodes of the graph correspond to index terms (words), and edges capture the co-occurrence and order relationship between terms. The relevance score is computed using a graph connectivity measure, while a modified version of TF-IDF, called TW-IDF, integrates term weights according to graph indices. Preliminary experiments carried out over the TREC_AP collection show significant improvements over conventional approaches.

✨ Summary

The paper “Graph-of-Word and TW-IDF: New Approach to Ad Hoc IR” introduces an innovative method for representing documents as graphs to improve ad hoc information retrieval (IR). Unlike traditional text representation models, the authors propose representing documents, where nodes signify words and edges represent their co-occurrence and order, effectively capturing more contextual information. The relevance score of queries is determined using a graph connectivity measure while a modified TF-IDF, named TW-IDF, is used to integrate term weights. The authors report that this method shows notable improvements over traditional retrieval models when tested on the TREC_AP collection, an established dataset in IR research.

Upon searching for the influence and citation of this work, there is limited academic and industry-wide acknowledgment or reference to this particular paper. No significant direct impact or citations have been identified in subsequent research studies or industry applications, suggesting that while the theory may hold potential, it has yet to gain substantial traction or be widely applied within the field of information retrieval.