University of Twente Student Theses
Information Retrieval models based on electrical circuit analysis using Ohm’s law
Donselaar, V.L. van (2012) Information Retrieval models based on electrical circuit analysis using Ohm’s law.
PDF
130kB |
Abstract: | The TFxIDF weighting function is a well-known and proven model in modern information retrieval systems. The model allows docu- ments to be ranked by relevance based on the frequency of terms issued by a user’s query. Despite the fact that it yields good results, an clarification for its success is not so obvious. Attempts have been made to explain the model in terms of statistics or common sense. This paper tries to find similarities and differences with the theory of network analysis. A simplified network model based on the principle of an electrical circuit acts as a guide to gain under- standing of the model’s operation. The correctness of this model is tested by implementing it as a function of the Terrier Informa- tion Retrieval System, whereupon it is evaluated against Terrier’s predefined TFxIDF model. Results show that the precision of the network model is not as high as a TFxIDF model would typically achieve. Nonetheless the network model shows a new approach for calculation of the document score based on multiple termed queries, which improves the precision of the top 10 results. |
Item Type: | Essay (Bachelor) |
Faculty: | TNW: Science and Technology |
Subject: | 54 computer science |
Programme: | Advanced Technology BSc (50002) |
Link to this item: | https://purl.utwente.nl/essays/74667 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page