Library Science

Indexing Evaluation

Juran Sarkhel (2017). Professor of Library & Information Science, University of Kalyani, India

An indexing system is a sub-system of an information retrieval system and hence, its performance is directly linked up with the overall performance of the entire information retrieval system. Evaluation of an indexing system essentially means measuring the performance of the system, success or failure, in terms of its retrieval efficiency (ease of approach, speed, and accuracy) to the users, and its internal operating efficiency, cost-effectiveness and cost-benefit to the managers of the system.

The foundation of the Institute of Information Scientists in the UK in 1958 coincides closely with the beginning of the notion of experimental evaluation of information retrieval systems in general and indexing system in particulars. Although there had been some earlier attempts, we usually mark the start of the tradition as the Cranfield experiments, which ran from 1958 to 1966.

Purpose of Inexing Evaluation:

  • To identify the level of performance of the given indexing system,
  • To understand how well the given indexing system fulfills the queries of the users in retrieving the relevant documents,
  • To compare the performance of two or more indexing systems against a standard,
  • To identify the possible sources of failure of the given indexing system or inefficiency with a view to raising the level of performance at some future date,
  • To justify the existence of the given indexing system by analysing its costs and benefits,
  • To establish a foundation for further research on the reasons for the relative success of alternative techniques, and
  • To improve the means employed for attaining objectives or to redefine goals in view of research findings.

Efficiency and Effectiveness of Indexing System:

By effectiveness we mean the level up to which the given indexing system attains its stated objectives. The effectiveness may be a measure of how far an information retrieval system can retrieve relevant information withholding non-relevant information. The effectiveness of an indexing system can be measured by the calculation of recall and precision ratios. By efficiency, we mean how economically the indexing system is achieving its stated objectives. Efficiency can be measured by such factors, such as, at what minimum cost and effort do the system function effectively. It may be necessary that the cost factors are to be calculated indirectly, such as response time (i.e. that is time taken by the system to retrieve the information), user effort (i.e. the amount of time and effort required by a user to interact with the indexing system and analyse the output retrieved in order to get the required information), the cost involved, and so on.

Indexing Evaluation Criteria:

It is evident from the history of the experimental evaluation of information retrieval systems that there has been a remarkably coherent development of a set of criteria for the evaluation of indexing systems. These evaluation criteria generate argument, disagreement and heated dispute, but there remains a relatively stable common core, which has, despite its limitations, served us well over the last 50 years. The most important criteria used for evaluating an indexing system are: Recall and Precision.

Recall and Precision

A) Recall:

Recall refers to the index’s ability to let relevant documents through the filter. Recall ratio is a ratio of the relevant documents retrieved to the total number of relevant documents potentially available. It measures the completeness of the output. Hence,

the recall performance can be expressed quantitatively by means of a ratio called recall ratio as mentioned below:

Recall Ratio
Where, R = Number of relevant documents retrieved against a search

C = Total number of relevant documents available to that particular request in the collection.

b) Precision:

When the system retrieves items that are relevant to a given query it also retrieves some documents that are not relevant. These non-relevant items affect the success of the system because they must be discarded by the user, which results in wastage of a significant amount of time. The term ‘precision’ refers to the index’s ability to hold back documents not relevant to the user. Precision ratio is a ratio of the relevant documents retrieved to the number of documents retrieved. It measures the preciseness of the output, i.e. how precisely an indexing system functions. If recall is the measure of system’s ability to let through wanted items, precision is the measure of the system’s ability to hold back unwanted items. The formula for calculation of precision ratio is:

Where, R = Total number of relevant documents retrieved against a search

L = Total number of documents retrieved in that search

The search result against a query is to separate the all documents into two parts: (a) One part is the set of relevant documents, and (b) the other part is the set of irrelevant documents. The following matrix can be used as a common frame of reference for evaluation of indexing system with reference to the calculation of recall and precision ratios:

User relevance decision

From the above matrix, recall and precision ratios can be calculated according to the following manner:

  • Recall ratio = [a / (a+c)] x 100
  • Precision ratio = [a / (a +b)] x 100


a = Hit (Retrieval of relevant documents by the system. It adds to precision).

b = Noise (Retrieval of irrelevant documents by the system along with the relevant documents against a search).

c = Misses (The system fails to retrieve the relevant documents that should have been retrieved. It adds to the noise).

d = Dodged (The system correctly rejects to retrieve the documents that are not relevant to the given query).

It needs to be pointed out here that 100% recall and 100% precision are not possible in practice because recall and precision tend to vary inversely in searching. When we broaden a search to achieve better recall, precision tends to go down. Conversely, when we restrict the scope of a search by searching more stringently in order to improve the precision, recall tends to deteriorate.

c) Relevance:

In human history, relevance has been around forever, or as long as humans tried to communicate and use information effectively. The concept of “relevance” is the fundamental concept of information science in general and information retrieval, in particular. Evaluation of indexing will never be effective until there is an understanding of the percept of relevance. Relevance is one of the important types of measures used in the evaluation of an information retrieval system and is highly debated issue in information retrieval research. There does not seem to be any consensus among the experts on the definition of relevance.

The first full recognition of relevance as an underlying notion came in 1955 with a proposal to use “recall” and “relevance” (later, because of confusion, renamed precision, sometimes it was called as pertinence) as measures of retrieval effectiveness in which relevance was the underlying criterion for these measures. But, the term pertinence refers to a relationship between a document and an information need, whereas the term relevance refers to a relationship between a document and a request statement (i.e. expressed information need). It refers to the ability of an information retrieval system to retrieve material that satisfies the needs of the user.

We know that the main objective of indexing, forming an essential component of an IR system, is to determine the aboutness of documents for subsequent retrieval of information object relevant to user queries. Relevance denotes how well a retrieved set of documents meets the information need of the user i.e. to what extent the topic of a retrieved set of information objects matches the topic of the query or information need.

In most of the evaluation studies relevance was applied to stated requests (ie. expressed need). But, it has now been well established that the users’ requests do not reflect their information needs completely. Therefore, the current view is that the relevance is to be judged in relation to both expressed and unexpressed needs rather than restricting only to stated requests. It is dependent on the degree to which a user is able to recognize the exact nature of his/her information need and the degree to which his/her need is accurately expressed in the form of a request (ie. request statement). Information retrieval systems create relevance—they take a query, match it to information objects in the system by following some algorithms, and provide what they consider relevant. People derive relevance from obtained information or information objects. They relate and interpret the information or information objects to the problem at hand, their cognitive state, and other factors—in other words, people take the retrieved results and derive what may be relevant to them. Relevance is derived by inference.

Although “relevance” is extensively used in evaluation of information retrieval, there are considerable problems associated with reaching an agreement on its definition, meaning, evaluation, and application in information retrieval. There are a number of different views on “relevance” and its use for evaluation. This is because there are degrees of relevance. Relevance is a subjective factor depending on the individual. The same questions, posed by two different enquirers, may well require two different answers. It is because of the fact that enquirers seek information from their own corpus of knowledge. Thus it appears that the relevance is highly subjective and personal. It is a relation between an individual with an information need and a document.

Other Important Criteria:

Perry and Kent are credited for bringing the concept of evaluation into information retrieval systems during the 1950s. The evaluation criteria they suggested were:

i) Resolution factor: The proportion of total items retrieved over a total number of items in the collection.

ii) Pertinency factor: The proportion of relevant items retrieved over a total number of retrieved items. This factor was popularly named as the precision ratio in the subsequent evaluation studies.

iii) Recall factor: The proportion of relevant items retrieved over a total number of relevant items in the collection.

iv) Elimination factor: The proportion of non-retrieved items (both relevant and non-relevant) over the total items in the collection.

v) Noise factor: The proportion of retrieved items those are not relevant. This factor is considered as the complement of the pertinency factor.

vi) Omission factor: The proportion of non-relevant items retrieved over the total number of non-retrieved items in the collection.

Perry and Kent suggested the following formulae for the estimation of the above-mentioned evaluation criteria:

L / N = Resolution factor             (N—L) / N = Elimination factor

R / L = Pertinency factor             (L—R) / L = Noise factor

R / C = Recall factor                    (C—R) / C = Omission factor


N = Total number of documents

L =Number of retrieved documents

C = Number of relevant documents

R = Number of documents that are both retrieved and relevant

C. W. Cleverdon (1966) identified six criteria for the evaluation of an information retrieval system. These are:

i) Recall: It refers to the ability of the system to present all the relevant items;

ii) Precision: It refers to the ability of the system to present only those items that are relevant;

iii) Time lag: It refers to the time elapsing between the submission of a request by the user and his receipt of the search results.

iv) User Effort: It refers to the intellectual as well as the physical effort required from the user in obtaining answers to the search requests. The effort is measured by the amount of time user spends in conducting the search or negotiating his enquiry with the system. Response time may be good, but user effort may be poor.

v) From of presentation of the search output, which affects the user’s ability to make use ofthe retrieved items, and

vi) Coverage of the collection: It refers to the extent to which the system includes relevant matter. It is a measure of the completeness of the collection.

Article Collected From:

  • Sarkhel, J. (2017). Unit-9 Basics of Subject Indexing. Retrieved from
  • Juran Sarkhel (2017).(Professor of Library & Information Science, University of Kalyani, India)


Declaration: Articles shared in this blog are collected from different sources available on the internet to help students of Library and Information Science. Sources are mentioned in the reference section of the article. If you have any objections about the content of this blog, feel free to contact the site admin at

Leave a Reply

Your email address will not be published. Required fields are marked *