Investigating Bell Inequalities for Multidimensional Relevance Judgments in Information Retrieval
Relevance judgment in Information Retrieval is influenced by multiple factors. These include not only the topicality of the documents but also other user oriented factors like trust, user interest, etc. Recent works have identified and classified these various factors into seven dimensions of relevance. In a previous work, these relevance dimensions were quantified and user’s cognitive state with respect to a document was represented as a state vector in a Hilbert Space, with each relevance dimension representing a basis. It was observed that relevance dimensions are incompatible in some documents, when making a judgment. Incompatibility being a fundamental feature of Quantum Theory, this motivated us to test the Quantum nature of relevance judgments using Bell type inequalities. However, none of the Bell-type inequalities tested have shown any violation. We discuss our methodology to construct incompatible basis for documents from real world query log data, the experiments to test Bell inequalities on this dataset and possible reasons for the lack of violation.
Keywords:Quantum Cognition Information Retrieval Multidimensional Relevance Bell Inequalities
Information Retrieval (IR) is defined as finding material (documents, videos, audio, etc.) of an unstructured nature that are relevant to an information need of the user. Information Need (IN) of a user is usually expressed as a query. An essential component of IR is the concept of relevance of documents. It is defined as how well a document satisfies the user Information Need. Relevance in IR was traditionally considered to be Topical, i.e. how well the content of the retrieved document matches the topic of the query(e.g. text match). As content similarity matching techniques have become more accurate, almost all of the documents obtained for a query generally satisfy the topicality criteria. Hence users tend to consider other factors while judging documents. These different factors have been investigated in several works [3, 19, 20]. In , seven relevance dimensions were identified. Each of these dimensions was quantified by defining certain features, which could be extracted from all query-document pairs. These seven dimensions are described in Table 1.
|Topicality||The extent to which the retrieved document is related to the topic of the current query.|
|Reliability||The degree to which the content of the document is true, accurate and believable. Determined by the reliability of source.|
|Understandability||Extent to which the contents are readable. Vocabulary, complexity of sentences, layout of pages, etc. taken into consideration.|
|Interest||Topics from userâs past searches.|
|Habit||Focus on behavioral preference of users, e.g. always using certain websites for particular tasks.|
|Scope||Whether both breadth and depth of the document are suitable to the Information Need|
|Novelty||Whether the document contains information which is new to the user, or the document itself is newly created|
In [sigir18], a document defined using these relevance dimensions is represented as a two dimensional Hilbert space. Each of the seven relevance dimensions is represented as a basis. The different basis correspond to the different perspectives of relevance judgment for the same document. Based on which relevance dimension is considered, the same document will have different probabilities of relevance. Thus the document exists in multiple states (e.g. highly relevant, not relevant, moderately relevant, etc.) simultaneously and we get a particular judgment depending upon which criteria (relevance dimension) the user used to judge (measure) it. This is analogous to the measurement of electron spin which is either up or down in direction, but depends upon which axis it is measured in. Electrons with spin up along the Z-axis may have both up and down components along the X-axis. So a document may look relevant based on the Topicality dimension, but may not be so along, say, the Reliability dimension. We discuss the methodology used to quantify these seven dimensions and construct Hilbert spaces for documents in the next section.
This incompatibility in judgment perspectives is a fundamental feature of Quantum Mechanics . Incompatibility forbids the possibility of jointly determining the outcome of an event from two perspectives. We investigate whether decision making in IR, consisting of multiple perspectives, has an analogous quantum phenomena. A formal test of quantumness of systems was given in 1964 by John Bell . He formulated an inequality which cannot be violated by classical systems governed by joint probability distributions. Quantum Mechanics was shown to violate it for particular settings. In this work, we use another version of the Bell inequality, called the CHSH inequality . The CHSH inequality is given by equation (1) for two systems and where observables and can be measured in system and and can be measured in system . and can take values only in . It is assumed that the observables have pre-existing values which are not influenced by any other measurement.
The CHSH inequality is violated in Quantum Mechanics using a special composite state of two systems, called the Bell state , which has the following form:
where and represent the standard basis for the two systems. Initially, both the systems are in a superposed state. The two outcomes, i.e., corresponding to the and vectors can be obtained with equal probabilities. However, on measuring one system, if one obtains the outcome corresponding to the basis vector , the state of the composite system collapses to . Now it is known for certain that the outcome of the second system also corresponds to . This is true even if the two systems are spatially separated - the measurement on one system reveals the state of the other, instantaneously.
Violation of Bell inequalities by such entangled states prove the impossibility of the existence of a joint probability distribution for the two systems. It rules out the concept of ”Local Realism” of the classical world, which is the assumption made while deriving the Bell inequalities. ’Local’ implies the fact that measurement of one system does not influence that of a spatially separated system. ’Realism’ assumes that values of physical properties of systems have definite values and exist independent of observation .
There have been several works which have investigated violation of Bell inequalities in macroscopic and cognitive systems [1, 2, 6]. This work also investigates the Bell inequalities for violation by user’s composite state for judgment of two documents. After describing the methodology used to quantify the seven relevance dimensions, we describe equivalent Bell inequalities for the user states for documents. Subsequently we give details of the experimental settings used to form the composite system of documents.
2 Quantifying Relevance Dimensions
We represent each document as a two-dimensional real valued Hilbert space. The two basis vectors correspond to relevance and non-relevance of a dimension. For the seven dimensions, we have seven different basis in the Hilbert space. The user’s cognitive state for this document is a vector in the Hilbert space, a superposition of the basis vectors. Using the Dirac notation, we get the user state for a document d in different basis as:
and so on, in all seven basis. The coefficients is the weight (i.e., probability of relevance) the user assigns to document in terms of the dimension , and .
To calculate the coefficients of superposition in a basis, we use the same technique as [sigir18]. The dataset is of query logs from the Bing search engine. Following the methodology in , we define a set of features for each of the seven relevance dimensions. For each query-document pair, the set of features for each dimension are extracted and integrated into the LambdaMART  Learning to Rank (LTR) algorithm to generate seven relevance scores (one for each dimension) for the query-document pair. Due to lack of space, we refer the readers to  for more details on the features defined for each dimension and also how they are used in the LTR algorithm. We thus get seven different ranked lists for a query, corresponding to each relevance dimension. Then the scores assigned to a document for each dimension are normalized using the min-max normalization technique, across all the documents for the query. The normalized score for each dimension forms the coefficient of superposition of the relevance vector for the respective dimension. For example, for a query , let be the ranking order corresponding to the ”Reliability” dimension, based on the relevance scores of respectively. We construct the vector for document in the ’Reliability’ basis as:
where , where is the maximum value among . Square root is taken to enable calculation of probabilities according to the Born rule. We can thus represent this document in all the seven basis and therefore all the documents in their respective Hilbert spaces.
For documents where and are different, we get incompatible basis. Incompatibility in relevance dimensions for judging documents can be manifested in terms of Order Effects. Different order of considering relevance dimensions while judging a document will lead to different final judgments. As an example, consider a document with the following Hilbert space :
We take the basis()as the standard basis. Representing basis in the standard basis, we get (Appendix A):
Suppose that while judging Document , the user has the order in mind. Then the final probability of relevance is the projection from as shown in Figure 1.a. This is calculated as . If the user reverses the order of relevance dimensions considered while judging document , we get = , which is times larger (Figure 1.b).
3 Deriving a Bell Inequality for Documents
3.1 CHSH Inequality
In section 2, we showed how we can calculate the relevance probabilities of a document for different dimensions. We constructed a Hilbert space for each document, consisting of seven different basis, representing each dimension of relevance. Two or more such documents can be considered as a composite system by taking a tensor product of the document Hilbert spaces. If and are the state vectors of two documents, we can represent the tensor product as . Figure 2 shows the geometrical representation of two such Hilbert spaces. Here represents Relevance in the Habit basis, or in IR terms, relevance of document with respect to the Habit dimension. Similarly, represents irrelevance in the Habit basis.
In the CHSH inequality, we have observables and for a system taking values in . For a document , we have observables corresponding to the different relevance dimensions. Taking the case of two relevance dimensions, Habit and Novelty, we have observables and which take values in . Where corresponds to a projection on the basis vector , corresponds to the projection on its orthogonal basis vector .
Taking two documents as a composite system, we can write the CHSH inequality in the following way:
Where the subscripts and denote that the observables belong to document and document respectively. Using the fact that and , we can convert the above inequality into its probability form as:
We don’t have the joint probabilities in our dataset, hence we assuming (this where the assumption of realism is incorrectly made, which will not lead to the CHSH inequality violation), we get:
As we mentioned above, corresponds to the basis vector and therefore corresponds to the probability that document is relevant with respect to the dimension of relevance. Therefore we can calculate these probabilities as projections in the Hilbert space:
and similarly for document .
3.2 CHSH Inequality for documents using the Trace Method
Another way to define the CHSH inequality for documents is by directly calculating the expectation values using the trace rule. According to this rule, expectation value of an observable A in a state is given by
where the quantity is the density matrix for the state .
Let the two documents be represented in the standard basis as follows:
where and Hence, the state vector and the density matrix for a document can be written as:
The document representations in another basis are as follows:
and are basically relevance with respect to two relevance dimensions, say Habit and Novelty. We can write the basis in terms of the basis (see appendix A) as:
and similarly for the second document.
Thus we get the vector representations for basis states and as:
Now the observables and are defined as:
where and are the projection operators for standard basis vectors with eigen values and respectively. This is the spectral decomposition of the observables. We get . The matrix for observable is obtained in terms of the amplitudes and . Now the CHSH inequality for the observables and acting on the two documents can be written as:
Here denotes that we measure the observable on both the documents. In the language of tensor products,
In this way we can directly calculate the expectation values in equation (19). As a sample calculation, , where and are the probabilities of relevance and non-relevance respectively in the standard (Habit) basis.
3.3 N-Settings Bell Inequality
The CHSH inequality refers to two two-dimensional systems where each system has two measurement settings (or measurement basis). However this can be generalized for systems with multiple settings or basis 
where denotes the largest integer smaller or equal to .
4 Experiment and Results
Having obtained an equivalent representation of Bell inequalities in section 3, we proceed to substitute the values in the inequalities and test for violation using relevance scores as calculated in section 2. For each query, a user judges several documents to be relevant or non-relevant according to his or her information need. We investigate the correlations between these documents, with each document having multiple decision perspectives, using the Bell Inequalities. We consider the following types of document pairs to test for Quantum Correlations:
I) We consider those queries where only two documents are SAT clicked (Satisfied Click - Those documents which are clicked and browsed for at least 30 seconds). Out of queries in our dataset, queries had exactly two SAT clicked documents. We consider a composite system of these two documents and measure (judge the relevance) along different basis (relevance dimensions) corresponding to each of the Bell inequalities described in sections 3.1, 3.2 and 3.3.
II) We consider those queries for which we have at least one SAT clicked document. Out of queries in our dataset, we find queries with at least one SAT clicked document. We then consider a composite system of this SAT clicked document with all the unclicked documents for the query (one by one) and measure (judge the relevance) along different basis (relevance dimensions) corresponding to each of the Bell inequalities described in sections 3.1, 3.2, 3.3.
In both cases, we do not find the violation of the Bell inequalities for any query. While case (I) corresponds to correlated documents and case (II) corresponds to anti-correlated documents, it is to be noted that we are taking a composite system by taking a tensor product of two document states. This, in turn is separable back into the two document states. The reason why Quantum Mechanics violates Bell Inequalities is due to the existence of non-separable states like the Bell States. To get something similar to an entangled state, we consider another type of document pairs:
III) Consider a pair of documents which are listed together for many queries, but are always judged in a correlated manner. That is, if one document of the pair is SAT clicked, the other one is also SAT clicked for that query. And similarly both might be unclicked for another query in which they appear together. Also, we find those documents which are SAT clicked together in half of the queries they occur in, and unclicked in the other half. This corresponds to the following Bell State:
We take such pairs of documents to test the Bell inequalities on them. Out of pairs of documents, no pair show the violation of the inequalities discussed above.
The composite state of the two documents described in equation(23) appears to be like an entangled state of the documents - knowing that one document is SAT clicked or not can tell us about the other document. However, one fundamental property of the Bell states is their rotational invariance. Representing a Bell State in any basis, one gets the same probabilities of the two possible outcomes. For example,
where H, N and T are relevance with respect to the Habit, Topicality and Novelty basis. One can always hypothetically construct document Hilbert spaces in such a manner that the composite state is rotationally invariant, but that is not the case in the query log data, which is the target of our investigation.
As a formal test of non-separable states, we perform Schmidt decomposition  of the composite system of document pairs. We do not find any evidence of non-separable states for any type of document pairs, as described in cases (I), (II) and (III).
5 Conclusion and Future Work
We tested Bell inequalities for violation using data from Bing Query logs. Despite the presence of incompatible measurements, Bell inequalities are not violated. However, the incompatibility in measurement applies to the user’s cognitive state with respect to a single document. Hence there might exist a joint probability distribution governing user’s cognitive state for a pair of documents. The experiments in which the violation of Bell inequality has been reported for cognitive systems, the users are asked to report their judgments on composite states. Hence the joint probabilities can be directly estimated from the judgments. This may result in a “Conjunction Fallacy”  due to incompatible decision perspectives, thus violating the monotonicity law of probability by overestimating the joint probability, and therefore violating the Bell inequality. In our dataset, we don’t have judgments over the document pairs. That is, the user does not judge a pair of document to be relevant with respect to some dimensions. Instead we have got the probabilities of relevance of a single document with respect to different dimensions. When we use the relevance probability of individual documents to compute the joint probabilities for a pair of documents, we are forced to assume the existence of a joint probability distribution. Thus there might be a possibility of Bell inequality violation if we can obtain data for a pair of documents. For example, users can be asked to rate a document to be relevant with respect to Novelty and another document relevant with respect to Topicality. This would correspond to the term in the CHSH inequality. In this case, user’s judgment of a document may affect judgment of the other document in the pair.
Another test of the quantum nature of relevance judgments can be to test the non-contextual inequalities like the KCBS inequality . Bell inequalities are designed for a composite system with the assumption of locality and realism. The non-contextual inequalities are designed for a single system with multiple measurement perspectives, some of which are incompatible with each other. However, contextuality only exhibits in systems of more than two dimensions. Hence we need to modify our two-dimensional (two decision outcomes - relevant or not relevant) approach to test inequalities like the KCBS inequality. One can also test for violation of the Contextuality-by-Default inequality [9, 4]. This forms part of our future work.
Appendix A Appendix
Consider a state vector in two different basis of a two dimensional Hilbert space, We want to represent the vectors of one basis in terms of the other. To do that, consider the vector orthogonal to , which is
Using the above representations, we get
Substituting and in 25, we get:
This work is funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 721321. We would like to thank Jingfei Li for his help in providing the processed dataset.
-  Aerts, D.: Foundations of Physics 30(9), 1387–1414 (2000). https://doi.org/10.1023/a:1026449716544, https://doi.org/10.1023/a:1026449716544
-  Aerts, D., Sozzo, S.: Quantum entanglement in concept combinations. International Journal of Theoretical Physics 53(10), 3587–3603 (dec 2013). https://doi.org/10.1007/s10773-013-1946-z, https://doi.org/10.1007/s10773-013-1946-z
-  Barry, C.L.: Document representations and clues to document relevance. Journal of the American Society for Information Science 49(14), 1293–1303 (1998). https://doi.org/10.1002/(SICI)1097-4571(1998)49:14¡1293::AID-ASI7¿3.0.CO;2-E, http://dx.doi.org/10.1002/(SICI)1097-4571(1998)49:14<1293::AID-ASI7>3.0.CO;2-E
-  Basieva, I., Cervantes, V.H., Dzhafarov, E.N., Khrennikov, A.: True contextuality beats direct influences in human decision making (2018)
-  Bell, J.S.: On the einstein podolsky rosen paradox. Physics Physique Fizika 1, 195–200 (Nov 1964). https://doi.org/10.1103/PhysicsPhysiqueFizika.1.195, https://link.aps.org/doi/10.1103/PhysicsPhysiqueFizika.1.195
-  Bruza, P.D., Kitto, K., Ramm, B., Sitbon, L., Song, D., Blomberg, S.: Quantum-like non-separability of concept combinations, emergent associates and abduction. Logic Journal of IGPL 20(2), 445–457 (jan 2011). https://doi.org/10.1093/jigpal/jzq049, https://doi.org/10.1093/jigpal/jzq049
-  Bruza, P., Chang, V.: Perceptions of document relevance. Frontiers in Psychology 5, 612 (2014). https://doi.org/10.3389/fpsyg.2014.00612, https://www.frontiersin.org/article/10.3389/fpsyg.2014.00612
-  Burges, C.J.C.: From ranknet to lambdarank to lambdamart: An overview (2010)
-  Cervantes, V.H., Dzhafarov, E.N.: Snow queen is evil and beautiful: Experimental evidence for probabilistic contextuality in human choices. Decision 5(3), 193–204 (jul 2018). https://doi.org/10.1037/dec0000095, https://doi.org/10.1037/dec0000095
-  Clauser, J.F., Horne, M.A., Shimony, A., Holt, R.A.: Proposed experiment to test local hidden-variable theories. Phys. Rev. Lett. 23, 880–884 (Oct 1969). https://doi.org/10.1103/PhysRevLett.23.880, https://link.aps.org/doi/10.1103/PhysRevLett.23.880
-  Gisin, N.: Bell inequality for arbitrary many settings of the analyzers. Physics Letters A 260(1-2), 1–3 (sep 1999). https://doi.org/10.1016/s0375-9601(99)00428-4, https://doi.org/10.1016/s0375-9601(99)00428-4
-  Klyachko, A.A., Can, M.A., Binicioğlu, S., Shumovsky, A.S.: Simple test for hidden variables in spin-1 systems. Phys. Rev. Lett. 101, 020403 (Jul 2008). https://doi.org/10.1103/PhysRevLett.101.020403, https://link.aps.org/doi/10.1103/PhysRevLett.101.020403
-  Li, J., Zhang, P., Song, D., Wu, Y.: Understanding an enriched multidimensional user relevance model by analyzing query logs. Journal of the Association for Information Science and Technology 68(12), 2743–2754 (2017). https://doi.org/10.1002/asi.23868, http://dx.doi.org/10.1002/asi.23868
-  Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, New York, NY, USA, 10th edn. (2011)
-  Sakurai, J.J., Napolitano, J.: Modern Quantum Mechanics. Cambridge University Press (2017)
-  Trueblood, J.S., Busemeyer, J.R.: A quantum probability account of order effects in inference. Cognitive Science 35(8), 1518–1552 (sep 2011). https://doi.org/10.1111/j.1551-6709.2011.01197.x, https://doi.org/10.1111/j.1551-6709.2011.01197.x
-  Tversky, A., Kahneman, D.: Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review 90(4), 293–315 (1983). https://doi.org/10.1037/0033-295x.90.4.293, https://doi.org/10.1037/0033-295x.90.4.293
-  Uprety, S., Song, D.: Investigating order effects in multidimensional relevance judgment using query logs. In: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval. pp. 191–194. ICTIR ’18, ACM, New York, NY, USA (2018). https://doi.org/10.1145/3234944.3234972, http://doi.acm.org/10.1145/3234944.3234972
-  Xu, Y.C., Chen, Z.: Relevance judgment: What do information users consider beyond topicality? Journal of the American Society for Information Science and Technology 57(7), 961–973 (2006). https://doi.org/10.1002/asi.20361, http://dx.doi.org/10.1002/asi.20361
-  Zhang, Y., Zhang, J., Lease, M., Gwizdka, J.: Multidimensional relevance modeling via psychometrics and crowdsourcing. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. pp. 435–444. SIGIR ’14, ACM, New York, NY, USA (2014). https://doi.org/10.1145/2600428.2609577, http://doi.acm.org/10.1145/2600428.2609577