Click Efficiency: A Unified Optimal Ranking for Online Ads and Documents

Click Efficiency: A Unified Optimal Ranking for Online Ads and Documents

Abstract

Traditionally the probabilistic ranking principle is used to rank the search results while the ranking based on expected profits is used for paid placement of ads. These rankings try to maximize the expected utilities based on the user click models. Recent empirical analysis on search engine logs suggests a unified click models for both ranked ads and search results. The segregated view of document and ad rankings does not consider this commonality. Further, the used model considers parameters of (i) probability of the user abandoning browsing results (ii) perceived relevance of result snippets. But how to consider them for improved ranking is unknown currently. In this paper, we propose a generalized ranking function—namely Click Efficiency (CE)—for documents and ads based on empirically proven user click models. The ranking considers parameters (i) and (ii) above, optimal and has the same time complexity as sorting. To exploit its generality, we examine the reduced forms of CE ranking under different assumptions enumerating a hierarchy of ranking functions. Interestingly, some of the rankings in the hierarchy are currently used ad and document ranking functions; while others suggest new rankings. Thus this hierarchy illustrates the relations between different rankings, as well as clarifies the underlying assumptions. While optimality of ranking is sufficient for document ranking, applying ranking to ad auctions requires an appropriate pricing mechanism. We incorporate a second price based pricing mechanism with the proposed ranking. Our analysis proves several desirable properties including revenue dominance over VCG for the same bid vector and existence of a Nash equilibrium in pure strategies. The equilibrium is socially optimal, and revenue equivalent to the truthful VCG equilibrium. As a result of its generality, the auction mechanism and the equilibrium reduces to the current mechanisms including GSP and corresponding equilibria. Further, we relax the independence assumption in CE ranking and analyze the diversity ranking problem. We show that optimal diversity ranking is NP-Hard in general, and that a constant time approximation algorithm is not likely.

ad ranking, document ranking, diversity, unified ranking
\category

H.3.3Information SystemsINFORMATION STORAGE AND RETRIEVAL[Information Search and Retrieval] \termsAlgorithms, Theory

1 Introduction

Search engines rank the results to maximize relevance of the top documents. On the other hand, targeted ads are ranked to maximize the profit from clicks. In general, the users browse through ranked lists of results or ads from top to bottom either clicking or skipping the results, or abandoning browsing the list due to impatience or satiation. The goal of the ranking is to maximize the expected relevances (or profits) of clicked results based on the click model of the users. The sort by relevance ranking suggested by Probability Ranking Principle (PRP) is commonly used for search results for decades [Robertson (1977), Gordon and Lenk (1991)]. In contrast, sorting by the expected profits calculated as the product of bid amount and Click Through Rate (CTR) is popular for ranking ads [Richardson et al. (2007)].

Recent click models suggest that the user click behaviors for both search results and targeted ads are the same [Guo et al. (2009), Zhu et al. (2010)]. Considering this commonality, the only difference between the two ranking problems is the utilities of entities ranked: for documents utility is the relevance and for the ads it is the cost-per-click. This suggests the possibility of a unified ranking function for results and ads. The current segregation of document and ad ranking as separate areas does not consider this commonality. A unified approach can help to widen the scope of the related research to these two areas, and enable applications of existing ranking functions in one area to isomorphic problems in the other area as we will show below.

In addition to the unified approach, the recent click models consider the following parameters:

  1. Browsing Abandonment: The user may abandon browsing ranked list at any point. The likelihood of abandonment may depend on the entities the user has already seen [Zhu et al. (2010)].

  2. Perceived Relevance: Perceived relevance is the user’s relevance assessment viewing only the search snippet or ad impression. The decision to click or not depends on the perceived relevance, not on the actual relevance of the results [Yue et al. (2010), Clarke et al. (2007)].

Though these parameters are part of the click models [Guo et al. (2009), Zhu et al. (2010)] how to exploit these parameters to improve ranking is unknown. The current document ranking is based on the simplifying assumption that the perceived relevance is the same as the actual relevance of the document, and ignores the browsing abandonment. The ad placement partially considers perceived relevance, but ignores abandonment probabilities.

We propose a unified optimal ranking function—namely Click Efficiency (CE)—based on a generalized click model of the user. CE is defined as the ratio of the stand-alone utility generated by an entity to the sum of the abandonment probability and click probability of that entity (abandonment probability is the probability for the user to leave browsing the list after viewing the entity). The sum of the abandonment and click probability may be viewed as the click probability consumed by the entity. We derive the name Click Efficiency based on this view—similar to the definition of the mechanical efficiency of a machine as the ratio of the output to the input energy. We show that sorting in the descending order of CE of entities guarantees optimum ranking utility. We do not make assumptions on the utilities of the entities, which may be assessed relevance for documents or cost per click (CPC) charged based on the auction for ads. On plugging in the appropriate utilities—relevance for documents and CPC for the ads—the ranking specializes to document and ad ranking.

As an implication of the generality, the proposed ranking will reduce to specific ranking problems on assumptions on the user behavior. We enumerate a hierarchy of ranking functions corresponding to the limiting assumptions on the click model. Most interestingly, some of these special cases correspond to the currently used document and ad ranking functions—including PRP and sort by expected profit described above. Further, some of the reduced ranking functions suggest new rankings for special cases of the click model—like a click model in which the user never abandons the search, or the perceived relevance is approximated as the actual relevance. This hierarchy elucidates interconnection between different ranking functions and the assumptions behind the rankings. We believe that this will help in choosing the appropriate ranking function for a particular user click behavior.

Ad Ranking: An ad placement mechanism consists of a ranking and a pricing strategy. Hence to apply the CE ranking to ad placement, a pricing mechanism has to be associated. We incorporate a second price based pricing mechanism with the proposed ranking. Our analysis establishes many interesting properties of the proposed mechanism. Particularly, we state and prove the existence of a Nash Equilibrium in pure strategies. At this equilibrium the profits of the search engine and the total revenue of the advertisers is simultaneously optimized. Like ranking, this is a generalized auction mechanism, and reduces to the existing GSP and Overture mechanisms under the same assumptions as that of the ranking. Further, the stated Nash Equilibrium is a general case of the equilibriums of these existing mechanisms. Comparing the mechanism properties with that of VCG [Vickrey (1961), Clarke (1971), Groves (1973)], we show that for the same bid vector search engine revenue for the CE mechanism will be greater or equal to that of VCG. Further, the revenue for the proposed equilibrium is equal to the revenue of truthful dominant strategy equilibrium of VCG.

Diversity Ranking: Our analysis so far was based on the assumption of parameter independence between the ranked entities. We relax this assumption and analyze the implications based on a specific well known problem—diversity ranking [Carterette (2010), Agrawal et al. (2009), Rafiei et al. (2010)]. Diversity ranking try to maximize the collective utility of top- ranked entities. For a ranked list, an entity will reduce residual utility of a similar entity below in the list. Though optimizing many of the specific ranking functions incorporating diversity is known to be NP-Hard [Carterette (2010)], an understanding of why this is an inherently hard problem is lacking. By analyzing a significantly general case, we show that even the very basic formulation of diversity ranking is NP-Hard. Further we extend our proof showing that a constant ratio approximation algorithm is unlikely. As a benefit of the generality of ranking, these results are applicable both for ads and documents.

The contributions of the unified ranking, including both ad and document domains are:

  1. Unified optimal ranking (CE ranking) based a generalized click model.

  2. Optimal ranking considering abandonment probabilities for documents and ads.

  3. Optimal Ranking considering perceived relevance of documents and ads.

  4. A unified hierarchy of ranking functions and enumerating optimal rankings for different click models.

  5. Analysis of general diversity ranking problem and hardness proofs.

Contributions to ad placement are:

  1. Design and analysis of a generalized ad auction mechanism incorporating pricing with CE ranking.

  2. Proving existence of a socially optimal Nash Equilibrium with optimal advertisers revenue as well as optimal search engine profit.

  3. Proof of search engine revenue dominance over VCG for equivalent bid vectors, and equilibrium revenue equivalence to the truthful VCG equilibrium.

The rest of this paper is organized as following. Next section reviews related work. Section 3 explains the click model used for our analysis. Subsequently we introduce our optimal ranking function, and discuss the intuitions and implications. In Section 5 reductions of ranking function to several document and ad ranking functions under limiting assumptions are enumerated. Further we discuss several useful special cases of our ranking and assumptions under which they are optimal. In Section 6 we incorporate a pricing strategy to design a complete auction mechanism for ads. Several useful properties are established, including the existence of Nash equilibrium and revenue dominance over VCG. Section 7 explores the ranking considering mutual influences and proves our hardness results. Finally we conclude by discussing potential future research.

2 Related Work

The impact of click models on ranking has been analyzed in ad-placement. Balakrishnan and Kambhampati [Balakrishnan and Kambhampati (2008)] proposed the optimal ad ranking considering mutual influences. The ranking uses the same user model, but the paper considers only ads, and does not include generalization and ad auction mechanisms. Aggarwal et al. [Aggarwal et al. (2008)] as well as Kempe and Mahdian [Kempe and Mahdian (2008)] analyze placement of ads using a Markovian click model. The click model is similar except for that the abandoning is not modeled separately from continuing probability. These two papers optimize the sum of the revenues of the advertisers, instead of the optimizing search engine profits as we do for ads in this paper. Giotis and Karlin [Giotis and Karlin (2008)] extend this work by applying GSP pricing and analyzing the equilibrium. The GSP pricing and ranking lacks the optimality and generality properties we prove in this paper. Deng and Yu [Deng and Yu (2009)] extend this work by suggesting a ranking and pricing schema for the search engines and prove the existence of a Nash Equilibrium. The ranking is a simpler bid based ranking (not based on CPC as in our case); and mechanism as well as equilibrium do not show optimality properties we prove in this paper. Kuminov and Tennenholtz [Kuminov and Tennenholtz (2009)] proposed a Pay Per Action (PPA) model similar to the click models and compared the equilibrium of GSP mechanism on the model with the VCG. Ad auctions considering influence of other ads on conversion rates are analyzed by Ghosh and Sayedi [Ghosh and Sayedi (2010)].

The existing document ranking based on PRP [Robertson (1977)] claims that a retrieval order sorted on relevance leads to the largest number of relevant documents in a set than any other policy. Gordon and Lenk [Gordon and Lenk (1991), Gordon and Lenk (1992)] identified the required assumptions for the optimality of the ranking according to PRP. Our discussion on PRP may be considered as an independent formulation of assumptions under which PRP is optimal for web ranking.

User behavior studies in click models validate the ranking function introduced. There are a number of position based and cascade models studied recently [Dupret and Piwowarski (2008), Craswell et al. (2008), Guo et al. (2009), Chapelle and Zhang (2009), Zhu et al. (2010)]. In particular, General Click Model (GCM)  by Zhu et al.[Zhu et al. (2010)] is interesting for us, since other click models are special cases of GCM. Zhu et al. [Zhu et al. (2010)] have listed assumptions under which the GCM would reduce to other click models. We will discuss the relations of our model to GCM below. Optimizing utilities of two dimensional placement of search results has been studied by Chierichetti et al. [Chierichetti et al. (2011)]

Along with the current click models, there has been research on evaluating perceived relevance of the search snippets [Yue et al. (2010)] and ad impressions [Clarke et al. (2007)]. Research in this direction neatly complements our new ranking function by estimating the parameters required.

Diversity ranking has received considerable attention recently [Agrawal et al. (2009), Rafiei et al. (2010)]. The objective functions used to measure diversity by prior works are known to be NP-Hard [Carterette (2010)].

3 Click Model

Figure 1: Flow graph for an user browsing the first two entities. The labels are the view probabilities and denotes the entity at the position

We assume a basic user click model in which the web user browses the entity list in ranked order, as shown in Figure 1. At every result entity, the user may:

  1. Click the result entity with perceived relevance . We define the perceived relevance as the probability to click the entity having seen i.e. . Note that Click Through Rate (CTR) defined in ad placement is the same as the perceived relevance defined here [Richardson et al. (2007)].

  2. Abandon browsing the result list with abandonment probability . is defined as the probability of abandoning the search at having seen . i.e. .

  3. Go to the next entity in the result list with probability

The click model can be schematically represented as the flow graph shown in Figure 1. Labels on the edges refer to the probability of the user traversing them. Each vertex in the figure corresponds to a view epoch (see below), and the flow balance holds at each vertex. Starting from the top entity, the probability of the user clicking the first ad is and probability of him abandoning browsing is . The user goes beyond the first entity with probability and so on for the subsequent results.

In this model, we assume that the parameters—, and —are functions of the entity at the current position i.e. these parameters are independent of other entities the user has already seen. We recognize that this assumption is not fully accurate, since the users decision to click the current item or leave search may depend on not just on the current item but rather all the items he has seen before in the list. We stick to the assumption for the optimal ranking analysis below, since considering mutual influence of ads can lead to combinatorial optimization problems with intractable solutions. We will show that even the simplest dependence between the parameters will indeed lead to intractable optimal ranking in Section 7.

Though the proposed model is intuitive enough, we would like to mention that the our model is confirmed by the recent empirical click models. For example, the General Click Model (GCM) by Zhu et al. [Zhu et al. (2010)] is based on the same basic user behavior. The GCM is empirically validated for both search results and ads [Zhu et al. (2010)]. Further, other click models are shown to be special cases of GCM (hence special cases of the model used in this paper). Please refer to Zhu et al. [Zhu et al. (2010)] for a detailed discussion. These previous works avoids the need for separate model validation, as well as confirms feasibility of the parameter estimation.

4 Optimal Ranking

Based on the click model, we formally define the ranking problem and derive optimal ranking in this section. The formal problem statement is,

Choose the optimal ranking of entities to maximize the expected utility

(1)

where is the total number of entities to be ranked.

For the browsing model in Figure 1, the click probability for the entity at the position is,

(2)

Substituting click probability from Equation 2 in Equation 1 we get,

(3)

The optimal ranking maximizing this expected utility can be shown to be a sorting problem with a simple ranking function:

Theorem 1

The expected utility in Equation 3 is maximum if the entities are placed in the descending order of the value of the ranking function ,

(4)

Proof Sketch: The proof shows that any inversion in this order will reduce the expected profit. function is deduced from expected profits of two placements—the ranked placement and placement in which the order of two adjacent ads are inverted. We show that the expected profit from the inverted placement can be no greater that the ranked placement. Please refer to Appendix A-1 for the complete proof.

As mentioned in the introduction, the ranking function is the utility generated per unit view probability consumed by the entity. With respect to browsing model in Figure 1, the top entities in the ranked list have higher view probabilities, and placing ads with greater utility per consumed view probability higher intuitively increases total utilities.

Note that the ordering above does not maximize the utility for selecting a subset of items. The seemingly intuitive method of ranking the set of items by and selecting top- may not be optimal [Aggarwal et al. (2008)]. For optimal selection, the proposed ranking can be extended by a dynamic programming based selection—similar to the method suggested by Aggrawal et al [Aggarwal et al. (2008)] for maximizing advertiser’s profit. In this paper, we discuss only the ranking problem.

5 Ranking Taxonomy

Figure 2: Taxonomy reduced ranking functions of CE . The assumptions and corresponding reduced ranking functions are illustrated. The dotted lines denote predicted ranking functions incorporating new click model parameters.

As we mentioned before, the CE ranking will be applicable to different ranking problems by plugging in corresponding utilities. For example, if we plug in relevance as utility ( in Equation 4), the ranking function is for the documents, whereas if we plug in cost per click of ads, the ranking function is for ads. Further, we may assume specific constraints on one or more of the three parameters of CE ranking (e.g. ). On these assumptions, CE ranking will suggest number of reduced ranking functions with specific applications. These substitutions and reductions can be enumerated as a taxonomy of ranking functions.

We show the taxonomy in Figure 2. The three top branches of the taxonomy (, , and branches) are document, ad ranking maximizing search engine profit, and ad ranking maximizing advertisers revenue respectively. These branches correspond to the substitution of utilities by document relevance, CPC, and private value of the advertisers. The sub-trees below these branches are the further reduced cases of these three main categories. The solid lines in Figure 2 denote already known functions, while the dotted lines are the new ranking functions suggested by CE ranking. Sections 5.1, Section 5.2, and Section 5.3 below discuss the further reductions of document ranking, search engine optimal ad ranking, and social optimal ad ranking respectively.

5.1 Document Ranking

For the document ranking the utility of ranking is the probability of relevance of the document. Hence by substituting the document relevance—denoted by —in Equation 4 we get

(5)

This function suggests the general optimal relevance ranking for the documents. We discuss some intuitively valid assumptions on user model for the document ranking and the corresponding ranking functions below. The three assumptions discussed below correspond to the three branches under Document Ranking subtree in Figure 2.

Sort by Relevance (PRP): We elucidate two sets of assumptions under which the in Equation 5 will reduce to PRP.

First assume that the user has infinite patience, and never abandons results (i.e. ). Substituting this assumption in Equation 5,

(6)

which is exactly the ranking suggested by PRP.

In other words, scenarios in which the user has infinite patience and never abandons checking the results (i.e. the user leaves browsing the results only by clicking a result) the PRP is still optimal.

Second set of slightly weaker assumptions under which the will reduce to PRP is

  1. .

  2. Abandonment probability is negatively proportional to the document relevance i.e. , where is a constant between one and zero. This assumption corresponds to the intuition that the higher the perceived relevance of the current result, the less likely is the user to abandon the search.

Now the reduces to,

(7)

Since this function is strictly increasing with , ordering by results in the same ranking as suggested by the function. This implies that PRP is optimal under these assumptions also.

Abandonment decreasing with perceived relevance is a more intuitively valid assumption than the infinite patience assumption above.

Ranking Considering Perceived Relevance: Recently click log studies effectively assesses perceived relevance of document search snippets [Yue et al. (2010), Clarke et al. (2007)]. But how to use the perceived relevance for improved document ranking is unknown. We show that depending on the nature of abandonment probability , the optimal ranking considering perceived relevance differs.

If we assume that in Equation 5, the optimal perceived relevance ranking is the same as that suggested by PRP as we have seen in Equation 6.

On the other hand, if we assume that the abandonment probability is negatively proportional to the perceived relevance () as above, the optimal ranking considering perceived relevance is

(8)

i.e. sorting in the order of product of document relevance and perceived relevance is optimal under these assumptions. The assumption of abandonment probabilities negatively proportional to relevance is more realistic than infinite patience assumption as we discussed above. This discussion shows that by estimating nature of abandonment probability, one would be able to decide on optimal perceived relevance ranking.

Ranking Considering Abandonment: We consider the ranking considering abandonment probability , with the assumption that the perceived relevance is approximately equal to the actual relevance. In this case the becomes,

(9)

Clearly this is not a strictly increasing function with . This means that ranking considering abandonment is different from PRP ranking, even if we assume that the perceived relevance is equal to the actual relevance. On the assumption that , the abandonment ranking becomes same as PRP.

5.2 Optimal Ad Ranking for Search Engines

For the paid placement of ads, the utility of ads to the search engine are Cost Per Click (CPC) of ads. Hence, by substituting the CPC of the ad—denoted by — in Equation 4 we get

(10)

Thus this function suggests the general optimal ranking for the ads. Please recall that the perceived relevance is the same as the Click Through Rate (CTR) used for ad placement [Richardson et al. (2007)].

In the following subsections demonstrate how the general ranking presented reduces to the currently used ad placement strategies under appropriate assumptions. We will show that they all correspond to the specific assumptions on the abandonment probability . These two functions below correspond to the two branches under the SE Optimal ad Placement subtree in Figure 2.

Ranking by Bid Amount: The sort by bid amount ranking was used by Overture Services (and was later used by Yahoo! for a while after their acquisition of Overture). Assuming that the user never abandons browsing (i.e. , Equation 10 reduces to

(11)

This means that the ads are ranked purely in terms of their payment1.

When , we essentially have a user with infinite patience who will keep browsing downwards until he finds the relevant ad. So, to maximize profit, it makes perfect sense to rank ads by bid amount. More generally, for small abandonment probabilities, ranking by bid amount is near optimal. Note that this ranking is isomorphic to PRP ranking discussed above for document ranking, since both are ranking based only on utilities.

Ranking by Expected Profit: Google and Microsoft are purported to be placing the ads in the order of expected profit based on product of CTR ( in ) and bid amount ( [Richardson et al. (2006)]. The ranking is part of the well known Generalized Second Price (GSP) auction mechanism. If we approximate abandonment probability as negatively proportional to the CTR of the ad (i.e. ) , the Equation 10 reduces to,

(12)

This shows that ranking ads by their stand-alone expected profit is near optimal as long as the abandonment probability is negatively proportional to the relevance.2 Note that this ranking is isomorphic to the perceived relevance ranking of the documents discussed above.

5.3 Revenue Optimal Ad Ranking

An important property of the auction mechanism is the expected revenue—which is the sum of the profits of the advertisers and search engine. To analyze advertisers profit, a private value model is commonly used. Each advertiser is assigned with a private value for the click equal to the expected revenue from the click. Advertisers pay a fraction of this revenue to the search engine depending on the pricing mechanism. The profit for advertisers is the difference between the private value and payment to the search engine. The profit for the search engine is the payment of the advertisers. Consequently, the revenue is the sum of the profits of all the parties—search engine and the advertisers.

The Advertiser Social Optima branch in Figure 2 corresponds to the ranking to maximize total revenue. Private value of advertisers is denoted as—. By substituting the utility by private values in Equation 4 we get,

(13)

If the ads are ranked in this order, the ranking will guarantee maximum revenue. This result is already known, as Agarwal et al. [Aggarwal et al. (2008)] and Kempe and Mahdian [Kempe and Mahdian (2008)] independently analyzed the auctions to maximize advertiser revenue and proposed similar ranking function.

In Figure 2 the two left branches of revenue maximizing subtree (labeled and ) correspond to the assumption of no abandonment, and abandonment probabilities negatively proportional to the click probability respectively. These two cases are isomorphic to the Overture and Google ranking discussed in Section 5.2 above. We discuss further on revenue maximizing ranking in conjunction with a pricing mechanism in Section 6

6 Applying Ranking for Ad placement

We have shown that ranking maximizes the profits for search engines for given CPCs in Section 5.2. In ad placement, the net profits of ranking to the search engine can only be analyzed in association with a pricing mechanism. To this end, we introduce a pricing to be used with the designing a full auction mechanism. Subsequently, we analyze the properties of the mechanism.

To describe the dynamics of ad auctions briefly, the search engine decides the ranking and pricing (cost per click) of the ads based on the bid amounts of the advertisers. Generally the pricing is not equal to the bid amount of advertisers, but is instead derived based on the bids [Easley and Kleinberg (2010), Edelman et al. (2005), Aggarwal et al. (2006)]. In response to these ranking and pricing strategies, the advertisers (more commonly, the software agents of the advertisers) may change their bids to maximize their profits. They may change bids hundreds of times a day. Eventually, the bids will stabilize at a fixed point where no advertiser can increase his profit by unilaterally changing his bid. This set of bids corresponds to a Nash Equilibrium of the auction mechanism. Hence the realistic profits of a search engine will be the profits corresponding to the Nash Equilibrium.

The next section discusses the properties of any mechanism based on the user model in Figure 1—independent of the ranking and pricing strategies. In Section 6.2, we introduce a pricing mechanism and analyze its properties including the equilibrium.

6.1 Pricing Independent Properties

In this section we illustrate properties arising based on the user browsing model in Figure 1, not assuming any pricing or ranking strategy. One of the basic results is

Remark 1

In any equilibrium the payment by the advertisers is less than or equal to their private values (i.e. individual rationality of the bidders is maintained).

If this is not true, advertiser may opt out from the auction by bidding zero and increase the profit, violating the assumption of an equilibrium.

Remark 2

In any equilibrium, price paid by an advertiser increases monotonically as he moves up in the ranking unilaterally.

From the browsing model, click probability of the advertisers in non-decreasing as he moves up in the position. Unless the price increases monotonically, advertiser can increase his profit by moving up, violating the assumption of an equilibrium.

Note that the proposed model is a general case of the positional auctions model by Varian [Varian (2007)]. Positional auctions assume static click probabilities for each position independent of the other ads. We assume realistic dynamic click probabilities depending on the ads above. Due to these externalities, the model is more complex and does not hold many of the properties derived by Varian [Varian (2007)] (e.g. monotonically increasing values and prices with positions).

Remark 3

Irrespective of the ranking and pricing, the sum of revenues of the advertisers is upper bounded at

(14)

when the advertisers are ordered by . Further, this is an upper bound for the search engine profit.

This result directly follows from the Advertisers Social Optima branch in Figure 2, and Equation 13.

The revenue is shared among the advertisers and search engine. For each click, advertisers get a revenue equal to the private value and pay a fraction equal to the CPC (set by the search engine pricing strategy) to the search engine. The total payoff for the search engine is the sum of the payments by the advertisers. Conversely, total payoff to the advertisers is the difference between total revenue and payoff to the search engine. Since the suggested order above in Remark 3 maximizes revenue, which is the sum of the payoffs of all the players (search engine and the advertisers), this is a socially optimal order and the revenue realized is the socially optimal revenue.

A corollary of the social optimality combined with the individual rationality result in Remark 1 is that,

Remark 4

The quantity in Remark 3 is an upper bound for the search engine profit irrespective of the ranking and pricing mechanism.

Social optimal revenue can be realized only if the ads are in the descending order of . Social optimum is desirable for search engines, since this will increase the payoffs for advertisers for the same CPC. Increased payoffs will increase the advertiser’s incentive to advertise with the search engine and will increase business for the search engine in long term.

Since search engines do not know the private value of the advertisers (note that search engine perform the ranking), social optimal ranking based on private values is not directly feasible. We need to design a mechanism having an equilibrium coinciding with the social optimality. This will motivate advertisers towards bids coinciding with social optimal ordering. In addition to social optimality, it is highly desirable for the mechanism to be based on CE ranking to simultaneously maximize advertiser’s revenue and search engine profit. In the following section we propose such a mechanism using CE ranking and prove the existence of an equilibrium in which the CE ranking coincides with the social optimal allocation.

6.2 Pricing and Equilibrium

In this section, we define a pricing strategy to use with the CE ranking, and analyze properties of the resulting mechanism.

For defining the pricing strategy, we define the pricing order as the decreasing order of , where is,

(15)

In this pricing order, we denote the advertiser’s as , as , as , and abandonment probability as for convenience. Let . For each click advertiser is charged with a price (CPC) equal to the minimum bid required to maintain its position in the pricing order,

(16)

Substituting in Equation 10 for the ranking order, CE of the advertiser is,

(17)

This proposed mechanism preserves the pricing order in the ranking order as well, i.e.

Theorem 2

The order by is the same as the order by for the auction i.e.

(18)

Proof is given in the Appendix A-2. This order preservation property implies that the final ranking is the same as that based on bid amounts. As a corollary, the CPC is equal to the minimum amount the advertisers have to pay to maintain his position in the ranking order as well.

Further we show below that any advertisers CPC is less or equal to his bid.

Lemma 1 ((Individual Rationality))

The payment of any advertiser is less or equal to his bid amount.

{proof}

This means advertisers will never have to pay more than bid, similar to GSP. This nice property makes it easy for the advertiser to decide his bid.

Interestingly, this mechanism also is a general case of the existing mechanisms, as in the case of CE ranking. In particular, the mechanism reduces to GSP (Google mechanism) and Overture mechanisms on the same assumptions on which CE ranking reduces to respective rankings (described in Section 5.2).

Lemma 2

The mechanism reduces to Overture ranking with second price auction on the assumption

{proof}

This assumption implies

Lemma 3

The mechanism reduces to GSP on assumption

{proof}

This assumption implies

This in conjunction with Theorem 2 implies that GSP ranking by (i.e. by bids) is the same as the ranking by (by CPCs).

Now we will look at the equilibrium properties of the mechanism. Truth telling is not a dominant strategy. This trivially follows, since GSP is a special case of the proposed mechanism, and it is known that for GSP truth telling is not a dominant strategy [Edelman et al. (2005)]. Hence we center our analysis on Nash Equilibrium conditions.

Theorem 3 ((Nash Equilibrium))

Without the loss of generality assume that the advertisers are ordered in the decreasing order of where is the private value of the advertiser. The advertisers are in an envy free pure strategy Nash Equilibrium if

(19)

This equilibrium is socially optimal as well as optimal for search engines for the given CPC’s.

Proof Sketch: The inductive proof shows that for these bid values, no advertiser can increase his profit by moving up or down in the ranking. The full proof is given in Appendix A-3.

We do not rule out the existence of multiple equilibria. The stated equilibrium is particularly interesting, due to the simultaneous social optimality and search engine optimality.

The following remarks show that equilibria of other placement mechanisms are reduced cases of the proposed CE equilibrium, as a natural consequence of its generality. The stated equilibrium reduces to equilibriums in Overture mechanism and GSP under the same assumptions under which the ranking reduces to respective rankings.

Remark 5

The bid values

(20)

are a pure strategy Nash Equilibrium in Overture mechanism. This corresponds to the substitution of the assumption in Theorem 3.

The proof follows from Theorem 3 as both pricing and ranking is shown to be a special case of our proposed mechanism.

Similarly for GSP,

Remark 6

The bid values

(21)

is a pure strategy Nash Equilibrium in GSP mechanism.

This equilibrium corresponds to the substitution of the assumption in Theorem 3. Since this is a special case, the proof for Theorem 3 is sufficient.

6.3 Comparison with VCG mechanism

We compare the revenue and equilibrium of mechanism with those of VCG [Vickrey (1961), Clarke (1971), Groves (1973)]. VCG auctions combine an optimal allocation (ranking) with VCG pricing. VCG payment of a bidder is equal to the reduction of revenues of other bidders due to the presence of the bidder. A well known property is that VCG pricing with any socially optimal allocation has truth telling as the the dominant strategy equilibrium.

In the context of online ads, ranking optimal with respect to the bid amounts is socially optimal ranking for VCG. This optimal ranking is ; as directly implied by the Equation 1 on substituting for utilities. Hence this ranking combined with VCG pricing has truth telling as the dominant strategy equilibrium. Since at the dominant strategy equilibrium, ranking is socially optimal for an advertiser’s true value as suggested in Equation 13.

The CE ranking function is different from VCG since CE ranking by payments optimizes search engine profits. On the other hand, VCG ranks by bids optimizing advertiser’s profit. But the Theorem 2 shows that for the pricing used in , ordering of is the same as that of VCG. This order preserving property facilitates comparison of with VCG. The theorem below shows revenue dominance of CE over VCG for the same bid values of advertisers.

Theorem 4 ((Search Engine Revenue Dominance))

For the same bid values for all the advertisers, the revenue of search engine by mechanism is greater or equal to the revenue by VCG.

Proof Sketch: The proof is an induction based on the fact that the ranking by CE and VCG are the same, as mentioned above. Full proof is given in Appendix A-4.

This theorem shows that the CE mechanism is likely to provide higher revenue to the search engine even during transient times before the bids settle on equilibriums.

Based on Theorem 4 we prove revenue equivalence of the proposed equilibrium with dominant strategy equilibrium of VCG.

Theorem 5 ((Equilibrium Revenue Equivalence))

At the equilibrium in Theorem 3, the revenue of search engine is equal to the revenue of the truthful dominant strategy equilibrium of VCG.

Proof Sketch: The proof is an inductive extension of the of Theorem 4. Please see Appendix A-5 for complete proof.

Note that the equilibrium has lower bid values than VCG at the equilibrium, but provides the same profit to the search engine.

7 CE Ranking Considering Mutual Influences: Diversity Ranking

An assumption in CE ranking is that the entities are mutually independent as we pointed out in Section 3. In other words, the three parameters—, and —of an entity do not depend on other entities in the ranked list. In this section we relax this assumption and analyze the implications. Since the nature of the mutual influence may vary for different problems, we base our analysis on a specific well known problem—ranking considering diversity [Carterette (2010), Agrawal et al. (2009), Rafiei et al. (2010)].

Diversity ranking accounts for the fact that the utility of an entity is reduced by the presence of a similar entity above in the ranked list. This is a typical example of the mutual influence between the entities. All the existing objective functions for the diversity ranking are known to be NP-Hard [Carterette (2010)]. We analyze a most basic form of diversity ranking to explain why this is a fundamentally hard problem.

We modify the objective function in Equation 1 slightly to distinguish between the stand-alone utilities and the residual utilities—utility of an entity in the context of other entities in the list—as,

(22)

where denotes the residual utility.

We consider a simple case of diversity ranking problem by considering a set of entities—all having the same utilities, perceived relevances and abandonment probabilities. Some of these entities may be repeating. If an entity in the ranked list is same as the entity in the list above, residual utility of that entity becomes zero. In this case, it is intuitive that the optimal ranking is to place maximum number of pair wise dissimilar entities in the top slots. The theorem below shows that even in this simple case the optimal ranking is NP-Hard.

Theorem 6

Diversity ranking optimizing expected utility in Equation 22 is NP-Hard.

Proof Sketch: The proof is by reduction from the independent set problem. See Appendix A-6 for the complete proof.

Moreover, the proof by reduction from independent set problem has more severe implications than NP-Hardness as shown in the following corollary,

Corollary 1

The constant approximation algorithm for ranking considering diversity is hard.

Proof: The proof of NP-Hardness theorem above shows that the independent set problem is a special case of diversity ranking. This implies that a constant ratio approximation algorithm for the optimal diversity ranking would be a constant ratio approximation algorithm for the independent set problem. Since constant ratio approximation of the independent set is known to be hard (cf. Garey and Johnson [Garey and Johnson (1976)] and Håstad [Håstad (1996)]) the corollary follows. To define hard, in his landmark paper Håstad proved that independent set cannot be solved within for unless all problems in are solvable in probabilistic polynomial time, which is widely believed to be not possible.3

This section shows that the optimal ranking considering mutual influences of parameters is hard. We are leaving formulating approximation algorithms (not necessarily constant ratio) for future research.

Beyond proving the intractability of mutual influence ranking, we believe that intractability of the simple scenario here explains why all diversity rankings are likely to be intractable. Further the proof based on the reduction from the well explored independent set problem may help in adapting approximations algorithms from graph theory.

8 Conclusion and Future Work

We approach the web ranking as a utility maximization based on user’s click model, and derive the optimal ranking—namely CE ranking. The ranking is simple and intuitive; and optimal (for the given utilities) considering perceived relevance and abandonment probability of user behavior.

For specific assumptions on parameters, the ranking function reduces to a taxonomy of ranking functions in multiple ranking domains. The enumerated taxonomy will help to decide optimal ranking for a specific user behavior. In addition, the taxonomy shows that the existing document and ad ranking strategies are special cases of the proposed ranking function under specific assumptions.

To apply CE ranking to ad auctions, we incorporate a second price based pricing. The resulting CE mechanism has a Nash Equilibrium which simultaneously optimizes search engine and advertiser revenues. CE mechanism is revenue dominant over VCG for the same bid vectors, and has an equilibrium which is revenue equivalent with the truthful equilibrium of VCG.

Finally, we relax the assumption of independence between entities in CE ranking and consider diversity ranking. The ensuing analysis revels that diversity ranking is an inherently hard problem; since even the basic formulations are NP-Hard with unlikely constant ratio approximation algorithms.

Our previous simulation studies suggest significant improvement in profits by CE ranking over existing ranking strategies [Balakrishnan and Kambhampati (2008)]. As a future research, assessing profits by CE mechanism on a large scale search engine click log will quantify improvement in a real environment. Learning and prediction of abandonment probability from click logs as well as by parametric learning are interesting problems. The suggested ranking is optimal for other web ranking scenarios with similar click models—like products and friends recommendations—and may be extended to these problems. Further, effective approximation schemes for diversity ranking based on similarity with the independent set problem may be investigated.

Appendix

A-1 Proof of Theorem 1

{proof}

Consider results and in positions and respectively. Let for notational convenience. The total expected utility from and when is placed above is

If the order of and are inverted by placing above , the expected utility from these entities will be,

Since utilities from all other results in the list will remain the same, the expected utility of placing above is greater than inverse placement iff

This means if entities are ranked in the descending order of any inversions will reduce the profit. Otherwise ranking by is optimal.

A-2 Proof of Theorem 2

{proof}

Without loss of generality, we assume that refers to ad in the position in the descending order of .

A-3 Proof of Theorem 3

Let there are advertisers. Without loss of generality, let us assume that advertisers are indexed in the descending order of . We prove equilibrium in two steps.

Step 1: Prove that

(A-1)
{proof}

Expanding by Equation 19,

Notice that is a convex linear combination of and . This means that the value of is in between (or equal to) the values of and . Hence to prove that all we need to prove is that . This inductive proof is given below.

Induction hypothesis: Assume that

Base case: Prove for i.e. for the bottommost ad.

Assuming

Induction: Expanding by Equation 19,

is the convex linear combination, i.e , as we know that by induction hypothesis. Consequently,

This completes the induction.

Since advertisers are ordered by for pricing, the above proof says that the pricing order is the same as the assumed order in this proof (i.e. ordering by ). Consequently,

As corollary of Theorem 2 we know that .

In the second step we prove the envy free equilibrium using results in Step 1.

Step 2: No advertiser can increase his profit by changing his bids unilaterally {proof}[ of Envy Freeness to Advertisers Below] In the first step let us prove that ad can not increase his profit by decreasing his bid to move to a position below.

Inductive hypothesis: Assume true for .

Base Case: Trivially true for .

Induction: Prove that the expected profit of at is less or equal to the expected profit of at .

Let denotes the amount paid by when he is at the position . By inductive hypothesis, the expected profit at is less or equal to the expected profit at . So we just need to prove that the expected profit at is less or equal to the expected profit at . i.e.

Canceling the common terms,

(A-2)

—the price charged to at position —is based on the Equations 16 and  19. Since the is moving downward, will occupy position by shifting ad upwards. Hence the ad just below is . Consequently, the price charged to when it is at the position is,

Substituting for and in Equation A-2,

Simplifying, and multiplying both sides by

Substituting by from Equation 19 on RHS.

Canceling out the common terms on both sides,

Which is true by the assumed order as

Inductive proof for is somewhat similar and enumerated below.

Inductive hypothesis: Assume true for .

Base Case: Trivially true for .

{proof}

[of Envy freeness to the ad one above ]

The case in which increase his bid to move one position up i.e. to is a special case and need to be proved separately. In this case, by moving a single slot up, the index of the ad below will change from to (a difference of two). For all other movements of to a position one above or one below, the index of the advertisers below will change only by one. Since the amount paid by depends on the ad below , this case warrants a slightly different proof,

Expanding is straight forward.To expand , note that when has moved upwards to , the ad just below is . Since has not changed its bids, the can be expanded as . Substituting for and ,

Simplifying and multiplying by

Substituting from Equation 19

We now prove that both the terms in RHS are greater or equal to the corresponding terms in LHS separately.

Which is true by our assumed order.

Similarly,

Which is true by Equation A-1 above. This completes the proof for this case.

Induction: Prove that the expected profit at is less or equal to the expected profit at . The proof is similar to the induction for the case . {proof} Base case is trivially true.

Canceling common terms,

In this case, note that is moving upwards. This means that will occupy position by pushing the ad originally at one position downwards. Hence the original ad at is the one just below now. i.e.

Substituting for and

Simplifying and multiplying by

Substituting by from Equation 19

Canceling common terms,

Which is true by the assumed order as .

A-4 Proof of Theorem 4

{proof}

VCG payment of the ad at position (i.e. ) is equal to the reduction in utility of the ads below due to the presence of . For each user viewing the list of ads (i.e. for unit view probability), the total expected loss of ads below due to is,