Optimal Mix of Incentive Strategies for Product Marketing on Social Networks
Abstract
We consider the problem of devising incentive strategies for viral marketing of a product. In particular, we assume that the seller can influence penetration of the product by offering two incentive programs: a) direct incentives to potential buyers (influence) and b) referral rewards for customers who influence potential buyers to make the purchase (exploit connections). The problem is to determine the optimal timing of these programs over a finite time horizon. In contrast to algorithmic perspective popular in the literature, we take a meanfield approach and formulate the problem as a continuoustime deterministic optimal control problem. We show that the optimal strategy for the seller has a simple structure and can take both forms, namely, influenceandexploit and exploitandinfluence. We also show that in some cases it may optimal for the seller to deploy incentive programs mostly for low degree nodes. We support our theoretical results through numerical studies and provide practical insights by analyzing various scenarios.
Keywords:
Incentive Strategies, Social Networks, Viral Marketing1 Introduction
A key research topic in multiagent systems is to understand the effect of microdynamics/interactions between agents on macroscopic properties. Often the agents are a part of a social structure such as a social network. A common example is that of a social network that consists of potential buyers of a particular new product offering in the market. These buyers interact with each other and influence each others’ purchase decisions through wordofmouth and/or behavior. This socalled social influence exerted by agents on their neighboring agents in the network have a significant role to play in generating a network effect on the sales of the product. The idea of viral marketing is to essentially exploit the (macroscopic) network effects that result due to the microdynamics between the agents in the network.
Viral marketing is receiving much attention by practicing marketers and academics alike. While not a new idea, it has come to the forefront because of multiple effects  products have become more complex, making buyers to increasingly rely on opinions of their peers; consumers have evolved to distrust advertising; and Web2.0 has revolutionized the way people can connect, communicate and share. With power shifting to consumers, it has become important for sellers to devise effective viral marketing strategies [8]. This work is motivated by this urgent need.
For social influence to work, there must be seeds, i.e., product advocates to start with. The sellers, therefore, employ two basic strategies. The first is to create advocates, by providing incentives to potential buyers to make an actual purchase. These incentives are typically in the form of discounts, free goodies, etc. The second is to reward product advocates who ‘put in a good word’ and influence potential buyers to make the purchase. Thus, the latter program helps to exploit the impact of social influence while making a purchasing decision whereas the former program helps to directly influence the buying behavior by offering discounts.
Since incentives come at a cost, a seller must balance the revenue she generates through these strategies and the expenditure she incurs in doing so. This poses some nontrivial challenges. The first is determining incentives themselves, since response of an individual is contingent on them (too low a referral reward may not elicit recommendation from an individual since personal reputation is usually at stake). Secondly, the two programs are not necessarily causally connected. The reputation of a firm or a brand might create product advocates without incentives, thereby, requiring a seller to launch a referral program directly. This necessitates careful ‘timing’ of these programs.
The objective of this paper is to shed some light on this practically important and theoretically interesting problem. In particular, we seek to determine an optimal timing of these programs over a finite time horizon.
1.1 Related Work
In recent years, problems such as these have attracted much attention. Several papers investigate ‘influence maximization’ (see, for example, Domingos and Richardson [7], Kempe, Kleinberg, and Tardos [11], Bharathi, Kempe, and Salek [4], Chen, Yaun, and Zhang [6]), where the problem is to determine the set of initial adopters who, through an influence process, can maximize the future adoptions of the product. Auriol and Benaim [2] discuss a dynamic model of how standards and norms emerge in decentralized economies. Hartline, Mirrokni, and Sundararajan [10], Arthur, Motwani, Sharma, and Xu [1] consider the problem of ‘revenue maximization’ for viral marketing and are close in spirit to the problem we consider in this paper.
In Hartline et al. [10] a model is proposed in which the purchase decision of a buyer is influenced by individuals who own the product and the price at which the product is offered. An optimal pricing policy is derived using dynamic programming in a symmetric setting (i.e., identical buyers). In a general setting, finding an optimal strategy is shown to be NPhard and approximation algorithms are considered. The authors suggest influenceandexploit strategy where selected buyers are given the product for free, and the seller extracts revenue by making a random sequence of offers and a greedy pricing strategy for the remaining buyers to compensate for the initial loss.
Arthur et al. [1] also considers a model in which a buyer’s decision is influenced by friends who own the product and price at which the product is offered. Sales are assumed to cascade through the social network. The seller offers cashback to recommenders and also sets price for each buyer. The authors show that determining an optimal strategy to maximize expected revenue is NPhard and propose a nonadaptive influenceandexploit policy, which offers product to the interior nodes of the maxleaf spanning tree of the network for free and later exploits their influence by extracting more revenue from the leaf nodes of the tree. They show that the expected revenue generated from the nonadaptive strategy is within a constant factor of the optimal revenue from an adaptive strategy.
1.2 Our Contributions
We consider a seller interested in selling a product to a population of agents. The product is assumed to be durable and free from network externalities. From the seller’s perspective, each agent assumes one of the following types at any point in time: potential buyer (one who is yet to make a purchase), customer (one who has purchased the seller’s product) and competitor’s customer (one who has purchased a competing product
In practice sellers have limited knowledge about the social network underlying the population, typically, in the form of a classlevel statistical description of it. A class comprises agents who are considered essentially identical on a variety of factors (chosen by the seller), such as demographic, economic level, number of social contacts and so on. In this paper we consider this setup. However to keep it simple, we assume heterogeneity only in terms of network connections (in particular, probabilities of purchase under either external or social influence are assumed to be the same for all agents); hence classes are based only on the number of social contacts (degrees). The seller thus knows only the degree distribution and degreedegree correlation of the social network.
This classlevel statistical description of the agent population allows us to approximate the stochastic evolution of the purchase dynamics by a deterministic process described by ordinary differential equations (ODEs). This is formally established as a meanfield limit, taking the number of agents [3]. With an ODE limit, we pose the problem as a continuous time optimal control problem and employ the well known Pontryagin’s Maximum principle [12] to characterize an optimal control. An optimal control specifies for each class the times at which direct incentives and referral rewards programs are to be executed. The following are our main results.

We show that an optimal control has a simple structure: the seller needs to run each of the programs at most twice for a certain duration. Moreover, it is nonadaptive (or openloop). This simplifies the implementation and practically can help a seller preallocate the budget for her campaign.

While influenceandexploit strategy turns out to be optimal when social influence is strong in the population, exploitandinfluence strategy can be optimal when the seller has a good reputation.

In some settings, the seller may be better off incentivizing low degree nodes as against the popular approach of targeting the influentials (high degree nodes). This, we believe, provides some support to the findings reported in [14] in reference to the influentials hypothesis.
The approach we have taken to address the problem is entirely different from the ones in the literature. While a large size of the population presents a challenge to the earlier approaches, it, in fact, aids us in migrating to a simpler deterministic description of the dynamics. The assumption that agents of a class are indistinguishable also fits in naturally with the popular marketing approach of customer segmentation and allows a seller to customize incentives and referral rewards as per these segments.
In contrast to earlier papers, we have also modeled competition. This is not only close to reality but interestingly it allows to address some problems in completely different contexts. For example, in limiting the spread of misinformation about an entity or an Internet virus, the objective is to maximize nodes with correct information or security patches by immunizing them (akin to direct incentives) and/or incentivizing them to spread the information they have to their neighbors (akin to referrals). Our results are, thus, applicable to these problems as well (see [5] for discussion of the influence limitation problem).
2 Problem Formulation
Consider a population of agents, indexed by . The underlying social network is specified by an undirected graph . Each agent is identified with a node in and means that and are social contacts and they influence each other in decisionmaking.
denotes the state of agent . can take three values: (indicates potential buyer), (indicates customer) and (indicates competitor’s customer). Let .
Each agent makes the purchase decision at a random time point, independent of all others. It suffices to assume that time is discrete (denoted by ) and at each time step, an agent is chosen uniformly randomly from the population for a potential state change. Since there are no repeat purchases, and are absorbing states. Therefore, the state change occurs only if the chosen agent is a potential buyer. Suppose agent is chosen at a time . Then one of the following happens if is a potential buyer.

buys the seller’s product on her own with probability (For example, means that there is 8% chance that a potential buyer will buy the seller’s product on her own).

buys the competitor’s product on her own with probability .

selects one of her social contacts at random. If the selected contact is a customer, buys the seller’s product under social influence with probability (For example, means that there is 10% chance that a potential buyer will buy the seller’s product if she interacts with someone who has already bought the product).

selects one of her social contacts at random. If the selected contact is a competitor’s customer, buys the competitor’s product under social influence with probability .
Clearly, the state process is a Markov chain.
Now from the seller’s perspective, agents having the same degree are indistinguishable and the network is known only statistically, i.e., is drawn from an ensemble of random undirected graphs of size , a given degree distribution () and degreedegree correlation function , which denotes the probability that a given link from a node of degree is to a node of degree . Note that a number of wellknown graphs such as homogeneous random graphs, exponential random graphs (e.g., and WattsStrogatz network), scalefree networks can be represented in this framework. We assume that remains uniformly bounded as .
Denote by and the fraction of degreek agents who are potential buyers, customers and competitor’s customers respectively (note that normalization is with respect to the number of classk agents; hence ). Let and . From the above assumption, it follows that is a Markov chain (as seen by the seller).
The drift of can be computed considering the four cases described above. Table 1 shows the corresponding probabilities and the change in for degree classk. Consider as an example Case . The probability of a randomly selected agent being a potential customer of degree is . This agent randomly chooses one of her social contacts. The probability that this chosen one is an existing customer is = . The selected agent buys the seller’s product under the social influence from her contact with probability . Thus the probability of Case 3 is . One agent changes her state from (potential customer) to (customer). Hence the effect on is .
Case  Probability  Effect on 

1  
2  
3  
4 
We now make the dependence on the population size explicit and denote by the drift of . is as follows.
Observe that the number of transitions per agent per time slot is of the order of the second moment of number of agent transitions per time slot is bounded and is a smooth function of and . Let . It then follows from Theorem 1 of [3] that the time evolution of can be approximated by the following system of ODEs (with the same initial conditions).
(1) 
More explicitly, for
where, and .
The seller offers direct incentives and referral rewards to increase and respectively. We model this as follows. A referral reward of results in an increase of in and a direct incentive of causes to increase by . Thus, for the duration of the referral reward program, social influence rate of is operational and the seller incurs a cost of for every successful referral. Similarly, if the direct incentive program is executed for some duration, the takerate for seller’s product increases to for that duration, incurring her a cost of for every sale. We normalize and with respect to the product price. Thus the price is fixed to . The seller’s problem of maximizing her profit (revenue minus cost) over a fixed time horizon by optimally timing the two program can now be stated formally as follows.
Let (resp. ) denote the control variable indicating whether or not the referral reward program (resp. direct incentive program) is offered to classk at time . The cost incurred in running the referral reward program is
(2) 
Recall that the conversion rate of potential buyers under the program is . The cost incurred in the direct incentives program is
(3) 
Since the product price is unity, the revenue obtained is proportional to the number of customers at the end of horizon, . Denoting the total cost (2)+(3) by the problem is
subject to
for and a given initial condition .
Three remarks are in order. The assumption of heterogeneity only in the number of social contacts is mainly to keep the formulation simple and highlight the impact of network structure. Extending this formulation to a general setting is straightforward and will be taken up in a longer version of the paper. Our random interaction model essentially means that the social influence on a potential buyer is the average influence from her neighbors. This, we believe, is reasonable since we have also assumed presence of external influence (through ) on agents
3 Structure of Optimal Control
In this section we mathematically prove the structural properties of an optimal control. To keep the proof simple, we will assume that the network is drawn randomly from a set of regular networks of size and degree . This is without loss of generality.
Let , and denote the fraction of population in states at time respectively. Let denote whether or not the referral reward program is offered at time and let denote whether or not the direct incentive program is offered at time . The purchase dynamics under the influence of these programs are given as follows:
(4)  
(5)  
(6) 
From (4), (5), and (6), observe that . Therefore, it suffices to consider any two equations. Let . Let denote the state variable.
The optimal control problem in this simpler setting is as follows.
(7)  
subject to (4), (5) and the following constraints on state and control variables: for all , , and .
Our main result is given in Proposition 1. It shows that an optimal strategy for the seller is to deploy the two incentive programs for at most two distinct time periods.
Proposition 1

There exist () such that for and else.

There exist () such that for and else.
Proof
otherwise there is no problem to solve. Observe that is positively invariant. Therefore a solution starting from any initial point remains confined to . This allows us to disregard state constraints from the control formulation.
Let for all (This relaxation allows us to establish existence of an optimal control. We show that the optimal controls are indeed ‘bangbang’, i.e., for all ). Writing the problem in Mayer form, it can be seen that the state space (appropriately expanded with additional variables) is bounded and positively invariant (thus, state trajectories remain bounded for all admissible pairs); and the system is affine in controls (see (7), (4) and (5)). Existence of an optimal control is now established by FilippovCesari theorem.
From (4), (5), and (7), the Hamiltonian is written as follows.
(8)  
denotes costate variables. Then according to Pontryagin’s Maximum Principle, there exist continuous and piecewise continuously differentiable costate functions and that satisfy
(9)  
(10)  
at all where and are continuous and satisfy the following transversality condition
(11) 
and also satisfy, for all , and ,
(12) 
From (8) and (12), we get the following form for controls.
(13)  
(14) 
In case of equality in the conditions specified in equations (13) and (14), and may take any arbitrary values in .
Let and .
We denote by the Hamiltonian along optimal statecontrol trajectory at time . The following lemma proves that Hamiltonian will always remain positive.
Lemma 1
.
Proof
The lemma below shows that the costate variables remain positive for the whole duration.
Lemma 2
.
Proof
Suppose for all and let at (at least one exists since ). Then otherwise since . Observe from (9) that if . Strict inequality in (13) implies that at , is continuous. Therefore, in the neighborhood of . Thus implies for all and which violates (11). It follows that for all . This in turn implies that for all otherwise . ∎
Lemma 3
.
Proof
Suppose not. Let at . Then and, therefore, . (8) then yields , a contradiction. ∎
Let . The lemma that follows shows that is a decreasing function.
Lemma 4
.
Proof
Now consider . From (9), (10), and (8) we get
is monotonically decreasing (Lemma 4). From (4) exponentially (). is a positive constant (Lemma 1).
Assume that at three points in time , , . Therefore, for either or . Without loss of generality, let us say . From the above equation it follows that which is not feasible as . It follows that at at most two points in time. Therefore, there exist such that for and , and elsewhere.
Similarly, one can show that there exist such that for and otherwise. The proposition is, thus, established.
Proposition 1 implies that both the referral reward and direct incentives programs are to be deployed at most twice for certain durations, one in the beginning and the other at the end. It may happen that both the durations are of length which means that a program is not deployed at all. On the other hand, it could also get deployed over the complete time horizon . This gives a simple and elegant marketing strategy which is easy to implement for the seller.
The structure of the above optimal control is quite intuitive. In the case of the referral reward program, the cost is proportional to the product of number of potential buyers and customers. Hence to keep the cost low, rewards are declared in the initial stage (when the number of customers is less) to motivate product advocates and may also be paid at the end (when the number of potential buyers is less) to acquire some additional customers.
In the case of direct incentives, the cost is proportional to the number of potential buyers. If the initial take rate for the product is less, this program may get executed at initial stages to quickly acquire customers whose social influence can be exploited in the later stages; otherwise more agents may buy competitor’s product and attract other potential buyers. Towards the end of the campaign, the number of potential buyers is less; hence direct incentives may be offered to attract additional customers.
4 Numerical Results
The simple structure of optimal controls given by Proposition 1 allows one to devise incentives programs quite easily by numerical optimization of ’s. Here we obtain an independent validation of optimal controls by discretizing (7), casting it as a nonlinear constrained optimization problem and using a gradient descent approach to find an optimal solution. (For discussion on various numerical solution techniques for such problems refer to [12]). For all our experiments, the time horizon and discretization stepsize are fixed at 10 and 0.1 respectively. The NLP formulation is not convex. Therefore, we use a multistart mechanism to determine an optimal solution. Results are also verified using the commercial package PROPT which uses pseudospectral methods for solving such problems.
In this paper, we will primarily investigate the initial condition . This captures the case when the seller and the competitor(s) enter the market with substitutable products at around the same time (e.g., gaming technologies) or when the seller introduces an independent product into the market (e.g., a book). Of course, similar results can be obtained for the case where seller and/or competitor already have some presence in the market ( and/or ).
We fix , for all numerical studies, and consider and as the base scenario (These parameter values are arbitrary and only roughly based on some available data). The optimal marketing strategy is shown in Figure 1. It is optimal for the seller to run both the incentive programs initially for some duration, stop and then run the programs again towards the end. Note that the optimal strategy is open loop; hence estimates of and are not required for implementation.
It is, thus, possible for the seller to determine the timing of her incentives programs numerically. Experimentation with different values of payouts and (which essentially fix and ) can be used to understand tradeoffs and optimize these payouts.
In the following we undertake an investigation of two important questions pertaining to the interplay between the two incentives programs and the impact of network structure on the them. The former question is important because influenceandexploit strategy has received much attention in the literature. As we show below, exploitandinfluence strategy can also come into play for some parameter settings. The second question is linked to the socalled influentials hypothesis which informally says that high degree agents (hubs) play significant role in product diffusion, and, therefore, are natural targets for incentives (direct or referral rewards). We show that in some cases the seller is better off incentivizing low degree agents (more than highdegree ones). Thus, our results highlight the need for a careful consideration of the network structure while making incentive decisions.
4.1 Interplay between Referral and Direct Incentive Programs
When social influence is strong in the population, i.e., is higher, the seller needs to employ only direct incentives initially. For example, if is set to in the base scenario then it is optimal to offer referral rewards only at the end and that too for a short period as shown in Figure 3. This can be seen as a manifestation of the influenceandexploit strategy. On the other hand, if the seller has established a good reputation in the market, translating into a higher value of , an initial influence step through direct incentives may not even be required. See from Figure 3 that when in the base scenario, it is optimal for the seller to offer direct incentives only at the end. In this case, for most portion of the time horizon, the seller must exploit connections of existing customers and only at the end must she impart direct influence on potential buyers. We call it the exploitandinfluence strategy for the seller.
4.2 Impact of Network Structure on Incentive Programs
Realworld networks show strong degree correlation amongst connected nodes. Some networks show assortative mixing of nodes by degrees where highdegree nodes have most of their connections to other highdegree nodes. Others show disassortative mixing where highdegree nodes have most of their connections to lowdegree nodes ([13]). In this section, we examine the impact of network structure on incentives programs.
We consider an undirected correlated network with nodes belonging to either of two classes and with probability and respectively. Class nodes are of high degree, say and class nodes are of low degree, say . is the probability that a given link from class node points to a class node. can be computed from the following balance equation:
(15) 
We consider two types of network structures. One structure represents assortative mixing whereas the other one represents disassortative mixing. To keep things simple and derive key insights, we assume that the seller is optimizing implementation of only one incentives program, namely, referral rewards program. The seller can offer referral rewards to class and/or class nodes to increase their social influence rate () by . As earlier, she incurs a per conversion cost of after normalizing with respect to product price.
We fix for our numerical studies. For disassortative network, we set whereas for assortative network, we set .
The optimal timing of referral reward program for disassortative network is shown in Figure 7. The seller’s optimal strategy is to offer referral rewards to class nodes for the complete duration whereas rewards to class nodes are offered initially for a short duration and then again towards the end for a short duration. In this case, class B nodes have almost half of their connections going to class A nodes. Also, major fraction of the population is from class B. So, referral rewards are offered to class B nodes for entire duration as it increases influence not only on class B nodes but also on class A nodes. Class A nodes are not rewarded for the entire duration in order to control the cost.
In the case of assortative network, the optimal strategy changes completely (see Figure 7). The seller offers referral rewards to class nodes for the complete duration whereas rewards to class nodes are offered initially for some duration and then again towards the end. In this case, nodes from both the classes are well connected amongst themselves with very few connections going across the classes. So, the optimal reward strategies for both the classes are essentially independent. For this particular scenario, it turns out that it is optimal to offer rewards to class A nodes for the entire duration as the cost incurred is not much. Whereas in the case of class B nodes, referral rewards are discontinued for some duration in the middle as the cost overshoots the potential revenue.
The results show that networks with different structures can result in different optimal strategies for the seller. In some scenarios, the seller may be better off incentivizing low degree nodes as against the popular approach of targeting the influentials (high degree nodes), thus, providing some support to the finding in [14]. In some scenarios, the seller may be better off targeting influentials thus supporting the results in [9]. Thus, our results highlight the need for a careful consideration of the network structure while making incentive decisions.
5 Conclusion
In this paper we have addressed the problem of optimal timing of two incentive programs, namely, direct incentives and referral rewards, for product diffusion through social networks. Taking a deviation from the existing approaches, we formulate the problem as a continuoustime deterministic optimal control problem. The optimal strategy for the seller is to deploy these programs in at most two distinct time periods. The simplicity of this structure and nonadaptive nature makes them ideal for implementation in practice. We further show that if the seller has good reputation in the market, exploitandinfluence strategy can be optimal whereas if social influence is strong in the population, influenceandexploit strategy can be optimal for the seller. In the case of correlated networks, our numerical studies show that the seller need not necessarily offer more frequent referral reward programs to high degree nodes to maximize her profit.
There are two immediate directions for future work: extend heterogeneity of agents to include their external and social influence probabilities and devise procedures to estimate model parameters.
Footnotes
 All competitors are aggregated into one single virtual competitor.
 For the lack of clear empirical evidence, one may also consider total influence from the neighbors. Mathematically, it is a simple modification to our formulation.
References
 D. Arthur, R. Motwani, A. Sharma, and Y. Xu. Pricing strategies for viral marketing on Social Networks. In Proceedings of Workshop on Internet and Network Economics, pages 101–112, 2009.
 E. Auriol and M. Benaim. Standardization in decentralized economies. American Economic Review, 90(3):550–570, 2000.
 M. Benaim and J. Y. Le Boudec. A Class of Mean Field Interaction Models for Computer and Communication Systems. Performance Evaluation, 65:823–838, 2008.
 S. Bharathi, D. Kempe, and M. Salek. Competitive influence maximization in social networks. In Proceedings of Workshop on Internet and Network Economics, pages 306–311, 2007.
 C. Budak, D. Agrawal, and A. El Abbadi. Limiting the Spread of Misinformation in Social Networks. In Proceedings of the 20th International Conference on World Wide Web, pages 665–674, 2011.
 W. Chen, Y. Yaun, and L. Zhang. Scalable influence maximization in social networks under the linear threshold model. In Proceedings of 10th International Conference on Data Mining, 2010.
 P. Domingos and M. Richardson. Mining the Network Value of Customers. In 7th International Conference on Knowledge Discovery and Data Mining, pages 57–66, 2001.
 D. Godes, D. Mayzlin, Y. Chen, S. Das, C. Dellarocas, B. Pfeiffer, B. Libai, S. Sen, M. Shi, and P. Verlegh. The Firm’s Management of Social Interactions. Marketing Letters, 16(3/4):415–428, 2005.
 J. Goldenberg, S. Han, D. R. Lehmann, and J. W. Hong. The Role of Hubs in the Adoption Processes. Journal of Marketing, 73:1–13, March 2009.
 J. Hartline, V. S. Mirrokni, and M. Sundararajan. Optimal Marketing Strategies over Social Networks. In Proceedings of 17th International Conference on World Wide Web, pages 189–198, 2008.
 D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the Spread of Influence through a Social Network. In Proceedings of Knowledge Discovery and Data Mining, pages 137–146, 2003.
 D. E. Kirk. Optimal Control Theory: An Introduction. Dover, 1970.
 M. E. J. Newman. Assortative Mixing in Networks. Physics Review Letters, 89(20), November 2002.
 D. Watts and P. S. Dodds. Influentials, Networks, and Public Opinion Formation. Journal of Consumer Research, 34, December 2007.