# Constrained Non-Monotone Submodular Maximization: Offline and Secretary Algorithms

## Abstract

Constrained submodular maximization problems have long been studied, most recently in the context of auctions and computational advertising, with near-optimal results known under a variety of constraints when the submodular function is *monotone*. The case of non-monotone submodular maximization is less well understood: the first approximation algorithms even for the unconstrained setting were given by Feige et al. *(FOCS ’07)*. More recently, Lee et al. *(STOC ’09, APPROX ’09)* show how to approximately maximize non-monotone submodular functions when the constraints are given by the intersection of matroid constraints; their algorithm is based on local-search procedures that consider -swaps, and hence the running time may be , implying their algorithm is polynomial-time only for constantly many matroids.

In this paper, we give algorithms that work for *-independence systems* (which generalize constraints given by the intersection of matroids), where the running time is . Both our algorithms and analyses are simple: our algorithm essentially reduces the non-monotone maximization problem to multiple runs of the greedy algorithm previously used in the monotone case. Our idea of using existing algorithms for monotone functions to solve the non-monotone case also works for maximizing a submodular function with respect to *a knapsack constraint*: we get a simple greedy-based constant-factor approximation for this problem.

With these simpler algorithms, we are able to adapt our approach to constrained non-monotone submodular maximization to the *(online) secretary setting*, where elements arrive one at a time in random order, and the algorithm must make irrevocable decisions about whether or not to select each element as it arrives. We give constant approximations in this secretary setting when the algorithm is constrained subject to a uniform matroid or a partition matroid, and give an approximation when it is constrained by a general matroid of rank .

## 1Introduction

We present algorithms for maximizing (not necessarily monotone) non-negative submodular functions satisfying under a variety of constraints considered earlier in the literature. Lee et al. [28] gave the first algorithms for these problems via local-search algorithms: in this paper, we consider greedy approaches that have been successful for *monotone* submodular maximization, and show how these algorithms can be adapted very simply to non-monotone maximization as well. Using this idea, we show the following results:

We give an -approximation for maximizing submodular functions subject to a -independence system. This extends the result of Lee et al. [28] which applied to constraints given by the intersection of matroids, where was a constant. (Intersections of matroids give -indep. systems, but the converse is not true.) Our greedy-based algorithm has a run-time polynomial in , and hence gives the first polynomial-time algorithms for non-constant values of .

We give a constant-factor approximation for maximizing submodular functions subject to a knapsack constraint. This greedy-based algorithm gives an alternate approach to solve this problem; Lee et al. [28] gave LP-rounding-based algorithms that achieved a -approximation algorithm for constraints given by the intersection of knapsack constraints, where is a constant.

Armed with simpler greedy algorithms for nonmonotone submodular maximization, we are able to perform constrained nonmonotone submodular maximization in several special cases in the secretary setting as well: when items arrive online in random order, and the algorithm must make irrevocable decisions as they arrive.

We give an -approximation for maximizing submodular functions subject to a cardinality constraint and subject to a partition matroid. (Using a reduction of [3], the latter implies -approximations to e.g., graphical matroids.) Our secretary algorithms are simple and efficient.

We give an -approximation for maximizing submodular functions subject to an arbitrary rank matroid constraint. This matches the known bound for the

*matroid secretary problem*, in which the function to be maximized is simply linear.

No prior results were known for submodular maximization in the secretary setting, even for *monotone* submodular maximization; there is some independent work, see § ? for details.

Compared to previous offline results, we trade off small constant factors in our approximation ratios of our algorithms for exponential improvements in run time: maximizing nonmonotone submodular functions subject to (constant) matroid constraints currently has a approximation due to a paper of Lee, Sviridenko and Vondrák [29], using an algorithm with run-time exponential in . For the best result is a -approximation by Vondrák [34]. In contrast, our algorithms have run time only linear in , but our approximation factors are worse by constant factors for the small values of where previous results exist. We have not tried to optimize our constants, but it seems likely that matching, or improving on the previous results for constant will need more than just choosing the parameters carefully. We leave such improvements as an open problem.

### 1.1Submodular Maximization and Secretary Problems in an Economic Context

Submodular maximization and secretary problems have both been widely studied in their economic contexts. The problem of selecting a subset of people in a social network to maximize their influence in a viral marketing campaign can be modeled as a constrained submodular maximization problem [22]. When costs are introduced, the influence minus the cost gives us *non-monotone* submodular maximization problems; prior to this work, *online* algorithms for non-monotone submodular maximization problems were not known. Asadpour et al. studied the problem of adaptive stochastic (monotone) submodular maximization with applications to budgeting and sensor placement [2], and Agrawal et al. showed that the *correlation gap* of submodular functions was bounded by a constant using an elegant cost-sharing argument, and related this result to social welfare maximizing auctions [1]. Finally, secretary problems, in which elements arriving in random order must be selected so as to maximize some constrained objective function have well-known connections to online auctions [23]. Our simpler *offline* algorithms allow us to generalize these results to give the first secretary algorithms capable of handling a non-monotone submodular objective function.

### 1.2Our Main Ideas

At a high level, the simple yet crucial observation for the offline results is this: many of the previous algorithms and proofs for constrained monotone submodular maximization can be adapted to show that the set produced by them satisfies , for some , and being an optimal solution. In the monotone case, the right hand side is at least and we are done. In the non-monotone case, we cannot do this. However, we observe that if is a reasonable fraction of , then (approximately) finding the most valuable set within would give us a large value—and since we work with constraints that are downwards closed, finding such a set is just *unconstrained* maximization on restricted to , for which Feige et al. [14] give good algorithms! On the other hand, if and is also too small, then one can show that deleting the elements in and running the procedure again to find another set with would guarantee a good solution! Details for the specific problems appear in the following sections; we first consider the simplest cardinality constraint case in to illustrate the general idea, and then give more general results in and Section 3.2.

For the secretary case where the elements arrive in random order, algorithms were not known for the monotone case either—the main complication being that we cannot run a greedy algorithm (since the elements are arriving randomly), and moreover the value of an incoming element depends on the previously chosen set of elements. Furthermore, to extend the results to the non-monotone case, one needs to avoid the local-search algorithms (which, in fact, motivated the above results), since these algorithms necessarily implement multiple passes over the input, while the secretary model only allows a single pass over it. The details on all these are given in .

### 1.3Related Work

Monotone Submodular Maximization.

The (offline) monotone submodular optimization problem has been long studied: Fisher, Nemhauser, and Wolsey [31] showed that the greedy and local-search algorithms give a -approximation with cardinality constraints, and a -approximation under matroid constraints. In another line of work, [20] showed that the greedy algorithm is a -approximation for maximizing a *modular* (i.e., additive) function subject to a -independence system. This proof extends to show a -approximation for monotone submodular functions under the same constraints (see, e.g., [8]). A long standing open problem was to improve on these results; nothing better than a -approximation was known even for monotone maximization subject to a single partition matroid constraint. Calinescu et al. [7] showed how to maximize monotone submodular functions representable as weighted matroid rank functions subject to any matroid with an approximation ratio of , and soon thereafter, Vondrák extended this result to *all* submodular functions [33]; these highly influential results appear jointly in [8]. Subsequently, Lee et al. [29] give algorithms that beat the -bound for matroid constraints with to get a -approximation.

*Knapsack constraints.* Sviridenko [32] extended results of Wolsey [35] and Khuller et al. [25] to show that a greedy-like algorithm with partial enumeration gives an -approximation to monotone submodular maximization subject to a knapsack constraint. Kulik et al. [27] showed that one could get essentially the same approximation subject to a constant number of knapsack constraints. Lee et al. [28] give a -approximation for the same problem in the non-monotone case.

*Mixed Matroid-Knapsack Constraints.* Chekuri et al. [10] give strong concentration results for dependent randomized rounding with many applications; one of these applications is a -approximation for monotone maximization with respect to a matroid and any constant number of knapsack constraints. [17] extends ideas from [9] to give polynomial-time algorithms with respect to non-monotone submodular maximization with respect to a -system and knapsacks: these algorithms achieve an -approximation for constant (since the running time is ), or a -approximation for arbitrary ; at a high level, their idea is to “emulate” a knapsack constraint by a polynomial number of partition matroid constraints.

Non-Monotone Submodular Maximization.

In the non-monotone case, even the unconstrained problem is NP-hard (it captures max-cut). Feige, Mirrokni and Vondrák [14] first gave constant-factor approximations for this problem. Lee et al. [28] gave the first approximation algorithms for constrained non-monotone maximization (subject to matroid constraints, or knapsack constraints); the approximation factors were improved by Lee et al. [29]. The algorithms in the previous two papers are based on local-search with -swaps and would take time. Recent work by Vondrák [34] gives much further insight into the approximability of submodular maximization problems.

Secretary Problems.

The original secretary problem seeks to maximize the probability of picking the element in a collection having the highest value, given that the elements are examined in random order [12]. The problem was used to model item-pricing problems by Hajiaghayi et al. [19]. Kleinberg [23] showed that the problem of maximizing a *modular* function subject to a cardinality constraint in the secretary setting admits a -approximation, where is the cardinality. (We show that maximizing a *submodular* function subject to a cardinality constraint cannot be approximated to better than some universal constant, independent of the value of .) Babaioff et al. [5] wanted to maximize modular functions subject to matroid constraints, again in a secretary-setting, and gave constant-factor approximations for some special matroids, and an approximation for general matroids having rank . This line of research has seen several developments recently [6].

#### Independent Work on Submodular Secretaries

Concurrently and independently of our work, Bobby Kleinberg has given an algorithm similar to that in §Section 4.1 for monotone secretary submodular maximization under a cardinality constraint [24]. Again independently, Bateni et al. consider the problem of non-monotone submodular maximization in the secretary setting [4]; they give a different -approximation subject to a cardinality constraint, an -approximation subject to matroid constraints, and an -approximation subject to knapsack constraints in the secretary setting. While we do not consider multiple constraints, it is easy to extend our results to obtain and respectively using standard techniques.

### 1.4Preliminaries

Given a set and an element , we use to denote . A function is *submodular* if for all , . Equivalently, is submodular if it has *decreasing marginal utility*: i.e., for all , and for all , . Also, is called *monotone* if for . Given and , define as . The following facts are standard.

**Matroids.** A *matroid* is a pair , where contains , if and then , and for every with , there exists such that . The sets in are called *independent*, and the *rank* of a matroid is the size of any maximal independent set (base) in . In a *uniform* matroid, contains all subsets of size at most . A *partition* matroid, we have groups with and ; the independent sets are such that .

**Unconstrained (Non-Monotone) Submodular Maximization.** We use to denote an approximation algorithm given by Feige, Mirrokni, and Vondrák [14] for unconstrained submodular maximization in the non-monotone setting: it returns a set such that . In fact, Feige et al. present many such algorithms, the best approximation ratio among these is via a local-search algorithm, the easiest is a -approximation that just returns a uniformly random subset of .

## 2Submodular Maximization subject to a Cardinality Constraint

We first give an offline algorithm for submodular maximization subject to a cardinality constraint: this illustrates our simple approach, upon which we build in the following sections. Formally, given a subset and a non-negative submodular function that is potentially non-monotone, but has . We want to approximate . The greedy algorithm starts with , and repeatedly picks an element with maximum marginal value until it has elements.

Suppose not. Then , and hence there is at least one element that has . Since we ran the greedy algorithm, at each step this element would have been a contender to be added, and by submodularity, ’s marginal value would have been only higher then. Hence the elements actually added in each of the steps would have had marginal value more than ’s marginal value at that time, which is more than . This implies that , a contradiction.

This theorem is existentially tight: observe that if the function is just the cardinality function , and if and happen to be disjoint, then .

By submodularity, it follows that . Again using submodularity, we get . Putting these together and using non-negativity of , the lemma follows.

We now give our algorithm Submod-Max-Cardinality () for submodular maximization: it has the same multi-pass structure as that of Lee et al., but uses the greedy analysis above instead of a local-search algorithm.

Let be the optimal solution with . We know that . Also, if is at least , then we know that the -approximate algorithm FMV gives us a value of at least . Else,

Similarly, we get that . Adding this to (Equation 1), we get

where we used to get from (Equation 2) to ( ?). Hence . The approximation factor now is . Setting , we get a -approximation, as claimed.

Using the known value of from Feige et al. [14], we get a -approximation for submodular maximization under cardinality constraints. While this is weaker than the -approximation of Vondrák [34], or even the -approximation we could get from Lee et al. [28] for this special case, the algorithm is faster, and the idea behind the improvement works in several other contexts, as we show in the following sections.

## 3Fast Algorithms for -Systems and Knapsacks

In this section, we show our greedy-style algorithms which achieve an -approximation for submodular maximization over -systems, and a constant-factor approximation for submodular maximization over a knapsack. Due to space constraints, many proofs are deferred to the appendices.

### 3.1Submodular Maximization for Independence Systems

Let be a universe of elements and consider a collection of subsets of . is called an *independence system* if (a) , and (b) if and , then as well. The subsets in are called *independent*; for any set of elements, an inclusion-wise maximal independent set of is called a *basis* of . For brevity, we say that is a basis, if it is a basis of .

See, e.g., [8] for a discussion of independence systems and their relationship to other families of constraints; it is useful to recall that intersections of matroids form a -independent system.

#### The Algorithm for -Independence Systems

Suppose we are given an independence system , a subset and a non-negative submodular function that is potentially non-monotone, but has . We want to find (or at least approximate) . The greedy algorithm for this problem is what you would expect: start with the set , and at each step pick an element that maximizes and ensures that is also independent. If no such element exists, the algorithm terminates, else we set , and repeat. (Ideally, we would also check to see if , and terminate at the first time this happens; we don’t do that, and instead we add elements even when the marginal gain is negative until we cannot add any more elements without violating independence.) The proof of the following lemma appears in , and closely follows that for the monotone case from [8].

The algorithm Submod-Max--Systems () for maximizing a non-monotone submodular function with over a -independence system now immediately suggests itself.

Let be an optimal solution with , and let for all —hence . Note that is a feasible solution to the greedy optimization in . Hence, by , we know that . Now, if for some , it holds that (for to be chosen later), then the guarantees of ensure that , and we will get a -approximation. Else, it holds for all that

Now we can add all these inequalities, divide by , and use the argument from [28] to infer that

(While Claim 2.7 of [28] is used in the context of a local-search algorithm, it uses just the submodularity of the function , and the facts that and for every .) Thus the approximation factor is . Setting , we get the claimed approximation ratio.

Note that even using , our approximation factors differ from the ratios in Lee et al. [28] by a small constant factor. However, the proof here is somewhat simpler and also works seamlessly for all -independence systems instead of just intersections of matroids. Moreover our running time is only linear in the number of matroids, instead of being exponential as in the local-search: previously, no polynomial time algorithms were known for this problem if was super-constant. Note that running the algorithm just twice instead of times reduces the run-time further; we can then use instead of the full power of [28], and hence the constants are slightly worse.

### 3.2Submodular Maximization over Knapsacks

The paper of Sviridenko [32] gives a greedy algorithm with partial enumeration that achieves a -approximation for *monotone* submodular maximization with respect to a knapsack constraint. In particular, each element has a size , and we are given a bound : the goal is to maximize over subsets such that . His algorithm is the following—for each possible subset of at most three elements, start with and iteratively include the element which maximizes the gain in the function value per unit size, and the resulting set still fits in the knapsack. (If none of the remaining elements gives a positive gain, or fit in the knapsack, stop.) Finally, from among these solutions, choose the best one—Sviridenko shows that in the monotone submodular case, this is an -approximation algorithm. One can modify Sviridenko’s algorithm and proof to show the following result for non-monotone submodular functions. (The details are in ).

Note that the tight example for cardinality constraints shows that we cannot hope to do better than a factor of . Now using an argument very similar to that in gives us the following result for non-monotone submodular maximization with respect to a knapsack constraint.

## 4Constrained Submodular Maximization in the Secretary Setting

In this section, we will give algorithms for submodular maximization in the secretary setting: first subject to a cardinality constraint, then with respect to a partition matroid, and finally an algorithm for general matroids. The main algorithmic concerns tackled in this section when developing secretary algorithms are: (a) previous algorithms for non-monotone maximization required local-search, which seems difficult in an online secretary setting, so we developed greedy-style algorithms; (b) we need multiple passes for non-monotone optimization, and while that can be achieved using randomization and running algorithms in parallel, these parallel runs of the algorithms may have correlations that we need to control (or better still, avoid); and of course (c) the marginal value function changes over the course of the algorithm’s execution as we pick more elements—in the case of partition matroids, e.g., this ever-changing function creates several complications.

We also show an information theoretic lower bound: no secretary algorithm can approximately maximize a submodular function subject to a cardinality constraint to a factor better than some universal constant greater than 1, independent of (This is ignoring computational constraints, and so the computational inapproximability of offline submodular maximization does not apply). This is in contrast to the additive secretary problem, for which Kleinberg gives a secretary algorithm achieving a -approximation [23]. This lower bound is found in . (For a discussion about independent work on submodular secretary problems, see § ?.)

### 4.1Subject to a Cardinality Constraint

The offline algorithm presented in builds three potential solutions and chooses the best amongst them. We now want to build just one solution in an *online* fashion, so that elements arrive in random order, and when an element is added to the solution, it is never discarded subsequently. We first give an online algorithm that is given the optimal value as input but where the elements can come in *worst-case* order (we call this an “online algorithm with advice”). Using sampling ideas we can estimate , and hence use this advice-taking online algorithm in the secretary model where elements arrive in random order.

To get the advice-taking online algorithm, we make two changes. First, we do not use the greedy algorithm which selects elements of highest marginal utility, but instead use a *threshold algorithm*, which selects any element that has marginal utility above a certain threshold. Second, we will change of Algorithm Submod-Max-Cardinality to use FMV, which simply selects a random subset of the elements to get a -approximation to the unconstrained submodular maximization problem [14]. The *Threshold Algorithm* with inputs simply selects each element as it appears if it has marginal utility at least , up to a maximum of elements.

The claim is immediate if the algorithm picks elements, so suppose it does not pick elements, and also . Then , or . By averaging, this implies there exists an element such that ; this element cannot have been chosen into (otherwise the marginal value would be ), but it would have been chosen into when it was considered by the algorithm (since at that time its marginal value would only have been higher). This gives the desired contradiction.

We show that , and picking a random one of these gets a third of that in expectation. Indeed, if or has elements, then . Else if , then FMV guarantees that . Else , which by is at least .

We can randomly choose which one of we want to output before observing any elements. Clearly can be determined online, as can by choosing any element that has high marginal value and is not chosen in . Moreover, just selects elements from independently with probability .

Finally, it will be convenient to recall Dynkin’s algorithm: given a stream of numbers randomly ordered, it samples the first fraction of the numbers and picks the next element that is larger than all elements in the sample.

#### The Secretary Algorithm for the Cardinality Case

For a constrained submodular optimization, if we are given *(a)* a -approximate offline algorithm, and also *(b)* a -approximate online advice-taking algorithm that works given an estimate of , we can now get an algorithm in the secretary model thus: we use the offline algorithm to estimate on the first half of the elements, and then run the advice-taking online algorithm with that estimate. The formal algorithm appears in . Because of space constraints, we have deferred the proof of the following theorem to .

### 4.2Subject to a Partition Matroid Constraint

In this section, we give a constant-factor approximation for maximizing submodular functions subject to a partition matroid. Recall that in such a matroid, the universe is partitioned into “groups”, and the independent sets are those which contain at most one element from each group. To get a secretary-style algorithm for *modular (additive)* function maximization subject to a partition matroid, we can run Dynkin’s algorithm on each group independently. However, if we have a submodular function, the marginal value of an element depends on the elements previously picked—and hence the marginal value of an element as seen by the online algorithm and the adversary become very different.

We first build some intuition by considering a simpler “contiguous partitions” model where all the elements of each group arrive together (in random order), but the groups of the partition are presented in some *arbitrary* order . We then go on to handle the case when all the elements indeed come in completely random order, using what is morally a reduction to the contiguous partitions case.

#### A Special Case: Contiguous Partitions

For the contiguous case, one can show that executing Dynkin’s algorithm with the obvious marginal valuation function is a good algorithm: this is not immediate, since the valuation function changes as we pick some elements—but it works out, since the groups come contiguously. Now, as in the previous section, one wants to run two parallel copies of this algorithm (with the second one picking elements from among those not picked by the first)—but the correlation causes the second algorithm to not see a random permutation any more! We get around this by coupling the two together as follows:

Initially, the algorithm determines whether it is one of 3 different modes (A, B, or C) uniformly at random. The algorithm maintains a set of selected elements, initially . When group of the partition arrives, it runs Dynkin’s secretary algorithm on the elements from this group using valuation function . If Dynkin’s algorithm selects an element , our algorithm flips a coin. If we are in modes or , we let if the coin is heads, and let otherwise. If we are in mode , we do the reverse, and let if the coin is tails, and let otherwise. Finally, after the algorithm has completed, if we are in mode , we discard each element of with probability . (Note that we can actually implement this step online, by ’marking’ but not selecting elements with probability when they arrive).

We first analyze the case in which the algorithm is in mode or . Consider a hypothetical run of *two* versions of our algorithm simultaneously, one in mode and one in mode which share coins and produce sets and . The two algorithms run with identical marginal distributions, but are coupled such that whenever both algorithms attempt to select the same element (each with probability ), we flip only one coin, so one succeeds while the other fails. Note that , and so we will be able to apply . For a fixed permutation , let be the set chosen by the mode algorithm for that particular permutation. As usual, we define . Hence, , and taking expectations, we get

Now, for any , let be the index of the group containing ; hence we have

where the first inequality is just subadditivity, the second submodularity, the third follows from the fact that Dynkin’s algorithm is an -approximation for the secretary problem and selecting the element that Dynkin’s selects with probability gives a approximation, and the resulting telescoping sum gives the fourth equality. Now substituting (Equation 5) into (Equation 4) and rearranging, we get . An identical analysis of the second hypothetical algorithm gives: .

It remains to analyze the case in which the algorithm runs in mode . In this case, the algorithm generates a set by selecting each element in uniformly at random. By the theorem of [14], uniform random sampling achieves a -approximation to the problem of *unconstrained* submodular maximization. Therefore, we have in this case: . By , we therefore have: . Since our algorithm outputs one of these three sets uniformly at random, it gets a approximation to .

#### General Case

We now consider the general secretary setting, in which the elements come in random order, not necessarily grouped by partition. Our previous approach will not work: we cannot simply run Dynkin’s secretary algorithm on contiguous chunks of elements, because some elements may be blocked by our previous choices. We instead do something similar in spirit: we divide the elements up into ‘epochs’, and attempt to select a single element from each. We treat every element that arrives before the current epoch as part of a sample, and according to the current valuation function at the beginning of an epoch, we select the first element that we encounter that has higher value than any element from its own partition group in the sample, so long as we have not already selected something from the same partition group. Our algorithm is as follows:

Initially, the algorithm determines whether it is one of 3 different modes (A, B, or C) uniformly at random. The algorithm maintains a set of selected elements, initially , and observes the first of the elements without selecting anything. The algorithm then considers

epochs, where the th epoch is the set of contiguous elements after the th epoch. At epoch , we use valuation function . If an element has higher value than any element from its own partition group that arrived earlier than epoch , we flip a coin. If we are in modes or , we let if the coin is heads, and let otherwise. If we are in mode , we do the reverse, and let if the coin is tails, and let otherwise. After all epochs have passed, we ignore the remaining elements. Finally, after the algorithm has completed, if we are in mode , we discard each element of with probability . (Note that we can actually implement this step online, by ’marking’ but not selecting elements with probability when they arrive).

If we were guaranteed to select an element in every epoch that was the highest valued element according to , then the analysis of this algorithm would be identical to the analysis in the contiguous case. This is of course not the case. However, we prove a technical lemma that says that we are “close enough” to this case.

Because of space constraints, we defer the proof of this technical lemma to .

Note an immediate consequence of the above lemma: if is the element selected from epoch , by summing over the elements in the optimal set (1 from each of the partition groups), we get:

Summing over the expected contribution to from each of the epochs and applying submodularity, we get . Using this derivation in place of inequality Equation 5 in the proof of proves that our algorithm gives an approximation to the non-monotone submodular maximization problem subject to a partition matroid constraint.

### 4.3Subject to a General Matroid Constraint

We consider matroid constraints where the matroid is with rank . Let the maximum value obtained by any single element, and let be the element that achieves this maximum value. (Note that we do not know these values up-front in the secretary setting.) In this section, we first give an algorithm that gets a set of fairly high value given a threshold . We then show how to choose this threshold, assuming we know the value of the most valuable element, and why this implies an advice-taking online algorithm having a logarithmic approximation. Finally, we show how to implement this in a secretary framework.

**A Threshold Algorithm.** Given a value , run the following algorithm. Initialize . Go over the elements of the universe in *arbitrary* order: when considering element , add it to if and is independent, else add it to if and is independent, else discard it. (We will choose the value of later.) Finally, output a uniformly random one of or .

To analyze this algorithm, let be the optimal set with . Order the elements of by picking its elements greedily based on marginal values. Given , let be the elements whose marginal benefit was at least when added in this greedy order: note that .

If either or is at least , we get value at least . Else both these sets have small cardinality. Since we are in a matroid, there must be a set of cardinality , such that is disjoint from both and , and both and lie in (i.e., they are independent).

We claim that . Indeed, an element in was not added by the threshold algorithm; since it could be added while maintaining independence, it must have been discarded because the marginal value was less than . Hence , and hence . Similarly, . And by disjointness, . Hence, summing these and applying , we get that .

Since the marginal values of all the elements in were at least when they were added by the greedy ordering, and , submodularity implies that , which in turn implies . A random one of gets half of that in expectation. Taking the minimum of and and setting , we get the claim.

Consider the greedy enumeration of , and let . First consider an infinite summation —each element contributes at least to it, and hence the summation is at least . But , which says the infinite sum is at least . But the finite sum merely drops a contribution of from at most elements, and clearly is at least , so removing this contribution means the finite sum is at least .

Hence, if we choose a value uniformly from and run the above threshold algorithm with that setting of , we get that the expected value of the set output by the algorithm is:

**The Secretary Algorithm.** The secretary algorithm for general matroids is the following:

Sample half the elements, let be the weight of the highest weight element in the first half. Choose a value uniformly at random. Run the threshold algorithm with as the threshold

With probability , we choose the value . In this case, with constant probability the element with second-highest value comes in the first half, and the highest-value element comes in the second half; hence our (conditional) expected value in this case is at least . In case this single element accounts for more than half of the optimal value, we get . We ignore the case . If we choose , now with constant probability comes in the first half, implying that . Moreover, each element in appears in the second half with probability slightly higher than . Since accounts for at most half the optimal value, the expected optimal value in the second half is at least . The above argument then ensures that we get value in expectation.

**Acknowledgments.** We thank C. Chekuri, V. Nagarajan, M.I. Sviridenko, J. Vondrák, and especially R.D. Kleinberg for valuable comments, suggestions, and conversations. Thanks to C. Chekuri also for pointing out an error in Section Appendix B, and to M.T. Hajiaghayi for informing us of the results in [4].

## AProof of Main Lemma for -Systems

Let be the elements added to by greedy, and let be the first elements in this order, with , which may be positive or negative. Since , we have . And since is submodular, for all .

We show the existence of a partition of into with the following two properties:

for all , where , and

for all , .

Assuming such a partition, we can complete the proof thus:

where the first inequality follows from [8] (using the first property above, and that the ’s are non-increasing), the second from the second property of the partition of , the third from subadditivity of (which is implied by the submodularity of and applications of both facts in ), and the fourth from the definition of . Using the fact that , and rearranging, we get the lemma.

Now to prove the existence of such a partition of . Define as follows: . Note that since , it follows that ; since the independence system is closed under subsets, we have ; and since the greedy algorithm stops only when there are no more elements to add, we get . Defining ensures we have a partition of .

Fix a value . We claim that is a basis (a maximal independent set) for . Clearly by construction; moreover, any was considered but not added to because . Moreover, is clearly independent by subset-closure. Since is a -independence system, , and thus , proving the first property.

For the second property, note that ; hence each does not belong to but could have been added to whilst maintaining independence, and was considered by the greedy algorithm. Since greedy chose the maximizing the “gain”, for each . Summing over all , we get , where the last inequality is by the subadditivity of . Again, by submodularity, , which proves the second fact about the partition of .

Clearly, the greedy algorithm works no worse if we stop it when the best “gain” is negative, but the above proof does not use that fact.

## BProofs for Knapsack Constraints

The proof is similar to that in [32] and the proof of . We use notation similar to [32] for consistency. Let be a non-negative submodular function with . Let , and we are given items with weights , and ; let , where . Our goal to solve . To that end, we want to prove the following result:

### b.1The Algorithm

The algorithm is the following: it constructs a polynomial number of solutions and chooses the best among them (and in case of ties, outputs the lexicographically smallest one of them).

First, the family contains all solutions with cardinality : clearly, if then we will output itself, which will satisfy the condition of the theorem.

Now for each solution of cardinality , we greedily extend it as follows: Set , . At step , we have a partial solution . Now compute

Let the maximum be achieved on index . If , terminate the algorithm. Else check if : if so, set and , else set and . Stop if .

The family of sets we output is all sets of cardinality at most three, as well as for each greedy extension of a set of cardinality three, we output all the sets created during the run of the algorithm. Since each set can have at most elements, we get sets output by the algorithm.

### b.2The Analysis

Let us assume that , and order as such that

i.e., index the elements in the order they would be considered by the greedy algorithm that picks items of maximum marginal value (and does not consider their weights ). Let . Submodularity and the ordering of gives us the following:

Summing the above three inequalities we get that for ,

For the rest of the discussion, consider the iteration of the algorithm which starts with . For such that , recall that . shows that is a submodular function with . The following lemma is the analog of [32]:

, where we used subadditivity of the submodular function .

Let be the first step in the greedy algorithm at which either (a) the algorithm stops because , or (b) we consider some element and it is dropped by the greedy algorithm—i.e., we set and . Note that before this point either we considered elements from and picked them, or the element considered was not in . In fact, let us assume that there are no elements that are neither in nor are picked by our algorithm, since we can drop them and perform the same algorithm and analysis again, it will not change anything—hence we can assume we have not dropped any elements before this, and for all .

Now we apply to the submodular function with sets and to get

Suppose case (a) happened and we stopped because . This means that every term in the summation in must be negative, and hence , or equivalently, . In this case, we are not even losing the factor.

Case (b) is if the greedy algorithm drops the element . Since was dropped, it must be the case that but . In this case the right-hand expression in has some positive terms for each of the values of , and hence for each , we get

To finish up, we prove a lemma similar to .

If not, then we have

Since we are in the case that , we know that , and hence

Now, the subadditivity of implies that there exists some element with . Submodularity now implies that at each point in time , the marginal increase per unit cost for element is . Now since the greedy algorithm picked elements with the largest marginal increase per unit cost, the marginal increase per unit cost at each step was strictly greater than . Hence, at the moment the total cost of the picked exceeded , the total value accrued would be strictly greater than , which is a contradiction.

Now for the final calculations:

Hence this set will be in the family of sets output, and will satisfy the claim of the theorem.

## CMultiple Knapsacks

Don’t have this yet. Maybe doable.

The first step towards this is to consider a fractional extension of submodularity, and optimization over down-monotone polytopes. Given a nonnegative submodular function , define thus: for any , let be a random vector in where the coordinate is independently rounded to with probability and to otherwise. Clearly, each such