Maximum Area Rectangle Separating Red and Blue Points††thanks: This research was partially supported by NSF award IIP1439718 and CPRIT award RP150164
Given a set of red points and a set of blue points, we study the problem of finding a rectangle that contains all the red points, the minimum number of blue points and has the largest area. We call such rectangle a maximum separating rectangle. We address the planar, axis-aligned (2D) version, and present an time, space algorithm. The running time reduces to if the points are pre-sorted by one of the coordinates. We further prove that our algorithm is optimal in the decision model of computation.
Consider two sets of points, and in the plane. contains points, called red points, and contains points, called blue points. The problem we study in this paper, called the Maximum Area Rectangle Separating Red and Blue Points problem, is to find an axis-aligned rectangle that contains all the red points, the minimum number of blue points, and has the largest area. We call such a rectangle a maximum separating rectangle. An example of a maximum separating rectangle is illustrated in Figures 1, for the planar axis-aligned case, as well as the planar non-axis-aligned case.
Applications that require separation of bi-color points could benefit from efficient results to this problem. For instance, consider we are given a tissue containing a tumor, where the tumor cells are specified by their coordinates. The coordinates of healthy cells are also given. The goal is to separate the tumor cells from the healthy cells, for surgical removal or radiation treatment. Another application would be in city planning, where blue points represent buildings and red points represent monuments and the goal is to build a park that contains the monuments and as few buildings as possible.
1.1 Related Work
The problem of finding the largest-area empty axis-aligned rectangle among a set of points was introduced by Hsu et. al . They consider the axis-aligned version and show how to find all optimal solutions in worst-case and expected time. Later, Chazelle et. al showed how to find one optimal solution in in the worst case . Currently, the best known result for this problem is by Aggarwal and Suri , namely time nd space to find an optimal solution. To do that, they give a divide-an-conquer approach in which a largest empty corner rectangle problem is solved in the merging step. They also show how to find the largest-perimeter empty rectangle optimally in time, by modifying their approach for finding the largest-area empty rectangle.
Mukhopadhyay et. al  studied the problem of finding the maximum empty arbitrary oriented rectangle among a planar point set bounded by a given axis-aligned rectangle. They give an -time, -space algorithm to find all such maximum empty rectangles. Chaudhuri et. al  independently gave a different solution for the same problem, also with time and space. For the 3D axis-aligned case, Nandy et. al  give an algorithm to compute the maximum empty box that runs in time using space.
More recetntly, Dumitrescu and Jiang  considered the problem of computing the maximum-volume axis-aligned -dimensional box that is empty with respect to a given point set in dimensions. The target box is restricted to be contained within a given axis-aligned box . For this problem, they give the first known FPTAS, which computes a box of a volume at least , for an arbitrary , where OPT is the volume of the optimal box. Their algorithm runs in time.
Kaplan and Sharir study the problem of finding a maximal empty axis-aligned rectangle containing a query point . They design an algorithm to answer queries in time, with time and space for preprocessing. Here is the inverse of the Ackermann’s function. In a different paper, they also solve the disk version of the problem, tat asks to find the maximal empty disk containing a query point) . Their approach takes query and preprocessing time, using a space data structure.
For the case where the input consists of axis-aligned rectangles rather than points, Nandy et. al  presented a solution that finds the maximum empty rectangle in worst-case, expected time. In a follow-up paper , they consider the problem of finding the largest empty rectangle among line segments and present an -time algorithm, which they also extend to the case when the input consists of arbitrary polygonal obstacles.
Separability of two point sets using various separators is a well known problem. Megiddo et. al  study the hyperplane separability of point sets in in dimensions. They show how to decide one-hyperplane separability using linear programming in polynomial time. They also prove that the problem of separating two point sets by lines is NP-complete. Aronov et. al  consider four metrics to evaluate misclassification of points of two sets by a linear separator. One of them is the number of misclassified points (which is somewhat related to our problem). For this error metric, they present an -time algorithm.
Separating point sets with circles has also been considered. The problem of finding the minimum area separating circle among red and blue points was studied by Kosaraju et. al . They solve the decision problem in linear time and give an time algorithm for computing the minimum separating circle, if it exists. When point sets cannot be separated by a circle, Bitner, Cheung and Daescu  give two algorithms that run in and time, respectively. Armaselu and Daescu  studied the dynamic version of this problem and presented results for the case when blue points are inserted or deleted.
1.2 Our Results
We consider the planar case of the problem and present an time, space algorithm to find one optimal axis-aligned rectangle, based on a staircase approach. The running time reduces to if the points are pre-sorted by one of the coordinates. We also prove a matching lower bound for the problem, by reducing from a known ”-hard” problem.
2 Algorithms for the 2D axis-aligned version
We begin by describing some properties of the bounded optimal solution.
The maximum axis-aligned separating rectangle must contain at least one blue point on each of its sides.
Consider a quad of 4 non-colinear blue points in this order. Two vertical lines going through two of these points and two horizontal lines going through the other two of those points define a rectangle . We say that defines .
Definition 2.1. Consider a vertical strip formed by the two vertical lines bounding to the left and right, as well as a horizontal strip formed by the parallel lines bounding above and below. The minimum -enclosing rectangle is the intersection between the vertical and horizontal strips (refer to Figure 2).
Definition 2.2. A candidate rectangle is an -enclosing rectangle that contains the minimum number of blue points and cannot be extended in any direction without introducing a blue point.
We start with the minimum -enclosing rectangle . For each side of , we slide it outwards parallel to itself until it hits a blue point (if no such point exists, then the solution is unbounded). Denote by the resulting rectangle (shown in Figure 1). Unbounded solutions can be easily determined in linear time, so from now on we assume bounded solutions. If the interior of does not contain blue points, then is the optimal solution, and we are done. We discard the blue points contained in , as well as the blue points outside of , from and call the resulting set .
is partitioned into 4 disjoint subsets (quadrants) . Each quadrant contains points that are located in a rectangle formed by right upper (resp. left upper, left lower, right lower) corners of and (see Figure 2 for details).
Consider the points of and sort them by X coordinate. A point is a candidate to be a part of an optimal solution only if there are no points with and . We only leave such possible candidate blue points in and discard the rest. The elements of form a staircase sequence, denoted , which is ordered non-decreasingly by X coordinate and non-increasingly by Y coordinate, as shown in Figure 3. The sets are treated appropriately in a similar way and form the staircases , which are ordered non-decreasingly by X coordinate. The 4 staircases can be found in time  and do not change in the axis-aligned cases. Thus, abusing notation, we will refer to as simply for each quadrant .
While the staircase construction approach has been used before , there are differences between how we use it in this paper and how it was used previously. Specifically, note that a maximum -empty rectangle for a given orientation, computed as in , may not contain all red points.
2.1 Finding all optimal solutions
We first prove that we can have maximum separating rectangles in the worst case. To do that, we use a construction similar to the one in , except that we have to ensure that all rectangles will contain .
In the worst case, there are maximum separating rectangles.
assume all blue points are in and refer to Figure 4. Let be composed of two points and of a sequence of points , with the following coordinates.
1. , with ;
In addition, let contain 4 red points .
It is easy to check that all rectangles passing through , for some , enclose . Moreover, all these rectangles have an area equal to . All larger rectangles either contain a blue point or do not contain all red points. Thus, there are maximum separating rectangles. ∎
To compute all maximum separating rectangles, we do the following.
We first compute and in time. Then, we find all -empty rectangles bounded by using the approach in  in time. For each such rectangle, we check whether it contains and, if it does not, we discard it. Finally, we report the remaining rectangles that have the maximum area.
We have proved the following result.
All axis-aligned maximum area rectangles separating red points and blue points can be found in time.
2.2 Finding one optimal solution
Suppose and the four staircases have been already computed. We describe an algorithm to find only one optimal solution.
Observe that all maximal rectangles containing are defined by tuples of four points from (called support points). Each of these points supports an edge of the rectangle. For the -th candidate rectangle, denote the top support by , the left support by , the right support by and the bottom support by .
The problem that we solve is essentially the one of finding the maximal -empty rectangle containing a given ”origin” point, considered in [13, 14]. However, in our case, the target rectangle has to contain , rather than a single point. Based on the position of each support of a candidate rectangle, 3 cases may arise [13, 14]. Note that the solution in  takes time in cases 1 and 2 and time in case 3. We show how to solve this problem in time in every case.
Case 1. Three supports are in the same side of and the fourth support is on the opposite side of .
Suppose without loss of generality (wlog) that the top, right and bottom supports lie to the right of and the left one lies to the left of (as in Figure 5). Note that, for each top-right tuple with the top support in , there is a unique bottom support to the right of , which is in . Thus, the left support is also unique. As argued in [13, 14], there are top-right tuples. Similarly, for each top-left tuples with the top support in , we get a unique bottom-right tuple. This gives us a total of candidate rectangles in case 1. See [13, 14] for more details.
Case 2. Each support is from a different quadrant.
Suppose wlog that , which implies and (see Figure 6). For each top-right tuple satisfying these conditions, there is a unique bottom support from and a unique left support from . Again, as argued in [13, 14], there are top-right tuples. Similarly, for each top-left tuple with , the bottom and right supports satisfying the condition are unique. Thus, there are candidate rectangles in Case 2. More details can be found in [13, 14].
Case 3. Two supports are from a quadrant and the other two are from an opposite quadrant.
Suppose wlog that and (refer to Figure 7). For each such top-right pair, there are multiple choices of bottom-left pairs formed by adjacent points from . However, the bottom support has to be above or equal to the last point to the left of , if exists, and the left support has to be to the right or equal to the last point below , if exists. Consider two functions, and , that assign, to every top-right pair , the index in of the first (resp., last) bottom support that occurs with , denoted by and , respectively. Note that and are monotonically decreasing functions (as shown in Figure 7).
Cases 1 and 2 give tuples and we will argue later how to find all these tuples in time. Case 3 gives tuples so, from now on, we focus on case 3. Note that all supporting points are in . Candidate rectangles are defined by two pairs of adjacent points in and respectively. Denote by , the -th pair of adjacent points in , and by , the -th pair of adjacent points in . Consider a matrix such that denotes the area of the rectangle supported on the top-right by and on the bottom-left by . Some entries of correspond to cases where and do not define a -empty rectangle and are therefore set to ”undefined”. The goal is to compute the maximum of each row , along with the column where it occurs. To break ties, we always take the rightmost index.
It is easy to see that the defined portion of each row of is contiguous. Since the functions and are monotonically decreasing, the defined portion of has a staircase structure (as shown in Figure 8). We say that is staircase-defined by and . Moreover, it turns out that is a partially defined inverse Monge matrix  of size , and thus all row-maxima can be found in time using the algorithm in  (see also ).
To eliminate the factor, we extend to a totally inverse monotone matrix, so that all row maxima can be found in time using the SMAWK algorithm . Recall that is totally inverse monotone iff such that , we have . To do that, one could try the approach in  to make totally inverse monotone, by filling it with 0’s on both sides of the defined portion. However, it does not work in our case. Note that it may happen that there exist such that and , so is not totally inverse monotone. Moreover, it follows by a similar argument that is not totally monotone either. Therefore, we resort to a different filling scheme, which is similar to the one in . Specifically, we fill only the left undefined portion of each row with 0. That is, if the defined portion of row starts at , then . We also fill the right undefined portion of each row with negative numbers such that, if the defined portion of row ends at , then , that is, negative numbers in decreasing order (see Figure 9).
is a totally inverse monotone matrix.
Suppose this is not the case. Then there exist such that and . If , then by construction , so , contradiction. Thus, . If , then by construction , again contradiction. Hence, , which entails or . The first choice gives a contradiction, so the only remaining possibility is and . But this contradicts the total inverse monotonicity of the defined (positive) portion of . ∎
As a side note, if the functions and were monotonically increasing, rather than decreasing, we could make totally monotone instead. That is, . To do that, we fill the ”undefined” portion of with zeros on the right side, and negative numbers in increasing order on the left side of the defined portion of . By a similar argument as in Lemma 1, it follows that is totally monotone.
Note that computing explicitly would take time. To avoid that, we only store the pairs from that may define optimal solutions in a list, as in , and evaluate only when needed. Thus, we only require time and space.
In order to find the optimal solution, after having computed the staircases, we do the following. Suppose we have fixed a top-right pair , with . The leftmost possible support is the highest point in below , denoted , and the lowest possible support is the rightmost point in to the left of , denoted . Both such extremal supports can be found in time if we store, with each point in some staircase, and each quadrant , the pointers and (see Figure 10). In addition, we consider, for each , the pointers to the lowest point in quadrant above , and to the leftmost point in quadrant to the right of . These pointers can be precomputed in time for all through a scan in X order. Note that these pointers are defined in a similar manner as in the algorithm in  for finding the largest empty corner rectangle. Also, the staircases are stored as doubly-linked lists with pointers to the point before (resp., after) in the respective staircase, plus the pointers mentioned above whenever needed. We then consider all bottom-left pairs occurring with and assume wlog that . If and one of (say ) are on the same side of , then we are in case 1 with a rectangle defined by . If , and are on different quadrants, then we are in case 2 with a rectangle defined by . Thus, cases 1 and 2 require time and space in total. Otherwise (i.e. we are in case 3), we store , respectively, in two arrays, and . After all top-right and top-left pairs are treated, we run the SMAWK algorithm as described earlier, in order to compute all row-maxima of in time, where is staircase matrix defined by and , and is the area of rectangle defined by the -th pair from one quadrant and the -th pair from the opposite quadrant in case 3. We then report the (last index) rectangle corresponding to the maximum between all row-maxima of and all maximum area rectangles obtained in cases 1 and 2, along with its area. Thus, we have proved the following result.
The axis-aligned version of the maximum-area separating rectangle problem can be solved in time and space. The running time reduces to if the blue points are presorted by their X coordinates.
2.3 Lower bound
In this section, we prove that steps are sometimes needed in order to compute a maximum axis-aligned separating rectangle, provided that the blue points are not pre-sorted.
To do that, we reduce our problem from the 1D-Furthest-Adjacent-Pair problem, which is known to have a lower bound of for a set of numbers.
In the 1D-Furthest-Adjacent-Pair, we are given a set of real numbers, and the goal is to find the two numbers for which the quantity is the maximum among all adjacent pairs in the sorted order of (denoted by ). Wlog assume that all these numbers are in the interval .
The reduction to our problem is as follows. For each , we consider two points, and . The set of blue points is the set of all such ’s and ’s. The red point set consists of the origin , together with four special points, , and . See Figure 11 for details.
Now we prove that the reduction works.
Two adjacent numbers form the furthest adjacent pair of numbers if and only if the rectangle bounded by and is one of the separating rectangles of maximum area.
Let be adjacent in and that are adjacent in . Note that decreases while increases (same for ) and , so encloses . We need to show that the rectangle has the following properties:
(1) does not contain any blue point, and
(2) has the maximum area among all such rectangles.
If would contain a point (similarly, ), then we would have with , contradiction. So property (1) is satisfied. We have the points and . This means that . Let and . We have , which is monotonically increasing. Therefore, has the highest area among all rectangles with property (1).
Now let be a largest separating rectangle bounded by , for some . Since does not contain any blue points, there cannot exist any , so are adjacent in . Moreover, since is monotonically increasing in , it follows that that are adjacent in . ∎
This gives us the following result.
steps are needed in order to compute the maximum axis-aligned rectangle separating and .
First, steps are needed in order to compute , as one cannot compute the optimal solution without knowing . The term follows from the reduction from 1D-Furthest-Pair. ∎
3 Conclusion and Remarks
We addressed the problem of finding the maximum area axis-aligned separating rectangle that encloses all red points and the minimum number of blue points proved a lower bound for the problem, and provided optimal algorithms.
We reopen the problem of finding a maximum area empty rectangle among points in the plane, Specifically, either prove an lower bound (which seems unlikely), or improve over the thirty years old time algorithm of Aggarwal and Suri .
We also leave open whether it is possible to adapt the maximum empty rectangle containing a query point data structures in [13, 14] to find the maximum area rectangle containing a set of red points and fewest number of blue points when the red points are given at query time. Notice that this version could have important applications, including in fabrication of integrated circuits, where red points could represent defects in the fabrication boards.
The authors would like to thank Dr. Anastasia Kurdia for her preliminary work on the maximum separating rectangle problem.
-  A. Aggarwal and S. Suri, Fast algorithms for computing the largest empty rectangle, SoCG ’87: 278-290
-  A. Aggarwal, M. Klawe, S. Moran, P. Shor and R. Wilber, Geometric Applications of a Matrix Search Algorithm, Algorithmica, 2 (2): 195–208, 1987
-  B. Armaselu and O. Daescu, Dynamic minimum bichromatic separating circle, COCOA 2015: 688-697
-  B. Aronov and D. Garijo, Y. Núñez, D. Rappaport. C. Seara and J. Urrutia, Measuring the error of linear separators on linearly inseparable data, Discrete Applied Mathematics 160(10-11): 1441-1452 (2012)
-  S. Bitner, Y. K. Cheung, and O. Daescu, Minimum separating circle for bichromatic points in the plane, ISVD ’2010
-  J. Chaudhuri, S. C. Nandy and S. Das, Largest empty rectangle among a point set, J. Algorithms, 46, Vol. 1, pp. 54-78, 2003
-  B. Chazelle, R.L. Drysdale III, D.T. Lee, Computing the largest empty rectangle, SIAM Journal of Computing, 1986, 15: 300-315
-  A. Datta and S. Soundaralakshmi, An efficient algorithm for computing the maximum empty rectangle in three dimensions, Information Sciences, Vol. 128 (1-2) 2000: 43-65
-  A. Dumitrescu and M. Jiang, On the largest empty axis-parallel box amidst points, Algorithmica 66(2): 225-248 (2013)
-  A. Dumitrescu and M. Jiang: On the Number of Maximum Empty Boxes Amidst Points. Symposium on Computational Geometry 2016: 36:1-36:13
-  W.L. Hsu, D.T. Lee and A. Namaad, On the maximum empty rectangle problem, Discrete Applied Math, 1984, 8: 267-277
-  H. Kaplan and M. Sharir, Finding the Maximal Empty Disk Containing a Query Point, Symposium on Computational Geometry 2012: 287-292
-  H. Kaplan and M. Sharir, Finding the Maximal Empty Rectangle Containing a Query Point, CoRR abs/1106.3628 (2011)
-  H. Kaplan, S. Mozes, Y. Nussbaum and M. Sharir, Submatrix maximum queries in Monge matrices and partial Monge matrices, and their application, SODA 2012: 338-355
-  M. Klawe and D.J. Kleitman, An almost linear time algorithm for generalized matrix searching, SIAM Journal of Discrete Math. (1990), Vol. 3, pp. 81-97
-  S. Kosaraju, J. O’Rourke and N. Megiddo, Computing circular separability, Journal of Discrete Computational Geometry, 1:105–113, 1986
-  N. Megiddo, On the complexity of Polyhedral Separability, Journal of Diescrete Computational Geometry, 1988, vol. 3, pp. 325-337
-  A. Mukhopadhyay and S.V. Rao, Computing a Largest Empty Arbitrary Oriented Rectangle. Theory and Implementation, International Journal of Computational Geometry & Applications Vol. 13 (3), 2003, pp. 257-271
-  S.C. Nandy, B. B. Bhattacharya and Sibabrata Ray, Efficient algorithms for Identifying All Maximal Isothetic Empty Rectangles in VLSI Layout Design, FSTTCS 1990: 255-269
-  S. C. Nandy, A. Sinha and B. B. Bhattacharya, Location of the Largest Empty Rectangle among Arbitrary Obstacles, FSTTCS 1994: 159-170
-  S.C. Nandy and B.B. Bhattacharya, Maximal empty cuboids among points and blocks, Journal of Computers & Mathematics with Applications, Vol. 36 (3), August 1998, pp. 11-20