# Quasi-homography Warps in Image Stitching

###### Abstract

Naturalness of warping is gaining extensive attention in image stitching. Recent warps such as SPHP, AANAP and GSP, use a global similarity to effectively mitigate projective distortion (which enlarges regions), however, they necessarily bring in perspective distortion (which generates inconsistency). In this paper, we propose a quasi-homography warp, which balances perspective distortion against projective distortion in the non-overlapping region, to create natural-looking mosaics. Our approach formulates the warp as a solution of a system of bivariate equations, where perspective distortion and projective distortion are characterized as slope preservation and scale linearization respectively. Our proposed warp only relies on a global homography thus is totally parameter-free. A comprehensive experiment shows that quasi-homography outperforms some state-of-the-art warps in urban scenes, including homography, AutoStitch and SPHP. A user study demonstrates that quasi-homography wins most users’ favor as well, comparing to homography and SPHP.

[1]itemsep=0pt,partopsep=0pt,parsep=topsep=0pt

## I Introduction

Image stitching plays an important role in many multimedia applications, such as panoramic videos [1, 2, 3], virtual reality [4, 5, 6], etc. Basically, it is a process of combining multiple images with overlapping fields of views to produce a wide-view panorama [7], where the first stage is determining a warp for each image to transform it in a common coordinate system, then the warped images are composed [8, 9, 10, 11, 12] and blended [13, 14, 15] into a final mosaic. Evaluation of warping includes alignment quality in the overlapping region and naturalness quality in the non-overlapping region.

Early warps focus on improving alignment quality. A global warp such as similarity or projective [16], aims to minimize alignment errors between overlapping pixels by a uniform transformation. Homography is the most frequently used one, because it is the farthest planar transformation where straight lines are preserved in the warped image. However, it usually suffers from projective distortion (which enlarges regions) in the non-overlapping region (see Fig. 1(b)). Some warps address alignment issues due to parallax in the overlapping region, by using multiple local transformations instead of a single global one [17, 18, 19, 20], or by combining image alignment with seam-cutting approaches [21, 22, 23].

Recently, some warps such as shape-preserving half-projective (SPHP) warp [24], adaptive as-natural-as-possible (AANAP) warp [25], and warp with global similarity prior (GSP) [26], address naturalness issues in the non-overlapping region by using a global similarity. Though they effectively mitigate projective distortion, they necessarily bring in perspective distortion (which generates inconsistency), because similarity transformations keep original perspectives individually (see Fig. 1(c)). It motivates us to propose a novel warp, that overcomes drawbacks of homography and similarity transformations, to mitigate projective distortion and perspective distortion simultaneously in the non-overlapping region (see Fig. 1(d)).

In this paper, we propose a quasi-homography warp, which balances projective distortion against perspective distortion in the non-overlapping region, to create natural-looking mosaics (see Fig. 2). Our proposed warp only relies on a global homography thus is totally parameter-free. The rest of the paper is organized as follows. Section II describes some of recent related works. Section III provides a detailed derivation of our proposed warp. In particular, Section III-A presents an analysis of homography and SPHP warps, where perspective distortion is addressed by preserving corresponding slopes of horizontal lines and vertical lines, and projective distortion is described in terms of linearizing a scale function. Section III-B reformulates homography as a solution of a system of bivariate equations, where slope preservation and scale linearization are characterized respectively. In Section III-C, we linearize the scale function to a piece-wise continuous function, and define our quasi-homgoraphy warp as a solution of a modified system. Some implementation details (including two-image stitching and multiple-image stitching) and variations (including orthogonal rectification and partition refinement) are proposed in Section IV. Section V presents a comparison experiment and a user study, which demonstrate that our proposed warp not only outperforms some state-of-the-art warps in urban scenes, but also wins most users¡¯ favor. Finally, conclusions are drawn in Section VI.

## Ii Related Work

In the following, we review recent works of image warping in aspects of alignment and naturalness respectively. For more fundamental concepts about image stitching, please refer to a comprehensive survey [7] by Szeliski.

### Ii-a Warps for Better Alignment

Conventional stitching methods usually use global warps such as affine, similarity and homography, to align images in the overlapping region [16]. Global warps are robust but often not flexible enough to provide accurate alignment. Gao et al. [17] proposed a dual-homography warp to address scenes with two dominant planes by a weighted sum of two homographies. Lin et al. [18] proposed a smoothly varying affine warp to replace a global affine transformation with a smoothly affine stitching field, which is more flexible and maintains much of the motion generalization properties of affine or homography. Zaragoza et al. [19] proposed an as-projective-as-possible warp in a moving DLT framework, which is able to accurately align images that differ by more than a pure rotation. Lou et al. [20] proposed a local alignment method by piecewise planar region matching, which approximates image regions with planes by incorporating piecewise local geometric models.

Other methods combine image alignment with seam-cutting approaches [27, 28, 29, 30], to find a locally registered area instead of aligning the overlapping region globally. Gao et al. [21] proposed a seam-driven stitching framework, which searches a homography with minimal seam costs instead of minimal alignment errors. Zhang and Liu [22] proposed a parallax-tolerant warp, which combines homography and content-preserving warps to locally align images. Lin et al. [23] proposed a seam-guided local alignment method , which iteratively improves warping by adaptive feature weighting according to their distances to current seams.

These methods focus on improving alignment quality in the overlapping region, so they sometimes suffer from naturalness issues in the non-overlapping region (see Fig. 2).

### Ii-B Warps for Better Naturalness

Many efforts have been devoted to mitigate distortion in the non-overlapping region to create natural-looking mosaics. A pioneering work [31] uses spherical or cylindrical warps to provide multi-perspective results to address this problem, however, it necessarily curves straight lines.

Recently, some methods take the advantage of global similarity (preserve original perspective) to mitigate projective distortion. Chang et al. [24] proposed a SPHP warp, which spatially combines a projective and a similarity, where the projective maintains good alignment in the overlapping region while the similarity keeps original perspective in the non-overlapping region. Lin and Pankanti [25] proposed an AANAP warp, which combines a linearized homography and a global similarity with smallest rotation angle to create natural-looking mosaics. Chen et al. [26] proposed a GSP warp to stitch multiple images with a global similarity prior in the objective function, which constrains the warp resembles a similarity as a whole.

These methods use a global similarity to effectively mitigate projective distortion in the non-overlapping region, assuming that the original images are most natural to users. However, the assumption is not consistent with human perception in urban scenes, due to the perspective distortion generated by different perspectives between images (see Fig. 2).

## Iii Proposed Warps

In this section, we first present an analysis of homography and SPHP warps, to show their advantages and disadvantages in aspect of perspective distortion or projective distortion (see Fig. 3 and 4), then homography is reformulated as a solution of a system of bivariate equations, where projective distortion is quantified by a nonlinear scale function on a special horizontal line (see Fig. 5(b)), finally quasi-homography is proposed as a solution of a modified system, by linearizing the scale function to a piece-wise continuous function (see Fig. 5(c)).

### Iii-a Analysis of Homography and SPHP Warps

Let and denote the target image and the reference image respectively. A warp is a planar transformation [16], which relates pixel coordinates to , where

(1) |

We say a warp is line-preserving, if any straight line in the target image is transformed into a straight line in the warped result. If a line-preserving warp aligns the target image to the reference image, then the perspective of the warped result is consistent with the perspective of the reference image. Given a line-preserving warp as (1) and a line with slope crossing , then it is corresponding to a line with slope crossing , where

(2) | ||||

(3) | ||||

(4) |

and denote partial derivatives of and . In fact, (2) is a sufficient and necessary condition of a line-preserving warp. In order to characterize perspective distortion, we draw a grid-to-grid map of lines with slopes

(5) | ||||

(6) |

which are corresponding to horizontal lines and vertical lines in the target image. Consequently, any point can be regarded as the intersection point of a horizontal line and a vertical line, which is corresponding to a point as the intersection point of lines with slopes (5,6) (see Fig. 3).

It is easy to check that a homography warp

(7) | ||||

(8) |

satisfies condition (2), therefore homography is line-preserving (see Fig. 3(b)). However, SPHP is not line-preserving, because projective’s perspective in the overlapping region is inconsistent with similarity’s perspective in the non-overlapping region (see Fig. 3(c)).

In horizontal image stitching, there always exists a special horizontal line , that preserves horizontal under a line-preserving warp, where is determined by solving

(9) |

In fact, is a scale function on the special horizontal line, which quantifies projective distortion in horizontal image stitching. In order to characterize the function more explicitly, we draw a map (see Fig. 4).

Homography necessarily generates projective distortion, because the derivative constantly increases in the non-overlapping region (see Fig. 4(b)). However, SPHP possesses a piece-wise continuous scale function, which is linear in the non-overlapping region (see Fig. 4(c)). This difference leads to less projective distortion of SPHP than homography.

### Iii-B Reformulation of Homography Warps

Suppose and are well-aligned by a global homography as (7,8) in the overlapping region, our goal is to propose a warp that combines the advantages of homography (line-preserving) and similarity (a linear scale function) in the non-overlapping region. In the following, we give a reformulation of homography, that characterizes slope preservation and scale linearization respectively.

First, we define as the preimage of the border of the overlapping region and the non-overlapping region (marked in blue in Fig. 5(a)), which satisfies

(12) |

where if assigning the left image as the target image , otherwise equals to the width of the reference image .

For a homography , the preimage

(13) |

Then, in order to benefit from the analysis of homography and SPHP, we reformulate homography as a solution of

(14) | ||||

(15) |

where and are projections of onto and respectively (see Fig. 5(a)), and

(16) | ||||

(17) |

which are independent of and respectively.

It is easy to check that, if we regard and as unknowns, then as (7,8) is the unique solution of (14,15) satisfying (10,13,16,17). Comparing to the formulation (7,8), our reformulation characterizes perspective distortion as slope preservation ( and ) and projective distortion as scale linearization () respectively. In fact, and constrain shapes of the grid, while determines its density (see Fig. 5(b)).

### Iii-C Quasi-homography Warps

Our reformulation of homography and our analysis of SPHP motivate us to modify to a piece-wise continuous function, which is linear in the non-overlapping region.

Then, the truncation of Taylor’s expansion of ,

(19) |

is linear in the non-overlapping region and is connected to in the overlapping region. Let denote the overlapping region, we modify the scale function to

(20) |

which is piece-wise continuous. If we regard and as unknowns, then the system of bivariate equations

(21) | ||||

(22) |

satisfying (10,13,16,17,20) possess a unique solution

(23) | ||||

(24) |

where

(25) | ||||

(26) |

define a warp in the non-overlapping region and are polynomials in .

In fact, the warp just crowds the grid of homography in the horizontal direction but without varying its shape (see Fig. 5(c)). In this sense, we call a quasi-homography warp corresponding to a homography warp . A quasi-homography maintains good alignment in the overlapping region as while mitigates perspective distortion and projective distortion simultaneously as in the non-overlapping region.

## Iv Implemention

In this section, we first present more implementation details of our quasi-homography in two-image stitching and multiple-image stitching, and then we propose two variations including orthogonal rectification and partition refinement.

### Iv-a Two-image Stitching

Given two images captured from a camera rotated vertically, if a homography is estimated, then a quasi-homography can be calculated, which smoothly extrapolates in the overlapping region to in the non-overlapping region. A brief algorithm is given in Algorithm 1.

### Iv-B Multiple-image Stitching

Given a sequence of multiple images, which are captured from a camera rotated vertically, our warping method consists of three stages. In the first stage, we determine a reference image as the standard perspective. Then, we estimate a homography for every two images with overlapping fields of views, and calculate its corresponding quasi-homography. Finally, we concatenate all target images in the image plane of the reference image.

Fig. 6 illustrates the concatenation procedure of stitching four images. First, we set as the reference image, so perspectives of other target images should agree with its perspective. Then, we calculate quasi-homography warps , and . Finally, we concatenate to by

(30) |

For stitching more images, we obtain the concatenation maps recursively.

### Iv-C Orthogonal Rectification

In urban scenes, users accustom to take pictures by rotating cameras vertically, hence any vertical line in the target image is expected to transform to a vertical line in the warped result. However, it will inevitably sacrifice alignment quality in the overlapping region.

In order to achieve orthogonal rectification, we incorporate an extra constraint in homography estimation, which constrains the external vertical boundary of the target image preserves vertical in the warped result.

Then, it should satisfy

(31) |

where is the height of the target image and if assigning the left image as target, otherwise equals to the width of the target image. For a homography warp as (7,8), we obtain

(32) |

Then, a global homography is estimated by solving

(33) |

Since quasi-homography just crowds the grid of homography in the horizontal direction but without varying its shape, so the external vertical boundary still preserves vertical (see Fig. 7).

### Iv-D Partition Refinement

In (12), we simply set the partition line as the border of the overlapping region and the non-overlapping region, which directly determines the linear scale function (19). In fact, the scale ratio

(34) |

is more precise when is more locally aligned, thus we modify the partition line as the external border of the seam for further refinement (see Fig. 8).

## V Experiments

We experimented our proposed method on a range of images captured from rear and front cameras in urban scenes. In our experiments, we use SIFT [32] to extract and match features, RANSAC [33] to estimate a global homography, and seam-cutting [27] to blend the overlapping region. Codes are implemented in OpenCV 2.4.9 and generally take s-s on a desktop PC with Intel i5 GHz CPU and GB memory to stitch two images with resolution by Algorithm 1, where the calculation of quasi-homography only takes 0.1s (including forward map and backward map).

### V-a Comparisons to State-of-the-art Warps

We compared our quasi-homography warp to state-of-the-art warps in urban scenes, including homography, SPHP and AutoStitch. In order to highlight comparisons on naturalness quality in the non-overlapping region, for homography, SPHP and quasi-homography, we use the same homography alignment and the same seam-cutting composition in the overlapping region. More results and original input images are available in the supplementary material.

Fig. 9 illustrates an example in selfie-stitching. Homography provides a line-preserving stitching result but generates projective distortion in buildings and trees (highlighted in blue rectangles). SPHP mitigates projective distortion significantly but brings in apparent perspective distortion in the top of buildings (highlighted in red rectangles). Quasi-homography creates a more natural-looking stitching result in both aspects of projective distortion and perspective distortion.

Fig. 10 illustrates an example in multiple-image stitching, where stitching results are cropped for the sake of layout. Homography enlarges trees on the right (highlighted in blue rectangles) but preserves all straight lines. SPHP curves buildings and grounds (highlighted in red rectangles) but without enlarging the region of trees. AutoStitch produces a fisheye-like mosaic due to spherical warps. Quasi-homography balances projective distortion against perspective distortion by preserving horizontal lines and vertical lines, simultaneously possessing a piece-wise continuous scale function.

### V-B User Study

In order to investigate whether quasi-homography is preferred by users in urban scenes, we conduct a user study to compare our result to homography and SPHP. We invite 17 participants to rank 20 unannotated groups of stitching results, including 5 groups from front camera and 15 groups from rear camera. In each group, we use the same homography alignment and the same seam-cutting composition, and all parameters are set to produce the optimal results. In our study, each participant ranks three unannotated stitching results in each group, and a score is counted by assigning weights 4, 2 and 1 to Rank 1, Rank 2 and Rank 3.

Table I shows a summary of rank vote and total score for three warps, and the histogram of the score is shown in Fig. 11 in three aspects. This user study demonstrates that stitching results of quasi-homography win most users’ favor in urban scenes.

Methods | Results | |||
---|---|---|---|---|

Rank 1 | Rank 2 | Rank 3 | Total score | |

Homography | 69 | 210 | 61 | 757 |

SPHP [24] | 19 | 56 | 265 | 453 |

Quasi-homography | 252 | 74 | 14 | 1170 |

## Vi Conclusion

In this paper, we propose a quasi-homography warp, which balances perspective distortion against projective distortion in the non-overlapping region, to create natural-looking mosaics. Experiments show that quasi-homography outperforms some state-of-the-art warps in urban scenes, including homography, AutoStitch and SPHP. A user study demonstrates that stitching results of quasi-homography win most users’ favor as well, comparing to homography and SPHP. It should be noted that, though quasi-homography preserves horizontal lines and vertical lines, it may curve some diagonal lines in the stitching result. Fortunately, users accustom to take pictures of orthogonal compositions in urban scenes. Future works may include generalizing quasi-homography warps in the spatially-varying warping framework to improve alignment quality.

## References

- [1] S. Tzavidas and A. K. Katsaggelos, “A multicamera setup for generating stereo panoramic video,” IEEE Transactions on Multimedia, vol. 7, no. 5, pp. 880–890, 2005.
- [2] X. Sun, J. Foote, D. Kimber, and B. S. Manjunath, “Region of interest extraction and virtual camera control based on panoramic video capturing,” IEEE Transactions on Multimedia, vol. 7, no. 5, pp. 981–990, 2005.
- [3] V. R. Gaddam, M. Riegler, R. Eg, and P. Halvorsen, “Tiling in interactive panoramic video: Approaches and evaluation,” IEEE Transactions on Multimedia, vol. 18, no. 9, pp. 1–1, 2016.
- [4] H. Y. Shum, K. T. Ng, and S. C. Chan, “A virtual reality system using the concentric mosaic: construction, rendering, and data compression,” IEEE Transactions on Multimedia, vol. 7, no. 1, pp. 85–95, 2005.
- [5] W. K. Tang, T. T. Wong, and P. A. Heng, “A system for real-time panorama generation and display in tele-immersive applications,” IEEE Transactions on Multimedia, vol. 7, no. 2, pp. 280–292, 2005.
- [6] Q. Zhao, L. Wan, W. Feng, and J. Zhang, “Cube2video: Navigate between cubic panoramas in real-time,” IEEE Transactions on Multimedia, vol. 15, no. 8, pp. 1745–1754, 2013.
- [7] R. Szeliski, “Image alignment and stitching: A tutorial,” Found. Trends Comput. Graph. Vis., vol. 2, no. 1, pp. 1–104, 2006.
- [8] S. Peleg, “Elimination of seams from photomosaics,” Comput. Graph. Image Process., vol. 16, no. 1, pp. 90–94, 1981.
- [9] M.-L. Duplaquet, “Building large image mosaics with invisible seam lines,” in Proc. SPIE Visual Information Processing VII, 1998, pp. 369–377.
- [10] J. Davis, “Mosaics of scenes with moving objects,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., June. 1998, pp. 354–360.
- [11] A. A. Efros and W. T. Freeman, “Image quilting for texture synthesis and transfer,” in Proc. ACM SIGGRAPH, 2001, pp. 341–346.
- [12] A. Mills and G. Dudek, “Image stitching with dynamic elements,” Image Vis. Comput., vol. 27, no. 10, pp. 1593–1602, 2009.
- [13] P. J. Burt and E. H. Adelson, “A multiresolution spline with application to image mosaics,” ACM Trans. Graphics, vol. 2, no. 4, pp. 217–236, 1983.
- [14] P. Pérez, M. Gangnet, and A. Blake, “Poisson image editing,” ACM Trans. Graphics, vol. 22, no. 3, pp. 313–318, 2003.
- [15] A. Levin, A. Zomet, S. Peleg, and Y. Weiss, “Seamless image stitching in the gradient domain,” in Proc. Eur. Conf. Comput. Vis., May 2004, pp. 377–389.
- [16] R. Hartley and A. Zisserman, Multiple view geometry in computer vision. Cambridge univ. press, 2003.
- [17] J. Gao, S. J. Kim, and M. S. Brown, “Constructing image panoramas using dual-homography warping,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 49–56.
- [18] W.-Y. Lin, S. Liu, Y. Matsushita, T.-T. Ng, and L.-F. Cheong, “Smoothly varying affine stitching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 345–352.
- [19] J. Zaragoza, T.-J. Chin, M. S. Brown, and D. Suter, “As-projective-as-possible image stitching with moving DLT,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 2339–2346.
- [20] Z. Lou and T. Gevers, “Image alignment by piecewise planar region matching,” IEEE Transactions on Multimedia, vol. 16, no. 7, pp. 2052–2061, 2014.
- [21] J. Gao, Y. Li, T.-J. Chin, and M. S. Brown, “Seam-driven image stitching,” Eurographics, pp. 45–48, 2013.
- [22] F. Zhang and F. Liu, “Parallax-tolerant image stitching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., May. 2014, pp. 3262–3269.
- [23] K. Lin, N. Jiang, L.-F. Cheong, M. Do, and J. Lu, “Seagull: Seam-guided local alignment for parallax-tolerant image stitching,” in Proc. Eur. Conf. Comput. Vis., Oct. 2016.
- [24] C.-H. Chang, Y. Sato, and Y.-Y. Chuang, “Shape-preserving half-projective warps for image stitching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., May 2014, pp. 3254–3261.
- [25] C.-C. Lin, S. U. Pankanti, K. N. Ramamurthy, and A. Y. Aravkin, “Adaptive as-natural-as-possible image stitching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2015, pp. 1155–1163.
- [26] N. I. S. with the Global Similarity Prior, “Yu-sheng chen and yung-yu chuang,” in Proc. Eur. Conf. Comput. Vis., Oct. 2016, pp. 186–201.
- [27] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 11, pp. 1222–1239, Nov. 2001.
- [28] A. Agarwala, M. Dontcheva, M. Agrawala, S. Drucker, A. Colburn, B. Curless, D. Salesin, and M. Cohen, “Interactive digital photomontage,” ACM Trans. Graphics, vol. 23, no. 3, pp. 294–302, 2004.
- [29] V. Kwatra, A. Schödl, I. Essa, G. Turk, and A. Bobick, “Graphcut textures: image and video synthesis using graph cuts,” ACM Trans. Graphics, vol. 22, no. 3, pp. 277–286, 2003.
- [30] A. Eden, M. Uyttendaele, and R. Szeliski, “Seamless image stitching of scenes with large motions and exposure differences,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., vol. 2, Jun. 2006, pp. 2498–2505.
- [31] M. Brown and D. G. Lowe, “Automatic panoramic image stitching using invariant features,” Int. J. Comput. Vis., vol. 74, no. 1, pp. 59–73, 2007.
- [32] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.
- [33] M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, 1981.