Photometric Stereo in Participating Media Considering ShapeDependent Forward Scatter
Abstract
Images captured in participating media such as murky water, fog, or smoke are degraded by scattered light. Thus, the use of traditional threedimensional (3D) reconstruction techniques in such environments is difficult. In this paper, we propose a photometric stereo method for participating media. The proposed method differs from previous studies with respect to modeling shapedependent forward scatter. In the proposed model, forward scatter is described as an analytical form using lookup tables and is represented by spatiallyvariant kernels. We also propose an approximation of a largescale dense matrix as a sparse matrix, which enables the removal of forward scatter. Experiments with real and synthesized data demonstrate that the proposed method improves 3D reconstruction in participating media.
1 Introduction
Threedimensional (3D) shape reconstruction from twodimensional (2D) images is an important task in computer vision. Numerous 3D reconstruction such as structure from motion, shapefromX, and multiview stereo have been proposed. However, reconstructing the shape of an object in a participating medium, e.g., murky water, fog, and smoke, remains a challenging task. In participating media, light is attenuated and scattered by suspended particles, which degrades the quality of the captured images (Figure 1). 3D reconstruction techniques designed for clear air environments will not work in participating media.
Several methods to reconstruct a 3D shape in participating media using photometric stereo techniques have been proposed [11, 22, 9]. Photometric stereo methods reconstruct surface normals from images captured under different lighting conditions [24]. Note that backscatter and forward scatter occur in participating media, as shown in Figure 2; thus, the irradiance observed at a camera includes a direct component reflected on the surface, as well as a backscatter and forward scatter components. Narasimhan et al. [11] modeled single backscattering under a directional light source in participating media and estimated surface normals using a nonlinear optimization technique. Tsiotsios et al. [22] assumed that backscatter saturates close to the camera when illumination follows the inverse square law, and subtracted the backscatter from the captured image.
These methods do not consider forward scatter. Forward scatter depends on the object’s shape locally and globally, and in highly turbid media such as port water, 3D reconstruction accuracy is affected by forward scatter. Although Murez et al. [9] proposed a photometric stereo technique that considers forward scatter, they assumed that the scene is approximated as a plane, which enables prior calibration. Therefore, this assumption deteriorates the estimation of normals because forward scatter is intrinsically dependent on the object’s shape.
We propose a forward scatter model and implement the model into a photometric stereo framework. Differing from previous studies [15, 9], we compute forward scatter, which depends on the object’s shape. To overcome the mutual dependence between shape and forward scatter, we develop an iterative algorithm that performs a forward scatter removal and 3D shape reconstruction alternately.
In the proposed model, forward scatter is represented by an analytical form of single scattering. In computer graphics, Monte Carlo and finite element techniques have been used to simulate light scattering in participating media. Although such techniques provide accurate simulations, realtime rendering is difficult. Thus, analytical or closedform solutions have been proposed for efficient computation [19, 26, 17]. For example, Sun et al. [19] proposed an analytical single scattering model of backscatter and forward scatter between the source and the surface (sourcesurface forward scatter) using 2D lookup tables. Similar to their model, in this study, forward scatter between the surface and the camera (surfacecamera forward scatter) is computed using a lookup table.
Note that surfacecamera forward scatter causes image blur. As mentioned previously, Murez et al. [9] assumed that the object is approximated as a plane. Thus, they modeled forward scatter as a spatiallyinvariant point spread function under orthogonal projection. In our proposed model, forward scatter is modeled as spatiallyvariant kernels because it depends on the object’s shape. Thus, it is impossible to remove forward scatter from the captured image directly because the spatiallyvariant kernels result in a shiftvariant and largescale dense matrix (Section 4.1). To address this problem, we approximate the kernel matrix as a sparse matrix. We leverage kernel convergence between distant points on the surface for the approximation.
The primary contributions of this paper are summarized as follows:

To the best of our knowledge, this is the first study to strictly model shapedependent surfacecamera forward scatter and we derive its analytical solution.

To remove surfacecamera forward scatter, we propose an approximation of the largescale dense matrix to a sparse matrix.
2 Related work
2.1 3D reconstruction in participating media
In participating media, suspended particles cause light scattering and attenuation that reduce the contrast in captured images. Nayar et al. [13] proposed direct and global light separation such as interreflection or volumetric scattering using highfrequency patterns. Treibitz and Schechner [21] utilized polarization to remove a backscatter component and estimated a depth map from the extracted backscatter. Kim et al. [8] mounted a lenslet array and a diffuser between a camera and a participating medium to estimate blur caused by scattering.
Several attempts have been made to design traditional 3D reconstruction techniques for participating media (e.g., structured light [11, 4] and stereo [18, 14]). Narasimhan et al. [11] proposed a structured light method in participating media, and Gu et al. [4] reconstructed the 3D shape of a participating medium using structured light. Negahdaripour and Sarafraz [14] improved stereo matching in participating media by exploiting the relationship between backscatter and a disparity map. Recently, Tiang et al. [20] proposed a depth map estimation method using a light field. Some methods have utilized scattering or attenuation directly for 3D reconstruction. For example, Hirufuji et al. [5] reconstructed specular objects with occlusions using single scattered light, and Inoshita et al. [7] reconstructed translucent objects directly from volumeric scattering. Hullin et al. [6] used fluorescence as a participating medium to reconstruct transparent objects, and Asano et al. [2] utilized absorption of infrared light to estimate a depth map in underwater scenes.
Several photometric stereo methods have been proposed [15, 11, 22, 9]. Photometric stereo has several advantages, e.g., it does not require stereo correspondence and provides pixelwise detailed shape information even if the target object has a textureless surface. However, the image formation must be strictly modeled to preserve photometric information. Narasimhan et al. [11] modeled the single scattering of backscatter under a directional light source in participating media, and Tsiotsios et al. [22] demonstrated the saturation of backscatter under the inverse square law, which enabled backscatter removal by subtracting no object image from an input image. Note that these methods did not consider the effect of forward scatter, which deteriorates the accuracy of 3D reconstruction in highly turbid media. On the other hand, some methods have considered forward scatter [15, 9]. For exmaple, Murez et al. [9] approximated the scene as a plane and precalibrated the forward scatter component. Nevertheless, unlike the proposed model, such method do not discuss the relationship between forward scatter and the object’s shape. In this paper, we model shapedependent surfacecamera forward scatter.
2.2 Analytical solution for single scattering
In computer graphics, analytical or closedform solutions for single scattering in participating media have been proposed to overcome computational complexity issues. Sun et al. [19] assumed single and isotropic scattering and used 2D lookup tables to analytically describe backscatter and sourcesurface forward scatter. Zhou et al. [26] extended this approach to inhomogeneous single scattering media with respect to backscatter. Pegoraro et al. [17] derived a closedform solution for single backscattering under a general phase function and light distribution. In this study, owing to its simplicity, we use a lookup table similar to that of Sun et al. [19], and we model surfacecamera forward scatter analytically.
3 Image formation model
In this section, we discuss an image formation model in participating media, and provide an analytical form using lookup tables. We assume perspective projection, near lighting, and Lambertian objects. As in many previous studies [11, 22, 9, 19], multiple scattering is considered to be negligible.
Here, let be irradiance at a camera when the 3D position on an object surface is observed. In participating media, is decomposed into a reflected component (Figure 4), a backscatter component (Figure 3), and a forward scatter component (Figure 5) as follows:
(1) 
Here, parameters and denote an extinction coefficient and the distance between the camera and position , respectively. In participating media, light is attenuated exponentially relative to distance. The extinction coefficient is the sum of the absorption coefficient and the scattering coefficient .
(2) 
3.1 Backscatter component
As shown in Figure 3, the backscatter component is the sum of scattered light on the viewline without reaching the surface. Thus, the irradiance of the backscatter component is integral along the line, which is expressed as follows:
(3) 
where denotes the radiant intensity of the source and is a phase function that describes the angular scattering distribution. Although Equation (3) cannot be computed in closedform, an analytical solution can be acquired using a lookup table. However, Equation (3) depends on , , , and ; thus, the entry of the table is fourdimensional. Sun et al. [19] assumed isotropic scattering (i.e., ) and derived an analytical solution using a 2D lookup table :
(4) 
where and are optical thickness. In the following, denotes the product of and distance . , , and are defined as follows:
(5)  
(6)  
(7) 
is a 2D lookup table computed numerically in advance.
As mentioned previously, to remove backscatter, Tsiotsios et al. [22] leveraged backscatter saturation without computing it explicitly. We also use an image without the target object to remove the backscatter component from the input image.
3.2 Reflected component
As shown in Figure 4, the reflected component is decomposed into directly reaching the surface and the sourcesurface forward scatter component :
(8) 
Considering diffuse reflection and attenuation in participating media, is expressed as follows:
(9) 
where is a diffuse albedo at , is a normal vector, and is the direction from to the source. The sourcesurface forward scatter component is the integral of scattered light on a hemisphere centered on :
(10) 
We define as the sum of scattered light from direction . As discussed in Section 3.1, Sun et al. [19] derived an analytical solution using a 2D lookup table as follows:
(11) 
where is a 2D lookup table given as
(12) 
3.3 Surfacecamera forward scatter component
When we observe surface point in a participating medium, the light reflected on point is scattered on the viewline, and the scattered light is also observed as a forward scatter component (Figure 5). In this paper, we describe this component analytically using a lookup table.
As shown in Figure 5, irradiance at the camera includes reflected light from the small facet centered at . If we consider this small facet as a virtual light source, similar to Equation (3), the irradiance can be expressed as follows:
(13) 
where is the area of the facet. At the camera, a discrete point on the surface corresponding to the pixel is observed. Thus, is the sum of these discrete points:
(14) 
Note that the domain of integration differs from that of Equation (3), i.e., . We define as the intersection point of the viewline and the tangent plane to . If , i.e., is inside the object, we set . If which means that is behind the camera, we set . Similar to Equation (4), the isotropic scattering assumption yields the following:
(15) 
This is the analytical expression of the surfacecamera forward scatter. Note that we define the area of the small facet as follows [12]:
(16) 
where is the area of the camera pixel and is the direction from to the camera.
4 Photometric stereo considering forward scatter
To reconstruct the surface normals using photometric stereo, we must deal with both the surfacecamera and sourcesurface forward scatter. In this section, we first discuss how to remove the surfacecamera forward scatter, and then we explain a photometric stereo method that considers the sourcesurface forward scatter.
4.1 Approximation of a largescale dense matrix
As mentioned previously, we can remove the backscatter using a previously proposed method [22]. Here, let be a backscatter subtracted image where is a number of pixels. Then from Equation (1) and (15), reflected light at the surface is expressed as follows:
(17) 
where is an dense matrix. is a largescale dense matrix whose elements are given by
(18) 
Our model is similar to that of Murez et al. [9]. However, our model is different in that each row of is spatiallyvariant because we compute the forward scatter considering the object’s shape. In the model presented by Murez et al. [9], the plane approximation of the scene under orthogonal projection yields a spatiallyinvariant point spread function. Our spatiallyvariant kernel matrix makes it impossible to solve Equation (17) directly.
To overcome this problem, we propose an approximation of a largescale dense matrix as a sparse matrix. Figure 6 (a) shows a row of reshaped in a 2D when we observe a plane in a participating medium. This shows how the observed irradiance of the center of the plane is affected by other points. Figure 6 (b) shows the profile of the blue line in Figure 6 (a). From these figures, we observe that the effect between two points converges to a very small value as the distance of the points increases; however, it does not converge to zero. Here, we assume that the value of converges to in the neighboring set centered at , and we obtain the following approximation:
(19)  
(20)  
(21) 
where and we use from Equation (20) to (21). Then, we define a sparse matrix as follows:
(22) 
This yields the following linear system:
(23) 
We solve this linear system using BiCG stabilization [23] to remove surfacecamera forward scatter. We define as the set of 3D points captured in a region centered at the observed pixel . Note that the size of the kernel support should be set manually. In our experiments, to gave efficient results. To avoid computation of all the elemetns of , we approximated the convergence value as follows:
(24) 
4.2 Photometric stereo
After removing the backscatter and surfacecamera forward scatter, we can obtain the reflected components . We reconstruct the surface normals by applying photometric stereo to . From Equations (8), (9) and (11), is given as follows:
(25) 
Note that this equation is not linear with respect to the normal due to the sourcesurface forward scatter. We want to apply photometric stereo directly to the equation; therefore, we use the following approximation of table :
(26) 
In Figure 7, we plot and when and . In each figure, the blue line represents and the green line represents . Although the error increases as increases, these graphs validate this approximation. Therefore, we can obtain the following:
(27) 
This is a linear equation about normal ; hence we apply photometric stereo to this equation.
4.3 Implementation
In this section, we explain our overall algorithm. Note that the kernel of equation 18 is only defined on the object’s surface; thus, we input a mask image and perform the proposed method on only the object region. Backscatter is removed using a previously proposed method [22]; however, the resulting image contains highfrequency noise due to SNR degradation. Therefore, we apply a median filter after removing the backscatter to reduce this highfrequency noise. We used Poisson solver [1] which is extended to perspective projection [16] for normal integration to reconstruct the shape. The proposed algorithm is described as follows:

Input images and a mask.
Initialize the shape and normals. 
Remove backscatter [22] and apply a median filter to the resulting images.

Remove forward scatter between the object and the camera (Equation (23)).

Reconstruct the normals using Equation (27).

Integrate the normals and update them from the reconstructed shape.

Repeat steps 3–5 until convergence.
5 Experiments
5.1 Experiments with synthesized data
We first describe experiments with synthesized data. We generated 8 synthsized images with a 3D model of a sphere using our scattering model in Section 3. The scattering property was assumed to be isotoropic and the parameters were set as . We show the examples of the synthesized images in Figure 8 (a), where an image without a participating medium, a reflected component , and a backscatter subtracted image from top to bottom. In the experimetns, the kernel support was set as . The shape was initialized as a plane.
The results are shown in Figure 8 (b) (c) and Table 1. Figure 8 (b) shows the ground truth and (c) shows the output of each iteration from left ro right. The top row shows the normals map, the middle row shows the angular error of the output, and the bottom row shows the reconstructed shapes. Table 1 shows the mean angular error of each output. GT in Table 1 denotes the error when we removed scattering effects with the ground truth shape and reconstructed the 3D shape inversely. As shown in Figure 8, the shape converged while oscillating in height. This convergence was also seen in the experiments with the real data (Figure 10).
5.2 Experimental environment
We also evaluated the proposed methods using real captured data. The experimental environment is shown in Figure 9 (a). We used a 60cm cubic tank and placed a target object in the tank. We used diluted milk as a participating medium. The medium parameters were set with reference to the literature [10]. A ViewPLUS Xviii 18bit linear camera was mounted in close contact with the tank, and eight LEDs were mounted around the camera. The input images were captured at an exposure of 33 ms. We captured 60 images under the same condition, and these images were averaged to make input images robust to noise caused by the imaging system; thus, eight averaged images were input to the proposed method.
The camera was calibrated using the method presented in the literature [25]. To consider refraction on the wall of the tank, calibration was performed when the tank was full of water. The locations of the LEDs were measured manually, and each radiant intensity was calibrated using a white Lambertian sphere.
The target objects are shown in Figure 9 (b) (sphere, tetrapod, and shell).
1  2  3  4  5  GT  

Error (deg.)  5.20  4.65  1.43  1.29  1.29  1.30 
5.3 Comparison wtih the backscatteronly modeling
We compared the proposed method with a previously proposed method [22] that models only backscatter. In each experiment, we initialized the target object as a plane for the iteration, and the kernel support was set as .
First, we evaluated the proposed method quantitatively using sphere. In this experiment, we placed 120 L of water and 30 mL of milk in the tank. Figure 1 (b) shows one of the input images. The results are given in Figure 10, where (a) shows the ground truth, (b) shows the result of the backscatteronly modeling [22], and (c) shows the result of the proposed method. These experimental results demonstrate that the proposed method can reconstruct the object’s shape in highly turbid media, in which the method that does not consider forward scatter fails. Table 2 shows the mean angular error of the results of the backscatteronly modeling [22] and the output of each iteration of the proposed method. As can be seen, the error reaches convergence during a few iterations.
Figure 11 and 12 show the results for tetrapod and shell. In each figure, (a) shows the result obtained in clear water and (b) shows the results of the existing [22] (second and third rows) and proposed (fourth and fifth rows). The top row shows one of the input images. We changed the concentration of the participating medium during these experiments (we mixed 10, 20, and 30 mL of milk with 120 L of water from left to right). As can be seen, the result of the existing method [22] becomes flattened as the concentration of the participating medium increases. In contrast, the proposed method can reconstruct the detalied shape in highly turbid media.
6 Conclusion
In this paper, we have proposed a photometric stereo method in participating media that considers forward scatter. The proposed analytical model differs from the previous works in that forward scatter depends on the object’s shape. However, the shape dependency of the forward scatter makes it impossible to remove. To address this problem, we have proposed an approximation of the largescale dense matrix that represents the forward scatter as a sparse matrix. Our experimental results demonstrate that the proposed method can reconstruct a shape in highly turbid media.
However, the ambiguity of the optimized support size of the kernel remains. We set aside an adaptive estimation of the support size for future work.
A limitation of the proposed method is that it requires a mask image of the target object. However, in highly turbid media, it may be difficult to obtain an effective mask image. In addition, we must initialize the object’s shape, which may be solved using a depth estimation method in participating media [21, 2, 3].
Acknowledgement
This work was supported by the Japan Society for the Promotion of Science KAKENHI Grant Number 15K00237.
References
 [1] A. Agrawal, R. Rasker, and R. Chellappa. What is the range of surface reconstructions from a gradient field? Proceedings of the 9th European conference on Computer Vision, I:578–591, 2006.
 [2] Y. Asano, Y. Zheng, K. Nishino, and I. Sato. Shape from water: Bispectral light absorption for depth recovery. European Conference on Computer Vision, pages 635–649, 2016.
 [3] A. Dancu, M. Fourgeaud, Z. Franjcic, and R. Avetisyan. Underwater reconstruction using depth sensors. SIGGRAPH Asia 2014 Technical Briefs, 2014.
 [4] J. Gu, S. K. Nayar, P. N. Belhumeur, and R. Ramamoorthi. Compressive structured light for reconvering inhomogeneous partiicpating media. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3):555–567, 2013.
 [5] Y. Hirofuji, M. Iiyama, T. Funatomi, and M. Minoh. 3d reconstruction of specular objects with occulusion: A shapefromscattering approach. Asian Conference on Computer Vision, pages 630–641, 2014.
 [6] M. B. Hullin, M. Fuchs, I. Ihrke, H. P. Seidel, and H. P. A. Lensch. Fluorescent immersion range scanning. ACM Transaction on Graphics, 27(3), 2008.
 [7] C. Inoshita, Y. Mukaigawa, Y. Matsushita, and Y. Yagi. Shape from single scattering for translucent objects. Proceedings of the 12th European Conference on Computer Vision, pages 371–384, 2012.
 [8] J. Kim, D. Lanman, Y. Mukaigawa, and R. Rasker. Descattering transmission via angular filtering. Proceedings of the 11th European conference on Computer vision, pages 86–99, 2010.
 [9] Z. Murez, T. Treibitz, R. Ramamoorthi, and D. J. Kriegman. Photometric stereo in a scattering medium. IEEE Transaction on Pattern Analysis and Machine Intelligence, 39(9):1880–1891, 2017.
 [10] S. G. Narasimhan, M. Gupta, C. Donner, R. Ramamoorthi, S. K. Nayar, and H. W. Jensen. Acquiring scattering propoerties of participating media by dilution. ACM Transaction on Graphics, 25(3):1003–1012, 2006.
 [11] S. G. Narasimhan, S. K. Nayar, B. Sun, and S. J. Koppal. Structured light in scattering media. Proceedings of the Tenth IEEE International Conference on Computer Vision, I:420–427, 2005.
 [12] S. K. Nayar, K. Ikeuchi, and T. Kanade. Shape from interreflection. International Journal of Computer Vision, 6(3):173–195, 1991.
 [13] S. K. Nayar, G. Krishnan, M. D. Grossberg, and R. Raskar. Fast separation of direct and global components of a scene using high frequency. ACM Transaction on Graphics, 25(3):935–944, 2006.
 [14] S. Negahdaripour and A. Sarafraz. Improved stereo matching in scattering media by incorporating a backscatter cue. IEEE Transactions on Image Processing, 23(12):5743–5755, 2014.
 [15] S. Negahdaripour, H. Zhang, and X. Han. Investigation of photometric stereo method for 3d shape recovery from underwater imagery. OCEANS’02, 2:1010–1017, 2002.
 [16] T. Papadhimitri and P. Favaro. A new perspective on uncalibrated photometric stereo. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1474–1481, 2013.
 [17] V. Pegoraro, M. Schott, and S. G. Parker. A closedform solution to single scattering for general phase functions and light distributions. Proceedings of the 21st Eurographics conference on Rendering, pages 1365–1374, 2010.
 [18] M. Roser, M. Dunbabin, and A. Geiger. Simultaneous underwater visibility assessment, enhancement and improved stereo. IEEE International Conference on Robotics & Automation, pages 3840–3847, 2014.
 [19] B. Sun, R. Ramamoorthi, S. Narasimhan, and S. Nayar. A practical analytic single scattering model for real time rendering. ACM Transaction on Graphics, 24(3):1040–1049, 2005.
 [20] J. Tiang, Z. Murez, T. Cui, Z. Zhang, D. Kriegman, and R. Ramamoorthi. Depth and image restoration from light field in a scattering medium. Proceedings of the IEEE International Conference on Computer Vision, pages 2401–2410, 2017.
 [21] T. Treibitz and Y. Y. Schechner. Active polarization descattering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3):385–399, 2009.
 [22] C. Tsiotsios, M. E. Angelopoulou, T. Kim, and A. J. Davison. Backscatter compensated photometric stereo with 3 sources. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 2259–2266, 2014.
 [23] H. A. van der Vorst. Bicgstab: A fast and smoothly converging variant of bicg for the solution of nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing, 13(2):631–644, 1992.
 [24] R. J. Woodham. Photometric method for determining surface orientation from multiple images. Optical engineering, 19(1):139–144, 1980.
 [25] Z. Zhang. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330–1334, 2000.
 [26] K. Zhou, Q. Hou, M. Gong, J. Snyder, B. Guo, and H. Y. Shum. Fogshop: Realtime design and rendering of inhomogeneous, singlescattering media. 15th Pacific Conference on Computer Graphics and Applications, pages 116–125, 2007.