Lung nodule segmentation via level set machine learning
Abstract
Lung cancer has the highest mortality rate of all cancers in both men and women. The algorithmic detection, characterization, and diagnosis of abnormalities found in chest CT scan images can potentially aid radiologists by providing additional medical information to consider in their assessment. Lung nodule segmentation, i.e., the algorithmic delineation of the lung nodule surface, is a fundamental component of an automated nodule analysis pipeline. We introduce an extension of the vanilla level set image segmentation method where the velocity function is learned from data via machine learning regression methods, rather than manually designed. This mitigates the tedious design process of the velocity term from the standard method. We apply the method to image volumes of lung nodules from CT scans in the publicly available LIDC dataset, obtaining an average intersection over union score of 0.7185 (0.1114).
figure \cftpagenumbersofftable
*Matthew C Hancock\linkablemhancock@math.fsu.edu
1 Introduction
Lung cancer has the highest mortality rate in both males and females in the United States [1]. A lung nodule is a small to medium sized (roughly, 3 mm to 30 mm) abnormal region with a somewhat welldefined boundary. The definition is inherently imprecise because it is based on visual examples and the subjective interpretations thereof [2]. The likelihood of malignancy of a lung nodule can be inferred by a combination of radiological features (e.g., growth rate, shape, or density features) [3], which if determined early, increases chances of survival [4]. The precise location of the nodule’s surface in the image volume is often necessary to produce such nodule features computationally (e.g., a nodule’s volume or average internal density). Thus, the accurate segmentation of the nodule is a crucial step in a computeraided diagnosis (CAD) system. However, lung nodule segmentation is challenging because of the variability in nodule appearance and shape, as well as the potential proximity to various other lung anatomy (e.g., the vasculature or the pleural wall).
In this work, we introduce a machine learning extension of the standard level set image segmentation method of Malladi and Sethian [5] and apply it to the lung nodule segmentation problem. Starting from an initial guess, the method evolves a function that moves toward the boundary of the lung nodule in the image volume. The evolution of this function is dictated by a partial differential equation (PDE) whose velocity term is a function of the underlying image data, thus guiding the surface towards the desired boundary. The standard level set approach requires the manual design of this velocity term, which is difficult. In our extension of this method, which we call the level set machine learning (LSML) method, the velocity term in the PDE is learned from data via machine learning regression models.
This paper is structured as follows: in Section 2 we provide background on the level set segmentation method, related works and results for lung nodule segmentation. In Section 3, we introduce the LSML image segmentation method, and in Section 4, we present and discuss our results from applying our method to the lung nodule segmentation problem. We conclude in Section 5.
2 Background
2.1 Level set image segmentation and variants
Segmentation of images by evolving contours was introduced by Kass et al. [6] where a parameterized curve is evolved by minimizing a weighted sum of internal and external energy functionals. Evolving a parametrized curve is tedious from a computational point of view because topological changes are not easily handled. On the other hand, implicit curves handle topological changes very naturally. In the level set method, the curve denoting an object’s boundary is given implicitly by the zero level curve of a function which is often called the level set function. Figure 1 illustrates this concept, where evolving the surface yields a zero level curve (shown in red below the surface ) that undergoes a change in number of connected components from one to two. The concept extends readily to higher dimensions where the zero level set is a surface. The level set approach was pioneered by Osher and Sethian [7] for tracking flame movement, and Malladi and Sethian [5] introduced the level set method to the realm of image segmentation.
The movement of the level set function is governed by a PDE that we briefly derive. Suppose parameterizes the zero level set, , and evolves in the outward normal direction with velocity , i.e., , where is the time partial derivative of and is the outward unit normal to the zero level set. Thus, positive values of expand the level set and negative values of contract it. We take the convention that is positive inside the zero level set and negative outside, so that the gradient vector points in the inward normal direction along the zero level set of and . Thus by differentiating the level set relation with respect to , we arrive at the level set evolution PDE,
(1) 
To perform image segmentation, the velocity in Eq. (1) is defined in terms of the underlying image information. Intuitively, the velocity should be positive interior to the target boundary to cause expansion and negative exterior to the boundary to cause contraction. Near the target boundary, the velocity should be small in magnitude to prevent overshooting. The manual design of such a velocity field is difficult and often entails simplifying assumptions such as a homogeneously bright object against a homogeneously dark background. Such assumptions are often violated in practice: boundaries can be fuzzy, occluded, or defined in terms other than those assumed. These difficulties often occur in medical imagery where lung nodules are attached, or in close proximity, to separate anatomical objects with locally similar appearance, e.g., juxtapleural or juxtavascular lung nodules. The result of faulty assumptions is that the zero level set fails to evolve towards the desired boundary.
In typical level set image segmentation approaches, statistical information garnered from a dataset of training examples is not used. A few works have explored this avenue, mostly by attempting to enforce the evolving function to conform to more statistically likely shapes or to enclose more image regions with more statistically like appearance. The first work to introduce such an approach is due to Leventon [8], where principal component analysis (PCA) was used on a dataset of training images and shapes to create Bayesian prior models in order to penalize segmentations that deviate from those with expected shape and expected image values near the segmentation surfaces. Tsai [9] took a similar approach by applying PCA to a training set of signed distance representation of shapes. More recent work has formulated the level set evolution as an energy functional minimization problem, where statistical information about shape and image features is incorporated by viewing the energy functional as the negative log of some probability density [10]. The probability density of image and shape features is modeled by employing, for example, Gaussian kernel density estimation [11].
The work closest in spirit to our approach is not level set based, but rather Van Ginneken’s [12] machine learning extension of the standard region growing method. Traditionally, the region growing method recursively adds points to a growing region via a fixed rule (e.g., based on an image value threshold); however, Van Ginneken allows this rule to be learned from data, viewing the choice of whether or not to add a point as a binary classification problem solvable via machine learning methods.
2.2 Lung Nodule Image Segmentation
Lung nodule image segmentation refers to the algorithmic delineation of the boundaries of objects called lung nodules, which are focal abnormalities of the lung, often appearing as dense regions relative to their surroundings [2]. Lung nodule image segmentation is difficult because of the variation in nodule geometry, variation in nodule interior and edgedensities, and nonnodule anatomical structures in close proximity to, or occluding, a nodule, which often have similar density to the nodule. Most lung nodule segmentation work, including our own, leverages the publicly available LIDC dataset [13] of lung CT data and radiologist annotations. The LIDC dataset contains 1018 lung CT scans that have been annotated by four radiologists (see Fig. 2 for an example). Each radiologist visually examined each scan, and upon detection of a lung nodule within a scan, drew the boundary of the lung nodule in each slice for which the detected nodule was present (according to that specific radiologist). These “groundtruth” nodule boundary annotations, along with CT image volume data, are available in the LIDC dataset.
An assortment of methods have been applied to the lung nodule segmentation problem. The recent work by Wang [14] constructed a table of works, including their own, that reported the Jaccard overlap score. The Jaccard score is also referred to as the “intersection over union” score and is a measure of segmentation quality. It is defined as the size of the intersection between the algorithmic segmentation and the groundtruth segmentation divided by the size of the union, which is one for perfect overlap and zero at worst when there is no overlap. Because we have also used the Jaccard overlap score as the measure of segmentation quality in our work, we have included and expanded this list of works in Table 1, where we have placed our work in context for comparison with these other methods: Tachibana [15] combined a variety of image processing techniques such as thresholding, templatematching, and the watershed method (an edgebased method for determining boundaries) and obtained an average Jaccard overlap score of 0.5070 on 23 nodules. Wang [16] used a dynamic programming approach and fusion method for combining information from multiple twodimensional image slices. They reported an average Jaccard overlap score of 0.58 on 64 nodules. Messay [17, 18] in their first work, applied a variety of morphological operations with a subsequent “rulebased analysis”, obtaining average overlap of 0.63 over 68 nodules, whereas in their followup work, they used a calibration process over training data to predict various thresholding and morphological parameters based on features computed from the image, improving their results to an average of 0.7170 over 66 nodules. Both Kubota [19] and Lassen [20] applied basic image processing techniques such as thresholding and morphological operations, as well as convexity information, achieving average Jaccard overlap scores of 0.69 and 0.52, respectively. Tan [21] used the watershed method, active contours, and Markovrandom fields, achieving an average overlap score of 0.65 over a dataset containing 23 nodules. The work by Wang [14] applied convolutional neural networks (CNNs) with a centrallyfocused maxpooling operation applied to 493 test nodules, obtaining an average overlap score of 0.7116.
Authors  Year  Number of Nodules  Jaccard overlap  
Training  Testing  
Tachibana [15]  2006    23  0.5070 (0.2190) 
Schildkraut [22]  2009    23  0.4790 
Wang [16]  2009  23  64  0.5800 
Messay [17]  2010    68  0.6300 (0.1600) 
Kubota [19]  2011    23  0.6900 (0.1800) 
Tan [21]  2013    23  0.6500 
Farag [23]  2013    334  N/A 
Lassen [20]  2015    19  0.5200 (0.0700) 
Messay [18]  2015  300  66  0.7170 (0.1989) 
Farhangi [24]  2017  488  54  0.5700 (0.1600) 
Wang [14] (Level set)  2017  350  493  0.4350 (0.0952) 
Wang [14] (CNN)  2017  350  493  0.7116 (0.1222) 
LSML method (our work)    672  112  0.7185 (0.1114) 
Relatively fewer works have applied the level set method for lung nodule image segmentation. Schildkraut [22] tested the level set method with energy terms, for “increasing contrast of the segmented region relative to its surroundings”, to 23 lung nodules in radiography images. They reported an average overlap using the Dice coefficient, , of 0.6477 on the 23 lung nodules. The Dice score is related to the Jaccard score by , and thus, the work by Schildkraut [22] reported an Jaccard overlap score of 0.4790. In the work of Tan [21], which we have mentioned previously, a level set formulation of the geometric active contours method was employed as a postprocessing step following an initial watershed segmentation, achieving an average Jaccard overlap score of 0.65. Farag [23] used a level set approach and incorporated an elliptical prior to aid in cases where lung nodules are in proximity to other anatomical objects. They reported a “success rate” (where success is determined by visual inspection of the resulting segmentation) of 94.61% on 334 lung nodules images from the LIDC dataset, but the Jaccard overlap score or similar measures of overlap are not reported. Farhangi [24] used the regionbased ChanVese [25] active contour model to partition nodule and nonnodule regions based on region imagehomogeneity using a level set formulation. In addition, a training set of nodule shapes was employed, and at each iteration during the level set evolution, the level set iterate was projected onto the linear span of the training shapes by solving a minimization problem that included a sparsityinducing term to force coefficients in the weighted sum to be sparse. They used 542 lung nodules from the LIDC dataset, achieving an average Jaccard overlap score of 0.57 over a 10fold crossvalidation procedure. For comparison against their convolutional network model, Wang [14] also applied a generic version (i.e., nonstatistical and without specific tailoring to the lung nodule problem domain) of the regionbased ChanVese level set model [25], obtaining an average overlap score of 0.4350 over the same 493 nodules on which they tested their network model. In our work, we obtain an average Jaccard overlap score of 0.7185 over a testing dataset of 112 nodules using our machine learning extension of the level set method described in Section 3.
3 Methods
We provide motivation for the LSML method in Section 3.1 and an algorithmic outline of the parameter tuning process in Section 3.2. In Section 3.4 we describe our initialization routine that yields a first guess of the segmentation given an image volume containing a lung nodule. In Section 3.5 we describe the features that are used as inputs to the regression models in the LSML method used in our experiments.
3.1 Motivation
Let’s suppose we find ourselves in the not unusual situation of having a dataset of pairs , where is an image and is a curve or surface annotating the boundary of the object to be segmented in . From the outset, it is not entirely apparent how to incorporate such labeled data to model the velocity term in Eq. (1); however, we leverage an observation from Breen and Whitaker [26]: if is the signed distance transform of the target boundary , then the zero set of converges to under the motion dictated by Eq. (1).
This result is intuitive because we’re dictating that the zero level set move towards the with speed equal to the distance from the boundary, where the choice of expansion or contraction is controlled by the sign in the signed distance. A bit more formally, this can be understood as follows: first, we define a success score of , with being the unit step function. is maximal when the zero level set of matches that of because otherwise the integral would include a negative part of . Next, we observe the evolution of the score by differentiating with respect to and plugging in Eq. (1): , which is stationary when the level set of matches the zero level set of , thus matching the target curve .
Our motivation thus far is circular: setting to the signed distance transform of the annotating curve assumes the solution! However, these observations strongly suggest an approximation scheme of the form , where is a machine learning regression model calibrated from image and shape data to approximate the signed distance values from the annotating curves provided by the dataset, thus guiding the level set evolution towards the correct segmentation in cases where the solution is unknown. This is the essence of the LSML approach.
3.2 Outline of the LSML method
First, Eq. (1) is discretized:
(2) 
where the discrete gradient norm term is approximated using the upwind scheme of Osher and Sethian [7]. Next, the velocity term is replaced with a iterationdependent approximation by a regression model , thus replacing Eq. (2) with
(3) 
The input to the regression model is a feature vector computed from image and shape data, thus , where . In other words, the feature vector, which is the input to the regression model , is a function of the image and the level set iterate, from which image and shape features can be extracted, respectively. As a simple example, consider a twocomponent feature map function that yields two features, , where is the unit step function. The first feature, which approximates the volume enclosed by the zero level set of , does not depend on the local position and is thus a global shape feature. The second feature depends on both the local spatial grid index and the image and is therefore a local image feature. Generally, the feature map function may comprise an assortment of combinations of local and global, shape and image features.
The goal of the regression model is to learn a mapping from the feature vector , which depends only on local and global image and shape information, to the signed distance value because, as discussed in Section 3.1, these values are known to guide the evolving zero level set towards the target shape. Because there is a regression model for each discrete timestep , these models are obtained sequentially in the training process. We outline this sequential procedure below, assuming a dataset of images and groundtruth boundaries:

Initialize each according to a computationally cheap scheme that yields a first guess of the segmentation for image . Set .

Calibrate parameters of regression model by least squares over all spatial coordinates and examples, i.e.,
where are the parameters of regression model and is the signed distance transform of the annotating curve . Note that we have suppressed the argument to : in full, is a function of , i.e., the feature vector for the current level set iterate and image for example .

Step each forward in time by the PDE discretization Eq. (3)

If metrics (e.g., intersection over union) observed over a separate validation set are not degrading, set and go to Step 2.
The training procedure outlined above yields a sequence of regression models that can be deployed on new, unseen images using the iteration in Eq. (3) with the regression model parameters fixed to those determined from the training procedures outlined above. We note that the procedure can be made much more efficient by considering spatial coordinates only in a narrow band about the zero level set, for which fast methods exist to extract [27].
3.3 Data preparation
We use the data provided in the LIDC dataset[13] for our experiments. The data undergoes a number of preprocessing steps, which we first describe briefly and then in more detail in the subsequent subsections. First, the data is analyzed to obtain only nodules where all four annotators agree upon the existence of a lung nodule at a particular location in the scan; this yields 896 nodules. Next, the radiologist annotation contours are converted to booleanvalued target volumes, and multiple annotations for the same nodule are consolidated into a single boolean volume. Afterward, for each of the 896 lung nodules selected, we standardize the image volumes containing the lung nodule as well as its associated groundtruth, booleanvalued volumes to have uniform voxel spacing of one millimeter because the CT scans in the LIDC data have been generated with different scanning devices and scanner parameters, resulting in different image volume resolutions. Lastly, we randomly partition this dataset of 896 lung nodules and respective groundtruth segmentations into subsets of size 672, 112, and 112 for training, validation, and testing, respectively.
3.3.1 Gathering nodules with four annotators
Images of the lung nodules in the CT scan volumes from the LIDC were annotated by up to four radiologists, but the physical lung nodules lack a universal identifier. By “annotation”, we mean the sequence (i.e., the sequence through the slices of a particular CT scan) planar curves describing the boundary of a particular lung nodule as determined by a single radiologist. Symbolically, a lung nodule annotation can be written , where is the curve describing the nodule boundary in imageslice of the CT volume. In our experiments, we only use lung nodules that have been annotated by all four radiologists, and thus, we begin by estimating when multiple annotations refer to the same physical nodule in an image. We accomplish this by first defining a distance function, , that describes the nearness of two nodule annotations, and . Next, for a given scan, we compute a distance matrix, , where the entry is (i.e., the distance from annotation to annotation ), thus providing the pairwise distances between all annotations in the scan. Two annotations are said to be adjacent when , where is a threshold parameter. The value of is initialized to be equal to the slice thickness (in millimeters) of the scan, which is a parameter that can be found in the DICOM image data in the LIDC dataset. Nodule annotations are said to refer to the same physical nodule in the scan when they belong to the same connected component of the adjacency graph, which is formed by thresholding the pairwise distance matrix by . If afterwards there are annotation groupings with size greater than four nodules (i.e., greater than the number of annotating radiologists), we reduce the threshold parameter by a multiplicative factor, and we repeat the process. In our work, we find that the distance function, , between two annotations that takes the minimum over all pairwise 3D distances between between the coordinates of the two annotations works well, which is confirmed visually. From this process, we obtain 896 lung nodules, each having annotations by exactly four radiologists. This approach for clustering annotations is implemented in the pylidc^{1}^{1}1https://github.com/notmatthancock/pylidc Python package.
3.3.2 Volume interpolation
The pixel and slice spacing (i.e., the within and betweenslice scan resolutions, respectively) varies from among scan in the LIDC dataset, and to normalize this, we construct bounding boxes about each nodule, which we then interpolate to have uniform voxel spacing across all scans. For each lung nodule, of the 896 obtained with four annotations (discussed in the previous section), a common reference frame (i.e., a bounding box in the image volume) for the four associated annotations is formed. In this common reference frame, we convert each annotation into a booleanvalued volume that is one inside each annotating contour and zero outside, using the raycasting method implemented in the matplotlib [28] Python package. Next, we perform a trilinear interpolation on both the image and the four associated booleanvalued volumes so that the resulting volumes have equal, onemillimeter spacing between voxels. The volumes are interpolated so that each volume is 70 cubic millimeters, which was chosen to account for the nodule with largest observed diameter of 60 millimeters and to leave sufficient padding of nonnodule voxels about every nodule. Thus, the interpolated volumes are of dimensions, .
3.3.3 Consolidation of multiple ground truths and final preprocessing steps
For each nodule, we consolidate the four groundtruth segmentations (i.e., the booleanvalued indicator volumes of the lung nodule), , into a single a single groundtruth segmentation, , by computing, where is the Jaccard overlap function. The quantity is sometimes referred to as the “Jaccard median” and is the best consolidation of the four annotations in the sense that it agrees most with all four annotations under the Jaccard overlap measure. Although we use the computed Jaccard median for our experiments, we observe that the typicallyused 50% consensus consolidation (where the combined segmentation is equal to one where 50% or greater agreement occurs in the nodule’s segmentations and is zero otherwise) is often in high agreement with the Jaccard median. For example, in our case, on average, where denotes the 50% consolidation. This, combined with its simplicity (whereas the Jaccard median requires numerically solving a constrained optimization problem), justifies the use of the 50% consensus consolidation method that is often used in other works that use the LIDC dataset.
As the final preprocessing step, we standardize each image volume by subtracting off each respective mean and dividing by each respective standard deviation. This process results in datasets, , of image and segmentation pairs, where for the training dataset and for the validation and testing datasets.
3.4 Initialization
In devising an initialization procedure, we use the observation that lung nodules are often, but not always, dense (and thus appearing relatively brighter than their surrounding in CT image slices), and approximately spherical in shape. These assumptions about the shape and appearance of the nodules are frequently violated, but nevertheless, the goal in initialization is to only to provide reasonable guess and to allow the LSML algorithm to improve upon it.
The initialization process uses local thresholding, connected component analysis, and a “radius trimming” postprocessing technique. An example is shown in Fig. 3. Local thresholding of the image yields regions in the image that have similar image intensity values. Connected component analysis removes extraneous binary components obtained by thresholding, except the component that is closest to the seed point. Finally, the radius trimming technique serves to remove any parts of the binary component that extend beyond a specified distance from the center of the volume. This initialization process involves two free parameters, a smoothing factor and a radius percentile value , that are calibrated by performing a grid search over training data.
In more detail, the procedure begins by convolving a Gaussian smoothing kernel (with parameter ) with the image, and the image is thresholded by sending values that are larger than the smoothed value to one and those that are below to zero. In other words, we transform the image to a booleanvalued image by setting those voxels to one that are greater than the average value in the neighborhood of surrounding voxels (where the neighborhood size is implied by the smoothing factor ). Afterwards, we determine the connected component of this binary image that is closest to the center point of the volume. All pixel values in the binary image that are not in this connected component are set to zero.
Next, in spherical coordinates with the center of the image as the origin, we sample azimuth and zenith angles uniformly and determine the corresponding radius for given azimuth and zenith angles, defined as the distance until the first voxel where a ray emanating from the center of the volume meets a value of zero in the boolean volume. After sampling many azimuth and zenith angles, we compute the percentile of radii observed. Any ray with radius extending beyond the percentile is trimmed (see Figure 2(b), left in blue), i.e., the thresholded image is set to zero outside a sphere centered in the image volume with radius corresponding to the percentile of the radii observed. If after trimming the radii, there are multiple connected components, we choose the one closest to the seed point and set the others to zero. The initial values of are set to where the final boolean initialization is and where the boolean initialization is .
The free parameters and are determined through a grid search procedure over the training data. Specifically, we search over the parameter values, and . For each parameter combination in the Cartesian product of these sets of values, we compare this segmentation from our initialization procedure against the ground truth segmentation under the Jaccard overlap score. The parameter combination with the highest average overlap score over the training data is used, which we determine to be and .
3.5 Features used
Feature  Local  Global 

Volume  ✓  
Surface area  ✓  
Isoperimetric ratio  ✓  
Moments of , order = 1,2  ✓  
Distance from to center of mass  ✓  ✓ 
Image average in  ✓  
Image variability in  ✓  
Image value at  ✓  
Image edge at  ✓ 
The first feature set that we apply to the lung nodule image segmentation problem, which we call Feature Map 1, is enumerated in Tab. 2. It consists of a number of simple and generic, global and local, shape and image features. These features serve as a baseline against the next feature set that extends Feature Map 1. All image features are computed at two scales, a fine scale () and a coarse scale (), so that Feature Map 1 comprises a total of 18 features.
Feature  Local  Global 

All Feature Map 1 features (see Tab. 2)  ✓  ✓ 
Image average over  ✓  
Distance to center of mass average  ✓  
Distance to center of mass variability  ✓  
Distance to center of mass maximum  ✓  
Slice areas  ✓  ✓ 
Slice areas absolute change  ✓  ✓ 
Image samples along normal  ✓  ✓ 
Image samples along ray to center of mass  ✓  ✓ 
The second feature map, Feature Map 2, is enumerated in Tab. 3. It extends Feature Map 1 by including additional local image and shape features. Many of the features in Feature Map 2 include both local and global aspects in their computation. These additional globallocal (or ‘glocal’) features are local in the sense that they require the local voxel coordinate , but use previously computed global features such the “center of mass” in their computation. Feature Map 2 includes a total of 109 features, where all image features are computed at both fine and coarse scales (). The “distance to center of mass” statistics features in Feature Map 2 (i.e., average, variability, and maximum) supplement the “distance to center of mass” feature from Feature Map 1 with more global context. The “slice areas” and ‘slice areas absolute change” features are computed by calculating the areas of each slice through in the three axes directions. These provide a more localized version of the purely global volume feature in Feature Map 1, and in addition, they attempt to help enforce a slicetoslice continuity.
The “image samples along normal” and “image samples along ray to center of mass” features, illustrated in Fig. 4, provide context of the image along two lines emanating from a given coordinate . Image values are sampled along two lines: (1) in the direction of the unit normal through , and (2) in the direction of the line segment connecting the center of mass and . For each of these lines, 10 samples are taken in the direction inward and outward from .
4 Results
Feature Map 1  45  0.6951 (0.1119) 
Feature Map 2  47  0.7185 (0.1114) 
In Tab. 4 we report the average Jaccard overlap scores over the testing set at the optimal iteration from the validation dataset. In Fig. 4(a), we plot the average Jaccard overlap score over the testing dataset against the iteration number for Feature Map 1 (shown in solid black) and Feature Map 2 (shown in dashed black). At initialization, the average Jaccard overlap score is 0.6484 (0.1119). Using Feature Map 1, the LSML algorithm terminates after 45 iterations, obtaining a final average Jaccard overlap score of 0.6951 (0.1127), an increase of 7.2% relative to the average overlap score at initialization, whereas with Feature Map 2, the algorithm terminates after 47 iterations, and obtains a final average overlap score of 0.7185 (0.1114), a 10.8% relative increase over the average score at initialization. Feature Map 2 produces a 3.4% relative increase in overlap score over the overlap score produced under Feature Map 1, which is a modest, but significant () increase, indicating the usefulness of the ‘glocal’ features that are included in Feature Map 2. We also observe from Figure 4(a) that the score increases more rapidly during the earlier iterations than the latter, indicating that the more substantial changes in the segmentation shape (in the sense of those that produce relatively larger increases in the overlap score) occur in the early iterations, whereas the latter iterations serve in making only small refinements.
In Figure 4(b), the distributions of overlap scores for the Feature Maps 1 and 2 are given on the left and right, respectively. The median overlap score for Feature Map 1 is 0.7115 (above the mean overlap of 0.6951), and the 25th percentile to the 75th percentile stretches from 0.6596 to 0.7650, having a range of 0.1054. For Feature Map 2, the median overlap score of 0.7356 (above the mean overlap of 0.7185) with the 25th percentile to the 75th percentile ranging from 0.6793 to 0.7918, having a range of 0.1125. The large range in overlap score indicates the difficulty of the lung nodule segmentation problem compared to the synthetic dataset examples discussed in previous sections.
The maximal overlap scores observed for Feature Maps 1 and 2 are 0.8593 and 0.8906, respectively. The lung nodule with highest overlap score under Feature Map 2 is shown in Figure 6. This nodule achieves the second highest overlap score (of 0.8573) under Feature Map 1 and is an isolated nodule with a welldefined boundary. Both feature maps yield segmentations that capture the true boundary very well, with the most notable difference being in the 32nd slice, where the true boundary includes slightly more of the region of the nodule that is in close proximity to the vasculature in the posterior direction (i.e., moving downward from the lung nodule in the image).
Two outliers are observed (see Figure 4(b)) under both feature maps, the lowest overlap score having an overlap score of 0.0582 and 0.0732, for Feature Maps 1 and 2, respectively, which resulted from a poorly initialized segmentation of a juxtapleural lung nodule in a region near the bottom of the lung with the nearby lung wall and organs having image values very close to those of the nodule, as can be seen in Figure 7. This case is an outlier, and there are many juxtapleural nodule cases (as well as other nodule anatomical location and density categories) where substantial improvements are observed from Feature Map 1 to Feature Map 2.
In Figure 8, we show the center (35th) slice through the image volume of each of the 112 lung nodules in the testing dataset, where the red curve represents the contour given by the slice through the groundtruth segmentation surface and the blue curve represents the contour given by the slice through the approximate segmentation surface given by the zero level set in the LSML method using Feature Map 2. The LSML method performs well in a variety of contexts, including many juxtapleural nodules (e.g., row five, column two; or, row three, column eight), nodules with cavities (e.g., row one, column eleven; or, row seven, column ten), nonsolid nodules (e.g., row four, column nine), irregularlyshaped nodules (e.g., row nine, column three), spiculated nodules (e.g., row ten, column five; or, row three, column nine), as well as the other nodule types shown. These visual results, together with the quantitative results discussed previously, demonstrate the effectiveness of the LSML method applied to the lung nodule image segmentation problem in CT image volumes.
5 Conclusions
Lung nodule segmentation is a core component of the lung CAD pipeline, and accurate nodule segmentation poses unique challenges. The LSML method, a natural and direct machine learning extension of the level set image segmentation method, achieves an average Jaccard overlap of 0.7185 (0.1127), which is comparable to the current stateoftheart for lung nodule image segmentation. The mortality rate of lung cancer is large [29], and lung CAD methods, if robustly validated in clinical settings, carry the potential to aid physicians towards increased the survival rate of those afflicted. Accurate lung nodule segmentation is work towards this goal.
Disclosures
No conflicts of interest, financial or otherwise, are declared by the authors.
References
 [1] R. L. Siegel, K. D. Miller, and A. Jemal, “Cancer statistics, 2017,” CA: A Cancer Journal for Clinicians 67, 7–30 (2017).
 [2] M. F. McNittGray, S. G. Armato, C. R. Meyer, et al., “The Lung Image Database Consortium (LIDC) Data Collection Process for Nodule Detection and Annotation,” Academic Radiology 14, 1464–1474 (2007).
 [3] J. Erasmus, J. Connolly, H. McAdams, et al., “Solitary pulmonary nodules: part I. Morphologic evaluation for differentiation of benign and malignant lesions,” Radiographics (2000).
 [4] C. I. Henschke, D. F. Yankelevitz, D. M. Libby, et al., “Survival of Patients with Stage I Lung Cancer Detected on CT Screening,” New England Journal of Medicine 355(17) (2006).
 [5] R. Malladi and J. Sethian, “Shape modeling with front propagation: A level set approach,” IEEE Transactions on Pattern Analysis and Machine Intelligence 17(2) (1995).
 [6] M. Kass, A. Witken, and D. Terzopoulos, “Snakes: active contour models,” International Journal of Computer Vision 1(4), 321–331 (1988).
 [7] S. Osher and J. A. Sethian, “Fronts propagating with curvaturedependent speed: algorithms based on HamiltonJacobi formulations,” Journal of computational physics 79(1), 12–49 (1988).
 [8] M. E. Leventon, O. Faugeras, W. E. L. Grimson, et al., “Level set based segmentation with intensity and curvature priors,” in Mathematical Methods in Biomedical Image Analysis, 2000. Proceedings. IEEE Workshop on, 4–11, IEEE (2000).
 [9] A. Tsai, A. Yezzi, W. Wells, et al., “A shapebased approach to the segmentation of medical imagery using level sets,” IEEE Transactions on Medical Imaging 22, 137–154 (2003).
 [10] D. Cremers, M. Rousson, and R. Deriche, “A Review of Statistical Approaches to Level Set Segmentation: Integrating Color, Texture, Motion and Shape,” International Journal of Computer Vision 72, 195–215 (2007).
 [11] D. Cremers and M. Rousson, “Efficient kernel density estimation of shape and intensity priors for level set segmentation,” in Deformable Models, 447–460, Springer (2007).
 [12] B. Van Ginneken, “Supervised probabilistic segmentation of pulmonary nodules in CT scans,” in Medical Image Computing and ComputerAssisted Intervention–MICCAI 2006, 912–919, Springer (2006).
 [13] S. G. Armato III, G. McLennan, L. Bidaut, et al., “The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans,” Medical physics 38(2), 915–931 (2011).
 [14] S. Wang, M. Zhou, Z. Liu, et al., “Central focused convolutional neural networks: Developing a datadriven model for lung nodule segmentation,” Medical Image Analysis 40, 172–183 (2017).
 [15] R. Tachibana and S. Kido, “Automatic segmentation of pulmonary nodules on CT images by use of NCI lung image database consortium,” (2006).
 [16] Q. Wang, E. Song, R. Jin, et al., “Segmentation of Lung Nodules in Computed Tomography Images Using Dynamic Programming and Multidirection Fusion Techniques1,” Academic Radiology 16, 678–688 (2009).
 [17] T. Messay, R. C. Hardie, and S. K. Rogers, “A new computationally efficient CAD system for pulmonary nodule detection in CT imagery,” Medical Image Analysis 14, 390–406 (2010).
 [18] T. Messay, R. C. Hardie, and T. R. Tuinstra, “Segmentation of pulmonary nodules in computed tomography using a regression neural network approach and its application to the Lung Image Database Consortium and Image Database Resource Initiative dataset,” Medical Image Analysis 22, 48–62 (2015).
 [19] T. Kubota, A. K. Jerebko, M. Dewan, et al., “Segmentation of pulmonary nodules of various densities with morphological approaches and convexity models,” Medical Image Analysis 15, 133–154 (2011).
 [20] B. C. Lassen, C. Jacobs, J.M. Kuhnigk, et al., “Robust semiautomatic segmentation of pulmonary subsolid nodules in chest computed tomography scans,” Physics in Medicine and Biology 60, 1307–1323 (2015).
 [21] Y. Tan, L. H. Schwartz, and B. Zhao, “Segmentation of lung lesions on CT scans using watershed, active contours, and Markov random field,” Medical physics 40(4), 043502 (2013).
 [22] J. S. Schildkraut, S. Chen, M. Heath, et al., “Levelset segmentation of pulmonary nodules in radiographs using a CT prior,” Proc. SPIE Med. Imag., Image Process. 7259, 72593B–1 (2009).
 [23] A. A. Farag, H. E. A. El Munim, J. H. Graham, et al., “A Novel Approach for Lung Nodules Segmentation in Chest CT Using Level Sets,” IEEE Transactions on Image Processing 22, 5202–5213 (2013).
 [24] M. M. Farhangi, H. Frigui, A. Seow, et al., “3d Active Contour Segmentation Based on Sparse Linear Combination of Training Shapes (SCoTS),” IEEE Transactions on Medical Imaging , 1–1 (2017).
 [25] T. F. Chan and L. A. Vese, “Active contours without edges,” Image processing, IEEE transactions on 10(2), 266–277 (2001).
 [26] D. E. Breen and R. T. Whitaker, “A levelset approach for the metamorphosis of solid models,” IEEE Transactions on Visualization and Computer Graphics 7(2), 173–192 (2001).
 [27] D. Adalsteinsson and J. A. Sethian, “A fast level set method for propagating interfaces,” Journal of Computational Physics 118(2), 269–277 (1995).
 [28] J. D. Hunter, “Matplotlib: A 2d graphics environment,” Computing In Science & Engineering 9(3), 90–95 (2007).
 [29] “American cancer society cancer facts and figures 2017.” https://www.cancer.org/research/cancerfactsstatistics/allcancerfactsfigures/cancerfactsfigures2017.html. Accessed: Jul. 2017.