Improved Automated Crater Detection



Crater cataloging is an important yet time–consuming part of geological mapping. We present an automated Crater Detection Algorithm (CDA) that is competitive with expert–human researchers and hundreds of times faster.

The CDA uses multiple neural networks to process digital terrain model and thermal infra–red imagery to identify and locate craters across the surface of Mars. We use additional post-processing filters to refine and remove potential false crater detections, improving our precision and recall by 10% compared to Lee (2019). We now find 80% of known craters above 3km in diameter, and identify 7,000 potentially new craters (13% of the identified craters). The median differences between our catalog and other independent catalogs is 2–4% in location and diameter, in–line with other inter–catalog comparisons.

The CDA has been used to process global terrain maps and infra–red imagery for Mars, and the software and generated global catalog are available at


mode = title]Automated crater detection with human level performance

1]Christopher Lee[orcid=0000-0003-0029-5278, role=Researcher]


[1] \fnmark[1] url],


Conceptualization of the study, methodology, original software, writing

1]James Hogan[orcid=0000-0003-4694-5176, role=Researcher] \fnmark[1]


Algorithm development, optimization, methods section


[cor1]Corresponding Author


eep Learning \sepCrater Detection Algorithms \sepMars

1 Introduction

Crater populations and distributions provide critical information on the history and evolution of Solar system bodies. Regional variation in crater distribution can be used to constrain stratigraphy and geologic processes (e.g., Cintala et al., 1976; Barlow and Perez, 2003), and populations are used in estimated the age of surface features (e.g., Arvidson, 1974; Soderblom et al., 1974; Kinczyk et al., 2020; Palucis et al., 2020).

Measuring craters has historically required manually measuring the size and location of craters in images of the surface. Early work on crater mapping required manual annotation of printed maps(Barlow, 1988). Advances in imagery products and computational tools allowed more detailed work using digital mapping (Robbins and Hynek, 2012b) and computer aided detection of craters using Crater Detection Algorithms (CDAs, e.g., Stepinski et al. (2009), Di et al. (2014), Pedrosa et al. (2017)). These early CDAs uses manually designed rules to identify craters, but recent advances in Deep Learning(Goodfellow et al., 2016) permit algorithms that learn rules using the data itself and a ground–truth catalog of known craters. These neural network based CDAs have shown performance approaching expert human level (Silburt et al., 2019; Lee, 2019), but have fallen short of matching or exceeding expert performance when compared against independent expert generated catalogs (e.g., Salamunićcar et al., 2012).

In this work, we develop a CDA that is capable of finding near–circular craters from orbital imagery and terrain data with expert human level performance while being hundreds of times faster. We improve on the Lee (2019) CDA using additional datasets and post–processing to remove false positives and less confident crater detections to find almost 60,000 craters from 3km to 450km in diameter. The CDA workflow described in the next section can be used on imagery data of the same type with little or no training, and can be trained to work on new data from visible and ultra–violet wavelengths. New post–processing stages merge and de–duplicate global catalogs generated from independent sources to improve the accuracy and completeness of the final catalogs.

In this paper we describe the structure and workflow of the algorithm in section 2, including the crater detection neural networks and the post-processing filters that improve the precision of final catalogs. In section 3 we use this CDA to process DTM and Infra–red imagery data from Mars to generate a new catalog of craters above 3km in diameter, and we compare this new catalog to two existing crater catalogs for Mars (Robbins and Hynek, 2012a; Salamunićcar et al., 2012) to show that the CDA catalog is statistically competitive with the human generated catalogs. Finally, in section 4, we provide our conclusions and possible future steps to improve the CDA and generated catalogs.

1.1 Previous Work

The Lee (2019) CDA (hereafter L19) used the DeepMoon CDA developed by Silburt et al. (2019) to find craters in a digital terrain model (DTM) of Mars. This CDA worked by finding the rims of craters based on altitude, and learned by comparing images to the location of known craters from a ground–truth catalog(Robbins and Hynek, 2012b). The CDA found almost 55,000 craters, larger than 4km in diameter, distributed across the planet with a precision and accuracy of 75%. Approximately 13,000 ‘new’ craters were proposed by the CDA and a similar number of known craters were missing in the CDA catalog. A large fraction of these ‘new’ crater candidates are anomalous detections of small craters in the DTM data, with some larger genuine features from canyon rims and volcanic paterae. Under similar conditions, the Salamunićcar et al. (2012) catalog lists almost 72% of the same craters from Robbins and Hynek (2012b) but suggests less than 5,000 ‘new’ craters.

Other automatically–generated crater catalogs exist for Mars. Stepinski and Urbach (2009) developed a catalog using an automated CDA (following Bue and Stepinski (2006)) in combination with statistical methods to combine disparate datasets into a larger catalog. Di et al. (2014) developed an automated CDA based on correlation methods, but only tested the algorithm on a small region of the planet. In both examples, the reported precision fell much lower than either L19 or Salamunićcar et al. (2012) in comparison to Robbins and Hynek (2012b), but the limited area tested makes comparison difficult (Lee (2019) includes a quantitative comparison of these limited area tests). Similar work has been performed for Lunar craters (e.g., Zuo et al., 2016) and for the Earth (e.g., Krøgli and Dypvik, 2010).

Other groups have also created neural network based CDAs. Silburt et al. (2019) developed the UNET (Ronneberger et al., 2015) based DeepMoon CDA to detect craters on the Moon, which was subsequently modified in Lee (2019). Wronkiewicz et al. (2018) used the YOLO (Redmon et al., 2016) neural network for Mars craters, and the results from this CDA in Jezero crater are included in the JMars package ( Lee (2018) used a MaskRCNN(He et al., 2017) algorithm to find circular and elliptical craters for Mars and Pluto, Ali-Dib et al. (2019) used a similar network to find elliptical craters on the Moon.

Each of these CDAs identified and locates craters much faster than human cataloging efforts. For example, the L19 CDA processes a global Mars DTM down to 4km crater diameters in 24 hours on a modest workstation, using 150,000 images to identify 55,000 craters. However, since the CDA catalog contains a large number of new and missed crater detection additional work is required to filter out spurious craters and complete the catalog.

The CDA developed in this work improves upon L19 by reducing by one–fifth the number of missed craters, and by halving the number of mistakenly identified craters (compared to Salamunićcar et al., 2012).

2 Methods and Data

The Crater Detection Algorithm (CDA) automates the process of finding and characterizing crater-like features on the surface of a rocky planet. This involves using multiple processing stages to identify crater candidates, confirm crater identifications, and finally output a crater catalog based on the information collected in the CDA. The workflow of the CDA, as used in this work, is shown in Figure 1.

In stage 1 of the CDA, we use a ResUNET(Zhang et al., 2018) neural network to find circular crater features in global imagery or terrain model datasets (section 2.2). For memory and performance considerations, we split the global dataset into smaller images sampled across the planet at various resolutions, and process each image using the ResUNET neural network to identify circular features.

In the workflow shown in Figure 1, we use two datasets in the stage 1 of the CDA (section 2.2). A global daytime Infra-Red (IR) map produced from observations by the Thermal Emission Imaging System instrument (THEMIS) onboard Mars Odyssey (Edwards et al., 2011), and the combined Digital Terrain Model (DTM) dataset generated from the Mars Orbiter Laser Altimeter (MOLA) and the High Resolution Stereo Camera (HRSC) (Fergason et al., 2018). Each global map is processed independently using a ResUNET trained to process that data, resulting in two crater catalogs. We then merge the catalogs generated by the ResUNETs into a single catalog, and duplicate craters are counted and merged. In the process of merging, we calculate the average crater location and size (and associated uncertainties) from the various detections of each crater and record the number of times each ResUNET found a crater.

In stage 2 of the CDA(section 2.3), we use a classifier neural network to calculate a ‘crater confidence’ value for each crater in the catalog. This network uses both the optical and terrain data to determine whether a crater is well-resolved in the dataset. The probabilities generated by this CDA are stored for later use but not used for filtering at this stage .

In stage 3 of the CDA(section 2.4), we use the results from stage 1 and stage 2 to filter out false crater identifications and further refine the catalog, using a Gradient Boosting algorithm (XGBoost, Pedregosa et al., 2011) to filter out the false–positive identifications by considering diameter and duplicate count (from stage 1), and classification probability (from stage 2). At the end of this stage, the crater catalog contains the location, diameter, and associated uncertainties for each detected crater. The catalog contains sufficient information to plot the craters using the JMars mapping tool ( that already includes various human and computer generated crater catalogs for Mars.

Figure 1: Workflow of the CDA. In stage 1 the ResUNETs convert DEM or IR images into synthetic images of the crater rims, followed by the template matching to determine the size and location of the craters. The crater datasets are then merged, removing any duplicates and counting the fraction of craters from each source. In stage 2 the global crater dataset is used to generate images of single craters, centred in the images and the post–processing network is used to provide confidence values. In stage 3 the Gradient Boosting filter is used to remove the weaker crater predictions.

2.1 Data

The neural networks described in this section are flexible and can be trained to use different datasets that contain appropriate images of craters. In this work we use the DTM dataset from HRSC/MOLA(Fergason et al., 2018) and the daytime infra–red (IR) imagery data from THEMIS(Edwards et al., 2011). The DTM product is provided as a single file covering the globe on an equirectangular grid with a stated scale of 200m/pixel. The daytime IR data is provided as a single equirectangular grid with a stated resolution of 100m/pixel. Both datasets are created by the satellite mission teams and registered to the Mars datum. We make no attempt to correct misalignments in these datasets. While we have used DTM and IR data in this work, it is possible to replace either dataset by (e.g.) visible imagery or add additional datasets to the workflow, providing craters can be identified in the images.

When finding the location and size of the craters the neural networks and template matching code reports locations in pixels using floating point numbers, providing a numerical uncertainty of 0.5 pixels in each reported number. This uncertainty is smaller than the uncertainty in defining a crater ’rim’ in the images, and is comparable to the median difference between the CDA and the ground–truth dataset (5% uncertainty at most, compared to a median difference of 4% from Table 1).

In typical images, we find craters between 10 and 40 pixels in diameters at all locations. For the DTM data, this implies a lower diameter of km with location uncertainty of km, and an upper diameter of around km with a location uncertainty of km. For the IR data, the lower limits are half the equivalent values from the DTM because of the higher resolution of the IR data. The upper detectable crater size in IR data is lower than 500km because the larger craters are not clean features in the infra–red data.

For the results discussed in section 3, we found craters using the workflow by generating 272,370 image swatches at sizes from 1.5 degrees to 30 degrees in longitudinal extent covering the surface of Mars. We also generated an additional 1,292,726 images from the IR data only with image sizes from 0.5 to 1.3 degrees to exploit the higher resolution of the IR data. Results from these images are labelled as IRH in the following text.

2.2 Stage 1: ResUNET

The first stage of the CDA uses a residual neural network, or ResUNET (Zhang et al., 2018), to identify circular features in the input images that are highly correlated with desired features in the output images (i.e., crater rims). The ResUNET architecture is similar to the L19 network architecture (Ronneberger et al., 2015), but features additional internal connections that reduce training time and improve performance. Our ResUNET configuration is shown in Figure 2.

Figure 2: Our adaptation of the Residual UNET architecture using Keras (Chollet and Others, 2015). The layer types are shown in the figure legend. All convolution layers use a 2D convolution with a 3x3 kernel, and include batch normalization and use a rectilinear activation function. Residual connections carry the input to output changefrom each layer to the next layer, as in (Fergason et al., 2018).

The ResUNET uses an encoder branch (the descending branch on the left) and a decoder branch (the ascending branch on the right). The encoder branch takes the original image and compresses the information down to a series of feature locations (i.e., crater locations) at the ‘bottom’ of the network. The decoder branch then constructs a new image by combining the feature locations with the original image data constructing an image with highlighted crater rims, while removing noise from the images.

We train the ResUNET following the same methodology as in L19. We generate forty thousand images with random locations across the planet and with a log–normal distribution of image sizes. The images are orthographically projected and presented to the ResUNET along with a target image generated from a ground–truth crater catalog (Robbins and Hynek, 2012b). We train the network over many iterations until the performance of the network, measured using the Dice loss algorithm (Milletari et al., 2016), stops improving. We retain an additional 10,000 randomly generated images in a validation dataset to adjust external hyper parameters.

As in L19, the ResUNET outputs synthetic images with the crater rims marked (shown in Figure 1). We then convert each greyscale image to black and white, and use a circular template matching algorithm(van der Walt et al., 2014) to identify all circles with a radius of between 5 and 40 pixels that are a fraction of a complete circle. This processes finds all circles in the image, with some duplicates at consecutive crater sizes and locations (e.g., within a pixel or two in diameter or location). We remove these duplicate craters in the image by merging craters that are within a radius threshold () and location threshold () of another crater, and choosing the crater properties that maximize the correlation with the target circle. That is, duplicates satisfy the following equations


where the position and radius of the candidate craters are given by and . We determined the threshold values for this work by maximizing the recall of the ResUNET on the 10,000 image validation dataset, and use the values , , , .

The output from the ResUNET is a crater candidate list with no per–image duplicates but possible global duplicates. We do not remove the global duplicates at this stage, instead using them to refine the location of each crater and estimate uncertainties in the following stages.

After the ResUNET processing is complete for two or more input datasets, we merge the individual catalogs into a global catalog. We merge multiple identifications of the same crater (craters within 25% in size and relative position of each other) and count the number of times the crater is found in each dataset. We record the duplicate count, average location and diameter of each crater, and the standard deviation in location and size for each crater in the stage 1 catalog.

An alternative stage 1 network was tested where the two image datasets are used simultaneously in a two channel ResUNET. This network did not improve on the L19 as the DTM and IR 1–channel ResUNETs perform well at different spatial scales, and thus are better combined after the initial crater detection. The final improvements we found (i.e., stage 2 and 3) were gained by removing false positives using the independent datasets. We also limited the input data to image scales where both datasets could contribute to avoid biasing the results towards one source of data. The individual ResUNET stages here can be trained to work on high resolution data for crater identification and other feature mapping (e.g., Palafox et al., 2017; Wronkiewicz et al., 2018; Lee, 2019).

2.3 Stage 2: Classifier

In stage 2, we use a classifier network to derive a ‘crater confidence’ estimate for all of the candidates in the stage 1 catalog. The classifier network has a similar structure to the encoder branch of the ResUNET in stage 1, but converts a 2 channel (IR and DTM) image into a single number representing the crater confidence value.

To process each crater candidate, the classifier network uses the location and diameter of each candidate to generate IR and DTM images centered on the crater candidate with a 20% border around the crater. The network then assigns a confidence value to the crater candidate based using both the IR and DTM images together.

To train the classifier, we take 1,000 random crater samples and 1,000 random non–crater samples (that might contain a crater) and generate images containing each target crater centered in the image. We then train the network to assign high confidence values to the crater images and low confidence values to the non–crater images, without specifically training it to identify circles, edges, rims, or shadows. Once trained, we use the classifier network to process the entire stage 1 catalog and add the crater confidence value for each crater to the catalog. We do not remove craters at this stage.

2.4 Stage 3: Catalog Refinement

In the final stage of the CDA we filter craters from the catalog using the data collected in the previous stages. We train the filter by sampling 1,000 crater candidates from the CDA catalog and identify which craters match the ground–truth catalog. We then use a Gradient Boosting Classifier model (XGBoost, Pedregosa et al., 2011) to predict and remove false positive craters based on the initial 1,000 samples.

The XGBoost algorithm learns to map the detected features (diameter, DTM duplicate count, total duplicate count, likelihood) into a binary crater/non–crater value, and we remove all ‘non–craters’ from the final catalog. In typical usage, the XGBoost model typically favors high duplicate count, equal representation in IR and DTM images, and a high classifier confidence. These features are automatically learned but likely represent the idea that a real crater should be identifiable at many image scales in both IR and DTM images, and should look like other craters. We include the diameter of the crater in this model because very large and very small craters tend to have low duplicate counts and are not found in both IR and DTM data equally well. At the end of this stage, the crater catalog has been globally deduplicated, filtered for false positives, and contains crater location, size, and associated uncertainties.

2.5 Comparison with the L19 CDA

The L19 CDA is conceptually simpler than the CDA presented here, using the Silburt et al. (2019) UNET architecture to find crater candidates, followed by local and global deduplication. The additional work in this CDA brings in the IR dataset that better resolves the smaller craters (stage 1), includes additional verification of the crater candidates (stage 2), and refines the final catalog by considering the duplicate count and other metrics to produce a more accurate final catalog (stage 3).

The source DTM dataset is the same for this work and L19, and the new CDA also uses THEMIS daytime IR imagery(Edwards et al., 2011). As in L19, empirically obtained uncertainties are about 1 pixel in diameter and location, which scales with the pixel scale of the source image but is lower than 5% for almost all detectable craters.

In contrast to the L19 CDA, we adjusted the stage 1 network parameters to have higher overall recall and lower precision, while performing achieving a similar F1 score Lee (2019). This change allowed us to create a high recall catalog in stage 1 (i.e., find many craters), and remove additional false–positive detections using later post–processing stages.

The increase in the number of neural networks we use in this CDA does not increase training or processing time. The ResUNET architecture trains up to four times faster than the L19 UNET architecture, and improvements in the orthographic projection code (PROJ Contributors, 2018) and the template matching code reduce the image processing time by a factor of two. As in L19, the bottleneck in the processing speed is the data generation step, not the neural network steps. Computational limits on the consumer workstation used (64GB RAM, 8 core CPU, 1 NVidia 1080TI GPU) and the size of the input dataset (the DTM data is 6GB uncompressed, the IR is 22GB data uncompressed) limited the throughput of the CDA to around 2,000 images per minute. For the results discussed in the next section, the training and processing time is under 72 hours.

3 Results and Discussion

3.1 Results

The results presented in this section are from an experiment using the CDA to process the DTM and IR global maps. From these maps, we generated image swatches to systematically cover the planet at sizes from 1.5 degrees to 30 degrees, and additional IR images to cover the planet with image sizes from 0.5 to 1.3 degrees (the IRH dataset). None of these images were used in the training process, and to remove possible overfitting to the Robbins and Hynek (2012b) catalog, we compare our results to the Salamunićcar et al. (2012) catalog. For the imagery datasets we used in this work, this CDA produces a statistically complete catalog (Wang et al., 2020), and performs at expert level when compared to the independent Salamunićcar et al. (2011) catalog and the Robbins and Hynek (2012b) catalog.

Table 1 summarizes the global catalogs generated by this CDA compared against Salamunićcar et al. (2012) for craters larger 3km in diameter. For each of the three input datasets (DTM, IR, and IRH) we processed the entire image set with the stage 1 ResUNET and post–processed using only stage 3 for the columns marked with a star. For the combined dataset, we processed all three datasets the full CDA as described in the methods.

Smaller craters are far more numerous than larger craters, so the high–resolution IR dataset (IRH) finds almost as many craters as the lower resolution DTM and IR datasets, while only finding craters smaller than 16km in diameter. Each of the stage 1 ResUNETs performs comparably with L19 (measured by the F1 score(Lee, 2019)), but falls below the performance of Robbins and Hynek (2012b) compared against Salamunićcar et al. (2012). The combined catalog performs as well or better than Robbins and Hynek (2012b) against Salamunićcar et al. (2012), and slightly worse when compared directly to Robbins and Hynek (2012b). The apparent switch in the recall and precision values in the Robbins and Hynek (2012b) column is because Robbins and Hynek (2012b) has more craters than Salamunićcar et al. (2012), so the ‘precision’ (number of matching craters/ number of total candidates) is low and the ‘recall’ (number of matching craters/ number of real craters) is high. Reversing the comparison so that Robbins and Hynek (2012b) is the target catalog would produce an identical F1 score but with swapped precision and recall.

Combined DTM IR IRH R12 S12 L19
Craters detected 57,444 52,653 49,100 45,727 79,528 62,086 58,144
Craters matched 50,264 43,786 44,735 38,840 57,905 57,915 44,407
Precision 87 83 91 84 72 93 76
Recall 80 70 72 62 93 72 71
F1 84 76 80 72 81 81 73
Latitude Error
Longitude Error
Diameter Error
Table 1: Comparison against the Salamunićcar et al. (2012) catalog. Craters detected and matched are given as number of craters. All other numbers are given in percentages, with errors quoted relative to the minimum diameter used in the comparison between the generated catalog and target catalog. The DTM, IR, and IRH columns show the results from processing only DTM, only IR above 1.3 degree image swatches, and only IR below 1.3 degree image swatches (‘high’ resolution). The combined column shows all 3 datasets combined by the CDA. The R12 column compares the Robbins and Hynek (2012b) catalog against Salamunićcar et al. (2012). The S12 column compared Salamunićcar et al. (2012) against Robbins and Hynek (2012b), reversing the R12 comparison.

Figure 3 shows the precision, recall, and F1 scores for each of the columns in Table 1, grouped by the crater diameter. For each data point, we calculate the diagnostics considering all craters that lie within 20% of each marker. The drop–off in DEM recall at smaller diameters is the main reason the DEM catalog performs more poorly than the IR catalog overall — smaller crater are far more numerous and have a significant effect on these metrics. The drop–off in the IRH catalog with increasing diameter is worse than the IR catalog because of the lack of larger–scale images to contribute to the catalog.

Figure 3: Precision, Recall, and F1 score as a function of crater diameter for the individual CDAs for IRH, IR, DEM, and for the combined CDA. Each datapoint is calculated using craters with diameters within 20% of the marked datapoint, and all values are given in percentage units. The comparable test using Robbins and Hynek (2012b) is included for reference.

Figure 4 shows crater metrics recommended by Robbins et al. (2018a) for comparing crater catalogs. In particular, we use the smoothed ‘Size–Frequency Distribution‘ (SFD) algorithm to calculate the distribution of crater sizes instead of the histogram method used in earlier work (Arvidson, 1974). In contrast to the recall and precision data in Figure 3 and Table 1, the SFD calculation only considers crater numbers (per diameter) and does not match craters between catalogs first, so two catalogs with similar SFDs only match in statistical population count, not in specific crater identifications. Figure 4 shows three versions of the SFD following Robbins et al. (2018a). The cumulative SFD (CSFD, similar to a raw crater count), the relative SFD (RSFD, similar to the metric suggested by the Crater Analysis and Techniques Working Group (Arvidson, 1974)), and the ratio of incremental SFDs (ISFD) to compare the relative crater counts between catalogs. In this Figure, the ISFDs are compared to the Robbins and Hynek (2012b) catalog.

Figure 4: SFDs for the crater catalogs generated from the IR, DTM, and combined catalogs, as well as Robbins and Hynek (2012b) and Salamunićcar et al. (2012) Mars catalogs. (top) Cumulative SFD, (middle) Relative SFD, (bottom) ratio of Incremental SFD to the Robbins and Hynek (2012b) ISFD. SFDs are calculated following Robbins et al. (2018a) using the recommended bandwidth of 0.1D and a Gaussian kernel.

Following Lee (2019), we show a sample of ‘new’ crater candidates in Figure 5 and a sample of missed craters in Figure 6. We generate each image as in L19 with the crater centered in the image with 1 crater diameter on each side. In the metrics calculated for Table 1 the ‘new’ craters (Figure 5) are considered false positives and reduce the precision, while the missed craters (Figure 6) are considered false negatives and reduce the recall.

Figure 5: A sample of 5 ‘new’ crater candidates found by the CDA. Each image is a randomly chosen crater from the catalog with diameter in the range (left to right) 6–10km, 10–40km, 40–60km, 60–100km, 100–400km. The THEMIS IR (top) and MOLA/HRSC DTM (bottom) images are shown for each crater, along with the CDA measured diameter, and a 10km scale bar (marked at 0,5,10km) for reference. Crater 1,2, and 5 are included in the Robbins and Hynek (2012b) catalog but not in Salamunićcar et al. (2012), and would not be considered ‘new’ in a comparison between the CDA and Robbins and Hynek (2012b).
Figure 6: A sample of 5 craters missed by the CDA. The Figure is constructed identically to Figure 5 but using the craters missed by the CDA and included in Salamunićcar et al. (2012). The craters at the center of images 1 and 5 are not included in Robbins and Hynek (2012b) but are in Salamunićcar et al. (2012). The CDA finds the crater in images 2 and 3 in the DTM, but it is rejected with a low probability from the classifier network and no IR detections. The crater in image 4 was not found by the CDA.

3.2 Discussion

The CDA developed here performs at expert level down to 3km diameter craters. This new CDA has an overall (F1 score) performance on par with the Robbins and Hynek (2012b) catalog in comparison to Salamunićcar et al. (2012), or vice versa. The CDA is 10% more precise and has a 5% higher recall than the L19 CDA compared against Salamunićcar et al. (2012), with similar numbers when compared to Robbins and Hynek (2012b).

We use multiple stages in the new CDA to find many crater candidates (stage 1) and refine the candidates into a crater catalog (stage 2 and 3). In contrast to the L19, our stage 1 ResUNET is more ‘liberal’ at identifying craters, potentially finding more false–positive craters and allowing the later stages to remove some of these false–positive detections.

The later stages of the CDA help to merge disparate catalogs from the IR and DEM CDAs. We combined the candidates from the IR,IRH, and DEM catalogs in the ‘combined’ catalog, increasing the total crater count, recal, and F1 score. The precision of the combined catalog does drop compared to the IR catalog, but the combined catalog includes more than 6,000 new matching craters for the 2,000 extra non–matching candidates.

The THEMIS IR dataset extended the range of detectable craters below the native resolution of the MOLA/HRSC dataset(Fergason et al., 2018). We limited the analysis in L19 to about 3.8km diameter, or 8 pixels at the MOLA scale of 463m/pixel. The 100m/pixel scale of the THEMIS dataset should allow a lower diameter of 1km. We used a conservative limit of 3km diameter for this work because the stage 2 network relies on DTM data to work, even if the stage 1 input data is higher resolution.

We find that the position and diameter ‘errors’ in matching craters are similar across the different datasets compared to Salamunićcar et al. (2012), and are similar to the L19 results. For most metrics in Table 1 the median difference between our catalog and the Salamunićcar et al. (2012) catalog is 2% relative to each crater’s diameter, or equivalently, less than 1 pixel. If we include the uncertainties calculated by our CDA, matching craters from Salamunićcar et al. (2012) are within 1 for 76% of the craters in our catalog.

In contrast, we find most of the unique craters in each catalog are genuinely different from craters in the other catalog, and not near–matches. For false–positive (‘new’) craters and for false–negative (missed) craters, the nearest crater that meets the size criterion (the crater diameters are within 25% of each other) is, on average, five crater widths away. Less than 10% of either population is within one crater width of a crater in the other catalog.

False–positive (‘new’) crater candidates include a few plausible craters that may be missing from the ground–truth catalog, but many more are implausible candidates such as curved land formations or ill–resolved depressions (Figure 5). Nevertheless, the false–positive rate has been reduced by a factor of 2 from L19, with much of the improvement due to the later stages filtering out anomalous small crater identifications. As in L19, The false–positive list includes a small number of volcanic paterae that are not explicitly excluded from the CDA.

False–negative craters include disagreements in location or size but are dominated by genuine misses of degraded and shallow craters (Figure 6). The CDA is twice as likely to miss a highly degraded crater (DEGRADATION_STATE of 1 in Robbins and Hynek (2012b)) compared to the next most degraded state. In addition, each 100m in crater depth increases the chance of detection by 30%, and each 1% decrease in eccentricity increases the chance of detection by 3%. In all cases, the diameter of the crater does not significantly affect these results.


Notwithstanding the improvements to the CDA, we have added additional complexity to the algorithm and additional requirements before the CDA can be used on a new planetary body. The L19 CDA (and Silburt et al. (2019) CDA) did not require any sample of craters from the body before the CDA could be used — the UNET was tested without training on Mercury(Silburt et al., 2019) and Pluto(Lee, 2018) Our new CDA benefits, in part, from the stage 2 classifier network that requires a sample of ground–truth data, and the stage 3 filter that requires independent validation of a sample of the CDA candidates.

In the experiment described above, we used the same ground–truth catalog to train the stage 2 network and validate the CDA candidates in stage 3, but in principle the workflow can be adjusted to accommodate new data. Stage 2 requires a sample of ground–truth craters that could be manually measured from a region of a new planetary body. Stage 3 requires validation of candidates proposed by the CDA, which can be quickly validated by humans. The CDA can work without stage 2, losing a few percent in precision and accuracy, and performs about as well as L19 without stage 3.

Future Improvements

We have improved the L19 CDA to be competitive with human–generated catalogs. Improvements beyond this limit are difficult using human generated ground–truth datasets without more complex neural networks. It is also not clear that better agreement between catalogs should be targeted in future work (Robbins et al., 2014). Instead, refinements in the pre–processing stages, and calculating additional crater metrics would likely be more productive additions to the algorithm than increasing naive crater matching performance.

For example, the pre–processing stages could be improved to expose shallow craters, either by smoothing to varying spatial resolutions (Stepinski et al., 2009) or adjusting contrast in variable terrain to highlight shallow features Lee (2019). The CDA could also be used to process smaller scale localized images as part of ongoing mission operations to provide value–added data to the images of the surface (e.g., Wronkiewicz et al., 2018; Lee, 2018).

Two metrics are relatively straightforward to extract from the DTM and IR datasets, ellipticity and depth. Lee (2018) and Ali-Dib et al. (2019) showed that a MaskRCNN network(He et al., 2017) can extract crater–rim points that can then be fit to ellipses Robbins (e.g. 2018) to provide ellipticity and orientation data for the craters(e.g., for use in secondary crater mapping Naegeli and Laura, 2019). Stepinski et al. (2009) attempted to determine crater depth automatically from terrain data, and Robbins et al. (2018b) discusses using IR imagery for similar purposes. Robbins and Hynek (2012b) provides ground–truth data for both of these metrics, so networks could be trained to replicate their performance.

4 Conclusions

This paper develops a CDA to find near–circular craters in Mars DTM and IR imagery. The CDA improves on the Lee (2019) CDA by using IR data to extend the lower diameter limit to below 3km and by using new post–processing stages to verify and refine the crater catalogs generated from the imagery data.

The performance of the CDA measured against Salamunićcar et al. (2012) matches the performance of catalogs generated by expert human classifiers (Robbins and Hynek, 2012b) while working hundreds of times faster than typical human classification speeds. By design, the CDA incorporates multiple crater detection networks that perform with similar skill to the Lee (2019) algorithm, combined with a post–processing network designed to improve the precision of the algorithm to reduce the occurrence of false crater identifications.

The new CDA performs worst on shallow crater and degraded craters, as does L19. Shallow crater detection was improved using IR imagery, but we did not include additional pre–processing of images to expose shallow features in the DTM(e.g., Stepinski and Urbach, 2009).

The post–processing stages (2 and 3) improve precision of the CDA by removing false–positive candidates without sacrificing recall by removing many true craters. These stages could be used to improve results from other CDA (e.g. Wronkiewicz et al., 2018; Ali-Dib et al., 2019), or be retrained to refine catalogs of other features on the surface of planetary bodies(e.g., Aye et al., 2019; Bickel et al., 2019).

5 Acknowledgments

We thank the three anonymous reviewers and the journal editors for their constructive comments that helped improve the clarity of the paper. Computations were performed on clusters and GPU enabled workstations at the University of Toronto. J. H. was funded by an Undergraduate Student Research Award from the Natural Sciences and Engineering Research Council of Canada during summer 2019.

6 Computer Code Availability

The crater detection software, named DeepMars2 (to distinguish from the L19 algorithm named DeepMars) is available with an MIT licence from the Dataverse repository for this manuscript ( The CDA software is written in Python and requires a standard array of Python libraries to operate. The code will work on workstations with at least 16GB of available memory, but will benefit from additional RAM to open the large input data maps, and an Nvidia GPU (Nvidia 1080Ti or larger memory) to accelerate learning and prediction.


  1. Ali-Dib, M., Menou, K., Zhu, C., Hammond, N., Jackson, A.P., 2019. Automated crater shape retrieval using weakly-supervised deep learning. arXiv preprint URL:, arXiv:1906.08826.
  2. Arvidson, R.E., 1974. Morphologic classification of Martian craters and some implications. Icarus 22, 264--271. doi:10.1016/0019-1035(74)90176-6.
  3. Aye, K.M., Schwamb, M.E., Portyankina, G., Hansen, C.J., McMaster, A., Miller, G.R., Carstensen, B., Snyder, C., Parrish, M., Lynn, S., Mai, C., Miller, D., Simpson, R.J., Smith, A.M., 2019. Planet Four: Probing springtime winds on Mars by mapping the southern polar CO2 jet deposits. Icarus 319, 558--598. URL:, doi:10.1016/j.icarus.2018.08.018.
  4. Barlow, N.G., 1988. Crater size-frequency distributions and a revised Martian relative chronology. Icarus 75, 285--305. doi:10.1016/0019-1035(88)90006-1.
  5. Barlow, N.G., Perez, C.B., 2003. Martian impact crater ejecta morphologies as indicators of the distribution of subsurface volatiles. Journal of Geophysical Research 108, 5085. URL:, doi:10.1029/2002JE002036.
  6. Bickel, V.T., Lanaras, C., Manconi, A., Loew, S., Mall, U., 2019. Automated Detection of Lunar Rockfalls Using a Convolutional Neural Network. IEEE Transactions on Geoscience and Remote Sensing 57, 3501--3511. doi:10.1109/TGRS.2018.2885280.
  7. Bue, B.D., Stepinski, T.F., 2006. Automated classification of landforms on Mars. Computers and Geosciences 32, 604--614. doi:10.1016/j.cageo.2005.09.004.
  8. Chollet, F., Others, 2015. Keras. URL:
  9. Cintala, M.J., Head, J.W., Mutch, T.A., 1976. Martian crater depth/diameter relationships : Comparison with the Moon and Mercury, in: Proc. Lunar. Sci. Conf. 7th, pp. 3575--3587.
  10. Di, K., Li, W., Yue, Z., Sun, Y., Liu, Y., 2014. A machine learning approach to crater detection from topographic data. Advances in Space Research 54, 2419--2429. URL:, doi:10.1016/j.asr.2014.08.018.
  11. Edwards, C.S., Nowicki, K.J., Christensen, P.R., Hill, J., Gorelick, N., Murray, K., 2011. Mosaicking of global planetary image datasets: 1. Techniques and data processing for Thermal Emission Imaging System (THEMIS) multi-spectral data. Journal of Geophysical Research E: Planets 116, 1--21. doi:10.1029/2010JE003755.
  12. Fergason, R., Hare, T., Laura, J., 2018. HRSC and MOLA Blended Digital Elevation Model at 200m v2. URL:{_}MOLA{_}Blend{_}v0.
  13. Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep Learning. MIT Press.
  14. He, K., Gkioxari, G., Dollar, P., Girshick, R., 2017. Mask R-CNN. Proceedings of the IEEE International Conference on Computer Vision 2017-Octob, 2980--2988. doi:10.1109/ICCV.2017.322, arXiv:1703.06870.
  15. Kinczyk, M.J., Prockter, L.M., Byrne, P.K., Susorney, H.C., Chapman, C.R., 2020. A morphological evaluation of crater degradation on Mercury: Revisiting crater classification with MESSENGER data. Icarus 341, 113637. URL:, doi:10.1016/j.icarus.2020.113637.
  16. Krøgli, S.O., Dypvik, H., 2010. Automatic detection of circular outlines in regional gravity and aeromagnetic data in the search for impact structure candidates. Computers and Geosciences 36, 477--488. URL:, doi:10.1016/j.cageo.2009.07.010.
  17. Lee, C., 2018. Martian Crater Identification Using Deep Learning, in: American Geophysical Union Fall Meeting, pp. P41D--3768.
  18. Lee, C., 2019. Automated Crater Detection on Mars using Deep Learning. Planetary and Space Science in review.
  19. Milletari, F., Navab, N., Ahmadi, S.A., 2016. V-Net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings - 2016 4th International Conference on 3D Vision, 3DV 2016 , 565--571doi:10.1109/3DV.2016.79, arXiv:1606.04797.
  20. Naegeli, T.J., Laura, J., 2019. Back-projecting secondary craters using a cone of uncertainty. Computers and Geosciences 123, 1--9. URL:, doi:10.1016/j.cageo.2018.10.011.
  21. Palafox, L.F., Hamilton, C.W., Scheidt, S.P., Alvarez, A.M., 2017. Automated detection of geological landforms on Mars using Convolutional Neural Networks. Computers and Geosciences 101, 48--56. URL:, doi:10.1016/j.cageo.2016.12.015.
  22. Palucis, M.C., Jasper, J., Garczynski, B., Dietrich, W.E., 2020. Quantitative assessment of uncertainties in modeled crater retention ages on Mars. Icarus 341, 113623. URL:, doi:10.1016/j.icarus.2020.113623.
  23. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825--2830.
  24. Pedrosa, M.M., de Azevedo, S.C., da Silva, E.A., Dias, M.A., 2017. Improved automatic impact crater detection on Mars based on morphological image processing and template matching. Geomatics, Natural Hazards and Risk 8, 1306--1319. URL:, doi:10.1080/19475705.2017.1327463.
  25. PROJ Contributors, 2018. PROJ Coordinate Transformation Software Library.
  26. Redmon, J., Divvala, S., Girshick, R., Farhadi, A., 2016. You only look once: Unified, real-time object detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2016-Decem, 779--788. doi:10.1109/CVPR.2016.91, arXiv:1506.02640.
  27. Robbins, S.J., 2018. A potpourri of related crater cataloging: From fitting ellipses to new basemaps to crater production functions. URL: papers2://publication/uuid/BFDC7D14-EB5B-46FF-A6B7-964CD98C7413, doi:10.1023/A.
  28. Robbins, S.J., Antonenko, I., Kirchoff, M.R., Chapman, C.R., Fassett, C.I., Herrick, R.R., Singer, K., Zanetti, M., Lehan, C., Huang, D., Gay, P.L., 2014. The variability of crater identification among expert and community crater analysts. Icarus 234, 109--131. URL:, doi:10.1016/j.icarus.2014.02.022.
  29. Robbins, S.J., Hynek, B.M., 2012a. A new global database of Mars impact craters larger than 1 km: 2. Global crater properties and regional variations of the simple-to-complex transition diameter. Journal of Geophysical Research E: Planets 117, 1--21. doi:10.1029/2011JE003967.
  30. Robbins, S.J., Hynek, B.M., 2012b. A new global database of Mars impact craters larger than 1km: 1. Database creation, properties, and parameters. Journal of Geophysical Research E: Planets 117, 1--18. doi:10.1029/2011JE003966.
  31. Robbins, S.J., Riggs, J.D., Weaver, B.P., Bierhaus, E.B., Chapman, C.R., Kirchoff, M.R., Singer, K.N., Gaddis, L.R., 2018a. Revised recommended methods for analyzing crater size-frequency distributions. Meteoritics and Planetary Science 53, 891--931. doi:10.1111/maps.12990.
  32. Robbins, S.J., Watters, W.A., Chappelow, J.E., Bray, V.J., Daubar, I.J., Craddock, R.A., Beyer, R.A., Landis, M., Ostrach, L.R., Tornabene, L., Riggs, J.D., Weaver, B.P., 2018b. Measuring impact crater depth throughout the solar system. Meteoritics and Planetary Science 53, 583--637. doi:10.1111/maps.12956.
  33. Ronneberger, O., Fischer, P., Brox, T., 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation, in: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (Eds.), Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015, Springer International Publishing, Cham. pp. 234--241.
  34. Salamunićcar, G., Lonarić, S., Pina, P., Bandeira, L., Saraiva, J., 2011. MA130301GT catalogue of Martian impact craters and advanced evaluation of crater detection algorithms using diverse topography and image datasets. Planetary and Space Science 59, 111--131. doi:10.1016/j.pss.2010.11.003.
  35. Salamunićcar, G., Lončarić, S., Mazarico, E., 2012. LU60645GT and MA132843GT catalogues of Lunar and Martian impact craters developed using a Crater Shape-based interpolation crater detection algorithm for topography data. Planetary and Space Science 60, 236--247. doi:10.1016/j.pss.2011.09.003.
  36. Silburt, A., Ali-Dib, M., Zhu, C., Jackson, A., Valencia, D., Kissin, Y., Tamayo, D., Menou, K., 2019. Lunar crater identification via deep learning. Icarus 317, 27--38. doi:10.1016/j.icarus.2018.06.022, arXiv:1803.02192.
  37. Soderblom, L.A., Condit, C.D., West, R.A., Herman, B.M., Kreidler, T.J., 1974. Martian planetwide crater distributions: Implications for geologic history and surface processes. Icarus 22, 239--263. doi:10.1016/0019-1035(74)90175-4.
  38. Stepinski, T.F., Mendenhall, M.P., Bue, B.D., 2009. Machine cataloging of impact craters on Mars. Icarus 203, 77--87. URL:, doi:10.1016/j.icarus.2009.04.026.
  39. Stepinski, T.F., Urbach, E.R., 2009. The First Automatic Survey of Impact Craters on Mars: Global Maps of Depth/ Diameter Ratio, in: Lunar and Planetary Science Conference, p. 1117. doi:10.2174/138920312803582960.
  40. van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager, N., Gouillart, E., Yu, T., 2014. scikit-image: image processing in Python. PeerJ 2, e453. URL:, doi:10.7717/peerj.453, arXiv:1407.6245.
  41. Wang, Y., Xie, M., Xiao, Z., Cui, J., 2020. The minimum confidence limit for diameters in crater counts. Icarus , 113645doi:10.1016/j.icarus.2020.113645.
  42. Wronkiewicz, M., Kerner, H.R., Harrison, T., 2018. Autonomous Mapping of Surface Features on Mars, in: AGU Fall Meeting, pp. P41D--3758.
  43. Zhang, Z., Liu, Q., Wang, Y., 2018. Road Extraction by Deep Residual U-Net. IEEE Geoscience and Remote Sensing Letters 15, 749--753. doi:10.1109/LGRS.2018.2802944, arXiv:1711.10684.
  44. Zuo, W., Zhang, Z., Li, C., Wang, R., Yu, L., Geng, L., 2016. Contour-based automatic crater recognition using digital elevation models from Chang’E missions. Computers and Geosciences 97, 79--88. URL:, doi:10.1016/j.cageo.2016.07.013.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description