A Directionally Selective Small Target Motion Detecting Visual Neural Network in Cluttered Backgrounds
Discriminating targets moving against a cluttered background is a huge challenge, let alone detecting a target as small as one or a few pixels and tracking it in flight. In the fly’s visual system, a class of specific neurons, called small target motion detectors (STMDs), have been identified as showing exquisite selectivity for small target motion. Some of the STMDs have also demonstrated directional selectivity which means these STMDs respond strongly only to their preferred motion direction. Directional selectivity is an important property of these STMD neurons which could contribute to tracking small targets such as mates in flight. However, little has been done on systematically modeling these directional selective STMD neurons. In this paper, we propose a directional selective STMD-based neural network (DSTMD) for small target detection in a cluttered background. In the proposed neural network, a new correlation mechanism is introduced for direction selectivity via correlating signals relayed from two pixels. Then, a lateral inhibition mechanism is implemented on the spatial field for size selectivity of STMD neurons. Extensive experiments showed that the proposed neural network not only is in accord with current biological findings, i.e. showing directional preferences, but also worked reliably in detecting small targets against cluttered backgrounds.
Intelligent robots have shown great potential in reshaping human life in the future. However, artificial visual systems so far could not provide a robot with the required capacity to respond to the visual world in real time, like many animal species do. Among many visual functionalities, detecting small moving targets is one of the most important abilities for many animal species, e.g, finding mates in the distance, and it is also critical for a robot to track small targets in a cluttered background.
Small target motion detection in visual cluttered backgrounds is always considered as a challenging problem for artificial visual systems. The difficulty is reflected in two aspects: firstly, when a target is far away from the observer, it always appears as a small dim speckle whose size may vary from one pixel to a few pixels in the receptive field. In this size, shape, color and texture information cannot be used for target detection. Secondly, small targets are often buried in cluttered backgrounds and difficult to separate from noise. In addition, ego motion may bring in further difficulties to small target motion detection.
Nature has provided a rich source of inspiration for small target motion detection. Detecting small targets in naturally cluttered backgrounds is critical for many insect species to search for mates or track their prey. As the result of millions of years of evolution, the small target motion detection visual systems in insects are both efficient and reliable [1, 2]. For example, dragonflies can pursue small flying insects with successful capture rates as high as relying on their well evolved vision system . Compared to the visual systems of primate animals, insects’ visual systems achieve amazing capability using relatively simple structures and a small number of neurons. Insects’ visual pathways are perfect models for designing artificial vision systems for small target motion detection.
In the fly’s visual system, a class of specific neurons, called small target motion detectors (STMDs), has been identified as showing exquisite selectivity for small targets (size selectivity) [2, 4, 5]. To be more precise, STMD neurons give peak responses to targets subtending of the visual field, with no response to larger bars (typically ) or to wide-field grating stimuli. In addition, some STMD neurons are directionally selective (direction selectivity) [6, 7]. They respond strongly to small target motion oriented along a preferred direction, but show weak or no response to null-direction motion. Null direction is from the preferred direction. Although the postsynaptic pathways of the STMD neurons are still under investigation , it is clear that knowing the small target motion and its direction at the same time is an advantage in tasks such as tracking mates or intercepting prey.
The electrophysiological knowledge about the STMD neurons and their afferent pathway revealed in the past few decades makes it possible to propose quantitative STMD models, however, little has been done on systematically modeling these directionally selective STMD neurons. Wiederman et. al.  proposed elementary small target motion detector (ESTMD) to account for size selectivity of STMD neurons. ESTMD showed strong responses to small target motion, but much weaker or even no responses to wide-field motion. However, it did not consider direction selectivity and showed no different responses to small target motion oriented along different directions. In , the authors indicated that two hybrid models, i.e., EMD cascaded with ESTMD (EMD-ESTMD) and ESTMD cascaded with EMD (ESTMD-EMD), could exhibit both size and direction selectivities. However, there are no details on how the size and direction selectivities could be achieved with these two models. In short, little or no systematic research has been carried out on modeling directional selective STMD neurons, though it could be a crucial component of an efficient artificial vision system.
In this study, we propose a neural network to model the specific STMD neurons with directional selectivity in the fly called (DSTMD). It can detect not only small target motion but also the direction of small target motion in cluttered backgrounds. The proposed neural network incorporates a new correlation mechanism which correlates signals relayed from two photoreceptors so as to introduce directional selectivity. Then, a lateral inhibition mechanism acting on correlation outputs is used for size selectivity. Systematic experiments are carried out to validate the proposed neural network in complex environments.
The remainder of this paper is organized as follows. In section II, related work is reviewed. In section III, the proposed neural network is introduced in detail. In section IV, experiments are carried out to test the performance of the proposed neural network. Discussion is also given in this section. In section V, further discussions are given. In section VI, we give conclusions and perspectives.
Ii Related Work
In this section, we review the related work on three motion sensitive neurons including the lobula giant movement detector (LGMD) [11, 12, 13, 14], lobula plate tangential cell (LPTC) [15, 16, 17, 18, 19] and small target motion detector (STMD) [2, 4, 5, 6, 7]. These three types of motion sensitive neurons are all found in insects’ visual systems and have been extensively studied. They extract vision motion information from dynamic scenes and respond to specific motion patterns.
Ii-a Lobula Giant Movement Detector (LGMD)
Lobula giant movement detectors (LGMDs) are collision sensitive neurons found in locusts [11, 12, 13, 14]. They respond strongly to objects approaching the locust on a direct collision course while showing little or no response to receding objects.
A great number of LGMD-based neural networks [20, 1, 21, 22, 23] have been proposed. These neural networks showed the same collision sensitivity as the LGMD neuron and can detect collisions cheaply and reliably in a complex background. Additionally, these neural networks have been implemented on mobile robots . However, because these neural networks are proposed to model the LGMD neuron and detect collisions, they lack the ability of small target motion detection and do not show size and direction selectivities.
Ii-B Lobula Plate Tangential Cell (LPTC)
Lobula plate tangential cells (LPTCs) are located in the lobula layer of the fly’s visual system . They show strong responses to wide-field motion, but also to the motion of local, salient features [16, 17, 18, 19].
In 1956, Hassenstein and Reichardt proposed the first LPTC model (elementary motion detector (EMD))  by analyzing the turning behavior of the insect. In the past decade, considerable progress has been made in identifying afferent pathways and characteristics of LPTCs. In order to incorporate these new biological findings, EMD was adapted, giving rise to several models, such as two-quadrant-detector (TQD) [26, 27], weighted-quadrant-detector (WQD) . However, these models are not size selective and respond strongly to both wide-field motion and small target motion.
Ii-C Small Target Motion Detector (STMD)
In the fly’s visual system, small target motion detector (STMD) are characterized by their exquisite size selectivity. They are exquisitely sensitive to small moving targets subtending of the receptive field, but show much weaker or no response to wide-field motion [2, 4, 5]. Besides, some STMDs exhibit direction selectivity [6, 7]. They show strong responses to preferred-direction motion, but weak or no response to null-direction motion.
Elementary small target motion detector (ESTMD) was proposed to model non-directionally selective STMDs . Although ESTMD shows size selectivity, it is not directional selective. In , the authors indicated that two hybrid models, i.e., EMD cascaded with ESTMD (EMD-ESTMD) and ESTMD cascaded with EMD (ESTMD-EMD) could show both size selectivity and direction selectivity. However, the size and direction selectivities of these two models were not well studied and demonstrated in details. In short, there is little or no systematic research on modeling directional selective STMD neurons
Iii Formulation of the model
Following the typical multi-stage view of motion detection in the fly’s visual system (schematically illustrated in Fig. 1), we devised a directional selective STMD-based neural network (DSTMD) in this study. Fig. 2 shows the schematic diagram of one DSTMD cell and its presynaptic neural network. The proposed neural network is composed of four neural layers including the retina, lamina, medulla and lobula. These four sequentially arranged neural layers have specific functions and cooperate together for small target motion detection. In the following subsections, we will elaborate on the components and functions of each layer.
Iii-a Retina Layer
In the fly’s visual system, the retina layer system contains a great number of ommatidia (or facets) . Each ommatidium is composed of eight photoreceptors, denoted by R-. Each photoreceptor views a small region of the whole visual field and supplies a ’pixel’ of information . In DSTMD, a rectangular sampling window is used to roughly approximate the hexagonal receptive field of ommatidia. As depicted in Fig. 3, each small square denotes a pixel, corresponding to a photoreceptor. The red dotted rectangle represents the receptive field of an ommatidium which overlaps significantly with its neighbours. The neural response of an ommatidia is approximated by a linear Gaussian blur. Specifically, let denote varying luminance values captured by photoreceptors where and are spatial and temporal field positions. Then, the output of ommatidia with receptive fields centered at denoted by is defined by the following equation,
where is a Gaussian function, defined as
Iii-B Lamina Layer
In the lamina layer, large monopolar cells (LMCs), such as L1 and L2, are postsynaptic neurons of ommatidia (see Fig. 1). They show strong responses to luminance increments and decrements [31, 32, 33, 34]. In DSTMD, LMCs act as a temporal high-pass filter extracting motion information (luminance change) from input signals. Let denote the neural response of LMC located at . Then, is defined by convolving ommatidium output with a temporal high-pass filter . That is,
where is a Gamma function, defined as
An illustration of Gamma function and temporal high-pass filter is shown in Fig. 6.
Before LMC relays its output to the medulla layer, it receives lateral inhibition from its adjacent LMCs. Here, signal is convolved with an inhibition kernel so as to implement lateral inhibition mechanism. That is,
where is the signal after lateral inhibition and inhibition kernel is defined by the following equations ,
Here, we set , , , as
where denote and , respectively.
Iii-C Medulla Layer
In the medulla layer, LMCs synapse on a great variety of intermediate neurons, such as the transmedullary neuron Tm1, Tm2, Tm3 and medulla intrinsic neuron Mi1 [35, 36, 37]. The neurons Mi1 and Tm3 respond selectively to brightness increments, with the response of Mi1 delayed relative to Tm3. Conversely, Tm1 and Tm2 respond selectively to brightness decrements, with the response of Tm1 delayed compared with Tm2 [38, 34, 39].
The modeling method for Tm1, Tm2, Tm3 and Mi1 differs between DSTMD and ESTMD. In the following, we introduce how DSTMD and ESTMD model these four medulla neurons, respectively.
1) Medulla Neuron Modeling of DSTMD: In DSTMD, neural responses of Tm3 () and Tm2 () are defined by the following equations,
where and are positive and negative components of the laterally inhibited neural response of LMCs (), respectively. That is,
where and are called ON and OFF signals in DSTMD, representing luminance increment and decrement signals, respectively.
Since the neural response of Mi1 (or Tm1) is delayed relative to Tm3 (or Tm2), we convolve () with a Gamma function to obtain temporally delayed signals. That is,
where and are time-delayed signals, representing neural responses of Mi1 and Tm1, respectively. are orders of Gamma functions while are time constants. Larger (or ), longer time delay of neural response (or ).
2) Medulla Neuron Modeling of ESTMD: In ESTMD, neural responses of Tm3, Tm2 are defined by the following equations,
where and denote neural responses of Tm3 and Tm2, respectively. is the second-order lateral inhibition kernel, defined as
where are constant, and is defined as
where is a Gaussian function and are constant.
Compared to DSTMD, ESTMD implements a second-order lateral inhibition mechanism on these four medulla neurons, i.e., convolving and with a second-order lateral inhibition kernel . This second-order lateral inhibition mehcanism is different from the classic lateral inhibition mechanism . Its surround inhibition is much stronger than center excitation while in the classic lateral inhibition mechanism, surround inhibition is equal to center excitation (see Fig. 7). In , biologists assert that size selectivity of STMD neurons is shaped by this second-order lateral inhibition mechanism. However, where this second-order lateral inhibition mechanism occurs remains elusive. ESTMD implements this lateral inhibition mechanism in the medulla layer. However, it contradicts with biological findings. This is because LPTCs receive signals from the medulla layer [41, 34]. LPTCs would show strong size selectivity if this lateral inhibition mechanism is implemented in the medulla layer. This conflicts with the finding that LPTCs do not exhibit size selectivity [16, 17, 18, 19]. For this reason, we implement the second-order lateral inhibition mechanism on correlation outputs of STMD neurons (see Eq. (27)) rather than in the medulla layer. We will demonstrate this point in the experimental section.
Similarly, and are convolved with a Gamma function so as to obtain temporally delayed signals. That is,
where and are time-delayed signals, representing neural responses of Mi1 and Tm1, respectively.
Iii-D Lobula Layer
In the lobula layer, STMDs integrate signals relayed from the medulla layer then respond strongly to small target motion. In the following, modeling methods for STMDs of ESTMD and DSTMD are introduced, respectively.
1) ESTMD: In ESTMD, the neural response of STMD neuron with receptive field centered at () is defined by the following equation,
In ESTMD, is regarded as the neural output of STMD neurons.
2) DSTMD: In DSTMD, the correlation output of STMD neuron with receptive field centered at and a preferred motion direction () is defined by the following equation,
and is a constant and .
Different from the correlation mechanism of ESTMD (see Eq. (24)), DSTMD correlates outputs of medulla neurons located at two different positions, i.e., and . Medulla neural signals from at least two different positions are needed to discriminate motion direction . Since ESTMD only correlates signals of medulla neurons located at a single position , ESTMD is able to detect the presence of target motion, but not the target’s motion direction. In the correlation mechanism of DSTMD (see Eq. (25)), for a given position , we can choose a series of , corresponding to different (see Fig. 8). Thus, a series of correlation outputs with different preferred motion directions can be defined. Through this correlation mechanism, information about motion direction can be introduced in DSTMD and DSTMD can show direction selectivity.
After signal correlation, DSTMD implements the second-order lateral inhibition mechanism on to account for size selectivity of STMD neurons. That is,
where is the signal after lateral inhibition and is defined in Eq. (20).
From Fig. 7, we can see that inhibition kernel contains two components, i.e., excitatory and inhibitory components, which are determined by and , respectively. In both ESTMD and DSTMD, surround inhibition is set as three times as strong as center excitation. In this case, once the target’s size exceeds excitatory region, it will receive strong inhibition. When the target is smaller than the excitatory region, the amount of excitation will increase as the increment of target size. Therefore, DSTMD prefers the target whose size is equal to excitatory region. That is, DSTMD shows size selectivity.
Following this second-order lateral inhibition mechanism, DSTMD model inhibits model response at directions more than apart by convolving with an inhibition kernel . That is,
where is defined as
In DSTMD, is regarded as neural output of STMD neurons.
Iii-E Motion Direction Estimation
STMDs synapse on target selective descending neurons (TSDNs) which connect with wing muscles [43, 7, 44]. In , the authors found that insects use eight pairs of TSDNs to encode motion direction of targets by a population vector algorithm.
Here, we estimate motion direction of targets by populating neural responses along different directions . That is,
where denotes motion direction of the small target at time and denotes the position of STMD neurons responding to the small target.
Iii-F Parameter Setting
Parameters of DSTMD and ESTMD are given in Table I. These parameters are tuned manually based on empirical experience and will not be changed in the following experiments unless stated.
The proposed neural network is written in Matlab (The MathWorks, Inc., Natick, MA). The computer used in the experiments is a PC with one Ghz CPU and windows operating system.
Iv Results and Discussions
In this section, synthetic image sequences produced by Vision Egg  are used to test the proposed model. The video images are (in horizontal) by (in vertical) pixels and temporal sampling frequency is set as Hz.
Iv-a Contribution of Various Neurons
In this section, we evaluate the impact of different neurons in DSTMD and ESTMD. Fig. 9 is a representative frame of the input image sequence. Neural responses to motion along the red line are extracted and presented in Fig. 19 for clearly showing characteristics of the neurons.
Fig. 19(a) shows the input luminance signal . In Fig. 19(a), luminance is close to at position . Besides, luminance between and is also significantly lower than that of the surrounding areas. These two regions correspond to the small target and the tree trunk in Fig. 9, respectively. Fig. 19(b) shows the responses of the ommatidium. Compared to the input luminance signal, the response of ommatidium is spatially blurred. Fig. 19(c) shows the response of LMCs. In DSTMD, the LMC neuron acts as a temporal filter which shows positive responses to luminance increments and negative responses to luminance decrements. For example, in Fig. 9, the tree trunk is moving from left to right. For pixels located at the right edge of this tree trunk (), their luminance will decrease due to the movement of this tree trunk. However, for pixels located at the left edge of this tree trunk (), their luminance will increase. Therefore, a positive response and a negative response can be seen at the left and right edges of the tree trunk, respectively.
Fig. 19(d) shows Tm3 and Tm2 neural responses ( and ) in DSTMD while Fig. 19(e) shows Tm3 and Tm2 neural responses ( and ) in ESTMD. Compared to and , and are largely inhibited because of the second-order lateral inhibition mechanism (see Eq. (18) and (19)). Fig. 19(f) and 19(g) show medulla signals used in signal correlation step of DSTMD and ESTMD, respectively. Four medulla signals are used in signal correlation step of DSTMD (see Eq. (25)) while only two medulla signals are used in the signal correlation step of ESTMD (see Eq. (24)).
Fig. 19(h) and 19(i) show DSTMD outputs () and ESTMD outputs (), respectively. From these two figures, we can see that both DSTMD and ESTMD show strong responses to the small target located at position , but much weaker or even no response to other objects, such as the tree trunk. This reflects size selectivity of DSTMD and ESTMD. DSTMD and ESTMD are only interested in the motion of small targets with specific size. Besides, in Fig. 19(h), we can see that DSTMD has eight outputs representing STMD neural responses tuned to eight directions, i.e., . However, in Fig. 19(i), ESTMD only has an output lacking of direction information. This is a significant difference between DSTMD and ESTMD. DSTMD exhibits direction selectivity while ESTMD is not directional selective. For clearly showing direction selectivity of DSTMD, DSTMD responses to the small target are shown in polar coordinate (see the small disk in Fig. 19(h)). From this subplot, we can see that DSTMD shows the strongest response along direction which is consistent with motion direction of the small target. The response of DSTMD gradually decreases when preferred direction gradually deviates from motion direction of the small target. When preferred direction is opposite to motion direction of the small target, i.e. , the DSTMD response tuned to this direction is close to . In this case, we can roughly estimate motion direction of the small target by determining the direction of the strongest DSTMD response.
Iv-B Tuning Properties
Biological research found that STMDs have some basic tuning properties, including LDTB sensitivity, velocity selectivity, width selectivity and height selectivity [5, 2, 7]. The LDTB sensitivity means that the response of STMDs increases with the increment of LDTB. The velocity selectivity refers to that STMDs show the strongest response to a specific velocity (optimal velocity). The width and height selectivities are similar to velocity selectivity. In this section, we perform experiments to test whether DSTMD (or ESTMD) can satisfy tuning properties of STMDs.
As shown in Fig. 20, width represents target length extended parallel to target’s motion direction while height denotes target length extended orthogonal to target’s motion direction. If the size of a target is , the size of its background rectangle is , where is a constant which equals to pixels in this paper. Luminance difference between the target and background (LDTB) is defined by the following equation,
where is the average pixel value of the target, is the average pixel value in neighboring area around the target.
For a small target whose velocity, width, height and luminance are set as pixel/second ( pixel/frame), pixels, pixels and , respectively, moving against white background, we vary one of these four parameters (velocity, width, height and luminance) and record corresponding DSTMD and ESTMD responses. Recorded tuning curves are shown in Fig. 25. In Fig. 25(a), we can see that both DSTMD and ESTMD show LDTB sensitivity. The responses of DSTMD and ESTMD increase as the increment of LDTB until reach maximum at LDTB . In Fig. 25(b), responses of DSTMD and ESTMD peak at velocity pixel/second ( pixel/frame) and decrease significantly as the increment or decrement of velocity. In other words, DSTMD and ESTMD prefer targets whose velocities are equal to pixel/second. Similar variation trend can be seen in Fig. 25(c) and Fig. 25(d). These three figures indicate that both DSTMD and ESTMD show velocity, width and height selectivities.
As we have mentioned in subsection III-C, it is unrealistic that ESTMD implements the second-order lateral inhibition mechanism in the medulla layer. If this lateral inhibition mechanism occurs in the medulla layer, medulla signals would be largely inhibited. In this case, LPTCs would show strong size selectivity by integrating these inhibited medulla signals. This conflicts with the biological finding that LPTCs do not show size selectivity [16, 17, 18, 19]. In order to demonstrate this point, we firstly use TQD [26, 27] to model LPTCs. Then, we record TQD outputs when the second-order lateral inhibition mechanism is implemented in medulla layer, denoted by TQD (LI). For comparison, TQD outputs without the second-order lateral inhibition are also recorded, denoted by TQD. Recorded tuning curves of TQD (LI) and TQD are shown in Fig. 30.
TQD (LI) and TQD show little difference in Fig. 30(a), (b) and (c). The most significant difference between responses of TQD (LI) and TQD can be seen in Fig. 30(d). In Fig. 30(d), we can see that responses of TQD (LI) and TQD have a local maximum at height . With the increase of height, the response of TQD firstly show a slight drop and finally tend to be stable around . This indicates that TQD does not have height selectivity. The increase of height cannot elicit significant decrease of TQD outputs. However, the response of TQD (LI) decreases significantly after it reaches its local maximum and finally stays around . This indicates that TQD (LI) exhibits selectivity for target height and prefers targets whose height is smaller than . The strong height selectivity of TQD (LI) contradicts with the biological finding that LPTCs are not size selective [16, 17, 18, 19]. From this point of view, we assert that although the second-order lateral inhibition mechanism can account for size selectivity of STMDs, it should not be implemented in the medulla layer.
Iv-C Parameter Sensitivity
In the first two experiments, we vary and , respectively, while fixing other parameters. More precisely, in the first experiment, we fix , as , and set as , , , , , . In the second experiment, we fix , as , and set as , , , , , . Recorded tuning curves of the first and second experiment are shown in Fig. 36 and 41, respectively.
In Fig. 36(a), 36(d), 41(a) and 41(d), we can see that and have little effect on LDTB sensitivity and height selectivity of DSTMD. From Fig. 36(b) and 41(b), we can see that the higher (or ), the lower optimal velocity. Especially, although variation ranges of and are different, tuning curves in these two figures show little difference. In Fig. 36(c) and 41(c), we can see that the optimal width increases with the increment of (or ). However, these two figures differ in variation range of the optimal width. The variation range of the optimal width in Fig. 36(c) (from to pixels) is smaller than that in Fig. 41(c) (from to pixels). This indicates that the effect of on width selectivity is not as significant as the effect of .
Here, we point out the meaning of and in DSTMD model. Fig. 31 shows the schematic of luminance changes of pixel A and B when a dark target successively passes A and B. Let , and denote the distance between pixel A and B (correlation distance, see Eq. (26) and Fig. 8), target velocity and width, respectively. In Figure. 31, we can see that once , and are given, and can be determined. In DSTMD, we set , , as , and , respectively. In this case, velocity tuning and width tuning curves of DSTMD will peak at and , respectively. Since is related to (), varying can directly change optimal velocity and width. Similarly, since is related to (), varying can change optimal velocity. Although is not directly related to , it still shows small effect on width selectivity of DSTMD (see Fig. 36(c)).
In the third experiment, we fix , as , and set as , , , , . Recorded tuning curves of DSTMD are shown in Fig. 46.
In Fig. 46(a), 46(b) and 46(c), we can see that shows little effect on LDTB sensitivity, velocity selectivity and width selectivity of DSTMD. However, in Fig. 46(d), the optimal height increases with the increment of . In DSTMD, height selectivity is shaped by the second-order lateral inhibition mechanism (see Eq. (27)). Sizes of excitatory and inhibitory regions of this second-order lateral inhibition are determined by and (see Eq. (20) and Fig. 7). The higher , the larger the excitatory region. Since DSTMD prefer the target whose size equals to the excitatory region, the larger the excitatory region, the higher the optimal height. In other words, the higher , the higher the optimal height.
Iv-D Direction Selectivity and Motion Direction Estimation
In this section, we show how DSTMD encode motion direction of targets by populating its directionally selective outputs.
We firstly record DSTMD responses to a small target moving against the white background. Size of the small target is set as pixels while luminance is set as . The coordinate of the small target at time t is where is set as pixel/second ( pixel/frame). The motion trace of the small target is shown in Fig. 47 where color denotes direction of the strongest response of DSTMD, i.e., . The motion direction of the small target varies between and when the small target moves along this trace.
We select six positions (A,B,C,D,E,F, in Fig. 47) at the motion trace. DSTMD outputs of these six positions are shown in Fig. 54. At each position, DSTMD has eight outputs tuned to eight different directions , denoted by eight different colors. Besides, since the small target successively passes these six positions, temporal delays can be seen between DSTMD responses of these six positions. In Fig. 61, DSTMD responses at each position are normalized and then shown in polar coordinate. From Fig. 61, we can see that in each subplot, the smaller difference between the preferred direction of DSTMD () and actual motion direction (actual motion direction can be seen in Fig. 68), the stronger DSTMD output tuned to this preferred direction ().
These directional selective DSTMD outputs are further encoded by population vector algorithm to estimate actual motion direction of the target (see Eq. (30)). The estimated motion direction () and actual motion direction at six positions are shown in Fig. 68. The difference between estimated motion direction and actual motion direction is smaller than at these six positions (see Table II). Furthermore, we estimate motion direction of the target at each point of this target trace. The maximal difference between estimated direction and actual motion direction is . This indicates provides a good estimation for motion direction of the small target.
|Estimated Motion Direction||Actual Motion Direction||Difference|
Iv-E Target Detection in Cluttered Backgrounds
In this section, we test the ability of DSTMD for small target motion detection in cluttered backgrounds. Firstly, two metrics are defined to evaluate detection performances of DSTMD and ESTMD. That is,
where and are the detection rate and false alarm rate, respectively. The detected result is considered correct if the pixel distance between the ground truth and the result is within a threshold ( pixels).
In the first three experiments, we investigate the effect of three target parameters (size, luminance and velocity ) on detection performance of DSTMD. All input image sequences are produced using the same background in the first three experiments. A representative frame is given in Fig. 69. In all input image sequences, the background is moving from left to right and its velocity is set as pixel/second ( pixel/frame). A small target is moving against the cluttered background. The coordinate of the small target at time is set as where denotes horizontal velocity. In each experiment, we vary one of the target parameters (size, luminance and velocity ) and fix the other two parameters. The target parameter setting of the first three experiments is shown in Table III. LDTB of the small target during time period and the receiver operating characteristic (ROC) curves of three experiments with respect to target luminance, sizes and horizontal velocities , are shown in Fig. 73 and Fig. 77, respectively.
In Fig. 77(a), we can see that the lower the target luminance is, the better ESTMD and DSTMD perform. This is because the decrement of target luminance can induce the increment of LDTB (see Fig. 73(a)). Since ESTMD and DSTMD all show LDTB sensitivity (see Fig. 25(a)), the higher LDTB always means the stronger model response. It can also be seen from Fig. 77(a) that for the fixed target luminance, ESTMD has better performance than DSTMD. Since DSTMD correlates signals along eight directions, DSTMD is likely to produce more noises in its correlation step compared to ESTMD. In this case, the false alarm rate of DSTMD is often larger than that of ESTMD for the fixed detection rate.
From Fig. 77(b) and 77(c), we can see that for ESTMD and DSTMD, when the false alarm rate is given, the target size of (or the horizontal velocity of ) has higher detection rate compared to the target size of and (or the horizontal velocity of and ). This is because ESTMD and DSTMD show size and velocity selectivities (see Fig. 25). Both ESTMD and DSTMD show the strongest response to the target whose size (or velocity) equals to pixels (or pixel/second), but much weaker responses to targets whose size (or velocity) is higher or lower than pixels (or pixel/second).
In the fourth and fifth experiment, we evaluate the performance of ESTMD and DSTMD in different backgrounds. Representative frames of input image sequences are shown in Fig. 85(a) and Fig. 93(a), respectively. In these two image sequences, backgrounds are all moving from left to right and their velocities are set as pixel/second. A small target whose luminance, size are set as and pixels, is moving against cluttered backgrounds. The coordinate of the small target at time equals to where is set as pixel/second.
In Fig. 85(c) and 93(c), we can see that for the fixed false alarm rate, detection rate of ESTMD (or DSTMD) in Fig. 85(c) is much higher than that of ESTMD (or DSTMD) in Fig. 93(c). This is because Fig. 93(a) is more cluttered than Fig. 85(a). In other words, Fig. 93(a) contains more small-target like features (background noises) which cannot be filtered out by ESTMD and DSTMD. Besides, most of LDTB in Fig. 85(b) are higher than while most of LDTB in Fig. 93(b) are lower than . Since ESTMD and DSTMD are LDTB sensitive, model responses to the small target in Fig. 85(a) are much stronger than model responses to the small target in Fig. 93(a).
For the fourth experiment, Fig. 85(d) shows motion directions detected by DSTMD in the sample , , , , frame while Fig. 85(f) shows motion directions detected by DSTMD from the th to the th frame. Theses detected motion directions are quite close to actual motion directions shown in Fig. 85(e) and 85(g). Since ESTMD is not directional selective, it cannot detect motion direction. For the fifth experiment, similar results can be seen in Fig. 93(d)-(g).
V Further Discussions
As discussed in the above sections, the presented neural network (DSTMD) demonstrated a reliable ability to detect small targets and motion direction against complex backgrounds. Nowadays, for vision-based mobile robots, their visual sensors are becoming more reliable while computation ability is more powerful. These make it possible for mobile robots, such as unmanned aerial vehicle (UAV), equipped with the presented neural network to detect small moving targets in the distance in the real world.
In insects’ visual system, numerous neurons work together to extract different cues from the real world. For example, LMCs extract motion information while amacrine cells are able to extract spatial contrast information from input visual signals [46, 47]. Integrating these two types of information may contribute to the improvement of detection performance of STMD neurons in cluttered backgrounds. In the future, the cooperation of these specialized neurons needs to be taken into consideration for further modifying the neural network.
In engineering, small target motion detection can also be performed by infrared detection methods [48, 49, 50, 51, 52]. However, infrared methods always require significant temperature differences between objects of interest (such as rockets and jets) and the background. This largely limits the application of infrared methods, because such significant temperature difference cannot be found between small flying insects and background such as trees and bushes in natural world. Different from infrared methods, the presented neural network (DSTMD) uses natural images as input and provides a vision-based method for small moving target detection.
In this paper, we proposed a visual neural network DSTMD to simulate directional selective STMD neurons. Direction selectivity is obtained by correlating signals from two positions while size selectivity is introduced by the second-order lateral inhibition mechanism. Motion direction of detected targets is estimated by the population vector algorithm. Systematic experiments showed that the presented STMD-based neural network can detect not only small moving targets, but also motion direction against complex backgrounds. In the future, other neurons such as amacrine cells may be integrated into the neural network to extract other visual cues simultaneously from the same sequential images to improve detection performance.
This research was supported by EU FP7 Project HAZCEPT (318907), HORIZON 2020 project STEP2DYNA (691154), ENRICHME (643691) and the National Natural Science Foundation of China under the grant no. 11771347. We thank Marie Daniels for proofreading this manuscript.
-  S. Yue and F. C. Rind, “Collision detection in complex dynamic scenes using an lgmd-based visual neural network with feature enhancement,” IEEE transactions on neural networks, vol. 17, no. 3, pp. 705–716, May 2006.
-  K. Nordström, P. D. Barnett, and D. C. O’Carroll, “Insect detection of small targets moving in visual clutter,” PLoS biology, vol. 4, no. 3, p. e54, Feb. 2006.
-  R. Olberg, A. Worthington, and K. Venator, “Prey pursuit and interception in dragonflies,” Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology, vol. 186, no. 2, pp. 155–162, Feb. 2000.
-  K. Nordström, “Neural specializations for small target detection in insects,” Current opinion in neurobiology, vol. 22, no. 2, pp. 272–278, Apr. 2012.
-  K. Nordström and D. C. O’Carroll, “Small object detection neurons in female hoverflies,” Proceedings of the Royal Society of London B: Biological Sciences, vol. 273, no. 1591, pp. 1211–1216, May 2006.
-  D. O’Carroll, “Feature-detecting neurons in dragonflies,” Nature, vol. 362, no. 6420, pp. 541–543, 1993.
-  P. D. Barnett, K. Nordström, and D. C. O’Carroll, “Retinotopic organization of small-field-target-detecting neurons in the insect visual system,” Current Biology, vol. 17, no. 7, pp. 569–578, Apr. 2007.
-  P. T. Gonzalez-Bellido, H. Peng, J. Yang, A. P. Georgopoulos, and R. M. Olberg, “Eight pairs of descending visual neurons in the dragonfly give wing motor centers accurate population vector of prey direction,” Proceedings of the National Academy of Sciences, vol. 110, no. 2, pp. 696–701, Jan. 2013.
-  S. D. Wiederman, P. A. Shoemaker, and D. C. O’Carroll, “A model for the detection of moving targets in visual clutter inspired by insect physiology,” PloS one, vol. 3, no. 7, p. e2784, Jul. 2008.
-  S. D. Wiederman and D. C. OâCarroll, “Biologically inspired feature detection using cascaded correlations of off and on channels,” Journal of Artificial Intelligence and Soft Computing Research, vol. 3, no. 1, pp. 5–14, Dec. 2013.
-  S. Judge and F. Rind, “The locust dcmd, a movement-detecting neurone tightly tuned to collision trajectories,” Journal of Experimental Biology, vol. 200, no. 16, pp. 2209–2216, 1997.
-  F. C. Rind and P. J. Simmons, “Orthopteran dcmd neuron: a reevaluation of responses to moving objects. i. selective responses to approaching objects,” Journal of Neurophysiology, vol. 68, no. 5, pp. 1654–1666, Nov. 1992.
-  G. Schlotterer, “Response of the locust descending movement detector neuron to rapidly approaching and withdrawing visual stimuli,” Canadian Journal of Zoology, vol. 55, no. 8, pp. 1372–1376, 1977.
-  P. J. Simmons and F. C. Rind, “Responses to object approach by a wide field visual neurone, the lgmd2 of the locust: characterization and image cues,” Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology, vol. 180, no. 3, pp. 203–214, Feb. 1997.
-  K. Hausen, “The lobula-complex of the fly: structure, function and significance in visual behaviour,” in Photoreception and vision in invertebrates. Springer, 1984, pp. 523–559.
-  Y.-J. Lee, H. O. Jönsson, and K. Nordström, “Spatio-temporal dynamics of impulse responses to figure motion in optic flow neurons,” PloS one, vol. 10, no. 5, pp. 1–16, May 2015.
-  A. Borst and M. Egelhaaf, “Direction selectivity of blowfly motion-sensitive neurons is computed in a two-stage process,” Proceedings of the National Academy of Sciences, vol. 87, no. 23, pp. 9363–9367, Dec. 1990.
-  J. Haag, M. Egelhaaf, and A. Borst, “Dendritic integration of motion information in visual interneurons of the blowfly,” Neuroscience letters, vol. 140, no. 2, pp. 173–176, Jun. 1992.
-  A. Borst, M. Egelhaaf, and J. Haag, “Mechanisms of dendritic integration underlying gain control in fly motion-sensitive interneurons,” Journal of computational neuroscience, vol. 2, no. 1, pp. 5–18, Mar. 1995.
-  F. C. Rind and D. Bramwell, “Neural network based on the input organization of an identified neuron signaling impending collision,” Journal of Neurophysiology, vol. 75, no. 3, pp. 967–985, Mar. 1996.
-  S. Yue and F. C. Rind, “Redundant neural vision systemsâcompeting for collision recognition roles,” IEEE Transactions on Autonomous Mental Development, vol. 5, no. 2, pp. 173–186, Apr. 2013.
-  S. Yue, F. C. Rind, M. S. Keil, J. Cuadri, and R. Stafford, “A bio-inspired visual collision detection mechanism for cars: Optimisation of a model of a locust neuron to a novel environment,” Neurocomputing, vol. 69, no. 13, pp. 1591–1598, Feb. 2006.
-  Q. Fu, C. Hu, and S. Yue, “Bio-inspired collision detector with enhanced selectivity for ground robotic vision system,” in British Machine Vision Conference 2016, Conference Proceedings.
-  C. Hu, F. Arvin, C. Xiong, and S. Yue, “A bio-inspired embedded vision system for autonomous micro-robots: the lgmd case,” IEEE Transactions on Cognitive and Developmental Systems, vol. 9, no. 3, pp. 241–254, Sep. 2016.
-  B. Hassenstein and W. Reichardt, “Systemtheoretische analyse der zeit-, reihenfolgen-und vorzeichenauswertung bei der bewegungsperzeption des rüsselkäfers chlorophanus,” Zeitschrift für Naturforschung B, vol. 11, no. 9-10, pp. 513–524, Oct. 1956.
-  N. Franceschini, A. Riehle, and A. Le Nestour, “Directionally selective motion detection by insect neurons,” in Facets of vision. Springer, 1989, pp. 360–390.
-  H. Eichner, M. Joesch, B. Schnell, D. F. Reiff, and A. Borst, “Internal structure of the fly elementary motion detector,” Neuron, vol. 70, no. 6, pp. 1155–1164, Jun. 2011.
-  D. A. Clark, L. Bursztyn, M. A. Horowitz, M. J. Schnitzer, and T. R. Clandinin, “Defining the computational structure of the motion detector in drosophila,” Neuron, vol. 70, no. 6, pp. 1165–1177, Jun. 2011.
-  A. Borst and M. Helmstaedter, “Common circuit design in fly and mammalian motion vision,” Nature neuroscience, vol. 18, no. 8, pp. 1067–1076, Jun. 2015.
-  E. J. Warrant, “Matched filtering and the ecology of vision in insects,” in The Ecology of Animal Senses. Springer, 2016, pp. 143–167.
-  J. Van Hateren, “Theoretical predictions of spatiotemporal receptive fields of fly lmcs, and experimental validation,” Journal of Comparative Physiology A, vol. 171, no. 2, pp. 157–170, Sep. 1992.
-  R. R. Harrison, “A low-power analog vlsi visual collision detector.” in NIPS, 2003, pp. 987–994.
-  L. Freifeld, D. A. Clark, M. J. Schnitzer, M. A. Horowitz, and T. R. Clandinin, “Gabaergic lateral interactions tune the early stages of visual processing in drosophila,” Neuron, vol. 78, no. 6, pp. 1075–1089, Jun. 2013.
-  R. Behnia, D. A. Clark, A. G. Carter, T. R. Clandinin, and C. Desplan, “Processing properties of on and off pathways for drosophila motion detection,” Nature, vol. 512, no. 7515, p. 427, Aug. 2014.
-  á. Bausenwein and K.-F. Fischbach, “Activity labeling patterns in the medulla of drosophila melanogaster caused by motion stimuli,” Cell and tissue research, vol. 270, no. 1, pp. 25–35, Oct. 1992.
-  S.-y. Takemura, A. Bharioke, Z. Lu, A. Nern, S. Vitaladevuni, P. K. Rivlin, W. T. Katz, D. J. Olbris, S. M. Plaza, P. Winston et al., “A visual motion detection circuit suggested by drosophila connectomics,” Nature, vol. 500, no. 7461, p. 175, Aug. 2013.
-  R. Behnia and C. Desplan, “Visual circuits in flies: beginning to see the whole picture,” Current opinion in neurobiology, vol. 34, pp. 125–132, Oct. 2015.
-  H. H. Yang, F. St-Pierre, X. Sun, X. Ding, M. Z. Lin, and T. R. Clandinin, “Subcellular imaging of voltage and calcium signals reveals neural processing in vivo,” Cell, vol. 166, no. 1, pp. 245–257, Jun. 2016.
-  K. Shinomiya, T. Karuppudurai, T.-Y. Lin, Z. Lu, C.-H. Lee, and I. A. Meinertzhagen, “Candidate neural substrates for off-edge motion detection in drosophila,” Current Biology, vol. 24, no. 10, pp. 1062–1070, May 2014.
-  D. M. Bolzon, K. Nordström, and D. C. O’Carroll, “Local and large-range inhibition in feature detection,” Journal of Neuroscience, vol. 29, no. 45, pp. 14 143–14 150, Nov. 2009.
-  J. C. Tuthill and B. G. Borghuis, “Four to foxtrot: how visual motion is computed in the fly brain,” Neuron, vol. 89, no. 4, pp. 677–680, Feb. 2016.
-  M.-J. Escobar, D. Pezo, and P. Orio, “Mathematical analysis and modeling of motion direction selectivity in the retina,” Journal of Physiology-Paris, vol. 107, no. 5, pp. 349–359, Nov. 2013.
-  R. M. Olberg, “Object-and self-movement detectors in the ventral nerve cord of the dragonfly,” Journal of Comparative Physiology A: Neuroethology, Sensory, Neural, and Behavioral Physiology, vol. 141, no. 3, pp. 327–334, Sep. 1981.
-  P. T. Gonzalez-Bellido, S. T. Fabian, and K. Nordström, “Target detection in insects: optical, neural and behavioral optimizations,” Current opinion in neurobiology, vol. 41, pp. 122–128, Dec. 2016.
-  A. D. Straw, “Vision egg: an open-source library for realtime visual stimulus generation.” Frontiers in neuroinformatics, vol. 2, no. 4, Nov. 2008.
-  N. Vogt and C. Desplan, “The first steps in drosophila motion detection,” Neuron, vol. 56, no. 1, pp. 5–7, Oct. 2007.
-  L. Zheng, G. G. de Polavieja, V. Wolfram, M. H. Asyali, R. C. Hardie, and M. Juusola, “Feedback network controls photoreceptor output at the layer of first visual synapses in drosophila,” The Journal of general physiology, vol. 127, no. 5, pp. 495–510, Apr. 2006.
-  C. Gao, D. Meng, Y. Yang, Y. Wang, X. Zhou, and A. G. Hauptmann, “Infrared patch-image model for small target detection in a single image,” IEEE Transactions on Image Processing, vol. 22, no. 12, pp. 4996–5009, Sep. 2013.
-  A. P. Tzannes and D. H. Brooks, “Detecting small moving objects using temporal hypothesis testing,” IEEE Transactions on Aerospace and Electronic Systems, vol. 38, no. 2, pp. 570–586, Aug. 2002.
-  T.-W. Bae, “Small target detection using bilateral filter and temporal cross product in infrared images,” Infrared Physics & Technology, vol. 54, no. 5, pp. 403–411, Sep. 2011.
-  W. A. C. Schmidt, “Modified matched filter for cloud clutter suppression,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 6, pp. 594–600, Jun. 1990.
-  S. Kim and J. Lee, “Scale invariant small target detection by optimizing signal-to-clutter ratio in heterogeneous background for infrared search and track,” Pattern Recognition, vol. 45, no. 1, pp. 393–406, Jan. 2012.