![]() | This is not a Wikipedia article: It is an individual user's work-in-progress page, and may be incomplete and/or unreliable. For guidance on developing this draft, see
Wikipedia:So you made a userspace draft. Find sources:
Google (
books ·
news ·
scholar ·
free images ·
WP refs) ·
FENS ·
JSTOR ·
TWL |
In neuroscience, Saliency detection is considered to be a key attentional mechanism that facilitates learning and survival. Kadir Brady saliency detector is one of the feature detection computer algorithm which borrows the concept in neuroscience and aims to detect representative, discriminative, hence, salient region in the image. It has certain useful properties so that it often performs well in the task of object class recognition.
It's similarity invariant version was first invented by Timor Kadir and Michael Brady in 2001 [1]. Later in 2004, the affine invariant version was invented by Timor Kadir, Zisserman, A. and Michael Brady. [2] In the following article, detail of the algorithm and evaluation of the performance are presented.
Many computer vision and image processing applications work directly with the features extracted from image, rather than the raw image, for example, for computing image correspondences [3] [4] [5] [6] [7], or for learning object categories [8] [9] [10] [11]. Depenands on the applications, different characteristics are preferred. However, there are three broad classes of image change under which good performance may be required:
All Feature detection algorithms are trying to detect regions which is stable under these three types of image change. In stead of finding corner [12] [13] , or blob [4] [7] , or any specific shape of regions, Kadir brady saliency detector looks for regions which are locally complex, and globally discriminative. Such regions usually correspond to regions which are more stable under these types of image change.
In Information_theory, Shannon entropy is defined to quantifies the complexity(information) of a probability distribution p as . Higher entropy means p is more complex, hence more unpredictable.
To measure the complexity of a image region around point with shape , a descriptor that takes on values (e.g. in an 8 bit grey level image, D would range from 0 to 255 for each pixel) is defined. Then , the probability of descriptor value occurs in region can be computed by counting the occurrence of the descriptor. Further, the entropy of image region can compute as
Using this entropy equation we can calculate for every point and region shape . As shown in Figure 3 (a), more complex region, like eye region, has more complex distribution, hence higher entropy.
is a good measure for local complexity. However, entropy only measures the statistic of local attribute. It doesn't measure the spatial arrangement of the local attribute. For example, figure 3 (b) shows the eye region and three permutations of pixels of it, which all have the same entropy. However, these four regions are not equally discriminative under scale change as shown in figure 3 (b). This observation is used to define saliency.
The following subsections will discuss different methods to select regions with high local complexity and discriminative across different region.
The first version of Kadir brady saliency detector [1] only finds Salient regions invariant under similarity transformation. The algorithm finds circle regions with different scales (see Figure 5). In another words, given , where s is the scale parameter of a circle region , the algorithm select a set of circle region,.
The method consists of three steps:
The final saliency is the product of and .
For each x the method picks a scale and calculates salient score . By comparing of different points , the detector can rank points by saliency. For example, in Figure 4, blue circle has higher saliency than red circle.
Previous method is invariant to the similarity group of geometric transformations and to photometric shifts. However, as mentioned in the opening remarks, the ideal detector should detect region invariant to viewpoint change. There are several detector [5] [7] [14] can detect affine invariant region which is a better approximation of viewpoint change than similarity transformation.
To detect affine invariant regions, the detector need to detect ellipses as in figure 6. now is parameterized by three parameters (s, "ρ", "θ"), where "ρ" is the axis ratio and "θ" the orientation of the ellipse.
This modification increases the search space of the shape from scale to a set of parameters. Therefore, the complexity of the affine invariant saliency detector increases. In practice the affine invariant saliency detector starts with the set of points and scales generated from the Similarity invariant saliency detector. Then, the algorithm iteratively approximates the suboptimal parameters solution and output ellipses.
Although similarity invariant saliency detector is faster than Affine invariant saliency detector, it also has the drawback of favoring isotropic structure since the discriminative measure is measured over isotropic scale. To summarize, Affine invariant saliency detector is invariant to affine transformation and able to detect more general salient regions.
Figure 7 shows the comparison between Similarity invariant saliency detector and Affine invariant saliency detector.
It is intuitive to pick points with higher saliency directly and stop when a certain threshold on number of points or saliency is satisfied. However, natural images contain noise and motion blur, both of them act as randomisers and generally increase entropy, affecting previously low entropy values more than high entropy values.
A more robust method would be to pick regions rather than points in entropy space. Although the individual pixels within a salient region may be affected at any given instant by the noise, it is unlikely to affect all of them in such a way that the region as a whole becomes non-salient.
A simple clustering algorithm is used at the end of the algorithm. It works by selecting highly salient points that have local support - that is, nearby points with similar saliency and scale. Each region must be sufficiently distant from all others (in R3 ) to qualify as a separate entity.
For details, please refer to [1] section 5. The algorithm is implement as GreedyCluster1.m in matlab by Timor Kadir which can be download here
In the field of computer vision, different feature detectors have been evaluated by several tests. The most profound evaluation is published in [15]. The following subsection discuss the performance of Kadir brady saliency detector on a subset of test in the paper.
In order to measure the consistency of region detected under global transformation, repeatability score, which is first proposed by Mikolajczyk and Cordelia Schmid in [16] [5] , is calculated as follow.
Firstly, overlap error of a pair of corresponding ellipses and each on different images is defined as
where A is the locally linearized affine transformation of the homography between the two images, and and represent the area of intersection and union of the ellipses respectively. Notice is scaled into a fix scale to take the count of size difference of detected region. Only if is smaller then certain , the pair of ellipses are deemed to correspond.
Then the repeatability score for a given pair of images is computed as the ratio between the number of region-to-region correspondences and the smaller of the number of regions in the pair of images, where only the regions located in the part of the scene present in both images are counted. In general we would like a detector to have a high repeatability score and a large number of correspondences.
The specific global transformations tested in the [ test dataset] are:
The performance of Kadir brady saliency detector is inferior than most of other detectors [17] [4] [5] [7]
mainly because the number of points detected is usually lower than other detectors.
The precise procedure is given in the Matlab code from Detector evaluation #Software implementation.
In the task of object class categorization, the ability of detecting similar regions given intra-class variation and image perturbations is very critical. In [2], Repeatability measure over intra-class variation and image perturbations is proposed. The following subsection will introduce the definition and discuss the performance.
Suppose there are a set of images of the same object class, e.g. motorbikes. A region detection operator which is unaffected by intra-class variation will reliably select regions on corresponding parts of all the objects, say the wheels, engine or seat for motorbikes.
Repeatability over intra-class variation is measuring the (average) number of correct correspondences over the set of images, where the correct correspondences is manually selected.
A region is matched if it fulfils three requirements:
In detail the average correspondence score S is measured as follows.
N regions are detected on each image of the M images in the dataset. Then for a particular reference image i the correspondence score is given by the proportion of corresponding to detected regions for all the other images in the dataset, i.e.:
The score is computed for M/2 different selections of the reference image, and averaged to give S. The score is evaluated as a function of the number of detected regions N .
Kadir brady saliency detector gives highest score across three test classes, which are motorbike, car, and face. As illustrate in figure 8, saliency detector indicates that most detections are near the object. In contrast, other detectors (Difference-Of-Gaussian (DoG) blob detector [14] and multi-scale Harris (MSHar) with Laplacian scale selection [5] ) maps show a much more diffuse pattern over the entire area caused by poor localization and false responses to background clutter.
In order to test insensitivity to image perturbation, the data set is split into two parts: the first contains images with a uniform background and the second, images with varying degrees of background clutter. If the detector is robust to background clutter then the average correspondence score S should be similar for both subsets of images.
In this test saliency detector also outperforms other detectors duo to three reasons:
objects and background.
Saliency detector is most useful in the task of object recognition, whereas several other detectors are more useful in the task of computing image correspondences. However, in the task of 3D object recognition (see #External links) which all three types of image change are combined, Saliency detector might still be powerful.
Object Category Recognition (Constellation Model)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: |pages=
has extra text (
help); Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help); Unknown parameter |note=
ignored (
help)
{{
cite conference}}
: |pages=
has extra text (
help); Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
Cite error: The named reference "R.Fergus2003" was defined multiple times with different content (see the
help page).
{{
cite conference}}
: Check date values in: |year=
(
help); Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
![]() | This is not a Wikipedia article: It is an individual user's work-in-progress page, and may be incomplete and/or unreliable. For guidance on developing this draft, see
Wikipedia:So you made a userspace draft. Find sources:
Google (
books ·
news ·
scholar ·
free images ·
WP refs) ·
FENS ·
JSTOR ·
TWL |
In neuroscience, Saliency detection is considered to be a key attentional mechanism that facilitates learning and survival. Kadir Brady saliency detector is one of the feature detection computer algorithm which borrows the concept in neuroscience and aims to detect representative, discriminative, hence, salient region in the image. It has certain useful properties so that it often performs well in the task of object class recognition.
It's similarity invariant version was first invented by Timor Kadir and Michael Brady in 2001 [1]. Later in 2004, the affine invariant version was invented by Timor Kadir, Zisserman, A. and Michael Brady. [2] In the following article, detail of the algorithm and evaluation of the performance are presented.
Many computer vision and image processing applications work directly with the features extracted from image, rather than the raw image, for example, for computing image correspondences [3] [4] [5] [6] [7], or for learning object categories [8] [9] [10] [11]. Depenands on the applications, different characteristics are preferred. However, there are three broad classes of image change under which good performance may be required:
All Feature detection algorithms are trying to detect regions which is stable under these three types of image change. In stead of finding corner [12] [13] , or blob [4] [7] , or any specific shape of regions, Kadir brady saliency detector looks for regions which are locally complex, and globally discriminative. Such regions usually correspond to regions which are more stable under these types of image change.
In Information_theory, Shannon entropy is defined to quantifies the complexity(information) of a probability distribution p as . Higher entropy means p is more complex, hence more unpredictable.
To measure the complexity of a image region around point with shape , a descriptor that takes on values (e.g. in an 8 bit grey level image, D would range from 0 to 255 for each pixel) is defined. Then , the probability of descriptor value occurs in region can be computed by counting the occurrence of the descriptor. Further, the entropy of image region can compute as
Using this entropy equation we can calculate for every point and region shape . As shown in Figure 3 (a), more complex region, like eye region, has more complex distribution, hence higher entropy.
is a good measure for local complexity. However, entropy only measures the statistic of local attribute. It doesn't measure the spatial arrangement of the local attribute. For example, figure 3 (b) shows the eye region and three permutations of pixels of it, which all have the same entropy. However, these four regions are not equally discriminative under scale change as shown in figure 3 (b). This observation is used to define saliency.
The following subsections will discuss different methods to select regions with high local complexity and discriminative across different region.
The first version of Kadir brady saliency detector [1] only finds Salient regions invariant under similarity transformation. The algorithm finds circle regions with different scales (see Figure 5). In another words, given , where s is the scale parameter of a circle region , the algorithm select a set of circle region,.
The method consists of three steps:
The final saliency is the product of and .
For each x the method picks a scale and calculates salient score . By comparing of different points , the detector can rank points by saliency. For example, in Figure 4, blue circle has higher saliency than red circle.
Previous method is invariant to the similarity group of geometric transformations and to photometric shifts. However, as mentioned in the opening remarks, the ideal detector should detect region invariant to viewpoint change. There are several detector [5] [7] [14] can detect affine invariant region which is a better approximation of viewpoint change than similarity transformation.
To detect affine invariant regions, the detector need to detect ellipses as in figure 6. now is parameterized by three parameters (s, "ρ", "θ"), where "ρ" is the axis ratio and "θ" the orientation of the ellipse.
This modification increases the search space of the shape from scale to a set of parameters. Therefore, the complexity of the affine invariant saliency detector increases. In practice the affine invariant saliency detector starts with the set of points and scales generated from the Similarity invariant saliency detector. Then, the algorithm iteratively approximates the suboptimal parameters solution and output ellipses.
Although similarity invariant saliency detector is faster than Affine invariant saliency detector, it also has the drawback of favoring isotropic structure since the discriminative measure is measured over isotropic scale. To summarize, Affine invariant saliency detector is invariant to affine transformation and able to detect more general salient regions.
Figure 7 shows the comparison between Similarity invariant saliency detector and Affine invariant saliency detector.
It is intuitive to pick points with higher saliency directly and stop when a certain threshold on number of points or saliency is satisfied. However, natural images contain noise and motion blur, both of them act as randomisers and generally increase entropy, affecting previously low entropy values more than high entropy values.
A more robust method would be to pick regions rather than points in entropy space. Although the individual pixels within a salient region may be affected at any given instant by the noise, it is unlikely to affect all of them in such a way that the region as a whole becomes non-salient.
A simple clustering algorithm is used at the end of the algorithm. It works by selecting highly salient points that have local support - that is, nearby points with similar saliency and scale. Each region must be sufficiently distant from all others (in R3 ) to qualify as a separate entity.
For details, please refer to [1] section 5. The algorithm is implement as GreedyCluster1.m in matlab by Timor Kadir which can be download here
In the field of computer vision, different feature detectors have been evaluated by several tests. The most profound evaluation is published in [15]. The following subsection discuss the performance of Kadir brady saliency detector on a subset of test in the paper.
In order to measure the consistency of region detected under global transformation, repeatability score, which is first proposed by Mikolajczyk and Cordelia Schmid in [16] [5] , is calculated as follow.
Firstly, overlap error of a pair of corresponding ellipses and each on different images is defined as
where A is the locally linearized affine transformation of the homography between the two images, and and represent the area of intersection and union of the ellipses respectively. Notice is scaled into a fix scale to take the count of size difference of detected region. Only if is smaller then certain , the pair of ellipses are deemed to correspond.
Then the repeatability score for a given pair of images is computed as the ratio between the number of region-to-region correspondences and the smaller of the number of regions in the pair of images, where only the regions located in the part of the scene present in both images are counted. In general we would like a detector to have a high repeatability score and a large number of correspondences.
The specific global transformations tested in the [ test dataset] are:
The performance of Kadir brady saliency detector is inferior than most of other detectors [17] [4] [5] [7]
mainly because the number of points detected is usually lower than other detectors.
The precise procedure is given in the Matlab code from Detector evaluation #Software implementation.
In the task of object class categorization, the ability of detecting similar regions given intra-class variation and image perturbations is very critical. In [2], Repeatability measure over intra-class variation and image perturbations is proposed. The following subsection will introduce the definition and discuss the performance.
Suppose there are a set of images of the same object class, e.g. motorbikes. A region detection operator which is unaffected by intra-class variation will reliably select regions on corresponding parts of all the objects, say the wheels, engine or seat for motorbikes.
Repeatability over intra-class variation is measuring the (average) number of correct correspondences over the set of images, where the correct correspondences is manually selected.
A region is matched if it fulfils three requirements:
In detail the average correspondence score S is measured as follows.
N regions are detected on each image of the M images in the dataset. Then for a particular reference image i the correspondence score is given by the proportion of corresponding to detected regions for all the other images in the dataset, i.e.:
The score is computed for M/2 different selections of the reference image, and averaged to give S. The score is evaluated as a function of the number of detected regions N .
Kadir brady saliency detector gives highest score across three test classes, which are motorbike, car, and face. As illustrate in figure 8, saliency detector indicates that most detections are near the object. In contrast, other detectors (Difference-Of-Gaussian (DoG) blob detector [14] and multi-scale Harris (MSHar) with Laplacian scale selection [5] ) maps show a much more diffuse pattern over the entire area caused by poor localization and false responses to background clutter.
In order to test insensitivity to image perturbation, the data set is split into two parts: the first contains images with a uniform background and the second, images with varying degrees of background clutter. If the detector is robust to background clutter then the average correspondence score S should be similar for both subsets of images.
In this test saliency detector also outperforms other detectors duo to three reasons:
objects and background.
Saliency detector is most useful in the task of object recognition, whereas several other detectors are more useful in the task of computing image correspondences. However, in the task of 3D object recognition (see #External links) which all three types of image change are combined, Saliency detector might still be powerful.
Object Category Recognition (Constellation Model)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: |pages=
has extra text (
help); Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help); Unknown parameter |note=
ignored (
help)
{{
cite conference}}
: |pages=
has extra text (
help); Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
Cite error: The named reference "R.Fergus2003" was defined multiple times with different content (see the
help page).
{{
cite conference}}
: Check date values in: |year=
(
help); Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)
{{
cite conference}}
: Unknown parameter |booktitle=
ignored (|book-title=
suggested) (
help)CS1 maint: multiple names: authors list (
link)