Foundations of Computer Vision
Foundations of Computer Vision
James F. Peters
Foundations
of Computer
Vision
Computational Geometry, Visual Image
Structures and Object Shape Detection
Intelligent Systems Reference Library
Volume 124
Series editors
Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland
e-mail: [email protected]
The aim of this series is to publish a Reference Library, including novel advances
and developments in all aspects of Intelligent Systems in an easily accessible and
well structured form. The series includes reference works, handbooks, compendia,
textbooks, well-structured monographs, dictionaries, and encyclopedias. It contains
well integrated knowledge and current information in the field of Intelligent
Systems. The series covers the theory, applications, and design methods of
Intelligent Systems. Virtually all disciplines such as engineering, computer science,
avionics, business, e-commerce, environment, healthcare, physics and life science
are included.
Foundations of Computer
Vision
Computational Geometry, Visual Image
Structures and Object Shape Detection
123
James F. Peters
Electrical and Computer Engineering
University of Manitoba
Winnipeg, MB
Canada
This book introduces the foundations of computer vision. The principal aim of
computer vision (also, called machine vision) is to reconstruct and interpret natural
scenes based on the content of images captured by various cameras (see, e.g.,
R. Szeliski [191]). Computer vision systems include such things as survey satellites,
robotic navigation systems, smart scanners, and remote sensing systems. In this
study of computer vision, the focus is on extracting useful information from images
(see, e.g., S. Prince [162]). Computer vision systems typically emulate human
visual perception. The hardware of choice in computer vision systems is some form
of digital camera, programmed to approximate visual perception. Hence, there are
close ties between computer vision, digital image processing, optics, photometry
and photonics (see, e.g., E. Stijns and H. Thienpont [188]).
From a computer vision perspective, photonics is the science of light in the
capture of visual scenes. Image processing is the study of digital image formation
(e.g., conversion of analogue optical sensor signals to digital signals), manipulation
(e.g., image filtering, denoising, cropping), feature extraction (e.g., pixel intensity,
gradient orientation, gradient magnitude, edge strength), description (e.g., image
edges and texture) and visualization (e.g., pixel intensity histograms). See, e.g., the
mathematical frameworks for image processing by B. Jähne [87] and S.G. Hoggar
[82], extending to a number of practitioner views of image processing provided,
for example, by M. Sonka and V. Hlavac and R. Boyle [186], W. Burger and
M.J. Burge [21], R.C. Gonzalez and R.E. Woods [58], R.C. Gonzalez and R.E.
Woods and S.L. Eddins [59], V. Hlavac [81], and C. Solomon and T. Breckon
[184]. This useful information provides the bedrock for the focal points of computer
visionists, namely, image object shapes and patterns that can be detected, analyzed
and classified (see, e.g., [142]). In effect, computer vision is the study of digital
image structures and patterns, which is a layer of image analysis above that of
image processing and photonics. Computer vision includes image processing and
photonics in its bag of tricks in its pursuit of image geometry and image region
patterns.
In addition, it is helpful to cultivate an intelligent systems view of digital images
with an eye to discovering hidden patterns such as repetitions of convex enclosures
vii
viii Preface
of image regions and embedded image structures such as clusters of points in image
regions of interest. The discovery of such structures is made possible by quantizers.
A quantizer restricts a set of values (usually continuous) to a discrete value. In its
simplest form in computer vision, a quantizer observes a particular target pixel
intensity and selects the nearest approximating values in the neighbourhood of the
target. The output of a quantizer is called a codebook by A. Gersho and R.M. Gray
[55, §5.1, p. 133] (see, also, S. Ramakrishnan, K. Rose and A. Gersho [164]).
In the context of image mesh overlays, the Gersho–Gray quantizer is replaced by
geometry-based quantizers. A geometry-based quantizer restricts an image region
to its shape contour and observes in an image a particular target object shape
contour, which is compared with other shape contours that have approximately the
same shape as the target. In the foundations of computer vision, geometry-based
quantizers observe and compare image regions with approximately the same
regions such as mesh maximal nucleus clusters (MNCs) compared with other
nucleus clusters. A maximal nucleus cluster (MNCs) is a collection of image mesh
polygons surrounding a mesh polygon called the nucleus (see, e.g., J.F. Peters and
E. İnan on Edelsbrunner nerves in Voronoï tessellations of images [150]). An
image mesh nucleus is a mesh polygon that is the centre of a collection of adjacent
polygons. In effect, every mesh polygon is a nucleus of a cluster of polygons.
However, only one or more mesh nuclei are maximal.
A maximal image mesh nucleus is a mesh nucleus with the highest number of
adjacent polygons. MNCs are important in computer vision, since what we will call
a MNC contour approximates the shape of an underlying image object. A Voronoï
tessellation of an image is a tiling of the image with polygons. A Voronoï tessel-
lation of an image is also called a Voronoï mesh. A sample tiling of a musician
image in Fig. 0.1.1 is shown in Fig. 0.1.2. A sample nucleus of the musician image
tiling is shown in Fig. 0.2.1. The red dots inside each of the tiling polygons are
examples of Voronoï region (polygon) generating points. For more about this, see
Sect. 1.22.1. This musician mesh nucleus is the centre of a maximal nucleus cluster
shown in Fig. 0.2.2. This is the only MNC in the musician image mesh in Fig. 0.1.2.
This MNC is also an example of a Voronoï mesh nerve. The study of image MNCs
takes us to the threshold of image geometry and image object shape detection. For
more about this, see Sect. 1.22.2.
Each image tiling polygon is a convex hull of the interior and vertex pixels.
A convex hull of a set of image points is the smallest convex set of the set of points.
A set of image points A is a convex set, provided all of the points on every straight
line segment between any two points in the set A is contained in the set. In other
words, knowledge discovery is at the heart of computer vision. Both knowledge and
understanding of digital images can be used in the design of computer vision
systems. In vision system designs, there is a need to understand the composition
and structure of digital images as well as the methods used to analyze captured
images.
The focus of this volume is on the study of raster images. The sequel to this
volume will focus on vector images, which are composed of points (vectors), lines
and curves. The basic content of every raster image consists of pixels
Preface ix
(e.g., distinguished pixels called sites or mesh generating points), edges (e.g.,
common, parallel, intersecting, convex, concave, straight, curved, connected,
unconnected), angles (e.g., vector angle, angle between vectors, pixel angle), image
geometry (e.g., Voronoï regions [141], Delaunay triangulations [140]), colour,
shape, and texture. Many problems in computer vision and scene analysis are
solved by finding the most probable values of certain hidden or unobserved image
variables and structures (see, e.g., P. Kohli and P.H.S. Torr [96]). Such structures
and variables include the topological neighbourhood of a pixel, convex hulls of sets
of pixels, nearness (and apartness) of image structures and pixel gradient distri-
butions as well as feature vectors that describe elements of captured scenes.
Other computer vision problems include image matching, feature selection,
optimal classifier design, image region measurement, interest point identification,
contour grouping, segmentation, registration, matching, recognition, image clus-
tering, pattern clustering in F. Escolono, P. Suau, B. Bonev [45] and in N. Paragios,
Y. Chen, O. Faugeras [138], landmark and point shape matching, image warping,
x Preface
shape gradients [138], false colouring, pixel labelling, edge detection, geometric
structure detection, topological neighbourhood detection, object recognition, and
image pattern recognition.
In computer vision, the focus is on the detection of the basic geometric structures
and object shapes commonly found in digital images. This leads into a study of the
basics of image processing and image analysis as well as vector space and com-
putational geometry views of images. The basics of image processing include
colour spaces, filtering, edge detection, spatial description and image texture.
Digital images are examples of Euclidean spaces (both 2D and 3D). Hence, vector
space views of digital images are a natural outcome of their basic character.
A digital image structure is basically a geometric or a visual topological structure.
Examples of image structures are image regions, line segments, generating points
(e.g. Lowe keypoints), set of pixels, neighbourhood of a pixel, half spaces, convex
sets of pixels and convex hulls of sets of image pixels. For example, such structures
can be viewed in terms of image regions nearest selected points or collections of
image regions with a specified range of diameters. An image region is a set of
image points (pixels) in the interior of a digital image. The diameter of any image
region is the maximum distance between a pair of points in the region). Such
structures can also be found in line segments connected between selected points to
form triangular regions in 2D and 3D images.
Such structures are also commonly found in 2D and 3D images in the inter-
section of closed half spaces to form either convex hulls of a set of points or what
G.M. Ziegler calls polytopes [221]. An image half space is the set of all points
either above or below a line. In all three cases, we obtain a regional view of digital
images. For more about polytopes, see Appendix B.15.
Every image region has a shape. Some region shapes are more interesting than
others. The interesting image region shapes are those containing objects of interest.
These regional views of images leads to various forms of image segmentations that
have practical value when it comes to recognizing objects in images. In addition,
detection of image region shapes of interest views lead to the discovery of image
patterns that transcend the study of texels in image processing. A texel is an image
region represented by an array of pixels. For more about shapes, see Appendix B.18
on shape and shape boundaries.
Image analysis focuses on various digital image measurements (e.g., pixel size,
pixel adjacency, pixel feature values, pixel neighbourhoods, pixel gradient, close-
ness of image neighbourhoods). Three standard region-based approaches in image
analysis are isodata thresholding (binarizing images), watershed segmentation
(computed using a distance map from foreground pixels to background regions),
and non-maximum suppression (finding local maxima by suppressing all pixels that
are less likely than their surrounding pixels) [212].
In image analysis, object and background pixels are associated with different
adjacencies (neighbourhoods) by T. Aberra [3]. There are three basic types of
neighbourhoods, namely, Rosenfeld adjacency neighbourhoods [171, 102],
Hausdorff neighbourhoods [74, 75] and descriptive neighbourhoods in J.F. Peters
[142] and in C.J. Henry [77, 76]. Using different geometries, an adjacency
Preface xi
xiii
xiv Contents
r
( pqr)
x
Vp p
q
y
The principal aim of computer vision is to reconstruct and interpret natural scenes
based on the content of images captured by digital cameras [190]. A natural scene
is that part of visual field that is captured either by human visual perception or by
optical sensor arrays.
1.1 What Is Computer Vision? 3
The reconstruction and interpretation of natural scenes is made easier by tiling (tes-
sellating) a scene image with known geometric shapes such as triangles (Delaunay
triangulation approach) and polygons (Voronoï diagram approach). This is a divide-
and-conquer approach. Examples of this approach in computer vision are found in
centers C and C . See, for example, the pair of epipoles and epipolar lines in
Fig. 1.5.
Video stippling: Stippling renders an image using point sets, elementary shapes
and colours. The core technique in video stippling is Voronoï tessellation of video
frames. This is the approach by T. Houit and F. Nielsen in [85]. This article contain
a good introduction to Voronoï diagrams superimposed on video frame images (see
[85, Sect. 2, pp. 2–3]). Voronoï diagrams are useful in segmenting images. This
leads to what are known as Dirichlet tessellated images, leading a new form of k-
means clusters of image regions (see Fig. 1.6 for steps in the Voronoï segmentation
method). This form of image segmentation uses cluster centroid proximity to
find image clusters. This is approach used by R. Hettiarachchi and J.F. Peters in
1.2 Divide and Conquer Approach 5
[79]. Voronoï manifolds are introduced by J.F. Peters and C. Guadagni in [146].
A manifold is a topological space that is locally Euclidean, i.e., around every point
in the manifold there is an open neighbourhood. A nonempty set X with a topology
τ on it, is a topological space. A collection of open sets τ on a nonempty open
set X is a topology on X , provided it has certain properties (see Appendix B.19
for the definitions of open set and topology). An open set is a nonempty set of
points A in space X contains all points sufficiently close to A but does not include
its boundary points.
geodetic graph, provided, for any two vertices p, q on G, there is at most one
shortest path between p and q. A geodetic line is a straight line, since the shortest
path between the endpoints of a straight line is the line itself. For more about this,
see J. Topp [195]. For examples, see Appendix B.7.
Convex Hulls: A convex hull of a set of points A (denoted by convh A) is the
smallest convex set containing A. A nonempty set A in an n-dimensional Euclid-
ean space is a convex set (denoted by conv A), provided every straight line segment
between any two points in the set is also contained in the set. Voronoï tessellation
of a digital image results in image region clusters that are helpful in shape detec-
tion and the analysis of complex systems such as the cosmic web. This approach
is used by J. Hidding, R. van de Weygaert, G. Vegter, B.J.T. Jones and M. Teillaud
in [80]. For a pair of 3D convex hulls, see Fig. 1.8. For more about convex hulls,
see Appendix B.3.
These methods use image areas instead of pixels to extract image shape and object
information. In other words, we use computational geometry in the interpretation and
analysis of scene images.
Let S be any set of selected pixels in a digital image and let p ∈ S. The pixels in S
are called sites (or generating points) to distinguish them from other pixels in an
image. Recall that Euclidean distance between a pair of points x, y in the Euclidean
plane is denoted by x − y and defined by
1.3 Voronoï Diagrams Superimposed on Images 7
n
x − y = xi2 − yi2
i=1
V p = {x ∈ E : x − p ≤ x − q for all q ∈ S} .
Every site in S belongs to only one Voronoï region. A digital image covered with
Voronoï regions is called a tessellated image. Notice that each Voronoï region is
a convex polygon. This means that all of the points on a straight edge connecting
any pair of points in a Voronoï region belongs to the region. And a complete set of
Voronoï regions covering an image is called a Voronoï diagram or Voronoï mesh.
Many problems in computer vision and scene analysis are solved by finding the
most probable values of certain hidden or unobserved image variables and struc-
tures [96]. Such structures and variables include Voronoï regions, Delaunay trian-
gles, neighbourhoods of pixels, nearness (and apartness) of image structures and
pixel gradient distributions as well as values of encoded desired properties of scenes.
Other computer vision problems include image matching, feature selection, opti-
mal classifier design, image region measurement, interest point, contour grouping,
segmentation, registration, matching, recognition, image clustering, pattern clus-
tering [45, 138], landmark and point shape matching, image warping, shape gra-
dients [138], false colouring, pixel labelling, edge detection, geometric structure
detection, topological neighbourhood detection, object recognition, and image pat-
tern recognition. Typical applications of computer vision are in digital video stabi-
lization [49, Sect. 9, starting on p. 261] and in robot navigation [93, Sect. 5, starting
on p. 109].
The term camera comes from Latin camera obscura (dark chamber). Many dif-
ferent forms of cameras provide a playground for computer vision, e.g., affine cam-
era,pinhole camera, ordinary digital cameras, infrared cameras (also thermographic
camera), gamma (tomography) camera devices (in 3D imaging). An affine camera
is a linear mathematical model that approximates the perspective projection derived
from an ideal pinhole camera [218]. A pinhole camera is a perspective projection
device, which is a box with light-sensitive film on its interior back plane and which
admits light through a pinhole.
8 1 Basics Leading to Machine Vision
Fig. 1.9 Pixel centered at (5.5, 2.5) in a very small image grid
In this work, the focus is on the detection of the basic content and structures
in digital images. An interest in image content leads into a study of the basics of
image processing and image analysis as well as vector space and computational
geometry views of images. The basics of image processing include colour spaces,
filtering, edge detection, spatial description and image texture. The study of image
structures leads to a computational geometry view of digital images. The basic idea
is to detect and analyze image geometry from different perspectives.
Digital images are examples of subsets of Euclidean spaces (both 2D and 3D).
Hence, vector space views of digital images are a natural outcome of their basic
character. Digital image structures are basically geometric structures. Such structures
can be viewed in terms of image regions nearest selected points (see, e.g., the tiny
region nearest the highlighted pixel centered at (5.5, 2.5) in Fig. 1.9). Such structures
can also viewed with respect to line segments connection between selected points
to form triangular regions. Both a regional view and a triangulation view of image
structures leads to various forms of image segmentations that have practical values
when it comes to recognizing objects in images and classifying images. In addition,
both regional and triangle views lead to the discovery of patterns hidden in digital
images.
image regions viewed as sets of pixels that are in some sense near each other or set of
points near a fixed point (e.g., all points near a site (also, seed or generating point) in a
Voronoï region [38]). For this reason, it is highly advantageous to associate geometric
structures in an image with mesh-generating points (sites) derived from the fabric of
an image. Image edges, corners, centroids, critical points, intensities, and keypoints
(image pixels viewed as feature vectors) or their combinations provide ideal sources
of mesh generators as well as sources of information about image geometry.
Computational geometry is the brain child of A. Rosenfeld, who suggested
approaching image analysis in terms of distance functions in measuring the
separation between pixels [168] and image structures such as sets of pixels [169,
170]. Rosenfeld’s work eventually led to the introduction of topological algorithms
useful in image processing [99] and the introduction of a full-scale digital geometry
in picture analysis [94].
Algorithm 1 leads to a mesh covering a digital image. Image meshes can vary
considerably, depending on the type of image and the type mesh generating points that
are chosen. Image geometry tends to be revealed, whenever the choice of generating
points accurately reflects the image visual content and the structure of the objects in
an image scene. For example, corners would be the logical choice for image scenes
containing buildings or objects with sharply varying contours such as hands or facial
profiles.
1.4 A Brief Look at Computational Geometry 11
Fig. 1.10 Hunting grounds for scene information: corner-based Delaunay and Voronoï meshes
A digital image is a discrete representation of visual field objects that have spatial
(layout) and intensity (colour or grey tone) information.
From an appearance point of view, a greyscale digital image1 is represented by
a 2D light intensity function I (x, y), where x and y are spatial coordinates and the
value of I at (x, y) is proportional to the intensity of light that impacted on an optical
sensor and recorded in the corresponding picture element (pixel) at that point.
If we have a multicolour image, then a pixel at (x, y) is 1 × 3 array and each array
element indicates a red, green or blue brightness of the pixel in a colour band (or
colour channel). A greyscale digital image I is represented by a single 2D array of
numbers and a colour image is represented by a collection of 2D arrays, one for each
colour band or channel. This is how, for example, Matlab represents colour images.
A binary image consists entirely of black pixels (pixel intensity = 0) and white
pixels (pixel intensity = 1). For simplicity, we use the term binary image to refer to
a black and white image. By contrast, a greyscale image is an image that consists
entirely of pixels with varying shades of black, grey tones and white (grey tones).
Binary images and greyscale images are 2-dimensional intensity images. By con-
trast, an RGB (red green blue) colour image (is a 3-dimensional or multidimen-
sional image) image, since each colour pixel is represented by 3 colour channels,
one channel for each colour. RGB images live in a what is known as an RGB colour
space. There are many other forms of colour spaces. The most common alternative
to an RGB space is the HSV (Hue, Saturation, Value) space implemented by Matlab
or the HSB (Hue, Saturation, Brightness) space implemented by Mathematica.
1A greyscale image is an image containing pixels that are visible as black or white or grey tones
(intermediate between black and white).
1.5 Framework for Digital Images 13
Fig. 1.12 Pixels as tiny squares on the edges of the maple leaf
is common to use the little square model, which represents a pixel as a geometric
square.
using a pair of colour images like the ones shown in Fig. 1.11. To see this, move
the cpselect window over the maple leaf in the sample image in Fig. 1.11.1 and set
the zoom at 600%. Then notice the tiny squares along the edges of the zoomed-in
maple leaf in Fig. 1.12. Try a similar experiment with a second image such as the
one in Fig. 1.11.2 (or the same image) in the right-hand cpselect window. There are
advantages in choosing the same image for the right-hand cpselect window, since this
makes it possible to compare what happens while zooming in by different amounts
in relation to a zoomed-in image in the left-hand cpselect window.
Note The human eye can identify 120 pixels per degree of visual arc, i.e., if 2 dots
1
are closer than 120 degree, then our eyes cannot tell the difference. At a distance of 2
m (normal distance to a TV), our eyes cannot differentiate 2 dots 0.4 mm apart (see,
for example, Fig. 1.13).
In other words, for example, a pixel p centered at location (i, j) in a digital image
is identified with an area of the plane bounded by a square with sides of length
0.5 mm, i.e.,
See, e.g., the sample pixel p centered at (5.2, 2.5) represented as a square in Fig. 1.9,
where
A.R. Smith points out that this is misleading [179]. Instead, in a 2D model of an
image, a pixel is a point sample that exists only at a point in the plane. For a colour
image, pixels contains three point samples, one for each colour channel. Normally,
a pixel is the smallest unit of analysis of images. Sub-pixel analysis is also possible.
For more about pixels, see Appendix B.15.
In photography, a visual field is that part of the physical world that is visible
through a camera at a particular position and orientation in space. A visual field is
identified with a view cone or angle of view. In Matlab, a greyscale image pixel
I (x, y) denotes the light intensity (without colour) at the x row and y column of the
image. Values of x and y start at the origin in the upper lefthand corner of an image
(see, e.g., the greyscale image of a cameraman in Fig. 1.14).
A sample display of a coordinate system with a greyscale colorbar for an image is
shown in Fig. 1.14 using the code in Listing 1.1. The imagesc function is used to scale
the intensities in a greyscale image. The colormap(gray) and colorbar functions
are used to produce a colorbar to the west of a displayed image.
In Fig. 1.14, the top lefthand corner has coordinates (0, 0), the origin of the array
representation of the image. To see the information for the cameraman image, use
the imfinfo function (see Listing 1.2).
imfinfo(’cameraman.tif’)
to obtain
imfinfo ( ’cameraman.tif’ )
ans =
Filename : ’cameraman.tif’
FileModDate : ’20-Dec-2010 09:43:30’
FileSize : 65240
Format : ’tif’
FormatVersion : []
Width : 256
Height : 256
BitDepth : 8
ColorType : ’grayscale’
FormatSignature : [ 7 7 77 42 0 ]
ByteOrder : ’big-endian’
NewSubFileType : 0
BitsPerSample : 8
Compression : ’PackBits’
PhotometricInterpretation : ’BlackIsZero’
StripOffsets : [ 8 x1 double ]
SamplesPerPixel : 1
RowsPerStrip : 32
StripByteCounts : [ 8 x1 double ]
XResolution : 72
YResolution : 72
ResolutionUnit : ’None’
Colormap : []
PlanarConfiguration : ’Chunky’
TileWidth : []
TileLength : []
TileOffsets : []
TileByteCounts : []
Orientation : 1
FillOrder : 1
GrayResponseUnit : 0.0100
MaxSampleValue : 255
MinSampleValue : 0
Thresholding : 1
Offset : 64872
ImageDescription : [ 1 x112 char ]
⎡ ⎤ ⎡ ⎤
A(2, 1) . . . A(450, 1)
A(1, 1) 50 52 ... 50
⎢ A(1, 2)
A(2, 2) . . . A(450, 2) ⎥ ⎢ 50 152 ... 250⎥
⎢ ⎥ ⎢ ⎥
A=⎢ ⎥ = ⎢ .. .
⎣
..
.
..
.
. . .
..
. ⎦ ⎣ .
..
.
..
. ... ⎥
⎦
A(1, 350) A(2, 350) . . . A(450, 350) 100 120 . . . 8
In image A, notice that pixel A(450, 350) has greyscale intensity 8 (almost black).
And the pixel at A(450, 2) has intensity 250 (almost white).
A digital visual space is a nonempty set that consists of points in a digital image. A
space is a nonempty set with some sort of structure.
Historical Note 1 Visual Space.
J.H. Poincaré introduced sets of similar sensations to represent the results of G.T.
Fechner’s sensation sensitivity experiments [50] and a framework for the study of
resemblance in representative spaces as models of what he termed physical con-
tinua [154–156]. Visual spaces are prominent among the types of spaces that Poincaré
wrote about.
The elements of a physical continuum (pc) are sets of sensations. The notion of a
pc and various representative spaces (tactile, visual, motor spaces) were introduced
by Poincaré in an 1894 article on the mathematical continuum [156], an 1895 article
on space and geometry [155].
From the Historical Note, the important thing to observe is that a digital image
can be viewed as a visual space with some form of structure. Notice that the idea
of a digital visual space extends to collections (sets) of digital images, where the
structure of each such collection is defined, for example, by the nearness or apartness
of image structures such as neighbourhoods of points in the images in the collection.
In effect, a collection of digital images with some form of structure constitutes a
visual space.
3 Orgrayscale image, using the Mathworks (Matlab) spelling. These notes are written using Cana-
dian spelling.
18 1 Basics Leading to Machine Vision
Any 2D array of natural numbers in the range [0, n], n ∈ N (natural numbers) can be
viewed as a greyscale digital image. Each natural number specifies a pixel intensity.
The upper limit n on the range of intensities is usually 255.
Here is an example. The greyscale image in Fig. 1.15 (an image that approximates
the Mona Lisa painting) is constructed from an array of positive integers, where each
integer represents a pixel grayscale intensity. Internally, Matlab represents a single
intensity as tiny subimage (each pixel in the subimage has the same intensity).
% sample d i g i t a l image
132 128 126 123 137 129 130 145 158 170 172 161 153 158 162 172 159 1 5 2 ;
139 136 127 125 129 134 143 147 150 146 157 157 158 166 171 163 154 1 4 4 ;
144 135 125 119 124 135 121 62 29 16 20 47 89 151 162 158 152 1 3 7 ;
146 132 125 125 132 89 17 19 11 8 6 9 17 38 134 164 155 1 4 3 ;
142 130 124 130 119 15 46 82 54 25 6 6 11 17 33 155 173 1 5 6 ;
134 132 138 148 47 92 208 227 181 111 33 9 6 14 16 70 180 1 7 8 ;
151 139 158 117 22 162 242 248 225 153 62 19 8 8 11 13 159 1 5 2 ;
153 135 157 46 39 174 207 210 205 136 89 52 17 7 6 6 70 1 0 8 ;
167 168 128 17 63 169 196 211 168 137 121 88 21 9 7 5 34 5 7 ;
166 170 93 16 34 63 77 140 28 48 31 25 17 10 9 8 22 3 6 ;
136 111 83 15 48 69 57 124 55 86 52 112 34 11 9 6 15 3 0 ;
49 39 46 11 83 174 150 128 103 199 194 108 23 12 12 10 14 3 4 ;
26 24 18 14 53 175 153 134 98 172 146 59 13 14 13 12 12 4 6 ;
21 16 11 14 21 110 126 47 62 142 85 33 10 13 13 11 11 1 5 ;
17 14 10 11 11 69 102 42 39 74 71 28 9 13 12 12 11 1 8 ;
18 19 11 12 8 43 126 69 49 77 46 17 7 14 12 11 12 1 9 ;
24 30 17 11 12 6 73 165 79 37 15 12 10 12 13 10 10 1 6 ;
24 40 18 9 9 2 2 23 16 10 9 10 10 11 9 8 6 1 0 ;
43 40 25 6 10 2 0 6 20 8 10 16 18 10 4 3 5 7 ;
39 34 23 5 7 3 2 6 77 39 25 31 36 11 2 2 5 2 ;
1.7 Creating Your Own Images 19
17 16 9 4 6 5 6 36 85 82 68 75 72 27 5 7 8 0 ;
4 8 5 6 8 15 65 127 135 108 120 131 101 47 6 11 7 4 ;
2 9 6 6 7 74 144 170 175 149 162 153 110 48 11 12 3 5 ;
11 9 3 7 21 127 176 190 169 166 182 158 118 44 10 11 2 5 ;
8 0 5 23 63 162 185 191 186 181 188 156 117 38 11 12 25 3 3 ;
3 5 6 64 147 182 173 190 221 212 205 181 110 33 19 42 57 5 0 ;
5 3 7 45 160 190 149 200 253 255 239 210 115 46 30 25 9 5 ;
9 4 10 16 24 63 93 187 223 237 209 124 36 17 4 3 2 1 ;
7 8 13 8 9 12 17 19 26 41 42 24 11 5 0 1 7 4 ;
%
fid=fopen ( ’lisa2.txt’ ) ;
A = textscan ( fid , ’%d’ , ’delimiter’ , ’\b\t;’ ) ;
B=reshape ( A { 1 } , [ 1 8 , 2 9 ] ) ;
B=double ( B ) ;
B=B ’ ;
figure ; imshow ( B , [ ] ) ;
r and(n). ∗ m
This section illustrates the use of random numbers to generate a digital image. This
is done with the rand function in Matlab. The image in Fig. 1.16 represents an array
of randomly generated intensities in the range from 0 to 100, using the code in
Listing 1.5. Here is a sample of 8 numbers of type double produced by the code:
% G e n e r a t e random a r r a y o f n u m b e r s i n r a n g e 0 t o max
Listing 1.5 Matlab code in eg_02.m to produce Figs. 1.16 and 1.17.
In Matlab, the image function displays a matrix I as an image. Using this function,
each element of matrix I specifies the colour of a rectangular patch (see Fig. 1.17
to see how the image function displays a matrix of random number values as an
image containing color patches). Listing 1.5 contains the following line of code that
produces the greyscale image in Fig. 1.16. Because the second image produced by
the code in Listing 1.5 has been scaled, the colourbar to the right of the image in
Fig. 1.17 shows intensities (colour amounts) in the range from 0 to 100, instead of
the range 0 to 1 (as in Fig. 1.16).
figure , imshow ( I ) ; title ( ’intensities in [0,1] range’ ) ;
Problem 1.7
(image.1) ® Write Matlab code to produce the image in Fig. 1.18.
(image.2) ® Write Matlab code to display the Mona Lisa shown in Fig. 1.15 as
a colour image instead of a greyscale image. Your colour image should be similar
to colour image in Fig. 1.17.
For Problem 1.7, you may find the following trick useful. To transform the color
array in Fig. 1.19.1 to a narrower display like the one in Fig. 1.19.1, try
axis image.
1.8 Randomly Generated Images 23
I = randi(80,100) −1;
% what ’ s h a p p e n i n g ?
I = rand ( 1 0 0 ) . ∗ 8 0 ; %g e n e r a t e random i m a g e a r r a y
% w i t h 80 i n t e n s i t i e s i n r a n g e 0 . . . 1 0 0
subplot ( 1 , 3 , 1 ) ; imshow ( I ) ;
imagesc ( I ) ; % s c a l e c o l o r m a p t o d a t a
axis image ; axis off ; %r a n g e
colormap ( gray ) ; colorbar ; % p r o d u c e c o l o r b a r
subplot ( 1 , 3 , 2 ) ; imshow ( I ) ; % do n o t s p e c i f y r a n g e
subplot ( 1 , 3 , 3 ) ; imshow ( I , [ 0 8 0 ] ) ; % s p e c i f y r a n g e
Listing 1.7 Use the Matlab code in eg_03.m to produce the images in Fig. 1.20.
% Display m u l t i p l e images
% what ’ s h a p p e n i n g ?
I = imread ( ’cell.tif’ ) ; % c h o o s e . t i f f i l e
J = imread ( ’spine.tif’ ) ; % c h o o s e 2 nd . t i f f i l e
K = imread ( ’onion.png’ ) ; % c h o o s e . png f i l e
%
subplot ( 1 , 3 , 1 ) ; imagesc ( I ) ; axis image ; % s c a l e i m a g e
axis image ; axis off ; % d i s p l a y f i r s t i m a g e
colormap ( gray ) ; colorbar ; % p r o d u c e c o l o r b a r
subplot ( 1 , 3 , 2 ) ; imagesc ( J ) ; axis image ; % 2 nd i m a g e
axis off ; colormap ( jet ) ; % s e t c o l o r m a p t o j e t ( f a l s e c o l o u r )
subplot ( 1 , 3 , 3 ) ; imshow ( K ) ; % d i s p l a y c o l o u r i m a g e
Listing 1.8 Use the Matlab code in eg_04.m to produce the images in Fig. 1.21.
1.10 Digital Image Formats 25
There are a number of important, commonly available digital image formats, briefly
summarised as follows.
(Format.1) .bmp (bit mapped picture) basic image format, limited generally, loss-
less compression (lossy variants exist). .bmp originated in the devel-
opment of Microsoft Windows.
(Format.2) .gif (graphics interchange format) Limited to 256 colours (8 bit), loss-
less compression. Lossless data compression is the name of a class
of compression algorithms that make it possible to reconstruct the
exact original data from the compressed data. By contrast, lossy data
compression only permits an approximation of the original data from
the compressed data.
(Format.3) .jpg, .jpeg (joint photographic experts group) Most commonly used
file format, today (e.g., in most cameras), lossless compression (lossy
variants exist).
(Format.4) .png (portable network graphics) .png is a bit mapped image format
that employs lossless data compression.
(Format.5) .svg (scalable vector graphics) Instead of a raster image format
(describes the characteristics of each pixel), a vector image format
gives a geometric description that can be rendered smoothly at any
display size. An .svg image provides a versatile, scriptable and all-
purpose vector format for the web and other publishing applications.
To gain some experience working with vector images, download and
experiment with the public domain tool named Inkscape.
(Format.6) .tif, .tiff (tagged image file format) Highly flexible, detailed, adaptable
format, compressed and uncompressed variants exist.
.png was designed to replace .gif, new lossless compression format. The
acronym png can be read png not gif. This format was approved by the
internet engineering steering group in 1996 and was published as an
ISO/IEC standard in 2004. This format supports both greyscale and full
colour (rgb) images. This format was designed for internet transfer and
not for professional quality print graphics. See http://linuxgazette.net/
issue13/png.html for a history of the .png format.
The need to transmit images over networks and the need to recognize bodies of
numerical data as corresponding to digital images has led to the introduction of a
number of different image formats. Among these formats, .jpg and .tif are the most
popular. In general, .jpg and .tif are better suited for photographic images. The .gif
and .png formats are better suited for images with limited colour, detail, e.g., logos,
handwriting, line drawings, text.
26 1 Basics Leading to Machine Vision
The choice of an image format can be determined, for the most part, not just by image
contents but also by the image data type required for storage. Here are a number of
distinct image types.
(Type.2) Intensity (greyscale) images are 2D arrays, where each array element
assigns one numerical value from N0+ (natural numbers plus zero, usu-
ally natural numbers in the 8 bit range from 0 to 255 (or scaling in
the range from 0.0 to 1.0). For a greyscale image with 16 bit integer
intensities from 0 to 65535, try the Matlab code in Listing 1.9 using the
im2uint16 function and the rgb2gray function to convert a colour image
to a greyscale image. Using imtool, you can inspect the resulting image
1.11 Image Data Types 27
intensities (see, e.g., Fig. 1.22, with an intensity equal to 6842 for the
pixel at (2959, 1111)).
% 16 b i t g r e y s c a l e image
g = imread ( ’workshop.jpg’ ) ; % a 4 . 7 MB c o l o u r i m a g e
g = im2uint16 ( g ) ;
g = rgb2gray ( g ) ;
imtool ( g )
Listing 1.9 Use the Matlab code in eg_im2uint16.m to produce the images in Fig. 1.22.
(Type.5) Floating point images are very different from the other image types.
Instead of integer pixel values, the pixels in these image store floating
point numbers representing intensity.
Thought Problem 2 K
The code in Listing 1.9 leads us to a path of a UXW, namely, the world of greyscale
images with intensities in the range [0, ∞]. To inspect images in this unexplored
world, we just need to invent a im2uint∞ matlab function that gives us an unbounded
range of intensities for greyscale images.
Problem 1.8 Pick several different points in a colour image and display the colour
channel values for each of the selected points. How would you go about printing
(displaying) all of the values for the red colour channel for this image using only one
command in Matlab? If you get it right, you will be displaying only the red colors
in the peppers.png image.
Problem 1.9 To see an example of a floating point image, try out the code in
Listing 1.10.
1.11 Image Data Types 29
C = rand ( 1 0 0 , 2 ) ;
figure , image ( C , ’CDataMapping’ , ’scaled’ )
axis image
imtool ( C )
Listing 1.10 Use the Matlab code to produce a floating point image.
Modify the code to produce an image with width = 3, then 4, then 10. To find out
how rand works in Matlab, enter
help rand
Also enter
help image
to find out what the ’CDataMapping and ’scaled’ parameters are doing to produce
the image when you run the code in Listing 1.10. For more information about Matlab
image formats, type5
5 For example, imwrite is useful, if there is a needed to create a new image file containing a processed
image such as the image g in Listing 1.9. To see this, try imwrite(g,’greyimage.jpg’);.
30 1 Basics Leading to Machine Vision
This section focuses on the use of a colour lookup table by Mathworks (in Matlab)
in correlating pixel intensities with the amounts of pixel colour in the rgb colour
space. An overview of the different colour spaces is given by R.C. Gonzalez and
R.E. Woods [58, Chap. 6, p. 394ff]. For a Matlab-oriented view of the colour spaces,
see R.C. Gonzalez, R.E. Woods and S.L. Eddins [59, Chap. 5, p. 272ff].
This section briefly introduces an approach to producing sample colours in the RGB,
HSB and CIE LUV colour spaces.
Mathematically, the colour channels for a colour image are represented by three
distinct 2D arrays with dimension m × n for an image with m rows and n columns
1.12 Colour Images 31
with one array for each colour, red (colour channel 1), green (colour channel 2),
blue (colour channel 3). A pixel colour is modelled as 1 × 3 array. In Fig. 1.24, for
example,
I (2, 4, :) = (I (2, 4, 1), I (2, 4, 2), I (2, 4, 3)) = , , ,
where, for instance, the numerical value of I (4, 2, 1) is represented by . For a colour
image I in Matlab, the colour channel values for a pixel with coordinates (x, y) are
displayed using
Three 2D colour planes for an image I are shown in Fig. 1.24. A sample pixel at
location (x, y) = (2, 4) (column 2, row 4) is also shown in each colour plane, i.e.,
The rgb colour cube in Fig. 1.26 is generated with the Matlab code in Listing 1.11.
function rgbcube ( x , y , z )
vertices = [ 0 0 0 ; 0 0 1 ; 0 1 0 ; 0 1 1 ; 1 0 0 ; 1 0 1 ; 1 1 0 ; 1 1 1 ] ;
faces = [ 1 5 6 2 ; 1 3 7 5 ; 1 2 4 3 ; 2 4 8 6 ; 3 7 8 4 ; 5 6 8 7 ] ;
colors = vertices ;
patch ( ’vertices’ , vertices , ’faces’ , faces , . . .
’FaceVertexCData’ , colors , ’FaceColor’ , ’interp’ , . . .
’EdgeAlpha’ , 0 )
if nargin == 0
x = 10;y = 10;z = 4;
elseif nargin ~= 3
error ( ’wrong no. of inputs’ )
end
axis off
view ( [ x , y , z ] )
axis square
%
% >> r g b c u b e %s a m p l e u s e o f t h i s f u n c t i o n
%
Listing 1.11 Use the Matlab code in rgbcube.m to produce the image in Fig. 1.26.
The Matlab code Listing 1.11 uses the patch function. Briefly,
patch(X,Y,C) adds a ’patch’ or filled 2D polygon defined by the vectors
X and Y to the current axes. If X and Y are matrices of the same size,
one polygon per column is added. The parameter C specifies the colour
of the added face. To obtain a detailed explanation of the patch function,
type
doc pat ch
Problem 1.12 Use the rgbcube function to the following colour planes, using the
specific values for (x, y, z) as an argument for the function:
1.12 Colour Images 33
For example, the green-cyan-white-yellow colour plane in Fig. 1.27 is produced using
To solve this problem, give the missing values for each (?, ?, ?) and display and name
the corresponding colour plane.
Table 1.1 shows the six main colours in the visible spectrum, along with their typical
wavelength and frequency6 ranges. The wavelengths are given in nanometres (10−9
m) and the frequencies are given in terahertz (1012 Hz).
6 Frequencies:
High frequency RF signals (3–30 GHz) and Extreme High Frequency RF signals (30–300 GHz)
interacting with an electron-hole plasma in a semiconductor [178, Sect. 1.2.1, p. 10] (see,
also, [213]), Picosecond photoconducting dipole antenna illuminated with femtosecond optical
pulses, radiate electrical pulses with frequency spectra from dc-to-THz [178, Sect. 1.4.1, p. 16].
34 1 Basics Leading to Machine Vision
I = imread( pepper s. j pg );
I = rgb2gray(I );
f igur e, image(I )
col or map(bone)% greyscale colour map
col or map( pi nk)% pastel shades of pink colormap
col or map(copper)% colours from black to bright copper
A colour translation table or a colour lookup table (LUT) associates a pixel inten-
sity value (0 to 255) to a colour value. This colour value is represented as a triple (i,
j, k); this is much like the representation of a colour using one of the colour models.
Once a desired colour table or LUT has been set up, any image displayed on a colour
monitor automatically translates each pixel value by the LUT into a colour that is
then displayed at that pixel point. In Matlab, a LUT is called a colourmap.
A sample colour lookup table is given in Table 1.2. This table is based on the RGB
colour cube and goes gradually from black (0, 0, 0) to yellow (255, 255, 0) to red
(255, 0, 0) to white (255, 255, 255). Thus, the pixel value (the subscript to the colour
table) for black is 0, for yellow is 84 and 85, for red is 169 and 170, and for white is
255. In Matlab, the subscripts to the colour table would run from 1 to 256.
Notice that not all the possible 2563 RGB hues are represented, since there are
only 256 entries in the table. For example, both blue (0, 0, 255) and green (0, 255, 0)
are missing. This is due to the fact that a pixel may have only one of 256 8-bit values.
This means that the user may choose which hues to put into his colour table.
A colour translation table based on the hue saturation value (hsv) or hue saturation
lightness (hsl) colour models would have entries with real numbers instead of inte-
gers. The basic representation for hsv uses cylindrical coordinate representations of
points in an rgb colour model. The first column would contain the number of degrees
(0–360◦ ) needed to specify the hue. The second and third columns would contain
values between 0.0 and 1.0.
1.13 Colour Lookup Table 35
within Matlab to see what colour tables or colourmaps it has predefined. For more
details, see https://csel.cs.colourado.edu/~csci4576/SciVis/SciVisColor.html.
36 1 Basics Leading to Machine Vision
Remark 1.13 To see the colour channel values for a pixel, try the following experi-
ment.
% Experiment with a p i x e l i n a c o l o u r image
%
% What ’ s h a p p e n i n g ?
g = imread ( ’rainbow-shoe2.jpg’ ) ; % r e a d c o l o u r image
figure , imagesc ( g ) , colorbar ; % d i s p l a y r a i n b o w i m a g e
g(196 ,320) % display red channel value
g(196 ,320 ,:) % display 3 colour channel values
Listing 1.12 Use the Matlab code in band.m to produce the image in Fig. 1.28.
When the imtool window is displayed with the hsv version of the pep-
pers.png image, move the cursor over the image to see the real values that
correspond to each hsv colour. Notice, for example, the hsv color channel
values for the pixel at hsv(355, 10) in Fig. 1.29. Also, custom interpre-
tations of the hsv color space are possible, e.g., hsv colours represented
by integers corresponding to degrees in a circle.
(HSV.3) Using the Matlab imtool function to display the colour channel values
for the hsv image. Give some sample colour channel values for three of
the same pixels in the original rgb image and the new hsv image.
38 1 Basics Leading to Machine Vision
The idea now is to experiment with accessing pixels and display pixel intensities
in an image. The following tricks are used to view individual pixels, modify pixel
values and display images.
Example 1.16
trick.12 Matlab improfile % computes the intensity values along a line or multiline
path in an image.
Listing 1.13 puts together the basic approach to accessing and plotting the pixel
values along a line segment. Notice that properties parameters can be added to the line
function to change the colour and width of a displayed line segment. For example,
try
impr o f ile(im, [r 1, c1], [r 2, c2], Color , r , LineW idth , 3); % red line
Notice that the set of pixels in a line segment is an example of a simple convex set.
In general, a set of points is a convex set, provided the straight line segment connecting
40 1 Basics Leading to Machine Vision
each pair of points in the set is contained in the set. A line segment is an example of
one-sided convex polygon. The combination of the line and improfile functions gives
a glimpse of what is known as the texture of an image. Small elementary patterns
repeated periodically in an image constitute what is known as image texture. The
study of image line segments leads to the skeletonization of digital images, which is
a stepping to object recognition and the delineation of image regions (e.g., areas of
interest in topographic maps), which is an important part of computer vision (see, e.g.,
[176, Sect. 5.2]). The study of image line segments also ushers in the combination
of digital images and computational geometry [41].
% p i x e l i n t e n s i t y p r o f i l e along a l i n e segment
clc , clear all , close all
im = imread ( ’liftingbody.png’ ) ; % b u i l t −i n g r e y s c a l e image
image ( im ) , axis on , colormap ( gray ( 2 5 6 ) ) ; % d i s p l a y i m a g e
r1 = 4 5 0 ; c1 = 2 0 ; r2 = 3 0 ; c2 = 3 5 0 ; % s e l e c t pixel coords .
line ( [ r1 , c1 ] , [ r2 , c2 ] ) ; % draw l i n e s e g m e n t
figure ,
improfile ( im , [ r1 , c1 ] , [ r2 , c2 ] ) , % plot pixel intensities
ylabel ( ’Pixel value’ ) ,
title ( ’improfile(im,[r1,c1],[r2,c2])’ )
Listing 1.13 Use the Matlab code in findIt.m to produce the images in Fig. 1.32.
1.14 Image Geometry, a First Look 41
A geometric view of digital images approximates what the eye sees and
what a camera captures from a visual scene.
% p i x e l i n t e n s i t y p r o f i l e along a l i n e segment
clc , clear all , close all
im = imread ( ’liftingbody.png’ ) ; % b u i l t −i n g r e y s c a l e image
figure
image ( im ) , axis on , colormap ( gray ( 2 5 6 ) ) ; % d i s p l a y i m a g e
hold on
seg1 = [ 1 9 427 416 7 7 ] ; % d e f i n e l i n e segment 1
seg2 = [ 9 6 462 37 3 3 ] ; % d e f i n e l i n e segment 2
r1 = 8 ; c1 = 3 5 0 ; r2 = 4 5 0 ; c2 = 4 5 ; % s e l e c t pixel coords .
line ( [ r1 , c1 ] , [ r2 , c2 ] , ’Color’ , ’r’ ) ; % draw l i n e s e g m e n t
42 1 Basics Leading to Machine Vision
Listing 1.14 Use the Matlab code in findLines.m to produce the image in Fig. 1.33.
It is possible to access, modify and display modified pixel values in an image. The
images in Fig. 1.34 (original image) and Fig. 1.35 (modified image) are produced
using the code in Listing 1.15.
% what ’ s h a p p e n i n g ?
I = imread ( ’cell.tif’ ) ; % c h o o s e . t i f f i l e
imtool ( I ) ; % use i n t e r a c t i v e viewer
%
K = imread ( ’onion.png’ ) ; % c h o o s e . png f i l e
imtool ( K ) ; % use i n t e r a c t i v e viewer
subplot ( 2 , 2 , 1 ) ; imshow ( I ) ; % d i s p l a y u n m o d i f i e d g r e y s c a l e i m a g e
subplot ( 2 , 2 , 2 ) ; imshow ( K ) ; % d i s p l a y u n m o d i f i e d r g b i m a g e
%
I(25 ,50) % print value at (25 ,50)
I(25 ,50) = 255; % set pixel value to white
I(26 ,50) = 255; % set pixel value to white
I(27 ,50) = 255; % set pixel value to white
I(28 ,50) = 255; % set pixel value to white
I(29 ,50) = 255; % set pixel value to white
I(30 ,50) = 255; % set pixel value to white
I(31 ,50) = 255; % set pixel value to white
I(32 ,50) = 255; % set pixel value to white
I(33 ,50) = 255; % set pixel value to white
I(34 ,50) = 255; % set pixel value to white
I(35 ,50) = 255; % set pixel value to white
%
I(26 ,51) = 255; % set pixel value to white
I(27 ,52) = 255; % set pixel value to white
I(28 ,52) = 255; % set pixel value to white
I(29 ,54) = 255; % set pixel value to white
I(30 ,55) = 255; % set pixel value to white
subplot ( 2 , 2 , 3 ) ; imshow ( I ) ; % d i s p l a y m o d i f i e d i m a g e
imtool ( I ) ; % use i n t e r a c t i v e viewer
%
K(25 ,50 ,:) % p r i n t rgb p i x e l value a t (25 ,50)
K(25 ,50 ,1) % p r i n t red value at (25 ,50)
K(25 ,50 ,2) % p r i n t green value at (25 ,50)
K(25 ,50 ,3) % p r i n t blue value at (25 ,50)
K (25 ,50 ,:) = 255; % s e t p i x e l value to rgb white
K (26 ,50 ,:) = 255; % s e t p i x e l value to rgb white
K (27 ,50 ,:) = 255; % s e t p i x e l value to rgb white
K (28 ,50 ,:) = 255; % s e t p i x e l value to rgb white
K (29 ,50 ,:) = 255; % s e t p i x e l value to rgb white
K (30 ,50 ,:) = 255; % s e t p i x e l value to rgb white
%
K (26 ,51 ,:) = 255; % s e t p i x e l value to rgb white
K (27 ,52 ,:) = 255; % s e t p i x e l value to rgb white
K (28 ,52 ,:) = 255; % s e t p i x e l value to rgb white
K (29 ,54 ,:) = 255; % s e t p i x e l value to rgb white
K (30 ,55 ,:) = 255; % s e t p i x e l value to rgb white
%K ( 3 1 , 5 6 , : ) = 2 5 5 ; % s e t p i x e l value to rgb white
K(25 ,50 ,:)
subplot ( 2 , 2 , 4 ) ; imshow ( K ) ; % d i s p l a y m o d i f i e d 2 nd i m a g e
imtool ( K ) ; % use i n t e r a c t i v e viewer
Listing 1.15 Use the Matlab code in eg_05.m to produce the images in Figs. 1.34 and 1.35.
You will find that running the code in Listing 1.15 will display the following pixel
values.
ans(:,:,.1) = 46% unmodified red channel value for pixel I(25, 50)
ans(:,:,.2) = 29% unmodified green channel value for pixel I(25, 50)
ans(:,:,.3) = 50% unmodified blue channel value for pixel I(25, 50)
ans(:,:,.1) = 255% modified red channel value for pixel I(25, 50)
ans(:,:,.2) = 255% modified green channel value for pixel I(25, 50)
44 1 Basics Leading to Machine Vision
ans(:,:,.3) = 255% modified blue channel value for pixel I(25, 50)
In addition, the code in Listing 1.15 displays an image viewer for each (both
unmodified and modified). Here is what the image viewer looks like, using
imtool(image) (Fig. 1.36).
Fig. 1.36 Image viewer for modified rgb onions image (notice hook)
% c o n v e r t i n g an image t o g r e y s c a l e
% What ’ s h a p p e n i n g
I = imread ( ’onion.png’ ) ; % i n p u t png ( r g b ) i m a g e
%
Ig = rgb2gray ( I ) ; % convert to grayscale
Ibw = im2bw ( I ) ; % c o n v e r t t o rgb t o b i n a r y image
%
subplot ( 1 , 3 , 1 ) ; imshow ( I ) ; axis image ; title ( ’png (rgb) image’ )
subplot ( 1 , 3 , 2 ) ; imshow ( Ig ) ; title ( ’greyscale image’ ) ;
subplot ( 1 , 3 , 3 ) ; imshow ( Ibw ) ; title ( ’binary image’ ) ;
Listing 1.16 Use the Matlab code in binary.m to produce the images in Figs. 1.34 and 1.37.
y p
N4 ( p) = { p(x, y), p(x − 1, y), p(x + 1, y), p(x, y − 1), p(x, y + 1)} (4-Nbd).
That is, B is the set of pixels in the corners of the subimage in I containing the
4-neighbourhood N4 ( p) in Fig. 1.38.
46 1 Basics Leading to Machine Vision
y p
N8 ( p) = N4 ( p) ∪ B,
= N4 ( p) ∪ { p(x − 1, y − 1), p(x + 1, y + 1), p(x + 1, y − 1), p(x − 1, y + 1)} ,
= { p(x, y), p(x − 1, y), p(x + 1, y), p(x, y − 1), p(x, y + 1)} ∪
{ p(x − 1, y − 1), p(x + 1, y + 1), p(x + 1, y − 1), p(x − 1, y + 1)} . (8-Nbd)
See Fig. 1.39 for a sample 8-neighbourhood. For more about Rosenfeld 4- and 8-
neighbourhoods, see R. Klette and A. Rosenfeld [94, Sect. 1.1.4, p. 9].
In an ordinary digital image without zooming, the tiny squares representing indi-
vidual pixels are usually not visible. To get around this problem and avoid the need
to use zooming, the Matlab function imcrop is used to select a tiny subimage in an
image.
% S e l e c t i n g a t i n y subimage using imcrop
clc , clear all , close all
a = imread ( ’peppers.png’ ) ; % b u i l t −i n g r e y s c a l e image
im = imcrop ( a ) ; % s e l e c t t i n y subimage
figure % d i s p l a y subimage
image ( im ) , axis on , colormap ( gray ( 2 5 6 ) ) ;
Listing 1.17 Use the Matlab code in RosenfeldTinyImage.m to produce the subimage in
Fig. 1.41.
Example 1.21 Extracting a Subimage from an Image. With, for example, the
peppers.png image available in Matlab (see Fig. 1.40), use imcrop to select a tiny
subimage such as (part of the center, lower green pepper in Fig. 1.40). This is
done using the Matlab script 1.17. For example, we can obtain the 9 × 11 subimage
in Fig. 1.41.
1.17 Rosenfeld 8-Neighbourhood of a Pixel 47
% Rosenfeld 8 neighbours of a p i x e l
clc , clear all , close all
a = imread ( ’peppers.png’ ) ; % b u i l t −i n g r e y s c a l e image
im = imcrop ( a ) ; % s e l e c t t i n y subimage
figure % d i s p l a y subimage
image ( im ) , axis on , colormap ( gray ( 2 5 6 ) );
row = 4 , col = 5 ; % s e l e c t 8− Nbd c e n t e r
im ( row , col , : ) = 2 5 5 ; % paint center white
im ( row − 1 ,col − 1: col + 1 , : ) = 1 5 5 ; % point border grey
48 1 Basics Leading to Machine Vision
im ( row , col − 1 , : ) = 1 5 5 ;
im ( row , col + 1 , : ) = 1 5 5 ;
im ( row + 1 , col − 1: col + 1 , : ) = 1 5 5 ;
figure % d i s p l a y 8− Nbd
image ( im ) , axis on , grid on , colormap ( gray ( 2 5 6 ) ) ; % d i s p l a y i m a g e
These steps are carried out using the Matlab script Listing 1.18. For example, we can
obtain the 8-Neighbourhood displayed with false colours in Fig. 1.42.
y p
q( x2 , y2 )
y2
q
y1 − y2
−
p
p( x1 , y1 )
y1
x1 − x 2
x1 x2
This section briefly introduces two of the most commonly used means of measuring
distance, namely, Euclidean distance metric and Manhattan distance metric. Let Rn
denote the real Euclidean space. In Euclidean space in Rn , a vector is also called
a point (also called a vector with n coordinates. The Euclidean line (or real line)
equals R1 for n = 1, usually written R. A line segment x1 x2 between points x1 , x2
on the real line has length that is the absolute value x1 − x2 (see, for example, the
distance between points on the horizontal axis in Fig. 1.44).
The Euclidean plane (or 2-space) R2 is the space of all points with 2 coordinates.
The Euclidean 3-space R3 is the space of all points each with 3 coordinates. In
general, the Euclidean n-space is the n-dimensional space Rn . The elements of Rn
are points (also called vectors), each with n coordinates.
For example, let points x, y ∈ Rn with n coordinates, then x = (x1 , . . . , xn ), y =
(y1 , . . . , yn ). The norm of x ∈ Rn (denoted x) is
x = x12 + x22 + ... + xn2 (vector length from the origin).
Sometimes the Euclidean distance is written x − y2 (see, e.g., [34, Sect. 5, p. 94]).
The taxicab metric is computed using the absolute value of the differences between
points along the vertical and horizontal axes of a plane grid. Let |x1 − x2 | equal the
absolute value of the distance between x1 and x2 (along the horizontal axis of a digital
image). The taxicab metric dtaxi , also called the Manhattan distance between points
p at (x1 , y1 ) and q at (x2 , y2 ), is distance in the plane defined by
In general, the taxicab distance between two points in Rn mimics the distance logged
by a taxi moving down one street and up another street until the taxi reaches its desti-
nation. The taxicab distance between two points x = (x1 , . . . , xn ), y = (y1 , . . . , yn )
in n-dimensional Euclidean space Rn is defined by
n
dtaxi = |xi − yi | (Taxicab distance in Rn ).
i=1
Euclidean distance and the taxicab distance are two of commonest metrics used
in measuring distances in digital images. For example, see the sample distances
computed in Matlab Listing 1.19.
% d i s t a n c e between p i x e l s
clc , clear all , close all
im0 = imread ( ’liftingbody.png’ ) ; % b u i l t − i n g r e y s c a l e i m a g e
image ( im0 ) , axis on , colormap ( gray ( 2 5 6 ) ) ; % d i s p l a y i m a g e
% s e l e c t v e c t o r components
x1 = 1 0 0 ; y1 = 2 7 5 ; x2 = 3 2 5 ; y2 = 4 0 0 ;
im0 ( x1 , y1 ) , im0 ( x2 , y2 ) , % display pixel intensities
p = [100 275]; q = [325 400]; % vectors
norm ( p ) , norm ( q ) , % 2− norm v a l u e s
norm ( p−q ) , % norm ( p−q ) = E u c l i d e a n d i s t .
EuclideanDistance = sqrt ( ( x1−x2 ) ^2 + ( y1−y2 ) ^ 2 ) ,
ManhattanDistance = abs ( x1−x2 ) + abs ( y1−y2 )
Listing 1.19 Use the Matlab code in distance.m to experiment with distances between pixels.
This section briefly introduces pointillist picture painting approach to modifying the
appearance of patterns in a digital image with false colours. Pointillism (from French
pointillisme) is a form of neo-impressionist art introduced by G. Seurat and P. Signac
during the 1880s [174]. Pointillism is a form of painting in which small dobs (dots)
of pure colour are applied to a canvas and which become blended in a viewer’s eye.
52 1 Basics Leading to Machine Vision
The dobs of pure colour in a pointillist painting are arranged in visual patterns to
form a picture. The pointillist approach to painting with dobs of pure colour carries
over in the the false-colouring of selected pixels that are part of hidden patterns in
digital images.
With a digital image, the basic approach is to replace selected pixels with a false
colour to highlight hidden image patterns. Such an image pattern would normally
not be visible without false-colouring. The steps in applying false colours to pixels
in a digital image pattern is as follows.
Picture Pattern False Colouring Method for RGB Images.
Example 1.28 Choose a particular pixel p. Pattern Rule. If any other pixel q in the
selected image has the same intensity as p, then q belongs to the pattern.
3o Choose a pixel false colour method.
4o Assign values to method parameters such as initial pixel coordinates.
5o Apply the method in Step 2.
6o False-Colour Step. If a colour pixel q intensity satisfies the Pattern Rule, then
maximize the intensity of q.
7o Display image with false colours.
8o Repeat Step 2 to display a different image pattern.
Example 1.29 RGB Image Pattern Highlighted with False Colouring. A pattern
(highlighted with a false colour) is shown in the colour image in Fig. 1.45. In this case,
the pattern contains all subimage pixels that have the same intensity as a selected
pixel. For example, choose the pixel p at (25,50) in Fig. 1.45. In example, p −→
% Some c o l o u r i m a g e p i x e l s a s s i g n e d f a l s e c o l o u r s
clc , clear all , close all
I=imread ( ’peppers.png’ ) ;
x = 2 5 ; y = 5 0 ; rad = 2 5 0 ; p = [ x y ] ; % s e t t i n g s
for i = x + 1 : x+1+rad % w i d t h o f box
for j = y + 1 : y+1+rad % l e n g t h o f box
q = [i j ] ; % u s e i n norm ( p−q )
if ( ( I ( i , j ) == I ( x , y ) ) && ( norm ( p−q ) <rad ) )
I(i , j , 2 ) = 255; % false colour
end
end
end
I ( x , y , 1 ) = 2 5 5 ; I ( x − 1 ,y , 1 ) = 2 5 5 ; I ( x − 1 ,y , 1 ) = 2 5 5 ; % 8 Nbd
I ( x − 1 ,y + 1 , 1 ) = 2 5 5 ; I ( x − 1 ,y − 1 , 1 ) = 2 5 5 ; I ( x + 1 , y + 1 , 1 ) = 2 5 5 ;
I ( x+1 ,y +1 ,1) =255;I ( x , y−1 ,1) =255;I ( x , y +1 ,1) =255;
figure , imshow ( I ) , axis on , % show f a l s e c o l o r s
title ( ’(I(i,j) == I(x,y)) && (norm(p-q)<rad))’ )
Listing 1.20 Use the Matlab code in falseColourRGB.m to experiment with false-colouring
pixels.
In Listing 1.20, notice that a pixel false colour is obtained by assigning the maxi-
mum intensity to one of the colour image channels. In this example, if a pixel intensity
I (i, j) matches the intensity of the pixel I (x, y) at the upper lefthand corner of the
box and norm of the distance from (x,y) to (i,j) is less than to an upper bound rad
(e.g., rad = 250 pixels), then I (i, j) is painted a false colour. In Listing 1.20, the
following assignment is made.
I(i,j,2) = 255;
1o ® Modify Listing 1.20 so that pixel false colour is changed to red. Display the
result.
2o Give a comlete Matlab script that implements the following pattern rule:
Choose a particular pixel p. If any other pixel q in the selected image is less
than the intensity of p, then q belongs to the pattern. Display pixel q with
a false colour.
3o Give a comlete Matlab script that implements the following pattern rule:
Choose a particular pixel p. If any other pixel q in the selected image is
greater than the intensity of p, then q belongs to the pattern. Display pixel
q with a false colour.
4o Invent your own pixel pattern rule. Give a comlete Matlab script that implements
your pattern rule. Display the result.
2o Give a comlete Matlab script that implements the following pattern rule:
Choose a particular pixel p. If the intensity of any other pixel q in the selected
greyscale image is less than the intensity of p, then q belongs to the pattern.
Display pixel q with a false colour.
3o Give a comlete Matlab script that implements the following pattern rule:
Choose a particular pixel p. If the intensity of any other pixel q in the selected
image is greater than the intensity of p, then q belongs to the pattern.
Display pixel q with a false colour..
4o Invent your own pixel pattern rule. Give a comlete Matlab script that implements
your pattern rule. Display the result.
% Some p i x e l s i n s i d e a box r e g i o n a s s i g n e d a f a l s e c o l o u r
clc , clear all , close all
I = imread ( ’liftingbody.png’ ) ; % g r e y s c a l e image
I=double ( I ) ; % for scaling
I3=zeros ( size ( I , 1 ) , size ( I , 2 ) , 3 ) ; % s e t up 3 c h a n n e l s
I3 ( : , : , 1 ) =I ; I3 ( : , : , 2 ) =I ; I3 ( : , : , 3 ) =I ; % c h a n n e l s <− I
I=I3 ; % I <− c h a n n e l s
x = 1 0 0 ; y = 1 5 0 ; rad = 3 5 0 ; p = [ x y ] ; % s e t t i n g s
for i = x + 1 : x+1+rad % w i d t h o f box
for j = y + 1 : y+1+rad % l e n g t h o f box
q = [i j ] ; % q vector
if ( ( I ( i , j ) == I ( x , y ) ) && ( norm ( p−q ) <rad ) )
I(i , j , 2 ) = 255;
end
end
end
I ( x , y , 1 ) = 2 5 5 ; I ( x − 1 ,y , 1 ) = 2 5 5 ; I ( x − 1 ,y , 1 ) = 2 5 5 ; % 8− n e i g h b o u r h o o d
I ( x − 1 ,y + 1 , 1 ) = 2 5 5 ; I ( x − 1 ,y − 1 , 1 ) = 2 5 5 ; I ( x + 1 , y + 1 , 1 ) = 2 5 5 ;
I ( x+1 ,y +1 ,1) =255;I ( x , y−1 ,1) =255;I ( x , y +1 ,1) =255;
figure , imshow ( I . / 2 5 5 ) , axis on , % display false colours
title ( ’(I(i,j) == I(x,y)) && (norm(p-q)<rad)’ )
Listing 1.21 Use the Matlab code in falseColourGrey.m to experiment with false-colouring
pixels.
A vector space is a set of objects or elements that can be added together and multiplied
by numbers (the result of either operation is an element of the space) in such a way
that the usual calculations hold. For example, the set of all pixels in a 2D digital
image form a local vector space that is a subset in R2 with holes in it. Every pixel
has coordinates in the plane that can be treated as vectors in the usual way. Unlike
the usual vector space called the Euclidean plane, there are holes in a digital image
(between every pair of adjacent pixels, there is no pixel). Similarly, the set of all
pixels in a 3D digital image form a local vector space in R3 . Compared with the
usual Euclidean dense 3D space, a 3D image looks like a piece of swiss cheese.
1.20 Vector Spaces Over Digital Images 57
Given a pair of vectors x, y, the dot product (denoted x · y) equals the length of the
projection of x onto y. Let θ be the angle between vectors x and y with norms x
and y. Then the dot product is defined by
This gives us a way to find the angle between a pair of vectors in a digital image, i.e.,
x·y
θ = arccos (angle between vectors).
x y
Listing 1.22 Use the Matlab code in dotProduct.m to experiment with dot products.
Problem 1.35 Let (x, y), (a, b) be a pair of vectors in a digital image of your own
choosing and let ∠((x, y), (a, b)) be the angle between (x, y) and (a, b). Use false-
coloring to display all of the pairs of vectors that satisfy the Vector Pair Angle Rule
(VPA Rule). For an example, see Fig. 1.47 for sample false colouring based on the
VPA Rule). Hint: Solve this problem with a very small subimage.
Vector Pair Angle (VPA) Rule. For each ∠((r, t), (u, v)) between pairs of vec-
tors (r, t), (u, v), if ∠((r, t), (u, v)) = ∠((x, y), (a, b)), then display the pixels
at (r, t), (u, v) with a false colour.
In a 2D image, the gradient of vector (location of a pixel intensity) is the slope of the
vector. Let f be a 2D image. Also, let ∂∂xf be the partial derivative of in the x-direction
and let ∂∂ yf be the partial derivative of in the y-direction. The gradient of f at location
(x, y) (denoted by ∇ f ) is defined as a 2D column vector
⎡∂ f ⎤
∂x
∇f =⎣ ⎦ (gradient of f at (x,y)).
∂f
∂y
Fig. 1.49 PA rule application: ∀r, c ∈ Image, highlight angle(r,c) < 2.5.*angle(10,20)
% i m a g e v e c t o r x − , y− d i r e c t i o n m a g n i t u d e s a n d v e c t o r a n g l e s
clc , clear all , close all
im = imread ( ’liftingbody.png’ ) ; % b u i l t − i n g r e y s c a l e i m a g e
im=imresize ( im , 0 . 5 ) ; % s h r i n k i m a g e by 50%
imshow ( im ) , axis on , grid on ; % d i s p l a y image
[ Gdir , Angle ] = imgradient ( im ) ; % vector directions , angles
Angle ( 1 5 0 , 1 5 0 ) % sample a n g l e s :
Angle ( 1 6 5 , 1 3 0 )
Angle ( 8 0 , 8 0 )
Angle ( 1 0 0 , 4 0 )
Listing 1.23 Use the Matlab code in vectorDirection.m to experiment with false-colouring
pixels.
Problem 1.37 ® Let ∠ p(x, y) be the angle of a pixel p(x, y) with coordinates
(x, y) in a colour image of your choosing. Apply the Pixel Angle (PA) rule: For
k = 2.5, false colour all image pixels with coordinates (r, c) so that
angle(r, c) < 2.5. ∗ angle(x, y), whereangle(r, c) = angle of pixel at (r, c).
For an example, see Fig. 1.49 for sample false colouring based on the PA Rule). Hint:
Solve this problem by letting i, j range over
i = 1 : r and j = 1 : c, where
60 1 Basics Leading to Machine Vision
r, c are number of rows and columns in the selected image, respectively. Caution: In
Matlab, imgradient works with greyscale images, not colour images. Even though
the display of false colours is on the selected colour image img, the pixels angles
are extracted from imgGrey, the greyscale equivalent of the original colour image
img.
Pixel Angle (PA) Rule. Let k > 0. For each ∠q(a, b), if ∠q(a, b) < k∗∠ p(x, y),
then display pixel q(x, y) with a false colour.
This section briefly introduces some of the features of camera vision, starting with
cameras with some form of low-level intelligence.
The intelligent system approach in camera design is part of what is known as intel-
ligent multimedia. Intelligence in this context means the ability of a picture-taking
device to combine available sensor information to facilitate the capture of an optimal
picture. A good discussion on intelligence considered in the context of multimedia
is given by M. Ma [37, Sect. 1.1.3, p. 4]. Capturing the underlying geometry of a 3D
scene is of the principal problems requiring solution for intelligent camera control
(see, e.g., M. Christie, P. Olivier and J.-M. Normand [30]).
Intelligent camera control is central to motion planning in robotic devices (see,
e.g., [15]), work by Daimler-Benze on vision-based pixel-classification autonomous
driving cite[Fig. 3.2, p. 3]Franke1999 and intelligent vehicle vision systems [109].
A recent survey of hardware accelerated methods for intelligent object recognition
in cameras is given by A. Karimaa [90]. Histogram equalization, motion detection,
image interpretation and objection recognition are key features that are implemented
in intelligent visual surveillance [92].
A good overview of early intelligent camera control is given by S.M. Drucker [36].
Stability (steady hold) is a basic feature of cameras that can be considered intelligent.
A camera that supports stability while capturing an image, compensates for move-
ment (camera shake) while a picture is being taken. This feature is important for both
ordinary lens- and macro lens-based picture-taking, since it eliminates (diminishes)
image blurring. For example, the Canon®Hybrid IS implements optimal image sta-
bilization.
1.21 What a Camera Sees: Intelligent Systems View 61
For example, Fig. 1.51 is produced using the code in Listing 1.24. Each of the
colour channels for the rgb image of a workshop is shown in Fig. 1.51. For example,
the red colour channel for the workshop is displayed in the second image in the top
row of Fig. 1.51. In the second row
% colour experiments
g = imread ( ’workshop.jpg’ ) ;
%
gr = g ( : , : , 1 ) ; gg = g ( : , : , 2 ) ; gb = g ( : , : , 3 ) ;
%
subplot ( 2 , 2 , 1 ) ; image ( g ) ; axis image ;
title ( ’original image’ ) ;
subplot ( 2 , 2 , 2 ) ; image ( gr , ’CDataMapping’ , ’scaled’ ) ; axis image ;
title ( ’r image’ ) ;
subplot ( 2 , 2 , 3 ) ; image ( gg , ’CDataMapping’ , ’scaled’ ) ;
title ( ’g image’ ) ;
subplot ( 2 , 2 , 4 ) ; image ( gb , ’CDataMapping’ , ’scaled’ ) ;
title ( ’b image’ ) ;
%
% >> f i g u r e , c o l o u r %s a m p l e u s e o f c o l o u r . m
%
Listing 1.24 Use the Matlab code in colour.m to produce the images in Fig. 1.34 and Fig. 1.51.
Problem 1.38 Give Matlab code to display an hsv image (converted from rgb to hsv)
and the red, green, and blue colour changes for the hsv image (see Fig. 1.53). Do this
for a pair of colour (rgb) images by converting the rgb images to hsv images.
Problem 1.39 ® Give Matlab code to display the cameraman image as shown in
Fig. 1.52 so that concentric circles are drawn on the image. The center of both circles
should be positioned at (120,75) with inner circle radius equal to about 30 pixels and
outer circle radius equal to about 50 pixels.
Hint: Since you are only dealing with a greyscale image for the cameraman, you do
have to worry about color channel values. Use the false-colour approach and change
the pixel intensity to maximum intensity for each pixel along the circumference of
the circle to be drawn in the cameraman image. Let (xc, yc) be the center of a circle
with radius r . And let x = 0 : 0.01 : 1, y = 0 : 0.01 : 1. Then false colour each of
the points at
(xc + r cos(2πx), yc + rsin(2π y)
Algorithm Symbols.
−→ Maps to.
Example 1.42
img −→ gr eyscaleI mg
reads Set S maps from cor nerCoor dinates(gr eyscaleI mg) (i.e.,
S gets a copy of the coordinates of corners in the greyscale image).
66 1 Basics Leading to Machine Vision
This section briefly introduces an approach to detecting image geometry using either
a Voronoï polygonal mesh or a Delaunay triangulation mesh overlay on a digital
image. These meshes can either be view separately or in combination.
Example 1.44 Sample mesh generating points are displayed as green stars in
Fig. 1.55. In this case, there are 55 sites scattered across the cycle image. Each
indicates a pixel that has gradient orientation angle and gradient magnitude that is
different each of the other sites. Such sites are called keypoints. For the details about
keypoints, see Sect. 8.8 and Appendix B.10.
In an intelligent systems approach to machine vision, the focus is on the selection
of useful generating points found in digital images or in video frames useful in
detecting image objects and patterns. An image generating point p in a digital
image is a pixel used to find all pixels closer to p than to any other generating point
in the image. In effect, we always start by identifying the generating points in a digital
image. For now, we consider only image corners.
Let V ( p) be an image Voronoï region of a corner generating point p. When
we refer to a Voronoï region, we usually also mention the generating point used
to construct the region. The collection of Voronoï regions that cover an image is
called a Dirichlet tessellation,7 which is also called a Voronoï mesh. By joining
pairs of nearest image generating points with straight edges, we obtain a Delaunay
triangulation of an image, which is also called a Delaunay tessellation or a Delaunay
mesh. A Delaunay mesh on an image is collection of triangles that cover the image.
Example 1.45 Voronoï region and Delaunay Triangle.
There are two types of mesh polygons important in solving object recognition and
pattern recognition problems in either single images or in video frames.
7 This form of tessellation is named after Dirichlet who used Voronoï diagrams in 1850, even though
it was René Descartes who first had the idea as early as 1644 in an investigation of quadratic forms.
In 1907, it was Voronoï who extended Dirichlet tessellations to higher dimensions. Hence the name
Voronoï diagram. For more about this, see http://mathworld.wolfram.com/VoronoiDiagram.html.
A complete set of notes on Voronoï diagrams is available at http://www.ics.uci.edu/~eppstein/
junkyard/nn.html.
68 1 Basics Leading to Machine Vision
Voronoï regions and Delaunay triangle go hand-in-hand and yield quite different
information about a tessellated surface.
Notice again that each Voronoï region ( p) is a convex polygon. This means that
each straight edge connecting any pair of points in the interior or along the border
of a Voronoï region V ( p) is contained in that region.
A Voronoï mesh extracted from a tessellated digital image tends to reveal image
geometry and the presence of image objects.
Image Geometry: The term image geometry means geometric shapes such as
tiny-, medium- and large-sized polygons that surround image objects.
Example 1.47 Car Wheels Mesh An isolated Voronoï image sub-mesh is shown
in Fig. 1.57. Notice that the wheels of the tessellated car image in Fig. 1.54 are
surrounded by twisting polygons along the borders of the wheels in the mesh in
Fig. 1.56.1. This is an example of what is known as a mesh nerve.
Example 1.48 Voronoï Mesh Nerve that is a Maximal Nucleus Cluster (MNC).
A sample Voronoï mesh nerve is shown in a mesh with 55 sites (generating points)
in Fig. 1.60.2. The nucleus of this nerve is a yellow hexagon covering the upper front
1.23 Nerve Structures 71
cycle nucleus is surrounded by red polygons. The combination of the yellow polygon
nucleus and red adjacent polygons constitutes a mesh nerve. Notice that the sites in the
adjacent red polygons can be connected pairwise to form a convex hull of two types
of sites, namely, the nucleus site and adjacent polygon sites. Let S be a nonempty set
of sites. The smallest convex set containing the set of points in S is the convex hull
of S. A nonempty set is a convex set, provided every straight line segment between
any two points in the set is also contained in the set. A sample convex hull is shown
in Fig. 1.60.3. In this example, the convex set contains the border points as well as
all points inside the borders of blue polygon in Fig. 1.60.3.
Notice that every polygon in a Voronoï Mesh is the nucleus of a mesh nerve, which
is a cluster of polygons. Each mesh nerve is a cluster of polygons, containing a
nucleus polygon in its center and a collection of polygons that share an edge with the
nucleus. Also notice that each polygon in a Voronoï mesh nerve is a Voronoï region of
a site used to construct the region polygon. By connecting each neighbouring pair of
Voronoï polygons adjacent to the MNC nucleus, we can sometimes obtain a convex
hull, which is one of the strongest indications of the shape of an image object covered
by the MNC. This is, identifying MNC convex hulls in image nerves is important,
since such convex hulls approximate the shape of an object covered by an MNC. For
more about MNCs, see Sect. 7.5. For more about convex hulls, see Appendix B.3.
objects by the shapes of polygon clusters that surround image objects. The corner
generating points by themselves (without the Voronoï polygons) tell us a lot about
image geometry, since the generating points tend to follow the contours of image
objects.
When we use a generating point to construct a particular Voronoï region,
the resulting polygon tells about all image pixels that are closer to the particular
generating point than to any other pixel that is a generating point in the image.
See, for example, the corner-based generating points in Fig. 1.61.
7o K Display Voronoï mesh for the corners on both car wheels. Hint: Find the
corners in the subimage containing both the wheels.
8o
K Display Voronoï mesh for the corners on the complete car with the back-
ground. Hint: Find the corners in the subimage containing only the car. This
will be a rectangular-shaped subimage containing the car.
Problem 1.51 Mesh Nerves.
Write a Matlab script to display only the corners extracted from a digital image. Do
this for your choice of any three colour images. In your script, do the following.
1o ® Repeat the first 4 steps in Problem 1.50.
2o K Use false color to display a mesh nerve in the corner-based Voronoï mesh
on the selected image. Hint: Find one Voronoï region of a corner in a selected
subimage. This selected Voronoï region is the nucleus of a mesh nerve. The use
false colouring (try green) to highlight each of the polygons surrounding the
selected polygon.
3o K Display the area of the polygon that is nucleus of the nerve.
4o K Display a count of the polygons in the mesh nerve, including the nerve
nucleus.
5o K Display the mesh nerve only on the selected image.
6o K Display the mesh nerve by itself (without the selected image).
Some mesh nerves are more interesting than others. An interesting nerve
typically has a small polygon as its nucleus and many polygons surround-
ing the nucleus.
Fig. 1.64 Delaunay on Voronoï geometric views of red car image structures
1.23 Nerve Structures 77
This section briefly introduces video frame mesh overlays. The basic approach is to
detect the geometry of objects in digital images by covering each video frame image
with mesh polygons surrounding (in the vicinity of) image objects. Image mesh
polygons tend to reveal the shapes and identity of objects. For a sample Dirichlet
tessellation of a video frame image, see Fig. 1.65.1. A Dirichlet tessellation of a
plane surface is a tiling of the surface.
Seed points (also called sites or generator) provide a basis for generating Dirichlet
tessellations (also called Voronoï diagrams) and Delaunay triangulations of sets of
points, providing a basis for the construction of meshes that cover a set with clusters
of polygonal shapes. In general, a tessellation of a plane surface is a tiling of the
surface. A plane tiling is a plane-filling arrangement of plane figures that cover the
plane without gaps or overlaps [63]. The plane figures are closed sets. Taken by
themselves or in combination, pixel intensity, corner, edge, centroid, salient, critical
and key points are examples of seed points with many variations.
1.24 Video Frame Mesh Overlays 79
Problem 1.57 A sample 8×8 grid is shown in Fig. 1.67. The red • dots indicate
interior corners and outer box corners. Using pencil and paper, do the following:
1o Draw a corner-based Voronoi mesh on the grid.
2o Draw a corner-based Delaunay mesh on the grid.
This section briefly introduces an approach to offline video processing. This form of
video processing has three basic steps.
has a gradient orientation that is sharply different from the gradient orientation
of its neighbouring pixels. The gradient orientation of a pixel is the angle of the
tangent to the edge containing the pixel.
4o Repeat Step 2 until all of the frames in the video have been processed.
Algorithm 5 illustrates a particular form of video processing. This algorithm pro-
duces an offline video in which a corner-based Voronoï mesh is superimposed on
each video frame image.
A Matlab script that implements Algorithm 5 is given in Appendix A.1.5. This
algorithm uses Algorithm 2 to overlay a Voronoï mesh on each frame in the video.
This is done offline, i.e., after a complete video has been captured. In the offline
mode, each video frame is treated as an ordinary image.
Example 1.58 The Matlab script A.8 in Appendix A.1.5 creates an .mp4 video file.
This script overlays a corner-based Voronoï mesh on each of the frames in each
video that is captured by the particular webcam that is used. For example, image
corners are used as seed points to construct the image Voronoï mesh in Fig. 1.54 and
video frame Voronoï meshes in Figs. 1.65 and 1.66. Notice that as the hand moves,
the Voronoï mesh polygons change. The changes in the polygons are a result of the
changing positions of the corners found in the image.
Problem 1.60 Offline Video Production of Corner Delaunay Head Image Frame
Meshes.
Problem 1.61 Offline Video Production of Corner Voronoï Hand Image Frame
Meshes.
® Write a Matlab script to do the following.
1o Create a video of a moving hand.
82 1 Basics Leading to Machine Vision
2o Offline, find and display a corner-based Voronoï mesh on each video frame.
Mark the corners with a red × symbol.
3o Create an .avi file showing the production of video frames containing a corner-
based Voronoï on each video frame image.
Problem 1.62 Offline Video Production of Corner Voronoï Head Image Frame
Meshes.
® Write a Matlab script to do the following.
1o Create a video of a moving head.
2o Offline, find and display a corner-based Voronoï mesh on each video frame.
Mark the corners with a red × symbol.
3o Create an .avi file showing the production of video frames containing a corner-
based Voronoï on each video frame image.
This section briefly introduces an approach to real-time video processing. This form
of video processing has three basic steps.
Example 1.64 The Matlab script A.9 in Appendix A.1.6 creates an .mp4 video file
for a sample form of real-time video processing. This script overlays a corner-based
Voronoï mesh on each of the frames during video capture by the particular webcam
that is used. For example, image corners are used as seed points to construct the
image Voronoï mesh in Fig. 1.68 and video frame Voronoï meshes in Fig. 1.65 and in
Fig. 1.69. Notice again that as the hand moves, the Voronoï mesh polygons change
in real-time. The changes in the polygons are a result of the changing positions of
the corners found in the image.
84 1 Basics Leading to Machine Vision
A pixel (aka picture element) is an element at position (r, c) (row, column) in a dig-
ital image I . A pixel represents the smallest constituent element in a digital image.
Typically, each pixel in a raster image is represented by a tiny square called a raster
image tile. Raster image technology has its origins in the raster scan of cathode ray
tube (CRT) displays in which images are rendered line-by-line by magnetically steer-
ing a focused electron beam. Usually, computer monitors have bitmapped displays
in which each screen pixel corresponds to its bit depth, i.e., number of pixels used
to render pixel colour channels.
By zooming in on (also, resample) an image at different magnification levels,
these tiny pixel squares become visible.
Example 2.1 Inspecting Raster Image Pixels.
Four views of a raster image are shown in Fig. 2.1:
1o Lower-left panel: hand-held camera with pixel inspection window:
Fig. 2.2 Zoom in at 100 and 800%, exhibiting colour image pixels
Each colour or greyscale or binary image pixel carries with it several numerical
values. There are a number of cases to consider.
Binary image pixel values: 1 for a white pixel and 0 for a black pixel.
Greyscale image pixel values: Commonly 0–255, for the pixel greyscale inten-
sity. Each greyscale pixel value quantizes the magnitude of white light for a pixel.
RGB image pixel values: Each colour pixel value quantizes the magnitude of a
particular colour channel brightness for a pixel. A colour channel is particu-
lar colour component of an image and corresponds to a range of visible light
wavelengths. Each color pixel contains intensities for three colour channels. For
90 2 Working with Pixels
colour pixel with a bit depth equal to 8, we have the following range of intensity
(brightness) values for each colour channel.
Red: 0–255, for red pixel intensity (brightness).
Green: 0–255, for green pixel intensity (brightness).
Blue: 0–255, for blue pixel intensity (brightness).
Let I k (u, v) be the intensity of the k colour channel at camera image cell (u, v), Λ,
the set of wavelengths in the visible spectrum, p0k , a scaling factor, λ, a particular
wavelength, E u,v (λ), the amount of incoming light at image cell (u, v), τ k (λ), the
filter transmittance for the k colour channel, and s(λ), the spectral responsivity
of a camera optical sensor. The final colour pixel value I k (u, v) is defined by
I k (u, v) = p0k E u,v (λ)τ k (λ)s(λ)dλ.
Λ
In a typical RGB camera, k ∈ {r, g, b}. Recently, color pixel values have been
used extensively in image segmentation [135, Sect. 2.1, p. 666] and for visual
object tracking [32, Sect. 2.1, p. 666].
Let img be a m × n colour image. In that case, the pixels in img can be accessed
in the following ways.
The notation img(:, :), img(r, :), img(:, c) can be used to inspect and change pixel
intensities in binary, greyscale or colour images.
Ideally, a colour channel value indicates the magnitude of the colour channel light
recorded by an optical sensor used in pixel formation of a colour image produced by
a digital camera. Let I be a colour image. It is possible to convert the colour image
I to a greyscale image Igr using the Matlab function r gb2gray(I ).
Example 2.4 Figure 2.5 shows the result of converting a leaf colour image to a
greyscale image using MScript A.12 in Appendix A.2.3. Contrast between the two
forms of images becomes clearer when we zoom in on subimages of the original
image and its greyscale counterpart.
Colour Subimage
In this leaf image segment, there is a visible mixture various shades of greens.
2.3 Colour to Greyscale Conversion 93
Recall that shades of green are obtained by mixing yellow and blue. It is a straight-
forward task to verify that visible green in an image is rendered digitally with a
mixture of red and blue channel intensities.
Colour Subimage
In this leaf image segment, the original mixture various shades of greens is
replaced by a mixture of greys. The change from pixel color intensity to greyscale
intensity can be seen in the sample pixel values in Fig. 2.6.
At pixel level, pixel modification can be carried out by replacing each pixel in a
colour or greyscale image I with either the average of the colour channel values or
maps of pixels values to real numbers using functions such as ln(x), ex p(x) or with
a weighted sum of the colour channel values. For example, for a greyscale image
pixel Igr (x, y) at (x, y), the pixel intensity of Igr at (x, y) is
In Matlab, we write
or greyscale). Let k ∈ [0, 255]. Then new images i 1 , i 2 , i 3 , i 4 are obtained using the
image variable in simple algebraic expressions.
Image Algebraic Expressions I .
i 1 = g + g,
i 2 = (0.5)(g + g),
i 3 = (0.3)(g + g),
g
i4 = g (0.2).
2
Let h be a colour image and use the following algebraic expressions to change
the pixel intensities in h.
Image Algebraic Expressions II .
i 5 = h + 30,
i 6 = h − (0.2)h,
i 7 = |h − (0.2)(h + h)| ,
i 8 = (0.2) (h + (0.5)(h + h)) .
Let img be a colour image and use the following algebraic expressions to change
the pixel intensities in img.
Image Algebraic Expressions III .
Fig. 2.12 Color channel pixel intensity changes induced by algebraic expressions III
2o Repeat Step 1 to change the green channel intensities in each video frame image.
3o Repeat Step 1 to change the blue channel intensities in each video frame image.
Problem 2.10 Real-Time Video Frame Colour Channel Changes.
Use the approach to changing image channel intensities in MScript A.15 in Appen-
dix A.2.3 as a template for real-time video processing, do the following.
1o K Using Matlab script A.9 in Appendix A.1.5 as a template for offline video
processing, change the red channel intensities in each video frame image. Hint:
Replace the lines of Voronoï tessellation code with lines to code in Matlab
script A.8 to handle and display changes in the green channel of each video
frame image in real-time.
2o Repeat Step 1 to change the green channel intensities in each video frame image
in real-time.
3o Repeat Step 1 to change the blue channel intensities in each video frame image
in real-time.
Distinct images g and h can be added, provided the images are approximately
the same size. To combine pixel values in different images, it is necessary that the
distinct images I, have the same dimensions. To get around this same-size images
problem, choose any n × m image img, which is the larger of two images and just
copy a second image onto an n × m array of 1s or 0s (call it copy). Then img and
copy can be combined in various ways.
Example 2.11 Combining Pixel Intensities Across Separate Images.
The images in Fig. 2.13 showing Thai grocery store shelves. These Thai shelf images
are both approximately 1.5 MB. MScript A.16 in Appendix A.2.1 illustrates how to
combine pixel intensities in pairs of different images. Two Thai grocery shelf images
are combined in different ways is the first row of images in Fig. 2.14. The second row
of images in Fig. 2.14 are result of algebraic operations on just one of the original
images.
Problem 2.12 Choose three different pairs of colour images g, h and do the follow-
ing.
1o ® In Image Algebraic Expressions I, replace g, g with g, h and display the
changed images using MScript A.16 in Appendix A.2.1.
2o Repeat Step 1 using the Image Algebraic Expressions II.
3o Repeat Step 1 using the Image Algebraic Expressions III.
There are many other possibilities besides the constructed images I1 , . . . , I12
using the Algebraic Operations I, II and III. For example, one can determine largest
red colour value in a selected image row r , using
Using g(r, c), new images can be constructed by modifying the red channel values
using a maximum red channel value.
2.4 Algebraic Operations on Pixel Intensities 99
obtained by adding a fraction of a maximum red channel intensity in the first row
of an image. Internally, a colour channel is just a greyscale image (not what we
would imagine). An internal view of the modified red channel intensities is shown
in Fig. 2.16.
Problem 2.15 ® Repeat the steps in Problem 2.14 using a minimum colour chan-
nel intensity.
One of the commonest forms pixel selection is in the form of edge pixels. The basic
approach is to detect those pixels that are on edges in either in a greyscale image or
in a colour channel.
2.5 Pixel Selection Illustrated with Edge Pixel Selection 101
Briefly, to find edge pixels, we first find the gradient orientation (gradient angle)
of each image pixel, i.e., angle of the tangent to each pixel. Let img be a 2D image
and let img(x, y) be a pixel at location (x, y). Then the gradient angle ϕ of pixel
img(x, y) is found in the following way.
∂img(x, y)
Gx = .
∂x
∂img(x, y)
Gy = .
∂y
∂img(x,y)
−1 G y −1 ∂y
ϕ = tan = tan ∂img(x,y)
.
Gx
∂x
In Canny’s approach to edge pixel detection [24], each image is filtered to remove
noise, which has the visual effect of smoothing an image. After the gradient ori-
entation for each pixel is found, then a double threshold for an hysteresis interval
on orientation angles is introduced by Canny. The basic idea is to choose all pixels
with gradient orientations that fall within the hysteresis interval. Edge pixels that fall
within the selected hysteresis interval are called strong edge pixels. All edge pixels
102 2 Working with Pixels
with gradient angles outside the hysteresis interval are called weak edge pixels. The
weak edge pixels are ignored.
Before we separate out the edges from each colour image channel, we consider
the conventional approach to separating greyscale image edges embossed as white
pixels on a binary image.
Example 2.16 Figure 2.17 shows the result of finding the strong edge pixels in a
greyscale image derived from a colour image using MScript A.18 in Appendix A.2.5.
The basic approach is to start by converting a colour image to a greyscale image.
If we ignore the location of each colour pixel, then a colour image is an example
of a 3D image. Mathematically, each pixel p in location (x, y) in a colour image
is described by a vector (x, y, r, g, b) in a 5-dimensional Euclidean space, where
r, g, b are the colour channel brightness (intensity) values of pixel p. Traditionally,
edge detection algorithms require a greyscale image, which is a 2D image in which
each pixel intensity is visually a shade of grey ranging from pure white to pure black.
After choosing the pixels in a colour channel, then any of the usual edge detection
methods can be used on the single colour channel pixels. In this example, we use the
edge detection method introduced by John Canny [24].
Here are some of the details.
2.5 Pixel Selection Illustrated with Edge Pixel Selection 103
Colour Subimage
In this cycle image segment, the combined RGB channel pixels are shown.
BW Subimage Edges
In this cycle BW image segment, white edge pixels on a binary subimage are
shown.
The steps to follow in edge pixel detection in each of the colour channels are
given in Algorithm 7. Notice the parallel between the conventional approach to pixel
edge detection and colour channel edge detection in Algorithm 7. In both cases, edge
pixels (either in white or in colour) are embossed on a black image. Sample strong
edge pixels for the red channel of a cycle image are shown in Fig. 2.18.
Example 2.17 Figure 2.19 shows the result of finding the strong edge pixels in the
green channel of a colour image using MScript A.18 in Appendix A.2.5. The story
starts by selecting all of the pixels in a colour image. Traditionally, edge detection
algorithms require a greyscale image. The pixels in a single channel of a colour image
have the appearance of a typical 2D greyscale image, except that pixel intensities
are pixel colour brightness values in a single channel. After choosing the pixels in
a colour channel, then any of the usual edge detection methods can be used on the
single colour channel pixels. Here again, we use Canny’s edge detection method.
Example 2.18 Figure 2.20 shows the result of combining the red channel and the
green channel edge pixels again using MScript A.18 in Appendix A.2.5. This is
accomplished in a straightforward fashion by concatenating the separate images,
namely, img R (red channel edges), imgG (green channel edges) and a (entirely
black image).
Here are some of the details.
2o Repeat Step 1 to handle and display the green channel edges in each video frame
image.
3o Repeat Step 1 to handle and display the blue channel edges in each video frame
image.
4 K Using Matlab script A.8 in Appendix A.1.5 as a template for offline video
o
processing, display the combined red and green channel edges in each video
frame image. Hint: Replace the lines of Voronoï tessellation code with lines to
code in MScript A.18 to handle and display the combined red and green channel
edges in each video frame image.
5o Repeat Step 4 to handle and display the combined red and blue channel edges in
each video frame image.
6o Repeat Step 4 to handle and display the combined green and blue channel edges
in each video frame image.
processing, display the combined red and green channel edges in each video
frame image. Hint: Replace the lines of Voronoï tessellation code with lines to
code in MScript A.18 to handle and display the combined red and green channel
edges in each video frame image in real-time.
5o Repeat Step 4 to handle and display the combined red and blue channel edges in
each video frame image in real-time.
6o Repeat Step 4 to handle and display the combined green and blue channel edges
in each video frame image in real-time.
This section briefly introduces an approach to modifying image pixel values using
various functions. We illustrate this approach using the natural log of pixel values
over an selected colour image channels. The steps to follow in modifying the each of
the channel intensities resulting from the log of each colour channel pixel intensity
are show in Algorithm 8.
Example 2.24 Figure 2.22 shows the result of a log-based modification of channel
pixel intensities in a colour image using Algorithm MScript A.19 in Appendix A.2.6.
Here are sample coding steps in the basic approach.
Colour Subimage
In this colour image segment, only the front wheel is shown.
1o ® Compute the cosine of each colour channel intensity and produce four images
like ones in Fig. 2.23. Hint: Modify MScript A.19 in Appendix A.2.6 to get the
desired result.
2o Repeat the preceding step for two different choices of the scaling factor to adjust
the brightness of the modified images. For example, 0.2 is the scaling factor in
MScript A.19 and 1.8 is the scaling factor used to obtain the results in Fig. 2.23.
Problem 2.26 Colour Channel Edge Information Content.
Select three colour images of your own choosing and do the following.
1o K Compute the information content of each colour channel edge pixel intensity
and produce four images like ones in Fig. 2.23. Hint: Find the total number of
pixels in each image. Assume that the edge pixel intensities in the digital image
img are random. In addition, let the probability p(img(x, y)) = x∗y 1
for each
image intensity img(x, y) for a pixel with coordinates (x, y), 1 ≤ x ≤ m, 1 ≤
y ≤ n in an n ×m image.1 Then, for each colour channel pixel intensity, compute
the colour channel edge pixel information content h(img(x, y, k)), k = 1, 2, 3
of an edge pixel defined by
1
h(img(x, y, k)) := log2 (colour channel pixel info. content).
p(img(x, y, k)
And, for each colour pixel edge intensity, compute the colour edge pixel infor-
mation content h(img(x, y)), 1 ≤ x ≤ m, 1 ≤ y ≤ n of an edge pixel defined
by
1
h(img(x, y)) := log2 (pixel information content).
p(img(x, y)
1 Many other ways to compute the probability of a pixel intensity img(x, y) are possible. There is
a restriction:
n∗m
pi (img(r, c)) = 1, 1 ≤ r ≤ m, 1 ≤ c ≤ n.
i=1
112 2 Working with Pixels
2o Repeat the preceding steps for two different choices of the scaling factor to adjust
the brightness of the modified images.
The logical operations are not, and, or, and xor (exclusive or). This section introduces
the use of not, or, and xor (exclusive or) on image pixels. Later, it will be shown how
the and operation can be combined with what is known as thresholding to separate
the foreground from the background of images (see Sect. 2.8).
For a greyscale image, the complement of the image makes dark areas lighter and
bright areas darker. For a binary image g, not (g) changes background (black) values
to white and foreground (white) values to black. The not (g) produces the same results
as imcomplement (g).
Fig. 2.25 Sample complement and logically negated binary pixel intensities
To see what the xor operation does, consider Table 2.1, where x, y are pixel intensities
in a binary image. Table 2.1 is modelled after an exclusive or truth table. In Matlab,
114 2 Working with Pixels
the exclusive or operation produces the following sample result on a pair of binary
images. To see what happens, consider the following pair of colour images.
% c o n s t r u c t i n g new i m a g e s f r o m o l d i m a g e s u s i n g x o r
% i d e a f r o m Solomon a n d B r e c k o n , 2011
clc , close all , clear all
% What ’ s h a p p e n i n g ?
g = imread ( ’race1.jpg’ ) ; h = imread ( ’race2.jpg’ ) ; % r e a d i m a g e s
gbw = im2bw ( g ) ; hbw = im2bw ( h ) ; % convert to binary
check = xor ( gbw , hbw ) ;
subplot ( 1 , 3 , 1 ) , imshow ( gbw ) ; % display g
subplot ( 1 , 3 , 2 ) , imshow ( hbw ) ; % display h
subplot ( 1 , 3 , 3 ) , imshow ( check ) ; % d i s p l a y x o r ( gbw , hbw )
Next, a pair of .png colour images in Fig. 2.26 are converted to binary images
(every pixel value is either 1 (white) or 0 (black) after applying the im2bw function
to each image. Then the xor function is applied (see Listing 2.2) to the pair of binary
images to obtain the result shown in Fig. 2.27.
% c o n s t r u c t i n g new i m a g e s f r o m o l d i m a g e s
close all
clear all
% What ’ s h a p p e n i n g ?
%g = i m r e a d ( ’ b i r d s 1 . j p g ’ ) ; h = i m r e a d ( ’ b i r d s 2 . j p g ’ ) ; % r e a d png i m a g e s
g = imread ( ’race1.jpg’ ) ; h = imread ( ’race2.jpg’ ) ; % r e a d png i m a g e s
gbw = im2bw ( g , 0 . 3 ) ; hbw = im2bw ( h , 0 . 3 ) ; % convert to binary
check = xor ( gbw , hbw ) ; % xor binary
intensities
2.7 Logical Operations on Images 115
figure ,
subplot ( 1 , 3 , 1 ) , imshow ( gbw ) ; % d i s p l a y gbw
subplot ( 1 , 3 , 2 ) , imshow ( hbw ) ; % d i s p l a y hbw
subplot ( 1 , 3 , 3 ) , imshow ( check ) ; % d i s p l a y x o r ( gbw , hbw )
For the sake of completeness, the same experiment is performed on a pair of .jpg
colour images showing two different Thai grocery store displays. The interesting
thing here is seeing how the xor operation on the displays reveals movements of
similar items (bottles) from one display to the other (Fig. 2.28).
% c o n s t r u c t i n g new i m a g e s f r o m o l d i m a g e s
clc , clear all , close all % housekeeping
g = imread ( ’P9.jpg’ ) ; h = imread ( ’P7.jpg’ ) ; % read jpg images
%
gbw = im2bw ( g ) ; hbw = im2bw ( h ) ; % convert to binary
check = xor ( gbw , hbw ) ; % xor binary
intensities
subplot ( 1 , 3 , 1 ) , imshow ( gbw ) ; % d i s p l a y gbw
subplot ( 1 , 3 , 2 ) , imshow ( hbw ) ; % d i s p l a y hbw
subplot ( 1 , 3 , 3 ) , imshow ( check ) ; % d i s p l a y x o r ( gbw , hbw )
Greyscale and colour images can be transformed into binary (black and white)
images, where the pixels in the foreground of an image are black and pixels in
the background of an image are white. The separation of image foreground from
background is accomplished using a technique called thresholding. The threshold-
ing method results in a binary image by changing each background pixel value to 0,
if a pixel value is below a threshold, and to 1, if a foreground pixel value is greater
than or equal to the threshold. Let th ∈ (0, ∞] denote a threshold and let denote
a greyscale image. Then
% T h r e s h o l d i n g on g r e y s c a l e i m a g e
clc , clear all , close all % housekeeping
g = imread ( ’cameraman.tif’ ) ; % r e a d g r e y s c a l e i m a g e
h1 = im2bw ( g , 0 . 1 ) ; % threshold = 0.1
h2 = im2bw ( g , 0 . 4 ) ; % threshold = 0.5
h3 = im2bw ( g , 0 . 6 ) ; % threshold = 0.5
subplot ( 1 , 4 , 1 ) , imshow ( g ) ; % d i s p l a y g r e y s c a l e image
subplot ( 1 , 4 , 2 ) , imshow ( h1 ) ; % d i s p l a y t r a n s f o r m e d image
subplot ( 1 , 4 , 3 ) , imshow ( h2 ) ; % d i s p l a y t r a n s f o r m e d image
subplot ( 1 , 4 , 4 ) , imshow ( h3 ) ; % d i s p l a y t r a n s f o r m e d image
Notice that th = 0.5 works best in separating the cameraman from the background
(in fact, the background is no longer visible in Fig. 2.30 for th = 0.5). If there is
interest in isolating the foreground of a greyscale image, it is necessary to experiment
with different thresholds to obtain the best result. The code used to produce Fig. 2.30
is given in Listing 2.4.
Separating the foreground from the background in colour images can either be
done uniformly (treating all three colour channels alike) or finely by thresholding
each colour channel individually. Sample results of the uniform separation approach
are shown in Fig. 2.31 using the code Listing 2.5.
% T h r e s h o l d i n g a c o l o u r image
% What ’ s h a p p e n i n g ?
g = imread ( ’rainbow.jpg’ ) ; % r e a d c o l o u r image
% g = imread ( ’ pe n g u i n s . jpg ’ ) ; % r e a d c o l o u r image
h1 = im2bw ( g , 0 . 1 ) ; % threshold = 0.1
h2 = im2bw ( g , 0 . 4 ) ; % threshold = 0.4
h3 = im2bw ( g , 0 . 5 ) ; % threshold = 0.5
subplot ( 1 , 4 , 1 ) , imshow ( g ) ; title ( ’Scottish shoreline’ ) ;
subplot ( 1 , 4 , 2 ) , imshow ( h1 ) ; title ( ’th = 0.1’ ) ;
subplot ( 1 , 4 , 3 ) , imshow ( h2 ) ; title ( ’th = 0.4’ ) ;
subplot ( 1 , 4 , 4 ) , imshow ( h3 ) ; title ( ’th = 0.5’ ) ;
varying pixel colour intensities. Common applications of this reversal process are
in signature forgery detection and camouflage detection in paintings and in satellite
images.
Another useful technique in separating the foreground from the background in colour
images stems from an application of the logical and operation. The basic idea is to
threshold the pixel intensities in each colour channel and then experiment with the
conjunction of the resulting colour changes, either in pairs or the conjunction of all
three thresholded colour channels. Let be a colour image, r, g, b colour channels
in , and let r th, gth, bth denote thresholds on the red, green, blue colour channels,
respectively. Then
Image contrast can be improved by altering the dynamic range of an image. The
dynamic range of an image equals the difference between the smallest and largest
image pixel values. Transforms can be defined by altering the relation between the
dynamic range and the greyscale (colour) image pixel values. For example, an image
dynamic range can be altered by replacing each pixel value with its logarithm. Let
denote an image. Then alter the pixel value at (x, y) using
255
k= .
loge (1 + max())
To simplify the implementation of Eq. (2.1), use the following technique to alter
all pixel values in .
= k. ∗ log(1 + im2double())
Next observe that, since is a matrix, max() returns a row vector containing the
maximum pixel value from each column. To complete the implementation of k, use
k = mean((255)./log(1 + max())).
Notice that by increasing the value of the multiplier k, the overall brightness of the
image increases.2 The best result for the signature image is shown in the third image in
row 2 of Fig. 2.34, where 5.*log(g + h) is used on the image g. A less than satisfactory
result is obtained using k.*log(1 + im2double(g)). The logarithmic transform in
Eq. (2.1) induces a brightening of the foreground by spreading the foreground pixel
values over a wider range and a compression of the background pixel range. The
2 Many thanks to Patrik Dahlström for pointing out the corrections in eg_log1.m.
122 2 Working with Pixels
narrowing of the background pixel range provides a sharper contrast between the
background and the foreground.
Problem 2.33 Let g denote either a greyscale or colour image. In Matlab, implement
Eq. (2.1) using (eσ − 1)g(x, y) instead of im2double(g) and show sample images
using several choices of σ. Use the cameraman image as well as the signature image
to show the results for different values of σ.
Use the Matlab whos function to display the information about the current vari-
ables in the workspace, e.g., variables k and com4 in Listing 2.7. Matlab constructs
the double data type in terms of the definition for double precision in IEEE Standard
754, i.e., double precision values require 64 bits (for Matlab, double is the default data
type for numbers). The im2double(g) function converts pixel intensities in image g
to type double.
The constant k provides a means of scaling the transformed pixel values. Here are
rules-of-thumb for the choice of γ.
(rule.1) γ > 1: Increase contrast between high-value pixel values at the expense
of low-valued pixels.
(rule.2) γ < 1: Decrease contrast between high-value pixel values at the expense
of high-valued pixels.
% Gamma t r a n s f o r m
clc , clear all , close all % housekeeping
g = imread ( ’P9.jpg’ ) ; % r e a d image
% h = i m r e a d ( ’ P7 . j p g ’ ) ; % r e a d image
g = im2double ( g ) ;
g1 = 2 ∗ ( g . ^ ( 0 . 5 ) ) ; g2 = 2 ∗ ( g . ^ ( 1 . 5 ) ) ; g3 = 2 ∗ ( g . ^ ( 3 . 5 ) ) ;
subplot ( 1 , 4 , 1 ) , imshow ( g ) ; % display g
title ( ’Thai shelves’ ) ;
subplot ( 1 , 4 , 2 ) , imshow ( g1 ) ; % gamma = 0 . 5
title ( ’gamma = 0.5’ ) ;
subplot ( 1 , 4 , 3 ) , imshow ( g2 ) ; % gamma = 1 . 5
title ( ’gamma = 1.5’ ) ;
subplot ( 1 , 4 , 4 ) , imshow ( g3 ) ; % gamma = 3 . 5
title ( ’gamma = 3.5’ ) ;
There is a nonlinear relationship between input voltage and output intensity in mon-
itor displays. This problem can be corrected by preprocessing image intensities with
an inverse gamma transform (also called inverse power law transform) using
1 γ+k
gout = ginγ ,
where gin is the input image and gout is the output image after gamma correction.
Gamma correction can be carried out using the Matlab imadjust function as shown
in gamma_adjust.m with sample results shown in Fig. 2.36. Unlike the results in
Fig. 2.35 with the gamma transform, the best result in Fig. 2.36 is obtained with a
lower γ value, namely, γ = 1.5 in Fig. 2.36 as opposed to γ = 3.5 in Fig. 2.35.
% Gamma c o r r e c t i o n t r a n s f o r m
clc , clear all , close all % housekeeping
g = imread ( ’P9.jpg’ ) ; % Thai s h e l v e s image
%g = i m r e a d ( ’ s i g . j p g ’ ) ; % Currency s i g n a t u r e
g = im2double ( g ) ;
g1 = imadjust ( g , [ 0 1 ] , [ 0 1 ] , 0 . 5 ) ; % in / our range [0 ,1]
g2 = imadjust ( g , [ 0 1 ] , [ 0 1 ] , 1 . 5 ) ; % in / our range [0 ,1]
g3 = imadjust ( g , [ 0 1 ] , [ 0 1 ] , 3 . 8 ) ; % in / our range [0 ,1]
subplot ( 1 , 4 , 1 ) , imshow ( g ) ; % display g
Problem 2.34 ® Experiment with the currency signature in Fig. 2.34 using both
the gamma transform and inverse gamma transform. Which value of γ gives the best
result in each case? The best result will be the transformed image that has the clearest
signature.
Chapter 3
Visualising Pixel Intensity Distributions
This chapter introduces various ways to visualize pixel intensity distributions (see,
e.g., Fig. 3.1). Also included here are pointers to sources of generating points useful in
image tessellations and triangulations. In other words, image structure visualizations
carries with it tacit insights about image geometry.
The basic approach here is to provide 2D and 3D views of pixel intensities in
cropped digital images. By cropping a colour image, it is possible to obtain differ-
ent views of either the combined pixel colour values or the individual colour colour
channel pixel values within the same image. The importance of image cropping can-
not be overestimated. Image cropping extracts a subimage from an image. This
makes it possible to concentrate on that part of a natural scene or laboratory sample
that is considered interesting, relevant, deserving a closer look. Pixel intensities are
© Springer International Publishing AG 2017 125
J.F. Peters, Foundations of Computer Vision, Intelligent Systems
Reference Library 124, DOI 10.1007/978-3-319-52483-2_3
126 3 Visualising Pixel Intensity Distributions
Fig. 3.2 Sample RGB image for the Salerno train station
Fig. 3.3 3D View of green • channel pixel Intensities with contours for Fig. 3.2
yet another source of generating points (sites) used to tessellate an image, result-
ing in image meshes that reveal image geometry and image objects from different
perspectives.
3 Visualising Pixel Intensity Distributions 127
Example 3.1 Matlab script A.22 in Appendix A.3 is used to do the following:
1o Crop an rgb image to obtain a subimage. For example, the tiny image in Fig. 3.4
is the result of cropping the larger image in Fig. 3.2.
2o Produce a 3D mesh showing the combined rgb pixel values. The result for cropped
image is shown in Fig. 3.1.
3o Produce a 3D mesh with contours for the red • channel values. The result for the
red channel values for the pixels in the cropped image is shown in Fig. 3.5.1.
4o Produce a 3D mesh with contours for the green channel values. The result for the
green • channel values for the pixels in the cropped image is shown in Fig. 3.3. The
green channel values in a colour image often tend to have the greatest number of
changes between the minimum and maximum values. Hence, the green channel is
good place to look for non-uniformity in the selection of generating points (sites)
use in a Voronoï tessellation of an image. To see this, consider the difference
between the 3D meshes and their contours, starting with 3D mesh for the green
channel in Fig. 3.3, compared with the red channel values in Fig. 3.5.1 and blue
channel values in Fig. 3.5.2.
128 3 Visualising Pixel Intensity Distributions
5o Produce a 3D mesh with contours for the blue channel values. The result for the
blue • channel values for the pixels in the cropped image is shown in Fig. 3.5.2.
There are a number of ways to visualize the distribution of pixel intensities in a digital
image. A good way to get started is to visualize the distribution of pixel intensities
in an image.
Example 3.2 Sample Greyscale Histogram. Sample pixel intensity counts for each
greyscale pixel in Fig. 3.2 are shown in Fig. 3.6. To experiment with image pixel
intensity counts, see script A.21 in Appendix A.3. For the details, see Sect. 3.1.1
given next.
3.1.1 Histogram
An image histogram plots the relative frequency of occurrence of image pixel inten-
sity values against intensity values. Histograms are constructed using binning, since
it is usually not possible to include individual pixel intensity values in a histogram.
An image intensity bin (also called an image intensity bucket) is a set of pixel inten-
sities within a specified range. Typically, a histogram for an intensity image contains
256 bins, one pixel intensity per bin. Each intensity image histogram displays the
size (cardinality) of each pixel intensity bin. Image histograms are constructed using
a technique called binning. Image binning is a method of assigning each pixel
intensity to a bin containing matching intensities. Here is another example.
And let img(x, y) be a pixel intensity at location (x, y). Let 0 ≤ i ≤ 255 represent the
intensity of bin i. Then all pixels with intensities matching the intensity of img(x, y)
130 3 Visualising Pixel Intensity Distributions
A sample Matlab script for exploring binning for both colour images and greyscale
images, see see script A.21 in Appendix A.3. For an expanded study of binning,
see [21, §3.4.1].
Example 3.4 To inspect the numbers of intensities in a subimage, crop the a selected
image. For example, crop the image in Fig. 3.7, selecting only the fisherman’s head
and shoulders as shown in Fig. 3.8. Then, for intensities 80, 81, 82, use script A.21
in Appendix A.3 to compare the size of bins 80, 81, and 82 with the original image:
In other words, the cardinality of the bins in the cropped image decreases sharply
compared with the bins in the original image.
In Matlab, the imhist function displays a histogram for a greyscale image. If is
a greyscale image, the default display for imhist is 255 bins, one bin for each image
intensity. Use imhist(,n) to display n bins in the histogram for (see, e.g., Fig. 3.6
a sample greyscale histogram for the rgb plant image). Use
to store the relative frequency values in counts for histogram with horizontal axis
values stored in x. See, also, the histeq function introduced in Sect. 3.6.
plot(x, counts);
3.1.3 Plot
Relative to the vectors Counts and x extracted from a histogram, the plot function
produces a 2D plot of the relative frequency counts (see, e.g., Table 3.1).
39 41 35
Notice evidence of the separation of image intensities in the first of the first of the
contour plots and the evidence of the results of thresholding the rice image in the
contour plot drawn beneath the surface in the second of the above surface plots.
Using surfc, we obtain a visual perspective of the results produced by the script in
Listing 3.4 in Fig. 3.16.
3.1 Histograms and Plots 133
% Visualisation experiment
g = imread ( ’rice.png’ ) ; % r e a d g r e y s c a l e i m a g e
g = im2double ( g ) ;
[ x y ] = meshgrid ( max ( g ) ) ;
z = 2 0 . ∗ log ( 1 + g ) ;
figure , surfc ( x , y , z ) ; zlabel ( ’z = 20.*log(1 + g)’ ) ;
The meshgrid combined with surfc produces a wireframe surface plot with a con-
tour plot beneath the surface. This form of visualizing image intensities is shown
in Fig. 3.9. In Listing 3.1, meshgrid(g) is an abbreviation for meshgrid(g,g), trans-
forming the domain specified by the image g into arrays x and y, which are then used
to construct 3D wireframe plots.
The contour function draws contour plot, with 3D surface values mapped to isolines,
each with a different colour.
3.2 Isolines
An isoline for a digital image connects points in a 2D plane all representing the
same intensity. The points in an isoline represent heights above the x-y plane. In
the case of the red channel in an rgb, the points in an isoline represent brightness
levels of the colour red (see, e.g., Fig. 3.11). Each isoline belongs to a surface, say
the line indicating the locations of, for example, intensity 100 value. For an isoline,
there is an implicit scalar field defined in 2D such as the value 100 at locations (0,0),
(25,75). The Matlab clabel function is can be used to insert intensity values in an
isoline. However, an attractive alternative to clabel is the combination of the set and
get functions, which make it possible to control the range of values represented by
isolines.1 Sample isolines are shown in the contour plot in Figs. 3.10.2 and 3.1.5,
produced by script in MScript A.23 Appendix A.3.3.
Problem 3.5 ® Experiment with surf, surfc, stem, stem3, plot, mesh, meshc, mesh-
grid, contour and, in each case, display various visualizations, using the mini-image
represented by the array g in
Repeat the same experiments with the pout.png greyscale image. Then demonstrate
the use of each of the visualization functions with each of the colour channels in a
sample colour image.
Example 3.6 By way of illustration of the histogram and stem plot for an image
intensity distribution, consider the distribution of intensities for pout.tif (a useful
image in the Matlab library). A good overview of image histograms is given by
M. Sonka, V. Hlavac and R. Boyle [185, §2.3.2]. The utility of a histogram can be
seen in the fact that it is possible, for some images, to choose a threshhold value in
logarithmically compressing the dynamic range of an image, where the threshold is
an intensity in a valley between dominant peaks in the histogram. This is the case in
Fig. 3.12, where an intensity of approximately 120 provides a good threshold.
This approach to labelling contour lines will give control over the height labels that are displayed
on the contour lines. For example, try get(h, LevelStep ) ∗ 2 to inhibit the lower height labels.
136 3 Visualising Pixel Intensity Distributions
% Histogram experiment
%% h o u s e k e e p i n g
clc , c l e a r all , c l o s e a l l
%%
% This section for colour images
I = imread ( ’ fishermanHead . jpg ’ ) ;
% I = imread ( ’ f i s h e r m a n . jpg ’ ) ;
% I = imread ( ’ f o o t b a l l . jpg ’ ) ;
I = rgb2gray ( I ) ;
%%
% This section for i n t e n s i t y images
%I = i m r e a d ( ’ p o u t . t i f ’ ) ;
%%
% Construct histogram :
%
h = imhist ( I ) ;
[ counts , x ] = imhist ( I ) ;
counts
size ( counts )
subplot ( 1 , 3 , 1 ) , imshow ( I ) ;
subplot ( 1 , 3 , 2 ) , imhist ( I ) ;
ylabel ( ’ pixel count ’ ) ;
subplot ( 1 , 3 , 3 ) , stem ( x , counts ) ;
grid on
script histh.m). Display the original image, thresholded image, and image histogram,
indicating the intensity you have chosen for the threshold. Also, display histh.m.
Also of interest is the distributions colour channel intensities in a colour image. After
recording the color channel intensities for each pixel, superimposed stem plots for
each colour serve to produce a colour histogram.
% C o l o u r Image h i s t o g r a m
% a l g o r i t h m from
% h t t p : / / www . m a t h w o r k s . com / m a t l a b c e n t r a l / f i l e e x c h a n g e / a u t h o r s / 1 0 0 6 3 3
close all
clear all
% What ’ s h a p p e n i n g ?
g = imread ( ’rainbow-plant.jpg’ ) ; % r e a d r g b i m a g e
%g = i m r e a d ( ’ s i t a r . j p g ’ ) ; % r e a d g r e y s c a l e image
nBins = 2 5 6 ; % b i n s f o r 256 i n t e n s i t i e s
rHist = imhist ( g ( : , : , 1 ) , nBins ) ; % s a v e r e d i n t e n s i t i e s
gHist = imhist ( g ( : , : , 2 ) , nBins ) ; % s a v e g r e e n i n t e n s i t i e s
bHist = imhist ( g ( : , : , 3 ) , nBins ) ; % s a v e b l u e i n t e n s i t i e s
figure
subplot ( 1 , 2 , 1 ) ; imshow ( g ) , axis on % d i s p l a y o r i g . i m a g e
subplot ( 1 , 2 , 2 ) % display histogram
h ( 1 ) = stem ( 1 : 2 5 6 , rHist ) ; hold on % r e d s t e m p l o t
h ( 2 ) = stem ( 1 : 2 5 6 + 1 / 3 , gHist ) ; % g r e e n s t e m p l o t
h ( 3 ) = stem ( 1 : 2 5 6 + 2 / 3 , bHist ) ; % b l u e s t e m p l o t
hold off
set ( h , ’marker’ , ’none’ ) % s e t p r o p e r t i e s of bins
set ( h ( 1 ) , ’color’ , [ 1 0 0 ] )
set ( h ( 2 ) , ’color’ , [ 0 1 0 ] )
set ( h ( 3 ) , ’color’ , [ 0 0 1 ] )
axis square % make a x i s box s q u a r e
The plot in Fig. 3.14 shows a histogram the presents the combined colour channel
intensities for the plant image. By modifying the code in Listing 3.3, it is possible
to display three separate histograms, one for each colour channel in the plant image
(see Fig. 3.15).
− (f + C).
A combination of the imfilter and fspecial filter functions can be used to compute
filtered image neighbourhood values. First, decide on an effective n × n neighbour-
hood size and use the averaging filter average filter option for fspecial. Then use the
replicate option to populate all image neighbourhoods with the average filter values
for each neighbourhood. For example, choose n = 9 for a 9 × 9 neigbourhood and
combine the two Matlab filters to obtain
% Histogram experiment
close all
clear all
g = imread ( ’rainbow-plant.jpg’ ) ; % r e a d r g b i m a g e
g = rgb2gray ( g ) ;
%g = i m r e a d ( ’ r i c e . png ’ ) ; % r e a d g r e y s c a l e i m a g e
gf = imfilter ( g , fspecial ( ’average’ , [ 1 5 1 5 ] ) , ’replicate’ ) ;
gth = g − ( gf + 2 0 ) ;
gbw = im2bw ( gth , 0 ) ;
subplot ( 1 , 4 , 1 ) , imshow ( gbw ) ;
%s e t ( gca , ’ x t i c k ’ , [ ] , ’ y t i c k M o d e ’ , ’ a u t o ’ ) ;
subplot ( 1 , 4 , 2 ) , imhist ( gf ) ; title ( ’avg filtered image’ ) ;
grid on
glog = imfilter ( g , fspecial ( ’log’ , [ 1 5 1 5 ] ) , ’replicate’ ) ;
gth = g − ( glog + 1 0 0 ) ;
gbw = im2bw ( gth , 0 ) ;
%g l o g = i m f i l t e r ( g , f s p e c i a l ( ’ p r e w i t t ’ ) ) ;
%g l o g = i m f i l t e r ( g , f s p e c i a l ( ’ s o b e l ’ ) ) ;
%g l o g = i m f i l t e r ( g , f s p e c i a l ( ’ l a p l a c i a n ’ ) ) ;
%g l o g = i m f i l t e r ( g , f s p e c i a l ( ’ g a u s s i a n ’ ) ) ;
%g l o g = i m f i l t e r ( g , f s p e c i a l ( ’ u n s h a r p ’ ) ) ;
gbw = im2bw ( gth , 0 ) ;
subplot ( 1 , 4 , 3 ) , imshow ( gbw ) ;
set ( gca , ’xtick’ , [ ] , ’ytickMode’ , ’auto’ ) ;
subplot ( 1 , 4 , 4 ) , imhist ( glog ) ; title ( ’filtered image’ ) ;
grid on
g = input image,
c, d = max(max(g)), min(min(g)), respectively,
a, b = new dynamic range for g,
a−b
g(x, y) = (g(x, y) − c) + a.
c−d
A combination of the stretchlim and imadjust functions can be used to carry out
contrast stretching on an image. For example, the choice of the new dynamic range
is the 10th and 90th percentile points in the cumulative distributions of pixel values.
This means that in the new dynamic range, 10% of the pixel values will be less than
the new min d and 90% of the new pixel values will be greater than the max c.
% Constrast −s t r e t c h i n g experiment
clear all
close all
g = imread ( ’rainbowshoe.jpg’ ) ; % r e a d c o l o u r i m a g e
%g = i m r e a d ( ’ r a i n b o w . j p g ’ ) ; % r e a d c o l o u r i m a g e
% g = imread ( ’ tooth819 . t i f ’ ) ;
% g = i m r e a d ( ’ t o o t h 2 . png ’ ) ;
%g = i m r e a d ( ’ t o o t h . t i f ’ ) ;
%g = r g b 2 g r a y ( g ) ;
stretch = stretchlim ( g , [ 0 . 0 3 , 0 . 9 7 ] ) ;
h = imadjust ( g , stretch , [ ] ) ;
subplot ( 1 , 2 , 1 ) , imshow ( g ) ;
title ( ’rgb image’ ) ;
% t i t l e ( ’ g r e y s c a l e image ’ ) ;
axis on
subplot ( 1 , 2 , 2 ) , imshow ( h ) ;
title ( ’contrast stretched’ ) ;
axis on
In the contrast-stretched image in Fig. 3.17, the shoe and the spots on the floor to
the right of the shoe are now more visible, i.e., more distinguishable.2
Notice that contrast stretching is performed on an rgb image converted to
greyscale. It is possible to do contrast stretching directly on an rgb image (see,
e.g., Fig. 3.18). The changed distribution of relative frequencies of pixel values is
evident in the contrasting histograms in Fig. 3.20.
% C o n s t r a s t −s t r e t c h e d dynamic r a n g e s
g = imread ( ’tooth819.tif’ ) ;
stretch = stretchlim ( g , [ 0 . 0 3 , 0 . 9 7 ] ) ;
h = imadjust ( g , stretch , [ ] ) ;
2 The shoe image in Fig. 3.17, showing refracted light from the windows overlooking on an upper
staircase landing in building E2, EITC, U of Manitoba, was captured by Chido Uchime with a cell
phone camera.
142 3 Visualising Pixel Intensity Distributions
subplot ( 1 , 2 , 1 ) , imhist ( g ) ;
title ( ’tooth histogram’ ) ;
subplot ( 1 , 2 , 2 ) , imhist ( h ) ;
title ( ’new tooth histogram’ )
The choice of the new dynamic range is image-dependent. Consider, for example,
an image of a micro-slice of a 350,000 year old tooth fossil found in Serbia in 1990.
This image and the corresponding contrast-stretched image is shown in Fig. 3.19. The
features of the tooth image are barely visible in the original image. After choosing
a new dynamic range equal to [0.03, . . . , 0.97], the features of the tooth are more
sharply defined.
The contrast between the distribution of relative frequencies of pixel values in
the original tooth image and contrast-stretched image can be seen by comparing the
histograms in Fig. 3.21, especially for the high intensities.
3.5 Contrast Stretching 143
Problem 3.9 K Experiment with the tooth image tooth819.tif using contrast
stretching. The challenge here is to find a contrast-stretched image that more sharply
defines the parts of the tooth image.
g = imread ( ’tooth819.tif’ ) ; % t o o t h i m a g e
ramp = 4 0 : 6 0 ; % histogram d i s t r i b u t i o n
h = histeq ( g , ramp ) ; % h i s t o g r a m e q u a l i s a t i o n
subplot ( 1 , 2 , 1 ) , imshow ( g ) ;
title ( ’tooth histogram’ ) ;
subplot ( 1 , 2 , 2 ) , imshow ( h ) ;
title ( ’equalised image’ )
After some experimentation, it was found that the best result is obtained with
target histogram range equal to 40 : 60. This leads to the result shown in Fig. 3.23.
Even with this narrow range for the target histogram, the regions of the tooth in the
resulting image are not as sharply defined as they are in the contrast-stretched image
of the tooth in Fig. 3.19.
Problem 3.10 Experiment with the tooth image tooth819.tif using histogram match-
ing. The challenge here is to identify a target histogram that more sharply defines
the parts of the tooth image.
This chapter introduces linear spatial filters. A linear filter is a time-invariant device
(function, or method) that operates on a signal to modify the signal in some fashion.
In our case, a linear filter is a function that has pixel (colour or non-colour) values
as its input. In effect, a linear filter is a linear function on sets of pixel feature values
such as colour, gradient orientation and gradient magnitude (especially, gradient
magnitude which is a measure of edge pixel strength), which are either modified or
exhibited in some useful fashion. For more about linear functions, see Sect. 5.1.
From an engineering perspective, one of the most famous as well as important
papers on filtering is the 1953 paper by L.A. Zadeh [215]. Very relevant to the interests
of computer vision are Zadeh’s ideal and optimum filters. An ideal filter is a filter that
yields a desired signal without any distortion or delay. A good example of an ideal
filter is the Robinson shape filter from M. Robinson [167, Sect. 5.4, p. 159ff], useful
in solving shape recognition problems in computer vision. Ideal filters are often not
possible. So Zadeh introduced optimum filtering. An optimum filter is a filter that
yields the best (close) approximation of the desired signal. Another classic paper
that is important for computer visions J.F. Canny’s edge filtering method introduced
in [24] and elaborated in [25]. For a recent paper on scale-invariant filtering for edge
detection in digital images, see S. Mahmoodi [118]. For more about linear filters in a
general setting, see R.B. Holmes [84] and for linear filters in signal processing, see,
especially, D.S. Broomhead, J.P. Huke and M.R. Muldoon [20, Sect. 3].
Previously, the focus was on manipulating the dynamic range of images to improve,
sharpen and increase the contrast of image features. In this chapter, the focus shifts
from sharpening image contrast to image filtering, which is based on weighted sums
of local neighbourhood pixel values. As a result, we obtain a means of removing
© Springer International Publishing AG 2017 145
J.F. Peters, Foundations of Computer Vision, Intelligent Systems
Reference Library 124, DOI 10.1007/978-3-319-52483-2_4
146 4 Linear Filtering
image noise, sharpening image features (enhancing image appearance), and achieve
edge and corner detection. The study of image filtering methods has direct bearing
on various approaches to image analysis, image classification and image retrieval
methods. Indications of the importance of image filtering in image analysis and
computer vision can be found in the following filtering approaches.
Fast elliptical filtering: Radially-uniform box splines are constructed via repeated
convolution of a fixed number of box distributions by K.N. Chaudhury, A. Munoz-
Barrutia and M. Unser in [28]. A sample result of the proposed filtering methods
is shown in Fig. 4.1.
Gaussian Smoothing filtering: This method has two main steps given by S.S. Sya
and A.S. Prihatmanto [189]: (1) a given raster image is normalized in the RGB
colour so that the colour pixel intensities are in the range 0 to 225 and (2) the
normalized RGB image is converted to HSV to obtain a threshold for hue, satu-
ration and value in detecting a face that is being tracked by a Lumen social robot.
This application of Gaussian filtering by Sya and Prihatmanto illustrates the high
utility of the HSV colour space. Gaussian filtering is an example of non-linear
filtering. For more about this, see Sect. 5.6 and for more about the Appendix B.8.
Nonlinear adaptive median filtering (AMF): Nonlinear AMF is used by T.K.
Thivakaran and R.M. Chandrasekaran in [193].
Nearness of open neighbourhoods of pixels to given pixels: This approach by
S.A. Naimpally and J.F. Peters given in [128, 142, 151] and by others [73, 76,
137, 152, 162, 170]) introduces an approach to filtering an image that focuses
on the proximity of an open neighbourhood of a pixel to pixels external to the
neighbourhood. An open neighbourhood of a pixel is a set of pixels within a
fixed distance of a pixel which does not include the pixels along the border of
4.1 Importance of Image Filtering 147
the neighbourhood. For more about open sets and neighbourhoods of points, see
Appendix B.13 and B.14.
In linear spatial filters, we obtain filtered values of target pixels by means of linear
combinations of pixel values in a n × m neighborhood. A target pixel is located at
the centre of neighbourhood. A linear combination of neighbourhood pixel values is
determined by a filter kernel or mask. A filter kernel is an array the same size as a
neighbourhood, containing weights that are assigned to the pixels in the neighbour-
hood of a target pixel. A linear spatial filter convolves the kernel and neighbourhood
pixel values to obtain a new target pixel value. Let w denote a e × 3 kernel and let
g(x, y) be a target pixel in a 3 × 3 neighbourhood, then the new value of the target
pixel is obtained as the sum of the dot products of pairs of row vectors. For a pair of
1 × n vectors a, b of the same size, the dot product is the sum of the products of the
values in corresponding positions, i.e.,
n
A·B = (ai )(bi )
i=1
3
g(x, y) = w(1, i)g(1, i) +
i=1
3
w(2, i)g(2, i) +
i=1
3
w(3, i)g(3, i).
i=1
The kernel w is called a Sobel mask and is used in edge detection in images [58,
Sect. 3.6.4] (see, also [Sect. 3][24, 54, 57, 120, 160]). In Matlab, the value of the
target pixel n(2, 2) for a given 3 × 3 array, can be computed using the dot (dot
product) function. This is illustrated in Listing 4.1.
148 4 Linear Filtering
% Sample t a r g e t p i x e l v a l u e u s i n g a 3 x3 f i l t e r kernel
The steps in the convolution of a kernel with an image neighbourhood are sum-
marised, next.
(step.1) Define a n × n filter kernel k.
(step.2) Slide the kernel onto a n × n neighbourhood n in an image g (the centre
of the kernel lie on top of the neighbourhood target pixel).
(step.3) Multiply the pixel values by the corresponding kernel weights. If n(x, y)
lies beneath k(x, y), then compute n(x, y)k(x, y). For the ith row k(i,:)
in k and ith row n(i,:), compute the dot product k(i, :) · n(i, :). Then sum
the dot products of the rows.
(step.4) Replace the original target value with the new filtered value, namely, the
total of the dot products from step 3.
f unc = @(x)op(x(:));
g = imread ( ’cameraman.tif’ ) ;
subplot ( 1 , 4 , 1 ) , imshow ( g ) ;
subplot ( 1 , 4 , 2 ) , imhist ( g ) ;
func = @ ( x ) median ( x ( : ) ) ; % set f i l t e r
%f u n c = @( x ) max ( x ( : ) ) ; % set f i l t e r
%f u n c = @( x ) ( u i n t 8 ( mean ( x ( : ) ) ) ) ; % s e t f i l t e r
h = nlfilter ( g , [ 3 3 ] , func ) ;
subplot ( 1 , 4 , 3 ) , imshow ( h ) ;
title ( ’nlfilter(g,[3 3],func)’ ) ;
subplot ( 1 , 4 , 4 ) , imhist ( h ) ;
To see how the neighbourhood sliding filter works, try the following experiment
shown in Listing 4.3.
% Experiment with function handle
Problem 4.1 Using selected images, use the max and mean functions to define
new versions of the nlfilter (see Listing 4.3 to see how this is done). In each case,
use subplot to display the original image, histogram for the original image, filtered
image, and histogram for the filtered image.
The basic idea in this section is to use the fspecial function to construct various linear
convolution filter kernels. By way of illustration, consider constructing a kernel that
mimics the effect of motion blur. Motion blur is an apparent streaking of fast-
moving objects (see, e.g., Fig. 4.3). This can be achieved either by taking pictures
of fast moving objects while standing still or by continuous picture-taking while
moving a camera. The motion blur effect can be varied with different choices of
pixel length and counterclockwise angle of movement (the fspecial function is used
set up a particular motion blur kernel). In Fig. 4.3, global motion blur is shown in a
recent picture of a honey bee.
150 4 Linear Filtering
Fig. 4.3 Lin. Convolution. filter of honey bee with Listing 4.4
g = imread ( ’bee-polen.jpg’ ) ;
%g = i m r e a d ( ’ k i n g f i s h e r 1 . j p g ’ ) ;
subplot ( 1 , 2 , 1 ) , imshow ( g ) ;
kernel = fspecial ( ’motion’ , 5 0 , 4 5 ) ; %l e n = 2 0 , C C a n g l e =45
%k e r n e l = f s p e c i a l ( ’ m o t i o n ’ , 3 0 , 4 5 ) ; %l e n = 2 0 , C C a n g l e =45
h = imfilter ( g , kernel , ’symmetric’ ) ;
subplot ( 1 , 2 , 2 ) , imshow ( h ) ;
title ( ’fspecial(motion,20,45)’ ) ;
Problem 4.2 Apply motion blur filtering exclusively to the part of the image in
Fig. 4.3 containing the honey bee. This will result in an image where only the honey
bee is motion blurred.
Hint: Use a combination of the roipoly region-of-interest function and the roifilt2
function (filter a region-of-interest (roi)). To see how this is done, try
help roifilt2
The basic approach using the roifilt2 function is shown in Listing 4.5 in terms of
unsharp filtering one of the coins shown in Fig. 4.4. To take advantage of roi filtering,
you will need to adapt the approach in Listing 4.4 in terms of mirror filtering a region-
of-interest.
% Sample r o i filtering
I = imread ( ’eight.tif’ ) ;
c = [ 2 2 2 272 300 270 221 1 9 4 ] ;
r = [ 2 1 21 75 121 121 7 5 ] ;
BW = roipoly ( I , c , r ) ;
H = fspecial ( ’unsharp’ ) ;
J = roifilt2 ( H , I , BW ) ;
subplot ( 1 , 2 , 1 ) , imshow ( I ) ; title ( ’roi = upper right coin’ ) ;
subplot ( 1 , 2 , 2 ) , imshow ( I ) ; title ( ’filtered roi = upper right coin’ ) ;
% How t o u s e r o i p o l y
clear all
close all
g = imread ( ’rainbow-plant.jpg’ ) ; h = rgb2gray ( g ) ;
%g = i m r e a d ( ’ f o r e s t . t i f ’ ) ;
%g = i m r e a d ( ’ k i n g f i s h e r 1 . j p g ’ ) ;
%g = i m r e a d ( ’ bee − p o l e n . j p g ’ ) ;
%g = i m r e a d ( ’ e i g h t . t i f ’ ) ;
%g = r g b 2 g r a y ( g ) ;
%c = [ 2 1 2 206 231 269 288 280 262 232 2 1 2 ] ; % c o l u m n f r o m r o i t o o l
%r = [ 5 3 96 112 107 74 49 36 36 5 3 ] ; % row f r o m r o i t o o l
%c = [ 2 2 2 272 300 270 221 1 9 4 ] ; % c o l u m n f r o m r o i t o o l
%r = [ 2 1 21 75 121 121 7 5 ] ; % row f r o m r o i t o o l
%[BW, r , c ] = i m p o l y ( g ) ;
% manually s e l e c t r , c ve c tor s , double −c l i c k i n g a f t e r s e l e c t i o n :
[ BW , r , c ] = roipoly ( h )
B = roipoly ( h , r , c ) ; % interactive roi selection tool
%p = i m h i s t ( g ( B ) ) ;
%n p i x = sum ( B ( : ) ) ;
%f i g u r e ,
subplot ( 1 , 3 , 1 ) , imshow ( g ) ; title ( ’original figure’ ) ;
%s u b p l o t ( 1 , 3 , 2 ) , i m h i s t ( g ( B ) ) ; t i t l e ( ’ r o i h i s t o g r a m ’ ) ;
subplot ( 1 , 3 , 2 ) , bar3 ( h , 0 . 2 5 , ’detached’ ) , colormap ( [ 1 0 0 ; 0 1 0 ; 0 0 1 ] ) ;
title ( ’bar3(B,detached)’ ) ;
subplot ( 1 , 3 , 3 ) , bar ( B , ’stacked’ ) , axis square ; title ( ’bar(B,stacked)’ ) ;
%s u b p l o t ( 1 , 3 , 3 ) , b a r 3 ( n p i x , ’ g r o u p e d ’ ) ; t i t l e ( ’ bar3 graph ’ ) ;
%s u b p l o t ( 1 , 3 , 3 ) , b a r 3 ( n p i x , ’ s t a c k e d ’ ) ; t i t l e ( ’ bar3 graph ’ ) ;
Problem 4.3 To solve the problem of finding the vectors c, r for the roi for an image
g such as the one used in Listing 4.5, try
[B,c,r] = roipoly(g)
Then rewrite the code in Listing 4.5 using roipoly to obtain the vectors c, r , instead
of manually inserting the c, r vectors to define the desired roi. Show what happens
when you select a roi containing the lower right hand coin in Fig. 4.4. A sample use
of roipoly in terms of the eight.tif image is shown in Fig. 4.7.
imnoise function is to create a noisy image. This is done by adding one of the
following types of an image g and using mean filtering to remove the noise.
(noise.1) ‘gaussian’: adds white noise with mean m (default = 0) and variance v
(default = 0.01), with syntax
g = imnoise(g,‘gaussian’,m,v)
(noise.2) ‘localvar’: adds zero mean Gaussian white noise with an intensity-
dependent variance, with syntax
g = imnoise(g,‘localvar’,V)
g = imnoise(g,‘poisson’)
(noise.4) ‘salt & pepper’: adds what looks like pepper noise to an image, with
syntax
g = imnoise(g,‘salt&pepper’, d)
g = imnoise(g,‘speckle’, v)
% Adding n o i s e t o an image
g = imread ( ’forest.tif’ ) ;
subplot ( 1 , 3 , 1 ) , imshow ( g ) ; title ( ’forest image’ ) ;
nsp = imnoise ( g , ’salt & pepper’ , 0 . 0 5 ) ; %s l i g h t p e p p e r i n g
% n s p = i m n o i s e ( g , ’ s a l t & p e p p e r ’ , 0 . 1 5 ) ; %i n c r e a s e d p e p p e r
subplot ( 1 , 3 , 2 ) , imshow ( nsp ) ; title ( ’salt & pepper noise’ ) ;
g = im2double ( g ) ;
v = g(: ,:) ;
np = imnoise ( g , ’localvar’ , v ) ;
subplot ( 1 , 3 , 3 ) , imshow ( np ) ; title ( ’localvar noise’ ) ;
A mean filter is the simplest of the linear filters. This form of filtering gives equal
weight to all pixels in an n × m neighbourhood, where a weight w is defined by
1
w= .
nm
g = imread ( ’forest.tif’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’forest image’ ) ;
nsp = imnoise ( g , ’salt & pepper’ , 0 . 0 5 ) ; %s l i g h t p e p p e r i n g
% n s p = i m n o i s e ( g , ’ s a l t & p e p p e r ’ , 0 . 1 5 ) ; %i n c r e a s e d p e p p e r
subplot ( 2 , 3 , 2 ) , imshow ( nsp ) ; title ( ’salt & pepper noise’ ) ;
g = im2double ( g ) ;
v = g(: ,:) ;
np = imnoise ( g , ’localvar’ , v ) ;
subplot ( 2 , 3 , 3 ) , imshow ( np ) ; title ( ’localvar noise’ ) ;
kernel = ones ( 3 , 3 ) / 9 ;
g1 = imfilter ( g , kernel ) ;
g2 = imfilter ( nsp , kernel ) ;
g3 = imfilter ( np , kernel ) ;
subplot ( 2 , 3 , 4 ) , imshow ( g1 ) ; title ( ’mean-filtered image’ ) ;
subplot ( 2 , 3 , 5 ) , imshow ( g2 ) ; title ( ’filter pepper image’ ) ;
subplot ( 2 , 3 , 6 ) , imshow ( g3 ) ; title ( ’filter localvar image’ ) ;
Problem 4.5 Find the best mean filter for noise removal from salt & pepper and
localvar noisy forms of the bf forest.tif image.
Hint: Vary the mean filter kernel.
Problem 4.6 Define an image g with the following matrix:
Show how the g matrix changes after mean-filtering with the kernel defined in List-
ing 4.8.
Median filtering is more effective than mean filtering. Each pixel p value in an image
is replaced by the median value of from the n × m neighbourhood of p. This form
of filtering preserves image edges, while eliminating noise spikes in image pixel
values. Rather than set up a filter kernel as in mean filtering, the medfilt2 function
is used to carry out median filtering in terms of a n × m image neighbourhood1 (see
Listing 4.9).
1 Usually, n = m = 3.
4.8 Median Filtering 157
% Median f i l t e r i n g a n i m a g e
g = imread ( ’forest.tif’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’forest image’ ) ;
nsp = imnoise ( g , ’salt & pepper’ , 0 . 0 5 ) ; %s l i g h t p e p p e r i n g
% n s p = i m n o i s e ( g , ’ s a l t & p e p p e r ’ , 0 . 1 5 ) ; %i n c r e a s e d p e p p e r
subplot ( 2 , 3 , 2 ) , imshow ( nsp ) ; title ( ’salt & pepper noise’ ) ;
g = im2double ( g ) ;
v = g(: ,:) ;
np = imnoise ( g , ’localvar’ , v ) ;
subplot ( 2 , 3 , 3 ) , imshow ( np ) ; title ( ’localvar noise’ ) ;
g1 = medfilt2 ( g , [ 3 , 3 ] ) ;
g2 = medfilt2 ( nsp , [ 3 , 3 ] ) ;
g3 = medfilt2 ( np , [ 3 , 3 ] ) ;
subplot ( 2 , 3 , 4 ) , imshow ( g1 ) ; title ( ’median-filtered image’ ) ;
subplot ( 2 , 3 , 5 ) , imshow ( g2 ) ; title ( ’filter pepper image’ ) ;
subplot ( 2 , 3 , 6 ) , imshow ( g3 ) ; title ( ’filter localvar image’ ) ;
Problem 4.7 Find the best median filter for noise removal from salt & pepper and
localvar noisy forms of the bf forest.tif image.
Hint: Vary the neighbourhood size.
Show how the g matrix changes after median-filtering with the neighbourhood
defined in Listing 4.9.
158 4 Linear Filtering
Median filtering is a special case of what is known as rank order filtering. A max-
imum order filter selects the maximum value in a given neighbourhood. Similarly,
a minimum order filter selects the minimum value in a given neighbourhood. The
ordfilt2 function to carry out order filtering, using the syntax.
filteredg = ordfilt2(g,order,domain)
replaces each pixel value in image g by the orderth pixel value in an ordered set of
neighbours specified by the nonzero pixel values in the domain. Using the maximum
order filter on g = forest.tif with a 5 × 5 neighbourhood, write
maxfilter = ordfilt2(g,25,ones(5,5))
See Listing 4.9 for a sample maximum order filter with 5 × 5 neighbourhood.
% Maximum o r d e r f i l t e r i n g an image
g = imread ( ’forest.tif’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’forest image’ ) ;
nsp = imnoise ( g , ’salt & pepper’ , 0 . 0 5 ) ; %s l i g h t p e p p e r i n g
% nsp = imnoise ( g , ’ s a l t & pepper ’ , 0 . 1 5 ) ; %i n c r e a s e d p e p p e r
subplot ( 2 , 3 , 2 ) , imshow ( nsp ) ; title ( ’salt & pepper noise’ ) ;
4.9 Rank Order Filtering 159
g = im2double ( g ) ;
v = g(: ,:) ;
np = imnoise ( g , ’localvar’ , v ) ;
subplot ( 2 , 3 , 3 ) , imshow ( np ) ; title ( ’localvar noise’ ) ;
g1 = ordfilt2 ( g , 2 5 , ones ( 5 , 5 ) ) ;
g2 = ordfilt2 ( nsp , 2 5 , ones ( 5 , 5 ) ) ;
g3 = ordfilt2 ( np , 2 5 , ones ( 5 , 5 ) ) ;
subplot ( 2 , 3 , 4 ) , imshow ( g1 ) ; title ( ’max-order-filtered image’ ) ;
subplot ( 2 , 3 , 5 ) , imshow ( g2 ) ; title ( ’filter pepper image’ ) ;
subplot ( 2 , 3 , 6 ) , imshow ( g3 ) ; title ( ’filter localvar image’ ) ;
Show how the g matrix changes after maximum order filtering with a 3 × 3 neigh-
bourhood rather than the 5 × 5 neighbourhood defined in Listing 4.8.
Problem 4.12 Use roipoly to select a polygon-shaped region (i.e., select a region-
of-interest (roi)) of a noisy image. Then set up a Matlab script that performs median
filtering on just the roi. The display the results of median filtering the roi for noise
removal from salt & pepper and localvar noisy forms of the bf forest.tif image.
Let x denote the pixel intensity of a digital image g, x̄ the average image pixel
intensity, and σ the standard deviation of the pixel intensities. The discrete form the
of the normal distribution of the pixel intensities is Gaussian function f : X → R
160 4 Linear Filtering
defined by
1 (x−x̄)2
f (x) = √ e− 2σ2 .
σ 2π
g = imread ( ’forest.tif’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’forest image’ ) ;
nsp = imnoise ( g , ’salt & pepper’ , 0 . 0 5 ) ; %s l i g h t p e p p e r i n g
% n s p = i m n o i s e ( g , ’ s a l t & p e p p e r ’ , 0 . 1 5 ) ; %i n c r e a s e d p e p p e r
subplot ( 2 , 3 , 2 ) , imshow ( nsp ) ; title ( ’salt & pepper noise’ ) ;
g = im2double ( g ) ;
v = g(: ,:) ;
np = imnoise ( g , ’localvar’ , v ) ;
subplot ( 2 , 3 , 3 ) , imshow ( np ) ; title ( ’localvar noise’ ) ;
lowpass = fspecial ( ’gaussian’ , [ 5 5 ] , 2 ) ;
g1 = imfilter ( g , lowpass ) ;
g2 = imfilter ( nsp , lowpass ) ;
g3 = imfilter ( np , lowpass ) ;
subplot ( 2 , 3 , 4 ) , imshow ( g1 ) ; title ( ’norm-filtered image’ ) ;
subplot ( 2 , 3 , 5 ) , imshow ( g2 ) ; title ( ’filter peppering’ ) ;
subplot ( 2 , 3 , 6 ) , imshow ( g3 ) ; title ( ’filter localvar noise’ ) ;
This chapter focuses on the detection of edges, lines and corners in digital images.
This chapter also introduces a number of non-linear filtering methods. A method is a
non-linear method, provided the output of the method is not directly proportional to
the input. For example, a method whose input is a real-valued variable x and whose
output is x α , α > 0 (power of x) is non-linear.
1 x2
f (x; σ) = √ e− 2σ2 (Gaussian kernel function),
σ 2π
is a non-linear function with a curved planar plot such as the ones shown in Fig. 5.2.
In the definition of the 1D Gaussian kernel function f (x; σ), x is a spatial parameter
and σ is a scale parameter. Notice that as σ decreases (e.g., from σ = 0.81 in
Fig. 5.1 to σ = 0.61 in Fig. 5.2.1 and then to σ = 0.41 in Fig. 5.2.2), the width of
the Gaussian kernel plot shrinks. For this reason, σ is called a width parameter.
For other experiments with the 1D Gaussian kernel, try the Matlab script A.24 in
Appendix A.5.1.
5.1 Linear Function 163
The corners in a digital image provide a good source of Voronoï mesh generators.
A Voronoï mesh derived from image corners provides a segmentation of an image.
Each segment in such a mesh is a convex polygon. Recall that the straight line segment
between any pair points in a convex polygon belongs to the polygon. The motivation
for considering this form of image segmentation is that mesh polygons provide a
means of
1o Image segmentation. Voronoï meshes provide a straightforward means of parti-
tioning an image into non-intersecting convex polygons that facilitate image and
scene analysis as well as image understanding.
2o Object recognition. Object corners determine distinctive (recognizable) convex
submeshes that can be recognized and compared.
3o Pattern recognition. The arrangement of corner-based convex image submeshes
constitute image patterns that can be recognized and compared. See Sect. 5.13
for more about this.
Fig. 5.4 Logical not versus non-logical not image with Listing 5.1
Quite a number of edge (and line) detection methods have been proposed. Prominent
among these filtering methods are those proposed by L.G. Roberts [166], J.M.S.
Prewitt [160], I. Sobel [180, 181] and the more recent Laplacian and Zero cross fil-
tering methods. The Laplacian and Zero cross filters effect remarkable improvements
over the earlier edge detection methods. This can be seen in Figs. 5.5 and 5.6.
% Edge d e t e c t i o n f i l t e r i n g an image
clc , clear all , close all
%g = r g b 2 g r a y ( i m r e a d ( ’ b e e − p o l e n . j p g ’ ) ) ;
g = imread ( ’circuit.tif’ ) ;
gr = edge ( g , ’roberts’ ) ;
gp = edge ( g , ’prewitt’ ) ;
gs = edge ( g , ’sobel’ ) ;
gl = edge ( g , ’log’ ) ;
gz = edge ( g , ’zerocross’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’circuit.tif’ ) ;
subplot ( 2 , 3 , 2 ) , imshow ( ~ gr ) ; title ( ’Roberts filter’ ) ;
subplot ( 2 , 3 , 3 ) , imshow ( ~ gp ) ; title ( ’Prewitt filter’ ) ;
subplot ( 2 , 3 , 4 ) , imshow ( ~ gs ) ; title ( ’Sobel filter’ ) ;
subplot ( 2 , 3 , 5 ) , imshow ( ~ gl ) ; title ( ’Laplacian filter’ ) ;
subplot ( 2 , 3 , 6 ) , imshow ( ~ gz ) ; title ( ’Zero cross filter’ ) ;
Listing 5.2 illustrates the application of each of the common edge-filtering meth-
ods. Notice that the Matlab logical not operator. To experiment with logical not, try
166 5 Edges, Lines, Corners, Gaussian …
% Sample l o g i c a l n o t o p e r a t i o n on an a r r a y
clc , clear all , close all
g = [1 1 1 1 0 0 0 0]
notg = ~g
The approach in Script 5.3 can be used to reverse the appearance of each filtered
image from white edges on black background to black edges on white background
(see, e.g., Figs. 5.4 and 5.5, for edges extracted from Figs. 5.6 and 5.7).
The basic approach in edge detection filters is to convolve the n×n neighbourhood
of each pixel in an image with an n × n mask (or filter kernel), where n is usually an
odd integer. The term convolve means fold (roll) together. For a real-life example of
convolving, see http://www.youtube.com/watch?v=7EYAUazLI9k.
For example, the Prewitt and Sobel edge filters are used to convolve each
3 × 3 image neighbourhood (also called an 8-neighbourhood) with an edge fil-
ter. The notion of an 8-neighbourhood of a pixel comes from A. Rosenfeld
[170]. A Rosenfeld 8-neighbourhood is an square array of 8 pixels surrounding
a center pixel. Prewitt and Sobel edge filters are a pair of 3 × 3 masks (one mask
representing the pixel gradient in the x-direction and a second mask for the pixel
gradient in the y-direction).
Matlab favours the horizontal direction, filtering an image with only the mask
representing the gradient of a pixel in the x-direction. To see examples of masks, try
The masks available with the Matlab fspecial function favour the horizontal direc-
tion. For example, the Prewitt 3 × 3 mask is defined by
5.2 Edge Detection 167
⎡ ⎤
1 1 1
m Pr ewitt = ⎣ 0 0 0 ⎦ .
−1 −1 −1
The Laplacian edge filter L(x, y) is a 2D isotropic1 measure of the 2nd derivative
of an image g with pixel intensities g(x, y) defined by
∂2g ∂2g
L(x, y) = + .
∂x 2 ∂ y2
For detailed explanations for the Laplacian, Laplacian of Gaussian, LoG, and
Marr edge filters, see http://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm.
Problem 5.3 Using image enhancement methods from Chap. 3, preprocess the
dragonfly2.jpg image and create a new image (call it dragonfly2.jpg). Find the
best preprocessing method to do edge detection filtering to obtain an image similar
to the one shown in Fig. 5.7. Display both the binary (black and white) and the (black
on white or logical not) edge image as shown in Fig. 5.7. In addition, type
help edge
and experiment with different choices of the thresh and sigma (standard deviation
parameters for the Laplacian of the Gaussian (normal distribution) filtering method,
using
gl = edge(g, log , thresh, sigma)
Hint: Use im2double on an input image. Also, edge detection methods operate on
greyscale (not colour) images.
g = imread ( ’circuit.tif’ ) ;
gr = edge ( g , ’roberts’ ) ;
gp = edge ( g , ’prewitt’ ) ;
gs = edge ( g , ’sobel’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’circuit.tif’ ) ;
1 Isotropic means not direction sensitive, having the same magnitude or properties when measured
in different directions.
168 5 Edges, Lines, Corners, Gaussian …
It has been observed by T. Lindeberg that the concept of an image edge is only what
we define it to be [113, p. 118]. The earlier attempts at edge detection by Roberts,
Prewitt and Sobel focused on the detection of points where the first order edge
gradient is high. Starting in the mid-1960s, jumps in brightness values are the kinds
5.4 Enhancing Digital Image Edges 169
∂2g ∂2g
∇ 2 g(x, y) = +
∂x 2 ∂ y2
For implementation purposes, the discrete form of the Laplacian filter ∇ 2 g(x, y)
is defined by
% L a p l a c i a n edge −e n h a n c e d image
%A= i m r e a d ( ’ c i r c u i t . t i f ’ ) ;
%g = r g b 2 g r a y ( i m r e a d ( ’ S n a p − 04 a . t i f ’ ) ) ;
g = imread ( ’Snap -04a.tif’ ) ;
k= fspecial ( ’laplacian’ , 1 ) ; % G e n e r a t e L a p l a c i a n f i l t e r
h2= imfilter ( g , k ) ; % F i l t e r i m a g e w i t h L a p l a c i a n k e r n e l
ge= imsubtract ( g , h2 ) ; % S u b t r a c t L a p l a c i a n f r o m o r i g i n a l .
subplot ( 1 , 3 , 1 ) , imshow ( g ) ; title ( ’Snap -04a.tif fossil’ ) ;
subplot ( 1 , 3 , 2 ) , imagesc ( ~ h2 ) ;
title ( ’Laplacian filtered image’ ) ; axis image ;
170 5 Edges, Lines, Corners, Gaussian …
In Matlab, the second order Laplacian filter has an optional shape parameter α,
which controls the shape of the Laplacian (e.g., see Listing 5.6, where α = 1 (high
incidence of edges)). The original image in Fig. 5.18 is a recent Snap-04a.tif image
of an ostracod fossil from MYA (found in an ostracod colony trapped in amethyst
crystal from Brasil). In this image, there is a very high incidence edges and ridges,
handled with a high α value. Similarly, in the circuit.tif in Fig. 5.9, there are a high
incidence of lines, again warranting a high α value to achieve image enhancement.
It was Carl Friedrich Gauss (1777–1895) who introduced the kernel (or normal
distribution) function named after him. Let x, y be linearly independent, random
real-valued variables with a standard deviation σ and mean μ. The goal is to exhibit
the distribution of either the x values by themselves or the combined x, y values
around the origin with μ = 0 for each experiment. The width σ > 0 of a set of x or
x, y values is called the standard deviation (average distance from the middle of a
set of data) and σ 2 is called the variance. Typically, the plot of a set of sample values
with a normal distribution has a bell shaped curve (also called normal curve arranged
around the middle of the values. The now famous example of the Gaussian kernel
plot appears on 10 Deutsch mark (10 DM) note shown in Fig. 5.10.1. A cropped
5.5 Gaussian Kernel 171
version of the 10 DM images is shown in Fig. 5.10.2. A very good overview of the
evolution of the Gaussian kernel is given by S. Stahl [186].
When all negative x or x, y values are represented are represented by their absolute
values, then the Gaussian of the values is called a folded normal distribution (see,
for example, F.C. Leone, L.S. Nelson and R.B. Nottingham [107]).
There are two forms of the Gaussian kernel to consider.
1 (x−0)2 1 x2
f (x; σ) = √ e− 2σ2 = √ e− 2σ2 (1D Guassian kernel).
σ 2π σ 2π
172 5 Edges, Lines, Corners, Gaussian …
Example 5.4 Sample 1D Gaussian kernel plots are given in Fig. 5.11. To experiment
with different choices of the width parameter σ, try using the Mathematica script 1
in Appendix A.5.2.
Example 5.5 Sample continuous and discrete 2D Gaussian kernel plots are given
in Fig. 5.12. A discrete plot is derived from discrete values. By discrete, we mean
that distinct, separated. In this example, discrete values are used to obtain the plot
in Fig. 5.12.2. The plot in Fig. 5.12.1 is for less separated values and hence has a
continuous appearance, even though the plot is derived from discrete values. To
experiment with different choices of the width parameter σ, try using the Matlab
script A.25 in Appendix A.5.3.
This section briefly introduces Gaussian filtering (smoothing) of digital images. Let
x, y be the coordinates of a pixel in a 2D image I mg, I mg(x, y)the intensity of a
pixel located at (x, y) and let σ be the standard deviation of a pixel intensity relative
to the average intensity of the pixels in a neighbourhood of I mg. The assumption
made here is that σ is the standard deviation of a probability distribution of the pixel
intensities in an image neighbourhood. The Gaussian filter (smoothing) 2D function
G(x, y; σ) is defined by
1 x 2 +y 2
G(x, y; σ) = √ e− 2σ2 (Filtered value), or,
σ 2π
x 2 +y 2
G(x, y; σ) = e− 2σ2 (Simplified filtered value). Next,
I mg(x, y) := G(x, y; σ) (G(x, y; σ) replaces pixel intensity I mg(x, y)).
The basic approach in Gaussian filtering an image is to assign each pixel intensity
in a selected image neighbourhood with the filtered value G(x, y, σ). M. Sonka, V.
Hlavac and R. Boyle [184, Sect. 5.3.3, p. 139] observe that σ is proportional to the
size of the neighbourhood on which the Gaussian filter operates (see, e.g., Fig. 5.14
for Gaussian filtering of the cropped train image in Fig. 5.13).
5.6 Gaussian Filter 173
An alternative to a simple second order Laplacian filter, is the second order Laplacian
of a Gaussian filter. This is implemented in Matlab using the log option with fspecial
function.
Fig. 5.19 2nd Order Laplace image enhancement with Listing 5.7
176 5 Edges, Lines, Corners, Gaussian …
g= imread ( ’circuit.tif’ ) ;
%g = r g b 2 g r a y ( i m r e a d ( ’ S n a p − 04 a . t i f ’ ) ) ;
%g = i m r e a d ( ’ S n a p − 04 a . t i f ’ ) ;
k= fspecial ( ’log’ , [ 3 3 ] , 0 . 2 ) ; % G e n e r a t e L a p l a c i a n f i l t e r
h2= imfilter ( g , k ) ; % F i l t e r i m a g e w i t h L a p l a c i a n k e r n e l
ge= imsubtract ( g , h2 ) ; % S u b t r a c t L a p l a c i a n f r o m o r i g i n a l .
subplot ( 1 , 3 , 1 ) , imshow ( g ) ; title ( ’circuit.tif’ ) ;
subplot ( 1 , 3 , 2 ) , imagesc ( ~ h2 ) ;
title ( ’log filtered image’ ) ; axis image ;
subplot ( 1 , 3 , 3 ) , imshow ( ge ) ; title ( ’Enhanced image’ ) ;
Problem 5.8 Eliminate the salt-n-pepper effect of the second-order Laplacian image
enhancement shown in Fig. 5.19. Show your results for the circuit.tif and one other
image of your own choosing.
In most cases, the most effective of the second order filter approaches to image
enhancement stems from an application of the R. Haralick zero-crossing filtering
method (see, e.g., the zero-crossing enhancement of the circuit.tif image in Fig. 5.20).
In a discrete matrix representation of a digital image, there are usually jumps in
the brightness values, if the brightness values are different. To interpret jumps in
brightness values relative to local extrema of derivatives, it is helpful to assume that
pixel values come from a sampling of a real-valued function of a digital image g that
is a bounded and connected subset of the plane R2 . Then jumps in derivative values
indicates points of high first derivative of g or to points of relative extrema in the
second derivative of g [69, p. 58]. For this reason, Haralick viewed edge detection
as fitting a function to sample values. The directional derivative of g at point (x, y)
is defined in terms of a direction angle α by
∂g ∂g
gα (x, y) = sin α + cos α,
∂x ∂y
∂2g 2∂ 2 g ∂2g
gα (x, y) = sin2 α + sin α cos α + 2 cos2 α
∂x 2 ∂x y ∂y
Assuming that g is a cubic polynomial in x and y, then the gradient and gradient
direction of g can be estimated in terms of α at the center of a neighbourhood used
to estimate the value of g. In an n × n neighbourhood of g, the value of g(x, y) is
computed as a cubic in a linear combination of the form
5.9 Zero-Cross Edge Filter Image Enhancement 177
g(x, y) = k1 + k2 x + k3 y + k4 x 2 + · · · + k10 y 3 .
k2
sin α = ,
k22 + k32
k3
cos α = .
k22 + k32
then a negatively sloped zero crossing of the estimated second derivative has been
found and the target neighbourhood pixel is marked as an edge pixel.
%g = i m r e a d ( ’ c i r c u i t . t i f ’ ) ;
g = rgb2gray ( imread ( ’Snap -04a.tif’ ) ) ;
%g = i m r e a d ( ’ S n a p − 04 a . t i f ’ ) ;
178 5 Edges, Lines, Corners, Gaussian …
g = im2double ( g ) ;
h2=edge ( g , ’zerocross’ , 0 , ’nothinning ’ ) ;
h2 = im2double ( h2 ) ;
ge= imsubtract ( g , h2 ) ; % S u b t r a c t L a p l a c i a n f r o m o r i g i n a l .
subplot ( 1 , 3 , 1 ) , imshow ( g ) ; title ( ’Snap -04a.tif’ ) ;
subplot ( 1 , 3 , 2 ) , imagesc ( ~ h2 ) ;
title ( ’zero -cross filtered image’ ) ; axis image ;
subplot ( 1 , 3 , 3 ) , imshow ( ge ) ; title ( ’Enhanced image’ ) ;
The Matlab edge function implementation has two optional parameters, namely,
thresh and filter h. By choosing h = 0, the output image has closed contours and
by choosing no thinning as the filtering method, the edges in the output image
are not thinned. Notice that the edge-detection image in Fig. 5.20 is superior to the
edge-detection image in Fig. 5.19 or in 5.9. Why? For some images such as the
Snap_04a.tif image, the zero-crossing method does not work well. Evidence of this
can be seen in Fig. 5.21.
Problem 5.9 Try other filters besides nothinning (used in Listing 5.8) and look for
the best zero-crossing filter image enhancement of the dragonfly2.jpg and as well
as one other image of your own choosing. For each of the two images, give both the
binary and logical not edge image.
The term isotropic means having the same magnitude or properties when measured in
different directions. The isotropic edge detection approach is direction-independent.
Isotropic edge detection was proposed by D. Marr and E. Hildreth [120], an approach
that offers simplicity and uniformity at the expense of smoothing across edges.
Gaussian smoothing of edges was proposed by A.P. Witkin [212] by convolving
an image with a Gaussian kernel. Let Io (x, y) denote an original image, I (x, y, t)
a derived image and G(x, y, t) a Gaussian kernel with variance t. Then the original
image is convolved with the Gaussian kernel in the following way.
5.10 Anisotropy Versus Isotropy in Edge Detection 179
where the convolution is performed only over the variables x, y and the scale para-
meter t after the semicolon specifies the scale level (t is the variance of the Gaussian
filter G(x, y; t)). At t = 0, the scale space representation is the original image. An
increasing number of image details are removed as t increases, √ i.e., image smooth-
ing increases as t increases. Image details smaller than the t are removed from an
image. The fspecial function is used to achieve Gaussian smoothing an image.
g = imread ( ’circuit.tif’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’circuit.tif’ ) ;
g1 = fspecial ( ’gaussian’ , [ 1 5 1 5 ] , 6 ) ;
g2 = fspecial ( ’gaussian’ , [ 3 0 3 0 ] , 1 2 ) ;
subplot ( 2 , 3 , 2 ) , imagesc ( g1 ) ; title ( ’gaussian ,[3 3],1’ ) ;
axis image ;
subplot ( 2 , 3 , 3 ) , imagesc ( g2 ) ; title ( ’gaussian ,[30 30],12’ ) ;
axis image ;
g = imread ( ’circuit.tif’ ) ;
% I s o l a t e e d g e s o f p i c t u r e u s i n g t h e 2D w a v e l e t t r a n s f o r m
[ c , s ] = wavefast ( g , 1 , ’sym4’ ) ;
figure , wavedisplay ( c , s , − 6 ) ;
title ( ’direction dependence of wavelets’ ) ;
% Zero the a p p r o x i m a t i o n c o e f f i c i e n t s
% [ nc , y ] = w a v e c u t ( ’ a ’ , c , s ) ;
% Compute t h e a b s o l u a t e v a l u e o f t h e i n v e r s e
% e d g e s = a b s ( w a v e b a c k ( nc , s , ’ sym4 ’ ) ) ;
% D i s p l a y b e f o r e and a f t e r images
% figure ;
% s u b p l o t ( 1 , 2 , 1 ) , imshow ( g ) , t i t l e ( ’ O r i g i n a l Image ’ ) ;
% s u b p l o t ( 1 , 2 , 2 ) , imshow ( m a t 2 g r a y ( e d g e s ) )
Next, consider enhancing the circuit.tif image using the edges found using the
wavelets to detect edges. A preliminary result of wavelet image enhancement is
shown in Fig. 5.24. Two things can be observed. First, the wavelet form of edge
detection is less effective than Haralick’s zero crossing edge detection method. Sec-
ond, at this very preliminary stage, it can be observed that the wavelet edge detection
5.10 Anisotropy Versus Isotropy in Edge Detection 181
method does not result in satisfactory image enhancement. More work needs to be
done before one can evaluate the image enhancement potential of the wavelet edge
detection method (see Problem 5.10).
g = imread ( ’circuit.tif’ ) ;
% I s o l a t e e d g e s u s i n g 2D w a v e l e t t r a n s f o r m
[ c , s ] = wavefast ( g , 1 , ’sym4’ ) ;
% Zero the a p p r o x i m a t i o n c o e f f i c i e n t s
[ nc , y ] = wavecut ( ’a’ , c , s ) ;
% Compute t h e a b s o l u a t e v a l u e o f t h e i n v e r s e
edges = abs ( waveback ( nc , s , ’sym4’ ) ) ;
% D i s p l a y b e f o r e and a f t e r images
figure ;
subplot ( 1 , 3 , 1 ) , imshow ( g ) , title ( ’Original Image’ ) ;
subplot ( 1 , 3 , 2 ) , imshow ( edges ) ;
title ( ’waveback(nc, s, sym4)’ ) ;
g = im2double ( g ) ; h = g − edges ;
subplot ( 1 , 3 , 3 ) , imshow ( h ) ;
title ( ’im2double(g) - edges’ ) ;
Problem 5.10 Experiment with enhancing images using the wavelet detection
method with 3 other images besides the circuit.tif image. For example, use wavelets
to detect edge and to perform image enhancement with the Snap_4a.tif and the
blocks.jpg images.
This section briefly presents J.F. Canny’s approach2 to edge detection based on his
M.Sc. thesis completed in 1983 at the MIT Artificial Intelligence Laboratory [24].
The term edge direction means the direction of the tangent to a contour that an edge
2 See http://www.cs.berkeley.edu/~jfc/papers/grouped.html.
182 5 Edges, Lines, Corners, Gaussian …
% Canny e d g e d e t e c t i o n
clc , close all , clear all
g = imread ( ’circuit.tif’ ) ;
subplot ( 2 , 3 , 1 ) , imshow ( g ) ; title ( ’circuit.tif’ ) ;
g1 = fspecial ( ’gaussian’ , [ 1 5 1 5 ] , 6 ) ;
g2 = fspecial ( ’gaussian’ , [ 3 0 3 0 ] , 1 2 ) ;
subplot ( 2 , 3 , 2 ) , imagesc ( g1 ) ; title ( ’gaussian ,[15 15],6’ ) ;
axis image ;
subplot ( 2 , 3 , 3 ) , imagesc ( g2 ) ; title ( ’gaussian ,[30 30],12’ ) ;
axis image ;
[ bw , thresh ] = edge ( g , ’log’ ) ;
subplot ( 2 , 3 , 4 ) , imshow ( ~ bw , [ ] ) ; title ( ’log filter’ ) ;
[ bw , thresh ] = edge ( g , ’canny’ ) ;
subplot ( 2 , 3 , 5 ) , imshow ( ~ bw , [ ] ) ; title ( ’canny filter’ ) ;
[ bw , thresh ] = edge ( imfilter ( g , g1 ) , ’log’ ) ;
subplot ( 2 , 3 , 6 ) , imshow ( ~ bw , [ ] ) ; title ( ’log -smoothed filter’ ) ;
g = imread ( ’circuit.tif’ ) ;
%s u b p l o t ( 2 , 3 , 1 ) , imshow ( g ) ; t i t l e ( ’ c i r c u i t . t i f ’ ) ;
g0 = fspecial ( ’gaussian’ , [ 3 3 ] , 1 . 5 ) ;
subplot ( 2 , 3 , 1 ) , imagesc ( g1 ) ; title ( ’g0=gaussian ,[3 3],1.5’ ) ;
axis image ;
g1 = fspecial ( ’gaussian’ , [ 1 5 1 5 ] , 7 . 5 ) ;
g2 = fspecial ( ’gaussian’ , [ 3 1 3 1 ] , 1 5 . 5 ) ;
subplot ( 2 , 3 , 2 ) , imagesc ( g1 ) ; title ( ’g1=gaussian ,[15 15],7.5’ ) ;
axis image ;
subplot ( 2 , 3 , 3 ) , imagesc ( g2 ) ; title ( ’g2=gaussian ,[31 31],15.5’ ) ;
axis image ;
[ bw , thresh ] = edge ( g , ’log’ ) ;
subplot ( 2 , 3 , 4 ) , imshow ( ~ bw , [ ] ) ; title ( ’log filter g’ ) ;
[ bw , thresh ] = edge ( g , ’canny’ ) ;
subplot ( 2 , 3 , 5 ) , imshow ( ~ bw , [ ] ) ; title ( ’canny filter g’ ) ;
[ bw , thresh ] = edge ( imfilter ( g , g0 ) , ’log’ ) ;
subplot ( 2 , 3 , 6 ) , imshow ( ~ bw , [ ] ) ; title ( ’log -smoothed filter g0’ ) ;
Problem 5.11 Try LoG filtering g1 and g2 Listing 5.13 as well as other Gaussian
smoothing of the dragonfly2.jpg image and look for choices of kernel size and
standard deviation that lead to an improvement over Canny filtering the original
image. Notice that the LoG filter method has a thresh option (all edges not stronger
than thresh are ignored) and a sigma option (standard deviation of the LoG filter
(Laplacian of the Gaussian method). Experiment with these LoG optional parameters
to obtain an improvement over the result in Fig. 5.26. In addition, notice that the
Canny edge filter has an optional two element thresh parameter (the first element
in the Canny thresh parameter is a low threshold and the second parameter is a high
threshold). Experiment with the edge Canny thresh parameter to improve on the
result given in Fig. 5.26.
184 5 Edges, Lines, Corners, Gaussian …
This section introduces Harris–Stephens corner detection [71] (see, Fig. 5.28 for the
results of finding corners in circuit.tif). A corner is defined to be the intersection of
edges (i.e., a target pixel where there are two dominant and different edge directions
in the neighbourhood of the target pixel). See, e.g., the corners inside the dotted
circles in Fig. 5.27, where each corner is a juncture for a pair of edges with different
edge directions. In conflict with corner detection are what are known as interest
points. An interest point is an isolated point which is a local maximum or minimum
intensity (a spike), line ending or point on a curve such as a ridge (concavity down)
or valley (concavity up). If only corners are detected, then the detected points will
include interest points. It is then necessary to do post processing to isolate real
corners (separated from interest points). The details concerning this method will be
5.12 Detecting Image Corners 185
given later. The corner detection results for kingfisher1.jpg are impressive, where
only corner detection is perform only in a small region-of-interest in the image (see
Fig. 5.29).
% Image c o r n e r detection
% g = imread ( ’ c i r c u i t . tif ’) ;
g = imread ( ’kingfisher1.jpg’ ) ;
g = g (10:250 ,300:600) ; % not used with c i r c u i t . t i f
corners = cornermetric ( g , ’Harris’ ) ; % d e f a u l t
corners ( corners < 0 ) = 0 ;
cornersgray = mat2gray ( corners ) ;
figure ,
subplot ( 1 , 3 , 1 ) , imshow ( ~ cornersgray ) ;
title ( ’g,Harris’ ) ;
corners2 = cornermetric ( g , ’MinimumEigenvalue ’ ) ;
corners2 = mat2gray ( corners2 ) ;
subplot ( 1 , 3 , 2 ) , imshow ( imadjust ( corners2 ) ) ;
title ( ’g,MinimumEigenvalue ’ ) ;
cornerpeaks = imregionalmax ( corners ) ;
results = find ( cornerpeaks == true ) ;
[ r g b ] = deal ( g ) ;
r ( results ) = 2 5 5 ;
g ( results ) = 2 5 5 ;
b ( results ) = 0 ;
RGB = cat ( 3 , r , g , b ) ;
subplot ( 1 , 3 , 3 ) , imshow ( RGB ) ;
title ( ’imregionalmax(corners)’ ) ;
Problem 5.12 The corner and peak detection method implemented in Listing 5.14
is restricted to greyscale images (required by the cornermetric function). To see
this, type
help cornermetric
Give a matlab script called cornerness.m that makes it possible to use the cornermet-
ric on colour images. Your adaptation of the cornermetric should produce (i) colour
image showing the location of corners on the input colour image and (ii) colour image
showing the location both the corners and peaks on the input colour image. Do this
so that corners and peaks are visible on each input colour image. Demonstrate the use
of your script on peppers.png and two other colour images that you select. For the
peppers.png colour image, your cornerness.m script should produce output similar
to the three images in Fig. 5.30, but instead of a black background, your script should
display the locations of the corners and peaks on each input colour image.
r
pqr
x
Vp p
q
y Vq
This section revisits Voronoï meshes on digital images using image corners and
carries forward the discussion on image geometry started in Sect. 1.22.
The Voronoï region V p depicted as the intersection of finitely many closed half
planes in Fig. 5.31 is a variation of the representation of a Voronoï region in the
monograph by H. Edelsbrunner [41, Sect. 2.1, p. 10], where each half plane is defined
by its outward directed normal vector. The rays from p and perpendicular to the sides
of V p are comparable to the lines leading from the center of the convex polygon in
G.L. Dirichlet’s drawing [35, Sect. 3, p. 216].
Lemma 5.14 ([41, Sect. 2.1, p. 9]) The intersection of convex sets is convex.
Proof Let A, B ⊂ R2 be convex sets and let K = A ∩ B. For every pair points
x, y ∈ K , the line segment x y connecting x and y belongs to K , since this property
holds for all points in A and B. Hence, K is convex.
Lemma 5.15 ([143]) A Voronoï region of a point is the intersection of closed half
planes and each region is a convex polygon.
From an application point of view, Voronoï mesh segments a digital image. This
is especially important in the case where the sites used to construct a mesh have some
significance in the structure of a image. For example, by choosing the corners in an
5.13 Image Corner-Based Voronoï Meshes Revisited 189
image as a set of sites, each Voronoï region of a site p that has the property that all
points in the region are nearest p than to any other corner in the image. In effect, the
points in a Voronoï region of a corner site p are symmetrically arranged around the
particular corner p. This property holds true for each the Voronoï region in a corner
mesh.
The steps to construct a corner-based Voronoï mesh on a digital image are given next.
1o Select a digital image Im.
2o Select an upper bound n on the number of corners to detect in Im.
3o Find up to n corners in Im. The corners found form a set of sites.
4o Display the corners in Im. This display provides a handle for the next step. N.B.:
At this point in a Matlab® script, use the hold on instruction. This hold-on step
is not necessary in Mathematica® 10
5o Find the Voronoï region for each site. This step constructs a Voronoï mesh on
Im.
Example 5.16 Constructing a Voronoï mesh on an Image.
A sample Voronoï mesh is shown on the image in Fig. 5.32. To implement the Voronoï
190 5 Edges, Lines, Corners, Gaussian …
mesh construction steps in Matlab, use a combination of the corner function and
voronoi functions. Let X, Y be the x- and y-coordinates of the image corners found
using the corner function. Then use voronoi(X,Y) to find the x- and y-coordinates of
the vertices in each of the regions in a Voronoï mesh. Then the Matlab plot function
can be used to draw the Voronoï mesh on a selected digital image.
Problem 5.17 For three digital images of your own choosing, construct a Voronoï
mesh on each image. Do this for the following upper bounds on the number of sites:
30, 50, 80, 130.
To include the extreme image corners in a set of mesh generators, used the following
steps.
1o im := greyscale image;
2o [m, n] := size of image im; % use size[im] in Matlab
3o let C := set of interior image corners;
4o let f c be the coordinates of the extreme image corners;
5o let Cim := [C; f c]; % Cim contains coords. of all im corners
6o superimpose Cim on image im;
Remark 5.18 Superimposing corners on a full-size as well as on cropped image.
A 480 × 640 colour image of a Salerno motorcycle is shown in Fig. A.49. Using the
Matlab script A.28, the corners are found in both the full image in Fig. A.51.1 and in
a cropped image in Fig. A.50.1. Notice that there are a number different methods that
can be used to crop an image (these cropping methods are explained in the comments
in script A.28.
5.15 Extreme Image Corners in Set of Mesh Generators 191
Example 5.19 A 480 × 640 colour image of an Italian Carabinieri auto is shown
in Fig. 5.33. Using the Matlab script A.28 in Appendix A.5.6, the corners are found
in both the full image in Fig. 5.34.1 and in a cropped image in Fig. 5.34.2. Notice
that there are a number different methods that can be used to crop an image (these
cropping methods are explained in the comments in script A.28.
This section demonstrates the effectiveness of the inclusion of image corners in the
set of sites (generators) in constructing a Voronoï mesh on a 2D digital image. To
superimpose a Voronoï mesh on an image using the set of sites that includes the
extreme image corners, do the following.
set of generating points (sites), we obtain a Voronoï mesh like the one shown in
Fig. 5.35. Notice the convex polygons surrounding parts of the inside corners in
Fig. 5.35 that result from including the extreme corners in the set of generators used
to derived the image mesh (Figs. 5.36 and 5.37).
% g r a d i e n t s : S , Garg , 2 0 1 4 , m o d i f i e d by J . F . P . , 2015
% h t t p : / / www . m a t h w o r k s . com / m a t l a b c e n t r a l / f i l e e x c h a n g e /
% 46408 − h i s t o g r a m − o f − o r i e n t e d − g r a d i e n t s −− hog −− c o d e − u s i n g − m a t l a b /
% c o n t e n t / h o g _ f e a t u r e _ v e c t o r .m
clear all ; close all ; clc ;
im= imread ( ’floorplan.jpg’ ) ;
if size ( im , 3 ) ==3
im= rgb2gray ( im ) ; end
im= double ( im ) ; rows=size ( im , 1 ) ; cols=size ( im , 2 ) ;
Ix=im ; Iy=im ; % B a s i c M a t r i x a s s i g n m e n t s
for i = 1 : rows −2 % G r a d i e n t s i n X d i r e c t i o n .
Iy ( i , : ) = ( im ( i , : ) −im ( i + 2 , : ) ) ; end
for i = 1 : cols −2 % G r a d i e n t s i n Y d i r e c t i o n .
Ix ( : , i ) = ( im ( : , i ) −im ( : , i + 2 ) ) ; end
angle= atand ( Ix . / Iy ) ; % e d g e g r a d i e n t a n g l e s
angle= imadd ( angle , 9 0 ) ; % A n g l e s i n r a n g e ( 0 , 1 8 0 )
magnitude =sqrt ( Ix . ^ 2 + Iy . ^ 2 ) ;
imwrite ( angle , ’gradients.jpg’ ) ;
imwrite ( magnitude , ’magnitudes.jpg’ ) ;
subplot ( 2 , 2 , 1 ) , imshow ( imcomplement ( uint8 ( angle ) ) ) , title ( ’edge
gradients’ ) ;
subplot ( 2 , 2 , 2 ) , plot ( Ix , angle ) , title ( ’angles in [0,180]’ ) ;
subplot ( 2 , 2 , 3 ) , imshow ( imcomplement ( uint8 ( magnitude ) ) , [ 0 2 5 5 ] ) ,
title ( ’x-,y-gradient magnitudes in situ’ ) ;
subplot ( 2 , 2 , 4 ) , plot ( Ix , magnitude ) , title ( ’x-,y-gradient magnitudes ’ ) ;
Fig. 5.39 Edges found with Listing 5.15 using Fig. 5.38
% g r a d i e n t s : S . Garg , 2 0 1 4 , m o d i f i e d by J . F . P . , 2015
clear all ; close all ; clc ;
% im = i m r e a d ( ’ f l o o r p l a n . j p g ’ ) ;
im= imread ( ’redcar.jpg’ ) ;
if size ( im , 3 ) ==3
im= rgb2gray ( im ) ; end
im= double ( im ) ; rows=size ( im , 1 ) ; cols=size ( im , 2 ) ;
Ix=im ; Iy=im ; %B a s i c M a t r i x a s s i g n m e n t s
for i = 1 : rows −2 % G r a d i e n t s i n X d i r e c t i o n .
Iy ( i , : ) = ( im ( i , : ) −im ( i + 2 , : ) ) ; end
for i = 1 : cols −2 % G r a d i e n t s i n Y d i r e c t i o n .
Ix ( : , i ) = ( im ( : , i ) −im ( : , i + 2 ) ) ; end
angle= atand ( Ix . / Iy ) ; % e d g e p i x e l g r a d i e n t s i n d e g r e e s
angle= imadd ( angle , 9 0 ) ; %A n g l e s i n r a n g e ( 0 , 1 8 0 )
magnitude =sqrt ( Ix . ^ 2 + Iy . ^ 2 ) ;
imwrite ( angle , ’gradients.jpg’ ) ;
imwrite ( magnitude , ’magnitudes.jpg’ ) ;
figure , imshow ( uint8 ( angle ) ) ;
figure , imshow ( imcomplement ( uint8 ( magnitude ) ) ) ;
% f i g u r e , p l o t ( Ix , a n g l e ) ;
% f i g u r e , p l o t ( Ix , m a g n i t u d e ) ;
Example 5.21 Edge Thinning Using Image Gradient Magnitudes. A sample thin-
ning of the thick lines in the Alhambra floorplan image is shown in Fig. 5.40. In this
image, each of the thick floorplan borders has been reduced to thin line segments. The
result is a collection of thinly bordered large-scale convex polygons. The Alhambra
floorplan gradient angles are displayed in Fig. 5.41.
196 5 Edges, Lines, Corners, Gaussian …
The results from Example 5.21 provide a basis for finding a minimum number of
image corners, leading to the construction of an effective Voronoï Mesh. In section,
we again consider the Alhambra floorplan image.
The Matlab script A.28 applied to the Alhambra floorplan (limited to thinned
edges) produces the result shown in Figs. 5.42 and 5.43.
Hint: Choose images containing lots of straight edges such as images containing
houses or buildings (Fig. 5.43).
Chapter 6
Delaunay Mesh Segmentation
the segment, (2) segments do not partition an image into disjoint regions, since each
pair of adjacent segments in an image segmentation have a common border. In this
chapter, an image is segmented into triangular segments in a mesh using an approach
to planar triangulation introduced by Delaunay. A Delaunay mesh is the result of
what is known as a triangulation.
Fig. 6.2 p, q ∈
S, pq = delaunay edge
r
( pqr)
x
Vp p
q
y Vq
Delaunay Wedge
A planar Delaunay wedge is a Delaunay triangle with an interior that
contains an uncountably infinite number of points. The interior of a
Delaunay triangle is that part of the triangle between the edges. It is
assumed that every Delaunay triangle connecting generating points in
an image defines a Delaunay edge.
That is, a Delaunay wedge is the intersection of closed half planes H ps , for all
s ∈ {q, r } − p.
Theorem 6.4 A planar Delaunay wedge is a convex polygon.
Proof Immediate from Lemma 5.15, since a Delaunay wedge is the intersection of
closed half planes in spanning a Delaunay triangle ( pqr ), stretching from vertex
p to the opposite edge qr .
Problem 6.5 Give an example of a Delaunay wedge in an image.
For simplicity, let E be the Euclidean space R2 . For a Delaunay triangle ( pqr ),
a circumcircle passes through the vertices p, q, r of the triangle (see Fig. 6.3 for
an example). The center of a circumcircle u is the Voronoï vertex at the inter-
section of three Voronoï regions, i.e., u = V p ∩ Vq ∩ Vr . The circumcircle radius
ρ = u − p = u − q = u − r [40, Sect. I.1, p. 4], which is the case in Fig. 6.3.
202 6 Delaunay Mesh Segmentation
p
u
Lemma 6.6 Let circumcircle ( pqr ) pass through the vertices of a Delaunay tri-
angle ( pqr ), then the following statements are equivalent.
1o The center u of ( pqr ) is a vertex common to Voronoï regions V p , Vq , Vr .
2o u = cl V p ∩ cl Vq ∩ clVr .
3o V p δ Vq δ Vr .
Proof 1o ⇔ 2o ⇔ 3o .
Theorem 6.7 A triangle ( pqr ) is a Delaunay triangle if and only if the center of
the circumcircle ( pqr ) is the vertex common to three Voronoï regions.
Proof The circle (pqr) has center u = clV p ∩ clVq ∩ clVr (Lemma 6.6) ⇔ (pqr)
center is the vertex common to three Voronoï regions V p , Vq , Vr ⇔ pq, pr , qr are
Delaunay edges ⇔ ( pqr ) is a Delaunay triangle.
The steps to construct a corner-based Delaunay mesh on image edges are as follows.
Problem 6.8 Give a Matlab script that constructs a corner-based Delaunay mesh on
image for three image of your own choosing. N.B.: Choose your images from a your
personal collection of images not taken from the web.
204 6 Delaunay Mesh Segmentation
1 1 1
n m h
xc = xi , yc = yi , z c = zi .
n i=1 m i=1 h i=1
The basic approach is to use image region centroids as generating points in Delau-
nay mesh construction. Here are the steps to do this.
1o Find the region centroids in a given image Im.
2o Connect each pair of nearest centroids x, y ∈ S with a straight edge x y. A Delau-
nay triangle results from connecting with straight edges for centroids x, y, r that
are nearest each other.
3o Repeat step 2o until all pairs of centroids are connected. N.B. It is also assumed
that each triangular region of the mesh is a Delaunay wedge.
Problem 6.14 Give a Matlab script that false colours (your choice of colour) the
maximal nucleus triangle of each MNTC in a centroid-based Delaunay mesh on an
image for three images of your own choosing. False colour each triangle adjacent
to the maximal nucleus triangle. N.B.: Choose your images from a your personal
collection of images not taken from the web. In this problem, image centroids are
used instead of corners as a source of generating points in constructing the Delaunay
triangulation mesh.
Problem 6.15 Give a Matlab script that false colours (your choice of colour) the
maximal nucleus triangle of each MNC in a centroid-based Voronoï mesh on an
image for three images of your own choosing. False colour each triangle adjacent
to the maximal nucleus triangle. N.B.: Choose your images from a your personal
collection of images not taken from the web. In this problem, image centroids are
used instead of corners as a source of generating points in constructing the Voronoï
mesh.
Problem 6.16 Give a Matlab script that false colours (your choice of colour) the
maximal nucleus triangle of each MNptC in a centroid-based Voronoï-Delaunay
triangulation mesh on an image for three images of your own choosing. False colour
each triangle adjacent to the maximal nucleus polygon with inscribed triangle corners.
N.B.: Choose your images from a your personal collection of images not taken from
the web. In this problem, image centroids are used instead of corners as a source of
generating points in constructing the Voronoï mesh.
Chapter 7
Video Processing. An Introduction
to Real-Time and Offline Video Analysis
This chapter introduces video processing with the focus on tracking changes in
video frame images. Video frame changes can be detected in the changing shapes,
locations and distribution of the polygons (regions) in Voronoï tilings of the frames
(see, e.g., Fig. 7.1). The study of video frame changes can be done either in real-
time or offline. Real-time video frame analysis is the preferred method, provided the
analysis can be carried out in a reasonably short time for each frame. Otherwise, for
more time-consuming analysis of video frame content, offline processing is used.
From a computer vision perspective, scenes recorded by a video camera depend on
the camera aperture angle and its view of a visual field, which is analogous to the
human perception angle (see Fig. 7.2). For more about this, see Sect. 7.3.
This section briefly introduces some of the essentials of video processing, leading
to object detection in videos. A good introduction to video processing is given by
T.B. Moselund [125].
The basic unit in a video is a frame. A frame is an individual digital image in a
linear sequence of images.
Every frame is a set of pixels susceptible to any of the standard image processing
techniques such as false colouring, pixel selection (e.g., centroid, corner and edge
pixels), pixel manipulation (e.g., RGB −→ greyscale), pixel (point) processing (e.g.,
adjusting color channel brightness), filtering (e.g., frame noise reduction, histogram
equalization, thresholding) and segmentation (e.g., separation of pixels into non-
overlapping regions).
Video processing begins with the image acquisition process. This process is markedly
different from snapshots. Image acquisition is basically a two step process in which
a single image is added to a sequence of images called frames.
Videos consume huge amounts of memory for their storage. Hence, image com-
pression is a central concern in video image acquisition. The MPEG (Motion Picture
Experts Group) standard was designed to compress video signals from 4 to 6 Mbps
(megabits per second). MPEG-1 and MPEG-2 compression reduces spatial and tem-
poral redundancies.
With the MPEG approach to compression, each frame is coded separated using
JPEG (Joint Photographic Experts Group) lossy compression. JPEG uses piecewise
uniform quantization. A quantizer is determined by an encoder that partitions an
input set of signal values in classes and a decoder that specifies the set of output
values. Let x be a signal value. This quantization process is modelled with a selector
function Si (x) on a set Ri (a partition cell). A selector function Si (x) is an example
of what is known as an indicator function 1R of a partition cell, defined by
1, if x ∈ R (input signal x belongs to partition cell R),
1R (x) =
0, otherwise.
214 7 Video Processing. An Introduction to Real-Time and Offline Video Analysis
A video selector function Si for partition cell Ri is defined by the indicator function
1Ri on cell Ri , i.e.,
Si (x) = 1Ri (x).
7.1.3 Blobs
A blob (binary large object) is a set of path-connected pixels in a binary image. The
notion of connectedness makes it possible to extend the notion of a blob to grey-blobs
in greyscale and colour-blobs in colour images.
Polygons are connected, provided the polygons share one or more points. For
example, a pair of Voronoï regions A and B that have a common edge are connected.
Again, for example, a pair of Delaunay triangles that have a common vertex are
connected. In that case, connected Voronoï polygons with a common edge containing
n points are n-adjacent. Similarly, Delaunay triangles that have a common vertex
containing n points are both connected and n-adjacent. A pair of Delaunay triangles
with a common vertex are 1-adjacent.
A sequence p1 , . . . , pi , pi+1 , . . . , pn of n pixels or voxels is a path, provided
pi , pi+1 are adjacent (no pixels in between pi and pi+1 ). Pixels p and q are path-
connected, provided there is a path with p and q as endpoints. Similarly, image
shapes A and B (any polygons) are path-connected, provided there is a sequence
S1 , . . . , Si , Si+1 , . . . , Sn of n adjacent shapes with A = S0 and B = Sn .
7.1 Basics of Video Processing 215
For more about connectedness from digital image perspective, see R. Klette and
A. Rosenfeld [94, Sect. 2.2.1, pp. 46–50].
Either in real-time or offline, every video frame can be tiled (tessellated) with a
Voronoï diagram or tiling each frame using Delaunay’s triangulation method, i.e.,
connect sites of neighbouring Voronoï regions with straight edges to form multiple
triangles covering a video frame. A natural outcome of either form of frame tiling is
mesh clustering and object recognition. The fundamentally important step in his form
of video processing is the selection of sites (generating points) used to construct either
frame Voronoï regions or Delaunay triangles. After the selection of frame generating
points, a frame can be tiled. Frame tiling takes along the path that leads to video
object detection (see Fig. 7.4 for the steps leading to frame object detection).
216 7 Video Processing. An Introduction to Real-Time and Offline Video Analysis
Recall that a Voronoï tiling of a plane surface is a covering of the surface with non-
overlapping Voronoï regions. Each 2D Voronoï regions of a generating point is an
n-sided polygon (briefly, ngon). In effect, a planar Voronoï tiling is a covering of a
surface with non-overlapping ngons. Video frame tilings have considerable practical
value, since the contour of the outer polygons surrounding a frame object can be
measured and compared.
The detection of personal spaces in the motion of people in video sequences is aided
by constructing Voronoï tilings (also called Voronoï diagrams) on each video frame.
A personal space is defined by a comfort distance between persons in motion. Let d
be a distance (in meters) between persons. Four types of comfort distances between
persons have been identified by E. Hall [66], namely,
Intimate: 0 ≤ d ≤ 0.5 m (Friendship distance).
Personal: 0.5 ≤ d ≤ 1.25 m (Conversational distance).
7.3 Detection of Shapes in Video Frames 217
Then the the personal space PPS(fv ) of a frame fv [86, Sect. 3.2, p. 326] is defined
by
απ 2
PPS(Vf ) ≥ R m (Video frame perceived personal distance).
360o c
Then PPS is defined to be the area of the region formed by the intersection of a
person’s visual field and corresponding Voronoï polygon [86]. The region of attention
focus of a person’s visual field is estimated to be a circular sector with an approximate
aperture angle of 40o . Let f be the focal length and D the diameter of an aperture.
The aperture angle [122] of a lens (e.g., human eye) is the apparent angle α of the
lens aperture as seen from the focal point, defined by
−1 D
α = 2tan (aperture angle).
2f
The similarity distance D(A, B) between the two contours A and B, represented
by a set of uniformly sampled points in A and B [60, Sect. 2, p. 29], is defined by
D(A, B) = max max D(a, B), max D(b, A) (Similarity Distance).
a∈A b∈B
The contour distances D(A, Bi ) between question mark shape A in Fig. 7.7.1 is
compared with the distance between points along the contours of three different
Aha! shapes B1, B2, B3 in Fig. 7.6. In this example, check the value of
The shapes B1, B2, B3 would be considered close to the shape , provided maxCon-
tourDistance is close to zero.
The similarity between the shapes of visual hulls in a sequence of video frames is
measured relative to a distance between known visual hull shapes. Let A, Bi be known
shapes with distance D(A, Bi ). And let S0 be a known shape that is compared with
shape Sj , 1 ≤ j ≤ n in a sequence of n video frames. The distance D(A, Bi ) between
known shapes A, Bi is compared with the summation of the distances between a base
shape S0 and a succession of shapes Si in a sequence of video frames. A and S0 have
similar shapes, provided the sum of the differences between n shape contours S0
and Sj is close to (approximately equal to) the distance D(A, Bi ), i.e.,
7.4 Measuring Shape Similarity and the Voronoï Visual Hull of an Object 219
⎛ ⎞
n
⎜ ⎟
D(A, Bi ) ≈ ⎜ a − b⎟
⎝ ⎠ (Similar Frame Shapes).
j=1 a∈S0
b∈Sj
Let ε > 0 be a small number. Shapes A and S0 are considered close, provided
⎛ ⎞
n
⎜ ⎟
shapeDiff (S0 , Sj ) := ⎜ a − b⎟
⎝ ⎠.
j=1 a∈S0
b∈Sj
D(A, Bi ) − shapeDiff (S0 , Sj ) ≤ ε.
The hunt for similar shapes in video frames reduces to a comparison of the dis-
tance between known shapes and similarity distances between a succession of frame
shapes.
The similarity distance D(A, Bi ) between question mark shapes in Fig. 7.7.1
is compared with the distance between points along the contours of frame shapes
containing a mixture of and Aha! shapes in a video frame sequence in Fig. 7.7.2.
This comparison fails, since the similarity distance between and is usually not
close for small ε values. Strangely enough, the shape can be deformed (mapped)
into the shape. To see how this is done, see [144, Sect. 5.3].
Notice that every polygon in a Voronoï tessellation of a surface is the centre (nucleus)
of a cluster containing all polygons that are adjacent to the nucleus. A Voronoï mesh
nucleus is any Voronoï region that is the centre of a collection of Voronoï regions
adjacent to the nucleus.
There are two basic types of cluster contours useful in identifying object shapes
in tessellated digital images.
2 http://www.atlasoftheuniverse.com/milkyway2.jpg.
7.5 Maximal Nucleus Clusters 223
7.9.1: milkyway
7.6 Problems
2o Let S be the set of pixels with different colour intensities found in Step 1.
3o For each frame, construct the Voronoï mesh V (S), using S as the set of generating
points.
4o Display two sample frames with mesh V (S) superimposed on it.
5o Repeat Steps 1–4 for up to 300 pixel colour intensities.
Problem 7.14 HSV-based Voronoï mesh.
Capture two .mp4 files with 100 frames in each file and do the following in real-time
during video capture.
1o Convert each RGB frame to the HSV colour space.
2o For each frame image, find up to 100 pixels with different hue-values and loca-
tions, i.e., each pixel found will have a hue and value that is different from the
other pixels in img.
3o Let S be the set of pixels with different hues and values found in Step 2.
4o For each frame, construct the Voronoï mesh V (S), using S as the set of generating
points.
5o Display two sample frames with mesh V (S) superimposed on it.
6o Repeat Steps 2–5 for up to 300 pixel hue-value combinations.
Problem 7.15 Gradient orientation & green channel-based Voronoï mesh.
Capture two .mp4 files with 100 frames in each file and do the following in real-time
during video capture.
1o For each frame image, find up to 100 pixels with different green channel colour
intensity and gradient orientation combinations and locations, i.e., each pixel
found will have a green intensity and gradient orientation that is different from
the other pixels in each frame image.
2o Let S be the set of pixels with different hues and values found in Step 1.
3o For each frame, construct the Voronoï mesh V (S), using S as the set of generating
points.
4o Display two sample frames with mesh V (S) superimposed on it.
5o Repeat Steps 1–4 for up to 300 pixel green intensity-gradient orientation combi-
nations.
Problem 7.16 Fine Cluster Contours.
Capture three .mp4 files with 100 frames in each file and do the following in real-time
during video capture.
1o capture video frames.
2o Select 100 corners in each frame image.
3o Tile (tessellate) each frame with a Voronoï diagram.
4o Recall that each Voronoï polygon is the nucleus of a cluster, which is a collection
of polygons adjacent to a central polygon called the cluster nucleus. The focus
of this step is on maximal nucleus clusters, i.e., a maximal nucleus cluster is a
nucleus cluster in which the Voronoï nucleus polygon has a maximal number of
adjacent polygons. In each frame, identify the Voronoï nuclei polygons with the
maximum number of adjacent polygons.
226 7 Video Processing. An Introduction to Real-Time and Offline Video Analysis
Problem 7.28 Green channel pixel intensities-based fine cluster contour simi-
larities.
Do Problem 7.25 for generating points that are frame green channel pixel intensi-
ties instead of frame corners to tessellate each video frame. That is, in each frame,
choose generating points that are frame green channel pixel intensities instead of
corners to test the similarity between a known shape and the shapes in captured
.mp4 files.
Problem 7.29 Corner and Green channel pixel intensities-based fine cluster
contour similarities.
Do Problem 7.25 for generating points that are corners with different green channel
pixel intensities to tessellate each video frame. That is, in each frame, choose gen-
erating points that are corners with different green channel pixel intensities instead
of corners to test the similarity between a known shape and the shapes in captured
.mp4 files.
Problem 7.30 Repeat the steps in Problem 7.25 by doing Problem 7.17 in Step
7.25.2. That is, measure the similarity between shapes by measuring the difference
between the coarse cluster contour of a known image object Target and the coarse
cluster contours that identify the shapes of objects in each video frame.
Problem 7.31 This problem focuses on coarse cluster contours (call them coarse
perimeters). Consider three levels of coarse perimeters:
S1P: Level 1 coarse perimeter (our our starting point–call it supra 1 perimeter or
briefly S1P).
S2P: Level 2 coarse perimeter (supra 2 perimeter or briefly S2P) that contains a
supra 1 perimeter.
S3P: Level 3 coarse perimeter (supra 3 perimeter or briefly S3P) that contains a
S2P and S1P.
Level 3 is unlikely.
The occurrence of S2P containing S1P is the promising case in terms of object
recognition. Do the following:
7.6 Problems 229
1o Detect when a S1P is contained in a S2P. Announce this in the work space along
with the lengths of the S1P and S2P perimeters. Put a tiny circle label (1) on the
S1P and a (2) on S2P.
2o Detect when S3P contains S2P. Announce this in the work space along with the
lengths of the S1P and S2P perimeters. Put a tiny circle label (2) on S2P, (3) on
S3P.
3o Detect when S3P contains S2P and S2P contains S1P. Announce this in the work
space along with the lengths of the S1P, S2P, S3P perimeters. Put a tiny circle
label (1) on the S1P and a (2) on S2P and a (3) on S3P.
4o Detect when S2P does not contain S1P and S3P does not contain S1P. Announce
this in the work space along with the lengths of the S1P, S2P, S3P perimeters.
Put a tiny circle label (1) on the S1P and a (2) on S2P.
5o Produce a new figure that suppresses (ignores) MNCs on the border of an image
and displays S1P (case 1).
6o Produce a new figure that suppresses (ignores) MNCs on the border of an image
and displays S1P, S2P (case 2). Include (1), (2) circle labels. Announce this in
the work space along with the lengths of the S1P and S2P perimeters.
7o Produce a new figure that suppresses (ignores) MNCs on the border of an image
and displays S1P, S2P, S3P (case 3). Announce this in the work space along with
the lengths of the S1P, S2P, S3P perimeters. Put a tiny circle label (1) on the S1P
and a (2) on S2P and a (3) on S3P.
Suggestion by Drew Barclay : Select SnP contained within S(n+1)P so that the
line segments making up each of the contours do not ever intersect. In addition, the
minimum and maximum X/Y values have greater absolute values for S(n+1)P.
For this problem : Try object recognition in a traffic video to see which of the
above cases works best.
Hint : Crop frame 1 of a video and use that crop for each of the video frames. Let k
equal the number of SURF keypoints selected. Try k = 89 and k = 377 to see some
very interesting coarse perimeters.
Let |ei | be the number of mesh generators in a mesh contour edgelet and let Pr(ei )
(probability of the occurrence of edgelet ei ) be defined by
1 1
Pr(ei ) = = (MNC Contour Edgelet Probability).
|ei | size of ei
230 7 Video Processing. An Introduction to Real-Time and Offline Video Analysis
Problem 7.32 Let V be a video containing Voronoï tessellated frames. Do the fol-
lowing:
1. Crop each video frame (select only one or more the central rectangular regions,
depending on the size of the rectangles used to partition a video frame). Work
with the central rectangular region for the next steps.
2. Tessellate each frame using SURF points as mesh generators. Experiment with
the number of SURF points to use as mesh generators, starting with 10 SURF
points.
3. Find the MNCs in each video frame.
7.12.1: edgelet
7.12.2: edgeletLabels
232 7 Video Processing. An Introduction to Real-Time and Offline Video Analysis
11. Give Matlab script to display a log-polar plot for edgelet frequencies. Hint: the
basic approach is to bin the edgelet frequencies of a video frame into a polar
histogram. For examples, see O. Tekdas and N. Karnad [192, Fig. 3.1, p. 8].
12. Give Matlab script to display a plot Pr(ei ) against ei .
13. Give Matlab script to display a 3D contour plot Pr(ei ) against ei and mi . Hint:
For a sample solution, see Matlab script A.34 in Appendix A.7.1.
Let N be the sample size used to study video frame edgelets. For instance, if we
shoot a video with 150 frames, then N := 150. For this work, N equals the number
of frames containing tessellated video images. The chi-squared distribution3 χ2s is a
measure of the deviation of a sample s from the expectation for the sample s and is
defined by
k
mi − NPr(ei )
χ2s = .
i=1
NPr(ei )
Problem 7.33 Let V be a video containing Voronoï tessellated frames. Do the fol-
lowing:
1. Repeat steps 1 to 8 in Problem 7.32.
2. Give Matlab script to compute χ2s for a tessellated video frame.
3. Give Matlab script to plot χ2s values for 10 sample videos.
The focus here is on computing what is known as the cost for the distance between
MNC contour edgelets. The approach here is an extension of the basic notion of a
cost function for distance introduced by M. Eisemann, F. Klose and M. Magnor [44,
p. 10]. Let ei , ej be edgelets and let a, b > 0 be constants used to adjust the cost
function Cdist (ei , ej ) defined by
a
Cdist (ei , ej ) = .
1 + e−bei −ej
3 http://mathworld.wolfram.com/Chi-SquaredTest.html.
7.7 Shape Distance 233
The selection of a and b is based on arriving at the maximal cost of the distance
between ei and ej . For example, let b := 1 and let a := D(ei , ej ) (C̆ech distance
between the pair edgelet point sets) defined by
D(ei , ej ) = min x − y : x ∈ ei , y ∈ ej .
We are interested in defining a cost function of the distance between a target MNC
contour edgelet etarget and an sample edgelet ej in a video. Then, for example,
Cdist (etarget , ej ) is defined by
a
Cdist (etarget , ej ) = .
1 + e−betarget −ej
a=D(etarget ,ej ),b=1
Problem 7.34 K
Let V be a video containing Voronoï tessellated frames. Do the following:
1. Crop each video frame (select only one or more the central rectangular regions,
depending on the size of the rectangles used to partition a video frame). Work
with the central rectangular region for the next steps.
2. Tessellate each frame using SURF points as mesh generators. Experiment with
the number of SURF points to use as mesh generators, starting with 10 SURF
points.
3. Select an edgelet etarget that is the set of generators that are the endpoints of the
edges along a fine contour of a target object shape.
4. Select an edgelet ej from a sample video. The selected edgelet should be extracted
from a video frame containing a MNC contour that is similar to the known target
shape. In other words, to select ej , verify that
D(etarget , ej ) − shapeDiff (etarget , ej ) ≤ ε.
See Sect. 7.5 in this book for the methods used to compute D(etarget , ej ) and
shapeDiff (etarget , ej ). It may be necessary to pad etarget or ej with zeros, if one of
these edgelets has fewer edge pixels than the other edgelet.
Hint: Check the number of pixels in both edgelets, before attempting to compute
the distance between etarget and ej .
5. Give Matlab script to compute the cost distance function Cdist (etarget , ej ).
6. Repeat steps 1 to 5 for other choices of edgelets ej and the same target. Also,
experiment with other choices of a, b in the cost distance function.
7. Repeat steps 1 to 5 for other choices of edgelets ej in 10 different videos and a
different target. Also, experiment with other choices of a, b in the cost distance
function.
8. Give Matlab script to display a 3D contour plot Cdist (etarget , ej ) against a and b
for the 10 selected videos.
9. Comment on the choices of a and b in computing Cdist (etarget , ej ).
234 7 Video Processing. An Introduction to Real-Time and Offline Video Analysis
Problem 7.35 Let V be a video containing Voronoï tessellated frames. Do the fol-
lowing:
1. Repeat Steps 1 to 5 in Problem 7.34.
2. In Problem 7.34.5, include
in your Matlab script the computation of the shape
cost Cshape etarget , ej .
3. In Step 2 of this Problem, compute the overall cost C etarget , ej .
4. Repeat Steps 7.35.1 to 7.35.3 for 10 different videos and different targets.
5. Give Matlab script to display a 2D plot C etarget , ej against 1, · · · , 10.
videos
6. Give Matlab script to
display a 3D contour plot C e ,
target j against Cdist
e
etarget , ej and Cshape etarget , ej for the 10 selected videos.
7. Comment on the results of the 2D and 3D plots obtained.
Fig. 7.13 Contour edge pixels from the edgelet in Fig. 7.12.2
The first new element in object recognition in tessellated images is the introduction
of maximum MNC contour edgelets (denoted by maxei ) that are edgelets containing
all contour edge pixels, i.e.,
7.9 Maximum Edgelets 235
max|ei | = no. of contour edge pixels , not just straight edge endpoints.
Example 7.36 Let ei denote the edgelet Fig. 7.12.2, which is the ith edgelet in a
collection of tessellated video frames. Edgelet ei contains 9 mesh generating points.
In other words, ei is not maximal. To obtain maxei , identify all of the edge pixels along
each contour straight edge. For example, maxei would include the endpoints g5 , g6 as
well as the interior pixels in the contour straight edge g5 g6 shown in Fig. 7.13.
The second new element is the inclusion of coarse contour edgelets in the study of
object shapes in tessellated digital images. Until now, the focus has been on fine
contour edgelets define by the straight edges connecting the generating points for all
polygons adjacent to the nucleus in a mesh MNC. Now we want to consider edgelets
defined by the straight edges connecting the generating points for the polygons
surround the fine contour polygons.
Example 7.38 Let maxefine denote a maximum fine contour edgelet and let
maxecoarse denote a maximum coarse contour edgelet. For example, in Fig. 7.15.1,
the dotted lines · · · · · · represent the endpoints and interior straight edge pixels
in a fine MNC contour edgelet maxefine in the Voronoï mesh shown in Fig. 7.14. In
Fig. 7.15.1, the dotted lines · · · · · · represent the endpoints and interior straight edge
pixels in a coarse MNC contour edgelet maxecoarse in the Voronoï mesh shown in
Fig. 7.14.
1. Crop each video frame (select only one or more the central rectangular regions,
depending on the size of the rectangles used to partition a video frame). Work
with the central rectangular region for the next steps.
2. Tessellate each frame using SURF points as mesh generators. Experiment with
the number of SURF points to use as mesh generators, starting with 10 SURF
points.
3. Find the MNCs in each tessellated frame.
4. Display the MNC in the tessellated frame. Highlight the nucleus in red and the
polygons surround the nucleus in yellow, all with with 50% opacity (adjust the
opacity so that the subimage underlying the MNC can be clearly see).
5. For the selected MNC, determine the maximum fine contour edgelet for the MNC
(call it maxei ).
6. Display (plot) the points in maxei by itself.
7. Display (plot) the points (in red) in maxei superimposed on the image MNC.
8. Repeat Steps 1 to 7 for 10 different videos.
9. Comment on the results obtained.
Example 7.40 Let MNC1, MNC2 be represented in Fig. 7.17.1. This pair of
6. Display each MNC in the tessellated frame. Highlight the nucleus in red and
the polygons surround the nucleus in yellow, all with with 50% opacity (adjust
the opacity so that the subimage underlying the MNC can be clearly see).
7. For the selected MNC, determine the maximum fine contour edgelet for the
MNC (call it maxei ).
8. Display (plot) the points in maxei by itself.
9. Display (plot) the points (in red) in maxei superimposed on the image MNC.
10. Repeat Steps 1 to 9 for 10 different videos.
11. Comment on the results obtained.
Chapter 8
Lowe Keypoints, Maximal Nucleus Clusters,
Contours and Shapes
This chapter carries forward the use of Voronoï meshes superimposed on digital
images as a means revealing image geometry and the shapes that result from contour
lines surrounding maximal nucleus clusters (MNCs) in a mesh. Recall that every
polygon in a Voronoï mesh is a nucleus of a cluster of polygons. Here, the term
nucleus refers to the fact that in each MNC, there is always a polygon that is a
cluster center. For more about this, see Appendix B.12 (MNC) and Appendix B.13
(nucleus).
The focus of this chapter on an image geometry approach to image and scene
analysis. To facilitate image and scene analysis, a digital image can be viewed as a
set of points (pixels) susceptible to the whole spectrum of mathematical structures
commonly found in geometry and in the topology of digital images.
The typical geometry structures found in image geometry include points, lines,
circles, triangles, and polygons as well equations that specify the positions and config-
urations of the structures. In other words, image geometry is an analytic geometry
view of images. In digital images, these geometric structures also include image
neighbourhoods, image clusters, image segments, image tessellations, collections of
image segment centroids, sets of points nearest a particular image point such as a
region centroid, image regions gathered together as near sets, adjacent image regions,
and the geometry of polygonal image regions.
Another thing to notice that is the topology of digital images. A topology of
digital images (or image topology) is the study of the nearness of pixels to sets of
pixels. Such a topological approach to digital images leads to meaningful groupings
of the parts of a digital image that includes sets of pixels with border pixels excluded
(open sets) or sets of pixels that with border pixels included (closed sets). This basic
approach in the study of images is a direct result of A. Rosenfeld’s discovery of 4-
and 8-neighbourhoods [170] (see, also, [94, 142]).
A tessellation of an image is a tiling of the image with polygons. The polygons
can have varying numbers of sides.
The fact that structured images reveal hidden information in images is the main
motivation for tessellating an image. Structured images are then analysed in terms
of their component parts such as subsets of image regions, local neighbourhoods,
regional topologies, nearness and remoteness of sets, local convex sets and mappings
between image structures.
Image analysis focuses on various digital image measurements, e.g., pixel size, pixel
adjacency, pixel feature values, pixel neighbourhood membership, pixel gradient
orientation, pixel gradient magnitude, pixel intensity (either colour channel intensity
or greyscale intensity), pixel intensities distribution (histograms), closeness of image
8.1 Image Analysis 243
A visual scene is a collection of objects in a visual field that captures our attention.
In human vision, a visual field is the total area in which objects can be seen. A normal
visual field is about 60◦ from the vertical meridian of each eye and about 60◦ above
and 75◦ below the horizontal meridian. A sample Dirichlet tessellation of a 640 × 480
digital image containing fisherman scene is shown in Fig. 8.1. Here, the locations of
up to 60 image key colour-feature values are the source of sites used to generate the
fisherman diagram. The • indicates the location of a keypoint, i.e., a pixel with a
particular gradient orientation (Fig. 8.3).
The edge pixels of a fingerprint are displayed with varying colours and intensities
in Fig. 8.2. The HSV channel values for each pixel are determined using the gradient
orientation (Hue), gradient magnitude in the x-direction (Saturation) and gradient
magnitude in the y-direction (Value) of each edge pixel are combined to achieve
a visualization of the pixel gradient information. To see how this is done, see the
Mathematica script 6 in Appendix A.8.5. Try doing the same thing using the RGB
and Lab colour spaces. Recall that the CIE Lab color space describes colours visible
to the human eye. Lab is a 3D color space model, where L represents lightness of
the colour, the position of the colour between red/magenta and green along an a axis,
and the position of the colour between yellow and blue along a b axis. Hint: see
Matlab script A.37 and Mathematica script 7 in Appendix A.8.5.
The foundations for scene analysis are built on the pioneering work by
A. Rosenfeld work on digital topology [98, 168–172] (later called digital geom-
etry [94]) and others [39, 99, 102, 104, 105]. The work on digital topology runs
parallel with the introduction of computational geometry by M.I. Shamos [175] and
F.P. Preparata [158, 159], building on the work on spatial tessellations by G. Voronoi
[201, 203]and others [27, 53, 64, 103, 124, 196].
To analyze and understand image scenes, it is necessary to identify the objects
in the scenes. Such objects can be viewed geometrically as collections of connected
edges (e.g., skeletonizations or edges belonging to shapes or edges in polygons) or
image regions viewed as sets of pixels that are in some sense near each other or set of
points near a fixed point (e.g., all points near a site (also, seed or generating point) in a
Voronoï region [38]). For this reason, it is highly advantageous to associate geometric
structures in an image with mesh-generating points (sites) derived from the fabric of
an image. Image edges, corners, centroids, critical points, intensities, and keypoints
(image pixels viewed as feature vectors) or their combinations provide ideal sources
of mesh generators as well as sources of information about image geometry.
8.3 Pixel Edge Strength 247
Fig. 8.7 Two sets of intensity image edge pixel strengths represented by circle radii magnitudes.
The orientation angle of each radius corresponds to the gradient orientation of the circle center
keypoints
This section briefly looks at pixel edge strength (also called pixel gradient magnitude).
The edge strength of pixel I mg(x, y) (also called the pixel gradient magnitude)
is denoted by E(x, y) and defined by
2 2
∂ I mg(x, y) ∂ I mg(x, y)
E(x, y) = + (Pixel edge strength)
∂x ∂y
= G x (x, y)2 + G y (x, y)2 .
248 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
tend to cluster around mesh polygons in image regions where the image entropy
(and corresponding information levels) is highest. Those high entropy nucleus mesh
cluster are good hunting grounds for the recognition of image objects and patterns.
Mesh nucleus clusters are examples of Edelsbrunner-Harer nerves [42] (see, also,
[148, 150]).
For complex video frames such as traffic video frames, it is necessary to crop each
frame and then select only a portion of the cropped frame to tessellate. By cropping
an image, we mean removing the outer parts of an image to isolate and magnify a
region of interest. See, for example, the regions in the sample cropped traffic video
frame in Fig. 8.8. For a traffic video, a promising approach is to crop the central part
of each frame. See, e.g., P. Knee [95] for common approaches to cropping an image.
A sparse representation of an image is some form of either a reduction or expansion
of the image. Repeated sparse representation results in a sequence of images called
a Gaussian pyramid by P.J. Burt and E.H. Adelson [22].
Remark 8.7 Sparse Representations.
Sparse representation of digital images is a promising research area (basically,
a followup to the approach suggested by P. Knee [95]). See, e.g., P.J. Burt and
E.H. Adelson [22] and, more recently, for scalable visual recognition by B. Zhao
and E.P. Xing [219]. The article by B. Zhao and E.P. Xing not only presents an
250 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
is clearer than the first shadow shape in Fig. 8.12.1 as well as the third shadow shape in
Fig. 8.12.3. Then in Fig. 8.12.2 provides a good laboratory in the study
of image object shapes using computational geometry techniques from the previous
chapters.
Basically, a plane shape like the auto shadow shape in Fig. 8.12.2 is a container
for a spatial region in the plane. In the context of shape detection of objects in
digital images, the trick is to isolate and compare shapes of interest in a sequence
of images such as those found in a video. K. Borsuk was one of the first to suggest
studying sequences of plane shapes in his theory of shapes [17]. For an expository
introduction to Borsuk’s notion of shapes, see K. Borsuk and J. Dydak [18]. Borsuk’s
initial study of shapes has led to a variety of applications in science (see, e.g., the
shape of capillarity droplets in a container by F. Maggi and C. Mihaila [117] and
8.5 Shape Theory and the Shapes of 2D Image … 253
shapes of 2D water waves and hydraulic jumps by M.A. Fontelos, R. Lecaros, J.C.
López-Rios and J.H. Ortega [51]). For more about the basics of shape theory, see
N.J. Wildberger [210]. For shapes from a physical geometry perspective with direct
application in detecting shapes of image objects, see J.F. Peters [145].
Image object shape detection and object class recognition are of great interest
in Computer Vision. For example, basic shape features can represented by bound-
ary fragments and shape appearance can be represented by patches such as the
auto shadow shape in the traffic video frame Fig. 8.12.2. This is
basic approach to image object shape detection by A. Opelt, A. Pinz and A. Zis-
serman in [133]. Yet another recent Computer Vision approach to image object
shape detection reduces to the problem of finding the contours of an image object,
which correspond to object boundaries and symmetry axes. This is the approach
suggested by I. Kokkinos and A. Yuille in [97]. A promising approach in image
object shape detection in video frames is to track changing image object contours
(shapes) and minimizing an energy function that combines region, boundary and
shape information. This approach shape detection in videos is given by M.S. Allili
and D. Ziou in [5].
Fig. 8.14 Two sets of colour image edge pixel strengths represented by circle radii magnitudes.
The orientation angle of each radius corresponds to the gradient orientation of the circle center
keypoints
This section briefly looks at the derivation of pixel gradient orientation and gradient
magnitudes.
Let I mg be digital image and let I mg(x, y) equal the intensity of a pixel at
location (x, y). Since I mg(x, y) is a function of two variables x, y, we compute the
8.6 Image Pixel Gradient Orientation and Magnitude 255
partial derivative of I mg(x, y) with respect to x denoted by ∂ I mg(x,y)
∂x
, which is
the gradient magnitude of pixel I mg(x, y) in the x-direction. The partial derivative
∂ I mg(x,y)
∂x
is represented, for example, by the • on the horizontal axis in Fig. 8.6.
Similarly, ∂ I mg(x,y)
∂y
is the gradient magnitude of pixel I mg(x, y) in the y −
dir ection, which is represented, for example, by the • on the vertical axis in Fig. 8.6.
Let G x (x, y), G y (x, y) denote the edge pixel gradient magnitudes in the x- and y-
directions, respectively.
∂ f (x, y)
= f (x + 1, y) − f (x, y) = 1 − 1 = 0,
∂x
∂ f (x, y)
= f (x, y + 1) − f (x, y) = 0 − 1 = −1.
∂y
An alternative to Chen’s method is the preferred widely used approach called the
Sobel partial derivative given by J.L.R. Herran [78, Sect. 2.4.2, p. 23].
∂ f (x, y) 2
= f (x + 1, y) − f (x − 1, y)
∂x 4
1
+ f (x + 1, y + 1) − f (x − 1, y + 1)
4
1 2 2
+ f (x + 1, y + 1) − f (x − 1, y + 1) = − − = −1.
4 4 4
∂ f (x, y) 2
= f (x, y + 1) − f (x, y − 1)
∂y 4
256 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
1
+ f (x + 1, y + 1) − f (x + 1, y − 1)
4
1 2 2
+ f (x − 1, y + 1) − f (x − 1, y − 1) = + = 1.
4 4 4
The Sobel partial derivative is named after I. Sobel [181].
Let ϑ(x, y) be the gradient orientation angle of the edge pixel I mg(x, y) in
image I mg. This angle is found by computing the arc tangent of the ratio the edge
pixel gradient magnitudes. Compute ϑ(x, y) using
∂ I mg(x,y)
−1 ∂y Gy
ϑ(x, y) = tan ∂ I mg(x,y)
= tan−1 (Pixel gradient orientation).
∂x
Gx
8.7 Difference-of-Gaussians
1 − x 2 +y2 2
G(x, y, σ ) = e 2σ .
2π σ 2
Let k be a scaling factor and let ∗ be a convolution operation. From D.G. Lowe [116],
we obtain a difference-of-Gaussians image (denoted by D(x, y, σ ) defined by
Then use D(x, y, σ ) to identify potential interest points that are invariant to scale
and orientation.
The Scale-Invariant Feature Transform (SIFT) introduce by D.G. Lowe [115, 116]
is a mainstay in solving object recognition as well as object tracking problems. SIFT
works in a scale space to capture multiple scale levels and image resolutions. There
are four main stages in a SIFT computation on a digital image.
SIFT.4 Local pixel gradient magnitudes in the x- and y- directions are used to
compute pixel edge strengths. Note: See Sect. 8.6 for an explanation and
examples.
Remark 8.15 Keypoints, Edge Strength and Mesh Nerves.
A sample pixel edge strength is represented by the length of the hypotenuse in see
Fig. 8.7.1. This is part of the image geometry shown in Fig. 8.6, illustrated in terms
of the edge pixels along the whorls of a fingerprint. Here is a summary of the results
for two pixel edge strengths experiments.
chipmunk.jpg 860 keypoints found (144 and 233 keypoints displayed in the inten-
sity image in Fig. 8.14).
cycleImage.jpg 2224 keypoints found (144 and 233 keypoints displayed in the
intensity image in Fig. 8.7).
carPoste.jpg 902 keypoints found (144 and 233 keypoints displayed in the inten-
sity image in Fig. 8.14).
The analog of edge pixel strength in 2D images is the length of the radius of a
sphere with a keypoint at is center in a 3D image. In either case, keypoints pro-
vide a basis for object recognition and solid foundation for the study of image
geometric patterns in 2D and 3D images. A common approach in the study of
image objects and geometry is to used keypoints as generators of either Voronoï
or Delaunay tessellations of images (see, e.g., Fig. 8.18 for a Voronoi tessellation of
a cycle image using 144 keypoints and Fig. 8.19 for a tessellation using 377 key-
points). In either case, the result image mesh reveals clusters of polygons. Recall
8.8 Image Keypoints: D.G. Lowe’s SIFT Approach 261
that every mesh polygon is the nucleus of a mesh nerve. Often, interest (key) points
tend to cluster around mesh polygons in image regions where the image entropy
(and corresponding information levels) is highest. Those high entropy nucleus mesh
cluster are good hunting grounds for the recognition of image objects and patterns.
Mesh nucleus clusters are examples of Edelsbrunner-Harer nerves [42] (see, also,
[148, 150]) .
of the polygons that are along the outside border of the S1P polygons. A S2P
edgelet defines an intermediate coarse shapes of an MNC, namely, a S2P shape.
S3P: Level 3 coarse perimeter 3 (supra 3 perimeter or briefly S3P) that con-
tains a S2P and S1P. This form of coarse contour edgelet earns the name supra
3 contour, since this edgelet consists of line segments between keypoints of the
polygons that are along the outside border of the S2P polygons. A S2P edgelet
defines a maximally coarse shapes of an MNC, namely, a S3P shape.
The simplest of the nucleus contours is the edgelet formed by connecting the
keypoints inside the Voronoï mesh polygons that are adjacent to an MNC nucleus.
This is the fine nucleus contour (also called the fine perimeter) of an MNC. The
sub-image inside a fine contour usually encloses part of an object of interest. The
length of a fine contour (nucleus perimeter) traces the shape of small objects and is a
source of useful information in fine-grained recognition of an object that has a shape
that closely matches the shape of a target object.
8.21.2: Image IP
Fig. 8.21 Visualized image geometry via MNC fine contour edgelet
shown in Fig. 8.21.2 is constructed with 89 keypoints. This image edgelet tells us
something about the geometry of a subimage containing the maximal nucleus. This
geometry appears in the form of a shape described by the IP edgelet. For more about
shapes, see B.18. In the best of all possible worlds, this edgelet will enclose an
interesting image region that contains some object.
266 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
Coarse nucleus contours are useful in detecting the shapes of large image objects.
A coarse nucleus contour is found by connecting the keypoints inside the Voronoï
mesh polygons that are along the border of the fine perimeter polygons of an MNC
nucleus. Coarse contours are also called supra- or outer-contours in an MNC. The
length of a coarse contour (nucleus perimeter) traces the shape of medium-sized
objects covered by an MNC. The S1P (level 1 supra perimeter) is the innermost
MNC coarse contour.
played as red polygons) are shown in in Fig. 8.23.3. These S1P polygons
Fig. 8.23 Visualized image geometry via MNC S1P coarse contour edgelet
268 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
The quality of the MNC contour shape will depend on the target shape that we select.
In an object recognition setting, a target shape is the shape of an object we wish to
compare with sample shapes in either a single image or in a sequence of video image
frames. The quality of an MNC contour shape is high in cases where the perimeter
of a target shape is close to the length of a sample MNC contour perimeter. In other
8.11 Quality of a MNC Contour Shape 269
This section pushes the envelope for MNC contours by considering Level 2 and level
3 MNC contours, i.e3., S2P and S3P MNC contours. S2P contours are often tightly
grouped around S1P contours in an MNC cluster on an image, since the sites (e.g.,
keypoints or corners) are usually found in the interior of an image rather than along
the image borders. This often happens, provided the number of selected sites is high
enough.
Example 8.18 Coarse S1P and S2P Maximal Nucleus Cluster Contours.
The number of keypoints is 89 in the construction of the Voronoï mesh on the Poste car
image in Fig. 8.25. Notice how the keypoints cluster around the driver and monogram
on the Poste vehicle as well as around the Poste wheels. So we can expect to find a
maximal nucleus cluster in the middle of the Poste car shown in Fig. 8.23.4.
A combination of S1P, S2P and S3P contour edgelets are shown in Fig. 8.26. An
S2P contour edgelet is shown in white in Fig. 8.26. Notice how the S2P contour is
tightly grouped around the S1P contour. Here is a summary of the lengths of these
contours:
For object recognition purposes, comparing an S2P contour in a target image with
an S2P in a sample image such as a video frame is useful. The caution here is that
the tight grouping of the resulting S2P and S1P contours is dependent on the number
of keypoints that you choose. A choice of 89 or higher number of keypoints usually
produces a good result.
Example 8.19 Keypoint Mesh with S1P, S2P and S3P Maximal Nucleus Cluster
Contours.
A combination of S1P, S2P and S3P contour edgelets are shown in Fig. 8.27. Now
S3P contour is displayed as a sequence of connected red line segments •—• using
8.12 Coarse S2P and S3P (Levels 2 and 3) MNC Contours 271
Fig. 8.27 Sample tightly grouped S3P, S2P and S1P contours
the keypoints as endpoints of each line segment in the S3P. Each S3P line segment
is drawn between the keypoints in a pair of adjacent polygons along the border of
the S2P polygons. Here is a summary of the lengths of these contours:
S1P contour length 943.2667 pixels.
S2P contour length 1384.977 pixels.
S2P contour length 2806.5184 pixels.
Unlike the S2P contour, the line segments in a S3P contour are usually not tightly
grouped around the inner contours surrounding the MNC nucleus. This is reflected
in the number of pixels in the S3P contour, which is more than double the number
of pixels in the S2P contour. The absence of tight grouping reflects the influence of
the image edge and corner polygons in the Voronoï mesh.
So far, we have considered a tessellated image containing only one maximal nucleus
cluster. By varying the number of generating points (either corners or keypoints
some other form of mesh generators), it is possible to vary the number of MNCs
272 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
Fig. 8.29 Visualized image geometry via dual MNCs coarse contours
274 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
in a tessellated image. The goal is to construct an image mesh that contains either
adjacent or overlapping MNCs, which serve as markers of image objects. Adjacent
MNCs are maximal nucleus clusters in which a polygon in one MNC shares an edge
with a polygon in the other MNC. Overlapping MNCs occur whenever an entire
polygon is common to both MNCs (see, e.g., Figs. 8.28 and 8.29).
After we obtain a Voronoï mesh with multiple MNCs for a selected number of
keypoints, the MNCs can either separated (covering different parts of an image) or
overlapping. It then is helpful to experiment with either small or very large changes in
the number of keypoints in the search for meshes with multiple, overlapping MNCs
that are tightly grouped. The ideal situation is to find overlapping MNCs so that the
difference in the S1P and S2P contours lengths is small. Let ε be a positive number
and let S1Pc , S2Pc be the lengths (in pixels). For example, let ε = 500. Then find
an MNC so that
|S1Pc − S2Pc | < ε.
Notice that neighbouring (in the sense of close but neither adjacent nor overlap-
ping) MNCs are possible. Neighbouring MNCs are MNCs that are either adjacent,
overlapping or separated by at most one polygon.
A mesh with neighbouring MNCs can result in contours that cover a region of
interest in an image. We illustrate this with a small change in the number of keypoints
from Example 8.19, i.e., we select 91 instead of 89 keypoints as generators of a
Voronoï diagram superimposed on an image.
Fig. 8.31 Dual coarse S1P and S2P contours on overlapping MNCs
276 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
Fig. 8.32 Dual S1P, S2P and S3P contours on overlapping MNCs
Notice that most of the polygons in the Voronoï mesh covering the image in
Fig. 8.23.1 have been suppressed in Fig. 8.23.2. With the selection of 91 keypoints as
mesh generators, we obtain dual yellow nuclei in overlapping MNC nuclei, namely,
and (for a closeup view of these dual nuclei, see Fig. 8.33). These
overlapping MNC nuclei are important, since that cover a part of the image where
neighbouring keypoints are not only close together but also cover a part of the image
where the entropy is highest (in effect, where the information level is highest in this
image).
Example 8.23 Keypoint Mesh with S1P, S2P and S3P Maximal Nucleus
Cluster Contours.
A combination of S1P, S2P and S3P contour edgelets are shown in Fig. 8.27. Now
S3P contour is displayed as a sequence of connected red line segments •—• using
the keypoints as endpoints of each line segment in the S3P. Each S3P line segment
is drawn between the keypoints in a pair of adjacent polygons along the border of
the S2P polygons. Here is a summary of the lengths of these coarse contours:
S1P contour length 841.8626 pixels.
S2P contour length 1292.1581 pixels.
S2P contour length 2851.7199 pixels.
Unlike the S2P contour, the line segments in the S3P contour are not tightly grouped
around the inner contours surrounding the MNC nucleus. This is reflected in the
number of pixels in the S3P contour, which is more than double the number of pixels
in the S2P contour. The absence of tight grouping reflects the influence of the image
edge and corner polygons in the Voronoï mesh.
In this section, we call attention to the Rényi entropy of maximal nucleus clusters
covering image regions with high information levels.
It is known that the entropy of an image MNC is higher than the entropy of sur-
rounding non-MNC regions [153]. It is also known that Rényi entropy corresponds
to the information level of a set of data. For each increase in Rényi entropy there is
a corresponding increase in the underlying information level in MNC regions of a
Voronoï mesh on a digital image. This result concerning the entropy of the tessella-
tion of digital images stems from a recent study by E. A-iyeh and J.F. Peters [2]. In
our case, the Rényi entropy of an MNC corresponds to the information level of that
part of an image covered by an MNC.
Let p(x1 ), . . . , p(xi ), . . . , p(xn ) be the probabilities of a sequence of events
x1 , . . . , xi , . . . , xn and let β ≥ 1. Then the Rényi entropy [164] Hβ (X ) of a set
of event X is defined by
1
n
Hβ (X ) = ln p β (xi ) (Rényi entropy).
1 − β i=1
Rényi’s entropy is based on the work by R.V.L. Hartley [72] and H. Nyquist [129]
on the transmission of information. A proof that Hβ (X ) approaches Shannon entropy
as β −→ 1 is given P.A. Bromiley, N.A. Thacker and E. Bouhova-Thacker in [19],
i.e.,
1 n n
lim ln p β (xi ) = − pi ln pi .
β−→1 1 − β
i=1 i=1
Example 8.25 MNC versus non-MNC Entropy on the Video Frame Tessellation
with 145 Keypoints.
Dual MNCs in a Voronoï mesh with 145 keypoints is shown in Fig. 8.37. 3D plots
showing the distribution of Rényi’s entropy values with varying β are shown in
Fig. 8.38 for the MNC and non-MNC mesh regions. A comparison of the Rényi’s
entropy values for the MNC and non-MNC region is given in the plot in Fig. 8.39.
Observe that the Rényi’s entropy values of the MNC regions increase monotonically
and are greater than the entropy of the non-MNC regions. This implies that the
information content around the front of the train engines covered by the MNCs is
higher than the surrounding image regions in this particular Voronoï mesh.
Fig. 8.36 Combined MNC and non-MNC entropy plot for 376-keypoint-based mesh
Fig. 8.37 Dual MNCs in Voronoï mesh generated by 145 keypoints on video frame
Fig. 8.38 3D Plots for MNC and Non-MNC entropy for a 145 keypoint-based video frame
8.15 Rényi Entropy of Image MNC Regions 281
correspondence between the Rényi entropy of mesh cells relative to the quality of
the cells varies for different classes of images.
For example, with Voronoï tessellations of images of humans, Rényi entropy tends
to be higher for higher quality mesh cells (see, e.g., the plot in Fig. 8.40 for different
Rényi entropy levels, ranging from β = 1.5 to 2.5 in 0.5 increments).
8.16 Problems
Problem 8.26 K
Let I mg be a Voronoï tessellated image using SURF keypoints. Do the following:
1. Select k keypoints, starting with 10 SURF points.
2. Find the maximal nucleus clusters (MNCs) on the Img.
3. Draw the fine IP edgelet geometry (by itself, not on an image). Use blue for the
IP line segments. See, for example, 1P edgelet geometry in Fig. 8.21.1.
4. Draw the coarse S1P edgelet geometry (by itself, not on an image). Use blue for
the S1P line segments. See, for example, S1P edgelet geometry in Fig. 8.22.
282 8 Lowe Keypoints, Maximal Nucleus Clusters, Contours and Shapes
5. Draw the fine IP contour surrounding the MNC nucleus on an image. Use blue
for the IP line segments.
6. Draw the coarse S1P contour surrounding the MNC nucleus on an image. Use
green for the S1P line segments.
7. Draw the coarse S2P contour surrounding the MNC nucleus on an image. Use
white for the S1P line segments.
8. Choose a positive number ε and let S1Pc, S2Pc be the lengths (in pixels) of the
level 1 and level 2 MNC contours, respectively. Adjust ε so that
9. Repeat Step 1 for k = 13, 21, 34, 55, 89, 144, 233, 610 keypoints, until two over-
lapping or adjacent MNCs are found on the Img.
Shapes are elusive creatures that drift in and out of natural scenes that we sometimes
perceive, store in memory and record with digital cameras. In a sequence of video
frame images, for example, shapes such as the ones shown in Fig. 9.1 sometimes
deform into other shapes. In Fig. 9.1, there is a sequence of deformations (represented
by −→) like in Fig. 9.2.
edgelets surrounding each MNC nucleus. These are the now the familiar collections
of connected straight edges. In a fine contour, each straight edge is drawn between
generating points along the border of an MNC nucleus polygon.
We have seen numerous examples of fine contours. Taking this image geometry
a step further (moving outward along the border of a fine contour), we can identify
a course contour surrounding each fine contour. In a coarse contour, each straight
edge is drawn between generating points along the border of fine contour.
Image topology supplies us with structures useful in the analysis and classification
of image regions. The main structure in an image topology is an open set. Basically,
an open set is a set of elements that does not include the elements on its boundary.
In these foundations, open sets first appeared in Sect. 1.2. For more about open sets,
see Appendix B.14. Here is another example.
Image topologies are defined on image open sets. An image topology is a collec-
tion of open sets τ on image open set X with the following properties.
1o The empty set ∅ is open and ∅ is in τ .
2o The set X is open and X is in τ .
3o If A is a sub-collection of open sets in τ , then
286 9 Postscript. Where Do Shapes Fit into the Computer Vision Landscape?
B is a open set in τ .
B∈A
Example 9.2
A sample portrayal of shapes in a sequence of video frames is shown in Fig. 9.1.
Over time, the ring torus is the first of the video frames in Fig. 9.1 breaks open and
stretches out, eventually assuming a tubular shape. The deformation of one shape
into another shape is a common occurrence in the natural world.
In an extreme case such as the one in Fig. 9.4, some form of worldsheet rolls up
(over time), forming a ring torus. In topological terms, there is a continuous mapping
from a planar worldsheet wshM in R2 to a ring torus f (wshM) in R3 . A worldsheet
D (denoted by wshD) is a collection of strings that cover a patch in a natural scene. A
string is either a wiggly or straight line segment. In string theory, a string is defined
by the path followed by a particle moving through space. Another name for such
a string is worldline [130–132]. The idea of a string works well in explaining the
sequences of shapes in video frames in which the paths followed by photons has
been recorded by a video camera.
This mapping from a worldsheet to a torus is represented in Fig. 9.4. A ring torus
is tubular surface in the shape of a doughnut, obtained by rotating a circle of radius
r (called the tube radius) about an axis in the plane of the circle at distance c from
the torus center. Worldsheet wshM maps to (rolls up into) the tubular surface of a
ring torus in 3-space, i.e., there is a continuous mapping from wshM in 2-space to
ring torus surface f (sheetM) in 3-space.
This section briefly covers part of the ground for shape estimation. The basic idea
is twofold. First, we need some means of measuring the shape of an image object.
Second, we to decide when one shape is approximately the same as another shape.
For simplicity, we consider only 2D shapes, here.
In the plane, shapes are known by their perimeters and areas. The focus here is on
perimeters that are collections of connected straight edges. Recall that edges e, e are
connected, provided there is a path between e and e . A perimeter that is composed
288 9 Postscript. Where Do Shapes Fit into the Computer Vision Landscape?
Algorithm 10: Comparing Image Region Shape Perimeters that are Edgelets
Input : Read digital image regions T, R.
Output: shapeSimilarity (Shape perimeters similarity measurement).
1 /* edgelet T equals a shape perimeter in a target image region T */ ;
2 edgelet T ← connectedT arget Edges ⊂ T ;
3 /* edgelet R equals a shape perimeter in a sample image region R */ ;
4 edgelet R ← connected Region Edges ⊂ R;
5 /* ε = upper bound on similarity between shape edgelets */ ;
6 ε ← small + ve Real Number;
7 /* Compare shape perimeters: */ ;
8
1, if |edgelet T − edgelet R| < ε,
shapeSimilarit y (edgelet T, edgelet R) =
0, otherwise.
/* One Shape edgelet approximates another one, provided shapeSimilarit y = 1 */
9.5.1: Target Drone Video Frame Region 9.5.2: Sample Region in a Drone Video
Frame
Since image regions are known by their shape perimeters, it is possible to compare
the shape perimeter that encloses an image region containing a target object with the
shape perimeter of an image region containing an unknown object. Notice that, after
tessellating an image and identifying the maximal nucleus clusters (MNCs) in the
tessellated image, each MNC contour surrounding an MNC nucleus polygon is a
shape perimeter.
9.2 Shape Estimates 289
Example 9.3 Sample Pair of Traffic Drone Video Frame Shape Perimeters.
A sample pair of drone traffic video frames are shown in Fig. 9.5. To obtain a shape
perimeter from each of these video frames, we do the following:
1o Select video frame images img1, img2.
2o Select a set of mesh generating points S.
3o Select a video frame image img ∈ {img1, img2}.
4o Superimpose on img1 a Voronoö diagram V (S), i.e., tessellate img, covering
img with Voronoö regions V (s), using each generating point (site, seed point)
s ∈ S.
5o Identify a MNC in the image diagram V (S) (call it M N C(s)).
6o Identify coarse edgelet contour M N Cedgelet in img (a target MNC shape
perimeter in a video frame).
7o Repeat Step 3, after obtaining a target MNC shape perimeter in img (call it
M N Cedgelet T ) to obtain a sample video frame image MNC coarse edgelet
contour M N Cedgelet R (a sample MNC shape perimeter in a video frame).
The result of this step is the production of a pair MNC shape perimeters
(M N Cedgelet T and M N Cedgelet R) embedded in a pair video frame images.
An embedded target shape perimeter M N Cedgelet T is shown in Fig. 9.6.1
and an embedded sample region shape perimeter M N Cedgelet R is shown
in Fig. 9.6.2.
8o Next extract a pair of pure plane shape perimeters from the embedded MNC
perimeters. Note: This is done to call attention to the edgelets whose lengths
we want to measure and compare.
290 9 Postscript. Where Do Shapes Fit into the Computer Vision Landscape?
11o Repeat Step 9, after obtaining the first MNC shape perimeter (call it edgelet T )
to obtain a sample MNC shape perimeter edgelet R (a sample MNC shape
perimeter in a video frame). The result of this step is the production of a pair
of pure plane shape perimeters (edgelet T and edgelet R) embedded in a pair
of contours MNCs in Voronoï-tessellated video frame images. An target shape
perimeter edgelet T is shown in Fig. 9.7.1 and a sample region shape perime-
ter edgelet R is shown in Fig. 9.7.2.
12o Use edgelet T and edgelet R as inputs in Algorithm 10 (compute the similarity
between the target and sample MNC shape perimeters).
13o Compute the value of shapeSimilarit y (edgelet T, edgelet R).
This Appendix contains Matlab® and Mathematica® scripts referenced in the chap-
ters. Matlab® R2013b is used to write the Matlab scripts.
% s c r i p t : G e n e r a t i n g P o i n t s O n I m a g e .m
% image geometry : image c o r n e r s
% p a r t 1 : image c o r n e r s + Voronoi d i a g r a m on image
% p a r t 2 : p l o t image c o r n e r s + Voronoi d i a g r a m by t h e m s e l v e s
%
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
img=imread ( ’carRedSalerno.jpg’ ) ;
g = double ( rgb2gray ( img ) ) ; % c o n v e r t t o g r e y s c a l e image
%
% part 1:
%
cornersMin = corner ( g ) ; % min . no . o f c o r n e r s
% i d e n t i f y image boundary c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
% c o n c a t e n a t e image boundary c o r n e r s & s e t o f i n t e r i o r image c o r n e r s
cornersMin = cat ( 1 , cornersMin , box_corners ) ;
% s e t up d i s p l a y o f c o r n e r s M i n on r g b image
figure , imshow ( img ) , . . .
hold on , axis on , axis tight , % s e t up c o r n e r s d i s p l a y on r g b image
plot ( cornersMin ( : , 1 ) , cornersMin ( : , 2 ) , ’g*’ ) ;
% s e t up cornerMin −b a s e d Voronoi d i a g r a m on r g b image
redCarMesh = figure , imshow ( img ) , . . .
hold on , axis on , axis tight ,
voronoi ( cornersMin ( : , 1 ) ,cornersMin ( : , 2 ) , ’gx’ ) ; % b l u e e d g e s
% uncomment n e x t l i n e t o s a v e Voronoi d i a g r a m :
% s a v e a s ( redCarMesh , ’ imageMesh . png ’ ) ; % s a v e copy o f image
%
% part 2:
%
corners = corner ( g , 1 0 0 0 ) ; % up t o 1000 c o r n e r s
% c o n c a t e n a t e image boundary c o r n e r s & s e t o f i n t e r i o r image c o r n e r s
corners = cat ( 1 , corners , box_corners ) ;
% p l o t s p e c i f i e d no . o f c o r n e r s :
figure , imshow ( g ) , . . .
hold on , axis on , axis tight , % s e t up c o r n e r s p l o t
plot ( corners ( : , 1 ) , corners ( : , 2 ) , ’b*’ ) ;
% c o n s t r u c t c o r n e r −b a s e d Voronoi d i a g r a m
planarMesh = figure
voronoi ( corners ( : , 1 ) ,corners ( : , 2 ) , ’bx’ ) ; % b l u e e d g e s
% uncomment n e x t l i n e t o s a v e Voronoi d i a g r a m :
% s a v e a s ( planarMesh , ’ p l a n a r M e s h . png ’ ) ; % s a v e copy o f image
corner-based Voronoï mesh superimposed on the image in Fig. A.1.2, produced using
Matlab® script A.1. A plot of 1000 image corners plus image boundary corners is
given in Fig. A.2.1 and a plot of the corner-based Voronoï mesh is shown in Fig. A.2.2,
also using Matlab script A.1. For more about this, see Sect. 1.22.
Fig. A.3 Image interior corners plus image boundary corners-based Voronoï mesh
% s c r i p t : VoronoiMeshOnImage .m
% image geometry : o v e r l a y Voronoi mesh on image
%
% s e e h t t p : / / homepages . u l b . ac . be / ~ dgonze / INFO / m a t l a b . h t m l
% r e v i s e d 23 Oct . 2016
clear all ; close all ; clc ; % h o u s e k e e p i n g
g=imread ( ’fisherman.jpg’ ) ;
% im= i m r e a d ( ’ c y c l e . jpg ’ ) ;
% g= i m r e a d ( ’ c a r R e d S a l e r n o . jpg ’ ) ;
%%
img = g ; % s a v e copy o f c o l o u r image t o make o v e r l a y p o s s i b l e
g = double ( rgb2gray ( g ) ) ; % c o n v e r t t o g r e y s c a l e image
% c o r n e r s = c o r n e r ( g ) ; % min . no . o f c o r n e r s
k = 233; % s e l e c t k corners
corners = corner ( g , k ) ; % up t o k c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
294 Appendix A: Matlab and Mathematica Scripts
% s c r i p t : VoronoiMesh1000CarPolygons .m
% image geometry : Voronoi mesh image p o l y g o n s
% I n t e r i o r + boundary c o r n e r s −b a s e d Voronoi mesh p l o t :
% Notice the corner c l u s t e r s .
%
% P a r t 1 : p l o t o f d e f a u l t i n t e r i o r + boundary c o r n e r s
% P a r t 2 : p l o t o f up t o 2000 i n t e r i o r + boundary c o r n e r s
%
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
g=imread ( ’carRedSalerno.jpg’ ) ;
% g= i m r e a d ( ’ p e p p e r s . png ’ ) ;
g = double ( rgb2gray ( g ) ) ; % c o n v e r t t o g r e y s c a l e image
%
% Part 1
%
Appendix A: Matlab and Mathematica Scripts 295
Fig. A.5 Sample 50 corner triangulation with Voronoi mesh overlay on image
296 Appendix A: Matlab and Mathematica Scripts
% s c r i p t : DelaunayOnImage .m
% image geometry : Delaunay t r i a n g l e s on image
%
% P a r t 1 : d e f a u l t i n t e r i o r + boundary c o r n e r s −b a s e d t r i a n g u l a t i o n
% P a r t 2 : Up t o 2000 i n t e r i o r + boundary c o r n e r s −b a s e d t r i a n g u l a t i o n
%
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
g=imread ( ’carRedSalerno.jpg’ ) ;
% g= i m r e a d ( ’ 8 x 8 g r i d . jpg ’ ) ;
% g= i m r e a d ( ’ Fox −2 s t a t e s . jpg ’ ) ;
img = g ; % s a v e copy o f c o l o u r image
g = double ( rgb2gray ( g ) ) ; % c o n v e r t t o g r e y s c a l e image
%
% Part 1
%
corners = corner ( g , 5 0 ) ; % d e f a u l t image c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
corners = cat ( 1 , corners , box_corners ) ; % combined c o r n e r s
figure , imshow ( img ) , hold on ; % s e t up o v e r l a y o f mesh on image
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI = delaunay ( corners ( : , 1 ) ,corners ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI , corners ( : , 1 ) ,corners ( : , 2 ) , ’b’ ) ; % meshes on image
%
% c o r n e r Delaunay t r i a n g u l a t i o n w i t h Voronoi mesh o v e r l a y :
%
figure , imshow ( img ) , hold on ; % s e t up o v e r l a y o f mesh on image
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI = delaunay ( corners ( : , 1 ) ,corners ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI , corners ( : , 1 ) ,corners ( : , 2 ) , ’b’ ) ; % meshes on image
voronoi ( corners ( : , 1 ) ,corners ( : , 2 ) , ’y’ ) ; % i d e n t i f y p o l y g o n s
%
% Part 2
%
corners1000 = corner ( g , 2 0 0 0 ) ; % f i n d 1000 image c o r n e r s
corners1000 = cat ( 1 , corners1000 , box_corners ) ; % combined c o r n e r s
figure , imshow ( img ) , hold on ; % s e t up o v e r l a y o f mesh on image
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI = delaunay ( corners1000 ( : , 1 ) ,corners1000 ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI , corners1000 ( : , 1 ) ,corners1000 ( : , 2 ) ,’b’ ) ; % meshes on image
Fig. A.6 Sample 2000 corner triangulation with Voronoi mesh overlay on image
Appendix A: Matlab and Mathematica Scripts 297
%
% c o r n e r Delaunay t r i a n g u l a t i o n w i t h Voronoi mesh o v e r l a y :
%
figure , imshow ( img ) , hold on ; % s e t up o v e r l a y o f mesh on image
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI = delaunay ( corners1000 ( : , 1 ) ,corners1000 ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI , corners1000 ( : , 1 ) ,corners1000 ( : , 2 ) ,’b’ ) ; % meshes on image
voronoi ( corners1000 ( : , 1 ) ,corners1000 ( : , 2 ) , ’y’ ) ; % i d e n t i f y p o l y g o n s
% i m f i n f o ( ’ c a r R e d S a l e r n o . jpg ’ )
% s c r i p t : D e l a u n a y C o r n e r T r i a n g l e s .m
% image geometry : Delaunay t r i a n g l e s from image c o r n e r s
% p l u s Delaunay t r i a n g u l a t i o n w i t h Voronoi mesh o v e r l a y
%
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
g=imread ( ’carRedSalerno.jpg’ ) ;
% g= i m r e a d ( ’ Fox −2 s t a t e s . jpg ’ ) ;
img = g ; % s a v e copy o f c o l o u r image
g = double ( rgb2gray ( g ) ) ; % c o n v e r t t o g r e y s c a l e image
%
% Part 1
%
corners = corner ( g , 5 0 ) ; % d e f a u l t image c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
corners = cat ( 1 , corners , box_corners ) ; % combined c o r n e r s
figure , imshow ( g ) , hold on ; % s e t up o v e r l a y o f mesh on image
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI = delaunay ( corners ( : , 1 ) ,corners ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI , corners ( : , 1 ) ,corners ( : , 2 ) , ’b’ ) ; % meshes on image
%
% 50 c o r n e r Delaunay t r i a n g u l a t i o n w i t h Voronoi mesh o v e r l a y :
%
figure , imshow ( g ) , hold on ; % s e t up o v e r l a y o f mesh on image
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI = delaunay ( corners ( : , 1 ) ,corners ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI , corners ( : , 1 ) ,corners ( : , 2 ) , ’b’ ) ; % meshes on image
voronoi ( corners ( : , 1 ) ,corners ( : , 2 ) , ’r’ ) ; % i d e n t i f y p o l y g o n s
%
% Part 2
%
corners2000 = corner ( g , 2 0 0 0 ) ; % f i n d 1000 image c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
corners2000 = cat ( 1 , corners2000 , box_corners ) ; % combined c o r n e r s
figure , imshow ( g ) , hold on ; % s e t up o v e r l a y o f mesh on image
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI2000 = delaunay ( corners2000 ( : , 1 ) ,corners2000 ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI2000 , corners2000 ( : , 1 ) ,corners2000 ( : , 2 ) ,’b’ ) ; % meshes on image
%
% 2000 − c o r n e r Delaunay t r i a n g u l a t i o n w i t h Voronoi mesh o v e r l a y :
%
figure , imshow ( g ) , hold on ; % s e t up o v e r l a y o f mesh on image
Appendix A: Matlab and Mathematica Scripts 299
% voronoi ( c or ne r s ( : , 1 ) , c or ne r s ( : , 2 ) , ’ x ’ ) ; % i d e n t i f y polygons
TRI2000 = delaunay ( corners2000 ( : , 1 ) ,corners2000 ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI2000 , corners2000 ( : , 1 ) ,corners2000 ( : , 2 ) ,’b’ ) ; % meshes on image
voronoi ( corners2000 ( : , 1 ) ,corners2000 ( : , 2 ) , ’r’ ) ; % i d e n t i f y p o l y g o n s
% i m f i n f o ( ’ c a r R e d S a l e r n o . jpg ’ )
% s c r i p t : DelaunayVoronoiOnImage .m
% image geometry : Delaunay t r i a n g l e s on Voronoi mesh on image
%
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
% E x p e r i m e n t w i t h Delaunay t r i a n g u l a t i o n Voronoi mesh o v e r l a y s :
g=imread ( ’cycle.jpg’ ) ;
% g= i m r e a d ( ’ c a r R e d S a l e r n o . jpg ’ ) ;
img = g ; % s a v e copy o f c o l o u r image
g = double ( rgb2gray ( g ) ) ; % c o n v e r t t o g r e y s c a l e image
corners = corner ( g , 5 0 ) ; % f i n d 1000 image c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
corners = cat ( 1 , corners , box_corners ) ; % combined c o r n e r s
figure , imshow ( img ) , hold on ; % s e t up o v e r l a y o f mesh on image
voronoi ( corners ( : , 1 ) ,corners ( : , 2 ) , ’y’ ) ; % i d e n t i f y p o l y g o n s
TRI = delaunay ( corners ( : , 1 ) ,corners ( : , 2 ) ) ; % i d e n t i f y t r i a n g l e s
triplot ( TRI , corners ( : , 1 ) ,corners ( : , 2 ) , ’b’ ) ; % meshes on image
% i m f i n f o ( ’ c y c l e . jpg ’ )
% i m f i n f o ( ’ c a r R e d S a l e r n o . jpg ’ )
The Delaunay triangulation combined with Voronoï mesh each derived from 50
image corners plus image boundary corners extracted from a colour image is shown
in Fig. A.9, produced using Matlab script A.6. For more about this, see Sect. 1.22.
Fig. A.9 Combination of 50 corner Delaunay triangulation plus Voronoï mesh overlay on an image
% s c r i p t : DelaunayOnVoronoi .m
% image geometry : Delaunay t r i a n g l e s on Voronoi mesh p o l y g o n s
%
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
g=imread ( ’fisherman.jpg’ ) ; % i n p u t c o l o u r image
% g= i m r e a d ( ’ c a r R e d S a l e r n o . jpg ’ ) ; % i n p u t c o l o u r image
g = double ( rgb2gray ( g ) ) ; % c o n v e r t t o g r e y s c a l e image
corners = corner ( g , 5 0 ) ; % f i n d up t o 50 image c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
corners = cat ( 1 , corners , box_corners ) ; % box + i n n e r c o r n e r s
figure , imshow ( g ) , hold on ; % s e t up combined meshes
voronoi ( corners ( : , 1 ) ,corners ( : , 2 ) , ’x’ ) ; % Voronoi mesh
TRI = delaunay ( corners ( : , 1 ) ,corners ( : , 2 ) ) ; % Delaunay mesh
triplot ( TRI , corners ( : , 1 ) ,corners ( : , 2 ) , ’r’ ) ; % combined meshes
% i m f i n f o ( ’ f i s h e r m a n . jpg ’ )
% i m f i n f o ( ’ c a r R e d S a l e r n o . jpg ’ )
Fig. A.10 Corner-based Delaunay triangulation plus Voronoï mesh overlays on an image
shown in Fig. A.10, produced using Matlab script A.7. For more about this, see
Sect. 1.22.
% s c r i p t : o f f l i n e V o r o n o i .m
% OFFLINE VIDEO VORONOI AND DELAUNAY MESH (CORNERS)
% O f f l i n e c o r n e r −b a s e d Voronoi t e s s e l l a t i o n o f v i d e o f r a m e s
% Example by D. V i l l a r from August 2015 e x p e r i m e n t
% R e v i s e d v e r s i o n : 15 Dec . 2015 , 7 Nov . 2 0 1 6 .
%
close all , clear all , clc % workspace h o u s e k e e p i n g
%%
% I n i t i a l i z e i n p u t and o u t p u t v i d e o s
videoReader = vision . VideoFileReader ( ’moving_hand.mp4’ ) ;
videoWriter = vision . VideoFileWriter ( ’offlineVoronoiResult1.avi’ , . . .
’FileFormat’ , ’AVI’ , . . .
’FrameRate’ ,videoReader . info . VideoFrameRate ) ;
videoWriter2 = vision . VideoFileWriter ( ’offlineDelaunayResult1.avi’ , . . .
’FileFormat’ , ’AVI’ , . . .
’FrameRate’ ,videoReader . info . VideoFrameRate ) ;
% C a p t u r e one frame t o g e t i t s s i z e .
videoFrame = step ( videoReader ) ;
frameSize = size ( videoFrame ) ;
runLoop = true ;
frameCount = 0 ;
% 100 frame v i d e o
while runLoop && frameCount < 100
% Get t h e n e x t frame and c o r n e r s
videoFrame = imresize ( step ( videoReader ) , 0 . 5 ) ;
frameCount = frameCount + 1 ;
videoFrameGray = rgb2gray ( videoFrame ) ;
videoFrameGray = medfilt2 ( videoFrameGray , [ 5 5 ] ) ;
C = corner ( videoFrameGray , 3 0 0 ) ; % g e t up t o 300 frame c o r n e r s
[ a , b ] = size ( C ) ;
% C a p t u r e Voronoi t e s s e l l a t i o n o f v i d e o frame
if a > 2
[ VX , VY ] = voronoi ( C ( : , 1 ) , C ( : , 2 ) ) ;
% C r e a t i n g m a t r i x o f l i n e s e g m e n t s i n t h e form [ x_11 y_11 x_12 y_12 . . .
% . . . x_n1 y_n1 x_n2 y_n2 ]
A = [ VX ( 1 , : ) ; VY ( 1 , : ) ; VX ( 2 , : ) ; VY ( 2 , : ) ] ;
A ( A>5000) = 5 0 0 0 ; A ( A< − 5000) = − 5000;
A = A’;
% D i s p l a y Voronoi t e s s e l l a t i o n o f v i d e o frame
videoFrame2 = insertMarker ( videoFrame , C , ’+’ , . . .
’Color’ , ’red’ ) ;
videoFrame2 = insertShape ( videoFrame , ’Line’ , A , ’Color’ , ’red’ ) ;
% D i s p l a y t h e a n n o t a t e d v i d e o frame u s i n g t h e v i d e o p l a y e r o b j e c t .
step ( videoWriter , videoFrame2 ) ;
else
step ( videoWriter , videoFrame ) ;
end
Appendix A: Matlab and Mathematica Scripts 303
end
% Clean up : v i d e o h o u s e k e e p i n g
release ( videoWriter ) ;
disp ( ’offlineVoronoiResult1.mp4 has been produced.’ )
release ( videoWriter2 ) ;
disp ( ’offlineDelaunayResult1.mp4 has been produced.’ )
% s c r i p t : s c r i p t : r e a l T i m e 1 .m
% Real −t i m e Voronoi mesh :
% c o r n e r −b a s e d t e s s e l l a t i o n o f v i d e o f r a m e s .
% See l i n e s 32 − 33.
% Example from D. V i l l a r , J u l y 2015 Compute V i s i o n E x p e r i m e n t .
% R e v i s e d 7 Nov . 2016
304 Appendix A: Matlab and Mathematica Scripts
%
close all , clear all , clc % h o u s e k e e p i n g
%%
% C r e a t e t h e webcam o b j e c t .
cam = webcam ( 2 ) ;
% C a p t u r e one frame t o g e t i t s s i z e .
videoFrame = snapshot ( cam ) ;
frameSize = size ( videoFrame ) ;
runLoop = true ;
frameCount = 0 ;
% 100 frame v i d e o
while runLoop && frameCount < 100
% Get t h e n e x t frame .
videoFrame = snapshot ( cam ) ;
frameCount = frameCount + 1 ;
videoFrameGray = rgb2gray ( videoFrame ) ;
% Voronoi u s i n g c o r n e r s
C = corner ( videoFrameGray , 1 0 0 ) ;
[ VX , VY ] = voronoi ( C ( : , 1 ) , C ( : , 2 ) ) ;
% D i s p l a y t h e a n n o t a t e d v i d e o frame u s i n g t h e v i d e o p l a y e r o b j e c t .
step ( videoPlayer , videoFrame ) ;
step ( videoWriter , videoFrame ) ;
end
% Clean up ( v i d e o camera h o u s e k e e p i n g )
clear cam ;
release ( videoWriter ) ;
release ( videoPlayer ) ;
Listing A.9 Matlab script in realTime1.m to construct Voronoï tessellation of video frames in
real-time.
% s c r i p t : i n s p e c t P i x e l s .m
% Use c p s e l e c t ( g , h ) t o i n s p e c t p i x e l s i n a r a s t e r image
% comment : CTRL<r > , uncomment : CTRL< t >
% Each p i x e l i s r e p r e s e n t e d by a t i n y s q u a r e
clc , close all , clear all % h o u s e k e e p i n g
306 Appendix A: Matlab and Mathematica Scripts
%%
% i n p u t a p a i r o f images .
% Choices :
% 1 . I n p u t two c o p i e s o f t h e same image
% 2 . I n p u t two d i f f e r e n t images .
% Examples :
%
% choice 1:
% g = i m r e a d ( ’ camera . jpg ’ ) ; h = i m r e a d ( ’ camera . jpg ’ ) ;
% g = i m r e a d ( ’ p e p p e r s . png ’ ) ; h = i m r e a d ( ’ p e p p e r s . png ’ ) ;
% choice 2:
g = imread ( ’naturalTessellation.jpg’ ) ; h = imread ( ’imgGrey.jpg’ ) ;
% use c p s e l e c t t o o l
cpselect ( g , h )
% s c r i p t : p i x e l C h a n n e l s .m
% D i s p l a y c o l o r image c h a n n e l v a l u e s
% S c r i p t i d e a from :
% h t t p : / / www. mathworks . com / m a t l a b c e n t r a l / p r o f i l e / a u t h o r s /1220757 − sixwwwwww
clc , clear all , close all
img = imread ( ’carCycle.jpg’ ) ; % Read image
% img = i m r e a d ( ’ c a r P o s t e . jpg ’ ) ; % Read image
red = img ( : , : , 1 ) ; % Red c h a n n e l
green = img ( : , : , 2 ) ; % Green c h a n n e l
blue = img ( : , : , 3 ) ; % Blue c h a n n e l
rows = size ( img , 1 ) ; columns = size ( img , 2 ) ;
rc = zeros ( rows , columns ) ;
justR = cat ( 3 , red , rc , rc ) ;
justG = cat ( 3 , rc , green , rc ) ;
justB = cat ( 3 , rc , rc , blue ) ;
Appendix A: Matlab and Mathematica Scripts 309
% s c r i p t : r g b 2 g r e y .m
% Colour to g r e y s c a l e conversion .
clc , clear all , close all
%%
img = imread ( ’naturalTessellation.jpg’ ) ;
% f i g u r e , imshow ( img ) , a x i s on ;
imgGrey = rgb2gray ( img ) ;
imwrite ( imgGrey , ’imgGrey.jpg’ ) ;
figure ,
subplot ( 1 , 2 , 1 ) , plot ( img ( 1 , : ) ) , . . . % row 1 c o l o u r i n t e n s i t i e s
axis square ; title ( ’row 1 colour values’ ) ;
subplot ( 1 , 2 , 2 ) ,plot ( imgGrey ( 1 , : ) ) , . . . % row 1 g r e y s c a l e i n t e n s i t i e s
axis square ; title ( ’row 1 greyscale values’ ) ;
figure ,
subplot ( 1 , 2 , 1 ) , imshow ( img ) , . . . % d i s p l a y c o l o u r image
axis on ; title ( ’orginal image’ ) ;
subplot ( 1 , 2 , 2 ) , imshow ( imgGrey ) , . . . % d i s p l a y g r e y s c a l e image
axis on ; title ( ’greyscale image’ ) ;
% s c r i p t : p i x e l C y c l e .m
% Sample p i x e l v a l u e c h a n g e s I .
clc , clear all , close all
%%
g = imread ( ’leaf.jpg’ ) ;
% g = i m r e a d ( ’ c a r C y c l e . jpg ’ ) ;
figure , imshow ( g ) ,axis on ;
figure ,
i1 = g + g ; % add image p i x e l v a l u e s
subplot ( 3 , 4 , 1 ) , imshow ( i1 ) , . . .
axis off ; title ( ’g + g’ ) ; % d i s p l a y sum
i2 = ( g + g ) . ∗ 0 . 5 ; % average pixel values
subplot ( 3 , 4 , 2 ) , imshow ( i2 ) , . . .
axis off ; title ( ’(g + g).*0.5’ ) ; % d i s p l a y a v e r a g e
i3 = ( g + g ) . ∗ 0 . 3 ; % 1/3 pixel values
Appendix A: Matlab and Mathematica Scripts 311
subplot ( 3 , 4 , 3 ) , imshow ( i3 ) , . . .
axis off ; title ( ’(g + g).*0.3’ ) ; % d i s p l a y r e d u c e d v a l u e s
i4 = ( ( g . / 2 ) . ∗ g ) . ∗ 2 ; % doubled p i x e l value products
subplot ( 3 , 4 , 4 ) , imshow ( i4 ) , . . .
axis off ; title ( ’((g./2).*g).*2’ ) ; % d i s p l a y d o u b l e d v a l u e s
% Sample p i x e l v a l u e c h a n g e s I I .
clc , clear all , close all
h = imread ( ’naturalTessellation.jpg’ ) ;
figure , imshow ( h ) ,axis on ;
i5 = h + 3 0 ; % p i x e l v a l u e s + 30
figure ,
subplot ( 3 , 4 , 5 ) , imshow ( i5 ) , . . .
axis off ; title ( ’h + 30’ ) ; % d i s p l a y augmented image p i x e l s
i6 = imsubtract ( h , 0 . 2 . ∗ h ) ; % p i x e l v a l u e d i f f e r e n c e s
subplot ( 3 , 4 , 6 ) , imshow ( i6 ) , . . .
axis off ; title ( ’h-0.2.*h’ ) ; % d i s p l a y p i x e l d i f f e r e n c e s
i7 = imabsdiff ( h , ( ( h + h ) . ∗ 0 . 5 ) ) ; % a b s o l u t e v a l u e o f d i f f e r e n c e s
subplot ( 3 , 4 , 7 ) , imshow ( i7 ) , . . .
axis off ; title ( ’|h-((h + h).*0.5)|’ ) ; % d i s p l a y a b s o f d i f f e r e n c e s
i8 = imadd ( h , ( ( h + h ) . ∗ 0 . 5 ) ) . ∗ 2 ; % summed p i x e l v a l u e s d o u b l e d
subplot ( 3 , 4 , 8 ) , imshow ( i8 ) , . . .
axis off ; title ( ’h+((h + h).*0.5)).*2’ ) ; % d i s p l a y d o u b l e d sums
Fig. A.23 Yet another colour image −→ pixel intensity changes
312 Appendix A: Matlab and Mathematica Scripts
% s c r i p t : p i x e l R .m
% Sample p i x e l v a l u e c h a n g e s I I I .
clc , clear all , close all
%%
img = imread ( ’leaf.jpg’ ) ;
% img = i m r e a d ( ’ CVLab − 3. jpg ’ ) ;
figure , imshow ( img ) ,axis on ;
% s e t up dummy image
rows = size ( img , 1 ) ; columns = size ( img , 2 ) ;
a = zeros ( rows , columns ) ;
% f i l l dummy image w i t h new r e d b r i g h t n e s s v a l u e s
figure ,
i9 = cat ( 3 , ( 0 . 8 ) . ∗ img ( : , : , 1 ) , a , a ) ; % changed r e d i n t e n s i t i e s
subplot ( 3 , 4 , 9 ) , imshow ( i9 ) , . . .
axis off ; title ( ’i9 (0.8).*red’ ) ; % d i s p l a y m o d i f i e d r e d i n t e n s i t i e s
% f i l l dummy image w i t h new g r e e n b r i g h t n e s s v a l u e s
i10 = cat ( 3 , a , ( 0 . 9 ) . ∗ img ( : , : , 2 ) , a ) ; % changed g r e e n i n t e n s i t i e s
subplot ( 3 , 4 , 1 0 ) , imshow ( i10 ) , . . .
axis off ; title ( ’i10 (0.9).*green’ ) ; % d i s p l a y newgreen i n t e n s i t i e s
% f i l l dummy image w i t h new g r e e n b r i g h t n e s s v a l u e s
i11 = cat ( 3 , a , ( 0 . 5 ) . ∗ img ( : , : , 2 ) , a ) ; % changed g r e e n i n t e n s i t i e s
subplot ( 3 , 4 , 1 1 ) , imshow ( i11 ) , . . .
axis off ; title ( ’i11 (0.5).*green’ ) ; % d i s p l a y new g r e e n i n t e n s i t i e s
i12 = cat ( 3 , a , a , ( 1 6 . 5 ) . ∗ img ( : , : , 3 ) ) ; % changed b l u e i n t e n s i t i e s
subplot ( 3 , 4 , 1 2 ) , imshow ( i12 ) , . . .
axis off ; title ( ’i12 (16.5).*blue’ ) ; % d i s p l a y new b l u e i n t e n s i t i e s
Fig. A.24 Yet another colour image −→ pixel intensity changes
Appendix A: Matlab and Mathematica Scripts 313
% s c r i p t : t h a i R .m
% c o n s t r u c t i n g new images from o l d images
% Sample p i x e l v a l u e c h a n g e s IV .
clc , clear all , close all
%%
% What ’ s h a p p e n i n g ?
%g = i m r e a d ( ’ r a i n b o w . jpg ’ ) ; h = i m r e a d ( ’ gems . jpg ’ ) ;
g = imread ( ’P9.jpg’ ) ; h = imread ( ’P7.jpg’ ) ;
i1 = g + h ; % add image p i x e l v a l u e s
subplot ( 2 , 4 , 1 ) , imshow ( i1 ) ; title ( ’g + h’ ) ; % d i s p l a y sum
i2 = ( g + h ) . ∗ 0 . 5 ; % average pixel values
subplot ( 2 , 4 , 2 ) , imshow ( i2 ) ; title ( ’(g+h).*0.5’ ) ; % d i s p l a y a v e r a g e
i3 = ( g + h ) . ∗ 0 . 3 ; % 1/3 pixel values
subplot ( 2 , 4 , 3 ) , imshow ( i3 ) ; title ( ’(g+h).*0.3’ ) ; % d i s p l a y r e d u c e d v a l u e s
i4 = ( g + h ) . ∗ 2 ; % d o u b l e d p i x e l v a l u e sums
subplot ( 2 , 4 , 4 ) , imshow ( i4 ) ; title ( ’(g+h).*2’ ) ; % d i s p l a y d o u b l e d v a l u e s
i5 = g + 3 0 ; % p i x e l v a l u e + 30
subplot ( 2 , 4 , 5 ) , imshow ( i5 ) ; title ( ’g + 30’ ) ; % d i s p l a y augmented image p i x e l s
i6 = imsubtract ( h , i3 ) ; % pixel value d i f f e r e n c e s
subplot ( 2 , 4 , 6 ) , imshow ( i6 ) ; title ( ’(h-i3)’ ) ; % d i s p l a y p i x e l d i f f e r e n c e s
i7 = imabsdiff ( h , ( ( g + h ) . ∗ 0 . 5 ) ) ; % a b s o l u t e value of d i f f e r e n c e s
subplot ( 2 , 4 , 7 ) , imshow ( i7 ) ; title ( ’(h-((g+h).*0.5))’ ) ; % d i s p l a y a b s o f
differences
i8 = imadd ( h , ( ( g + h ) . ∗ 0 . 5 ) ) . ∗ 2 ; % summed p i x e l v a l u e s d o u b l e d
subplot ( 2 , 4 , 8 ) , imshow ( i8 ) ; title ( ’(h+((g+h).*0.5))’ ) ; % d i s p l a y d o u b l e d sums
% s c r i p t : maxImage .m
% Modifying c o l o u r c h a n n e l p i x e l v a l u e s u s i n g a max i n t e n s i t y
clc , clear all , close all % housekeeping
%%
g = imread ( ’camera.jpg’ ) ; % r e a d c o l o u r image
[ r , c ] = max ( g ( 1 , : , 1 ) ) ; % g ( r , c ) = max r e d i n t e n s i t y i n row 1
h = g ( : , : , 1 ) + ( 0 . 1 ) .∗g(r , c) ; % add ( 0 . 1 ) max r e d v a l u e t o a l l p i x e l v a l u e s
h2 = g ( : , : , 1 ) + ( 0 . 3 ) . ∗ g ( r , c ) ; % add ( 0 . 3 ) max r e d from a l l p i x e l v a l u e s
h3 = g ( : , : , 1 ) + ( 0 . 6 ) . ∗ g ( r , c ) ; % add ( 0 . 6 ) max r e d from a l l p i x e l s
rows = size ( g , 1 ) ; columns = size ( g , 2 ) ;
a = zeros ( rows , columns ) ; % b l a c k image
captureR1 = cat ( 3 , h , a , a ) ; % r e d c h a n n e l image
captureR2 = cat ( 3 , h2 , a , a ) ; % r e d c h a n n e l image
captureR3 = cat ( 3 , h3 , a , a ) ; % r e d c h a n n e l image
figure , % i n t e r n a l view o f a r e d c h a n n e l i s a g r e y s c a l e image
subplot ( 1 , 3 , 1 ) , imshow ( h ) ,title ( ’g(:,:,1)+(0.1).*g(r,c)’ ) ;
subplot ( 1 , 3 , 2 ) , imshow ( h2 ) ,title ( ’g(:,:,1)+(0.3).*g(r,c)’ ) ;
subplot ( 1 , 3 , 3 ) , imshow ( h3 ) ,title ( ’g(:,:,1)+(0.6).*g(r,c)’ ) ;
figure , % e x t e r n a l view o f a r e d c h a n n e l i s a c o l o u r image
subplot ( 1 , 3 , 1 ) , imshow ( captureR1 ) ,title ( ’red channel captureR1’ ) ;
subplot ( 1 , 3 , 2 ) , imshow ( captureR2 ) ,title ( ’red channel captureR2’ ) ;
subplot ( 1 , 3 , 3 ) , imshow ( captureR3 ) ,title ( ’red channel captureR3’ ) ;
Listing A.17 Find max red intensity in row 1 in an image, using maxImage.m
314 Appendix A: Matlab and Mathematica Scripts
% s c r i p t : imageEdgesOnColorChannel .m
% Edge C o l o u r Channel p i x e l s mapped t o new i n t e n s i t i e s
clc , clear all , close all
%%
img = imread ( ’trains.jpg’ ) ;
% img = i m r e a d ( ’ c a r C y c l e . jpg ’ ) ;
figure , imshow ( img ) , . . .
axis square , axis on , title ( ’colour image display’ ) ;
gR = img ( : , : , 1 ) ; gG = img ( : , : , 2 ) ; gB = img ( : , : , 3 ) ;
imgRGB = edge ( rgb2gray ( img ) , ’canny’ ) ; % g r e y s c a l e e d g e s i n B /W
Appendix A: Matlab and Mathematica Scripts 315
Fig. A.27 Sample Canny train edges in binary and in colour, using Script A.18
Fig. A.28 Canny edges for each colour image channel, using Script A.18
Fig. A.29 Canny edges for combined red and blue colour channels, using Script A.18
Fig. A.30 Sample Canny train edges in binary and in colour using Script A.18
Fig. A.31 Canny edges for each colour image channel using Script A.18
% s c r i p t : c a m e r a P i x e l s M o d i f i e d .m
% Changing C o l o u r Channel V a l u e s .
% Method : s c a l e d l o g o f c h a n n e l i n t e n s i t i e s
%
clc , clear all , close all
%%
img = imread ( ’CNtrain.jpg’ ) ; % Read image
% img = i m r e a d ( ’ c a r C y c l e . jpg ’ ) ;
Appendix A: Matlab and Mathematica Scripts 319
% s c r i p t : i n v e r t .m
% G r e y s c a l e image complement and L o g i c a l Not o f B i n a r y image
clc , clear all , close all % housekeeping
%%
g = imread ( ’cameraman.tif’ ) ; % r e a d g r e y s c a l e image
gbinary = im2bw ( g ) ; % c o n v e r t t o b i n a r y image
gnot = not ( gbinary ) ; % n o t o f bw i n t e n s i t i e s
% gbinaryComplement = imcomplement ( g b i n a r y ) ;
% gbinaryComplement = imcomplement ( g n o t ) ;
gbinaryComplement = imcomplement ( g ) ;
figure ,
subplot ( 1 , 3 , 1 ) , imshow ( g ) , . . .
axis square , axis on , title ( ’greyscale image’ ) ;
h = imcomplement ( g ) ; % i n v e r t image ( complement )
subplot ( 1 , 3 , 2 ) , imshow ( h ) , . . .
axis square , axis on , title ( ’image complement’ ) ;
[ r , c ] = max ( g ) ; % max i n t e n s i t y l o c a t i o n
h2 = g + g ( r , c ) ; % max− i n c r e a s e d i n t e n s i t i e s
subplot ( 1 , 3 , 3 ) , imshow ( h2 ) , . . .
axis square , axis on , title ( ’add max intensity’ ) ;
figure ,
subplot ( 1 , 3 , 1 ) , imshow ( gbinary ) , . . .
axis square , axis on , title ( ’binary image’ ) ;
subplot ( 1 , 3 , 2 ) , imshow ( gnot ) , . . .
axis square , axis on , title ( ’not of image’ ) ;
subplot ( 1 , 3 , 3 ) , imshow ( gbinaryComplement ) , . . .
axis square , axis on , title ( ’ image complement’ ) ;
% s c r i p t : h i s t o g r a m B i n s .m
% H i s t o g r a m and stem p l o t e x p e r i m e n t
%
clc , clear all , close all % h o u s e k e e p i n g
%%
% T h i s s e c t i o n f o r c o l o u r images
I = imread ( ’trains.jpg’ ) ; % sample RGB image
% I = i m r e a d ( ’ C N t r a i n . jpg ’ ) ;
% I = i m r e a d ( ’ f i s h e r m a n H e a d . jpg ’ ) ;
% I = i m r e a d ( ’ f i s h e r m a n . jpg ’ ) ;
% I = i m r e a d ( ’ f o o t b a l l . jpg ’ ) ;
322 Appendix A: Matlab and Mathematica Scripts
I = rgb2gray ( I ) ;
%
% T h i s s e c t i o n f o r i n t e n s i t y images
%I = i m r e a d ( ’ p o u t . t i f ’ ) ;
%
% Construct histogram :
%
h = imhist ( I ) ;
[ counts , x ] = imhist ( I ) ;
for j= 1 :size ( x )
[ j , counts ( j ) ]
end
% counts
size ( counts )
subplot ( 1 , 3 , 1 ) , imshow ( I ) ;
subplot ( 1 , 3 , 2 ) , imhist ( I ) ,
grid on ,
ylabel ( ’pixel count’ ) ;
subplot ( 1 , 3 , 3 ) , stem ( x , counts ) ,
grid on
% s c r i p t : imageMesh .m
% image geometry : v i s u a l i z i n g r g b p i x e l i n t e n s i t y d i s t r i b u t i o n
%
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
img = imread ( ’trains.jpg’ ) ; % sample RGB image
% img= i m r e a d ( ’ c a r P o l i z i a . jpg ’ ) ;
figure , imshow ( img ) , . . .
axis on , grid on , xlabel ( ’x’ ) ,ylabel ( ’y’ ) ;
% img = imcrop ( img ) ;
% [ r , c ] = s i z e ( img ) ; % d e t e r m i n e c r o p p e d image s i z e
% r ,c
figure , imshow ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 ) ) , . . .
axis on , grid on , xlabel ( ’x’ ) ,ylabel ( ’y’ ) ;
% c o n v e r t t o 64 b i t ( d o u b l e p r e c i s i o n ) f o r m a t
% s u r f & s u r f c need d o u b l e p r e c i s i o n : 64 b i t p i x e l v a l u e s
img = double ( double ( img ) ) ;
% Cr = g r a d i e n t ( img ( : , : , 1 ) ) ;
% Cg = g r a d i e n t ( img ( : , : , 2 ) ) ;
% Cb = g r a d i e n t ( img ( : , : , 2 ) ) ;
% c o l o u r c h a n n e l g r a d i e n t s o f m a n u a l l y c r o p image :
Cr = gradient ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 , 1 ) ) ;
Cg = gradient ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 , 2 ) ) ;
Cb = gradient ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 , 3 ) ) ;
figure ;
% vm3D = s u r f ( img ( : , : ) ) ;
vm3D = surf ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 ) ) ;
axis tight , zlabel ( ’rgb pixel intensities’ ) ,
xlabel ( ’x:gradient(img(:,:)’ ) ,ylabel ( ’y:gradient(img(:,:)’ ) ; % l a b e l a x e s
saveas ( vm3D , ’3DcontourMesh.png’ ) ; % s a v e copy o f image
vm3Dred = figure ,
% s u r f c ( img ( : , : , 1 ) , Cr ) ,
surfc ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 , 1 ) , Cr ) ,
axis tight , zlabel ( ’red channel pixel intensities’ ) , . . .
xlabel ( ’x:gradient(img(:,:,1)’ ) ,ylabel ( ’y:gradient(img(:,:,1)’ ) ; % l a b e l a x e s
vm3Dgreen = figure ,
% s u r f c ( img ( : , : , 2 ) , Cg ) ,
surfc ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 , 2 ) , Cg ) ,
axis tight , zlabel ( ’green channel pixel intensities’ ) , . . .
xlabel ( ’x:gradient(img(:,:,2)’ ) ,ylabel ( ’y:(img(:,:,2)’ ) ; % l a b e l a x e s
vm3Dblue = figure ,
% s u r f c ( img ( : , : , 3 ) , Cb ) ,
surfc ( img ( 3 0 0 : 3 6 0 , 3 0 0 : 3 8 0 , 3 ) , Cb ) ,
axis tight , zlabel ( ’blue channel pixel intensities’ ) , . . .
xlabel ( ’x:gradient(img(:,:,3)’ ) ,ylabel ( ’y:gradient(img(:,:,3)’ ) ; % l a b e l a x e s
saveas ( vm3D , ’3DcontourMesh.png’ ) ; % s a v e copy o f image
saveas ( vm3Dred , ’3DcontourMeshRed.png’ ) ; % s a v e copy o f r e d c h a n n e l c o n t o u r mesh
saveas ( vm3Dgreen , ’3DcontourMeshGreen.png’ ) ; % s a v e copy o f r e d c h a n n e l c o n t o u r
mesh
saveas ( vm3Dblue , ’3DcontourMeshRed.png’ ) ; % s a v e copy o f r e d c h a n n e l c o n t o u r
mesh
% a c c e s s and d i s p l a y i n g ( i n t h e work s p a c e ) m a n u a l l y c r o p p e d image :
% rgb340341 = img ( 3 4 0 , 3 4 1 ) ,
% rgb340342 = img ( 3 4 0 , 3 4 2 ) ,
% rgb340343 = img ( 3 4 0 , 3 4 3 ) ,
% r e d = img ( 3 4 0 : 3 4 3 , 1 ) ,
% g r e e n = img ( 3 4 0 : 3 4 3 , 2 ) ,
% b l u e = img ( 3 4 0 : 3 4 3 , 3 )
Remark A.22 Sample colour image grid, 3D mesh for colour intensities and 3D
contour mesh for green channel intensities plots.
Appendix A: Matlab and Mathematica Scripts 325
Script A.22 produces the results shown in Fig. A.38.1 and A.38.2 for the colour image
with grid overlay shown in Fig. A.37. For more about this, see Sect. 3.1.
% S o u r c e : i s o l i n e s .m
% V i s u a l i s a t i o n experiment with i s o l i n e s
%
clc , close all , clear all % h o u s e k e e p i n g
g = imread ( ’peppers.png’ ) ; % r e a d c o l o u r image
figure , imshow ( g ) ,axis on , grid on ;
figure ,
contour ( g ( : , : , 1 ) ) ; % i s o l i n e s w/ o v a l u e s
figure ,
[ c , h ] = contour ( g ( : , : , 1 ) ) , % r e d c h a n n e l i s o l i n e s
clabel ( c , h , ’labelspacing’ , 8 0 ) ; % i s o l i n e l a b e l s p a c i n g
hold on
set ( h , ’ShowText’ ,’on’ ,’TextStep’ ,get ( h , ’LevelStep’ ) ) ;
colormap jet , title ( ’peppers.png red channel isoline values’ ) ;
Listing A.23 Matlab code in isolines.m to produce the colour channel isolines shown in
Fig. A.40.
326 Appendix A: Matlab and Mathematica Scripts
Fig. A.40 Sample colour image isolines with and without labels
Remark A.23 Sample colour channel isolines with and without labels.
Script A.23 produces the results shown in Fig. A.40.1 and A.40.2 for the colour image
with grid overlay shown in Fig. A.39. For more about this, see Sect. 3.1.
% g a u s s i a n S m o o t h i n g .m
% S c r i p t f o r 1D G a u s s i a n k e r n e l p l o t s
% O r i g i n a l s c r i p t by Matthew B r e t t 6 / 8 / 9 9
% Thanks e x t e n d e d t o R . H e t t i a r a c h c h i f o r nos c o r r e c t i o n .
% r e v i s e d 24 Oct . 2016
clear all , close all , clc
% make v e c t o r s o f p o i n t s f o r t h e x a x i s
% minx = 1 ; maxx = 5 5 ; x = minx : maxx ; % f o r d i s c r e t e p l o t s
% fineness = 1/100;
% f i n e x = minx : f i n e n e s s : maxx ; % f o r c o n t i n u o u s p l o t s
% im = r e a d ( ’ p e p p e r s . png ’ ) ;
% im = r g b 2 h s v ( im ) ; % u s e row o f im i n s t e a d o f nos v a r i a b l e ( below ) .
%% L e t mean u = 0 . The f o r m u l a f o r 1D G a u s s i a n k e r n e l i s d e f i n e d by
% 1 ( x ^2 )
% f ( x ) = −−−−−−−−−−−− exp[ − −−−−−−−−− ]
% v∗ s q r t (2∗ p i ) ( 2v ^2 )
% where v ( o r sigma ) i s t h e s t a n d a r d d e v i a t i o n , and u i s t h e mean .
Appendix A: Matlab and Mathematica Scripts 327
% 1D G a u s s i a n k e r n e l sigma :
%%
sigma1 = 0 . 4 1 ; % 0 . 5 1 , 1 . 5 ;
rng ( ’default’ ) ;
nos = randn ( 1 , 1 0 0 ) ;
fineness = nos / 1 0 0 ;
kernx = min ( nos ) : fineness : max ( nos ) ;
skerny = 1 / ( sigma1∗sqrt(2∗pi ) ) ∗ exp( −kernx . ^ 2 / ( 2 ∗ sigma1^ 2 ) ) ; % v = o . 5 1 , 1 , 3
figure
plot ( kernx , skerny , ’r’ ) , . . .
legend ( ’f(x;sigma=0.41)’ ,’Location’ , ’NorthEast’ ) ;
sigma2 = 0 . 6 1 ; % 1 . 0 ;
skerny = 1 / ( sigma2∗sqrt(2∗pi ) ) ∗ exp( −kernx . ^ 2 / ( 2 ∗ sigma2^ 2 ) ) ; % v = 1 , 3
figure
plot ( kernx , skerny , ’r’ ) , . . .
legend ( ’f(x;sigma=0.61)’ ,’Location’ , ’NorthEast’ ) ;
sigma3 = 0 . 8 1 ; %1 . 2 ;
skerny = 1 / ( sigma3∗sqrt(2∗pi ) ) ∗ exp( −kernx . ^ 2 / ( 2 ∗ sigma3^ 2 ) ) ; % v = 1 , 3
figure
plot ( kernx , skerny , ’r’ ) , . . .
legend ( ’f(x;sigma=0.81)’ ,’Location’ , ’NorthEast’ ) ;
1 (x−0)2 1 x2
f (x; σ) = √ e− 2σ2 = √ e− 2σ2 (1D Gaussian kernel).
σ 2π σ 2π
In the definition of the 1D Gaussian kernel function f (x; σ), x is a spatial parameter
and σ is a scale parameter. The semicolon between x and σ separates the two types of
parameters. For x, try letting x range over the pixel intensities in a row or column of
either a colour or grayscale image, letting the scale parameter σ be a small value such
as σ = 0.5. For more about this, see B.M. ter Haar Romeny in [65]. For other papers
Appendix A: Matlab and Mathematica Scripts 329
by ter Haar Romeny on computer vision, visualization, and the Gaussian kernel, see
http://bmia.bmt.tue.nl/people/bromeny/index.html.
% g a u s s i a n 2 D K e r n e l E x p e r i m e n t .m
% S c r i p t t o p r o d u c e a l m o s t c o n t i n u o u s a s w e l l a s d i s c r e t e 2D G a u s s i a n k e r n e l
plots
% Matthew B r e t t 6 / 8 / 9 9
% r e v i s e d 24 Oct . 2016
% s e e d random number g e n e r a t o r
rng ( ’default’ ) ;
nos = randn ( 1 , 1 0 0 ) ;
fineness = mean ( nos ) ;
fineness = fineness∗ 5 ;
%
FWHM = 4 ;
sig = FWHM / sqrt(8∗log ( 2 ) )
%
% 2d G a u s s i a n k e r n e l − f a i r l y c o n t i n u o u s
Dim = [ 2 0 2 0 ] ;
% fineness = 0.55; % .1
[ x2d , y2d ] = meshgrid( − (Dim ( 2 ) − 1) / 2 : fineness : ( Dim ( 2 ) − 1) / 2 , . . .
−(Dim ( 1 ) − 1) / 2 : fineness : ( Dim ( 1 ) − 1) / 2 ) ;
gf = exp( − (x2d . ∗ x2d + y2d . ∗ y2d ) / ( 2 ∗ sig∗sig ) ) ;
gf = gf / sum ( sum ( gf ) ) / ( fineness^ 2 ) ;
figure
colormap hsv
surfc ( x2d+Dim ( 1 ) / 2 , y2d+Dim ( 2 ) / 2 , gf ) , . . .
legend ( ’f(x,y,sigma=1.6986)’ , ’Location’ , ’NorthEast’ ) ;
beta = 1 ;
brighten ( beta )
% 2d G a u s s i a n k e r n e l − d i s c r e t e
[ x2d , y2d ] = meshgrid( − (Dim ( 2 ) − 1) / 2 : ( Dim ( 2 ) − 1) / 2 , − (Dim ( 1 ) − 1) / 2 : ( Dim ( 1 ) − 1) / 2 ) ;
gf = exp( − (x2d . ∗ x2d + y2d . ∗ y2d ) / ( 2 ∗ sig∗sig ) ) ;
gf = gf / sum ( sum ( gf ) ) ;
figure
bar3 ( gf , ’r’ ) , . . .
legend ( ’f(x,y,sigma=1.6986)’ , ’Location’ , ’NorthEast’ ) ;
axis ( [ 0 Dim ( 1 ) 0 Dim ( 2 ) 0 max ( gf ( : ) ) ∗ 1 . 2 ] )
axis xy
%
%
%
% s c r i p t : g a u s s i a n F i l t e r S i m p l e .m
% G a u s s i a n f i l t e r i n g ( s m o o t h i n g ) a c r o p p e d image :
% See , a l s o :
% h t t p : / / s t a c k o v e r f l o w . com / q u e s t i o n s / 2 7 7 3 6 0 6 / g a u s s i a n − f i l t e r −in −m a t l a b
clear all , close all , clc
%%
img = imread ( ’CNtrain.jpg’ ) ;
% img = i m r e a d ( ’ t i s s u e . png ’ ) ;
img = imcrop ( img ) ; % c r o p image
% img = img ( 8 0 + [ 1 : 2 5 6 ] , 1 : 2 5 6 , : ) ;
figure , imshow ( img ) , . . .
grid on , title ( ’Sample subimage’ ) ;
%# C r e a t e t h e g a u s s i a n f i l t e r w i t h h s i z e = [ 5 5 ] and sigma = 2
G = fspecial ( ’gaussian’ , [ 5 5 ] , 2 ) ;
%# F i l t e r ( smooth ) image
Appendix A: Matlab and Mathematica Scripts 331
% s c r i p t : g a u s s i a n F i l t e r .m
% G a u s s i a n b l u r f i l t e r on a c r o p p e d image :
% Image c o u r t e s y o f A.W. P a r t i n .
% Sample a p p l i c a t i o n o f t h e d e c o n v r e g f u n c t i o n .
% Try d o c s e a r c h d e c o n v r e g f o r more a b o u t t h i s .
clear all , close all , clc
Fig. A.46 Tissue image and cropped image for restoration experiments
%%
img = imread ( ’tissue.png’ ) ;
% 1 d i s p l a y t i s s u e sample
figure , imshow ( img ) , . . .
grid on , axis tight , title ( ’Tissue image’ ) ;
% 2 c r o p image
img = imcrop ( img ) ; % t o o l −b a s e d c r o p p i n g
% img = img ( 1 2 5 + [ 1 : 2 5 6 ] , 1 : 2 5 6 , : ) ; % manual c r o p p i n g
figure , imshow ( img ) , . . .
title ( ’Cropped tissue image’ ) ;
% 3 b l u r : c o n v o l v e g a u s s i a n smoothed image w i t h c r o p p e d image
psf = fspecial ( ’gaussian’ , 1 1 , 5 ) ; % p s f = p o i n t s p r e a d f u n c t i o n
blurred = imfilter ( img , psf , ’conv’ ) ; %c o n v o l v e img w i t h p s f
figure , imshow ( blurred ) , . . .
title ( ’Convolved point spread image 1’ ) ;
% 4 n o i s e : g a u s s i a n smooth b l u r r e d image
v = 0.02; % suggested v = 0.02 ,0.002 ,0.001 ,0.005
blurredNoisy = imnoise ( blurred , ’gaussian’ , 0 . 0 0 0 , v ) ; %0 vs . 0 . 0 0 1
figure , imshow ( blurredNoisy ) , . . .
title ( ’Blurred noisy image 1’ ) ;
% 5 r e s t o r e image ( f i r s t p a s s )
np = v∗prod ( size ( img ) ) ; % n o i s e power
% o u t p u t 1 : r e s t o r e d image r e g 1 & o u t p u t o f Lagrange m u l t i p l i e r
[ reg1 LAGRA ] = deconvreg ( blurredNoisy , psf , np ) ;
figure , imshow ( reg1 ) , . . .
title ( ’Restored image reg1’ ) ;
% 6 b l u r : c o n v o l v e g a u s s i a n smoothed image and c r o p p e d image
psf2 = fspecial ( ’gaussian’ , 8 , 5 ) ; % p s f = p o i n t s p r e a d f u n c t i o n
blurred2 = imfilter ( img , psf2 , ’conv’ ) ; %c o n v o l v e img w i t h p s f
figure , imshow ( blurred2 ) , . . .
title ( ’Convolved point spread image 2’ ) ;
% 7 noise
v2 = 0 . 0 0 5 ; % s u g g e s t e d v = 0 . 0 2 , 0 . 0 0 2 , 0 . 0 0 1 , 0 . 0 0 5
blurredNoisy2 = imnoise ( blurred2 , ’gaussian’ , 0 . 0 0 1 , v2 ) ; %0 vs . 0 . 0 0 1
figure , imshow ( blurredNoisy2 ) , . . .
title ( ’Blurred noisy image 2’ ) ;
% 8 r e s t o r e image ( s e c o n d p a s s )
np2 = v2∗prod ( size ( img ) ) ; % n o i s e power
% o u t p u t 2 : r e s t o r e d image r e g 2 & o u t p u t o f Lagrange m u l t i p l i e r
334 Appendix A: Matlab and Mathematica Scripts
Fig. A.47 Experiment 1: convolution, noise injection and cropped tissue image restoration
Appendix A: Matlab and Mathematica Scripts 335
3o Figure A.48: (1) Blur: cropped image convolved with Gaussian smoothed image,
(2) Noise: injection of noise in blurred image, and (3) Restoration 2.
For more about this, see Sect. 5.6.
Fig. A.48 Experiment 2: convolution, noise injection and cropped tissue image restoration
% s c r i p t : i m a g e C o r n e r s .m
% i m a g e C o r n e r s .m
% F i n d i n g image c o r n e r s , R . H e t t i a r a c h c h i , 2015
% r e v i s e d 23 Oct . 2016
336 Appendix A: Matlab and Mathematica Scripts
Fig. A.50 Image corners & Voronoi mesh on cropped cycle image
Fig. A.51 Image corners & Voronoi mesh on full cycle image
im2 = imcrop ( im )
figure , imshow ( im2 ) , . . .
grid on , title ( ’cropped image’ ) ;
% c r o p method 2
% imcrop ( im , [ xmin ymin w i d t h h e i g h t ] )
% im2 = imcrop ( im , [ 1 8 0 300 300 3 0 0 ] ) ;
% c r o p method 3
% imcrop ( im , [ xmin [ v e r t i c a l w i d t h ] ymin [ h o r i z o n t a l w i d t h ] ] )
% im2 = im ( 2 0 0 + [ 1 : 1 5 0 ] , 1 8 0 + [ 1 : 3 2 0 ] , : ) ; % c r o p image
g2=rgb2gray ( im2 ) ;
[ m2 , n2]=size ( g2 ) ;
C2 = corner ( g2 , 5 0 ) ; %f i n d up t o 50 c o r n e r s
%add f o u r c o r n e r s o f t h e image t o C
fc2=[1 1 ; n2 1 ; 1 m2 ; n2 m2 ] ;
C2=[C2 ; fc2 ] ;
figure , image ( im2 ) , hold on , . . .
grid on , title ( ’corners on cropped image’ ) ,
resultOne = plot ( C2 ( : , 1 ) , C2 ( : , 2 ) , ’g+’ ) ;
figure , image ( im2 ) , hold on , . . .
grid on , title ( ’Voronoi mesh on cropped image’ ) ,
result2 = plot ( C2 ( : , 1 ) , C2 ( : , 2 ) , ’g+’ ) ;
voronoi ( C2 ( : , 1 ) , C2 ( : , 2 ) , ’g.’ ) ; % r e d e d g e s
% i m w r i t e ( r e s u l t 2 , ’ c o r n e r s 2 . jpg ’ ) ;
%%
g=rgb2gray ( im ) ;
[ m , n]=size ( g ) ;
C = corner ( g , 5 0 0 ) ; %f i n d up t o 500 c o r n e r s
%add f o u r c o r n e r s o f t h e image t o C
fc=[1 1 ; n 1 ; 1 m ; n m ] ;
C=[C ; fc ] ;
figure , image ( im ) , hold on , . . .
grid on , title ( ’corners on whole image’ ) ,
resultTwo = plot ( C ( : , 1 ) , C ( : , 2 ) , ’g+’ ) ;
figure , image ( im ) , hold on , . . .
grid on , title ( ’Voronoi mesh on whole image’ ) ,
result = plot ( C ( : , 1 ) , C ( : , 2 ) , ’g+’ ) ;
voronoi ( C ( : , 1 ) , C ( : , 2 ) ,’g.’ ) ; % r e d e d g e s
% i m w r i t e ( r e s u l t , ’ c o r n e r s . jpg ’ ) ;
Fig. A.52 Voronoï mesh on cropped cycle image with and without corners
% s c r i p t : VoronoiMeshOnImage .m
% image geometry : o v e r l a y Voronoi mesh on image
%
% s e e h t t p : / / homepages . u l b . ac . be / ~ dgonze / INFO / m a t l a b . h t m l
% r e v i s e d 23 Oct . 2016
clear all ; close all ; clc ; % h o u s e k e e p i n g
g=imread ( ’fisherman.jpg’ ) ;
% im= i m r e a d ( ’ c y c l e . jpg ’ ) ;
% g= i m r e a d ( ’ c a r R e d S a l e r n o . jpg ’ ) ;
%%
img = g ; % s a v e copy o f c o l o u r image t o make o v e r l a y p o s s i b l e
g = double ( rgb2gray ( g ) ) ; % c o n v e r t t o g r e y s c a l e image
% c o r n e r s = c o r n e r ( g ) ; % min . no . o f c o r n e r s
k = 233; % s e l e c t k corners
corners = corner ( g , k ) ; % up t o k c o r n e r s
box_corners = [ 1 , 1 ; 1 , size ( g , 1 ) ; size ( g , 2 ) , 1 ; size ( g , 2 ) ,size ( g , 1 ) ] ;
corners = cat ( 1 , corners , box_corners ) ;
vm = figure , imshow ( img ) , . . .
axis on , hold on ; % s e t up image o v e r l a y
Appendix A: Matlab and Mathematica Scripts 339
img = ;
c = ComponentMeasurements[img, “Centroid”][[All, 2]][[1]];
Show[img, Graphics[{Black, PointSize[0.02], Point[c]}]]
For more about this, see Appendix A.6.2 and Sect. 6.4.
% s c r i p t : f i n d C e n t r o i d s .m
% c e n t r o i d −b a s e d image Delaunay mesh
clc , clear all , close all
%%
im = imread ( ’fisherman.jpg’ ) ;
% im = i m r e a d ( ’ l i f t i n g b o d y . png ’ ) ;
figure ,
imshow ( im ) , axis on ;
% i f s i z e ( im , 3 ) ==3
% g= r g b 2 g r a y ( im ) ;
% end
[ m , n]=size ( im ) ;
bw = im2bw ( im , 0 . 5 ) ; % t h r e s h o l d a t 50%
bw = bwareaopen ( bw , 2 ) ; % remove o b j e c t s l e s s 2 t h a n p i x e l s
342 Appendix A: Matlab and Mathematica Scripts
% s u p e r i m p o s e mesh on image
figure ,
imshow ( im ) ,hold on
plot ( centroids ( : , 1 ) ,centroids ( : , 2 ) , ’r+’ )
hold on ;
X=centroids ( : , 1 ) ;
Y=centroids ( : , 2 ) ;
% constuct delaunay t r i a n g u l a t i o n
% TRI = d e l a u n a y (X,Y) ;
% t r i p l o t ( TRI , X, Y, ’ y ’ ) ;
% f i n d C e n t r o i d a l D e l a u n a y M e s h .m
% c e n t r o i d −b a s e d image Delaunay mesh
clc , clear all , close all
%%
im = imread ( ’fisherman.jpg’ ) ;
% im = i m r e a d ( ’ l i f t i n g b o d y . png ’ ) ;
Appendix A: Matlab and Mathematica Scripts 343
% i f s i z e ( im , 3 ) ==3
% g= r g b 2 g r a y ( im ) ;
% end
[ m , n]=size ( im ) ;
bw = im2bw ( im , 0 . 5 ) ; % t h r e s h o l d a t 50%
bw = bwareaopen ( bw , 2 ) ; % remove o b j e c t s l e s s 2 t h a n p i x e l s
stats = regionprops ( bw , ’Centroid’ ) ; % c e n t r o i d c o o r d i n a t e s
centroids = cat ( 1 , stats . Centroid ) ;
fc=[1 1 ; n 1 ; 1 m ; n m ] ; % i d e n t i f y image c o r n e r s
centroids=[centroids ; fc ] ;
% s u p e r i m p o s e mesh on image
figure ,
imshow ( im ) ,hold on
plot ( centroids ( : , 1 ) ,centroids ( : , 2 ) , ’r+’ )
hold on ;
X=centroids ( : , 1 ) ;
Y=centroids ( : , 2 ) ;
% constuct delaunay t r i a n g u l a t i o n
TRI = delaunay ( X , Y ) ;
triplot ( TRI , X , Y , ’y’ ) ;
% s c r i p t : f i n d C e n t r o i d a l V o r o n o i M e s h .m
% c e n t r o i d −b a s e d image Voronoi mesh
clc , clear all , close all
%%
im = imread ( ’fisherman.jpg’ ) ;
% im = i m r e a d ( ’ l i f t i n g b o d y . png ’ ) ;
% i f s i z e ( im , 3 ) ==3
% g= r g b 2 g r a y ( im ) ;
% end
[ m , n]=size ( im ) ;
bw = im2bw ( im , 0 . 5 ) ; % t h r e s h o l d a t 50%
bw = bwareaopen ( bw , 2 ) ; % remove o b j e c t s l e s s 2 t h a n p i x e l s
stats = regionprops ( bw , ’Centroid’ ) ; % c e n t r o i d c o o r d i n a t e s
centroids = cat ( 1 , stats . Centroid ) ;
fc=[1 1 ; n 1 ; 1 m ; n m ] ; % i d e n t i f y image c o r n e r s
centroids=[centroids ; fc ] ;
% s u p e r i m p o s e mesh on image
figure ,
imshow ( im ) ,hold on
plot ( centroids ( : , 1 ) ,centroids ( : , 2 ) , ’r+’ )
hold on ;
X=centroids ( : , 1 ) ;
Y=centroids ( : , 2 ) ;
% c o n s t r u c t Voronoi mesh
[ vx , vy ] = voronoi ( X , Y ) ;
plot ( vx , vy , ’g-’ ) ;
% s c r i p t : f i n d C e n t r o i d a l V o r n o n o i O n D e l a u n a y M e s h .m
% c e n t r o i d −b a s e d image Delaunay and Voronoi mesh
clc , clear all , close all
%%
im = imread ( ’fisherman.jpg’ ) ;
% im = i m r e a d ( ’ l i f t i n g b o d y . png ’ ) ;
% i f s i z e ( im , 3 ) ==3
% g= r g b 2 g r a y ( im ) ;
% end
[ m , n]=size ( im ) ;
bw = im2bw ( im , 0 . 5 ) ; % t h r e s h o l d a t 50%
bw = bwareaopen ( bw , 2 ) ; % remove o b j e c t s l e s s 2 t h a n p i x e l s
stats = regionprops ( bw , ’Centroid’ ) ; % c e n t r o i d c o o r d i n a t e s
centroids = cat ( 1 , stats . Centroid ) ;
fc=[1 1 ; n 1 ; 1 m ; n m ] ; % i d e n t i f y image c o r n e r s
centroids=[centroids ; fc ] ;
% s u p e r i m p o s e mesh on image
Appendix A: Matlab and Mathematica Scripts 345
figure ,
imshow ( im ) ,hold on
plot ( centroids ( : , 1 ) ,centroids ( : , 2 ) , ’r+’ )
hold on ;
X=centroids ( : , 1 ) ;
Y=centroids ( : , 2 ) ;
% constuct delaunay t r i a n g u l a t i o n
TRI = delaunay ( X , Y ) ;
triplot ( TRI , X , Y , ’y’ ) ;
% c o n s t r u c t Voronoi mesh
[ vx , vy ] = voronoi ( X , Y ) ;
plot ( vx , vy , ’k-’ ) ;
%For s a v i n g e d g e l e t s
saveFrameNums = [ 1 0 , 3 0 ] ;
%S e t up d i r e c t o r y f o r s t i l l s , t a r g e t c o n t o u r , e t c .
[ pathstr , name , ext ] = fileparts ( videoFile ) ;
savePath = [ ’./’ name ’/’ int2str ( numPoints ) ’points’ ] ;
if ~exist ( savePath , ’dir’ )
mkdir ( savePath ) ;
end
frameCount = 0 ;
cropRect = [ ] ;
edgelets = { } ;
% F i n d what t o c r o p i f we haven ’ t y e t .
if length ( cropRect ) == 0
[ videoFrame , cropRect ] = imcrop ( videoFrame ) ;
close figure 1 ; %C l o s e t h e imcrop f i g u r e , which s t a y s a r o u n d .
else
videoFrame = imcrop ( videoFrame , cropRect ) ;
end
frameCount = frameCount + 1 ;
if i == 1
edgelets{frameCount} = mnc ;
end
%Update v i d e o p l a y e r
pos = get ( videoPlayer , ’Position’ ) ;
pos ( 3 ) = size ( videoFrame , 2 ) + 3 0 ;
pos ( 4 ) = size ( videoFrame , 1 ) + 3 0 ;
set ( videoPlayer , ’Position’ ,pos ) ;
step ( videoPlayer , videoFrameT ) ;
%Cause c o n t o u r p l a y e r t o ’ s t i c k ’ t o t h e r i g h t o f t h e main v i d e o
pos ( 1 ) = pos ( 1 ) + size ( videoFrame , 2 ) + 3 0 ;
set ( edgeletPlayer , ’Position’ ,pos ) ;
step ( edgeletPlayer , contourFrame ) ;
end
m{i} = m{i} + 1 ;
end
end
end
eProb = cellfun ( @ ( e ) 1 / length ( e ) , edgelets ) ;
%H i s t o g r a m o f m_i
figure , histogram ( cell2mat ( m ) ) , title ( ’Histogram of m_i’ ) ;
xlabel ( ’Frequency’ ) , ylabel ( ’Count at that Frequency’ ) ;
%Now, do a compass p l o t .
%I have m o d i f i e d t h i s t o t r y and l o o k d e c e n t .
mags = unique ( cell2mat ( m ) ) ;
zs = mags . ∗ exp ( sqrt( − 1) ∗ ( 2 ∗ pi ∗ ( 1 : length ( mags ) ) / length ( mags ) ) ) ;
%The above e v e n l y s p a c e s o u t t h e m a g n i t u d e s by making them
%Complex numbers w i t h a m a g n i t u d e e q u a l t o t h e i r f r e q u e n c y v a l u e s
%And p h a s e s e q u a l l y s p a c e d
figure , compass ( zs ) , title ( ’Compass Plot of m_i’ ) ;
%TODO: l o g p o l a r p l o t
%3d c o u n t o u r p l o t
tri = delaunay ( 1 : length ( eSize ) , cell2mat ( m ) ) ;
figure , trisurf ( tri , 1 : length ( eSize ) , cell2mat ( m ) , eProb ) , title ( ’Pr(e_i) vs
. e_i and m_i’ ) ;
xlabel ( ’e_i’ ) , ylabel ( ’m_i’ ) , zlabel ( ’Pr(e_i)’ ) ;
end
%Take p o i n t s , o r d e r them t o t h e r i g h t a n g l e , r e t u r n [ x1 , y1 , x2 , y2 . . . ]
function xy = orderPoints ( x , y )
cx = mean ( x ) ;
cy = mean ( y ) ;
a = atan2 ( y − cy , x − cx ) ;
[ ~ , order ] = sort ( a ) ;
x = x ( order ) ;
y = y ( order ) ;
xy = [ x ’ ; y ’ ] ;
xy = xy ( : ) ; %merge t h e two s u c h t h a t we g e t [ x1 , y1 , x2 . . ]
if length ( xy ) < 6
%o u r p o l y g o n i s a l i n e , dummy v a l u e i t
xy = [ 0 0 0 0 0 0 ] ;
end
end
voronoiLines = [ ] ;
MNCs = { } ;
% C r e a t i n g m a t r i x o f l i n e s e g m e n t s i n t h e form
% [ x_11 y_11 x_12 y_12 . . . x_n1 y_n1 x_n2 y_n2 ]
A = [ VX ( 1 , : ) ; VY ( 1 , : ) ; VX ( 2 , : ) ; VY ( 2 , : ) ] ;
A ( A>5000) = 5 0 0 0 ; A ( A< − 5000) = − 5000;
A = A’;
voronoiLines = A ;
%Now f i n d maximal n u c l e u s c l u s t e r
[ V , C ] = voronoin ( corners , { ’Qbb’ ,’Qz’ } ) ; %O p t i o n s added t o a v o i d co− s p e r i c a l
error , see matlab documentation
%L i m i t v a l u e s , can ’ t draw i n f i n i t e t h i n g s
V ( V > 5000) = 5 0 0 0 ;
V ( V < − 5000) = − 5000;
numSides=cellfun ( @length , C ) ;
maxSides=max ( numSides ) ;
ind=find ( numSides==maxSides ) ;
N=size ( corners , 1 ) ;
for i= 1 :length ( ind )
xy = [ ] ;
for j= 1 :N
if ( ind ( i ) ~=j ) %F i n d t h e c o r n e r p o i n t s which have t h i s edge
s = size ( intersect ( C{ind ( i ) } , C{j } ) ) ;
if ( s ( 2 ) >1)%i f n e i g h b o r v o r o n o i r e g i o n
xy=[xy ; corners ( j , : ) ] ; %keep t h e xy c o o r d s o f a d j a c e n t
polygon
end
end
end
MNCs{i} = xy ;
end
end
Listing A.34 Matlab code in Problem734.m to obtain edgelet measurements for each Voronoi
tessellated video from.
Fig. A.59 Sample colour used in Gaussian pyramid scheme in script A.35
350 Appendix A: Matlab and Mathematica Scripts
% pyramidScheme .m
% G a u s s i a n pyramid r e d u c t i o n and e x p a n s i o n o f an image
% c f . S e c t i o n 8 . 4 on c r o p p i n g & s p a r s e r e p r e s e n t a t i o n s
clear all , close all , clc
% Crop ( e x t r a c t ) an i n t e r e s t i g subimage
im0 = imcrop ( im0 ) ;
%%
dwd = WaveletBestBasis[DiscreteWaveletPacketTransform[ ,
Padding → “Extrapolated”]];
imgFunc[img_, {___, 1|2|3}]:=
Composition[Sharpen[#, 0.5]&, ImageAdjust[#, {0, 1}]&, ImageAdjust, Image
Apply[Abs, #1]&][
img]
imgFunc[img_, wind_]:=Composition[ImageAdjust, ImageApply[Abs, #1]&][img];
WaveletImagePlot[dwd, Automatic, imgFunc[#1, #2]&, BaseStyle → Red,
ImageSize → 800]
Remark A.37 Digital image region centroids.
A sample wavelet-based sparse representation pyramid scheme for a 2D image
is shown in Fig. A.61 using Mathematica® script 4. For more about this, see
Sect. 8.4.
Fig. A.62 Sample pixel edge strengths represented by circle radii magnitudes
The edge strength of the red • hat pixel in Fig. A.62.1 is represented by the length
of the radius of the circle centered on the hat pixel.
A global view of multiple pixel edge strengths is shown in Fig. A.62.2. To exper-
iment with finding the edge strengths of pixels, try Matlab script A.36.
% p i x e l edge s t r e n g t h d e t e c t i o n
% N. B . : e a c h p i x e l found i s a k e y p o i n t
clc ; clear all , close all ;
% g = i m r e a d ( ’ cameraman . t i f ’ ) ;
% I = g;
g = imread ( ’fisherman.jpg’ ) ;
I = rgb2gray ( g ) ; % n e c e s s a r y s t e p f o r c o l o u r images
points = detectSURFFeatures ( I ) ;
% a c q u i r e edge p i x e l s t r e n g t h s
[ features , keyPts ] = extractFeatures ( I , points ) ;
% r e c o r d number o f k e y p o i n t s found
keyPointsFound = keyPts
% s e l e c t number p i x e l edge s t r e n g t h s t o d i s p l a y on o r i g i n a l image
figure ,
imshow ( g ) ; hold on ;
plot ( keyPts . selectStrongest ( 1 3 ) , ’showOrientation’ ,true ) ,
axis on , grid on ;
figure ,
imshow ( g ) ; hold on ;
plot ( keyPts . selectStrongest ( 8 9 ) , ’showOrientation’ ,true ) ,
axis on , grid on ;
Remark A.39 A sample plot of arctan values is shown in Fig. A.63. Try doing the
same things using Matlab® .
Remark A.40 Each pixel intensity in Fig. A.64 is a representation of the HSB (Hue
Saturation Brightness) colour channel values that correspond to the pixel (gradient
orientation (angle), gradient magnitude in the x-direction, gradient magnitude in the
x-direction) in a fingerprint. Try Mathematica script 6 to see how the pixel gradient
orientations of the pixels varying with each fingerprint. The HSV (Hue Saturation
Value) colour space in Matlab is equivalent to the HSB colour space in Mathematica.
i= ;
orientation = GradientOrientationFilter[i, 1]//ImageAdjust;
magnitude = GradientFilter[i, 1]//ImageAdjust;
ColorCombine[{orientation, magnitude, magnitude}, “HSB”]
Remark A.41 In Fig. A.65.1, each pixel intensity is a representation of the RGB
colour channel values that correspond to the pixel (gradient orientation (angle), gra-
dient magnitude in the x-direction, gradient magnitude in the x-direction) in a fin-
gerprint. In Fig. A.65.2, each pixel intensity is a representation of the LAB colour
channel values that correspond to the pixel (gradient orientation (angle), gradient
magnitude in the x-direction, gradient magnitude in the x-direction) in a fingerprint.
Try Mathematica script 6 for different images to see how the pixel gradients vary
with each image.
i= ;
orientation = GradientOrientationFilter[i, 1]//ImageAdjust;
356 Appendix A: Matlab and Mathematica Scripts
Remark A.42 In Fig. A.66, a LAB colour space view of an image is given. To exper-
iment with other LAB coloured images, try Matlab script A.37.
% LAB c o l o u r s p a c e : C o l o u r i z e image r e g i o n s .
% s c r i p t : LABexperiment .m
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
img = imread ( ’fisherman.jpg’ ) ;
Remark A.43 The image in Fig. A.67.2 results from a difference of Gaussians con-
volved with the original image in Fig. A.67.1. Let I mg(x, y) be an intensity image
let G(x, y, σ) be a variable scale Gaussian defined by
1 − x 2 +y2 2
G(x, y, σ) = e 2σ .
2πσ 2
Let k be a scaling factor and let ∗ be a convolution operation. From D.G. Lowe [116],
we obtain a difference-of-Gaussians image (denoted by D(x, y, σ) defined by
Then use D(x, y, σ) to identify potential interest points that are invariant to scale
and orientation.
% DoG : D i f f e r e n c e o f G a u s s i a n s .
% s c r i p t : dogImg .m
clear all ; close all ; clc ; % h o u s e k e e p i n g
%%
k = 1 . 5 ; % vs . 1 . 1 , 1 . 5 , 3 3 . 5
sigma1 = 5 . 5 5 ; % vs . 0 . 3 0 , 0 . 9 8 , 0 . 9 9 , 5 . 5 5
sigma2 = sigma1∗k ;
hsize = [ 8 , 8 ] ;
% g= i m r e a d ( ’ cameraman . t i f ’ ) ;
g=rgb2gray ( imread ( ’fisherman.jpg’ ) ) ;
%%
gauss1 = imgaussfilt ( g , sigma1 ) ;
gauss2 = imgaussfilt ( g , sigma2 ) ;
%
dogFilterImage = gauss2 − gauss1 ; % d i f f e r e n c e o f G a u s s i a n s
imshow ( dogFilterImage , [ ] ) , axis , grid on ;
% t i t l e ( ’DOG Image ’ , ’ F o n t S i z e ’ , f o n t S i z e ) ;
Remark A.44 Two views of image geometry are shown in Fig. A.68.
View.1 Keypoint Gradient Orientation View. Each of the 21 keypoints in
Fig. A.68.1 is the center of a circle. The radius of each circle equals the
edge strength of a keypoint.
Example A.45 Fisherman’s hat keypoint.
%% p a r t 2 − d i s p l a y t h e k e y p o i n t s w i t h o u t t h e s u r r o u n d i n g c i r c l e s and w i t h o u t a
mesh
%g e t XY c o o r d i n a t e s o f key p o i n t s
% XYLoc= k e y P t s . L o c a t i o n ; %f o r a l l k e y p o i n t s − uncomment t h i s and comment l i n e s
23 and 24 below
360 Appendix A: Matlab and Mathematica Scripts
B.1 A
A
A/D: Analog-to-digital conversion accomplished by taking samples of an analog
signal at appropriate intervals. The A/D process is known as sampling. See
Fig. B.1 for an example.
B.2 B
B
bdy A: Set of boundary points of the set A. See, also, Open set, Closed set.
© Springer International Publishing AG 2017 361
J.F. Peters, Foundations of Computer Vision, Intelligent Systems
Reference Library 124, DOI 10.1007/978-3-319-52483-2
362 Appendix B: Glossary
Bit depth: Bit depth quantifies how many unique colors are available in an image’s
color palette in terms of the number of 0’s and 1’s, or “bits,” which are used to
specify each color.
Example B.1 Bit Depth. Digital camera colour image usually has a bit depth of 24
bits with 8-bits per colour.1
Boundary set: The boundary set of a nonempty set A (denoted by bdy A) is the
set of points along the border of and adjacent to a nonempty set. Then bdy A is
the set of those points nearest A and not in A. Put another way, let A be any set
in the Euclidean plane X . A point p ∈ X is a boundary point of A, provided the
neighbourhood of p(denoted by N ( p)) intersects both A and the set of all points
in X not in A [100, Sect. 1.2, p. 5]. Geometrically, p is on the edge between A
and the complement of A in the plane. See Boundary region, Neighbourhood
of a point, Hole.
1 http://www.cambridgeincolour.com/tutorials/bit-depth.htm.
Appendix B: Glossary 363
orange pulp exterior The skin of an orange is the boundary of the orange interior
(the orange pulp).
egg exterior Egg shell that is the boundary of an egg yoke.
window frame Window frame surrounding a pane of glass is the boundary of the
glass.
empty box Empty box bounded on each side with a wall. Example: shoe box with
no shoes or any else in it.
Plane subset boundary Set of points on edge of a planar set and constitutes the
boundary of interior of the set.
subimage Any Subimage that includes its boundary pixels. A Rosenfeld 8-
neighbourhood plus the pixels along its borders.
B.3 C
C Symbol for set of complex numbers. See, also, Complex numbers, Riemann
space.
Candela: A Candela is the SI (Standard International) base unit of luminous
intensity, i.e., luminous power per unit solid angle emitted by a point light source
in a particular direction. The contribution of each wavelength is weighted by the
standard luminosity function [209], defined by
E v = illuminance measurement.
r1 = radius of the limiting aperture.
r2 = radius of the light source.
d = physical distance between the light source and aperture.
D = distance between an aperture plane and a photometer:
= r12 + r22 + d.
A = area of source aperture.
Ev D2
L v (E v , D, A) = (Luminosity function).
A
364 Appendix B: Glossary
For more details about region centroids, see Q. Du, V. Faber and
M. Gunzburger [38, p. 638]. For the discrete form of centroid, see Sect. 6.4.
Closed lower halfspace: A closed lower half space is the set of all points below
as well as on a boundary line.
Closed upper halfspace: A closed upper half space is the set of all points above
as well as on a boundary line.
Closed set: A set of points that includes its boundary and interior points. Let A
be a nonempty set, intA the interior of A, bdy A the boundary of A. Then A is a
closed set, provided A = intA ∪ bdy A, i.e., A is closed, provided A is the union
of the set of points in its interior and in its boundary.
C Complex numbers.
√ C is the set of complex numbers. Let a, b ∈ R be real
numbers, i = −1. A complex number z ∈ C has the form
z = a + bi (Complex number).
Complex Plane C: The complex plane is the plane of complex numbers (also
called the (see z-plane). See, for example, Fig. B.4.1) in Fig.√B.4. Points z in the
complex plane are of the form z = a + bi, a, b ∈ R, i = −1. The origin of
the complex plane is at z = 0 = 0 + 0i. A Riemann surface is the union of
the complex planes with infinity (denoted by C ∪ {∞}). A projection from the
complex plane to a Riemann sphere, which touches the complex z-plane at a point
S at z = 0 = 0 + 0i and whose central axis is a line joining the top of the sphere at
N diametrically opposite S. A line joining N (at infinity ∞) to a point z pierces the
surface of the sphere at the point P (see Fig. B.4.2). For a visual perspective on the
complex plane and Riemann sphere, providing insights important for computer
vision, see E. Wegert [206]. See, also, Riemann surface, C, z
Colour pixel value: Pixel colour intensity or brightness. See Gamma Correc-
tion, RGB, HSV.
Compact set: Let X be a topological space and let A ⊆ X . Briefly, a cover of
a nonempty set A in a space X is a collection of nonempty subsets in X whose
union is A. Put another way, let C (A) be a cover of A. Then
C (A) = B (Cover of the nonempty set A).
B⊂X
The set A is a compact set, provided every open covering of A has a finite
subcovering [100, S 1.5, p. 17]. See Cover, Topological space.
Appendix B: Glossary 367
Example B.9 Let A be a 2D digital image. Then Ac is the set of all points in plane
not in A.
2 Many thanks to Kyle Fedoruk for supplying the Subaru EyeSight® vision system images.
Appendix B: Glossary 369
Connected polygons Polygons are connected, provided the polygons share one
or more points. See Disconnected, Path-connected.
Example B.15 Connected Voronoï Polygons.
For example, the pair of Voronoï regions A and B in Fig. B.8 are connected, since
A and B have a common edge. Similarly, the pair of Voronoï regions B and C in
Fig. B.8. However, the pair of Voronoï regions A and C in Fig. B.8 are not connected
polygons, since A and C have no points in common, i.e., A and C are disjoint sets.
e− 2σ2
x
with a line segment, an edgelet gains the points (pixels) on each contour line
segment. In tessellated video frames, an edgelet is a set of points in a frame MNC
contour.
Cover of a set A cover of a nonempty set A in a space X is a collection of non-
empty subsets in X whose union is A. Put another way, let C (A) be a cover of A.
Then
C (A) = B (Cover of the nonempty set A).
B⊂X
In other words, every mesh cover of I mg have a finite sub-mesh that is a cover of
I mg.
Convex body A convex body is a compact convex set [61, Sect. 3.1, p. 41].
A proper convex body has a nonempty interior. Otherwise, a convex body is
improper. See Convex set, Compact.
Convex combination A unique straight line segment p0 p1 , p0
= p1 is defined
by two points p0
= p1 so that the line passes through both points [42, Sect. I.4, p.
20]. Each point on x ∈ p0 p1 can be written as x = (1−t) p0 +t p1 for some t ∈ R.
Notice that p0 p1 is a convex set containing all points on the line (see Fig. B.10.1).
Edelsbrunner–Harer Convex Combination Method: Then we can construct a
convex set containing all points on the triangular face formed by three points after
we a third point a2 to the original set { p0 , p1 }. That is, we construct a line segment
convex set so that is a triangle with a filled triangle face.
for t = 0, we get x = p0 ,
for t = 1, we get x = p1 ,
for 0 < t < 1, we get a point in between p0 and p1 .
A line segment convex set is also a convex hull of two point, since it is the
smallest convex set containing the two points. In the case where have more than
two points, the above construction is repeated for { p0 , p1 , p2 } by adding all points
y = (1−t) p0 +t p1 for some 0 ≤ t ≤ 1. The result is a triangle-shaped convex hull
with a filled in triangle face (see, e.g., Fig. B.10.2). Using the convex combination
method on a set of 4 points, we obtain the convex hull shown in Fig. B.10.3.
372 Appendix B: Glossary
Repeating this construction for a set of 5 points, we obtain the convex hull shown
in Fig. B.10.4.
In general, starting with a set { p0 , p1 , p2 , . . . , pk } containing k + 1 points, the
convex combination construction method can be done in one step, calling x =
l
l
ti pi a convex combination of the points pi , provided ti = 1 and ti ≥ 0 for
i=0 i=0
all 0 ≤ i ≤ k. In that case, the set of convex combinations is the convex hull of
the points pi .
Convex hull Let A be a nonempty set. The smallest convex set containing the set
of points in A is the convex hull of A (denoted by convh A). G.M. Ziegler [220, p.
3] a method of constructing a convex hull of a set of points K (convhK ), defined
by the intersection of all convex sets that contain K . For practical purposes, let
convhK be a 2D convex hull with K ⊂ R 2 . Then a 2D convex hull convhK is
defined by
convhK = K ⊆ R2 : K ⊆ K with convK (Ziegler Method).
boundary set and interior set. In the plane, every convex set contains an infinite
number of points. The 55 points on the vertices, on the edges and in the interior
of the 7-gon shaped convex hull are shown in Fig. B.11.2. A 3D convex hull of 89
points is shown in Fig. B.11.3. The 89 points on the surfaces and in the interior of
the sample 3D convex hull are shown in Fig. B.11.4.
374 Appendix B: Glossary
Fig. B.12 Sample MNC contour and Voronoï region convex hulls of sets of points
Theorem B.19 A Voronoï region in the Euclidean plane is a proper convex body.
Proof See J.F. Peters [144, Sect. 11, p. 307].
Example B.20 Strictly Convex set.
Let p be a mesh generating point and let V p (also written V ( p) be a Voronoï region.
In Fig. B.13, the Voronoï region A = V p is a strictly convex set, since A is closed
and it has the Strict Convexity property.
Convexity property A family of convex sets has the convexity property, pro-
vided the intersection of any number of sets in the family belongs to the fam-
ily [182]. See Convex set.
B.4 D
D11
Data compression: Reduction in the number of bits required to store data (text,
audio, speech, image and video).
Digital video: Time-varying sequence of images captured by digital camera (see,
e.g., S. Akramaullah [4, p. 2]).
B.5 E
E
|ei |. Number of edge pixels in edgelet ei . Initially, |ei | will equal the number of
mesh generators in a MNC contour. Later, |ei | will equal the total number of edge
pixels in contour edgelet (denoted by |maxei |), i.e.,
Note: The approximation of a digital signal is its expected value. In the context
of video signal analysis, x̂(i) denotes the expected value of the i th digital sample
of an analog signal either from an optical sensor in a digital camera or in a video
camera.
B.6 F
B.7 G
For the details, see Sect. 2.11. For more about this, see Z.-N. Li, M.S. Drew and
J. Liu [111, Sect. 4].
Gamut mapping: Mapping a source signal to a display (e.g., LCD) that meets
the requirements of the display. Colour saturation is kept within the boundaries
of the destination gamut to preserve relative colour strength and out-of-gamut
colours from a source are compressed into the destination gamut. A graphical
representation of the destination gamut (nearest colour) and outside gamut (true
colour) is shown in https://en.wikipedia.org/wiki/Gamut.
Basically, this is accomplished in practice using gamma correction. See Gamma
γ Correction.
Generating point: A generating point is a point used to construct a Voronoï region.
Site and Generator are other names for a mesh generating point. Let S be a set
of sites, s ∈ S, X a set of points used in constructing a Voronoï mesh, V (s) a
Voronoï region defined by
Geodetic graph: A graph G is a geodetic graph, provided, for any two vertices
p, q on G, there is at most one shortest path between p and q. A geodetic line is
a straight line, since the shortest path between the endpoints of a straight line is
the line itself. For more about this, see J. Topp [195]. See, also, Convex hull.
B.8 H
H
Halfspace: Given a line L in an n-dimensional vector space Rn space, a half space
is a set of points that includes those points on one side of the boundary line L. The
points on the line L are included in the half space, provided the half space is closed.
Otherwise, the half space is open. See, also, Boundary region, Boundary set,
Open set, Polytope, Closed half space Closed lower half space, Closed upper
half space, Open half space, Open lower half space, Open upper half space.
Hole: A set with an empty interior. In the plane, a portion of the plane with a
portion of it missing, i.e., a portion of the plane with a puncture in it (a cavity in
the plane). A geometric structure that cannot be continuously shrunk to a point,
since there is always part of the structure that is missing. See, also, Interior, Open
set.
I mg = ;
Closing[Img, 1]
Try writing a Matlab script to remove black holes from a colour (not a binary) image.
Hue: Hue of a colour is the wavelength of the colour within the visible light spec-
trum at which the energy output is greatest [67, Sect. 1.4]. Hue is a point char-
acteristic of colour, determined at a particular point in the visible light spectrum
and measured in nanometers. Let R, G, B be red, green, blue colour. A. Hanbury
and J. Serra [68, Sect. 2.2.1, p. 3] define saturation S and hue H expressions in
the following way.
max{R,G,B}−min{R,G,B}
, if max {R, G, B}
= 0,
S= max{R,G,B}
0, otherwise.
382 Appendix B: Glossary
and ⎧
⎪
⎪undefined, if S = 0,
⎪
⎪
⎨ G−B
, if R = max {R, G, B}
= 0,
H = max{R,G,B}−min{R,G,B}
⎪
⎪2 + B−R
, if G = max {R, G, B}
= 0,
⎪
⎪ max{R,G,B}−min{R,G,B}
⎩4 + R−G
, if B = max {R, G, B}
= 0.
max{R,G,B}−min{R,G,B}
Then 60o H equals the hue value in degrees. See Saturation, Value, HSV.
Hue Saturation Value (HSV): The HSV (Hue Saturation Value) theory is com-
monly used to represent the RGB (Red Green Blue) colour technical model [67].
The HSV colour space was introduced in 1905 by A. Munsell [127] and elabo-
rated in 1978 by G.H. Joblove and D. Greenberg [89] to compensate for technical
and hardware limitations for applications in colour display systems. Hue is an
angular component, Saturation is a radial component and Value (Lightness) is the
colour intensity along vertical in the 3D color model that shows the geometry of
the HSV colour space in https://en.wikipedia.org/wiki/HSL_and_HSV.
Complete, detailed views of the HSV color model are given by J. Halus̆ka [67]
and A. Hanbury and J. Serra [68].
Huffman coding: A lossless data compression algorithm that uses a small number
of bits to encode common characters.4 To see an example of Huffman coding for
a digital image, try using the Matlab script A.2 in Appendix A.1 with any digital
colour image.
B.9 I
I
√
i: Imaginary number i = −1. See, also, z, Complex plane, Riemann surface.
Int A: Interior of a set nonempty set A, a set without boundary points. See, also,
Open set, Closed set, Boundary set.
4 For
the details about Huffman coding, see E.W. Weisstein at http://mathworld.wolfram.com/
HuffmanCoding.html.
Appendix B: Glossary 383
B.10 K
Key Frame Selection: Selection of video frames exhibiting the most change.
Example B.29 Adaptive Key Frame Selection. Adaptive key frame selection is an
approach to efficient video coding suggested by J. Jun et al. in [121]. For exam-
ple, adaptive video video frame selection is achievable by selecting video frames in
which there are significant changes in image tessellation polygons. For example, in
a changing scene recorded by a webcam, overlay polygons on video frames tem-
porally close together usually will vary only slightly. By contrast, the areas of the
overlay polygons on temporally separated video frames often will vary significantly
in recording a changing scene.
B.11 L
Luminance. The light reflected from a surface (denoted by L(λ)). Let E be the
incident illumination, λ wavelength, R ∈ [0, 1] reflectivity or reflectance of a
surface. Then L(λ) has a spectrum given by
L(λ) = E(λ)R(λ) cd m −2 .
B.12 M
of a MNC. For the details, see Sect. 7.5. See, also, MNC-based image object shape
recognition Methods B.12.
Example B.30 A sample Vornonï mesh on a CN train image is shown in Fig. B.18.2.
A pair of mesh nuclei such as are shown in Fig. B.19.1. These red nuclei
each has the highest number of adjacent polygons. Hence, these red nuclei are the
centers of maximal nucleus clusters (MNCs) shown in Fig. B.19.2.
MNC-based image object shape recognition Methods There are three basic
methods that can be used to achieve image object shape recognition, namely,
Method 1: Inscribed Circle Perimeter Inside MNC contours. The
circle-perimeter method gets its inspiration from V. Vakil [198]. To begin, choose a
keypoint-based Voronoï-tessellated query image Q like the one in Fig. B.20 and a
Voronoï-tessellated video frame test image T . Identify an MNC in Q and an MNC
in T . Measure the perimeter of circles inscribed in coarse-grained and fine-grained
nucleus contours and that are centered on each image MNC nucleus generating
point. See, for example, the coarse-grained circle on the MNC in Fig. B.20.
386 Appendix B: Glossary
In
other words,
an object in a query image is recognized, provided the difference
PQ − PT < ε is small enough.
Appendix B: Glossary 387
Let Q E , TE be the edge strengths of the keypoints of the nuclei for the query
image Q and test image T , respectively. Then compute
1, if |PE − TE | < ε,
Object Recogni zed =
0, otherwise
In other words, the difference between the edge strengths |PE − TE | < ε is
small enough the MNC object in image Q and image T are similar, recognized.
3o Summarize your findings for each video in a table (see Table B.2).
MNC spoke. See Nerve in Appendix B.13.
MSSIM Mean SSIM image quality index. Let X, Y be the rows and columns of
an n × m digital image, row xi ∈ X, 1 ≤ i ≤ n, column y j ∈ Y , 1 ≤ j ≤ m
such that
388 Appendix B: Glossary
x = (x1 , . . . , xn ) , y = (y1 , . . . , xm ) .
B.13 N
For more about this, see J.F. Peters [142, Sect. 1.14]. See Open set, Closed set.
Appendix B: Glossary 389
B.14 O
O
Object tracking: In a digital video sequence, identify moving objects and track
them from frame to frame. In practice, each frame is segmented into regions with
similar colour and intensity and which are likely to have some motion. This can be
accomplished by tessellating each video frame, covering each frame with poly-
gons that are Voronoï regions, then comparing the changes in particular polygons
from frame to frame. For more about this, see S.G. Hoggar [83, Sect. 12.8.3, p.
441].
Open lower halfspace: An open lower half space is the set of all points below
but not on a boundary line.
Open upper halfspace: An open upper half space is the set of all points above
but not on on a boundary line.
B.15 P
P
Path: sequence p1 , . . . , pi , pi+1 , . . . , pn of n pixels or voxels is a path, provided
pi , pi+1 are adjacent (no pixels in between pi and pi+1 ).
Appendix B: Glossary 391
2h λ5
B(λ) = W m−2 m−1 .
hc
c2 e kλT −1
This is the power emitted by a light source such as the filament of an incandescent
light bulb:
https://en.wikipedia.org/wiki/Incandescent_light_bulb.
For the details, see P. Corke [31].
Point Another name for digital image pixel.
Polytope: Let Rn be an n-dimensional Euclidean vector space, which is where
polytopes live. A polytope is a set of points A ⊆ Rn that is either a convex hull
of set of points K (denoted by convh A(K )) or the intersection of finitely many
closed half spaces in Rn . This view of polytopes is based on [220, p. 5]. Notice
that non-convex polytopes are possible, since the intersection of finitely many half
spaces may not be the smallest convex set containing a set of points. Polytopes
are commonly found in Voronoï tessellations of digital images. See, also, Convex
hull, Convex set, Half space.
392 Appendix B: Glossary
B.16 Q
Q
Quality of a digital image: Mean SSIM (MSSIM) index. See MSSIM, SSIM.
Quality of an Voronoï region: Recall that a Voronoï region is a polygon with n
sides. Let V (s) be a Voronoï region and let Q(V (s)) be the quality of V (s). The
quality of Q(V (s)) is highest when the polygon sides are equal in length.
Example B.37 Let A be the area of V (s) with 4 sides having lengths l1 , l2 , l3 , l4 .
Then
4A
Q(V (s)) = 2 .
l1 + l2 + l32 + l42
2
Q(V (s)) will vary, depending on the area and the number of sides in a Voronoï region
polygon. Let Q i (V (si )) (briefly, Q i ) be the quality of polygon i with 1 ≤ i ≤ n, n ≥
1. And let S be set of generating points, V (S) a Voronoï tessellation. Then
1
n
Q(V (S)) = Q i (Global Mesh Quality Index).
n i=1
In other words, an MNC contour shape is high, provided the difference between a
target MNC contour perimeter and a sample MNC contour perimeter is less than
some small positive number ε.
Quantum optics: The study of light and interaction between light and matter at
the microscopic level. For more about this, see C. Fabre [48]. See, also, Photon.
B.17 R
R
R: Set of reals (real numbers).
R2 : Euclidean plane (2-space). 2D digital images live in R2 .
R3 : Euclidean 3-space. 3D digital images live in R3 .
Reality: What we experience as human beings.
RGB: Red Green Blue colour technical model. For the RGB wavelengths based
on the CIE (Commission internationale de l’ëclairage: International Commission
on Illumination) 1931 color space, see
https://en.wikipedia.org/wiki/CIE_1931_color_space.
Regular polygon: An n-sided polygon in which all sides are the same length
and are symmetrically arranged about a common center, which means that a
regular polygon is both equiangular and equilateral. For more about this, see
E.W. Weisstein [207].
Riemann surface: A Riemann surface is a surface that covers the complex plane
(z-plane or z-sphere) with sheets. See Complex plane, C.
B.18 S
S
Sampling: Extracting samples from an analog signal at appropriate intervals. A
continuous analog signal xa (t) such as the signal from an optical sensor in a digital
camera or in a web cam, is captured over a temporal interval t. Let T > 0 denote
the sampling period (duration between samples) and let n be the sample number.
The ratio And x(n) is a digital sample of the analog signal xa (t) at time t, provided
(duration between sampled signals). Each spike in Fig. B.1.2 represents a sampled
signal (either image or video frame).
total area of the polygons covering one shape with the total area of the polygons
covering another shape (see, e.g., the shape measure given by D.R. Lee and G.T.
Sallee in [106]).
i.e., boundary perimeters p(K Q ), p(K T ) are close and convex hull perimeters
p(C Q ), p(C T ).
See, also, Shape, Convex Hull, Convex Set, MNC, Boundary region of a set,
Boundary Set and Jeff Weeks lecture5 on the shape of space and how the uni-
verse could be a Poincaré dodecahedral space: https://www.youtube.com/watch?
v=j3BlLo1QfmU.
Fig. B.26 Query image Q and test image T boundary and convex hull perimeters
Example B.41 Comparing query and test image shape convex hulls and bound-
aries. In Fig. B.26, query and test image shape boundaries and convex hulls are
represented by p(K Q ), p(K T ) and p(K T ), p(C T ), respectively. In each case, the
shape boundary contains the shape convex hull. The basic approach is the compare
lengths of the boundaries p(K Q ), p(K T ) and convex hull perimeters p(C Q ), p(C T ).
Let ε be a positive number. Then the test image shape approximates the query image
shape, provided
p(K Q ) − p(K T ) ≤ ε and p(C Q ) − p(C T ) ≤ ε.
The similarity distance D(A, B) between the two contours A and B, represented
by a set of uniformly sampled points in A and B [60, Sect. 2, p. 29], is defined by
D(A, B) = max max D(a, B), max D(b, A) (Similarity Distance).
a∈A b∈B
Signal quality: The expected value of a signal compared with an actual signal.
N
2
xi − x̂i
i=1
M S E(x) = .
N
In our case, a signal is a vector of pixel intensities such as the row or column intensities
in a greyscale digital image.
B.19 T
B.20 U
UQI: Universal Quality Index defined by Z. Wang and A.C. Bovik in [204]. Let
μx , μ y be the average pixel intensity in the x and y directions, respectively. Let
σx , σx be image signal contrast in the x and y directions, respectively, defined by
n 1 m 1
(xi − μx ) 2 yi − μ y 2
σx = , σy = .
i=1
n−1 i=1
m−1
U Q I (x, y) is defined by
4σx y μx μ y
U Q I (x, y) = 2 ,
μx + μ2y σx2 + σ 2y
In [204, Sect. II, p. 81], x is an original image signal and y is a test image signal.
Notice that in a greyscale image, x is a row of pixel intensities and y is a column
of pixel intensities. This is the same as SS I M(x, x), when C1 = C2 = 0 in the
structural similarity image measure SS I M(x, x). See SSIM.
400 Appendix B: Glossary
B.21 V
B.22 W
Webcam: Video camera that streams its images in real-time to a computer network.
For applications, see E.A. Vlieg [200].
B.23 X
B.24 Z
1. A-iyeh, E.: Point pattern voronoï tessellation quality and improvement, information and
processing: applications in digital image analysis. Ph.D. thesis, University of Manitoba,
Department of Electrical and Computer Engineering’s (2016). Supervisor: J.F. Peters
2. A-iyeh, E., Peters, J.: Rényi entropy in measuring information levels in Voronoï tessellation
cells with application in digital image analysis. Theory Appl. Math. Comput. Sci. 6(1), 77–95
(2016). MR3484085
3. Aberra, T.: Topology preserving skeletonization of 2d and 3d binary images. Master’s the-
sis, Technische Universität Kaiserslautern, Kaiserslautern, Germany (2004). Supervisors: K.
Schladitz, J. Franke
4. Akramaullah, S.: Digital Video Concepts, Methods and Metrics. Quality, Compression, Per-
formance, and Power Trade-Off Analysis, Xxiii+344 pp. Springer, Apress, Berlin (2015)
5. Allili, M., Ziou, D.: Active contours for video object tracking using region, boundary and
shape information. Signal Image Video Process. 1(2), 101–117 (2007). doi:10.1007/s11760-
007-0021-8
6. Apostol, T., Mnatsakanian, M.: Complete dissections: converting regions and their boundaries.
Am. Math. Mon. 118(9), 789–798 (2011)
7. Archimedes: sphere and cylinder. On paraboloids, hyperboloids and ellipsoids, trans. and
annot. by A. Czwalina-Allenstein. Cambridge University Press, UK (1897). Reprint of 1922,
1923, Geest & Portig, Leipzig 1987, The Works of Archimedes, ed. by T.L. Heath, Cambridge
University Press, Cambridge (1897)
8. Baerentzen, J., Gravesen, J., Anton, F., Aanaes, H.: Computational Geometry Processing.
Foundations, Algorithms, and Methods. Springer, Berlin (2012). doi:10.1007/978-1-4471-
4075-7, Zbl 1252.68001
9. Baroffio, L., Cesana, M., Redondi, A., Tagliasacchi, M.: Fast keypoint detection in video
sequences, pp. 1–5 (2015). arXiv:1503.06959v1 [cs.CV]
10. Bay, H., Ess, A., Tuytelaars, T., Gool, L.: Speeded-up robust features (surf). Comput. Vis.
Image Underst. 110(3), 346–359 (2008)
11. Beer, G.: Topologies on Closed and Closed Convex Sets. Kluwer Academic Publishers, The
Netherlands (1993)
12. Beer, G., Lucchetti, R.: Weak topologies for the closed subsets of a metrizable space. Trans.
Am. Math. Soc. 335(2), 805–822 (1993)
13. Belongie, S., Malik, J., Puzicha, J.: Matching shapes. In: Proceedings of the IEEE International
Conference on Computer Vision (ICCV2001), vol. 1, pp. 454–461. IEEE (2001). doi:10.1109/
ICCV.2001.937552
14. Ben-Artzi, G., Halperin, T., Werman, M., Peleg, S.: Trim: triangulating images for efficient
registration, pp. 1–13 (2016). arXiv:1605.06215v1 [cs.GR]
15. Benhamou, F., Goalard, F., Languenou, E., Christie, M.: Interval constraint solving for camera
control and motion planning. ACM Trans. Comput. Logic V(N), 1–35 (2003). http://tocl.acm.
org/accepted/goualard.pdf
16. Blashke, T., Burnett, C., Pekkarinen, A.: Luminaires. In: de Jong, S., van der Meer, F. (eds.)
Image Segmentation Methods for Object-Based Analysis and Classification, pp. 211–236.
Kluwer, Dordrecht (2004)
17. Borsuk, K.: Theory of Shape. Monografie Matematyczne, Tom 59. [Mathematical Mono-
graphs, vol. 59] PWN—Polish Scientific Publishers (1975). MR0418088, Based on K. Bor-
suk, Theory of Shape, Lecture Notes Series, vol. 28, Matematisk Institut, Aarhus Universitet,
Aarhus (1971). MR0293602
18. Borsuk, K., Dydak, J.: What is the theory of shape? Bull. Aust. Math. Soc. 22(2), 161–198
(1981). MR0598690
19. Bromiley, P., Thacker, N., Bouhova-Thacker, E.: Shannon entropy, Rényi’s entropy, and infor-
mation. Technical report, The University of Manchester, U.K. (2010). http://www.tina-vision.
net/docs/memos/2004-004.pdf
20. Broomhead, D., Huke, J., Muldoon, M.: Linear filters and non-linear systems. J. R Stat. Soc.
Ser. B (Methodol.) 54(2), 373–382 (1992)
21. Burger, W., Burge, M.: Digital Image Processing. An Algorithmic Introduction Using Java,
2nd edn, 811 pp. Springer, Berlin (2016). doi:10.1007/978-1-4471-6684-9
22. Burt, P., Adelson, E.: The Laplacian pyramid as a compact image code. IEEE Trans. Commun.
COM–31(4), 532–540 (1983)
23. Camastra, F., Vinciarelli, A.: Machine Learning for Audio, Image and Video Analysis, Xvi +
561 pp. Springer, Berlin (2015)
24. Canny, J.: Finding edges and lines in images. Master’s thesis, MIT, MIT Artificial Intelligence
Laboratory (1983). ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-720.pdf
25. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach.
Intell. 8, 679–698 (1986)
26. Chakerian, G.: A characterization of curves of constant width. Am. Math. Mon. 81(2), 153–
155 (1974)
27. Chan, M.: Topical curves and metric graphs. Ph.D. thesis, University of California, Berkeley,
CA, USA (2012). Supervisor: B. Sturmfels
28. Chaudhury, K., Munoz-Barrutia, A., Unser, M.: Fast space-variant elliptical filtering using
box splines, pp. 1–42 (2011). arXiv:1003.2022v2
29. Chen, L.M.: Digital Functions and Data Reconstruction. Digital-Discrete Methods, Xix+207
pp. Springer, Berlin (2013). doi:10.1007/978-1-4614-5638-4
30. Christie, M., Olivier, P., Normand, J.M.: Camera control in computer graphics. Com-
put. Graph. Forum 27(8), 2197–2218 (2008). https://www.irisa.fr/mimetic/GENS/mchristi/
Publications/2008/CON08/870.pdf
31. Corke, P.: Robitics, Vision and Control. Springer, Berlin (2013). doi:10.1007/978-3-642-
20144-8
32. Danelljan, M., Häger, G., Khan, F., Felsberg, M.: Coloring channel representations for visual
tracking. In: Paulsen, R., Pedersen, K. (eds.) SCIA 2015. LNCS, vol. 9127, pp. 117–129.
Springer, Berlin (2015)
33. Delaunay, B.D.: Sur la sphère vide. Izvestia Akad. Nauk SSSR, Otdelenie Matematicheskii i
Estestvennyka Nauk 7, 793–800 (1934)
34. Deza, E., Deza, M.M.: Encyclopedia of Distances. Springer, Berlin (2009)
35. Dirichlet, G.: Über die reduktion der positiven quadratischen formen mit drei unbestimmten
ganzen zahlen. Journal für die reine und angewandte 40, 221–239 (1850). MR
36. Drucker, S.: Intelligent camera control for graphical environments. Ph.D. thesis, Massa-
chusetts Institute of Technology, Media Arts and Sciences (1994). http://research.microsoft.
com/pubs/68555/thesiswbmakrs.pdf. Supervisor: D. Zeltzer
References 405
37. Drucker, S.: Automatic conversion of natural language to 3d animation. Ph.D. thesis, Univer-
sity of Ulster, Faculty of Engineering (2006). http://www.paulmckevitt.com/phd/mathesis.
pdf. Supervisor: P. McKevitt
38. Du, Q., Faber, V., Gunzburger, M.: Centroidal voronoi tessellations: applications and algo-
rithms. SIAM Rev. 41(4), 637–676 (1999). MR1722997
39. Eckhardt, U., Latecki, L.J.: Topologies for the digital spaces Z2 and Z3 . Comput. Vis. Image
Underst. 90(3), 295–312 (2003)
40. Edelsbrunner, H.: Geometry and Topology of Mesh Generation, 209 pp. Cambridge University
Press, Cambridge (2001)
41. Edelsbrunner, H.: A Short Course in Computational Geometry and Topology, 110 pp. Springer,
Berlin (2014)
42. Edelsbrunner, H., Harer, J.: Computational Topology. An Introduction, Xii+110 pp. American
Mathematical Society, Providence (2010). MR2572029
43. Edelsbrunner, H., Kirkpatrick, D., Seidel, R.: On the shape of a set of points in the plane.
IEEE Trans. Inf. Theory IT-29(4), 551–559 (1983)
44. Eisemann, M., Klose, F., Magnor, M.: Towards plenoptic raumzeit reconstruction. In: Cremers,
D., Magnor, M., Oswald, M., Zelnik-Manor, L. (eds.) Video Processing and Computational
Video, pp. 1–24. Springer, Berlin (2011). doi:10.1007/978-3-642-24870-2
45. Escolano, F., Suau, P., Bonev, B.: Information Theory in Computer Vision and Pattern Recog-
nition. Springer, Berlin (2009)
46. Nielson, F. (ed.): Emerging Trends in Visual Computing, Xii+388 pp. Springer, Berlin (2008)
47. Fabbri, R., Kimia, B.: Multiview differential geometry of curves, pp. 1–34 (2016).
arXiv:1604.08256v1 [cs.CV]
48. Fabre, C.: Basics of quantum optics and cavity quantum electrodynamics. Lect. Notes Phys.
531, 1–37 (2007). doi:10.1007/BFb0104379
49. Favorskaya, M., Jain, L., Buryachenko, V.: Digital video stabilization in static and dynamic
situations. In: Favorskaya, M., Jain, L. (eds.) Intelligent Systems Reference, vol. 73, pp.
261–310. Springer, Berlin (2015)
50. Fechner, G.: Elemente der Psychophysik, 2 vols. E.J. Bonset, Amsterdam (1860)
51. Fontelos, M., Lecaros, R., López-Rios, J., Ortega, J.: Stationary shapes for 2-d water-waves
and hydraulic jumps. J. Math. Phys. 57(8), 081,520, 22 pp. (2016). MR3541857
52. Frank, N., Hart, S.: A dynamical system using the Voronoi tessellation. Am. Math. Mon.
117(2), 92–112 (2010)
53. Gardner, M.: On tessellating the plane with convex polygon tiles. Sci. Am. 116–119 (1975)
54. Gaur, S., Vajpai, J.: Comparison of edge detection techniques for segmenting car license
plates. Int. J. Comput. Appl. Electr. Inf. Commun. Eng. 5, 8–12 (2011)
55. Gersho, A., Gray, R.: Vector Quantization and Signal Compression. Kluwer Academic Pub-
lishers, Norwell (1992). ISBN: 0-7923-9181-0
56. Gersho, A., Gray, R.: Vector Quantization and Signal Compression, Xii + 732 pp. Kluwer
Academic Publishers, Boston (1992)
57. Gonzalez, R., Woods, R.: Digital Image Processing. Prentice-Hall, Upper Saddle River, NJ
07458 (2002). ISBN: 0-20-118075-8
58. Gonzalez, R., Woods, R.: Digital Image Processing, 3rd edn, Xxii + 954 pp. Pearson Prentice
Hall, Upper Saddle River (2008)
59. Gonzalez, R., Woods, R., Eddins, S.: Digital Image Processing Using Matlab® , Xiv + 609
pp. Pearson Prentice Hall, Upper Saddle River (2004)
60. Grauman, K., Shakhnarovich, G., Darrell, T.: Coloring channel representations for visual
tracking. In: Comaniciu, R.M.D.S.D., Kanatani, K. (eds.) Statistical Methods in Video
Processing (SMVP) 2004. LNCS, vol. 3247, pp. 26–37. Springer, Berlin (2004)
61. Gruber, P.: Convex and discrete geometry, Grundlehren der Mathematischen Wissenschaften,
vol. 336, Xiv+578 pp. Springer, Berlin (2007). MCS2000 52XX, 11HXX, ISBN: 978-3-540-
71132-2, MR2335496
62. Gruber, P.M., Wills, J.M. (eds.): Handbook of Convex Geometry. North-Holland, Amsterdam
(1993) vol. A: lxvi+735 pp.; vol. B: ilxvi and 7371438 pp. ISBN: 0-444-89598-1, MR1242973
406 References
63. Grünbaum, B., Shephard, G.: Tilings and Patterns, Xii+700 pp. W.H. Freeman and Co., New
York (1987). MR0857454
64. Grünbaum, B., Shepherd, G.: Tilings with congruent tiles. Bull. (New Ser.) Am. Math. Soc.
3(3), 951–973 (1980)
65. ter Haar Romeny, B.: Computer vision and mathematica. Comput. Vis. Sci. 5(1), 53–65 (2002).
MR1947476
66. Hall, E.: The Silent Language. Doubleday, Garden City (1959)
67. Halus̆ka, J.: On fields inspired with the polar HSV – RGB theory of colour, pp. 1–16 (2015).
arXiv:1512.01440v1 [math.HO]
68. Hanbury, A., Serra, J.: A 3d-polar coordinate colour representation suitable for image analy-
sis. Technical report, Vienna University of Technology (2003). http://cmm.ensmp.fr/~serra/
notes_internes_pdf/NI-230.pdf
69. Haralick, R.: Digital step edges from zero crossing of second directional derivatives. IEEE
Trans. Pattern Anal. Mach. Intell. PAMI-6(1), 58–68 (1984)
70. Haralick, R., Shapiro, L.: Computer and Robot Vision. Addison-Wesley, Reading (1993)
71. Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the 8th
Alvey Vision Conference, pp. 147–151 (1988)
72. Hartley, R.: Transmission of information. Bell Syst. Tech. J. 7, 535 (1928)
73. Hassanien, A., Abraham, A., Peters, J., Schaefer, G., Henry, C.: Rough sets and near sets
in medical imaging: a review. IEEE Trans. Info. Technol. Biomed. 13(6), 955–968 (2009).
doi:10.1109/TITB.2009.2017017
74. Hausdorff, F.: Grundzüge der Mengenlehre, Viii + 476 pp. Veit and Company, Leipzig (1914)
75. Hausdorff, F.: Set Theory, trans. by J.R. Aumann, 352 pp. AMS Chelsea Publishing, Provi-
dence (1957)
76. Henry, C.: Near sets: theory and applications. Ph.D. thesis, University of Manitoba, Depart-
ment of Electrical and Computer Engineering (2010). http://130.179.231.200/cilab/. Super-
visor: J.F. Peters
77. Henry, C.: Arthritic hand-finger movement similarity measurements: tolerance near set
approach. Comput. Math. Methods Med. 2011, 1–14 (2011). doi:10.1155/2011/569898
78. Herran, J.: Omnivis: 3d space and camera path reconstruction for omnidirectional vision. Mas-
ter’s thesis, Harvard University, Mathematics Department (2010). Supervisor: Oliver Knill
79. Hettiarachchi, R., Peters, J.: Voronoï region-based adaptive unsupervised color image seg-
mentation, pp. 1–2 (2016). arXiv:1604.00533v1 [cs.CV]
80. Hidding, J., van de Weygaert, R., G. Vegter, B.J., Teillaud, M.: The sticky geometry of the
cosmic web, pp. 1–2 (2012). arXiv:1205.1669v1 [astro-ph.CO]
81. Hlavac, V.: Fundamentals of image processing. In: Cristóbal, H.T.G., Schelkens, P. (eds.)
Optical and Digital Image Processing. Fundamentals and Applications, pp. 25–48. Wiley-
VCH, Weinheim (2011)
82. Hoggar, S.: Mathematics of Digital Images. Cambridge University Press, Cambridge (2006).
ISBN: 978-0-521-78029-2
83. Hoggar, S.: Mathematics of Digital Images, Xxxii + 854 pp. Cambridge University Press,
Cambridge (2006)
84. Holmes, R.: Mathematical foundations of signal processing. SIAM Rev. 21(3), 361–388
(1979)
85. Houit, T., Nielsen, F.: Video stippling, pp. 1–13 (2010). arXiv:1011.6049v1 [cs.GR]
86. Jacques, J., Braun, A., Soldera, J., Musse, S., Jung, C.: Understanding people in motion in
video sequences using Voronoi diagrams. Pattern Anal. Appl. 10, 321–332 (2007). doi:10.
1007/s10044-007-0070-1
87. Jähne, B.: Digital Image Processing, 6th revised, extended edn. Springer, Berlin (2005). ISBN:
978-3-540-24035-8 (Print) 978-3-540-27563-3 (Online)
88. Jarvis, R.: Computing the shape hull of points in the plane. In: Proceedings of the Computer
Science Conference on Pattern Recognition and Image Processing, pp. 231–241. IEEE (1977)
89. Joblove, G., Greenberg, D.: Color spaces for computer graphics. In: Proceedings of the 5th
Annual Conference on Computer Graphics and Interactive Techniques, pp. 20–25. Association
for Computing Machinery (1978)
References 407
90. Karimaa, A.: A survey of hardware accelerated methods for intelligent object recognition
on camera. In: Świa̧tek, J., Grzech, A., Świa̧tek, P., Tomczak, J. (eds.) Advances in Systems
Science, vol. 240, pp. 523–530. Springer, Berlin (2013)
91. Kay, D., Womble, E.: Automatic convexity theory and relationships between the carathèodory,
helly and radon numbers. Pac. J. Math. 38(2), 471–485 (1971)
92. Kim, I., Choi, H., Yi, K., Choi, J., Kong, S.: Intelligent visual surveillance-A survey. Int. J.
Control Autom. Syst. 8(5), 926–939 (2010)
93. Kiy, K.: A new real-time method of contextual image description and its application in robot
navigation and intelligent control. In: Favorskaya, M., Jain, L. (eds.) Intelligent Systems
Reference, vol. 75, pp. 109–134. Springer, Berlin (2015)
94. Klette, R., Rosenfeld, A.: Digital Geometry. Geometric Methods for Digital Picture Analysis.
Morgan Kaufmann Publishers, Amsterdam (2004)
95. Knee, P.: Sparse representations for radar with Matlab examples. Morgan & Claypool Pub-
lishers (2012). doi:10.2200/S0044ED1V01Y201208ASE010
96. Kohli, P., Torr, P.: Dynamic graph cuts and their applications in computer vision. In: Cipolla,
G.F.R., Battiato, S. (eds.) Computer Vision, pp. 51–108. Springer, Berlin (2010)
97. Kokkinos, I., Yuille, A.: Learning an alphabet of shape and appearance for multi-class object
detection. Int. J. Comput. Vis. 93(2), 201–225 (2011). doi:10.1007/s11263-010-0398-7
98. Kong, T., Roscoe, A., Rosenfeld, A.: Concepts of digital topology. Special issue on digital
topology. Topol. Appl. 46(3), 219–262 (1992). Am. Math. Soc. MR1198732
99. Kong, T., Rosenfeld, A.: Topological Algorithms for Digital Image Processing. North-
Holland, Amsterdam (1996)
100. Krantz, S.: A Guide to Topology, Ix + 107 pp. The Mathematical Association of America,
Washington (2009)
101. Krantz, S.: Essentials of topology with applications, Xvi+404 pp. CRC Press, Boca Raton
(2010). ISBN: 978-1-4200-8974-5. MR2554895
102. Kronheimer, E.: The topology of digital images. Special issue on digital topology. Topol.
Appl. 46(3), 279–303 (1992). MR1198735
103. Lai, R.: Computational differential geometry and intinsic surface processing. Ph.D. thesis,
University of California, Los Angeles, CA, USA (2010). Supervisors: T.F. Chan, P. Thompson,
M. Green, L. Vese
104. Latecki, L.: Topological connectedness and 8-connectedness in digital pictures. Comput. Vis.
Graph. Image Process. 57, 261–262 (1993)
105. Latecki, L., Conrad, C., Gross, A.: Preserving topology by a digitization process. J. Math.
Imaging Vis. 8, 131–159 (1998)
106. Lee, D., Sallee, G.: A method of measuring shape. Geogr. Rev. 60(4), 555–563 (1970)
107. Leone, F., Nelson, L., Nottingham, R.: The folded normal distribution. TECHNOMETRICS
3(4), 543–550 (1961). MR0130737
108. Leutenegger, S., Chli, M., Siegwart, R.: Brisk: binary robust invariant scalable keypoints. In:
Proceedings of the 2011 IEEE International Conference on Computer Vision, pp. 2548–2555.
IEEE (2011)
109. Li, L., Wang, F.Y.: Advanced Motion Control and Sensing for Intelligent Vehicles. Springer,
Berlin (2007)
110. Li, N.: Retrieving camera parameters from real video images. Master’s thesis, The Univer-
sity of British Columbia, Computer Science (1998). http://www.iro.umontreal.ca/~poulin/
fournier/theses/Li.msc.pdf
111. Li, Z.N., Drew, M., Liu, J.: Color in Image and Video. Springer, Berlin (2014). doi:10.1007/
978-3-319-05290-8_4
112. Lin, Y.J., Xu, C.X., Fan, D., He, Y.: Constructing intrinsic Delaunay triangulations from the
dual of geodesic Voronoi diagrams, pp. 1–32 (2015). arXiv:1605.05590v2 [cs.CG]
113. Lindeberg, T.: Edge detection and ridge detection with automatic scale selection. Int. J. Com-
put. Vis. 30(2), 117–154 (1998)
114. Louban, R.: Image Processing of Edge and Surface Defects. Materials Science, vol. 123.
Springer, Apress (2009). See pp. 9–29 on edge detection
408 References
115. Lowe, D.: Object recognition from local scale-invariant features. In: Proceedings of the 7th
IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999). doi:10.
1109/ICCV.1999.790410
116. Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis.
60(2), 91–110 (2004). doi:10.1023/B:VISI.0000029664.99615.94
117. Maggi, F., Mihaila, C.: On the shape of capillarity droplets in a container. Calc. Var. Partial
Differ. Equ. 55(5), 122 (2016). MR3551302
118. Mahmoodi, S.: Scale-invariant filtering design and analysis for edge detection. R. Soc. Proc.:
Math. Phys. Eng. Sci. 467(2130), 1719–1738 (2011)
119. Mani-Levitska, P.: Characterizations of convex sets. Handbook of Convex Geometry, vol. A,
B, pp. 19–41. North-Holland, Amsterdam (1993). MR1242975
120. Marr, D., Hildreth, E.: Theory of edge detection. Proc. R. Soc. Lond. Ser. B 207(1167),
187–217 (1980)
121. Mery, D., Rueda, L. (eds.): Advances in Image and Video Technology, Xviii+959 pp. Springer,
Berlin (2007)
122. Michelson, A.: Studies in Optics. Dover, New York (1995)
123. Milnor, J.: Topology through the centuries: low dimensional manifolds. Bull. (New Ser.) Am.
Math. Soc. 52(4), 545–584 (2015)
124. Gavrilova, M.L. (ed.): Generalized Voronoi Diagrams: A Geometry-Based Approach to Com-
putational Intelligence, Xv + 304 pp. Springer, Berlin (2008)
125. Moselund, T.: Introduction to Video and Image Processing. Building Real Systems and Appli-
cations, Xi + 227 pp. Springer, Heidelberg (2012)
126. Munkres, J.: Topology, 2nd edn., Xvi + 537 pp. Prentice-Hall, Englewood Cliffs (2000), 1st
edn. in 1975. MR0464128
127. Munsell, A.: A Color Notation. G. H. Ellis Company, Boston (1905)
128. Naimpally, S., Peters, J.: Topology with Applications. Topological Spaces via Near and Far, Xv
+ 277 pp. World Scientific, Singapore (2013). American Mathematical Society. MR3075111
129. Nyquist, H.: Certain factors affecting telegraph speed. Bell Syst. Tech. J. 3, 324 (1924)
130. Olive, D.: Algebras, lattices and strings 1986. Unification of fundamental interactions. Proc.
R. Swed. Acad. Sci. Stockh. 1987, 19–25 (1987). MR0931580
131. Olive, D.: Loop algebras, QFT and strings. Proc. Strings Superstrings, Madr. 1987, 217–2858
(1988). World Scientific Publishing, Teaneck, NJ. MR1022259
132. Olive, D., Landsberg, P.: Introduction to string theory: its structure and its uses. Physics and
mathematics of strings. Philos. Trans. R. Soc. Lond. 329, pp. 319–328 (1989). MR1043892
133. Opelt, A., Pinz, A., Zisserman, A.: Learning an alphabet of shape and appearance for multi-
class object detection. Int. J. Comput. Vis. 80(1), 16–44 (2008). doi:10.1007/s11263-008-
0139-3
134. Orszag, M.: Quantum Optics. Including Noise Reduction, Trapped Ions, Quantum Trajecto-
ries, and Decoherence. Springer, Berlin (2016). doi:10.1007/978-3-319-29037-9
135. Ortiz, A., Oliver, G.: Detection of colour channels uncoupling for curvature-insensitive seg-
mentation. In: F.P. et al. (ed.) IbPRIA 2003. LNCS, vol. 2652, pp. 664–672. Springer, Berlin
(2003)
136. Over, E., Hooge, I., Erkelens, C.: A quantitative method for the uniformity of fixation density:
the Voronoi method. Behav. Res. Methods 38(2), 251–261 (2006)
137. Pal, S., Peters, J.: Rough Fuzzy Image Analysis. Foundations and Methodologies. CRC Press,
Taylor & Francis Group, London: ISBN: 13: 9781439803295. ISBN: 10, 1439803293 (2010)
138. Paragios, N., Chen, Y., Faugeras, O.: Handbook of Mathematical Models in Computer Vision.
Springer, Berlin (2006)
139. Perona, P., Malik, J.: Scale-space and edge detection using anisotropic diffusion. IEEE Trans.
Pattern Anal. Mach. Intell. 12(7), 629–639 (1990)
140. Peters, J.: Proximal Delaunay triangulation regions, pp. 1–4 (2014). arXiv:1411.6260 [math-
MG]
141. Peters, J.: Proximal Voronoï regions, pp. 1–4 (2014). arXiv:1411.3570 [math-MG]
References 409
142. Peters, J.: Topology of Digital Images - Visual Pattern Discovery in Proximity Spaces. Intel-
ligent Systems Reference Library, vol. 63, Xv + 411 pp. Springer, Berlin (2014). Zentralblatt
MATH Zbl 1295 68010
143. Peters, J.: Proximal Voronoï regions, convex polygons, & Leader uniform topology. Adv.
Math. 4(1), 1–5 (2015)
144. Peters, J.: Computational Proximity. Excursions in the Topology of Digital Images. Intelligent
Systems Reference Library, vol. 102, Viii + 445 pp. Springer, Berlin (2016). doi:10.1007/978-
3-319-30262-1
145. Peters, J.: Two forms of proximal physical geometry. axioms, sewing regions together, classes
of regions, duality, and parallel fibre bundles, pp. 1–26 (2016). To appear in Adv. Math.: Sci.
J., vol. 5 (2016). arXiv:1608.06208
146. Peters, J., Guadagni, C.: Strong proximities on smooth manifolds and Voronoi diagrams. Adv.
Math. Sci. J. 4(2), 91–107 (2015). Zbl 1339.54020
147. Peters, J., İnan, E.: Rényi entropy in measuring information levels in Voronoï tessellation
cells with application in digital image analysis. Theory Appl. Math. Comput. Sci. 6(1), 77–95
(2016). MR3484085
148. Peters, J., İnan, E.: Strongly proximal Edelsbrunner-Harer nerves. Proc. Jangjeon Math. Soc.
19(2), 563–582 (2016)
149. Peters, J., İnan, E.: Strongly proximal Edelsbrunner-Harer nerves in Voronoï tessellations.
Proc. Jangjeon Math. Soc. 19(3), 563–582 (2016). arXiv:1604.05249v1
150. Peters, J., İnan, E.: Strongly proximal Edelsbrunner-Harer nerves in Voronoï tessellations, pp.
1–10 (2016). arXiv:1605.02987v3
151. Peters, J., Naimpally, S.: Applications of near sets. Notices Am. Math. Soc. 59(4), 536–542
(2012). doi:10.1090/noti817.MR2951956
152. Peters, J., Puzio, L.: Image analysis with anisotropic wavelet-based nearness measures. Int.
J. Comput. Intell. Syst. 2(3), 168–183 (2009). doi:10.1016/j.ins.2009.04.018
153. Peters, J., Tozzi, A., İnan, E., Ramanna, S.: Entropy in primary sensory areas lower than in
associative ones: the brain lies in higher dimensions than the environment. bioRxiv 071977,
1–12 (2016). doi:10.1101/071977
154. Poincaré, H.: La Science et l’Hypothèse. Ernerst Flammarion, Paris (1902). Later ed.; Champs
sciences, Flammarion, 1968 & Science and Hypothesis, trans. by J. Larmor, Walter Scott
Publishing, London, 1905; cf. Mead Project at Brock University. http://www.brocku.ca/
MeadProject/Poincare/Larmor_1905_01.html
155. Poincaré, J.: L’espace et la géomètrie. Revue de m’etaphysique et de morale 3, 631–646
(1895)
156. Poincaré, J.: Sur la nature du raisonnement mathématique. Revue de méaphysique et de morale
2, 371–384 (1894)
157. Pottmann, H., Wallner, J.: Computational Line Geometry. Springer, Berlin (2010). doi:10.
1007/978-3-642-04018-4. MR2590236
158. Preparata, F.: Convex hulls of finite sets of points in two and three dimensions. Commun.
Assoc. Comput. Mach. 2(20), 87–93 (1977)
159. Preparata, F.: Steps into computational geometry. Technical report, Coordinated Science Lab-
oratory, University of Illinois (1977)
160. Prewitt, J.: Object Enhancement and Extraction. Picture Processing and Psychopictorics.
Academic Press, New York (1970)
161. Prince, S.: Computer Vision. Models, Learning, and Inference, Xvii + 580 pp. Cambridge
University Press, Cambridge (2012)
162. Pták, P., Kropatsch, W.: Nearness in digital images and proximity spaces. In: Proceedings of
the 9th International Conference on Discrete Geometry, LNCS 1953, 69–77 (2000)
163. Ramakrishnan, S., Rose, K., Gersho, A.: Constrained-storage vector quantization with a uni-
versal codebook. IEEE Trans. Image Process. 7(6), 785–793 (1998). MR1667391
164. Rényi, A.: On measures of entropy and information. In: Proceedings of the 4th Berkeley
Symposium on Mathematical Statistics and Probability, vol. 1, pp. 547–547. University of
California Press, Berkeley, California (2011). Math. Sci. Net. Review. MR0132570
410 References
165. Rhodin, H., Richart, C., Casas, D., Insafutdinov, E., Shafiei, M., Seidel, H.P., Schiele, B.,
Theobalt, C.: Egocap: egocentric marker-less motion capture with two fisheye cameras, pp.
1–11 (2016). arXiv:1609.07306v1 [cs.CV]
166. Roberts, L.: Machine perception of three-dimensional solids. In: Tippett, J. (ed.) Optical and
Electro-Optical Information Processing. MIT Press, Cambridge (1965)
167. Robinson, M.: Topological Signal Processing, Xvi+208 pp. Springer, Heidelberg (2014).
ISBN: 978-3-642-36103-6. doi:10.1007/978-3-642-36104-3. MR3157249
168. Rosenfeld, A.: Distance functions on digital pictures. Pattern Recognit. 1(1), 33–61 (1968)
169. Rosenfeld, A.: Digital Picture Analysis, Xi + 351 pp. Springer, Berlin (1976)
170. Rosenfeld, A.: Digital topology. Am. Math. Mon. 86(8), 621–630 (1979). Am. Math. Soc.
MR0546174
171. Rosenfeld, A., Kak, A.: Digital Picture Processing, vol. 1, Xii + 457 pp. Academic Press,
New York (1976)
172. Rosenfeld, A., Kak, A.: Digital Picture Processing, vol. 2, Xii + 349 pp. Academic Press,
New York (1982)
173. Rowland, T., Weisstein, E.: Continuous. Wolfram Mathworld (2016). http://mathworld.
wolfram.com/Continuous.html
174. Ruhrberg, K.: Seurat and the neo-impressionists. In: Art in the 20th Century, pp. 25–48.
Benedict Taschen Verlag, Koln (1998)
175. Shamos, M.: Computational geometry. Ph.D. thesis, Yale University, New Haven, Connecti-
cut, USA (1978). Supervisors: D. Dobkin, S. Eisenstat, M. Schultz
176. Sharma, O.: A methodology for raster to vector conversion of colour scanned maps. Master’s
thesis, University of New Brunswick, Department of Geomatics Engineering (2006). http://
www2.unb.ca/gge/Pubs/TR240.pdf
177. Shimizu, Y., Zhang, Z., Batres, R.: Frontiers in Computing Technologies for Manufacturing
Applications. Springer, London (2007). ISBN: 978-1-84628-954-5
178. Slotboom, B.: Characterization of gap-discontinuities in microstrip structures, used for opto-
electronic microwave switching, supervisor: G. Brussaard. Master’s thesis, Technische Uni-
versiteit Eindhoven (1992). http://alexandria.tue.nl/extra1/afstversl/E/394119.pdf
179. Smith, A.: A pixel is not a little square (and a voxel is not a little cube), vol. 6. Technical
report, Microsoft (1995). http://alvyray.com/Memos/CG/Microsoft/6_pixel.pdf
180. Sobel, I.: Camera models and perception. Ph.D. thesis, Stanford University, Stanford (1970)
181. Sobel, I.: An Isotropic 3x3 Gradient Operator, Machine Vision for Three-Dimensional Scenes,
pp. 376–379. Freeman, H., Academic Press, New York (1990)
182. Solan, V.: Introduction to the axiomatic theory of convexity [Russian with English and French
Summaries], 224 pp. Shtiintsa, Kishinev (1984). MR0779643
183. Solomon, C., Breckon, T.: Fundamentals of Digital Image Processing. A Practical Approach
with Examples in Matlab, X + 328 pp. Wiley-Blackwell, Oxford (2011)
184. Sonka, M., Hlavac, V., Boyle, R.: Image Processing, Analysis and Machine Vision. Springer,
Berlin (1993). doi:10.1007/978-1-4899-3216-7
185. Sonka, M., Hlavac, V., Boyle, R.: Image Processing, Analysis, and Machine Vision, 829 pp.
Cengage Learning, Stamford (2008). ISBN: -13 978-0-495-24438-7
186. Stahl, S.: The evolution of the normal distribution. Math. Mag. 79(2), 96–113 (2006).
MR2213297
187. Stijns, E., Thienpont, H.: Fundamentals of photonics. In: Cristóbal, G., Schelkens, P., Thien-
pong, H. (eds.) Optical and Digital Image Processing, pp. 25–48. Wiley, Weinheim (2011).
ISBN: 978-3-527-40956-3
188. Stijns, E., Thienpont, H.: Fundamentals of photonics. In: Cristóbal, H.T.G., Schelkens, P.
(eds.) Optical and Digital Image Processing. Fundamentals and Applications, pp. 25–48.
Wiley-VCH, Weinheim (2011)
189. Sya, S., Prihatmanto, A.: Design and implementation of image processing system for lumen
social robot-humanoid as an exhibition guide for electrical engineering days, pp. 1–10 (2015).
arXiv:1607.04760
References 411
190. Szeliski, R.: Computer Vision. Algorithms and Applications, Xx + 812 pp. Springer, Berlin
(2011)
191. Takita, K., Muquit, M., Aoki, T., Higuchi, T.: A sub-pixel correspondence search technique
for computer vision applications. IEICE Trans. Fundam. E87-A(8), 1913–1923 (2004). http://
www.aoki.ecei.tohoku.ac.jp/research/docs/e87-a_8_1913.pdf
192. Tekdas, O., Karnad, N.: Recognizing characters in natural scenes. A feature study. CSci
5521 Pattern Recognition, University of Minnesota, Twin Cities (2009). http://rsn.cs.umn.
edu/images/5/54/Csci5521report.pdf
193. Thivakaran, T., Chandrasekaran, R.: Nonlinear filter based image denoising using AMF
approach. Int. J. Comput. Sci. Inf. Secur. 7(2), 224–227 (2010)
194. Tomasi, C.: Cs 223b: introduction to computer vision. Matlab and images. Technical report,
Stanford University (2014). http://www.umiacs.umd.edu/~ramani/cmsc828d/matlab.pdf
195. Topp, J.: Geodetic line, middle and total graphs. Mathematica Slovaca 40(1), 3–
9 (1990). https://www.researchgate.net/publication/265573026_Geodetic_line_middle_and_
total_graphs
196. Toussaint, G.: Computational geometry and morphology. In: Proceedings of the First Interna-
tional Symposium for Science on Form, pp. 395–403. Reidel, Dordrecht (1987). MR0957140
197. Tuz, V.: Axiomatic convexity theory [Russian]. Rossiïskaya Akademiya Nauk. Matematich-
eskie Zametki [Math. Notes and Math. Notes] 20(5), 761–770 (1976)
198. Vakil, V.: The mathematics of doodling. Am. Math. Mon. 118(2), 116–129 (2011)
199. Valente, L., Clua, E., Silva, A., Feijó, R.: Live-action virtual reality games, pp. 1–10 (2016).
arXiv:1601.01645v1 [cs.HC]
200. Vlieg, E.: Scratch by Example. Apress, Berlin (2016). doi:10.1007/978-1-4842-1946-1_10.
ISBN: 978-1-4842-1945-4
201. Voronoi, G.: Sur une fonction transcendante et ses applications à la sommation de quelque
séries. Ann. Sci. Ecole Norm. Sup. 21(3) (1904)
202. Voronoï, G.: Nouvelles applications des paramètres continus à la théorie des formes quadra-
tiques. J. für die reine und angewandte Math. 133, 97–178 (1907). JFM 38.0261.01
203. Voronoï, G.: Nouvelles applications des paramètres continus à la théorie des formes quadra-
tiques. J. für die reine und angewandte Math. 134, 198–287 (1908). JFM 39.0274.01
204. Wang, Z., Bovik, A.: A universal image quality index. IEEE Signal Process. Lett. 9(3), 81–84
(2002). doi:10.1109/97.995823
205. Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility
to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
206. Wegert, E.: Visual Complex Functions. An Introduction to Phase Portraits, Xiv + 359 pp.
Birkhäuser, Freiburg (2012). doi:10.1007/978-3-0348-0180-5
207. Weisstein, E.: Regular polygon. Wolfram Mathworld (2016). http://mathworld.wolfram.com/
RegularPolygon.html
208. Weisstein, E.: Wavelet. Wolfram Mathworld (2016). http://mathworld.wolfram.com/Wavelet.
html
209. Wen, B.J.: Luminance meter. In: Luo, M. (ed.) Encyclopedia of Color Science and Technology,
pp. 824–886. Springer, New York (2016). doi:10.1007/978-1-4419-8071-7
210. Wildberger, N.: Algebraic topology: a beginner’s course. University of South
Wales (2010). https://www.youtube.com/watch?v=Ap2c1dPyIVo&index=40&list=
PL6763F57A61FE6FE8
211. Wirjadi, O.: Models and algorithms for image-based analysis of microstructures. Ph.D. thesis,
Technische Universität Kaiserslautern, Kaiserslautern, Germany (2009). Supervisor: K. Berns
212. Witkin, A.: Scale-space filtering. In: Proceedings of the 8th International Joint Conference
on Artificial Intelligence, pp. 1019–1022. Karlsruhe, Germany (1983)
213. Xu, L., Zhang, X.C., Auston, D.: Terahertz beam generation by femtosecond optical pulses
in electo-optic materials. Appl. Phys. Lett. 61(15), 1784–1786 (1992)
214. Yung, C., Choi, G.T., Chen, K., Lui, L.: Trim: triangulating images for efficient registration,
pp. 1–13 (2016). arXiv:1605.06215v1 [cs.GR]
215. Zadeh, L.: Theory of filtering. J. Soc. Ind. Appl. Math. 1(1), 35–51 (1953)
412 References
216. Zelins’kyi, Y.: Generalized convex envelopes of sets and the problem of shadow. J. Math. Sci.
211(5), 710–717 (2015)
217. Zhang, X., Brainard, D.: Estimation of saturated pixel values in digital color imaging. J.
Opt. Soc. Am. A 21(12), 2301–2310 (2004). http://color.psych.upenn.edu/brainard/papers/
Zhang_Brainard_04.pdf
218. Zhang, Z.: Affine cameral. In: Ikeuchi, K. (ed.) Computer Vision. A Reference Guide, pp.
19–20. Springer, Berlin (2014)
219. Zhao, B., Xing, E.: Sparse output coding for scalable visual recognition. Int. J. Comput. Vis.
119, 60–75 (2016). doi:10.1007/s11263-015-0839-4
220. Ziegler, G.: Lectures on Polytopes. Springer, Berlin (2007). doi:10.1007/978-1-4613-8431-
1
Author Index
A C
Abraham, A., 146 Canny, J., 102
Adelson, E.H., 249, 250 Canny, J.F., 145, 181
Ahmad, M.Z., 395 Casas, D., 367
A-iyeh, E., 278, 392 Cesana, M., 383
Akramaullah, S., 376 Chakerian, G.D., 395
Allili, M.S., 253 Chan, M., 246
Chandrasekaran, R.M., 146
Aoki, T., 61
Chen, K., 3
Apostol, T.M., 395
Chen, Li M., 256
Auston, D., 33
Chen, Y., 7
Choi, G.P.-T., 3
Choi, H.S., 60
Choi, J.Y., 60
B Christie, M., 60
Barclay, D., 229, 349 Clua, E., 400
Baroffio, L., 383 Confucius, 100
Batres, R., 251 Conrad, C., 9, 246
Bay, H., 398 Corke, P., 391
Beer, G., 188 Cross, B., 212
Beer, G.M., 372
Belongie, S., 230
Ben-Artzi, G., 4 D
Benhamou, F., 60 Danelljan, M., 90
Bonev, B., 7 Delaunay, B.N., 200
Borsuk, K., 253 Deza, E., 50
Deza, M.M., 50
Bouhova-Thacker, E., 278
Dirichlet, G., 188
Bovik, A.C., 398, 399
Drew, M.S., 378
Boyle, R., 30, 135, 172, 381
Drucker, S.M., 60
Brainerd, D., 62 Du, Q., 9, 246, 364
Bromiley, P.A., 278 Dydak, J., 253
Broomhead, D.S., 145
Burge, M.J., 130, 243
Burger, W., 130, 243 E
Burt, P.J., 249, 250 Eckhardt, U., 9, 246
Buryachenko, V., 7 Eddins, S.L., 30
© Springer International Publishing AG 2017 413
J.F. Peters, Foundations of Computer Vision, Intelligent Systems
Reference Library 124, DOI 10.1007/978-3-319-52483-2
414 Author Index
Edelsbrunner, H., 40, 187, 188, 200, 201, Hildreth, E., 147, 178
249, 262, 284, 372, 389 Hlavac, V., 30, 135, 172, 381
Eisemann, M., 395 Hoggar, S.G., 389
Escolano, F., 7 Holmes, R.B., 145
Ess, A., 398 Houit, T., 5
Huke, J.P., 145
F
Fabbri, R., 383 I
Faber, V., 9, 246, 364 İnan, E., 220, 249, 262, 278, 384, 389
Fabre, C., 393 Insafutdinov, E., 367
Fan, D., 6
Faugeras, O., 7
J
Favorskaya, M.N., 7
Jain, L.C., 7
Fedoruk, K., 368
Jarvis, R.A., 372
Feijó, R., 400
Joblove, G.H., 382
Felsberg, M., 90
Jones, B.J.T., 6
Fontelos, M.A., 253
Frank, N., 188
K
Kak, A., 9, 246
G Karimaa, A., 60
Gardner, M., 9, 246 Kay, D.C., 376
Gavrilova, M.L., 9, 246 Khan, F.S., 90
Goalard, F., 60 Kim, I.S., 60
Gonzalez, R.C., 30 Kimia, B.B., 383
Gool, L.V., 398 Kirkpatrick, D.G., 372
Greenberg, D., 382 Kishino, F., 400
Gross, A., 9, 246 Klette, R., 9, 46, 242, 246
Gruber, P.M., 371 Klose, F., 395
Grünbaum, B., 9, 78, 246 Knee, P., 249, 250
Guadagni, C., 5 Kohli, P., 7
Gunzburger, M., 9, 246, 364 Kokkinos, I., 253
Kong, S.G., 60
Kong, T., 9, 246
H Krantz, S., 381
Häger, G., 90 Krantz, S.G., 398
Halperin, T., 4 Kronheimer, E., 9, 246
Halus̆ka, J., 382, 394 Kropatsch, W., 146
Hanbury, A., 363, 382
Haralick, R., 169, 176
Haralick, R.M., 27 L
Harer, J.L., 249, 262, 284, 372, 389 Lai, R., 9, 246
Harris, C., 184 Landsberg, P.T., 287
Hart, S., 188 Languenou, E., 60
Hartley, R.V.L., 278 Lateki, L., 9, 246
Hassanien, A., 146 Lecaros, R., 253
Hausdorff, F., 397 Lee, D.R., 395
Henry, C.J., 146 Li, L., 60
Herran, J.L.R., 256 Lin, Y.-J., 6
Hettiarachchi, R., 5, 361 Lindeberg, T., 168
He, Y., 6 Liu, J., 378
Hidding, J., 6 Li, Z.-N., 378
Higuchi, T., 61 López-Rios, J.C., 253
Author Index 415
Y
V Yi, K.M., 60
Valente, L., 400 Yuille, A., 253
van Bommel, W., 364 Yung, C.P., 3
van de Weygaert, R., 6
Vegter, G., 6
Voronoï, G., 9, 187, 246 Z
Zadeh, L.A., 145
Zelins’kyi, Y.B., 376
W Zhang, X.C., 33, 62
Wang, F.-Y., 60 Zhang, Z., 7, 251
Wang, Z., 398, 399 Zhao, B., 250
Weeks, J., 395 Ziegler, G., 372
Wegert, E., 366 Ziegler, G.M., 391
Weisstein, E.W., 393 Ziou, D., 253
Wen, B.-J., 364 Zisserman, A., 253
Subject Index
Symbols R3 , 50, 56
Ac , 367 Rn , 50
C I E, 393 IP, 264
G(x, y, σ), 172 S1P, 264
M N T C, 206 S2P, 264
M N ptC, 209 S3P, 264
N ( p, ε), 388 μ, 163, 329
N4 ( p), 45 μx , μ y , 398, 399
N24 ( p), 49 ∇ f , 58
N8 ( p), 46 æB30Dx − yæ B30D, 7, 50
RG B, 393 æB30Dx − yæ B30D2 , 50
S I F T , 248 æB30Dxæ B30D, 50
SU R F, 248 σ, 163, 172, 329
T H z, 33 σ 2 , 163, 329
V (S), 378, 392 σx , σx , 398, 399
Vp , 7 dtaxi , 51
∠((x, y), (a, b)), 58 h, 391
bdy A, 361, 362 i, 366
re A, 362 img(:, :), 91
N bhd, 256 img(:, :, k), k = 1, 2, 3, 91
i, 382 nm, 33
imaginary number, 382 x · y, 57
x, y, 399 z
z, 366, 401 complex number, 401
Conv A, 374 .png, 25
Convh A, 372 history, 25
∂ f (x,y) 1D kernel Gaussian, 163, 327
∂x , 256
∂ f (x,y) definition, 163
∂ y , 256 plot, 327
∂f
∂x , 58 24-neighbourhood, 50
γ, 122, 123 2D kernel Gaussian, 329
C, 366 plot, 329
N, 19 2D pixel, 256
N0+ , 27 edge strength, 249
R, 50 gradient magnitude, 249
R1 , 50 partial derivative, 256
R2 , 50, 56 3D pixel, 249
© Springer International Publishing AG 2017 417
J.F. Peters, Foundations of Computer Vision, Intelligent Systems
Reference Library 124, DOI 10.1007/978-3-319-52483-2
418 Subject Index
RGB color space, 393 Colour pixel intensity, 320, 325, 326
Class of shapes, 253 3D mesh isolines plot, 325
features, 253 3D mesh plot, 325
representative, 253 isolines, 326
Closed half space, 364 isolines labels, 326
lower, 365 log-modified pixels, 320
upper, 365 Colour space, 12
Closed set HSB, 12
closed half space, 364 HSV, 12
Cluster, 5, 12 RGB, 12
image regions, 5 Compact, 366
k-means, 5 picture, 367
maximal nucleus, 12 Compact set, 366
MNC, 12 definition, 366
nucleus, 12 Complex plane, 366
shape, 12 visual perspective, 366
Coarse contour, 285 Computational geometry, 7–9, 374
definition, 285 basic approach, 8
Color space, 245, 355, 356 definition, 9
HSB, 355 Delaunay triangle, 7
HSV, 245, 355 Delaunay triangulation, 7, 10
LAB, 245, 355, 356 image object shape, 10
LAB definition, 245 lines, 9
RGB, 355 object shape, 10
Colour, 27, 363, 382, 394, 400 site, 7
brightness, 363 structures, 8
false colour, 27 Voronoï diagram, 374
hue, 382 Voronoï regions, 7
saturation, 394 Voronoï tessellation, 10
true colour, 27 Computational photography, 367
value, 400 CPh, 367
Colour channel definition, 367
img(:, :, k), k = 1, 2, 3, 91 Computational topology, 284
edge detection, 100 algorithms, 284
filter transmittance, 90 application, 284
log-based, 109 geometry, 284
recombined, 92 three topics, 284
separate, 90 topology, 284
separated, 92 Computer vision, 7, 250, 253, 284, 329, 367,
Colour image, 93, 310 368
→ greyscale, 93 algorithms, 284
greyscale conversion, 310 applied computational topology, 284
intensities plot, 310 arXiv, 367
Colour pixel, 90 definition, 367
applications, 90 field of view, 368
blue channel, 90 geometry, 284
brightness, 90 human eye, 367
green channel, 90 image object, 253
intensity, 90 motion capture, 367
object tracking, 90 object class recognition, 253
red channel, 90 problems, 7
segmentation, 90 robot navigation, 7
value, 90 shape detection, 253
420 Subject Index
nucleus, 70 E
Delaunay triangle, 68 Edge detection, 162, 178, 179
Delaunay triangulation, 10, 67 anisotropic, 179
benefit, 10 Canny, 102, 181
Digital image, 12 colour channel, 102
.bmp, 25 greyscale image, 102
.gif, 25 isotropic, 178
.jpg, 25 Laplacian, 164
.png, 25 Prewitt, 164
.svg, 25 Roberts, 164
.tif, 25 Sobel, 164
angles, 7 Zero cross, 164
background, 116 Edge pixel, 101, 243, 262
basic content, 7 colour channel, 102
binary, 12 colour channel edges, 105
colour, 12 combined channel edges, 105
colour channels, 118 gradient angle, 101
definition, 12 gradient orientation, 101
Euclidean space, 8 strength, 262
foreground, 116 Edge pixel strength, 249
formats, 25 2D, 249
geometry, 7 3D, 249
Edgelet, 230, 262, 285, 288, 349, 394
greyscale, 12
coarse, 285
noise, 153
connected, 288
patterns, 8
contour, 262
pixels, 7
definition, 230, 288, 394
set of point samples, 15
fine, 285
structures, 8
measurements, 349
thresholding, 116
MNC, 262
vector space, 8
perimeter, 288
Digital topology, 9, 246 Edges, 318
digital geometry, 9, 246 Canny, 318
Rosenfeld, 9, 246 Canny binary display, 318
Digital video, 376 Canny green channel display, 318
Digital visual space, 17 Canny red channel display, 318
Dimension, 377 Canny red on blue display, 318
2D Euclidean space, 377 Canny RGB display, 318
2D Riemann space, 377 Entropy, 278
definition, 377 information level, 278
Disconnected set, 368 MNC, 278
definition, 368 nonMNC, 278
Discrete, 172, 377 Rényi, 278
definition, 172, 377 Epipolar line, 4
Distance, 50, 218, 397 definition, 4
between pixels, 51 Epipolar plane, 4
Euclidean, 7, 50 definition, 4
Hausdorff, 218, 397 Epipole, 4
Manhattan, 51 definition, 4
similarity, 218, 397 Euclidean norm, 7
taxicab, 51 Euclidean space, 50
Dot product, 57, 147 3-space, 50
Dynamic range, 120 distance, 50
422 Subject Index
edge, 100 Q
edge strength, 247 Quality, 392
edge strength definition, 247 contour shape, 393
false colour, 48 Global Voronoï Mesh Quality Index, 392
geometric square, 15 Voronoï region, 392
gradient, 61 Quantum optics, 393
gradient magnitude, 247 definition, 393
gradient orientation, 81
grey tone, 12
greyscale, 90 R
information content, 112 Raster image, 14, 88
inspect, 88 aliasing, 14
intensity, 18, 90, 93, 391 jaggies, 14
pixel, 88
intensity image, 27
tile, 88
neighbourhood, 49
Raster image technology, 88
point sample, 13, 15
origin, 88
raster image, 14 Real-time, 305
selection, 100, 317 video frame tessellation, 305
sub-pixel, 15, 61 video frame tiling, 305
value, 90 Real-Time Video Processing
Pixel edge strength, 247 basic steps, 83
definition, 247 Region, 204, 246
gradient magnitude, 247 centroid, 204
Pixel gradient, 245, 355 image, 204
x-direction magnitude, 245, 355 Voronoï, 246
y-direction magnitude, 245, 355 Region-based approach, 244
orientation, 245, 355 binarizing images, 244
Pixel intensity, 98, 126 isodata thresholding, 244
not of, 112 non-maximum suppression, 244
complement, 112 Otsu’s method, 244
log-based, 109 watershed segmentation, 244
max, 98 Region of interest, 151
source of generating points, 126 Renyi entropy, 278
Planck’s constant, 391 definition, 278
Plane figure, 78 image quality, 281
information level, 278
Plane tiling, 78
information order, 279
Plot, 172
MNC, 278, 279
continuous, 172
nonMNC, 278, 279
discrete, 172 RGB, 393
Pointillism, 52 wavelengths, 393
false colour, 52 Rgb colour space, 30
French pointillisme, 52 Riemann surface, 366, 393
Polytope, 391 complex numbers, 366
convex hull, 391 set, 366
Pure geometric nerve, 69 Rule, 58, 60
definition, 69 pixel angle, 60
Pyramid scheme, 250, 351 vector pair angle, 58
definition, 250
expansion, 250, 351
Gaussian, 250, 351 S
reduction, 250, 351 Saturation, 394
Subject Index 429