01 Introduction To Digital Image Processing
01 Introduction To Digital Image Processing
Univerzitet u Sarajevu
Contents
Digital images
Image quality
Section 1.
Digital images
Introduction
When too few gray values are used, contouring appears. The image is reduced to an
artificial looking height map. How many gray values are needed to produce a continuous
looking image?
Assume that n + 1 gray values are displayed with corresponding physical intensities I0 , I1 , .
. . , In . I0 is the lowest attainable intensity and In the maximum intensity. The ratio In /I0 is
called the dynamic range.
The human eye cannot distinguish subsequent intensities they differ less than 1%. For a
dynamic range of 100 the required number of gray values is 463 and a dynamic range of
1000 requires 694 different gray values for a continuous looking brightness.
Most digital medical images today use 4096 gray values (12 bits per pixel).
In the process of digital imaging, the continuous looking world has to be captured onto the
finite number of pixels of the image grid. The conversion from a continuous function to a
discrete function, retaining only the values at the grid points, is called sampling.
Npx (𝜈)
h(𝜈) =
Npx
Fig. 2: Histogram of mammography images
Section 2.
Image quality
Resolution
The resolution of a digital image is sometimes wrongly defined as the linear pixel density
(expressed in dots per inch). This is, however, only an upper bound for the resolution.
Resolution is also determined by the imaging process. The more blurring, the lower is the
resolution.
Factors that contribute to the unsharpness of an image are
1. the characteristics of the imaging system, such as the focal spot and the amount of
detector blur;
2. the scene characteristics and geometry;
3. the viewing conditions.
Instead of using the PSF or LSF it is also possible to use the optical
transfer function (OTF). The OTF expresses the relative amplitude
and phase shift of a sinusoidal target as a function of frequency.
The modulation transfer function (MTF) is the amplitude (i.e.
MTF = |OTF|) and the phase transfer function (PTF) is the phase
component of the OTF. For small amplitudes the lines may no
longer be distinguishable. An indication of the resolution is the
number of line pairs per millimeter (lp/mm) at a specified small
amplitude (e.g., 10%).
The OTF is the Fourier transform (FT) of the PSF or LSF .
Fig. 4: (a) PSF and (b)
Corresponding MTF
Image Processing UNSA
Digital images Image quality Basic image operations Multiscale image processing
Contrast
Contrast is the difference in intensity of adjacent regions of the image. More accurately, it
is the amplitude of the Fourier transform of the image as a function of spatial frequency.
Using the Fourier transform, the image is unraveled in sinusoidal patterns with
corresponding amplitude and these amplitudes represent the contrast at different spatial
frequencies.
The contrast is defined by (1) the imaging process, such as the source intensity and the
absorption efficiency or sensitivity of the capturing device, (2) the scene characteristics,
such as the physical properties, size and shape of the object, and the use of contrast agents,
and (3) the viewing conditions, such as the room illumination and display equipment.
Noise
A third quality factor is image noise. The emission and detection of
light and all other types of electromagnetic waves are stochastic
processes. Because of the statistical nature of imaging, noise is
always present. It is the random component in the image.
If the noise level is high compared with the image intensity of an
object, the meaningful information is lost in the noise. An important
measure, obtained from signal theory, is therefore the
signal-to-noise ratio (SNR or S/N). In the terminology of images
this is the contrast-to-noise ratio (CNR). Both contrast and noise are
frequency dependent. An estimate of the noise can be obtained by
making a flat-field image, i.e., an image without an object between
the source and the detector.
Artifacts
Section 3.
The aim of medical image enhancement is to allow the clinician to perceive better all the
relevant diagnostic information present in the image. In digital radiography for example,
12-bit images with 4096 possible gray levels are available.
It is physically impossible for the human eye to distinguish all these gray values at once in a
single image. Consequently, not all the diagnostic information encoded in the image may
be perceived. Meaningful details must have a sufficiently high contrast to allow the
clinician to detect them easily.
The larger the number of gray values in the image, the more important this issue becomes,
as lower contrast features may become available in the image data.
Given a digital image I that attributes a gray value (i.e., brightness) to each of the pixels
(i, j), a gray level transformation is a function g that transforms each gray level I (i, j) to
another value I 0 (i, j) independent of the position (i, j). Hence, for all pixels (i, j)
In practice, g is an increasing function. If pixel (i1 , j1 ) appears brighter than pixel (i2 , j2 ) in
the original image, this relation holds after the gray level transformation. The main use of
such a gray level transformation is to increase the contrast in some regions of the image.
The price to be paid is a decreased contrast in other parts of the image.
Window/level operation
0 for t < l − w2
M
gl,w (t) = w (t − l + wl ) for l − w2 ≤ t ≤ l + w
2
M for t > l + w2
Fig. 6: Window/leveling with l = 1500, w = 1000.
Multi-image operations
A simple operation is adding or subtracting images in a pixelwise way. For two images I1
and I2 , the sum I+ and the difference I− are defined as
Geometric operations
« 1 ¬ « 0 0 1¬ «1¬
x0 1 0 tx x
© 0ª ©
translation y ® = 0 1 ty ® y®
ª© ª
general x0 a11 a12 tx x
« 1 ¬ «0 0 1 ¬ «1¬ affine
© 0ª ©
y ® = a21 a22 ty ® y®
ª© ª
x0 1 ux 0 x
shear
© 0ª ©
y ® = uy 1 0® y®
ª© ª «1¬ « 0 0 1 ¬ «1¬
« 1 ¬ « 0 0 1¬ «1¬
x0 cos 𝜃 − sin 𝜃 0 x
© 0ª ©
rotation y ® = sin 𝜃 cos 𝜃 0® y ®
ª© ª
«1¬ « 0 0 1¬ «1¬
Linear filters
From linear system theory, we know that an image I (i, j) can be written as follows:
∑︁
I (i, j) = I (k, l)𝛿(i − k, j − l)
k,l
In practice, the flipped kernel h defined as h(i, j) = f (−i, −j) is usually used. Hence,
where h • I is the cross-correlation of h and I. If the filter is symmetric, which is often the
case, cross-correlation and convolution are identical.
A cross-correlation of an image I (i, j) with a kernel h has the following physical meaning.
The kernel h is used as an image template or mask that is shifted across the image. For
every image pixel (i, j), the template pixel h(0, 0), which typically lies in the center of the
mask, is superimposed onto this pixel (i, j), and the values of the template and image that
correspond to the same positions are multiplied. Next, all these values are summed. A
cross-correlation emphasizes patterns in the image similar to the template.
A softer way of smoothing the image is to give a high weight to the center pixel and less
weight to pixels further away from the central pixel. A suitable filter for this operation is
the discretized Gaussian function
1 − r 22
g(®r) = e 2𝜎 ®r = (i, j)
2𝜋𝜎 2
Small values are put to zero in order to produce a local filter.
The Fourier transform of the Gaussian is again Gaussian. In the Fourier domain,
convolution with a filter becomes multiplication. Taking this into account, it is clear that a
Gaussian filter attenuates the high frequencies in the image. These averaging filters are
therefore also called low-pass filters. In contrast, filters that emphasize high frequencies are
called high-pass filters.
Other types of linear filters are differential operators such as the gradient and the Laplacian.
However, these operations are not defined on discrete images. Because derivatives are
defined on differentiable functions, the computation is performed by first fitting a
differentiable function through the discrete data set. This can be obtained by convolving the
discrete image with a continuous function f .
I = g ∗ I + (I − g ∗ I).
Unsharp masking enhances the image details by emphasizing the high-frequency part and
assigning it a higher weight. For some 𝛼 > 0, the output image I 0 is then given by
I 0 = g ∗ I + (1 + 𝛼) (I − g ∗ I)
= I + 𝛼(I − g ∗ I)
= (1 + 𝛼)I − 𝛼g ∗ I.
The parameter 𝛼 controls the strength of the enhancement, and the parameter 𝜎 is
responsible for the size.
Image Processing UNSA
Digital images Image quality Basic image operations Multiscale image processing
Nonlinear filters
Not every goal can be achieved by using linear filters. Many problems are better solved with
nonlinear methods. Consider, for example, the denoising problem. As explained above, the
averaging filter removes noise in the image. The output image is, however, much smoother
than the input image. In particular, edges are smeared out and may even disappear. To avoid
smoothing, it can therefore be better to calculate the median instead of the mean value in a
small window around each pixel. This procedure better preserves the edges.
Section 4.
Gray value transformations, such as the widespread window/level operation, increase the
contrast in a subpart of the gray value scale. They are quite useful for low-contrast objects
situated in the enhanced gray value band. Unfortunately, features outside this gray value
interval are attenuated instead of enhanced. In addition, gray value transformations do not
make use of the spatial relationship among object pixels and therefore equally enhance
meaningful and meaningless features such as noise.
Spatial operations overcome this problem. Differential operations, such as unsharp
masking, enhance gray value variations or edges, whereas other operations, such as spatial
averaging and median filtering, reduce the noise. However, they focus on features of a
particular size because of the fixed size of the mask, which is a parameter that must be
chosen.
It is clear that a method is needed that is independent of the spatial extent or scale of the
image features and emphasizes the amplitude of only the low-contrast features. Multiscale
image processing has been studied extensively, not only by computer scientists but also by
neurophysiologists. It is well known that the human visual system makes use of a
multiscale approach.
Thank you.