- •Preface
- •Biological Vision Systems
- •Visual Representations from Paintings to Photographs
- •Computer Vision
- •The Limitations of Standard 2D Images
- •3D Imaging, Analysis and Applications
- •Book Objective and Content
- •Acknowledgements
- •Contents
- •Contributors
- •2.1 Introduction
- •Chapter Outline
- •2.2 An Overview of Passive 3D Imaging Systems
- •2.2.1 Multiple View Approaches
- •2.2.2 Single View Approaches
- •2.3 Camera Modeling
- •2.3.1 Homogeneous Coordinates
- •2.3.2 Perspective Projection Camera Model
- •2.3.2.1 Camera Modeling: The Coordinate Transformation
- •2.3.2.2 Camera Modeling: Perspective Projection
- •2.3.2.3 Camera Modeling: Image Sampling
- •2.3.2.4 Camera Modeling: Concatenating the Projective Mappings
- •2.3.3 Radial Distortion
- •2.4 Camera Calibration
- •2.4.1 Estimation of a Scene-to-Image Planar Homography
- •2.4.2 Basic Calibration
- •2.4.3 Refined Calibration
- •2.4.4 Calibration of a Stereo Rig
- •2.5 Two-View Geometry
- •2.5.1 Epipolar Geometry
- •2.5.2 Essential and Fundamental Matrices
- •2.5.3 The Fundamental Matrix for Pure Translation
- •2.5.4 Computation of the Fundamental Matrix
- •2.5.5 Two Views Separated by a Pure Rotation
- •2.5.6 Two Views of a Planar Scene
- •2.6 Rectification
- •2.6.1 Rectification with Calibration Information
- •2.6.2 Rectification Without Calibration Information
- •2.7 Finding Correspondences
- •2.7.1 Correlation-Based Methods
- •2.7.2 Feature-Based Methods
- •2.8 3D Reconstruction
- •2.8.1 Stereo
- •2.8.1.1 Dense Stereo Matching
- •2.8.1.2 Triangulation
- •2.8.2 Structure from Motion
- •2.9 Passive Multiple-View 3D Imaging Systems
- •2.9.1 Stereo Cameras
- •2.9.2 3D Modeling
- •2.9.3 Mobile Robot Localization and Mapping
- •2.10 Passive Versus Active 3D Imaging Systems
- •2.11 Concluding Remarks
- •2.12 Further Reading
- •2.13 Questions
- •2.14 Exercises
- •References
- •3.1 Introduction
- •3.1.1 Historical Context
- •3.1.2 Basic Measurement Principles
- •3.1.3 Active Triangulation-Based Methods
- •3.1.4 Chapter Outline
- •3.2 Spot Scanners
- •3.2.1 Spot Position Detection
- •3.3 Stripe Scanners
- •3.3.1 Camera Model
- •3.3.2 Sheet-of-Light Projector Model
- •3.3.3 Triangulation for Stripe Scanners
- •3.4 Area-Based Structured Light Systems
- •3.4.1 Gray Code Methods
- •3.4.1.1 Decoding of Binary Fringe-Based Codes
- •3.4.1.2 Advantage of the Gray Code
- •3.4.2 Phase Shift Methods
- •3.4.2.1 Removing the Phase Ambiguity
- •3.4.3 Triangulation for a Structured Light System
- •3.5 System Calibration
- •3.6 Measurement Uncertainty
- •3.6.1 Uncertainty Related to the Phase Shift Algorithm
- •3.6.2 Uncertainty Related to Intrinsic Parameters
- •3.6.3 Uncertainty Related to Extrinsic Parameters
- •3.6.4 Uncertainty as a Design Tool
- •3.7 Experimental Characterization of 3D Imaging Systems
- •3.7.1 Low-Level Characterization
- •3.7.2 System-Level Characterization
- •3.7.3 Characterization of Errors Caused by Surface Properties
- •3.7.4 Application-Based Characterization
- •3.8 Selected Advanced Topics
- •3.8.1 Thin Lens Equation
- •3.8.2 Depth of Field
- •3.8.3 Scheimpflug Condition
- •3.8.4 Speckle and Uncertainty
- •3.8.5 Laser Depth of Field
- •3.8.6 Lateral Resolution
- •3.9 Research Challenges
- •3.10 Concluding Remarks
- •3.11 Further Reading
- •3.12 Questions
- •3.13 Exercises
- •References
- •4.1 Introduction
- •Chapter Outline
- •4.2 Representation of 3D Data
- •4.2.1 Raw Data
- •4.2.1.1 Point Cloud
- •4.2.1.2 Structured Point Cloud
- •4.2.1.3 Depth Maps and Range Images
- •4.2.1.4 Needle map
- •4.2.1.5 Polygon Soup
- •4.2.2 Surface Representations
- •4.2.2.1 Triangular Mesh
- •4.2.2.2 Quadrilateral Mesh
- •4.2.2.3 Subdivision Surfaces
- •4.2.2.4 Morphable Model
- •4.2.2.5 Implicit Surface
- •4.2.2.6 Parametric Surface
- •4.2.2.7 Comparison of Surface Representations
- •4.2.3 Solid-Based Representations
- •4.2.3.1 Voxels
- •4.2.3.3 Binary Space Partitioning
- •4.2.3.4 Constructive Solid Geometry
- •4.2.3.5 Boundary Representations
- •4.2.4 Summary of Solid-Based Representations
- •4.3 Polygon Meshes
- •4.3.1 Mesh Storage
- •4.3.2 Mesh Data Structures
- •4.3.2.1 Halfedge Structure
- •4.4 Subdivision Surfaces
- •4.4.1 Doo-Sabin Scheme
- •4.4.2 Catmull-Clark Scheme
- •4.4.3 Loop Scheme
- •4.5 Local Differential Properties
- •4.5.1 Surface Normals
- •4.5.2 Differential Coordinates and the Mesh Laplacian
- •4.6 Compression and Levels of Detail
- •4.6.1 Mesh Simplification
- •4.6.1.1 Edge Collapse
- •4.6.1.2 Quadric Error Metric
- •4.6.2 QEM Simplification Summary
- •4.6.3 Surface Simplification Results
- •4.7 Visualization
- •4.8 Research Challenges
- •4.9 Concluding Remarks
- •4.10 Further Reading
- •4.11 Questions
- •4.12 Exercises
- •References
- •1.1 Introduction
- •Chapter Outline
- •1.2 A Historical Perspective on 3D Imaging
- •1.2.1 Image Formation and Image Capture
- •1.2.2 Binocular Perception of Depth
- •1.2.3 Stereoscopic Displays
- •1.3 The Development of Computer Vision
- •1.3.1 Further Reading in Computer Vision
- •1.4 Acquisition Techniques for 3D Imaging
- •1.4.1 Passive 3D Imaging
- •1.4.2 Active 3D Imaging
- •1.4.3 Passive Stereo Versus Active Stereo Imaging
- •1.5 Twelve Milestones in 3D Imaging and Shape Analysis
- •1.5.1 Active 3D Imaging: An Early Optical Triangulation System
- •1.5.2 Passive 3D Imaging: An Early Stereo System
- •1.5.3 Passive 3D Imaging: The Essential Matrix
- •1.5.4 Model Fitting: The RANSAC Approach to Feature Correspondence Analysis
- •1.5.5 Active 3D Imaging: Advances in Scanning Geometries
- •1.5.6 3D Registration: Rigid Transformation Estimation from 3D Correspondences
- •1.5.7 3D Registration: Iterative Closest Points
- •1.5.9 3D Local Shape Descriptors: Spin Images
- •1.5.10 Passive 3D Imaging: Flexible Camera Calibration
- •1.5.11 3D Shape Matching: Heat Kernel Signatures
- •1.6 Applications of 3D Imaging
- •1.7 Book Outline
- •1.7.1 Part I: 3D Imaging and Shape Representation
- •1.7.2 Part II: 3D Shape Analysis and Processing
- •1.7.3 Part III: 3D Imaging Applications
- •References
- •5.1 Introduction
- •5.1.1 Applications
- •5.1.2 Chapter Outline
- •5.2 Mathematical Background
- •5.2.1 Differential Geometry
- •5.2.2 Curvature of Two-Dimensional Surfaces
- •5.2.3 Discrete Differential Geometry
- •5.2.4 Diffusion Geometry
- •5.2.5 Discrete Diffusion Geometry
- •5.3 Feature Detectors
- •5.3.1 A Taxonomy
- •5.3.2 Harris 3D
- •5.3.3 Mesh DOG
- •5.3.4 Salient Features
- •5.3.5 Heat Kernel Features
- •5.3.6 Topological Features
- •5.3.7 Maximally Stable Components
- •5.3.8 Benchmarks
- •5.4 Feature Descriptors
- •5.4.1 A Taxonomy
- •5.4.2 Curvature-Based Descriptors (HK and SC)
- •5.4.3 Spin Images
- •5.4.4 Shape Context
- •5.4.5 Integral Volume Descriptor
- •5.4.6 Mesh Histogram of Gradients (HOG)
- •5.4.7 Heat Kernel Signature (HKS)
- •5.4.8 Scale-Invariant Heat Kernel Signature (SI-HKS)
- •5.4.9 Color Heat Kernel Signature (CHKS)
- •5.4.10 Volumetric Heat Kernel Signature (VHKS)
- •5.5 Research Challenges
- •5.6 Conclusions
- •5.7 Further Reading
- •5.8 Questions
- •5.9 Exercises
- •References
- •6.1 Introduction
- •Chapter Outline
- •6.2 Registration of Two Views
- •6.2.1 Problem Statement
- •6.2.2 The Iterative Closest Points (ICP) Algorithm
- •6.2.3 ICP Extensions
- •6.2.3.1 Techniques for Pre-alignment
- •Global Approaches
- •Local Approaches
- •6.2.3.2 Techniques for Improving Speed
- •Subsampling
- •Closest Point Computation
- •Distance Formulation
- •6.2.3.3 Techniques for Improving Accuracy
- •Outlier Rejection
- •Additional Information
- •Probabilistic Methods
- •6.3 Advanced Techniques
- •6.3.1 Registration of More than Two Views
- •Reducing Error Accumulation
- •Automating Registration
- •6.3.2 Registration in Cluttered Scenes
- •Point Signatures
- •Matching Methods
- •6.3.3 Deformable Registration
- •Methods Based on General Optimization Techniques
- •Probabilistic Methods
- •6.3.4 Machine Learning Techniques
- •Improving the Matching
- •Object Detection
- •6.4 Quantitative Performance Evaluation
- •6.5 Case Study 1: Pairwise Alignment with Outlier Rejection
- •6.6 Case Study 2: ICP with Levenberg-Marquardt
- •6.6.1 The LM-ICP Method
- •6.6.2 Computing the Derivatives
- •6.6.3 The Case of Quaternions
- •6.6.4 Summary of the LM-ICP Algorithm
- •6.6.5 Results and Discussion
- •6.7 Case Study 3: Deformable ICP with Levenberg-Marquardt
- •6.7.1 Surface Representation
- •6.7.2 Cost Function
- •Data Term: Global Surface Attraction
- •Data Term: Boundary Attraction
- •Penalty Term: Spatial Smoothness
- •Penalty Term: Temporal Smoothness
- •6.7.3 Minimization Procedure
- •6.7.4 Summary of the Algorithm
- •6.7.5 Experiments
- •6.8 Research Challenges
- •6.9 Concluding Remarks
- •6.10 Further Reading
- •6.11 Questions
- •6.12 Exercises
- •References
- •7.1 Introduction
- •7.1.1 Retrieval and Recognition Evaluation
- •7.1.2 Chapter Outline
- •7.2 Literature Review
- •7.3 3D Shape Retrieval Techniques
- •7.3.1 Depth-Buffer Descriptor
- •7.3.1.1 Computing the 2D Projections
- •7.3.1.2 Obtaining the Feature Vector
- •7.3.1.3 Evaluation
- •7.3.1.4 Complexity Analysis
- •7.3.2 Spin Images for Object Recognition
- •7.3.2.1 Matching
- •7.3.2.2 Evaluation
- •7.3.2.3 Complexity Analysis
- •7.3.3 Salient Spectral Geometric Features
- •7.3.3.1 Feature Points Detection
- •7.3.3.2 Local Descriptors
- •7.3.3.3 Shape Matching
- •7.3.3.4 Evaluation
- •7.3.3.5 Complexity Analysis
- •7.3.4 Heat Kernel Signatures
- •7.3.4.1 Evaluation
- •7.3.4.2 Complexity Analysis
- •7.4 Research Challenges
- •7.5 Concluding Remarks
- •7.6 Further Reading
- •7.7 Questions
- •7.8 Exercises
- •References
- •8.1 Introduction
- •Chapter Outline
- •8.2 3D Face Scan Representation and Visualization
- •8.3 3D Face Datasets
- •8.3.1 FRGC v2 3D Face Dataset
- •8.3.2 The Bosphorus Dataset
- •8.4 3D Face Recognition Evaluation
- •8.4.1 Face Verification
- •8.4.2 Face Identification
- •8.5 Processing Stages in 3D Face Recognition
- •8.5.1 Face Detection and Segmentation
- •8.5.2 Removal of Spikes
- •8.5.3 Filling of Holes and Missing Data
- •8.5.4 Removal of Noise
- •8.5.5 Fiducial Point Localization and Pose Correction
- •8.5.6 Spatial Resampling
- •8.5.7 Feature Extraction on Facial Surfaces
- •8.5.8 Classifiers for 3D Face Matching
- •8.6 ICP-Based 3D Face Recognition
- •8.6.1 ICP Outline
- •8.6.2 A Critical Discussion of ICP
- •8.6.3 A Typical ICP-Based 3D Face Recognition Implementation
- •8.6.4 ICP Variants and Other Surface Registration Approaches
- •8.7 PCA-Based 3D Face Recognition
- •8.7.1 PCA System Training
- •8.7.2 PCA Training Using Singular Value Decomposition
- •8.7.3 PCA Testing
- •8.7.4 PCA Performance
- •8.8 LDA-Based 3D Face Recognition
- •8.8.1 Two-Class LDA
- •8.8.2 LDA with More than Two Classes
- •8.8.3 LDA in High Dimensional 3D Face Spaces
- •8.8.4 LDA Performance
- •8.9 Normals and Curvature in 3D Face Recognition
- •8.9.1 Computing Curvature on a 3D Face Scan
- •8.10 Recent Techniques in 3D Face Recognition
- •8.10.1 3D Face Recognition Using Annotated Face Models (AFM)
- •8.10.2 Local Feature-Based 3D Face Recognition
- •8.10.2.1 Keypoint Detection and Local Feature Matching
- •8.10.2.2 Other Local Feature-Based Methods
- •8.10.3 Expression Modeling for Invariant 3D Face Recognition
- •8.10.3.1 Other Expression Modeling Approaches
- •8.11 Research Challenges
- •8.12 Concluding Remarks
- •8.13 Further Reading
- •8.14 Questions
- •8.15 Exercises
- •References
- •9.1 Introduction
- •Chapter Outline
- •9.2 DEM Generation from Stereoscopic Imagery
- •9.2.1 Stereoscopic DEM Generation: Literature Review
- •9.2.2 Accuracy Evaluation of DEMs
- •9.2.3 An Example of DEM Generation from SPOT-5 Imagery
- •9.3 DEM Generation from InSAR
- •9.3.1 Techniques for DEM Generation from InSAR
- •9.3.1.1 Basic Principle of InSAR in Elevation Measurement
- •9.3.1.2 Processing Stages of DEM Generation from InSAR
- •The Branch-Cut Method of Phase Unwrapping
- •The Least Squares (LS) Method of Phase Unwrapping
- •9.3.2 Accuracy Analysis of DEMs Generated from InSAR
- •9.3.3 Examples of DEM Generation from InSAR
- •9.4 DEM Generation from LIDAR
- •9.4.1 LIDAR Data Acquisition
- •9.4.2 Accuracy, Error Types and Countermeasures
- •9.4.3 LIDAR Interpolation
- •9.4.4 LIDAR Filtering
- •9.4.5 DTM from Statistical Properties of the Point Cloud
- •9.5 Research Challenges
- •9.6 Concluding Remarks
- •9.7 Further Reading
- •9.8 Questions
- •9.9 Exercises
- •References
- •10.1 Introduction
- •10.1.1 Allometric Modeling of Biomass
- •10.1.2 Chapter Outline
- •10.2 Aerial Photo Mensuration
- •10.2.1 Principles of Aerial Photogrammetry
- •10.2.1.1 Geometric Basis of Photogrammetric Measurement
- •10.2.1.2 Ground Control and Direct Georeferencing
- •10.2.2 Tree Height Measurement Using Forest Photogrammetry
- •10.2.2.2 Automated Methods in Forest Photogrammetry
- •10.3 Airborne Laser Scanning
- •10.3.1 Principles of Airborne Laser Scanning
- •10.3.1.1 Lidar-Based Measurement of Terrain and Canopy Surfaces
- •10.3.2 Individual Tree-Level Measurement Using Lidar
- •10.3.2.1 Automated Individual Tree Measurement Using Lidar
- •10.3.3 Area-Based Approach to Estimating Biomass with Lidar
- •10.4 Future Developments
- •10.5 Concluding Remarks
- •10.6 Further Reading
- •10.7 Questions
- •References
- •11.1 Introduction
- •Chapter Outline
- •11.2 Volumetric Data Acquisition
- •11.2.1 Computed Tomography
- •11.2.1.1 Characteristics of 3D CT Data
- •11.2.2 Positron Emission Tomography (PET)
- •11.2.2.1 Characteristics of 3D PET Data
- •Relaxation
- •11.2.3.1 Characteristics of the 3D MRI Data
- •Image Quality and Artifacts
- •11.2.4 Summary
- •11.3 Surface Extraction and Volumetric Visualization
- •11.3.1 Surface Extraction
- •Example: Curvatures and Geometric Tools
- •11.3.2 Volume Rendering
- •11.3.3 Summary
- •11.4 Volumetric Image Registration
- •11.4.1 A Hierarchy of Transformations
- •11.4.1.1 Rigid Body Transformation
- •11.4.1.2 Similarity Transformations and Anisotropic Scaling
- •11.4.1.3 Affine Transformations
- •11.4.1.4 Perspective Transformations
- •11.4.1.5 Non-rigid Transformations
- •11.4.2 Points and Features Used for the Registration
- •11.4.2.1 Landmark Features
- •11.4.2.2 Surface-Based Registration
- •11.4.2.3 Intensity-Based Registration
- •11.4.3 Registration Optimization
- •11.4.3.1 Estimation of Registration Errors
- •11.4.4 Summary
- •11.5 Segmentation
- •11.5.1 Semi-automatic Methods
- •11.5.1.1 Thresholding
- •11.5.1.2 Region Growing
- •11.5.1.3 Deformable Models
- •Snakes
- •Balloons
- •11.5.2 Fully Automatic Methods
- •11.5.2.1 Atlas-Based Segmentation
- •11.5.2.2 Statistical Shape Modeling and Analysis
- •11.5.3 Summary
- •11.6 Diffusion Imaging: An Illustration of a Full Pipeline
- •11.6.1 From Scalar Images to Tensors
- •11.6.2 From Tensor Image to Information
- •11.6.3 Summary
- •11.7 Applications
- •11.7.1 Diagnosis and Morphometry
- •11.7.2 Simulation and Training
- •11.7.3 Surgical Planning and Guidance
- •11.7.4 Summary
- •11.8 Concluding Remarks
- •11.9 Research Challenges
- •11.10 Further Reading
- •Data Acquisition
- •Surface Extraction
- •Volume Registration
- •Segmentation
- •Diffusion Imaging
- •Software
- •11.11 Questions
- •11.12 Exercises
- •References
- •Index
8 3D Face Recognition |
337 |
and matched each region independently using ICP. They used Borda Count and consensus voting to combine the scores. Matching expression insensitive regions of the face is a potentially useful approach to overcome the sensitivity of ICP to expressions. However, determining such regions is a problem worth exploring because these regions are likely to vary between individuals as well as expressions. Another challenge in matching sub-regions is that it requires accurate segmentation of the sub-regions.
Finally we note that, rather than minimizing a mean squared error metric between the probe and gallery surfaces, other metrics are possible, although a significantly different approach to minimization must be adopted and the approach is no longer termed ‘ICP’. One such metric is termed the Surface Interpenetration Measure (SIM) [81] which measures the degree to which two aligned surfaces cross over each other. The SIM metric has recently been used with a Simulated Annealing approach to 3D face recognition [76]. A verification rate of 96.5 % was achieved on the FRGC v2 dataset at 0.1 % FAR and a rank-one accuracy of 98.4 % was achieved in identification tests.
In the following two sections we discuss PCA and LDA-based 3D face recognition systems that operate on depth maps and surface feature maps (e.g. arrays of curvature values) rather than on point clouds.
8.7 PCA-Based 3D Face Recognition
Once 3D face scans have been filtered, pose normalized and re-sampled so that, for example, standard size depth maps are generated, the simplest way to implement a face recognition system is to compare the depth maps directly. In this sense we see a p × q depth map as an m × 1 feature vector in an m = pq dimensional space and we can implement a 1-nearest neighbor scheme, for example, based on either a Euclidean (L2 norm) metric or cosine distance metric. However, this is not generally recommended. Typical depth map sizes mean that m can be a large dimensional space with a large amount of redundancy, since we are only imaging faces and not other objects. Dimension reduction using Principal Component Analysis (PCA) can express the variation in the data in a smaller space, thus improving speed of feature vector comparisons, and removing dimensions that express noise, thus improving recognition performance. Note that PCA is also known in various texts as the Hotelling transform or the Karhunen-Lóeve transform. The transform involves a zero-mean operation and a rotation of the data such that the variables associated with each dimension become uncorrelated. It is then possible to form a reduced dimension subspace by discarding those dimensions that express little variance in the (rotated) data. This is equivalent to a projection into a subspace of the zero-mean dataset that decorrelates the data. It is based on the second order statistical properties of the data and maps the general covariance matrix in the original basis to a diagonal matrix in the new, rotated basis.
PCA based 3D face recognition has become a benchmark at least in near-frontal poses, such as is provided in the FRGC v2 dataset [73]. This implies that, when
338 |
A. Mian and N. Pears |
a researcher presents a new 3D face recognition method for this kind of 3D scan, it is expected to at least improve upon a standard PCA performance (for the given set of features employed). The method is similar to the seminal approaches where 2D facial images are decomposed into a linear combination of eigenvectors [82] and employed within a face recognition scenario [87], except that, instead of 2D images, depth maps are used as the input feature vectors. Another important difference is that in 2D ‘eigenfaces’, the three most significant eigenvalues are usually affected by illumination variations and discarding them improves recognition performance [8]. Since depth maps do not contain any illumination component, all significant eigenvalues are used for 3D face recognition.
One of the earliest works on PCA based 3D face recognition is of Achermann et al. [2]. Hesher et al. [46] explored the use of different numbers of eigenvectors and image sizes for PCA based 3D face recognition. Heseltine et al. [44] generated a set of twelve feature maps based on the gradients and curvatures over the facial surface, and applied PCA-based face recognition to these maps. Pan et al. [70] constructed a circular depth map using the nose tip as center and axis of symmetry as starting point. They applied a PCA based approach to the depth map for face recognition. Chang et al. [17] performed PCA based 3D face recognition on a larger dataset and later expanded their work to perform a comparative evaluation of PCA based 3D face recognition with 2D eigenfaces and found similar recognition performance [18].
In order to implement and test PCA-based 3D face recognition, we need to partition our pose-normalized 3D scans into a training set and a test set. The following sub-sections provide procedures for training and testing a PCA-based 3D face recognition system.
8.7.1 PCA System Training
1.For the set of n training images, xi , i = 1 . . . n, where each training face is represented as an m-dimensional point (column vector) in depth map or surface feature space,
x = [x1, . . . , xm]T , |
(8.12) |
stack the n training face vectors together (as rows) to construct the n × m training data matrix:
x1T |
|
|
. |
|
(8.13) |
. |
. |
|
X = . |
||
xnT |
|
|
2. Perform a mean-centering operation by subtracting the mean of the training face
1 |
n |
, from each row of matrix X to form the zero-mean train- |
|
vectors, x¯ = n |
i 1 xi |
||
ing data matrix: |
= |
|
|
|
|
|
|
|
|
X0 = X − Jn,1x¯ T , |
(8.14) |
where Jn,1 is an n × 1 matrix of ones. |
|
8 3D Face Recognition |
|
|
|
339 |
|
3. Generate the m × m covariance matrix of the training data as: |
|
||||
C = |
|
1 |
|
X0T X0. |
(8.15) |
n |
− |
1 |
|||
|
|
|
|
|
Note that dividing by n − 1 (rather than n) generates an unbiased covariance estimate from the training data (rather than a maximum likelihood estimate). As we tend to use large training sets (of the order of several hundred images), in practice there is no significant difference between these two covariance estimates.
4.Perform a standard eigendecomposition on the covariance matrix. Since the covariance matrix is symmetric, its eigenvectors are orthogonal to each other and can be chosen to have unit length such that:
VDVT = C, |
(8.16) |
where both V and D are m × m matrices. The columns of matrix V are the eigenvectors, vi , associated with the covariance matrix and D is a diagonal matrix whose elements contain the corresponding eigenvalues, λi . Since the covariance is a symmetric positive semidefinite matrix, these eigenvalues are real and nonnegative. A key point is that these eigenvalues describe the variance along each of the eigenvectors. (Note that eigendecomposition can be achieved with a standard function call such as the MATLAB eig function.) Eigenvalues in D and their corresponding eigenvectors in V are in corresponding columns and we require them to be in descending order of eigenvalue. If this order is not automatically performed within the eigendecomposition function, column reordering should be implemented.
5.Select the number of subspace dimensions for projecting the 3D faces. This is the dimensionality reduction step and is usually done by analyzing the ratio of cumulative variance associated with the first k dimensions of the rotated image space to the total variance associated with the full set of m dimensions in that space. This proportion of variance ratio is given by:
|
|
k |
λi |
|
|
a |
k = |
i=1 |
(8.17) |
||
|
|||||
|
m |
λi |
|
||
|
|
i=1 |
|
|
and takes a value between 0 and 1, which is often expressed as a percentage 0–100 %. A common approach is to choose a minimum value of k such that ak is greater than a certain percentage (90 % or 95 % are commonly used). Figure 8.7 shows a plot of ak versus k for 455 3D faces taken from the FRGC v2 dataset [73]. From Fig. 8.7 one can conclude that the shape of human faces lies in a significantly lower dimensional subspace than the dimensionality of the original depth maps. Note that the somewhat arbitrary thresholding approach described here is likely to be sub-optimal and recognition performance can be tuned later by searching for an optimal value of k in a set of face recognition experiments.
6. Project the training data set (the gallery) into the k-dimensional subspace:
˜ |
= |
0 |
k . |
(8.18) |
X |
|
X V |
|
|
340 |
A. Mian and N. Pears |
Fig. 8.7 Proportion of variance (%) of the first k eigenvalues to the total variance for 455 3D faces. The first 26 most significant eigenvalues retain 95 % of the total variance and the first 100 eigenvalues retain 99 % of the total variance [73]
Here Vk is a m × k matrix containing the first k eigenvectors (columns, vi ) of V
and ˜ is a n × k matrix of n training faces (stored as rows) in the k-dimensional
X
subspace (k dimensions stored as columns).
8.7.2 PCA Training Using Singular Value Decomposition
Several variants of PCA-based 3D face recognition exist in the literature and one of the most important variants is to use Singular Value Decomposition (SVD) directly on the n × m zero-mean training data matrix, X0, thus replacing steps 3 and 4 in the previous subsection. The advantage of using SVD is that it can often provide superior numerical stability compared to eigendecomposition algorithms, additionally the storage required for a data matrix is often much less than a covariance matrix (the number of training scans is much less than the dimension of the feature vector). The SVD is given as:
USVT = X0, |
(8.19) |
where U and V are orthogonal matrices of dimension n × n and m × m respectively and S is a n × m matrix of singular values along its diagonal. Note that, in contrast to the eigendecomposition approach, no covariance matrix is formed, yet the required matrix of eigenvectors, V, spanning the most expressive subspace of the training data is obtained. Furthermore, we can determine the eigenvalues from the corresponding singular values. By substituting for X0 in Eq. (8.15), using its SVD in Eq. (8.19), and then comparing to the eigendecomposition of covariance in Eq. (8.16) we see that:
D = |
|
1 |
|
S2. |
(8.20) |
n |
− |
1 |
|||
|
|
|
|
|
The proof of this is given as one of the questions at the end of this chapter. Typically SVD library functions order the singular values from highest to lowest along the