Shape-Based Hand Recognition 1121Ender Konukoğlu, Erdem Yörük, Jerôme Darbon, Bülent Sankur 1 Electrical and Electronic Engineering Department, Boğaziçi University, Bebek, İstanbul, Turkey 2 EPITA (Ecole Pour l’Informatique et les Techniques Avancées)
[konuk, yoruk, sankur]@boun.edu.tr; firstname.lastname@example.org
The problem of person identification based on their hand images has been addressed. The system is based on the images of the right hands of the subjects, captured by a flatbed scanner in an unconstrained pose. In a preprocessing stage of the algorithm, the silhouettes of hand images are registered to a fixed pose, which involves both rotation and translation of the hand and, separately, of the individual fingers. Two feature sets have been comparatively assessed, Hausdorff distance of the hand contours and independent component features of the hand silhouette images. Both the classification and the verification performances are found to be very satisfactory as it was shown that, at least for groups of about hundred subjects, hand-based recognition is a viable secure access control scheme.
The emerging field of biometric technology addresses the automated identification of individuals, based on their physiological and behavioral traits. The broad category of human authentication schemes, denoted as biometrics encompasses many techniques from computer vision and pattern recognition. The personal attributes used in a biometric identification system can be physiological, such as facial features, fingerprints, iris, retinal scans, hand and finger geometry; or behavioral, the traits idiosyncratic of the individual, such as voice print, gait, signature, and keystroking. Depending on the complexity or the security level of the application, one will opt to use one or more of these personal characteristics.
In this paper, we investigate the hand shape as a distinctive personal attribute for an authentication task. Despite the fact that the use of hands as biometric evidence is not very new, and that one can witness an increasing number of commercial products being deployed, the documentation in the literature is scarcer as compared to other modalities like face or voice. However, processing of
hands requires less complexity in terms of imaging conditions, for example a relatively simple sensor such as a flatbed scanner would suffice. Consequently hand-based biometry is friendlier and it is less prone to disturbances and robust to environmental conditions. In comparison, face recognition is quite sensitive to pose, facial accessories, expression and lighting variations; iris or retina-based based identification requires special illumination and is much less friendly; fingerprint imaging requires good frictional skin etc. Therefore, authentication based on hand shape can be an attractive alternative due to its unobtrusiveness, low-cost and easy interface, and low data storage requirements. Note that there is increasing deployment of access control based on hand geometry . These applications range from passport control in airports to international banks, from parents’ access to child daycare centers to university student meal programs, from hospitals, prisons to nuclear power plants. Some of the interesting applications have been interactive kiosks, time and attendance control, anti-passback to prevent a cardholder from passing it to an accomplice, and collection of the transactions of a service system.
Hand-based authentication schemes in the literature are mostly based on geometrical features. For example, Sanchez-Reillo et al.  measure finger widths at different latitudes, finger and palm heights, finger deviations and the angles of the inter-finger valleys with the horizontal. The twenty-five selected features are modeled with Gaussian mixture models specific to each individual. Öden, Erçil and Büke  have used fourth degree implicit polynomial representation of the extracted finger shapes in addition to such geometric features as finger widths at various positions and the palm size. The resulting sixteen features are compared using the Mahalanobis distance. Jain, Ross and Pankanti  have used a peg-based imaging scheme and obtained sixteen features, which include length and width of the fingers, aspect ratio of the palm to fingers, and thickness of the hand. The prototype system they developed was tested in a verification experiment for web access over for a group of 10 people. Bulatov et al.  extract geometric features similar to [21, 20, 22] and compare two classifiers.
The method of Jain and Duta  is somewhat similar to ours in that they compare the contour shape difference via the mean square error, and it involves fingers alignment. Lay  introduced a technique where the hand is illuminated with a parallel grating that serves both to segment the background and enables the user to register his hand with one the stored contours.
Finally let’s note The geometric features of the hand shape are captured by the quadtree code. that there exist a number of patents on hand information-based personnel identification, based on either geometrical features or on hand profile .
In our paper we employ a hand shape-based approach for person identification and/or verification. The algorithm is based on preprocessing the acquired image, which involves segmentation and normalization for hand’s deformable shape. In this context ―hand normalization‖ signifies the
registration of fingers and of the hand to standard positions by separate rotations of the fingers as well rotation and translation of the whole hand. Subsequently person identification is based on the comparison of the hand silhouette shapes using Hausdorff distance or on the distance of feature vectors, namely the independent component analysis (ICA) features. The features used and the data
sizes in different algorithms are summarized in Table 1:
Table I: Characteristics and population sizes of the hand-based recognition algorithms.
Algorithm Features & Classification Number of Images per
Oden et al.  16 features: geometric features and implicit 35 10
polynomial invariants of fingers. Classifier based
on Mahalanobis distance.
Sanchez-Reillo 25 geometric features including finger and palm 20 10
et al.  thickness. Classifier based on Gaussian mixture
Duta-Jain  Hand contour data. Classifier based on mean 53 variable (from 2
average distance of contours. to 15)
Ross  17 geometric features including length, height and 50 variable (7 on
thickness of fingers and palm. Classifier based on average)
Euclidean and Mahalanobis distances.
Bulatov et al. 30 geometric features including length and height 70 10
 of fingers and palm. Classifier based on
Chebyshev metric between feature vectors.
stOur methods 1 method: Features consist of hand contour data. 118 3
Classifier based on modified Hausdorff distance.
nd2 method: Features consist of independent
components of the hand silhouette. Classifier is
the Euclidean distance.
We assume that the user of this system will be cooperating, as he/she would be demanding for
access. In other words, the user would have no interest in invalidating the access mechanism by moving or jittering his/her hand or by having fingers crumpled or sticking to each other. On the other hand, the implementation does not assume or force the user to any particular orientation. The orientation information of the hand/fingers is automatically recovered from the scanned image and then the hand normalized.
The paper is organized as follows. In Section 2, the segmentation of hand images from its background is presented. The normalization steps for the deformable hand images are given in Section 3. Section 4 details the computation of features from the normalized hand silhouettes. The experimental setup and the classification results are discussed in Section 5 and conclusions are drawn in Section 6.
2. HAND SEGMENTATION
The hand segmentation aims to extract the hand region from the background. At first sight, segmentation of a two-object scene, consisting of a hand and the background, seems a relatively easy task. However, segmentation accuracy may suffer from artifacts due to rings, overlapping cuffs or
wristwatch belts/chains, or creases around the borders from too light or heavy pressing. Furthermore, the delineation of the hand contour must be very accurate, since the differences between hands of different individuals are often minute. We have comparatively evaluated two alternate methods of segmentation, namely, clustering followed by morphological operations and the watershed transform-based segmentation. Interestingly enough, the Canny edge-based segmentation with snake completion [6, 27] did not work well due to the difficulty of fitting snakes to the very sharp concavities between fingers. Snake algorithms performed adequately only if they were properly initialized at the extremities.
2.1 Segmentation Using the Watershed Transform:
The segmentation by watershed involves two steps: marker extraction and watershed transform. Marker extraction leads to one connected component inside each object of interest, while the
watershed transform propagates these markers to define the object boundaries.
Marker Extraction: In order to extract a marker for the hand, and another for the background a two-class clustering operation is used. The two largest connected components will correspond obviously to the hand and to the background. However, due to noise, dirt spots and/or ring artifacts on the hand, the class markers may be disconnected. (Fig. 3). Such artifacts can be remedied by imposing label connectivity via Markov Random Field (MRF).
hs()ls()Let denote, respectively, the image features () and their class labels (), hslss(),();：，()
sboth defined on the lattice of the image and is any element of this lattice. An initial label field ，l
of the hand and the background can be obtained directly using distances from the two class centroids, where obviously possesses only two labels, namely, hand and background. We then consider l
pairwise interactions between neighboring pixel positions, resulting in the following energy term:
where means that s and r are neighbors. D is a data term, which measures how well the ，，sr,
labeling fits the observed data (i.e., the Mahalanobis distance between the image pixel and the hs()
ccentroid, , of the class indicated by the label . V is a prior term on the labeling we are ls()ls()
interested in. We use the Ising model  for the prior, where the number of discontinuities is penalized by . In this expression refers to the Kronecker Vlslrlrls((),())((),())！？1；；
symbol and ： is a weighting term for the prior. This model penalizes the number of discontinuities. The resulting energy term becomes thus:
??11T？13?? argmin(())(())(((),())log()||hschsclslt +：；?12？；？？？；，，()()??lsslsl??22l,sst，，??
；where denotes the covariance matrix of the data for a given label field and its determinant. ；lll
In the case of gray-level features the covariance matrix in the data fitting term simplifies to the variance expression. We implemented the image segmentation both on the color features and gray-level image features, where the outcomes were very similar. Hence in the sequel, all results are obtained with the constrained minimization run over gray-level images only, although we leave the energy minimization expression above for the general vector case. We minimize this energy using a
：fast algorithm based on the graph cut method described in . The weight factor