Statistical Theory of Shape

Shape in image processing
Shapes provide a rich set of clues on the identity and topological properties
of an object. In many imaging environments, however, the same object appears
to have different shapes due to such distortions as translation, rotation,
reflection, anisotropic scaling, skewing, or shearing. These distortions are generally captured by affine transformations. Further, the order by which the object's
feature points are scanned changes. we refer to these as permutation distortions. We show below on the left two images of the same airplane that are affine distorted. When scanned, say lexicographically (top to bottom, left to right), the pixels are not in correspondence.
To relate shapes like these, i.e., of the same object and that are distorted
by different affine and permutation transformations is a challenge. Overcoming
the permutation distortions, i.e., the unknown scanning order, is combinatorialthe
correspondence problem. Our work is concerned with developing algorithms that
are invariant to these affine-permutation distortions. We have introduced
the concept of intrinsic shape of an object. It is a uniquely defined representative
of the equivalence class of all affine-permuted distortions of the same object.
The shape of the object is essentially the shape that results after we factor out the distortions. The figure of the airplane above on the right is the intrinsic shape of the two distorted images on the left as obtained by the BLAISER, a blind algorithm described in the references below by Ha and Moura. The distortions are interpreted as actions of the group of distortions (affine-permutation group as a subgroup of the general linear group) on the space of configurations (distorted shapes). We developed a blind algorithm that recovers the intrinsic shape from
any arbitrarily unknown affine-permutation distorted image of the object.
We are pursuing the definition of shape space and studying the geometry of
this shape space, for example, the notions of distance and geodesics in shape
PhD Students involved with this project:
PhD Students who graduated:
David Sepiashvili (May 2006)
Victor Ha (September 2002), Hyeong-Seok Viktor Ha,
formerly with Mobile Solution Lab., Digital Media R&D Center Samsung Electronics Co., Ltd, now with Genesis Microchip, a video display company in Toronto, Canada.
Image and Video Representation: Content-based Image
Sequence Representation

Three dimensional video representations: We developed algorithms
that process monocular video sequences and extract 3D models of rigid objects present in the scene.
The shape of the object is described by patches, e.g., planar patches, or, more generaly, polynomial patches. Our algorithms factor a rank one matrix to obtain the rigid 3D shape and the rigid 3D motions of the objects. Besides several papers describing this work, see for example Aguiar and Moura [2001,2003], part of this work was patented, see "System and Method for Generating a Three-dimensional Model from
a Two-Dimensional Image Sequence," (allowed by the the United States patent Office in February 2004). Earlier work fused texture information from a monocular video sequence with range laser measurements to build textured 3D models of objects, see Martins and Moura.
Modeling human motion: We extended generative video (see below) to capturing and modeling video sequences with human walkers, see our papers Cheng and Moura [1999,1998].
Generative video: Our work is concerned with developing representations
for video sequences based on their content. These representations differ from
those developed for MPEG/H.26X coding standards in that sequences are described
in terms of extended images instead of collections of frames. We describe
how these extended images, e.g., mosaics, are generated by basically the same
principle: the incremental composition of visual photometric, geometric, and
multi-view information into one or more extended images. Different outputs,
e.g., from single 2-D mosaics to full 3-D mosaics, are obtained depending
on the quality and quantity of photometric, geometric, and multi-view information.
In particular, we developed a frameworkgenerative videothat is
well suited to the representation of scenes with independently moving objects.
Content-based video representations can potentially provide compression ratios
that are in the range of 1000:1 with acceptable quality. This work is the subject of several papers, in particular a comprehensive description is in the CRC Chapter [2004], and in the papers [1996,1995a,1995b]. Part of the work is patented, see "Generative Video: Very Low Bit Rate Video Compression."
Video over wireless: We demonstrated the high compression at good quality provided by generative video with a very early (in 1995) demonstration of transmitting video over wireless. The wireless network interfaced two highly heterogeneous wireless networks - a 'fast' 2 Mbps local area network (CMU's wireless andrew) and a slow metropolitan area 'slow' 19.6 Kbps wireless network. The results were reported in [1996], in an invited paper published by the IEEE Personal Communications Magazine.
Predictive lossy compression: Our work developed noncausal random field
models to describe the image texture and then developed novel predictive
coding algorithms that compared very favorably with transform based coders, see for example the patent
"Noncausal Predictive Image Codec," or the papers Balram and Moura [1996, 1993], Moura and Balram [1992], and Asif and Moura [1996].
PhD Students who graduated:
Patents and disclosures:
- "Generative Video: Very Low Bit Rate Video Compression," José M. F. Moura and Radu S. Jasinschi, US Patent and Trademark Office, S.N. 5,854,856, issued December 29, 1998.
- "Noncausal Predictive Image Codec," Nikhil Balram and José M. F. Moura, US Patent and Trademark Office, S.N. 5,689,591, issued November 18, 1997.
- "System and Method for Generating a Three-dimensional Model from a Two-Dimensional Image Sequence," Pedro M. Q. Aguiar and José M. F. Moura, provisional patent filed July 1999; patent filed with US Patent and
Trademark Office, Serial Number 09/614,841, July 12, 2000. Notice of allowance 2/23/2004.
