The Skin Appearance Laboratory

Sitemap

Brian C. Madden, Ph. D.

Department of Dermatology

Box 697

University of Rochester

601 Elmwood Avenue

Rochester, NY 14642

USA

(585) 275-4526

Brian_Madden@urmc.rochester.edu

( ↑ Yes, that is an underscore, ... don’t ask.)

Proprietor

I am an electrical engineer with a 40+ year history of research in vision. Interesting opportunities of all sorts have come and gone over the years, but somehow I have always found my way back to study vision. The experiments and systems I’ve developed over the last decade stemmed from a desire to extend this study into medical imaging. That desire resulted in the creation of The Skin Appearance Laboratory here in the department of dermatology. The facilities of the laboratory and the work being accomplished there are described in the Research section. What brought me here is summarized below.

Additional material is available by following the thumbnail image links.

Firing up the WABAC Machine:

RCA, Aerospace Communications and Controls Division

In the summer after my junior year in high school as part of the Thayer Academy Summer Science Program, I worked at a division of RCA that was an Apollo Program subcontractor. There I wrote programs to simulate heat diffusion of the Lunar Rover electronics to account for convective, conductive and radiative transfer under ambient conditions covering both transit and lunar environments. The hard part wasn’t encoding the algorithm. The hard part was minimizing the number of reversals of the magnetic data tape. The RCA 301 computer used the sort of vacuum-buffered magnetic tape drives that ran the risk of stretching the tape every time the tape direction was reversed. A stretch in the tape would very likely cause an end-of-record mark to be missed and crash the long-running simulation.

After gluing together so many space rocket kits growing up, it was a real treat to see physical models of proposed NASA hardware in the lab. Some projects fade from memory and you often don’t know what happened to them. I always know just where to look for this one: overhead in the Apollo 15-17 lunar parking lots. And with the new Lunar Reconnaissance Orbiter images, we should all be getting a fresh look at these historic vehicles.

Tufts University

I obtained my first exposure to vision research working on eye movement psychophysics in Sam McLaughlin’s lab at Tufts. Sam was interested in the non-surgical treatment of strabismus in children. Using red and green anaglyph glasses and a custom projector in a dark room, exercises could be performed to strengthen the alignment capability of the children’s eyes. In support of this work, he developed eye movement models that investigated the effects of parametric readjustment. It is a pleasure to see his theory having a renaissance in eye movement circles today, now forty years on.

I remember when Sam gave me an early Campbell and Robson linear systems physiology paper and told me I should read it ‘because these guys got it right’. Working in this laboratory was my first real exposure to empirical science in the wild (there were no answers in the back of the book), and it is what got me hooked on vision.

Digital Equipment Corporation

DEC was a wonderful place to work. For a design engineer it was an environment where it was possible to do creative engineering. This was made possible in no small part because (back then) the company was run by engineers. George Fligg was an experienced electrical engineer who also worked in Control Products and he often served as a mentor. He would give me advice such as never work for a company that doesn’t do what you do (at 22, it took me years to fully appreciate the implications of this), and wear serviceable pants, you never know what you might need to crawl under (this point became evident straightaway in a time when engineers wore jackets and ties and all the floors were soaked with lanolin from the days when the Assabet Mills site housed the largest woolen mill in the world). He would acknowledge a neat design but then make me understand that while an impulse created by running the output of a D-flop back into the reset might reduce chip count, it wasn’t a function the device specification supported and therefore couldn’t be used in a product.

After cutting my teeth creating some M-Series modules, I worked on the hardware design for a computerized direct numerical control (DNC) for machine tools. The minicomputer, a box the size of a medium-sized suitcase with flashing lights and toggle switches on the front panel, which controlled everything was a PDP-8/L. The DNC could drive two Bridgeport milling machines doing circular interpolation while timesharing with the parts programmer creating new designs on the teletype console. All this with 4K of core (okay, they were 12 bit bytes). When hooked up to two massive Behrens punch presses on the production floor moving at 300 inches per minute, it was a force of nature. This led into the first project for which I assumed responsibility, a redesign of the PDP-14. Digital Equipment Corporation was indeed a wonderful place to work, but nothing lasts forever (DEC, Compaq, Hewlett Packard ... sigh).

Adams-Smith, Inc.

Adams-Smith built a variety of custom digital instrumentation products for video, control and measurement applications. Working there gave me an inside view of a small (four man) startup. It was an education in what it took to design without a net. There wasn’t much of a margin for error. A bad mistake could have sunk the company. As my education continued, I eventually learned to interpret (but never to properly speak) Australian.

On an historical note, I was given the chance at Adams-Smith to design with the Intel 4004 a few months after it was announced. I remember liking the device simply because it would reduce the physical size of the product significantly. Who knew? It wasn’t like the thing came with a label: This device will change the world as you know it.

IBM Research

At IBM I worked in the Experimental Systems Group at Yorktown Heights on the study of programming language design. The goal was to examine naturally occurring programs and to analyze them using the framework of Fillmore’s case grammar. I ended up collecting as many versions I could find of recipes for Quiche Lorraine, Beef Stroganoff and Sukiyaki. The recipes were parsed to extract actions, objects and modifiers. The analysis was extended to 50 different recipes from the Joy of Cooking and the resulting cooking lexicon was presented to undergraduates at a local university so their judgments of word associations could be used to extract the intrinsic relations present in particular recipes, and for cooking in general.

While at IBM, I also worked on Query By Example, which later, much to my surprise, evolved into a product and showed up in an IBM Super Bowl ad. By then, however, I’d left and gone back to school to study vision. I had come to understand that intuition about language was not my strong suit (and the philistines at Yorktown Heights had paved over the clay tennis courts).

University of Rochester

The Center for Visual Science was a place where you could learn biological vision from soup to nuts (especially if you were a graduate student there as long as I was). Among the many things I worked on, from Necturus eyecup preparations to ideal spatial patterns to psychophysical scaling of beauty and beer, there are two projects that stand out in my memory.

What started as a proposal for a physiological-based binocularity metric developed into a characterization of nonlinear cell responses in cat visual cortex with fellow graduate student Mike Mancini. In addition to traditional bar, edge and grating stimuli, we applied Wiener kernel analysis to cells in V1 and V2. Once, we held onto a cell for ten hours. We ran every test we had, three times over. To the best of my knowledge, these experiments produced the first white noise analysis to overcome the increased inhibition that is evident in mammalian cortical recordings.

In some ways this technique gave a very different view of V1 cell responses from that in the popular canon. For example, ‘nonlinear’ Y-cells exhibited a substantial first order kernel. One could interpret the nonlinear full-wave rectified subunit activity as a way for those cells to raise more of the linear component of the response above the threshold cutoff, a form of Spekrijse’s linearization process. It certainly would seem much easier to remove a nonlinear DC shift at a later stage than to cobble together two matched, mirror-symmetric, thresholded ‘linear’ X-cells for every orientation and spatial frequency and phase at every position across the visual field. The validity of any classification depends on how you define nonlinear or what comprises, in fact, an essential, irreversibly distorting nonlinearity to the function of those cells viewed in the context of the totality of cortical visual processing.

In my thesis, I proposed a model of spatial visual acuity based on data obtained from animal electrophysiology and human psychophysics. The model succeeded in explaining the substantial differences observed to occur among the various measures of spatial acuity. According to my theory, the visual system can localize stimuli only down to labeled regions that are, at least, several minutes of arc wide. This labeling forms the substrate of the absolute position sense. Other measures of spatial acuity (relative positional judgments) have limits that are one (two-line resolution) and two (localization hyperacuity) orders of magnitude finer because they allow the coarse positional labeling to be supplemented by the detection of changes in contrast within limited bands of spatial frequency. It is the use of an intensive dimension (contrast) as a supplement to a labeled dimension (location) that allows augmentation of acuity by visual abilities that are not part of the absolute position sense to elicit this exceptional spatial performance.

The failure of traditional theories to reconcile the differences in observers' ability to detect positional displacements with some form of contrast sensitivity measure occurs because the discrimination sensitivity of an array of contrast sensitive filters varies with the configuration of the different acuity targets as well as with changes in position. This combination of sources of stimulation precludes any fixed, and obvious, transformation between contrast and position. Transformations that vary with stimulus configuration are still able to support the relative positional judgments of two-line resolution and localization hyperacuity tasks as long as the responses are monotonic with position over the range of positional variations being compared. Hyperacuity stimuli get an extra boost in sensitivity because the minutes-of-arc separation of their features produce undulating spatial frequency spectra that when positionally displaced stimulate bandpass filters operating at threshold performance. By incorporating this interdimensional synergy between positional labeling and contrast sensitivity, I was able to model the wide range of interactions present in the two-line resolution and hyperacuity literature. The model required nothing more than coarse, minutes-of-arc positional labeling and the same bandpass contrast sensitivity filters required for the detection and discrimination of sinewave gratings.

While all this research was winding up, and prior to my defense, I had the good fortune to be invited to give a presentation at the Rank Prize Funds Symposium on Biological and Engineering Aspects of Visual Hyperacuity and Depth Perception that took place in January, 1984. Looking out over the hall at the University of Cambridge, I discovered that practically every living researcher cited in my thesis was there. With this experience behind me and the right guidance, the actual defense was somewhat anticlimactic.

Following my defense, I started a postdoctoral fellowship in the computer science department at the University of Rochester. There I had the opportunity to begin my studies in computer vision and robotics. In particular, I began work on translating the results I’d learned in biological vision to computational vision.

Xerox Research

At Xerox, the research involved the study of visual psychophysics examined through the lens of xerography. There I learned about the relation between the structure of images, the devices that made them, and the visual system. In particular, I learned the practical effects of an imaging chain from incident light to perception. The research looked at how tiny piles of charged toner form letters and images and how they appeared not only at threshold performance levels of the human visual system, but how those percepts varied at suprathreshold contrast levels.

Kodak Health, Safety and Human Factors

At Kodak, the project was to study the imaging characteristics of film. In this work, a different portion of the imaging chain was examined – image formation on slide film and the interaction of film and projection systems. I examined film structure by exposing and developing thousands of slide images of different contrast test patterns. Combinations of sinewave and squarewave gratings of different spatial frequencies were obtained under various exposure conditions using my 4,096-line horizontal resolution Vernier target display I designed for my research at the university. After developing the film and bringing it back to Kodak, the film structure was subsequently examined using a densitometer that was the size of a piano. After 100 years in business, this project was part of an effort by Kodak to break out of the thinking that arose from their vertically-organized business structure and to optimize more than individual components of the photographic process.

Xerox Development

Back at Xerox, I worked on introducing an area sensor into the imaging chain for a simulation of a laser scanner. This modification allowed the perceptual consequences of different product designs to be examined prior to fabrication. In particular, it allowed the study of the magnitude of distortions caused by positional noise during scanning where different sized regions of the image were simultaneously captured. The goal of the simulation was to increase scan efficiency while moving the spatial frequency properties of any artifacts to regions of lesser visual sensitivity.

Boeing, Helicopter Division, Advanced Computing Technology

At Helicopters, I worked as an artificial intelligence specialist. There I wrote Boeing’s 5-year technology forecasts for image processing and for robotics. On the plus side, going down to NIST three days a week to work on Y14.5 dimensioning and tolerancing standards was both an education and a pleasure. I also designed a pilot study that tested whether a Cartesian robotic gantry controlled by a vision sensor could be used to assist in automated mark-up during composite assembly.

There was so much opportunity at Helicopters, yet there was also a very deep and underappreciated need for the application of technology to the manufacturing process. It must be said that Dilbert’s world was alive and well at Boeing – pointy-haired bosses, cubicles and all. It is easy to understand the stories of Thomas Pynchon, who worked at Boeing earlier on, making a tent on his desk with D-size blueprints so as to have a place to hide. Fortunately for me there was Haim, who kept me both caffeinated and sane.

University of Pennsylvania, Grasp Lab

Extended Intensity Range Imaging (EIRI) came from work 15 years ago in the robotics lab at the University of Pennsylvania. It was based on a model of light adaptation I developed 15 years earlier for my thesis (but didn’t need to use). The adaptation model was based on physiological data of photoreceptor responses, much of it surprisingly overlooked in the psychophysics literature. Its implementation morphed over the years from Data General to PDP-11 assembly language to APL, FORTRAN, C, and finally to Matlab. EIRI happened along one day while I was playing with the adaptation model on a workstation in the lab. The idea came into existence during the few hundred milliseconds between someone looking over my shoulder and asking ‘What good is that?’, and the response ‘Well, you could ...’.

The technique involved capture of multiple images, each at a different exposure setting. The set of images obtained in this way could then be fused into a single representation with the pixels taken from multiple images, each pixel adjusted in value to reflect the variation in their capture sensitivities. Pixel values in the composite representation were selected from the input image with the highest sensitivity for which that pixel was not saturated. This fusion resulted in the most accurate (smallest quantization) composite view, an image with an arbitrarily large dynamic range. The result was effectively a floating point image. Such a capability certainly would be adequate to accommodate even the million-to-one luminance difference present between the representations of reflectance in dark shadows and on luminous sources, both common conditions in physical scenes (q.v. thumbnail image link above). The development of composite capture was the leading edge of the move toward representing the dynamic range of real scenes on more limited paper and electronic display modalities. It was an accommodation to the equipment of the day, but the need for artifice in the redistribution of luminance values will lessen as capture and display technologies catch up with the performance imposed by viewer demand and provided by innovation.

The gestation of PennEyes was somewhat longer than it took to conceive of EIRI. The binocular platform had its roots in a DoD-funded grant to HelpMate Robotics, Inc. in partnership with several universities with the goal of deploying similar hardware to multiple laboratories. With a standard unit built with state-of-the-art components, it was intended that more time could be spent researching applications. The head was fabricated and distributed by HelpMate. Although the binocular platform incorporated two CCD cameras and two lenses with motorized zoom, focus and aperture, it was still light enough to be supported by a PUMA 560 arm.

PennEyes was designed to be a three-dimensional visual servo. The binocular platform provided high performance rotation for each camera. The primary vergence and version capabilities were exceptional, with a peak velocity of 1000 degrees per second and a peak acceleration of 12,000 degrees per second². The head itself was supported by a six degree of freedom robotic arm. One of PennEyes’ main characteristics was the capacity for 3D tracking redundancy. In a tracking application, performance was optimized through the initial activation of the most responsive positioning axis combined with supplemental or compensatory movements of the slower axes. Depending on the task requirements, supplemental motions could be used to increase performance while compensatory motions could be used to keep the device centered in its range to better accommodate future tracking requirements.

In addition to the degrees of freedom in the motion of the head and arm, PennEyes also incorporated optical and electronic degrees of freedom for tracking. With the motorized control of lens zoom, it was possible to compensate for relative motion in depth between the head and target and thereby maintain a nearly constant target size in the acquired images. By stabilizing the target in the displayed image frame by electronically shifting the target, it was possible to compensate for tracking errors in all three dimensions. While this capability enabled the position of the moving target to be stationary in the image, that very stabilization had the effect of enhancing the subjective visibility of the tracking error distortions such as motion and defocus blur. Other changes such as specularities and luminance variations were also decidedly more apparent with the stabilized targets than when the tracking error and background context were left in. Alternate path and configuration choices were found to affect tracking and image quality metrics very differently.

A different type of redundant tracking involving a robotic arm was developed for a manipulator on a hovering subsea craft for the Deep Submergence Lab at Woods Hole Oceanographic Institute. The goal was to track moving objects and to combine that information with the arm kinematics to calibrate the lenses and cameras on the vehicle. Once calibrated, an imaging system with multiple cameras could then be used to provide stereo vision and to locate objects in the real world, wet or dry.

Part of the effort involved enhancing existing algorithms that computed the location of the centroid of a spherical object mounted a robotic until it could be consistently located in an image to within 0.01 pixel. With this level of visual performance, the system was able to tract a moving target with an accuracy of better than 3 millimeters even in the presence of occlusion of any one of the three cameras – a common occurrence in the ocean.

Postscript: A casual comment led to a wager that took the better part of a summer to achieve, but in the end the centroid localization was improved by another order of magnitude, to 0.001 pixel.

The Binocular Camera Platform Home Page grew out of the PennEyes project and provided a resource to a growing community of roboticists. To maximize visibility, it was located on The Computer Vision Homepage website maintained at CMU. The principal resource was a fairly complete listing of the motorized binocular camera platforms that appeared in the literature. The listing contained a description of the major components of each head and any available email contacts for the associated researchers. The site also listed sources for the various components required to build a platform – cameras, lenses, position controllers, software, and commercial systems. Of all places, I found a perfect candidate to be the patron saint of mobile robotic heads, St. Denis, on a door of the Notre Dame Cathedral in Paris while on the way to a robotics conference in the French Alps courtesy of the EU.

Building on the Active Vision paradigm, Hany Farid and I proposed a technique to use controllable sensors in a manner that would provide both an efficient and effective method of telepresence. The paper proceeds to describe in detail problems of occlusion, lighting and focus. A solution is presented that acquires the necessary images with active vision and generates the required scene using focus ranging to compute depth information and image stitching and warping to compose the views. It is demonstrated that a sparse array of active cameras could provide a sufficient source of information to allow a remote viewer to become perceptually immersed in a distant scene to a degree that applications such as telemedicine would become viable.