Funded Grants


The Computational Basis of Human Audition: Representation, Recognition, and Segregation

Just by listening, we can discern a vast number of important things about the world around us: what someone said, their emotional state when they said it, and whether it is windy or raining outside, to name but a few examples. The ease with which we make such judgments belies the inferences that support them. Sound is created when events in the world cause air molecules to vibrate, and takes the form of pressure waves that propagate through the air. Our auditory system measures sound with two sensors (the ears) that register the vibrations of the ear drums back and forth in response to sound. Information about the world is encoded in this pair of one dimensional signals, and the brain must decode it to correctly interpret events in the world. Our sense of hearing is fragile, and when impaired, the consequences can be devastating. Hearing is also an impressive feat of engineering. Machine systems for interacting with people fall far short of human auditory abilities, and are limited as a result.

My long-term goals are to understand how humans derive information from sound, to further technologies for treating hearing when it breaks down (e.g. hearing aids), and to improve machine audio algorithms (e.g. for separating mixtures of sounds, or for identifying sound sources). A central theme of my work is to combine the study of auditory perception and cognition with research on computational tools for manipulating and analyzing real-world sounds. These tools yield new insights about human hearing and enable novel experimental approaches. They also hold promise for improving machine hearing systems.

My interest in perception stems from the remarkable success of biological perceptual systems. Humans routinely accomplish feats that are far beyond the reach of even the most advanced and powerful machine systems designed to solve the same tasks. I believe that understanding how humans hear entails being able to construct systems that mirror our abilities, and that attempting to do so can provide unique forms of insight into human hearing. I aim to conduct experiments in humans that reveal how we succeed in situations where machine systems fail, and to use results in computational audio to motivate new experimental work.

Two central problems of audition figure prominently in my work. The first is that of recognition. Although speech recognition is the subject of a vast scientific literature, the general problem of sound recognition is little-studied. Part of my current work is aimed at developing and testing theories of auditory recognition. A second focus is the problem of sound segregation. Much of the time our environments contain more than one thing that produces sound, and the sounds from each of the sources add together into a mixture that enters the ear. However, listeners are typically interested in particular individual sound sources, and must derive a representation of single sources from the mixture. This is a classic problem in hearing, to which I apply novel approaches rooted in statistical analysis of natural sounds.