Funded Grants


Dynamics and metastability in phonological grammar

Human speech provides an example of a complex system which is central to our nature and situation as human beings. In some ways it resembles other complex systems, such as the weather. Speech, like the weather, is endlessly productive, providing no two examples of events which are exactly identical. At the same time, speech is structured, with some general types of outcomes being more common than others, both within languages and across languages. The existence of these preferred regions or states is known to be related to nonlinearities in the underlying physical system (including both the motor system which is part of human physiology, as well as the aerodynamics and acoustics, which characterize our natural surroundings). Nonlinearities are also implicated in the structures found in other complex systems. Lastly, speech displays structures at many different scales, from subparts of the syllable up through metrical structures, words, intonation phrases, and even stretches of discourse. For the weather, systematic effects can also be found at many different scales, from the eddies created by a tall building in the wind up through hurricanes and the jet stream.

An important feature of the cognitive system underlying speech relates to the fact that a human mind can treat itself as both the agent of generalization, and as the object of generalization. The human mind, contemplating the external world, can form generalizations about it. In addition, it can treat its own mental states as data, and form higher abstractions about them. The ability to form and elaborate such higher level abstractions is a hallmark of human intelligence. It is perhaps related to consciousness, but it is not the same thing as consciousness. This is because the abstractions involved in speech production and speech perception are used with such speed and automaticity (due to the extremely practiced nature of the linguistic system) that the processing is unconscious. If linguists are able to establish that an abstract generalization exists, then they might speak of unconscious or implicit knowledge of that generalization. For example, implicit knowledge of word relationships is revealed by the ability to form neologisms such as contriten (to render contrite, on the pattern of tight/tighten) and to reject a neologism such as externalen (on the intended reading to render external).

This project deals with implicit knowledge at three of the many levels of the human language system. These are the levels involved in the cognitive representation of the sound structure of language - phonetics through phonology. One level is an analogue map of the phonetic space, on which physical speech events are encoded in terms of their perceptually salient parameters. This level is analogue because quantitative similarity in all dimensions is relevant to the way it is used in speech processing. Then, the system also has a lexicon (or stored set of sound patterns of meaningful words; however the meanings themselves play a subordinate role in this study); it also has a phonological grammar, or general abstract characterization of the forms of words. One of the jobs of the grammar is supporting the productivity of the entire system. The grammar is compositional, determining the relative well-formedness of complex words from their subparts. When parsing neologisms, such as bnik or pranfletic, it will determine that bnik is not a possible word of English, but that pranfletic is. It adapts foreign borrowings into native form. It is also implicated in many other observable phenomena, such as speech errors, word games, and historical changes in sound patterns.

The key to the productivity of the grammar is its compositionality. It evaluates complex forms in terms of their subparts. If the grammar just listed as acceptable all the real words, and nothing more, then it would never accept any new words. What are the subparts? Where do they come from? Why do they seem so similar from one language to another, when superficial phonetic details of languages differ in so many ways?

This project seeks to address these questions by mathematical simulations of the grammar as an emergent system in the human mind. Contrary to a nativist approach, which presumes that the categories and structures of phonology are innate (that is, they emerged through evolution and are activated during language acquisition), the project explores the hypothesis that that the categories and structures of phonology are emergent in populations of humans in their lifetimes. This means that the innate human capability for language is more abstract than nativists would suggest, and that understanding the dynamics of the system can make a crucial contribution to understanding its near-stable state in the adult.

Prior empirical results have provided crucial insights into how this emergence might occur. The minute phonetic details which characterize individual languages, the gradualness with which these details are acquired, and the gradience of historical changes in progress, all point to a system in which probability distributions are continuously updated through experience. The importance of type statistics (frequency counts over words), as against surface statistics (frequency counts on the speech stream) shows that probability distributions are implicitly maintained at both surface and abstract levels. The adult system permits the listener to recover the speaker's intent with extremely high accuracy. Almost all children learn to talk well, no matter which words of the ambient language they experience in which order. Grammars change slowly in comparison to the rate at which conversations take place; otherwise, communication would not succeed. In short, the phonology of any given language (as instantiated in the minds of youthful and adult speakers who are in communication with each other) exhibits notable properties of convergence, statistical robustness and near stability (metastability). Because these properties are found in only a few of the incredibly many dynamical systems which mathematics can provide, they are critical to the scientific characterization of human language. By developing and evaluating the behavior of schematic mufti?agent models, I hope to follow up these broad insights with specific explanations and predictions about some of the surprising and remarkable characteristics of human phonologies.

The project will use existing speech and text databases for English and French, as well as on-line dictionaries. These languages are selected because they have notable differences in their phonetic inventories and prosodic structure. These databased will be mined for statistical regularities to be used in constructing schematic models of phonologies in multi-speaker groups. The general format of these models will integrate insights from prior work on phonetic categorization and on usage-based phonology. The exact behaviors of these models will be assessed by using the same databases for Monte Carlo simulations (in which random samples of varying sizes are used as speech samples which cause incremental updating in the listener). The results of these calculations will be used to make general deductions about phonology as a capability of humankind.

Understanding the relationship between speech experience and phonological competence is important, because some humans do not acquire phonological competence at normal rates. Acquisition failure can impact their ability to speak or read. Normal speakers of one language often wish to learn a second language. For both of these populations, it is important to design educational programs which provide optimal experiences for developing robust and productive cognitive representations.

The study of the emergence of phonology has further importance as a case study in the science of mind. Anyone who views the mind as embodied in the state of the brain is surely impressed by the ability of the brain to encapsulate general principles about the external world Black holes were predicted by Chandrasekhar from the equations of physics long before they were observed in the cosmos. A difficulty in creating a general theory of mind is that intelligence relates to our general understanding of the world - in short, to the whole theory of everything. Such a theory does not exist now, and it may never exist. However, in the case of phonology, the relevant physical world is limited, being the world of articulatory gestures, aerodynamics, and acoustics as used in speaking In the last 40 years, there has been immense progress on the scientific characterization of this micro-world. For many its aspects, exact equations are known, and heuristic assumptions about equations have permitted the construction of quite successful articulatory and perceptual models. As a result, phonology provides a microcosm in the which the relationship of the mind to the world can be explored in an exact, quantitative, fashion. The findings from this exploration will set precedents for examining the same relationship in other areas.