Funded Grants


Approximating meaning: structured statistical semantics

The Role of Semantic Models in Cognitive Science Research and Application

Higher-order cognition is fundamentally dependent on knowledge. To understand comprehension, thinking, and decision-making we must consider the role knowledge plays. We need to know about the content of knowledge, its structure, how it is retrieved, and how it is used in context. We do not have, however, an adequate model of human knowledge. In the absence of such a model, researchers have taken shortcuts: they have used hand coded knowledge structures as they needed them in their models of comprehension, problem solving, analogical reasoning, concept learning, and decision making. That is certainly a defensible strategy and has been used widely and often successfully by just about every theorist and model builder in cognitive science. However, the time has come when we can do better than that. Although we still lack a comprehensive theory of knowledge, and probably won’t have one for some time, promising beginnings have been made. In recent years, developments in cognitive science and machine learning have made it possible to build statistical models of meaning that serve as useful tools for modeling higher-order cognitive processes, and that can lead to successful practical applications as well.

The model that we have worked with in the past to simulate human verbal knowledge is Latent Semantic Analysis or LSA. The input to LSA is a very large corpus of text, such as 11M words from books and articles that a typical high-school graduate might have read. From this corpus LSA constructs automatically a map of word and text meanings. As in a geographic map, words and texts are represented as points in space at various distances from each other and in the correct spatial relationship. However, about 300 dimensions are required to map semantic relations in such a way that they resemble human meaning. Once the map is constructed, it can be used to rapidly estimate distances among words or documents, or conversely, similarities.

Such a map of meaning is extremely useful in mind/brain research. Since it is based on a large amount of natural text, it represents semantic relations in accordance with their real-world significance. A large number of studies using LSA are reported in a book in press, edited by Landauer, McNamara, Dennis, & Kintsch, that explore the ability of LSA to model human semantic judgment and performance, as well as its limitations. A major limitation of LSA is that in constructing its map of meaning it uses only information about which words appear in a document, but disregards information about their syntactic relations. We are now able to overcome this limitation, that is, to construct semantic representations that take into account the syntactic relationships among the words in a linguistic corpus. This is a major advance in the field of statistical semantics. There are as yet very few systems capable of dealing with syntactic structure, and they tend to be computationally expensive. What is proposed here is simple and robust, and we believe it will prove sufficient to model the role of syntax in everyday sentence comprehension (as distinguished from expert linguistic analysis).

Statistical models of meaning, whether they include structural information or not, represent the meaning of a word as a point (vector) in a semantic space. That is, from observing how that word has been used in many contexts, the model infers a context-free representation of that word. Real words, of course, are different: they often have several meanings and almost always many different senses. Therefore, psychological theories of word meaning are usually framed in terms of a mental lexicon that lists these various meanings and senses. The present approach is different. Instead of a mental lexicon with ready-made meanings, we assume that the lexicon is generative: word meanings are generated in working memory from the stored, context-free semantic representation (e.g., the LSA vector) together with the context in which the word is used. Thus, what is stored in long-term memory is merely a building block for the construction of meaning; meaning must be constructed in working memory in context, and is therefore always contextually appropriate. In our approach, a word has as many meanings as it has uses (many will differ only slightly, of course). The algorithm that generates contextual meanings when one word is used in the context of another word has been explored in Kintsch (2001; in-press); an algorithm for the construction of sentence meanings in working memory is at the core of the present proposal, as sketched in Section 3 above.

The model that we are developing here is not only of theoretical interest. A model that can simulate the comprehension of sentence meanings opens up promising areas of application in educational technology. For the past eight years we have elaborated the theory and developed a software system that teaches middle school and high-school students how to write summaries. Summary Street is now employed in 120 classrooms and has been used by over 3,000 students. It is easy and cheap to use over the internet, and it has been shown to be effective. It helps students to compose better summaries, even when they have no longer access to the tool (Wade-Stein & E. Kintsch, 2004; Franzke, Caccamise, E. Kintsch, & Johnson, 2005). We want to build on this work to develop a more broadly useful software for tutoring. Identifying important information to include in a summary is an important strategy for deep comprehension, but not the only one. True learning depends on integrative strategies as well, that serve to elaborate the text content with the reader’s own knowledge. The tool we have in mind will foster deep comprehension by supporting not only text-based processes, but these crucial inferencing processes as well,. The tutor is based on the theory and technology that enabled us to build Summary Street, but advances this approach in important ways.

Our early work on LSA was supported by the McDonnell Foundation (1997- 2001); the further development and scale-up work was supported by grants from NSF (ITR and IERI). Our present goal is to design and develop a general comprehension tutor that helps students comprehend difficult texts, and hence to learn from these texts.

Much research on how to make students actively engaged readers and better learners suggests that interactions between groups or pairs of students, or classroom interactions with the teacher are effective for learning higher-level comprehension. These interactions work by scaffolding students’ attempts to construct meaning as they read. They introduce the appropriate strategies to the students and serve as a natural diagnosisand hint-providing system: when students’ answers indicate that their understanding is still incomplete, a teacher or peer can continue questioning or provide hints to scaffold their attempts to comprehend and lead them in the right direction. However, welldeveloped classroom management in this manner is expensive and resource consuming.

To address the problem of how to help students better understand difficult texts we are proposing to develop a computer-based tutor that will provide guided practice in meaning construction during reading. Such a tutor can be embedded in content area instruction without placing an additional burden on classroom teachers: it can support reading of assigned texts at home or wherever a computer is available. The tutor will focus on middle- and high-school students at a point where such guidance is especially needed, that is, when learning becomes increasingly dependent on reading complex texts. Although our efforts will be initially limited to science topics, the tutor we have in mind is flexible and applicable to a wide variety of knowledge domains.

The key to such a tutor is for the computer to be able to understand what the student is saying. That is what this project is designed for. Once we have this ability, we can mimic the diagnosis and hint-providing instruction that has proven so successful in practice, letting the ever-present computer help out where the teacher is unable to provide individualized help. As we have found with Summary Street, such a tutor cannot replace the teacher. Its effectiveness depends on how well the teacher uses it: as just another exercise it will have little effect, but purposefully integrated in classroom instruction it can be a very powerful tool.

Thus, the goal of this project is twofold: (1) to advance our understanding of language comprehension by building a better model of the processes involved in comprehension, using novel methods from the machine learning field, and (2) to use our ability to simulate comprehension to build a tutor that makes possible guided practice of deep comprehension through flexible computer-based scaffolds.