ϟ

David E. Rumelhart

Here are all the papers by David E. Rumelhart that you can download and read on OA.mg.
David E. Rumelhart’s last known institution is . Download David E. Rumelhart PDFs here.

Claim this Profile →
DOI: 10.1038/323533a0
1986
Cited 20,941 times
Learning representations by back-propagating errors
DOI: 10.7551/mitpress/5236.001.0001
1986
Cited 13,032 times
Parallel Distributed Processing
What makes people smarter than computers? These volumes by a pioneering neurocomputing group suggest that the answer lies in the massively parallel architecture of the human mind. They describe a new theory of cognition called connectionism that is challenging the idea of symbolic computation that has traditionally been at the center of debate in theoretical discussions about the mind. The authors' theory assumes the mind is composed of a great number of elementary units connected in a neural network. Mental processes are interactions between these units which excite and inhibit each other in parallel rather than sequential operations. In this context, knowledge can no longer be thought of as stored in localized structures; instead, it consists of the connections between pairs of units that are distributed throughout the network. Volume 1 lays the foundations of this exciting theory of parallel distributed processing, while Volume 2 applies it to a number of specific issues in cognitive science and neuroscience, with chapters describing models of aspects of perception, memory, language, and thought.
DOI: 10.21236/ada164453
1985
Cited 6,162 times
Learning Internal Representations by Error Propagation
This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion
DOI: 10.1037/0033-295x.88.5.375
1981
Cited 4,214 times
An interactive activation model of context effects in letter perception: I. An account of basic findings.
1986
Cited 2,946 times
Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations
DOI: 10.1037/0033-295x.89.1.60
1982
Cited 1,262 times
An interactive activation model of context effects in letter perception: II. The contextual enhancement effect and some tests and extensions of the model.
The interactive activation model of context effects in letter perception is reviewed, elaborated, and tested. According to the model context aids the perception of target letters as they are processed in the perceptual system. The implication that the duration and timing of the context in which a letter occurs should greatly influence the perceptibility of the target is confirmed by a series of experiments demonstrating that early or enhanced presentations of word and pronounceablepseudoword contexts greatly increase the perceptibility of target letters. Also according to the model, letters in strings that share several letters with words should be equally perceptible whether they are orthographically regular and pronounceable (SLET) or irregular (SLNT) and should be much more perceptible than letters in contexts that share few letters with any word (XLQJ). This prediction is tested and confirmed. The basic results of all the experiments are accounted for, with some modification of parameters, although there are some discrepancies in detail. Several recent findings that seem to challenge the model are considered and a number of extensions are proposed.
DOI: 10.1207/s15516709cog1603_1
1992
Cited 1,094 times
Forward Models: Supervised Learning with a Distal Teacher
Internal models of the environment have an important role to play in adaptive systems, in general, and are of particular importance for the supervised learning paradigm. In this article we demonstrate that certain classical problems associated with the notion of the “teacher” in supervised learning can be solved by judicious use of learned internal models as components of the adaptive system. In particular, we show how supervised learning algorithms can be utilized in cases in which an unknown dynamical system intervenes between actions and desired outcomes. Our approach applies to any supervised learning algorithm that is capable of learning in multilayer networks.
DOI: 10.1016/b978-0-12-108550-6.50013-6
1975
Cited 1,067 times
NOTES ON A SCHEMA FOR STORIES
The chapter discusses that the structure of stories is ordinarily more than pair wise relationships among sentences, and strings of sentences combine into psychological wholes. It also explains the nature of these wholes and presents a simple story grammar that accounts for many of the salient facts about the structure of simple stories and that will serve as the basis for a theory of summarization. The grammar consists of a set of syntactical rules that generate the constituent structure of stories and a corresponding set of semantic interpretation rules that determine the semantic representation of the story. The symbol “+” is used to form two items in a sequence; the symbol “|” is used to separate mutually exclusive alternatives. A “*” following a structure name indicates one or more of those units; for example, A* is one or more As. In the semantic structures the convention is followed that the predicate names are written in the ovals and the arguments of the predicates are pointed to by arrows. The propositions that are the units of the story are simply numbered.
DOI: 10.4324/9781315107493-4
2017
Cited 1,017 times
Schemata: The Building Blocks of Cognition
Schemata are employed in the process of interpreting sensory data, in retrieving information from memory, in organizing actions, in determining goals and subgoals, in allocating resources, and, generally, in guiding the flow of processing in the system. Perhaps the central function of schemata is in the construction of an interpretation of an event, object, or situation—that is, in the process of comprehension. Schemata are active computational devices capable of evaluating the quality of their own fit to the available data. Schemata consist of subschemata as procedures consist of subprocedures. A schema is said to be activated from the bottom-up whenever a subschema that has been somehow activated causes the various schemata of which it is a part to be activated. One of the central problems of a schema theory is a specification of the process whereby new schemata are developed.
DOI: 10.1038/369525a0
1994
Cited 956 times
fMRI of human visual cortex
DOI: 10.1207/s15516709cog0901_5
1985
Cited 794 times
Feature Discovery by Competitive Learning*
This paper reports the results of our studies with an unsupervised learning paradigm which we have called “Competitive Learning.” We have examined competitive learning using both computer simulation and formal analysis and have found that when it is applied to parallel networks of neuron‐like elements, many potentially useful learning tasks can be accomplished. We were attracted to competitive learning because it seems to provide a way to discover the salient, general features which can be used to classify a set of patterns. We show how a very simply competitive mechanism can discover a set of feature detectors which capture important aspects of the set of stimulus input patterns. We also show how these feature detectors can form the basis of a multilayer system that can serve to learn categorizations of stimulus sets which are not linearly separable. We show how the use of correlated stimuli con serve as a kind of “teaching” input to the system to allow the development of feature detectors which would not develop otherwise. Although we find the competitive learning mechanism a very interesting and powerful learning principle, we do not, of course, imagine thot it is the only learning principle. Competitive learning is an essentially nonassociative statistical learning scheme. We certainly imagine that other kinds of learning mechanisms will be involved in the building of associations among patterns of activation in a more complete neural network. We offer this analysis of these competitive learning mechanisms to further our understanding of how simple adaptive networks can discover features important in the description of the stimulus environment in which the system finds itself.
DOI: 10.7551/mitpress/5237.001.0001
1987
Cited 778 times
Parallel Distributed Processing
What makes people smarter than computers? These volumes by a pioneering neurocomputing group suggest that the answer lies in the massively parallel architecture of the human mind. They describe a new theory of cognition called connectionism that is challenging the idea of symbolic computation that has traditionally been at the center of debate in theoretical discussions about the mind. The authors' theory assumes the mind is composed of a great number of elementary units connected in a neural network. Mental processes are interactions between these units which excite and inhibit each other in parallel rather than sequential operations. In this context, knowledge can no longer be thought of as stored in localized structures; instead, it consists of the connections between pairs of units that are distributed throughout the network. Volume 1 lays the foundations of this exciting theory of parallel distributed processing, while Volume 2 applies it to a number of specific issues in cognitive science and neuroscience, with chapters describing models of aspects of perception, memory, language, and thought.
DOI: 10.1037/0096-3445.114.2.159
1985
Cited 751 times
Distributed memory and the representation of general and specific information.
DOI: 10.7551/mitpress/5236.003.0008
1986
Cited 712 times
On Learning the Past Tenses of English Verbs
Abstract : This paper presents an alternative to the standard rule based account of a child's acquisition of the past tense in English. Children are typically said to pass through a three-phase acquisition process in which they first learn past tense by rote, then learn the past tense rule and over regularize, and then finally learn the exceptions to the rule. We show that the acquisition data can be accounted for in more detail by dispensing with the assumption that the child learns rules and substituting in its place a simple homogeneous learning procedure. We show how rule-like behavior can emerge from the interactions among a network of units encoding the root form to past tense mapping. A large computer simulation of the learning process demonstrates the operating principles of our alternative account, shows how details of the acquisition process not captured by the rule account emerge, and makes predictions about other details of the acquisition process not yet observed. Keywords: Learning; networks; Language; Verbs; Perceptions; Morphology.
DOI: 10.1016/b978-1-4832-1446-7.50035-2
1988
Cited 673 times
Learning Internal Representations by Error Propagation
This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion
DOI: 10.1142/s0129065790000102
1990
Cited 649 times
PREDICTING THE FUTURE: A CONNECTIONIST APPROACH
We investigate the effectiveness of connectionist architectures for predicting the future behavior of nonlinear dynamical systems. We focus on real-world time series of limited record length. Two examples are analyzed: the benchmark sunspot series and chaotic data from a computational ecosystem. The problem of overfitting, particularly serious for short records of noisy data, is addressed both by using the statistical method of validation and by adding a complexity term to the cost function ("back-propagation with weight-elimination"). The dimension of the dynamics underlying the time series, its Liapunov coefficient, and its nonlinearity can be determined via the network. We also show why sigmoid units are superior in performance to radial basis functions for high-dimensional input spaces. Furthermore, since the ultimate goal is accuracy in the prediction, we find that sigmoid networks trained with the weight-elimination algorithm outperform traditional nonlinear statistical approaches.
DOI: 10.1207/s15516709cog0601_1
1982
Cited 476 times
Simulating a Skilled Typist: A Study of Skilled Cognitive‐Motor Performance
We review the major phenomena of skilled typing and propose a model for the control of the hands and fingers during typing. The model is based upon an Activation‐Trigger‐Schema system in which a hierarchical structure of schemata directs the selection of the letters to be typed and, then, controls the hand and finger movements by a cooperative, relaxation algorithm. The interactions of the patterns of activation and inhibition among the schemata determine the temporal ordering for launching the keystrokes. To account for the phenomena of doubling errors, the model has only “type” schemata—no “token” schemata—with only a weak binding between the special schema that signals a doubling, and its argument. The model exists as a working computer simulation and produces an output display of the hands and fingers moving over the keyboard. It reproduces some of the major phenomena of typing, including the interkeystroke interval times, the pattern of transposition errors found in skilled typists, and doubling errors. Although the model is clearly inadequate or wrong in some of its features and assumptions, it serves as a useful first approximation for the understanding of skilled typing.
DOI: 10.1145/175247.175257
1994
Cited 464 times
Neural networks
article Free Access Share on Neural networks: applications in industry, business and science Authors: Bernard Widrow Stanford Univ., Stanford, CA Stanford Univ., Stanford, CAView Profile , David E. Rumelhart Stanford Univ., Stanford, CA Stanford Univ., Stanford, CAView Profile , Michael A. Lehr Stanford Univ., Stanford, CA Stanford Univ., Stanford, CAView Profile Authors Info & Claims Communications of the ACMVolume 37Issue 3March 1994 pp 93–105https://doi.org/10.1145/175247.175257Published:01 March 1994Publication History 328citation4,083DownloadsMetricsTotal Citations328Total Downloads4,083Last 12 Months53Last 6 weeks8 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my Alerts New Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteeReaderPDF
1986
Cited 425 times
Distributed representations
DOI: 10.2307/1421908
1975
Cited 402 times
Explorations in Cognition
DOI: 10.1145/175247.175256
1994
Cited 395 times
The basic ideas in neural networks
article Free AccessThe basic ideas in neural networks Authors: David E. Rumelhart Stanford Univ., Stanford, CA Stanford Univ., Stanford, CAView Profile , Bernard Widrow Stanford Univ., Stanford, CA Stanford Univ., Stanford, CAView Profile , Michael A. Lehr Stanford Univ., Stanford, CA Stanford Univ., Stanford, CAView Profile Authors Info & Claims Communications of the ACMVolume 37Issue 3March 1994 pp 87–92https://doi.org/10.1145/175247.175256Published:01 March 1994Publication History 276citation3,883DownloadsMetricsTotal Citations276Total Downloads3,883Last 12 Months180Last 6 weeks15 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteeReaderPDF
DOI: 10.1162/neco.1989.1.1.133
1989
Cited 388 times
Product Units: A Computationally Powerful and Biologically Plausible Extension to Backpropagation Networks
We introduce a new form of computational unit for feedforward learning networks of the backpropagation type. Instead of calculating a weighted sum this unit calculates a weighted product, where each input is raised to a power determined by a variable weight. Such a unit can learn an arbitrary polynomial term, which would then feed into higher level standard summing units. We show how learning operates with product units, provide examples to show their efficiency for various types of problems, and argue that they naturally extend the family of theoretical feedforward net structures. There is a plausible neurobiological interpretation for one interesting configuration of product and summing units.
DOI: 10.1598/0710.29
2013
Cited 386 times
Toward an Interactive Model of Reading
DOI: 10.1037//0096-3445.114.2.159
1985
Cited 369 times
Distributed memory and the representation of general and specific information.
We describe a distributed model of information processing and memory and apply it to the representation of general and specific information. The model consists of a large number of simple processing elements which send excitatory and inhibitory signals to each other via modifiable connections. Information processing is thought of as the process whereby patterns of activation are formed over the units in the model through their excitatory and inhibitory interactions. The memory trace of a processing event is the change or increment to the strengths of the interconnections that results from the processing event. The traces of separate events are superimposed on each other in the values of the connection strengths that result from the entire set of traces stored in the memory. The model is applied to a number of findings related to the question of whether we store abstract representations or an enumeration of specific experiences in memory. The model simulates the results of a number of important experiments which have been taken as evidence for the enumeration of specific experiences. At the same time, it shows how the functional equivalent of abstract representations--prototypes, logogens, and even rules--can emerge from the superposition of traces of specific experiences, when the conditions are right for this to happen. In essence, the model captures the structure present in a set of input patterns; thus, it behaves as though it had learned prototypes or rules, to the extent that the structure of the environment it has learned about can be captured by describing it in terms of these abstractions.
1986
Cited 364 times
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
DOI: 10.1016/s0364-0213(85)80010-0
1985
Cited 261 times
Feature discovery by competitive learning
This paper reports the results of our studies with an unsupervised learning paradigm which we have called “Competitive Learning.” We have examined competitive learning using both computer simulation and formal analysis and have found that when it is applied to parallel networks of neuron-like elements, many potentially useful learning tasks can be accomplished. We were attracted to competitive learning because it seems to provide a way to discover the salient, general features which can be used to classify a set of patterns. We show how a very simply competitive mechanism can discover a set of feature detectors which capture important aspects of the set of stimulus input patterns. We also show how these feature detectors can form the basis of a multilayer system that can serve to learn categorizations of stimulus sets which are not linearly separable. We show how the use of correlated stimuli con serve as a kind of “teaching” input to the system to allow the development of feature detectors which would not develop otherwise. Although we find the competitive learning mechanism a very interesting and powerful learning principle, we do not, of course, imagine thot it is the only learning principle. Competitive learning is an essentially nonassociative statistical learning scheme. We certainly imagine that other kinds of learning mechanisms will be involved in the building of associations among patterns of activation in a more complete neural network. We offer this analysis of these competitive learning mechanisms to further our understanding of how simple adaptive networks can discover features important in the description of the stimulus environment in which the system finds itself.
DOI: 10.1016/0022-2496(70)90044-1
1970
Cited 258 times
A multicomponent theory of the perception of briefly exposed visual displays
A feature extraction model for the recognition of tachistoscopically presented alphanumeric material is presented. This model is applied to data from whole report, partial report, detection, and backward masking experiments. On the whole, the results are encouraging. In a final section the model which is presented as an extension of Bower's multicomponent model for memory is shown to be derivable as a limiting case of LaBerge's neutral element stimulus sampling model.
DOI: 10.21236/ada030406
1976
Cited 234 times
Accretion, Tuning and Restructuring: Three Modes of Learning
Abstract : Learning is not a simple unitary process. In this paper we identify three qualitatively different phases of the learning process. In one phase, the learner acquires facts and information, accumulating more structures onto the already existing knowledge structures. This phase of learning is adequate only when the material being learned is part of a previously understood topic: the appropriate memory schemata already exist. In a second phase, the learner must devise new memory structures to interpret the material that is to be acquired. This is the most difficult and the most significant form of learning, for it marks the acquisition of truly new conceptualizations about a topic matter. The third phase of learning involves a continual process of modification: both constraining and generalizing the knowledge within the schemata of memory. This stage of learning does not increase the formal content of one's knowledge, but it makes the use of the knowledge more efficient. Thus, although a beginner and an expert might both perform a task with perfect accuracy, there is a marked qualitative difference between the performance of the two. We propose three different mechanisms that seem to be responsible for the different phases of the learning of complex topic matters: accretion, restructuring, and tuning.
DOI: 10.1037/h0036117
1974
Cited 230 times
Process of recognizing tachistoscopically presented words.
DOI: 10.1016/0010-0285(73)90023-6
1973
Cited 202 times
A model for analogical reasoning
A theory of analogical reasoning is proposed in which the elements of a set of concepts, e.g., animals, are represented as points in a multidimensional Euclidean space. Four elements A,B,C,D, are in an analogical relationship A:B::C:D if the vector distance from A to B is the same as that from C to D. Given three elements A,B,C, an ideal solution point I for A:B::C:? exists. In a problem A:B::C:D1, …, Di, …, Dn, the probability of choosing Di as the best solution is a monotonic decreasing function of the absolute distance of Di from I. A stronger decision rule incorporating a negative exponential function in Luce's choice rule is also proposed. Both the strong and weak versions of the theory were supported in two experiments where Ss rank-ordered the alternatives in problems A:B::C:D1,D2, D3D4. In a third experiment the theory was applied and further tested in teaching new concepts by analogy.
1990
Cited 256 times
Generalization by Weight-Elimination with Application to Forecasting
Inspired by the information theoretic idea of minimum description length, we add a term to the back propagation cost function that penalizes network complexity. We give the details of the procedure, called weight-elimination, describe its dynamics, and clarify the meaning of the parameters involved. From a Bayesian perspective, the complexity term can be usefully interpreted as an assumption about prior distribution of the weights. We use this procedure to predict the sunspot time series and the notoriously noisy series of currency exchange rates.
1986
Cited 225 times
Parallel distributed processing: explorations in the microstructure of cognition, vol. 2: psychological and biological models
DOI: 10.21236/ada092233
1980
Cited 182 times
Analogical Processes in Learning
Abstract : This paper examines the role of analogy and procedural representation in learning. Examples of analogical manipulation of knowledge schemata are presented from several domains, including turtle geometry, kinship terms, and the learning of a computer text editor. The view presented in this paper has a number of implications for instruction and for performance. In particular, the learner or user of a system should be presented with a conceptual model that has the following properties: (a) It is based on a domain with which the student is already knowledgeable and for which the student can reason readily; (b) The target and source domains should differ by a minimum number of spcifiable dimensions; (c) Operations that are natural in one domain should also be natural within the other domain; (d) Operations inappropriate within the target domain should be also by inappropriate within the source domain.
DOI: 10.2307/1423065
1989
Cited 168 times
Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises
This book presents the official, formal definition of the programming language ML including the rules for grammar and static and dynamic semantics. ML is the most well-developed and prominent of a new group of functional programming languages. On the cutting edge of theoretical computer science, ML embodies the ideas of static typing and polymorphism and has also contributed a number of novel ideas to the design of programming languages.Contents: Syntax of the Core. Syntax of Modules. Static Semantics for the Core. Static Semantics for Modules. Dynamic Semantics for Modules. Programs.Appendixes: Derived Forms. Full Grammar. The Initial Static Basis. The Initial Dynamic Basis. The Development of ML.
DOI: 10.4324/9780203763247
2013
Cited 156 times
Backpropagation
Composed of three sections, this book presents the most popular training algorithm for neural networks: backpropagation. The first section presents the theory and principles behind backpropagation as seen from different perspectives such as statistics, machine learning, and dynamical systems. The second presents a number of network architectures that may be designed to match the general concepts of Parallel Distributed Processing with backpropagation learning. Finally, the third section shows how these principles can be applied to a number of different fields related to the cognitive sciences, including control, speech recognition, robotics, image processing, and cognitive psychology. The volume is designed to provide both a solid theoretical foundation and a set of examples that show the versatility of the concepts. Useful to experts in the field, it should also be most helpful to students seeking to understand the basic principles of connectionist learning and to engineers wanting to add neural networks in general -- and backpropagation in particular -- to their set of problem-solving methods.
DOI: 10.1207/s15516709cog0403_5
1980
Cited 153 times
On Evaluating Story Grammars*
Cognitive ScienceVolume 4, Issue 3 p. 313-316 Free Access On Evaluating Story Grammars* David E. Rumelhart, David E. Rumelhart University of California, San DiegoSearch for more papers by this author David E. Rumelhart, David E. Rumelhart University of California, San DiegoSearch for more papers by this author First published: July 1980 https://doi.org/10.1207/s15516709cog0403_5Citations: 90 † The research represented here was partially supported by Office of Naval Research under contract N00014-79-C-0323, NR 157-437, and by the National Science Foundation under grant BMS 76-15024. I wish to thank Ray Gibbs, Jean Mandler. Jay McClelland, and Donald Norman for helpful comments on earlier drafts of this paper. AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinkedInRedditWechat REFERENCES Black, J. B., & Wilensky, R. An evaluation of story grammars. Cognitive Science, 1979, 3, 213–229. Mandler, J. M., & Johnson, N. S. Remembrance of things parsed: Story structure and recall. Cognitive Psychology, 1977, 9, 111–191. Newell, A., & Simon, H. Human problem solving. Englewood Cliffs , N.J. : Prentice-Hall, 1972. Rumelhart, D. E. Notes on a schema for stories. In D. G. Bobrow & A. M. Collins (Eds.), Representation and understanding: Studies in cognitive science. New York : Academic Press, 1975. Rumelhart, D. E. Toward an interactive model of reading. In S. Dornic (Ed), Attention and performance VI. Hillsdale N.J. : Lawrence Erlbaum Associates, 1977(a). Rumelhart, D. E. Understanding and summarizing brief stories. In D. Berge & S. J. Samuels (Eds.), Basic processes in reading: Perception and comprehension. Hillsdale , N.J. : Lawrence Erlbaum Associates, 1977(b). Rumelhart, D. E., & Ortony, A. The representation of knowledge in memory. In R. C. Anderson, R. J. Spiro, & W. E. Montague (Eds), Schooling and the acquisition of knowledge. Hillsdale , N.J. : Lawrence Erlbaum Associates, 1977. Stein, N. L., & Glenn, C. G. An analysis of story comprehension in elementary school children. In R. Freedle (Ed.), New Directions in discourse processing II. Norwood , N.J. : Ablex, 1979. Thorndyke, P. W. Cognitive structures in comprehension and memory of narrative discourse. Cognitive Psychology, 1977, 9, 111–191. Citing Literature Volume4, Issue3July 1980Pages 313-316 ReferencesRelatedInformation
DOI: 10.1037//0033-295x.88.5.375
1981
Cited 152 times
An interactive activation model of context effects in letter perception: I. An account of basic findings.
1986
Cited 134 times
Psychological and biological models
What makes people smarter than computers? These volumes by a pioneering neurocomputing group suggest that the answer lies in the massively parallel architecture of the human mind. They describe a new theory of cognition called connectionism that is challenging the idea of symbolic computation that has traditionally been at the center of debate in theoretical discussions about the mind. The authors' theory assumes the mind is composed of a great number of elementary units connected in a neural network. Mental processes are interactions between these units which excite and inhibit each other in parallel rather than sequential operations. In this context, knowledge can no longer be thought of as stored in localized structures; instead, it consists of the connections between pairs of units that are distributed throughout the network. Volume 1 lays the foundations of this exciting theory of parallel distributed processing, while Volume 2 applies it to a number of specific issues in cognitive science and neuroscience, with chapters describing models of aspects of perception, memory, language, and thought. David E. Rumelhart is Professor of Psychology at the University of California, San Diego. James L. McClelland is Professor of Psychology at Carnegie-Mellon University. A Bradford Book.
DOI: 10.4324/9781003309734-31
2022
Cited 21 times
Toward an Interactive Model of Reading
The purpose of this chapter is to develop a formalism within which psychologists can develop detailed information processing models of the reading process. I argue that such a formalism is necessary because the usual formalisms tend to lead most naturally to bottom-up, serial, stage-by-stage models of reading. Moreover, I argue that there is a good deal of evidence suggesting that reading is best characterized as a process of applying simultaneous constraints at all levels and thereby coming up with the most probable interpretation of the input string. Although it is probably not impossible to use the usual flow chart formalisms to represent such models (have arrows pointing back from higher levels to lower levels) it is not especially natural and when carried to the extreme of a completely interacting system is not very informative (two way arrows between every pair of levels). I suggest that the formalisms designed for parallel computing applications are the best substitutions. Finally, I develop a model based on HEARSAY II and GSP and argue that such a model has many very promising features.
1995
Cited 137 times
Backpropagation: the basic theory
DOI: 10.1016/0898-5529(90)90053-b
1990
Cited 112 times
MSnet: A Neural Network which Classifies Mass Spectra
We have designed a feed-forward neural network to classify low-resolution mass spectra of unknown compounds according to the presence or absence of 100 organic substructures. The neural network, MSnet, was trained to compute a maximum-likelihood estimate of the probability that each substructure is present. We discuss some design considerations and statistical properties of neural network classifiers, and the effect of various training regimes on generalization behavior. The MSnet classifies mass spectra more reliably than other methods reported in the literature, and has other desirable properties.
DOI: 10.1037/0096-3445.114.2.193
1985
Cited 111 times
Levels indeed! A response to Broadbent.
Although Broadbent concedes that we are probably correct in supposing that memory representations are distributed, he argues that psychological evidence is irrelevant to our argument because our point is relevant only at what Marr (1982) has called the implementation^ level of description and that psychological theory is only properly concerned with what Marr calls the computational level. We believe that Broadbent is wrong on both counts. First, our model is stated at a third level between the other two, Marr's representational and algorithmic level. Second, we believe that psychology is properly concerned with all three of these levels and that the information processing approach to psychology has been primarily concerned with the same level that we are, namely, the algorithmic level. Thus, our model is a competitor of the logogen model and other models of human information processing. We discuss these and other aspects of the question of levels, concluding that distributed models may ultimately provide more compelling accounts of a number of aspects of cognitive processes than other, competing algorithmic accounts.
1986
Cited 108 times
Parallel distributed processing: explorations in the microstructure of cognition. Volume 1. Foundations
The fundamental principles, basic mechanisms, and formal analyses involved in the development of parallel distributed processing (PDP) systems are presented in individual chapters contributed by leading experts. Topics examined include distributed representations, PDP models and general issues in cognitive science, feature discovery by competitive learning, the foundations of harmony theory, learning and relearning in Boltzmann machines, and learning internal representations by error propagation. Consideration is given to linear algebra in PDP, the logic of additive functions, resource requirements of standard and programmable nets, and the P3 parallel-network simulating system.
DOI: 10.1017/cbo9780511529863.014
1989
Cited 102 times
Toward a microstructural account of human reasoning
For the past several years my colleagues and I have been analyzing what we call parallel distributed processing (PDP) systems and looking at what we call the microstructure of cognition (cf. McClelland, Rumelhart, & the PDP Research Group, 1986; Rumelhart, McClelland, & the PDP Research Group, 1986). In this work we developed computational models of cognitive processes based on principles of “brainstyle” processing. The major focus of this work has been in perception, memory retrieval, and learning. The question remains as to how this work extends to the domains of “higher mental processes.” We have made one attempt to show how our PDP models can be used to account for schemalike effects (Rumelhart, Smolensky, McClelland, & Hinton, 1986). This chapter is designed to push those ideas further and to sketch an account of reasoning from a PDP perspective. I will proceed by first describing the basic theoretical structure of the PDP approach. I will then give a brief account of the reasoning process and finally show how it can be seen as resulting from a parallel distributed processing system.
DOI: 10.2307/415721
1987
Cited 99 times
Parallel Distributed Processing: Explorations in the Microstructures of Cognition
1. Very rarely, a book is published which not only advances our knowledge of a particular topic, but fundamentally recasts our methods of investigating and thinking about large tracts of the map of learning. Linguists remember 1957 as the publication year of Noam Chomsky's Syntactic structures-a book whose ostensible subjects were the structure of English grammatical rules and the goals of grammatical description, but which can be seen with hindsight as the first shot in an intellectual revolution which ended by radically changing the texture of day-to-day research activity and discourse throughout almost all of linguistics, and in substantial parts of other cognition-related disciplines. In decades to come, perhaps 1986 will be remembered by academics as the year of publication of the pair of volumes reviewed here: they constitute the first large-scale public statement of an intellectual paradigm fully as revolutionary as the generative paradigm ever was (there have been scattered journal articles in the preceding four or five years). I would go further and suggest that, if the promises of this book can be redeemed, the contrast in linguistics and neighboring disciplines between the 1990's and the 1970's will be significantly greater than the contrast between the 1970's and the 1950's. (I need hardly add, of course, that it is one thing to fire an opening salvo, but another to achieve ultimate predominance.) The new paradigm is called Parallel Distributed Processing by the sixteen writers who contributed to this book, many of whom work either at the University of California, San Diego, or at Carnegie-Mellon University in Pittsburgh. Some other researchers (e.g. Feldman 1985) use the term 'connectionism' for the same concept. These two volumes comprise 26 chapters which, among them, (i) explain the over-all nature and aims of PDP/connectionist models, (ii) define a family of specific variants of the general paradigm, and (iii) exemplify it by describing experiments in which PDP models were used to simulate human performance in various cognitive domains. The experiments, inevitably, treat their respective domains in a simplified, schematic way by comparison with the endless complexity found in any real-life cognitive area; but simplification in this case does not mean trivialization. There are also auxiliary chapters on relevant related topics; thus Chap. 9, by M. I. JORDAN, is a tutorial on linear algebra, a branch of mathematics having special significance for the PDP paradigm. (Each chapter is attributed to a particular author or
1972
Cited 95 times
A process model for long-term memory.
1984
Cited 85 times
Schemata and the cognitive system.
DOI: 10.4135/9781529681451.n4
2013
Cited 77 times
Representation in Memory
Abstract : This paper provides a review of work on the representation of knowledge from within psychology and artificial intelligence. The work covers the nature of representation, the distinction between the represented world and the representing world, and significant issues concerned with propositional, analogical, and superpositional representations. Major controversies within psychology -- such as distinctions between declarative and procedural representation, propositional and analogical representation, and the nature of visual images -- are analyzed and found not to reflect fundamental disagreements. (Author)
DOI: 10.1016/b978-0-12-521350-9.50007-3
1970
Cited 61 times
A System for Perception and Memory
DOI: 10.4324/9781315271644-10
2017
Cited 55 times
The Representation of Knowledge in Memory 1
While originating from the senses, knowledge is not a blind record of sensory inputs. Normal people are not tape recorders, or video recorders; rather, they seem to process and reprocess information, imposing on it and producing from it knowledge which has structure. Schemata are data structures for representing the generic concepts stored in memory. They exist for generalized concepts underlying objects, situations, events, sequences of events, actions, and sequences of actions. Just as certain characteristics of the actors are specified by the play-write, so too a schema contains, as part of its specification, information about the types of objects that may be bound to the various variables of the schema. In much the same way as the entries for lexical items in a dictionary consist of other lexical items, so the structure of a schema is given in terms of relationships among other schemata.
1987
Cited 87 times
Learning the past tenses of English verbs: Implicit rules or parallel distributed processing?
DOI: 10.1017/cbo9781139173865.007
1993
Cited 86 times
Some problems with the notion of literal meanings
In his paper, Professor Sadock brings to the fore a fundamental dilemma of semantic analysis as practiced by many linguists and modern philosophers. The approach adopted by these workers is committed to the existence of a sharp distinction between what an utterance might mean (that is, its literal meaning) and what that utterance is, or can be, used to convey. (See, for example, Searle's chapter [this volume] which emphasizes the distinction between “sentence meaning” and “utterance meaning.”) To a linguist interested in form-meaning pairs, or to a philosopher interested in truth conditions on expressions, this distinction might be crucial. In these cases, the concern is to build a theory of literal meaning and to assign conveyed meanings to the application of unspecified psychological processes not specific to language. As a psychologist I find myself primarily interested in the mechanisms whereby meanings are conveyed. Whatever role “literal meanings” (as defined by these linguists and philosophers) might play in the comprehension of language (that is, in the determination of what some utterance conveys), psychological theory must concern itself with conveyed meanings.
1987
Cited 82 times
Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences
DOI: 10.1016/b978-1-4832-1448-1.50016-0
1991
Cited 74 times
BACK-PROPAGATION, WEIGHT-ELIMINATION AND TIME SERIES PREDICTION
Abstract We investigate the effectiveness of connectionist architectures for predicting the future behavior of nonlinear dynamical systems. We analyze the sunspot series as an example of a real world time series of limited record length. The problem of overfitting, particularly serious for short records of noisy data, is addressed both by using the statistical method of validation and by adding a complexity term to the cost function (weight-elimination). We show why sigmoid units are superior in performance to radial basis functions for high-dimensional input spaces. The ultimate goal is prediction accuracy: we find that sigmoid networks trained with weight-elimination outperform traditional nonlinear statistical approaches. The prediction accuracy does not deteriorate when too many input units are used. Iterated single-step predictions are found to be better than direct multi-step predictions. Furthermore, we compare different sampling times (yearly and monthly), investigate the effect of preprocessing the data (square root and logarithmic transforms) and compare different error functions (corresponding to Gauss and Poisson statistics).
1977
Cited 73 times
Introduction to human information processing
DOI: 10.1016/s0364-0213(03)00015-6
2003
Cited 82 times
25th Annual Meeting of the Cognitive Science Society
The Cognitive Science Society is pleased to announce that The Boston Park Plaza Hotel has been selected as the site of the 25th annual meeting of the Society.The conference is scheduled July 30 through August 2, 2003.Tutorials are scheduled for July 30, with poster sessions, regular sessions and receptions scheduled July 31-August 2.Each year, in addition to submitted talks on traditional topics, the Conference highlights particular emerging trends in the field.This year, the focus will be on the following areas:The Social, Cultural and Contextual Elements of Cognition Collaboration (cooperation; coordination; organization of joint behaviors, shared planning and mental model) Cultural Learning (accumulation of knowledge within communities across generations; the function of artifacts in cultural history, acquisition of conventional behavior) Distributed Cognition (external representations, mechanisms of coordination, arena setting and context, workflow analysis) Interaction (between individuals; within context; person environment coupling as dynamical system, meaning in conversational interaction, emergent representations)
DOI: 10.7551/mitpress/4626.003.0008
1997
Cited 78 times
The Architecture of Mind: A Connectionist Approach
1991
Cited 71 times
Philosophy and Connectionist Theory
Contents: D.E. Rumelhart, Series Foreword. Preface. Part I:Connectionism and Other Styles of Cognitive Modeling. M.A. Boden, Horses of a Different Color? D.C. Dennett, Mother Nature Versus the Walking Encyclopedia: A Western Drama. Part II:Representation in Connectionist Models. T. van Gelder, What is the D in PDP? A Survey of the Concept of Distribution. J. Haugeland, Representational Genera. R. Cummins, The Role of Representation in Connectionist Explanations of Cognitive Capacities. A. Clark, In Defense of Explicit Rules. T. Goschke, D. Koppelberg, The Concept of Representation and the Representation of Concepts in Connectionist Models. G. Hatfield, Representation in Perception and Cognition: Connectionist Affordances. Part III:Philosophical Implications of Connectionism. W. Ramsey, S.P. Stich, J. Garon, Connectionism, Eliminativism and the Future of Folk Psychology. M. Davies, Concepts, Connectionism, and the Language of Thought. W. Lycan, Homuncular Functionalism Meets PDP. W. Ramsey, S.P. Stich, Connectionism and Three Levels of Nativism.
1998
Cited 69 times
Facial Expression Recognition Using a Neural Network
We discuss the development of a neural network for facial expression recognition. It aims at recognizing and interpreting facial expressions in terms of signaled emotions and level of expressiveness. We use the backpropagation algorithm to train the system to differentiate between facial expressions. We show how the network generalizes to new faces and we analyze the results. In our approach, we acknowledge that facial expressions can be very subtle, and propose strategies to deal with the complexity of various levels of expressiveness. Our database includes a variety of different faces, including individuals of different gender, race, and including different features such as glasses, mustache, and beard. Even given the variety of the database, the network learns fairly succesfuily to distinguish various levels of expressiveness, and generalizes on new faces as ~ell.
DOI: 10.1016/b978-1-4832-1446-7.50048-0
1988
Cited 68 times
An Interactive Activation Model of Context Effects in Letter Perception: Part I. An Account of Basic Findings
A model of context effects in perception is applied to the perception of letters in various contexts. In the model, perception results from excitatory and inhibitory interactions of detectors for visual features, letters, and words. A visual input excites detectors for visual features in the display. These excite detectors for letters consistent with the active features. The letter detectors in turn excite detectors for consistent words. Active word detectors mutually inhibit each other and send feedback to the letter level, strengthening activation and hence perceptibility of their constituent letters. Computer simulation of the model exhibits the perceptual advantage for letters in words over unrelated contexts and is consistent with the basic facts about the word advantage. Most importantly, the model produces facilitation for letters in pronounceable pseudowords as well as words. Pseudowords activate detectors for words that are consistent with most of the active letters, and feedback from the activated words strengthens the activations of the letters in the pseudoword. The model thus accounts for apparently rule-governed performance without any actual rules.
DOI: 10.7551/mitpress/1477.003.0005
1993
Cited 64 times
Learning and Connectionist Representations
This chapter contains sections titled: Introduction: Representational Tools In Connectionist Networks, Distributed Versus Localist Representations, Learning Representations In Contcectionist Networks, Autoencoders, Representing Semantic Networks In Connectionist Systems, Connectionist Representations And Human Judgments Of Similarity, Conclusion, Note, References
1991
Cited 61 times
Predicting sunspots and exchange rates with connectionist networks
DOI: 10.7551/mitpress/7113.001.0001
1987
Cited 61 times
Vision, Brain, and Cooperative Computation
These nineteen original essays present current developments in the exciting field of vision research, stressing contributions from neurophysiology, psychophysics, and computer science. They are unified by the theme of how best to structure the computations for visual systems and are placed in perspective by a major integrative essay provided by the editors. Broad in scope and packed with useful detail, Vision, Brain, and Cooperative Computation covers the entire range of perceptual experience from sensors to learning. Crossing several traditional disciplinary boundaries, it offers valuable insights into artificial intelligence and cognitive science with diverse and timely essays on visual neurophysiology, visual psychophysics, machine vision and robotics, and connectionism and cooperative computation. Bradford Books imprint
DOI: 10.1109/ijcnn.1991.170743
1991
Cited 57 times
Generalization by weight-elimination applied to currency exchange rate prediction
The authors focus on the minimal network strategy. The underlying hypothesis is that if several nets fit the data equally well, the simplest one will on average provide the best generalization. Inspired by the information theoretic idea of minimum description length, a term is added to the backpropagation cost function that penalizes network complexity. The authors give the details of the procedure, called weight-elimination, describe its dynamics, and clarify the meaning of the parameters involved. From a Bayesian perspective, the complexity term can be usefully interpreted as an assumption about prior distribution of the weights. This procedure was used to predict currency exchange rates.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
DOI: 10.7551/mitpress/5617.001.0001
1989
Cited 47 times
Explorations in Parallel Distributed Processing - Macintosh version
Includes 2 diskettes (for the Macintosh) Bradford Books imprint
DOI: 10.4324/9780203762981
2013
Cited 34 times
Neuroscience and Connectionist Theory
Contents: B.L. McNaughton, L. Nadel, Hebb-Marr Networks and the Neurobiological Representation of Action in Space. M.F. Bear, L.N. Cooper, Molecular Mechanisms for Synaptic Modification in the Visual Cortex: Interaction Between Theory and Experiment. R. Granger, J. Ambros-Ingerson, U. Staubli, G. Lynch, Memorial Operation of Multiple, Interacting Simulated Brain Structures. M.A. Gluck, E.S. Reifsnider, R.F. Thompson, Adaptive Signal Processing and the Cerebellum: Models of Classical Conditioning and VOR Adaptation. W.B. Levy, C.M. Colbert, N.L. Desmond, Elemental Adaptive Processes of Neurons and Synapses: A Statistical/Computational Perspective. H.T. Wang, B. Mathur, C. Koch, I Thought I Saw It Move: Computing Optical Flow in the Primate Visual System. K.D. Miller, Correlation-Based Models of Neural Development. D. Zipser, Modeling Cortical Computation With Backpropagation.
DOI: 10.1006/csla.1994.1010
1994
Cited 45 times
Context-dependent connectionist probability estimation in a hybrid hidden Markov model-neural net speech recognition system
In this paper we present a training method and a network architecture for estimating context-dependent observation probabilities in the framework of a hybrid hidden Markov model (HMM)/multi layer perceptron (MLP) speaker-independent continuous speech recognition system. The context-dependent modeling approach we present here computes the HMM context-dependent observation probabilities using a Bayesian factorization in terms of context-conditioned posterior phone probabilities which are computed with a set of MLPs, one for every relevant context. The proposed network architecture shares the input-to-hidden layer among the set of context dependent MLPs in order to reduce the number of independent parameters. Multiple states for phone models with different context dependence for each state are used to model the different context effects at the beginning and end of phonetic segments. A new training procedure that "smooths" networks with different degrees of context dependence is proposed to obtain a robust estimate of the context-dependent probabilities. We have used this new architecture to model generalized biphone phonetic contexts. Tests with the speaker-independent DARPA Resource Management database have shown average reductions in word error rates of 28% using a word-pair grammar, compared to our earlier context-independent HMM/MLP hybrid.
1990
Cited 42 times
Integrated Segmentation and Recognition of Hand-Printed Numerals
Neural network algorithms have proven useful for recognition of individual, segmented characters. However, their recognition accuracy has been limited by the accuracy of the underlying segmentation algorithm. Conventional, rule-based segmentation algorithms encounter difficulty if the characters are touching, broken, or noisy. The problem in these situations is that often one cannot properly segment a character until it is recognized yet one cannot properly recognize a character until it is segmented. We present here a neural network algorithm that simultaneously segments and recognizes in an integrated system. This algorithm has several novel features: it uses a supervised learning algorithm (backpropagation), but is able to take position-independent information as targets and self-organize the activities of the units in a competitive fashion to infer the positional information. We demonstrate this ability with overlapping handprinted numerals.
DOI: 10.1097/00001756-199803090-00008
1998
Cited 46 times
Somatotopy of the human arm using fMRI
We describe a technique for mapping out human somatosensory cortex using functional magnetic resonance imaging (fMRI). To produce cortical activation, a pneumatic apparatus presented subjects with a periodic series of air puffs in which a sliding window of five locations moved along the ventral surface of the left arm in a proximal-to-distal or distal-to-proximal direction. This approach, in which the phase-delay of the stimulus can be used to produce somatotopic maps of somatosensory cortex, is based on a method used to generate retinotopic maps of visual cortex. Functional images were acquired using an echoplanar 1.5T scanner and a T2*-weighted spiral acquisition pulse sequence. The periodic series of air puffs created phase-related activation in two cortical regions of the contralateral parietal lobe, the posterior bank of the central sulcus and a more posterior and lateral region.
DOI: 10.1080/01638539309544839
1993
Cited 39 times
A parallel distributed processing model of story comprehension and recall
An optimal control theory of story comprehension and recall is proposed within the framework of a “situation”‐state space. A point in situation‐state space is specified by a collection of propositions, each of which can have the values of either “present” or “absent.” A trajectory in situation‐state space is a temporally ordered sequence of situations. A reader's knowledge that the occurrence of one situation is likely to cause the occurrence of another situation is represented by a subjective conditional probability distribution. A multistate probabilistic (MSP) causal chain notation is also introduced for conveniently describing the knowledge structures implicitly represented by the subjective conditional probability distribution. A story is represented as a partially specified trajectory in situation‐state space, and thus, story comprehension is defined as the problem of inferring the most probable missing features of the partially specified story trajectory. The story‐recall process is also viewed as a procedure that solves the problem of estimating the most probable missing features of a partially specified trajectory, but the partially specified trajectory in this latter case is an episodic memory trace of the reader's understanding of the story. A parallel distributed processing (PDP) model whose connection strengths are derived from the MSP causal chain representation is then introduced. The PDP model is shown to solve the problem of estimating the missing features of a partially specified trajectory in situation‐state space, and the model's story‐recall performance is then qualitatively compared to known performance characteristics of human memory for stories.
DOI: 10.21236/ada090189
1980
Cited 33 times
An Interactive Activation Model of the Effect of Context in Perception. Part 2
Abstract : This paper is the second part of a two-part series introducing an interactive activation model of context effects in perception. In the previous part we developed the basic form of the model and showed how it accounts for several of the fundamental phenomena of work perception. In this part, we first present a number of new experiments and show how the model accounts for these experiments. Then we propose a number of extensions of the model to such cases as spoken input, pronunciation tasks, and words embedded in sentential context. Finally, we discuss the strengths and weaknesses of the model, pointing out further possible extensions of the model to account for aspects of word perception currently ignored. The new experiments all revolve around what we call the 'context enhancement effect.' These experiments are designed to assess the roles of direct and indirect evidence concerning the identity of a letter in an input string.
DOI: 10.1007/978-1-4612-5470-6_2
1983
Cited 29 times
A Glossary of Terms Including a Classification of Typing Errors
1973
Cited 24 times
Active semantic networks as a model of human memory
DOI: 10.7551/mitpress/1888.003.0013
2002
Cited 26 times
Learning Representations by Back-Propagating Errors
DOI: 10.1007/978-1-4612-5470-6_3
1983
Cited 23 times
Studies of Typing from the LNR Research Group
DOI: 10.1016/s0364-0213(82)80004-9
1982
Cited 23 times
Simulating a skilled typist: a study of skilled cognitive-motor performance
We review the major phenomena of skilled typing and propose a model for the control of the hands and fingers during typing. The model is based upon an Activation-Trigger-Schema system in which a hierarchical structure of schemata directs the selection of the letters to be typed and, then, controls the hand and finger movements by a cooperative, relaxation algorithm. The interactions of the patterns of activation and inhibition among the schemata determine the temporal ordering for launching the keystrokes. To account for the phenomena of doubling errors, the model has only “type” schemata—no “token” schemata—with only a weak binding between the special schema that signals a doubling, and its argument. The model exists as a working computer simulation and produces an output display of the hands and fingers moving over the keyboard. It reproduces some of the major phenomena of typing, including the interkeystroke interval times, the pattern of transposition errors found in skilled typists, and doubling errors. Although the model is clearly inadequate or wrong in some of its features and assumptions, it serves as a useful first approximation for the understanding of skilled typing.
DOI: 10.1016/b978-1-55860-200-7.50018-0
1991
Cited 23 times
Internal world models and supervised learning
Internal models of the environment have an important role to play in adaptive systems in general and are of particular importance for the supervised learning paradigm. In this paper we demonstrate that certain classical problems associated with the notion of the “teacher― in supervised learning can be solved by judicious use of learned internal models as components of the adaptive system. In particular, we show how supervised learning algorithms can be utilized in cases in which an unknown dynamical system intervenes between actions and desired outcomes.
1990
Cited 21 times
Brain style computation: learning and generalization
DOI: 10.3758/bf03203842
1988
Cited 21 times
A simulation-based tutorial system for exploring parallel distributed processing
This article presents a simulation-based tutorial system for exploring parallel distributed processing (PDP) models of information processing. The system consists of software and an accompanying handbook. The intent of the package is to make the ideas underlying PDP accessible and to disseminate some of the main simulation programs that we have developed. This article presents excerpts from the handbook that describe the approach taken, the organization of the handbook, and the software that comes with it. An example is given that illustrates the approach we have taken to teaching PDP, which involves presentation of relevant mathematical background, together with tutorial exercises that make use of the simulation programs.
DOI: 10.1080/02103702.1982.10821949
1982
Cited 20 times
La representación del conocimiento en la memoria
RESUMENRESUMENEl conocimiento se organiza en la memoria en forma de esquemas, estructuras de datos que representan los conceptos genéricos. Estos esquemas tienen variables, pueden encajarse uno dentro de otro, varían en sus niveles de abstracción y representan un conocimiento asociado a conceptos. Son las unidades clave de los procesos de comprensión, almacenamiento y recuperación de la información. También son necesarios para establecer inferencias, como predictores de estímulos no observados, y en el razonamiento analógico. El conocimiento subyacente para la realización de acciones lo constituyen los esquemas de acción, que son subesquemas de un esquema más complejo. Existen dos mecanismos para producir y modificar esquemas: la especialización y la generalización, y ambos pueden ser considerados como tipos de aprendizaje. Los procesos de búsqueda y activación de un esquema pueden ser conducidos por datos, o conducidos conceptualmente.ABSTRACTKnowledge is organized in memory in the form of schemata, that is, data estructures representing generic concepts. These schemata possess variables, may fit into one another, vary in their levels of abstraction, and represent knowledge associated to concepts. They represent the key units of information comprehension, storing and recovering processes. They are also necessary to establish inferences, to predict non-observed stimuli, and in analogic reasoning. Action schemata involve underlying knowledge which enable the individual to carry out actions, which are subschemata of a more complex schema. There are two mechanisms for producing and modifying schemata: specialization and generalization, and both may be considered types of learning. The processes involved in searching or activating a schema may either be guided by the data or conceptually.PALABRAS CLAVE: Esquemasconocimientocomprensiónmemoriainferenciasesquema de acciónactivaciónKEYWORDS: Schemaknowledgecomprehensionmemoryinferences action schemesactivation
DOI: 10.1109/ijcnn.1991.170692
1991
Cited 20 times
The effective dimension of the space of hidden units
The authors show how the effective number of parameters changes during backpropagation training by analyzing the eigenvalue spectra of the covariance matrix of hidden unit activations and of the matrix of weights between inputs and hidden units. They use the standard example of time series prediction of the sunspot series. The effective ranks of these matrices are equal to each other when a solution is reached. This effective dimension is also equal to the number of hidden units of the minimal network obtained with weight-elimination.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
1991
Cited 20 times
A Self-Organizing Integrated Segmentation and Recognition Neural Net
We present a neural network algorithm that simultaneously performs segmentation and recognition of input patterns that self-organizes to detect input pattern locations and pattern boundaries. We demonstrate this neural network architecture on character recognition using the NIST database and report on results herein. The resulting system simultaneously segments and recognizes touching or overlapping characters, broken characters, and noisy images with high accuracy.
1986
Cited 17 times
Parallel Distributed Processing: Explorations in the Microstructure of Cognition : Psychological and Biological Models
DOI: 10.1016/0010-0277(81)90051-2
1981
Cited 16 times
The LNR approach to human information processing
DOI: 10.21437/icslp.1992-281
1992
Cited 16 times
Hybrid neural network/hidden Markov model continuous-speech recognition
1992
Cited 14 times
Context-Dependent Multiple Distribution Phonetic Modeling with MLPs
A number of hybrid multilayer perceptron (MLP)/hidden Markov model (HMM) speech recognition systems have been developed in recent years (Morgan and Bourlard, 1990). In this paper, we present a new MLP architecture and training algorithm which allows the modeling of context-dependent phonetic classes in a hybrid MLP/HMM framework. The new training procedure smooths MLPs trained at different degrees of context dependence in order to obtain a robust estimate of the context-dependent probabilities. Tests with the DARPA Resource Management database have shown substantial advantages of the context-dependent MLPs over earlier context-independent MLPs, and have shown substantial advantages of this hybrid approach over a pure HMM approach.
DOI: 10.3758/bf03342814
1964
Cited 7 times
Perception by monkeys II. Use of cues at a distance by yourg and old monkeys
Fourteen rhesus monkeys and two human Os were trained to discriminate between identical blocks of wood placed 13 in apart, using cues that were provided by a pointer that was placed at random in positions spaced 1.0 in apart between the manipulanda. Monkeys made increasingly more errors as a function of increasing distance between the manipulandum and discriminandum, and extensive practice did not alter this relationship. The human Os, however, made no errors at positions of the pointer other than the center.
DOI: 10.4324/9781315108506-2
2017
Cited 6 times
Interactive Processing Through Spreading Activation
DOI: 10.1117/12.140155
1992
Cited 13 times
&lt;title&gt;Self-organizing integrated segmentation and recognition neural network&lt;/title&gt;
We present a neural network algorithm that simultaneously performs segmentation and recognition of input patterns that self-organizes to detect input pattern locations and pattern boundaries. We outline the algorithm and demonstrate this neural network architecture and algorithm on character recognition using the NIST database and report results herein. The resulting system simultaneously segments and recognizes touching characters, overlapping characters, broken characters, and noisy images with high accuracy. We also detail some of the characteristics of the algorithm on an artificial database in the appendix.
1988
Cited 11 times
Explorations in parallel distributed processing - A handbook of models, programs and exercises
This book presents the official, formal definition of the programming language ML including the rules for grammar and static and dynamic semantics. ML is the most well-developed and prominent of a new group of functional programming languages. On the cutting edge of theoretical computer science, ML embodies the ideas of static typing and polymorphism and has also contributed a number of novel ideas to the design of programming languages.Contents: Syntax of the Core. Syntax of Modules. Static Semantics for the Core. Static Semantics for Modules. Dynamic Semantics for Modules. Programs.Appendixes: Derived Forms. Full Grammar. The Initial Static Basis. The Initial Dynamic Basis. The Development of ML.
DOI: 10.7551/mitpress/4943.003.0128
1988
Cited 10 times
(1986) D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning internal representations by error propagation," Parallel Distributed Processing: Explorations in the Microstructures of Cognition, Vol. I, D. E. Rumelhart and J. L. McClelland (Eds.) Cambridge, MA: MIT Press, pp. 318-362
DOI: 10.1598/0872075028.41
2005
Cited 7 times
Toward an Interactive Model of Reading
1993
Cited 10 times
The neurobiological significance of the new learning models
DOI: 10.7551/mitpress/4631.003.0010
1998
Cited 10 times
The Architecture of Mind: A Connectionist Approach
1978
Cited 7 times
Accretion, Tuning, and Restructuring. Modes of Learning
DOI: 10.1037//0033-295x.89.1.60
1982
Cited 7 times
An interactive activation model of context effects in letter perception: II. The contextual enhancement effect and some tests and extensions of the model.
DOI: 10.1126/science.198.4319.816
1977
Cited 6 times
Cognitive Psychology: <i>Cognition and Reality</i> . Principles and Implications of Cognitive Psychology. Ulric Neisser. Freeman, San Francisco, 1976. xvi, 230 pp. Cloth, $12.50; paper, $4.95.
DOI: 10.2466/pr0.1963.12.1.251
1963
Cited 3 times
Enhancement of Amphetamine Sulphate Effects by Atropine in a Social Situation
1992
Cited 9 times
Introducción al procesamiento distribuido en paralelo
DOI: 10.7551/mitpress/3072.003.0006
1989
Cited 7 times
The Architecture of Mind: A Connectionist Approach