DOI: 10.1016/b978-1-55860-307-3.50037-x
Combining Instance-Based and Model-Based Learning
This paper concerns learning tasks that require the prediction of a continuous value rather than a discrete class. A general method is presented that allows predictions to use both instance-based and model-based learning. Results with three approaches to constructing models and with eight datasets demonstrate improvements due to the composite method. Keywords: learning with continuous classes, instance-based learning, model-based learning, empirical evaluation.
DOI: 10.1111/j.1467-8640.1989.tb00315.x
¤ Open Access
Instance-based prediction of real-valued attributes
Instance-based representations have been applied to numerous classification tasks with some success. Most of these applications involved predicting a symbolic class based on observed attributes. This paper presents an instance-based method for predicting a numeric value based on observed attributes. We prove that, given enough instances, if the numeric values are generated by continuous functions with bounded slope, then the predicted values are accurate approximations of the actual values. We demonstrate the utility of this approach by comparing it with a standard approach for value prediction. The instance-based approach requires neither ad hoc parameters nor background knowledge.
DOI: 10.1080/01621459.1981.10477729
¤ Open Access
Projection Pursuit Regression
Abstract A new method for nonparametric multiple regression is presented. The procedure models the regression surface as a sum of general smooth functions of linear combinations of the predictor variables in an iterative manner. It is more general than standard stepwise and stagewise regression procedures, does not require the definition of a metric in the predictor space, and lends itself to graphical interpretation.
DOI: 10.1214/aos/1176347126
¤ Open Access
On Projection Pursuit Regression
We construct a tractable mathematical model for kernel-based projection pursuit regression approximation. The model permits computation of explicit formulae for bias and variance of estimators. It is shown that the bias of an orientation estimate dominates error about the mean--indeed, the latter is asymptotically negligible in comparison with bias. However, bias and error about the mean are of the same order in the case of projection pursuit curve estimates. Implications of our formulae for bias and variance are discussed.
DOI: 10.1214/aos/1176347963
¤ Open Access
Multivariate Adaptive Regression Splines
A new method is presented for flexible regression modeling of high dimensional data. The model takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automatically determined by the data. This procedure is motivated by the recursive partitioning approach to regression and shares its attractive properties. Unlike recursive partitioning, however, this method produces continuous models with continuous derivatives. It has more power and flexibility to model relationships that are nearly additive or involve interactions in at most a few variables. In addition, the model can be represented in a form that separately identifies the additive contributions and those associated with the different multivariable interactions.
DOI: 10.1007/978-1-4612-2660-4_33
Learning to Catch: Applying Nearest Neighbor Algorithms to Dynamic Control Tasks
This paper examines the hypothesis that local weighted variants of k-nearest neighbor algorithms can support dynamic control tasks. We evaluated several k-nearest neighbor (k-NN) algorithms on the simulated learning task of catching a flying ball. Previously, local regression algorithms have been advocated for this class of problems. These algorithms, which are variants of k-NN, base their predictions on a (possibly weighted) regression computed from the k nearest neighbors. While they outperform simpler k-NN algorithms on many tasks, they have trouble on this ball-catching task. We hypothesize that the non-linearities in this task are the cause of this behavior, and that local regression algorithms may need to be modified to work well under similar conditions.
DOI: 10.1023/a:1022689900470
¤ Open Access
Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several real-world databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.
MAG: 3017143921
Pattern classification and scene analysis
Provides a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition. The topics treated include Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.