In 1982, Springer published the English translation of the Russian book, "Estimation of Dependencies Based on Empirical Data" which became the foundation of the statistical theory of learning and generalization (the VC theory). A number of new principles and new technologies of learning, including SVM technology, have been developed based on this theory. The second edition of this book contains two parts: a reprint of the first edition which provides the classical foundation of Statistical Learning Theory; four new chapters describing the latest ideas in the development of statistical inference methods. They form the second part of the book entitled "Empirical Inference Science". The second part of the book discusses along with new models of inference the general philosophical principles of making inferences from observations. It includes new paradigms of inference that use non-inductive methods appropriate for a complex world, in contrast to inductive methods of inference developed in the classical philosophy of science for a simple world. The two parts of the book cover a wide spectrum of ideas related to the essence of intelligence: from the rigorous statistical foundation of learning models to broad philosophical imperatives for generalization. The book is intended for researchers who deal with a variety of problems in empirical inference: statisticians, mathematicians, physicists, computer scientists, and philosophers.
1 REALISM AND INSTRUMENTALISM: CLASSICAL STATISTICS AND VC THEORY<br />(1960–1980) 411<br />1.1 The Beginning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411<br />1.1.1 The Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . 412<br />1.1.2 Uniform Law of Large Numbers . . . . . . . . . . . . . . . . 412<br />1.2 Realism and Instrumentalism in Statistics and the Philosophy of Science 414<br />1.2.1 The Curse of Dimensionality and Classical Statistics . . . . . 414<br />1.2.2 The Black Box Model . . . . . . . . . . . . . . . . . . . . . 416<br />1.2.3 Realism and Instrumentalism in the Philosophy of Science . . 417<br />1.3 Regularization and Structural Risk Minimization . . . . . . . . . . . 418<br />1.3.1 Regularization of Ill-Posed Problems . . . . . . . . . . . . . 418<br />1.3.2 Structural Risk Minimization . . . . . . . . . . . . . . . . . . 421<br />1.4 The Beginning of the Split Between Classical Statistics and Statistical<br />Learning Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422<br />1.5 The Story Behind This Book . . . . . . . . . . . . . . . . . . . . . . 423<br />2 FALSIFIABILITY AND PARSIMONY: VC DIMENSION AND THE NUMBER OF<br />ENTITIES (1980–2000) 425<br />2.1 Simplification of VC Theory . . . . . . . . . . . . . . . . . . . . . . 425<br />2.2 Capacity Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427<br />2.2.1 Bell Labs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427<br />2.2.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . 429<br />2.2.3 Neural Networks: The Challenge . . . . . . . . . . . . . . . 429<br />2.3 Support Vector Machines (SVMs) . . . . . . . . . . . . . . . . . . . 430<br />2.3.1 Step One: The Optimal Separating Hyperplane . . . . . . . . 430<br />2.3.2 The VC Dimension of the Set of ñ-Margin Separating Hyperplanes<br />. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431<br />2.3.3 Step Two: Capacity Control in Hilbert Space . . . . . . . . . 432 2.3.4 Step Three: Support Vector Machines . . . . . . . . . . . . . 433<br />2.3.5 SVMs and Nonparametric Statistical Methods . . . . . . . . . 436<br />2.4 An Extension of SVMs: SVM+ . . . . . . . . . . . . . . . . . . . . . 438<br />2.4.1 Basic Extension of SVMs . . . . . . . . . . . . . . . . . . . 438<br />2.4.2 Another Extension of SVM: SVMã+ . . . . . . . . . . . . . 441<br />2.4.3 Learning Using Hidden Information . . . . . . . . . . . . . . 441<br />2.5 Generalization for Regression Estimation Problem . . . . . . . . . . 443<br />2.5.1 SVM Regression . . . . . . . . . . . . . . . . . . . . . . . . 443<br />2.5.2 SVM+ Regression . . . . . . . . . . . . . . . . . . . . . . . 445<br />2.5.3 SVMã+ Regression . . . . . . . . . . . . . . . . . . . . . . . 445<br />2.6 The Third Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 446<br />2.7 Relation to the Philosophy of Science . . . . . . . . . . . . . . . . . 448<br />2.7.1 Occam’s Razor Principle . . . . . . . . . . . . . . . . . . . . 448<br />2.7.2 Principles of Falsifiability . . . . . . . . . . . . . . . . . . . 449<br />2.7.3 Popper’s Mistakes . . . . . . . . . . . . . . . . . . . . . . . 450<br />2.7.4 Principle of VC Falsifiability . . . . . . . . . . . . . . . . . . 451<br />2.7.5 Principle of Parsimony and VC Falsifiability . . . . . . . . . 452<br />2.8 Inductive Inference Based on Contradictions . . . . . . . . . . . . . . 453<br />2.8.1 SVMs in the Universum Environment . . . . . . . . . . . . . 454<br />2.8.2 The First Experiments and General Speculations . . . . . . . 457<br />3 NONINDUCTIVE METHODS OF INFERENCE: DIRECT INFERENCE INSTEAD<br />OF GENERALIZATION (2000–· · ·) 459<br />3.1 Inductive and Transductive Inference . . . . . . . . . . . . . . . . . . 459<br />3.1.1 Transductive Inference and the Symmetrization Lemma . . . 460<br />3.1.2 Structural Risk Minimization for Transductive Inference . . . 461<br />3.1.3 Large Margin Transductive Inference . . . . . . . . . . . . . 462<br />3.1.4 Examples of Transductive Inference . . . . . . . . . . . . . . 464<br />3.1.5 Transductive Inference Through Contradictions . . . . . . . . 465<br />3.2 Beyond Transduction: The Transductive Selection Problem . . . . . . 468<br />3.2.1 Formulation of Transductive Selection Problem . . . . . . . . 468<br />3.3 Directed Ad Hoc Inference (DAHI) . . . . . . . . . . . . . . . . . . 469<br />3.3.1 The Idea Behind DAHI . . . . . . . . . . . . . . . . . . . . . 469<br />3.3.2 Local and Semi-Local Rules . . . . . . . . . . . . . . . . . . 469<br />3.3.3 Estimation of Conditional Probability Along the Line . . . . . 471<br />3.3.4 Estimation of Cumulative Distribution Functions . . . . . . . 472<br />3.3.5 Synergy Between Inductive and Ad Hoc Rules . . . . . . . . 473<br />3.3.6 DAHI and the Problem of Explainability . . . . . . . . . . . 474<br />3.4 Philosophy of Science for a Complex World . . . . . . . . . . . . . . 474<br />3.4.1 Existence of Different Models of Science . . . . . . . . . . . 474<br />3.4.2 Imperative for a Complex World . . . . . . . . . . . . . . . . 476<br />3.4.3 Restrictions on the Freedom of Choice in Inference Models . 477<br />3.4.4 Metaphors for Simple and Complex Worlds . . . . . . . . . . 478 4 THE BIG PICTURE 479<br />4.1 Retrospective of Recent History . . . . . . . . . . . . . . . . . . . . 479<br />4.1.1 The Great 1930s: Introduction of the Main Models . . . . . . 479<br />4.1.2 The Great 1960s: Introduction of the New Concepts . . . . . 482<br />4.1.3 The Great 1990s: Introduction of the New Technology . . . . 483<br />4.1.4 The Great 2000s: Connection to the Philosophy of Science . . 484<br />4.1.5 Philosophical Retrospective . . . . . . . . . . . . . . . . . . 484<br />4.2 Large Scale Retrospective . . . . . . . . . . . . . . . . . . . . . . . . 484<br />4.2.1 Natural Science . . . . . . . . . . . . . . . . . . . . . . . . . 485<br />4.2.2 Metaphysics . . . . . . . . . . . . . . . . . . . . . . . . . . 485<br />4.2.3 Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . 486<br />4.3 Shoulders of Giants . . . . . . . . . . . . . . . . . . . . . . . . . . . 487<br />4.3.1 Three Elements of Scientific Theory . . . . . . . . . . . . . . 487<br />4.3.2 Between Trivial and Inaccessible . . . . . . . . . . . . . . . . 488<br />4.3.3 Three Types of Answers . . . . . . . . . . . . . . . . . . . . 489<br />4.3.4 The Two-Thousand-Year-Old War Between Natural Science<br />and Metaphysics . . . . . . . . . . . . . . . . . . . . . . . . 490<br />4.4 To My Students’ Students . . . . . . . . . . . . . . . . . . . . . . . . 491<br />4.4.1 Three Components of Success . . . . . . . . . . . . . . . . . 491<br />4.4.2 The Misleading Legend About Mozart . . . . . . . . . . . . . 492<br />4.4.3 Horowitz’s Recording of Mozart’s Piano Concerto . . . . . . 493<br />4.4.4 Three Stories . . . . . . . . . . . . . . . . . . . . . . . . . . 493<br />4.4.5 Destructive Socialist Values . . . . . . . . . . . . . . . . . . 494<br />4.4.6 Theoretical Science Is Not Only a Profession—It Is a Way of<br />Life . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497