001480836 000__ 05827cam\\22005297a\4500 001480836 001__ 1480836 001480836 003__ OCoLC 001480836 005__ 20231031003311.0 001480836 006__ m\\\\\o\\d\\\\\\\\ 001480836 007__ cr\un\nnnunnun 001480836 008__ 230917s2023\\\\si\\\\\\ob\\\\001\0\eng\d 001480836 019__ $$a1399167499 001480836 020__ $$a9789819917907$$q(electronic bk.) 001480836 020__ $$a9819917905$$q(electronic bk.) 001480836 020__ $$z9819917891 001480836 020__ $$z9789819917891 001480836 0247_ $$a10.1007/978-981-99-1790-7$$2doi 001480836 035__ $$aSP(OCoLC)1397562450 001480836 040__ $$aYDX$$beng$$cYDX$$dEBLCP$$dGW5XE 001480836 049__ $$aISEA 001480836 050_4 $$aQA76.9.D35 001480836 08204 $$a005.73$$223/eng/20230927 001480836 1001_ $$aYamanishi, Kenji. 001480836 24510 $$aLearning with the minimum description length principle /$$cKenji Yamanishi. 001480836 260__ $$aSingapore :$$bSpringer,$$c2023. 001480836 300__ $$a1 online resource 001480836 504__ $$aIncludes bibliographical references and index. 001480836 5050_ $$aIntro -- Foreword -- Preface -- Contents -- Acronyms -- 1 Information and Coding -- 1.1 Information, Probability, and Coding -- 1.1.1 What is Information? -- 1.1.2 Prefix Coding -- 1.1.3 Kraft Inequality -- 1.2 Shannon Information -- 1.2.1 Probability Distribution -- 1.2.2 Shannon's Coding Theorem -- 1.2.3 Shannon Information -- 1.3 Universal Coding -- 1.3.1 Two-part Coding -- 1.3.2 Bayes Coding -- 1.3.3 Counting Coding -- 1.3.4 Normalized Maximum Likelihood Coding -- 1.3.5 Kolmogorov Complexity -- 1.4 Stochastic Complexity -- 1.4.1 Stochastic Complexity for Parametric Classes 001480836 5058_ $$a1.4.2 Shtarkov's Min-Max Regret -- 1.4.3 Generalized Coding Theorem -- 1.5 Parametric Complexity -- 1.5.1 Asymptotic Approximation Method -- 1.5.2 g-Function-Based Method* -- 1.5.3 Fourier Method* -- 1.5.4 Combinatorial Method -- 1.5.5 Monte Carlo Method -- 1.6 MDL Principle -- 1.6.1 Machine Learning with MDL Principle -- 1.6.2 Estimation -- 1.6.3 Prediction -- 1.6.4 Testing -- 1.7 Summary of This Chapter -- References -- 2 Parameter Estimation -- 2.1 Maximum Likelihood Estimation -- 2.1.1 Maximum Likelihood Estimator -- 2.1.2 MLE for Multivariate Gaussian and Outlier Detection 001480836 5058_ $$a2.1.3 MLE for Linear Regression -- 2.1.4 Properties of Maximum Likelihood Estimator -- 2.2 EM Algorithm -- 2.2.1 EM Algorithm for Latent Variable Models -- 2.2.2 Incremental EM Algorithm for Online Outlier Detection -- 2.3 Maximum a Posteriori Estimation -- 2.3.1 MAP Estimation and Regularization -- 2.3.2 Sparse Regularized Linear Regression -- 2.3.3 Sparse Regularized Graphical Model* -- 2.4 Gradient Descent Methods -- 2.4.1 Gradient Descent Algorithms -- 2.5 High-Dimensional Penalty Selection -- 2.5.1 Luckiness Normalized Maximum Likelihood Code-length -- 2.5.2 Penalty Selection with LNML* 001480836 5058_ $$a2.5.3 Analytical Bounds for LNML Code-length* -- 2.6 Bayesian Estimation -- 2.6.1 Bayesian Estimator -- 2.6.2 Gibbs Sampler -- 2.7 Summary of This Chapter -- References -- 3 Model Selection -- 3.1 Model Selection -- 3.1.1 Problem Setting -- 3.1.2 Akaike's Information Criterion -- 3.1.3 Bayesian Information Criterion -- 3.1.4 Minimum Message Length Criterion -- 3.1.5 Cross-Validation -- 3.2 Minimum Description Length Criterion -- 3.2.1 MDL Criterion -- 3.2.2 Consistency -- 3.2.3 Estimation Optimality* -- 3.2.4 Rate of Convergence -- 3.2.5 Sequential Normalized Maximum Likelihood Criterion 001480836 5058_ $$a3.3 Applications of MDL Criterion -- 3.3.1 Histogram Density Estimation -- 3.3.2 Non-negative Matrix Factorization* -- 3.3.3 Decision Tree Learning -- 3.3.4 Dimensionality Selection for Word Embedding -- 3.3.5 Time Series Model Selection -- 3.3.6 Multivariate Linear Regression* -- 3.4 Summary of This Chapter -- References -- 4 Latent Variable Model Selection -- 4.1 MDL Approach to Latent Variable Model Selection -- 4.1.1 Non-identifiability for Latent Variable Models -- 4.2 Latent Stochastic Complexity -- 4.2.1 LSC Criterion -- 4.2.2 Computational Complexity of LSC 001480836 506__ $$aAccess limited to authorized users. 001480836 520__ $$aThis book introduces readers to the minimum description length (MDL) principle and its applications in learning. The MDL is a fundamental principle for inductive inference, which is used in many applications including statistical modeling, pattern recognition and machine learning. At its core, the MDL is based on the premise that the shortest code length leads to the best strategy for learning anything from data. The MDL provides a broad and unifying view of statistical inferences such as estimation, prediction and testing and, of course, machine learning. The content covers the theoretical foundations of the MDL and broad practical areas such as detecting changes and anomalies, problems involving latent variable models, and high dimensional statistical inference, among others. The book offers an easy-to-follow guide to the MDL principle, together with other information criteria, explaining the differences between their standpoints. Written in a systematic, concise and comprehensive style, this book is suitable for researchers and graduate students of machine learning, statistics, information theory and computer science. 001480836 588__ $$aOnline resource; title from PDF title page (SpringerLink, viewed September 27, 2023). 001480836 650_0 $$aData structures (Computer science)$$vCongresses$$0(DLC)sh2009122874 001480836 650_0 $$aMachine learning.$$vCongresses$$0(DLC)sh2008107143 001480836 655_0 $$aElectronic books. 001480836 77608 $$iPrint version: $$z9819917891$$z9789819917891$$w(OCoLC)1372391657 001480836 852__ $$bebk 001480836 85640 $$3Springer Nature$$uhttps://univsouthin.idm.oclc.org/login?url=https://link.springer.com/10.1007/978-981-99-1790-7$$zOnline Access$$91397441.1 001480836 909CO $$ooai:library.usi.edu:1480836$$pGLOBAL_SET 001480836 980__ $$aBIB 001480836 980__ $$aEBOOK 001480836 982__ $$aEbook 001480836 983__ $$aOnline 001480836 994__ $$a92$$bISE