Linked e-resources
Details
Table of Contents
Preface; Contents; Contributors; Part I Foundations; 1 Introduction to Sound Scene and Event Analysis; 1.1 Motivation; 1.2 What is Computational Analysis of Sound Scenes and Events?; 1.3 Related Fields; 1.4 Scientific and Technical Challenges in Computational Analysis of Sound Scenes and Events; 1.5 About This Book; References; 2 The Machine Learning Approach for Analysis of Sound Scenes and Events; 2.1 Introduction; 2.2 Analysis Systems Overview; 2.3 Data Acquisition; 2.3.1 Source Audio; 2.3.2 Reference Annotations; 2.4 Audio Processing; 2.4.1 Pre-processing; 2.4.2 Feature Extraction
2.5 Supervised Learning and Recognition2.5.1 Learning; 2.5.2 Generalization; 2.5.3 Recognition; 2.6 An Example Approach Based on Neural Networks; 2.6.1 Sound Classification; 2.6.2 Sound Event Detection; 2.7 Development Process of Audio Analysis Systems; 2.7.1 Technological Research; 2.7.2 Product Demonstrations; 2.7.3 Development Process; 2.8 Conclusions; References; 3 Acoustics and Psychoacoustics of Sound Scenes and Events; 3.1 Introduction ; 3.2 Acoustic and Psychoacoustic Characteristics of Auditory Scenes and Events; 3.2.1 Acoustic Characteristics of Sound Scenes and Events
3.2.1.1 Periodic and Non-periodic Signals3.2.1.2 Sound Production and Propagation; 3.2.2 Psychoacoustics of Auditory Scenes and Events; 3.2.2.1 Models of Peripheral Auditory Processing; 3.2.2.2 Pitch and Loudness; 3.2.2.3 The Dimensional Approach to Timbre; 3.3 The Perception of Auditory Scenes; 3.3.1 Multidimensional Representation; 3.3.2 Temporal Coherence; 3.3.3 Other Effects in Segregation; 3.4 The Perception of Sound Events; 3.4.1 Perception of the Properties of Sound Events: Psychomechanics; 3.4.1.1 Material; 3.4.1.2 Shape and Size; 3.4.1.3 Parameters of Actions
3.4.2 Minimal and Sparse Features for Sound Recognition3.4.2.1 Spectral Regions, Minimal Durations, and Spectro-Temporal Modulations; 3.4.2.2 Sparse Features; 3.4.3 Discussion: On the Dimensionality of Auditory Representations; 3.5 Summary; References; Part II Core Methods; 4 Acoustic Features for Environmental Sound Analysis; 4.1 Introduction; 4.2 Signal Representations; 4.2.1 Signal Acquisition and Preprocessing; 4.2.2 General Time-Frequency Representations; 4.2.3 Log-Frequency and Perceptually Motivated Representations; 4.2.4 Multiscale Representations; 4.2.5 Discussion
4.3 Feature Engineering4.3.1 Temporal Features; 4.3.2 Spectral Shape Features; 4.3.3 Cepstral Features; 4.3.4 Perceptually Motivated Features; 4.3.5 Spectrogram Image-Based Features; 4.3.6 Discussion; 4.4 Feature Learning; 4.4.1 Deep Learning for Feature Extraction; 4.4.2 Matrix Factorisation Techniques; 4.4.3 Discussion; 4.5 Dimensionality Reduction and Feature Selection; 4.5.1 Dimensionality Reduction; 4.5.2 Feature Selection Paradigms; 4.5.3 Filter Approaches; 4.5.4 Embedded Feature Selection; 4.5.4.1 Feature Selection by Sparsity-Inducing Norms; 4.5.4.2 Multiple Kernel Learning
2.5 Supervised Learning and Recognition2.5.1 Learning; 2.5.2 Generalization; 2.5.3 Recognition; 2.6 An Example Approach Based on Neural Networks; 2.6.1 Sound Classification; 2.6.2 Sound Event Detection; 2.7 Development Process of Audio Analysis Systems; 2.7.1 Technological Research; 2.7.2 Product Demonstrations; 2.7.3 Development Process; 2.8 Conclusions; References; 3 Acoustics and Psychoacoustics of Sound Scenes and Events; 3.1 Introduction ; 3.2 Acoustic and Psychoacoustic Characteristics of Auditory Scenes and Events; 3.2.1 Acoustic Characteristics of Sound Scenes and Events
3.2.1.1 Periodic and Non-periodic Signals3.2.1.2 Sound Production and Propagation; 3.2.2 Psychoacoustics of Auditory Scenes and Events; 3.2.2.1 Models of Peripheral Auditory Processing; 3.2.2.2 Pitch and Loudness; 3.2.2.3 The Dimensional Approach to Timbre; 3.3 The Perception of Auditory Scenes; 3.3.1 Multidimensional Representation; 3.3.2 Temporal Coherence; 3.3.3 Other Effects in Segregation; 3.4 The Perception of Sound Events; 3.4.1 Perception of the Properties of Sound Events: Psychomechanics; 3.4.1.1 Material; 3.4.1.2 Shape and Size; 3.4.1.3 Parameters of Actions
3.4.2 Minimal and Sparse Features for Sound Recognition3.4.2.1 Spectral Regions, Minimal Durations, and Spectro-Temporal Modulations; 3.4.2.2 Sparse Features; 3.4.3 Discussion: On the Dimensionality of Auditory Representations; 3.5 Summary; References; Part II Core Methods; 4 Acoustic Features for Environmental Sound Analysis; 4.1 Introduction; 4.2 Signal Representations; 4.2.1 Signal Acquisition and Preprocessing; 4.2.2 General Time-Frequency Representations; 4.2.3 Log-Frequency and Perceptually Motivated Representations; 4.2.4 Multiscale Representations; 4.2.5 Discussion
4.3 Feature Engineering4.3.1 Temporal Features; 4.3.2 Spectral Shape Features; 4.3.3 Cepstral Features; 4.3.4 Perceptually Motivated Features; 4.3.5 Spectrogram Image-Based Features; 4.3.6 Discussion; 4.4 Feature Learning; 4.4.1 Deep Learning for Feature Extraction; 4.4.2 Matrix Factorisation Techniques; 4.4.3 Discussion; 4.5 Dimensionality Reduction and Feature Selection; 4.5.1 Dimensionality Reduction; 4.5.2 Feature Selection Paradigms; 4.5.3 Filter Approaches; 4.5.4 Embedded Feature Selection; 4.5.4.1 Feature Selection by Sparsity-Inducing Norms; 4.5.4.2 Multiple Kernel Learning