Linked e-resources

Details

Intro
Preface
Organization
Invited Talks Abstracts
WuDao: Pretrain the World
The Value of Data for Personalization
AI Fairness in Practice
Safety and Robustness for Deep Learning with Provable Guarantees
Contents
Part I
Online Learning
Routine Bandits: Minimizing Regret on Recurring Problems
1 Introduction
2 The Routine Bandit Setting
3 The KLUCB-RB Strategy
4 Sketch of Proof
5 Numerical Experiments
5.1 More Arms Than Bandits: A Beneficial Case
5.2 Increasing the Number of Bandit Instances
5.3 Critical Settings
6 Conclusion

3 Knowledge Infused Policy Gradients
4 Formulation of Knowledge Infusion
5 Regret Bound for KIPG
6 KIPG-Upper Confidence Bound
7 Experiments
7.1 Simulated Domains
7.2 Real-World Datasets
8 Conclusion and Future Work
References
Exploiting History Data for Nonstationary Multi-armed Bandit
1 Introduction
2 Related Works
3 Problem Formulation
4 The BR-MAB Algorithm
4.1 Break-Point Prediction Procedure
4.2 Recurrent Concepts Equivalence Test
4.3 Regret Analysis for Generic CD-MABs
4.4 Regret Analysis for the Break-Point Prediction Procedure

5 Experiments
5.1 Toy Example
5.2 Synthetic Setting
5.3 Yahoo! Setting
6 Conclusion and Future Works
References
High-Probability Kernel Alignment Regret Bounds for Online Kernel Selection
1 Introduction
1.1 Related Work
2 Problem Setting
3 A Nearly Optimal High-Probability Regret Bound
3.1 Warm-Up
3.2 A More Efficient Algorithm
3.3 Regret Bound
3.4 Time Complexity Analysis
4 Regret-Performance Trade-Off
4.1 Regret Bound
4.2 Budgeted EA2OKS
5 Experiments
5.1 Experimental Setting
5.2 Experimental Results
6 Conclusion
References

Reinforcement Learning
Periodic Intra-ensemble Knowledge Distillation for Reinforcement Learning
1 Introduction
2 Related Work
3 Background
4 Method
4.1 Overview
4.2 Ensemble Initialization
4.3 Joint Training
4.4 Intra-ensemble Knowledge Distillation
5 Experiments
5.1 Experimental Setup
5.2 Effectiveness of PIEKD
5.3 Effectiveness of Knowledge Distillation for Knowledge Sharing
5.4 Effectiveness of Selecting the Best-Performing Agent as the Teacher
5.5 Ablation Study on Ensemble Size
5.6 Ablation Study on Distillation Interval
6 Conclusion

Browse Subjects

Show more subjects...

Statistics

from
to
Export