Linked e-resources

Details

Intro
Preface
Organization
Contents
Part III
Reinforcement Learning
Consistency Regularization for Ensemble Model Based Reinforcement Learning
1 Introduction
2 Related Work
3 Background
4 Method
4.1 Model Discrepancy and Consistency
4.2 Model Learning
4.3 Implementation
5 Experiments
5.1 Comparative Evaluation
5.2 Effects of Consistency Regularization
5.3 Ablation Study
6 Conclusions
References
Detecting and Learning Against Unknown Opponents for Automated Negotiations
1 Introduction
2 Related Work
3 Preliminaries

3.1 Negotiation Settings
3.2 Bayes Policy Reuse
4 Agent Design
4.1 Deep Reinforcement Learning Based Learning Module
4.2 Policy Reuse Mechanism
5 Experiments
5.1 Experimental Setup
5.2 Performance Against ANAC Winning Agents
5.3 New Opponent Detection and Learning
6 Conclusion
References
Diversity-Based Trajectory and Goal Selection with Hindsight Experience Replay
1 Introduction
2 Background
2.1 Reinforcement Learning
2.2 Goal-Oriented Reinforcement Learning
2.3 Deep Deterministic Policy Gradient
2.4 Determinantal Point Processes

3 Related Work
4 Methodology
4.1 Diversity-Based Trajectory Selection
4.2 Diversity-Based Goal Selection
5 Experiments
5.1 Environments
5.2 Training Settings
5.3 Benchmark Results
5.4 Ablation Studies
5.5 Time Complexity
6 Conclusion
References
Off-Policy Training for Truncated TD() Boosted Soft Actor-Critic
1 Introduction
2 Related Work
2.1 TD Learning and Multi-step Methods
2.2 TD() and Eligibility Traces
3 Preliminaries
3.1 MDPs and Temporal Difference Learning
3.2 Multi-step Algorithms and TD()

4 Soft Actor-Critic with Truncated TD ()
4.1 Off-Policy Truncated TD()
4.2 Soft Actor-Critic with Truncated TD()
4.3 SAC() Training
5 Experiments
5.1 Evaluation of SAC()
5.2 Ablation Study
6 Discussion
References
Adaptive Warm-Start MCTS in AlphaZero-Like Deep Reinforcement Learning
1 Introduction
2 Related Work
3 Warm-Start AlphaZero Self-play
3.1 The Algorithm Framework
3.2 MCTS
3.3 MCTS Enhancements
4 Adaptive Warm-Start Switch Method
5 Experimental Setup
6 Results
6.1 MCTS Vs MCTS Enhancements
6.2 Fixed I Tuning

6.3 Adaptive Warm-Start Switch
7 Discussion and Conclusion
References
Batch-Constraint Inverse Reinforcement Learning
1 Introduction
2 Offline Inverse Reinforcement Learning
3 Method
3.1 Feature Expectation Approximation
3.2 Policy Optimization with BRL
3.3 Batch-Constraint Inverse Reinforcement Learning Algorithm (BCIRL)
4 Experiments
4.1 Standard Control Environments
4.2 Gridworld Example
5 Conclusion
References
KG-RL: A Knowledge-Guided Reinforcement Learning for Massive Battle Games
1 Introduction
2 Related Work
3 Method
3.1 Rule-Mix
3.2 Plan-Extend.

Browse Subjects

Show more subjects...

Statistics

from
to
Export