Optimization
Reward Optimization
Title | Year | Author | Link | Memo |
---|---|---|---|---|
When Demonstrations Meet Generative World Models: A Maximum Likelihood Framework for Offline Inverse Reinforcement Learning | 2023 | Siliang Zeng et al | two level optimization with some conservative assumption |