offlax.cql
offlax.cql#
Functions
|
Classes
|
Implementation of Conservative Q Learning (CQL) algorithm. |
- class offlax.cql.CQLDiscrete(rng, actor, critic, state_dims, action_dims, gamma, tau)[source]#
Bases:
objectImplementation of Conservative Q Learning (CQL) algorithm.
Paper: https://arxiv.org/abs/2006.04779
- Parameters
rng (jax.random.PRNGKey) –
actor (ActorDiscrete) –
critic (Critic) –
state_dims (List[int]) –
action_dims (int) –
gamma (float) –
tau (float) –