DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

Xiaoteng Ma; Junyao Chen; Li Xia; Jun Yang; Qianchuan Zhao; Zhengyuan Zhou

doi:10.1613/jair.1.17526

PDF

Published: Jun 25, 2025

DOI: https://doi.org/10.1613/jair.1.17526

Keywords:

markov decision processes, reinforcement learning, risk-sensitive reinforcement learning

Xiaoteng Ma

Junyao Chen

Li Xia

Jun Yang

Qianchuan Zhao

Zhengyuan Zhou

Abstract

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft Actor-Critic (SAC) algorithm. DSAC models the randomness in both action and rewards, surpassing baseline performances on various continuous control tasks. Unlike standard approaches that solely maximize expected rewards, we propose a unified framework for risk-sensitive learning, one that optimizes the risk-related objective while balancing entropy to encourage exploration. Extensive experiments demonstrate DSAC’s effectiveness in enhancing agent performances for both risk-neutral and risk-sensitive control tasks.

Issue

Vol. 83 (2025)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details