Exploration of Dynamic Portfolio Optimization Strategies under Credit Risk Using Multi-Agent Reinforcement Learning Paradigms
DOI:
https://doi.org/10.63397/ISCSITR-IJCSE_04_02_001Keywords:
Multi-Agent Reinforcement Learning, Portfolio Optimization, Credit Risk, Deep Q-Network, MAPPO, Financial Engineering, Dynamic Allocation, Risk-Adjusted ReturnsAbstract
In the context of financial portfolio management, credit risk presents a significant challenge to dynamic allocation strategies. This study explores the application of Multi-Agent Reinforcement Learning (MARL) to optimize portfolio performance under credit risk constraints. By simulating interacting agents within a stochastic market environment, we assess how cooperative and competitive learning strategies adaptively manage asset allocations in response to changing credit conditions. We integrate credit risk modeling through probability of default (PD), loss given default (LGD), and exposure at default (EAD) into the reward structure. The experimental framework employs Deep Q-Network (DQN) and Multi-Agent Proximal Policy Optimization (MAPPO) to examine the performance of various agent interactions across volatile and stress-tested market scenarios. Results demonstrate that MARL-based strategies not only outperform traditional optimization models in cumulative returns and risk-adjusted metrics but also exhibit resilience to sudden credit shocks. These findings offer a promising direction for next-generation, credit-aware portfolio optimization tools.
References
Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91.
Rockafellar, R. T., & Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2, 21–42.
Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates. The Journal of Finance, 29(2), 449–470.
Black, F., & Cox, J. C. (1976). Valuing corporate securities: Some effects of bond indenture provisions. The Journal of Finance, 31(2), 351–367. https://doi.org/10.1111/j.1540-6261.1976.tb01891.x
Duffie, D., Saita, L., & Wang, K. (2007). Multi-period corporate default prediction with stochastic covariates. The Journal of Financial Economics, 83(3), 635–665. https://doi.org/10.1016/j.jfineco.2005.10.011
Moody, J., & Saffell, M. (2001). Learning to trade via direct reinforcement. IEEE Transactions on Neural Networks, 12(4), 875–889.
Deng, Y., Bao, F., Kong, Y., Ren, Z., & Dai, Q. (2016). Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 653–664.
Nowé, A., Vrancx, P., De Hauwere, YM. (2012). Game Theory and Multi-agent Reinforcement Learning. In: Wiering, M., van Otterlo, M. (eds) Reinforcement Learning. Adaptation, Learning, and Optimization, vol 12. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27645-3_14
Yang, Y., Zhang, Y., Gao, Y., & Zhang, Y. (2020). Multi-agent reinforcement learning for portfolio optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 7274–7281.
Liang, Z., Chen, T., Zhu, Y., & Liu, J. (2022). Multi-agent reinforcement learning for financial portfolio management with implicit coordination. Expert Systems with Applications, 189, 115646.
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Santhosh Kumar Sagar Nagaraj (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.


