Responsible Agentic Reasoning and AI Agents: A Critical Survey

Authors: Shaina Raza*, Ranjan Sapkota*, Manoj Karkee, Christos Emmanouilidis
Affiliations: Vector Institute; Cornell University; University of Groningen
Equal contribution: Shaina Raza and Ranjan Sapkota

If you use this work, please cite us (see Cite below).

Overview

What is R²A²?
Responsible Reasoning AI Agents (R²A²) are LLM-powered agents that perform multi-step reasoning with built-in safeguards — bias checks, privacy protection, audit logs, and robustness tests — applied at every reasoning step, not just the final output.

Why now? The 2024–2025 wave of reasoning models and agentic browsers demands trace-level evaluation (faithfulness, safety, privacy), continuous auditing, and human-in-the-loop oversight to reach production in high-stakes domains.

BibTeX:

@article{raza2025responsible,
  author       = {Shaina Raza and Ranjan Sapkota and Manoj Karkee and Christos Emmanouilidis},
  title        = {Responsible Agentic Reasoning and AI Agents: A Critical Survey},
  journal      = {TechRxiv},
  year         = {2025},
  month        = sep,
  day          = {08},
  doi          = {10.36227/techrxiv.175735299.97215847/v1},
  note         = {Preprint}
}

License

This repository is licensed under the MIT License (see LICENSE).

Acknowledgments

We thank contributors and readers who provide feedback and issue reports. PRs welcome!

📚 References (Inline View)

Show full references table

#	Key	Title	Authors	Venue	Year	Link
1	`venerito2025reasoning`	Reasoning large language models in rheumatology: a call for responsible action	Venerito, Vincenzo and Iannone, Florenzo and Gupta, Latika	The Lancet Rheumatology	2025
2	`nist2023airmf`	Artificial Intelligence Risk Management Framework (AI RMF 1.0)	{National Institute of Standards and Technology		2023	DOI/URL
3	`johnson2019billion`	Billion-scale similarity search with GPUs	Johnson, Jeff and Douze, Matthijs and J{\'e	IEEE Transactions on Big Data	2019
4	`oecd2019ai`	Recommendation of the Council on Artificial Intelligence	{Organisation for Economic Co-operation and Development		2019	DOI/URL
5	`eu2024aiact`	Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 on Artificial Intelligence and amending certain Union legislative acts (Artificial Intelligence Act)	{European Parliament and Council of the European Union		2024	DOI/URL
6	`ieee2019ethics`	Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems	{IEEE		2019	DOI/URL
7	`chen2025reasoning`	Reasoning Models Don’t Always Say What They Think	Chen, Yanda and Benton, Joe and Radhakrishnan, Ansh and Uesato, Jonathan and Denison, Carson and Schulman, John and Somani, Arushi and Hase, Peter and Wagner, Misha and Roger, Fabien and Mikulik, Vlad and Bowman, Samuel R. and Leike, Jan and Kaplan, Jared and Perez, Ethan and Alignment Science Team, Anthropic	arXiv preprint arXiv:2505.05410	2025	DOI/URL
8	`xu2025towards`	Towards large reasoning models: A survey of reinforced reasoning with large language models	Xu, Fengli and Hao, Qianyue and Zong, Zefang and Wang, Jingwei and Zhang, Yunke and Wang, Jingyi and Lan, Xiaochong and Gong, Jiahui and Ouyang, Tianjian and Meng, Fanjin and others	arXiv preprint arXiv:2501.09686	2025
9	`karpas2022mrkl`	MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning	Karpas, Ehud and Abend, Omri and Belinkov, Yonatan and Lenz, Barak and Lieber, Opher and Ratner, Nir and Shoham, Yoav and Bata, Hofit and Levine, Yoav and Leyton-Brown, Kevin and others	arXiv preprint arXiv:2205.00445	2022
10	`raza2024beads`	Beads: Bias evaluation across domains	Raza, Shaina and Rahman, Mizanur and Zhang, Michael R	arXiv preprint arXiv:2406.04220	2024
11	`xia2025evaluating`	Evaluating mathematical reasoning beyond accuracy	Xia, Shijie and Li, Xuefeng and Liu, Yixin and Wu, Tongshuang and Liu, Pengfei	Proceedings of the AAAI Conference on Artificial Intelligence	2025
12	`raza2025humanibench`	Humanibench: A human-centric framework for large multimodal models evaluation	Raza, Shaina and Narayanan, Aravind and Khazaie, Vahid Reza and Vayani, Ashmal and Chettiar, Mukund S and Singh, Amandeep and Shah, Mubarak and Pandya, Deval	arXiv preprint arXiv:2505.11454	2025
13	`dafoe2018ai`	AI governance: a research agenda	Dafoe, Allan	Governance of AI Program, Future of Humanity Institute, University of Oxford: Oxford, UK	2018
14	`10771762`	Exploring Bias and Prediction Metrics to Characterise the Fairness of Machine Learning for Equity-Centered Public Health Decision-Making: A Narrative Review	Raza, Shaina and Shaban-Nejad, Arash and Dolatabadi, Elham and Mamiya, Hiroshi	IEEE Access	2024	DOI/URL
15	`putnam_axiom2024`	Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning	Aryan Gulati and Brando Miranda and Eric Chen and Emily Xia and Kai Fronsdal and Bruno de Moraes Dumont and Sanmi Koyejo	38th Conference on Neural Information Processing Systems (NeurIPS 2024) Workshop on MATH-AI	2024	DOI/URL
16	`OperaBrowserOperator2025`	Meet Opera’s AI Browser Operator	{Opera Software		2025
17	`Comet2025`	Introducing Comet: Browse at the Speed of Thought	{Perplexity Team		2025
18	`Dia2025`	Dia Browser	Dia Browser		2025
19	`OpenAIOperator2025`	OpenAI Operator	OpenAI		2025
20	`sapkota2025multimodal`	Multimodal large language models for image, text, and speech data augmentation: A survey	Sapkota, Ranjan and Raza, Shaina and Shoman, Maged and Paudel, Achyut and Karkee, Manoj	arXiv preprint arXiv:2501.18648	2025
21	`ClaudeArtifacts2024`	Claude 3.5 Sonnet Launch \& Artifacts Preview	{Anthropic		2024
22	`CowPilot2025`	CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation	Faria Huq and Zora Zhiruo Wang and Frank F. Xu and Tianyue Ou and Shuyan Zhou and Jeffrey P. Bigham and Graham Neubig	arXiv preprint	2025	DOI/URL
23	`SWEAgent2024`	SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering	John Yang and Carlos E. Jimenez and Alexander Wettig and Kilian Lieret and Shunyu Yao and Karthik Narasimhan and Ofir Press	arXiv preprint	2024	DOI/URL
24	`MistralAgentsAPI2025`	Build AI Agents with the Mistral Agents API	{Mistral AI		2025
25	`chollet2019measure`	On the measure of intelligence	Chollet, Fran{\c{c	arXiv preprint arXiv:1911.01547	2019
26	`chollet2025arc`	Arc-agi-2: A new challenge for frontier ai reasoning systems	Chollet, Francois and Knoop, Mike and Kamradt, Gregory and Landers, Bryan and Pinkard, Henry	arXiv preprint arXiv:2505.11831	2025
27	`yue2024mmmumassivemultidisciplinemultimodal`	MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI	Xiang Yue and Yuansheng Ni and Kai Zhang and Tianyu Zheng and Ruoqi Liu and Ge Zhang and Samuel Stevens and Dongfu Jiang and Weiming Ren and Yuxuan Sun and Cong Wei and Botao Yu and Ruibin Yuan and Renliang Sun and Ming Yin and Boyuan Zheng and Zhenzhu Yang and Yibo Liu and Wenhao Huang and Huan Sun and Yu Su and Wenhu Chen		2024	DOI/URL
28	`peiyuan_liu_2023`	MMLU Dataset	Peiyuan Liu	Kaggle	2023	DOI/URL
29	`PerplexityComet2025`	Comet: The Browser That Thinks With You	Perplexity AI		2025
30	`dominguez2024training`	Training on the test task confounds evaluation and emergence	Dominguez-Olmedo, Ricardo and Dorner, Florian E and Hardt, Moritz	arXiv preprint arXiv:2407.07890	2024
31	`OpenAIChatGPTAgent2025`	Introducing ChatGPT Agent: Bridging Research and Action	OpenAI		2025
32	`lee2024vhelm`	Vhelm: A holistic evaluation of vision language models	Lee, Tony and Tu, Haoqin and Wong, Chi Heem and Zheng, Wenhao and Zhou, Yiyang and Mai, Yifan and Roberts, Josselin and Yasunaga, Michihiro and Yao, Huaxiu and Xie, Cihang and others	Advances in Neural Information Processing Systems	2024
33	`pineau2020improving`	Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)	Joelle Pineau and Philippe Vincent-Lamarre and Koustuv Sinha and Vincent Larivière and Alina Beygelzimer and Florence d'Alché-Buc and Emily Fox and Hugo Larochelle		2020	DOI/URL
34	`AWSStrandsAgents2025`	Introducing Strands Agents, an Open Source AI Agents SDK	{AWS Open Source		2025
35	`he2024webvoyager`	Webvoyager: Building an end-to-end web agent with large multimodal models	He, Hongliang and Yao, Wenlin and Ma, Kaixin and Yu, Wenhao and Dai, Yong and Zhang, Hongming and Lan, Zhenzhong and Yu, Dong	arXiv preprint arXiv:2401.13919	2024
36	`GeminiMariner2024`	Introducing Gemini 2.0: Our New AI Model for the Agentic Era	{Google DeepMind		2024
37	`zhang2024litewebagent`	LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications	Danqing Zhang and Balaji Rama and Shiying He and Jingyi Ni	Zenodo	2024	DOI/URL
38	`GoogleMariner2025`	Project Mariner	{Google DeepMind		2025
39	`yang2025agenticwebweavingweb`	Agentic Web: Weaving the Next Web with AI Agents	Yingxuan Yang and Mulei Ma and Yuxuan Huang and Huacan Chai and Chenyu Gong and Haoran Geng and Yuanjian Zhou and Ying Wen and Meng Fang and Muhao Chen and Shangding Gu and Ming Jin and Costas Spanos and Yang Yang and Pieter Abbeel and Dawn Song and Weinan Zhang and Jun Wang		2025	DOI/URL
40	`Fellou2025`	Fellou: Agentic Web Browser	{Fellou AI		2025
41	`OperaNeon2025`	Opera Neon	{Opera Software AS		2025
42	`CopilotAgent2025`	GitHub Copilot	GitHub Copilot		2025
43	`AmazonQDeveloper2025`	Amazon Q Developer Elevates the IDE Experience with New Agentic Coding Experience	Elizabeth Fuentes		2025
44	`AutoGen04_2025`	AutoGen v0.4: Reimagining the Foundation of Agentic AI for Scale, Extensibility, and Robustness	Adam Fourney and Ahmed Awadallah and Cheng Tan and Erkang Zhu and Friederike Niedtner and Gagan Bansal and \textit{et al.		2025
45	`zhou2024webarenarealisticwebenvironment`	WebArena: A Realistic Web Environment for Building Autonomous Agents	Shuyan Zhou and Frank F. Xu and Hao Zhu and Xuhui Zhou and Robert Lo and Abishek Sridhar and Xianyi Cheng and Tianyue Ou and Yonatan Bisk and Daniel Fried and Uri Alon and Graham Neubig		2024	DOI/URL
46	`huang2023benchmarking`	Benchmarking large language models as ai research agents	Huang, Qian and Vora, Jian and Liang, Percy and Leskovec, Jure	NeurIPS 2023 Foundation Models for Decision Making Workshop	2023
47	`huang2023mlagentbench`	Mlagentbench: Evaluating language agents on machine learning experimentation	Huang, Qian and Vora, Jian and Liang, Percy and Leskovec, Jure	arXiv preprint arXiv:2310.03302	2023
48	`martinez2025dissecting`	Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM-and Agent-Based Repair Systems	Martinez, Matias and Franch, Xavier	arXiv preprint arXiv:2506.17208	2025
49	`wang2024mobileagentv2mobiledeviceoperation`	Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration	Junyang Wang and Haiyang Xu and Haitao Jia and Xi Zhang and Ming Yan and Weizhou Shen and Ji Zhang and Fei Huang and Jitao Sang		2024	DOI/URL
50	`chen2025spabench`	SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation	Jingxuan Chen and Derek Yuen and Bin Xie and Yuhao Yang and Gongwei Chen and Zhihao Wu and Li Yixing and Xurui Zhou and Weiwen Liu and Shuai Wang and Kaiwen Zhou and Rui Shao and Liqiang Nie and Yasheng Wang and Jianye HAO and Jun Wang and Kun Shao	The Thirteenth International Conference on Learning Representations	2025
51	`chollet2024arc`	Arc prize 2024: Technical report	Chollet, Francois and Knoop, Mike and Kamradt, Gregory and Landers, Bryan	arXiv preprint arXiv:2412.04604	2024
52	`chang2024agentboard`	Agentboard: An analytical evaluation board of multi-turn llm agents	Chang, Ma and Zhang, Junlei and Zhu, Zhihao and Yang, Cheng and Yang, Yujiu and Jin, Yaohui and Lan, Zhenzhong and Kong, Lingpeng and He, Junxian	Advances in neural information processing systems	2024
53	`talmor2018commonsenseqa`	Commonsenseqa: A question answering challenge targeting commonsense knowledge	Talmor, Alon and Herzig, Jonathan and Lourie, Nicholas and Berant, Jonathan	arXiv preprint arXiv:1811.00937	2018
54	`casper2025aiagentindex`	The AI Agent Index	Stephen Casper and Luke Bailey and Rosco Hunter and Carson Ezell and Emma Cabalé and Michael Gerovitch and Stewart Slocum and Kevin Wei and Nikola Jurkovic and Ariba Khan and Phillip J. K. Christoffersen and A. Pinar Ozisik and Rakshit Trivedi and Dylan Hadfield-Menell and Noam Kolt		2025	DOI/URL
55	`srivastava2023beyond`	Beyond the imitation game: Quantifying and extrapolating the capabilities of language models	Srivastava, Aarohi and Rastogi, Abhinav and Rao, Abhishek and Shoeb, Abu Awal and Abid, Abubakar and Fisch, Adam and Brown, Adam R and Santoro, Adam and Gupta, Aditya and Garriga-Alonso, Adri and others	Transactions on machine learning research	2023
56	`zaharia2018accelerating`	Accelerating the machine learning lifecycle with MLflow.	Zaharia, Matei and Chen, Andrew and Davidson, Aaron and Ghodsi, Ali and Hong, Sue Ann and Konwinski, Andy and Murching, Siddharth and Nykodym, Tomas and Ogilvie, Paul and Parkhe, Mani and others	IEEE Data Eng. Bull.	2018
57	`merkel2014docker`	Docker: lightweight linux containers for consistent development and deployment	Merkel, Dirk and others	Linux j	2014
58	`borenstein2021introduction`	Introduction to meta-analysis	Borenstein, Michael and Hedges, Larry V and Higgins, Julian PT and Rothstein, Hannah R	John wiley \& sons	2021
59	`W3C2013PROVOverview`	PROV-Overview: An Overview of the PROV Family of Documents	{W3C Provenance Working Group		2013
60	`gebru2021datasheets`	Datasheets for datasets	Gebru, Timnit and Morgenstern, Jamie and Vecchione, Briana and Vaughan, Jennifer Wortman and Wallach, Hanna and Iii, Hal Daum{\'e	Communications of the ACM	2021
61	`imo`	Official Website	{International Mathematical Olympiad		n.d.
62	`livecodebench_datasets`	LiveCodeBench datasets - code\_generation\_lite, execution‑v2, test\_generation, …	{LiveCodeBench		n.d.
63	`park2023generative`	Generative agents: Interactive simulacra of human behavior	Park, Joon Sung and O'Brien, Joseph and Cai, Carrie Jun and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S	Proceedings of the 36th annual acm symposium on user interface software and technology	2023
64	`hong2023metagpt`	MetaGPT: Meta programming for a multi-agent collaborative framework	Hong, Sirui and Zhuge, Mingchen and Chen, Jonathan and Zheng, Xiawu and Cheng, Yuheng and Wang, Jinlin and Zhang, Ceyao and Wang, Zili and Yau, Steven Ka Shing and Lin, Zijuan and others	The Twelfth International Conference on Learning Representations	2023
65	`wu2023visual`	Visual chatgpt: Talking, drawing and editing with visual foundation models	Wu, Chenfei and Yin, Shengming and Qi, Weizhen and Wang, Xiaodong and Tang, Zecheng and Duan, Nan	arXiv preprint arXiv:2303.04671	2023
66	`li2023camel`	Camel: Communicative agents for" mind" exploration of large language model society	Li, Guohao and Hammoud, Hasan and Itani, Hani and Khizbullin, Dmitrii and Ghanem, Bernard	Advances in Neural Information Processing Systems	2023
67	`wang2023voyager`	Voyager: An open-ended embodied agent with large language models	Wang, Guanzhi and Xie, Yuqi and Jiang, Yunfan and Mandlekar, Ajay and Xiao, Chaowei and Zhu, Yuke and Fan, Linxi and Anandkumar, Anima	arXiv preprint arXiv:2305.16291	2023
68	`madaan2023self`	Self-refine: Iterative refinement with self-feedback	Madaan, Aman and Tandon, Niket and Gupta, Prakhar and Hallinan, Skyler and Gao, Luyu and Wiegreffe, Sarah and Alon, Uri and Dziri, Nouha and Prabhumoye, Shrimai and Yang, Yiming and others	Advances in Neural Information Processing Systems	2023
69	`li2023api`	Api-bank: A comprehensive benchmark for tool-augmented llms	Li, Minghao and Zhao, Yingxiu and Yu, Bowen and Song, Feifan and Li, Hangyu and Yu, Haiyang and Li, Zhoujun and Huang, Fei and Li, Yongbin	arXiv preprint arXiv:2304.08244	2023
70	`patil2024gorilla`	Gorilla: Large language model connected with massive apis	Patil, Shishir G and Zhang, Tianjun and Wang, Xin and Gonzalez, Joseph E	Advances in Neural Information Processing Systems	2024
71	`suris2023vipergpt`	Vipergpt: Visual inference via python execution for reasoning	Sur{\'\i	Proceedings of the IEEE/CVF international conference on computer vision	2023
72	`ahn2022can`	Do as i can, not as i say: Grounding language in robotic affordances	Ahn, Michael and Brohan, Anthony and Brown, Noah and Chebotar, Yevgen and Cortes, Omar and David, Byron and Finn, Chelsea and Fu, Chuyuan and Gopalakrishnan, Keerthana and Hausman, Karol and others	arXiv preprint arXiv:2204.01691	2022
73	`shinn2023reflexion`	Reflexion: Language agents with verbal reinforcement learning	Shinn, Noah and Cassano, Federico and Gopinath, Ashwin and Narasimhan, Karthik and Yao, Shunyu	Advances in Neural Information Processing Systems	2023
74	`shen2023hugginggptsolvingaitasks`	HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face	Yongliang Shen and Kaitao Song and Xu Tan and Dongsheng Li and Weiming Lu and Yueting Zhuang		2023	DOI/URL
75	`jimenez2023swe`	Swe-bench: Can language models resolve real-world github issues?	Jimenez, Carlos E and Yang, John and Wettig, Alexander and Yao, Shunyu and Pei, Kexin and Press, Ofir and Narasimhan, Karthik	arXiv preprint arXiv:2310.06770	2023
76	`chen2021evaluating`	Evaluating large language models trained on code	Chen, Mark and Tworek, Jerry and Jun, Heewoo and Yuan, Qiming and Pinto, Henrique Ponde De Oliveira and Kaplan, Jared and Edwards, Harri and Burda, Yuri and Joseph, Nicholas and Brockman, Greg and others	arXiv preprint arXiv:2107.03374	2021
77	`hendrycks2020measuring`	Measuring massive multitask language understanding	Hendrycks, Dan and Burns, Collin and Basart, Steven and Zou, Andy and Mazeika, Mantas and Song, Dawn and Steinhardt, Jacob	arXiv preprint arXiv:2009.03300	2020
78	`chollet2024abstraction`	Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI)	Chollet, Fran{\c{c		2024
79	`codeforces`	Competitive Programming Platform	{Codeforces		n.d.
80	`rein2024gpqa`	Gpqa: A graduate-level google-proof q\&a benchmark	Rein, David and Hou, Betty Li and Stickland, Asa Cooper and Petty, Jackson and Pang, Richard Yuanzhe and Dirani, Julien and Michael, Julian and Bowman, Samuel R	First Conference on Language Modeling	2024
81	`hendrycks2021measuring`	Measuring mathematical problem solving with the math dataset	Hendrycks, Dan and Burns, Collin and Kadavath, Saurav and Arora, Akul and Basart, Steven and Tang, Eric and Song, Dawn and Steinhardt, Jacob	arXiv preprint arXiv:2103.03874	2021
82	`maa_aime`	American Invitational Mathematics Examination (AIME)	{Mathematical Association of America
83	`cobbe2021training`	Training verifiers to solve math word problems	Cobbe, Karl and Kosaraju, Vineet and Bavarian, Mohammad and Chen, Mark and Jun, Heewoo and Kaiser, Lukasz and Plappert, Matthias and Tworek, Jerry and Hilton, Jacob and Nakano, Reiichiro and others	arXiv preprint arXiv:2110.14168	2021
84	`shao2024deepseekmath`	Deepseekmath: Pushing the limits of mathematical reasoning in open language models	Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Bi, Xiao and Zhang, Haowei and Zhang, Mingchuan and Li, YK and Wu, Yang and others	arXiv preprint arXiv:2402.03300	2024
85	`he2025skywork`	Skywork open reasoner 1 technical report	He, Jujie and Liu, Jiacai and Liu, Chris Yuhao and Yan, Rui and Wang, Chaojie and Cheng, Peng and Zhang, Xiaoyu and Zhang, Fuxiang and Xu, Jiacheng and Shen, Wei and others	arXiv preprint arXiv:2505.22312	2025
86	`bai2023qwen`	Qwen technical report	Bai, Jinze and Bai, Shuai and Chu, Yunfei and Cui, Zeyu and Dang, Kai and Deng, Xiaodong and Fan, Yang and Ge, Wenbin and Han, Yu and Huang, Fei and others	arXiv preprint arXiv:2309.16609	2023
87	`LuongLockhart2025GeminiIMO`	Advanced version of Gemini with Deep Think officially achieves gold‑medal standard at the International Mathematical Olympiad	Thang Luong and Edward Lockhart		2025
88	`comanici2025gemini`	Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities	Comanici, Gheorghe and Bieber, Eric and Schaekermann, Mike and Pasupat, Ice and Sachdeva, Noveen and Dhillon, Inderjit and Blistein, Marcel and Ram, Ori and Zhang, Dan and Rosen, Evan and others	arXiv preprint arXiv:2507.06261	2025
89	`zhang2024naturalcodebench`	Naturalcodebench: Examining coding performance mismatch on humaneval and natural user queries	Zhang, Shudan and Zhao, Hanlin and Liu, Xiao and Zheng, Qinkai and Qi, Zehan and Gu, Xiaotao and Dong, Yuxiao and Tang, Jie	Findings of the Association for Computational Linguistics ACL 2024	2024
90	`bai2022constitutionalaiharmlessnessai`	Constitutional AI: Harmlessness from AI Feedback	Yuntao Bai and Saurav Kadavath et al.		2022	DOI/URL
91	`liang2023holistic`	Holistic Evaluation of Language Models	Percy Liang and Rishi Bommasani and et al.	Transactions on Machine Learning Research	2023	DOI/URL
92	`borghoff2025human`	Human-artificial interaction in the age of agentic AI: a system-theoretical approach	Borghoff, Uwe M and Bottoni, Paolo and Pareschi, Remo	Frontiers in Human Dynamics	2025
93	`team2024gemma`	Gemma: Open models based on gemini research and technology	Team, Gemma and Mesnard, Thomas and Hardin, Cassidy and Dadashi, Robert and Bhupatiraju, Surya and Pathak, Shreya and Sifre, Laurent and Rivi{\`e	arXiv preprint arXiv:2403.08295	2024
94	`GoogleGeminiModels2025`	Gemini Models \textbar{			2025
95	`anil2023palm`	Palm 2 technical report	Anil, Rohan and Dai, Andrew M and Firat, Orhan and Johnson, Melvin and Lepikhin, Dmitry and Passos, Alexandre and Shakeri, Siamak and Taropa, Emanuel and Bailey, Paige and Chen, Zhifeng and others	arXiv preprint arXiv:2305.10403	2023
96	`OpenAI2025`	OpenAI			2015--2025
97	`yehudai2025survey`	Survey on evaluation of llm-based agents	Yehudai, Asaf and Eden, Lilach and Li, Alan and Uziel, Guy and Zhao, Yilun and Bar-Haim, Roy and Cohan, Arman and Shmueli-Scheuer, Michal	arXiv preprint arXiv:2503.16416	2025
98	`wang2025survey`	A survey on responsible llms: Inherent risk, malicious use, and mitigation strategy	Wang, Huandong and Fu, Wenjie and Tang, Yingzhou and Chen, Zhilong and Huang, Yuxi and Piao, Jinghua and Gao, Chen and Xu, Fengli and Jiang, Tao and Li, Yong	arXiv preprint arXiv:2501.09431	2025
99	`chu2024fairness`	Fairness in large language models: A taxonomic survey	Chu, Zhibo and Wang, Zichong and Zhang, Wenbin	ACM SIGKDD explorations newsletter	2024
100	`plaat2024reasoning`	Reasoning with large language models, a survey	Plaat, Aske and Wong, Annie and Verberne, Suzan and Broekens, Joost and van Stein, Niki and Back, Thomas	arXiv preprint arXiv:2407.11511	2024
101	`plaat2025agentic`	Agentic large language models, a survey	Plaat, Aske and van Duijn, Max and van Stein, Niki and Preuss, Mike and van der Putten, Peter and Batenburg, Kees Joost	arXiv preprint arXiv:2503.23037	2025
102	`rao1995bdi`	BDI agents: From theory to practice.	Rao, Anand S and Georgeff, Michael P and others	Icmas	1995
103	`chu2023navigate`	Navigate through enigmatic labyrinth a survey of chain of thought reasoning: Advances, frontiers and future	Chu, Zheng and Chen, Jingchang and Chen, Qianglong and Yu, Weijiang and He, Tao and Wang, Haotian and Peng, Weihua and Liu, Ming and Qin, Bing and Liu, Ting	arXiv preprint arXiv:2309.15402	2023
104	`Wooldridge_Jennings_1995`	Intelligent agents: theory and practice	Wooldridge, Michael and Jennings, Nicholas R.	The Knowledge Engineering Review	1995	DOI/URL
105	`Wang_2024`	A survey on large language model based autonomous agents	Wang, Lei and Ma, Chen and Feng, Xueyang and Zhang, Zeyu and Yang, Hao and Zhang, Jingsen and Chen, Zhiyuan and Tang, Jiakai and Chen, Xu and Lin, Yankai and Zhao, Wayne Xin and Wei, Zhewei and Wen, Jirong	Frontiers of Computer Science	2024	DOI/URL
106	`raza2025responsible`	Who is Responsible? The Data, Models, Users or Regulations? A Comprehensive Survey on Responsible Generative AI for a Sustainable Future	Raza, Shaina and Qureshi, Rizwan and Zahid, Anam and Fioresi, Joseph and Sadak, Ferhat and Saeed, Muhammad and Sapkota, Ranjan and Jain, Aditya and Zafar, Anas and Hassan, Muneeb Ul and others	arXiv preprint arXiv:2502.08650	2025
107	`sapkota2025ai`	Ai agents vs. agentic ai: A conceptual taxonomy, applications and challenge	Sapkota, Ranjan and Roumeliotis, Konstantinos I and Karkee, Manoj	arXiv preprint arXiv:2505.10468	2025
108	`raza2025fairsense`	FairSense-AI: Responsible AI Meets Sustainability	Raza, Shaina and Chettiar, Mukund Sayeeganesh and Yousefabadi, Matin and Khan, Tahniat and Lotif, Marcelo	arXiv preprint arXiv:2503.02865	2025
109	`song2024audit`	Audit-llm: Multi-agent collaboration for log-based insider threat detection	Song, Chengyu and Ma, Linru and Zheng, Jianming and Liao, Jinzhi and Kuang, Hongyu and Yang, Lin	arXiv preprint arXiv:2408.08902	2024
110	`green2025leaky`	Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers	Green, Tommaso and Gubri, Martin and Puerto, Haritz and Yun, Sangdoo and Oh, Seong Joon	arXiv preprint arXiv:2506.15674	2025
111	`yao2023reactsynergizingreasoningacting`	ReAct: Synergizing Reasoning and Acting in Language Models	Shunyu Yao and Jeffrey Zhao and Dian Yu and Nan Du and Izhak Shafran and Karthik Narasimhan and Yuan Cao		2023	DOI/URL
112	`openai2019gpt2`	Better Language Models and Their Implications	{OpenAI		2019
113	`google2024gemini15pro`	Get more done with Gemini: Try 1.5 Pro and more intelligent features	{Google		2024
114	`meta2025llama4`	The Llama 4 herd: The beginning of a new era of natively multimodal intelligence	{Meta AI		2025
115	`xai2025grok3`	Grok 3 Beta --- The Age of Reasoning Agents	{xAI		2025
116	`ibm2025granite33`	IBM Granite 3.3: Speech recognition, refined reasoning, and RAG LoRAs	{IBM		2025
117	`ibm2025granitedocs`	Granite 3.3 Models --- Documentation	{IBM		2025
118	`baidu2025ernie45blog`	Announcing the Open Source Release of the ERNIE 4.5 Model Family	{ERNIE Team		2025
119	`baidu2025ernie45report`	ERNIE 4.5 Technical Report	{ERNIE Team		2025	DOI/URL
120	`pan2024webcanvasbenchmarkingwebagents`	WebCanvas: Benchmarking Web Agents in Online Environments	Yichen Pan and Dehan Kong and Sida Zhou and Cheng Cui and Yifei Leng and Bing Jiang and Hangyu Liu and Yanyi Shang and Shuyan Zhou and Tongshuang Wu and Zhengyang Wu		2024	DOI/URL
121	`yoran2024assistantbench`	Assistantbench: Can web agents solve realistic and time-consuming tasks?	Yoran, Ori and Amouyal, Samuel Joseph and Malaviya, Chaitanya and Bogin, Ben and Press, Ofir and Berant, Jonathan	arXiv preprint arXiv:2407.15711	2024
122	`BEARCUBS2025`	BEARCUBS: A benchmark for computer-using web agents	Song, Yixiao and Thai, Katherine and Pham, Chau Minh and Chang, Yapei and Nadaf, Mazin and Iyyer, Mohit	arXiv:2503.07919	2025	DOI/URL
123	`mistral2025medium3`	Medium is the new large. (Mistral Medium 3)	{Mistral AI		2025
124	`mistral2025magistral`	Magistral: Reasoning Model Family	{Mistral AI		2025
125	`anthropic2024claude3`	Introducing the next generation of Claude (Claude 3 family)	{Anthropic		2024
126	`meta2024llama31`	Introducing Llama 3.1: Our most capable models to date	{Meta AI		2024
127	`openai2025o3o4mini`	Introducing OpenAI o3 and o4-mini	{OpenAI		2025
128	`brown2020language`	Language models are few-shot learners	Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others	Advances in neural information processing systems	2020
129	`chen2021codex`	Evaluating Large Language Models Trained on Code	Mark Chen and Jerry Tworek and Heewoo Jun and et al.	arXiv preprint arXiv:2107.03374	2021	DOI/URL
130	`raza2025vldbench`	VLDBench Evaluating Multimodal Disinformation with Regulatory Alignment	Raza, Shaina and Vayani, Ashmal and Jain, Aditya and Narayanan, Aravind and Khazaie, Vahid Reza and Bashir, Syed Raza and Dolatabadi, Elham and Uddin, Gias and Emmanouilidis, Christos and Qureshi, Rizwan and others	arXiv preprint arXiv:2502.11361	2025
131	`ouyang2022training`	Training language models to follow instructions with human feedback	Ouyang, Long and Wu, Jeffrey and Jiang, Xu and Almeida, Diogo and Wainwright, Carroll and Mishkin, Pamela and Zhang, Chong and Agarwal, Sandhini and Slama, Katarina and Ray, Alex and others	Advances in neural information processing systems	2022
132	`wang2024rethinking`	Rethinking the bounds of llm reasoning: Are multi-agent discussions the key?	Wang, Qineng and Wang, Zihao and Su, Ying and Tong, Hanghang and Song, Yangqiu	arXiv preprint arXiv:2402.18272	2024
133	`yao2023tree`	Tree of thoughts: Deliberate problem solving with large language models	Yao, Shunyu and Yu, Dian and Zhao, Jeffrey and Shafran, Izhak and Griffiths, Tom and Cao, Yuan and Narasimhan, Karthik	Advances in neural information processing systems	2023
134	`wang2023plan`	Plan-and-solve prompting: Improving zero-shot chain-of-thought reasoning by large language models	Wang, Lei and Xu, Wanyu and Lan, Yihuai and Hu, Zhiqiang and Lan, Yunshi and Lee, Roy Ka-Wei and Lim, Ee-Peng	arXiv preprint arXiv:2305.04091	2023
135	`ejjami2024ethical`	Ethical artificial intelligence framework theory (EAIFT): a new paradigm for embedding ethical reasoning in AI systems	Ejjami, Rachid	Int J Multidiscip Res	2024
136	`schick2023toolformer`	Toolformer: Language models can teach themselves to use tools	Schick, Timo and Dwivedi-Yu, Jane and Dess{\`\i	Advances in Neural Information Processing Systems	2023
137	`raza2025trismagenticaireview`	TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems	Shaina Raza and Ranjan Sapkota and Manoj Karkee and Christos Emmanouilidis		2025	DOI/URL
138	`zhang2025litewebagentopensourcesuitevlmbased`	LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications	Danqing Zhang and Balaji Rama and Jingyi Ni and Shiying He and Fu Zhao and Kunyu Chen and Arnold Chen and Junyu Cao		2025	DOI/URL
139	`SAPKOTA2026103575`	Object detection with multimodal large vision-language models: An in-depth review	Ranjan Sapkota and Manoj Karkee	Information Fusion	2026	DOI/URL
140	`Huq_2025`	CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation	Huq, Faria and Wang, Zora Zhiruo and Xu, Frank F. and Ou, Tianyue and Zhou, Shuyan and Bigham, Jeffrey P. and Neubig, Graham	Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (System Demonstrations)	2025	DOI/URL
141	`dunnell2024bioticbrowserapplyingstreamingllm`	Biotic Browser: Applying StreamingLLM as a Persistent Web Browsing Co-Pilot	Kevin F. Dunnell and Andrew P. Stoddard		2024	DOI/URL
142	`desai2025responsibleaiagents`	Responsible AI Agents	Deven R. Desai and Mark O. Riedl		2025	DOI/URL
143	`wu2025llm`	Llm fine-tuning: Concepts, opportunities, and challenges	Wu, Xiao-Kun and Chen, Min and Li, Wanyi and Wang, Rui and Lu, Limeng and Liu, Jia and Hwang, Kai and Hao, Yixue and Pan, Yanru and Meng, Qingguo and others	Big Data and Cognitive Computing	2025
144	`jin2024impact`	The impact of reasoning step length on large language models	Jin, Mingyu and Yu, Qinkai and Shu, Dong and Zhao, Haiyan and Hua, Wenyue and Meng, Yanda and Zhang, Yongfeng and Du, Mengnan	arXiv preprint arXiv:2401.04925	2024
145	`patil2025advancing`	Advancing reasoning in large language models: Promising methods and approaches	Patil, Avinash and Jadon, Aryan	arXiv preprint arXiv:2502.03671	2025
146	`bonagiri2025towards`	Towards Trustworthy AI: Frameworks for Evaluating Consistency in Language Models	Bonagiri, Vamshi Krishna		2025
147	`liang2025ai`	AI Reasoning in Deep Learning Era: From Symbolic AI to Neural--Symbolic AI	Liang, Baoyu and Wang, Yuchen and Tong, Chao	Mathematics	2025
148	`al2025building`	Building Trustworthy AI: Transparent AI Systems via Language Models, Ontologies, and Logical Reasoning	Al Machot, Fadi and Horsch, Martin Thomas and Ullah, Habib	Designing the Conceptual Landscape for a XAIR Validation Infrastructure: Proceedings of the International Workshop on Designing the Conceptual Landscape for a XAIR Validation Infrastructure, DCLXVI 2024, Kaiserslautern, Germany	2025
149	`wu2024usable`	Usable XAI: 10 strategies towards exploiting explainability in the LLM era	Wu, Xuansheng and Zhao, Haiyan and Zhu, Yaochen and Shi, Yucheng and Yang, Fan and Hu, Lijie and Liu, Tianming and Zhai, Xiaoming and Yao, Wenlin and Li, Jundong and others	arXiv preprint arXiv:2403.08946	2024
150	`wu2025does`	Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning	Wu, Xuyang and Nian, Jinming and Wei, Ting-Ruen and Tao, Zhiqiang and Wu, Hsin-Tai and Fang, Yi	arXiv preprint arXiv:2502.15361	2025
151	`fan2025biasguard`	Biasguard: A reasoning-enhanced bias detection tool for large language models	Fan, Zhiting and Chen, Ruizhe and Liu, Zuozhu	arXiv preprint arXiv:2504.21299	2025
152	`zhang2025collaborative`	Collaborative LLM Numerical Reasoning with Local Data Protection	Zhang, Min and Lu, Yuzhe and Zhou, Yun and Xu, Panpan and Cheong, Lin Lee and Lu, Chang-Tien and Wang, Haozhu	arXiv preprint arXiv:2504.00299	2025
153	`tavasoli2025responsible`	Responsible innovation: A strategic framework for financial LLM integration	Tavasoli, Ahmadreza and Sharbaf, Maedeh and Madani, Seyed Mohamad	arXiv preprint arXiv:2504.02165	2025
154	`ferdaus2024towards`	Towards trustworthy ai: A review of ethical and robust large language models	Ferdaus, Md Meftahul and Abdelguerfi, Mahdi and Ioup, Elias and Niles, Kendall N and Pathak, Ken and Sloan, Steven	arXiv preprint arXiv:2407.13934	2024
155	`chen2024trustworthy`	Trustworthy, responsible, and safe ai: A comprehensive architectural framework for ai safety with challenges and mitigations	Chen, Chen and Gong, Xueluan and Liu, Ziyao and Jiang, Weifeng and Goh, Si Qi and Lam, Kwok-Yan	arXiv preprint arXiv:2408.12935	2024
156	`shi2024large`	Large language model safety: A holistic survey	Shi, Dan and Shen, Tianhao and Huang, Yufei and Li, Zhigen and Leng, Yongqi and Jin, Renren and Liu, Chuang and Wu, Xinwei and Guo, Zishan and Yu, Linhao and others	arXiv preprint arXiv:2412.17686	2024
157	`zheng2025beyond`	Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models	Zheng, Baihui and Zheng, Boren and Cao, Kerui and Tan, Yingshui and Liu, Zhendong and Wang, Weixun and Liu, Jiaheng and Yang, Jian and Su, Wenbo and Zhu, Xiaoyong and others	arXiv preprint arXiv:2505.19690	2025
158	`goh2024large`	Large language model influence on diagnostic reasoning: a randomized clinical trial	Goh, Ethan and Gallo, Robert and Hom, Jason and Strong, Eric and Weng, Yingjie and Kerman, Hannah and Cool, Jos{\'e	JAMA network open	2024
159	`lucas2024reasoning`	Reasoning with large language models for medical question answering	Lucas, Mary M and Yang, Justin and Pomeroy, Jon K and Yang, Christopher C	Journal of the American Medical Informatics Association	2024
160	`guha2023legalbench`	Legalbench: A collaboratively built benchmark for measuring legal reasoning in large language models	Guha, Neel and Nyarko, Julian and Ho, Daniel and R{\'e	Advances in neural information processing systems	2023
161	`shu2024lawllm`	LawLLM: Law large language model for the US legal system	Shu, Dong and Zhao, Haoran and Liu, Xukun and Demeter, David and Du, Mengnan and Zhang, Yongfeng	Proceedings of the 33rd ACM International Conference on information and knowledge management	2024
162	`liu2025fin`	Fin-r1: A large language model for financial reasoning through reinforcement learning	Liu, Zhaowei and Guo, Xin and Lou, Fangqi and Zeng, Lingfeng and Niu, Jinyi and Wang, Zixuan and Xu, Jiajie and Cai, Weige and Yang, Ziwei and Zhao, Xueqian and others	arXiv preprint arXiv:2503.16252	2025
163	`son2023beyond`	Beyond classification: Financial reasoning in state-of-the-art language models	Son, Guijin and Jung, Hanearl and Hahm, Moonjeong and Na, Keonju and Jin, Sol	arXiv preprint arXiv:2305.01505	2023
164	`yuan2024finllms`	Finllms: A framework for financial reasoning dataset generation with large language models	Yuan, Ziqiang and Wang, Kaiyuan and Zhu, Shoutai and Yuan, Ye and Zhou, Jingya and Zhu, Yanlin and Wei, Wenqi	IEEE Transactions on Big Data	2024
165	`beltagy2019scibert`	SciBERT: A pretrained language model for scientific text	Beltagy, Iz and Lo, Kyle and Cohan, Arman	arXiv preprint arXiv:1903.10676	2019
166	`taylor2022galactica`	Galactica: A large language model for science	Taylor, Ross and Kardas, Marcin and Cucurull, Guillem and Scialom, Thomas and Hartshorn, Anthony and Saravia, Elvis and Poulton, Andrew and Kerkez, Viktor and Stojnic, Robert	arXiv preprint arXiv:2211.09085	2022
167	`raza2025developing`	Developing safe and responsible large language model: can we balance bias reduction and language understanding?	Raza, Shaina and Bamgbose, Oluwanifemi and Ghuge, Shardul and Tavakoli, Fatemeh and Reji, Deepak John and Bashir, Syed Raza	Machine Learning	2025
168	`besta2025reasoning`	Reasoning language models: A blueprint	Besta, Maciej and Barth, Julia and Schreiber, Eric and Kubicek, Ales and Catarino, Afonso and Gerstenberger, Robert and Nyczyk, Piotr and Iff, Patrick and Li, Yueling and Houliston, Sam and others	arXiv preprint arXiv:2501.11223	2025
169	`lomonaco2019continual`	Continual learning with deep architectures	Lomonaco, Vincenzo	alma	2019
170	`hitzler2022neuro`	Neuro-symbolic artificial intelligence: The state of the art	Hitzler, Pascal and Sarker, Md Kamruzzaman	IOS press	2022
171	`FERPA1974`	{Family Educational Rights and Privacy Act of 1974 (FERPA)	{U.S. Congress		1974	DOI/URL
172	`iso42001`	ISO/IEC 42001:2023 -- Artificial Intelligence Management System (AI MS) -- Requirements	{International Organization for Standardization		2023
173	`MiFIDII2014`	{Directive 2014/65/EU	{European Parliament and Council of the European Union		2014	DOI/URL
174	`hipaa164`	{HIPAA Privacy Rule -- 45 CFR Part 164: Security and Privacy Protections for Health Information	{U.S. Department of Health and Human Services		2003
175	`gdpr25`	{General Data Protection Regulation (GDPR) -- Article 25: Data protection by design and by default	{European Union		2016
176	`slattery2024ai`	The ai risk repository: A comprehensive meta-review, database, and taxonomy of risks from artificial intelligence	Slattery, Peter and Saeri, Alexander K and Grundy, Emily AC and Graham, Jess and Noetel, Michael and Uuk, Risto and Dao, James and Pour, Soroush and Casper, Stephen and Thompson, Neil	arXiv preprint arXiv:2408.12622	2024
177	`sakib2024risks`	Risks, causes, and mitigations of widespread deployments of large language models (llms): A survey	Sakib, Md Nazmus and Islam, Md Athikul and Pathak, Royal and Arifin, Md Mashrur	2024 2nd International Conference on Artificial Intelligence, Blockchain, and Internet of Things (AIBThings)	2024
178	`zhao2024explainability`	Explainability for large language models: A survey	Zhao, Haiyan and Chen, Hanjie and Yang, Fan and Liu, Ninghao and Deng, Huiqi and Cai, Hengyi and Wang, Shuaiqiang and Yin, Dawei and Du, Mengnan	ACM Transactions on Intelligent Systems and Technology	2024
179	`jha2022responsible`	Responsible reasoning with large language models and the impact of proper nouns	Jha, Sumit Kumar and Ewetz, Rickard and Velasquez, Alvaro and Jha, Susmit	Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022	2022
180	`park2023generativeagents`	Generative Agents: Interactive Simulacra of Human Behavior	Park, Joon Sung and O'Brien, Joseph C. and Cai, Carrie J. and Morris, Meredith Ringel and Liang, Percy and Bernstein, Michael S.	Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST '23)	2023	DOI/URL
181	`lewis2020rag`	Retrieval-Augmented Generation for Knowledge-Intensive NLP	Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K{\"u	Advances in Neural Information Processing Systems 33 (NeurIPS 2020)	2020	DOI/URL
182	`packer2023memgpt`	MemGPT: Towards LLMs as Operating Systems.	Packer, Charles and Fang, Vivian and Patil, Shishir\_G and Lin, Kevin and Wooders, Sarah and Gonzalez, Joseph\_E	ArXiv	2023
183	`google2025_gemini25_deepthink_modelcard`	{Gemini 2.5 Deep Think Model Card	{Google DeepMind		2025
184	`anthropic2025_claude37_sonnet`	{Claude 3.7 Sonnet (Hybrid Reasoning Model) Announcement and System Card	Anthropic		2025
185	`he2025_skywork_or1`	{Skywork Open Reasoner 1 (Skywork-OR1): A Scalable RL Framework for Long Chain-of-Thought Reasoning	He, Jujie and Liu, Jiacai and Liu, Chris Yuhao and Yan, Rui and Wang, Chaojie and Cheng, Peng and Zhang, Xiaoyu and Zhang, Fuxiang and Xu, Jiacheng and Shen, Wei and Li, Siyuan and Zeng, Liang and Wei, Tianwen and Cheng, Cheng and An, Bo and Liu, Yang and Zhou, Yahui	arXiv preprint arXiv:2505.22312	2025
186	`alibaba2025_qwq32b`	{Alibaba Cloud Unveils QwQ-32B: A Compact Reasoning Model with Cutting-Edge Performance	{Alibaba Cloud Qwen Team		2025
187	`anthropic2024_claude35_sonnet`	{Introducing Claude 3.5 Sonnet	Anthropic		2024
188	`anthropic_claude4_systemcard`	Claude Opus 4 \& Claude Sonnet 4 System Card	{Anthropic		2025
189	`Guardian_OpenAI_GPT5_2025`	OpenAI says latest {ChatGPT	{The Guardian	The Guardian	2025	DOI/URL
190	`OpenAI_GPT5_2025`	Introducing {GPT-5	{OpenAI		2025
191	`openai2025_gpt_oss_model_card`	{gpt-oss-120b \& gpt-oss-20b Model Card	{OpenAI		2025	DOI/URL
192	`wang2022self`	Self-consistency improves chain of thought reasoning in language models	Wang, Xuezhi and Wei, Jason and Schuurmans, Dale and Le, Quoc and Chi, Ed and Narang, Sharan and Chowdhery, Aakanksha and Zhou, Denny	arXiv preprint arXiv:2203.11171	2022
193	`peter1997experiences`	Experiences with an architecture for intelligent, reactive agents	Peter Bonasso, R and James Firby, R and Gat, Erann and Kortenkamp, David and Miller, David P and Slack, Mark G	Journal of Experimental \& Theoretical Artificial Intelligence	1997
194	`gat1998three`	On three-layer architectures	Gat, Erann and Bonnasso, R Peter and Murphy, Robin and others	Artificial intelligence and mobile robots	1998
195	`brooks1991intelligence`	Intelligence without representation	Brooks, Rodney A	Artificial intelligence	1991
196	`brooks2003robust`	A robust layered control system for a mobile robot	Brooks, Rodney	IEEE journal on robotics and automation	2003
197	`karpukhin2020dpr`	Dense Passage Retrieval for Open-Domain Question Answering	Karpukhin, Vladimir and O{\u{g	Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)	2020	DOI/URL
198	`johnson2017faiss`	Billion-Scale Similarity Search with GPUs	Johnson, Jeff and Douze, Matthijs and J{\'e	IEEE Transactions on Big Data	2019	DOI/URL
199	`woodgate2024macro`	Macro ethics principles for responsible AI systems: Taxonomy and directions	Woodgate, Jessica and Ajmeri, Nirav	ACM Computing Surveys	2024
200	`devlin2019bert`	Bert: Pre-training of deep bidirectional transformers for language understanding	Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina	Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers)	2019
201	`wei2022chain`	Chain-of-thought prompting elicits reasoning in large language models	Wei, Jason and Wang, Xuezhi and Schuurmans, Dale and Bosma, Maarten and Xia, Fei and Chi, Ed and Le, Quoc V and Zhou, Denny and others	Advances in neural information processing systems	2022
202	`jaech2024openai`	Openai o1 system card	Jaech, Aaron and Kalai, Adam and Lerer, Adam and Richardson, Adam and El-Kishky, Ahmed and Low, Aiden and Helyar, Alec and Madry, Aleksander and Beutel, Alex and Carney, Alex and others	arXiv preprint arXiv:2412.16720	2024
203	`guo2025deepseek`	Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning	Guo, Daya and Yang, Dejian and Zhang, Haowei and Song, Junxiao and Zhang, Ruoyu and Xu, Runxin and Zhu, Qihao and Ma, Shirong and Wang, Peiyi and Bi, Xiao and others	arXiv preprint arXiv:2501.12948	2025
204	`zhang2022automatic`	Automatic chain of thought prompting in large language models	Zhang, Zhuosheng and Zhang, Aston and Li, Mu and Smola, Alex	arXiv preprint arXiv:2210.03493	2022
205	`lyu2023faithful`	Faithful chain-of-thought reasoning	Lyu, Qing and Havaldar, Shreya and Stein, Adam and Zhang, Li and Rao, Delip and Wong, Eric and Apidianaki, Marianna and Callison-Burch, Chris	The 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL 2023)	2023
206	`mokander2024auditing`	Auditing large language models: a three-layered approach	M{\"o	AI and Ethics	2024
207	`amirizaniani2024llmauditor`	LLMAuditor: A Framework for Auditing Large Language Models Using Human-in-the-Loop	Amirizaniani, Maryam and Yao, Jihan and Lavergne, Adrian and Okada, Elizabeth Snell and Chadha, Aman and Roosta, Tanya and Shah, Chirag	arXiv preprint arXiv:2402.09346	2024
208	`amirizaniani2024auditllm`	AuditLLM: A tool for auditing large language models using multiprobe approach	Amirizaniani, Maryam and Martin, Elias and Roosta, Tanya and Chadha, Aman and Shah, Chirag	Proceedings of the 33rd ACM International Conference on Information and Knowledge Management	2024
209	`paraschou2025mind`	Mind the XAI Gap: A Human-Centered LLM Framework for Democratizing Explainable AI	Paraschou, Eva and Arapakis, Ioannis and Yfantidou, Sofia and Macaluso, Sebastian and Vakali, Athena	arXiv preprint arXiv:2506.12240	2025
210	`ehsan2024human`	Human-centered explainable AI (HCXAI): Reloading explainability in the era of large language models (LLMs)	Ehsan, Upol and Watkins, Elizabeth A and Wintersberger, Philipp and Manger, Carina and Kim, Sunnie SY and Van Berkel, Niels and Riener, Andreas and Riedl, Mark O	Extended Abstracts of the CHI Conference on Human Factors in Computing Systems	2024
211	`yang2024human`	Human-centric autonomous systems with llms for user command reasoning	Yang, Yi and Zhang, Qingwen and Li, Ci and Marta, Daniel Sim{\~o	Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision	2024
212	`zhang2024llama`	Llama-berry: Pairwise optimization for o1-like olympiad-level mathematical reasoning	Zhang, Di and Wu, Jianbo and Lei, Jingdi and Che, Tong and Li, Jiatong and Xie, Tong and Huang, Xiaoshui and Zhang, Shufei and Pavone, Marco and Li, Yuqiang and others	arXiv preprint arXiv:2410.02884	2024
213	`zheng2024processbench`	Processbench: Identifying process errors in mathematical reasoning	Zheng, Chujie and Zhang, Zhenru and Zhang, Beichen and Lin, Runji and Lu, Keming and Yu, Bowen and Liu, Dayiheng and Zhou, Jingren and Lin, Junyang	arXiv preprint arXiv:2412.06559	2024
214	`browne2012survey`	A survey of monte carlo tree search methods	Browne, Cameron B and Powley, Edward and Whitehouse, Daniel and Lucas, Simon M and Cowling, Peter I and Rohlfshagen, Philipp and Tavener, Stephen and Perez, Diego and Samothrakis, Spyridon and Colton, Simon	IEEE Transactions on Computational Intelligence and AI in games	2012
215	`zhao2024expel`	Expel: Llm agents are experiential learners	Zhao, Andrew and Huang, Daniel and Xu, Quentin and Lin, Matthieu and Liu, Yong-Jin and Huang, Gao	Proceedings of the AAAI Conference on Artificial Intelligence	2024
216	`besta2024graph`	Graph of thoughts: Solving elaborate problems with large language models	Besta, Maciej and Blach, Nils and Kubicek, Ales and Gerstenberger, Robert and Podstawski, Michal and Gianinazzi, Lukas and Gajda, Joanna and Lehmann, Tomasz and Niewiadomski, Hubert and Nyczyk, Piotr and others	Proceedings of the AAAI conference on artificial intelligence	2024
217	`liu2024mathbench`	Mathbench: Evaluating the theory and application proficiency of llms with a hierarchical mathematics benchmark	Liu, Hongwei and Zheng, Zilong and Qiao, Yuxuan and Duan, Haodong and Fei, Zhiwei and Zhou, Fengzhe and Zhang, Wenwei and Zhang, Songyang and Lin, Dahua and Chen, Kai	arXiv preprint arXiv:2405.12209	2024
218	`wang2024rupbench`	Rupbench: Benchmarking reasoning under perturbations for robustness evaluation in large language models	Wang, Yuqing and Zhao, Yun	arXiv preprint arXiv:2406.11020	2024
219	`zeng2024mr`	Mr-ben: A meta-reasoning benchmark for evaluating system-2 thinking in llms	Zeng, Zhongshen and Liu, Yinhong and Wan, Yingjia and Li, Jingyao and Chen, Pengguang and Dai, Jianbo and Yao, Yuxuan and Xu, Rongwu and Qi, Zehan and Zhao, Wanru and others	Advances in Neural Information Processing Systems	2024
220	`estermann2024puzzles`	Puzzles: A benchmark for neural algorithmic reasoning	Estermann, Benjamin and Lanzend{\"o	Advances in Neural Information Processing Systems	2024
221	`wang2019superglue`	Superglue: A stickier benchmark for general-purpose language understanding systems	Wang, Alex and Pruksachatkun, Yada and Nangia, Nikita and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel	Advances in neural information processing systems	2019
222	`pan2023logic`	Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning	Pan, Liangming and Albalak, Alon and Wang, Xinyi and Wang, William Yang	arXiv preprint arXiv:2305.12295	2023
223	`yu2024natural`	Natural language reasoning, a survey	Yu, Fei and Zhang, Hongbo and Tiwari, Prayag and Wang, Benyou	ACM Computing Surveys	2024
224	`liu2024logic`	Logic-of-thought: Injecting logic into contexts for full reasoning in large language models	Liu, Tongxuan and Xu, Wenjiang and Huang, Weizhe and Zeng, Yuting and Wang, Jiaxing and Wang, Xingyu and Yang, Hailong and Li, Jing	arXiv preprint arXiv:2409.17539	2024
225	`pan2025survey`	A survey of slow thinking-based reasoning llms using reinforced learning and inference-time scaling law	Pan, Qianjun and Ji, Wenkai and Ding, Yuyang and Li, Junsong and Chen, Shilian and Wang, Junyi and Zhou, Jie and Chen, Qin and Zhang, Min and Wu, Yulan and others	arXiv preprint arXiv:2505.02665	2025
226	`huang2025deep`	Deep Research Agents: A Systematic Examination And Roadmap	Huang, Yuxuan and Chen, Yihang and Zhang, Haozheng and Li, Kang and Fang, Meng and Yang, Linyi and Li, Xiaoguang and Shang, Lifeng and Xu, Songcen and Hao, Jianye and others	arXiv preprint arXiv:2506.18096	2025
227	`raiaan2024review`	A review on large language models: Architectures, applications, taxonomies, open issues and challenges	Raiaan, Mohaimenul Azam Khan and Mukta, Md Saddam Hossain and Fatema, Kaniz and Fahad, Nur Mohammad and Sakib, Sadman and Mim, Most Marufatul Jannat and Ahmad, Jubaer and Ali, Mohammed Eunus and Azam, Sami	IEEE access	2024
228	`chen2025towards`	Towards reasoning era: A survey of long chain-of-thought for reasoning large language models	Chen, Qiguang and Qin, Libo and Liu, Jinhao and Peng, Dengyun and Guan, Jiannan and Wang, Peng and Hu, Mengkang and Zhou, Yuhang and Gao, Te and Che, Wanxiang	arXiv preprint arXiv:2503.09567	2025
229	`cao2025toward`	Toward generalizable evaluation in the llm era: A survey beyond benchmarks	Cao, Yixin and Hong, Shibo and Li, Xinze and Ying, Jiahao and Ma, Yubo and Liang, Haiyuan and Liu, Yantao and Yao, Zijun and Wang, Xiaozhi and Huang, Dan and others	arXiv preprint arXiv:2504.18838	2025
230	`chang2024survey`	A survey on evaluation of large language models	Chang, Yupeng and Wang, Xu and Wang, Jindong and Wu, Yuan and Yang, Linyi and Zhu, Kaijie and Chen, Hao and Yi, Xiaoyuan and Wang, Cunxiang and Wang, Yidong and others	ACM transactions on intelligent systems and technology	2024
231	`morishita2024enhancing`	Enhancing reasoning capabilities of llms via principled synthetic logic corpus	Morishita, Terufumi and Morio, Gaku and Yamaguchi, Atsuki and Sogawa, Yasuhiro	Advances in Neural Information Processing Systems	2024
232	`basiouni2025context`	In-Context Learning in Large Language Models (LLMs): Mechanisms, Capabilities, and Implications for Advanced Knowledge Representation and Reasoning	Basiouni, Azza Mohamed and El Rashid, Mohamed and Shaalan, Khaled	IEEE Access	2025
233	`yeo2025demystifying`	Demystifying long chain-of-thought reasoning in llms	Yeo, Edward and Tong, Yuxuan and Niu, Morry and Neubig, Graham and Yue, Xiang	arXiv preprint arXiv:2502.03373	2025
234	`kumar2025llm`	Llm post-training: A deep dive into reasoning large language models	Kumar, Komal and Ashraf, Tajamul and Thawakar, Omkar and Anwer, Rao Muhammad and Cholakkal, Hisham and Shah, Mubarak and Yang, Ming-Hsuan and Torr, Phillip HS and Khan, Fahad Shahbaz and Khan, Salman	arXiv preprint arXiv:2502.21321	2025
235	`fu2025improving`	Improving complex reasoning in large language models	Fu, Yao	The University of Edinburgh	2025
236	`feng2025efficient`	Efficient reasoning models: A survey	Feng, Sicheng and Fang, Gongfan and Ma, Xinyin and Wang, Xinchao	arXiv preprint arXiv:2504.10903	2025
237	`ferrag2025llm`	From llm reasoning to autonomous ai agents: A comprehensive review	Ferrag, Mohamed Amine and Tihanyi, Norbert and Debbah, Merouane	arXiv preprint arXiv:2504.19678	2025
238	`putta2024agent`	Agent q: Advanced reasoning and learning for autonomous ai agents	Putta, Pranav and Mills, Edmund and Garg, Naman and Motwani, Sumeet and Finn, Chelsea and Garg, Divyansh and Rafailov, Rafael	arXiv preprint arXiv:2408.07199	2024
239	`tariq2025reasoning`	Reasoning About Responsibility in Autonomous Systems: Navigating the Challenges and Charting Future Directions	Tariq, Usman and Ahmed, Irfan	Ubiquitous Technology Journal	2025
240	`ferrag2025reasoning`	Reasoning beyond limits: Advances and open problems for llms	Ferrag, Mohamed Amine and Tihanyi, Norbert and Debbah, Merouane	arXiv preprint arXiv:2503.22732	2025
241	`wu2025position`	Position Paper: Towards Open Complex Human-AI Agents Collaboration System for Problem-Solving and Knowledge Management	Wu, Ju and Or, Calvin KL	arXiv preprint arXiv:2505.00018	2025
242	`tran2025reasoning`	Reasoning in Neurosymbolic AI	Tran, Son and Mota, Edjard and Garcez, Artur d'Avila	arXiv preprint arXiv:2505.20313	2025
243	`swiechowski2023monte`	Monte Carlo tree search: A review of recent modifications and applications	{\'S	Artificial Intelligence Review	2023
244	`sun2025data`	Data Agent: A Holistic Architecture for Orchestrating Data+ AI Ecosystems	Sun, Zhaoyan and Wang, Jiayi and Zhao, Xinyang and Wang, Jiachi and Li, Guoliang	arXiv preprint arXiv:2507.01599	2025
245	`zheng2025retrieval`	Retrieval augmented generation and understanding in vision: A survey and new outlook	Zheng, Xu and Weng, Ziqiao and Lyu, Yuanhuiyi and Jiang, Lutao and Xue, Haiwei and Ren, Bin and Paudel, Danda and Sebe, Nicu and Van Gool, Luc and Hu, Xuming	arXiv preprint arXiv:2503.18016	2025
246	`bei2025graphs`	Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities	Bei, Yuanchen and Zhang, Weizhi and Wang, Siwen and Chen, Weizhi and Zhou, Sheng and Chen, Hao and Li, Yong and Bu, Jiajun and Pan, Shirui and Yu, Yizhou and others	arXiv preprint arXiv:2506.18019	2025
247	`chhikara2025mem0`	Mem0: Building production-ready ai agents with scalable long-term memory	Chhikara, Prateek and Khant, Dev and Aryan, Saket and Singh, Taranjeet and Yadav, Deshraj	arXiv preprint arXiv:2504.19413	2025
248	`huang2025foundation`	Foundation models and intelligent decision-making: Progress, challenges, and perspectives	Huang, Jincai and Xu, Yongjun and Wang, Qi and Wang, Qi Cheems and Liang, Xingxing and Wang, Fei and Zhang, Zhao and Wei, Wei and Zhang, Boxuan and Huang, Libo and others	The Innovation	2025
249	`sun2025survey`	A survey of reasoning with foundation models: Concepts, methodologies, and outlook	Sun, Jiankai and Zheng, Chuanyang and Xie, Enze and Liu, Zhengying and Chu, Ruihang and Qiu, Jianing and Xu, Jiaqi and Ding, Mingyu and Li, Hongyang and Geng, Mengzhe and others	ACM Computing Surveys	2025
250	`zhang2025igniting`	Igniting language intelligence: The hitchhiker’s guide from chain-of-thought reasoning to language agents	Zhang, Zhuosheng and Yao, Yao and Zhang, Aston and Tang, Xiangru and Ma, Xinbei and He, Zhiwei and Wang, Yiming and Gerstein, Mark and Wang, Rui and Liu, Gongshen and others	ACM Computing Surveys	2025
251	`wang2025multimodal`	Multimodal chain-of-thought reasoning: A comprehensive survey	Wang, Yaoting and Wu, Shengqiong and Zhang, Yuecheng and Yan, Shuicheng and Liu, Ziwei and Luo, Jiebo and Fei, Hao	arXiv preprint arXiv:2503.12605	2025
252	`chen2025policy`	Policy frameworks for transparent chain-of-thought reasoning in large language models	Chen, Yihang and Deng, Haikang and Han, Kaiqiao and Zhao, Qingyue	arXiv preprint arXiv:2503.14521	2025
253	`manuvinakurike2025thoughts`	Thoughts without Thinking: Reconsidering the Explanatory Value of Chain-of-Thought Reasoning in LLMs through Agentic Pipelines	Manuvinakurike, Ramesh and Moss, Emanuel and Watkins, Elizabeth Anne and Sahay, Saurav and Raffa, Giuseppe and Nachman, Lama	arXiv preprint arXiv:2505.00875	2025
254	`li2025llm`	LLM-augmented hierarchical reinforcement learning for human-like decision-making of autonomous driving	Li, Lin and Tan, Runjia and Fang, Jianwu and Xue, Jianru and Lv, Chen	Expert Systems with Applications	2025
255	`zhao2025world`	World Models for Cognitive Agents: Transforming Edge Intelligence in Future Networks	Zhao, Changyuan and Zhang, Ruichen and Wang, Jiacheng and Zhao, Gaosheng and Niyato, Dusit and Sun, Geng and Mao, Shiwen and Kim, Dong In	arXiv preprint arXiv:2506.00417	2025
256	`lopez2025survey`	A Survey on Large Language Models in Multimodal Recommender Systems	Lopez-Avila, Alejo and Du, Jinhua	arXiv preprint arXiv:2505.09777	2025
257	`giannone2025feedback`	Feedback-Driven Vision-Language Alignment with Minimal Human Supervision	Giannone, Giorgio and Li, Ruoteng and Feng, Qianli and Perevodchikov, Evgeny and Chen, Rui and Martinez, Aleix	arXiv preprint arXiv:2501.04568	2025
258	`cao2025causal`	Causal action empowerment for efficient reinforcement learning in embodied agents	Cao, Hongye and Feng, Fan and Huo, Jing and Gao, Yang	Science China Information Sciences	2025
259	`ranjan2025fairness`	Fairness in Agentic AI: A Unified Framework for Ethical and Equitable Multi-Agent System	Ranjan, Rajesh and Gupta, Shailja and Singh, Surya Narayan	arXiv preprint arXiv:2502.07254	2025
260	`chen2024fairness`	Fairness testing: A comprehensive survey and analysis of trends	Chen, Zhenpeng and Zhang, Jie M and Hort, Max and Harman, Mark and Sarro, Federica	ACM Transactions on Software Engineering and Methodology	2024
261	`su2025thinking`	Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers	Su, Zhaochen and Xia, Peng and Guo, Hangyu and Liu, Zhenhua and Ma, Yan and Qu, Xiaoye and Liu, Jiaqi and Li, Yanshu and Zeng, Kaide and Yang, Zhengyuan and others	arXiv preprint arXiv:2506.23918	2025
262	`karunanayake2025next`	Next-generation agentic AI for transforming healthcare	Karunanayake, Nalan	Informatics and Health	2025
263	`zhang2025survey`	A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?	Zhang, Qiyuan and Lyu, Fuyuan and Sun, Zexu and Wang, Lei and Zhang, Weixu and Hua, Wenyue and Wu, Haolun and Guo, Zhihan and Wang, Yufei and Muennighoff, Niklas and others	arXiv preprint arXiv:2503.24235	2025
264	`kim2025cost`	The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective	Kim, Jiin and Shin, Byeongjun and Chung, Jinha and Rhu, Minsoo	arXiv preprint arXiv:2506.04301	2025
265	`li2025system`	From system 1 to system 2: A survey of reasoning large language models	Li, Zhong-Zhi and Zhang, Duzhen and Zhang, Ming-Liang and Zhang, Jiaxin and Liu, Zengyan and Yao, Yuxuan and Xu, Haotian and Zheng, Junhao and Wang, Pei-Jie and Chen, Xiuyi and others	arXiv preprint arXiv:2502.17419	2025
266	`gao2024interpretable`	Interpretable contrastive monte carlo tree search reasoning	Gao, Zitian and Niu, Boye and He, Xuzheng and Xu, Haotian and Liu, Hongzhang and Liu, Aiwei and Hu, Xuming and Wen, Lijie	arXiv preprint arXiv:2410.01707	2024
267	`liang2025mcts`	I-MCTS: Enhancing agentic AutoML via introspective monte carlo tree search	Liang, Zujie and Wei, Feng and Xu, Wujiang and Chen, Lin and Qian, Yuxi and Wu, Xinhui	arXiv preprint arXiv:2502.14693	2025
268	`an2025combining`	Combining llms with logic-based framework to explain mcts	An, Ziyan and Wang, Xia and Baier, Hendrik and Chen, Zirong and Dubey, Abhishek and Johnson, Taylor T and Sprinkle, Jonathan and Mukhopadhyay, Ayan and Ma, Meiyi	arXiv preprint arXiv:2505.00610	2025
269	`dao2025boosting`	Boosting MCTS with Free Energy Minimization	Dao, Mawaba Pascal and Peter, Adrian M	arXiv preprint arXiv:2501.13083	2025
270	`meimandi2025measurement`	The Measurement Imbalance in Agentic AI Evaluation Undermines Industry Productivity Claims	Meimandi, Kiana Jafari and Ar{\'a	arXiv preprint arXiv:2506.02064	2025
271	`ahmed2025enhancing`	Enhancing Explainability, Robustness, and Autonomy: A Comprehensive Approach in Trustworthy AI	Ahmed, Mobyen Uddin and Begum, Shahina and Barua, Shaibal and Masud, Abu Naser and Di Flumeri, Gianluca and Navarin, Nicol{\`o	2025 IEEE Symposium on Trustworthy, Explainable and Responsible Computational Intelligence (CITREx)	2025
272	`sanwal2025layered`	Layered chain-of-thought prompting for multi-agent llm systems: A comprehensive approach to explainable large language models	Sanwal, Manish	arXiv preprint arXiv:2501.18645	2025
273	`pang2025interactive`	Interactive Reasoning: Visualizing and Controlling Chain-of-Thought Reasoning in Large Language Models	Pang, Rock Yuren and Feng, KJ and Feng, Shangbin and Li, Chu and Shi, Weijia and Tsvetkov, Yulia and Heer, Jeffrey and Reinecke, Katharina	arXiv preprint arXiv:2506.23678	2025
274	`bilal2025meta`	Meta-thinking in llms via multi-agent reinforcement learning: A survey	Bilal, Ahsan and Mohsin, Muhammad Ahmed and Umer, Muhammad and Bangash, Muhammad Awais Khan and Jamshed, Muhammad Ali	arXiv preprint arXiv:2504.14520	2025
275	`wen2025cotguard`	CoTGuard: Using Chain-of-Thought Triggering for Copyright Protection in Multi-Agent LLM Systems	Wen, Yan and Guo, Junfeng and Huang, Heng	arXiv preprint arXiv:2505.19405	2025
276	`zahid2025explainability`	Explainability, Robustness, and Fairness in User-Centric Intelligent Systems: A Systematic Review	Zahid, Idrees A and Garfan, Salem and Chyad, MA and Albahri, AS and Albahri, OS and Alamoodi, AH and Deveci, Muhammet and Homod, Raad Z and Alzubaidi, Laith	IEEE Transactions on Emerging Topics in Computational Intelligence	2025
277	`gupta2025ai`	AI Agents Collaboration Under Resource Constraints: Practical Implementations	Gupta, Shubham	INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT	2025
278	`molinari2025towards`	Towards Pervasive Distributed Agentic Generative AI--A State of The Art	Molinari, Gianni and Ciravegna, Fabio	arXiv preprint arXiv:2506.13324	2025
279	`zhang2024integrating`	Integrating Artificial Intelligence into Operating Systems: A Comprehensive Survey on Techniques, Applications, and Future Directions	Zhang, Yifan and Zhao, Xinkui and Li, Ziying and Yin, Jianwei and Zhang, Lufei and Chen, Zuoning	arXiv preprint arXiv:2407.14567	2024
280	`wei2025agent`	Agent. xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC	Wei, Xinming and Zhang, Jiahao and Li, Haoran and Chen, Jiayu and Qu, Rui and Li, Maoliang and Chen, Xiang and Luo, Guojie	arXiv preprint arXiv:2506.24045	2025
281	`jiang2025large`	From large ai models to agentic ai: A tutorial on future intelligent communications	Jiang, Feibo and Pan, Cunhua and Dong, Li and Wang, Kezhi and Dobre, Octavia A and Debbah, Merouane	arXiv preprint arXiv:2505.22311	2025
282	`liu2025optimizing`	Optimizing on-demand food delivery with BDI-based multi-agent systems and Monte Carlo tree search scheduling	Liu, Li and Chen, Shikun and Jin, Huan and Deng, Xiaoying and Liu, Yangguang and Lin, Yang	Scientific Reports	2025
283	`zou2025agente`	El Agente: An autonomous agent for quantum chemistry	Zou, Yunheng and Cheng, Austin H and Aldossary, Abdulrahman and Bai, Jiaru and Leong, Shi Xuan and Campos-Gonzalez-Angulo, Jorge Arturo and Choi, Changhyeok and Ser, Cher Tian and Tom, Gary and Wang, Andrew and others	Matter	2025
284	`amini2025distributed`	Distributed llms and multimodal large language models: A survey on advances, challenges, and future directions	Amini, Hadi and Mia, Md Jueal and Saadati, Yasaman and Imteaj, Ahmed and Nabavirazavi, Seyedsina and Thakker, Urmish and Hossain, Md Zarif and Fime, Awal Ahmed and Iyengar, SS	arXiv preprint arXiv:2503.16585	2025
285	`chaudhry2025towards`	Towards Resource-Efficient Compound AI Systems	Chaudhry, Gohar Irfan and Choukse, Esha and Goiri, {\'I	Proceedings of the 2025 Workshop on Hot Topics in Operating Systems	2025
286	`roy2024enhancing`	Enhancing Real-World Robustness in AI: Challenges and Solutions	Roy, Pritam	J. Recent Trends Comput. Sci. Eng	2024
287	`kim2025medical`	Medical hallucinations in foundation models and their impact on healthcare	Kim, Yubin and Jeong, Hyewon and Chen, Shan and Li, Shuyue Stella and Lu, Mingyu and Alhamoud, Kumail and Mun, Jimin and Grau, Cristina and Jung, Minseok and Gameiro, Rodrigo and others	arXiv preprint arXiv:2503.05777	2025
288	`gao2025mono`	Mono: Is Your" Clean" Vulnerability Dataset Really Solvable? Exposing and Trapping Undecidable Patches and Beyond	Gao, Zeyu and Zhou, Junlin and Zhang, Bolun and He, Yi and Zhang, Chao and Cui, Yuxin and Wang, Hao	arXiv preprint arXiv:2506.03651	2025
289	`chander2025toward`	Toward trustworthy artificial intelligence (TAI) in the context of explainability and robustness	Chander, Bhanu and John, Chinju and Warrier, Lekha and Gopalakrishnan, Kumaravelan	ACM Computing Surveys	2025
290	`barros2025think`	I Think, Therefore I Hallucinate: Minds, Machines, and the Art of Being Wrong	Barros, Sebastian	arXiv preprint arXiv:2503.05806	2025
291	`latif2025hallucinations`	Hallucinations in large language models and their influence on legal reasoning: Examining the risks of ai-generated factual inaccuracies in judicial processes	Latif, Youssef Abdel	Journal of Computational Intelligence, Machine Reasoning, and Decision-Making	2025
292	`chakraborti2025personalized`	Personalized uncertainty quantification in artificial intelligence	Chakraborti, Tapabrata and Banerji, Christopher RS and Marandon, Ariane and Hellon, Vicky and Mitra, Robin and Lehmann, Brieuc and Br{\"a	Nature Machine Intelligence	2025
293	`liu2025uncertainty`	Uncertainty quantification and confidence calibration in large language models: A survey	Liu, Xiaoou and Chen, Tiejin and Da, Longchao and Chen, Chacha and Lin, Zhen and Wei, Hua	arXiv preprint arXiv:2503.15850	2025
294	`becerra2025historical`	Historical Methods for AI Evaluations, Assessments, and Audits	Becerra Sandoval, Juana Catalina and Jing, Felicia S	Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency	2025
295	`yeo2025comprehensive`	A comprehensive review on financial explainable AI	Yeo, Wei Jie and Van Der Heever, Wihan and Mao, Rui and Cambria, Erik and Satapathy, Ranjan and Mengaldo, Gianmarco	Artificial Intelligence Review	2025
296	`mao2025llms`	From LLMs to MLLMs to Agents: A Survey of Emerging Paradigms in Jailbreak Attacks and Defenses within LLM Ecosystem	Mao, Yanxu and Cui, Tiehan and Liu, Peipei and You, Datao and Zhu, Hongsong	arXiv preprint arXiv:2506.15170	2025
297	`feng2025integration`	Integration of multi-agent systems and artificial intelligence in self-healing subway power supply systems: Advancements in fault diagnosis, isolation, and recovery	Feng, Jianbing and Yu, Tao and Zhang, Kuozhen and Cheng, Lefeng	Processes	2025
298	`hammond2025multi`	Multi-agent risks from advanced ai	Hammond, Lewis and Chan, Alan and Clifton, Jesse and Hoelscher-Obermaier, Jason and Khan, Akbir and McLean, Euan and Smith, Chandler and Barfuss, Wolfram and Foerster, Jakob and Gaven{\v{c	arXiv preprint arXiv:2502.14143	2025
299	`acharya2025agentic`	Agentic ai: Autonomous intelligence for complex goals--a comprehensive survey	Acharya, Deepak Bhaskar and Kuppan, Karthigeyan and Divya, B	IEEe Access	2025
300	`abdallah2024multi`	Multi-agent DRL for distributed codebook design in RIS-aided cell-free massive MIMO networks	Abdallah, Asmaa and Celik, Abdulkadir and Mansour, Mohammad M and Eltawil, Ahmed M	IEEE Transactions on Communications	2024
301	`feffer2024red`	Red-teaming for generative AI: Silver bullet or security theater?	Feffer, Michael and Sinha, Anusha and Deng, Wesley H and Lipton, Zachary C and Heidari, Hoda	Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society	2024
302	`majumdar2025red`	Red Teaming AI Red Teaming	Majumdar, Subhabrata and Pendleton, Brian and Gupta, Abhishek	arXiv preprint arXiv:2507.05538	2025
303	`qwen2025ledger`	Accountability Ledger: Blockchain-Based AI Decision Logging	{Qwen Team		2025
304	`google2025bert`	Bias Bounty Program for BERT	{Google AI		2025
305	`openai2025gpt4`	Homomorphic Encryption in GPT-4	OpenAI		2025
306	`deepmind2025sparrow`	Safety Layer in Sparrow: Preventing Harmful Outputs	DeepMind		2025
307	`anthropic2025claude`	Interactive Transparency in Claude	Anthropic		2025
308	`ey2025mott`	How Mott MacDonald is Building Confidence Through Responsible AI	EY		2025
309	`ey2025biopharma`	How a Global Biopharma Became a Leader in Ethical AI	EY		2025
310	`eu2025ai`	EU AI Act	{European Union		2025
311	`masood2025effectiveness`	Measuring the Effectiveness of AI Adoption	Masood, A.		2025
312	`forbes2025ai`	Future Directions in AI Ethics	Forbes		2025
313	`ey_mottmac2025`	{How Mott MacDonald is building confidence through responsible AI}	{EY}		2025
314	`challita2025redteamllm`	RedTeamLLM: an Agentic AI framework for offensive security	Challita, Brian and Parrend, Pierre	arXiv preprint arXiv:2505.06913	2025
315	`glazer2024frontiermath`	Frontiermath: A benchmark for evaluating advanced mathematical reasoning in ai	Glazer, Elliot and Erdil, Ege and Besiroglu, Tamay and Chicharro, Diego and Chen, Evan and Gunning, Alex and Olsson, Caroline Falkman and Denain, Jean-Stanislas and Ho, Anson and Santos, Emily de Oliveira and others	arXiv preprint arXiv:2411.04872	2024
316	`ogbu2023agentic`	Agentic ai in computer vision domain-recent advances and prospects	Ogbu, Daniel	International Journal of Research Publication and Reviews	2023
317	`glaese2022improvingalignmentdialogueagents`	Improving alignment of dialogue agents via targeted human judgements	Amelia Glaese and Nat McAleese and Maja Trębacz and John Aslanides and Vlad Firoiu and Timo Ewalds and Maribeth Rauh and Laura Weidinger and Martin Chadwick and Phoebe Thacker and Lucy Campbell-Gillingham and Jonathan Uesato and Po-Sen Huang and Ramona Comanescu and Fan Yang and Abigail See and Sumanth Dathathri and Rory Greig and Charlie Chen and Doug Fritz and Jaume Sanchez Elias and Richard Green and Soňa Mokrá and Nicholas Fernando and Boxi Wu and Rachel Foley and Susannah Young and Iason Gabriel and William Isaac and John Mellor and Demis Hassabis and Koray Kavukcuoglu and Lisa Anne Hendricks and Geoffrey Irving		2022	DOI/URL
318	`amorim2023dataprivacyhomomorphicencryption`	Data Privacy with Homomorphic Encryption in Neural Networks Training and Inference	Ivone Amorim and Eva Maia and Pedro Barbosa and Isabel Praça		2023	DOI/URL
319	`scharowski2023exploring`	Exploring the effects of human-centered AI explanations on trust and reliance	Scharowski, Nicolas and Perrig, Sebastian AC and Svab, Melanie and Opwis, Klaus and Br{\"u	Frontiers in Computer Science	2023
320	`liao2022humancenteredexplainableaixai`	Human-Centered Explainable AI (XAI): From Algorithms to User Experiences	Q. Vera Liao and Kush R. Varshney		2022	DOI/URL
321	`alibabacloud_sls_logaudit`	Simple Log Service: Log Audit Service (new version)	{Alibaba Cloud		2024	DOI/URL
322	`yang2020ledgerdb`	LedgerDB: A centralized ledger database for universal audit and verification	Yang, Xinying and Zhang, Yuan and Wang, Sheng and Yu, Benquan and Li, Feifei and Li, Yize and Yan, Wenyuan	Proceedings of the VLDB Endowment	2020
323	`fli_ai_safety_index_2025`	{AI Safety Index: Summer 2025 Edition	{Future of Life Institute		2025	DOI/URL
324	`TFS2025_ai_agents_eu`	Ahead of the Curve: Governing AI Agents under the EU {AI	{The Future Society		2025	DOI/URL
325	`maclean2017nist`	The NIST risk management framework: Problems and recommendations	Maclean, Don	Cyber Security: A Peer-Reviewed Journal	2017
326	`gogia2025trust`	Trust by Design: Dissecting IBM's Enterprise AI Governance Stack	Sanchit Vir Gogia		2025	DOI/URL
327	`xia2024responsibleaimetricscatalogue`	Towards a Responsible AI Metrics Catalogue: A Collection of Metrics for AI Accountability	Boming Xia and Qinghua Lu and Liming Zhu and Sung Une Lee and Yue Liu and Zhenchang Xing		2024	DOI/URL
328	`weidinger2024holistic`	Holistic safety and responsibility evaluations of advanced ai models	Weidinger, Laura and Barnhart, Joslyn and Brennan, Jenny and Butterfield, Christina and Young, Susie and Hawkins, Will and Hendricks, Lisa Anne and Comanescu, Ramona and Chang, Oscar and Rodriguez, Mikel and others	arXiv preprint arXiv:2404.14068	2024
329	`sprague2024cot`	To cot or not to cot? chain-of-thought helps mainly on math and symbolic reasoning	Sprague, Zayne and Yin, Fangcong and Rodriguez, Juan Diego and Jiang, Dongwei and Wadhwa, Manya and Singhal, Prasann and Zhao, Xinyu and Ye, Xi and Mahowald, Kyle and Durrett, Greg	arXiv preprint arXiv:2409.12183	2024
330	`bergman2024stela`	STELA: a community-centred approach to norm elicitation for AI alignment	Bergman, Stevie and Marchal, Nahema and Mellor, John and Mohamed, Shakir and Gabriel, Iason and Isaac, William	Scientific Reports	2024
331	`larsen2024aivaluealignment`	AI value alignment: How we can align artificial intelligence with human values	Larsen, Benjamin and Dignum, Virginia		2024	DOI/URL
332	`alicloud2025ledgerdb`	LedgerDB: a centralized ledger database for universal audit and verification	Yang, Xinying and Zhang, Yuan and Wang, Sheng and Yu, Benquan and Li, Feifei and Li, Yize and Yan, Wenyuan	Proc. VLDB Endow.	2020	DOI/URL
333	`mialon2023gaiabenchmarkgeneralai`	GAIA: a benchmark for General AI Assistants	Grégoire Mialon and Clémentine Fourrier and Craig Swift and Thomas Wolf and Yann LeCun and Thomas Scialom		2023	DOI/URL
334	`timms2024agentic`	Agentic Anomaly Detection for Shipping	Timms, Alexander and Langbridge, Abigail and O'Donncha, Fearghal	NeurIPS 2024 Workshop on Open-World Agents	2024
335	`kumar2025saarthi`	Saarthi: The First AI Formal Verification Engineer	Kumar, Aman and Gadde, Deepak Narayan and Radhakrishna, Keerthan Kopparam and Lettnin, Djones	arXiv preprint arXiv:2502.16662	2025
336	`garg2025designing`	Designing the Mind: How Agentic Frameworks Are Shaping the Future of AI Behavior	Garg, Venus	Journal of Computer Science and Technology Studies	2025
337	`buehler2025agentic`	Agentic deep graph reasoning yields self-organizing knowledge networks	Buehler, Markus J	arXiv preprint arXiv:2502.13025	2025
338	`perrier2025out`	Out of Control--Why Alignment Needs Formal Control Theory (and an Alignment Control Stack)	Perrier, Elija	arXiv preprint arXiv:2506.17846	2025
339	`huang2025agentic`	Agentic AI	Huang, Ken	Springer	2025
340	`kitchenham2004procedures`	Procedures for performing systematic reviews	Kitchenham, Barbara	Keele, UK, Keele University	2004
341	`boland2017doing`	Doing a systematic review: a student s guide	Boland, Angela and Cherry, Gemma and Dickson, Rumona	Sage	2017
342	`lee2025evaluating`	Evaluating step-by-step reasoning traces: A survey	Lee, Jinu and Hockenmaier, Julia	arXiv preprint arXiv:2502.12289	2025
343	`natarajan2025human`	Human-in-the-loop or AI-in-the-loop? Automate or Collaborate?	Natarajan, Sriraam and Mathur, Saurabh and Sidheekh, Sahil and Stammer, Wolfgang and Kersting, Kristian	Proceedings of the AAAI Conference on Artificial Intelligence	2025
344	`yigit2025generative`	Generative AI and LLMs for critical infrastructure protection: evaluation benchmarks, agentic AI, challenges, and opportunities	Yigit, Yagmur and Ferrag, Mohamed Amine and Ghanem, Mohamed C and Sarker, Iqbal H and Maglaras, Leandros A and Chrysoulas, Christos and Moradpoor, Naghmeh and Tihanyi, Norbert and Janicke, Helge	Sensors	2025
345	`allana2025privacy`	Privacy Risks and Preservation Methods in Explainable Artificial Intelligence: A Scoping Review	Allana, Sonal and Kankanhalli, Mohan and Dara, Rozita	arXiv preprint arXiv:2505.02828	2025
346	`deng2025ai`	Ai agents under threat: A survey of key security challenges and future pathways	Deng, Zehang and Guo, Yongjian and Han, Changzhou and Ma, Wanlun and Xiong, Junwu and Wen, Sheng and Xiang, Yang	ACM Computing Surveys	2025
347	`inala2025building`	Building Trustworthy Agentic Ai Systems FOR Personalized Banking Experiences	Inala, Ramesh and Somu, Bharath	Metallurgical and Materials Engineering	2025
348	`huang2025ai`	AI Agent Safety and Security Considerations	Huang, Jerry and Huang, Ken and Jackson, Krystal and Hughes, Chris	Agentic AI: Theories and Practices	2025
349	`sutton2018reinforcement`	{Reinforcement learning: An introduction	Sutton, Richard S and Barto, Andrew G	MIT press	2018
350	`hosseini2025ai`	AI ethics in action: a circular model for transparency, accountability and inclusivity	Hosseini Tabaghdehi, Seyedeh Asieh and Ayaz, {\"O	Journal of Managerial Psychology	2025
351	`bahangulu2025algorithmic`	Algorithmic bias, data ethics, and governance: Ensuring fairness, transparency and compliance in AI-powered business analytics applications	Bahangulu, Julien Kiesse and Berko, Louis Owusu	World Journal of Advanced Research and Reviews	2025
352	`li2025ai`	AI-Driven Governance: Enhancing Transparency and Accountability in Public Administration	LI, CHANGKUI	Digital Society \& Virtual Governance	2025
353	`andrada2023varieties`	Varieties of transparency: Exploring agency within AI systems	Andrada, Gloria and Clowes, Robert W and Smart, Paul R	AI \& society	2023
354	`zerilli2022transparency`	How transparency modulates trust in artificial intelligence	Zerilli, John and Bhatt, Umang and Weller, Adrian	Patterns	2022
355	`akhtar2024privacy`	Privacy and Security Considerations in Explainable AI	Akhtar, Mohammad Amir Khusru and Kumar, Mohit and Nayyar, Anand	Towards Ethical and Socially Responsible Explainable AI: Challenges and Opportunities	2024
356	`busuioc2021accountable`	Accountable artificial intelligence: Holding algorithms to account	Busuioc, Madalina	Public administration review	2021
357	`griffin2024ethical`	The ethical agency of AI developers	Griffin, Tricia A and Green, Brian Patrick and Welie, Jos VM	AI and Ethics	2024
358	`bjurling2025designing`	Designing Human-Swarm Interaction Systems	Bjurling, Oscar	Link{\"o	2025
359	`braun2025liability`	Liability for artificial intelligence reasoning technologies--a cognitive autonomy that does not help	Braun, Tomasz	Corporate Governance: The International Journal of Business in Society	2025
360	`crewAI`	CrewAI: Framework for Orchestrating Role-Playing, Autonomous AI Agents	João Moura and contributors	GitHub	2023
361	`raman2025navigating`	Navigating artificial general intelligence development: societal, technological, ethical, and brain-inspired pathways	Raman, Raghu and Kowalski, Robin and Achuthan, Krishnashree and Iyer, Akshay and Nedungadi, Prema	Scientific Reports	2025
362	`hammerschmidt2025bridging`	Bridging the gap: inequalities that divide those who can and cannot create sustainable outcomes with AI	Hammerschmidt, Teresa and Stolz, Katharina and Posegga, Oliver	Behaviour \& Information Technology	2025
363	`dahlan2025navigating`	Navigating the Digital Frontier: Understanding Technology's Impact on Society	Dahlan, Mariani Mohd	Universiti Poly-Tech Malaysia	2025
364	`jiang_mistral_2023`	Mistral {7B	Jiang, Albert Q. and Sablayrolles, Alexandre and Mensch, Arthur and Bamford, Chris and Chaplot, Devendra Singh and Casas, Diego de las and Bressand, Florian and Lengyel, Gianna and Lample, Guillaume and Saulnier, Lucile and Lavaud, Lélio Renard and Lachaux, Marie-Anne and Stock, Pierre and Scao, Teven Le and Lavril, Thibaut and Wang, Thomas and Lacroix, Timothée and Sayed, William El	arXiv	2023	DOI/URL
365	`xu_bot-adversarial_2021`	Bot-{Adversarial	Xu, Jing and Ju, Da and Li, Margaret and Boureau, Y-Lan and Weston, Jason and Dinan, Emily	Proceedings of the 2021 {Conference	2021	DOI/URL
366	`pujari2024ethical`	Ethical and responsible AI: Governance frameworks and policy implications for multi-agent systems	Pujari, Tejaskumar and Goel, Anshul and Sharma, Ashwin	IJST	2024
367	`nanjundan2025navigating`	Navigating the ethical landscape of artificial intelligence: Challenges, frameworks, and responsible deployment	Nanjundan, Preethi and Indu, PV and Thomas, Lijo	Artificial Intelligence Technologies for Engineering Applications	2025
368	`panarese2025algorithmic`	Algorithmic bias, fairness, and inclusivity: a multilevel framework for justice-oriented AI	Panarese, Paola and Grasso, Marta Margherita and Solinas, Claudia	AI \& SOCIETY	2025
369	`mergen2025artificial`	Artificial intelligence and bias towards marginalised groups: Theoretical roots and challenges	Mergen, Aybike and {\c{C	AI and Diversity in a Datafied World of Work: Will the Future of Work be Inclusive?	2025
370	`kay2025imitation`	Imitation, Identity, and Injustice in Artificial Intelligence	Kay, Jackie		2025
371	`koukaras2025ai`	AI-driven telecommunications for smart classrooms: Transforming education through personalized learning and secure networks	Koukaras, Christos and Koukaras, Paraskevas and Ioannidis, Dimosthenis and Stavrinides, Stavros G	Telecom	2025
372	`sharma2025role`	The role of large language models in personalized learning: a systematic review of educational impact	Sharma, Sahil and Mittal, Puneet and Kumar, Mukesh and Bhardwaj, Vivek	Discover Sustainability	2025
373	`lau2025size`	Size Matters When Adopting and Scaling AI	Lau, Theodora	Banking on (Artificial) Intelligence: Navigating the Realities of AI in Financial Services	2025
374	`rahal2025use`	The use of publicly available online texts in training AI: an ethical analysis of AI’s right to learn	Rahal, Louai	Journal of Information, Communication and Ethics in Society	2025
375	`emery2025international`	International governance of advancing artificial intelligence	Emery-Xu, Nicholas and Jordan, Richard and Trager, Robert	AI \& SOCIETY	2025
376	`charkhian2025can`	HOW CAN AI EVALUATE AND IMPROVE INCLUSIVITY IN UNIVERSITY PORTALS, WITH A FOCUS ON CULTURAL, LINGUISTIC, AND ACCESSIBLE REQUIREMENTS?	Charkhian, D and Moghaddami, B	INTED2025 Proceedings	2025
377	`davoodi2024equal`	EQUAL AI: A framework for enhancing equity, quality, understanding and accessibility in liberal arts through AI for multilingual learners	Davoodi, Amin	Language, Technology, and Social Media	2024
378	`hyrynsalmi2025making`	Making Software Development More Diverse and Inclusive: Key Themes, Challenges, and Future Directions	Hyrynsalmi, Sonja M and Baltes, Sebastian and Brown, Chris and Prikladnicki, Rafael and Rodriguez-Perez, Gema and Serebrenik, Alexander and Simmonds, Jocelyn and Trinkenreich, Bianca and Wang, Yi and Liebel, Grischa	ACM Transactions on Software Engineering and Methodology	2025
379	`alam2025ethical`	Ethical Challenges and Bias in AI-Driven Marketing: Educational Imperatives and Policy Perspectives	Alam, Ashraf	Impacts of AI-Generated Content on Brand Reputation	2025
380	`neumann2025position`	Position is Power: System Prompts as a Mechanism of Bias in Large Language Models (LLMs)	Neumann, Anna and Kirsten, Elisabeth and Zafar, Muhammad Bilal and Singh, Jatinder	Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency	2025
381	`ma2025breaking`	Breaking Down Bias: On The Limits of Generalizable Pruning Strategies	Ma, Sibo and Salinas, Alejandro and Nyarko, Julian and Henderson, Peter	Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency	2025
382	`solano2025running`	" Who is running it?" Towards Equitable AI Deployment in Home Care Work	Solano-Kamaiko, Ian Ren{\'e	Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems	2025
383	`gabriel2025matter`	A matter of principle? AI alignment as the fair treatment of claims	Gabriel, Iason and Keeling, Geoff	Philosophical Studies	2025
384	`watson2025competing`	Competing narratives in AI ethics: a defense of sociotechnical pragmatism	Watson, David S and M{\"o	ai \& Society	2025
385	`goldberg2025threat`	Threat Rigidity and the Role of Leadership and Organizational Change in Artificial Intelligence Adoption in Technology Companies	Goldberg, Nicole Dillon		2025
386	`van2025beyond`	Beyond efficiency: How artificial intelligence (AI) will reshape scientific inquiry and the publication process	Van Quaquebeke, Niels and Tonidandel, Scott and Banks, George C	The Leadership Quarterly	2025
387	`belliger2025new`	New Perspectives on AI Alignment	Belliger, Andr{\'e	Ethics in the Age of AI: Navigating Politics and Security	2025
388	`xue2025mmrc`	Mmrc: A large-scale benchmark for understanding multimodal large language model in real-world conversation	Xue, Haochen and Tang, Feilong and Hu, Ming and Liu, Yexin and Huang, Qidong and Li, Yulong and Liu, Chengzhi and Xu, Zhongxing and Zhang, Chong and Feng, Chun-Mei and others	arXiv preprint arXiv:2502.11903	2025
389	`yang2025survey`	A survey of ai agent protocols	Yang, Yingxuan and Chai, Huacan and Song, Yuanyi and Qi, Siyuan and Wen, Muning and Li, Ning and Liao, Junwei and Hu, Haoyi and Lin, Jianghao and Chang, Gaowei and others	arXiv preprint arXiv:2504.16736	2025
390	`tian2025outlook`	An outlook on the opportunities and challenges of multi-agent ai systems	Tian, Fangqiao and Luo, An and Du, Jin and Xian, Xun and Specht, Robert and Wang, Ganghua and Bi, Xuan and Zhou, Jiawei and Srinivasa, Jayanth and Kundu, Ashish and others	arXiv preprint arXiv:2505.18397	2025
391	`karim2025ai`	Ai agents meet blockchain: A survey on secure and scalable collaboration for multi-agents	Karim, Md Monjurul and Van, Dong Hoang and Khan, Sangeen and Qu, Qiang and Kholodov, Yaroslav	Future Internet	2025
392	`gawande2025reactive`	From Reactive to Proactive: Real-Time Human-AI Collaboration in Intelligent Alerting Systems	Gawande, Pramod Dattarao	Journal of Computer Science and Technology Studies	2025
393	`hughes2025ai`	AI agents and agentic systems: A multi-expert analysis	Hughes, Laurie and Dwivedi, Yogesh K and Malik, Tegwen and Shawosh, Mazen and Albashrawi, Mousa Ahmed and Jeon, Il and Dutot, Vincent and Appanderanda, Mandanna and Crick, Tom and De’, Rahul and others	Journal of Computer Information Systems	2025
394	`ahrweiler2025inclusive`	Inclusive technology co-design for participatory AI	Ahrweiler, Petra and Sp{\"a	Participatory Artificial Intelligence in Public Social Services: From Bias to Fairness in Assessing Beneficiaries	2025
395	`merchan2025trust`	Trust by Design: An Ethical Framework for Collaborative Intelligence Systems in Industry 5.0	Merch{\'a	Electronics	2025
396	`watson2025personalized`	Personalized Constitutionally-Aligned Agentic Superego: Secure AI Behavior Aligned to Diverse Human Values	Watson, Nell and Amer, Ahmed and Harris, Evan and Ravindra, Preeti and Zhang, Shujun	arXiv preprint arXiv:2506.13774	2025
397	`kolt2025governing`	Governing AI agents	Kolt, Noam	arXiv preprint arXiv:2501.07913	2025
398	`kraprayoon2025ai`	Ai agent governance: A field guide	Kraprayoon, Jam and Williams, Zoe and Fayyaz, Rida	arXiv preprint arXiv:2505.21808	2025
399	`cohen2025exploring`	Exploring Big Five Personality and AI Capability Effects in LLM-Simulated Negotiation Dialogues	Cohen, Myke C and Su, Zhe and Kao, Hsien-Te and Nguyen, Daniel and Lynch, Spencer and Sap, Maarten and Volkova, Svitlana	arXiv preprint arXiv:2506.15928	2025
400	`zhi2024beyond`	Beyond preferences in ai alignment	Zhi-Xuan, Tan and Carroll, Micah and Franklin, Matija and Ashton, Hal	Philosophical Studies	2024
401	`chan2024visibility`	Visibility into AI agents	Chan, Alan and Ezell, Carson and Kaufmann, Max and Wei, Kevin and Hammond, Lewis and Bradley, Herbie and Bluemke, Emma and Rajkumar, Nitarshan and Krueger, David and Kolt, Noam and others	Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency	2024
402	`raza_fair_2024`	{FAIR	Raza, Shaina and Ghuge, Shardul and Ding, Chen and Pandya, Deval	arXiv preprint arXiv:2401.11033	2024
403	`liu_agentbench_2023`	{AgentBench	Liu, Xiao and Yu, Hao and Zhang, Hanchen and Xu, Yifan and Lei, Xuanyu and Lai, Hanyu and Gu, Yu and Ding, Hangliang and Men, Kaiwen and Yang, Kejuan and Zhang, Shudan and Deng, Xiang and Zeng, Aohan and Du, Zhengxiao and Zhang, Chenhui and Shen, Sheng and Zhang, Tianjun and Su, Yu and Sun, Huan and Huang, Minlie and Dong, Yuxiao and Tang, Jie	arXiv	2023	DOI/URL
404	`touvron2023llama`	{LLaMA: Open and Efficient Foundation Language Models	Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timothée and Rozi{\`{e	arXiv preprint arXiv:2302.13971	2023	DOI/URL
405	`zhang_multitrust_2024`	{MultiTrust	Zhang, Yichi and Huang, Yao and Sun, Yitong and Liu, Chang and Zhao, Zhe and Fang, Zhengwei and Wang, Yifan and Chen, Huanran and Yang, Xiao and Wei, Xingxing and Su, Hang and Dong, Yinpeng and Zhu, Jun	arXiv	2024	DOI/URL

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
readme		readme

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Responsible Agentic Reasoning and AI Agents: A Critical Survey

Overview

License

Acknowledgments

📚 References (Inline View)

About

Uh oh!

Releases

Packages

shainarazavi/Responsible-reasoning-agents

Folders and files

Latest commit

History

Repository files navigation

Responsible Agentic Reasoning and AI Agents: A Critical Survey

Overview

License

Acknowledgments

📚 References (Inline View)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages