COSO releases guidance on internal controls for AI

The Committee of Sponsoring Organizations of the Treadway Commission has released guidance outlining ways that organizations can achieve and maintain effective internal controls over generative AI. 

Processing Content

The COSO publication emphasizes that while generative AI transforms how information is generated, processed, and acted upon, it does not change the fundamental purpose of internal control: to help organizations achieve their objectives reliably. With this in mind, rather than proposing a new governance model, COSO has instead adapted the five components of its internal control framework — control environment, risk assessment, control activities, information & communication, and monitoring activities — into practices specific to generative AI. 

The COSO guidance is based on five foundational principles, each with its own prescription:

  • Gen AI is probabilistic versus deterministic, and so controls should treat outputs as claims requiring validation, rather than as facts to accept by default; 
  • Gen AI is dynamic, so risk assessment, change control, and monitoring should be continuous, or moving toward continuous, to keep pace with these changes; 
  • Gen AI is easily scalable, for better and worse, so controls should be designed to prevent small errors from propagating into systemic issues; 
  • There is a low barrier of entry for using gen AI, and so controls should be designed to govern who can build, deploy and interact with it; and,
  • When properly implemented, gen AI can enhance monitoring, documentation, and validation activities that would otherwise be impractical at scale.

"With the capabilities defined, common risks outlined, and foundational characteristics established, we can now integrate them into the framework in a way that is both comprehensive and practical," said the COSO guidance. 
In terms of the control environment, COSO recommended that organizations assign clear owners for each capability type, with defined authority, escalation paths, and documented scope of use; treat prompts, system prompts, retrieval connectors, and transformation rules as governed configurations with version history, approval workflows, and rollback plans; and embed generative AI governance into the broader control culture, to encourage innovation while also respecting guardrails. The board of directors also must have visibility into gen AI use and its attendant risks. 

AI-specific risk areas that professionals need to focus on include data extraction and ingestion, because if controls are weak here, every downstream process will inherit compromised or non-compliant data; judgment, forecasting and insight generation, because these outputs often inform high-impact decisions, and so the competence and independence of reviewers is critical; and human-AI collaboration, because without clear boundaries and training, users may inadvertently input sensitive information, rely on outputs without verification, or share unverified outputs externally. 

In terms of risk assessment, COSO said organizations should regularly ask "What if …" questions for each capability to surface hidden dependencies or edge cases, with the scenarios documented so they can be used not only for risk management, but also as evidence in audit planning and walkthroughs with internal and external auditors. They should also maintain living risk registers that update when models, corpuses, or configurations change — not just at annual review cycles; and they should embed monitoring triggers like linking identified risks to specific KRIs, dashboards, or alerts that will surface early signs of drift, bias, or misuse. 

Internal controls professionals should especially focus on data transformation and integration, as a small mapping or enrichment error can silently corrupt large datasets, leading to cumulative downstream reporting or compliance failures. The should also focus on automated transaction processing and reconciliation, since misclassification or threshold misalignment can lead to inappropriate action (or inaction) at scale; and knowledge retrieval and summarization, for incomplete sets of information or misinterpretation of unstructured data can result in inaccurate results. 

When it comes to control activities across AI capabilities, COSO believes internal controls professionals should test AI performance before and after deployment, as well as periodically retest to confirm ongoing reliability; separate the ability to configure AI settings from the authority to approve or review outputs; implement rules that require additional review, require source citations, or block the ability to take action when the confidence of the output falls below acceptable levels; and require documented approvals and evidence for changes to prompts, thresholds, and retrieval corpuses. 

When it comes to AI-specific risks, COSO noted that some types require especially robust control activities due to the nature of their outputs or the speed at which errors can scale. To head off data ingestion and extraction risks, professionals need to apply confidence thresholds and require human review for low-confidence extractions and require dual review for use cases before production use. For workflow orchestration and autonomous task execution risk, meanwhile, professionals should simulate and test routing changes before they go live and document expected routing logic so deviations are detectable; and in terms of judgment, forecasting and insight generation, internal controls professionals should require citations for all material outputs and implement capture of contrary information for cases where reviewers disagree, as well as perform hindsight analysis of forecasts against actual results. 

When it comes to information & communications, COSO urged internal auditors to record where data came from, how it was processed, and by which model configuration, so output similarity can be assessed; to maintain prompt libraries, retrieval knowledge sources, and model cards in controlled systems with role-based access; to define model or output KPIs (e.g., hallucinations, citation coverage, bias) and to report them alongside control KPIs; and to define who must be informed about incidents, changes, and limitations — and how quickly.

For data transformation and integration applications, internal controls professionals should communicate any transformation rule changes to all dependent processes before they take effect to avoid silent downstream errors. In cases of knowledge retrieval and summarization, organizations should distribute changes to underlying knowledge libraries, policies, or regulatory expectations to all impacted teams in plain language summaries, as well as tracking acknowledgments where necessary; and in cases of human–AI collaboration, people should display disclaimers or warnings in the interface when outputs have not been verified or may contain sensitive information. 

Finally, monitoring activities should combine dashboards for real-time metrics with scheduled deep-dive reviews to validate effectiveness; ensure human-in-the-loop quality reviews for a sample of transactions with use-case-specific rubrics (e.g., accuracy, completeness, tone); establish explicit triggers for retraining, reconfiguration, or rollback based on monitored metrics or incident thresholds; maintain a remediation log that records the issue, root cause analysis, corrective action taken, and follow-up testing results; and create an AI control deficiency playbook that includes a mapping of common generative AI failures to standard corrective actions agreed upon ahead of time. 

COSO warned that, ironically, monitoring systems themselves need monitoring to ensure detection logic remains accurate and relevant; recalibration schedules and hindsight analysis are critical here. People should also take care to compare forecasts to actual results regularly, investigate variances, and track performance trends that may indicate model drift or other incremental unintended changes. They should also review extraction accuracy when source formats change or new templates are introduced, as well as watch for shifts in confidence score distributions that may signal underlying issues. 

In terms of implementing the guidance, the committee said organizations should first establish an AI governance structure, inventory generative AI use cases, assess risk by COSO component, design and map controls, implement and communicate, then monitor and adapt. 

"The path forward is iterative: Inventory and classify use cases, assess risks with a gen AI-aware mindset, design and implement controls tied to both COSO and capabilities, and monitor performance with clear metrics and governance reporting. Done well, this approach transforms gen AI from an emerging risk into a well-governed asset — one that delivers value with confidence, transparency, and accountability. The sooner organizations embed gen AI governance into their internal control environment, the sooner they can realize its benefits while avoiding the costly risks of uncontrolled adoption," the guidance concluded. 

The guidance calls to mind a recent decision by the IAASB to pursue not new standards for AI, but rather non-authoritative guidance. Both speak to a growing demand for standards and frameworks for the technology. A recent report from Caseware (see previous story) found that a clear majority of respondents (66%) said there is an urgent need for a globally harmonized AI framework for audit and assurance.

For reprint and licensing requests for this article, click here.
Accounting Technology Artificial intelligence Corporate governance Accounting standards
MORE FROM ACCOUNTING TODAY