Fifty-five percent of all AI failures inside organisations come from third-party tools, according to a joint MIT Sloan Management Review and Boston Consulting Group study of more than 1,240 executives across 87 countries. The same research found that 78% of organisations rely on third-party AI, and over half rely on it exclusively. That’s the modern reality of the AI stack: most of the systems making decisions inside your enterprise weren’t built by you, weren’t trained on your data, and weren’t designed with your risk appetite in mind. Yet under ISO 42001, the EU AI Act, and the NIST AI RMF, your organisation is on the hook when those systems fail. This piece breaks down what third-party AI risk actually looks like in 2026, where vendor governance breaks under pressure, and how to build a programme that holds up to certification audits and regulatory scrutiny.
What “Third-Party AI Risk” Actually Means
Third-party AI risk isn’t just vendor risk with a new label. The structural difference matters. A traditional SaaS vendor delivers a deterministic service: input goes in, output comes out, behaviour is documented. An AI vendor delivers a probabilistic service whose outputs depend on training data you can’t see, model weights you can’t inspect, and update cycles you don’t control. When your CRM provider pushes a release, the changelog tells you what changed. When your AI vendor retrains a model, your decisions may shift overnight without any visible release at all.
That gap is where the risk lives. Third-party AI risk encompasses every exposure that arises when an external party’s AI system, model, or AI-enabled component influences decisions, processes, or data inside your organisation. It includes the AI you bought, the AI embedded in software you bought, the AI your service providers use to deliver work back to you, and increasingly, the AI your vendors’ vendors are using somewhere upstream.
The shadow problem makes this harder. According to industry research cited by Accorian, roughly 89% of enterprise AI usage is invisible to the organisations using it. Vendors embed AI features into existing products without explicit notice. Service providers route work through generative AI. Procurement teams sign renewal contracts unaware that the platform has quietly added a recommendation engine. The compliance question shifts from “do we use third-party AI?” to “do we know where it’s running?”
There are five exposure categories that every governance programme needs to map. Data exposure: what does the vendor see, store, and use to train. Model risk: bias, drift, hallucination, and degradation that you inherit from a system you didn’t build. Operational dependency: what fails when the vendor’s AI fails. Compliance transfer: regulatory obligations that move with the system regardless of who deployed it. And contractual gaps: liability, indemnification, and breach disclosure clauses that traditional vendor templates simply don’t address.
Why Third-Party AI Risk Became a Board-Level Issue
The shift from “IT problem” to “board problem” happened in two moves. First, vendor breaches became the dominant cyber-incident vector. Second, AI regulation made the deploying organisation accountable for systems built by someone else.
On the breach side, the 2025 Verizon Data Breach Investigations Report found that roughly one in three data breaches now involves a third party double the figure from the prior year. IBM’s 2025 Cost of a Data Breach Report added a sharper edge: 13% of organisations reported breaches involving AI models or applications, and 97% of those organisations lacked proper AI access controls. The breach economics tell the rest of the story. Software supply chain attacks are projected to cost businesses $60 billion in 2025, up from $46 billion in 2023, according to Cybersecurity Ventures.
On the regulatory side, three frameworks now treat third-party AI as a first-class governance concern. The EU AI Act assigns provider-level obligations to anyone who substantially modifies a vendor’s AI system or deploys it in a high-risk context it wasn’t designed for. ISO 42001 Annex A.10 makes supplier governance a certifiable control. The NIST AI RMF treats third-party dependencies as a core measurement category for trustworthiness. The board doesn’t need to read the standards. They need to know that the regulator audits the deployer, not the vendor.
| WHAT MOST PEOPLE GET WRONGTreating third-party AI risk as a procurement problem. By the time it reaches procurement, the design decisions that determine your exposure have already been made. The vendor selected the training data. The vendor chose the model architecture. The vendor decided what telemetry the system emits. Procurement is downstream of risk, not the place where risk is managed. |
How to Find AI You Didn’t Know You Were Using
Before you can govern third-party AI, you have to find it. This sounds trivial. It is not. The vendor whose contract you signed three years ago may have added AI features last quarter without renegotiating terms. Your sales team is forwarding customer emails into a generative summarisation tool that nobody approved. Your finance partner is using a model to score your invoices for fraud. Each of these creates exposure. None of them appear in a standard vendor inventory.
Three discovery techniques work in combination. None work alone.
1. Contractual disclosure with teeth
Update your master services agreements and renewal templates to require that vendors disclose, in writing, every AI capability used in service delivery. Specify the disclosure must include the model type, training data sources, whether your data is used for training or inference, and the human oversight regime. Tie disclosure to a notification clause: any new AI feature requires written notice 30 days before deployment, with a right to opt out. Most vendors will push back. The ones that won’t disclose are the ones you most need to identify.
2. Technical detection
Network telemetry can identify AI traffic patterns even when vendors don’t disclose them. DNS analysis flags traffic to known AI provider endpoints. Egress monitoring catches large data flows to inference APIs. CASB tools increasingly tag AI services. Treat technical detection as the validation layer against contractual disclosure: when telemetry shows AI traffic and the vendor’s questionnaire shows none, you have a conversation to start.
3. Procurement and renewal triggers
Embed AI screening into every procurement gate, not just new vendors. Renewal cycles are where shadow AI surfaces because that’s when terms get re-examined. Build a five-question screening tier into the procurement workflow: does the product use AI, what category, what data does it touch, what’s the inherent risk tier, and what governance evidence exists. Vendors who can’t answer the screening don’t pass the gate.
Risk-Tiering Vendor AI Systems
Not every vendor AI system warrants the same level of scrutiny. Trying to apply uniform due diligence across hundreds of vendors is what breaks programmes. The Whistic 2025 vendor survey found that the average vendor now responds to 37 assessment requests per month, spending 179 hours equivalent to one full-time employee on questionnaires alone [VERIFY]. Volume crushes depth. Tiering restores it.
A defensible tiering model uses three dimensions, scored independently and combined into an inherent risk rating.
| Dimension | Low Tier | Medium Tier | High Tier |
|---|---|---|---|
| Data sensitivity | Public or anonymised data only | Internal business data, no PII | PII, PHI, financial, or regulated data |
| Decision impact | Recommendations a human reviews | Automated workflows with human override | Automated decisions affecting customers, employees, or regulated outcomes |
| Operational criticality | Productivity tooling | Material business process | Revenue, safety, or compliance-critical process |
A vendor scoring high on any single dimension is a high-tier vendor. Don’t average. Averaging hides the systems that need the most scrutiny. A productivity tool that processes regulated health data is high tier even if its decisions feel low stakes.
Tier dictates depth. Low-tier vendors can pass a 20-question self-attestation. Medium tier requires evidence-backed responses, sample model cards, and SOC 2 Type II review. High tier requires independent audit reports, technical due diligence on the model itself, ISO 42001 alignment evidence, and contractual clauses that specifically address AI failure modes. The goal isn’t more questionnaires. It’s right-sized assurance.
How ISO 42001, the EU AI Act, and NIST AI RMF Treat Third-Party AI
Each framework approaches vendor AI from a different angle. Understanding the overlap is what lets you build one governance programme that satisfies all three rather than three programmes that fight each other for evidence.
ISO 42001: supplier governance as a certifiable control
ISO/IEC 42001 places third-party governance squarely inside Annex A.10, which addresses supplier and customer relationships in the AI lifecycle. As A-LIGN’s analysis of the standard makes clear, the auditor isn’t asking whether your vendors are well-behaved. The auditor is asking whether you have a process to identify, risk-assess, and control any externally provided input that affects AI lifecycle outcomes. Control A.10.3 specifically requires a formal process to evaluate AI suppliers and ensure their products align with the organisation’s responsible AI approach.
In practice, ISO 42001 certification requires four pieces of evidence for vendor AI: a current inventory mapping each system to applicable controls, completed risk assessments at vendor onboarding, an AI risk register with vendor-specific entries and assigned owners, and a Statement of Applicability that links Annex A controls to each vendor relationship. Mature programmes cut audit prep time from eight weeks to two by maintaining this evidence continuously rather than reconstructing it before each audit.
EU AI Act: the requalification trap
Article 25 of the EU AI Act contains the most consequential third-party provision in the regulation. It requalifies any deployer, distributor, or third party as a provider of a high-risk AI system if they: put their name or trademark on a system already on the market, make a substantial modification, or change the intended purpose in a way that pushes the system into high-risk territory.
This catches more organisations than they realise. White-labelling a vendor’s AI under your brand makes you the provider. Fine-tuning a foundation model for a high-risk use case makes you the provider. Deploying a general-purpose AI system in a high-risk context it wasn’t designed for makes you the provider. The original vendor must cooperate with you, but the provider obligations conformity assessment, CE marking, EU database registration, technical documentation, post-market monitoring sit with you.
Article 25(4) adds the contractual layer: providers of high-risk systems and the third parties who supply tools, components, or processes integrated into them must, by written agreement, specify the information, technical access, and assistance needed for compliance. Standard SaaS contracts don’t cover this. Most templates need to be rewritten.
NIST AI RMF: third-party risk as a measurement function
The NIST AI Risk Management Framework treats third-party AI through its Manage and Measure functions. Manage requires documented procedures for assessing, monitoring, and responding to risks from external AI components. Measure requires methods to evaluate whether the third-party system performs as expected and within acceptable parameters across its lifecycle. Unlike ISO 42001, NIST AI RMF isn’t certifiable but US federal procurement, financial services regulators, and several state AI laws are increasingly aligning their expectations to it. Mapping your vendor AI controls to NIST functions is how you stay portable across US regulatory regimes.
What an AI Vendor Risk Assessment Should Actually Ask
Most vendor questionnaires fail on AI because they import security questions wholesale and add a few token AI items at the end. The result is 200 questions that miss the actual risks. A defensible AI-specific assessment narrows the focus to areas where third-party AI behaves differently from traditional software.
Eight categories cover the ground that matters:
- Model provenance and architecture. What model is in use, who built it, what architecture, and what version. Foundation model dependencies must be disclosed because the vendor’s risk inherits from upstream model providers.
- Training data and data lineage. Where did the training data come from, what consent or licensing applies, is your data used to train models, and how is data segregated between customers.
- Bias, fairness, and evaluation. What evaluation methodology is used, what fairness metrics are measured, on what populations, and what’s the result. “We tested for bias” is not an answer; “we measured demographic parity across these five protected categories with these results” is.
- Human oversight design. Where in the vendor’s system can a human review, override, or contest an output. For high-tier systems, this maps directly to EU AI Act Article 14 obligations.
- Model monitoring and drift. How does the vendor detect model degradation, what triggers retraining, how are customers notified of material model changes, and what’s the rollback capability.
- Security controls specific to AI. Prompt injection defences, model inversion protections, training data poisoning controls, and inference-time access logging.
- Compliance posture and certifications. ISO 42001 certification status, SOC 2 Type II with AI-specific scope, alignment with NIST AI RMF, and any sector-specific evidence (HIPAA, FedRAMP, PCI DSS).
- Incident response and disclosure. What constitutes an AI-specific incident in the vendor’s view, what’s the disclosure SLA, and who has authority to declare an incident.
Score responses against weighted criteria tied to the vendor’s risk tier. Demand evidence model cards, audit reports, evaluation results not just attestation. The vendors who can produce evidence are the ones you can defend if a regulator comes asking.
Rewriting Contracts for AI Risk
If procurement is downstream of design, contracting is where you reclaim leverage. Standard vendor templates were written for predictable software. They don’t address the failure modes that actually matter for AI.
Six clauses separate AI-aware contracts from legacy ones. AI use disclosure with notification triggers any addition, modification, or material change to AI features requires advance written notice. Data use restrictions explicit prohibition on using customer data for training without separate consent, with audit rights to verify. Model change governance material changes that affect output behaviour, accuracy, or compliance posture trigger a documented review before deployment. Audit and evidence rights including the right to receive model cards, evaluation results, and independent audit reports on a defined cadence. Incident response and disclosure AI-specific incident definitions and time-bound disclosure obligations that match your regulatory reporting clocks. Liability and indemnification clear allocation for outputs that breach regulation, infringe third-party rights, or cause downstream harm.
| PRACTICAL DETAILThe EU AI Act Article 25(4) cooperation clause is the easiest one to drop into a contract today. It requires the vendor to provide the information, technical access, and assistance you need to comply with provider obligations if your use of their system pulls you into provider territory. Most vendors will accept this language because it’s effectively required by regulation. Use it as a wedge for the harder clauses. |
From Point-in-Time to Continuous Vendor Monitoring
Annual reassessment doesn’t survive contact with AI. A vendor that scored “low risk” in January may have fine-tuned its model on a new dataset, added a generative feature, or replaced its foundation model provider by March. The EY 2025 Global TPRM Survey found that 64% of organisations now monitor their vendors’ vendors, a level of visibility that would have been impossible a few years ago. The shift from periodic to continuous is happening because the risk surface itself is continuous.
Continuous monitoring for vendor AI doesn’t require boiling the ocean. Four signal layers cover most of the value:
- Vendor-published changes. Subscribe to release notes, model cards, and trust centre updates. Track them automatically; flag anything that mentions model architecture, training data, or compliance posture.
- Regulatory and certification status. Watch ISO 42001 certification registers, SOC 2 attestation refreshes, and regulatory enforcement actions. A loss of certification is a leading indicator.
- Operational telemetry. Output quality, latency, and error rates from the vendor’s API tell you when a model has shifted before the vendor admits it. Establish baselines at onboarding and alert on deviation.
- External signals. Breach notifications, security advisories, and credible industry reports. One in three breaches now involves a third party; treat external signals as part of your control set, not external noise.
This is where Govern365.ai’s compliance dashboards earn their place: aggregating vendor evidence, certification status, and risk register entries into a single view tied to ISO 42001 controls and EU AI Act articles. The goal isn’t more dashboards. It’s having one view your auditor can walk through that proves continuous oversight rather than periodic assertion.
Building the Operating Model: Who Owns What
Programmes fail on ownership before they fail on tooling. AI governance touches procurement, security, legal, data privacy, and the business owner of each AI use case. Without explicit RACI, every team assumes someone else is doing the assessment, and nobody is.
A workable model assigns four primary responsibilities. The AI governance lead owns the policy, the framework alignment, and the control catalogue. Procurement owns the gate: no vendor onboards without governance sign-off proportional to tier. Security owns the technical due diligence on data handling, access controls, and AI-specific threat models. The business owner the person whose process the AI runs owns the outcomes: bias monitoring, output quality, and the decision to keep using the system. Legal supports each of these but doesn’t own the programme; that’s a common failure mode where the programme becomes a contract review queue rather than a risk function.
For C-suite sponsors, the question to ask in board reviews is not “how many vendors did we assess this quarter.” It’s “what’s our coverage rate by risk tier, and what’s our mean time from AI feature deployment to governance approval.” Both metrics tell you whether the programme is keeping pace with the business.
A 90-Day Programme to Stand This Up
Governance programmes that try to do everything at once stall. A phased approach gets the first defensible posture in place quickly, then layers depth.
Days 1 to 30: Visibility
Build the inventory. Combine procurement records, contract data, network telemetry, and a survey to every vendor asking explicitly about AI use. Tier each system using the three-dimension model. Identify the high-tier vendors that need immediate attention. Don’t try to assess everything; just know what’s there.
Days 31 to 60: Controls
Stand up the AI vendor questionnaire and assessment workflow. Roll it out to high-tier vendors first. Update the master services agreement and renewal templates with the six AI clauses. Establish the RACI. Connect the inventory to the AI risk register so vendors and risks are tracked in the same system, not parallel spreadsheets.
Days 61 to 90: Continuous oversight
Activate continuous monitoring signals. Map vendor controls to ISO 42001 Annex A and the EU AI Act articles relevant to your portfolio. Run the first management review with a real dataset: coverage by tier, control gaps, vendor-driven incidents, and remediation timelines. From day 91, the programme should be feeding evidence continuously rather than reconstructing it for each audit.
Frequently Asked Questions
Are we liable for what our AI vendor’s model does?
Yes, in most regulatory regimes. The EU AI Act, ISO 42001, and emerging US state laws assign accountability based on functional role, not vendor relationship. If you deploy the system, modify it, or rebrand it, the obligations move to you. Contracts can allocate cost and indemnification, but they cannot eliminate the regulatory obligation. Treat vendor AI as an extension of your own AI inventory for compliance purposes.
How do we handle vendors who refuse to disclose their AI use?
Treat refusal as a risk signal. Reasonable vendors disclose AI use and the controls around it; vendors who refuse are signalling either immature governance or known issues. For high-tier categories, refusal should be a procurement gate failure. For lower-tier categories, document the refusal in your risk register, restrict the data the vendor can access, and accelerate replacement. Persistent non-disclosure has become a common audit finding under ISO 42001.
Does ISO 42001 certification cover our vendors automatically?
No. Your certification covers your AI management system, including how you govern third parties. Vendors aren’t certified by your audit; they have to certify independently if they want to claim ISO 42001 alignment. What your certification does require is evidence that you’ve assessed and are managing the risk those vendors introduce. Annex A.10 controls are tested directly against vendor evidence during certification audits.
How is AI vendor risk different from traditional SaaS risk?
Three differences matter most: outputs are probabilistic rather than deterministic, models change without traditional release cycles, and training data introduces compliance exposure that doesn’t exist in standard software. A SaaS vendor’s behaviour is bounded by code; an AI vendor’s behaviour is bounded by data and weights you can’t see. Traditional vendor questionnaires don’t surface these dimensions, which is why AI-specific assessment categories are required.
What’s the minimum we need before our next audit?
A current inventory of AI vendors with risk tiers, evidence of due diligence proportional to tier, an AI risk register with vendor-specific entries and owners, and contractual evidence that key clauses (disclosure, data use, change governance, incident response) are in place for high-tier vendors. Without these four artefacts, ISO 42001 audits and EU AI Act compliance reviews become significantly harder to defend.
How do we monitor model drift in a vendor’s system we don’t control?
You monitor outputs, not weights. Establish quality baselines at onboarding — accuracy, latency, output distribution on representative inputs. Track those baselines continuously and alert on material drift. Combine output telemetry with vendor-published change logs and SLA reports. If your vendor’s contract doesn’t include change notification clauses, telemetry is your only signal — which is why those clauses matter.
Closing the Loop
Third-party AI is now where most of your AI risk lives. The MIT Sloan and BCG data make that empirical: more failures come from vendor systems than from anything you built yourself. The frameworks have caught up. ISO 42001 makes supplier governance a certifiable control. The EU AI Act’s Article 25 reassigns provider obligations to anyone who substantially shapes how a vendor’s AI gets used. NIST AI RMF treats third-party dependencies as a core measurement function. The board-level question isn’t whether to build a programme. It’s whether yours can produce evidence on demand.
The fastest path forward: get the inventory right, tier ruthlessly, write contracts that match the technology, and move from periodic reassessment to continuous oversight. The organisations doing this well are turning what used to be a quarterly fire drill into a steady-state capability.
Govern365.ai, by the Global AI Certification Council, gives compliance and GRC teams a single platform to map vendor AI systems to ISO 42001 controls, EU AI Act articles, and NIST AI RMF functions, with continuous evidence collection built in. Start your 14-day free trial and see how the platform turns vendor governance from an annual scramble into an audit-ready operating rhythm.
