Red Team vs Purple Team vs Pentest vs Adversary Simulation — Clear Definitions

Published May 19, 2026 · By AxVeil Red Team · 18 min read

Walk into any security trade show and you will hear "red team" used to describe a network vulnerability scan, "purple team" used to describe two consultants in a Zoom call, and "adversary simulation" used as a marketing synonym for whatever is in last quarter's deck. The terms are real, the techniques behind them are mature, and the cost differences between engagements span more than an order of magnitude. If a CFO is being asked to sign a quote that says "red team" on it, they deserve to know whether the vendor is actually proposing a four-week, intel-led, stealth kill-chain operation or a two-week vulnerability sweep with a more expensive cover page. This post is the reference we point clients at when scoping conversations start drifting into vocabulary fog. It extends our earlier red team vs pentest comparison with the rest of the modern engagement family — purple team, adversary simulation, tabletop, and bug bounty — and tells you which one your programme should be buying this quarter.

TL;DR matrix

Engagement	Question it answers	Time	Cost (USD)
Penetration test	Is this asset vulnerable?	1-3 weeks	8-40k
Red team operation	Can a real adversary win, and would we notice?	4-12 weeks	50-300k
Purple team exercise	Does our SOC actually detect technique X?	1-5 days per cycle	10-60k
Adversary simulation / emulation	Would APT<n> succeed against us?	2-6 weeks	40-200k
Tabletop / breach simulation	Will leadership make the right calls under stress?	2-4 hours	5-25k
Bug bounty	What can a global crowd find on us continuously?	Continuous	20-500k+/yr

The vocabulary problem — and why it costs you money

The security industry has a long history of relabelling. Boutique pentest shops added "red team" to their pricelists in the mid-2010s without changing what they delivered, because boards started asking for it. Tooling vendors invented "continuous red teaming" for products that were closer to breach-and-attack simulation. Compliance frameworks then enshrined the loose vocabulary into their text — so an auditor will ask for "a red team report" and accept what is functionally a pentest, while a regulator on the next floor will reject the same document for not being a red team at all.

The cost of this fuzziness is real. We have seen a regional bank pay USD 220,000 for a "red team" that turned out to be three external pentests stitched into one PDF, with no kill-chain narrative and no detection-gap matrix. They had to commission the actual red team a quarter later when their regulator pushed back. Vocabulary discipline is a procurement control.

Penetration test — defined

A penetration test is a time-boxed, scope-defined offensive assessment of a known set of assets. The tester is told what to attack (this URL, this IP range, this mobile app binary, this AWS account), is given a testing window, and is expected to enumerate and exploit vulnerabilities against a methodology — OWASP for web, OWASP MASVS for mobile, CIS Benchmarks and provider security pillars for cloud, NIST 800-115 as a generic frame. The deliverable is a list of findings, each with a CVSS or risk rating, reproduction steps, and a remediation recommendation.

The defining trade-off in a pentest is depth versus breadth inside a fixed budget. A three-week web pentest on a single application will go deeper into business logic and authorisation flaws than a three-week pentest spread across a hundred subdomains. The scoping document is therefore the most consequential artefact in a pentest engagement; see our pentest scoping checklist for the questions worth asking before a vendor quotes.

Sample pentest finding

Title:IDOR on /api/v2/invoices/<id> allows cross-tenant invoice retrieval.
Severity: High (CVSS 8.1).
Reproduction:Authenticate as tenant A; issue GET /api/v2/invoices/9831 (an invoice belonging to tenant B); server returns 200 with JSON body of tenant B's invoice including PII fields.
Remediation: Enforce tenant ID match in the authorisation middleware, not in the database query alone. Add an automated test asserting cross-tenant requests return 404.

That is the shape of a pentest finding. Specific, reproducible, actionable in one sprint. Pentests are not designed to model what an adversary would do after they have exfiltrated those invoices — that is a different engagement.

Red team operation — defined

A red team operation is intent-driven, multi-vector, and time-bound. The operators are given an objective in business language — "exfiltrate the crown-jewel customer database undetected within thirty days", "sign a fraudulent SWIFT transaction from the treasury network", "modify a clinical trial dataset and walk it out without triggering a SIEM ticket" — and the choice of vector is the operators'. The blue team is not informed. Stealth is part of the success criteria. The operation is governed by a written Rules of Engagement document, a Traffic Light Protocol (TLP) marking on every artefact, and a white cell of trusted agents inside the customer who can vouch for the team if a SOC analyst flags activity and an executive demands a response.

The deliverable from a red team is not a list of bugs. It is a narrative kill chain mapped to MITRE ATT&CK, with a detection-gap matrix for every technique used and replayable artefacts the customer's SOC can use to validate any new detection content they write. A vendor that hands you a pentest-style findings list with the word "red" on the cover has not run a red team. Our red team service page documents the deliverable shape we contract to.

Sample objective, as it would read in the engagement letter: "Within a thirty-day operational window starting 1 August 2026, demonstrate the ability to exfiltrate the production customer database (datawarehouse-prod-1, schema customer) to an attacker-controlled S3 endpoint. Document every technique attempted, every detection raised by the blue team, and every detection that should have raised but did not. Stop conditions: any action that would damage data integrity, any action against named out-of-scope assets, any action that risks regulatory reportable customer impact."

Purple team exercise — defined

A purple team exercise is a collaborative offensive-defensive workshop. The red side executes a single ATT&CK technique at a time; the blue side watches the SIEM, EDR, and identity logs to see what was raised. If nothing was raised, the two teams write or tune a detection rule, redeploy it, and replay the technique. The point is to convert detection coverage into a measurable metric instead of a screenshot. A mature purple team programme reports ATT&CK coverage as a percentage, broken down by tactic, with a trendline over time. A small SOC can routinely move from 30% coverage to north of 70% on the techniques their threat model demands inside twelve months of disciplined monthly purpling.

Sample technique walk-through: T1558.003 Kerberoasting

Red operator runs Rubeus from a domain-joined workstation requesting service tickets for accounts with SPNs set: Rubeus.exe kerberoast /outfile:hashes.txt.
Blue checks the SIEM. Was Event ID 4769 logged with encryption type 0x17 (RC4)? Was the source workstation flagged as anomalous for that user? Did the EDR raise on the Rubeus binary or its in-memory equivalent?
If the logs are missing or the alert did not fire, the two teams write a Sigma rule keyed on the 4769/RC4 anomaly and a behavioural rule on rapid sequential service ticket requests from a single source.
Red replays. Blue confirms the new detection fires inside the SOC ticketing workflow, not just the raw SIEM. Coverage matrix updated, next technique selected.

Repeat across the tactics that matter for your threat model. Twelve cycles, twelve months, measurable lift.

Adversary simulation / emulation — defined

Adversary simulation (or adversary emulation — the terms are used interchangeably in most procurement documents, though purists distinguish them) is a structured exercise in which the operators faithfully reproduce the documented TTPs of a specific named threat actor. MITRE's open library of adversary emulation plans provides published profiles for APT29, FIN6, FIN7, OilRig, menuPass, and others, each one mapped technique-by-technique to the actor's public reporting.

The distinction from a red team is operational intent. A red team picks any path to an objective; an adversary simulation deliberately constrains the operators to the specific TTPs of one actor, so the resulting report tells you "our detection posture against APT29 specifically is X percent — here is the gap list". This is valuable because boards and regulators often ask exactly that question after a high-profile breach: "could the Lazarus playbook that hit Bank Y succeed here?". Pair this engagement type with our Lazarus TTP analysis for a worked example of what a profile-driven simulation looks like in practice. Our adversary simulation service ships with profiles for APT29, FIN7, Lazarus, Volt Typhoon, and Scattered Spider, all mapped to current ATT&CK.

Tabletop / breach simulation — defined

A tabletop is a facilitated, conversation-based exercise — no live offensive activity, no production systems touched. The facilitator delivers a scenario opening and a sequence of injects to the leadership team, who talk through the decisions they would make. The point is to surface policy gaps, runbook gaps, and unclear decision rights before a real incident reveals them under time pressure. See our full tabletop templates for CISOs post for the universal frame and five worked scenarios.

A tabletop is not a substitute for offensive testing — it does not validate detection or controls — but it is the highest-leverage way to test the human layer of the incident response stack. The cheapest tabletop is a half-day facilitated workshop with the leadership team in a room and a printed inject sheet; the most expensive is a multi-day breach simulation with parallel injects to legal, communications, customer success, and the board, and a written report inside a week.

Bug bounty — defined

A bug bounty programme is continuous, crowdsourced, and pay-per-result. The customer publishes a scope and a reward table; independent researchers from anywhere in the world test the scope and submit findings; the customer triages and pays for valid ones. Programmes are public (anyone can join) or private (invite-only on a platform like HackerOne, Bugcrowd, Intigriti, or YesWeHack).

Bug bounty is excellent at breadth and at long-tail discovery — researchers with very different specialities will find very different bugs over a year of programme operation. It is poor at depth on complex business logic, at compliance-driven scope coverage where you need a signed report on a specific date, and at testing the same asset twice in quick succession (researchers self-select toward novel targets). For the full trade-off, see our bug bounty vs pentest comparison and our security/bug-bounty page describing how AxVeil works with researchers.

Comparison table — the long form

Dimension	Pentest	Red team	Purple team	Adversary sim	Tabletop	Bug bounty
Cost (USD)	8-40k	50-300k	10-60k	40-200k	5-25k	20-500k+/yr
Duration	1-3 wks	4-12 wks	1-5 days/cycle	2-6 wks	Hours	Continuous
Deliverable	Findings list	Kill-chain narrative + gap matrix	Coverage report + Sigma/KQL	Actor-profile gap report	Hot-wash + action items	Submission stream
Who's in scope	Named assets	People + process + tech	SOC + tooling	Detection stack	Leadership	Published scope
Blue team awareness	Often informed	Black box	Fully participating	Usually informed	N/A	Triage team aware
Prerequisite maturity	Low	High	Medium	Medium-High	Low	Medium
Frequency	Quarterly / yearly	1-2x / year	Monthly	1-2x / year	Quarterly	Continuous

Maturity ladder — which engagement at which stage

Engagement type should track security maturity. Buying an engagement two rungs above your current level produces an expensive report that nobody can action; buying one two rungs below leaves the obvious tier of risk untouched. The ladder we use in programme reviews:

Stage 0 — no security programme yet. Engagement: tabletop and a scoped external pentest. Goal: surface the embarrassing bugs and get leadership fluent in incident vocabulary before spending bigger numbers.
Stage 1 — EDR deployed, SIEM stood up, basic IR playbooks. Engagement: quarterly pentest plus monthly purple team. Goal: ratchet detection coverage on the techniques your threat model demands and patch the easy bugs in parallel.
Stage 2 — 24x7 SOC, mature detection engineering, post-incident process. Engagement: adversary simulation against a named actor profile relevant to your sector. Goal: stress-test the detection stack against a coherent playbook, not just isolated techniques.
Stage 3 — board-level cyber risk programme, regulator expectations. Engagement: annual red team operation with TLP-WHITE executive readout and a detection-gap matrix that feeds the next year's detection engineering backlog. Goal: validate the whole programme end-to-end.
Stage 4 — public-facing product, large attack surface, security as a brand differentiator. Engagement: continuous bug bounty layered on top of all the above. Goal: continuous adversarial coverage between scheduled engagements.

The most common scoping mistake in our practice is a Stage-1 organisation buying a red team because the board asked for one. The right response is to run the pentest and the purple team for two cycles, evidence the lift, and then propose the red team in the next budget cycle with concrete prerequisites met.

Pitfalls

Red teaming an immature organisation

A red team against a programme with no SIEM, no EDR, and no IR playbooks produces a report that says "we got in via three vectors, took the customer database in six hours, and were never detected". The report is correct. It is also useless, because the action items reduce to "build the security programme you do not yet have", which is what a maturity assessment at one-tenth the cost would have told you. The hidden cost is worse: a Stage-1 organisation that has paid for a red team often spends the next year reacting to the findings instead of building the foundational detection capability they should have built first.

Purple teaming without a telemetry baseline

A purple team exercise with no SIEM is a workshop in which the red side runs techniques and the blue side has no instrumentation to see them. The output is a list of detections you should write, with no platform to write them in. Spend the budget on the SIEM and on EDR first, get one quarter of telemetry flowing, then start the purple programme. The technique catalogue is open source, the time is free; the consultancy spend should follow, not precede, the platform spend.

Vocabulary leakage in procurement

If the statement of work uses "red team" but the deliverable section reads "findings list with CVSS scores", you are paying red-team rates for a pentest. Fix it before signing. Insist that the deliverable section explicitly includes a kill-chain narrative, an ATT&CK-mapped detection gap matrix, and replayable artefacts. If the vendor pushes back, you have learned something important about the engagement before money changed hands.

Conflating bug bounty with assurance

A live bug bounty programme is not equivalent to a pentest from a regulator's point of view, and a quiet bounty programme (no severe findings in the last quarter) is not evidence of security. It might equally mean researchers are off testing more lucrative targets. Pair the bounty with scheduled pentest or red team coverage and ask your bounty platform for triage SLAs and researcher-engagement metrics; quiet is not the same as safe.

How AxVeil structures each engagement

Each engagement family in our practice is delivered against a documented playbook so scoping conversations are about your threat model, not our internal process. Brief summaries below; deep-dives on the service pages.

Pentest / VAPT. Scoped offensive assessment with OWASP-aligned methodology, fixed-window delivery, and a signed report your auditor will accept. See /services/vapt and the VAPT vs penetration testing explainer for the terminology distinction.
Adversary simulation. TTP-faithful exercises against named actor profiles (APT29, FIN7, Lazarus, Volt Typhoon, Scattered Spider). MITRE-aligned technique coverage report, Sigma / KQL artefacts shipped on close-out. See /services/adversary-simulation.
Red team. Objective-driven operations under TIBER-EU, CBEST, iCAST, AASE, or bespoke frame, depending on regulator. Multi-vector kill chain, white cell coordination, narrative deliverable with executive readout. See /services/red-team and the TIBER-EU framework explainer for the regulated-finance variant.
Bug bounty. Programme design, scope definition, triage integration with your engineering backlog, researcher engagement and payout policy. See /security/bug-bounty and the bug bounty vs pentest decision guide.

Tabletop and purple team are typically delivered as fixed-scope workshops attached to an existing engagement — quarterly purple bundles ride on top of a VAPT retainer, tabletops are usually a half-day add-on to a red team close-out so leadership can walk through the findings under simulated time pressure.

FAQ

Is adversary simulation the same as a red team engagement?

They overlap but they are not identical. A red team operation is intent-driven (achieve an objective like exfiltrating the customer DB) and stealth is part of the success criteria. An adversary simulation (sometimes called adversary emulation) is technique-driven — the operator faithfully reproduces the documented TTPs of a specific actor such as APT29 or FIN7, often with the blue team aware, to test whether the detection stack catches that actor's playbook. A red team can borrow adversary-emulation profiles, but you can run an emulation exercise without the stealth or end-to-end kill chain a red team would attempt.

Can a purple team replace a red team?

No, they answer different questions. A purple team measures detection coverage one technique at a time, with the blue team participating, so it is a high-throughput way to improve the SIEM and EDR. A red team measures whether the whole programme — people, process, and tooling — would catch a determined adversary chaining many techniques together under real conditions. A mature programme runs both: monthly or quarterly purple to ratchet coverage up, annual red to validate the result holds end-to-end.

Do we need a pentest if we already have a bug bounty?

Yes, for most organisations. Bug bounty is excellent for breadth and long-tail discovery but it is poor at depth on complex business logic, compliance-driven scope coverage, and time-boxed deliverables your auditor expects. A pentest delivers a signed report against a defined scope inside a known window; bounty researchers self-select what looks lucrative. See our dedicated comparison in the bug-bounty-vs-pentest post for the trade-offs and where the two complement each other.

What is the cheapest engagement that still satisfies SOC 2 or ISO 27001?

A scoped external penetration test on production assets, run annually by a qualified third party with a signed report, is the floor most auditors accept. Internal pentests and red team engagements are nice-to-have for those frameworks but not mandatory. PCI DSS v4 is stricter — segmentation testing and internal pentest are explicitly required. For regulated banking in the EU, UK, HK, or Singapore the floor is higher again: a TIBER-EU, CBEST, or iCAST-aligned red team is mandated on a multi-year cycle.

How early in a security programme is it worth running a red team?

Wait until you have functioning detection. Concretely: a SIEM ingesting endpoint, identity, and cloud logs; an EDR deployed across the estate; documented playbooks for the top five alert categories; and at least one prior pentest closed out. Without those, a red team simply walks in unopposed and you pay six figures to learn what a pentest would have shown for a fraction of the cost. The right entry point for an immature programme is pentest plus purple team, then red team in year two or three.