30 Questions to Ask Your VAPT Vendor — Before You Sign the SOW

Published May 19, 2026 · By AxVeil Research · 16 min read

Most VAPT RFPs miss the point. They optimise for procurement convenience — fixed unit price, tidy line items, a logo grid of past clients — and skip the questions that actually determine whether the engagement will produce findings your defenders can act on and a report your auditor will accept. The result is a familiar pattern: a clean SOW, a low quote, and a report that lands six weeks later with eight findings, two of which are duplicate scanner output, and none of which would have stopped the breach you bought the pentest to prevent.

This checklist is the antidote. Thirty questions across five categories — scope, qualifications, compliance, operations, commercials — each with a stated reason it matters, a red flag pattern that should slow the procurement, and a green flag pattern that should speed it up. None of the questions are gotchas. All of them are questions a competent vendor wants you to ask, because the answer makes the engagement easier for both sides. For the contractual scaffolding that surrounds these questions, our pentest RFP template ships the same structure as a downloadable Word document.

Why most VAPT RFPs miss the point

Three failure modes recur. First, the RFP is written by procurement, not by security — so it asks about price, references, and SLAs without asking about methodology, named operators, or retest cadence. Second, the RFP treats VAPT as a commodity — three vendors, same SOW, lowest quote wins — when in reality the variance in operator skill between two equally-credentialled firms is larger than the variance in price. Third, the RFP ends at the report — no retest, no remediation verification, no clause for the inevitable mid-engagement scope change. By the time the buyer notices, the engagement is over and the findings are stale.

The fix is not more procurement process. The fix is fewer, better questions. Thirty is the right number — enough to discriminate between a credible vendor and a brochureware vendor, few enough that a buyer team can score them honestly without dropping into spreadsheet fatigue. Group them by category, score them on a weighted rubric (we describe one further down), and the right vendor falls out of the matrix without negotiation heroics. For the budget context that sits underneath these questions, our pentest cost estimator gives you a defensible envelope before the RFP goes out.

Category 1 — Scope and methodology (Q1–Q6)

Scope is the engagement's contract with reality. If the scope is wrong, no amount of operator brilliance saves the report. These six questions establish whether the vendor is actually going to test what your defenders care about, or whether they are going to bill for a scan they could have run on day one.

Which assets, by inventory, will you test — and which are explicitly excluded? Why: a scope without an explicit exclusion list invites the vendor to test the easy assets and skip the gnarly ones. Red flag:"the application" without naming subdomains, APIs, mobile, or third-party integrations. Green flag: a vendor that asks for your asset inventory, marks each entry in or out of scope by name, and writes the exclusions back into the SOW.
Which methodology frameworks does your team follow? Why:methodology is the difference between repeatable testing and bespoke heroics. Red flag:"our proprietary methodology" with no public reference. Green flag: three or more of OWASP WSTG / MASVS, PTES, NIST SP 800-115, OSSTMM, CREST, named in the proposal with version numbers.
What is the manual-versus-automated split, in hours? Why:automated scanning is a baseline, not a deliverable. Red flag:a vendor that cannot tell you the split or who quotes "100% manual" (no serious team works that way). Green flag: a concrete answer — typically 60–75% manual on a well-scoped application pentest — with the manual hours mapped to specific test categories (authentication, authorisation, business logic, injection).
How do you handle business-logic flaws that automated tools cannot find? Why: the highest-impact findings in modern pentests are business-logic flaws — IDOR chains, multi-step authorisation bypasses, race conditions — and they need named human time. Red flag:deflection to "our AI-augmented engine". Green flag: a vendor that explains how the lead operator allocates time to the unique flows of your application after reading the inventory you provide.
What evidence do you capture per finding — and how is it preserved? Why: evidence is what makes a finding defensible in retest and in audit. Red flag: "screenshot in the report". Green flag:video / HAR / request-response capture, hashed and timestamped, retained for the contract period and available on demand for audit defence.
Is the testing aligned to OWASP / PTES at the finding level, not just the cover page? Why: a methodology citation in the executive summary with no per-finding mapping is theatre. Red flag: the methodology section and the findings section do not reference each other. Green flag: every finding cites the OWASP WSTG section (or PTES phase) it was identified under, so a reviewer can trace coverage gap by gap.

Category 2 — People and qualifications (Q7–Q12)

VAPT is operator-dependent work. Two vendors with identical methodology produce wildly different reports because the lead operator is — or is not — senior enough to chain findings into a real attack path. These six questions sort the firm-as-brand from the firm-as-people.

How many of your delivery staff hold OSCP, OSEP, OSWE, CREST CRT, or equivalent? Why: certification is not a proxy for skill, but the absence of it is a proxy for the absence of training. Red flag:"our team is highly skilled" without numbers. Green flag:a count by certification, with the lead operator's certs named.
Who, by name, will lead this engagement — and how many years have they been testing? Why: the lead carries the engagement. Red flag:"to be assigned after the contract" or a name that disappears after the kickoff call. Green flag: a named lead with a CV, four or more years of relevant testing, and a commitment that the same person runs the retest.
What is your delivery-staff retention over the last 24 months? Why: high churn means the team that wrote the proposal will not be the team that delivers. Red flag: the vendor cannot answer or quotes industry-average numbers without specifics. Green flag: sub-15% annual churn on the delivery bench, with the lead operator at the firm three-plus years.
What background verification do you run on testers before they touch production-adjacent systems? Why: auditors and regulated buyers both want vendor-side personnel screening on file. Red flag:no BGV or "handled by HR". Green flag: documented BGV — criminal-record check, identity verification, prior-employment confirmation — refreshed on a defined cadence.
How do you handle conflict-of-interest disclosures — particularly if your firm has tested our competitor or our supplier? Why: in tight vertical markets the same firm tests three direct competitors. Red flag:"we have a strong wall" with no documented control. Green flag:a written CoI policy, named disclosures at proposal stage, and a willingness to walk away from an engagement where the conflict cannot be cleanly partitioned.
Will the lead operator be on the kickoff, the mid-engagement check-in, and the readout — not a sales engineer or a delivery manager? Why:operator continuity through the engagement is how nuance survives. Red flag:the sales engineer runs every call. Green flag: the lead operator is named on the calendar invite for every working call.

Category 3 — Compliance and deliverables (Q13–Q18)

The report is the deliverable, and the report is what your auditor reads. These six questions check that the report will land cleanly inside whichever compliance frame your business operates against — and that the evidence behind it will survive the auditor's sampling.

How are findings mapped to SOC 2 CC-series, PCI DSS v4.0 requirements, and ISO 27001 Annex A controls? Why:a finding that is not mapped to a control is work the buyer's GRC team has to redo. Red flag: mapping left to the buyer. Green flag: findings mapped explicitly in the report appendix, with the auditor-facing language pre-written.
Is retest of every finding included in the base price? Why:retest is where remediation gets verified. Without it the auditor cannot close the loop. Red flag: retest as an extra line item or a per-finding charge. Green flag: retest of all findings within a defined window (typically 30–45 days) included in the SOW.
Does the report include a board-ready executive summary that does not require legal redaction before it is shared? Why: board packets and audit-committee minutes both need a clean one-pager. Red flag: the executive summary is technical-jargon paragraphs with embedded payloads. Green flag: a one-page summary that reads cleanly to a non-technical director.
What audit-ready evidence do you ship alongside the report? Why: auditors sample the evidence, not the report narrative. Red flag: "findings in the report" with no evidence pack. Green flag: a structured evidence directory — proof-of-exploitation video, request / response pairs, environment metadata — hashed and timestamped so the chain of custody survives challenge.
Can you share an anonymised sample report under NDA before we sign? Why: the sample report is the single best procurement signal. Red flag: "confidential, no samples". Green flag: two sanitised samples — one application, one network or cloud — produced within a working day.
How do you score findings — CVSS v3.1, v4.0, or your own scale? Why: a base score without a vector cannot be challenged or recalculated. Red flag:"High / Medium / Low" with no scoring metadata. Green flag: CVSS v4.0 (or v3.1) vector documented per finding, with environmental score available where the buyer provides the asset-impact context. See our explainer on CVSS v3.1 vs v4.0 for which to ask for.

Category 4 — Operations and SLAs (Q19–Q24)

Operations is where the engagement either runs smoothly or burns the buyer's engineering team out. These six questions cover the cadence and communications that determine whether week three of the engagement is calm or chaotic.

How fast can you stand up a kickoff once the SOW is signed? Why: the window between SOW signature and kickoff is when scope assumptions drift. Red flag: "four to six weeks". Green flag:kickoff within five working days of signature for a standard engagement, with the lead operator on the call.
What is your escalation path if the operator finds a critical mid-engagement — an active intrusion, a data leak, an exploitable authentication bypass? Why: a mid-engagement critical needs a named human inside the hour. Red flag: "reported in the final report". Green flag:a same-business-day escalation to a named buyer-side contact with a written follow-up under the contract's confidentiality envelope.
Do you run daily standups during the engagement? Why:daily contact prevents the "four weeks of silence then a 60-page report" failure mode. Red flag: "weekly status email". Green flag: a 15-minute daily standup (Slack-async or video) during active testing windows with the lead operator present.
Which channels do you use for live communication — and what is the buyer expected to provide? Why: some buyers cannot install third-party chat tools, and some vendors only operate inside their own portal. Red flag:the vendor's portal is the only channel. Green flag:the vendor adapts to the buyer's tooling (Slack Connect, dedicated Teams channel, GPG email) and ships a written communications matrix at kickoff.
What is your turnaround on finding triage from identification to written communication? Why:a finding sat on a tester's laptop for five days is a finding the buyer cannot remediate in parallel. Red flag:"included in the final report". Green flag: criticals and highs communicated within one business day of identification; mediums and lows within three.
What is the SLA on the final report after testing ends? Why:report-writing drift is the most common cause of missed audit deadlines. Red flag: "within four weeks". Green flag: draft report within five working days of test completion, final report within ten, with a written penalty clause if either slips.

Category 5 — Commercial (Q25–Q30)

Price is the last conversation, not the first. These six questions surface the commercial structure that determines whether the engagement is a fair trade or a future regret.

What is the pricing model — fixed price, time-and-materials, or hybrid — and what assumptions sit underneath it? Why:a fixed price with soft assumptions becomes a T&M engagement on day three. Red flag:fixed price with no documented scope envelope. Green flag:fixed price for the documented scope, T&M only on documented additions, with a daily rate published up front.
What does a retest cost — and is the rate the same as the original engagement? Why: retest is where vendors recover margin on a tight quote. Red flag: retest at full daily rate. Green flag:retest at no additional cost within the contracted window, or at a documented fraction (typically 25–35%) of the original engagement.
What is the scope-change clause — how is mid-engagement scope expansion priced and approved? Why: scope changes happen on every engagement. Red flag: no clause, leaving change-orders to email. Green flag: a written change-order template, a fixed unit price per additional asset, and a 48-hour written approval loop with the buyer-side sponsor.
What indemnity, professional-indemnity insurance, and limitation of liability do you carry? Why: the vendor has hands on your systems — the indemnity matters. Red flag: liability capped at fees paid with no indemnity. Green flag:professional-indemnity cover at the buyer's risk-adjusted level (₹5 Cr / $600k or higher for regulated buyers), cyber liability cover, and indemnity language reviewed by counsel before signature.
Who owns the intellectual property in the report, the evidence, and any custom tooling created during the engagement? Why: IP ownership determines whether the buyer can use the report in future audits, fundraises, or insurance applications. Red flag: vendor retains the report and licenses it back. Green flag: buyer owns the report, the evidence, and any custom tooling produced for the engagement; vendor retains the underlying methodology and its general-purpose toolchain.
Will you share three buyer-side references — one current, one recent, one regulated — and will you let us speak with them unsupervised? Why:a curated reference is a sales asset; an unsupervised reference is a procurement signal. Red flag: vendor-supplied talking points or a reference call with the vendor on the line. Green flag: three references with direct contact details, a written introduction, and a clear statement that the buyer can ask any question without vendor presence.

Common vendor anti-patterns to spot

A handful of patterns recur across the bad procurements we have audited. Catching even one of these inside the RFP cycle pays for the buyer's entire procurement effort.

The logo wall as substitute for references. A grid of recognisable client logos is marketing collateral, not procurement evidence. Ask for three unsupervised references; treat the logo wall as zero signal.
The proprietary methodology dodge."Our methodology is proprietary" usually means the methodology is whatever the lead operator improvises on the day. Insist on a public-framework mapping.
The unnamed engagement team."Our pool of senior consultants" is a placeholder for "whoever is on the bench when your engagement starts". Insist on a named lead before signature.
The free retest that is not free.Read the retest clause carefully. "Retest included" with a 14-day window and a 90-day remediation cycle is a retest you cannot use.
The platform-as-pentest. A dashboard with continuously-running automated scans is a vulnerability-management tool, not a pentest. A real pentest produces business-logic findings that no scanner finds. If the demo is all dashboard and no operator, you are buying scanning at a pentest price.
The discount that arrives without an ask. A vendor that drops price 30% before the second call is a vendor whose first quote was either fake or whose delivery margin is about to be cut from your engagement. Either way, the quality risk is now yours.

How to score the answers — a weighted 1–5 rubric

Score each answer on a 1–5 scale: 1 = absent or evasive, 2 = partial / verbal only, 3 = documented but generic, 4 = documented and specific to your scope, 5 = documented, specific, and exceeds reasonable expectation. Weight the categories against the engagement's risk profile: regulated buyers weight compliance and qualifications higher, fast-moving SaaS buyers weight scope and operations higher.

Category	Default weight	Regulated buyer	SaaS buyer
Scope & methodology	25%	20%	30%
People & qualifications	20%	25%	20%
Compliance & deliverables	20%	25%	15%
Operations & SLAs	20%	15%	25%
Commercial	15%	15%	10%

Compute the weighted score per vendor, then compare. The right vendor usually scores inside three percentage points of one other vendor — that is the negotiation set. Anyone twenty points below the leader is not a value play; they are a quality risk wearing a price tag. Anyone twenty points above is either over-selling or you have written the RFP loosely enough that they are reading the answers off your wishlist — double-check the references.

Bringing it together

Thirty questions, five categories, one weighted rubric. Run it on three vendors and the right answer falls out. Run it on one vendor and you have a defensible single-source justification for audit. Either way, the questions ahead of the SOW are cheaper than the surprises after it. For the underlying service framing and what a clean VAPT engagement looks like, see our VAPT services page and the VAPT glossary entry.

FAQ

How many vendors should we put through this 30-question filter?

Three to five. Below three you lose comparative pressure; above five the buyer team cannot do justice to each response and the RFP turns into spreadsheet theatre. Three is the floor for a regulated buyer because audit committees expect a documented shortlist. Five is the ceiling for everyone else.

Do these questions apply to a pentest-as-a-service vendor too?

Yes — and arguably more, because PtaaS vendors are easier to buy on dashboard demos than on operator depth. The same questions apply: who is the named lead, what is the manual-versus-automated split, what does the retest cadence look like, what happens on a scope-change request. Strip the platform varnish and the procurement is identical.

Should we share all 30 questions with the vendor up front, or hold some back for the technical call?

Share 24 in the written RFP and hold six back for the technical call — typically the executor-profile and operations questions. Holding some back lets you compare unrehearsed answers between vendors, which is the most informative signal you will get during procurement.

What if our shortlisted vendor refuses to answer some of these questions?

That is itself an answer. A vendor that will not name its lead operator, share a sample sanitised report under NDA, or commit to a retest window has already told you what the engagement will feel like. Disqualifying refusals are not a tantrum — they are signal.

Is this checklist usable for a non-Indian buyer?

Yes. The compliance mappings (SOC 2, PCI DSS, ISO 27001) are international. Drop the RBI / CERT-In specifics if they do not apply and the remaining 26 questions hold up across US, UK, EU, GCC, and Singapore engagements. For Indian regulated buyers, pair this checklist with the RBI-specific checklist linked below.

Ready to write the RFP?

Download the AxVeil pentest RFP template — the same 30-question scaffold, in a Word document your procurement team can edit and send today.

Get the RFP template →See how AxVeil answers these 30 →