Evaluate pen testing providers - beyond scanning

By Larissa Kolver, Head of Cyber Security at Securecom

Author Introduction

As someone who sits at the intersection of operations, risk, and delivery every day, I’ve learned that meaningful assurance is about more than a long list of CVEs – it’s about evidence that your real business risks are being reduced release by release. I’m Larissa Kolver, Head of Cyber Security at Securecom, and my focus is helping New Zealand teams translate security testing into clear priorities that fit how they actually build software, ship changes, and support customers. In this piece I share a practitioner’s view on separating scans from true penetration testing, what good looks like in modern environments, and how to evaluate providers in a way your engineers and auditors will both respect.

Key Takeaway: Don’t mistake tool output for attacker impact. Separate vulnerability scanning from accredited manual testing, demand evidence of chained exploit paths and business impact, and ensure the approach integrates cleanly with DevSecOps and compliance.

Outline:

Why “checkbox security” fails in modern environments
Scanning vs penetration testing – what’s the real difference
What good looks like – methodology, scope, and depth
DevSecOps integration and retest workflow
Independence, data handling, and lawful testing
How to run a low-risk proof of concept
RFI questions you can copy and paste

Introduction

If your last security test produced a long list of CVEs but little clarity on exploitability, you’ve seen checkbox security in action. Modern attackers don’t stop at single findings – they chain weaknesses across apps, APIs, edge devices, and identities. The evaluation task in front of you is not to pick who runs the loudest scanner, but who can credibly model attacker behaviour, validate impact, and fit into how your teams actually build and ship software. This article gives you a structured way to separate genuine assurance from box ticking, aligned to recognised standards and New Zealand governance realities.

1) First principle – scanning is not penetration testing

A vulnerability scan is largely automated. It identifies known weaknesses and misconfigurations at scale. A penetration test is a human-led exercise that attempts controlled exploitation to demonstrate real business impact and attack paths. Treating them as equivalent leads to false assurance. Authoritative sources draw this line clearly: NIST SP 800-115 separates vulnerability analysis from exploitation phases, emphasising planning, execution, and post-test activities; OWASP’s Web Security Testing Guide highlights manual techniques and a balanced approach; CISA’s public guidance distinguishes routine scanning services from remote penetration testing. (NIST Computer Security Resource Center)

What this means for your evaluation: insist on both capabilities – broad, frequent scanning for coverage, and accredited manual exploitation for depth – reported in plain English with narrative impact, not just CVSS numbers.

2) Guardrails from recognised methods and controls

Ask prospective providers to anchor their approach to recognised methodologies and controls so your outcomes are defensible:

NIST SP 800-115 – five disciplined phases covering planning, information gathering, vulnerability analysis, exploitation, and post-testing analysis. (NIST Computer Security Resource Center)
OWASP Web Security Testing Guide – deep techniques for web and API testing and the need for a balanced manual approach. (OWASP Foundation)
ISO 27001 Annex A 8.8 – management of technical vulnerabilities, which your programme should evidence through cadence and remediation, not one-off reports. (ISMS.online)

Mapping providers to these references helps you prove due diligence in audits and board updates without lapsing into tool worship.

3) Scope and depth – what “good” testing actually covers

A credible provider will go beyond external perimeters and include the systems attackers target most:

Web applications, APIs, and mobile apps aligned to OWASP testing practices. (OWASP Foundation)
Edge and VPN devices that have seen a sharp rise in exploit-driven breaches, with median remediation lag measured in weeks according to the Verizon 2025 DBIR executive summary. (Verizon)
Cloud and SaaS entry points with clarity on how legal scope is managed when testing service-provider controlled assets. (CISA)

Beyond listing findings, expect exploit narratives that show how weaknesses chain together – for example, an IDOR in an API plus a misconfigured edge device leading to data exposure. If a sample report can’t tell that story, you’re buying a scan, not a test.

4) Fit for DevSecOps – integrate, don’t disrupt

Security that can’t keep pace with delivery gets bypassed. Your shortlist should demonstrate:

Workflow integrations – push issues directly into JIRA, Teams, or Slack, with status updates on retest and closure.
Change-driven testing – the ability to trigger focused tests for high-risk releases, not just calendar-based cycles.
Retest service levels – defined windows to validate fixes so backlog actually shrinks.

OWASP notes that manual penetration testing provides more accurate risk ratings than scans alone and should feed back into earlier-phase controls. That only works if your provider plugs into your existing SDLC and collaboration tools. (OWASP Foundation)

5) Independence, data handling, and lawful testing

In New Zealand contexts, independence and lawful scope matter:

Independence – if your MSP is also testing its own work, ask how they avoid a perceived conflict of interest. Consider third-party testers for objectivity.
Lawful scope and safe harbours – providers should explain authorisations, change windows, and how they avoid disrupting SaaS or shared platforms. CISA advises consulting legal counsel for SaaS-centric testing scopes. (CISA)
Data handling and residency – clarify where evidence and artifacts are stored and who has access, aligning to ISO 27001 control expectations on vulnerability management records. (ISMS.online)

These factors often decide procurement outcomes more than raw technical prowess.

6) Run a low-risk proof of concept

Before you sign anything substantial, run a limited-scope PoC that exercises the parts of the service that matter most to you:

Scope – pick one externally facing app and its primary API.
Success criteria – time to first actionable finding, clarity of exploit narrative, ticket quality, and retest turnaround.
Operational check – validate the hand-off into triage, the behaviour of alerts, and how developer time is affected.

A small PoC will quickly reveal whether you are getting outcomes or just artefacts.

7) RFI questions you can copy and paste

Use these to flush out checkbox approaches:

Method and standards – Describe how your testing maps to NIST SP 800-115 phases and the OWASP Testing Guide. Provide a sample test plan and sample report. (NIST Computer Security Resource Center)
Depth vs breadth – Provide two example exploit narratives that show chained findings and business impact.
DevSecOps integration – List native integrations with JIRA, Teams, Slack, and CI/CD. Describe your retest workflow and SLAs for validation.
Edge and VPN coverage – Explain how you test internet-facing devices and measure remediation speed in line with patterns highlighted in the DBIR. (Verizon)
ISO 27001 Annex A 8.8 evidence – Show how your reporting supports vulnerability management control requirements. (ISMS.online)
Independence and data handling – Describe how you manage conflicts of interest, data residency, storage, and access controls.
SaaS and legal scope – Explain your approach to testing SaaS in a way that respects provider terms and local legal obligations. (CISA)

8) Red flags during evaluation

Scan-and-dump – a heavy CVE list with little to no exploit context or business impact.
No retest SLA – findings aren’t verified after fixes, so risk doesn’t actually fall.
Tool-brand sales pitch – lots of scanner features, little human tradecraft.
Edge and API blind spots – minimal coverage where attacks are surging. Verizon reports an eight-fold year-over-year rise in exploitation of edge devices and VPNs and a median of 32 days to remediate those issues – testing must reflect that reality. (Verizon)
Compliance as the goal – selling “the certificate” rather than real assurance. ISO 27001 expects operational vulnerability management, not one report filed once a year. (ISMS.online)

9) Why depth and cadence both matter right now

Exploitation of vulnerabilities is an increasingly common initial access vector in global breach datasets. The Verizon 2025 DBIR highlights significant growth in exploit-driven incidents and slow remediation on edge devices. That context reinforces why your shortlist must prove it can deliver both continuous coverage and credible human-led exploitation to prioritise fixes that actually reduce your breach probability. (Verizon)

Next Steps

Shortlist 2-3 providers who can evidence manual exploitation aligned to NIST and OWASP, plus integrations with your ticketing and chat tools. (NIST Computer Security Resource Center)
Run a 2-week PoC on one app and its primary API. Measure time-to-first-report, exploit narrative quality, and retest turnaround.
Map reporting to ISO 27001 Annex A 8.8 so you can show a living vulnerability management process to auditors and your board. (ISMS.online)
Set baseline KPIs – critical-fix lead time, retest pass rate, reduction in open criticals, and percentage of findings created as tickets. Track monthly.

PTAAS Blog Series

Rethinking Security Testing: A 5-Step Guide to Continuous Assurance

About the Author:

Larissa Kolver PMP®, AgilePM® – Head of Cyber Security, Securecom

Larissa is a seasoned cyber resilience leader who blends disciplined project governance with hands-on security engineering with over a 10-year career across financial, health and safety and technology sectors. At Securecom she heads the Security Operations function, translating continuous attack-surface insights into actionable remediation plans that executives can measure. Larissa is passionate about turning board-level risk appetite into practical cadence – replacing once-a-year checkbox tests with data-driven assurance tied to every release. Her mission is simple: help Kiwi businesses stay one step ahead of attackers while keeping compliance costs in check.

Email: larissa.kolver@securecom.co.nz

Concerned about security vulnerabilities in your application environments?

Talk to us about a PTaaS cadence that lowers your business risk.

"*" indicates required fields

How to evaluate security testing providers without checkbox security