A firm conducts a thorough 10-day internal assessment. They achieve Domain Admin in under four hours, chain three low-severity misconfigurations into a critical attack path, and identify 47 findings. The report lands at 186 pages. The executive summary references "Kerberoasting," "NTDS.dit extraction," and "LLMNR poisoning" without explaining what any of them mean in business terms.
Six months later, the next test achieves Domain Admin through the same attack path.
The testing was excellent. The report failed. And because the report failed, the testing was wasted.
The quality of a penetration test is not determined by the skill of the testers. It's determined by whether the report causes the right things to get fixed.
After two decades of commissioning, reviewing, and delivering penetration tests at Fortune 500 scale, here's what we've learned about what makes the difference between a report that drives remediation and one that collects dust.
The Seven Ways Pen Test Reports Fail
1. Wrong Audience for the Entire Report
Every pen test has at least three audiences with fundamentally different needs:
- Board and senior leadership need to understand risk in business terms — data exposure, regulatory liability, financial impact. They don't know what CVSS 9.1 means and shouldn't have to.
- CISOs and security managers need to understand risk posture — how findings relate to each other, which chains are most dangerous, where to invest remediation resources.
- IT and engineering teams need to fix things — specific systems, exact configuration changes, Group Policy paths, verification steps.
Most reports are written for one audience (typically engineers) and fail the other two entirely. An executive summary packed with CVE numbers is not an executive summary. A finding that says "implement best practices" without specifics is not a remediation instruction.
2. No Attack Narrative
This is the most common structural failure: a flat list of vulnerabilities with no explanation of how they relate. Findings in isolation dramatically understate risk.
Three medium-severity issues that chain into Domain Admin are collectively more critical than a single high-severity finding that's an isolated dead end. Without an attack narrative showing how an attacker moved from initial access to demonstrated impact, the CISO cannot prioritize, and the IT team cannot understand the blast radius.
3. Executive Summary That Doesn't Communicate Risk
We see the same failing patterns repeatedly:
- "47 findings: 3 critical, 8 high, 15 medium..." — Numbers without context. The board doesn't know if 47 is normal or catastrophic.
- "Achieved Domain Admin via Kerberoasting of svc_backup..." — Technical narrative in a non-technical section. Demonstrates expertise, not risk.
- "Security posture broadly in line with industry expectations" — "Industry expectations" may mean 85% of internal tests achieve Domain Admin. That's not reassuring.
- "Numerous critical vulnerabilities, immediate remediation recommended" — Alarmist without being specific. No basis for action or prioritization.
A good executive summary answers in plain English, in two pages or less: What was tested? What's the headline? What could an attacker achieve in terms of data exposure, financial loss, and operational disruption? What are the top three actions?
4. No Remediation Roadmap
A list of 47 findings without prioritization and sequencing leaves IT teams paralyzed. A finding that's easy to fix and breaks the primary attack chain should be prioritized above a technically severe finding that requires months of work and is an isolated issue.
Good reports include a remediation roadmap with timeframes: immediate (this week), short-term (this month), medium-term (this quarter). Each item should identify dependencies — which fixes unlock other fixes and which fixes break the most dangerous attack chains.
5. Automated Tool Output Passed Off as Testing
If a "penetration test" report looks like a Nessus or Qualys export with generic recommendations, it is a vulnerability scan with a different label. Over-reliance on automated tools that generate unedited, generic reports is a widespread industry problem.
A vulnerability scan identifies known CVEs. A penetration test proves exploitation, demonstrates impact, and chains findings together. The deliverables should reflect that difference.
6. Missing Context for Non-Technical Stakeholders
Reports packed with CVSS vectors, CVE references, and attack chain diagrams without plain-language translations get put in the backlog, not the queue. Every finding needs a "so what" statement in business terms: "This allows an attacker to access the payroll database containing 12,000 employee records including SSNs and salary data."
7. Scope Poorly Defined or Unstated
Without clear scope documentation, the organization may assume everything was tested when significant areas were out-of-scope. Future tests become non-comparable. The scope section should explicitly state what was tested, what wasn't, the starting position (authenticated vs. unauthenticated, internal vs. external), and any limitations encountered.
What a Good Pen Test Report Includes
Here's what we require — and deliver — in every engagement:
- Executive Summary (1-2 pages) — Non-technical business risk in plain English. What was tested, what happened, what it means for the business, top three actions.
- Scope and Methodology (1-2 pages) — What was tested, what wasn't, testing approach, limitations, starting position, tools used.
- Attack Narrative (2-5 pages) — Chronological story from initial access to demonstrated impact. Shows how findings chain together. This is the section that drives understanding and urgency.
- Risk Summary (1 page) — Visual/tabular overview of all findings by severity, showing which chain together versus standalone.
- Technical Findings (2-4 pages per critical/high finding) — Each finding includes: descriptive title, severity with justification, affected systems (specific hostnames/URLs), description, evidence of exploitation, business impact, specific remediation instructions, and verification steps.
- Remediation Roadmap (1-2 pages) — Prioritized actions grouped by timeframe with dependencies noted.
- Appendices — Raw evidence, scan outputs, screenshots, command output.
What Each Finding Must Contain
The individual finding is where reports most often fall short. Each finding should include all eight elements:
- Descriptive title — Not just "SQL Injection" but "Unauthenticated SQL Injection in Customer Search API Exposing Full Customer Database"
- Severity rating with justification — Not just a CVSS number, but why this severity applies given your specific environment and data
- Affected systems — Specific hostnames, URLs, IP addresses, endpoints
- Description of the vulnerability — What it is and why it exists
- Evidence of exploitation — Screenshots, request/response pairs, command output proving the issue was exploited, not just detected
- Business impact — What data is at risk, what operations could be disrupted, what regulatory exposure exists
- Specific remediation instructions — Not "patch the vulnerability" but the exact configuration change, Group Policy path, code fix, or hardening step
- Verification steps — How to confirm the fix worked
Vulnerability Scan vs. Pen Test vs. Red Team: Know What You're Buying
Confusion between these three engagement types costs organizations money and produces mismatched expectations. Here's the distinction:
- Vulnerability Assessment: Automated scanning (Nessus, Qualys) to find known CVEs. Broad coverage, hours to days, common false positives. Use for continuous monitoring and compliance.
- Penetration Test: Manual and automated exploitation with specific scope. Proves impact, eliminates false positives through verification, produces attack narratives. Duration: 1-3 weeks. Use for verifying attack paths and compliance.
- Red Team: Scenario-driven, stealthy, goal-oriented assessment of the entire organization — technology, people, and process. Tests detection and response capabilities. Duration: 3-8 weeks or ongoing. Use for security maturity validation.
A security maturity model for selecting the right engagement type: organizations at Level 1-2 maturity should focus on vulnerability assessments and regular penetration testing. Red team exercises are appropriate for Level 4-5 maturity, where the organization has established detection and response capabilities worth testing.
Ten Questions to Ask Before Signing the SOW
These questions separate competent firms from those that will deliver a 186-page doorstop:
- Who specifically will conduct the test? Named testers with verifiable certifications (OSCP, OSCE, GPEN, GWAPT, SANS GXPN). Major consultancies often put junior staff on external and web app testing while reserving senior resources for prestigious engagements.
- Can we see a sample report? Any firm with significant experience should provide a sanitized example. If they can't or won't, walk away.
- What does the scope document explicitly include and exclude? Confirm in writing what systems, IPs, applications, and environments are in-scope.
- Is this manual testing, automated scanning, or both? A penetration test that is only automated tool output is not a penetration test.
- What's your escalation process if you discover evidence of an active breach during testing? There should be a defined protocol.
- Do you carry liability insurance? Specifically errors and omissions (E&O) insurance covering unintended damage during testing.
- What is the retesting policy? After remediation, will they verify findings are fixed, and at what cost?
- What are the deliverables exactly? Executive summary, technical findings report, attack narrative, remediation roadmap, and verification steps should all be named in the SOW.
- What variables affect scope and cost? Their answer reveals how they think about engagements and whether they've handled complex environments.
- What is your approach to not disrupting production systems? What safeguards prevent outages during testing?
Green Flags and Red Flags
Green flags
- Named, certified testers with verifiable experience
- Sample report that includes attack narratives and business-context impact statements
- Proactive questions about your business to understand what matters
- Proposes starting with threat modeling before testing
- Clear escalation protocol for live threats discovered during testing
- References from clients in your industry
Red flags
- Report that looks like a Nessus export with a different header
- No attack narrative — just a flat finding list
- Remediation steps that say "apply latest patches" without specifics
- Same generic findings regardless of your environment
- No executive summary in plain English
- Testers who haven't read your scope document by the kickoff call
- Pricing suspiciously low — often means junior staff running automated tools
The Bottom Line
A penetration test is only as valuable as the remediation it drives. The most technically brilliant assessment in the world is worthless if the report doesn't communicate risk in terms each stakeholder can act on, provide a prioritized roadmap, and include specific enough instructions for the engineering team to fix the issues without additional research.
If your last pen test report is sitting unread in someone's inbox, the problem probably isn't the findings. It's the report.
Need a penetration test that drives results?
Our engagements deliver actionable reports with attack narratives, business-context impact statements, and remediation roadmaps that get things fixed. Book a session to discuss your next assessment.