Autopentest Ai

AutoPentest is an automated Web application penetration testing server based on the MCP protocol, integrating the OWASP WSTG and PortSwigger attack technique guides. It realizes a seven - stage automated testing through role - based agents (scout, analyst, exploiter, reporter), including 109 tests, 31 attack techniques, 27 security tools, and the ability to bypass WAF adaptively, ensuring zero false positives and evidence - driven vulnerability discovery.

Security Developer tools #Penetration testing #Automated security #MCP tool #Vulnerability scanning .Python

rating : 2.5 points

downloads : 7.0K

update time : 2026-03-12

Open Site

What is AutoPentest?

AutoPentest is an intelligent automated platform for Web application penetration testing. It encodes the testing methodologies of senior security experts into executable processes. Driven by Claude AI, multiple professional role agents (scout, analyst, attacker, reporter) work together to automatically complete the full penetration testing process from information collection to vulnerability verification. Compared with traditional manual testing, AutoPentest ensures the comprehensiveness and consistency of testing; compared with automated scanners, it can conduct more in - depth logical vulnerability testing and multi - step attack chain verification.

How to use AutoPentest?

Using AutoPentest is very simple: 1. Install the necessary dependencies (Docker, Claude Code CLI) 2. Run `make setup` to deploy all security tools in one go 3. Start Claude Code and specify the target URL 4. AutoPentest will automatically execute the full testing process in 7 stages 5. View the generated detailed penetration testing report You can drive the testing through a configuration file or interactively specify the testing scope and targets.

Applicable scenarios

AutoPentest is suitable for the following scenarios: • Web application security assessment: Conduct comprehensive security testing on internal or external Web applications • Continuous security testing: Integrate into the CI/CD process to regularly check the application's security • Red team exercises: Simulate the behavior of real attackers to test the effectiveness of the defense system • Security training: Serve as a teaching tool for learning Web security testing • Bug bounty: Help security researchers quickly discover potential vulnerabilities It is particularly suitable for enterprise security teams that require systematic and repeatable testing.

Main features

Comprehensive OWASP WSTG coverage

Fully implement 109 test cases from the OWASP Web Security Testing Guide, covering 12 security testing categories, from information collection to API testing, ensuring the comprehensiveness and standardization of testing.

PortSwigger attack technology integration

Built - in 31 attack technique guides from the PortSwigger Web Security Academy, including detection methods, exploitation techniques, and WAF bypass patterns for vulnerabilities such as SQL injection, XSS, SSRF, and SSTI.

Role - based proxy system

Adopt 4 professional roles to work together: the scout is responsible for information collection, the analyst identifies potential vulnerabilities, the attacker verifies vulnerability exploitation, and the reporter conducts quality review, ensuring the depth and accuracy of testing.

Adaptive WAF bypass

Automatically identify 12 mainstream WAFs (Cloudflare, AWS WAF, Akamai, etc.) and dynamically adjust attack payloads according to the WAF type to improve the success rate of bypass.

Zero - false - positive evidence system

All discovered vulnerabilities must provide reproducible evidence (curl commands, request/response records) to ensure no false positives, and each discovery has actual exploitation proof.

Multi - stage testing process

7 structured testing stages: application discovery, information collection, configuration testing, authentication and authorization testing, input validation testing, business logic testing, and report generation, ensuring the systematicness of testing.

Integration of 27 security tools

Pre - configure 27 commonly used security tools (nuclei, sqlmap, dalfox, ffuf, etc.), which are uniformly managed through Docker containers without the need for separate installation and configuration.

Intelligent vulnerability chain analysis

Automatically analyze the association relationships between vulnerabilities, discover multi - step attack chains (such as XSS + lack of CSP, SSRF + cloud metadata access), and improve the accuracy of vulnerability severity assessment.

Crash recovery mechanism

Support automatic recovery after the testing process is interrupted. A resume - prompt.md file is generated, containing the full testing context, and the testing can continue in a new session.

Professional report generation

Automatically generate a structured penetration testing report, including an executive summary, detailed findings, repair suggestions, testing coverage matrix, and tool usage statistics.

Advantages

Comprehensive testing: Covers all 109 test cases of the OWASP WSTG, ensuring no omissions

Testing depth: Supports multi - step attack chain verification and business logic testing, surpassing traditional scanners

Zero false positives: All discoveries are supported by reproducible evidence, avoiding the waste of repair resources due to false positives

Consistency: Each test follows the same methodology, with highly comparable results

High efficiency: Automatically executes repetitive testing tasks, freeing up the time of security experts

Ease of use: One - click deployment and configuration - driven testing, reducing the usage threshold

Scalability: Supports adding custom test cases and attack technique guides

Professional report: Generates penetration testing reports that meet industry standards

Limitations

Dependence on Claude API: An effective Anthropic API key is required for use

Network requirements: A large amount of network traffic is generated during the testing process, which may trigger the protection mechanism of the target system

Complex business logic: For highly customized business logic vulnerabilities, manual verification is still required

Learning curve: Although it is easy to use, it takes time to fully master all functions

Resource consumption: Running the full testing process requires a certain amount of computing resources

Legal compliance: Must be used within the scope of authorization to avoid legal risks

Dynamic content processing: For highly dynamic JavaScript applications, additional configuration may be required

How to use

Environment preparation

Install the necessary software dependencies: Docker, Claude Code CLI, uv package manager, Node.js

Project deployment

Clone the project repository and deploy all security tools in one go

Verify installation

Check if all security tools are installed correctly

Start testing

Start Claude Code and specify the testing target

Configuration - driven testing (optional)

Use a YAML configuration file for more refined testing control

View results

View the generated reports and findings after the testing is completed

Usage examples

Full Web application security assessment

Conduct a comprehensive security assessment on an enterprise's Web application, covering all OWASP WSTG testing categories, and generate a detailed penetration testing report.

Specific vulnerability type testing

Conduct in - depth testing for specific types of vulnerabilities, such as SQL injection or cross - site scripting attacks.

API security testing

Conduct specialized security testing on REST API or GraphQL endpoints.

Configuration - driven multi - domain testing

Use a configuration file to test complex applications containing multiple domains and SSO authentication.

CTF challenge solution

Use the CTF mode to solve security challenges and quickly verify vulnerability exploitation.

Frequently Asked Questions

Can AutoPentest completely replace manual penetration testing?

How long does it take to test a medium - sized Web application?

Is Burp Suite required to use AutoPentest?

How to add custom attack payloads or test cases?

How to resume the testing after it is interrupted?

How does AutoPentest handle targets protected by WAF?

Can AutoPentest be used in a production environment?

Which authentication mechanisms are supported?

Related resources

GitHub repository

The source code and latest version of AutoPentest

OWASP Web Security Testing Guide

The testing methodology standard on which AutoPentest is based

PortSwigger Web Security Academy

The source of the attack technique guides

Model Context Protocol

The MCP protocol specification used by AutoPentest

Claude Code documentation

The usage guide for the Claude Code CLI

XBOW validation benchmarks

A set of CTF challenges for benchmark testing

Sample testing report

A sample full testing report for the PortSwigger Gin & Juice Shop

🚀 AutoPentest

An agentic pentesting MCP server that automates web application penetration testing using the full OWASP Web Security Testing Guide and PortSwigger Web Security Academy technique references.

Point it at a target — it crawls your app, maps every endpoint, then spawns role-specialized agents (Scout, Analyzer, Exploiter, Reporter) to test for XSS, SQLi, SSRF, SSTI, IDOR and more. No false positives — every finding is backed by real, reproducible evidence with quality gates enforcing proof at every phase. Includes 31 PortSwigger technique guides, adaptive WAF evasion for 12 vendors, cross-phase vulnerability chaining, and risk-weighted endpoint prioritization. Run it with Claude Code, the API, or go fully offline using Ollama models.

Think of it as: A senior pentester's methodology encoded into an MCP server — 109 OWASP tests, 31 PortSwigger attack technique guides, 68+ MCP tools, 27 security tools, 4 specialized agent roles, 7 structured phases, automated quality assurance, and a zero-context final review.

🚀 Quick Start

Prerequisites

Docker (Docker Desktop on macOS/Windows, Docker Engine on Linux)
Claude Code CLI with an active Anthropic API key
uv (Python package manager for the MCP server)
Node.js (for Playwright MCP server)
Optional: Burp Suite Professional for passive traffic monitoring

Installation

# 1. Clone the repository
git clone https://github.com/bhavsec/autopentest-ai.git
cd autopentest-ai

# 2. Install Python dependencies for the MCP server
cd server && uv sync && cd ..

# 3. Build Docker image and start the tools container
make setup

That's it. All 27 security tools are now installed and ready inside the Docker container.

Verify Installation

# Check all tools are installed
make verify-tools

# Expected output:
# [+] nuclei: installed
# [+] httpx: installed
# [+] katana: installed
# ... (27 tools total)

Start Testing

# Launch Claude Code in the project directory
claude

Then tell Claude what to test:

Run a full WSTG assessment against https://target.example.com

✨ Features

Comprehensive OWASP Coverage

109 WSTG test cases across 12 categories — from information gathering to API testing.
Each test includes step-by-step CLI procedures, context-specific payloads, detection criteria, and severity rubrics.
Tests are prioritized (MUST/SHOULD) with conditional triggers so nothing relevant is skipped.

31 PortSwigger Attack Technique Guides

Sourced from PortSwigger Web Security Academy — detection methods, exploitation techniques, payloads, cheat sheets, and WAF bypass patterns.
Organized by vulnerability class (SQLi, XSS, SSRF, JWT, OAuth, etc.) for direct use during testing.
Integrated into every testing phase — agents automatically load the relevant technique guide before testing each vulnerability class.
Database/platform-specific payload tables (Oracle vs MySQL vs PostgreSQL vs MSSQL for SQLi, Jinja2 vs Twig vs Freemarker for SSTI, etc.).
WAF bypass patterns organized by bypass level (basic → intermediate → advanced).

27 Pre-Configured Security Tools

All tools pre-installed in a single Docker image — make setup and you're ready.
Tools organized by phase: discovery, injection testing, authentication, cryptography, API testing.
Automatic Burp Suite proxy integration for passive traffic monitoring.

Structured 7-Phase Workflow

Phase 0: Application Discovery & Mapping
Phase 1: Information Gathering & Reconnaissance
Phase 2: Configuration & Deployment Testing
Phase 3: Identity, Authentication, Authorization & Session Management
Phase 4: Input Validation Testing (pipelined XSS/SQLi/SSRF pipelines)
Phase 5: Error Handling, Cryptography, Business Logic, Client-Side & API Testing
Phase 6: Coverage Verification & Reporting
Phase 7: Final Judge Review & Remediation

Quality Assurance System

Automated phase gates — each phase must pass quality checks before proceeding.
Quality Reviewer subagent at every phase transition identifies gaps and suggests improvements.
Final Judge — a zero-context agent reviews the entire engagement cold, like an external QA reviewer.
Exhaustion gates — "not vulnerable" requires proof of sufficient testing effort (minimum techniques and bypass attempts).

Evidence-Based Findings

Every finding requires reproducible curl commands and full request/response evidence.
Three-tier classification: EXPLOITED (proven impact), POTENTIAL (blocked by control), FALSE_POSITIVE (control holds).
Anti-hallucination framework — "no exploit = no finding" enforced at every level.
Evidence checklists per vulnerability class verified before any finding is logged.

Role-Specialized Subagents

4 dedicated roles with focused prompt templates, tool guidance, and anti-patterns:
- Scout — reconnaissance only, maps attack surface without sending payloads (Phase 0 - 1).
- Analyzer — identifies potential sinks with canary/witness payloads, builds exploitation queues (Phase 2 - 5 analysis).
- Exploiter — consumes Analyzer output, proves exploitation with evidence, logs confirmed findings (Phase 4 exploitation).
- Reporter — quality review and Final Judge, reviews data without sending requests (QA + post-report).
Validation checkpoint between analysis and exploitation prevents wasted effort.
Each role has explicit allowed/restricted tool lists and input/output contracts.

Pipelined Exploitation (Phase 4)

3 independent two-stage pipelines run in parallel: XSS, Injection (SQLi/CMDi), SSRF/SSTI.
Each pipeline: Analyzer (discover → analyze → queue) → validation checkpoint → Exploiter (exploit → log).
Each pipeline loads its PortSwigger technique guide for detection methods, cheat sheets, and WAF bypass patterns.
WAF intelligence shared across all pipelines.
Context-aware witness payloads for 13 sink types.

Adaptive WAF Evasion

Automatic WAF fingerprinting from response headers, body, and status codes — identifies 12 WAF vendors (Cloudflare, AWS WAF, Akamai, Imperva, ModSecurity, F5, FortiWeb, Sucuri, Barracuda, Wordfence, NAXSI, Citrix).
Vendor-specific bypass payloads organized by complexity level (basic → intermediate → advanced).
WAF intelligence shared across all agents via deliverable system.
Agents automatically identify WAF on first block response and switch to tailored bypass payloads.

Cross-Phase Knowledge Graph

Entity-relationship graph tracks endpoints, parameters, technologies, findings, cookies, domains, and user roles.
Automated vulnerability chaining via BFS path finding with 7 predefined chain patterns:
- XSS + missing CSP, XSS + weak cookie (no HttpOnly), Open redirect + OAuth callback.
- IDOR + admin role, SSRF + cloud metadata, No lockout + no MFA, CORS + sensitive endpoint.
Severity upgrades when chaining materially increases impact.
Populated throughout testing, queried after Phase 4 for chain discovery.

Hierarchical Task Tree

Persistent tree structure (phases as branches, tests as leaves) prevents LLM depth-first bias and context loss.
Main agent maintains strategic macro view; subagents update only their assigned leaf nodes.
Auto-propagation: when all children complete, parent auto-completes.
Phase-level completion percentages for informed decision-making.

Endpoint Risk Prioritization

Score and sort endpoints by risk for prioritized testing — highest risk tested first.
Scoring factors: parameter count, technology risk indicators, taint chain confidence, tool convergence, auth requirements, injectable parameter names.
Integrated into Phase 0 endpoint map generation.

Tool Output Parsing

13 built-in parsers for common CLI tools (nmap, nuclei, sqlmap, ffuf, httpx, whatweb, testssl, nikto, dalfox, katana, gau, wapiti, commix).
Condenses raw tool output 3 - 5x while preserving key findings, endpoints, and errors.
Configurable verbosity: summary (~15 lines), detailed (~50 lines), full (complete parsed output).

CLI Tool Results Verification

Automatic validation of CLI tool output quality — detects empty output, proxy errors, permission issues, and suspicious results.
10 per-tool validators (nmap, nuclei, sqlmap, ffuf, feroxbuster, testssl, dalfox, wapiti, katana, httpx) with corrected command suggestions.
When a tool produces empty or suspicious output, the validator suggests fixes (e.g., add -Pn for nmap, remove proxy env vars, try different flags).
Integrated into the tool execution workflow — agents call verify_tool_result() after every CLI tool run.

Progressive Context Compression

Phase summaries (~500 - 800 words) auto-generated when phase gates pass — capturing findings, coverage, tool results, and attack surface in compressed form.
Prevents context degradation in long-running engagements by replacing raw historical data with structured summaries.
get_engagement_summary() combines all phase summaries into a single overview for injecting into new subagent prompts.
Summaries stored as deliverables — accessible by any downstream agent without requiring full engagement history.

Counterfactual Analysis (Second-Pass Discovery)

After an Analyzer completes with vulnerabilities found, a second Analyzer is spawned with instructions to "assume those vulns are patched".
The counterfactual Analyzer searches for additional vulnerabilities: different endpoints, different parameters, different injection contexts, logic flaws.
Results are appended to the existing exploitation queue (automatic merge with deduplication by endpoint+parameter and auto-incrementing IDs).
Based on PenHeal ablation research showing +71% vulnerability coverage with counterfactual prompting.

Multi-Domain Support

Automatic SSO/OAuth/OIDC/SAML detection and handling.
Per-domain scope registration, crawling, and testing.
Cookie jar management for cross-domain session persistence.
6-level authentication failure escalation (alternative grants → PKCE → headless browser → token extraction → user provision → unauthenticated).

Crash-Safe Engagement Management

Append-only findings.md and progress.log survive crashes.
Git workspace checkpointing with rollback capability.
Auto-resume on interruption — resume-prompt.md auto-generated at every checkpoint with full context (target, credentials, current phase, remaining tests, scope). Paste into a new session to continue exactly where you left off.
Mid-phase checkpoint granularity — tracks which tests within a phase are completed, not just phase-level state.
Full audit trail of every MCP tool call with timestamps.

Professional Reporting

Markdown reports with executive summary, findings by severity, test coverage matrix, and tool coverage.
Per-category coverage percentages and gap analysis.
Vulnerability chaining analysis documented.
Final Judge observations and quality notes included.

📦 Installation

# 1. Clone the repository
git clone https://github.com/bhavsec/autopentest-ai.git
cd autopentest-ai

# 2. Install Python dependencies for the MCP server
cd server && uv sync && cd ..

# 3. Build Docker image and start the tools container
make setup

💻 Usage Examples

Option A: Interactive Mode

Launch Claude Code and provide the target:

Run a full pentest against https://app.example.com

Credentials: admin / P@ssw0rd123

Claude will ask for any missing information (like credentials) and begin the 7 - phase workflow.

Option B: Config-Driven Mode (Recommended)

Create a YAML config file for repeatable, consistent assessments:

# configs/my-target.yaml
target:
  url: https://app.example.com
  scope:
    - app.example.com
    - api.example.com
  exclude:
    - cdn.example.com

authentication:
  login_type: form
  login_url: https://app.example.com/login
  credentials:
    username: testuser@example.com
    password: secret123
  login_flow:
    - "Type $username into the email field"
    - "Type $password into the password field"
    - "Click the 'Sign In' button"
  success_condition:
    type: url_contains
    value: "/dashboard"

rules:
  avoid:
    - description: "Do not test logout"
      type: path
      url_path: "/logout"
  focus:
    - description: "Prioritize API endpoints"
      type: path
      url_path: "/api"

reporting:
  tester_name: "Security Team"

Then in Claude Code:

Load the config from configs/my-target.yaml and run the pentest

Option C: Targeted Testing

Run specific WSTG tests against specific endpoints:

Run WSTG-INPV-05 (SQL Injection) against https://app.example.com/search?q=

Test https://app.example.com for CORS misconfiguration (WSTG-CONF-13)

Run all authentication tests (WSTG-ATHN) against https://app.example.com

Option D: Resume an Interrupted Engagement

Resume engagement pentest-2026-02-11-myapp

📚 Documentation

Agent Role System

AutoPentest uses 4 specialized agent roles instead of generic subagents. Each role has a dedicated prompt template with focused tool guidance, input/output contracts, and anti-patterns.

Role	Template	Purpose	Phases
Scout	`templates/agent-roles/scout.md`	Reconnaissance and attack surface mapping	Phase 0 - 1, source code discovery
Analyzer	`templates/agent-roles/analyzer.md`	Vulnerability discovery with canary/witness payloads	Phase 2 - 5 analysis
Exploiter	`templates/agent-roles/exploiter.md`	Exploitation proof with evidence	Phase 4 exploitation
Reporter	`templates/agent-roles/reporter.md`	Quality review and Final Judge	Phase transitions, post-report

How the Pipeline Works

Phase 4 (highest-impact testing) uses a two-stage pipeline per vulnerability class:

┌──────────────────────────────────────────────────────────────┐
│                    Pipeline 1: XSS                           │
│                                                              │
│  Analyzer (75 turns)          Exploiter (75 turns)           │
│  ┌─────────────────────┐      ┌─────────────────────┐        │
│  │ Discover endpoints  │      │ Load Analyzer queue │        │
│  │ Send canary payloads│─────▶│ Attempt exploitation│        │
│  │ Build exploit queue │ gate │ Prove impact        │        │
│  │ Save deliverable    │      │ Log findings        │        │
│  └─────────────────────┘      └─────────────────────┘        │
│                          ▲                                   │
│               validate_exploitation_queue()                  │
└──────────────────────────────────────────────────────────────┘

Three pipelines (XSS, Injection, SSRF/SSTI) run in parallel. The validation checkpoint between Analyzer and Exploiter ensures only well-formed exploitation queues proceed.

Role Boundaries

Each role has explicit tool restrictions enforced through prompts:

Scouts cannot call log_finding() or send attack payloads.
Analyzers can log configuration findings (missing headers, weak cookies) but not injection-class findings.
Exploiters cannot create new queues — they consume what the Analyzer produced.
Reporters cannot send HTTP requests to the target — they review data only.

For CTF challenges and small apps (<3 input endpoints), a legacy monolithic pipeline is available as a fallback.

Testing Phases

Phase 0: Application Discovery & Mapping

The critical foundation phase. Claude autonomously:

Pre-flight checks — verifies target reachability, detects redirects and cross-domain auth.
Launches 10+ background tools in parallel (katana, ffuf, nuclei, whatweb, gau, nmap, feroxbuster, wapiti, httpx).
Recursive crawling — follows links to depth 2 - 3, parses HTML/JS for endpoints.
Directory brute-forcing — common paths + technology-specific wordlists.
Tool result ingestion — reads all background tool outputs and merges into unified endpoint map.
Builds structured endpoint inventory with parameters, auth requirements, and priority rankings.

Output: A complete endpoint map organized by domain, ready for systematic testing.

Phase 1 - 2: Reconnaissance & Configuration

Server fingerprinting, technology detection, metadata review.
Security header analysis (HSTS, CSP, CORS, X-Frame-Options).
TLS configuration testing, admin interface discovery.
HTTP methods testing, file extension handling.

Phase 3: Authentication, Authorization & Session Management

Role/privilege lattice built before testing (maps guards, middleware, and bypass tests).
IDOR testing with multiple alternate IDs per endpoint.
CSRF testing on every state-changing endpoint.
Session fixation, hijacking, and token analysis.
JWT vulnerability testing (if applicable).
OAuth/OIDC weakness testing (if applicable).

Phase 4: Input Validation (Highest Impact)

Three independent two-stage pipelines run in parallel, each using the Analyzer→Exploiter role split:

Pipeline	Vulnerability Classes	Tools	Technique Guides
XSS Pipeline	Reflected XSS, Stored XSS, DOM XSS	dalfox, Playwright	XSS, DOM
Injection Pipeline	SQL Injection, Command Injection, NoSQL Injection	sqlmap, commix, nosqli	SQLI, CMDI, NOSQLI
SSRF/SSTI Pipeline	SSRF, SSTI, Path Traversal	sstimap, ssrfmap	SSRF, SSTI, PTRAV

Each pipeline: Analyzer (discover → analyze → build exploitation queue) → validation checkpoint → Exploiter (attempt exploitation → prove impact → log findings). WAF evasion intelligence is shared across all pipelines.

Phase 5: Error Handling, Crypto, Business Logic, Client-Side & APIs

Stack trace and error message disclosure.
TLS/SSL testing via testssl.sh.
Business logic bypass (workflow circumvention, request forgery).
Client-side testing (clickjacking, open redirects, DOM manipulation).
GraphQL and REST API testing.
Vulnerability chaining analysis across all findings.

Phase 6: Reporting

Coverage verification (test coverage + tool coverage).
Finding deduplication and severity calibration.
Markdown report generation with executive summary, findings, coverage matrices.

Phase 7: Final Judge Review

A zero-context agent reviews the entire engagement cold — no knowledge of testing decisions or difficulties. It examines:

Coverage integrity — rubber-stamped tests, missing endpoints.
N/A cascade detection — categories with excessive "not applicable" markings.
Finding quality — evidence completeness, severity consistency, chaining opportunities.
Tool utilization — tools run but output never reviewed, lazy skip reasons.
Missed attack surface — untested endpoints, untested parameters, untested domains.

The verdict (PASS/CONDITIONAL_PASS/FAIL) triggers specific remediation actions before the report is delivered.

Security Tools

Discovery & Reconnaissance (Phase 0)

Tool	Purpose	Key Flags
katana	Web crawler with JS rendering	`-jc` for JavaScript crawling
httpx	HTTP probing, tech detection	`-tech-detect -status-code -title`
ffuf	Directory/parameter fuzzing	`-w wordlist -mc all -fc 404`
feroxbuster	Recursive directory enumeration	`--smart --auto-tune`
nuclei	Template-based vuln scanner	`-t cves/ -t misconfigurations/`
nikto	Web server misconfiguration	`-Tuning 1234567890`
whatweb	Technology fingerprinting	`--aggression 3`
nmap	Port and service scanning	`-sV -sC --top-ports 1000`
gau	Historical URL discovery	`--blacklist png,jpg,gif`
subfinder	Subdomain enumeration	`-silent -all`

Injection Testing (Phase 4)

Tool	Purpose	Key Flags
sqlmap	SQL injection (all techniques)	`--batch --risk 3 --level 5`
dalfox	XSS scanning & exploitation	`--skip-bav --deep-domxss`
commix	Command injection	`--batch --all`
sstimap	Server-Side Template Injection	`-u <url>`
ssrfmap	SSRF exploitation	`-r request.txt`
nosqli	NoSQL injection	`-u <url>`
crlfuzz	CRLF injection / HTTP splitting	`-u <url>`
smuggler	HTTP request smuggling	`-u <url>`

Authentication & Session (Phase 3)

Tool	Purpose	Key Flags
hydra	Credential brute-force	`-L users.txt -P pass.txt`
jwt_tool	JWT token analysis & exploitation	`-t <token> -M at`

Cryptography & APIs (Phase 5)

Tool	Purpose	Key Flags
testssl.sh	TLS/SSL configuration testing	`--severity HIGH --sneaky`
graphql-cop	GraphQL security testing	`-t <url>`
websocat	WebSocket testing	`ws://<url>`

Infrastructure (Phase 2)

Tool	Purpose
corscanner	CORS misconfiguration scanning
dnsreaper	Subdomain takeover detection

Browser Automation

Tool	Purpose
Playwright	DOM XSS proof, clickjacking, JS-rendered login, client-side storage inspection

WSTG Knowledge Base

109 test cases across 12 OWASP categories, each with CLI-specific procedures:

Code	Category	Tests	Examples
INFO	Information Gathering	10	Search engine discovery, server fingerprinting, metadata review
CONF	Configuration & Deployment	14	Security headers, CORS, CSP, HSTS, admin interfaces
IDNT	Identity Management	5	Role definitions, registration, account enumeration
ATHN	Authentication	11	Default creds, lockout, auth bypass, MFA, password policy
ATHZ	Authorization	5	Directory traversal, auth bypass, privilege escalation, IDOR
SESS	Session Management	11	Cookie attributes, CSRF, session fixation/hijacking, JWT
INPV	Input Validation	20	XSS, SQLi, CMDi, SSTI, SSRF, path traversal, XXE, LDAP
ERRH	Error Handling	2	Error messages, stack traces
CRYP	Cryptography	4	TLS config, padding oracle, weak encryption
BUSL	Business Logic	10	Workflow bypass, request forgery, file upload, rate limits
CLNT	Client-Side	14	DOM XSS, clickjacking, open redirects, WebSockets, storage
APIT	API Testing	3	GraphQL, REST, SOAP

Each test file includes:

Step-by-step CLI procedures (curl commands, tool invocations).
Payloads organized by bypass level (basic, intermediate, advanced).
Detection criteria with severity assessment rubrics.
Remediation guidance with references.

PortSwigger Technique Guides

31 attack technique reference guides sourced from PortSwigger Web Security Academy, organized by vulnerability class for direct use during real pentesting engagements.

What's Included

Code	Category	WSTG Mapping	Key Content
SQLI	SQL Injection	INPV-05	UNION/blind/error/time-based/OOB techniques, database-specific cheat sheets (Oracle, MySQL, PostgreSQL, MSSQL), WAF bypass
XSS	Cross-Site Scripting	INPV-01, INPV-02, CLNT-01	Reflected/stored/DOM contexts, tag & event handler payloads, CSP bypass, filter evasion
CMDI	OS Command Injection	INPV-12	Separator characters, blind techniques (time-delay, OOB), OS-specific payloads
SSTI	Server-Side Template Injection	INPV-18	Jinja2/Twig/Freemarker/Velocity/ERB detection & exploitation, sandbox escapes
SSRF	Server-Side Request Forgery	INPV-19	URL scheme tricks, IP obfuscation, DNS rebinding, cloud metadata, filter bypass
PTRAV	Path Traversal	INPV-04	Encoding variations, null byte injection, wrapper bypass
XXE	XML External Entities	INPV-07	File retrieval, SSRF via XXE, blind XXE with OOB, parameter entities
AUTHN	Authentication	ATHN-01 to ATHN-07	Brute force, 2FA bypass, password reset poisoning, credential stuffing
AUTHZ	Access Control	ATHZ-01 to ATHZ-04	IDOR, privilege escalation, horizontal/vertical bypass, referer-based controls
JWT	JSON Web Tokens	SESS-10	Algorithm confusion (none/HS256→RS256), kid injection, JWK/JKU exploitation
OAUTH	OAuth 2.0	ATHZ-05	Authorization code theft, open redirect, scope upgrade, CSRF on OAuth flows
CSRF	Cross-Site Request Forgery	SESS-05	Token bypass, SameSite bypass, referer validation bypass
SMUGGLE	HTTP Request Smuggling	INPV-15	CL.TE, TE.CL, TE.TE, HTTP/2 downgrade, request tunneling
DOM	DOM-Based Vulnerabilities	CLNT-01	Sources/sinks, DOM clobbering, prototype pollution gadgets
CORS	Cross-Origin Resource Sharing	CONF-13, CLNT-07	Origin reflection, null origin, subdomain trust exploitation
NOSQLI	NoSQL Injection	INPV-05	MongoDB operator injection, JavaScript injection, blind extraction
GRAPHQL	GraphQL	APIT-01	Introspection, field suggestion, batching attacks, authorization bypass
RACE	Race Conditions	BUSL-04	Limit overrun, TOCTOU, single-endpoint races, last-frame sync
UPLOAD	File Upload	BUSL-08, BUSL-09	Extension bypass, content-type manipulation, web shells, polyglot files
HOST	Host Header Injection	INPV-17	Password reset poisoning, cache poisoning, routing-based SSRF

Plus 11 more: CLICK, WS, CACHEPOIS, CACHEDEC, DESER, INFO, BUSL, PROTO, API, LLM, SKILLS.

How They're Used

Technique guides are integrated into every testing phase via the get_technique_guide() MCP tool:

Phase 2 → CORS guide for CONF-13 testing
Phase 3 → AUTHN, AUTHZ, CSRF, JWT, OAUTH guides for auth/session testing
Phase 4 → SQLI, XSS, CMDI, SSTI, SSRF, PTRAV, XXE guides for input validation
Phase 5 → DOM, CLICK, GRAPHQL, RACE, UPLOAD guides for client-side & business logic

Each parallel testing agent automatically loads its relevant technique guide before testing, providing:

Detection payloads — what to inject to identify the vulnerability.
Exploitation techniques — organized by attack method with step-by-step procedures.
Cheat sheets — database/platform-specific syntax tables for quick reference.
WAF bypass patterns — encoding, obfuscation, and filter evasion strategies.

Adding Custom Guides

See for instructions on adding new technique guides to the knowledge base.

Quality Assurance System

AutoPentest has a multi-layered QA system that prevents shallow testing:

1. Phase Gates (Automated)

After each phase, phase_gate_check() validates:

All MUST-priority tests were executed.
Minimum coverage thresholds are met.
Tool coverage is adequate.
No critical gaps exist.

Blocked phases cannot proceed until all issues are resolved.

2. Quality Reviewer (Per-Phase)

A subagent spawned at every phase transition that:

Checks for 16 known anti-patterns (rubber-stamping, N/A cascades, finding inflation).
Identifies untested endpoints and parameters.
Suggests vulnerability chaining opportunities.
Recommends alternative approaches for blocked tests.

3. Final Judge (Post-Report)

A zero-context agent that reviews the completed engagement with fresh eyes:

Analyzes coverage integrity across all domains.
Detects N/A cascades and their root causes.
Validates finding quality and evidence completeness.
Identifies missed attack surface.
Issues a verdict: PASS, CONDITIONAL_PASS, or FAIL.

4. Exhaustion Gates

Marking a vulnerability as "not exploitable" requires proof of effort:

Vuln Class	Min Techniques	Min Bypass Attempts
XSS	3	5
SQL Injection	3	5
Command Injection	3	5
SSTI	2	3
SSRF	3	5
Path Traversal	3	5

5. Evidence Checklists

Before logging any finding, evidence requirements are verified:

Reproducible curl command.
Full HTTP request and response.
Proof of actual exploitation (not theoretical impact).
Correct classification tier (EXPLOITED vs POTENTIAL).

6. Live Engagement Logging

Every MCP tool call is automatically logged to engagements/<eid>/logs.txt with full arguments, results, and execution duration. Run tail -f logs.txt in a separate terminal to watch all agent activity in real time. 100% coverage via automatic tool wrapper — no manual instrumentation needed.

7. Phase Gate Timing

Phase gates enforce minimum 60 - second intervals between calls (15s in CTF mode), preventing premature phase completion. Inter - gate work verification warns if fewer than 3 work events occur between consecutive gates.

Benchmarking

AutoPentest includes integration with the XBOW Validation Benchmarks — 104 CTF-style Docker challenges used as the industry standard for benchmarking AI pentest agents.

Benchmark Scores (Reference)

Agent	Score	Source
Shannon	96.2%	KeygraphHQ (2024)
PentestGPT	86.5%	USENIX Sec 2024

Usage

# Setup (one-time)
cd benchmarks/xbow && make setup

# Solve with AutoPentest (MCP server + CLAUDE.md + CTF mode)
make solve ID=XBEN-001-24

# Solve with raw Claude (baseline — no MCP, no methodology)
make solve ID=XBEN-001-24 RAW=1

# Solve by vulnerability tag
make solve-tag TAG=sqli

# Solve all 104 challenges
make solve-all

# Full baseline run for comparison
make solve-all RAW=1

# Score the latest run
make score

# Compare autopentest vs raw runs side-by-side
make compare

The solver has two modes:

autopentest (default): Runs Claude Code from the project root, loading .mcp.json (MCP server with 68+ tools) and CLAUDE.md (pentest methodology). Measures AutoPentest's full capability.
raw (RAW=1): Runs bare Claude Code with no MCP server or methodology. Baseline for measuring AutoPentest's value - add over raw LLM capability.

Each challenge is a Docker Compose app with a flag injected at build time. Flag extraction from Claude's output determines pass/fail. Results are scored per - challenge, per - tag, and per - difficulty - level.

CTF Mode

For CTF challenges and small apps, enable CTF mode for relaxed quality gates:

mode: ctf
target:
  url: https://target.com

CTF mode reduces phase gate timing (15s vs 60s), skips QA Reviewer requirements, and halves completion thresholds — while maintaining finding quality and evidence standards.

Example Report

A complete example report from a pentest against PortSwigger's Gin & Juice Shop (a deliberately vulnerable application) is included in the repository:

View Full Report

What the Report Includes

The report demonstrates AutoPentest's output against a real target with 23 findings across all severity levels:

Severity	Count	Examples
Critical	2	UNION-based SQL injection with full data extraction, access control bypass via X-Original-URL header
High	5	Reflected XSS via JS string escape bypass, IDOR on order details, XXE with local file read, DOM XSS via prototype pollution
Medium	6	Missing security headers, no account lockout, missing CSP, CRLF injection, DOM-based open redirect
Low	5	Infrastructure info disclosure, EOL AngularJS, insecure ALB cookies, weak TLS config
Informational	5	Consolidated duplicates and secondary evidence for primary findings

Report Structure

1. Executive Summary         — Target scope, finding summary, domain architecture
2. Detailed Findings         — Each finding with description, evidence (curl commands), and remediation
3. Vulnerability Chaining    — Cross-finding analysis (e.g., XSS + no CSP = severity upgrade)
4. Test Coverage Matrix      — Per-category WSTG coverage (100% across 12 categories)
5. Tool Coverage Matrix      — 27/27 tools tracked, 8 actively run

Sample Finding (SQL Injection)

From the report — a Critical SQL injection finding with full exploitation evidence:

FINDING-017: SQL Injection in /catalog category parameter — Full Data Extraction

Severity: Critical
WSTG Reference: WSTG-INPV-05

The category parameter is vulnerable to UNION-based SQL injection.
The attacker can:
  1. Inject a single quote to cause a 500 error (confirming injection)
  2. Use UNION SELECT with 8 columns to extract arbitrary data
  3. Enumerate tables: PRODUCTS, TRACKING, USERS
  4. Extract credentials from the USERS table

Evidence (reproducible curl command):
  curl -sk "https://ginandjuice.shop/catalog?category='+UNION+SELECT+1,USERNAME,PASSWORD,
  1,1,USERNAME,1,USERNAME+FROM+USERS+LIMIT+10--"

Every finding includes reproducible curl commands, full request/response evidence, and actionable remediation guidance.

Configuration

Engagement Config (YAML)

Config-driven pentests skip interactive questions and ensure consistency:

target:
  url: https://app.example.com
  scope: [app.example.com, api.example.com]

authentication:
  login_type: sso                    # form | sso | api | manual | none
  login_url: https://app.example.com/login
  credentials:
    username: testuser
    password: secret123
  sso:
    provider: keycloak               # keycloak | auth0 | okta | azure_ad
    auth_domain: auth.example.com
    realm: myrealm
    client_id: my-app

rules:
  avoid:
    - { type: path, url_path: "/logout", description: "Skip logout" }
    - { type: endpoint, method: DELETE, url_path: "/api/admin/*", description: "No destructive admin ops" }
  focus:
    - { type: path, url_path: "/api", description: "Prioritize API" }

reporting:
  tester_name: "Security Team"

MCP Server Configuration

The .mcp.json file registers two MCP servers:

{
  "mcpServers": {
    "wstg-pentest": {
      "command": "uv",
      "args": ["--directory", "./server", "run", "server.py"]
    },
    "playwright": {
      "command": "npx",
      "args": ["-y", "@playwright/mcp"]
    }
  }
}

Burp Suite Integration (Optional)

For passive traffic monitoring through Burp Suite Professional:

Start Burp Suite and enable the proxy on all interfaces (0.0.0.0:8080).
The Docker container automatically routes traffic through host.docker.internal:8080.
All HTTP requests appear in Burp's proxy history for manual review.

Multi-Domain Testing

AutoPentest has first-class support for applications with multiple domains (e.g., a SPA frontend + API backend + SSO provider):

Automatic Detection

During Phase 0, AutoPentest detects cross-domain authentication by following login redirects:

app.example.com → redirects to → auth.example.com/login
                 → after login → app.example.com/callback

All domains are automatically registered in scope with their type (app, auth_provider, api, cdn).

Per-Domain Testing

Every WSTG test is evaluated per domain — not just the primary:

Discovery tools (katana, ffuf, nuclei) run against all domains.
Input validation tools (sqlmap, dalfox) target endpoints on every domain with server-side processing.
A test is "not applicable" only when no domain has the tested feature.

Cross-Domain Authentication

Supported SSO protocols:

OAuth 2.0 / OIDC (Authorization Code, PKCE, Password Grant, Client Credentials).
SAML (SP-initiated flow).
Keycloak, Auth0, Okta, Azure AD.
Custom SSO (redirect chain following with cookie jar).

Authentication escalation procedure (6 levels) ensures testing can proceed even with complex auth flows.

Crash Recovery

AutoPentest is designed to survive interruptions:

Automatic Checkpointing

Phase gates auto-save checkpoints on PASS.
git_checkpoint() creates git snapshots of the engagement workspace.
Append-only logs (findings.md, progress.log) survive crashes.

Auto-Resume via resume-prompt.md (Recommended)

Every checkpoint and phase gate automatically generates engagements/<eid>/resume-prompt.md — a complete, self-contained prompt with everything a fresh session needs:

Target URL, authentication credentials, and scope domains.
Current phase and which specific tests remain (mid-phase precision).
Cookie jar status and re-authentication instructions.
Avoid/focus rules and endpoint map references.

To resume after an interruption:

Open a new Claude Code session.
Paste the contents of engagements/<eid>/resume-prompt.md.
Claude picks up exactly where it left off — no manual context needed.

Resume from Checkpoint (Alternative)

Resume engagement pentest-2026-02-11-myapp

This restores:

All findings and test tracking data.
Coverage statistics and phase gate results.
Scope registrations and deliverables.
Mid-phase remaining tests (not just phase-level state).
Instructions for what to do next.

Manual Checkpoints

Save at any time:

Save a checkpoint before starting Phase 4 exploitation

Rollback on Failure

If a phase produces bad results, roll back to the previous checkpoint:

Roll back the engagement to the last checkpoint

Project Structure

autopentest-ai/
├── CLAUDE.md                          # Master pentest workflow (drives Claude Code)
├── .mcp.json                          # MCP server configuration
├── Dockerfile                         # Multi-stage Docker build (27 tools)
├── docker-compose.yml                 # Docker Compose alternative
├── Makefile                           # setup, start, stop, verify-tools, shell
│
├── server/
│   ├── server.py                      # FastMCP server (68+ MCP tools)
│   ├── task_tree.py                   # Hierarchical task tree (6 MCP tools)
│   ├── tool_parsers.py                # Tool output parsing (2 MCP tools, 13 parsers)
│   ├── endpoint_priority.py           # Endpoint risk prioritization (2 MCP tools)
│   ├── waf_evasion.py                 # Adaptive WAF evasion (3 MCP tools, 12 vendors)
│   ├── knowledge_graph.py             # Cross-phase knowledge graph (5 MCP tools)
│   ├── tool_verification.py           # CLI tool results verification (1 MCP tool, 10 validators)
│   ├── context_compression.py         # Progressive context compression (2 MCP tools)
│   └── pyproject.toml                 # Python dependencies
│
├── knowledge-base/
│   ├── web-security-testing-guide/    # OWASP WSTG knowledge base (109 test procedures)
│   │   ├── 01-information-gathering/  # 10 tests (WSTG-INFO-01 → 10)
│   │   ├── 02-configuration/          # 14 tests (WSTG-CONF-01 → 14)
│   │   ├── 03-identity-management/    # 5 tests  (WSTG-IDNT-01 → 05)
│   │   ├── 04-authentication/         # 11 tests (WSTG-ATHN-01 → 11)
│   │   ├── 05-authorization/          # 5 tests  (WSTG-ATHZ-01 → 05)
│   │   ├── 06-session-management/     # 11 tests (WSTG-SESS-01 → 11)
│   │   ├── 07-input-validation/       # 20 tests (WSTG-INPV-01 → 20)
│   │   ├── 08-error-handling/         # 2 tests  (WSTG-ERRH-01 → 02)
│   │   ├── 09-cryptography/           # 4 tests  (WSTG-CRYP-01 → 04)
│   │   ├── 10-business-logic/         # 10 tests (WSTG-BUSL-01 → 10)
│   │   ├── 11-client-side/            # 14 tests (WSTG-CLNT-01 → 14)
│   │   └── 12-api-testing/            # 3 tests  (WSTG-APIT-01 → 03)
│   └── portswigger-academy/           # 31 PortSwigger attack technique guides
│       ├── sql-injection.md           # UNION, blind, error-based, OOB, WAF bypass
│       ├── cross-site-scripting.md    # Reflected, stored, DOM, CSP bypass, filter evasion
│       ├── ssrf.md                    # URL schemes, cloud metadata, DNS rebinding
│       ├── ssti.md                    # Jinja2, Twig, Freemarker sandbox escapes
│       ├── jwt.md                     # Algorithm confusion, kid injection, JWK exploitation
│       ├── oauth.md                   # Auth code theft, redirect exploitation, scope upgrade
│       └── ... (31 total)             # One per vulnerability class
│
├── templates/                         # Testing guides and procedures
│   ├── input-validation-guide.md      # Phase 4 step-by-step procedures
│   ├── testing-strategies.md          # Test matrices, chaining, parallel strategy
│   ├── cli-tools-guide.md             # Tool setup and Docker management
│   ├── tools.md                       # Per-tool command reference
│   ├── quality-gates.md               # Phase quality checklists and anti-patterns
│   ├── cross-domain-auth-guide.md     # SSO/OIDC/SAML procedures
│   ├── source-code-analysis.md        # Security-focused code review template
│   ├── pipelined-testing.md           # Phase 4 pipelined exploitation strategy
│   ├── agent-roles/                   # Role-specialized subagent templates
│   │   ├── README.md                  # Role index and selection guide
│   │   ├── scout.md                   # Reconnaissance role (Phase 0-1)
│   │   ├── analyzer.md                # Vulnerability discovery role (Phase 2-5)
│   │   ├── exploiter.md               # Exploitation proof role (Phase 4)
│   │   └── reporter.md                # QA review + Final Judge role
│   ├── shared/
│   │   ├── honesty-framework.md       # Anti-hallucination guardrails
│   │   ├── exploit-classification.md  # Three-tier finding classification
│   │   ├── reproducibility.md         # Evidence format requirements
│   │   └── scope-rules.md            # Avoid/focus rule templates
│   └── wordlists/                     # Tech-specific fuzzing wordlists
│
├── benchmarks/
│   └── xbow/                              # XBOW benchmark suite (104 CTF challenges)
│       ├── runner.py                      # Challenge orchestration
│       ├── solver.py                      # Automated solver (Claude Code CLI)
│       ├── Makefile                       # solve, solve-all, score, compare
│       └── results/                       # Run reports
│
├── docs/
│   ├── ROADMAP.md                         # Competitive analysis + improvement roadmap
│   └── adding-knowledge-base-resources.md # Guide for adding new technique guides
│
├── configs/
│   ├── example-config.yaml            # Example engagement configuration
│   └── config-schema.md               # YAML schema documentation
│
├── scripts/
│   ├── install-tools.sh               # Docker build + container start
│   ├── browser-auth.py                # Headless Chromium auth (JS-rendered logins)
│   ├── pkce-auth.py                   # OAuth 2.0 PKCE flow automation
│   └── status.sh                      # Engagement status dashboard
│
└── engagements/                       # Runtime output (git-ignored)
    └── <engagement-id>/
        ├── logs.txt                   # Live engagement log (tail -f to watch)
        ├── findings.md                # Append-only findings log
        ├── progress.log               # Timestamped event log
        ├── resume-prompt.md           # Auto-resume prompt (paste into new session)
        ├── report.md                  # Final pentest report
        ├── cookies.txt                # Cross-domain cookie jar
        └── tool-output/               # Raw CLI tool outputs

🔧 Technical Details

Why AutoPentest?

Manual penetration testing is thorough but slow. Automated scanners are fast but shallow. AutoPentest bridges the gap:

Capability	Manual Pentest	Automated Scanner	AutoPentest
Full OWASP WSTG coverage	Depends on tester	Partial	109 tests
Business logic testing	Yes	No	Yes
Multi-step exploitation	Yes	Limited	Yes
Vulnerability chaining	Yes	No	Yes
Evidence-based findings	Yes	Template output	Reproducible curl commands
Consistent quality	Varies	Yes	Phase gates + Final Judge
Speed	Days	Minutes	Hours
Cross-domain auth (SSO/OIDC)	Manual setup	Usually fails	Automated handling

Architecture

┌─────────────────────────────────────────────────────────────┐
│                  LLM Orchestrator (Claude)                  │
│                                                             │
│  Reads CLAUDE.md workflow, manages phases,                  │
│  spawns role-specialized subagents                          │
└──────────┬──────────┬──────────┬──────────┬─────────────────┘
           │          │          │          │
     ┌─────▼────┐ ┌───▼─────┐ ┌──▼───────┐ ┌▼─────────┐
     │  Scout   │ │Analyzer │ │Exploiter │ │ Reporter │
     │  (recon) │ │ (vuln   │ │ (proof)  │ │ (QA /    │
     │          │ │  disc.) │ │          │ │  judge)  │
     └──────────┘ └─────────┘ └──────────┘ └──────────┘
           │          │          │          │
           │     MCP  │          │     MCP  │
           ▼          ▼          ▼          ▼
┌──────────────────────────┐  ┌──────────────────────┐
│  WSTG MCP Server         │  │  Playwright MCP      │
│  (68+ tools)             │  │  (Browser Testing)   │
│                          │  │                      │
│  ◦ 109 WSTG tests        │  │  ◦ DOM XSS proof     │
│  ◦ 31 technique guides   │  │  ◦ Clickjacking      │
│  ◦ Task tree             │  │  ◦ JS-rendered auth  │
│  ◦ Knowledge graph       │  └──────────────────────┘
│  ◦ WAF evasion           │
│  ◦ Tool output parser    │
│  ◦ Results verification  │  docker exec
│  ◦ Context compression   │       │
│  ◦ Endpoint priority     │       ▼
│  ◦ Quality gates         │  ┌──────────────────────┐
│  ◦ Report generation     │  │  autopentest-tools   │
└──────────────────────────┘  │  (Docker Container)  │
                              │                      │
                              │  27 security tools:  │
                              │  nuclei, sqlmap,     │
                              │  dalfox, katana,     │
                              │  ffuf, nmap ...      │
                              │                      │
                              │  Burp proxy          │
                              │  passthrough         │
                              └──────────────────────┘

How it works:

Claude Code reads CLAUDE.md for the complete pentest methodology and orchestrates the 7 - phase workflow.
Role-specialized subagents (Scout, Analyzer, Exploiter, Reporter) execute focused tasks with dedicated prompt templates, tool guidance, and anti-patterns.
WSTG MCP Server (68+ tools) provides OWASP test procedures, 31 PortSwigger technique guides, hierarchical task tree, knowledge graph, WAF evasion, endpoint prioritization, results verification, context compression, quality gates, and report generation.
Docker Container runs all 27 security tools — traffic optionally routes through Burp Suite for passive monitoring.
Playwright MCP handles browser-based testing (DOM XSS, clickjacking, JS-rendered login pages).

📄 License

This tool is intended for authorized security testing only. Only use AutoPentest against applications you have explicit permission to test. Unauthorized access to computer systems is illegal. The authors are not responsible for any misuse of this tool.

Always ensure you have: