The mechanics of offensive security have changed more in the last 18 months than in the previous decade. The reason is not a new exploit, a new vulnerability class, or a new patch bypass. It is the arrival of useful AI inside the red team's toolchain.
According to the SANS 2025 red team survey, 67% of operators now use at least one AI-assisted tool during active engagements — up from 18% in 2023. Hadrian Research has cataloged over 70 open-source AI penetration testing tools released in the 18 months following GPT-4. The shift is real, and it is accelerating.
This article is the operator's view of where modern pentesting actually sits in 2025–2026, with concrete techniques and the implications for defenders.
1. The new attack chain: identity + cloud + AI agents
Modern attack chain (2026)
The textbook attack chain has shifted. Where five years ago a typical engagement started with phishing and ended with lateral movement on Windows domains, the 2026 chain looks different. Initial access is more likely to be a leaked credential or token from a developer's repository. The pivot is increasingly into cloud APIs and identity providers, not domain controllers. The objective often involves exfiltrating training data, prompts, or vector store contents from AI workflows the target organisation is quietly running without governance.
Mandiant's M-Trends 2025 confirms the pattern: exploits remain the most common initial infection vector at 33%, but stolen credentials rose to 16% — and the cloud-native targets have grown materially as a share of incidents.
2. AI-augmented reconnaissance
The first phase of any engagement — understanding what the target actually looks like — is where AI delivers its largest gain. Where a skilled operator once spent four to six hours correlating open-source intelligence (OSINT) into a coherent attack surface model, AI-assisted pipelines now produce a prioritised, annotated graph in under forty minutes. The operator's role shifts from data collection to verification and prioritisation — closer to a senior analyst than a junior researcher.
Concretely, the most useful patterns are: LLM-driven entity correlation across leaked credentials, certificate transparency, and DNS data; semantic clustering of public-facing endpoints to surface non-obvious technology stacks; and autonomous social-graph mapping for targeted phishing scope. Used together, these compress the recon phase by 60–80% and routinely surface targets that would have been missed manually.
3. Agentic exploitation
The headline development of 2026 is autonomous agents that conduct end-to-end exploitation, not just suggest it. Recent benchmarks for the Excalibur LLM-based pentesting agent demonstrated compromise of four out of five hosts in an Active Directory engagement at a total LLM API cost of $28.50. Automated approaches now achieve 69.5% success rates compared with 47.6% for manual efforts on equivalent target sets.
The implication is significant. The cost of running a credible end-to-end engagement has fallen by an order of magnitude. Threat actors will be the first to weaponise this, but defenders should not be far behind — if your security testing budget assumes 2024 economics, you are funding too few engagements.
4. Modern post-exploitation
Post-exploitation tradecraft has evolved alongside the entry vectors. The patterns we see most frequently in 2025–2026 engagements:
- Living off the land in the cloud. Using legitimate cloud APIs, IAM tokens, and managed services to move laterally and exfiltrate data. Detection by signature-based tooling is poor; behavioural analytics are essential.
- Token theft over credential theft. OAuth refresh tokens, session cookies, and AWS STS tokens are now the higher-value targets — they bypass MFA and have longer useful lifespans than passwords.
- Data exfiltration through AI workflows. A growing category. If the target organisation has an internal RAG system or AI agent with broad data access, that agent itself becomes the exfiltration vehicle once compromised.
- Persistent access via CI/CD. Compromised pipeline secrets give long-lived, well-disguised access that survives most credential rotation cycles.
5. What this means for defenders
Three implications for any organisation serious about defending against modern offensive operations:
One: assume your attack surface is wider than your CMDB knows. AI-augmented recon by attackers will find the shadow infrastructure, the unmaintained cloud accounts, and the developer artefacts your asset inventory does not track. Continuous external attack surface monitoring is no longer optional.
Two: detect intent, not just signatures. Living-off-the-land techniques and legitimate cloud API misuse cannot be caught by traditional IDS or EDR signatures. Behavioural analytics aligned to MITRE ATT&CK techniques — particularly the Defense Evasion (TA0005) and Lateral Movement (TA0008) tactics — are the foundation of modern detection.
Three: assume your own AI workflows are part of the attack surface. Internal AI agents with access to systems and data are now legitimate targets. Treat them with the same threat modelling rigour as any other privileged service — including allowlist controls, audit logging, and human-in-the-loop on consequential actions.
6. Practical recommendations
If you are running or commissioning offensive security in 2026, four operational shifts:
- Move your engagement cadence from annual to continuous. The cost economics now support it.
- Add an explicit scope clause for AI workflows, prompt injection chains, and agent tool surfaces.
- Combine human-led red teaming with autonomous agent-based assessment — they find different classes of vulnerabilities.
- Map every finding back to MITRE ATT&CK technique IDs. This is the lingua franca of defenders, and it makes remediation conversations dramatically faster.
For Malaysian organisations needing to align this with regulatory expectations — particularly under BNM RMiT, PDPA, and ISO 27001 — our AI Agentic Security programme covers the full attack-and-defend cycle, and is HRDC SBL-KHAS claimable for eligible employers.