3.2 KiB
Architecture
Pipeline Overview
subfinder (every 2 days, 23:00)
│
└── writes: all-resolved-latest.txt
│
▼
httpx (every 30 min) nuclei (every 1 hour)
├── reads chunk from resolved list ├── reads nuclei-queue.txt
├── probes 300 hosts per run ├── takes max 50 hosts
├── diffs against cumulative state ├── scans: exposures + misconfig
├── self-heals state on each chunk ├── scans: default-logins (filtered)
├── applies blacklist filter ├── diffs against known findings
└── appends NEWs+CHANGEDs to queue └── Matrix notification if new
│ ▲
└──── nuclei-queue.txt ──────────────┘
Key Design Decisions
httpx as infinite queue — chunk pointer never resets externally. subfinder writes a new resolved list, httpx picks up where it left off. New subdomains at the end of the file get scanned on the next natural cycle.
State self-healing — on every chunk run, all hosts in that chunk are removed from the cumulative state and rewritten with fresh httpx output. Corrupted entries (e.g. Content-Length stored as title) heal automatically on the next pass through that chunk.
nuclei decoupled from httpx — nuclei no longer triggered by httpx. httpx writes to a queue file, nuclei reads it independently every hour. Max 50 targets per nuclei run = no more timeouts.
Duplicate prevention — sort -u on queue additions ensures no host
appears in the queue more than once at any time.
File Layout
/var/jenkins_home/recon-state/
├── subfinder/
│ ├── all-resolved-latest.txt ← httpx input
│ ├── all-subdomains-latest.txt
│ └── history/
├── httpx/
│ ├── httpx-state-cumulative.txt ← diff baseline (self-healing)
│ ├── chunk-pointer.txt ← current position (never externally reset)
│ ├── daily-digest.txt ← all NEWs+CHANGEDs today
│ ├── metadata.txt ← last run stats
│ └── history/
└── nuclei/
├── nuclei-queue.txt ← shared queue (httpx writes, nuclei reads)
├── nuclei-findings-cumulative.txt
├── metadata.txt ← includes queue_remaining
└── history/
Schedules
| Job | Schedule | Duration | Notes |
|---|---|---|---|
| subfinder | every 2 days, 23:00 | ~10 min | Does NOT reset httpx pointer |
| httpx | every 30 min | ~20 min | 300 hosts/chunk, ~57 runs/cycle |
| nuclei | every 1 hour | ~10 min | max 50 hosts/run from queue |
Timelines
First full cycle (~28h):
- httpx scans all 16,844 hosts in ~57 runs
- Queue fills up as NEWs are discovered
- nuclei works through queue in background (~13 days for full backlog)
- Critical hosts (login panels, APIs) get scanned first
Steady state (after 2 cycles):
- httpx produces 0-5 real CHANGEDs per chunk
- Queue stays small, nuclei empties it within hours
- State fully healed, no more false CHANGEDs from CL-bug