Methodology

Open, deterministic.
Verifiable evidence.

How the Evidence Tracer is built, end to end. What we call, what we infer, what we store, and what we do not claim. This project is open-source: the scanner, the reasoning layer, and the deletion processes are public and reproducible.

Pilot version: EVT v0.4 · Last updated: 5 May 2026 · Open-source reasoning

01, Architecture
02, Evidence collection
03, Control mapping
04, Customizable controls
05, Reasoning layer
06, Scoring
07, Traceability
08, Design Partner platform
09, Your data and deletion

01Architecture

The open-source scanner runs as a stateless Cloudflare Worker. Each scan request creates short-lived AWS credentials via STS AssumeRole, performs evidence collection, and exits. Scan state is kept in Cloudflare D1 (SQLite at the edge) for 30 days, then deleted automatically. Generated reports are stored in R2 object storage.

We chose this deterministic, stateless architecture so the absence of persisted credentials is verifiable. Short-lived sessions, edge storage, and open code let an auditor reproduce the exact calls and verify results independently.

You POST an IAM Role ARN and ExternalId to the Worker.
The Worker calls STS AssumeRole for short-lived session credentials (1 hour TTL).
Evidence collection fans out across AWS services using SigV4-signed requests.
Evidence items are written to D1, chunked per scan.
An open-source reasoning model analyzes each control with a structured prompt and a strict JSON output contract.
The report is assembled from control results and stored in R2.
Evidence data is deleted automatically after 30 days unless you remove it earlier.

Long-lived AWS credentials are not retained. The role requested is read-only and does not include write permissions. Nothing is installed in your account beyond the read-only role created from the CloudFormation template.

02Evidence collection

All AWS API calls are SigV4-signed. Calls run in parallel with a concurrency cap to avoid rate limits. Raw responses, XML or JSON, are truncated for analysis context windows when necessary. CRITICAL-severity findings are preserved. Each response is stored as a discrete evidence item with its source endpoint, timestamp, and SHA-256 content hash.

IAM

GetAccountSummary
ListUsers
GetAccountPasswordPolicy
ListGroups
GetCredentialReport
ListMFADevices
ListPolicies
ListRoles

S3

ListBuckets
GetBucketEncryption
GetBucketPolicy
GetBucketVersioning
GetPublicAccessBlock

CloudTrail

DescribeTrails
GetTrailStatus
GetEventSelectors
ListTrails

AWS Config

DescribeConfigRules
DescribeConfigurationRecorders
DescribeDeliveryChannels

EC2 / VPC

DescribeSecurityGroups
DescribeVpcs
DescribeFlowLogs
DescribeInstances

CloudWatch + SNS

DescribeAlarms
ListMetrics
ListTopics
ListSubscriptions

Additional services include: KMS, GuardDuty, SecurityHub, Secrets Manager metadata, WAF, RDS, Lambda, and SSO.

03Control mapping

Mapping from evidence to SOC 2 Trust Services Criteria is deterministic. The same evidence set always produces the same control mapping. This mapping is implemented as hand-coded checks and rule tables that are updated as API coverage expands.

CC6.1Logical Access, Restricted Access

Sources: IAM users, password policy, MFA status, credential report, KMS key policies

CC6.2System Access Provisioning

Sources: IAM user creation dates, group memberships, SSO configuration, access key metadata

CC6.3Role-Based Access & Segregation

Sources: IAM roles, policy attachments, trust relationships, cross-account access

CC6.6External Threat Boundary

Sources: VPC configuration, security groups, WAF Web ACLs, flow logs status, EC2 instances

CC6.7Restricted Data Movement & Encryption

Sources: S3 encryption & public access, KMS keys & rotation, RDS encryption, Secrets Manager

CC7.1Configuration & Vulnerability Management

Sources: AWS Config recorders, delivery channels, Config rules compliance

CC7.2Security Event Monitoring

Sources: CloudTrail trails, multi-region flag, log validation, event selectors, CloudWatch alarms

CC7.3Anomaly Detection

Sources: GuardDuty detectors, CloudWatch alarms, SNS topics, CloudWatch metrics

CC7.4Incident Response

Sources: SecurityHub findings, GuardDuty status, SNS subscriptions, Security Hub standards

CC8.1Change Management

Sources: CloudTrail event selectors, Config rules, Lambda function inventory

CC5.2Technology Controls

Sources: AWS Config, SecurityHub standards, GuardDuty, Config rules

CC9.2Business Continuity & Recovery

Sources: RDS snapshots and backup retention, S3 versioning & replication status

Because the mapping is deterministic, every gap finding traces directly to a specific API call and response field. There is no ambiguity, the evidence shows the exact API call and value that triggered the finding.

04Customizable controls

Instead of forcing your security program into a tool's predefined boxes, Customizable Controls lets you define what passing looks like for your organization. Your policies, contractual commitments, and industry requirements can be encoded as deterministic checks. The scanner enforces your custom rules with the same SHA-256-verified evidence as built-in controls. Custom controls appear in your audit package exactly like the standard ones, providing verifiable proof rather than screenshots.

05Reasoning layer

Each control is analyzed by a reasoning layer that follows a strict, open contract. The analyzer receives the control definition, the relevant evidence items, and a JSON schema for the expected output. The reasoning layer is open-source, and its prompts and post-processing rules are part of the repository so an auditor can reproduce the same analysis locally.

The analyzer must cite specific evidence IDs in every finding, avoid inventing controls or permissions not present in the evidence, flag insufficient evidence rather than guessing, and produce copy-pasteable AWS CLI remediation commands anchored to real resource ARNs when available.

The output contract is enforced: malformed JSON or missing fields trigger retries. If a call ultimately fails after retries and backoff, the control is marked inconclusive rather than silently dropped.

Each control analysis runs in its own queue message with a wall-time limit. Typical total analysis time for the 12 controls is 60–120 seconds for a normal account.

Deterministic evidence pipeline: Every PASS, WARN, or FAIL result Loxe produces is the result of hardcoded logic: a specific AWS API call checked against a specific condition. The same environment will always produce the same result, and any auditor can independently verify it by running the same calls. The SHA-256 evidence chain ties the raw response to the finding and makes the result auditable.

Paid platform vs open-source scanner: The open-source scanner runs as stateless Cloudflare Workers. For the Design Partner paid platform we operate an optional Python SDK backend that supports longer-term storage, org workspaces, and additional services. The deterministic evidence generation and SHA-256 tracing apply to both modes.

05Scoring

Two scores are computed for each scan. Both are 0–100. Neither maps to a binary "audit-ready" claim.

Gap Score, percentage of checkpoints meeting their thresholds, weighted by severity of failing findings:

Finding severity	Score deduction
CRITICAL	−25 points
HIGH	−15 points
MEDIUM	−5 points
LOW / INFO	−1 point

Score range	Interpretation
80–100	Low audit risk, known gaps, manageable
60–79	Moderate, auditor will likely raise findings
<60	High, remediate before scheduling audit

Freshness Score, recency of the evidence underpinning the analysis. Inputs: IAM access key age vs 90-day rotation policy, credential last-used dates, CloudTrail log delivery recency, Config rule last-evaluation timestamps. Below 70 indicates configurations that auditors commonly flag for staleness.

The free-tier heuristic score is a directional estimate. The per-control paid analysis may identify additional findings the heuristic rules do not surface, particularly where IAM permission gaps prevented evidence collection. Scores between the two tiers will differ. This is expected, not an error.

06Traceability

Every finding in the paid report is anchored to the evidence that produced it. Each evidence item carries:

The exact AWS API endpoint called (e.g. iam.amazonaws.com/GetAccountSummary)
The request timestamp in ISO 8601 UTC
The raw response body (truncated if >50KB, but CRITICAL findings are never truncated)
A SHA-256 hash of the evidence item for tamper-evidence
The AWS region the call was issued against

Because the scanner is open-source, an auditor can clone the repo, point it at their client's account, run the exact same calls, and verify that our evidence matches what they collect independently. The hash creates a chain of custody between the raw response and the finding that cited it.

This is why the scan is open-source. Not a values statement, a trust mechanism. You can check our work because the work is checkable.

08Design Partner platform

Two modes are available: the open-source scanner you can run yourself, and the Design Partner paid platform with additional org features. Quick comparison:

Open Source (loxeai.com)

Stateless, Cloudflare Workers
Free scan, gap report
30-day data retention, delete anytime
Account not required

Design Partner Platform

Persistent, Python SDK backend
Full inventory, Gideon, continuous scans
Org workspace, audit package, logs
Account-based, design partner access

07What we're not yet claiming

The SOC 2 framework covers nine criteria series. AWS API calls can surface evidence for roughly 15–20% of those criteria, the parts with API endpoints. The remaining 80% (governance processes, risk assessments, written policies, access reviews, vendor risk, HR controls, incident response exercises) have no API. We do not assess those.

"Pre-audit readiness" means the AWS infrastructure layer is assessed. It does not mean your auditor will have no findings. It means you'll have fewer surprises on the technical side, and the ones you do have will come with traceable evidence and copy-pasteable CLI commands to fix them.

We don't publish an accuracy number. We don't have one that's meaningful enough to publish yet.

This is a Type I tool. Continuous monitoring (Type II evidence collection over time) is on the roadmap, not shipped.

08Required permissions

The CloudFormation template grants two AWS managed policies: SecurityAudit and ReadOnlyAccess. Together these cover the majority of what the scanner needs. A small number of services require additional permissions not included in those policies. If your role is missing them, those evidence items will return access denied and the affected controls will score lower or be marked inconclusive.

Included in the CloudFormation template (no action needed):

IAM, password policy, users, roles, policies, credential report, MFA devices
CloudTrail, trails, event selectors, trail status
AWS Config, configuration recorders, delivery channels
EC2 / VPC, security groups, VPCs, instances, flow logs
CloudWatch, alarms
KMS, key list and rotation status
SecurityHub, hub status, findings, enabled standards
GuardDuty, detector list
WAF, web ACL list
SNS, topics and subscriptions
Secrets Manager, secret metadata only (never values)
SSO, instance list

Not included, requires manual addition for full coverage:

RDS, DescribeDBInstances, DescribeDBSnapshots, DescribeDBClusterSnapshots require AmazonRDSReadOnlyAccess. Without it, CC9.2 (Business Continuity) cannot be assessed.
Config rules, config:DescribeConfigRules may be blocked depending on your account configuration. Affects CC7.1 and CC5.2.
CloudWatch metrics, cloudwatch:ListMetrics is not in SecurityAudit. Affects CC7.3 anomaly detection scoring.

If you want full RDS coverage, attach AmazonRDSReadOnlyAccess to the role after deploying the CloudFormation stack:

aws iam attach-role-policy --role-name LoxeAIPilotReadOnlyRole --policy-arn arn:aws:iam::aws:policy/AmazonRDSReadOnlyAccess

The free-tier heuristic score is directional and may differ from the paid analysis score. The paid analysis treats permission gaps as findings in their own right. If a control cannot be verified because the role lacks access, that is itself a compliance signal. The heuristic engine is more lenient on missing data.

09Your data & deletion

Here is exactly what we store, where, and for how long, with no buried clauses.

What	Where	How long
AWS API responses (evidence)	Cloudflare D1	30 days, then auto-deleted
Gap scores & control analysis	Cloudflare D1	30 days, then auto-deleted
Generated report (HTML + JSON)	Cloudflare R2	30 days, then auto-deleted
Finding edits & resolved marks	Cloudflare D1	30 days, deleted with scan
Data access log	Cloudflare D1	30 days, deleted with scan
Payment records	Stripe	As required by law (~7 years)

What we never store: AWS credentials, secret values, application data, customer data, code, or anything your application stores about your users. The IAM role sessions are 1-hour TTL and never persisted after the scan completes.

Delete anytime. Every scan page has a "Delete all my scan data" button. One click wipes all D1 rows and R2 objects for that scan immediately, evidence, report, edits, access log, everything. No email is required to perform this deletion. Payment records remain with Stripe as required by law but contain no scan data.

Access log. Every time your scan data is accessed, by you or anyone with your token, it is recorded and visible to you on the scan page under "Access log." You can see every download, every Gideon query, every results view. Access is auditable.

What we share: Stripe processes payments. Anthropic's Claude API analyzes your evidence for the paid report. Neither receives your AWS account ID or org name. That's the complete list of third parties.

If you want deletion before the 30-day window and can't access the scan page, email mehta.arja@northeastern.edu with your scan ID.

Questions, corrections, or methodology challenges, talk to the founder directly.

Open, deterministic.Verifiable evidence.