Qodo Bug Scanner - Security & Architecture

Executive Summary

Qodo Health Check is a free, zero-friction code review health assessment for public GitHub repositories. Users submit a repo URL, and Qodo's AI reviewer analyzes the repository's recent pull request history to identify bugs that slipped through code review. The service produces a health score, industry comparison, and detailed findings report.

Key commitments:

Public repos: No GitHub token required. Scanned using Qodo's own API credentials
Private repos: Token is encrypted on receipt and permanently deleted within seconds of scan start
No source code is stored. Code exists only in ephemeral memory during analysis
No customer data is used for model training
All infrastructure runs on Google Cloud Platform with encryption at rest and in transit
Rate limiting and bot verification protect the service without collecting unnecessary data

1. How It Works

Public Repos (Landing Page)

User submits a public GitHub repository URL via the landing page
Qodo's AI analyzes the repository's recent pull request history to identify bugs that slipped through code review
Each finding is validated to confirm whether the issue still exists in the codebase
A health score and grade are computed and compared against an industry benchmark
Free report shows a summary. Full report (all findings with evidence) available after sign-in

Private Repos (Invite-Based)

Prospect receives an invite code from a Qodo SE
Prospect runs the CLI script which collects the repo URL and a GitHub token via terminal prompts
Token is encrypted using Fernet symmetric encryption (AES-128-CBC with HMAC) and transmitted over HTTPS
Token is permanently deleted from our database within seconds of scan start
Analysis proceeds identically to the public flow
Results are delivered via email and accessible through the Qodo dashboard

2. Data Inventory

What We Collect

Data	Source	Purpose
Repository URL (owner/repo)	User input	Identifies what to scan
Client IP address	HTTP request	Rate limiting
Browser fingerprint (optional)	Client-side JS	Abuse prevention
Bot verification token	Cloudflare Turnstile	Bot prevention
Email address (optional)	User input at email gate, or Qodo SSO	Unlock full report

What We Generate

Data	Content	Storage
Finding metadata	Bug title, description, category, severity, file path, line number, validation status	Encrypted cloud storage
PR metadata	PR number, title, URL, merge date	Encrypted cloud storage
Health score	Score (0-100), grade, percentile, findings per PR, industry comparison	Encrypted database + cloud storage
HTML reports	Free report (3 findings) + full report (all findings with evidence)	Encrypted cloud storage
Scan metadata	Repo, status, timestamps	Encrypted database

What We Do NOT Collect or Store

Data	Handling
GitHub tokens (public flow)	Not needed. Public repos are accessed using Qodo's own GitHub credentials.
GitHub tokens (private flow)	Encrypted on arrival using Fernet (AES-128-CBC + HMAC), permanently deleted from our database within seconds of scan start. Exists only in application memory during the scan. Never written to disk or logs.
Source code	Cloned to ephemeral container filesystem during analysis. Deleted when the worker container exits. No source code is written to persistent storage.
Passwords or credentials	Not collected. Email gate uses Qodo Platform SSO (GitHub/Google OAuth). No passwords are handled by the Health Check service.

3. Data Flow

Phase 1: Intake

User submits a public repo URL on the landing page
Cloudflare Turnstile verifies the request is from a human (no CAPTCHA, privacy-preserving)
HTTPS POST to Cloud Run web service (TLS 1.2+)
Server validates the repo exists and is public via GitHub API
Rate limiting checks are applied
Scan record created in database (repo, status, timestamps)
Worker job triggered with scan ID

Phase 2: Analysis

Worker accesses the repository via GitHub API (using Qodo's own credentials for public repos)
Qodo's AI reviews the repository's recent pull request history and identifies potential bugs
Relevant code segments are sent to LLM APIs over HTTPS, under commercial API terms that prohibit use of customer data for model training
Findings are validated to confirm whether each issue still exists in the codebase
A health score and report are generated and stored in encrypted cloud storage
Container is destroyed. All local data (including any cloned code) ceases to exist.

Phase 3: Delivery

User is redirected to the free report (score + top 3 findings, no code details)
To unlock the full report, user signs in via Qodo Platform SSO (GitHub or Google OAuth)
Full report includes all findings with descriptions, code snippets, fix suggestions, evidence, and validation status
Reports are served through the application (not as static public files)

4. What Data Reaches LLM APIs

During analysis, Qodo's AI agents process public repository code to identify bugs. The following types of data are sent to LLM APIs:

Data type	Purpose	Constraints
Code diffs	Identify issues introduced by each PR	Changes from merged PRs only
Relevant file content	Understand context around changes; verify if bugs still exist	Truncated to relevant segments; not the full repository
PR metadata	Title, description, merge date	No authentication tokens or secrets

What is NOT sent to LLM APIs:

GitHub tokens or credentials (Qodo's own tokens are used, never sent to LLMs)
User personal information (email, IP address)
The full repository codebase. Only segments relevant to the PR under review

LLM API calls are made under commercial terms that explicitly prohibit the use of customer data for model training. All transmissions use HTTPS with API key authentication.

5. Infrastructure Security

Compute

Component	Platform	Isolation
Web Service	Google Cloud Run (managed)	Per-request container isolation, auto-scaled
Worker Job	Google Cloud Run Job (managed)	Dedicated container per scan, destroyed on completion
Landing Page	Google Cloud Run (managed)	Static content, no server-side user data processing
Admin Dashboard	Google Cloud Run (managed)	Authenticated access, organization membership required

No VMs, no SSH access, no persistent compute
Containers are immutable. Built from Dockerfile, no runtime modifications
Cloud Run provides gVisor-based sandboxing

Storage

Store	Encryption	Access Control
Database	AES-256 at rest (Google-managed keys)	IAM: only application service accounts
Cloud Storage (reports)	AES-256 at rest (Google-managed keys)	IAM: application writes, serves via authenticated endpoints (not public)
Secret Manager	AES-256 at rest (Google-managed keys)	IAM: only service accounts

Cloud storage is not publicly accessible. Reports are served only through the application
No customer data in application logs

Rate Limiting

The service enforces rate limits to ensure fair usage and prevent automated misuse. Bot verification is handled by Cloudflare Turnstile. Access to the full report requires sign-in via Qodo Platform (GitHub or Google OAuth).

Network

All endpoints HTTPS-only (TLS 1.2+, managed certificates)
Cloud Run does not expose SSH, VPN, or any non-HTTPS port
GitHub API calls: HTTPS with token in Authorization header
LLM API calls: HTTPS with API key in header

Secrets Management

All secrets (API keys, encryption keys, service credentials) are stored in Google Cloud Secret Manager with IAM-scoped access. Service accounts follow least-privilege: the web service cannot access LLM API keys, and the worker cannot access session signing keys.

6. Frequently Asked Questions

Do you need my GitHub token?

For public repos: No. Health Check uses Qodo's own GitHub API credentials.

For private repos (invite-based): Yes. You provide a token via the CLI script. The token is encrypted immediately, used only for read-only API access during the scan, and permanently deleted from our systems within seconds of scan start. We recommend using a fine-grained token scoped to the specific repository, and revoking it after the scan completes.

Do you store my source code?

No. Source code is cloned to an ephemeral container during analysis and destroyed when the container exits. No source code is written to persistent storage.

Do you use my data to train AI models?

No. Qodo does not use repository code, findings, or scan data to train, fine-tune, or improve AI models. Code segments sent to LLM providers are covered by commercial API terms that prohibit this.

What information do you collect about me?

Minimal. We collect your IP address (for rate limiting) and optionally your email (if you unlock the full report via SSO). We do not collect names, phone numbers, or company information unless you volunteer them.

Who can see my report?

Free report: Anyone with the URL can see the score and top 3 finding titles (no code details).

Full report: Requires sign-in via Qodo Platform SSO. Only the authenticated user and Qodo team members can access it.

Can I request data deletion?

Yes. Contact security@qodo.ai to request deletion of your scan data and report. We will delete all associated records within 5 business days.

Where does the infrastructure run?

All infrastructure runs in Google Cloud Platform, us-central1 region (Iowa, USA):

Compute: Google Cloud Run (managed, serverless)
Database: Google Cloud managed database
Storage: Google Cloud Storage (managed)
Secrets: Google Cloud Secret Manager (managed)

7. Compliance Considerations

SOC 2 Alignment

Criterion	Status
Encryption in transit	TLS 1.2+ on all endpoints (Cloud Run managed certificates)
Encryption at rest	AES-256 for all storage (Google-managed keys)
Access control	IAM-based service account permissions, least-privilege
Audit logging	Cloud Run request logs, database audit logs (via GCP Cloud Audit Logs)
Data minimization	No user tokens collected, source code never persisted
Incident response	Google Cloud's built-in monitoring and alerting

GDPR Alignment

Requirement	Implementation
Lawful basis	Legitimate interest (user-initiated scan of public code) + consent (voluntary email for full report)
Data minimization	Only IP + optional email retained; no tokens, no source code stored
Right to erasure	Supported. Contact security@qodo.ai for deletion
Data processing agreement	Available on request
Cross-border transfers	Data processed in US (GCP us-central1); standard contractual clauses available

8. Limitations and Disclaimers

Health Check provides automated code analysis powered by AI. Findings are informational and should be validated by your engineering team.
Public repositories can be scanned directly. Private repositories are supported through an invite-based flow with token encryption.
The health score is based on an industry benchmark of open-source repositories and may not reflect your specific domain or standards.
Scan results reflect the state of your repository at the time of analysis.
Rate limits are enforced to ensure fair usage.

For security inquiries: security@qodo.ai
For data deletion requests: support@qodo.ai