Skip to main content

Scanner Overview

AI-BOM includes 13 auto-registered scanners that cover different aspects of AI component detection. Scanners are implemented as subclasses of BaseScanner and automatically register themselves via __init_subclass__.

Scanner architecture

  • Scanners auto-register - add a new scanner in one file, zero wiring needed
  • Regex-based detection by default for speed and cross-language support
  • Optional AST analysis for deep Python scanning
  • Parallel execution via thread pool (--workers N)
  • Each scanner receives a target path and returns a list of AIComponent objects

Available scanners

Always enabled

ScannerNameDescription
Code ScannercodeDetects AI SDK imports, model references, and API keys in source code
Docker ScannerdockerDetects AI containers in Dockerfiles and Compose files
Network ScannernetworkDiscovers AI endpoints and credentials in config/env files
Cloud ScannercloudDetects AI services in Terraform, CloudFormation, and Pulumi
n8n Scannern8nScans n8n workflow JSON files for AI agents and MCP usage
Jupyter ScannerjupyterDetects AI imports and model usage in .ipynb files
GitHub Actions Scannergithub-actionsDetects AI components in GitHub Actions workflow files
Model File Scannermodel-fileDetects AI model binary files (.gguf, .safetensors, .onnx, .pt)
MCP Config Scannermcp-configDetects Model Context Protocol server configurations

Opt-in scanners

ScannerNameEnable withDescription
AST Scannerast--deepDeep Python AST analysis for decorators and function calls
AWS Live Scanneraws-livescan-cloud awsScans live AWS account for Bedrock, SageMaker, etc.
GCP Live Scannergcp-livescan-cloud gcpScans live GCP project for Vertex AI services
Azure Live Scannerazure-livescan-cloud azureScans live Azure subscription for OpenAI, ML services

What each scanner detects

Code Scanner

The primary scanner. Performs two-phase analysis:

  1. Phase A - Dependency file scanning (requirements.txt, package.json, Cargo.toml, etc.) to build a set of declared AI packages
  2. Phase B - Source code scanning to detect SDK usage patterns, model references, and API keys

Detects across Python, JavaScript, TypeScript, Java, Go, Rust, and Ruby. Identifies shadow AI (usage without dependency declaration), deprecated models, unpinned model versions, and hardcoded credentials.

Docker Scanner

Scans Dockerfiles and Docker Compose files for:

  • AI container images (Ollama, vLLM, HuggingFace TGI, NVIDIA Triton, ChromaDB)
  • GPU resource allocations
  • AI-related environment variables (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
  • Model volume mounts

Network Scanner

Scans configuration files and environment files for:

  • AI API endpoints (api.openai.com, api.anthropic.com, etc.)
  • Hardcoded API keys matching known patterns
  • .env files with AI credentials

Cloud Scanner

Scans Infrastructure-as-Code files for AI service definitions:

  • AWS: Bedrock, SageMaker, Comprehend, Rekognition, Textract, Lex, Polly, Transcribe
  • Azure: OpenAI, Machine Learning, Cognitive Services, Bot Service
  • GCP: Vertex AI, AutoML, Vision AI, Speech-to-Text, Natural Language

Supports Terraform (.tf), CloudFormation (.yaml, .json), and Pulumi configurations.

n8n Scanner

Scans n8n workflow JSON files for:

  • AI Agent nodes and LLM Chat nodes
  • MCP Client connections
  • Webhook triggers without authentication
  • Tool configurations and embedding nodes
  • Hardcoded credentials in workflow JSON
  • Cross-workflow AI agent chains

AST Scanner (deep mode)

Python-only. Enabled with --deep. Analyzes:

  • Decorator patterns (@agent, @tool, @crew, @flow, @task)
  • Function calls to AI APIs
  • String literals containing model names
  • CrewAI flow patterns

Jupyter Scanner

Scans .ipynb notebook files for:

  • AI library imports in code cells
  • Model references and API key patterns
  • Framework usage (transformers, torch, tensorflow, etc.)

GitHub Actions Scanner

Scans .github/workflows/ YAML files for:

  • AI-related GitHub Actions
  • Model deployment steps
  • AI API usage in workflow commands

Model File Scanner

Detects binary model files by extension:

  • .gguf - GGML/GGUF quantized models
  • .safetensors - Hugging Face safe tensors
  • .onnx - ONNX Runtime models
  • .pt / .pth - PyTorch models
  • .h5 - TensorFlow/Keras models

MCP Config Scanner

Detects Model Context Protocol server configurations in JSON config files (e.g., .cursor/mcp.json, .vscode/mcp.json).

List installed scanners

ai-bom list-scanners

This prints all registered scanners with their enabled/disabled status.