This guide will help you set up and run the RCA-RAG Code Intelligence system on Windows.
Before starting, ensure you have the following installed:
python --version in PowerShelldocker --version in PowerShellgit --version in PowerShellgit clone <repository-url>
cd rca-rag
# Create virtual environment
python -m venv .venv
# Activate virtual environment
.venv\Scripts\activate
Note: Always activate the virtual environment before running any Python commands. You’ll need to do this in each PowerShell window.
# Upgrade pip
pip install --upgrade pip
# Install the project and dependencies
pip install -e .
# Or install from requirements.txt
pip install -r requirements.txt
Note: The first time you run the system, sentence-transformers will download the embedding model (all-MiniLM-L6-v2, ~90MB). This happens automatically.
# Check Python version
python --version # Should be 3.11+
# Verify key packages
python -c "import fastapi; print('FastAPI OK')"
python -c "import sqlalchemy; print('SQLAlchemy OK')"
python -c "import sentence_transformers; print('Sentence Transformers OK')"
# Copy the example configuration
Copy-Item configs\app.example.env configs\.env
# Edit the configuration file
notepad configs\.env
# or use VS Code
code configs\.env
Edit configs\.env and update at minimum:
# Database connection (matches Docker Compose defaults)
DATABASE_URL=postgresql+psycopg://postgres:postgres@localhost:5432/rca_rag
# Message Queue (Redis is default, works with Docker Compose)
MQ_TYPE=redis
REDIS_URL=redis://localhost:6379/0
# Storage (MinIO is default, works with Docker Compose)
STORAGE_TYPE=minio
S3_ENDPOINT_URL=http://localhost:9000
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin
S3_BUCKET_NAME=rca-rag
# GitHub Webhook Secret (REQUIRED for webhooks)
GITHUB_WEBHOOK_SECRET=your-secret-key-here-change-this
# Embedding Model (default is fine for most cases)
EMBEDDING_MODEL_NAME=all-MiniLM-L6-v2
# Copy example rules configuration
Copy-Item configs\rules.example.yaml configs\rules.yaml
# Edit rules.yaml
notepad configs\rules.yaml
# or
code configs\rules.yaml
This is the easiest way to get started. Docker Compose includes pgvector pre-installed, avoiding manual installation complexity:
# Start all infrastructure services
docker-compose -f deployments\docker-compose.dev.yml up -d
# Wait for services to be healthy (check status)
docker-compose -f deployments\docker-compose.dev.yml ps
# Verify PostgreSQL is ready
docker-compose -f deployments\docker-compose.dev.yml exec db pg_isready -U postgres
# Verify pgvector extension is available
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -c "CREATE EXTENSION IF NOT EXISTS vector;"
This will start:
Note: The Docker Compose setup uses the pgvector/pgvector:pg16 image which includes pgvector extension. This avoids the need to manually install pgvector on Windows PostgreSQL.
If you must use local PostgreSQL on Windows (not recommended due to pgvector complexity):
pgvector-v0.5.1-pg16-windows-x64.zip)vector.dll → C:\Program Files\PostgreSQL\16\lib\vector.control → C:\Program Files\PostgreSQL\16\share\extension\vector--*.sql → C:\Program Files\PostgreSQL\16\share\extension\# Run PostgreSQL with pgvector in Docker
docker run -d `
--name postgres-pgvector `
-e POSTGRES_PASSWORD=postgres `
-e POSTGRES_DB=rca_rag `
-p 5432:5432 `
pgvector/pgvector:pg16
psql -U postgres -d rca_rag -c "CREATE EXTENSION IF NOT EXISTS vector;"
We strongly recommend using Docker Compose instead (Option 1).
# Run Alembic migrations
alembic -c infra\migrations\alembic.ini upgrade head
# Verify tables were created (using Docker)
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "\dt"
You should see tables: repos, commits, pull_requests, diffs, findings, rag_chunks, etc.
# Using Docker Compose
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "SELECT * FROM pg_extension WHERE extname = 'vector';"
Run each service in a separate PowerShell window:
# Navigate to project directory
cd D:\AI\RAG # Adjust path as needed
# Activate virtual environment
.venv\Scripts\activate
# Start API gateway
# Note: If you get WinError 10013, try using 127.0.0.1 or a different port (8081)
uvicorn apps.gateway.main:app --host 127.0.0.1 --port 8080 --reload
You should see:
INFO: Uvicorn running on http://127.0.0.1:8080
INFO: Application startup complete.
Troubleshooting: If you see WinError 10013, see Port Access Permission Error in the Troubleshooting section.
cd D:\AI\RAG
.venv\Scripts\activate
python -m apps.ingestion.worker
cd D:\AI\RAG
.venv\Scripts\activate
python -m apps.analysis
cd D:\AI\RAG
.venv\Scripts\activate
python -m apps.indexer --mq
# Check API health
curl http://localhost:8080/health
# Open API docs in browser
start http://localhost:8080/docs
# Build and start all services
docker-compose -f deployments\docker-compose.dev.yml up --build
# Or run in background
docker-compose -f deployments\docker-compose.dev.yml up -d
# View logs
docker-compose -f deployments\docker-compose.dev.yml logs -f
# Stop services
docker-compose -f deployments\docker-compose.dev.yml down
# Run all tests
# Run with coverage
pytest --cov=apps --cov-report=html
# Run specific test file
pytest tests\unit\test_parsers.py -v
# Run integration tests
pytest tests\integration\ -v
# Send a test webhook (requires valid signature)
curl -X POST http://localhost:8080/webhooks/github `
-H "Content-Type: application/json" `
-H "X-GitHub-Event: pull_request" `
-H "X-Hub-Signature-256: sha256=..." `
-d "@tests\fixtures\github_events.py"
First, ensure you have indexed some code:
# Manually index a repository (if you have repo_id)
python -m apps.indexer <repo_id>
Then query:
curl -X POST http://localhost:8080/rag/query `
-H "Content-Type: application/json" `
-d '{\"question\": \"How does authentication work?\", \"repo\": \"org/repo\", \"top_k\": 5}'
# Test AST parser
python -c "from apps.analysis.parsers import get_parser; parser = get_parser('Test.java', 'java'); code = 'public class Test { public void test() {} }'; result = parser.parse_file(code, 'Test.java'); print(f'Found {len(result.classes)} classes')"
# Test secret scanner
python -c "from apps.analysis.security import SecretScanner; scanner = SecretScanner(); findings = scanner.scan_file('api_key = \"sk_live_1234567890\"', 'test.py'); print(f'Found {len(findings)} secrets')"
Error: could not connect to server or connection refused
Solutions:
# Check if PostgreSQL is running
docker-compose -f deployments\docker-compose.dev.yml ps db
# Check connection string in configs\.env
# Verify DATABASE_URL format: postgresql+psycopg://user:pass@host:port/dbname
# Test connection manually (using Docker)
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag
Error: extension "vector" does not exist or extension "vector" is not available
Solutions:
# Option 1: Use Docker Compose (RECOMMENDED - Easiest)
# Docker Compose includes pgvector automatically
docker-compose -f deployments\docker-compose.dev.yml up -d db
# Option 2: Use pgvector Docker image directly
docker run -d `
--name postgres-pgvector `
-e POSTGRES_PASSWORD=postgres `
-e POSTGRES_DB=rca_rag `
-p 5432:5432 `
pgvector/pgvector:pg16
# Then update DATABASE_URL in configs\.env to use localhost:5432
Manual Installation (if you must use local PostgreSQL):
psql -U postgres -d rca_rag -c "CREATE EXTENSION IF NOT EXISTS vector;"Verify Installation:
# Check if extension exists (using Docker)
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "SELECT * FROM pg_extension WHERE extname = 'vector';"
Error: Error loading embedding model or network timeout
Solutions:
# Set proxy if needed (PowerShell)
$env:HTTP_PROXY="http://proxy:port"
$env:HTTPS_PROXY="http://proxy:port"
# Or download manually
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
# Or use a different model
# Edit configs\.env: EMBEDDING_MODEL_NAME=BAAI/bge-small-en-v1.5
Error: Error connecting to Redis or Connection refused
Solutions:
# Check if Redis is running
docker-compose -f deployments\docker-compose.dev.yml ps mq
# Test Redis connection (using Docker)
docker-compose -f deployments\docker-compose.dev.yml exec mq redis-cli ping # Should return PONG
# Check REDIS_URL in configs\.env
# Format: redis://localhost:6379/0
Error: Failed to upload file or Failed to download file
Solutions:
# Check if MinIO is running
docker-compose -f deployments\docker-compose.dev.yml ps s3
# Access MinIO console: http://localhost:9001
# Login: minioadmin / minioadmin
# Verify S3 settings in configs\.env
# Check S3_ENDPOINT_URL, S3_ACCESS_KEY, S3_SECRET_KEY
Symptoms: No findings in database after PR events
Solutions:
# Check worker logs for errors
# Look for: "No diffs found" or "Error fetching diff content"
# Verify diffs exist in database (using Docker)
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "SELECT COUNT(*) FROM diffs;"
# Check object_uri format in diffs table
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "SELECT id, path, object_uri FROM diffs LIMIT 5;"
# Verify storage is accessible
python -c "from apps.shared.storage import get_storage_client; storage = get_storage_client(); print('Storage client OK')"
Symptoms: No chunks in rag_chunks table
Solutions:
# Check if indexer worker is running
# Verify it's subscribed to 'indexing.requested' topic
# Manually trigger indexing
python -m apps.indexer <repo_id>
# Check for errors in logs
# Verify embedding model is loaded
Error: [WinError 10013] An attempt was made to access a socket in a way forbidden by its access permissions
Solutions:
Option 1: Check if port is already in use
# Check what's using port 8080
netstat -ano | findstr :8080
# If something is using it, either stop that service or change the port
# To change port, edit configs\.env: APP_PORT=8081
Option 2: Use a different port
# Edit configs\.env and change:
APP_PORT=8081
# Then start with new port
uvicorn apps.gateway.main:app --host 0.0.0.0 --port 8081 --reload
Option 3: Check Windows reserved ports
# Check if Windows has reserved the port range
netsh interface ipv4 show excludedportrange protocol=tcp
# If 8080 is in a reserved range, use a different port (like 8081, 8082, etc.)
Option 4: Run as Administrator (if needed)
# Right-click PowerShell and select "Run as Administrator"
# Then try again
uvicorn apps.gateway.main:app --host 0.0.0.0 --port 8080 --reload
Option 5: Use localhost instead of 0.0.0.0
# Sometimes 0.0.0.0 causes issues on Windows, try localhost
uvicorn apps.gateway.main:app --host 127.0.0.1 --port 8080 --reload
Most Common Solution: Use port 8081 or 8082 instead of 8080, as Windows often reserves lower ports.
Solutions:
# Verify chunks exist (using Docker)
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "SELECT COUNT(*) FROM rag_chunks;"
# Check if chunks have embeddings
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "SELECT COUNT(*) FROM rag_chunks WHERE embedding IS NOT NULL;"
# Ensure repository is indexed
python -m apps.indexer <repo_id>
# Try a simpler query
curl -X POST http://localhost:8080/rag/query `
-H "Content-Type: application/json" `
-d '{\"question\": \"test\", \"repo\": \"org/repo\", \"top_k\": 10}'
Enable debug logging:
# Edit configs\.env
# Change: LOG_LEVEL=DEBUG
# Restart services
Docker Compose:
# All services
docker-compose -f deployments\docker-compose.dev.yml logs -f
# Specific service
docker-compose -f deployments\docker-compose.dev.yml logs -f api
docker-compose -f deployments\docker-compose.dev.yml logs -f worker
Manual services:
logs\ directory (if configured)http://your-server:8080/webhooks/githubGITHUB_WEBHOOK_SECRET in configs\.envpush, pull_request, pull_request_review, check_runEdit configs\rules.yaml to match your architecture:
rules:
- id: domain_no_dep_infra
type: forbid_imports
severity: error
from:
- "**/domain/**"
to:
- "**/infra/**"
# Via API
curl -X POST http://localhost:8080/admin/service-map `
-H "Content-Type: application/json" `
-d '{\"repo_name\": \"org/repo\", \"service_map\": {\"modules\": {\":billing\": {\"team\": \"payments\", \"owner\": \"team-payments@example.com\"}}}}'
# First, ensure repository exists in database (via webhook or manual)
# Get repo_id from database (using Docker)
docker-compose -f deployments\docker-compose.dev.yml exec db psql -U postgres -d rca_rag -c "SELECT id, name FROM repos;"
# Index repository
python -m apps.indexer <repo_id>
Visit http://localhost:8080/docs for interactive API documentation.
# Prometheus metrics endpoint
curl http://localhost:8080/metrics
# Health checks
curl http://localhost:8080/health
curl http://localhost:8080/health/ready
curl http://localhost:8080/health/live
GOOGLE_CHAT_WEBHOOK_URL to configs\.envIf you encounter issues:
pip install -e .)configs\.env)docker-compose up -d)alembic upgrade head)uvicorn apps.gateway.main:app)python -m apps.ingestion.worker)python -m apps.analysis)python -m apps.indexer --mq)curl http://localhost:8080/health)http://localhost:8080/docs)Once all checkboxes are complete, you’re ready to use the system!
\) in paths, forward slashes (/) work too.venv\Scripts\activate (not .venv/bin/activate)`C:\Python311\python.execonfigs\.env or stop conflicting services