CAMT-CSV User Guide¶
Table of Contents¶
- Introduction
- Installation
- Configuration
- Basic Usage
- Advanced Features
- File Format Support
- Transaction Categorization
- Troubleshooting
- Examples
Introduction¶
CAMT-CSV is a powerful command-line tool that converts various financial statement formats into standardized CSV files with intelligent transaction categorization. It supports multiple input formats and uses a hybrid approach combining local rules with AI-powered categorization.
Key Features¶
- Multi-format Support: CAMT.053 XML, PDF bank statements, Revolut CSV, Revolut Investment CSV, Selma investment CSV, and generic debit CSV
- Smart Categorization: Three-tier strategy pattern using direct mapping, keyword matching, and AI fallback with auto-learning
- Dependency Injection Architecture: Clean architecture with explicit dependencies, eliminating global state
- Hierarchical Configuration: Viper-based configuration system with config files, environment variables, and CLI flags
- Batch Processing: Process multiple files at once with automatic format detection
- Investment Support: Dedicated parser for Revolut investment transactions with specialized categorization
- Extensible Architecture: Standardized parser interfaces with BaseParser foundation and segregated interfaces
- Comprehensive Error Handling: Custom error types with detailed context and proper error wrapping
- Framework-Agnostic Logging: Structured logging abstraction with dependency injection and configurable backends
- Performance Optimized: String operations optimization, lazy initialization, and pre-allocation for efficient processing
Installation¶
Homebrew (macOS / Linux) — Recommended¶
Docker¶
Multi-arch images (amd64/arm64) are available on GitHub Container Registry:
docker pull ghcr.io/fjacquet/camt-csv:latest
# Run directly
docker run --rm -v $(pwd):/data ghcr.io/fjacquet/camt-csv:latest camt -i /data/statement.xml -o /data/output.csv
Binary Download¶
Download pre-built binaries from GitHub Releases for linux/darwin/windows (amd64/arm64).
Building from Source¶
Prerequisites:
- Go 1.24.2 or higher: Download Go
- pdftotext CLI tool (for PDF processing):
- macOS: brew install poppler
- Ubuntu/Debian: apt-get install poppler-utils
- Windows: Download Poppler for Windows
Verify Installation¶
Configuration¶
CAMT-CSV uses a hierarchical configuration system, allowing you to manage settings flexibly. Settings are applied in the following order of precedence (highest to lowest):
- CLI Flags: Options passed directly on the command line (e.g.,
--log-level debug). - Environment Variables: Variables prefixed with
CAMT_(e.g.,CAMT_LOG_LEVEL=debug). - Configuration File: A
camt-csv.yamlfile located in~/.camt-csv/or.camt-csv/config.yaml.
Setting Up Configuration¶
Create and edit the configuration file for persistent settings:
Global Configuration Options¶
All commands support these global flags and configuration options:
Core Options¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
| - | - | --config |
$HOME/.camt-csv/config.yaml |
Config file path |
| - | - | -i, --input |
- | Input file or directory |
| - | - | -o, --output |
- | Output file or directory |
| - | - | -v, --validate |
false |
Validate format before conversion |
Logging¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
log.level |
CAMT_LOG_LEVEL |
--log-level |
info |
Log level (debug, info, warn, error) |
log.format |
CAMT_LOG_FORMAT |
--log-format |
text |
Log format (text, json) |
CSV Output¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
csv.delimiter |
CAMT_CSV_DELIMITER |
--csv-delimiter |
, |
CSV delimiter character |
csv.date_format |
CAMT_CSV_DATE_FORMAT |
- | DD.MM.YYYY |
Date format for CSV output |
csv.include_headers |
CAMT_CSV_INCLUDE_HEADERS |
- | true |
Include CSV header row |
csv.quote_all |
CAMT_CSV_QUOTE_ALL |
- | false |
Quote all CSV fields |
AI Categorization¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
ai.enabled |
CAMT_AI_ENABLED |
--ai-enabled |
false |
Enable AI categorization |
ai.api_key |
GEMINI_API_KEY |
- | - | Gemini API key |
ai.model |
CAMT_AI_MODEL |
- | gemini-2.0-flash |
AI model to use |
ai.requests_per_minute |
CAMT_AI_REQUESTS_PER_MINUTE |
- | 10 |
API rate limit |
ai.timeout_seconds |
CAMT_AI_TIMEOUT_SECONDS |
- | 30 |
API request timeout |
ai.fallback_category |
CAMT_AI_FALLBACK_CATEGORY |
- | Uncategorized |
Category when AI fails |
Categorization¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
categorization.auto_learn |
CAMT_CATEGORIZATION_AUTO_LEARN |
--auto-learn |
false |
Auto-save AI categorizations to YAML |
categorization.confidence_threshold |
CAMT_CATEGORIZATION_CONFIDENCE_THRESHOLD |
- | 0.8 |
Minimum confidence threshold |
categorization.case_sensitive |
CAMT_CATEGORIZATION_CASE_SENSITIVE |
- | false |
Case-sensitive matching |
Auto-Learn Behavior:
- --auto-learn enabled: AI categorizations are saved directly to creditors.yaml/debtors.yaml. Backups are created automatically before each write.
- --auto-learn disabled (default): AI categorizations are saved to staging files (staging_creditors.yaml/staging_debtors.yaml) for manual review. You can copy approved entries to the main files.
Staging¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
staging.enabled |
CAMT_STAGING_ENABLED |
- | true |
Save AI suggestions to staging files when auto-learn is off |
staging.creditors_file |
CAMT_STAGING_CREDITORS_FILE |
- | staging_creditors.yaml |
Staging file for creditor suggestions |
staging.debtors_file |
CAMT_STAGING_DEBTORS_FILE |
- | staging_debtors.yaml |
Staging file for debtor suggestions |
Data and Backup¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
data.directory |
CAMT_DATA_DIRECTORY |
- | - | Custom data directory |
data.backup_enabled |
CAMT_DATA_BACKUP_ENABLED |
- | true |
Enable backups |
backup.enabled |
CAMT_BACKUP_ENABLED |
- | true |
Enable backup system |
backup.directory |
CAMT_BACKUP_DIRECTORY |
- | - | Backup directory |
categories.file |
CAMT_CATEGORIES_FILE |
- | categories.yaml |
Categories file |
categories.creditors_file |
CAMT_CATEGORIES_CREDITORS_FILE |
- | creditors.yaml |
Creditors mapping file |
categories.debtors_file |
CAMT_CATEGORIES_DEBTORS_FILE |
- | debtors.yaml |
Debtors mapping file |
Parser-Specific Settings¶
| YAML Key | Environment Variable | CLI Flag | Default | Description |
|---|---|---|---|---|
parsers.camt.strict_validation |
CAMT_PARSERS_CAMT_STRICT_VALIDATION |
- | true |
Strict CAMT validation |
parsers.pdf.ocr_enabled |
CAMT_PARSERS_PDF_OCR_ENABLED |
- | false |
Enable OCR for PDF |
parsers.revolut.date_format_detection |
CAMT_PARSERS_REVOLUT_DATE_FORMAT_DETECTION |
- | true |
Auto-detect date format |
Command-Specific Flags¶
Parser Commands (camt, pdf, revolut, revolut-investment, selma, debit)¶
| CLI Flag | Default | Description |
|---|---|---|
-f, --format |
standard |
Output format: standard (29-col, comma) or icompta (10-col, semicolon, dd.MM.yyyy) |
--date-format |
DD.MM.YYYY |
Date format in output |
PDF Command Only¶
| CLI Flag | Default | Description |
|---|---|---|
--batch |
false |
Batch mode: convert each PDF individually |
Categorize Command¶
| CLI Flag | Default | Description |
|---|---|---|
-p, --party |
- | Party name (required) |
-d, --debtor |
false |
Whether party is debtor |
-a, --amount |
- | Transaction amount |
-t, --date |
- | Transaction date |
-n, --info |
- | Additional info |
Example Configuration¶
Complete example of ~/.camt-csv/camt-csv.yaml:
# Logging configuration
log:
level: "info"
format: "text"
# CSV output settings
csv:
delimiter: ";"
date_format: "DD.MM.YYYY"
include_headers: true
quote_all: false
# AI categorization
ai:
enabled: true
model: "gemini-2.0-flash"
requests_per_minute: 10
timeout_seconds: 30
fallback_category: "Uncategorized"
# Categorization behavior
categorization:
auto_learn: false
confidence_threshold: 0.8
case_sensitive: false
# Data management
data:
backup_enabled: true
# Category files
categories:
file: "categories.yaml"
creditors_file: "creditors.yaml"
debtors_file: "debtors.yaml"
# Staging (AI suggestions when auto-learn is off)
staging:
enabled: true
creditors_file: "staging_creditors.yaml"
debtors_file: "staging_debtors.yaml"
# Parser-specific settings
parsers:
camt:
strict_validation: true
pdf:
ocr_enabled: false
revolut:
date_format_detection: true
To set the API key, use the environment variable:
Basic Usage¶
Command Structure¶
All CAMT-CSV commands follow this pattern:
Supported Commands¶
| Command | Description | Input Format |
|---|---|---|
camt |
Convert CAMT.053 XML files | XML bank statements |
pdf |
Convert PDF bank statements | PDF files |
revolut |
Process Revolut CSV exports | Revolut CSV format |
revolut-investment |
Process Revolut investment transactions | Revolut investment CSV format |
selma |
Process Selma investment files | Selma CSV format |
debit |
Process generic debit CSV files | Generic CSV format |
batch |
Process multiple files | Directory of files |
categorize |
Categorize existing transactions | CSV files |
Quick Start Examples¶
- Convert a CAMT.053 XML file:
- Process a PDF bank statement:
- Convert Revolut export:
- Process Revolut investment transactions:
Advanced Features¶
Batch Processing¶
Process multiple files in a directory:
Features:
- Automatically detects file types
- Processes all supported formats
- Maintains original filenames with
.csvextension - Skips unsupported files with warnings
Transaction Categorization¶
CAMT-CSV uses a sophisticated three-tier categorization system:
- Direct Mapping (fastest): Exact matches from learned patterns
- Keyword Matching: Local rules from
database/categories.yaml - AI Categorization (fallback): Gemini AI for unknown transactions
Customizing Categories¶
Edit database/categories.yaml to add custom categories:
categories:
- name: "Groceries"
keywords:
- "supermarket"
- "grocery"
- "food store"
- name: "Transportation"
keywords:
- "uber"
- "taxi"
- "bus"
- "train"
AI Categorization Setup¶
- Get a Google AI API key from Google AI Studio
-
Set your API key as an environment variable:
-
Enable AI categorization in
~/.camt-csv/camt-csv.yaml:
Custom Output Formats¶
Change CSV Delimiter¶
For European Excel compatibility, set the delimiter in ~/.camt-csv/camt-csv.yaml:
Custom Data Directory¶
Store configuration files in a custom location by setting the CAMT_DATA_DIRECTORY environment variable:
File Format Support¶
CAMT.053 XML Files¶
Description: ISO 20022 standard bank statement format Features:
- Complete transaction details
- Multi-currency support
- Reference numbers and codes
- Party information (payer/payee)
Example Usage:
PDF Bank Statements¶
Description: Extracts transactions from PDF bank statements Supported Types:
- Viseca credit card statements (specialized parsing)
- Generic bank statement PDFs
Requirements: pdftotext must be installed
Example Usage:
Revolut CSV Files¶
Description: Processes Revolut app CSV exports Features:
- Transaction state handling
- Fee processing
- Currency conversion tracking
- Category mapping
Example Usage:
Revolut Investment CSV Files¶
Description: Processes Revolut investment transaction CSV exports Features:
- Investment transaction categorization (BUY, DIVIDEND, CASH TOP-UP)
- Share quantity and price tracking
- Multi-currency support with FX rate handling
- Automatic debit/credit classification
- Investment-specific metadata (ticker, fund information)
Supported Transaction Types: - BUY: Stock purchases with quantity and price per share - DIVIDEND: Dividend payments from holdings - CASH TOP-UP: Cash deposits to investment account
Example Usage:
Selma Investment CSV¶
Description: Processes Selma investment platform exports Features:
- Investment transaction categorization
- Stamp duty association
- Dividend and income tracking
- Trade transaction processing
Example Usage:
Generic Debit CSV¶
Description: Processes generic CSV files with debit transactions Features:
- Flexible column mapping
- Date format detection
- Amount standardization
Example Usage:
Transaction Categorization¶
How Categorization Works¶
CAMT-CSV uses a sophisticated Strategy Pattern with four-tier categorization:
- Direct Mapping Strategy (Fastest):
- Checks
database/creditors.yamlanddatabase/debtors.yaml - Exact, case-insensitive matches for known payees/payers
- Instant recognition for recurring transactions
-
No processing overhead
-
Keyword Strategy (Local Processing):
- Uses pattern matching rules from
database/categories.yaml - Matches against transaction descriptions and party names
- Configurable keyword patterns and rules
-
No API calls required, fully local processing
-
Semantic Strategy (Advanced Matching):
- Advanced pattern matching using semantic analysis
- Handles variations in transaction descriptions
- More intelligent than simple keyword matching
-
Still local processing, no external API calls
-
AI Strategy (Optional Fallback):
- Fallback to Gemini AI when local methods fail
- Context-aware analysis of transaction details
- With
--auto-learn: saves results directly to main YAML files - Without
--auto-learn: saves results to staging files for review - Rate limiting to prevent API quota exceeded
- Lazy initialization for optimal performance
Strategy Pattern Benefits¶
- Independent Testing: Each strategy can be tested and optimized separately
- Easy Extension: New categorization algorithms can be added as strategies
- Flexible Configuration: Strategies can be enabled/disabled or reordered
- Performance Optimization: Strategies execute in order of efficiency
Managing Categories¶
View Current Categories¶
Add New Category¶
Edit database/categories.yaml:
View Learned Mappings¶
cat database/creditors.yaml # For money received
cat database/debtors.yaml # For money spent (renamed from debitors.yaml)
Migration Note: The debtor mapping file has been renamed from debitors.yaml to debtors.yaml for standard English spelling. The application maintains backward compatibility with the old filename, but it's recommended to rename your existing file.
Categorization Best Practices¶
- Start with Keywords: Define common patterns in
categories.yaml - Use AI Sparingly: Enable AI for unknown transactions only
- Review and Clean: Periodically review learned mappings
- Case Sensitivity: All matching is case-insensitive
- Rate Limiting: Respect API limits with
GEMINI_REQUESTS_PER_MINUTE
Troubleshooting¶
Common Issues¶
1. "pdftotext not found"¶
Problem: PDF processing fails Solution: Install Poppler Utils:
2. "Invalid file format"¶
Problem: File not recognized or validation fails Solutions:
- Verify file format matches command (XML for
camt, PDF forpdf, etc.) - Check file isn't corrupted
- Try with a sample file first
- Look for specific error details in the error message (enhanced error types provide detailed context)
- Check for
ParseError,ValidationError, orInvalidFormatErrorin the output
3. "API quota exceeded"¶
Problem: Too many AI categorization requests Solutions:
- Reduce
GEMINI_REQUESTS_PER_MINUTE - Add more keywords to
categories.yaml - Process files in smaller batches
4. "Permission denied"¶
Problem: Cannot write output file Solutions:
- Check output directory exists and is writable
- Verify file isn't open in another application
- Use absolute paths if relative paths fail
Debug Mode¶
Enable detailed logging for troubleshooting by setting the log level as a CLI flag:
Understanding Error Messages¶
CAMT-CSV provides detailed error messages with context to help troubleshoot issues:
Parse Errors¶
- Parser: Which parser encountered the error - Field: What field failed to parse - Value: The actual value that caused the issueValidation Errors¶
- File Path: The file that failed validation - Reason: Why validation failedData Extraction Errors¶
data extraction failed in file '/path/to/file.pdf' for field 'amount': unable to parse currency. Reason: no currency symbol found
Logging Configuration¶
Configure logging output format and level:
# JSON format for structured logging
./camt-csv --log-format json --log-level info camt -i input.xml -o output.csv
# Text format for human-readable output
./camt-csv --log-format text --log-level debug camt -i input.xml -o output.csv
Getting Help¶
- Command Help:
./camt-csv [command] --help - General Help:
./camt-csv --help - Version Info:
./camt-csv --version
Examples¶
Example 1: Basic CAMT.053 Conversion¶
# Convert XML bank statement to CSV
./camt-csv camt -i samples/camt053/statement.xml -o output/transactions.csv
# View the results
head -5 output/transactions.csv
Example 2: Batch Processing with Custom Delimiter¶
-
Set the delimiter in
~/.camt-csv/camt-csv.yaml: -
Process all files in a directory:
Example 3: AI-Powered Categorization¶
-
Configure AI categorization in
~/.camt-csv/camt-csv.yaml: -
Set your API key as an environment variable:
-
Process with AI categorization:
Example 4: Custom Categories¶
-
Edit categories file:
-
Add custom category:
-
Process transactions:
Example 5: Processing Revolut Investment Transactions¶
# Process Revolut investment CSV with detailed transaction categorization
./camt-csv revolut-investment -i revolut_investment_export.csv -o investment_transactions.csv
# View the processed investment transactions
head -10 investment_transactions.csv
Sample Input (Revolut Investment CSV):
Date,Ticker,Type,Quantity,Price per share,Total Amount,Currency,FX Rate
2024-01-15T10:30:00.000Z,AAPL,BUY,10,$150.00,$1500.00,USD,1.0
2024-01-20T09:15:00.000Z,AAPL,DIVIDEND,,,$25.50,USD,1.0
2024-01-25T14:45:00.000Z,,CASH TOP-UP,,,$1000.00,USD,1.0
Example 6: Debugging Failed Processing¶
# Process with detailed logging enabled via CLI flag
./camt-csv --log-level debug --log-format text pdf -i problematic.pdf -o debug.csv 2>&1 | tee debug.log
# Review debug information
less debug.log
Next Steps¶
- Explore Samples: Check the
samples/directory for example files - Customize Categories: Edit
database/categories.yamlfor your needs - Set Up AI: Configure Gemini API for intelligent categorization
- Automate Processing: Create scripts for regular batch processing
For technical details and development information, see the Codebase Documentation.