API Specifications - CAMT-CSV Project¶
Overview¶
This document provides formal specifications for all APIs, interfaces, and contracts within the CAMT-CSV project. These specifications serve as the foundation for specification-driven development, ensuring consistency and reliability across all components.
Parser Interface Specification¶
Core Parser Interface¶
Parse Method Specification¶
Signature: Parse(ctx context.Context, r io.Reader) ([]models.Transaction, error)
Purpose: Extract financial transactions from a source reader
Preconditions:
ctxMUST be a validcontext.ContextrMUST be a validio.Reader- The data from the reader MUST be in the format supported by the specific parser
Postconditions:
- Returns slice of valid
Transactionobjects on success - Returns empty slice (not nil) if no transactions found
- Returns error if data format is invalid or I/O error occurs
- All returned transactions MUST have valid
Date,Amount, andCurrencyfields
Error Conditions:
// Invalid format
InvalidFormatError{Reason: string}
// Parse error with context
ParseError{Line: int, Reason: string}
Implementation Requirements:
- MUST handle malformed data gracefully
- MUST use structured logging with relevant context
- MUST NOT be responsible for closing the reader
Data Model Specifications¶
Transaction Model¶
type Transaction struct {
BookkeepingNumber string `csv:"BookkeepingNumber"`
Status string `csv:"Status"`
Date time.Time `csv:"Date"`
ValueDate time.Time `csv:"ValueDate"`
Name string `csv:"Name"`
PartyName string `csv:"PartyName"`
PartyIBAN string `csv:"PartyIBAN"`
Description string `csv:"Description"`
RemittanceInfo string `csv:"RemittanceInfo"`
Amount decimal.Decimal `csv:"Amount"`
CreditDebit string `csv:"CreditDebit"`
DebitFlag bool `csv:"IsDebit"`
Debit decimal.Decimal `csv:"Debit"`
Credit decimal.Decimal `csv:"Credit"`
Currency string `csv:"Currency"`
Product string `csv:"Product"`
AmountExclTax decimal.Decimal `csv:"AmountExclTax"`
AmountTax decimal.Decimal `csv:"AmountTax"`
TaxRate decimal.Decimal `csv:"TaxRate"`
Recipient string `csv:"Recipient"`
Investment string `csv:"InvestmentType"`
Number string `csv:"Number"`
Category string `csv:"Category"`
Type string `csv:"Type"`
Fund string `csv:"Fund"`
NumberOfShares int `csv:"NumberOfShares"`
Fees decimal.Decimal `csv:"Fees"`
IBAN string `csv:"IBAN"`
EntryReference string `csv:"EntryReference"`
Reference string `csv:"Reference"`
AccountServicer string `csv:"AccountServicer"`
BankTxCode string `csv:"BankTxCode"`
OriginalCurrency string `csv:"OriginalCurrency"`
OriginalAmount decimal.Decimal `csv:"OriginalAmount"`
ExchangeRate decimal.Decimal `csv:"ExchangeRate"`
// Internal fields (not exported to CSV)
Payee string `csv:"-"`
Payer string `csv:"-"`
}
Validation Rules¶
Required Fields:
Date: Must be validtime.TimeAmount: Must be valid decimal, can be negativeCurrency: Must be valid ISO 4217 currency codeCreditDebit: Must be either "CRDT" or "DBIT"
Format Constraints:
- IBAN fields: Must pass IBAN validation if present
- Date fields: Stored as
time.Time, exported to CSV as DD.MM.YYYY - Decimal fields: Must use
decimal.Decimalfor precision - Internal fields (Payee, Payer): Not exported to CSV (csv:"-")
Business Rules:
- If
CreditDebitis "DBIT",Debitshould equalAmount - If
CreditDebitis "CRDT",Creditshould equalAmount AmountExclTax + AmountTaxshould equalAmountwhen tax fields are used
Note: Helper methods (UpdateDebitCreditAmounts, UpdateNameFromParties, ToBuilder, etc.) were removed in v2.0.0. Transactions are now immutable value objects. Parsers construct transactions with all fields set correctly during parsing.
Categorization Service Specification¶
TransactionCategorizer Interface¶
type TransactionCategorizer interface {
Categorize(ctx context.Context, partyName string, isDebtor bool, amount, date, info string) (Category, error)
}
Parameters:
ctx: Context for cancellation and timeoutspartyName: Name of the party (creditor or debtor)isDebtor: true if this is a debit transactionamount: Transaction amount as stringdate: Transaction date as stringinfo: Additional information (description, remittance info)
OutputFormatter Interface¶
type OutputFormatter interface {
Header() []string
Format(transactions []models.Transaction) ([][]string, error)
Delimiter() rune
}
Methods:
Header(): Returns CSV header rowFormat(): Converts transactions to CSV rowsDelimiter(): Returns CSV delimiter character (e.g., ';', ',')
Parser Interfaces¶
type Parser interface {
Parse(ctx context.Context, r io.Reader) ([]models.Transaction, error)
}
type Validator interface {
ValidateFormat(filePath string) (bool, error)
}
type CSVConverter interface {
ConvertToCSV(ctx context.Context, inputFile, outputFile string) error
}
type LoggerConfigurable interface {
SetLogger(logger logging.Logger)
}
type CategorizerConfigurable interface {
SetCategorizer(categorizer models.TransactionCategorizer)
}
type BatchConverter interface {
BatchConvert(ctx context.Context, inputDir, outputDir string) (int, error)
}
type FullParser interface {
Parser
Validator
CSVConverter
LoggerConfigurable
CategorizerConfigurable
BatchConverter
}
Categorization Algorithm:
- Direct Mapping: Check exact match in creditor/debtor mappings
- Keyword Matching: Match against category keywords
- AI Fallback: Use Gemini API if enabled and previous steps fail
Rate Limiting: AI calls limited by GEMINI_REQUESTS_PER_MINUTE
Learning: When auto-learn is enabled, AI categorizations are saved directly to main mapping files. When disabled, suggestions are saved to staging files (staging_creditors.yaml/staging_debtors.yaml) for manual review via StagingStoreInterface
Configuration Specification¶
Configuration Loading Order¶
- Command-line flags (highest priority)
- Environment variables
- Configuration file (
~/.camt-csv/camt-csv.yaml) - Default values (lowest priority)
Configuration Options¶
YAML Key (camt-csv.yaml) |
Environment Variable | CLI Flag | Description | Default |
|---|---|---|---|---|
log.level |
CAMT_LOG_LEVEL |
--log-level |
Logging verbosity | info |
log.format |
CAMT_LOG_FORMAT |
--log-format |
Log output format (text, json) |
text |
csv.delimiter |
CAMT_CSV_DELIMITER |
--csv-delimiter |
CSV output delimiter | , |
ai.enabled |
CAMT_AI_ENABLED |
--ai-enabled |
Enable/disable AI categorization | false |
ai.model |
CAMT_AI_MODEL |
- | Gemini model for categorization | gemini-2.0-flash |
ai.api_key |
GEMINI_API_KEY |
- | API key for Gemini | - |
Error Handling Specification¶
Error Types Hierarchy¶
// Base error interface
type CAMTError interface {
error
Code() string
Context() map[string]interface{}
}
// File operation errors
type FileNotFoundError struct {
Path string
}
type FilePermissionError struct {
Path string
Operation string
}
// Format validation errors
type InvalidFormatError struct {
Path string
Expected string
Actual string
}
// Parsing errors
type ParseError struct {
Path string
Line int
Column int
Reason string
}
// Configuration errors
type ConfigurationError struct {
Key string
Value string
Reason string
}
// External service errors
type ExternalServiceError struct {
Service string
Code int
Message string
}
Error Handling Patterns¶
Wrapping Errors:
Custom Error Creation:
if !isValidFormat {
return InvalidFormatError{
Path: filePath,
Expected: "CAMT.053 XML",
Actual: detectedFormat,
}
}
Error Context:
log.WithError(err).WithFields(logrus.Fields{
"file": filePath,
"parser": "camt",
"line": lineNumber,
}).Error("Parse error occurred")
Integration Specifications¶
File System Integration¶
File Operations Contract:
- All file operations MUST use proper resource cleanup
- Large files MUST be processed in streaming fashion
- Temporary files MUST use atomic rename pattern
- File permissions MUST be set appropriately (0644 for files, 0755 for directories)
Example Pattern:
func processFile(filePath string) error {
file, err := os.Open(filePath)
if err != nil {
return fmt.Errorf("failed to open file: %w", err)
}
defer func() {
if err := file.Close(); err != nil {
log.Warnf("Failed to close file: %v", err)
}
}()
// Process file...
}
External API Integration¶
Gemini AI Integration Contract:
- MUST respect rate limiting
- MUST handle API errors gracefully
- MUST provide fallback behavior
- MUST validate API responses
Rate Limiting Implementation:
type RateLimiter struct {
requests chan struct{}
ticker *time.Ticker
}
func (rl *RateLimiter) Allow() bool {
select {
case <-rl.requests:
return true
default:
return false
}
}
Testing Specifications¶
Unit Test Requirements¶
Coverage Requirements:
- Minimum 80% code coverage for all packages
- 100% coverage for critical paths (parsing, validation)
- All public methods MUST have tests
Test Structure:
func TestParser_ParseFile(t *testing.T) {
tests := []struct {
name string
inputFile string
expectError bool
expectCount int
setupFunc func() string // Returns temp file path
cleanupFunc func(string) // Cleanup temp file
}{
// Test cases...
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Test implementation...
})
}
}
Integration Test Requirements¶
File Processing Tests:
- Test with real sample files
- Test with malformed input
- Test with large files (performance)
- Test with edge cases (empty files, single transaction)
External Service Tests:
- Mock external APIs in tests
- Test rate limiting behavior
- Test error handling and fallbacks
Performance Specifications¶
Performance Requirements¶
File Processing:
- MUST process files up to 100MB without excessive memory usage
- MUST complete validation in under 1 second for typical files
- MUST handle 10,000+ transactions efficiently
Memory Usage:
- MUST NOT load entire file into memory for large files
- MUST use streaming processing where possible
- MUST clean up resources promptly
Benchmarking Requirements:
func BenchmarkParser_ParseFile(b *testing.B) {
parser := NewParser()
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, err := parser.ParseFile("testdata/large_file.xml")
if err != nil {
b.Fatal(err)
}
}
}
Security Specifications¶
Input Validation¶
File Path Validation:
- MUST prevent directory traversal attacks
- MUST validate file extensions where appropriate
- MUST check file permissions before processing
Data Sanitization:
- MUST sanitize all user input
- MUST validate numeric inputs
- MUST escape output where necessary
Sensitive Data Handling¶
Configuration:
- API keys MUST be loaded from environment variables
- MUST NOT log sensitive configuration values
- MUST use secure defaults
File Handling:
- Generated files MUST have appropriate permissions
- Temporary files MUST be cleaned up
- MUST NOT expose sensitive data in error messages
Compliance Specifications¶
Data Format Compliance¶
ISO 20022 Compliance:
- CAMT.053 parsing MUST follow ISO 20022 standard
- Currency codes MUST be ISO 4217 compliant
- Date formats MUST be consistent
CSV Output Compliance:
- MUST follow RFC 4180 CSV standard
- MUST handle special characters properly
- MUST provide consistent field ordering
This specification document serves as the authoritative reference for all API contracts and integration patterns within the CAMT-CSV project.