Statement Processor
A tool that reads your PDF bank and credit card statements, extracts the transactions, and organizes them by vendor—turning messy financial documents into clean, usable data.
The Problem
Your credit card gets lost, stolen, or expires. The bank sends you a new one with a new number. Now what?
Suddenly you have to update that card everywhere: streaming services, charitable donations, utilities, insurance, gym memberships, and that subscription you forgot you still pay for. Miss even one and you risk service interruptions or late fees.
The real issue is simple: you don’t actually know everywhere your card number is stored.
Digging through a year of statements is tedious, and cryptic descriptions like “AMZN MKTP US2K4X7Y9”* don’t help.
Statement Processor solves this by scanning your statements and generating a clean, searchable list of every vendor you’ve paid—so on “card swap day,” you're not guessing.
How It Works
Step 1: Feed It Your Statements
Point the tool at a folder containing your PDF statements. It automatically scans and processes everything inside.
Step 2: Automatic Parsing
The parser identifies which bank or card issuer format each statement uses and extracts every transaction—dates, descriptions, and amounts.
Step 3: Smart Vendor Grouping
Transactions are intelligently clustered by vendor. Entries like “COSTCO WHSE #1234” and “COSTCO WHOLESALE” get recognized as the same place: Costco.
You can extend this behavior with your own vendor-normalization patterns. If no pattern matches, the tool still groups transactions by description similarity so nothing gets lost.
Step 4: Get Your Results
The output is a simple CSV summarizing each, like this sample:
| vendor_name | transaction_count | total_amount | earliest_date | latest_date |
|---|---|---|---|---|
| Walmart | 160 | 7768.43 | 2023-01-01 | 2025-11-08 |
| Amazon | 141 | 5096.36 | 2023-01-02 | 2025-11-07 |
| PayPal | 23 | 828.79 | 2023-12-24 | 2025-10-28 |
| Microsoft | 9 | 227.01 | 2023-01-09 | 2025-10-09 |
Clean, portable, and ready for analysis or your card-update checklist.
Key Features
A plugin-based architecture allows support for new banks and card issuers without modifying the core software. Custom clustering logic and new processing strategies can be added as standalone modules.
Built-in patterns decode many common vendor description formats. You can install additional Python packages to add custom vendor rules—the system automatically discovers and loads supported plugins.
Run it from your terminal with a single command. Batch-process multiple statement folders and send the output wherever you need.
Use Cases
Current Use Cases
Produces a CSV summary of each vendor, including transaction count, total spend, and earliest/latest charge dates.
This gives you a quick list of vendors to check or update when a credit card is lost, stolen, or replaced.
Future Use Cases (Planned)
Who Is This For?
- Anyone switching banks or credit cards who needs a list of every vendor to update
- People replacing lost or stolen cards who want a complete vendor inventory
- Personal finance enthusiasts seeking better insight into recurring charges
- Developers needing a clean, extensible framework for working with statement data
Links
- GitHub Repository: https://github.com/bogdanvarlamov/statement-processor — full documentation, installation, and source.
Following the Journey
Development updates and technical deep dives will be posted on the blog under the statement-processor-project tag.