Skip to main content

Statement Processor

A tool that reads your PDF bank and credit card statements, extracts the transactions, and organizes them by vendor—turning messy financial documents into clean, usable data.

The Problem

Your credit card gets lost, stolen, or expires. The bank sends you a new one with a new number. Now what?

Suddenly you have to update that card everywhere: streaming services, charitable donations, utilities, insurance, gym memberships, and that subscription you forgot you still pay for. Miss even one and you risk service interruptions or late fees.

The real issue is simple: you don’t actually know everywhere your card number is stored.

Digging through a year of statements is tedious, and cryptic descriptions like “AMZN MKTP US2K4X7Y9”* don’t help.

Statement Processor solves this by scanning your statements and generating a clean, searchable list of every vendor you’ve paid—so on “card swap day,” you're not guessing.


How It Works


Step 1: Feed It Your Statements

Point the tool at a folder containing your PDF statements. It automatically scans and processes everything inside.

Step 2: Automatic Parsing

The parser identifies which bank or card issuer format each statement uses and extracts every transaction—dates, descriptions, and amounts.

Step 3: Smart Vendor Grouping

Transactions are intelligently clustered by vendor. Entries like “COSTCO WHSE #1234” and “COSTCO WHOLESALE” get recognized as the same place: Costco.

You can extend this behavior with your own vendor-normalization patterns. If no pattern matches, the tool still groups transactions by description similarity so nothing gets lost.

Step 4: Get Your Results

The output is a simple CSV summarizing each, like this sample:

vendor_nametransaction_counttotal_amountearliest_datelatest_date
Walmart1607768.432023-01-012025-11-08
Amazon1415096.362023-01-022025-11-07
PayPal23828.792023-12-242025-10-28
Microsoft9227.012023-01-092025-10-09

Clean, portable, and ready for analysis or your card-update checklist.


Key Features

🔌Extensible Design

A plugin-based architecture allows support for new banks and card issuers without modifying the core software. Custom clustering logic and new processing strategies can be added as standalone modules.

🏷️Vendor Recognition

Built-in patterns decode many common vendor description formats. You can install additional Python packages to add custom vendor rules—the system automatically discovers and loads supported plugins.

🖥️Command Line Interface

Run it from your terminal with a single command. Batch-process multiple statement folders and send the output wherever you need.


Use Cases

Current Use Cases

Vendor Summary for Card Replacement

Produces a CSV summary of each vendor, including transaction count, total spend, and earliest/latest charge dates.

This gives you a quick list of vendors to check or update when a credit card is lost, stolen, or replaced.

High-Level Financial Review
Provides a simple overview of spending distribution across vendors by aggregating transaction amounts and date ranges.
Multi-Card or Multi-Bank Consolidation
Processes statements from multiple institutions and merges the results into one unified vendor table.
Vendor Description Normalization
Groups different statement descriptions that refer to the same merchant, producing cleaner and more accurate summaries.

Future Use Cases (Planned)

Automated Subscription Discovery
Detect recurring or subscription-like charges to identify unused or forgotten services.
More Detailed Transaction Categorization
Assign category labels (e.g., groceries, utilities, entertainment) to vendors for spending analysis.
Recurring Transaction Detection
Identify vendors that charge at predictable intervals (monthly, biweekly, irregular-but-patterned).
Vendor Change Detection
Highlight when vendor pricing or billing patterns shift, or when new variations of the vendor name appear.
OCR Support for Scanned Statements
Process scanned or photographed statements that lack embedded text.

Who Is This For?

  • Anyone switching banks or credit cards who needs a list of every vendor to update
  • People replacing lost or stolen cards who want a complete vendor inventory
  • Personal finance enthusiasts seeking better insight into recurring charges
  • Developers needing a clean, extensible framework for working with statement data

Following the Journey

Development updates and technical deep dives will be posted on the blog under the statement-processor-project tag.