Skip to content
Back to System Design
System Design8 min read

System Design: Payment System

system-designarchitecturepaymentsfintech
Share

Payment systems are the one place in engineering where a bug has a dollar sign attached to it. If your social media feed drops a post, nobody notices. If your payment system charges someone twice, you're dealing with chargebacks, angry customers, and potentially regulators.

Stripe processes hundreds of billions of dollars annually. Square handles millions of transactions per day. Let's design a system that can handle that kind of volume without losing a cent.

requirements

Functional:

  • Accept payments via credit card, debit card, bank transfer, and digital wallets
  • Process refunds (full and partial)
  • Payout to merchants/sellers
  • Transaction history and receipts
  • Webhook notifications for payment events
  • Multi-currency support

Non-functional:

  • Exactly-once payment processing (this is non-negotiable)
  • Data consistency — the ledger must always balance
  • Sub-second payment confirmation
  • PCI DSS compliance for card data
  • 99.999% availability (five nines — 5 minutes of downtime per year)
  • Auditability — every state change is logged

high-level architecture

┌──────────┐     ┌───────────┐     ┌──────────────┐
│  Client   │────▶│  Payment   │────▶│  Payment      │
│  (checkout│     │  API GW    │     │  Service      │
│   page)   │     │  (auth,    │     │  (orchestrate)│
└──────────┘     │  idempot.) │     └──────┬───────┘
                 └───────────┘            │
                                    ┌─────┴──────┐
                               ┌────▼───┐  ┌─────▼──────┐
                               │ Ledger  │  │  Payment    │
                               │ Service │  │  Processor  │
                               │ (double │  │  Gateway    │
                               │  entry) │  │  (Stripe,   │
                               └────┬───┘  │  Adyen)     │
                                    │      └─────┬──────┘
                               ┌────▼───┐        │
                               │Postgres │  ┌────▼──────┐
                               │(ledger) │  │  External  │
                               └────────┘  │  Payment   │
                                           │  Networks  │
                                           │  (Visa,    │
                                           │  Mastercard)│
                                           └───────────┘

Three critical components: the Payment Service (orchestration), the Ledger Service (accounting), and the Payment Processor Gateway (communication with external payment networks).

idempotency: the most important concept

Network failures happen. A client sends a payment request, the server processes it, but the response gets lost. The client retries. Without idempotency, you've charged the customer twice.

Idempotency key: Every payment request includes a unique client-generated key (usually a UUID). The server stores the key with the result. On retry, it returns the stored result instead of processing again.

Idempotency Flow:

Request 1: POST /payments {idempotency_key: "abc-123", amount: 50.00}
  → Server processes payment → stores result → returns {status: "success"}

Request 2 (retry): POST /payments {idempotency_key: "abc-123", amount: 50.00}
  → Server finds "abc-123" in idempotency store
  → Returns stored result {status: "success"}
  → NO second charge

Implementation:
┌──────────────────────────────────────────────┐
│  idempotency_keys table                       │
│  ┌──────────────┬────────┬──────────────────┐│
│  │ key          │ status │ response         ││
│  ├──────────────┼────────┼──────────────────┤│
│  │ abc-123      │ done   │ {payment_id: ..} ││
│  │ def-456      │ processing │ NULL         ││
│  └──────────────┴────────┴──────────────────┘│
└──────────────────────────────────────────────┘

The idempotency key has three states: started (request received, processing begun), done (completed, response stored), and failed (error, safe to retry). If a request arrives for a key in started state, return 409 Conflict — the original is still processing.

Stripe uses this exact pattern. Every API call accepts an Idempotency-Key header. It's the simplest, most effective protection against double charges.

double-entry bookkeeping

Every financial transaction creates two ledger entries: a debit from one account and a credit to another. The total debits must always equal the total credits. If they don't, money appeared or disappeared, and you have a bug.

Customer pays $50 for an order:

┌────────────────────────────────────────────┐
│  Ledger Entries                             │
│                                             │
│  Entry 1: DEBIT  customer_account   $50.00  │
│  Entry 2: CREDIT merchant_escrow    $50.00  │
│                                             │
│  After platform fee (10%):                  │
│  Entry 3: DEBIT  merchant_escrow    $5.00   │
│  Entry 4: CREDIT platform_revenue   $5.00   │
│                                             │
│  Merchant payout:                           │
│  Entry 5: DEBIT  merchant_escrow    $45.00  │
│  Entry 6: CREDIT merchant_bank      $45.00  │
│                                             │
│  Balance check: Σ debits = Σ credits ✓      │
└────────────────────────────────────────────┘

Why double-entry? Because single-entry ("customer balance -= 50") is impossible to audit. If the balance is wrong, you can't trace why. Double-entry creates a complete audit trail. Every dollar has a source and a destination.

Store ledger entries in an append-only table. Never update or delete ledger rows. Corrections are made by adding new entries (a refund is a reverse entry, not a deletion).

payment state machine

A payment goes through a defined sequence of states. Model this explicitly — it prevents impossible transitions and makes debugging trivial.

Payment State Machine:

  ┌──────────┐
  │ CREATED   │ ─── validation fails ──▶ REJECTED
  └─────┬────┘
        │ validate
  ┌─────▼────┐
  │ PENDING   │ ─── timeout (30min) ───▶ EXPIRED
  └─────┬────┘
        │ submit to processor
  ┌─────▼────────┐
  │ PROCESSING    │ ─── processor error ──▶ FAILED
  └─────┬────────┘
        │ processor confirms
  ┌─────▼────┐
  │ SUCCEEDED │ ─── refund requested ───▶ REFUNDING
  └──────────┘                                │
                                        ┌─────▼────┐
                                        │ REFUNDED  │
                                        └──────────┘

Every transition is logged. No state can be skipped.

Store the current state and the full state history. When debugging "why was this payment charged twice," the state history tells the complete story.

webhook handling

External payment processors (Stripe, Adyen) communicate results via webhooks. Your system receives an HTTP callback when a payment succeeds, fails, or is disputed.

Webhook Processing:

Stripe ──POST──▶ /webhooks/stripe
                      │
                ┌─────▼──────┐
                │  Verify      │ ← Check signature (HMAC)
                │  Signature   │   Reject if invalid
                └─────┬──────┘
                      │
                ┌─────▼──────┐
                │  Idempotent  │ ← Check event_id
                │  Check       │   Skip if already processed
                └─────┬──────┘
                      │
                ┌─────▼──────┐
                │  Process     │ ← Update payment state
                │  Event       │   Create ledger entries
                └─────┬──────┘
                      │
                ┌─────▼──────┐
                │  Return 200  │ ← ACK within 5 seconds
                └────────────┘   or Stripe retries

Critical rules for webhook handling:

  1. Verify signatures. Stripe signs every webhook with HMAC-SHA256. If you don't verify, anyone can send fake payment confirmations.
  2. Return 200 immediately. Process asynchronously. If processing takes more than 5 seconds, the provider will retry, creating duplicates.
  3. Idempotent processing. Webhooks will be delivered multiple times. Use the event ID to deduplicate.
  4. Handle out-of-order delivery. You might receive a payment.refunded event before payment.succeeded. Check the current state before applying transitions.

PCI compliance

If you handle raw credit card numbers, you need PCI DSS Level 1 certification. That's a 12-requirement, multi-month audit process. Stripe, Adyen, and Braintree exist specifically so you don't have to do this.

PCI-Compliant Architecture:

┌──────────┐     ┌──────────────┐     ┌──────────┐
│  Client   │────▶│  Stripe.js /  │────▶│  Stripe   │
│  Browser  │     │  Elements     │     │  Servers  │
└──────────┘     │  (tokenizes   │     │  (PCI L1) │
                 │  card data)   │     └─────┬────┘
                 └──────────────┘           │
                                      token │
                                      (tok_xxx)
                                            │
┌──────────┐     ┌──────────────┐     ┌─────▼────┐
│  Your     │────▶│  Your Server  │────▶│  Stripe   │
│  Client   │     │  (never sees  │     │  API      │
│           │     │  card number) │     │  (charges │
└──────────┘     └──────────────┘     │  token)   │
                                      └──────────┘

Card data flows from the browser directly to Stripe's servers via their JavaScript SDK. Your backend only receives a token. You never see, store, or transmit card numbers. This is called tokenization and it reduces your PCI scope from Level 1 (hundreds of controls) to SAQ-A (a short questionnaire).

reconciliation

Even with perfect software, discrepancies happen. Your ledger says a payment succeeded, but Stripe's records show it was declined. Network partitions, timeouts, and race conditions create these gaps.

Daily reconciliation: Every night, pull the day's transactions from each payment processor's API. Compare against your ledger. Flag discrepancies. This is not optional — every payment company runs reconciliation.

Reconciliation Process:

Your Ledger          Stripe Records         Result
┌──────────┐        ┌──────────┐
│ pay_001 ✓ │        │ pay_001 ✓ │  ──▶  Match ✓
│ pay_002 ✓ │        │ pay_002 ✗ │  ──▶  MISMATCH (investigate)
│ pay_003 ✓ │        │           │  ──▶  MISSING (Stripe lost it?)
│           │        │ pay_004 ✓ │  ──▶  UNRECORDED (we lost it)
└──────────┘        └──────────┘

Automate mismatch resolution for common cases (timeout-related discrepancies). Escalate unusual patterns to a human. Keep a separate reconciliation table with the investigation status of each discrepancy.

key tradeoffs

Consistency vs. availability: Payment systems choose consistency. AP systems (eventual consistency) are unacceptable when money is involved. Use synchronous writes to the primary database. Accept that a database failover might cause a few seconds of downtime. That's better than charging someone twice.

Own vs. outsource: Building a payment processor from scratch takes years and millions in compliance costs. Unless you're Stripe-sized, use Stripe/Adyen as the processor gateway and focus your engineering on the orchestration, ledger, and reconciliation layers.

Synchronous vs. asynchronous processing: The customer expects immediate feedback ("payment successful"). But the actual settlement takes 1-3 days. The solution: confirm the authorization synchronously (takes ~500ms), then process capture and settlement asynchronously. The customer sees "paid," and the money moves later.

Multi-currency: Store amounts in the smallest currency unit (cents, not dollars) to avoid floating-point errors. Store the original currency and the converted amount separately. Use a reliable exchange rate service (not a cached rate from 3 days ago). Convert at the time of the transaction and store the rate used.

The golden rule of payment systems: when in doubt, don't charge. It's easier to retry a failed payment than to refund a duplicate one.


Share

More in System Design