Navigating Pagination Strategies

How LZStock selects between Offset, Keyset (Cursor), and Database Cursors to balance UI requirements with PostgreSQL B-Tree index performance.

TL;DR

Eradicate the $O(N)$ Offset Penalty: Replaced naive OFFSET loops in heavy batch jobs with Keyset (Cursor) pagination, utilizing B-Tree index seeking to maintain a flat, lightning-fast $O(1)$ query latency regardless of pagination depth.
Strategic UX vs. Performance Trade-offs: Retained standard OFFSET/LIMIT solely for user-facing dashboards requiring arbitrary page jumping, actively mitigating the "Offset Death Spiral" by enforcing strict maximum page limits at the API layer.
Zero-Memory Streaming with Native Cursors: Implemented native PostgreSQL server-side cursors (FETCH FORWARD) inside persistent transactions to stream massive datasets without ballooning Go application RAM, strictly reserved for fast internal ETL pipelines.

The Objective

Fetching millions of financial records at once will instantly cause an Out-Of-Memory (OOM) crash. The obvious solution is pagination.

However, "Pagination" is not a one-size-fits-all solution. A strategy that works perfectly for a user clicking "Page 2" on a dashboard will completely destroy the database if used by a background cron job iterating through 500,000 tickers. The objective here is to match the Business/UX Requirement with the correct Database Execution Plan, strictly avoiding the fatal " $O(N)$ Offset Degradation" problem at scale.

The Mental Model & Execution Plans

Understanding how the database engine retrieves the data is critical for selecting the right strategy.

Codebase Anatomy

lzstock/
├──mods/
│  bc1-indicator-insights
│   └── persistence
│       └── pg
│           ├── repo
│           └── schema
bc3-company-selector
└── persistence
    └── pg
        └── repo

Core Implementation

LZStock employs three distinct pagination strategies, each strictly mapped to a specific use case.

Offset-Based Pagination (For UI Dashboards)

Use Case: Customer-facing dashboards where users explicitly need to jump to arbitrary pages (e.g., Page 3).
Mechanism: Standard LIMIT and OFFSET.

This is structurally inefficient at scale but necessary for UX. It requires two queries: one to count the total records (for UI math), and one to fetch the actual slice.

// internal/repository/pg/dashboard_repo.go

func (r *DashboardReadRepo) GetDashboardsMeta(investorID uuid.UUID, page, pageSize int) ([]*schema.DashboardMeta, error) {
    var summaries []*schema.DashboardMeta
    offset := (page - 1) * pageSize

    err := r.db.Model(&schema.DashboardMeta{}).
        Select("id, dashboard_name, status, created_at").
        Where("investor_id = ?", investorID).
        Order("created_at DESC").
        Offset(offset).
        Limit(pageSize).
        Find(&summaries).Error

    return summaries, err
}

Keyset / Cursor Pagination (For Background Batch Processing)

Use Case: Cron jobs iterating through millions of active tickers (e.g., Infinite Scroll or ETL pipelines).
Mechanism: Uses the last seen sequential ID to seek directly into the database's B-Tree index.

This is the gold standard for massive datasets. It completely avoids the offset penalty, maintaining a flat $O(1)$ query time regardless of how deep the pagination goes.

// internal/repository/pg/company_repo.go

func (r *GlobalCompanyRepo) GetAllTickersCursor(limit int, lastSeenID *uuid.UUID) ([]*schema.GlobalCompany, error) {
    var batch []*schema.GlobalCompany

    query := r.db.Where("is_active = ?", true).
        Order("id ASC").
        Limit(limit)

    // The magic of Keyset Pagination: B-Tree Index Seek
    if lastSeenID != nil {
        query = query.Where("id > ?", *lastSeenID)
    }

    err := query.Find(&batch).Error
    return batch, err
}

Database Cursors (For High-Speed Data Streaming)

Use Case: High-speed internal data syncing where holding intermediate data batches in Go memory is too expensive. (Doesn't have example in LZStock for now)
Mechanism: Utilizes native PostgreSQL server-side cursors (FETCH FORWARD) within a single, persistent transaction.

// internal/repository/pg/stream_repo.go

func (r *Repo) StreamCompanies(ctx context.Context, process func(schema.GlobalCompany) error) error {
    tx := r.db.WithContext(ctx).Begin() // MUST be in a transaction
    defer tx.Rollback()

    // 1. Declare the server-side cursor
    tx.Exec("DECLARE company_cursor CURSOR FOR SELECT * FROM global_companies WHERE is_active = true ORDER BY id ASC")

    for {
        // 2. Stream a manageable chunk directly from the DB memory
        rows, _ := tx.Raw("FETCH FORWARD 1000 FROM company_cursor").Rows()
        
        hasRows := false
        for rows.Next() {
            hasRows = true
            var company schema.GlobalCompany
            tx.ScanRows(rows, &company)
            
            if err := process(company); err != nil { return err } // Process 1-by-1
        }
        rows.Close()
        if !hasRows { break }
    }
    return tx.Commit()
}

Edge Cases & Trade-offs

The Offset Death Spiral (Performance vs. UX): Why use Offset pagination at all if it degrades to $O(N)$ ? Because Product Managers demand it. Users expect to see Page 1 of 50. To prevent the database from crashing on deep offsets (e.g., OFFSET 1000000), the architectural trade-off is to enforce a Max Page Limit (e.g., users can only paginate up to page 100). If they need deeper data, they must use search filters.
Cursor Sequential Constraint: Keyset (Cursor) pagination is blazing fast because it traverses the B-Tree index directly. However, it requires a deterministic, sequential column to sort by. If you paginate by a non-unique column (like updated_at where two rows might have the exact same timestamp), the cursor will skip records. I strictly enforce sorting by a unique, sequential id (or ULID or Snowflake ID) as a tie-breaker.
Database Cursors & Connection Leaks: Native PostgreSQL Cursors (FETCH FORWARD) use minimal Go application memory. However, they force the database connection to remain open for the entire duration of the process. If the Go process() callback is slow (e.g., making external API calls), it holds a DB lock and monopolizes a connection pool slot. This strategy is strictly reserved for fast, internal data transformations.

The Outcome

By abstracting pagination into three distinct architectural patterns, LZStock serves deterministic UI pagination to users, while background jobs comfortably process hundreds of thousands of financial tickers with flat, sub-millisecond database query latencies.

The Objective​

The Mental Model & Execution Plans​

Core Implementation​

Offset-Based Pagination (For UI Dashboards)​

Keyset / Cursor Pagination (For Background Batch Processing)​

Database Cursors (For High-Speed Data Streaming)​

Edge Cases & Trade-offs​

The Outcome​