Navigating Pagination Strategies
How LZStock selects between Offset, Keyset (Cursor), and Database Cursors to balance UI requirements with PostgreSQL B-Tree index performance.
- Eradicate the Offset Penalty: Replaced naive
OFFSETloops in heavy batch jobs with Keyset (Cursor) pagination, utilizing B-Tree index seeking to maintain a flat, lightning-fast query latency regardless of pagination depth. - Strategic UX vs. Performance Trade-offs: Retained standard
OFFSET/LIMITsolely for user-facing dashboards requiring arbitrary page jumping, actively mitigating the "Offset Death Spiral" by enforcing strict maximum page limits at the API layer. - Zero-Memory Streaming with Native Cursors: Implemented native PostgreSQL server-side cursors (
FETCH FORWARD) inside persistent transactions to stream massive datasets without ballooning Go application RAM, strictly reserved for fast internal ETL pipelines.
The Objective
Fetching millions of financial records at once will instantly cause an Out-Of-Memory (OOM) crash. The obvious solution is pagination.
However, "Pagination" is not a one-size-fits-all solution. A strategy that works perfectly for a user clicking "Page 2" on a dashboard will completely destroy the database if used by a background cron job iterating through 500,000 tickers. The objective here is to match the Business/UX Requirement with the correct Database Execution Plan, strictly avoiding the fatal " Offset Degradation" problem at scale.
The Mental Model & Execution Plans
Understanding how the database engine retrieves the data is critical for selecting the right strategy.
Codebase Anatomy
lzstock/
├──mods/
│ bc1-indicator-insights
│ └── persistence
│ └── pg
│ ├── repo
│ └── schema
bc3-company-selector
└── persistence
└── pg
└── repo
Core Implementation
LZStock employs three distinct pagination strategies, each strictly mapped to a specific use case.
Offset-Based Pagination (For UI Dashboards)
- Use Case: Customer-facing dashboards where users explicitly need to jump to arbitrary pages (e.g., Page 3).
- Mechanism: Standard LIMIT and OFFSET.
This is structurally inefficient at scale but necessary for UX. It requires two queries: one to count the total records (for UI math), and one to fetch the actual slice.
// internal/repository/pg/dashboard_repo.go
func (r *DashboardReadRepo) GetDashboardsMeta(investorID uuid.UUID, page, pageSize int) ([]*schema.DashboardMeta, error) {
var summaries []*schema.DashboardMeta
offset := (page - 1) * pageSize
err := r.db.Model(&schema.DashboardMeta{}).
Select("id, dashboard_name, status, created_at").
Where("investor_id = ?", investorID).
Order("created_at DESC").
Offset(offset).
Limit(pageSize).
Find(&summaries).Error
return summaries, err
}
Keyset / Cursor Pagination (For Background Batch Processing)
- Use Case: Cron jobs iterating through millions of active tickers (e.g., Infinite Scroll or ETL pipelines).
- Mechanism: Uses the last seen sequential ID to seek directly into the database's B-Tree index.
This is the gold standard for massive datasets. It completely avoids the offset penalty, maintaining a flat query time regardless of how deep the pagination goes.
// internal/repository/pg/company_repo.go
func (r *GlobalCompanyRepo) GetAllTickersCursor(limit int, lastSeenID *uuid.UUID) ([]*schema.GlobalCompany, error) {
var batch []*schema.GlobalCompany
query := r.db.Where("is_active = ?", true).
Order("id ASC").
Limit(limit)
// The magic of Keyset Pagination: B-Tree Index Seek
if lastSeenID != nil {
query = query.Where("id > ?", *lastSeenID)
}
err := query.Find(&batch).Error
return batch, err
}
Database Cursors (For High-Speed Data Streaming)
- Use Case: High-speed internal data syncing where holding intermediate data batches in Go memory is too expensive. (Doesn't have example in LZStock for now)
- Mechanism: Utilizes native PostgreSQL server-side cursors (FETCH FORWARD) within a single, persistent transaction.
// internal/repository/pg/stream_repo.go
func (r *Repo) StreamCompanies(ctx context.Context, process func(schema.GlobalCompany) error) error {
tx := r.db.WithContext(ctx).Begin() // MUST be in a transaction
defer tx.Rollback()
// 1. Declare the server-side cursor
tx.Exec("DECLARE company_cursor CURSOR FOR SELECT * FROM global_companies WHERE is_active = true ORDER BY id ASC")
for {
// 2. Stream a manageable chunk directly from the DB memory
rows, _ := tx.Raw("FETCH FORWARD 1000 FROM company_cursor").Rows()
hasRows := false
for rows.Next() {
hasRows = true
var company schema.GlobalCompany
tx.ScanRows(rows, &company)
if err := process(company); err != nil { return err } // Process 1-by-1
}
rows.Close()
if !hasRows { break }
}
return tx.Commit()
}
Edge Cases & Trade-offs
- The Offset Death Spiral (Performance vs. UX): Why use Offset pagination at all if it degrades to ? Because Product Managers demand it. Users expect to see Page 1 of 50. To prevent the database from crashing on deep offsets (e.g., OFFSET 1000000), the architectural trade-off is to enforce a Max Page Limit (e.g., users can only paginate up to page 100). If they need deeper data, they must use search filters.
- Cursor Sequential Constraint: Keyset (Cursor) pagination is blazing fast because it traverses the B-Tree index directly. However, it requires a deterministic, sequential column to sort by. If you paginate by a non-unique column (like updated_at where two rows might have the exact same timestamp), the cursor will skip records. I strictly enforce sorting by a unique, sequential id (or ULID or Snowflake ID) as a tie-breaker.
- Database Cursors & Connection Leaks: Native PostgreSQL Cursors (FETCH FORWARD) use minimal Go application memory. However, they force the database connection to remain open for the entire duration of the process. If the Go process() callback is slow (e.g., making external API calls), it holds a DB lock and monopolizes a connection pool slot. This strategy is strictly reserved for fast, internal data transformations.
The Outcome
By abstracting pagination into three distinct architectural patterns, LZStock serves deterministic UI pagination to users, while background jobs comfortably process hundreds of thousands of financial tickers with flat, sub-millisecond database query latencies.