Database Optimization: Eradicating the ORM N+1 Problem
How LZStock prevents database connection exhaustion by utilizing GORM Preloading and Application-Level Joins instead of massive SQL Cartesian products.
- Kill the Query Waterfall: Replaced naive loops with GORM's
Preloadmechanism, reducing hundreds of sequential, high-latency database hits to a strict, batch-optimized query flow. - Eliminate Cartesian Product Bloat: Favored application-level joins over massive SQL
LEFT JOINstatements to prevent redundant data duplication across the network, shifting the "stitching" CPU workload to horizontally scalable Go application pods. - Hybrid Optimization Strategy: Balanced DB-level sorting via eager-loading closures while establishing a clear fallback rule: utilize SQL
INNER JOINorEXISTSspecifically for cases requiring filtering by child attributes that independent queries cannot handle.
The Objective
The convenience of Object-Relational Mappers (ORMs) like GORM comes with a deadly trap: The N+1 Query Problem.
In our core domain, fetching a single Dashboard requires loading its Watchlists, which in turn require loading multiple USCompanies. A naive implementation using a loop triggers 1 query for the dashboard, N queries for the watchlists, and M queries for the companies—potentially executing 50+ sequential queries for a single API call.
Under high concurrency, this instantly exhausts the PostgreSQL connection pool and spikes latency. The objective is to enforce strict Batch Fetching (Eager Loading), reducing the query complexity from down to a strict, predictable (exactly 3 queries).
The Mental Model
Instead of executing a loop, the system executes 3 isolated batch queries and stitches the relationships together in the Go application's memory (Application-Level Join).
N+1 Trap Example
Click to reveal the N+1 Disaster in code
- GORM (The Naive Loop)
- Translated Raw SQL
func BadExample(id uuid.UUID) {
var dashboard Dashboard
// 1. Load Dashboard (1 Query)
db.Where("id = ?", id).First(&dashboard)
// 2. Load Watchlists (1 Query)
db.Where("dashboard_id = ?", id).Find(&dashboard.Watchlists)
// 3. Load Companies in a loop (N Queries! 💥)
for i := range dashboard.Watchlists {
db.Model(&dashboard.Watchlists[i]).
Association("USCompanyRefs").
Find(&dashboard.Watchlists[i].USCompanyRefs)
}
}
-- 1. Dashboard
SELECT * FROM dashboards WHERE id = 'uuid';
-- 2. Watchlists
SELECT * FROM watchlists WHERE dashboard_id = 'uuid';
-- 3. The N+1 Avalanche... (Exhausting the connection pool)
SELECT * FROM watchlist_companies wc JOIN us_companies uc ON ... WHERE wc.watchlist_id = 'watchlist-1';
SELECT * FROM watchlist_companies wc JOIN us_companies uc ON ... WHERE wc.watchlist_id = 'watchlist-2';
SELECT * FROM watchlist_companies wc JOIN us_companies uc ON ... WHERE wc.watchlist_id = 'watchlist-3';
-- ... repeats N times for every single watchlist!
Core Implementation
Below is the highly curated data access logic utilizing GORM's Preload. Notice how we inject custom SQL scoping (like ORDER BY) directly into the eager-loading closures to maintain sorting at the database level.
- GORM (Optimized Preload)
- Translated Raw SQL
// internal/repository/pg/dashboard_repo.go
func (r *DashboardRepo) GetDashboardWithRelations(ctx context.Context, id uuid.UUID) (*schema.Dashboard, error) {
var dashboard schema.Dashboard
// Executes exactly 3 optimized queries using the SQL 'IN' operator
err := r.db.WithContext(ctx).
// Query 2: Fetch all Watchlists
Preload("Watchlists", func(db *gorm.DB) *gorm.DB {
return db.Order("created_at DESC")
}).
// Query 3: Fetch all Companies linked to those Watchlists
Preload("Watchlists.USCompanyRefs", func(db *gorm.DB) *gorm.DB {
return db.Order("us_companies.company_name ASC")
}).
Where("id = ?", id).
First(&dashboard).Error
if err != nil {
return nil, fmt.Errorf("failed to fetch dashboard relations: %w", err)
}
return &dashboard, nil
}
-- Exactly 3 highly predictable, batch-optimized queries:
-- 1. Fetch Dashboard
SELECT * FROM dashboards WHERE id = 'uuid' LIMIT 1;
-- 2. Fetch ALL related Watchlists (Batched)
SELECT * FROM watchlists
WHERE dashboard_id = 'uuid'
ORDER BY created_at DESC;
-- 3. Fetch ALL related Companies (Batched via the IN clause!)
SELECT wc.*, uc.* FROM watchlist_companies wc
JOIN us_companies uc ON wc.us_company_id = uc.id
WHERE wc.watchlist_id IN ('watchlist-1', 'watchlist-2', 'watchlist-3', ...)
ORDER BY uc.company_name ASC;
Edge Cases & Trade-offs
- Preload (App-Level Join) vs. SQL JOIN (Cartesian Product): Why did I choose Preload instead of writing a single LEFT JOIN query?
- If a Dashboard has 10 Watchlists, and each has 50 Companies, a single JOIN produces 500 rows. The Dashboard and Watchlist columns are heavily duplicated across all 500 rows, causing massive network bandwidth bloat (The Cartesian Product Effect).
- Preload executes 3 separate queries and prevents this data duplication, shifting the stitching CPU load from the Database to the Go Application—which is horizontally scalable and cheaper.
- When JOIN is Mandatory: The fatal flaw of Preload is that the queries are independent. If the business logic requires filtering the parent based on a child's attribute (e.g., "Fetch only Dashboards that contain 'TSLA' in any of their watchlists"), Preload cannot do this. In such edge cases, the architecture strictly mandates falling back to an INNER JOIN or EXISTS subquery to utilize the database's relational engine.
The Outcome
By enforcing strict GORM Preloading strategies across the Repository layer, deeply nested aggregate roots are fully hydrated in a guaranteed maximum of 3 queries, completely neutralizing connection pool exhaustion under heavy WebSocket/API traffic.