Scenario 4: The SELECT * Flood
Antipattern:
SELECT *pulling columns the caller never uses, including fatNVARCHAR(MAX)data. Symptom: Query is slow, network I/O is saturated, memory usage spikes in the app. Fix: Explicit column list โ only select what you actually need.
The Story
A developer built a customer order history page. They wrote:
Quick to write, easy to read. On the dev machine with 20 orders it was instant.
In production, BigCorp (CustomerID=1) has 50,000 orders. Each of those orders has a Notes column seeded with ~800 bytes of JSON audit data. SELECT * pulls every single byte of every Notes value โ ~40MB per query call โ for a UI that only ever shows OrderID, Date, Status, and Amount. The four columns that matter are 8 bytes total per row.
The query also can't be covered by any nonclustered index because NVARCHAR(MAX) columns can't be included in indexes โ so SQL Server always hits the clustered index for those rows.
The Bad SP
CREATE OR ALTER PROCEDURE dbo.usp_Bad_GetCustomerOrderHistory
@CustomerID INT
AS
BEGIN
SET NOCOUNT ON;
SELECT * -- โ Pulls Notes (NVARCHAR MAX) for every row
FROM dbo.Orders o
WHERE o.CustomerID = @CustomerID
ORDER BY o.OrderDate DESC;
END;
The Exercise
Step 1: Enable statistics and actual execution plan:
Enable the actual plan before running:
| Tool | How to enable actual execution plan |
|---|---|
| VS Code MSSQL | Right-click โ "Run Query with Actual Execution Plan" |
| DataGrip | Right-click โ Explain Plan โ Explain Analyzed |
Step 2: Run the bad version for BigCorp:
In the Messages/Output tab, look specifically for the lob logical reads value โ this is SQL Server reading the LOB (Large Object) pages for the NVARCHAR(MAX) Notes column:
Each LOB page = 8 KB. 4,920 LOB reads โ 39 MB read just for the Notes column that the UI never displays.
Step 3: Run the fixed version:
LOB reads drop to zero โ the Notes column pages are never touched because the query doesn't ask for that column.
In the execution plan: Both plans will show the same shape (Index Seek on the clustered index). The difference isn't visible in the graphical plan โ it's entirely in the I/O numbers. This is why
STATISTICS IOis so important for this scenario.
The Fixed SP
CREATE OR ALTER PROCEDURE dbo.usp_Fixed_GetCustomerOrderHistory
@CustomerID INT
AS
BEGIN
SET NOCOUNT ON;
-- โ
Only the columns the UI actually uses
SELECT
o.OrderID,
o.OrderDate,
o.Status,
o.TotalAmount,
o.ShipCity
FROM dbo.Orders o
WHERE o.CustomerID = @CustomerID
ORDER BY o.OrderDate DESC;
END;
Why This Matters Beyond Performance
SELECT * has additional problems:
1. Fragile to schema changes: If someone adds or reorders a column, SELECT * returns a different shape โ positional DataReader access breaks silently.
2. Wider network payloads: Every row is larger, burning more bandwidth between app servers and the database.
3. More memory pressure on SQL Server: SQL Server buffers data in the buffer pool. Pulling fat columns evicts other useful pages from cache.
4. ORM footprint: If you map to a model with SELECT *, you're instantiating the entire object graph including fields you'll never read.
A Word on Projection in EF Core
The same problem occurs in Entity Framework when you forget to project:
// โ Loads ALL columns including Notes into every Order object
var orders = await _context.Orders
.Where(o => o.CustomerID == customerId)
.ToListAsync();
// โ
Projection to a DTO โ only the 5 columns needed
var orders = await _context.Orders
.Where(o => o.CustomerID == customerId)
.Select(o => new OrderSummaryDto
{
OrderID = o.OrderID,
OrderDate = o.OrderDate,
Status = o.Status,
TotalAmount = o.TotalAmount,
ShipCity = o.ShipCity
})
.ToListAsync();
The generated SQL for the projection omits Notes entirely โ the database never reads those LOB pages.
Interview Answer
"SELECT * is almost always a bad idea in production stored procedures. The obvious reason is network overhead โ you're transmitting columns the caller never uses. But the more subtle problem is LOB columns: if a table has an NVARCHAR(MAX) or VARBINARY(MAX) column, SELECT * forces SQL Server to read the large-object pages for every single row, even if the caller throws that data away immediately. In one case we had a reporting query that was reading 40MB per call because it was dragging a JSON notes column through SELECT *. Replacing it with an explicit column list dropped the LOB reads to zero and cut query time by 80%. I always explicit-column in stored procedures โ it's also much more resilient when the schema evolves."