Abstract
One browser tab. That’s all it took for our Amplify Gen2 application to trigger a storm of unnecessary API calls and log activity over a quiet weekend. What looked like a harmless polling pattern in React ended up exploding our spend and taught us a powerful lesson: cost control has to be built into both the frontend and backend. This post shares what went wrong, how we fixed it, and the multi-layer system we now use to keep costs predictable.
The Unwelcome Surprise
Monday morning started with a shock: our AWS spend had exploded over the weekend. No one had been actively using the system, yet AppSync requests and CloudWatch logs had surged.
The root cause? A single browser tab running our Amplify Gen2 system admin dashboard.
Here’s the kicker — this wasn’t a forgotten tab. I regularly leave the dashboard open while turning off my front-end environment. What I hadn’t realized was that the connection can still remain live, quietly firing requests in the background. That single tab unleashed thousands of API calls and inflated logging activity, all while no actual work was being done. We had added polling to get updates to users for various functions in a timely manner, which seemed innocent enough.
Why Traditional Polling Fails
Like many teams, we had a few React hooks doing background polling. By themselves, they seemed harmless. Together, they created a perfect storm:
- Tabs keep polling even when hidden: AWS activity doesn’t stop just because the user isn’t looking.
- Idle users are treated as active: someone can step away for hours, and the calls keep coming.
- Intervals carry across environments: fast dev settings for testing accidentally land in production.
- Cleanup is unreliable: polling may continue after components unmount.
- “Live” connections linger: even with the front-end server shut down, polling can still run from the browser.
The result was exponential API traffic and mountains of logs. The key was that AppSync calls were being made for every poll event. And for every one of those, a similar CloudWatch log entry was being made.
A Simplified Example
Here’s the difference between a naive polling hook and a smarter approach:
// ❌ The "bad" version
useEffect(() => {
const interval = setInterval(fetchUnreadCount, 30000); // every 30s
return () => clearInterval(interval);
}, []);
This keeps firing indefinitely, even if the tab is hidden or the user is idle.
// ✅ The smarter version
useEffect(() => {
const interval = setInterval(() => {
if (!document.hidden && userIsActive()) {
fetchUnreadCount();
}
}, 240000); // 4 minutes in production
return () => clearInterval(interval);
}, []);
This version:
- Respects tab visibility
- Pauses for inactive users
- Uses longer intervals in production
From Quick Fix to Multi-Layer Cost Control
Stopping runaway polling was just the first step. We realized cost control needed to extend across the entire stack.
- Smart Frontend Polling
- Tab-aware, activity-aware, and environment-specific intervals.
- Prevents runaway requests when no one is actually using the system.
- Runtime Configuration (AppConfig)
- Built a DynamoDB-backed configuration system.
- Adjust polling intervals, logging verbosity, and feature flags without redeployment.
- Gives us operational flexibility and faster response to cost signals.
- Backend Execution Control
- Added execution tracking so Lambda functions only run when needed.
- EventBridge provides lightweight “heartbeats” while AppConfig governs actual frequency.
- Prevents duplicate or unnecessary runs.
- Intelligent Logging
- Environment-aware logging levels (debug in dev, minimal in prod).
- Verbose logs can be toggled via AppConfig.
- Keeps CloudWatch useful without driving costs.
- Security & Session Management
- Periodic permission refreshes and planned session timeout handling.
- Protects against inactive or stale sessions quietly consuming resources.
Results & Impact
With these layers in place, we went from runaway weekend traffic to predictable, controlled activity. The benefits were immediate:
- Dramatic cost reduction from smarter polling and leaner logging.
- Predictable spend through runtime controls.
- Better user experience with refresh rates that adapt to real activity.
- Flexibility to tune behavior across environments without redeploying.
And longer term, we’ve laid the groundwork for:
- Cost prediction models based on usage patterns.
- Automated alerts when thresholds are crossed.
- Multi-tenant cost attribution for clearer client visibility.
Lessons Learned
This incident reinforced that cloud costs don’t just come from “big” infrastructure decisions — sometimes the smallest frontend detail can have the biggest impact. Key takeaways:
- Frontend polling is a silent cost killer — audit your hooks.
- Runtime configuration beats redeployment — flexibility matters.
- Environment awareness is essential — dev and prod need different defaults.
- Monitor before you optimize — understand what’s really driving costs.
- Security and cost are intertwined — inactive sessions can still create activity.
Why It Matters
Don’t wait for your spending to explode before addressing these issues. Build in cost awareness from the start, in your frontend, backend, and monitoring.
Have you ever seen a seemingly small coding pattern create massive cost consequences? I’d love to hear your story.
Matt Pitts, Sr Architect
No responses yet