The notification came through at 2:34 AM on a Tuesday. Not from PagerDuty. Not from the monitoring dashboard. From my co-founder, three timezones away: 「Dude, we're trending on Product Hunt. #3 right now.」
I should've been excited. Instead, I felt sick. We built the entire stack for 10,000 concurrent users. The database connection pool was capped at 200. Redis had 8GB of memory. Our CDN budget assumed a gentle ramp-up over six months, not a viral spike before sunrise.
The traffic hit like a fire hose into a garden sprinkler. Response times spiked from 120ms to 4.2 seconds. The auto-scaling group we "didn't need yet" didn't exist. I spent the next 48 hours manually resizing RDS instances while fielding angry emails from users whose uploads were timing out.
Here's the thing nobody tells you: premature optimization is a sin, but so is pretending you'll have time to optimize later. We had optimized for the wrong risk — the risk of over-engineering, not the risk of sudden relevance. The adaptive scaling framework we eventually built wasn't about handling 100K users. It was about surviving the gap between "we're fine" and "we're screwed."
Three weeks later, traffic normalized to 15K daily users. The infrastructure we panic-built now runs at 18% capacity. My co-founder still sends me screenshots at 2 AM. I still feel sick, but now it's just coffee.
特别声明:以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布,本平台仅提供信息存储服务。
Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.