Gemini Flash 2.0 is Unusable for Production – Unpredictable Limits, Bugs, and Awful UX
I'm running a content project using Flash 2.0 Free Tier. Everything was fine for the first three days, but yesterday I started getting 429 RESOURCE_EXHAUSTED errors out of nowhere.
Checked my quotas—nowhere near the limits. Spent almost an entire day debugging, and eventually found mentions of Dynamic Shared Quota (DSQ). But there's zero clear info in the docs, just this vague line:
"DSQ quotas aren't listed in the Quotas & System Limits page in the Google Cloud console."
Through trial and error, I discovered that prompt size somehow affects these errors. And sure, that would make sense if my prompt was huge… but it's just ~3,500 tokens. Nowhere near the 1 million context limit. I’m making 1-2 requests per minute, so also I’m not even close to hitting any known quota.
The weirdest part? If I slightly reduce the prompt size, everything works again.
Second issue: I added a Billing Account, but it didn’t change anything. It still feels like I’m stuck on Free Tier, despite having billing enabled.
WTF is going on?