I don't work at Google so I'm probably way off base, but if I was designing it I...

puzzle · on April 8, 2019

That's kind of how Google works, with multiple index tiers. Look up patents by Anna Paterson to get a few clues, assuming your lawyers won't bark at you.

Still, you can't keep partial results around forever, unless you want to make searches a lot more expensive, having to add a lot of capacity just to deal with the buffer bloat. Each query touches at least a thousand machines. Adding "a few more virtual machines" isn't going to cut it, especially if you have to handle tens of thousands of requests per second.