There is a very simple, standardized way of solving the problem of too small GPT context windows. This is what to do when the context window gets full:
Indexing:
1. Chunk up your context (book text, documents, messages, whatever).
2. Put each chunk through the text-embedding-ada-002 embedding model, and store