Transformers’ Key-Value Cache for Multi-Round Conversations

in

Let me give you an example: imagine you’re having a conversation with your friend about what movie to watch tonight. You both agree on “The Shawshank Redemption” but then your friend asks, “Hey, do you remember who played Andy Dufresne?” Instead of looking up the answer again (which would be boring and time-consuming), we can use this Key-Value Cache thingy to quickly retrieve that information from our previous conversation.

So how does it work? Well, let’s say your friend asked you a question like “What is the capital city of France?” And you responded with “Paris.” Instead of forgetting that answer and having to look it up again later (which would be annoying), we can store that information in our Key-Value Cache.

Here’s how: first, we create a dictionary called `cache` where we will store all the key-value pairs for each conversation. Then, whenever someone asks us a question and we respond with an answer, we add that answer to the cache using its corresponding question as the “key.” For example:

# Define our cache dictionary
cache = {}

# Example of adding a key-value pair to the cache
def remember(question):
    # Check if this question has already been asked and answered before
    if question in cache.keys(): # Checks if the question is already in the cache dictionary
        return "I've already told you that!" # Returns a message if the question has already been asked and answered
    
    # If not, ask for an answer and add it to our cache dictionary
    else:
        answer = input("What is the answer to your question? ") # Asks for an answer to the question
        cache[question] = answer # Adds the question and answer as a key-value pair to the cache dictionary
        print(f"{answer} has been added to my memory!") # Prints a message confirming the addition to the cache
        
# Example of retrieving a value from the cache using its corresponding key
def recall(question):
    # Check if this question has already been asked and answered before
    if question in cache.keys(): # Checks if the question is already in the cache dictionary
        return f"The answer is: {cache[question]}" # Returns the answer if the question has already been asked and answered
    
    # If not, ask for an answer and add it to our cache dictionary (and then remember it)
    else:
        remember(question) # Calls the remember function to add the question and answer to the cache dictionary
        
# Example usage of the recall function
recall("What is the capital city of France?") # Calls the recall function to retrieve the answer to the question from the cache dictionary

The “Transformers” Key-Value Cache thingy in a nutshell. It’s basically just a fancy way to store and retrieve information from previous conversations, which can save us time and effort in the long run.

SICORPS