Well, we’ve got the solution for you optimizing prompt design!
To begin with, what a “prompt” is in this context. In KD-DTI (Knowledge Distillation with Dynamic Text Input), prompts are essentially instructions that guide the model to extract specific knowledge from text. They can be hard or soft, and they come in all shapes and sizes.
Now, Let’s get started with some observations we made during our experiments:
1) Different manually designed hard prompts result in different performance and more instructive and informative prompts (e.g. ‘we can conclude that’) achieve better performance. This is because the model needs to be guided towards specific knowledge, rather than just extracting random information from text.
2) Generally, continuous embedding soft prompts perform better than manually designed hard prompts. Soft prompts are essentially a series of instructions that guide the model through the extraction process, whereas hard prompts provide more rigid guidance. Continuous embedding allows for a smoother and more natural flow in extracting knowledge from text.
3) The performance of continuous embedding soft prompts is roughly irrelevant to length. In our previous experiments, we empirically chose length=9 according to the performance on validation set. However, this may vary depending on the specific task and dataset being used.
So, how do you optimize prompt design for KD-DTI extraction? Here are some tips:
1) Keep it simple don’t overcomplicate your prompts with too many instructions or unnecessary information. The model needs to be able to follow the guidance provided in a clear and concise manner.
2) Use instructive language use words that guide the model towards specific knowledge, rather than just extracting random information from text. For example: “Identify the main argument presented in this article” or “Explain how this concept is related to another concept.”
3) Test and iterate try out different prompts on your dataset and see which ones perform best. Iterative prompting can also be helpful, as it allows for a more natural flow in extracting knowledge from text.