Chain-of-thought reasoning
Matt Rickard has a concise overview of Chain-of-thought: the design pattern of having an LLM think step by step.
To summarize, the four mentioned approaches from simpler to more nuanced are:
- Add "Let's think step-by-step" to the prompt.
- Produce multiple solutions, have the LLM self-check on each one, pick the one that passes the self-check.
- Divide task into subtasks, solve each subtask.
- Explicit thought-action-observation loops, where the action can be usage of a tool (e.g. external API call).
This is not an exhaustive list. Each of the ideas above is introduced in a separate paper, which feels excessive given their simplicity. Regardless, these are essentially tricks on top of GPT/InstructGPT, rather than fundamental discoveries, so anyone can come up with new ones, and I'm sure people have already.
This list is notable because the approaches are generic. You could build a system that uses these tricks and yet is not constrained to any particular task. If you care only about one domain -- say, generating code -- that is low-hanging fruit. If you have any inkling of how the task is done in reality, you can bias the model towards a specific set of steps.
For example, a system that uses GPT for making code edits based on simple descriptions ("rename the GET /dogs
API path with GET /bar
") might be built with the following hard-coded steps:
- List files in repo
- Identify files that need to be changed, given the prompt
- For each file: make an edit to this file
Perhaps you could add some self-checking on top: if the code doesn't compile, run the same steps a second time but now also conditioning on the error message.
The core here, though, is that you can only improve on the GPT-made "algorithm" if you have some domain knowledge. So expertise in the task still matters. (Maybe LLMs are the time we finally collectively realize the value of business process management...?)