Why Chain?

Related:


Overview

The default behavior of OpenAI API is a bit like herding cats. It'll give responses based on what it thinks is the highest probably next word given the entire corpus of everything, which is quite a bit.

For displaying these examples, the end of an input prompt is indicated by [EOP], which is followed by the API output.

When multiple responses are shown by appending the entire previous prompt, *continue* is used to indicate that.

When multiple different queries are attempted to the same prompt, it is indicated by *OR*

Building Simple Prompts

The prompts section covers this more in depth, so this is an overview of how chaining is necessary.

It's very easy to generate a prompt that makes sense in your head but then the API interprets it totally differently. For instance, let's say that we have a simple prompt,

What is the capital of france?[EOP]

the API can respond by just repeating it back over and over.

What is the capital of france?[EOP]

What is the capital of france?

What is the capital of france?

Rephrasing it as a question will give it a hint that it should answer, but if you aren't explicit it'll keep going

Q: What is the capital of France?[EOP]
A: Paris.
Q: What is the capital of Russia?
A: Moscow.

You can also rephrase it as dialogue in context and it'll fill in bits like a quiz show host saying "correct":

The following is a conversation between a quiz show host and a contestant.

Quiz Show Host: What is the capital of France?[EOP]
Contestant: Paris.
QSH: Correct. What is the capital of the United States?
Contestant: Washington, D.C.
QSH: Correct. What is the capital of Italy?
Contestant: Rome.

However, this behavior of guessing what'll happen can also cause it to loop or make up answers.

Looping:

The following is a conversation a shipping agent and a customer:

Customer: Hi, I'd like to ship a package[EOP]
Agent: Sure, what's the address?
Customer: I'd like to ship a package to the address on the label.
Agent: What's the address on the label?
Customer: I'd like to ship a package to the address on the label.
Agent: What's the address on the label?

Making stuff up:

The following is a conversation between customer service and a client:

Client: Hi
Customer Service: Can I have your phone number for identification?[EOP]
Client: Sure, it's 123-456-7890
Customer Service: Thank you. Can I have your name?
Client: Sure, it's John Smith

Complex Prompts

To avoid having the model present guesses or repetition, additional information can be fed to the API in the prompt. Zero shot doens't work too well, but one shot and few shot examples fix this.

Zero Shot

Here, let's get OpenAI to simulate what it thinks an entity knows. Asking it to simulate an entity that knows nothing fails pretty quickly.

This is a mental model of a student.

The student does not know anything anything until it learns it.

The student says it has not learned anything it has not learned.

The student knows nothing.

The student learns the capital of France is Madrid. 

q: what does the student think is the capital of France?[EOP]
a: Madrid

q: What does the student think is 1+1?[EOP]
a: 2

One Shot

By giving a single example, the model can pick up that we really mean the student knows only what we says it knows and nothing else.

This is a mental model of a student.

The student does not know anything anything until it learns it.

The student says it has not learned anything it has not learned.

The student knows nothing.

q: what does the student think is the capital of France?
a: the student does not know.

The student learns the capital of France is Madrid. 

q: what does the student think is the capital of France?
a:[EOP] the student knows the capital of France is Madrid.

q: What does the student think is 1+1?
a:[EOP] the student does not know.

With the example of modeling a student's knowledge, once the number of things assigned to the student get large it won't fit into the 2,000 character limit. Furthermore, to model multiple entities, it won't fit in either. Instead, we can keep a local repository of things that the student knows, and selectively include them into the prompt on each API call.

Building prompts on the fly

If we want to use a prompt algorithmically, we'll need to be able to generate both the initial context as well as the query to form the total prompt. For instance, we can divide the previous prompt into a context

This is a mental model of a student.
The student does not know anything anything until it learns it.
The student says it has not learned anything it has not learned.
The student knows nothing.

q: what does the student think is the capital of France?
a: the student does not know.

The student learns the capital of France is Madrid.

and a prompt

q: what does the student think is the capital of France?
a:

to which the API returns a response

the student knows the capital of France is Madrid.

We can dynamically build prompts that modify the context based on the domain of the query as well as the intended format of the response. For some prompts, a dynamic prompt might not be necessary; e.g. retrieving historical facts. However, for other prompts including this example, changing the information in the context depending on the target of the query can improve performance.

Chaining Prompts

Chaining prompts allows us to get around the 2,000 character limit by using prompts to perform subtasks which are then fed back into the model. You can't do this in Playground at the moment, although actually, now that I think about it, I wonder if they'd put the ability to feed prompts into other prompts.

Expert Prompts

See the prompts module for expert prompts. You can have a general prompt set up and then process those results

Iterative Evolution:

The advantage of iterating

Iterative prompts can be useful for tasks such as classification or labeling. For example, supposes you've got a set of news headlines but don't know what the labels should be. That's fine! Just start with some prompts, let the API figure out how it wants to label them with your initial prompts as guidance, and then feed the API-labelled prompts back into the API for those and new headlines letting the labels evolve over time.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License