Chaining Python Function Calls




When using multiple API calls in sequence, it can help to let the API figure out what function to call. This could be done by just doing a classification API call and then figuring out what call and parameters are associated with what output as well, although it isn't clear at this point what the most effective way to do this is.

To have the API generate its own function calls, you can use a simple loop to constantly evaluate output using python's eval command, which translates a string (returned by the API) into an executable Python variable or function. It looks like this:

while (nextStep != stopCondition):
  nextStep = eval(nextStep)

This while loop evaluates nextStep, executes the output of nextStep, and continues doing so until the output of nextStep meets some stopping condition.

Example: Summarizing News Homepage Coverage

Code (Github gist)

This code summarizes articles on a newspaper homepage. It has three major steps when it processes a request about newspapers:

  1. Classify input as the sort of newspaper question or not and pass to next step
  2. Run the selected function with parameters

When run, it'll generate responses like this:


Classifying Input

So first, when you get input, you need to figure out what sort of call you'll need. Here, we can just define a call to figure out what the input's trying to do. Since this isn't too complex, it'll have just a few cases it needs to consider:

  • Regular dialogue
  • Summarize a single front page
  • Compare differences in front pages
  • Compare similarity of front pages

Note: This doesn't scale if you want this feature to handle more cases than could fit in the context window; there's ways to address that.

def routeInput(text):
  prompt = """This is a conversation with a helpful and witty assistant. If the input is a question about news, the assistant routes the question to processIndividualQuery(). Otherwise, if the assistant knows the answer it just responds with a print() statement.

Input: How are you?
Output: print("I'm great!")

Input: what's nyt coverage looking like?
Output: processIndividualQuery("what's nyt coverage looking like?")

Input: What's the difference between the nytimes and FoxNews coverage?
Output: contrastNews("nyt", "foxnews")

Input: What's the similarities between the new york times and abc coverage?
Output: getSimilarityofNews("foxnews", "abc")

Input: What's 2+2?
Output: print("4")

Input: What's the difference between the fox and abc coverage?
Output: contrastNews("foxnews", "abc")

Input: bleh
Output: print("Ask a question")

Input: What's the similarities between the fox and ABC coverage?
Output: getSimilarityofNews("foxnews", "abc")

Input: What does ABC say about the game?
Output: processIndividualQuery("What does ABC say about the game?")

Input: Who was the first president of the United States?
Output: print("George Washington")

Input: Who won the race?
Output: processIndividualQuery("Who won the race?")

Input: What's the difference between the new york times and Fox coverage?
Output: contrastNews("nyt", "foxnews")

Input: {}
  r = query(prompt)
  if not r.endswith(")"):
      return '''print("Error! Ask a question")'''
  return r

Note what this is doing. It's returning strings which contain function calls along with parameters. In python, you can evaluate these strings using


Within the construct, the parameters are enclosed in quotes so they can be passed as strings; if they aren't in quotes they'll be passed as variable names, which can throw exceptions if the variables don't have values already assigned.

Classify type of newspaper question

These tasks are all similar to the word-in-context and anli, which, as it happens, have seen pretty decent performance increases compared to the original OpenAI paper since being released into the wild. Luckily, that means that comparison of issues across publications should be doable, but it's not 100% accurate.

Summarize Single Page

Multiple Pages

When comparing multiple pages, I try to avoid using the actual publication names to avoid any noise from the model and instead use generic terms ['Publication 1, Publication 2'], then replace the publications after the fact.

Compare differences

Compare differences

Scraping the Actual Data

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License