Skip to content

gramener/openai-function-calling-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

OpenAI Function Calling Workshop

This workshop is for Python developers and analysts who want to use LLMs like OpenAI to extract structured data from text or run queries from text and summarise the results.

Pre-requisites: Comfortable with Python and REST APIs

We will use Open AI's Function Calling via LangChain to write code that can answer natural language questions from data.

Install dependencies and data

For this workshop, you need:

pip install -q pandas      # To run queries on data
pip install -q openai      # To call OpenAI's API
pip install -q tiktoken    # To count tokens in text
pip install -q langchain   # To orchestrate function calling

This workshop uses Goodreads 10K dataset dataset. YOU SHOULD USE A DIFFERENT DATASET FOR YOUR SUBMISSION. Here are some options:

For example, download the Goodreads dataset:

from urllib.request import urlretrieve

urlretrieve('https://raw.githubusercontent.com/zygmuntz/goodbooks-10k/master/books.csv', 'books.csv')

Set up libraries

Set up an OpenAI API key.

import os

os.environ['OPENAI_API_KEY'] = '...'

Next, let's set up LangChain. We:

  • Use GPT 3.5 Turbo since it's a reasonably inexpensive and capable model as of 10 Oct 2023
  • Set temperature to 0 to get deterministic results
  • Set verbose to True to see the API calls

Write code to answer questions

from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model='gpt-3.5-turbo', temperature=0, verbose=True)

Let's create an OpenAI prompt chain that tells OpenAI to call a function. Try out different prompts to see what works best.

Note the Today is {today}. This passes context, allowing questions like "last year".

from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
    [
        ('system', 'Call right function. Today is {today}'),
        ('human', '{input}'),
        ('human', 'Always use right format'),
    ]
)

Define functions that can answer questions from the data. Use the Google Python Style docstring to document the functions. Specifically:

  • Define the type and default of each argument
  • Describe matching questions in the first paragraph, one per line
  • Describe the arguments
import pandas as pd
from typing import Optional

books = pd.read_csv('books.csv')

def top_rated_books(count: Optional[int] = 10, start_year: Optional[int] = 0, end_year: Optional[int] = 9999, min_ratings_count: Optional[int] = 0) -> pd.DataFrame:
    """Which are the top rated books?
    Which are the best books?

    Args:
        count: # books to return
        start_year: year after which to consider books
        end_year: year before which to consider books
        min_ratings_count: min # of people who rated
    """
    result = books[books['original_publication_year'] >= start_year] if start_year else books
    result = result[result['original_publication_year'] <= end_year] if end_year else result
    result = result[result['ratings_count'] >= min_ratings_count]
    return result.sort_values('average_rating', ascending=False).head(count)

def most_popular_books(count: Optional[int]=10, start_year: Optional[int] = 0, end_year: Optional[int] = 9999, min_rating: Optional[int] = 0) -> pd.DataFrame:
    """Which are the most popular books?
    What books did people read/rate the most?

    Args:
        count: # books to return
        start_year: year after which to consider books
        end_year: year before which to consider books
        min_rating: min average rating
    """
    result = books[books['original_publication_year'] >= start_year] if start_year else books
    result = result[result['original_publication_year'] <= end_year] if end_year else result
    result = result[result['average_rating'] >= min_rating]
    return result.sort_values('ratings_count', ascending=False).head(count)

def most_prolific_authors(count: Optional[int]=10, start_year: Optional[int] = 0, end_year: Optional[int] = 9999, min_rating: Optional[int] = 0, min_ratings_count: Optional[int] = 0) -> pd.DataFrame:
    """Who wrote the most books?
    Who are the most prolific authors?

    Args:
        count: # authors to return
        start_year: year after which to consider books
        end_year: year before which to consider books
        min_rating: min average rating
        min_ratings_count: min # of people who rated
    """
    result = books[books['original_publication_year'] >= start_year] if start_year else books
    result = result[result['original_publication_year'] <= end_year] if end_year else result
    result = result[result['average_rating'] >= min_rating]
    result = result[result['ratings_count'] >= min_ratings_count]
    return result.groupby('authors').size().sort_values(ascending=False).head(count).to_frame('count')

Now, create a function chain that passes these functions to Open AI to choose from.

from langchain.chains.openai_functions import create_openai_fn_chain

functions = [
    top_rated_books,
    most_popular_books,
    most_prolific_authors
]
chain = create_openai_fn_chain(functions, llm, prompt, verbose=True)

This allows us to ask questions:

query = 'Which are the most popular books after 2000?'
today = '2023-10-10'
result = chain.generate([{'input': query, 'today': today}])
result

result looks like this:

LLMResult(
  generations=[
    [
      ChatGeneration(
        generation_info={'finish_reason': 'function_call'},
        message=AIMessage(
          content='',
          additional_kwargs={
            'function_call': {
              'name': 'most_popular_books',
              'arguments': '{\n  "start_year": 2000\n}'
            }
          }
        )
      )
    ]
  ],
  llm_output={
    'token_usage': {
      'prompt_tokens': 302,
      'completion_tokens': 20,
      'total_tokens': 322
    },
    'model_name': 'gpt-3.5-turbo'
  },
  run=[RunInfo(run_id=UUID('...'))]
)

This identified the most_popular_books function as the one to call, along with the arguments (as JSON).

Let's write a function that also calls this function, and calculates its cost too.

import json
from datetime import datetime, timezone

# Cost of gpt-3.5-turbo as of 10 Oct 2023
cents = {
    'prompt_tokens': 0.0015 / 1000 * 100,
    'completion_tokens': 0.002 / 1000 * 100,
}
fn_map = {fn.__name__: fn for fn in functions}

def answer(query: str, today: str = datetime.now(timezone.utc).strftime('%Y-%m-%d')) -> dict:
    response = chain.generate([{'input': query, 'today': today}])
    # Calculate cost of query in cents
    cost = sum(cents[token] * response.llm_output['token_usage'][token] for token in cents)
    result = {'data': None, 'cost': cost, 'query': query, 'response': response}
    # Get the function name and arguments
    if len(response.generations) == 0 or len(response.generations[0]) == 0:
        return result
    function = response.generations[0][0].message.additional_kwargs.get('function_call', None)
    if function is None or function.get('name', None) not in fn_map:
        return result
    # Call the function and return the result
    kwargs = json.loads(function['arguments'])
    result['data'] = fn_map[function['name']](**kwargs)
    return result

Test your code

answer1 = answer('Which are the most popular books after 2000')
answer1['data']

This should give a DataFrame that lists Suzanne Collins' The Hunger Games, followed by Stephenie Meyer's Twilight, and others.

Let's try a few more questions:

answer2 = answer('Which are the best rated books with at least 10,000 ratings in the last century?')
answer3 = answer('Who are the top 3 authors by number of books in the last decade of the 20th century?')

The top rated books last century were Calvin and Hobbes, Harry Potter and Words of Radiance.

answer2['data'].head().T

The top authors in the 1990s were Stephen King, Terry Pratchett and Nora Roberts.

answer3['data']

These queries cost about 0.05 cents each:

answer1['cost'], answer2['cost'], answer3['cost']

Reduce the cost

The cost is driven by the number of functions we pass to OpenAI. With more functions, the cost per query increases.

Instead, let's find the 2 most similar queries (based on vector embeddings) and pass them to OpenAI. This reduces the cost to about 0.03 cents.

import numpy as np
from langchain.storage.file_system import LocalFileStore
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cache import CacheBackedEmbeddings
from typing import List

file_store = LocalFileStore('.embeddings/')
base = OpenAIEmbeddings()
cached_embeddings = CacheBackedEmbeddings.from_bytes_store(base, file_store, namespace=base.model)

def classify(docs: List[str], topics: List[str]):
    """Return the similarity between each doc and topic"""
    doc_embed = np.array(cached_embeddings.embed_documents(docs))
    topic_embed = np.array(cached_embeddings.embed_documents(topics))
    return np.dot(doc_embed, topic_embed.T)

Now modify answer() to pick the top 2 queries and pass them to OpenAI.

from langchain.chains.openai_functions.base import convert_python_function_to_openai_function


def answer(query: str, today: str = datetime.now(timezone.utc).strftime('%Y-%m-%d'), top=2, min_similarity=0.85) -> dict:
    # Convert the functions into a JSON schema
    questions = []
    for name, fn in fn_map.items():
        schema = convert_python_function_to_openai_function(fn)
        for desc in schema['description'].split('\n'):
            questions.append({'q': desc.strip(), 'fn': fn})
    # Find the similarity of each function to the query
    similar = classify([query], [question['q'] for question in questions])
    for index, similarity in enumerate(similar[0]):
        questions[index]['similarity'] = similarity
    # Create chain from the top similar questions with min_similarity
    questions = sorted(questions, key=lambda q: q['similarity'], reverse=True)
    top_functions = [q['fn'] for q in questions if q['similarity'] >= min_similarity][:top]
    chain = create_openai_fn_chain(top_functions, llm, prompt, verbose=True)

    # Then run the chain. This is the same as before
    response = chain.generate([{'input': query, 'today': today}])
    # Calculate cost of query in cents
    cost = sum(cents[token] * response.llm_output['token_usage'][token] for token in cents)
    result = {'data': None, 'cost': cost, 'query': query, 'response': response, 'functions': top_functions}
    # Get the function name and arguments
    if len(response.generations) == 0 or len(response.generations[0]) == 0:
        return result
    function = response.generations[0][0].message.additional_kwargs.get('function_call', None)
    if function is None or function.get('name', None) not in fn_map:
        return result
    # Call the function and return the result
    kwargs = json.loads(function['arguments'])
    result['data'] = fn_map[function['name']](**kwargs)
    return result

Now let's run all 3 questions...

answer1a = answer('Which are the most popular books after 2000')
answer2a = answer('Which are the best rated books with at least 10,000 ratings in the last century?')
answer3a = answer('Who are the top 3 authors by number of books in the last decade of the 20th century?')

... and check that their cost is ~0.03 cents each:

answer1a['cost'], answer2a['cost'], answer3a['cost']

Then verify that the answers are the same too.

[
  answer1['data'].equals(answer1a['data']),
  answer2['data'].equals(answer2a['data']),
  answer3['data'].equals(answer3a['data']),
]

Submit your code

  1. Create an issue titled Exercise submission. Add a link to your Colab notebook

Verification

To mark a submission as correct:

  1. Check if the notebook uses a different dataset and functions than the one provided
  2. Check if the code has been fully executed with the new dataset and without errors
  3. Add the Function calling skill

Colab notebook

Notebook

Video

Workshop video

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published