Comprehensive Guide to OpenAI Agents SDK Built-In Tools for Developers

Comprehensive Guide to OpenAI Agents SDK Built-In Tools for Developers

In previous discussions, we explored the OpenAI Responses API combined with the Agents SDK. This time, we’ll delve into the built-in tools available with the API. I’ll show how these can be implemented, focusing primarily on compact scripts to let the SDK handle the workload. However, achieving this was easier for two out of three tools.

Web Search

Building on the Python setup with OpenAI from earlier, let’s initiate our first example with the web search tool using the Responses API.

Once you request a particular model, you can specify a list of tools that the API should utilize. To ensure our request uses the web search, we need to include a specific line in the tools list. Below is a simplistic script saved in a local file named web_search.py:

The output of this script will only display the text generated. When run, it will also show the time taken for the request at the top, which can help manage your usage and cost (e.g., using just 4 cents for a query). Even throughout deeper usage, monitoring expenditure becomes crucial, especially for extended tasks on expensive models.

Let’s inspect the response format more closely. By altering the last line of our web_search.py file, we can explore the response metadata, using the following line:

This will present a detailed output including response annotations for URLs and texts. The output, truncated for brevity, will resemble:

We can also display the tools utilized in the response:

The output indicates the default search context and user location settings. For international news analysis, you might wish to customize the details:

Example of Tailored Web Search

Add user location to narrow down the search, like for news about an event in London:

File Search

The file search feature requires a pre-set knowledge base, which acts as a vector store. Before utilizing this tool, files need to be uploaded. An example file named “deep_research_blog.pdf” will be used:

After executing this script, we will have created a valid vector store that can integrate the uploaded file, ready for further queries.

To verify the file’s accessibility within the vector store, you can use a simple script later on with the respective VectorStore ID:

Running this code should confirm that the file is stored correctly. After validation, you can search the knowledge base using the file ID:

The response will reflect outputs similar to the web search tool, incorporating specific citations from the stored documents. Each output will have detailed annotations to indicate references.

Computer Use

The last built-in tool offers a computer use capability, allowing the AI to simulate human-like interactions by filling out forms or navigating web interfaces. This approach anticipates a future where AI may automatically communicate with APIs, minimizing human involvement.

This tool requires considerable setup for effective automation and hasn’t achieved the level of reliability necessary for general use. Below are steps to enable this feature:

  1. Set the AI role as “user.”
  2. Define the target screen’s size and the image representation.
  3. Receive output detailing suggested actions based on the screen.
  4. Execute the instructions obtained and capture the outcome for review.

Keep in mind that tools like these demand specialized focusing and care, especially in responding to user interfaces correctly.

Image source: OpenAI

Please follow and like us:

Related