Chaining Commands & Vector Databases

For more complex use cases, there are libraries with different components that can be “chained” together as a call to a single large language model via an API. In this case, consecutive API calls are made, where the output of one call influences or determines the input for the subsequent call.

The sequence of operations or queries is executed on vector-based data. These databases store information as numerical representations of the objects that can be mapped back to their original, which enables efficient manipulation and retrieval of the data.

Why Chaining Sequential API Calls Can Be Beneficial or Required:

Contextual Continuation: For models like GPT-4, the context of a previous request might be needed to continue generating meaningful or coherent content. For instance, if you're generating a long story or article that exceeds the model's token limit in a single response, you might use the end of one response as the start of the next request to continue the story seamlessly.
Iterative Refinement: Based on the response from the first call, you might decide to refine or adjust your next request to get a more desirable output. For instance, if the initial explanation of a concept isn't clear enough, the next request can ask for a more detailed or simplified version.
Multi-step Tasks: Some tasks require multiple steps of processing. For example, first summarizing a piece of text and then translating that summary into another language would involve chaining two different types of requests.
Feedback Loops: In interactive applications, the user might react or provide feedback to the output of the model, and this feedback can be used to guide the input for the next API call.
Depth and Breadth of Exploration: If you're exploring a vast topic or generating diverse ideas, chaining calls can help dive deeper into subtopics or explore various branches of a topic systematically.
Conditional Logic: Based on the model's output in one step, you might decide to take different subsequent actions. For instance, if you're creating a troubleshooting guide, the next step might depend on the solution proposed in the previous step.

However, it's essential to manage the chaining effectively:

Cost: Multiple API calls mean more cost, especially with paid models. It's crucial to ensure that the chain doesn't lead to excessive and unnecessary calls.
State Management: Keeping track of the context and ensuring that the chained calls maintain coherence and continuity is crucial. This often requires effective state management in the application.
Latency: Chained calls might introduce delays, as each call waits for the previous one to complete. This can affect real-time or time-sensitive applications.

<aside> 💡 In the courses we recommend in this section we explore the Langchain framework which is used to assist in creating sequences of operations and an initial introduction to using multiple (often quite different and even multi-modal) models. Love it or hate it over time, Langchain is a great way to get started in this 101 stage.

</aside>

Vector databases are an integral part of using a framework like Langchain. They allow the system to provide external files (context) to the LLM to answer Q&A requests and others. This is done through a multi-step process that transforms and stores the external data into a numerical representation so that when a query comes in, the “answers” with the highest similarity scores to the initial question are selected in an efficient manner. Learn more about Vector DB’s in our Key Concepts Section.