Expansion

Information retrieval systems can be sensitive to phrasing and specific keywords. To mitigate this, one classic retrieval technique is to generate multiple paraphrased versions of a query and return results for all versions of the query. This is called query expansion. LLMs are a great tool for generating these alternate versions of a query.

Let’s take a look at how we might do query expansion for our Q&A bot over the LangChain YouTube videos, which we started in the Quickstart.

Setup

Install dependencies

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/core zod

yarn add @langchain/core zod

pnpm add @langchain/core zod

Set environment variables

# Optional, use LangSmith for best-in-class observability
LANGSMITH_API_KEY=your-api-key
LANGCHAIN_TRACING_V2=true

Query generation

To make sure we get multiple paraphrasings we’ll use an LLM function-calling API.

Pick your chat model:

OpenAI
Anthropic
FireworksAI
MistralAI

Install dependencies

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/openai

yarn add @langchain/openai 

pnpm add @langchain/openai 

Add environment variables

OPENAI_API_KEY=your-api-key

Instantiate the model

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
  model: "gpt-3.5-turbo-0125",
  temperature: 0
});

Install dependencies

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/anthropic

yarn add @langchain/anthropic 

pnpm add @langchain/anthropic 

Add environment variables

ANTHROPIC_API_KEY=your-api-key

Instantiate the model

import { ChatAnthropic } from "@langchain/anthropic";

const llm = new ChatAnthropic({
  model: "claude-3-sonnet-20240229",
  temperature: 0
});

Install dependencies

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/community

yarn add @langchain/community 

pnpm add @langchain/community 

Add environment variables

FIREWORKS_API_KEY=your-api-key

Instantiate the model

import { ChatFireworks } from "@langchain/community/chat_models/fireworks";

const llm = new ChatFireworks({
  model: "accounts/fireworks/models/firefunction-v1",
  temperature: 0
});

Install dependencies

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/mistralai

yarn add @langchain/mistralai 

pnpm add @langchain/mistralai 

Add environment variables

MISTRAL_API_KEY=your-api-key

Instantiate the model

import { ChatMistralAI } from "@langchain/mistralai";

const llm = new ChatMistralAI({
  model: "mistral-large-latest",
  temperature: 0
});

import { z } from "zod";

const paraphrasedQuerySchema = z
  .object({
    paraphrasedQuery: z
      .string()
      .describe("A unique paraphrasing of the original question."),
  })
  .describe(
    "You have performed query expansion to generate a paraphrasing of a question."
  );

import { ChatPromptTemplate } from "@langchain/core/prompts";

const system = `You are an expert at converting user questions into database queries. 
You have access to a database of tutorial videos about a software library for building LLM-powered applications. 

Perform query expansion. If there are multiple common ways of phrasing a user question 
or common synonyms for key words in the question, make sure to return multiple versions 
of the query with the different phrasings.

If there are acronyms or words you are not familiar with, do not try to rephrase them.

Return at least 3 versions of the question.`;
const prompt = ChatPromptTemplate.fromMessages([
  ["system", system],
  ["human", "{question}"],
]);
const llmWithTools = llm.withStructuredOutput(paraphrasedQuerySchema, {
  name: "ParaphrasedQuery",
});
const queryAnalyzer = prompt.pipe(llmWithTools);

Let’s see what queries our analyzer generates for the questions we searched earlier:

await queryAnalyzer.invoke({
  question:
    "how to use multi-modal models in a chain and turn chain into a rest api",
});

{
  paraphrasedQuery: "How to utilize multi-modal models sequentially and convert the sequence into a REST API?"
}

await queryAnalyzer.invoke({ question: "stream events from llm agent" });

{ paraphrasedQuery: "Retrieve real-time data from the LLM agent" }

Setup​

Install dependencies​

Set environment variables​

Query generation​

Pick your chat model:

Install dependencies

Add environment variables

Instantiate the model

Install dependencies

Add environment variables

Instantiate the model

Install dependencies

Add environment variables

Instantiate the model

Install dependencies

Add environment variables

Instantiate the model

Help us out by providing feedback on this documentation page:

Setup

Install dependencies

Set environment variables

Query generation