Developing a RAG chatbot on NextJS: Firebase Genkit and Firestore Tutorial
Build an AI bot that answers questions based on a website's content using the new Firebase Genkit and Firestore Vector DB.
We are all adding various Generative AI workflows in our apps. A common problem encountered is to get LLMs to answer based on our defined dataset.
We may want to provide information like:
Our internal documents like PDFs or reports.
Data provided by the user like notes, forms, appointment history etc.
Or latest information like news articles, and wikipedia pages.
Building a RAG pipeline:
To solve this, we will incorporate a RAG pipeline that can perform a semantic search and extract most relevant data to answer user's query from a huge corpus of 1000s of documents.
For this article, we will pick up a website and create a chatbot answering questions over it's content.
Getting Help: Dash Agent for Firebase Genkit
Building with a new package is a difficult task that requires learning it's concepts first. To save our time reading the documentation of Firebase Genkit, we will use the @Firebase Genkit
agent in CommandDash.
We can install it from CommandDash marketplace in VSCode .
Now with the Firebase Genkit agent in place, we can directly ask it to assist us to build our usecase. Any interactions with the agent are quoted like below:
Hi Firebase Genkit! Let's start building ๐๐ผ
Getting Started
How to setup a Next.js project to use Firebase Genkit and Firestore
Here's a comprehensive guide on setting up a Next.js project to use Firebase Genkit with Firestore.
Prerequisites
Firebase Project: You need a Firebase project. If you don't have one, create one at https://console.firebase.google.com.
Node.js and npm: Ensure you have Node.js and npm installed on your machine.
Firebase CLI: Install the Firebase CLI using
npm install -g firebase-tools
.
Steps
Create a Next.js Project
npx create-next-app@latest my-genkit-nextjs-app cd my-genkit-nextjs-app
Choose TypeScript as the language during setup.
Initialize Genkit
npm i -g genkit genkit init
Select Next.js as your deployment platform.
Select your preferred model provider (e.g., Gemini, Vertex AI).
Accept the defaults for other prompts.
This will set up your project with sample Genkit files.
Install the Firebase Plugin
npm install --save @genkit-ai/firebase
Configure the Firebase Plugin
In your
genkit.ts
file:import { firebase } from '@genkit-ai/firebase'; import { configureGenkit } from '@genkit-ai/flow'; configureGenkit({ plugins: [firebase({ projectId: 'your-firebase-project-id' })], // Replace 'your-firebase-project-id' logLevel: 'debug', enableTracingAndMetrics: true });
Set up Firestore
Initialize the Firebase Admin SDK in your
genkit.ts
file:import * as admin from 'firebase-admin'; // ... other imports const app = admin.initializeApp({ credential: admin.credential.cert('path/to/your/firebase-service-account-key.json'), }); const firestore = getFirestore(app); // ... rest of your Genkit code
- Replace
path/to/your/firebase-service-account-key.json
with the path to your Firebase service account key file. You can download this key file from the Firebase console (Project Settings > Service Accounts).
- Replace
Populating Data in Firestore
For this article, we will pick the Firebase Genkit documentation and index it to our Firestore DB.
Indexing is converting text into embeddings which can them be used to rank text based on their semantic similarity. To make it easy, I've extracted the Genkit documentation into this JSON file.
Now let's ask the Dash Agent:
Write me a method that extracts data from the json file with schema [{title, description, content}] and saves the indexed data to the "documents" collection in firestore.
Chunk the data appropriately within the embedding model limits.
// app/genkit.ts
import { configureGenkit } from "@genkit-ai/core";
import { defineFirestoreRetriever, firebase } from "@genkit-ai/firebase";
import { googleAI } from "@genkit-ai/googleai";
import * as admin from 'firebase-admin';
import { getFirestore, FieldValue } from "firebase-admin/firestore";
import { textEmbeddingGecko001 } from "@genkit-ai/googleai";
import { embed } from "@genkit-ai/ai/embedder";
import * as fs from 'fs';
// your existing code
const firestore = getFirestore(app);
const indexConfig = {
collection: "documents",
contentField: "content",
vectorField: "embedding",
embedder: textEmbeddingGecko001,
};
async function indexWebsiteContent() {
const filePath = 'your downloded file path';
const data = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
for (const document of data) {
// Handle embedding length limits
const chunks = chunkText(document.content, 4500);
for (const chunk of chunks) {
const embedding = await embed({
embedder: indexConfig.embedder,
content: chunk,
});
await firestore.collection(indexConfig.collection).add({
[indexConfig.vectorField]: FieldValue.vector(embedding),
[indexConfig.contentField]: chunk,
title: document.title,
description: document.description,
});
console.log(`Indexed chunk of document: ${document.title}`);
}
}
}
function chunkText(text: string, chunkSize: number): string[] {
const chunks: string[] = [];
for (let i = 0; i < text.length; i += chunkSize) {
chunks.push(text.slice(i, i + chunkSize));
}
return chunks;
}
Now, we will run the indexWebsiteContent
as a one time API call to save it to Firestore for querying later.
// pages/api/index-website-content.ts
import { NextApiHandler, NextApiRequest, NextApiResponse } from 'next';
import { indexWebsiteContent } from '../../app/genkit';
const handler: NextApiHandler = async (req: NextApiRequest, res: NextApiResponse) => {
try {
await indexWebsiteContent();
res.status(200).json({ message: 'Website content indexed successfully' });
} catch (error) {
console.error(error);
res.status(500).json({ message: 'Error indexing website content' });
}
};
export default handler;
Then we run:
npm run dev
curl http://localhost:3000/api/index-content
The Firestore DB is now populated ๐๐ผ with embeddings in vector format!
Asking questions on this data
Moving back to Firebase Genkit Agent:
Create a chatRequest method that retrieves the data from the collection and creates a model request to answer user's query.
Use top 3 documents with Gemini Pro model.
import { configureGenkit } from "@genkit-ai/core";
import { defineFirestoreRetriever, firebase } from "@genkit-ai/firebase";
import { geminiPro, googleAI } from "@genkit-ai/googleai";
import * as admin from 'firebase-admin';
import { getFirestore, FieldValue } from "firebase-admin/firestore";
import { textEmbeddingGecko001 } from "@genkit-ai/googleai";
import { embed } from "@genkit-ai/ai/embedder";
import { generate } from "@genkit-ai/ai";
import * as fs from 'fs';
/// your existing code
const firestoreRetriever = defineFirestoreRetriever(
{
name: 'firestoreRetriever',
firestore: firestore,
collection: indexConfig.collection,
vectorField: indexConfig.vectorField,
contentField: indexConfig.contentField,
embedder: indexConfig.embedder,
}
)
export async function indexWebsiteContent() {
// no change here
}
export async function chatRequest(query: string) {
const docs = await firestoreRetriever({
query: {
content: [
{
text: query
}
]
},
options: { limit: 3 }, // Retrieve top 3 documents
});
let context = docs.documents.map(doc => doc.content[0].text).join('\n\n');
const response = await generate({
model: geminiPro,
prompt: `
You are a helpful and informative AI assistant.
Answer the user's question.
Question: ${query}
Context: ${context}
`,
});
return response.text();
}
// your existing code
Done! We got the entire from ready for us.
Now, we can first test the answerQuery
method as an API as we did before.
import { NextApiHandler, NextApiRequest, NextApiResponse } from 'next';
import { chatRequest } from '../../app/genkit';
const handler: NextApiHandler = async (req: NextApiRequest, res: NextApiResponse) => {
try {
// Assuming the query is passed in the request body
const { query } = req.body;
// If no query is provided, send an error response
if (!query) {
return res.status(400).json({ message: 'Please provide a query' });
}
let result = await chatRequest(query);
res.status(200).json({ message: result });
} catch (error) {
console.error(error);
res.status(500).json({ message: 'Error processing chat request' });
}
};
export default handler;
Running the methods returns an error regarding an indexing prerequisite issue.
Error: 9 FAILED_PRECONDITION: Missing vector index configuration. Please create the required index with the following gcloud command: gcloud alpha firestore indexes composite create --project={your project name} --collection-group=documents --query-scope=COLLECTION --field-config=vector-config='{"dimension":"768","flat": "{}"}',field-path=embedding
This indicates we need to first set the Vector Index configuration for the Firestore DB using gcloud CLI
. Once the CLI is installed, run this command:
gcloud alpha firestore indexes composite create \
--project={project}
--collection-group=documents
--query-scope=COLLECTION
--field-config=vector-config={"dimension":"768","flat": "{}"},
field-path=embedding
You can copy find values specific to your project from the error message itself. Once index is created, you get it's ID.
Now, we re-run the same API request again and this time it succeeds ๐๐ผ
Building the UI
We will ask CommandDash only to create us a basic UI that calls the chatRequest method.
Use Attach Snippets to Dash
in the menu bar after selecting the below snippets to provide them as context
[page.tsx] [genkit.ts] [pages/api/chat-request.ts] update my next JS page to have a textfield and button that calls that chat request method and shows the response in the text box.
// app/page.tsx
"use client";
import { useState } from "react";
export default function Home() {
const [query, setQuery] = useState("");
const [response, setResponse] = useState("");
const handleSubmit = async (event: any) => {
event.preventDefault();
try {
const result = await fetch("/api/chat-request", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ query }),
});
const data = await result.json();
setResponse(data.message);
} catch (error) {
console.error("Error:", error);
setResponse("Error processing chat request.");
}
};
return (
<main className="flex min-h-screen flex-col items-center justify-between p-24">
{/* ... Existing HTML content ... */}
<div className="mt-16 w-full max-w-5xl">
<form onSubmit={handleSubmit}>
<div className="mb-4">
<label htmlFor="query" className="block text-gray-700 text-sm font-bold mb-2">
Your question:
</label>
<input
type="text"
id="query"
className="shadow appearance-none border rounded w-full py-2 px-3 text-gray-700 leading-tight focus:outline-none focus:shadow-outline"
value={query}
onChange={(e) => setQuery(e.target.value)}
/>
</div>
<button
type="submit"
className="bg-blue-500 hover:bg-blue-700 text-white font-bold py-2 px-4 rounded focus:outline-none focus-shadow-outline"
>
Ask
</button>
</form>
{response && (
<div className="mt-4 p-4 border border-gray-300 rounded">
<p>{response}</p>
</div>
)}
</div>
</main>
);
}
Great, Let's run our Next app and see the results:
Congratulations! We've built our first RAG app with Genkit!
Now adapt this to your usecase and ship useful AI powered apps to your users.
Resources:
https://marketplace.visualstudio.com/items?itemName=WelltestedAI.fluttergpt