logo
6
话题头图

【千帆SDK+Semantic-Kernel】RAG知识检索增强实战

💡学习前小提示
请大家点击链接并加🌟:https://github.com/baidubce/bce-qianfan-sdk

通过SK实现RAG

对于需要外部知识支撑的场景,我们通常会使用RAG(Retrieval Augmented Generation)来实现,其中一般会涉及到文档解析,切片,向量化检索,通过LLM生成输出等步骤。 在Semantic ChatBot中我们基于semantic kernel实现了一个简单的chatbot,并使用context variables实现了历史聊天的存储和记录。 但是对于很长的外部知识库库,我们可能会因此需要超长的上下文,甚至无法没办法记录所有的历史,为此SK中提供了Memory类型以记录实现LLM的记忆能力。
  
  
  
  
  
  
! pip install semantic-kernel==0.4.5.dev0
初始化鉴权,导入SK,以及适配SK的Qianfan实现类型
  
  
  
  
  
  
import os
os.environ["QIANFAN_ACCESS_KEY"] = "your_ak"
os.environ["QIANFAN_SECRET_KEY"] = "your_sk"
import semantic_kernel as sk
from qianfan.extensions.semantic_kernel import (
QianfanChatCompletion,
QianfanTextEmbedding,
)

SK Memory

SK Memory 是一个数据框架,可以通过接入外部的各种数据源;可以是从网页,数据库,email等,这些都集成在了SK的内置connectors中,而通过QianfanTextEmbedding,可以提取这些数据源中的文本的特征向量,以供后续的检索使用。
这里使用了VolatileMemoryStore作为Memory的实现为例,VolatileMemoryStore实现了内存的临时存储(底层通过一个Dict[Dict[str, MemoryRecord]] 实现分collection的kv存储)。
  
  
  
  
  
  
from semantic_kernel.memory import VolatileMemoryStore
from semantic_kernel.core_skills import TextMemorySkill
kernel = sk.Kernel()
qf_chat_service = QianfanChatCompletion(ai_model_id="ERNIE-Bot")
qf_text_embedding = QianfanTextEmbedding(ai_model_id="Embedding-V1")
kernel.add_chat_service("chat-qf", qf_chat_service)
kernel.add_text_embedding_generation_service("embed-eb", qf_text_embedding)
kernel.register_memory_store(memory_store=VolatileMemoryStore())
kernel.import_skill(TextMemorySkill())
调用异步函数,完成数据的添加,这里往了一个名为aboutMecollection中添加了若干个人信息
  
  
  
  
  
  
async def populate_memory(kernel: sk.Kernel) -> None:
# Add some documents to the semantic memory
await kernel.memory.save_information_async(collection="aboutMe", id="info1", text="我名字叫做小度")
await kernel.memory.save_information_async(
collection="aboutMe", id="info2", text="我工作在baidu"
)
await kernel.memory.save_information_async(
collection="aboutMe", id="info3", text="我来自中国"
)
await kernel.memory.save_information_async(
collection="aboutMe",
id="info4",
text="我曾去过北京,上海,深圳",
)
await kernel.memory.save_information_async(collection="aboutMe", id="info5", text="我爱打羽毛球")
通过TextMemoryBase中实现的余弦相似度计算向量相似度可以search到对应相似的回答:
  
  
  
  
  
  
async def search_memory_examples(kernel: sk.Kernel) -> None:
questions = [
"我的名字是?",
"我在哪里工作?",
"我去过哪些地方旅游?",
"我的家乡是?",
"我的爱好是?",
]
for question in questions:
print(f"Question: {question}")
result = await kernel.memory.search_async("aboutMe", question)
print(f"Answer: {result[0].text}\n")
await populate_memory(kernel)
await search_memory_examples(kernel)
  
  
  
  
  
  
[INFO] [02-19 16:28:48] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
Question: 我的名字是?
[INFO] [02-19 16:28:48] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
Answer: 我名字叫做小度
Question: 我在哪里工作?
[INFO] [02-19 16:28:49] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
Answer: 我工作在baidu
Question: 我去过哪些地方旅游?
[INFO] [02-19 16:28:49] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
Answer: 我曾去过北京,上海,深圳
Question: 我的家乡是?
[INFO] [02-19 16:28:50] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
Answer: 我来自中国
Question: 我的爱好是?
Answer: 我爱打羽毛球

如果结合对话场景

如何将外部知识库和对话系统进行融合? SK中提供了TextMemorySkill,其中包含了recallfunction,可以获取一个input并在kernel的Memory之上执行相似度检索。
  
  
  
  
  
  
from typing import Tuple
async def setup_chat_with_memory(
kernel: sk.Kernel,
) -> Tuple[sk.SKFunctionBase, sk.SKContext]:
from semantic_kernel.core_skills import TextMemorySkill
sk_prompt = """
你是一个问答机器人,你的背景资料如下,
背景资料:
- {{$fact1}}: {{recall $fact1}}
- {{$fact2}}: {{recall $fact2}}
- {{$fact3}}: {{recall $fact3}}
- {{$fact4}}: {{recall $fact4}}
- {{$fact5}}: {{recall $fact5}}
聊天记录:
{{$chat_history}}
回答以下当前输入:: {{$user_input}}
回答:
""".strip()
chat_func = kernel.create_semantic_function(sk_prompt, temperature=0.8)
context = kernel.create_new_context()
context["fact1"] = "名字是?"
context["fact2"] = "哪里工作?"
context["fact3"] = "去过哪些地方旅游?"
context["fact4"] = "家乡是?"
context["fact5"] = "爱好是?"
context[sk.core_skills.TextMemorySkill.COLLECTION_PARAM] = "aboutMe"
context[sk.core_skills.TextMemorySkill.RELEVANCE_PARAM] = "0.7"
context["chat_history"] = ""
return chat_func, context
其中RelevanceParam用于指定检索的阈值,COLLECTION_PARAM用于指定collection名,sk_prompt中的recall 是TextMemorySkill中的一个NativeFunction。相当于TextMemorySkill.search_async
  
  
  
  
  
  
async def chat(kernel: sk.Kernel, chat_func: sk.SKFunctionBase, context: sk.SKContext) -> bool:
try:
user_input = input("用户:> ")
context["user_input"] = user_input
print(f"User:> {user_input}")
except KeyboardInterrupt:
print("\n\nExiting chat...")
return False
except EOFError:
print("\n\nExiting chat...")
return False
if user_input == "exit":
print("\n\nExiting chat...")
return False
print(context.variables)
answer = await kernel.run_async(chat_func, input_vars=context.variables)
context["chat_history"] += f"\n当前输入:> {user_input}\n回答:> {answer}\n"
print(f"Bot:> {answer}")
return True
await populate_memory(kernel)
# print("开始提问")
# await search_memory_examples(kernel)
print("构建prompt...")
chat_func, context = await setup_chat_with_memory(kernel)
print("开始对话 (type 'exit' to exit):\n")
chatting = True
while chatting:
chatting = await chat(kernel, chat_func, context)
  
  
  
  
  
  
[INFO] [02-19 16:29:03] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
[INFO] [02-19 16:29:03] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
[INFO] [02-19 16:29:04] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
[INFO] [02-19 16:29:04] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
[INFO] [02-19 16:29:05] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
构建prompt...
开始对话 (type 'exit' to exit):
[INFO] [02-19 16:29:10] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
User:> 你的工作是什么
我叫["\u6211\u540d\u5b57\u53eb\u505a\u5c0f\u5ea6"],欢迎使用我的服务。
Bot:> 我是一名百度员工,负责回答用户的问题。
User:> exit
Exiting chat...

添加外部links到Memory中

很多时候,我们有大量的外部知识库,接下来我们将使用SK的VolatileMemoryStore以用于加载外部链接: 例如我们添加千帆SDK的repo:
  
  
  
  
  
  
github_files = {
"https://github.com/baidubce/bce-qianfan-sdk/blob/main/README.md": "README: 千帆SDK介绍,安装,基础使用方法",
"https://github.com/baidubce/bce-qianfan-sdk/blob/main/cookbook/finetune/trainer_finetune.ipynb": "Cookbook: 千帆SDK Trainer使用方法"
}
  
  
  
  
  
  
https://github.com/baidubce/bce-qianfan-sdk/blob/main/cookbook/finetune/trainer_finetune.ipynb
  
  
  
  
  
  
https://github.com/baidubce/bce-qianfan-sdk/blob/main/README.md
与之前不同,这里通过SaveReferenceAsync将数据和引用来源reference 分开存储
  
  
  
  
  
  
memory_collection_name = "QianfanGithub"
i = 0
for entry, value in github_files.items():
await kernel.memory.save_reference_async(
collection=memory_collection_name,
description=value,
text=value,
external_id=entry,
external_source_name="GitHub",
)
i += 1
print(" 已添加 {} saved".format(i))
  
  
  
  
  
  
ask = "我希望整体了解千帆SDK,有什么办法?"
print("===========================\n" + "Query: " + ask + "\n")
results = await kernel.memory.search_async(memory_collection_name, ask, limit=5, min_relevance_score=0.7)
i = 0
for res in results:
i += 1
print(f"Result {i}:")
print(" URL: : " + res.id)
print(" Title : " + res.description)
print(" Relevance: " + str(res.relevance))
print()
  
  
  
  
  
  
[INFO] [02-19 16:29:52] openapi_requestor.py:275 [t:8406866752]: async requesting llm api endpoint: /embeddings/embedding-v1
===========================
Query: 我希望整体了解千帆SDK,有什么办法?
Result 1:
URL: : https://github.com/baidubce/bce-qianfan-sdk/blob/main/README.md
Title : README: 千帆SDK介绍,安装,基础使用方法
Relevance: 0.7502846678234273
除了VolatileMemory之外,我们还可以通过对接外部向量库的形式实现大量的外部知识库,SK官方提供常用的例如chroma,pinecone等实现。通过直接替换memory_store可以实现kernel和chroma的对接:
  
  
  
  
  
  
from semantic_kernel.connectors.memory.chroma import (
ChromaMemoryStore,
)
kernel.register_memory_store(
memory_store=ChromaMemoryStore(
persist_directory="./"
)
)
评论
用户头像