网易首页 > 网易号 > 正文 申请入驻

自愈型RAG系统:从脆弱管道到闭环智能体的工程实践

0
分享至


RAG系统在生产环境中有个老大难问题:脆弱。演示时用精心准备的问题去问,效果看起来很惊艳。但真正上线后,用户的问题五花八门,向量数据库返回的文档语义上相似但实际答非所问,LLM又特别喜欢讨好,拿着一堆噪音数据照样能编出一套看似合理的答案。

那么问题出在哪呢?标准RAG是典型的开环架构:输入 → 嵌入 → 检索 → 生成,一条线走到底。每个环节都假设上游输出是完美的,一旦某步出错,错误就会一路传导到最终结果。

要做企业级的RAG应用,必须转向闭环系统,也就是所谓的自愈RAG。这里的核心思路是让系统具备自省能力:检测到问题后能自主纠正,而不是把错误直接甩给用户。

第一部分:自动检索

RAG的第一个坑其实是用户本身。没人会按照向量搜索的最佳实践来写查询,要么用行话缩写,要么问题模糊不清,要么一个问题里塞了好几件事。自愈系统需要在输入端加一道"防护栏",把这些原始查询转换成高质量的检索请求。

策略1:假设文档嵌入(HyDE)

传统检索是拿短问题去匹配长文档,比如用"crag架构"这几个字去搜整段技术文档。这种模态不匹配会严重影响召回质量。

HyDE的思路是这样的,先让LLM根据问题"编造"一个假设性的答案,然后用这个假设答案去做向量检索。因为假设答案和真实文档在形态上更接近,匹配效果自然更好。



文档片段展示了其工作方式,HyDE能处理各类查询,且不需要修改底层的GPT-3和Contriever/mContriever模型。

比如说:

用户查询:"CRAG评分器怎么工作的?"

HyDE生成:"CRAG评分器通过评估检索文档的相关性来运作,它会对每个文档打分……"(虚构内容)

向量搜索:用生成的内容去检索,而不是用原始问题

代码实现(hyde.py):

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.core.indices.query.query_transform import HyDEQueryTransform
from llama_index.core.query_engine import TransformQueryEngine
from llama_index.llms.openai import OpenAI
# 1. 配置用于生成假设文档的LLM
Settings.llm = OpenAI(model="gpt-4-turbo", temperature=0.7)
def build_hyde_engine(index):
# 初始化HyDE转换
# include_original=True 确保同时搜索原始查询和假设文档
hyde = HyDEQueryTransform(include_original=True)
# 创建标准检索引擎
base_query_engine = index.as_query_engine(similarity_top_k=5)
# 用TransformQueryEngine包装
# 这个中间件会拦截查询,生成假设文档,然后执行搜索
hyde_engine = TransformQueryEngine(base_query_engine, query_transform=hyde)
return hyde_engine
# 使用示例
# index = VectorStoreIndex.from_documents(docs)
# engine = build_hyde_engine(index)
# response = engine.query("Explain the self-correction mechanism in CRAG")

策略2:查询分解

用户问"Llama-3和GPT-4在代码任务上谁表现更好",简单检索很难找到一篇文档同时包含两个模型的对比数据。查询分解就是把这种复合问题拆成原子级子查询:"Llama-3代码能力"和"GPT-4代码能力",分别检索后再合并结果。

代码实现(query_decomposition.py):

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from typing import List
# 定义输出结构
class SubQueries(BaseModel):
"""待检索的子问题集合"""
questions: List[str] = Field(description="List of atomic sub-questions.")
# 配置规划用的LLM
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
system_prompt = """You are an expert researcher. Break down the user's complex query.
into simple, atomic sub-queries that a search engine can answer."""
prompt = ChatPromptTemplate.from_messages([
("system", system_prompt),
("human", "{query}")
])
# 构建处理链
planner = prompt | llm.with_structured_output(SubQueries)
def plan_query(query: str):
result = planner.invoke({"query": query})
return result.questions
# 使用示例
# sub_qs = plan_query("Compare Llama-3 and GPT-4 on coding benchmarks")
# print(sub_qs)
# 输出:

第二部分:控制层

文档检索回来了如何判断它们靠不靠谱?CRAG的做法是在流程里加一个"评分员"角色,对每个检索到的文档进行相关性评估。如果发现数据质量不行,系统不会硬着头皮生成答案,而是触发备用方案(比如去搜网页)。



检索评估器的工作原理:评估检索文档与输入的相关性,估算置信度,然后根据结果触发不同的后续动作——{正确、错误、模糊}三种状态对应不同处理路径。

这种分支决策逻辑用图结构来实现最合适,LangGraph正好派上用场。

CRAG工作流程如下:

  1. 检索:拿到候选文档
  2. 评分:LLM判断每个文档"相关"还是"不相关"
  3. 决策:相关就直接生成答案;不相关则改写查询后去搜网页

代码实现(corrective_rag.py):

from typing import List, TypedDict
from langchain_core.prompts import PromptTemplate
from langchain_core.documents import Document
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
from langgraph.graph import END, StateGraph, START
# --- 1. 状态定义 ---
class GraphState(TypedDict):
question: str
generation: str
web_search: str # 'Yes'或'No'标记
documents: List
# --- 2. 组件初始化 ---
grader_llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
generator_llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
web_tool = TavilySearchResults(k=3)
# --- 3. 节点定义 ---
def grade_documents(state):
"""
自愈核心节点:过滤低质量文档
"""
print("---CHECK RELEVANCE---")
question = state["question"]
documents = state["documents"]
# 二分类结构化输出
structured_llm = grader_llm.with_structured_output(dict)
prompt = PromptTemplate(
template="""You are a grader assessing relevance.
Doc: {context}
Question: {question}
Return JSON with key 'score' as 'yes' or 'no'.""",
input_variables=["context", "question"],
)
chain = prompt | structured_llm
filtered_docs = []
web_search = "No"
for d in documents:
grade = chain.invoke({"question": question, "context": d.page_content})
if grade.get('score') == 'yes':
filtered_docs.append(d)
else:
# 丢失上下文时触发回退
web_search = "Yes"
return {"documents": filtered_docs, "question": question, "web_search": web_search}
def transform_query(state):
"""
自我纠正:重写查询以提升网页搜索效果
"""
print("---TRANSFORM QUERY---")
question = state["question"]
# 简易重写链
prompt = PromptTemplate(template="Rewrite this for web search: {question}", input_variables=["question"])
chain = prompt | generator_llm
better_q = chain.invoke({"question": question}).content
return {"question": better_q}
def web_search_node(state):
print("---WEB SEARCH---")
docs = web_tool.invoke({"query": state["question"]})
# 网页结果追加到已有文档
web_results = [Document(page_content=d["content"]) for d in docs]
return {"documents": state["documents"] + web_results}
def generate(state):
print("---GENERATE---")
# 这里接标准RAG生成链
# generation = rag_chain.invoke(...)
return {"generation": "Final Answer Placeholder"}
# --- 4. 图构建 ---
workflow = StateGraph(GraphState)
# 添加节点
workflow.add_node("retrieve", lambda x: {"documents": []}) # 检索占位
workflow.add_node("grade_documents", grade_documents)
workflow.add_node("transform_query", transform_query)
workflow.add_node("web_search_node", web_search_node)
workflow.add_node("generate", generate)
# 添加边
workflow.add_edge(START, "retrieve")
workflow.add_edge("retrieve", "grade_documents")
def decide_to_generate(state):
if state["web_search"] == "Yes":
return "transform_query"
return "generate"
workflow.add_conditional_edges(
"grade_documents",
decide_to_generate,
{"transform_query": "transform_query", "generate": "generate"}
)
workflow.add_edge("transform_query", "web_search_node")
workflow.add_edge("web_search_node", "generate")
workflow.add_edge("generate", END)
app = workflow.compile()

第三部分:自动排序

向量检索用的双编码器(Bi-Encoder)速度快但精度有限。文档被压缩成单个向量后,很多语义细节都丢了。解决办法是引入交叉编码器(Cross-Encoder)做二次排序。



交叉编码器把查询和文档作为一个整体输入,直接输出相关性分数的计算开销比较大,所以一般采用两阶段策略:

  1. 粗筛:向量库快速召回Top 50
  2. 精排:交叉编码器对这50个文档重新打分,保留Top 5

代码实现(reranker.py):

from sentence_transformers import CrossEncoder
class Reranker:
def __init__(self):
# 加载MS MARCO优化过的模型
self.model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')
def rerank(self, query, documents, top_k=5):
# 构造配对:[[query, doc1], [query, doc2]...]
pairs = [[query, doc] for doc in documents]
# 批量打分
scores = self.model.predict(pairs)
# 排序截取
results = sorted(zip(documents, scores), key=lambda x: x[1], reverse=True)
return [doc for doc, score in results[:top_k]]

第四部分:自动学习

高级的自愈系统不只是即时修复问题,还会从历史错误中学习,避免同样的坑反复踩。实现方式是动态少样本学习(Dynamic Few-Shot Learning)。

当系统生成了一个好答案(用户点了赞),就把这对查询-答案存到一个专门的"黄金样本库"里。后续遇到相似问题时,检索这些成功案例注入到prompt中,相当于用系统自己的成功经验来指导新的回答。

代码实现(dynamic_prompting.py):

from llama_index.core import VectorStoreIndex, Document
from llama_index.core.prompts import PromptTemplate
class LearningManager:
def __init__(self):
self.good_examples = []
self.index = None
def add_good_example(self, query, answer):
"""用户点赞时调用"""
doc = Document(text=f"Q: {query}\nA: {answer}")
self.good_examples.append(doc)
# 重建索引(生产环境建议用支持增量更新的向量库)
self.index = VectorStoreIndex.from_documents(self.good_examples)
def get_dynamic_prompt(self, current_query):
if not self.index:
return ""
# 检索相似的历史成功案例
retriever = self.index.as_retriever(similarity_top_k=2)
nodes = retriever.retrieve(current_query)
examples_text = "\n\n".join([n.text for n in nodes])
return f"Here are examples of how to answer correctly:\n{examples_text}"
# 在管道中使用
# manager = LearningManager()
# few_shot_context = manager.get_dynamic_prompt(user_query)
# final_prompt = f"{few_shot_context}\n\nQuestion: {user_query}..."

进阶方向:DSPy自动优化

如果想要更程序化的优化方式,DSPy是个值得关注的框架。它把prompt当成可优化的程序来处理,他会跑一遍验证集并根据准确率等指标自动重写prompt和更新少样本示例。

import dspy
# 1. 定义RAG签名
class GenerateAnswer(dspy.Signature):
"""用简短事实性答案回答问题"""
context = dspy.InputField()
question = dspy.InputField()
answer = dspy.OutputField()
# 2. 定义模块
class RAG(dspy.Module):
def __init__(self):
super().__init__()
self.retrieve = dspy.Retrieve(k=3)
self.generate = dspy.ChainOfThought(GenerateAnswer)
def forward(self, question):
context = self.retrieve(question).passages
return self.generate(context=context, question=question)
# 3. 优化
# MIPROv2会运行管道,遇到失败就重试并重写指令
# 目标是最大化指定metric(精确匹配、语义相似度等)
optimizer = dspy.MIPROv2(metric=dspy.evaluate.SemanticF1)
optimized_rag = optimizer.compile(RAG(), trainset=training_data)

完整系统集成

各个组件都准备好了:HyDE、查询分解、CRAG、交叉编码器重排序、动态提示。现在把它们串成一个完整的自愈RAG系统。这个编排层负责协调整个流程:解析查询、增强检索、校验上下文、优化相关性、收集反馈学习、最终生成稳定可靠的答案。

import os
import json
import asyncio
from typing import List, Dict, Any, Optional
from datetime import datetime
# 导入各组件
from hyde import build_hyde_engine, Settings
from query_decomposition import plan_query, SubQueries
from corrective_rag import app as crag_app, GraphState
from reranker import Reranker
from dynamic_prompting import LearningManager
# 核心依赖
from llama_index.core import VectorStoreIndex, Document, SimpleDirectoryReader
from llama_index.llms.openai import OpenAI
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from sentence_transformers import CrossEncoder
class SelfHealingRAGSystem:
"""
完整自愈RAG系统,整合全部组件
"""
def __init__(self, openai_api_key: str = None):
"""初始化RAG系统"""
# API密钥配置
if openai_api_key:
os.environ["OPENAI_API_KEY"] = openai_api_key
# 组件初始化
print(" Initializing Self-Healing RAG System...")
# 核心LLM
self.llm = OpenAI(model="gpt-4-turbo", temperature=0.3)
Settings.llm = self.llm
# 初始化各组件
self.reranker = Reranker()
self.learning_manager = LearningManager()
self.vector_index = None
self.hyde_engine = None
# 演示数据
self.sample_documents = self._create_sample_documents()
self._setup_vector_index()
# 统计
self.query_stats = {
"total_queries": 0,
"hyde_used": 0,
"decomposed_queries": 0,
"crag_activated": 0,
"reranked": 0,
"learning_applied": 0
}
print("✅ System initialized successfully!")
def _create_sample_documents(self) -> List[Document]:
"""创建演示用的示例文档"""
sample_texts = [
"""Retrieval-Augmented Generation (RAG) is a technique that combines
pre-trained language models with external knowledge retrieval. RAG systems
retrieve relevant documents from a knowledge base and use them to generate
more accurate and factual responses.""",
"""Corrective RAG (CRAG) introduces a self-correction mechanism that grades
retrieved documents for relevance. If documents are deemed irrelevant, the
system triggers alternative retrieval strategies like web search.""",
"""HyDE (Hypothetical Document Embeddings) improves retrieval by generating
hypothetical documents that answer the query, then searching for real documents
similar to these hypothetical ones.""",
"""Cross-encoder reranking provides more accurate document scoring compared
to bi-encoder similarity search. It processes query-document pairs together
to produce refined relevance scores.""",
"""DSPy enables automatic prompt optimization by treating prompts as programs
that can be compiled and optimized against specific metrics like accuracy
or semantic similarity.""",
"""Self-healing RAG systems implement feedback loops that learn from successful
query-answer pairs, storing them as examples for future similar queries to
improve performance over time.""",
"""Query decomposition breaks complex multi-part questions into atomic
sub-queries that can be individually processed and then combined for
comprehensive answers.""",
"""Vector databases enable semantic search by converting documents into
high-dimensional embeddings that capture semantic meaning rather than
just keyword matches."""
]
return [Document(text=text, metadata={"id": i}) for i, text in enumerate(sample_texts)]
def _setup_vector_index(self):
"""用示例文档构建向量索引"""
print(" Setting up vector index...")
self.vector_index = VectorStoreIndex.from_documents(self.sample_documents)
self.hyde_engine = build_hyde_engine(self.vector_index)
print("✅ Vector index ready!")
def enhanced_retrieve(self, query: str, use_hyde: bool = True, top_k: int = 5) -> List[Document]:
"""支持HyDE的增强检索"""
print(f" Retrieving documents for: '{query}'")
if use_hyde:
print(" Using HyDE for enhanced retrieval...")
response = self.hyde_engine.query(query)
# 从HyDE响应提取文档
documents = response.source_nodes
self.query_stats["hyde_used"] += 1
else:
print(" Using standard retrieval...")
retriever = self.vector_index.as_retriever(similarity_top_k=top_k)
nodes = retriever.retrieve(query)
documents = nodes
# 转换为Document对象
docs = []
for node in documents:
doc = Document(
page_content=node.text if hasattr(node, 'text') else str(node),
metadata=node.metadata if hasattr(node, 'metadata') else {}
)
docs.append(doc)
print(f" ✅ Retrieved {len(docs)} documents")
return docs
def decompose_and_retrieve(self, query: str) -> tuple[List[str], List[Document]]:
"""分解复杂查询并分别检索"""
print(f" Decomposing query: '{query}'")
try:
sub_queries = plan_query(query)
if len(sub_queries) > 1:
print(f" Decomposed into {len(sub_queries)} sub-queries:")
for i, sq in enumerate(sub_queries, 1):
print(f" {i}. {sq}")
# 对每个子查询检索
all_docs = []
for sq in sub_queries:
docs = self.enhanced_retrieve(sq, use_hyde=False, top_k=3)
all_docs.extend(docs)
self.query_stats["decomposed_queries"] += 1
return sub_queries, all_docs
else:
print(" ➡️ Query doesn't need decomposition")
docs = self.enhanced_retrieve(query)
return [query], docs
except Exception as e:
print(f" ⚠️ Error in decomposition: {e}")
docs = self.enhanced_retrieve(query)
return [query], docs
def apply_crag(self, query: str, documents: List[Document]) -> tuple[List[Document], str]:
"""应用CRAG过滤文档"""
print(" Applying CRAG (Corrective RAG)...")
try:
# 准备CRAG状态
state = GraphState(
question=query,
generation="",
web_search="No",
documents=documents
)
# 正常情况下会跑完整CRAG流程
# 这里为演示做简化处理
filtered_docs = []
for doc in documents[:3]: # 演示限制
# 简单相关性检查(实际应该用LLM)
if any(keyword in doc.page_content.lower() for keyword in query.lower().split()):
filtered_docs.append(doc)
if len(filtered_docs) < len(documents):
self.query_stats["crag_activated"] += 1
print(f" CRAG filtered {len(documents) - len(filtered_docs)} irrelevant documents")
return filtered_docs, "Documents filtered by CRAG"
except Exception as e:
print(f" ⚠️ Error in CRAG: {e}")
return documents, "CRAG not applied due to error"
def apply_reranking(self, query: str, documents: List[Document], top_k: int = 3) -> List[Document]:
"""交叉编码器重排序"""
print(" Applying cross-encoder reranking...")
try:
# 提取文本用于重排序
doc_texts = [doc.page_content for doc in documents]
if len(doc_texts) > 1:
reranked_texts = self.reranker.rerank(query, doc_texts, top_k)
# 映射回Document对象
reranked_docs = []
for text in reranked_texts:
for doc in documents:
if doc.page_content == text:
reranked_docs.append(doc)
break
self.query_stats["reranked"] += 1
print(f" ✅ Reranked to top {len(reranked_docs)} documents")
return reranked_docs
else:
print(" ➡️ Not enough documents for reranking")
return documents
except Exception as e:
print(f" ⚠️ Error in reranking: {e}")
return documents
def apply_dynamic_prompting(self, query: str) -> str:
"""动态少样本学习"""
print(" Applying dynamic prompting...")
try:
few_shot_context = self.learning_manager.get_dynamic_prompt(query)
if few_shot_context:
self.query_stats["learning_applied"] += 1
print(" ✅ Applied learned examples from previous successes")
else:
print(" ➡️ No relevant past examples found")
return few_shot_context
except Exception as e:
print(f" ⚠️ Error in dynamic prompting: {e}")
return ""
def generate_answer(self, query: str, documents: List[Document], few_shot_context: str = "") -> str:
"""基于检索文档生成答案"""
print("✍️ Generating final answer...")
# 合并文档内容
context = "\n\n".join([doc.page_content for doc in documents[:3]])
# 构建prompt,可选包含少样本示例
prompt_parts = []
if few_shot_context:
prompt_parts.append(few_shot_context)
prompt_parts.extend([
"Context:",
context,
f"\nQuestion: {query}",
"\nAnswer based on the provided context:"
])
prompt = "\n".join(prompt_parts)
try:
response = self.llm.complete(prompt)
answer = response.text.strip()
print(" ✅ Answer generated successfully")
return answer
except Exception as e:
print(f" ⚠️ Error generating answer: {e}")
return f"I apologize, but I encountered an error generating an answer: {e}"
def full_pipeline(self, query: str, user_feedback: bool = None, previous_answer: str = None) -> Dict[str, Any]:
"""
运行完整自愈RAG管道
"""
start_time = datetime.now()
print(f"\n Starting Self-Healing RAG Pipeline")
print(f"Query: '{query}'")
print("=" * 60)
self.query_stats["total_queries"] += 1
# 步骤1:查询增强
sub_queries, documents = self.decompose_and_retrieve(query)
# 步骤2:文档校验(CRAG)
filtered_docs, crag_status = self.apply_crag(query, documents)
# 步骤3:文档重排序
reranked_docs = self.apply_reranking(query, filtered_docs)
# 步骤4:动态提示
few_shot_context = self.apply_dynamic_prompting(query)
# 步骤5:答案生成
answer = self.generate_answer(query, reranked_docs, few_shot_context)
# 步骤6:学习(如有反馈)
if user_feedback is True and previous_answer:
try:
self.learning_manager.add_good_example(query, previous_answer)
print(" Added successful example to learning system")
except Exception as e:
print(f"⚠️ Error adding to learning system: {e}")
end_time = datetime.now()
processing_time = (end_time - start_time).total_seconds()
result = {
"query": query,
"sub_queries": sub_queries,
"documents_found": len(documents),
"documents_filtered": len(filtered_docs),
"final_documents": len(reranked_docs),
"answer": answer,
"crag_status": crag_status,
"processing_time": processing_time,
"components_used": self._get_components_used()
}
print("\n" + "=" * 60)
print(f"✅ Pipeline completed in {processing_time:.2f} seconds")
print(f" Documents: {len(documents)} → {len(filtered_docs)} → {len(reranked_docs)}")
return result
def _get_components_used(self) -> List[str]:
"""获取本次查询用到的组件"""
components = ["Vector Retrieval"]
if self.query_stats["hyde_used"] > 0:
components.append("HyDE")
if self.query_stats["decomposed_queries"] > 0:
components.append("Query Decomposition")
if self.query_stats["crag_activated"] > 0:
components.append("CRAG")
if self.query_stats["reranked"] > 0:
components.append("Cross-Encoder Reranking")
if self.query_stats["learning_applied"] > 0:
components.append("Dynamic Prompting")
return components
def get_system_stats(self) -> Dict[str, Any]:
"""获取系统统计信息"""
return {
"total_queries": self.query_stats["total_queries"],
"hyde_usage_rate": f"{(self.query_stats['hyde_used'] / max(1, self.query_stats['total_queries']) * 100):.1f}%",
"decomposition_rate": f"{(self.query_stats['decomposed_queries'] / max(1, self.query_stats['total_queries']) * 100):.1f}%",
"crag_activation_rate": f"{(self.query_stats['crag_activated'] / max(1, self.query_stats['total_queries']) * 100):.1f}%",
"reranking_rate": f"{(self.query_stats['reranked'] / max(1, self.query_stats['total_queries']) * 100):.1f}%",
"learning_rate": f"{(self.query_stats['learning_applied'] / max(1, self.query_stats['total_queries']) * 100):.1f}%",
"learned_examples": len(self.learning_manager.good_examples)
}
def demo_interactive_session():
"""交互式演示"""
print("""
Self-Healing RAG System Demo
================================
This system demonstrates:
• HyDE: Hypothetical Document Embeddings
• Query Decomposition: Breaking complex queries
• CRAG: Corrective RAG with document grading
• Cross-Encoder Reranking: Precision ranking
• Dynamic Learning: Few-shot from success examples
""")
# 初始化系统
system = SelfHealingRAGSystem()
# 演示用查询
demo_queries = [
"What is RAG and how does it work?",
"Compare HyDE and standard retrieval methods",
"How does CRAG improve retrieval quality and what are the benefits of cross-encoder reranking?",
"Explain the self-correction mechanisms in modern RAG systems",
"What are the advantages of DSPy optimization for prompts?"
]
print(" Running Demo Queries...")
print("=" * 50)
results = []
for i, query in enumerate(demo_queries, 1):
print(f"\n Demo Query {i}/{len(demo_queries)}")
result = system.full_pipeline(query)
results.append(result)
print(f"\n Answer:")
print(f"{result['answer']}")
print(f"\n Components Used: {', '.join(result['components_used'])}")
# 模拟正反馈用于学习
if i > 1: # 第二个查询开始加反馈
system.full_pipeline(query, user_feedback=True, previous_answer=result['answer'])
# 最终统计
print("\n" + "=" * 60)
print(" SYSTEM PERFORMANCE STATISTICS")
print("=" * 60)
stats = system.get_system_stats()
for key, value in stats.items():
print(f"{key.replace('_', ' ').title()}: {value}")
return system, results
if __name__ == "__main__":
# 设置OpenAI API密钥
# os.environ["OPENAI_API_KEY"] = "your-key-here"
demo_interactive_session()

总结

经典的RAG到自愈RAG,本质上是从"检索"到"推理"的升级。HyDE和查询分解确保问对问题;CRAG和交叉编码器确保读对文档;自动学习机制则让系统不再反复犯同样的错。这套组合下来,RAG系统的泛化性会有质的提升。

https://avoid.overfit.cn/post/d95478d7799646acbed0e0d2dc2c480d

作者:Subrata Samanta

特别声明:以上内容(如有图片或视频亦包括在内)为自媒体平台“网易号”用户上传并发布,本平台仅提供信息存储服务。

Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

相关推荐
热点推荐
朝鲜展示在俄乌战场缴获的西方武器,包括“豹”-2、M1A1“艾布拉姆斯”坦克!普京:铭记俄朝士兵并肩作战的英勇事迹

朝鲜展示在俄乌战场缴获的西方武器,包括“豹”-2、M1A1“艾布拉姆斯”坦克!普京:铭记俄朝士兵并肩作战的英勇事迹

每日经济新闻
2026-04-30 21:50:12
一女子举报身为公职人员的前男友索贿 当事人称系借款 纪委监委已展开调查

一女子举报身为公职人员的前男友索贿 当事人称系借款 纪委监委已展开调查

红星新闻
2026-04-30 15:20:27
他们说的话,我连标点符号都不信

他们说的话,我连标点符号都不信

胖胖说他不胖
2026-04-30 17:31:31
致敬帕勒莫!维尼修斯6分钟内罚丢3个点球,球迷看傻眼了

致敬帕勒莫!维尼修斯6分钟内罚丢3个点球,球迷看傻眼了

仰卧撑FTUer
2026-04-30 21:41:06
全国人民代表大会常务委员会公告〔十四届〕第十七号

全国人民代表大会常务委员会公告〔十四届〕第十七号

新京报
2026-04-30 18:59:10
华为请他代言,27分钟卖了一万多台车,整个车圈都傻了

华为请他代言,27分钟卖了一万多台车,整个车圈都傻了

茶余饭好
2026-04-29 11:14:32
网红“罗二哥”去世,年仅47岁,岳父5天前刚去世,原因令人惋惜

网红“罗二哥”去世,年仅47岁,岳父5天前刚去世,原因令人惋惜

180视角
2026-04-30 13:22:07
境外势力掏钱让你“躺平”?这波操作,真把年轻人当韭菜割了

境外势力掏钱让你“躺平”?这波操作,真把年轻人当韭菜割了

迷世书童H9527
2026-04-28 10:30:43
陕西男子3次报警,民警拒不派警,致两家四口被杀,法院咋判的?

陕西男子3次报警,民警拒不派警,致两家四口被杀,法院咋判的?

就一点
2026-04-29 17:28:35
广州一路段泊位遇冷,市民宁愿收万元罚单也不停泊位,官方回应:停车费以非税收入形式上缴国库,对个别欠费车主诉讼追缴,对泊位进行修编

广州一路段泊位遇冷,市民宁愿收万元罚单也不停泊位,官方回应:停车费以非税收入形式上缴国库,对个别欠费车主诉讼追缴,对泊位进行修编

大风新闻
2026-04-30 17:03:08
重磅!5月1日正式落地!体制内、公职人员9条红线碰不得,追责!

重磅!5月1日正式落地!体制内、公职人员9条红线碰不得,追责!

爱下厨的阿椅
2026-04-30 18:32:06
被困霍尔木兹海峡的部分船员已遇难!联合国呼吁紧急营救

被困霍尔木兹海峡的部分船员已遇难!联合国呼吁紧急营救

闪电新闻
2026-04-30 14:56:57
电磁炉为啥悄无声息退出中国家庭?内行人透底玄机,看完彻底懂了

电磁炉为啥悄无声息退出中国家庭?内行人透底玄机,看完彻底懂了

老特有话说
2026-04-30 11:36:13
欧联之王!过去5个完整的欧联赛季埃梅里都闯入决赛,4次夺冠

欧联之王!过去5个完整的欧联赛季埃梅里都闯入决赛,4次夺冠

懂球帝
2026-04-30 22:00:08
沉默45年后,中国第二轮“严打”终于来了!但这次目标变了

沉默45年后,中国第二轮“严打”终于来了!但这次目标变了

吃货的分享
2026-04-30 18:52:37
名记Shams:杜兰特将继续缺席G6 骨挫伤至少还需再休一周

名记Shams:杜兰特将继续缺席G6 骨挫伤至少还需再休一周

醉卧浮生
2026-04-30 20:56:42
劲爆!华尔街美女高管将男下属训成性奴,强迫其吃伟哥+办公室内公然猥亵

劲爆!华尔街美女高管将男下属训成性奴,强迫其吃伟哥+办公室内公然猥亵

可达鸭面面观
2026-04-30 15:46:28
越扒越多!孙杨成功把上海体育大学拉下水……

越扒越多!孙杨成功把上海体育大学拉下水……

麦杰逊
2026-04-30 14:51:44
以色列海军拦截“全球坚韧船队”,扣押约175名活动人士

以色列海军拦截“全球坚韧船队”,扣押约175名活动人士

界面新闻
2026-04-30 16:02:32
举报铁路员工站台抽烟沈女士已全网社死!当事人道歉,12306回应

举报铁路员工站台抽烟沈女士已全网社死!当事人道歉,12306回应

原广工业
2026-05-01 00:22:23
2026-05-01 03:59:00
deephub incentive-icons
deephub
CV NLP和数据挖掘知识
1986文章数 1461关注度
往期回顾 全部

科技要闻

9000亿美元估值,Anthropic即将反超OpenAI

头条要闻

英国国王给特朗普送了口钟 还贴脸开大"有需要尽管敲"

头条要闻

英国国王给特朗普送了口钟 还贴脸开大"有需要尽管敲"

体育要闻

季后赛场均5.4分,他凭啥在骑士打首发?

娱乐要闻

孙杨博士学历有问题?官方含糊其辞

财经要闻

易会满被“双开”!

汽车要闻

专访捷途汪如生:捷途双线作战 全球化全面落地

态度原创

家居
本地
教育
数码
艺术

家居要闻

灵动实用 生活艺术场

本地新闻

用青花瓷的方式,打开西溪湿地

教育要闻

高考地理中的数字文旅

数码要闻

机械革命耀世18 Pro游戏本270HX Plus + 5070 12GB,10499元

艺术要闻

耗资21亿的故宫北院,网友看后直摇头:怎么撞脸高铁站了?

无障碍浏览 进入关怀版