LlamaIndex 框架

LlamaIndex（原 GPT Index）是一个专注于数据连接和检索增强生成（RAG）的框架。它提供了强大的数据索引和查询能力，特别适合构建知识库问答、文档分析等应用。

一、核心原理

1.1 LlamaIndex 设计哲学

LlamaIndex 的核心设计理念是"数据优先"：

┌─────────────────────────────────────────────────────────────┐
│                    LlamaIndex 设计哲学                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   核心使命：连接你的数据与大语言模型                         │
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │                    数据层                           │  │
│   │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐   │  │
│   │  │   PDF   │ │  Web    │ │  SQL    │ │  API    │   │  │
│   │  │  Docs   │ │  Pages  │ │   DB    │ │  Data   │   │  │
│   │  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘   │  │
│   └───────┼──────────┼──────────┼──────────┼───────────┘  │
│           │          │          │          │              │
│           └──────────┼──────────┼──────────┘              │
│                      │          │                         │
│                      ↓          ↓                         │
│   ┌─────────────────────────────────────────────────────┐ │
│   │                    索引层                           │ │
│   │  ┌─────────────────────────────────────────────┐   │ │
│   │  │              Index (索引结构)                │   │ │
│   │  │  • VectorStoreIndex (向量索引)              │   │ │
│   │  │  • ListIndex (列表索引)                     │   │ │
│   │  │  • TreeIndex (树形索引)                     │   │ │
│   │  │  • KeywordTableIndex (关键词索引)           │   │ │
│   │  └─────────────────────────────────────────────┘   │ │
│   └─────────────────────────────────────────────────────┘ │
│                      │                                     │
│                      ↓                                     │
│   ┌─────────────────────────────────────────────────────┐ │
│   │                    查询层                           │ │
│   │  ┌─────────────────────────────────────────────┐   │ │
│   │  │           Query Engine (查询引擎)            │   │ │
│   │  │  • 检索相关文档                              │   │ │
│   │  │  • 构建提示词                                │   │ │
│   │  │  • 生成回答                                  │   │ │
│   │  └─────────────────────────────────────────────┘   │ │
│   └─────────────────────────────────────────────────────┘ │
│                      │                                     │
│                      ↓                                     │
│   ┌─────────────────────────────────────────────────────┐ │
│   │                    输出层                           │ │
│   │  ┌─────────────────────────────────────────────┐   │ │
│   │  │              Response (响应)                 │   │ │
│   │  │  • 回答内容                                  │   │ │
│   │  │  • 来源引用                                  │   │ │
│   │  │  • 置信度评分                                │   │ │
│   │  └─────────────────────────────────────────────┘   │ │
│   └─────────────────────────────────────────────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘

1.2 核心概念对比

概念	英文	作用	类比
Document	文档	原始数据单元	书的每一页
Node	节点	文档分块后的片段	每页的段落
Index	索引	数据组织结构	书的目录
Retriever	检索器	获取相关节点	查找相关章节
Query Engine	查询引擎	执行查询流程	阅读并回答
Chat Engine	聊天引擎	对话式交互	与读者对话

1.3 与 LangChain 的定位对比

┌─────────────────────────────────────────────────────────────┐
│                    框架定位对比                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│                    ┌─────────────┐                         │
│                    │   应用层    │                         │
│                    └──────┬──────┘                         │
│                           │                                │
│          ┌────────────────┼────────────────┐              │
│          │                │                │              │
│          ↓                ↓                ↓              │
│   ┌───────────┐   ┌───────────┐   ┌───────────┐          │
│   │ LangChain │   │LlamaIndex │   │ LangGraph │          │
│   │           │   │           │   │           │          │
│   │ • 通用型  │   │ • RAG专精 │   │ • 工作流  │          │
│   │ • Agent   │   │ • 数据连接│   │ • 状态机  │          │
│   │ • Chain   │   │ • 索引构建│   │ • 图编排  │          │
│   └───────────┘   └───────────┘   └───────────┘          │
│          │                │                │              │
│          └────────────────┼────────────────┘              │
│                           │                                │
│                    ┌──────┴──────┐                         │
│                    │   LLM 层    │                         │
│                    └─────────────┘                         │
│                                                             │
│   选择建议：                                                 │
│   • 纯 RAG 应用 → LlamaIndex                                │
│   • 复杂 Agent → LangChain + LangGraph                      │
│   • 企业知识库 → LlamaIndex + LangChain 组合                │
│                                                             │
└─────────────────────────────────────────────────────────────┘

二、核心组件详解

2.1 数据加载（Data Loaders）

LlamaIndex 提供丰富的数据连接器：

┌─────────────────────────────────────────────────────────────┐
│                    数据加载器生态                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │              文档类型                                │  │
│   │  ┌──────────┐ ┌──────────┐ ┌──────────┐            │  │
│   │  │   PDF    │ │  Docx    │ │ Markdown │            │  │
│   │  │ Loader   │ │  Loader  │ │  Loader  │            │  │
│   │  └──────────┘ └──────────┘ └──────────┘            │  │
│   │  ┌──────────┐ ┌──────────┐ ┌──────────┐            │  │
│   │  │   CSV    │ │  JSON    │ │   PPTX   │            │  │
│   │  │ Loader   │ │  Loader  │ │  Loader  │            │  │
│   │  └──────────┘ └──────────┘ └──────────┘            │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │              数据源                                  │  │
│   │  ┌──────────┐ ┌──────────┐ ┌──────────┐            │  │
│   │  │ Notion   │ │ Slack    │ │ Discord  │            │  │
│   │  │ Loader   │ │  Loader  │ │  Loader  │            │  │
│   │  └──────────┘ └──────────┘ └──────────┘            │  │
│   │  ┌──────────┐ ┌──────────┐ ┌──────────┐            │  │
│   │  │  Web     │ │ Database │ │ S3/Blob  │            │  │
│   │  │ Loader   │ │  Loader  │ │  Loader  │            │  │
│   │  └──────────┘ └──────────┘ └──────────┘            │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   200+ 数据连接器可用！                                     │
│   安装：pip install llama-index-readers-<name>              │
│                                                             │
└─────────────────────────────────────────────────────────────┘

代码示例：

"""
LlamaIndex 数据加载示例
展示多种数据源的加载方式
"""
 
from llama_index.core import Document, SimpleDirectoryReader
from llama_index.readers.file import PyMuPDFReader
from llama_index.readers.web import SimpleWebPageReader
from llama_index.readers.database import DatabaseReader
 
 
# ========== 1. 从目录加载文档 ==========
 
def load_from_directory(directory_path: str):
    """从目录加载所有支持的文档"""
    reader = SimpleDirectoryReader(
        input_dir=directory_path,
        required_exts=[".pdf", ".txt", ".md", ".docx"],
        recursive=True,  # 递归读取子目录
        exclude=["*.tmp", "*.bak"]
    )
    documents = reader.load_data()
    return documents
 
 
# ========== 2. 加载 PDF 文档 ==========
 
def load_pdf(pdf_path: str):
    """加载单个 PDF 文件"""
    reader = PyMuPDFReader()
    documents = reader.load(file_path=pdf_path)
    return documents
 
 
# ========== 3. 加载网页内容 ==========
 
def load_webpage(url: str):
    """加载网页内容"""
    reader = SimpleWebPageReader(html_to_text=True)
    documents = reader.load_data(urls=[url])
    return documents
 
 
# ========== 4. 从数据库加载 ==========
 
def load_from_database(
    connection_string: str,
    query: str
):
    """从数据库加载数据"""
    reader = DatabaseReader(connection_string)
    documents = reader.load_data(query=query)
    return documents
 
 
# ========== 5. 自定义文档创建 ==========
 
def create_custom_document():
    """创建自定义文档"""
    document = Document(
        text="这是一段自定义的文档内容...",
        metadata={
            "source": "custom",
            "author": "AI",
            "date": "2024-01-01",
            "category": "技术文档"
        }
    )
    return document
 
 
# ========== 使用示例 ==========
 
if __name__ == "__main__":
    # 加载目录中的文档
    docs = load_from_directory("./documents")
    print(f"加载了 {len(docs)} 个文档")
    
    for doc in docs[:3]:  # 打印前 3 个文档的信息
        print(f"- {doc.metadata.get('file_name', '未知')}")
        print(f"  内容预览: {doc.text[:100]}...")

2.2 文档处理（Node Parsing）

┌─────────────────────────────────────────────────────────────┐
│                    文档处理流程                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   原始文档                                                  │
│   ┌─────────────────────────────────────────────────────┐  │
│   │  Document 1: 这是一段很长的文本内容...              │  │
│   │  Document 2: 另一份文档的内容...                    │  │
│   └─────────────────────────────────────────────────────┘  │
│                          │                                  │
│                          ↓                                  │
│   ┌─────────────────────────────────────────────────────┐  │
│   │              Node Parser (节点解析器)                │  │
│   │                                                      │  │
│   │  ┌─────────────────────────────────────────────┐    │  │
│   │  │  分割策略：                                   │    │  │
│   │  │  • SentenceSplitter (按句子)                 │    │  │
│   │  │  • TokenTextSplitter (按 Token)              │    │  │
│   │  │  • SemanticSplitter (语义分割)               │    │  │
│   │  │  • MarkdownNodeParser (Markdown 结构)        │    │  │
│   │  │  • JSONNodeParser (JSON 结构)                │    │  │
│   │  └─────────────────────────────────────────────┘    │  │
│   │                                                      │  │
│   │  配置参数：                                          │  │
│   │  • chunk_size: 块大小                               │  │
│   │  • chunk_overlap: 重叠大小                          │  │
│   │  • include_metadata: 保留元数据                     │  │
│   │                                                      │  │
│   └─────────────────────────────────────────────────────┘  │
│                          │                                  │
│                          ↓                                  │
│   节点列表                                                  │
│   ┌─────────────────────────────────────────────────────┐  │
│   │  Node 1: 这是一段很长的...                          │  │
│   │  Node 2: 文本内容...（与 Node 1 有重叠）            │  │
│   │  Node 3: 另一份文档的...                            │  │
│   │  Node 4: 内容...（与 Node 3 有重叠）                │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

代码示例：

"""
LlamaIndex 文档处理示例
展示不同的分割策略
"""
 
from llama_index.core.node_parser import (
    SentenceSplitter,
    TokenTextSplitter,
    SemanticSplitter,
    MarkdownNodeParser
)
from llama_index.embeddings.openai import OpenAIEmbedding
 
 
# ========== 1. 句子分割器 ==========
 
def sentence_split(documents):
    """按句子分割文档"""
    splitter = SentenceSplitter(
        chunk_size=512,       # 块大小（字符数）
        chunk_overlap=50,     # 重叠大小
        paragraph_separator="\n\n"
    )
    nodes = splitter.get_nodes_from_documents(documents)
    return nodes
 
 
# ========== 2. Token 分割器 ==========
 
def token_split(documents):
    """按 Token 分割文档"""
    splitter = TokenTextSplitter(
        chunk_size=256,       # 块大小（Token 数）
        chunk_overlap=20,     # 重叠大小
        tokenizer="cl100k_base"  # 使用 GPT-4 tokenizer
    )
    nodes = splitter.get_nodes_from_documents(documents)
    return nodes
 
 
# ========== 3. 语义分割器 ==========
 
def semantic_split(documents):
    """按语义分割文档"""
    embed_model = OpenAIEmbedding()
    splitter = SemanticSplitter(
        buffer_size=1,        # 句子缓冲区大小
        breakpoint_percentile_threshold=95,  # 分割阈值
        embed_model=embed_model
    )
    nodes = splitter.get_nodes_from_documents(documents)
    return nodes
 
 
# ========== 4. Markdown 解析器 ==========
 
def markdown_parse(documents):
    """解析 Markdown 文档结构"""
    parser = MarkdownNodeParser()
    nodes = parser.get_nodes_from_documents(documents)
    return nodes
 
 
# ========== 5. 分层解析 ==========
 
def hierarchical_parse(documents):
    """分层解析：先按段落，再按句子"""
    from llama_index.core.node_parser import HierarchicalNodeParser
    
    parser = HierarchicalNodeParser.from_defaults(
        chunk_sizes=[2048, 512, 128]  # 三层：大、中、小
    )
    nodes = parser.get_nodes_from_documents(documents)
    return nodes
 
 
# ========== 使用示例 ==========
 
if __name__ == "__main__":
    from llama_index.core import SimpleDirectoryReader
    
    # 加载文档
    documents = SimpleDirectoryReader("./documents").load_data()
    
    # 使用语义分割
    nodes = semantic_split(documents)
    
    print(f"分割为 {len(nodes)} 个节点")
    for i, node in enumerate(nodes[:3]):
        print(f"\nNode {i+1}:")
        print(f"  长度: {len(node.text)}")
        print(f"  内容: {node.text[:100]}...")

2.3 索引类型（Index Types）

┌─────────────────────────────────────────────────────────────┐
│                    索引类型对比                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │  VectorStoreIndex (向量索引) - 最常用               │  │
│   │                                                      │  │
│   │  ┌─────────────────────────────────────────────┐    │  │
│   │  │  Query → Embedding → Similarity Search      │    │  │
│   │  │                 ↓                            │    │  │
│   │  │         Top-K Similar Nodes                 │    │  │
│   │  └─────────────────────────────────────────────┘    │  │
│   │                                                      │  │
│   │  特点：语义相似性搜索，适合问答场景                 │  │
│   │  适用：知识库问答、文档检索                         │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │  ListIndex (列表索引)                               │  │
│   │                                                      │  │
│   │  ┌─────────────────────────────────────────────┐    │  │
│   │  │  Query → 遍历所有节点 → 汇总回答             │    │  │
│   │  │                                            │    │  │
│   │  │  Node1 → Node2 → Node3 → ... → Summary     │    │  │
│   │  └─────────────────────────────────────────────┘    │  │
│   │                                                      │  │
│   │  特点：遍历所有内容，适合总结场景                   │  │
│   │  适用：文档摘要、全局分析                           │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │  TreeIndex (树形索引)                               │  │
│   │                                                      │  │
│   │  ┌─────────────────────────────────────────────┐    │  │
│   │  │              Root Summary                    │    │  │
│   │  │                 /        \                   │    │  │
│   │  │           Summary1    Summary2               │    │  │
│   │  │            /    \      /    \                │    │  │
│   │  │         N1    N2    N3    N4                 │    │  │
│   │  └─────────────────────────────────────────────┘    │  │
│   │                                                      │  │
│   │  特点：层级结构，适合大规模文档                     │  │
│   │  适用：大型文档库、多主题文档                       │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   ┌─────────────────────────────────────────────────────┐  │
│   │  KeywordTableIndex (关键词索引)                     │  │
│   │                                                      │  │
│   │  ┌─────────────────────────────────────────────┐    │  │
│   │  │  Query → Extract Keywords → Lookup Table    │    │  │
│   │  │                                      │       │    │  │
│   │  │  Keyword Table:                           │    │  │
│   │  │  "AI" → [Node1, Node3, Node5]              │    │  │
│   │  │  "ML" → [Node2, Node4]                     │    │  │
│   │  └─────────────────────────────────────────────┘    │  │
│   │                                                      │  │
│   │  特点：精确关键词匹配，适合特定主题查询             │  │
│   │  适用：术语检索、精确匹配                           │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

代码示例：

"""
LlamaIndex 索引创建示例
展示不同类型索引的创建方式
"""
 
from llama_index.core import (
    VectorStoreIndex,
    ListIndex,
    TreeIndex,
    KeywordTableIndex,
    SimpleDirectoryReader
)
from llama_index.core.storage.storage_context import StorageContext
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb
 
 
# ========== 1. 向量索引（最常用）==========
 
def create_vector_index(documents, persist_dir=None):
    """创建向量索引"""
    index = VectorStoreIndex.from_documents(
        documents,
        show_progress=True
    )
    
    # 持久化存储
    if persist_dir:
        index.storage_context.persist(persist_dir)
    
    return index
 
 
# ========== 2. 使用向量数据库 ==========
 
def create_vector_index_with_chroma(documents, collection_name="my_docs"):
    """使用 Chroma 向量数据库创建索引"""
    # 初始化 Chroma 客户端
    db = chromadb.PersistentClient(path="./chroma_db")
    chroma_collection = db.get_or_create_collection(collection_name)
    
    # 创建向量存储
    vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    
    # 创建索引
    index = VectorStoreIndex.from_documents(
        documents,
        storage_context=storage_context,
        show_progress=True
    )
    
    return index
 
 
# ========== 3. 列表索引 ==========
 
def create_list_index(documents):
    """创建列表索引（适合总结场景）"""
    index = ListIndex.from_documents(documents)
    return index
 
 
# ========== 4. 树形索引 ==========
 
def create_tree_index(documents):
    """创建树形索引（适合大规模文档）"""
    index = TreeIndex.from_documents(
        documents,
        show_progress=True
    )
    return index
 
 
# ========== 5. 关键词索引 ==========
 
def create_keyword_index(documents):
    """创建关键词索引"""
    index = KeywordTableIndex.from_documents(documents)
    return index
 
 
# ========== 使用示例 ==========
 
if __name__ == "__main__":
    # 加载文档
    documents = SimpleDirectoryReader("./documents").load_data()
    
    # 创建向量索引
    index = create_vector_index(documents, persist_dir="./storage")
    
    # 创建查询引擎
    query_engine = index.as_query_engine()
    
    # 查询
    response = query_engine.query("什么是机器学习？")
    print(f"回答: {response}")

2.4 查询引擎（Query Engine）

┌─────────────────────────────────────────────────────────────┐
│                    查询引擎工作流程                          │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   ┌─────────────┐                                          │
│   │  用户查询   │                                          │
│   └──────┬──────┘                                          │
│          │                                                  │
│          ↓                                                  │
│   ┌─────────────────────────────────────────────────────┐  │
│   │              Query Engine                            │  │
│   │                                                      │  │
│   │  ┌───────────────────────────────────────────────┐  │  │
│   │  │  1. 检索阶段 (Retrieval)                       │  │  │
│   │  │  ┌─────────────────────────────────────────┐  │  │  │
│   │  │  │ Retriever:                              │  │  │  │
│   │  │  │ • Vector Index: 向量相似度搜索          │  │  │  │
│   │  │  │ • Keyword Index: 关键词匹配             │  │  │  │
│   │  │  │ • Hybrid: 多检索器融合                  │  │  │  │
│   │  │  └─────────────────────────────────────────┘  │  │  │
│   │  └───────────────────────────────────────────────┘  │  │
│   │                       │                              │  │
│   │                       ↓                              │  │
│   │  ┌───────────────────────────────────────────────┐  │  │
│   │  │  2. 后处理阶段 (Post-processing)              │  │  │
│   │  │  ┌─────────────────────────────────────────┐  │  │  │
│   │  │  │ • Reranking: 重排序                     │  │  │  │
│   │  │  │ • Filtering: 过滤                       │  │  │  │
│   │  │  │ • Compression: 压缩                     │  │  │  │
│   │  │  └─────────────────────────────────────────┘  │  │  │
│   │  └───────────────────────────────────────────────┘  │  │
│   │                       │                              │  │
│   │                       ↓                              │  │
│   │  ┌───────────────────────────────────────────────┐  │  │
│   │  │  3. 合成阶段 (Synthesis)                      │  │  │
│   │  │  ┌─────────────────────────────────────────┐  │  │  │
│   │  │  │ Response Synthesizer:                   │  │  │  │
│   │  │  │ • Compact: 压缩上下文后生成             │  │  │  │
│   │  │  │ • Refine: 迭代优化                      │  │  │  │
│   │  │  │ • Tree Summarize: 树形总结              │  │  │  │
│   │  │  └─────────────────────────────────────────┘  │  │  │
│   │  └───────────────────────────────────────────────┘  │  │
│   │                                                      │  │
│   └─────────────────────────────────────────────────────┘  │
│          │                                                  │
│          ↓                                                  │
│   ┌─────────────┐                                          │
│   │  最终响应   │                                          │
│   └─────────────┘                                          │
│                                                             │
└─────────────────────────────────────────────────────────────┘

代码示例：

"""
LlamaIndex 查询引擎示例
展示各种查询模式和配置
"""
 
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.response_synthesizers import (
    CompactMode,
    TreeSummarize,
    Refine
)
from llama_index.core.postprocessor import (
    SimilarityPostprocessor,
    LongContextReorder,
    SentenceTransformerRerank
)
 
 
# ========== 1. 基础查询 ==========
 
def basic_query(index):
    """基础查询模式"""
    query_engine = index.as_query_engine()
    response = query_engine.query("什么是机器学习？")
    return response
 
 
# ========== 2. 高级检索配置 ==========
 
def advanced_retrieval(index):
    """高级检索配置"""
    query_engine = index.as_query_engine(
        # 检索配置
        similarity_top_k=5,           # 检索 top-5 相关节点
        sparse_top_k=10,              # 稀疏检索 top-10
        
        # 响应模式
        response_mode="compact",       # compact, refine, tree_summarize
        
        # 流式输出
        streaming=True
    )
    return query_engine
 
 
# ========== 3. 带后处理的查询 ==========
 
def query_with_postprocessing(index):
    """带后处理的查询"""
    query_engine = index.as_query_engine(
        similarity_top_k=10,
        
        # 后处理器列表
        node_postprocessors=[
            # 相似度过滤
            SimilarityPostprocessor(similarity_cutoff=0.7),
            
            # 长上下文重排序
            LongContextReorder(),
            
            # 重排序模型
            SentenceTransformerRerank(
                model="cross-encoder/ms-marco-MiniLM-L-2-v2",
                top_n=5
            )
        ]
    )
    return query_engine
 
 
# ========== 4. 自定义响应合成器 ==========
 
def custom_synthesizer(index):
    """自定义响应合成器"""
    from llama_index.core.response_synthesizers import ResponseMode
    
    # 使用 refine 模式（迭代优化）
    query_engine = index.as_query_engine(
        response_mode=ResponseMode.REFINE
    )
    
    return query_engine
 
 
# ========== 5. 多文档查询 ==========
 
def multi_document_query(indices):
    """多文档查询（跨索引）"""
    from llama_index.core.indices.composability import ComposableGraphQueryEngine
    
    # 组合多个索引
    # ... 创建组合图
    
    # 执行查询
    # query_engine = ComposableGraphQueryEngine(...)
    pass
 
 
# ========== 6. 流式查询 ==========
 
async def streaming_query(index):
    """流式查询"""
    query_engine = index.as_query_engine(streaming=True)
    
    response = query_engine.query("请详细介绍机器学习的发展历史")
    
    # 流式输出
    for text in response.response_gen:
        print(text, end="", flush=True)
 
 
# ========== 使用示例 ==========
 
if __name__ == "__main__":
    # 加载文档并创建索引
    documents = SimpleDirectoryReader("./documents").load_data()
    index = VectorStoreIndex.from_documents(documents)
    
    # 创建高级查询引擎
    query_engine = query_with_postprocessing(index)
    
    # 执行查询
    response = query_engine.query("什么是深度学习？它和机器学习有什么关系？")
    
    print(f"回答: {response}")
    print(f"\n来源节点数: {len(response.source_nodes)}")
    
    for i, node in enumerate(response.source_nodes[:3]):
        print(f"\n来源 {i+1}:")
        print(f"  文件: {node.node.metadata.get('file_name', '未知')}")
        print(f"  相似度: {node.score:.4f}")

三、高级特性

3.1 聊天引擎

"""
LlamaIndex 聊天引擎示例
支持多轮对话的查询模式
"""
 
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.chat_engine import (
    SimpleChatEngine,
    ContextChatEngine,
    CondenseQuestionChatEngine
)
from llama_index.core.memory import ChatMemoryBuffer
 
 
# 创建索引
documents = SimpleDirectoryReader("./documents").load_data()
index = VectorStoreIndex.from_documents(documents)
 
 
# ========== 1. 简单聊天 ==========
 
def simple_chat():
    """简单聊天模式（不检索）"""
    chat_engine = SimpleChatEngine.from_defaults()
    return chat_engine
 
 
# ========== 2. 上下文聊天（推荐）==========
 
def context_chat(index):
    """上下文聊天模式"""
    memory = ChatMemoryBuffer.from_defaults(token_limit=4096)
    
    chat_engine = index.as_chat_engine(
        chat_mode="context",
        memory=memory,
        system_prompt=(
            "你是一个专业的技术助手，"
            "请基于提供的文档内容回答用户问题。"
        )
    )
    return chat_engine
 
 
# ========== 3. 问题压缩聊天 ==========
 
def condense_chat(index):
    """问题压缩聊天模式"""
    chat_engine = index.as_chat_engine(
        chat_mode="condense_question",
        verbose=True
    )
    return chat_engine
 
 
# ========== 4. ReAct 聊天 ==========
 
def react_chat(index):
    """ReAct 聊天模式（带工具调用）"""
    chat_engine = index.as_chat_engine(
        chat_mode="react",
        verbose=True
    )
    return chat_engine
 
 
# ========== 使用示例 ==========
 
if __name__ == "__main__":
    # 创建聊天引擎
    chat_engine = context_chat(index)
    
    # 进行多轮对话
    response = chat_engine.chat("什么是机器学习？")
    print(f"AI: {response}")
    
    response = chat_engine.chat("它有哪些主要应用？")
    print(f"AI: {response}")
    
    response = chat_engine.chat("能给我举几个具体的例子吗？")
    print(f"AI: {response}")
    
    # 查看对话历史
    print("\n对话历史:")
    for msg in chat_engine.chat_memory.get_all():
        print(f"- {msg.role}: {msg.content[:50]}...")

3.2 检索增强

"""
LlamaIndex 高级检索策略
"""
 
from llama_index.core import VectorStoreIndex
from llama_index.core.retrievers import (
    VectorIndexRetriever,
    KeywordTableSimpleRetriever
)
from llama_index.retrievers.bm25 import BM25Retriever
from llama_index.core.retrievers import VectorIndexAutoRetriever
from llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo
 
 
# ========== 1. 混合检索 ==========
 
def hybrid_retrieval(index):
    """向量检索 + BM25 检索混合"""
    from llama_index.core.retrievers import QueryFusionRetriever
    
    # 向量检索器
    vector_retriever = VectorIndexRetriever(
        index=index,
        similarity_top_k=5
    )
    
    # BM25 检索器
    bm25_retriever = BM25Retriever.from_defaults(
        index=index,
        similarity_top_k=5
    )
    
    # 融合检索器
    fusion_retriever = QueryFusionRetriever(
        retrievers=[vector_retriever, bm25_retriever],
        similarity_top_k=5,
        num_queries=1,
        mode="reciprocal_rerank"  # 或 "dist_based_score"
    )
    
    return fusion_retriever
 
 
# ========== 2. 自动检索（元数据过滤）==========
 
def auto_retrieval(index):
    """自动检索（带元数据过滤）"""
    vector_store_info = VectorStoreInfo(
        content_info="技术文档",
        metadata_info=[
            MetadataInfo(
                name="category",
                type="str",
                description="文档类别，如 'AI', 'ML', 'NLP'"
            ),
            MetadataInfo(
                name="date",
                type="str",
                description="文档日期，格式 YYYY-MM-DD"
            )
        ]
    )
    
    retriever = VectorIndexAutoRetriever(
        index=index,
        vector_store_info=vector_store_info,
        similarity_top_k=5
    )
    
    return retriever
 
 
# ========== 3. 多文档检索 ==========
 
def multi_doc_retrieval(index):
    """多文档检索"""
    from llama_index.core.retrievers import RecursiveRetriever
    
    # 创建递归检索器
    # 支持跨文档引用和链接
    retriever = RecursiveRetriever(
        "vector",
        retriever_dict={"vector": index.as_retriever()},
        # ... 配置
    )
    
    return retriever

3.3 与 LangChain 集成

"""
LlamaIndex 与 LangChain 集成示例
"""
 
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from langchain_openai import ChatOpenAI
from langchain.agents import Tool, AgentExecutor, create_openai_functions_agent
 
 
# ========== 1. 作为 LangChain 工具 ==========
 
def as_langchain_tool(index):
    """将 LlamaIndex 查询引擎包装为 LangChain 工具"""
    
    query_engine = index.as_query_engine()
    
    def query_knowledge_base(query: str) -> str:
        """查询知识库"""
        response = query_engine.query(query)
        return str(response)
    
    tool = Tool(
        name="knowledge_base",
        func=query_knowledge_base,
        description="查询公司内部知识库，获取技术文档信息"
    )
    
    return tool
 
 
# ========== 2. 在 LangChain Agent 中使用 ==========
 
def create_agent_with_llamaindex(index):
    """创建使用 LlamaIndex 的 LangChain Agent"""
    from langchain import hub
    
    # 创建工具
    kb_tool = as_langchain_tool(index)
    tools = [kb_tool]
    
    # 创建 LLM
    llm = ChatOpenAI(model="gpt-4")
    
    # 获取提示词
    prompt = hub.pull("hwchase17/openai-functions-agent")
    
    # 创建 Agent
    agent = create_openai_functions_agent(llm, tools, prompt)
    agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
    
    return agent_executor
 
 
# ========== 3. 使用 LlamaIndex 的 LLM ==========
 
def use_llamaindex_llm():
    """在 LlamaIndex 中使用 LangChain LLM"""
    from llama_index.llms.langchain import LangChainLLM
    from langchain_openai import ChatOpenAI
    
    # 创建 LangChain LLM
    lc_llm = ChatOpenAI(model="gpt-4")
    
    # 包装为 LlamaIndex LLM
    llm = LangChainLLM(llm=lc_llm)
    
    return llm
 
 
# ========== 使用示例 ==========
 
if __name__ == "__main__":
    # 加载文档并创建索引
    documents = SimpleDirectoryReader("./documents").load_data()
    index = VectorStoreIndex.from_documents(documents)
    
    # 创建 Agent
    agent = create_agent_with_llamaindex(index)
    
    # 执行查询
    result = agent.invoke({
        "input": "请查询知识库，告诉我什么是机器学习？"
    })
    
    print(result["output"])

四、生产最佳实践

4.1 性能优化策略

┌─────────────────────────────────────────────────────────────┐
│                    LlamaIndex 性能优化                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│   1. 索引优化                                               │
│   ┌─────────────────────────────────────────────────────┐  │
│   │ • 选择合适的 chunk_size（256-1024）                 │  │
│   │ • 调整 chunk_overlap（chunk_size 的 10-20%）        │  │
│   │ • 使用分层索引处理大规模文档                        │  │
│   │ • 定期更新和维护索引                                │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   2. 检索优化                                               │
│   ┌─────────────────────────────────────────────────────┐  │
│   │ • 使用混合检索（向量 + 关键词）                     │  │
│   │ • 配置合适的 top_k 值                               │  │
│   │ • 使用重排序提升精度                                │  │
│   │ • 实现缓存机制                                      │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   3. 存储优化                                               │
│   ┌─────────────────────────────────────────────────────┐  │
│   │ • 使用专业向量数据库（Pinecone, Weaviate）          │  │
│   │ • 实现索引持久化                                    │  │
│   │ • 配置合适的向量维度                                │  │
│   │ • 定期清理过期数据                                  │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
│   4. 查询优化                                               │
│   ┌─────────────────────────────────────────────────────┐  │
│   │ • 使用流式输出提升用户体验                          │  │
│   │ • 实现查询缓存                                      │  │
│   │ • 使用更快的 Embedding 模型                         │  │
│   │ • 优化提示词长度                                    │  │
│   └─────────────────────────────────────────────────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

4.2 企业级部署

"""
企业级 LlamaIndex 部署示例
"""
 
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.core.storage.storage_context import StorageContext
from llama_index.vector_stores.pinecone import PineconeVectorStore
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler
import pinecone
 
 
class EnterpriseRAG:
    """企业级 RAG 应用"""
    
    def __init__(
        self,
        pinecone_api_key: str,
        pinecone_env: str,
        index_name: str,
        openai_api_key: str
    ):
        # 初始化 Pinecone
        pinecone.init(
            api_key=pinecone_api_key,
            environment=pinecone_env
        )
        
        # 配置全局设置
        Settings.embed_model = OpenAIEmbedding(
            api_key=openai_api_key,
            embed_batch_size=100
        )
        Settings.llm = OpenAI(
            api_key=openai_api_key,
            model="gpt-4"
        )
        
        # 配置回调管理（调试和监控）
        debug_handler = LlamaDebugHandler(print_trace_on_end=True)
        Settings.callback_manager = CallbackManager([debug_handler])
        
        # 获取或创建 Pinecone 索引
        if index_name not in pinecone.list_indexes():
            pinecone.create_index(
                name=index_name,
                dimension=1536,  # OpenAI embedding 维度
                metric="cosine"
            )
        
        self.pinecone_index = pinecone.Index(index_name)
        self.index = None
    
    def build_index(self, documents_path: str):
        """构建索引"""
        # 加载文档
        documents = SimpleDirectoryReader(documents_path).load_data()
        
        # 创建向量存储
        vector_store = PineconeVectorStore(
            pinecone_index=self.pinecone_index
        )
        storage_context = StorageContext.from_defaults(
            vector_store=vector_store
        )
        
        # 创建索引
        self.index = VectorStoreIndex.from_documents(
            documents,
            storage_context=storage_context,
            show_progress=True
        )
    
    def query(
        self,
        question: str,
        top_k: int = 5,
        filters: dict = None
    ):
        """查询"""
        query_engine = self.index.as_query_engine(
            similarity_top_k=top_k,
            # 可以添加元数据过滤
            # filters=filters
        )
        
        response = query_engine.query(question)
        
        return {
            "answer": str(response),
            "sources": [
                {
                    "content": node.node.text[:200],
                    "metadata": node.node.metadata
                }
                for node in response.source_nodes
            ]
        }
 
 
# 使用示例
if __name__ == "__main__":
    rag = EnterpriseRAG(
        pinecone_api_key="your-pinecone-key",
        pinecone_env="us-west1-gcp",
        index_name="company-knowledge",
        openai_api_key="your-openai-key"
    )
    
    # 构建索引（首次运行）
    # rag.build_index("./documents")
    
    # 查询
    result = rag.query("公司年假政策是什么？")
    print(f"回答: {result['answer']}")

五、面试问答

Q1: LlamaIndex 和 LangChain 如何选择？

回答要点：

对比维度	LlamaIndex	LangChain
核心定位	数据连接与 RAG	通用 Agent 框架
RAG 能力	⭐⭐⭐⭐⭐	⭐⭐⭐
Agent 能力	⭐⭐⭐	⭐⭐⭐⭐⭐
工具生态	数据连接器丰富	工具和集成丰富
学习曲线	较低	中等

选择建议：

纯 RAG/知识库应用 → LlamaIndex
复杂 Agent 系统 → LangChain
企业级知识库 → LlamaIndex + LangChain 组合

Q2: LlamaIndex 支持哪些索引类型？各有什么特点？

回答要点：

索引类型	检索方式	适用场景
VectorStoreIndex	向量相似度	语义搜索、问答
ListIndex	遍历所有节点	文档总结
TreeIndex	层级遍历	大规模文档
KeywordTableIndex	关键词匹配	精确术语检索

Q3: 如何优化 RAG 系统的检索质量？

回答要点：

数据预处理：合理的分块策略、重叠设置
混合检索：向量检索 + BM25 检索融合
重排序：使用 Cross-Encoder 重排结果
元数据过滤：利用文档元数据进行筛选
查询重写：优化用户查询的表达

Q4: LlamaIndex 的回调系统有什么作用？

回答要点：

调试追踪：记录每个步骤的输入输出
性能监控：统计 Token 消耗、响应时间
成本控制：追踪 API 调用次数和费用
质量评估：收集检索和生成指标

六、小结

LlamaIndex 是专注于数据连接和 RAG 的专业框架：

核心优势

数据连接能力强：200+ 数据连接器
索引类型丰富：向量、列表、树形、关键词
RAG 优化深入：检索、后处理、合成全流程

关键要点

Index 是核心：选择合适的索引类型是关键
Retriever 决定质量：检索策略直接影响回答质量
Query Engine 是入口：配置查询参数优化体验

下一步学习

深入学习 LangGraph 的工作流编排
实践企业级 RAG 系统的开发和部署
探索 LlamaIndex 的高级检索策略

LangChain 框架 LangGraph 框架