CH01 LangChain 시작하기
Blog

CH01 LangChain 시작하기

2026.02.27
·Web·by 배레온/부산/개발자
#Agent#LangChain#LCEL#LLM#RAG

Key Points

  • 1This resource provides a comprehensive Korean tutorial for LangChain, detailing its installation, core components, and related frameworks like LangSmith and LangGraph.
  • 2It systematically covers essential LangChain modules, including Prompts, Output Parsers, diverse LLM Models, various Memory types, Document Loaders, Text Splitters, Embeddings, Vector Stores, Retrievers, and Rerankers.
  • 3The tutorial also explores advanced applications such as Retrieval Augmented Generation (RAG), LangChain Expression Language (LCEL), Chains, comprehensive Evaluation methods, and in-depth Agent development with numerous LangGraph use cases.

The provided paper, "<랭체인LangChain 노트> - LangChain 한국어 튜토리얼🇰🇷," serves as a comprehensive Korean tutorial for the LangChain framework, designed for developing applications powered by large language models (LLMs). The core premise of LangChain, as highlighted, is to imbue LLMs with context-awareness by connecting them to various contextual sources (prompts, few-shot examples, response-based content) and to enable reasoning capabilities for decision-making and action execution. This facilitates the development of applications such as Retrieval Augmented Generation (RAG), structured data analysis, and conversational AI agents.

The LangChain ecosystem is presented through several key components:

  • LangChain Library: Provides Python and JavaScript interfaces, integrations, and a runtime for composing components into chains and agents.
  • LangChain Templates: Offers easily deployable reference architectures for common tasks.
  • LangServe: Enables deployment of LangChain chains as REST APIs.
  • LangSmith: A developer platform for debugging, testing, evaluating, and monitoring LLM applications, offering seamless integration.
  • LangGraph: A library built atop LangChain for constructing stateful multi-actor LLM applications, extending LangChain Expression Language (LCEL) with cyclic orchestration capabilities for multiple chains/actors.

The framework's core functionality is encapsulated in three primary modules:

  • Model I/O: Manages prompt engineering, optimization, and generic interfaces for interacting with LLMs.
  • Retrieval: Crucial for RAG applications, it handles fetching relevant data from external sources to augment LLM generation.
  • Agents: Facilitates autonomous decision-making in LLMs, allowing them to determine, execute, observe, and iterate on actions.

The tutorial is structured into seventeen main chapters, delving into various aspects of LangChain:

CH01: LangChain 시작하기 (Getting Started): Covers installation (Python 3.11 recommended, pip install -r requirements.txt), OpenAI API key setup, LangSmith tracing configuration, and the use of OpenAI's GPT-4o for multimodal applications. It also introduces LangChain Expression Language (LCEL) as a declarative way to compose runnables and define interfaces.

CH02: 프롬프트 (Prompt): Explores various prompt types including basic prompts, FewShotPromptTemplate for providing in-context examples, integration with LangChain Hub for shared prompts, and personalization of prompts by uploading to the Hub.

CH03: 출력 파서 (Output Parsers): Details mechanisms for structuring LLM outputs. This includes PydanticOutputParser for Pydantic models, CommaSeparatedListOutputParser, StructuredOutputParser for arbitrary structures, JsonOutputParser, PandasDataFrameOutputParser, DatetimeOutputParser, EnumOutputParser, and OutputFixingParser for correcting malformed outputs.

CH04: 모델 (Model): Discusses utilizing diverse LLM models beyond OpenAI (e.g., Google Generative AI, HuggingFace Endpoints/Local/Pipeline, Ollama, GPT4ALL), caching mechanisms (Cache), model serialization (Serialization) for saving/loading, and token usage tracking. Specific mention is made of Gemini for video question-answering.

CH05: 메모리 (Memory): Covers various memory types for maintaining conversational history: ConversationBufferMemory, ConversationBufferWindowMemory (fixed window size), ConversationTokenBufferMemory (token-based window), ConversationEntityMemory (tracking entities), ConversationKGMemory (knowledge graph-based), ConversationSummaryMemory (summarizing past interactions), and VectorStoreRetrieverMemory (retrieving past context from a vector store). It also covers adding memory to LCEL chains and RunnableWithMessageHistory, and persisting chat history to SQLite.

CH06: 문서 로더 (Document Loader): Explains how to load different document types into Document objects, including PDF, HWP, CSV, Excel, Word, PowerPoint, web pages (WebBaseLoader), plain text (TextLoader), JSON, Arxiv, UpstageLayoutAnalysisLoader, and LlamaParser.

CH07: 텍스트 분할 (Text Splitter): Focuses on methods for segmenting large texts: CharacterTextSplitter, RecursiveCharacterTextSplitter (recursively trying different delimiters), TokenTextSplitter (based on token count), SemanticChunker (semantic similarity), CodeTextSplitter for various programming languages, MarkdownHeaderTextSplitter, HTMLHeaderTextSplitter, and RecursiveJsonSplitter.

CH08: 임베딩 (Embedding): Details various embedding models: OpenAIEmbeddings, CacheBackedEmbeddings for efficiency, HuggingFace Embeddings, UpstageEmbeddings, OllamaEmbeddings, GPT4ALLEmbeddings, and LlamaCPPEmbeddings.

CH09: 벡터저장소 (VectorStore): Introduces popular vector databases for storing and querying embeddings: Chroma, FAISS, and Pinecone.

CH10: 검색기 (Retriever): Describes various retriever types: VectorStore-backed Retriever, ContextualCompressionRetriever (compressing retrieved documents), EnsembleRetriever (combining multiple retrievers), LongContextReorder (reordering for long contexts), ParentDocumentRetriever (retrieving larger parent chunks), MultiQueryRetriever (generating multiple queries), MultiVectorRetriever (using multiple vector representations), SelfQueryRetriever (LLM-driven query generation), TimeWeightedVectorStoreRetriever (prioritizing recent documents), and Korean morphological analyzers (Kiwi, Kkma, Okt) combined with BM25. A Convex Combination (CC) EnsembleRetriever is also mentioned for weighted fusion.

CH11: 리랭커 (Reranker): Covers post-retrieval reranking models for improving relevance: Cross Encoder Reranker, Cohere Reranker, Jina Reranker, and FlashRank Reranker.

CH12: Retrieval Augmented Generation (RAG): Demonstrates practical RAG applications, including QA from PDF documents and Naver news articles. It explores modular RAG, RAPTOR for long context summarization, and RAG chains with memory.

CH13: LangChain Expression Language (LCEL): Provides an in-depth look at LCEL components and patterns: RunnablePassthrough (passing inputs directly), Runnable structure and graph inspection, RunnableLambda (arbitrary functions), LLM chain routing using RunnableLambda and RunnableBranch, RunnableParallel (concurrent execution), dynamic property assignment (configurable_fields, configurable_alternatives), @chain decorator, RunnableWithMessageHistory, custom generators, runtime argument binding, and fallback models.

CH14: 체인 (Chains): Illustrates specific chain implementations such as document summarization, SQL query generation, and with_structured_output for predefined output formats.

CH15: 평가 (Evaluations): Covers methods for evaluating LLM applications, especially RAG: Synthetic dataset generation with RAGAS, RAGAS-based evaluation, uploading datasets to HuggingFace, LangSmith dataset creation, LLM-as-Judge, embedding distance evaluation (embedding1embedding22\| \text{embedding}_1 - \text{embedding}_2 \|_2), custom LLM evaluation, heuristic metrics (Rouge, BLEU, METEOR, SemScore), experiment comparison, summary evaluation, groundedness (hallucination) assessment, pairwise evaluation, and automated online evaluation.

CH16: 에이전트 (Agent): Details the concept of agents: Tools (functions agents can use), Tool Binding, general Agent construction, using agents with Claude, Gemini, Ollama, Together.ai. It also covers iteration with Human-in-the-loop, Agentic RAG, CSV/Excel data analysis agents, Toolkits, RAG + Image Generator agents for report writing, and multi-agent debates with tools.

CH17: LangGraph: Explores LangGraph for building stateful multi-actor LLM applications:

  • Core Features: Python syntax understanding for LangGraph, building chatbots and agents with LangGraph, adding memory to agents, streaming node outputs, Human-in-the-loop, state modification and replay through rollback, adding human-query nodes, message removal (RemoveMessage), ToolNode usage, creating branches for parallel execution, summarization of chat history, subgraphs (addition and I/O transformation), and streaming modes.
  • Structure Design: Basic graph creation, Naive RAG, adding Relevance Checker and Web Search modules, query rewriting, Agentic RAG, and Adaptive RAG.
  • Use Cases: Agent conversation simulation (customer service), user-requirement based meta-prompt generation, CRAG (Corrective RAG), Self-RAG, Plan-and-Execute, Multi-Agent Collaboration Network, Multi-Agent Supervisor, Hierarchical Multi-Agent Teams, SQL database interaction agents, and STORM concept-based multi-agent research.

CH18: 기타 정보 (Miscellaneous): Provides additional information, specifically on StreamEvent types.

The tutorial emphasizes LangChain's modularity and composability, allowing developers to easily assemble and integrate components, providing both fundamental building blocks and high-level, ready-to-use chains to accelerate development.