Introduction
LangChain is a modular framework that lets developers build applications powered by large language models (LLMs). By providing a uniform API for prompts, model calls, tool use, memory, and retrieval, LangChain lets you focus on the logic of your app while swapping components—such as the underlying LLM or a vector store—without rewriting code. This flexibility accelerates prototyping, improves maintainability, and enables advanced patterns like tool‑augmented reasoning and conversational memory.
Core Concepts
- LLM Wrapper – A thin layer (
OpenAI,Anthropic,AzureOpenAI, …) that standardises how a model is called. - PromptTemplate – A reusable string with placeholders (e.g.,
{question}) that are filled at runtime. - Chain – A fixed pipeline that connects a prompt, an LLM, and optional post‑processing steps.
- Agent – A dynamic planner that decides, on each turn, which tool or chain to invoke based on the LLM’s intermediate output.
- Tool – An external function (search API, calculator, database query, etc.) that an agent can call.
- Memory – Stores context across turns so later prompts can reference earlier interactions (
ConversationBufferMemory,ConversationSummaryMemory). - Retriever / Index – A vector store or document loader that supplies relevant passages for retrieval‑augmented generation.
- CallbackManager – Hooks that emit events for logging, streaming, or tracing token‑level activity.
All of these inherit from a common BaseComponent, which makes them interchangeable.
Data Flow in a Typical Chain
- User input is passed to
PromptTemplate.render, which substitutes variables. - The rendered prompt is sent to the LLM via
LLM.invoke. - The raw text response is handed to an
OutputParser(e.g.,RegexParser,PydanticOutputParser). - The parser returns structured data to the caller.
This linear flow can be visualised as:
User → PromptTemplate → LLM → OutputParser → Result
Agentic AI and ReAct Reasoning
Agents follow a loop known as ReAct (Reason + Act). The steps are:
- Observation – The agent receives the user’s query.
- Thought Generation – The LLM produces a “thought” that may contain a tool request.
- Action – The agent parses the thought, calls the indicated tool, and captures the result.
- Observation Integration – The tool’s output is inserted back into the prompt for the next LLM call.
- Loop – Steps 2‑4 repeat until the LLM emits a
final answer.
ReAct enables the system to decompose complex problems into smaller, executable sub‑tasks.
Core Classes (Python)
| Class | Role | Typical Subclass | Example Use |
|---|---|---|---|
LLM | Model wrapper | OpenAI, ChatOpenAI, Anthropic | OpenAI(model_name="gpt-3.5-turbo") |
PromptTemplate | Text templating | ChatPromptTemplate | PromptTemplate.from_template("Summarize: {text}") |
Chain | Fixed pipeline | LLMChain, SequentialChain | LLMChain(prompt=prompt, llm=llm) |
AgentExecutor | Runtime planner | ZeroShotAgent, ConversationalAgent | initialize_agent(tools, llm, agent_type="zero-shot-react-description") |
Tool | External function | Tool.from_function | Tool(name="calculator", func=calc, description="evaluate arithmetic") |
Memory | Context storage | ConversationBufferMemory, ConversationSummaryMemory | ConversationBufferMemory(memory_key="chat_history") |
Retriever | Document fetcher | FAISS, ElasticVectorSearch | FAISS.from_texts(docs, embeddings) |
CallbackManager | Event handling | StdOutCallbackHandler, StreamingStdOutCallbackHandler | CallbackManager([StdOutCallbackHandler()]) |
Worked Examples
1. Simple LLMChain
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
template = "Summarize the following article in one sentence:\n\n{article}"
prompt = PromptTemplate.from_template(template)
llm = OpenAI(model_name="gpt-3.5-turbo", temperature=0)
summary_chain = LLMChain(prompt=prompt, llm=llm)
article = """LangChain provides a modular framework for building LLM‑powered applications..."""
result = summary_chain.run(article)
print(result)
What happens:
PromptTemplate.renderinserts thearticletext.OpenAI.invokesends the completed prompt to the model.- The model returns a one‑sentence summary, which
LLMChain.runreturns unchanged. - Because
temperature=0, the output is deterministic; the same article always yields the same summary.
2. Structured Output with a Pydantic Parser
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
class Sentiment(BaseModel):
sentiment: str = Field(..., description="positive, neutral, or negative")
confidence: float = Field(..., ge=0, le=1)
parser = PydanticOutputParser(pydantic_object=Sentiment)
template = """Classify the sentiment of the following sentence and output JSON:
Sentence: "{sentence}"
{format_instructions}
"""
prompt = PromptTemplate.from_template(
template,
partial_variables={"format_instructions": parser.get_format_instructions()}
)
chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0), output_parser=parser)
result = chain.run({"sentence": "LangChain makes LLM integration painless."})
print(result)
Explanation:
- The parser supplies JSON format instructions that are baked into the prompt.
- The LLM returns a JSON string; the parser validates it against the
Sentimentmodel and returns a typed object. - If the LLM’s output cannot be parsed, the chain raises a
ValidationError, which you can catch to retry or fallback.
3. Zero‑Shot ReAct Agent with a Calculator Tool
from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
import math
def calc(expr: str) -> str:
"""Safely evaluate a mathematical expression."""
try:
return str(eval(expr, {"__builtins__": {}}, {"sqrt": math.sqrt, "pow": pow}))
except Exception as e:
return f"Error: {e}"
calculator = Tool(
name="calculator",
func=calc,
description="useful for evaluating arithmetic expressions"
)
agent = initialize_agent(
tools=[calculator],
llm=OpenAI(temperature=0),
agent_type="zero-shot-react-description",
verbose=True
)
question = "What is the square root of (12^2 + 5^2)?"
answer = agent.run(question)
print(answer)
Execution flow:
- The agent receives the question.
- The LLM’s thought includes a call to
calculatorwith the expressionsqrt(12**2 + 5**2). calcevaluates the expression safely and returns13.0.- The observation (
13.0) is fed back to the LLM, which produces the final answer"13.0".
Error handling tip: The calc function catches any exception and returns a readable error string; the agent treats non‑JSON output as a failure and will retry the step.
4. Conversational Agent with Memory
from langchain.memory import ConversationBufferMemory
from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
memory = ConversationBufferMemory(memory_key="chat_history")
agent = initialize_agent(
tools=[],
llm=OpenAI(temperature=0),
agent_type="conversational-react-description",
memory=memory,
verbose=False
)
print(agent.run("Hi, what's the weather in Paris?"))
print(agent.run("Will it rain tomorrow?"))
What changes:
- The first turn stores the weather query and any LLM response in
chat_history. - The second turn automatically includes that history, allowing the LLM to reference the earlier context (e.g., “Based on the earlier weather report…”).
Limitations and Considerations
- Prompt length – Chains that concatenate long histories can exceed model token limits; consider summarising memory with
ConversationSummaryMemory. - Tool description quality – Ambiguous or overly terse descriptions cause the LLM to generate malformed calls, leading to retries or failures.
- Latency – Each tool call adds a round‑trip; for real‑time UI you may need asynchronous execution or streaming callbacks.
- Determinism vs. creativity – Setting
temperature=0yields reproducible results but may limit nuanced answers; higher temperatures increase variability but reduce repeatability. - Dependency on external services – Vector stores, APIs, and LLM providers introduce network latency and cost; mock them with
FakeLLMorFakeRetrieverduring testing.
Best Practices (Integrated Key Notes)
- Uniform LLM API – Swap model providers by changing only the wrapper class.
- Immutable PromptTemplates – Use
partial_variablesfor static sections such as format instructions; this prevents accidental mutation. - Deterministic Chains – Keep
temperature=0for pipelines that must produce the same output given identical input. - Clear Tool Descriptions – Write concise, unambiguous sentences that describe input format and expected behavior.
- Memory Management – For long conversations, replace
ConversationBufferMemorywithConversationSummaryMemoryto bound token usage. - Callbacks for UI – Attach
StreamingStdOutCallbackHandleror a custom handler to stream tokens to a front‑end, enabling responsive interfaces and easier debugging. - Robust Error Handling – Wrap every tool function in a try/except block and return a plain‑text error message; agents interpret any non‑JSON output as a failure and will retry.
- Testing without APIs – Use
langchain.testing.FakeLLMandFakeChatModelto unit‑test chains and agents offline, ensuring deterministic test results.
Summary of Takeaways
- LangChain standardises prompts, model calls, tool use, memory, and retrieval, making LLM‑centric applications modular and interchangeable.
- A Chain is a static pipeline; an Agent adds dynamic decision‑making via the ReAct loop.
- PromptTemplates, OutputParsers, and Memory together shape how user input is transformed into structured, context‑aware responses.
- Proper tool descriptions, memory pruning, and error‑handling are essential for reliable agent behavior.
- The framework’s callback system and testing utilities support production‑grade development and debugging.