Learning Guy | AI Generated Courses

Introduction

LangChain is a modular framework that lets developers build applications powered by large language models (LLMs). By providing a uniform API for prompts, model calls, tool use, memory, and retrieval, LangChain lets you focus on the logic of your app while swapping components—such as the underlying LLM or a vector store—without rewriting code. This flexibility accelerates prototyping, improves maintainability, and enables advanced patterns like tool‑augmented reasoning and conversational memory.

Core Concepts

LLM Wrapper – A thin layer (OpenAI, Anthropic, AzureOpenAI, …) that standardises how a model is called.
PromptTemplate – A reusable string with placeholders (e.g., {question}) that are filled at runtime.
Chain – A fixed pipeline that connects a prompt, an LLM, and optional post‑processing steps.
Agent – A dynamic planner that decides, on each turn, which tool or chain to invoke based on the LLM’s intermediate output.
Tool – An external function (search API, calculator, database query, etc.) that an agent can call.
Memory – Stores context across turns so later prompts can reference earlier interactions (ConversationBufferMemory, ConversationSummaryMemory).
Retriever / Index – A vector store or document loader that supplies relevant passages for retrieval‑augmented generation.
CallbackManager – Hooks that emit events for logging, streaming, or tracing token‑level activity.

All of these inherit from a common BaseComponent, which makes them interchangeable.

Data Flow in a Typical Chain

User input is passed to PromptTemplate.render, which substitutes variables.
The rendered prompt is sent to the LLM via LLM.invoke.
The raw text response is handed to an OutputParser (e.g., RegexParser, PydanticOutputParser).
The parser returns structured data to the caller.

This linear flow can be visualised as:

User → PromptTemplate → LLM → OutputParser → Result

Agentic AI and ReAct Reasoning

Agents follow a loop known as ReAct (Reason + Act). The steps are:

Observation – The agent receives the user’s query.
Thought Generation – The LLM produces a “thought” that may contain a tool request.
Action – The agent parses the thought, calls the indicated tool, and captures the result.
Observation Integration – The tool’s output is inserted back into the prompt for the next LLM call.
Loop – Steps 2‑4 repeat until the LLM emits a final answer.

ReAct enables the system to decompose complex problems into smaller, executable sub‑tasks.

Core Classes (Python)

Class	Role	Typical Subclass	Example Use
`LLM`	Model wrapper	`OpenAI`, `ChatOpenAI`, `Anthropic`	`OpenAI(model_name="gpt-3.5-turbo")`
`PromptTemplate`	Text templating	`ChatPromptTemplate`	`PromptTemplate.from_template("Summarize: {text}")`
`Chain`	Fixed pipeline	`LLMChain`, `SequentialChain`	`LLMChain(prompt=prompt, llm=llm)`
`AgentExecutor`	Runtime planner	`ZeroShotAgent`, `ConversationalAgent`	`initialize_agent(tools, llm, agent_type="zero-shot-react-description")`
`Tool`	External function	`Tool.from_function`	`Tool(name="calculator", func=calc, description="evaluate arithmetic")`
`Memory`	Context storage	`ConversationBufferMemory`, `ConversationSummaryMemory`	`ConversationBufferMemory(memory_key="chat_history")`
`Retriever`	Document fetcher	`FAISS`, `ElasticVectorSearch`	`FAISS.from_texts(docs, embeddings)`
`CallbackManager`	Event handling	`StdOutCallbackHandler`, `StreamingStdOutCallbackHandler`	`CallbackManager([StdOutCallbackHandler()])`

Worked Examples

1. Simple LLMChain

from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

template = "Summarize the following article in one sentence:\n\n{article}"
prompt = PromptTemplate.from_template(template)

llm = OpenAI(model_name="gpt-3.5-turbo", temperature=0)

summary_chain = LLMChain(prompt=prompt, llm=llm)

article = """LangChain provides a modular framework for building LLM‑powered applications..."""
result = summary_chain.run(article)
print(result)

What happens:

PromptTemplate.render inserts the article text.
OpenAI.invoke sends the completed prompt to the model.
The model returns a one‑sentence summary, which LLMChain.run returns unchanged.
Because temperature=0, the output is deterministic; the same article always yields the same summary.

2. Structured Output with a Pydantic Parser

from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

class Sentiment(BaseModel):
    sentiment: str = Field(..., description="positive, neutral, or negative")
    confidence: float = Field(..., ge=0, le=1)

parser = PydanticOutputParser(pydantic_object=Sentiment)

template = """Classify the sentiment of the following sentence and output JSON:
Sentence: "{sentence}"
{format_instructions}
"""
prompt = PromptTemplate.from_template(
    template,
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

chain = LLMChain(prompt=prompt, llm=OpenAI(temperature=0), output_parser=parser)

result = chain.run({"sentence": "LangChain makes LLM integration painless."})
print(result)

Explanation:

The parser supplies JSON format instructions that are baked into the prompt.
The LLM returns a JSON string; the parser validates it against the Sentiment model and returns a typed object.
If the LLM’s output cannot be parsed, the chain raises a ValidationError, which you can catch to retry or fallback.

3. Zero‑Shot ReAct Agent with a Calculator Tool

from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI
import math

def calc(expr: str) -> str:
    """Safely evaluate a mathematical expression."""
    try:
        return str(eval(expr, {"__builtins__": {}}, {"sqrt": math.sqrt, "pow": pow}))
    except Exception as e:
        return f"Error: {e}"

calculator = Tool(
    name="calculator",
    func=calc,
    description="useful for evaluating arithmetic expressions"
)

agent = initialize_agent(
    tools=[calculator],
    llm=OpenAI(temperature=0),
    agent_type="zero-shot-react-description",
    verbose=True
)

question = "What is the square root of (12^2 + 5^2)?"
answer = agent.run(question)
print(answer)

Execution flow:

The agent receives the question.
The LLM’s thought includes a call to calculator with the expression sqrt(12**2 + 5**2).
calc evaluates the expression safely and returns 13.0.
The observation (13.0) is fed back to the LLM, which produces the final answer "13.0".

Error handling tip: The calc function catches any exception and returns a readable error string; the agent treats non‑JSON output as a failure and will retry the step.

4. Conversational Agent with Memory

from langchain.memory import ConversationBufferMemory
from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI

memory = ConversationBufferMemory(memory_key="chat_history")

agent = initialize_agent(
    tools=[],
    llm=OpenAI(temperature=0),
    agent_type="conversational-react-description",
    memory=memory,
    verbose=False
)

print(agent.run("Hi, what's the weather in Paris?"))
print(agent.run("Will it rain tomorrow?"))

What changes:

The first turn stores the weather query and any LLM response in chat_history.
The second turn automatically includes that history, allowing the LLM to reference the earlier context (e.g., “Based on the earlier weather report…”).

Limitations and Considerations

Prompt length – Chains that concatenate long histories can exceed model token limits; consider summarising memory with ConversationSummaryMemory.
Tool description quality – Ambiguous or overly terse descriptions cause the LLM to generate malformed calls, leading to retries or failures.
Latency – Each tool call adds a round‑trip; for real‑time UI you may need asynchronous execution or streaming callbacks.
Determinism vs. creativity – Setting temperature=0 yields reproducible results but may limit nuanced answers; higher temperatures increase variability but reduce repeatability.
Dependency on external services – Vector stores, APIs, and LLM providers introduce network latency and cost; mock them with FakeLLM or FakeRetriever during testing.

Best Practices (Integrated Key Notes)

Uniform LLM API – Swap model providers by changing only the wrapper class.
Immutable PromptTemplates – Use partial_variables for static sections such as format instructions; this prevents accidental mutation.
Deterministic Chains – Keep temperature=0 for pipelines that must produce the same output given identical input.
Clear Tool Descriptions – Write concise, unambiguous sentences that describe input format and expected behavior.
Memory Management – For long conversations, replace ConversationBufferMemory with ConversationSummaryMemory to bound token usage.
Callbacks for UI – Attach StreamingStdOutCallbackHandler or a custom handler to stream tokens to a front‑end, enabling responsive interfaces and easier debugging.
Robust Error Handling – Wrap every tool function in a try/except block and return a plain‑text error message; agents interpret any non‑JSON output as a failure and will retry.
Testing without APIs – Use langchain.testing.FakeLLM and FakeChatModel to unit‑test chains and agents offline, ensuring deterministic test results.

Summary of Takeaways

LangChain standardises prompts, model calls, tool use, memory, and retrieval, making LLM‑centric applications modular and interchangeable.
A Chain is a static pipeline; an Agent adds dynamic decision‑making via the ReAct loop.
PromptTemplates, OutputParsers, and Memory together shape how user input is transformed into structured, context‑aware responses.
Proper tool descriptions, memory pruning, and error‑handling are essential for reliable agent behavior.
The framework’s callback system and testing utilities support production‑grade development and debugging.

LangChain for Agentic AI (Intermediate)