Haystack by deepset: Building Production NLP and Agent Pipelines
Learn how Haystack's pipeline architecture and component-based design enable building production-grade NLP and agent systems with flexible routing, branching, and ready-made components.
Haystack's Pipeline-First Philosophy
Haystack, developed by deepset, approaches AI application development as pipeline engineering. Instead of building agents that autonomously decide their next action, Haystack lets you define explicit data processing pipelines where components are connected in a directed graph. Data flows from one component to the next through well-defined input and output sockets.
This philosophy prioritizes predictability and debuggability over autonomy. You know exactly what will happen at each step because you designed the pipeline graph. When something goes wrong, you can inspect the output of each component in isolation.
Component Architecture
Every building block in Haystack is a component — a class with typed input and output sockets. Components are self-contained and reusable:
from haystack import component
@component
class TextCleaner:
@component.output_types(cleaned_text=str)
def run(self, text: str) -> dict:
cleaned = text.strip().replace("\n\n", "\n")
return {"cleaned_text": cleaned}
@component
class WordCounter:
@component.output_types(count=int)
def run(self, text: str) -> dict:
return {"count": len(text.split())}
The @component decorator and typed output sockets enable Haystack to validate pipeline connections at build time. If you try to connect a component's string output to another component's integer input, Haystack raises an error before the pipeline runs.
Building Pipelines
Pipelines connect components into directed graphs:
from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
# Set up document store with data
document_store = InMemoryDocumentStore()
# Build a RAG pipeline
rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", InMemoryBM25Retriever(document_store))
rag_pipeline.add_component(
"prompt_builder",
PromptBuilder(
template="""Given these documents:
{% for doc in documents %}
{{ doc.content }}
{% endfor %}
Answer the question: {{ query }}"""
),
)
rag_pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o"))
# Connect components
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder.prompt", "llm.prompt")
# Run the pipeline
result = rag_pipeline.run({
"retriever": {"query": "What is agentic AI?"},
"prompt_builder": {"query": "What is agentic AI?"},
})
print(result["llm"]["replies"][0])
Branching and Routing
Haystack pipelines support conditional branching through router components. This lets you build pipelines that take different paths based on the input:
from haystack.components.routers import MetadataRouter
# Route documents based on file type
router = MetadataRouter(
rules={
"pdf_docs": {"file_type": {"$eq": "pdf"}},
"text_docs": {"file_type": {"$eq": "txt"}},
}
)
pipeline = Pipeline()
pipeline.add_component("router", router)
pipeline.add_component("pdf_converter", PDFToTextConverter())
pipeline.add_component("text_cleaner", TextCleaner())
pipeline.connect("router.pdf_docs", "pdf_converter.sources")
pipeline.connect("router.text_docs", "text_cleaner.text")
For more dynamic routing, the ConditionalRouter uses Jinja2 templates to evaluate conditions:
See AI Voice Agents Handle Real Calls
Book a free demo or calculate how much you can save with AI voice automation.
from haystack.components.routers import ConditionalRouter
routes = [
{
"condition": "{{ replies[0] | length > 500 }}",
"output": "long_response",
"output_name": "long",
"output_type": str,
},
{
"condition": "{{ replies[0] | length <= 500 }}",
"output": "short_response",
"output_name": "short",
"output_type": str,
},
]
router = ConditionalRouter(routes=routes)
Agent-Like Behavior with Loops
Haystack 2.x supports pipeline loops, enabling agent-like iterative behavior. You can create a pipeline where the LLM output feeds back into a tool-calling component, which feeds results back to the LLM:
from haystack.components.agents import ToolInvoker
from haystack.tools import Tool
# Define tools
def search_web(query: str) -> str:
return f"Search results for: {query}"
web_tool = Tool(
name="search_web",
description="Search the web for information",
function=search_web,
parameters={
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"],
},
)
# Build an agent pipeline with a loop
agent_pipeline = Pipeline(max_runs_per_component=5)
agent_pipeline.add_component("llm", OpenAIChatGenerator(
model="gpt-4o", tools=[web_tool]
))
agent_pipeline.add_component("tool_invoker", ToolInvoker(tools=[web_tool]))
# Create a loop: LLM -> tools -> back to LLM
agent_pipeline.connect("llm.replies", "tool_invoker.messages")
agent_pipeline.connect("tool_invoker.tool_messages", "llm.messages")
The max_runs_per_component parameter prevents infinite loops by capping how many times any component can execute within a single pipeline run.
Production Strengths
Haystack's pipeline architecture has distinct advantages for production deployments. Pipelines can be serialized to YAML for version control and deployment automation. Components are independently testable. The explicit graph structure makes it straightforward to add monitoring, logging, and error handling at each node.
Haystack also provides ready-made components for common tasks — document converters, text splitters, embedding generators, retrievers for various vector stores, and generators for multiple LLM providers.
FAQ
How does Haystack compare to LangChain for RAG applications?
Both handle RAG well, but Haystack's pipeline architecture gives you more explicit control over the data flow. LangChain's chain abstraction is more flexible but less predictable. For teams that value debuggability and pipeline reproducibility, Haystack's approach is often preferred.
Can Haystack pipelines run asynchronously?
Yes. Haystack 2.x supports async execution. Components that implement an async run method execute concurrently when possible, improving throughput for I/O-bound pipelines.
Is Haystack suitable for real-time applications?
Haystack pipelines add minimal overhead beyond the component execution time. For latency-sensitive applications, the explicit pipeline graph lets you optimize the critical path and parallelize independent branches.
#Haystack #Deepset #NLPPipelines #AgentFrameworks #ProductionAI #AgenticAI #LearnAI #AIEngineering
CallSphere Team
Expert insights on AI voice agents and customer communication automation.
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.