Haystack by deepset: Building Production NLP and Agent Pipelines

Haystack's Pipeline-First Philosophy

Haystack, developed by deepset, approaches AI application development as pipeline engineering. Instead of building agents that autonomously decide their next action, Haystack lets you define explicit data processing pipelines where components are connected in a directed graph. Data flows from one component to the next through well-defined input and output sockets.

This philosophy prioritizes predictability and debuggability over autonomy. You know exactly what will happen at each step because you designed the pipeline graph. When something goes wrong, you can inspect the output of each component in isolation.

Component Architecture

Every building block in Haystack is a component — a class with typed input and output sockets. Components are self-contained and reusable:

from haystack import component

@component
class TextCleaner:
    @component.output_types(cleaned_text=str)
    def run(self, text: str) -> dict:
        cleaned = text.strip().replace("\n\n", "\n")
        return {"cleaned_text": cleaned}

@component
class WordCounter:
    @component.output_types(count=int)
    def run(self, text: str) -> dict:
        return {"count": len(text.split())}

The @component decorator and typed output sockets enable Haystack to validate pipeline connections at build time. If you try to connect a component's string output to another component's integer input, Haystack raises an error before the pipeline runs.

Building Pipelines

Pipelines connect components into directed graphs:

from haystack import Pipeline
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore

# Set up document store with data
document_store = InMemoryDocumentStore()

# Build a RAG pipeline
rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", InMemoryBM25Retriever(document_store))
rag_pipeline.add_component(
    "prompt_builder",
    PromptBuilder(
        template="""Given these documents:
        {% for doc in documents %}
        {{ doc.content }}
        {% endfor %}
        Answer the question: {{ query }}"""
    ),
)
rag_pipeline.add_component("llm", OpenAIGenerator(model="gpt-4o"))

# Connect components
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder.prompt", "llm.prompt")

# Run the pipeline
result = rag_pipeline.run({
    "retriever": {"query": "What is agentic AI?"},
    "prompt_builder": {"query": "What is agentic AI?"},
})

print(result["llm"]["replies"][0])

Branching and Routing

Haystack pipelines support conditional branching through router components. This lets you build pipelines that take different paths based on the input:

from haystack.components.routers import MetadataRouter

# Route documents based on file type
router = MetadataRouter(
    rules={
        "pdf_docs": {"file_type": {"$eq": "pdf"}},
        "text_docs": {"file_type": {"$eq": "txt"}},
    }
)

pipeline = Pipeline()
pipeline.add_component("router", router)
pipeline.add_component("pdf_converter", PDFToTextConverter())
pipeline.add_component("text_cleaner", TextCleaner())

pipeline.connect("router.pdf_docs", "pdf_converter.sources")
pipeline.connect("router.text_docs", "text_cleaner.text")

For more dynamic routing, the ConditionalRouter uses Jinja2 templates to evaluate conditions:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

from haystack.components.routers import ConditionalRouter

routes = [
    {
        "condition": "{{ replies[0] | length > 500 }}",
        "output": "long_response",
        "output_name": "long",
        "output_type": str,
    },
    {
        "condition": "{{ replies[0] | length <= 500 }}",
        "output": "short_response",
        "output_name": "short",
        "output_type": str,
    },
]

router = ConditionalRouter(routes=routes)

Agent-Like Behavior with Loops

Haystack 2.x supports pipeline loops, enabling agent-like iterative behavior. You can create a pipeline where the LLM output feeds back into a tool-calling component, which feeds results back to the LLM:

from haystack.components.agents import ToolInvoker
from haystack.tools import Tool

# Define tools
def search_web(query: str) -> str:
    return f"Search results for: {query}"

web_tool = Tool(
    name="search_web",
    description="Search the web for information",
    function=search_web,
    parameters={
        "type": "object",
        "properties": {"query": {"type": "string"}},
        "required": ["query"],
    },
)

# Build an agent pipeline with a loop
agent_pipeline = Pipeline(max_runs_per_component=5)
agent_pipeline.add_component("llm", OpenAIChatGenerator(
    model="gpt-4o", tools=[web_tool]
))
agent_pipeline.add_component("tool_invoker", ToolInvoker(tools=[web_tool]))

# Create a loop: LLM -> tools -> back to LLM
agent_pipeline.connect("llm.replies", "tool_invoker.messages")
agent_pipeline.connect("tool_invoker.tool_messages", "llm.messages")

The max_runs_per_component parameter prevents infinite loops by capping how many times any component can execute within a single pipeline run.

Production Strengths

Haystack's pipeline architecture has distinct advantages for production deployments. Pipelines can be serialized to YAML for version control and deployment automation. Components are independently testable. The explicit graph structure makes it straightforward to add monitoring, logging, and error handling at each node.

Haystack also provides ready-made components for common tasks — document converters, text splitters, embedding generators, retrievers for various vector stores, and generators for multiple LLM providers.

FAQ

How does Haystack compare to LangChain for RAG applications?

Both handle RAG well, but Haystack's pipeline architecture gives you more explicit control over the data flow. LangChain's chain abstraction is more flexible but less predictable. For teams that value debuggability and pipeline reproducibility, Haystack's approach is often preferred.

Can Haystack pipelines run asynchronously?

Yes. Haystack 2.x supports async execution. Components that implement an async run method execute concurrently when possible, improving throughput for I/O-bound pipelines.

Is Haystack suitable for real-time applications?

Haystack pipelines add minimal overhead beyond the component execution time. For latency-sensitive applications, the explicit pipeline graph lets you optimize the critical path and parallelize independent branches.

#Haystack #Deepset #NLPPipelines #AgentFrameworks #ProductionAI #AgenticAI #LearnAI #AIEngineering

Haystack by deepset: Building Production NLP and Agent Pipelines

Haystack's Pipeline-First Philosophy

Component Architecture

Building Pipelines

Branching and Routing

Agent-Like Behavior with Loops

Production Strengths

FAQ

How does Haystack compare to LangChain for RAG applications?

Can Haystack pipelines run asynchronously?

Is Haystack suitable for real-time applications?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding