Home Blog Reviews Best Picks Guides Tools Glossary Advertise Subscribe Free
Tech Frontline Apr 8, 2026 4 min read

Building a Custom RAG Pipeline: Step-by-Step Tutorial with Haystack v2

Unlock the secrets to building a scalable, production-ready RAG pipeline using Haystack v2—complete with practical code and architecture tips.

Building a Custom RAG Pipeline: Step-by-Step Tutorial with Haystack v2
T
Tech Daily Shot Team
Published Apr 8, 2026
Building a Custom RAG Pipeline: Step-by-Step Tutorial with Haystack v2

Retrieval-Augmented Generation (RAG) is transforming how developers build robust, context-aware AI applications. If you’re looking to build a custom RAG pipeline with Haystack v2, you’re in the right place. This hands-on tutorial walks you through every step, from setup to production-ready inference, with practical code, troubleshooting tips, and next steps.

For a broader overview of RAG architectures, their use-cases, and deployment strategies, check out our Ultimate Guide to RAG Pipelines: Building Reliable Retrieval-Augmented Generation Systems. Here, we’ll dive deep into the nuts and bolts of building your own pipeline using Haystack v2.


Prerequisites

We’ll use a simple local document store (FAISS) and OpenAI’s GPT-3.5-turbo for demonstration, but you can swap in other vector stores or LLMs as needed.


1. Environment Setup

  1. Create a Virtual Environment
    python3 -m venv rag-tutorial-env
    source rag-tutorial-env/bin/activate  # On Windows: rag-tutorial-env\Scripts\activate
        
  2. Install Haystack v2 and Required Libraries
    pip install farm-haystack[faiss,openai]==2.0.0
        

    This installs Haystack with FAISS (for vector search) and OpenAI (for LLMs). For other backends, see Haystack’s [extras] options.

  3. Set Your API Key (if using OpenAI)
    export OPENAI_API_KEY=your-openai-api-key  # On Windows: set OPENAI_API_KEY=your-openai-api-key
        

2. Prepare Your Data

  1. Organize Documents

    Place your text files in a folder called data/. For this tutorial, let’s use three sample text files:

    • data/doc1.txt: "Haystack is an open-source framework for building search systems powered by language models."
    • data/doc2.txt: "Retrieval-Augmented Generation combines retrieval and generation to improve answer accuracy."
    • data/doc3.txt: "FAISS is a library for efficient similarity search and clustering of dense vectors."

    (You can use your own corpus; just adjust the file names.)


3. Build the Haystack Pipeline

  1. Import Required Modules
    
    from haystack import Pipeline
    from haystack.document_stores import FAISSDocumentStore
    from haystack.nodes import EmbeddingRetriever, PromptNode, PromptTemplate
    from haystack.utils import clean_wiki_text, convert_files_to_docs, fetch_archive_from_http
        
  2. Initialize the Document Store (FAISS)
    
    document_store = FAISSDocumentStore(embedding_dim=384, faiss_index_factory_str="Flat")
        

    Tip: embedding_dim should match your retriever’s embedding model. We’ll use sentence-transformers/all-MiniLM-L6-v2 (384 dimensions).

  3. Index Your Documents
    
    from haystack.utils import convert_files_to_docs
    
    docs = convert_files_to_docs(dir_path="data/")
    document_store.write_documents(docs)
        

    This step parses your text files and writes them into the FAISS vector store.

  4. Add Embeddings with a Retriever
    
    retriever = EmbeddingRetriever(
        document_store=document_store,
        embedding_model="sentence-transformers/all-MiniLM-L6-v2",
        model_format="sentence_transformers"
    )
    document_store.update_embeddings(retriever)
        

    The retriever will embed your documents for semantic search.

  5. Set Up a PromptNode for Generation
    
    prompt_template = PromptTemplate(
        prompt="Answer the question based on the following context: {join(documents)} \n Question: {query}",
        output_parser=None
    )
    generator = PromptNode(
        model_name_or_path="gpt-3.5-turbo",
        api_key=None,  # Uses environment variable
        default_prompt_template=prompt_template,
        max_length=256
    )
        

    Note: For open-source LLMs, use model_name_or_path like "google/flan-t5-base" and adjust Haystack’s PromptNode settings.

  6. Assemble the Pipeline
    
    rag_pipeline = Pipeline()
    rag_pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
    rag_pipeline.add_node(component=generator, name="Generator", inputs=["Retriever"])
        

    This connects the retriever and generator: Query → Retriever → Generator.


4. Run Inference: Ask Questions!

  1. Query the Pipeline
    
    query = "What is FAISS used for?"
    result = rag_pipeline.run(query=query)
    print(result["results"])
        

    Expected Output: The model should generate an answer using retrieved context, e.g.:
    ['FAISS is a library for efficient similarity search and clustering of dense vectors.']

    Screenshot description: Terminal output showing the answer generated by the pipeline in response to the user query.

  2. Try Another Query
    
    query = "What does RAG stand for?"
    result = rag_pipeline.run(query=query)
    print(result["results"])
        

    Screenshot description: Terminal output with the pipeline generating the correct expansion of "RAG" and a brief explanation.


5. Customizing and Extending Your Pipeline

  1. Swap in a Different Retriever or LLM

    You can use other embedding models (e.g., BAAI/bge-base-en) or LLMs (e.g., HuggingFace’s tiiuae/falcon-7b-instruct). Adjust embedding_dim and model_name_or_path accordingly.

  2. Add a Ranker or Filter
    
    from haystack.nodes import TransformersRanker
    
    ranker = TransformersRanker(model_name_or_path="cross-encoder/ms-marco-MiniLM-L-6-v2")
    rag_pipeline.add_node(component=ranker, name="Ranker", inputs=["Retriever"])
    rag_pipeline.connect("Ranker", "Generator")
        

    This improves answer relevance by re-ranking retrieved documents before generation.

    For more on scaling and optimizing large RAG deployments, see Scaling RAG for 100K+ Documents: Sharding, Caching, and Cost Control.

  3. Experiment with Prompt Engineering

    Adjust the PromptTemplate for your use-case, or try techniques from Reducing Hallucinations in RAG Workflows: Prompting and Retrieval Strategies for 2026.


Common Issues & Troubleshooting


Next Steps

You’ve now built a working custom RAG pipeline with Haystack v2! From here, you can:

RAG is a fast-moving field—keep experimenting, and join the Haystack and open-source communities to stay up to date!

RAG Haystack tutorial pipeline step-by-step builder

Related Articles

Tech Frontline
Unlocking the Power of Custom AI Agents in Knowledge Workflow Automation
May 30, 2026
Tech Frontline
Rapid AI Workflow Prototyping: How to Build and Validate Automated Processes in 48 Hours
May 30, 2026
Tech Frontline
How to Build an Automated Document Approval Workflow With AI: End-to-End Tutorial
May 30, 2026
Tech Frontline
Blueprint: Automating Compliance Workflows in Healthcare with Minimal Code (2026)
May 29, 2026
Free & Interactive

Tools & Software

100+ hand-picked tools personally tested by our team — for developers, designers, and power users.

🛠 Dev Tools 🎨 Design 🔒 Security ☁️ Cloud
Explore Tools →
Step by Step

Guides & Playbooks

Complete, actionable guides for every stage — from setup to mastery. No fluff, just results.

📚 Homelab 🔒 Privacy 🐧 Linux ⚙️ DevOps
Browse Guides →
Advertise with Us

Put your brand in front of 10,000+ tech professionals

Native placements that feel like recommendations. Newsletter, articles, banners, and directory features.

✉️
Newsletter
10K+ reach
📰
Articles
SEO evergreen
🖼️
Banners
Site-wide
🎯
Directory
Priority

Stay ahead of the tech curve

Join 10,000+ professionals who start their morning smarter. No spam, no fluff — just the most important tech developments, explained.