Home Blog Reviews Best Picks Guides Tools Glossary Advertise Subscribe Free
Tech Frontline Apr 8, 2026 4 min read

Building a Custom RAG Pipeline: Step-by-Step Tutorial with Haystack v2

Unlock the secrets to building a scalable, production-ready RAG pipeline using Haystack v2—complete with practical code and architecture tips.

Building a Custom RAG Pipeline: Step-by-Step Tutorial with Haystack v2
T
Tech Daily Shot Team
Published Apr 8, 2026
Building a Custom RAG Pipeline: Step-by-Step Tutorial with Haystack v2

Retrieval-Augmented Generation (RAG) is transforming how developers build robust, context-aware AI applications. If you’re looking to build a custom RAG pipeline with Haystack v2, you’re in the right place. This hands-on tutorial walks you through every step, from setup to production-ready inference, with practical code, troubleshooting tips, and next steps.

For a broader overview of RAG architectures, their use-cases, and deployment strategies, check out our Ultimate Guide to RAG Pipelines: Building Reliable Retrieval-Augmented Generation Systems. Here, we’ll dive deep into the nuts and bolts of building your own pipeline using Haystack v2.


Prerequisites

We’ll use a simple local document store (FAISS) and OpenAI’s GPT-3.5-turbo for demonstration, but you can swap in other vector stores or LLMs as needed.


1. Environment Setup

  1. Create a Virtual Environment
    python3 -m venv rag-tutorial-env
    source rag-tutorial-env/bin/activate  # On Windows: rag-tutorial-env\Scripts\activate
        
  2. Install Haystack v2 and Required Libraries
    pip install farm-haystack[faiss,openai]==2.0.0
        

    This installs Haystack with FAISS (for vector search) and OpenAI (for LLMs). For other backends, see Haystack’s [extras] options.

  3. Set Your API Key (if using OpenAI)
    export OPENAI_API_KEY=your-openai-api-key  # On Windows: set OPENAI_API_KEY=your-openai-api-key
        

2. Prepare Your Data

  1. Organize Documents

    Place your text files in a folder called data/. For this tutorial, let’s use three sample text files:

    • data/doc1.txt: "Haystack is an open-source framework for building search systems powered by language models."
    • data/doc2.txt: "Retrieval-Augmented Generation combines retrieval and generation to improve answer accuracy."
    • data/doc3.txt: "FAISS is a library for efficient similarity search and clustering of dense vectors."

    (You can use your own corpus; just adjust the file names.)


3. Build the Haystack Pipeline

  1. Import Required Modules
    
    from haystack import Pipeline
    from haystack.document_stores import FAISSDocumentStore
    from haystack.nodes import EmbeddingRetriever, PromptNode, PromptTemplate
    from haystack.utils import clean_wiki_text, convert_files_to_docs, fetch_archive_from_http
        
  2. Initialize the Document Store (FAISS)
    
    document_store = FAISSDocumentStore(embedding_dim=384, faiss_index_factory_str="Flat")
        

    Tip: embedding_dim should match your retriever’s embedding model. We’ll use sentence-transformers/all-MiniLM-L6-v2 (384 dimensions).

  3. Index Your Documents
    
    from haystack.utils import convert_files_to_docs
    
    docs = convert_files_to_docs(dir_path="data/")
    document_store.write_documents(docs)
        

    This step parses your text files and writes them into the FAISS vector store.

  4. Add Embeddings with a Retriever
    
    retriever = EmbeddingRetriever(
        document_store=document_store,
        embedding_model="sentence-transformers/all-MiniLM-L6-v2",
        model_format="sentence_transformers"
    )
    document_store.update_embeddings(retriever)
        

    The retriever will embed your documents for semantic search.

  5. Set Up a PromptNode for Generation
    
    prompt_template = PromptTemplate(
        prompt="Answer the question based on the following context: {join(documents)} \n Question: {query}",
        output_parser=None
    )
    generator = PromptNode(
        model_name_or_path="gpt-3.5-turbo",
        api_key=None,  # Uses environment variable
        default_prompt_template=prompt_template,
        max_length=256
    )
        

    Note: For open-source LLMs, use model_name_or_path like "google/flan-t5-base" and adjust Haystack’s PromptNode settings.

  6. Assemble the Pipeline
    
    rag_pipeline = Pipeline()
    rag_pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
    rag_pipeline.add_node(component=generator, name="Generator", inputs=["Retriever"])
        

    This connects the retriever and generator: Query → Retriever → Generator.


4. Run Inference: Ask Questions!

  1. Query the Pipeline
    
    query = "What is FAISS used for?"
    result = rag_pipeline.run(query=query)
    print(result["results"])
        

    Expected Output: The model should generate an answer using retrieved context, e.g.:
    ['FAISS is a library for efficient similarity search and clustering of dense vectors.']

    Screenshot description: Terminal output showing the answer generated by the pipeline in response to the user query.

  2. Try Another Query
    
    query = "What does RAG stand for?"
    result = rag_pipeline.run(query=query)
    print(result["results"])
        

    Screenshot description: Terminal output with the pipeline generating the correct expansion of "RAG" and a brief explanation.


5. Customizing and Extending Your Pipeline

  1. Swap in a Different Retriever or LLM

    You can use other embedding models (e.g., BAAI/bge-base-en) or LLMs (e.g., HuggingFace’s tiiuae/falcon-7b-instruct). Adjust embedding_dim and model_name_or_path accordingly.

  2. Add a Ranker or Filter
    
    from haystack.nodes import TransformersRanker
    
    ranker = TransformersRanker(model_name_or_path="cross-encoder/ms-marco-MiniLM-L-6-v2")
    rag_pipeline.add_node(component=ranker, name="Ranker", inputs=["Retriever"])
    rag_pipeline.connect("Ranker", "Generator")
        

    This improves answer relevance by re-ranking retrieved documents before generation.

    For more on scaling and optimizing large RAG deployments, see Scaling RAG for 100K+ Documents: Sharding, Caching, and Cost Control.

  3. Experiment with Prompt Engineering

    Adjust the PromptTemplate for your use-case, or try techniques from Reducing Hallucinations in RAG Workflows: Prompting and Retrieval Strategies for 2026.


Common Issues & Troubleshooting


Next Steps

You’ve now built a working custom RAG pipeline with Haystack v2! From here, you can:

RAG is a fast-moving field—keep experimenting, and join the Haystack and open-source communities to stay up to date!

RAG Haystack tutorial pipeline step-by-step builder

Related Articles

Tech Frontline
How to Build Reliable RAG Workflows for Document Summarization
Apr 15, 2026
Tech Frontline
How to Use RAG Pipelines for Automated Research Summaries in Financial Services
Apr 14, 2026
Tech Frontline
How to Build an Automated Document Approval Workflow Using AI (2026 Step-by-Step)
Apr 14, 2026
Tech Frontline
Design Patterns for Multi-Agent AI Workflow Orchestration (2026)
Apr 13, 2026
Free & Interactive

Tools & Software

100+ hand-picked tools personally tested by our team — for developers, designers, and power users.

🛠 Dev Tools 🎨 Design 🔒 Security ☁️ Cloud
Explore Tools →
Step by Step

Guides & Playbooks

Complete, actionable guides for every stage — from setup to mastery. No fluff, just results.

📚 Homelab 🔒 Privacy 🐧 Linux ⚙️ DevOps
Browse Guides →
Advertise with Us

Put your brand in front of 10,000+ tech professionals

Native placements that feel like recommendations. Newsletter, articles, banners, and directory features.

✉️
Newsletter
10K+ reach
📰
Articles
SEO evergreen
🖼️
Banners
Site-wide
🎯
Directory
Priority

Stay ahead of the tech curve

Join 10,000+ professionals who start their morning smarter. No spam, no fluff — just the most important tech developments, explained.