Back to blog
AI chatbottrainingknowledge basebest practicessetup guide

How to Train Your AI Chatbot: Best Practices for Knowledge Base Setup

PalaChat Team||9 min read

The difference between a helpful AI chatbot and a frustrating one comes down to one thing: training data. A chatbot is only as good as the content it has been trained on. Feed it comprehensive, well-organised information and it will answer customer questions accurately. Feed it thin, outdated content and customers will quickly lose patience.

This guide covers everything you need to know about training an AI chatbot effectively — from choosing the right content to upload, to writing system prompts that shape the chatbot's personality and behaviour.

Understanding How AI Chatbot Training Works

When you upload a document or crawl a website URL in PalaChat, the content goes through a process called Retrieval-Augmented Generation (RAG):

  • Ingestion — your content is extracted from PDFs, web pages, or text files
  • Chunking — the content is split into smaller, meaningful sections
  • Embedding — each chunk is converted into a numerical representation (vector) that captures its meaning
  • Storage — the vectors are stored in a database optimised for similarity search
  • Retrieval — when a customer asks a question, the system finds the most relevant chunks
  • Generation — the AI generates a natural-language answer based on the retrieved content
  • This means the chatbot does not memorise your content word for word. Instead, it understands the meaning and can answer questions in natural, conversational language — even if the exact answer is not stated verbatim in your documents.

    What Content to Upload

    Essential content (start here)

    • FAQ page — the most impactful content to upload first; covers the questions your customers actually ask
    • Product or service descriptions — detailed information about what you offer, including pricing, specifications, and availability
    • Company information — who you are, where you are located, your history, team, and mission
    • Contact details — phone numbers, email addresses, office hours, physical addresses
    • Policies — return policy, shipping policy, warranty, refund terms, privacy policy

    High-value content (add next)

    • How-to guides — step-by-step instructions that customers commonly request
    • Pricing pages — detailed pricing tiers, what is included in each plan, payment terms
    • Case studies or testimonials — helps the chatbot reference real results when relevant
    • Blog posts — especially educational content that answers common industry questions
    • Terms and conditions — particularly important for regulated industries

    Content to avoid

    • Internal documents — HR policies, internal processes, or confidential business information that should not be shared with customers
    • Outdated content — old pricing, discontinued products, or expired promotions
    • Duplicate content — uploading the same information multiple times does not improve accuracy and wastes storage quota
    • Heavily formatted content — complex tables, images, or infographics do not extract well; rewrite key information as plain text

    Best Practices for Website Crawling

    When you paste a URL into PalaChat, the crawler visits that page and extracts the text content. Here are tips for getting the best results:

    Crawl specific pages, not the homepage

    Instead of crawling https://yoursite.com, crawl the specific pages that contain useful information:

    • https://yoursite.com/faq
    • https://yoursite.com/pricing
    • https://yoursite.com/about
    • https://yoursite.com/services
    • https://yoursite.com/contact
    This gives you more control over what content the chatbot knows.

    Check the crawl result

    After crawling, review the extracted content in your Knowledge Base. If a page did not extract well (common with JavaScript-heavy single-page applications), try uploading a PDF version of the content instead.

    Keep content fresh

    If you update your website — new pricing, new products, policy changes — re-crawl the affected pages or re-upload the updated documents. Outdated information in the chatbot is worse than no information at all.

    Best Practices for PDF Uploads

    • Use text-based PDFs, not scanned images. PalaChat extracts text from PDF files, so scanned documents without OCR will produce poor results.
    • Keep files focused. A 5-page PDF covering your return policy is better than a 200-page company handbook where the return policy is buried on page 47.
    • Use clear headings. Well-structured documents with headings and subheadings produce better chunks and more accurate answers.
    • Mind your storage quota. Large files consume storage quickly. A 10 MB product catalogue takes up the same space whether or not the chatbot uses all the information in it.

    Writing an Effective System Prompt

    The system prompt is the set of instructions that shapes your chatbot's personality, tone, and behaviour. Think of it as the chatbot's job description.

    A basic system prompt

    You are a helpful customer support assistant for [Company Name]. Answer questions based on the provided context. If you do not know the answer, say so politely and suggest the customer contact our team at [email/phone].

    A more detailed system prompt

    You are the customer support assistant for [Company Name], a [brief description of business] based in Singapore. Your tone is friendly, professional, and concise.
    >
    Guidelines:
    - Always answer based on the provided context. Do not make up information.
    - If the customer asks about pricing, refer to the current pricing on our website.
    - If the customer wants to speak with a human, offer the callback form.
    - Keep responses under 3 paragraphs unless the question requires a detailed explanation.
    - When greeting customers, use their name if provided.
    - Do not discuss competitors by name.
    - Always end with an offer to help further.

    Tips for system prompts

    • Be specific about your company. The AI does not automatically know your company name, industry, or location. State it explicitly.
    • Set boundaries. Tell the chatbot what NOT to discuss — competitors, confidential information, medical or legal advice.
    • Define escalation behaviour. Instruct the chatbot to offer a callback form when it cannot answer a question.
    • Set the tone. "Friendly and casual" produces very different responses from "formal and professional".
    • Test and iterate. After setting up, ask the chatbot 10-20 common customer questions. Adjust the system prompt based on the responses.

    Common Mistakes to Avoid

    Uploading too little content

    A chatbot with only your homepage crawled will say "I don't have information about that" to most questions. The more relevant content you provide, the more helpful the chatbot becomes.

    Uploading irrelevant content

    Uploading your entire website including blog posts from 2019, press releases, and job listings dilutes the knowledge base. Focus on content that answers customer questions.

    Never updating the knowledge base

    Your business evolves. New products launch, prices change, policies update. If the chatbot still references last year's pricing, customers will lose trust.

    Writing vague system prompts

    "Be helpful" is too vague. "Answer questions about our dental services, pricing, and appointment availability in a warm, professional tone" gives the AI clear direction.

    Ignoring conversation history

    Review your conversation history regularly (available on Growth and Pro plans with export). The questions customers ask reveal gaps in your knowledge base — if the chatbot frequently cannot answer a specific type of question, add content to address it.

    A Simple Training Checklist

    Use this checklist when setting up a new chatbot:

    • Upload or crawl your FAQ page
    • Upload or crawl your product/service pages
    • Upload or crawl your pricing page
    • Upload or crawl your contact and about pages
    • Upload any relevant PDF documents (catalogues, policies)
    • Write a detailed system prompt with company name, tone, and guidelines
    • Test with 10-20 common customer questions
    • Adjust content or system prompt based on test results
    • Set a monthly reminder to review and update the knowledge base

    Getting Started

    Ready to train your first AI chatbot? Sign up for PalaChat free and follow the steps above. Your chatbot can be answering customer questions accurately in under 10 minutes.

    For more guidance, visit our How It Works page or contact our team.

    Ready to automate your customer support?

    Set up your AI chatbot in under 10 minutes. No coding required.

    Get started free