Train on Company Data

Building RAG with ChatGPT and Company Documents

By training on various internal files, web pages, and Notion pages, it is possible to create a chatbot that accurately responds to internal inquiries. The learning function can be used for free any number of times and is user-friendly, keeping up with updates to organizational data.

Start for Free
RAG illustration
Assuming a Variety of Internal Documents

Types of Data That Can Be Learned

Learning from Files

It is possible to learn from various internal files. Files are instantly learned upon upload and immediately reflected in the bot's responses.
Manuals, meeting minutes, contracts, reports, procedures, FAQs, policies and regulations are all considered, and formats such as text files (.txt), markdown files (.md), PDF files (.pdf), and Word documents (.docx) are supported.

Files

Learning from the Web

Simply input the URL of a web page to extract its text and learn from it. Unnecessary HTML tags are automatically removed, preventing noise from reducing the accuracy of the bot's responses.
This is intended for use with company homepages, FAQ sites, blog articles, etc.

Web
Notion

Learning from Notion

If you are using Notion to manage an internal wiki or external manuals, these pages can be targeted for learning. Integrating with your Notion account automatically retrieves pages managed by that account. You can then select what you want the chatbot to learn from.
In the future, any content changes in Notion will be automatically captured and relearned.

Notion
Features Focused on User Perspective

Features of the Internal Data Learning Function

Learning with company data and internal files for the chatbot is free. There are no limits on file size, word count, or number of learning sessions, and you can always keep the chatbot up-to-date with internal document updates and additions.

Learning for the chatbot is conducted on the management screen. No special settings or tuning are required, and operations such as file uploads are intuitive. Additionally, only administrators have access to the chatbot creation and learning functions.

Data used for learning is not utilized by external companies like OpenAI. Storage of internal document data is limited to the vector database only, and is not stored in other databases or storage systems.

When the chatbot responds to a question, it indicates which document and section it referred to. The original data for learning can be linked with URLs, allowing immediate access to the original files in internal shared folders or other locations.

Adopting New Technology

ChatGPT and RAG

The mechanism of providing custom data to LLMs like ChatGPT to generate responses is known as
RAG (Retrieval-Augmented Generation) and Doox utilizes this technology.

Using ChatGPT Alone

  • Can only respond based on open information
  • LLM is only as up-to-date as its last update
  • Issues with hallucinations leading to incorrect responses
Without RAG illustration

Enhancing ChatGPT with RAG

  • Can respond using closed information such as internal data
  • When querying LLM, information is provided, reflecting even the latest data in responses
  • Possible to indicate which information was used for the response, making the basis and source of answers clear
With RAG illustration
Start for Free