How to Install and Use Kotaemon β Your Secret Weapon for Document Interaction π
Saturday, Jan 11, 2025 | 6 minute read
Discover an innovative open-source tool that enhances document interaction with a clean UI, Document organization, advanced citations, and customizable features. Its community-driven development allows for personalized extensions, making it a must-have for developers! πβ¨
Chapter 1: Introduction to Kotaemon β A New Era of Document Interaction π
Kotaemon is an exhilarating open-source project designed to create a clean and personalized user interface, specifically tailored for Retrieval-Augmented Generation (RAG) applications. RAG is a cutting-edge technology that cleverly combines document retrieval with generative models, significantly enhancing the efficiency of question-answering and information processing. With Kotaemon, users will effortlessly achieve efficient interaction with documents, making the entire experience smoother! β¨
Chapter 2: Why Choose Kotaemon? β Key Features That Set It Apart π
The most appealing aspect of Kotaemon is its fresh and clean user interface! Users can focus on effective interactions without distractions during their usage. Additionally, Kotaemon supports various language models (LLMs) from API providers like OpenAI, Azure OpenAI, and Cohere, showcasing exceptional flexibility to meet diverse user needs.
Users can easily organize documents into collections and share their favorite dialogues, streamlining document management. The advanced citation mechanism provided by Kotaemon is another highlight, featuring a built-in PDF viewer that enhances academic integrity and readability, making academic research a breeze π‘!
Even more impressive, Kotaemon boasts powerful complex reasoning capabilities, employing question decomposition methods to handle multi-layered inquiries with unmatched responsiveness. Its user interface is highly customizable, built using Gradio, perfectly aligning with personalized needs and catering to users from various backgrounds! π¨
Chapter 3: The Allure of Kotaemon β Why Developers Are Enthralled β€οΈ
Kotaemon emphasizes a community-oriented development philosophy, allowing users to provide feedback and interact directly with developers via GitHub, driving the project’s evolution together π! It offers simple installation options, enabling users to install it online through HuggingFace Space, effectively lowering the entry barrier for beginners and developers alike!
This open development framework significantly fosters personalized extensions, allowing developers to customize elements through configuration files, thus enhancing functionality. The comprehensive documentation and guides provided by Kotaemon make it easy for developers to get started, covering prerequisites, system requirements, and detailed instructions for customizing the application, ensuring a smooth user journey πΎ.
Whatβs cooler is that Kotaemon regularly undergoes technical updates and improvements, ensuring its long-term vitality and adaptability within the tech stack, providing users with continuous innovative experiences π!
Kotaemon is undeniably a revolutionary tool that perfectly merges document processing with artificial intelligence, creating a seamless interactive experience for users and developers alike, becoming a crucial force in the evolution of document interaction systems! πͺ
Chapter 4: How to Install Kotaemon π»
System Requirements π οΈ
Before embarking on your journey with Kotaemon, ensure your system meets the following requirements:
- Python: You need to have Python version 3.10 or higher installed. You can download the latest version from the Python official website.
- Docker: While optional, using Docker is highly recommended as it simplifies environment setup and provides isolation.
- Unstructured: If you want to support additional file types, this library must be installed. You can refer to the Unstructured official documentation for installation instructions.
Installing with Docker (Recommended) π³
Installing with Docker can effectively simplify environment setup! Here are the steps to install Kotaemon using Docker:
-
Choose the
lite
orfull
version of the Docker image!-
For the Lite version, use the following command:
docker run \ -e GRADIO_SERVER_NAME=0.0.0.0 \ -e GRADIO_SERVER_PORT=7860 \ -p 7860:7860 -it --rm \ ghcr.io/cinnamon/kotaemon:main-lite
- Here, we set environment variables to allow the Gradio server to run on all available network interfaces (0.0.0.0), and specified the port we will use (7860).
-
For the Full version, use the following command:
docker run \ -e GRADIO_SERVER_NAME=0.0.0.0 \ -e GRADIO_SERVER_PORT=7860 \ -p 7860:7860 -it --rm \ ghcr.io/cinnamon/kotaemon:main-full
-
-
Next, access the Web User Interface WebUI in your browser:
http://localhost:7860/
.
Install Without Docker π
If you choose not to use Docker, you can also install Kotaemon manually, steps as follows:
-
Clone the repository and install the required packages:
conda create -n kotaemon python=3.10 conda activate kotaemon git clone https://github.com/Cinnamon/kotaemon cd kotaemon pip install -e "libs/kotaemon[all]" pip install -e "libs/ktem"
- At this stage, we’ve created a new conda environment named
kotaemon
, ensuring to use Python 3.10 and activated that environment. Then, we cloned the Kotaemon repository and installed the necessary libraries.
- At this stage, we’ve created a new conda environment named
-
Create a
.env
file according to the provided template. -
(Optional) Set up the built-in PDF viewer in your browser for easier document processing.
-
Start the web server by executing the following command:
python app.py
- The default username and password are both set to
admin
, so remember those!
- The default username and password are both set to
Setting Up GraphRAG π§
Kotaemon supports various reasoning methods through GraphRAG. The official index only works with the OpenAI or Ollama API, but you can use NanoGraphRAG for simpler integration.
Install and Set Up Nano GraphRAG π¦
- Install the Nano GRAPHRAG library:
pip install nano-graphrag
- Set the environment variable for usage:
export USE_NANO_GRAPHRAG=true
Install and Set Up LIGHTRAG π‘
- Install LIGHTRAG:
pip install git+https://github.com/HKUDS/LightRAG.git
- Set the appropriate environment variable:
export USE_LIGHTRAG=true
Install and Set Up MS GRAPHRAG π
- Install MS GraphRAG:
pip install "graphrag<=0.3.6" future
- Donβt forget to set the GRAPHRAG_API_KEY in your environment or
.env
file for authentication.
Customizing Applications π¨
By default, Kotaemon stores data in the ./ktem_app_data
directory. If you have higher demands, you can easily adjust application configurations through the flowsettings.py
and .env
files.
Adding Your RAG Pipeline π οΈ
- You can modify existing pipelines or create new reasoning pipelines by navigating to
libs/ktem/ktem/reasoning/
. - To customize index pipelines, just take a look at the example implementations found in
libs/ktem/ktem/index/file/graph
!
Offline Install - Intermediate (20 Minutes) π
Download π₯
First, download the kotaemon-app.zip
file from the latest release page.
Run the Installation Script ποΈ
After unzipping the downloaded file, navigate to the script folder and execute the installation script that matches your operating system:
-
Windows:
run_windows.bat
Just double-click the file to run it smoothly.
-
macOS:
run_macos.sh
On macOS, right-click the file and select βOpen with Terminal.β In the opened dialog, choose βAll Applicationsβ to ensure it opens with Terminal every time.
Note: Ensure to check the βAlways Open Withβ¦β option when configuring. After setup, you can double-click the file to execute it.
-
Linux:
bash run_linux.sh
Simply run this command in the terminal to execute the script.
Once the installation completes, you will be prompted to start the ktem's UI
, just answer βyesβ to proceed. If it starts successfully, the application will automatically open in your default browser!
Chapter 5: Starting the Application π
Once the initial setup is completed or any changes are made, you can start the application simply by re-running the corresponding run_*
script! The browser window will open, displaying the application interface.
Example Customization in flowsettings.py βοΈ
You can customize document storage (with full-text search capabilities) and vector storage according to your needs:
# Example customization in flowsettings.py
# Set preferred document storage
KH_DOCSTORE=(Elasticsearch | LanceDB | SimpleFileDocumentStore)
# Specify vector storage
KH_VECTORSTORE=(ChromaDB | LanceDB | InMemory | Qdrant)
# Enable or disable multimodal question answering
KH_REASONINGS_USE_MULTIMODAL=True
# Set or modify reasoning pipelines
KH_REASONINGS = [
"ktem.reasoning.simple.FullQAPipeline",
"ktem.reasoning.simple.FullDecomposeQAPipeline",
"ktem.reasoning.react.ReactAgentPipeline",
"ktem.reasoning.rewoo.RewooAgentPipeline",
]
Environment Configuration via .env File π
Don’t forget to set environment-related variables like this:
# Environment configuration through .env file
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_API_KEY=<Enter your OpenAI API key here>
OPENAI_CHAT_MODEL=gpt-3.5-turbo
OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002
AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_API_KEY=
OPENAI_API_VERSION=2024-02-15-preview
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-35-turbo
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=text-embedding-ada-002
Starting the Server π
Finally, to start your server, execute the following command:
# Command to start the server
python app.py
Following this guide, even if you are a beginner, you should now be ready to enjoy using Kotaemon, whether for building applications or enhancing existing functionalities! Happy coding! π