How to Install and Use Kotaemon – Your Secret Weapon for Document Interaction πŸš€

Saturday, Jan 11, 2025 | 6 minute read

GitHub Trend
How to Install and Use Kotaemon – Your Secret Weapon for Document Interaction πŸš€

Discover an innovative open-source tool that enhances document interaction with a clean UI, Document organization, advanced citations, and customizable features. Its community-driven development allows for personalized extensions, making it a must-have for developers! πŸš€βœ¨

Chapter 1: Introduction to Kotaemon – A New Era of Document Interaction πŸ“–

Kotaemon is an exhilarating open-source project designed to create a clean and personalized user interface, specifically tailored for Retrieval-Augmented Generation (RAG) applications. RAG is a cutting-edge technology that cleverly combines document retrieval with generative models, significantly enhancing the efficiency of question-answering and information processing. With Kotaemon, users will effortlessly achieve efficient interaction with documents, making the entire experience smoother! ✨

Chapter 2: Why Choose Kotaemon? – Key Features That Set It Apart 🌟

The most appealing aspect of Kotaemon is its fresh and clean user interface! Users can focus on effective interactions without distractions during their usage. Additionally, Kotaemon supports various language models (LLMs) from API providers like OpenAI, Azure OpenAI, and Cohere, showcasing exceptional flexibility to meet diverse user needs.

Users can easily organize documents into collections and share their favorite dialogues, streamlining document management. The advanced citation mechanism provided by Kotaemon is another highlight, featuring a built-in PDF viewer that enhances academic integrity and readability, making academic research a breeze πŸ’‘!

Even more impressive, Kotaemon boasts powerful complex reasoning capabilities, employing question decomposition methods to handle multi-layered inquiries with unmatched responsiveness. Its user interface is highly customizable, built using Gradio, perfectly aligning with personalized needs and catering to users from various backgrounds! 🎨

Chapter 3: The Allure of Kotaemon – Why Developers Are Enthralled ❀️

Kotaemon emphasizes a community-oriented development philosophy, allowing users to provide feedback and interact directly with developers via GitHub, driving the project’s evolution together πŸ‘! It offers simple installation options, enabling users to install it online through HuggingFace Space, effectively lowering the entry barrier for beginners and developers alike!

This open development framework significantly fosters personalized extensions, allowing developers to customize elements through configuration files, thus enhancing functionality. The comprehensive documentation and guides provided by Kotaemon make it easy for developers to get started, covering prerequisites, system requirements, and detailed instructions for customizing the application, ensuring a smooth user journey 🐾.

What’s cooler is that Kotaemon regularly undergoes technical updates and improvements, ensuring its long-term vitality and adaptability within the tech stack, providing users with continuous innovative experiences πŸ”„!

Kotaemon is undeniably a revolutionary tool that perfectly merges document processing with artificial intelligence, creating a seamless interactive experience for users and developers alike, becoming a crucial force in the evolution of document interaction systems! πŸ’ͺ


Chapter 4: How to Install Kotaemon πŸ’»

System Requirements πŸ› οΈ

Before embarking on your journey with Kotaemon, ensure your system meets the following requirements:

  1. Python: You need to have Python version 3.10 or higher installed. You can download the latest version from the Python official website.
  2. Docker: While optional, using Docker is highly recommended as it simplifies environment setup and provides isolation.
  3. Unstructured: If you want to support additional file types, this library must be installed. You can refer to the Unstructured official documentation for installation instructions.

Installing with Docker can effectively simplify environment setup! Here are the steps to install Kotaemon using Docker:

  1. Choose the lite or full version of the Docker image!

    • For the Lite version, use the following command:

      docker run \
      -e GRADIO_SERVER_NAME=0.0.0.0 \
      -e GRADIO_SERVER_PORT=7860 \
      -p 7860:7860 -it --rm \
      ghcr.io/cinnamon/kotaemon:main-lite
      
      • Here, we set environment variables to allow the Gradio server to run on all available network interfaces (0.0.0.0), and specified the port we will use (7860).
    • For the Full version, use the following command:

      docker run \
      -e GRADIO_SERVER_NAME=0.0.0.0 \
      -e GRADIO_SERVER_PORT=7860 \
      -p 7860:7860 -it --rm \
      ghcr.io/cinnamon/kotaemon:main-full
      
  2. Next, access the Web User Interface WebUI in your browser: http://localhost:7860/.

Install Without Docker πŸš€

If you choose not to use Docker, you can also install Kotaemon manually, steps as follows:

  1. Clone the repository and install the required packages:

    conda create -n kotaemon python=3.10
    conda activate kotaemon
    git clone https://github.com/Cinnamon/kotaemon
    cd kotaemon
    pip install -e "libs/kotaemon[all]"
    pip install -e "libs/ktem"
    
    • At this stage, we’ve created a new conda environment named kotaemon, ensuring to use Python 3.10 and activated that environment. Then, we cloned the Kotaemon repository and installed the necessary libraries.
  2. Create a .env file according to the provided template.

  3. (Optional) Set up the built-in PDF viewer in your browser for easier document processing.

  4. Start the web server by executing the following command:

    python app.py
    
    • The default username and password are both set to admin, so remember those!

Setting Up GraphRAG 🧠

Kotaemon supports various reasoning methods through GraphRAG. The official index only works with the OpenAI or Ollama API, but you can use NanoGraphRAG for simpler integration.

Install and Set Up Nano GraphRAG πŸ“¦

  • Install the Nano GRAPHRAG library:
    pip install nano-graphrag
    
  • Set the environment variable for usage:
    export USE_NANO_GRAPHRAG=true
    

Install and Set Up LIGHTRAG πŸ’‘

  • Install LIGHTRAG:
    pip install git+https://github.com/HKUDS/LightRAG.git
    
  • Set the appropriate environment variable:
    export USE_LIGHTRAG=true
    

Install and Set Up MS GRAPHRAG πŸ“Š

  • Install MS GraphRAG:
    pip install "graphrag<=0.3.6" future
    
  • Don’t forget to set the GRAPHRAG_API_KEY in your environment or .env file for authentication.

Customizing Applications 🎨

By default, Kotaemon stores data in the ./ktem_app_data directory. If you have higher demands, you can easily adjust application configurations through the flowsettings.py and .env files.

Adding Your RAG Pipeline πŸ› οΈ

  1. You can modify existing pipelines or create new reasoning pipelines by navigating to libs/ktem/ktem/reasoning/.
  2. To customize index pipelines, just take a look at the example implementations found in libs/ktem/ktem/index/file/graph!

Offline Install - Intermediate (20 Minutes) 🌐

Download πŸ“₯

First, download the kotaemon-app.zip file from the latest release page.

Run the Installation Script πŸ—οΈ

After unzipping the downloaded file, navigate to the script folder and execute the installation script that matches your operating system:

  • Windows:

    run_windows.bat
    

    Just double-click the file to run it smoothly.

  • macOS:

    run_macos.sh
    

    On macOS, right-click the file and select β€œOpen with Terminal.” In the opened dialog, choose β€œAll Applications” to ensure it opens with Terminal every time.

    Note: Ensure to check the β€œAlways Open With…” option when configuring. After setup, you can double-click the file to execute it.

  • Linux:

    bash run_linux.sh
    

    Simply run this command in the terminal to execute the script.

Once the installation completes, you will be prompted to start the ktem's UI, just answer β€œyes” to proceed. If it starts successfully, the application will automatically open in your default browser!

Chapter 5: Starting the Application πŸš€

Once the initial setup is completed or any changes are made, you can start the application simply by re-running the corresponding run_* script! The browser window will open, displaying the application interface.

Example Customization in flowsettings.py βš™οΈ

You can customize document storage (with full-text search capabilities) and vector storage according to your needs:

# Example customization in flowsettings.py
# Set preferred document storage
KH_DOCSTORE=(Elasticsearch | LanceDB | SimpleFileDocumentStore)

# Specify vector storage
KH_VECTORSTORE=(ChromaDB | LanceDB | InMemory | Qdrant)

# Enable or disable multimodal question answering
KH_REASONINGS_USE_MULTIMODAL=True

# Set or modify reasoning pipelines
KH_REASONINGS = [
    "ktem.reasoning.simple.FullQAPipeline",
    "ktem.reasoning.simple.FullDecomposeQAPipeline",
    "ktem.reasoning.react.ReactAgentPipeline",
    "ktem.reasoning.rewoo.RewooAgentPipeline",
]

Environment Configuration via .env File πŸ“„

Don’t forget to set environment-related variables like this:

# Environment configuration through .env file
OPENAI_API_BASE=https://api.openai.com/v1
OPENAI_API_KEY=<Enter your OpenAI API key here>
OPENAI_CHAT_MODEL=gpt-3.5-turbo
OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002

AZURE_OPENAI_ENDPOINT=
AZURE_OPENAI_API_KEY=
OPENAI_API_VERSION=2024-02-15-preview
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-35-turbo
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=text-embedding-ada-002

Starting the Server 🏁

Finally, to start your server, execute the following command:

# Command to start the server
python app.py

Following this guide, even if you are a beginner, you should now be ready to enjoy using Kotaemon, whether for building applications or enhancing existing functionalities! Happy coding! πŸŽ‰

Β© 2024 - 2025 GitHub Trend

πŸ“ˆ Fun Projects πŸ”