Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. App Wishlist
  3. H2O LLM Studio, no-code GUI, fine-tuning LLMs

H2O LLM Studio, no-code GUI, fine-tuning LLMs

Scheduled Pinned Locked Moved App Wishlist
5 Posts 2 Posters 1.4k Views 2 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • robiR Offline
    robiR Offline
    robi
    wrote on last edited by
    #1

    9b8ca8eb-913b-4cd2-b821-61ee80a5b8e9-llm-studio-logo.png

    Welcome to H2O LLM Studio, a framework and no-code GUI designed for

    fine-tuning state-of-the-art large language models (LLMs).
    https://github.com/h2oai/h2o-llmstudio/

    950bffac-8128-45ed-bfd7-fa8e470f45e7-233859311-32aa1f8c-4d68-47ac-8cd9-9313171ff9f9.png
    f06b0b86-f471-415a-ab69-e3aaf0a26fa4-233859315-e6928aa7-28d2-420b-8366-bc7323c368ca.png

    With H2O LLM Studio, you can

    • easily and effectively fine-tune LLMs without the need for any coding experience.
    • use a graphic user interface (GUI) specially designed for large language models.
    • finetune any LLM using a large variety of hyperparameters.
    • use recent finetuning techniques such as Low-Rank Adaptation (LoRA) and 8-bit model training with a low memory footprint.
    • use advanced evaluation metrics to judge generated answers by the model.
    • track and compare your model performance visually. In addition, Neptune integration can be used.
    • chat with your model and get instant feedback on your model performance.
    • easily export your model to the Hugging Face Hub and share it with the community.

    Quickstart

    For questions, discussing, or just hanging out, come and join Discord!

    We offer several ways of getting started quickly.

    Using CLI for fine-tuning LLMs:

    Kaggle Open in Colab

    Conscious tech

    L 1 Reply Last reply
    3
    • L Offline
      L Offline
      LoudLemur
      wrote on last edited by LoudLemur
      #2

      Well done on finding this one, @robi ! I was about to suggest it myself.
      How did I miss it here?

      https://huggingface.co/h2oai
      https://gpt-gm.h2o.ai/

      There is a video of a chap using it with Falcon 40b here:
      https://invidious.io.lol/watch?v=H8Dx-iUY49s&quality=dash

      I can't seem to find that model there, at the moment. https://falcon.h2o.ai/

      Near the very end, you can see him upload / link by URL a document which he then interrogates using the AI.

      1 Reply Last reply
      0
      • robiR robi

        9b8ca8eb-913b-4cd2-b821-61ee80a5b8e9-llm-studio-logo.png

        Welcome to H2O LLM Studio, a framework and no-code GUI designed for

        fine-tuning state-of-the-art large language models (LLMs).
        https://github.com/h2oai/h2o-llmstudio/

        950bffac-8128-45ed-bfd7-fa8e470f45e7-233859311-32aa1f8c-4d68-47ac-8cd9-9313171ff9f9.png
        f06b0b86-f471-415a-ab69-e3aaf0a26fa4-233859315-e6928aa7-28d2-420b-8366-bc7323c368ca.png

        With H2O LLM Studio, you can

        • easily and effectively fine-tune LLMs without the need for any coding experience.
        • use a graphic user interface (GUI) specially designed for large language models.
        • finetune any LLM using a large variety of hyperparameters.
        • use recent finetuning techniques such as Low-Rank Adaptation (LoRA) and 8-bit model training with a low memory footprint.
        • use advanced evaluation metrics to judge generated answers by the model.
        • track and compare your model performance visually. In addition, Neptune integration can be used.
        • chat with your model and get instant feedback on your model performance.
        • easily export your model to the Hugging Face Hub and share it with the community.

        Quickstart

        For questions, discussing, or just hanging out, come and join Discord!

        We offer several ways of getting started quickly.

        Using CLI for fine-tuning LLMs:

        Kaggle Open in Colab

        L Offline
        L Offline
        LoudLemur
        wrote on last edited by
        #3

        @robi On what sort of VPS would you consider running this?

        robiR 1 Reply Last reply
        0
        • L LoudLemur

          @robi On what sort of VPS would you consider running this?

          robiR Offline
          robiR Offline
          robi
          wrote on last edited by
          #4

          @LoudLemur no different than ones you would use to run the other projects that can run models with.

          Conscious tech

          1 Reply Last reply
          1
          • L Offline
            L Offline
            LoudLemur
            wrote last edited by LoudLemur
            #5

            Lots of work has continued on this project:
            https://github.com/h2oai/h2o-llmstudio/compare/v0.0.1...v1.14.1

            Here is an ai summary of the main improvements:

            "Overview
            H2O.ai's h2o-llmstudio, a tool for fine-tuning large language models (LLMs), has evolved significantly from its initial version (v0.0.1) to the latest (v1.14.1). This software, designed for both technical and non-technical users, has introduced new features and optimizations that enhance its functionality for enterprise applications.
            Technical Advancements
            The software has added support for new problem types, such as causal regression and classification modeling, making it versatile for different tasks. It has also shifted from Reinforcement Learning from Human Feedback (RLHF) to more efficient methods like Direct Preference Optimization (DPO) and Inverse Preference Optimization (IPO), improving training efficiency. Additionally, Deepspeed integration for distributed training and advanced techniques like Low-Rank Adaptation (LoRA) have boosted performance, especially for larger models.
            Performance and Cost Benefits
            Research indicates significant performance improvements, with fine-tuned models reducing costs by up to 70% and cutting inference time by 75%. It can now handle 500% more requests, making it scalable for high-demand scenarios. In benchmarks like GAIA, it achieved 79.7% accuracy, nearing human-level performance at 92%, outperforming competitors.
            Usability Enhancements
            The no-code GUI has been refined, making it easier for users without coding experience to fine-tune LLMs. Features like a single max_length setting and improved user secrets handling with the 'keyring' library enhance usability and security. Integration with tools like Neptune and Weights & Biases (W&B) also helps track and compare model performance.

            Technical Advancements
            The software has introduced several new problem types, enhancing its applicability across various use cases:
            Causal Regression Modeling: Added in recent updates, this feature supports single-target regression tasks using LLMs, as noted in pull request (PR) 788 on the GitHub repository. This allows for more advanced predictive modeling.

            Causal Classification Modeling: Introduced via PR 449, it supports binary and multiclass classification, expanding the software's utility for classification tasks.

            DPO/IPO Optimization: A significant shift occurred with the introduction of Direct Preference Optimization (DPO) and Inverse Preference Optimization (IPO) as alternatives to RLHF, starting with PR 530. This was further solidified with PR 592, which began deprecating RLHF, and PR 747, which fully removed it in favor of DPO/IPO/KTO (Kullback-Leibler Divergence-based Training Objective) optimization. This change improves training stability and efficiency.

            Additional technical enhancements include:
            Deepspeed Integration: PR 288 introduced Deepspeed for sharded training on multiple GPUs with NVLink, replacing Fully Sharded Data Parallel (FSDP). This is particularly beneficial for large-scale distributed training, requiring CUDA toolkit 12.1.

            Advanced Optimization Techniques: Support for Low-Rank Adaptation (LoRA) and 4-bit QLoRA, mentioned in external announcements, reduces memory footprint and enables faster training, especially for resource-constrained environments.

            Performance and Scalability Improvements
            The software has seen measurable performance gains, as highlighted in recent business announcements:
            Cost Reduction: Fine-tuned open-source LLMs have reduced expenses by up to 70%, making it cost-effective for enterprises

            .

            Latency Improvement: Optimized processing has cut inference time by 75%, enhancing real-time application performance.

            Scalability: The platform now handles 500% more requests than previous versions, as noted in the same announcement, making it suitable for high-demand scenarios.

            Benchmark performance is also notable:
            In the GAIA benchmark, h2oGPTe Agent achieved 79.7% accuracy, with human-level performance measured at 92%, outperforming general-purpose models from Google and Microsoft, which scored below 50% H2O.ai Breaks New World Record for Most Accurate Agentic AI for Generalized Assistants.

            Usability and User Experience Enhancements
            H2O LLM Studio has focused on improving accessibility and usability:
            No-Code GUI: The no-code interface, a core feature since its inception, has been refined, making it intuitive for users without coding experience, as mentioned in external blogs and LinkedIn posts

            .

            Configuration Simplification: PR 741 introduced a single max_length setting, resembling the chat_template functionality from the transformers library, simplifying model setup.

            User Secrets Handling: PR 364 improved security with support for the 'keyring' library, including automatic migration of user settings, enhancing data protection.

            Integration with Tools: Integration with Neptune and Weights & Biases (W&B) for experiment tracking and performance comparison, as noted in AWS Marketplace listings

            , improves workflow efficiency.

            Model Support and Deployment Flexibility
            The software now supports a broader range of LLMs:
            Models like DeepSeek, Llama, Qwen, H2O Danube, and H2OVL Mississippi are supported, as mentioned in recent announcements, allowing users to fine-tune diverse state-of-the-art models.

            Deployment options have been expanded to include on-premises, cloud VPCs, or air-gapped environments, ensuring data privacy and compliance, as highlighted in platform descriptions

            .

            Evaluation and Export Capabilities
            Advanced Evaluation Metrics: Users can now judge generated answers using advanced metrics, improving model reliability, as noted in AWS Marketplace details.

            Model Export: The ability to export fine-tuned models to the Hugging Face Hub for sharing and collaboration has been enhanced, facilitating community engagement

            "

            Category Initial (v0.0.1) Latest (v1.14.1) Impact
            Problem Types Limited to basic fine-tuning Includes causal regression, classification Expanded use cases
            Optimization Techniques RLHF-based DPO/IPO/KTO, Deepspeed, LoRA, QLoRA Improved efficiency, reduced memory use
            Scalability Basic, limited requests Handles 500% more requests Enhanced for enterprise needs
            Cost Efficiency Standard costs Up to 70% cost reduction Significant savings
            Latency Higher inference time 75% reduction in inference time Faster real-time applications
            Usability Basic GUI Enhanced no-code GUI, keyring support More accessible to non-coders
            Model Support Limited models Supports DeepSeek, Llama, etc. Broader model compatibility
            Benchmark Performance Not specified 79.7% GAIA accuracy, 92% human-level Competitive edge in accuracy
            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • Bookmarks
            • Search