Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps - Status | Demo | Docs | Install
  1. Cloudron Forum
  2. Ollama
  3. Old CPU / No GPU / Ollama Language Model?

Old CPU / No GPU / Ollama Language Model?

Scheduled Pinned Locked Moved Ollama
ollamagemmaqwen
13 Posts 4 Posters 122 Views 4 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L Offline
    L Offline
    LoudLemur
    wrote last edited by LoudLemur
    #1

    gemma3:1b

    This works. we gave it a whopping amount of RAM (32GB). You might be able to get it to run with less RAM.

    qwen3:4b was too slow and hit the proxy timeout.

    In the Ollama terminal you can set some environment variables to help too:

    export OLLAMA_KEEP_ALIVE=24h
    export OLLAMA_FLASH_ATTENTION=false
    

    OLLAMA_KEEP_ALIVE
    24h
    Keeps the model loaded in RAM (prevents reloading every request)
    OLLAMA_FLASH_ATTENTION
    false
    More stable on older CPUs

    After you have Ollama running on cloudron and have its api key, you can go into the Ollama terminal and:

    ollama pull gemma3:1b
    

    Then, using your own URL and your own API token, you can run this from your local machine to get gemma to tell you a joke and see if it is working:

    curl -X POST "https://YOUR_REAL_OLLAMA_URL/api/chat" \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer YOUR_TOKEN" \
      -d '{
        "model": "gemma3:1b",
        "messages": [{"role": "user", "content": "Hello! Tell me a short joke."}],
        "stream": false,
        "options": {
          "num_ctx": 1024,
          "num_thread": 6
        }
      }' | jq
    

    You will hopefully see a joke in the output and maybe some smilies laughing!😀

    1 Reply Last reply
    1
    • J Offline
      J Offline
      joseph
      Staff
      wrote last edited by
      #2

      I tried something like this in my 14 year old CPU 🙂 It's still writing out the joke . I also have only 16GB RAM to give.

      L 2 Replies Last reply
      1
      • timconsidineT Offline
        timconsidineT Offline
        timconsidine
        App Dev
        wrote last edited by
        #3

        Sadly there's currently no substitute for RAM or VRAM.
        My Mac silicon chip laptop does an ok-ish job with 24Gb RAM (integrated CPU/GPU memory model)
        But mostly I just accept defeat and use Ollama Cloud models (or Venice T2EE cloud models).

        Indie app dev, huge fan of Cloudron PaaS, scratching my itches : communityapps.appx.uk

        1 Reply Last reply
        4
        • J joseph

          I tried something like this in my 14 year old CPU 🙂 It's still writing out the joke . I also have only 16GB RAM to give.

          L Offline
          L Offline
          LoudLemur
          wrote last edited by LoudLemur
          #4

          @joseph

          Hey, I hope it tells you that joke some day, Joseph! :🤡

          In the meanwhile, here is one it told me:

          Why did the chicken cross the playground? 🐔

          To get to the other slide! 🤣

          It is pretty "low-VRAM" humour!

          1 Reply Last reply
          1
          • robiR Offline
            robiR Offline
            robi
            wrote last edited by
            #5

            There's a gemma3 270m instruct model that is fast, but it's also not very smart

            Conscious tech

            L 1 Reply Last reply
            0
            • robiR robi

              There's a gemma3 270m instruct model that is fast, but it's also not very smart

              L Offline
              L Offline
              LoudLemur
              wrote last edited by
              #6

              @robi Can it tell jokes?

              I had to find out!
              You be the judge:

              Tell us a quick joke about a chicken

              Why don't chicken birds fly? 🐔
              Because they are too small.

              🤣

              robiR 1 Reply Last reply
              0
              • L LoudLemur

                @robi Can it tell jokes?

                I had to find out!
                You be the judge:

                Tell us a quick joke about a chicken

                Why don't chicken birds fly? 🐔
                Because they are too small.

                🤣

                robiR Offline
                robiR Offline
                robi
                wrote last edited by
                #7

                @LoudLemur of course it can and at 200 tok/sec no less.

                It's just makes a lot of mistakes. Had trouble tool calling and web access

                Conscious tech

                L 1 Reply Last reply
                0
                • robiR robi

                  @LoudLemur of course it can and at 200 tok/sec no less.

                  It's just makes a lot of mistakes. Had trouble tool calling and web access

                  L Offline
                  L Offline
                  LoudLemur
                  wrote last edited by
                  #8

                  @robi said:

                  200 tok/sec no less

                  nice!

                  1 Reply Last reply
                  0
                  • robiR Offline
                    robiR Offline
                    robi
                    wrote last edited by
                    #9

                    Yes, compare it to LFM2.5-270M and 350M which are being geared for on phone inference. The prompts you are used to need to change though and be much more explicit so they don't feel so dumb.

                    Conscious tech

                    1 Reply Last reply
                    0
                    • J joseph

                      I tried something like this in my 14 year old CPU 🙂 It's still writing out the joke . I also have only 16GB RAM to give.

                      L Offline
                      L Offline
                      LoudLemur
                      wrote last edited by LoudLemur
                      #10

                      @joseph Thanks for this story. We asked a smallish model (Qwen 9B) running on a lot of VRAM to tell us a joke.

                      Qwen didn't tell us a joke, it just started thinking about which chicken joke to tell us. 🐔

                      We looked at its thinking and it had created and considered over 200 chicken jokes before we decided the best thing to do was ... not wait for it!

                      We wish we had left it running to find out which joke it would have eventually chosen for us!

                      1 Reply Last reply
                      0
                      • timconsidineT Offline
                        timconsidineT Offline
                        timconsidine
                        App Dev
                        wrote last edited by
                        #11

                        Why did the AI engine search for a chicken joke ?

                        Because it was looking for poultry in motion! 🐔

                        Indie app dev, huge fan of Cloudron PaaS, scratching my itches : communityapps.appx.uk

                        L 1 Reply Last reply
                        1
                        • timconsidineT timconsidine

                          Why did the AI engine search for a chicken joke ?

                          Because it was looking for poultry in motion! 🐔

                          L Offline
                          L Offline
                          LoudLemur
                          wrote last edited by
                          #12

                          @timconsidine 🤠

                          1 Reply Last reply
                          0
                          • L Offline
                            L Offline
                            LoudLemur
                            wrote last edited by
                            #13

                            @andreasdueren Thank you for Hermes! It is a great choice for us and it also tells a funny chicken joke!

                            hermes-4.3-36b

                            Sure! Here's a clucktastic one:

                            Why did the chicken join a band?

                            To learn how to make some "eggcellent" beats! 🐔🎸

                            (If you want more, just say the word!)

                            1 Reply Last reply
                            0

                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                            With your input, this post could be even better 💗

                            Register Login
                            Reply
                            • Reply as topic
                            Log in to reply
                            • Oldest to Newest
                            • Newest to Oldest
                            • Most Votes


                            • Login

                            • Don't have an account? Register

                            • Login or register to search.
                            • First post
                              Last post
                            0
                            • Categories
                            • Recent
                            • Tags
                            • Popular
                            • Bookmarks
                            • Search