Cloudron makes it easy to run web apps like WordPress, Nextcloud, GitLab on your server. Find out more or install now.


Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • Bookmarks
  • Search
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Brand Logo

Cloudron Forum

Apps | Demo | Docs | Install
  1. Cloudron Forum
  2. Matrix (Synapse/Element)
  3. Run s3_media_upload script

Run s3_media_upload script

Scheduled Pinned Locked Moved Matrix (Synapse/Element)
22 Posts 5 Posters 2.5k Views 6 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • robiR robi

    First, find out where it's expecting the cache file. Then see if you can specify a location for it on the command line.

    Alternatively, run it from the /app/data/ location in case it stores it right where it runs from.

    nichu42N Offline
    nichu42N Offline
    nichu42
    wrote on last edited by
    #3

    @robi
    Thank you for your response.

    I think I made a major step forward.

    The script expects a database.yaml file to be present. It needs to include the user, password, database and host entries that can be found and copied over from the homeserver.yaml file.
    Once you have prepared this, the script will create the cache.db on its own after you run the following command:

    s3_media_upload update /app/data/data/media_store 1m
    

    1m means all files that haven't been touched for a month will be used.

    Now the script is ready to upload. This can be triggered with the following command:

    s3_media_upload upload --delete /app/data/data/media_store s3_bucket_name
    

    Unfortunately, here I am stuck again. The script returns the following error message:

    botocore.exceptions.NoCredentialsError: Unable to locate credentials
    

    The script documentation states "This module uses boto3, and so the credentials should be specified", and refers to https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#guide-configuration
    Here I am completely lost again and don't even know where to start. Is this maybe something that needs to be done from Cloudron, @girish ?

    Matrix: @nichu42:blueplanet.social

    robiR 1 Reply Last reply
    0
    • nichu42N nichu42

      @robi
      Thank you for your response.

      I think I made a major step forward.

      The script expects a database.yaml file to be present. It needs to include the user, password, database and host entries that can be found and copied over from the homeserver.yaml file.
      Once you have prepared this, the script will create the cache.db on its own after you run the following command:

      s3_media_upload update /app/data/data/media_store 1m
      

      1m means all files that haven't been touched for a month will be used.

      Now the script is ready to upload. This can be triggered with the following command:

      s3_media_upload upload --delete /app/data/data/media_store s3_bucket_name
      

      Unfortunately, here I am stuck again. The script returns the following error message:

      botocore.exceptions.NoCredentialsError: Unable to locate credentials
      

      The script documentation states "This module uses boto3, and so the credentials should be specified", and refers to https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#guide-configuration
      Here I am completely lost again and don't even know where to start. Is this maybe something that needs to be done from Cloudron, @girish ?

      robiR Offline
      robiR Offline
      robi
      wrote on last edited by
      #4

      @nichu42 yeah those are s3 credentials for the bucket that need to be provided..

      see if you can find example commands for the upload step.

      Conscious tech

      nichu42N 1 Reply Last reply
      0
      • robiR robi

        @nichu42 yeah those are s3 credentials for the bucket that need to be provided..

        see if you can find example commands for the upload step.

        nichu42N Offline
        nichu42N Offline
        nichu42
        wrote on last edited by
        #5

        @robi
        The example command is what I stated above.
        The S3 credentials cannot be submitted on the command line.

        They are in the homeserver.yaml, but it seems that boto3 (whatever that is) doesn't read them but expects environment variables to be set. That's why I thought this might be something that needs to be done by Cloudron.

        Matrix: @nichu42:blueplanet.social

        robiR 1 Reply Last reply
        0
        • nichu42N nichu42

          @robi
          The example command is what I stated above.
          The S3 credentials cannot be submitted on the command line.

          They are in the homeserver.yaml, but it seems that boto3 (whatever that is) doesn't read them but expects environment variables to be set. That's why I thought this might be something that needs to be done by Cloudron.

          robiR Offline
          robiR Offline
          robi
          wrote on last edited by
          #6

          @nichu42 hmm, ok.

          what's the error if you rename the .yaml file? (just to make sure that's where it's looking for the info..)

          Conscious tech

          nichu42N 1 Reply Last reply
          0
          • robiR robi

            @nichu42 hmm, ok.

            what's the error if you rename the .yaml file? (just to make sure that's where it's looking for the info..)

            nichu42N Offline
            nichu42N Offline
            nichu42
            wrote on last edited by
            #7

            @robi homeserver.yaml is the configuration file for Synapse. It will not start without it.
            The S3 configuration is correct: Synapse uploads new media to the bucket.
            However, "boto3" needs different configuration as it seems.

            Matrix: @nichu42:blueplanet.social

            robiR 1 Reply Last reply
            0
            • nichu42N nichu42

              @robi homeserver.yaml is the configuration file for Synapse. It will not start without it.
              The S3 configuration is correct: Synapse uploads new media to the bucket.
              However, "boto3" needs different configuration as it seems.

              robiR Offline
              robiR Offline
              robi
              wrote on last edited by
              #8

              @nichu42 right, here are the options:
              https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

              Conscious tech

              nichu42N 1 Reply Last reply
              0
              • robiR robi

                @nichu42 right, here are the options:
                https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

                nichu42N Offline
                nichu42N Offline
                nichu42
                wrote on last edited by
                #9

                @robi said in Run s3_media_upload script:

                @nichu42 right, here are the options:
                https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

                Yes, that's what I figured.
                But I have no idea how to make any of these options work with Cloudron.
                The file system is read-only, so I cannot put a config file where "boto3" expects it (~/.aws/credentials).

                That's why I thought that maybe @girish has to enable the use of environment variables.

                Matrix: @nichu42:blueplanet.social

                girishG 2 Replies Last reply
                0
                • nichu42N nichu42

                  @robi said in Run s3_media_upload script:

                  @nichu42 right, here are the options:
                  https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

                  Yes, that's what I figured.
                  But I have no idea how to make any of these options work with Cloudron.
                  The file system is read-only, so I cannot put a config file where "boto3" expects it (~/.aws/credentials).

                  That's why I thought that maybe @girish has to enable the use of environment variables.

                  girishG Offline
                  girishG Offline
                  girish
                  Staff
                  wrote on last edited by
                  #10

                  @nichu42 took me a while to figure what/where this script was. I guess it's this - https://github.com/matrix-org/synapse-s3-storage-provider/blob/main/scripts/s3_media_upload ?

                  1 Reply Last reply
                  0
                  • nichu42N nichu42

                    @robi said in Run s3_media_upload script:

                    @nichu42 right, here are the options:
                    https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

                    Yes, that's what I figured.
                    But I have no idea how to make any of these options work with Cloudron.
                    The file system is read-only, so I cannot put a config file where "boto3" expects it (~/.aws/credentials).

                    That's why I thought that maybe @girish has to enable the use of environment variables.

                    girishG Offline
                    girishG Offline
                    girish
                    Staff
                    wrote on last edited by
                    #11

                    @nichu42 you have to create a so called database.yaml file manually as per https://github.com/matrix-org/synapse-s3-storage-provider#regular-cleanup-job

                    "database.yaml should contain the keys that would be passed to psycopg2 to connect to your database. They can be found in the contents of the database.args parameter in your homeserver.yaml."

                    From what I can make out from the code, it needs to be like this:

                    postgres:
                        user: xx
                        password: yy
                        database: zz
                        host: postgresql
                    

                    Might be worthwhile asking upstream to document this...

                    nichu42N 1 Reply Last reply
                    1
                    • girishG girish

                      @nichu42 you have to create a so called database.yaml file manually as per https://github.com/matrix-org/synapse-s3-storage-provider#regular-cleanup-job

                      "database.yaml should contain the keys that would be passed to psycopg2 to connect to your database. They can be found in the contents of the database.args parameter in your homeserver.yaml."

                      From what I can make out from the code, it needs to be like this:

                      postgres:
                          user: xx
                          password: yy
                          database: zz
                          host: postgresql
                      

                      Might be worthwhile asking upstream to document this...

                      nichu42N Offline
                      nichu42N Offline
                      nichu42
                      wrote on last edited by
                      #12

                      @girish Thank you for responding!

                      Yes, this thread is about the script that you have linked (https://github.com/matrix-org/synapse-s3-storage-provider#regular-cleanup-job). It is part of Cloudron's Synapse installation and can be found in /app/code/env/bin.

                      I had already managed to make the database config as you have mentioned in your post.

                      The problem is: The script uses "Boto3" (AWS SDK for Python) which expects the S3 credentials either to be saved in the config file ~/.aws/credentials or as environment variables, see
                      https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

                      Please correct me if I'm wrong, but Cloudron doesn't grant me access to either of these. That's why I mentioned you in this thread. I think you'd have to enable one of these options to make the script work.

                      Matrix: @nichu42:blueplanet.social

                      girishG 1 Reply Last reply
                      0
                      • nichu42N nichu42

                        @girish Thank you for responding!

                        Yes, this thread is about the script that you have linked (https://github.com/matrix-org/synapse-s3-storage-provider#regular-cleanup-job). It is part of Cloudron's Synapse installation and can be found in /app/code/env/bin.

                        I had already managed to make the database config as you have mentioned in your post.

                        The problem is: The script uses "Boto3" (AWS SDK for Python) which expects the S3 credentials either to be saved in the config file ~/.aws/credentials or as environment variables, see
                        https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

                        Please correct me if I'm wrong, but Cloudron doesn't grant me access to either of these. That's why I mentioned you in this thread. I think you'd have to enable one of these options to make the script work.

                        girishG Offline
                        girishG Offline
                        girish
                        Staff
                        wrote on last edited by
                        #13

                        @nichu42 From the link you posted, there is a bunch of environment variables you can set - both for the credentials itself and also for the config file. Have you tried those? Or is the question about how to use those env variables?

                        nichu42N 1 Reply Last reply
                        0
                        • girishG girish

                          @nichu42 From the link you posted, there is a bunch of environment variables you can set - both for the credentials itself and also for the config file. Have you tried those? Or is the question about how to use those env variables?

                          nichu42N Offline
                          nichu42N Offline
                          nichu42
                          wrote on last edited by
                          #14

                          @girish Yes, correct: How to set these environment variables with Cloudron?

                          Matrix: @nichu42:blueplanet.social

                          girishG 1 Reply Last reply
                          0
                          • nichu42N nichu42

                            @girish Yes, correct: How to set these environment variables with Cloudron?

                            girishG Offline
                            girishG Offline
                            girish
                            Staff
                            wrote on last edited by
                            #15

                            @nichu42 You are running this on a Web Terminal right ? You can just export foo=bar like in a normal terminal and then run the s3_media_upload script ?

                            nichu42N 1 Reply Last reply
                            1
                            • girishG girish

                              @nichu42 You are running this on a Web Terminal right ? You can just export foo=bar like in a normal terminal and then run the s3_media_upload script ?

                              nichu42N Offline
                              nichu42N Offline
                              nichu42
                              wrote on last edited by nichu42
                              #16

                              @girish Yay! Thank you.
                              I am all new to this Linux game so I wasn't aware I could just set the environment variables like that.

                              For everyone else, this is what you need to do:

                              ──────────────────────────────

                              1. Set up S3 with Synapse. See my post here: https://forum.cloudron.io/post/60415

                              2. Create a database.yaml file in /app/data/configs that contains the postgres database credentials.
                                You can find those in the existing homeserver.yaml file.

                                  user: xxx
                                  password: xxx
                                  database: xxx
                                  host: postgresql
                              
                              1. Create a script (e.g., s3cleanup.sh) with the following contents:
                              #!/bin/bash
                              cd /app/data/configs
                              export AWS_ACCESS_KEY_ID=[your S3 compatible access key]
                              export AWS_SECRET_ACCESS_KEY=[your s3 compatible secret access key]
                              /app/code/env/bin/s3_media_upload update /app/data/data/media_store 1m
                              /app/code/env/bin/s3_media_upload upload --delete --endpoint-url https://yours3storageendpoint.com /app/data/data/media_store [your s3_bucket_name]
                              
                              1. Run the s3cleanup.sh script.
                                It will look up media that hasn't been touched for 1m (= 1 month) or whatever you set above. It needs to be an integeger value, followed by either m = month(s), d = day(s) or y = year(s).
                                It will create a cache.db file that refers to the media that matches your criteria.
                                In the second step, it will upload all files from the cache.db to your s3 storage and delete the local copies.

                              The output looks like this:

                              Syncing files that haven't been accessed since: 2022-12-25 14:59:14.674154
                              Synced 603 new rows
                              100%|████████████████████████████████████| 603/603 [00:00<00:00, 16121.24files/s]
                              Updated 0 as deleted
                              100%|████████████████████████████████████| 603/603 [03:25<00:00,  2.93files/s]
                              Uploaded 603 media out of 603
                              Uploaded 3203 files
                              Uploaded 263.6M
                              Deleted 603 media
                              Deleted 3203 files
                              Deleted 263.6M
                              

                              Edit: Added path /app/data/configs to script to make it work as cron job.
                              Edit2: Added more choices for duration suffixes in 's3_media_upload update' job.

                              Disclaimer: This is to the best of my knowledge and understanding. It worked for me, but I accept no liability for loss of data on your server caused by my incompetence. 😉

                              Matrix: @nichu42:blueplanet.social

                              WiseMetalheadW 1 Reply Last reply
                              5
                              • nichu42N nichu42

                                @girish Yay! Thank you.
                                I am all new to this Linux game so I wasn't aware I could just set the environment variables like that.

                                For everyone else, this is what you need to do:

                                ──────────────────────────────

                                1. Set up S3 with Synapse. See my post here: https://forum.cloudron.io/post/60415

                                2. Create a database.yaml file in /app/data/configs that contains the postgres database credentials.
                                  You can find those in the existing homeserver.yaml file.

                                    user: xxx
                                    password: xxx
                                    database: xxx
                                    host: postgresql
                                
                                1. Create a script (e.g., s3cleanup.sh) with the following contents:
                                #!/bin/bash
                                cd /app/data/configs
                                export AWS_ACCESS_KEY_ID=[your S3 compatible access key]
                                export AWS_SECRET_ACCESS_KEY=[your s3 compatible secret access key]
                                /app/code/env/bin/s3_media_upload update /app/data/data/media_store 1m
                                /app/code/env/bin/s3_media_upload upload --delete --endpoint-url https://yours3storageendpoint.com /app/data/data/media_store [your s3_bucket_name]
                                
                                1. Run the s3cleanup.sh script.
                                  It will look up media that hasn't been touched for 1m (= 1 month) or whatever you set above. It needs to be an integeger value, followed by either m = month(s), d = day(s) or y = year(s).
                                  It will create a cache.db file that refers to the media that matches your criteria.
                                  In the second step, it will upload all files from the cache.db to your s3 storage and delete the local copies.

                                The output looks like this:

                                Syncing files that haven't been accessed since: 2022-12-25 14:59:14.674154
                                Synced 603 new rows
                                100%|████████████████████████████████████| 603/603 [00:00<00:00, 16121.24files/s]
                                Updated 0 as deleted
                                100%|████████████████████████████████████| 603/603 [03:25<00:00,  2.93files/s]
                                Uploaded 603 media out of 603
                                Uploaded 3203 files
                                Uploaded 263.6M
                                Deleted 603 media
                                Deleted 3203 files
                                Deleted 263.6M
                                

                                Edit: Added path /app/data/configs to script to make it work as cron job.
                                Edit2: Added more choices for duration suffixes in 's3_media_upload update' job.

                                Disclaimer: This is to the best of my knowledge and understanding. It worked for me, but I accept no liability for loss of data on your server caused by my incompetence. 😉

                                WiseMetalheadW Offline
                                WiseMetalheadW Offline
                                WiseMetalhead
                                translator
                                wrote on last edited by WiseMetalhead
                                #17

                                @nichu42 said in Run s3_media_upload script:

                                Run the s3cleanup.sh script

                                How do I run this script?
                                When I call bash /app/data/s3cleanup.sh, I get the following output:

                                /app/data/s3cleanup.sh: line 2: cd: $'/app/data/configs\r': No such file or directory
                                
                                usage: s3_media_upload update [-h] base_path duration
                                s3_media_upload update: error: argument duration: duration must be an integer followed by a 'd', 'm' or 'y' suffix
                                
                                usage: s3_media_upload [-h] [--no-progress] {update-db,check-deleted,update,write,upload} ...
                                
                                s3_media_upload: error: Could not open 'cache.db' as sqlite DB: unable to open database file
                                
                                nichu42N 1 Reply Last reply
                                0
                                • WiseMetalheadW WiseMetalhead

                                  @nichu42 said in Run s3_media_upload script:

                                  Run the s3cleanup.sh script

                                  How do I run this script?
                                  When I call bash /app/data/s3cleanup.sh, I get the following output:

                                  /app/data/s3cleanup.sh: line 2: cd: $'/app/data/configs\r': No such file or directory
                                  
                                  usage: s3_media_upload update [-h] base_path duration
                                  s3_media_upload update: error: argument duration: duration must be an integer followed by a 'd', 'm' or 'y' suffix
                                  
                                  usage: s3_media_upload [-h] [--no-progress] {update-db,check-deleted,update,write,upload} ...
                                  
                                  s3_media_upload: error: Could not open 'cache.db' as sqlite DB: unable to open database file
                                  
                                  nichu42N Offline
                                  nichu42N Offline
                                  nichu42
                                  wrote on last edited by
                                  #18

                                  @WiseMetalhead said in Run s3_media_upload script:

                                  How do I run this script?
                                  When I call bash /app/data/s3cleanup.sh, I get the following output:

                                  /app/data/s3cleanup.sh: line 2: cd: $'/app/data/configs\r': No such file or directory
                                  

                                  It seems your script uses CR+LF where it should only have the LF end of line character (resulting in the tailing \r at the end of the path, which makes it invalid).
                                  Are you using Windows? If so, I recommend using Notepad++ (https://notepad-plus-plus.org/).
                                  Open your script, then select Edit > EOL Conversion > Unix (LF). Save, upload and try again.

                                  Matrix: @nichu42:blueplanet.social

                                  WiseMetalheadW 1 Reply Last reply
                                  2
                                  • nichu42N nichu42

                                    @WiseMetalhead said in Run s3_media_upload script:

                                    How do I run this script?
                                    When I call bash /app/data/s3cleanup.sh, I get the following output:

                                    /app/data/s3cleanup.sh: line 2: cd: $'/app/data/configs\r': No such file or directory
                                    

                                    It seems your script uses CR+LF where it should only have the LF end of line character (resulting in the tailing \r at the end of the path, which makes it invalid).
                                    Are you using Windows? If so, I recommend using Notepad++ (https://notepad-plus-plus.org/).
                                    Open your script, then select Edit > EOL Conversion > Unix (LF). Save, upload and try again.

                                    WiseMetalheadW Offline
                                    WiseMetalheadW Offline
                                    WiseMetalhead
                                    translator
                                    wrote on last edited by WiseMetalhead
                                    #19

                                    @nichu42 said in Run s3_media_upload script:

                                    Open your script, then select Edit > EOL Conversion > Unix (LF). Save, upload and try again.

                                    It actually works. Thank you!

                                    Strangely enough, the script was originally created using the Cloudron file manager.

                                    1 Reply Last reply
                                    2
                                    • andreasduerenA Offline
                                      andreasduerenA Offline
                                      andreasdueren
                                      wrote on last edited by
                                      #20

                                      @WiseMetalhead @nichu42 Sorry to open this up again but I'm interested in setting synapse up with s3 and am curious as to how your experience has been so far

                                      nichu42N 1 Reply Last reply
                                      0
                                      • andreasduerenA andreasdueren

                                        @WiseMetalhead @nichu42 Sorry to open this up again but I'm interested in setting synapse up with s3 and am curious as to how your experience has been so far

                                        nichu42N Offline
                                        nichu42N Offline
                                        nichu42
                                        wrote on last edited by
                                        #21

                                        @andreasdueren Still running it as described above Everything is fine.

                                        Matrix: @nichu42:blueplanet.social

                                        andreasduerenA 1 Reply Last reply
                                        1
                                        • nichu42N nichu42

                                          @andreasdueren Still running it as described above Everything is fine.

                                          andreasduerenA Offline
                                          andreasduerenA Offline
                                          andreasdueren
                                          wrote on last edited by
                                          #22

                                          @nichu42 OK thanks I have to read through the whole documentation then to try this out

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • Bookmarks
                                          • Search