Unsolved Email / Dovecot, possible data corruption
d19dotca last edited by girish
I restored from a Cloudron backup recently as part of migrating to a new VPS provider, and while it seems to complete in that it contains all of the mail for everyone (nothing missing), there were a few reports and observations of a few dozen messages showing up in the Inbox or Sent folders unexpectedly with the date of the migration/restore rather than the date of the actual message received/sent. It's as if some messages didn't respect the original receive/send date, and filled it in with the time of the restore / disk write time (which for me was at about 11 PM Pacific Time on May 1st).
I found some similar reports online for Dovecot in general (nothing specific to Cloudron), and the solution presented in most cases seemed to be to run a command like
doveadm force-resync -u dustin
I don't see a way to run that kind of command though, as doveadm is not installed on the server. Is it supposed to be installed in order to run it, or do I need to be running this from inside of a container instead perhaps?
I also found this which suggested using a script to correct the dates, then to clear the Dovecot index cache, but a bit hesitant to run this route since it seems unofficial. Ref: https://blog.alan.lt/2020/01/fix-emails-date-time-after-migration-to.html
or do I need to be running this from inside of a container instead perhaps?
Yes, exactly that.
root@my-cloudron:~# docker exec mail doveadm -h usage: doveadm [-Dv] [-f <formatter>] <command> [<args>] acl add|debug|delete|get|recalc|remove|rights|set altmove [-u <user>|-A] [-S <socket_path>] [-r] <search query> auth cache|login|lookup|test backup [-u <user>|-A] [-S <socket_path>] [-fPRU] [-l <secs>] [-r <rawlog path>] [-m <mailbox>] [-g <mailbox_guid>] [-n <namespace> | -N] [-x <exclude>] [-s <state>] [-t <start date>] -d|<dest> batch [-u <user>|-A] [-S <socket_path>] <sep> <cmd1> [<sep> <cmd2> [..]] config [doveconf parameters] copy [-u <user>|-A] [-S <socket_path>] <destination> [user <source user>] <search query> deduplicate [-u <user>|-A] [-S <socket_path>] [-m] <search query> dict get|inc|iter|set|unset director add|down|dump|flush|kick|map|move|remove|ring|status|up|update dump [-t <type>] <path> exec <binary> [binary parameters] expunge [-u <user>|-A] [-S <socket_path>] [-m] <search query> fetch [-u <user>|-A] [-S <socket_path>] <fields> <search query> flags add|remove|replace force-resync [-u <user>|-A] [-S <socket_path>] [-f] <mailbox mask> fs copy|delete|get|iter|iter-dirs|metadata|put|stat fts expand|lookup|optimize|rescan|tokenize help <cmd> import [-u <user>|-A] [-S <socket_path>] [-U source-user] [-s] <source mail location> <dest parent mailbox> <search query> index [-u <user>|-A] [-S <socket_path>] [-q] [-n <max recent>] <mailbox mask> instance list|remove kick [-a <anvil socket path>] <user mask>[|]<ip/bits> log errors|find|reopen|test mailbox cache|cache|create|delete|list|metadata|mutf7|path|rename|status|subscribe|unsubscribe|update move [-u <user>|-A] [-S <socket_path>] <destination> [user <source user>] <search query> oldstats dump|reset|top penalty [-a <anvil socket path>] [<ip/bits>] process status proxy kick|list purge [-u <user>|-A] [-S <socket_path>] pw [-l] [-p plaintext] [-r rounds] [-s scheme] [-t hash] [-u user] [-V] rebuild attachments reload replicator add|dsync-status|remove|replicate|status save [-u <user>|-A] [-S <socket_path>] [-m mailbox] search [-u <user>|-A] [-S <socket_path>] <search query> service status|stop sieve activate|deactivate|delete|get|list|put|rename sis deduplicate|find stats dump stop sync [-u <user>|-A] [-S <socket_path>] [-1fPRU] [-l <secs>] [-r <rawlog path>] [-m <mailbox>] [-g <mailbox_guid>] [-n <namespace> | -N] [-x <exclude>] [-s <state>] [-t <start date>] -d|<dest> user [-a <userdb socket path>] [-x <auth info>] [-f field] [-e <value>] [-u] <user mask> [...] who [-a <anvil socket path>] [-1] [<user mask>] [<ip/bits>] zlibconnect <host> [<port>]
@fbartels Interesting, thanks! I don't see any users in the container though to run commands against. Wonder how that mapping works in the container to the user mail on the file system.
EDIT: I see the users are the full email address rather than any username. I can user a command like
sudo docker exec mail doveadm user <emailAddress>to see more details.
So it seems this may be more of a client-side issue but triggered by something unusual server-side.
Apple’s Mail app supposedly uses the file creation date as the mail date, even if the headers are correctly shown. This explains why I see some mail recent on my list despite it showing a header date of last week for example.
During the restore in Cloudron, it created files with the date of the restore rather than the original date, thus Mail.app showing some odd display issues.
What’s a kicker to me though is it’s only affecting a dozen or so messages by the looks of it and I can’t understand why that doesn’t impact everything if that were truly the case. I tried removing the account from my iPhone then re-adding it, same behaviour. Everything shows fine in Roundcube though, further suggesting this is more of a mail client side app issue.
What I don’t get is why this happened in the first place… I’ve done many Cloudron restores, never once seen this behaviour before. Is it possible there was a bug or something weird in the restore and how it laid out the mail files on the system, even though it would only really be seen by supposedly Apple’s own mail client apps and thus low impact? Has anyone else seen this before?
The link I provided earlier though is also from 2012, so I’m not sure how relevant it is anymore, so I may still be in the unknown. Lol.
EDIT: Interestingly I downloaded the Outlook app and set it up to my same account, and sure enough it shows the same behaviour as the Apple Mail app does, despite it still being different than Roundcube is showing it. So this issue may not be unique to Apple Mail app after all since Outlook is showing the same behaviour.
EDIT #2: Thunderbird doesn't display this issue and seems to resemble (more or less) how Roundcube does too. And looking at the files on the file system, I can't understand how Mail or Outlook would look at the last-modified or creation date to display the date in the UI when the last modified and created dates are all the same across all files during the migration last night, so theoretically if that was the cause then it'd impact ALL mail as viewed in Apple Mail or Outlook apps, which isn't the case currently. I don't see what's different though about the files when looking at the metadata (stats command) to a working file vs an oddly displaying file. So something still seems off here, but perhaps more on the client-side than server-side. Just not sure why I've never seen this issue before though, which is what strikes me as real odd.
@girish I think I may have just fixed it actually. I THINK the issue is this:
Dovecot creates an index file for each folder in each mailbox. Apps like Apple Mail and Outlook when they haven't ever had the account before will read first from the Dovecot index files before generating its own local cache / index, then it will later do its own magic to make things even better user experience, but the main metadata like dates on messages and such come from the Dovecot index files it generates on the first time. After that, Apple Mail and Outlook usually won't use the Dovecot files anymore and will keep a local index.
In my case, I believe the Index files were corrupt or had wrong dates inside them for some messages, so I had to delete them and recreate them. Here's the commands I ran for Phase 1:
docker exec mail doveadm index -u <emailAddress> '*'
The above commands will remove all Dovecot-related index files from the main directory plus each sub-mailbox folder (i.e. Sent, Drafts, Archive, etc). The last command will use the Dovecot Admin tool (doveadm) to create the indexes for the mail user specified in the command after the
The second step (Phase 2) is to remove the mail client's local index caches, and rebuild the mailbox. In my case, it was easiest to do the following (assuming Mail app is closed already):
- Navigate to
/Users/<username>/Library/Mail/V9on macOS 15 using Finder.
- Remove the account files for the mailbox. This is usually a GUID, so you may need to verify which one you're deleting if you have multiple accounts in the app.
- Navigate to
/Users/d19dotca/Library/Mail/V9/MailDataon macOS 15 in the same Finder window from step 1 above, and remove any files starting with
Envelopesuch as the following file names in my case:
- Navigate to Settings > Internet Accounts in macOS Settings
- Delete the affected Mail account
- Re-create the Mail account you just deleted
- Open Mail app.
Now the above is what I THINK is the solution. In my case, I actually did something extra (a Phase 3 if you will) in between clearing the index files on the server and then the local cache for the Desktop, and I think that was because I didn't see the changes right away so assumed my steps had failed, but believe I was simply thrown off with the fact that the local Mail app had its own index cached which basically negated any changes to the index files made in Dovecot since it wouldn't read it except on the very first add of the account.
For completeness, the extra step I did was run a bash script from this GitHub repo based on the blog post here: https://blog.alan.lt/2020/01/fix-emails-date-time-after-migration-to.html - This changes all the creation dates on the files to the Date found inside each file. After realizing what I did, I don't actually think this step was necessary, however I'm including it here just in-case it helps anyone else too, in case the first two main phases don't work alone.
After all of that (probably more work than necessary), my Mail app on macOS is showing everything with the expected dates.
- Navigate to
How many emails don't appear ?
Just to clarify, all emails were present, none were missing. It was more a matter of my Apple Mail client (and later Outlook when testing too) showing a dozen messages or so with wrong dates, and the date it was showing was the exact time of the migration last night.
There's still a few things I don't fully understand but since it's a live server, I'm hesitant to do much more work on it since it seems to be working now, lol, but I believe my message above will give a lot of details into how I solved it for my mailbox at least, and I'm working with others now to see if they need the same changes made (so far so good).
Is there a reason that the file dates aren't kept in a restore? I'd have expected them to stay the same. Much like a restore on macOS for example using Time Machine, the file keeps the original metadata such as the creation date, only thing changing is the "last accessed" time I believe. Is this something that could be improved perhaps, assuming it should be?
@d19dotca generally, there has been no need to preserve them. AFAIK, no web app requires time stamp of files to be preserved.
For rsync backups, we have to put in some extra work to preserve timestamps. It has to be stored somewhere since the backends don't usually support it uniformly (backends like s3 are object storages and not file storages). For tgz backup, timestamp is stored inside the tar file format and restore also preserves timestampes, but this is only accidental.
@girish Ohhh! Maybe that's why I haven't seen this issue before. I normally have always used tgz backups, only using rsync the past few months and the first restore since switching to that method. Perhaps that explains why I never seen this before. I guess since some client apps are temperamental (Outlook and Apple Mail as two popular examples), I should stick to tgz backups as it may help me avoid this issue in the future.
I should stick to tgz backups as it may help me avoid this issue in the future.
Yes, alternatively, we should fix this. As mentioned, there hasn't been a use case yet to preserve timestamps but looks like we might have one here.
So, reading more, I think you are right. I found https://doc.dovecot.org/admin_manual/mailbox_formats/maildir/#usage-of-timestamps which says timestamps are indeed used in some fashion. And one old blog post which seem to match your behavior - https://mikegriffin.ie/blog/20130226-change-the-timestamp-of-maildir-files . Someone has written a script too - https://blog.alan.lt/2020/01/fix-emails-date-time-after-migration-to.html
I will investigate more tomorrow. Maybe we can fix up mail data timestamps post restoring.
Someone has written a script too - https://blog.alan.lt/2020/01/fix-emails-date-time-after-migration-to.html
Ah yes, that's the script I used.
I will investigate more tomorrow. Maybe we can fix up mail data timestamps post restoring.
Thanks Girish! Definitely seems like this is a low priority issue given that nobody else using Cloudron has run into this yet, but its also likely something that will come up again as Cloudron grows with more users, and would be great to see that handled in Cloudron automatically, making things easier for the admin users.