OpenArchiver: E-Mail Archiving incl. fulltext-search of attachments
-
Referring to the discussion here: https://forum.cloudron.io/post/112663 and after a very interesting update to the app (adding OCR for attachments) that was just released:
- Main Page: https://github.com/LogicLabs-OU/OpenArchiver
- Git: https://github.com/LogicLabs-OU/OpenArchiver
- Licence: AGPL 3.0
- Dockerfile: Yes
- Demo: https://demo.openarchiver.com/
- Summary:Open Archiver provides a robust, self-hosted solution for archiving, storing, indexing, and searching emails from major platforms, including Google Workspace (Gmail), Microsoft 365, PST files, as well as generic IMAP-enabled email inboxes. Use Open Archiver to keep a permanent, tamper-proof record of your communication history, free from vendor lock-in.
Key features:
-
Universal Ingestion: Connect to any email provider to perform initial bulk imports and maintain continuous, real-time synchronization. Ingestion sources include:
- IMAP connection
- Google Workspace
- Microsoft 365
- PST files
- Zipped .eml files
- Mbox files
-
Secure & Efficient Storage: Emails are stored in the standard
.eml
format. The system uses deduplication and compression to minimize storage costs. All data is encrypted at rest. -
Pluggable Storage Backends: Support both local filesystem storage and S3-compatible object storage (like AWS S3 or MinIO).
-
Powerful Search & eDiscovery: A high-performance search engine indexes the full text of emails and attachments (PDF, DOCX, etc.).
-
Thread discovery: The ability to discover if an email belongs to a thread/conversation and present the context.
-
Compliance & Retention: Define granular retention policies to automatically manage the lifecycle of your data. Place legal holds on communications to prevent deletion during litigation (TBD).
-
File Hash and Encryption: Email and attachment file hash values are stored in the meta database upon ingestion, meaning any attempt to alter the file content will be identified, ensuring legal and regulatory compliance.
-
Comprehensive Auditing: An immutable audit trail logs all system activities, ensuring you have a clear record of who accessed what and when (TBD).
- Notes: OpenArchiver has full-text search for attachments based on Meili Search and Apache Tika (OCR!). I don‘t know any other open sourced mail archiver that does that…
- Alternative to / Libhunt link: Mail Archiver (https://forum.cloudron.io/post/112576), Mailpiler (https://forum.cloudron.io/post/2800) (whatever happened to it? The Cloudron git seems abandoned: https://git.cloudron.io/vladimir.d/mailpiler?)
- Screenshots:
-
N necrevistonnezr referenced this topic
-