Growing usage of email for business communication, increasingly stricter regulatory norms, and laws mandate the long-term retention of email data.
An inadequate data protection strategy and infrastructure can lead to data loss, severely impacting business continuity.
IT Teams are often called upon to recover specific email data to cater to a data loss request or a business request for accessing past communication or legal or compliance queries.
Related: Twelve significant challenges of Data Storage and Data Management
Some Data loss horror scenarios.
Scenario A: You’re just back from vacation, full of energy and ready to get to work…but a few hours later, an unfortunate incident leads to a small office fire…luckily the office sprinklers put it out. However, in the process, your laptop is now wet, and the data on it (including all downloaded mail) is inaccessible…lost.
Scenario B: A deadly virus or ransomware infects your computer network and servers. This corrupts the email backup, throwing everyone out of gear and disrupting the workflow.
Scenario C: One of your users accidentally deletes mail and wants those back.
Scenario D: Your users have to trim their mailboxes by deleting “unimportant” emails regularly to adhere to the mailbox quota policies of your organization. At some point, they may want to access historical data which has been deleted.
The business need for enterprise data protection
As part of information security, compliance, and IT requirements, businesses need to ensure that ALL the emails transacted by users are kept safe in a secondary or alternate store to facilitate:
- Fast search across the mailboxes of all users for regulation compliance and knowledge discovery
- Quick extraction of specific emails or the entire mailbox
- DIY – An easy way for users to help themselves find and download their own mail.
Choosing the proper email backup and archive architecture for email backup and restore can make an IT team more productive and help increase the general productivity of an organization by minimizing disruption.
Popular strategies to protect Email Data
Before you read on, it may be helpful to note that backup and archival are different.
Periodic Backup from endpoints
Endpoints are client devices and software users’ access to work with their emails.
Many organizations deploy tools to take backups of the email data on the client devices of their users, typically PST or EML files. In some organizations, it’s up to the end users to ensure that their data is safe.
The backed-up data will contain a snapshot of the user’s mailbox on the endpoint (such as a desktop or mobile) at the time of the backup.
Typically these backups are a collection of client data files (PST/EML or any other format) and are not conducive for searching or selectively downloading email without much effort.
Furthermore, this method is too technical for any end user to access the backup themselves to search through and restore mail.
Use this method:
In case of loss, failure, or corruption of a user’s endpoint device/software, the administrator can restore the last good backup to get your users up and running quickly.
Some challenges with periodic backup from the endpoints are:
- Possible data loss: Since these backups are taken at specific intervals and not continuously, a backup will not include emails exchanged and deleted between two backups.
- Cumbersome and Slow Search: This backup method does not store the data in a search-ready, restore-ready state. For an administrator to find a mail, he must know the period he is looking into, restore the correct PST file into a staging client setup, sync all the emails, and then search or restore as required. This process can take several hours or even run across days.
- Dependent on the IT team, no DIY: This backup method does not allow the users to help themselves.
Single box Journaling
It’s pretty easy on most mail systems to configure a journaling rule to send a copy of every mail transacted by selected or all users to an email ID (typically called a journal email ID).
Generally, this journal email id is just another mailbox in the same mail system. The administrator configures a desktop email client, like MS Outlook or Thunderbird, to POP mail from this journal email ID into a local PC to create PST files/EML files. These PST and EML files are backed up in secondary storage.
The journaling mailbox and the PST/EML files are regularly cleaned and rotated to prevent endpoint storage bloat.
The use of the Single box Journaling method:
This method of email backup, collects all mail for all users in one account and further splits this into multiple PST/EML files. This fragmentation makes it challenging to do any kind of processing on the data or even restore any specific email for any user. This is a feel-good or notional backup, which suffers several of the same limitations as the first strategy.
Some challenges with the Single box Journaling method are:
- High Risk: The single mailbox runs a high risk of corruption if it is not emptied in time. Also, the PST/EML files backed up to a secondary store are prone to tampering.
- Cumbersome and Slow Search: The backup mailbox and the data files contain a mixture of all users’ data, making it challenging to sift through and find specific information. An administrator must find the correct PST file, restore it into a staging client setup, sync all the emails, and then search or restore as required. This process can take several hours or even run across days.
- Dependent on the IT team, no DIY: This backup method is not conducive to allowing the users to help themselves. Besides the systemic complexity of accessing the data, the consolidated emails of all users can pose a privacy hazard.
- Cloud email solutions have throttled single box journaling: Most cloud email solutions like Google Workspace and M 365 have deployed rate controls to disable a high inflow rate for mailboxes. This effectively renders this strategy useless in an environment with more users and high mail flow.
- Mailbox size limits: Limitations in mailbox size can greatly impact the efficiency and productivity of an organization’s email journaling process. Many email solutions, whether on-premises or cloud-based, impose quotas on mailbox storage. When journaling emails from other mailboxes, the storage can quickly become filled, requiring manual efforts to either download the data to free up space or utilize multiple mailboxes for journaling.
- No option to optimize costs: Another limitation is the lack of options to move aging data to lower-cost cold storage for long-term retention, which can increase storage costs.
- Poor scalability: Additionally, most email solutions do not adequately support scale, making it difficult to search for and quickly access specific information when needed. This becomes especially problematic during compliance and audit scenarios, as exporting and restoring selective data from the archive can be challenging with traditional mailboxes.
Mailstore Backups from the server or backend storage
This is typically done by all mail administrators, wherein they periodically back up the entire mail storage from the backend to a secondary store (relevant only for on-premises setups)
The mailbox storage is captured via a snapshot tool, a tool like rsync to copy all the files, or a database backup tool depending on the mail solution deployed.
Since many users pop and automatically delete mail from the mail store on the server, this backup method will likely capture even less data than the first strategy of backing up the endpoints.
Use of this method:
This method is a must for backend server maintenance and management procedures, which are used to restore the mail server or storage in case of an irrecoverable crash during a disaster. This method supports the disaster recovery requirement and should not be confused with an email backup, which can be used for selective restoration or search.
Some challenges with this method are:
This method is not helpful while attempting to restore an individual email or a full mailbox of a user since this is a raw backup of the mailbox storage in its most native format and is only meant to be used during a server restore procedure.
Email Archiving to a separate system on-premise
It’s pretty easy on most email systems to configure a journaling rule to send a copy of every mail transacted by selected or all users to a separate on-premise archival platform, which ingests the email and retains the email in a search-ready form.
The main reason to keep this data on-premise could be an infosec requirement. However, enterprises and government organizations have significantly accepted moving their data storage and management workloads to the public cloud.
You may want to evaluate whether your on-premises archival platform allows you to keep all data online and search ready, whether you can quickly find and restore any information quickly, and if your users can access their archived email safely, securely, and not be able to tamper with it, etc.
Related:
Components and costs of maintaining an on-premise setup
While this method doesn’t replace strategy 3 (a server or mail store backup), it certainly can help you retire strategies 1 and 2 and improve the productivity of the users and IT team.
Cloud Email Archiving
In this method, you would configure your primary mail platform to push/journal a copy of every mail transacted to a separate operational infrastructure on the cloud.
Related: 4 Reasons Why Email Journaling is Necessary for Your Enterprise
Having the data at a separate location in the cloud improves the data’s redundancy, reliability, and safety.
Related: How to Protect Your Cloud Data from Hackers: 6 Ways to Keep Your Data Safe
And evaluate whether the cloud email archiving platform supports the following critical capability:
- Can you keep all your data central, online, and search-ready?
- Can you quickly find and restore any information?
- Can your users access their archived email safely and securely and not be able to tamper with it?
- Can the platform quickly scale with your growing data volume?
- What are the safety standards on the cloud platform to protect your data?
- Does the platform support disaster recovery?
This article explains the differences between an on-premise archival setup and a cloud email archiving platform.
Related: Six signs of a practical data management strategy
Conclusion
If you are using an on-premise mail server platform, then strategy 3, which is to maintain Server/backend storage backups, is a must from a service operational and DR strategy perspective.
Strategy 1 and 2, which are methods to backup email data, may not be required anymore if you opt for strategy 4/5, which is email archiving, since a copy of every mail is captured and stored in the archive store by design.
Many archiving platforms may give you tools to search for email, restore mail selectively, and even allow your users to access their archives (self-help), besides a host of other features.
Here is a table covering all the above strategies and mapping them against the required features to manage email data effectively.
Strategy |
100% mail archived. |
Compliance needing Fast Search across mailboxes |
Organisation-wide Knowledge discovery |
End-user self-service for discovery and recovery |
Data safe in an independent infrastructure |
All Data online and Search ready |
Periodic BACKUP from the endpoints |
No |
Not possible |
Not possible |
Not available |
No. Most backups are stored at the same site |
Not possible |
Many to one journaling |
Yes |
Not possible easily |
Not possible easily |
Not available |
No. Most backups are stored at the same site |
Not possible |
Mail store BACKUPS |
Not possible |
Not possible |
Not possible |
Not possible |
No. Most backups are stored at the same site |
Not possible |
Mail ARCHIVING in-premise |
Yes |
Yes. But not scalable |
Yes. But not scalable |
Sometimes available |
No. Typically at the same site |
Maybe. Some have secondary volumes |
Mail ARCHIVING on cloud |
Yes |
Yes and elastic |
Yes & elastic |
Yes |
Yes |
Yes |
Once you are convinced that email archiving is the most optimal email data management solution, you may want to read this post on how to choose between on-premise, dedicated-on-cloud, or a SaaS-on-cloud for archiving of email.
Also, see why Vaultastic is an excellent fit for archiving emails to the cloud from any primary email platform.
Related: Why cloud email archival