API vs Direct extractions from email archives – what’s best for my migration?

Posted by Liam Neate on Mar 17, 2015 Last updated Mar 04, 2024

There is a lot of debate around how best to extract data from legacy email archives. Some say API is the only way to go, as it’s the only way to ensure 100% reliability and data integrity. Others say only a Direct connection can deliver the extraction speeds enterprises demand when migrating TBs to the cloud.

The fact is, the ‘API camp’ and the ‘Direct camp’ are both right!

API data extraction from source email archives

Extraction via a source archive’s API, by definition, gives you the highest fidelity. After all, what better way to retrieve an item than via the technology that stored it in the first place. That’s why we offer API connections.

However, API has its downsides – you’re relying on the source system, which means it needs to be up and running. It may be slow, it may be busy – and actually retrieving terabytes of data over a short period can cause reliability problems in the platform itself. These platforms weren’t specified or designed for a scenario where all data is going to be retrieved as quickly as possible.

API extraction from Enterprise Vault

Let’s look at the API extraction approach in the context of Symantec’s Enterprise Vault.

Quite rightly, Symantec is unable to endorse a migration that uses a non-API approach. Whether you’re upgrading or restructuring Enterprise Vault (EV), migrating to Enterprise Vault.cloud, or even moving away from EV to say, Microsoft Office 365, the Enterprise Vault API route is really the only way to go. Having said that, there are some serious caveats that we’ll cover later in this article.

Next, although it’s easy to directly access generic CIFS/SMB storage or widely used devices like EMC Centera, there are many other storage options (from vendors including Amazon, Rackspace, Dell and Hitachi) which rely on the Enterprise Vault Streamer API to act as a go-between.

The API also insulates data access from other ‘unpredictable’ scenarios, such as unusual database schemas.

Direct extraction from Enterprise Vault

Now let’s consider the Direct extraction approach.

Customer projects have shown that the direct connector can deliver performance up to 10x faster than using the API. This is especially the case when the EV archive is ‘busy’ serving end user retrieval requests or busy archiving.

The direct route also has a lower impact on the EV environment (as it does not go via the EV servers).

With Direct, you bypass all the issues related to the source archive and just deal with the native data itself. We have lots of expertise and over 2000 projects under our belt that prove that we know what we’re doing when it comes to dealing directly with source platform data.Also, if you have a non-operational EV archive, the direct approach lets you get directly at the archive without the EV server even being ‘live’.

Direct cannot always be the answer for an EV migration though. There are ways in which Enterprise Vault can store data which is impossible to access directly. Customers can often use the “secondary storage” feature to move older data to other locations, like into Netbackup, or even to the Cloud. Such data cannot be accessed at the direct level. So that’s why any EV migration solution worth its salt has to support API as well.

API and Direct Extraction

It’s clear to us that each approach has its own merits.

That’s why Transvault has always offered the choice of API or Direct connections (where possible) for the archive sources we deal with day in, day out. We’re the only email archive solution that lets you select the optimum connection route for the environment you’re working with: Direct, API or both!

Having both extraction methods available to you when scoping a migration project is the ideal. There will be situations when one method is more appropriate than the other. This depends on the data at hand, as well as myriad other factors like the status of the archive’s API health and/or load when you need to extract from it. Some archives that Transvault supports extraction from don’t publish an API, so in those cases we have no choice but to go direct to the data.

Hybrid Extraction from Enterprise Vault (H3)With Transvault, you can choose a purely API connection or a totally Direct connection, but it may be that the unique hybrid connection is right for you. Our hybrid extraction automatically switches from using the direct connector to the API connector whenever it encounters an item that it can’t access properly.

A screenshot from our Transvault Migrator connection wizard, configuring a connection to Enterprise Vault

A screenshot from our Transvault Migrator connection wizard, configuring a connection to Enterprise Vault

Another fact is that some of the more complex projects our partners encounter require both types of connector. This allows them to cope with a mixed bag of different EV versions and storage types and EV servers in different ‘states of health’. Sometimes it’s difficult to know what you’ll need until you get going with the email archive migration.

The net result of a hybrid approach to extraction means that your migration goes at full velocity whenever it can, but without risking data loss or integrity for any items that need the API to access them.

But we hasten to add, Transvault always encourages our migration partners to err on the side of caution. to still elect to use a fully API connector. Our API connector scales in parallel across multiple servers to give you industry-leading API performance too.

There is a lot of nuance to the situation around extraction. Make sure any vendor provides a clear view of the pros and cons of each approach. Ideally then the technology should step in and make life easier on the customer’s choice of approach, rather than dictating the only way to go.

Another final point to note if you’re doing an EV-EV migration is that Transvault offers an end-to-end API connection to both a source and a target EV archive (even between different versions of EV, such as V6 to V11+). This means each item migrated goes across in one jump – that’s to say, it’s extremely fast, compliant and with no need for interim files.

No other vendor has the same direct API-API pipe when migrating between different instances of Enterprise Vault.

Transvault – Let’s Move Together

Transvault is the only archive migration vendor that offers its Partners and end-customers choice over which source connection methodology they want to employ. We give you the best of both worlds depending on your needs. If our flexible approach to email archive migration appeals then speak to our team today on info@transvault.com