Would Metadata Cleaning Spoil Your Evidence?

May 03, 2012 / in News / by RPost Marketing

Would Metadata Cleaning Spoil Your Evidence?by Xenia von Wedel – Sys-Con Media

RPost’s latest integration with Esquire is called iScrub. This new product removes metadata from important “reusable” documents such as loan application forms. One of the questions raised by using a product like this is what effect metadata cleaning would have on evidence used in the courtroom. We asked RPost CEO Zafar Khan to weigh in on how metadata changes electronic documents and the effect that can have.

In Dr. Ball’s blog, he smartly notes: “Application metadata resides within the file and moves with the file, not changing unless the contents of the file are altered. System metadata residesoutside the file and can be altered without impacting the contents of the file. Hashing the file hashes its contents, not information about the file. That is, you only hash what’s stored inside the file, not its system metadata.” He also notes other points about how any hashing, and metadata cleaning, would change the file.

The question, however, that one should consider is what is the record that one is trying to preserve? Is it what resided on someone’s computer in draft form at a point in time, or what someone said (wrote) and claimed to have sent, or what content was in fact received?

What was claimed to have been sent can certainly be different than what was received.

What was claimed to have been sent could have been easily edited and changed by the sender in their sent folder, or programmatically changed by scanners, archives, or meta-data cleaners.

In the context of metadata cleaners, these are implemented for the precise reason to change the content – to change what the sender claims (or believes) to have been sent, from what was actually sent, and certainly from what was received.

Now, this might sound confusing, but it’s quite simple if one looks at a typical message flow:

Sender → metadata cleaner → sender archive → sender mail transport server → receiver

Or, with metadata cleaning happening as a cloud service,

Sender → sender archive → sender mail transport server → metadata cleaner → receiver

In both of the above cases, what the sender sent is not what the receiver received; and deliberately (at the sender’s wishes). Further, with metadata cleaning happening as a cloud service, what the sender sent AND what is in the sender’s archive is not what the receiver received; and again, deliberately.

Further, there could certainly be delays in these processes – in the sender’s network before the sender’s mail transport server, in the metadata cleaner, or elsewhere in transmission.

Therefore, what should be important in any dispute about who said what to whom and when, one should consider precisely what content (message, attached documents, and “application metadata) was received by the receiver and precisely when it was received.

When sent by RPost’s Registered Email service, RPost provides precisely this record.

Consider metadata cleaning in the following two processes:

Sender → iScrub iScrub metadata cleaner → sender archive → sender mail transport server → RPost Processing → receiver → RPost Registered Receipt record to sender

Or, with metadata cleaning happening as a cloud service,

Sender → sender archive → sender mail transport server → RPost metadata cleaner → RPost Processing → receiver → RPost Registered Receipt record to sender

At the “RPost Processing” stage of the above process flows, RPost is hashing the content at that point in time, after metadata cleaning (note, RPost also has additional features not discussed here, to check for changes in content from the Sender to RPost). In the “RPost Receipt” stage, RPost is cryptographically associating uniform times of receipt by the recipient along with the “system metadata.”

As one can see, for an irrefutable record of who said what to whom and when; said another way, of the precise content received and when, one should rely on the RPost Receipt record.

This RPost Receipt is also durable.

Dr. Ball further points out in his blog that the act of opening a document or forwarding an email forensically changes the document and spoils any record (reliant on a hash of the document or message prior to opening or forwarding). This is why a digitally signed email is fragile; the act of forwarding it changes the message and beaks the ‘digital signature’.

RPost protects the evidentiary records of messages and their associated content, documents and metadata by packaging the email record as an RPost Receipt, in a special manner where the receipt can be third party verified for authentication of content, metadata (application and system), and times (not sender or recipient times, but uniform times). This record is preserved in a manner so that it can be forwarded by simple email to any recipient, and any recipient can verify the receipt and have the originals (message content, attachments, timestamps, application and system metadata) reconstructed and certified as authentic, on demand (RPost does this without storing any message content).

RPost’s Certified Email service does the same with RPost’s Digital Seal patented technology, but for the benefit of any recipient who also wants to verify the authenticity of the sender’s content, time authored, and by whom; in a manner that likewise is durable from inadvertent tampering at the forensic level.

In summary, metadata cleansing and then sending by RPOST REGISTERED EMAIL service provides a forensic record of the application metadata and the associated system metadata, after the cleaning occurred. If one is looking to have a proof record of the precise content (and metadata) of what the recipient actually received, RPost does that with its Registered Receipt email record.

It should matter more what was finally received, than a draft or non-final content that was on someone’s computer as a point in time. Therefore, the Registered Email service should be a requirement for proof of content received which (due to metadata cleaning and other things) can be deliberately different than what was sent, stored in the sender’s sent archive, stored in the sent folder, and stored on their hard drive file / document management system.