I was recently amused by reading a blog of a group who apparently
defeated PDF's DRM system by using GMail's "convert to HTML" option. I nearly fell off my chair when I read the claim " (it) works regardless of the files; usage restrictions..". Yes - under certain circumstances you can gain access to text or other components of a PDF document that has policy protection on it, but *only* if the person applying the policies set the policies to allow this type of access AND does not encrypt the PDF. Keep in mind that PDF is a completely free, open and available standard that anyone can implement. There are several third party SDK's to manipulate PDF documents. Before you read the blog above, it is extremely helpful to understand how the encryption and DRM mechanisms work.
In general, if you do not want someone other than the intended recipient to
view a PDF, you should encrypt it. By default, the encryption level for compatibility with Acrobat 5.0 and later is 128bit RC4. Encrypting the contents of a PDF with a strong key results in a situation where there is no way gmail or any other
application can crack it open by brute force. The PDF is turned into cipher text that is completely incomprehensible to anyone without the key to open it. I am so certain of this that I will provide $500 USD to the first person who can open this document within one year.
A person encrypting a PDF document has several options. First, you can determine the compatibility for earlier versions of Acrobat (5 , 6) or jump straight to Acrobat 7.0 and higher. If you select to encrypt it for Acrobat 7, the default level encryption method is AES, much harder (read = impossible) to crack using brute force.
You can also opt to encrypt all the document contents, or leave the metadata unencrypted. This is useful should you want to be able to have the document searchable in real time based on the metadata. Note the lower section of the screenshot above - by default, the box is checked to allow text access to the document. If you leave this selected, some PDF applications can access the text. If you don't want this, please de-select this option. After setting all of the options and pressing next, you will still be given a generic warning that certain non-Adobe products might not enforce this document's policies. Note that if you do not select "require a password to open the document", the usefulness of encrypting it is moot. Others will still not be able to copy the document by using the text copy tool or Control-C, but other means can be employed.
To summarize so far, Acrobat has DRM capabilities to limit the following interactions with documents
1. ability to disable printing
2. ability to disable cut and paste
3. ability to disable control printscreen
4. ability to disable local file saving
5. ability to disable local file saving
6. ability to disable accessibility
7. ability to make a document no longer exist
A person must comprehend the frame and scope of the intended use of each of
these and their built in restrictions. PDF's are like music - if you can
render it once, it is possible to capture it and render it again. Even if
we figured out a way to prevent all third party screen scraping software
from capturing what you see on a computer screen, someone who both has
access to the document for a single view AND intent to distribute it further can simply take adigital photo of their computer screen to circumvent all of these. There is simply no way to stop someone who is intent on doing this using 1-6 above.
Another methodology is available to place a dynamic watermark on the page, perhaps stating the users name and address in bold gray text across the document. This too can be defeated if one took a screen shot of the document and used a great tool like ... err "Adobe Photoshop" to take care of that nasty watermark. I am guessing the magic wand tool is your best friend here ;-)
So how can you protect a PDF? If you really want to make it secure and also
track the users interaction with it, you would be wise to use Adobe Policy
Server. The policy server uses a model of persistent DRM that follows the
document everywhere it goes. If you feel the document is out of control and
you want to stop it, you can simply "destroy" the document which will cause
it to fail to un-encrypt itself when someone opens it. Is there a way
around that? Sure - sneak into the office of the person who made the
policy, install a tiny pinhole camera near their desk and capture their
See what I am getting at, no matter what you do, there is a way around it if
someone is really intent. The easier method is "social engineering" rather than brute force.
So here is a challenge. Take this document here (link to APS protected
document) and try to render it with gmail (or any other method). I will pay
$500 USD to the first person who can show me the un-encrypted content of this document within one year of this.
How I would do it? I would probably try to lure myself into providing a password to a site that offered me some form of membership and hope that I was rather lazy and used the same password for this document. D'oh!! Not gonna work - I typed a random phrase of about 13 characters to encrypt this using AES.