Abstract :
Complex document formats such as PDF and Microsoft´s Compound File Binary Format can contain information that is hidden but recoverable, as a result of text highlighting, cropping, or the embedding of high-resolution JPEG images. Private information can be released inadvertently if these files are distributed in electronic form. Simple experiments involving the creation of test documents can determine whether a particular program embeds hidden information.
Keywords :
data privacy; document handling; Microsoft; PDF format; compound file binary format; document files; hidden information; high-resolution JPEG image embedding; private information; sensitive information leak prevention; text cropping; text highlighting; Computer security; Cryptography; Encryption; Government; Metadata; Portable document format; Transform coding; User centered design; JPEG; Microsoft Office; PDF; Photoshop; metadata; privacy; redaction;