objects.PDF_File_Object.xsd Maven / Gradle / Ivy
The newest version!
This schema was originally developed by The MITRE Corporation. The CybOX XML Schema implementation is maintained by The MITRE Corporation and developed by the open CybOX Community. For more information, including how to get involved in the effort and how to submit change requests, please visit the CybOX website at http://cybox.mitre.org.
PDF_File_Object
1.1
01/22/2014
The following specifies the fields and types that compose this defined CybOX Object type. Each defined object is an extension of the abstract ObjectPropertiesType, defined in CybOX Common. For more information on this extension mechanism, please see the CybOX Specification. This document is intended for developers and assumes some familiarity with XML.
Copyright (c) 2012-2014, The MITRE Corporation. All rights reserved. The contents of this file are subject to the terms of the CybOX License located at http://cybox.mitre.org/about/termsofuse.html. See the CybOX License for the specific language governing permissions and limitations for use of this schema. When distributing copies of the CybOX Schema, this license header must be included.
The PDF_File element is intended to characterize the structural and metadata information regarding a single PDF file.
The PDFFileObjectType type is intended to characterize the structural makeup of PDF files.
The Metadata field captures some useful metadata associated with the PDF file.
The Version field specifies the decimal version number portion of the string from the PDF Header that specifies the version of the PDF specification to which the PDF file conforms, e.g. '1.4'.
The Indirect_Objects field captures the indirect objects included in the PDF file, representing the contents of a document.
The Cross_Reference_Tables field captures the cross-reference tables included in the PDF file, used for facilitating random access of indirect PDF objects.
The Trailers field captures the trailers included in the PDF file, used for capturing offsets to the cross-reference table and important objects.
The PDFXrefTableListType captures a list of PDF cross-reference tables.
The Cross_Reference_Table field captures the cross-reference table contained in the PDF file, for the random access of indirect objects contained in the file.
The PDFXRefTableType captures the details of a PDF cross-reference table, which provides a capability for the random access of indirect objects contained in the file.
The Subsections field captures the subsections contained in the cross-reference table.
The Offset field specifies the offset of the cross-reference from the beginning of the file, in bytes.
The Hashes field captures any hashes that were computed for the cross-reference table.
The PDFXrefTableSubsectionListType captures a list of cross-reference table subsections.
The Subsection field captures a single cross-reference table subsection in the list.
The PDFXrefTableSubsectionType captures details of subsections contained within a PDF cross-reference table.
The First_Object_Number field captures the object number of the first object for which there is a corresponding entry in this cross-reference subsection.
The Number_Of_Objects field captures the number of objects for which there are corresponding entries in this cross-reference subsection.
The Cross_Reference_Entries field specifies the cross-reference entries contained in this cross-reference subsection.
The PDFTrailerListType captures a list of PDF trailers.
The Trailer field captures a PDF file trailer contained in the file, used by applications for quickly locating the cross-reference table and certain special objects.
The PDFTrailerType captures the details of a PDF trailer.
The Size field captures the total number of entries in the file's cross-reference table.
The Prev field the byte offset from the beginning of the file to the beginning of the previous cross-reference table. This is only applicable for files that have more than one cross-reference table.
The Root field captures an indirect object reference that points to the catalog dictionary for the PDF document contained in the file.
The Encrypt field captures the PDF document's encryption dictionary, either through an indirect reference or embedded set of key/value pairs.
The Info field captures an indirect object reference that points to the document information dictionary.
The ID field captures an array of two strings that constitutes a file identifier.
The Last_Cross_Reference_Offset field captures the byte offset, relative to the beginning of the file, of the last cross-reference table contained in the file.
The Offset field specifies the offset of the trailer from the beginning of the file, in bytes.
The Hashes field captures any hashes that were computed for the trailer.
The PDFTrailerIDType captures the details of a PDF ID value stored in a trailer.
The ID_String field captures one of the two strings that constitutes the file identifier.
The PDFIndirectObjectListType captures a list of PDF indirect objects.
The Indirect_Object field captures a single PDF indirect object contained in the file.
The PDFObjectType captures the details of a PDF document indirect object, used in constructing and storing data associated with the PDF document.
The ID field specifies the identifier of the PDF indirect object, consisting of an object number and generation number.
The Contents field captures the contents of the PDF indirect object, including non-stream and stream data.
The Offset field specifies the offset of the PDF indirect object from the beginning of the file, in bytes.
The Hashes field captures any hashes that were computed for the PDF indirect object.
The type field specifies the basic type of the PDF indirect object.
The PDFIndirectObjectIDType captures the details of PDF indirect object IDs.
The Object_Number field captures the number portion of the indirect object ID.
The Generation_Number field captures the generation number portion of the indirect object ID.
The PDFIndirectObjectContentsType captures the contents of a PDF indirect object, including both stream and non-stream portions.
The Non_Stream_Contents field captures the raw contents of the PDF indirect object excluding any stream data (i.e. everything after the 'obj' keyword and before the 'endobj' keyword up to but not including anything between the 'stream' and 'endstream' keywords) as a string enclosed in an XML CDATA section.
The Stream_Contents field captures the stream contained within in the PDF indirect object, if applicable.
The PDFStreamType element captures details of PDF document stream objects, which represent arbitrary sequences of bytes.
The Raw_Stream element captures the raw, undecoded stream (i.e., everything between the 'stream' and 'endstream' keywords), as a hex string.
The Raw_Stream_Hashes field captures any hashes of the raw, undecoded stream.
The Decoded_Stream field captures the decoded stream (i.e., after undoing the specified filters in the correct order) as a hex string.
The Decoded_Stream_Hashes field captures any hashes of the decoded stream.
The PDFDocumentInformationDictionaryType captures details of the PDF Document Information Dictionary, used for storing metadata associated with the PDF document.
The Title field captures the title of the PDF document.
The Author field captures the name of the person who created the PDF document.
The Subject field captures the subject of the PDF document.
The Keywords field captures the keywords associated with the PDF document.
The Creator field captures the name of the application that created the original document, for cases where the original document was then converted to PDF.
The Producer field captures the name of the application that converted the document to PDF, for cases where the original document was then converted to PDF.
The CreationDate field captures the date and time that the document was created.
The ModDate field captures the date and time that the document was most recently modified.
The Trapped field captures a name object indicating whether the document has been modified to included trapping information.
The PDFXrefEntryListType captures a list of cross-reference table subsection entries.
The Cross_Reference_Entry field captures a single cross-reference subsection entry in the list.
The PDFXrefEntryType captures details of a cross-reference table subsection entry.
The Byte_Offset field captures the 10-digit number, padded with leading zeros if necessary, that specifies the number of bytes from the beginning of the file to the beginning of the object.
The Object_Number field specifies the 10-digit object number of the next free object.
The Generation_Number field specifies the 5-digit generation number to be used when an object with the same object number is created.
The type field specifies the type of the cross-reference entry.
The PDFDictionaryType captures a PDF dictionary as a set of key value pairs, or as a reference to an indirect object that contains.
The Object_Reference field captures a reference to an indirect PDF object that contains the dictionary, via its object and generation numbers.
The Raw_Contents field captures the contents of the dictionary as a string enclosed in an XML CDATA section.
The PDFFileMetadaType captures some metadata regarding the PDF file object.
The Document_Information_Dictionary field captures the details of the PDF Document Information Dicitonary, which includes properties like the document creation date and producer, if present in the PDF document.
The Number_Of_Indirect_Objects field captures the number of indirect PDF objects contained in the file.
The Number_Of_Trailers field captures the number of trailers contained in the file.
The Number_Of_Cross_Reference_Tables field captures the number of cross-reference tables contained in the file.
The Keyword_Counts field captures the counts of various PDF keyword names in the file.
The encrypted field specifies whether the PDF file is encrypted.
The optimized field specifies whether the PDF file has been optimized.
The PDFKeywordCountsType captures the occurrences of various keywords in a PDF file.
The Page_Count field captures the number of occurrences of the '/Page' keyword in the PDF file, which provides an indication of the number of pages in the PDF document.
The Encrypt_Count field captures the number of occurrences of the '/Encrypt' keyword in the PDF file, which indicates that the PDF uses encryption.
The ObjStm_Count field captures the number of occurrences of the '/ObjStm' keyword in the PDF file.
The JS_Count field captures the number of occurrences of the '/JS' keyword in the PDF file.
The JavaScript_Count field captures the number of occurrences of the '/JavaScript' keyword in the PDF file.
The AA_Count field captures the number of occurrences of the '/AA' keyword in the PDF file.
The OpenAction_Count field captures the number of occurrences of the '/OpenAction' keyword in the PDF file.
The ASCIIHexDecode_Count field captures the number of occurrences of the '/ASCIIHexDecode' keyword in the PDF file.
The ASCII85Decode_Count field captures the number of occurrences of the '/ASCII85Decode' keyword in the PDF file.
The LZWDecode_Count field captures the number of occurrences of the '/LZWDecode' keyword in the PDF file.
The FlateDecode_Count field captures the number of occurrences of the '/FlateDecode' keyword in the PDF file.
The RunLengthDecode_Count field captures the number of occurrences of the '/RunLengthDecode' keyword in the PDF file.
The JBIG2Decode_Count field captures the number of occurrences of the '/JBIG2Decode' keyword in the PDF file.
The DCTDecode_Count field captures the number of occurrences of the '/DCTDecode' keyword in the PDF file.
The RichMedia_Count field captures the number of occurrences of the '/RichMedia' keyword in the PDF file.
The CCITTFaxDecode_Count field captures the number of occurrences of the '/CCITTFaxDecode' keyword in the PDF file.
The Launch_Count field captures the number of occurrences of the '/Launch' keyword in the PDF file.
The XFA_Count field captures the number of occurrences of the '/XFA' keyword in the PDF file.
The PDFKeywordCountType captures the obfuscated and non-obfuscated occurrences of a keyword.
The Non_Obfuscated_Count field captures the number of times the keyword occurred in the PDF file without any obfuscation.
The Obfuscated_Count field captures the number of times the keyword occurred in the PDF file with some form of obfuscation, such as with hexcodes.
The PDFObjectTypeEnum is an enumeration of basic PDF document object types.
The PDFXrefEntryTypeEnum is an enumeration of PDF cross-reference table entry types.