Vanilla.PDF
2.0.0
Cross-platform toolkit for creating and modifying PDF documents
|
The library is writted in standard C++ (currently 14) and can be compiled using Visual studio (2015 and 2017) and GCC (tested on Ubuntu 16.04) as well.
Build is executed using cross-platform make tool CMake (https://cmake.org/). CMake also integrates packaging system to provide one-click installable packages for each platform.
Currently supported package systems:
It provides only ANSI C API. The reason why I did not expose native C++ interface is rooted within the incompatibility of the C++ ABI between compilers. Functions across the interface use standard C caller clean-up cdecl calling convention.
Library uses C++ exceptions internally. Each interface function is wrapped inside try-catch block to prevent any exceptions to escape and potentially crash the application.
This is example, how interface functions usually look like:
All exceptions thrown in this way are caught and their message is stored in a thread-local buffer. This buffer is separate for each thread and has a pre-allocated size in case of memory shortage.
Following code snippet declares the structures that carries error information:
All handles are basically opaque pointers to internal structures. Library uses so-called intrusive pointer reference counting mechanism. Usually, the structure and the reference counter are two separate objects. In this case, the reference counter is embedded inside the structure body.
Let's compare intrusive pointer with the traditional C++ shared pointers.
Transferring object handle outside library bounds is more clear.
Intrusive pointers can guarantee, that there are no multiple reference count objects.
Intrusive pointers should have a better performance (in some cases) comparing to traditional C++ shared pointers. Main reason is that accessing the object required two pointer dereferences for shared pointer, while for intrusive only one. The other reason is that whole object is allocated within a single allocation, while shared pointers are often not.
File layer allows access to file contents at the syntactic level. It has some necessary semantic features that are required for parsing its syntax.
For example IndirectReferenceObjectHandle often has to be resolved to read an object. The StreamObjectHandle has it's Length
often stored as an indirect object. In order to validate this object, the Length
has to be resolved to successfully parse an object.
Library uses C++ io streams for reading source files and writing output files. There are already interfaces, that represents these streams and will be used throughout the library interface.
Tokens are smallest syntactic elements and are separated by a whitespace or a delimiter. Which characters are considered whitespace and which are considered delimiter is discussed in section 7.2 - Lexical Conventions.
Tokenizer uses look-ahead to determine proper token type, since some of the tokens are ambiguous from the first character. For example hexadecimal string is enclosed with angle brackets "<", ">" and the dictionary "<<", ">>".
Sample parsing loop for hexadecimal string:
Tokens are passed to the parser, who is responsible for constructing objects. Parser uses look-ahead as well, since multiple tokens may form a single object.
Library provides multiple interfaces, that could be overriden by the calling application.
For instance, when signing a document, it is possible to use classic PKCS#12 (Personal Information Exchange described in RFC 7292). Unfortunately, this would not work with smart cards, where the private key is not directly accessible. User can override SigningKeyHandle and provide signing implementation outside library boundaries.
More extendable interfaces:
Library has also following dependent libraries with required runtime support:
Internal dependent library without runtime support: