Kap6_doc.html


HELIOS PDF HandShake UB User manual

6 PDF HandShake utility programs

6.1 pdfcat

The "pdfcat" program is a command line tool that allows you to explode or concatenate PDF files on the server. Using "pdfcat" is indispensable if you have a multi-page document that is to be used in an EPSF-only or in an OPI workflow. The OPI server, for instance, will generate a layout file of the first document page only - so if you want to have a layout of document page 2, you must extract the desired page and create a new single-page PDF file. The illustration below shows the three different program modes that are all independent of one another and do exclude each other: The "concatenate" mode merges the selected PDF files into a new one, the "append" mode appends the selected files to an existing one, and the "explode" mode writes the selected pages of an existing document into new single-page files.

Please note that -

Every PDF file contains a list of file infos such as creator, creation date, modification date, image profiles (optional). Furthermore, a PDF file may contain security settings (optional), a table of contents (TOC) and annotations such as text fields, buttons, etc.

The "pdfcat" program re-arranges PDF files and creates new ones. The file infos, profiles, security settings, TOC, and annotations are handled as follows:

pdfcat -o

If "pdfcat" creates a new multi-page document - as shown in the first row in the above illustration - the new file will have its own creator and creation date. It will not have any profiles, even if the input files were tagged, and it will not have any security settings. Tables of content - if there were any in one of the input files - will not be copied to the new file. However, annotations from the original files will find their way into the newly created document.

pdfcat -a

If "pdfcat" appends pages and/or documents to an existing PDF file - as shown in the second row in the above illustration - the output file (here: helios.pdf (new)) will contain all the information that was already included in the original file (here: helios.pdf). File infos, profiles, security settings, TOC, and annotations remain unchanged. Information that had been included in the appended files (here: Doc2 and Doc3) will - except for the annotations - not make it into the output file; they will be ignored.

pdfcat -e

If "pdfcat" explodes a document into several single-page documents - as shown in the last row in the above illustration - the new documents inherit the file infos, profiles, and security settings from the original file. Tables of content, however, will not be copied to the output files. Please note that "pdfcat" cannot write comments into the Finder Info. Usually, if a PDF file contains profile information, the profiles are listed in the Comments text field of the Macintosh Finder Info (Fig. 2 in 4 "Before getting started"). This Comments text field may be empty for PDF files you have created with "pdfcat explode", even though the new files contain profile information. In that case, you can use our Acrobat print plug-in or the ImageServer "Tagger" application to make the profiles visible.

For OPI users only

Automatic layout generation is not available for PDF files that have been created with "pdfcat". You need to use our "touch" application or the command line layout program on the server to generate layouts from the new PDF files. Alternatively, the procedure can be automated by means of ImageServer Script Server.

6.1.1 Options

-h

Opens the online help file.

-v

This option allows you monitor the "pdfcat" conversion progress.

-o

Concatenates existing PDF documents (inDocs...) to a new PDF document (outDoc). With this parameter, you have to specify one output file name and one or more input file names. It is possible to copy only selected pages from the input file to the output file by specifying page ranges. Valid ranges are listed in 6.1.2 "Operands".

Important: If the name you choose for the output file already exists in the destination directory, the existing file will be replaced!

-a

Appends one or more PDF documents or parts of them (inDocs...) to another - already existing - PDF document (outDoc).

-e

Extracts from a given PDF document (inDoc) either all pages, or the pages you select explicitly, and creates new single-page PDF files. With this parameter, you have to specify a prefix for the output files. Page number <nnn> of the input file will then be copied to a new PDF document named "<prefix><nnn>.pdf".

Note: File name specifications may contain a complete UNIX path name that leads to another directory.

-p

Password to open PDF document. Required if <pdffile> is secured by an "Open Password".

6.1.2 Operands

<inDoc> is the path name of a PDF file, optionally followed by a comma-separated list of page ranges. Valid page ranges are:

<a>

Page <a> only

<a>-<b>

Page <a> to page <b>

<a>-

Page <a> to last page

-<b>

First page to page <b>

The character "$" stands for the last page of a document (the "$" must be escaped on a shell).

If <inDoc> is followed by a list of page ranges, only the specified pages of <inDoc> will be copied to the destination file. The default is to copy all pages from <inDoc>.

Examples:

Example 1:

pdfcat -o new.pdf doc1.pdf doc2.pdf,2,5-7

writes all pages of document "doc1.pdf" and the pages 2, 5, 6, 7 of document "doc2.pdf" to a new document called "new.pdf".

Example 2:

pdfcat -o new.pdf doc1.pdf,\$-1

writes all pages of document "doc1.pdf" in reverse order to a new document called "new.pdf".

Example 3:

pdfcat -a tmp.pdf doc1.pdf,9-6

appends the pages 9, 8, 7, 6 (in this order) of document "doc1.pdf" to an existing document called "tmp.pdf".

Example 4:

pdfcat -e page doc1.pdf,-3

writes the pages 1, 2, 3 of document "doc1.pdf" to new single-page documents called "page1.pdf", "page2.pdf", and "page3.pdf".

6.2 pdfform

The "pdfform" tool allows you to output form field values of a PDF document.

The "pdfform" program is used as follows:

pdfform <pdffile>

Example:

pdfform /laura/base11_e.pdf

Company=HELIOS Software GmbH
Date=03-11-04
Priority=Urgent

This PDF document contains three form fields of the type text. The Company field has the value "HELIOS Software GmbH", the Date field "03-11-04", and the Priority field has the value "Urgent".

6.2.1 Options

-p

Password to open PDF document. Required if <pdffile> is secured by an "Open Password".

6.3 pdfinfo

"pdfi nfo" has two modes of operation: either print information about the PDF document <pdffile> or extract objects from it.

In information mode every output line is of the form:

section: key1=value1, key2=value2, ..., flag1, flag2, ...

Possible sections are General, Security, Profile, Plate, Color, Font, MediaBox, CropBox, BleedBox, TrimBox, ArtBox, Rotate, Transparency, Image and Form. The output can be restricted with the -o option.

In object extraction mode arbitrary objects with object number can be extracted from the PDF document <pdffile> to standard output.

Note: The object extraction mode is useful only for experts with knowledge of the internals of PDF documents.

The "pdfinfo" program is used as follows:

pdfinfo [options] <pdffile>

6.3.1 Options

-p

Password to open PDF document. Required if <pdffile> is secured by an "Open Password".

Information mode:

-f <fromPage>

First page number for font and color information.

-t <toPage>

Last page number for font and color information.

-o <sections>

Print only information for the specified sections. The default value is All, which prints all available information.

-m

Use a different output format for values and flags, which is suitable for post-processing.

Object extraction mode:

-x <objNo>

Extract object with number <objNo>.

-s <objNo>

Extract a stream without decompressing it from object with number <objNo>.

-d <objNo>

Extract and decompress a stream from object with number <objNo>.

Examples:

Example 1:

The following commands are equivalent:

pdfinfo Doc.pdf
pdfinfo -o All Doc.pdf

Example 2:

Getting information about all images in "Doc.pdf":

pdfinfo -o Image Doc.pdf
# pdfinfo 3.0.0
Image: Page=18, BBox=207.6720/371.4596/ 264.3224/443.6887, Resolution=278.3387/593.1125, BPP=8, ColorSpace=DeviceGray, Filters=FlateDecode
Image: Page=18, BBox=225.1510/387.2033/ 237.4971/397.1024, Resolution=194.9898/194.9960, BPP=8, ColorSpace=DeviceRGB, Filters=FlateDecode

Example 3:

Getting information about transparencies in "Doc_1.pdf":

pdfinfo -o Transparency Doc_1.pdf
# pdfinfo 3.0.0

Transparency: Page=1, Transparencies=no

Transparency: Page=2, Transparencies=yes

Transparency: Page=3, Transparencies=no

6.4 pdfnote

The "pdfnote" tool allows adding a text annotation with any content anywhere within the PDF document <pdffile>. You can preselect the title of the text annotation and whether it is already open when opening the respective page of the PDF document.

The "pdfnote" program is used as follows:

pdfnote [options] <pdffile>

6.4.1 Options

-n

Number of the page where to add the text annotation. The default is -n 1 (first page).

-r

The location of the text annotation on the page. <location> is of the form: <llx>:<lly>:<width>:<height> specifying the lower left x coordinate, lower left y coordinate, width, and height of the text annotation rectangle in points.

-t

Title string. <title> will appear in the annotations title bar.

-s

Specifies that the annotation should initially be displayed open.

-c

Contents of the text annotation. If -c is not given, the contents are read from "stdin".

-h

Help

Example:

pdfnote -r 10:400 -t "Joe" -c "OK" Doc1.pdf

pdfnote -r 10:10:200:200 -t "Info" Doc2.pdf <
info.txt

6.5 pdftoeps

EPSF files can be generated by different applications, e.g. FreeHand or Photoshop, and they can be different in structure. One purpose of "pdftoeps" is to allow you to stick to PDF as a file exchange format and to use the program to generate EPSF files, that are all homogeneous in structure, from the incoming PDF files. This tool is available for ImageServer users only.

If you have an EPSF-only workflow, but receive PDF files from your customers, you may use the "pdftoeps" tool to transform PDF files into EPSF files. It allows you to specify the color space, the resolution, and the type (Mac-EPSF, PC-EPSF) of the EPSF output files. An example of the "pdftoeps" tool is given in Fig. 10. The illustration shows a situation where "pdfcat" and "pdftoeps" are used to convert a multi-page PDF document into several single-page EPSF files.

Fig. 10: Creating EPSF files using the "pdftoeps" tool

Font quality

We include in our software package 131 original PostScript 3 fonts. They are available automatically after installation and appropriate activation of PDF HandShake on the server. They guarantee high-quality printing, and high-quality font representation on screen and in layout files (for ImageServer). Our fonts are listed in A 5 "The fonts we deliver".

6.5.1 Options

-v

Displays activity reports during PDF to EPSF transformation.

-m

Produces Macintosh EPSF files (see A 6 "Glossary" and the option -p below).

-p

Generates cross-platform EPSF files (opposite of option -m above). If neither of the two options is specified, the default depends on the location of the selected PDF files and on whether this location is an EtherShare volume or not. If the volume settings in the respective volume are set to cross-platform EPSF, the resulting files will be cross-platform.

-r

Sets resolution in dpi used for the printable preview of EPSF files. This parameter requires a floating point value as e.g. 72.0. If you do not specify -r the resolutions of the elements of the PDF input file will be used.

-c

Defines color space used for the printable part of the EPSF file. The parameter requires a string value as e.g. "CMYK". For valid strings, see Table 11 below. If you do not specify
-c the default from the OPI server will be used.

-R

Sets resolution in dpi used for the screen preview of EPSF files. This parameter requires a floating point value as e.g. 72.0. If you do not specify -R the default from the OPI server will be used.

-P

Password to open PDF document. Required if <pdffile> is secured by an "Open Password".

Note: Set the resolution for the composite preview to a reasonable value. Increasing the resolution to an exaggerated extent might lead to VM and RAM overflow!.

-C

Defines color space used for the screen preview of the EPSF file. The parameter requires a string value as e.g. "Grayscale". For valid strings, see Table 11 below. If you do not specify -C the default (RGB) will be used.

-h

Displays the help text for the "pdftoeps" program.

Fig. 11: List of EPSF color spaces

Name of color space

None

HSV

YCbCr

Spot

HLS

CIELab

Bilevel

CMY

CIEXYZ

Grayscale

CMYK

CIELuv

Indexed

Multi

CIEYxy

RGB

Duotone

YCC

-

-

All options of the "pdftoeps" tool are optional. However, you should specify parameters like -m or -p explicitly whenever you are not sure about the defaults that are currently valid.

Note that for all "pdftoeps" options, the parameters and values must be separated by a blank (see example below).

After the options, you have to specify one or more files to convert and a destination file (if converting a single file) or a destination directory (if converting several files). The destination can contain a complete UNIX path name.

Example:

pdftoeps -m -c CMYK -C RGB file1 file2 /user/tmp

Pre-separated PDF documents

The "pdftoeps" program recognizes pre-separated PDF documents. It generates DCS files with default (depending on server settings) composite previews which are raster based with a maximum resolution of 150 dpi.

The DCS files are DCS-1 or DCS-2 style multifile images. The plate file suffixes for CMYK will be .C, .M, .Y, and .K. Spot color plate files will be assigned other suffixes, namely letters so far unused in alphabetical order. The suffix does not have any relation to the name of the spot color.

6.6 pdfprint

"pdfprint" allows the printing of PDF files directly from the server to a PDF HandShake printer queue (Fig. 12). Features like color matching and proof printing are available for each queue. Pre-separated files cannot be printed composite unless ImageServer is also used. For options and usage information, see 8.1 "pdfprint"

Fig. 12: Printing PDF files directly from the server

.

PDF print plug-in for
Acrobat

The print plug-in for Acrobat, or Adobe Reader, (see 8.2 "Printing PDF files using the Acrobat plug-in") is currently available for Macintosh computers only. It is the equivalent of the "pdfprint" program and also prints to PDF HandShake printer queues (Fig. 13). Again, without ImageServer, pre-separated files can only be printed as separations.

Fig. 13: Printing PDF files with the Acrobat plug-in

PDF extension for ImageServer

With PDF HandShake and ImageServer, you can use the PDF file format as an input format for the ImageServer layout generation. Moreover, you can use the "PDF HandShake" Acrobat plug-in (or the "pdfprint" command line tool) to export PDF files for further use in an imposition program. Fig. 14 shows these two features at a glance.

Fig. 14: The options for PDF files on an ImageServer

The OPI server uses the first page of a PDF file for layout generation. Further pages are ignored. The file format of the layout representation is EPSF for composite PDF files and DCS for pre-separated PDF files. This allows you to place the layouts into any popular layout application, e.g. QuarkXPress, InDesign, etc., and into any editorial or page composing system.

The "pdfprint" program is used as follows:

pdfprint [options] <PdfFilename> [<psFilename>]

6.6.1 Options

The options for the "pdfprint" program as well as examples on the use are described in detail in 8.1.2 "Options".

6.7 pdfresolve

The "pdfresolve" program allows replacing OPI references in PDF documents. Layout applications like QuarkXPress or InDesign can export their native documents as PDF documents. If the native documents contain layout images with OPI references, the OPI references can be preserved during the export process. These OPI references will be used to replace gray-area placeholders or layout images with the corresponding high-resolution original images prior to or during the printing process.

OPI 1.3 references contain the path name of the placed image file plus an optional Macintosh file ID. During OPI reference replacement "pdfresolve" searches for a matching high-res or low-res image to replace the form. An image search consists of a sequence of search methods: search by path name, search in additional search paths, search by Macintosh file ID, search in additional search volumes. An image search is complete when a search method succeeds. If a low-res image is searched, the found image is used for replacement, regardless whether it is low-res or high-res. If a high-res image is searched and a high-res image is found, it is used for replacement. If a high-res image is searched and a low-res image is found, the OPI reference of the low-res image is used for another high-res image search. A chain of low-res images leading to a high-res image must not exceed the length of seven.

Another advantage of "pdfresolve" is that transparencies that have been applied to the layout images in the layout application are preserved during the OPI image replacement process. This tool is available for ImageServer users only.

Note: OPI references in PDF documents are always embedded in form streams with OPI entries, which are called OPI forms in short. "pdfresolve" replaces OPI 1.3 forms in composite PDF documents, pre-separated PDF documents are not supported. Referenced files must be raster images, references to PDF and object-based EPSF files are not supported. During the replacement process all input files and the output file are locked for safe operation in multi-user and network file system environments.

A description of how to set up "hot folder" mechanisms for the PDF native OPI workflow can be found in 12 "PDF-native OPI workflow".

The "pdfresolve" program is used as follows:

pdfresolve [-lv] -P <printer> [-g <logFile>]
[-o <key>=<value>]- <inDoc> <outDoc>

6.7.1 Options

-l

High-res images by are inserted by default. When this option is set, low-res images are inserted if available.

-v

Verbose mode

-P <printer>

Replace OPI objects using preference settings of printer queue <printer>. OPI must be active on this printer queue. This option is mandatory.

-g <logFile>

If there are warnings or errors, then generate a log file <logFile> with their description.

-o <key>=
<value>

Set parameter with key <key> to value <value>.

Note: Command line options have a higher priority than file specific preference settings, and file specific preference settings have a higher priority than global preference settings which include printer queue preference settings. Command line option keys are case-insensitive, but preference keys are case-sensitive.

6.7.2 Valid parameter keys

ImageSearchPaths

The list of additional search paths. The search in additional search paths locates files with matching basenames in these search paths. Its type is strlist.

ImageIDsearch

Determines whether images are searched via Macintosh file ID or not. The search via file ID first extracts the volume specification from the file ID and then searches a file with matching file ID in the desktop database of the specified volume. Its type is bool.

ImageSearchVolumes

The list of additional search volumes. The search in additional search volumes uses their desktop databases to locate files with matching basenames, but excludes files in the network trash folder, files with layout suffix and in layouts folders. This search succeeds if and only if there is exactly one matching file. Its type is strlist.

LayoutSuffix

Files with this name suffix are treated as layouts, i.e. these files are excluded from the image search in additional search volumes. Its type is str.

CheckImages

If this option is TRUE, errors are generated for missing referenced images. If this option is FALSE, warnings are generated for missing referenced images and the OPI form is left unchanged. Its type is bool.

ProfileRepository

Name of the ICC profile repository volume. If an ICC profile is searched, it is searched here first, then it is searched in the directories listed in parameter ProfileSearchPaths. Its type is str.

ProfileSearchPaths

The list of additional directories where ICC profiles are searched. Its type is strlist.

CompositeColorspace

Determines output color space of source images with color spaces other than bilevel and grayscale. Its value must be one of None, Grayscale, RGB, CMYK, CIELab.

DefaultPrinterProfile

Determines the output profile for all source images with input color space other than bilevel and grayscale. The output profile must match the value of the parameter CompositeColorSpace. Its type is str.

PrintRenderingIntents

Determines rendering intent used for conversion between any source and printer color space. Its syntax is

<sourceA>:<printerA>:<intentA>,<sourceB>:<printerB>:

<intentB>,...

DefaultProofProfile

Determines the proof profile for all source images with input color space other than bilevel and grayscale. Its type is str.

ProofRenderingIntents

Determines rendering intent used for conversion between any printer and proof color space. Its syntax is

<printerA>:<proofA>:<intentA>,<printerB>:<proofB>,
<intentB>,...

PreferredCMM

Determines preferred color matching module. Its type is str.

RenderingQuality

Determines the CMM rendering quality. Its value must be one of 0 (normal), 1 (draft), 2 (best).

CheckICCProfiles

Determines whether to generate errors for missing ICC profiles or not. Its type is bool.

IgnoreUntagged

If DefaultPrinterProfile is not set or CheckICCProfiles is not active, this option has no effect. Otherwise this option determines whether errors are generated for color source images which are not tagged with an ICC profile and which are not excluded from color matching via PreserveDeviceN. Its type is bool.

TagReplacedImages

Determines whether to tag replaced images with printer or proof profile where applicable or not. When this option is active and a printer profile is set, a replaced color image is tagged with the printer or proof profile. When this option is active and no printer profile is set, a replaced color image is tagged with the profile of its source image if and only if no color conversion is necessary. Its type is bool.

IgnoreSpots

Determines whether to ignore all spot colors of source images or not. If spot colors are not ignored, they are converted to process colors unless the source image has color space CMYK with additional spot color channels and PreserveDeviceN is active. Its type is bool.

CustomColorTinting

Determines whether to generate spot colors for colorized bilevel and grayscale images or not if the colorization does not require the DeviceN color space. To preserve spot colors when the DeviceN color space is required, both the CustomColorTinting and PreserveDeviceN options must be active. Its type is bool.

PreserveDeviceN

Determines whether DeviceN images are generated for source images with color space CMYK and additional spot color channels or not. If DeviceN images are generated, color matching is disabled. Its type is bool.

ColorAliases

The list of color name aliases. Color name substitution is applied to process and spot color names from source images and OPI comments. Its syntax is: <nameA>=<aliasA>,<nameB>=<aliasB>,-

DownSampling

Determines whether downsampling is active or not. Its type is bool.

FixedSampling

If downsampling is not active, this option has no effect. If downsampling is active, this option determines whether upsampling is active or not. Its type is bool.

FastDownSampling

Determines whether the nearest neighbor algorithm or the mean value algorithm is used for downsampling. The nearest neighbor algorithm is fast and has low quality, the mean value algorithm is slow and has high quality. The type of this option is bool.

Resolution

Determines the resolution for downsampling in dpi. Its type is positive double.

ICMethodBilevel, ICMethodGrayscale, ICMethodRGB, ICMethodCMYK, ICMethodCIELab, ICMethodOther

These parameters determine the compression method for images of the corresponding output color space. Their value must be one of None, Compress, CCITTG4, JPEG, JPEG 2000, Flate.

ICQualityBilevel, ICQualityGrayscale, ICQualityRGB, ICQualityCMYK, ICQualityCIELab, ICQualityOther

These parameters determine the quality of JPEG and JPEG 2000 compression for images of the corresponding output color space. Their type is double between 0 and 100.

Valid color space names are: None, Spot, Bilevel, Grayscale, Indexed, RGB, HSV, HLS, CMY, CMYK, Multi, Duotone, YCbCr, CIELab, CIEXYZ, CIELuv, CIEYxy, YCC.

Valid rendering intent numbers are: 0 for perceptual, 1 for relative colorimetric, 2 for saturation, 3 for absolute colorimetric.

6.8 PDF preflighting with pdfInspektor

PDF HandShake includes "pdfInspektor", a powerful solution for analyzing and preflighting PDF files to ensure that incoming and outgoing PDF files are production compatible. About 400 PDF characteristics are checked, and the results can be displayed in different file formats. Spots which could cause problems are visualized in isolated areas or put out as an ASCII or XML report for automated evaluation. If required, the reports can also be saved as a PDF file.

The "pdfInspektor2_CLI" program is used as follows:

cd /usr/local/helios

callas/pdfInspektor2_CLI.sh [OPTION [-]] PDF-FILE [-]

Check passed PDF-FILE(s) against a set of rules. The rules come from the file named in the --profile-file option.

6.8.1 Options

-f

File with a single profile package (created with the Adobe Acrobat Preflight or pdfInspektor plug-in; the path to the package can be defined in the initialization file).

Alternative use: --profile-file=FILE

-b

Create PDF breakout in FILE (containing all PDF content, which failed the check; defaults to inifile setting, if FILE is omitted).

Alternative use: --breakout[=FILE]

-r

Create report with check results (default format is plain text; has default from inifile, if FILE is omitted).

Alternative use: --report[=FILE]

-x

Switch report format to XML (also UTF-8 encoded).

Alternative use: --xml

-o

Compact report plain text (cannot be combined with xml or short report).

Alternative use: --compact

-a

Create report even if there are no rule violations.

Alternative use: --always

-s

Short report or breakout (both plain text and XML).

Alternative use: --short

-k

Encoding for compact report format (-o).

utf8: 1,

utf16BE: 2,

utf16LE: 3

Alternative use: --encoding

-e

Set line endings for text report.

0 for \n (default),

1 for \r\n

Alternative use: --line-endings[=NUM]

-p

Only process page numbers in range FROM to TO default is a range from page 1 to the last page.

Alternative use: --pages=[FROM][:TO]

-1

pdfX1a conversion. FILE should be path/file name of the PDFXSetFile.

Alternative use: --pdfx1a[=FILE]

-3

pdfX3 conversion FILE should be path/file name of the PDFXSetFile.

Alternative use: --pdfx3[=FILE]

-d

Set the output file for pdfx conversion.

Alternative use: --pdfx-file[=FILE]

-l

Set report language to LANG:

ENG, GER, FRE, SVE, PTB, NLD, JAP, ITA, ESP, DAN (the default language is English)

Alternative use: --lang=LANG

-v

Set verbosity level NUM (1-15, the default is 1); bitwise or of:

1:display major process steps

2:immediately show rule constraint violations

4:progress indicator

8:display internal parameters for debugging

Alternative use: --verbose[=NUM]

-q

Set quiet mode (no output at all).

Alternative use: --quiet

-h

Display this text.

Alternative use: --help

6.8.2 Included sample scripts

We provide the following sample scripts for the HELIOS PDF preflighting functionality:

inspectPDF-notify

PDF preflight action script to be applied via a Create PDF Server printer queue. See 11 "Create PDF Server".

inspectPDF-scriptserver

PDF preflight action script to be applied via a Script Server "hot folder". See the chapter "Script Server" in the ImageServer manual.

Name of color space
None	HSV	YCbCr
Spot	HLS	CIELab
Bilevel	CMY	CIEXYZ
Grayscale	CMYK	CIELuv
Indexed	Multi	CIEYxy
RGB	Duotone	YCC
-	-

HELIOS PDF HandShake UB User manual