Optimize

Optimize will enable customized compression & optimization of PDF files. It lets you configure the compression type, set the resolution, choose whether you would like to clip the invisible parts of the images in your PDF and much more.

FeatureParameterResponseActionDescriptionLinks
optimizeOptimizeOptimizeResOptimizeAction Optimize will enable customized compression & optimization of PDF files Swagger
Sample
optimizeByProfileprofile,
file
file streamOptimizes a pdf based on predefined profile. Swagger
Sample

Samples

Optimize

  • curl
  • C#
  • Java
  • JavaScript
  • PHP
  • Python
  • Ruby
curl No Sample
// create optimize object
Optimize optimize = new Optimize()
{
    // document
    Document = new Document()
    {
        DocData = File.ReadAllBytes("myPdf.pdf"),
        Name = "myPdf.pdf",
    },
    // action
    OptimizeAction = new OptimizeAction()
    {
        UseProfile = true,
        Profile = OptimizeActionProfile.Max
    },
};

// optimize
var res = await Pdf4meClient.Pdf4me.Instance.OptimizeClient.OptimizeAsync(optimize);

// extract the optimized PDF and writing it to disk
byte[] optimizedPdf = res.Document.DocData;
File.WriteAllBytes("optimizedPdf.pdf", optimizedPdf);
// setup the optimizeClient
OptimizeClient optimizeClient = new OptimizeClient(pdf4meClient);

// create optimize object
Optimize optimize = new Optimize();
// document
Document document = new Document();
document.setDocData(Files.readAllBytes(Paths.get("myPdf.pdf")));
optimize.setDocument(document);
// action
OptimizeAction optimizeAction = new OptimizeAction();
optimizeAction.useProfile(true);
optimizeAction.profile(ProfileEnum.MAX);
optimize.setOptimizeAction(optimizeAction);

// optimization
OptimizeRes res = optimizeClient.optimize(optimize);

// extracting the generated PDF and writing it to disk
byte[] optimizedPdf = res.getDocument().getDocData();
FileUtils.writeByteArrayToFile(new File("optimizedPdf.pdf"), optimizedPdf);
// setup the pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')

// create optimize object
const optimizeReq = {
  // document
  document: {
    docData: fs.readFileSync(path.join(__dirname, 'myPdf.pdf')).toString('base64'),
  },
  // action
  optimizeAction: {
    useProfile: true,
    profile: 'web',
  },
}

// optimize
pdf4meClient.optimize(optimizeReq)
  .then(function(optimizeRes) {
    // extracting the generated PDF and writing it to disk
    const pdfDocument = Buffer.from(optimizeRes.document.docData, 'base64')
    fs.writeFileSync(path.join(__dirname, 'optimizedPdf.pdf'), pdfDocument)
  })
  .catch(error => {
    console.log(error)
    process.exit(1)
  })
pdf4me()->optimize([
    'document' => [
        'docData' => $client->getFileData('myPdf.pdf')
    ],
    'optimizeAction' => [
        'profile' => 'max',
        'useProfile' => true
    ]
]);

// extracting the optimized PDF
$optimizedPdf = base64_decode($createOptimize->document->docData);
// and writing it to file
file_put_contents('optimizedPdf.pdf', $optimizedPdf);
# setup the optimize_client
optimize_client = OptimizeClient(pdf4me_client)

# create the optimize object
optimize = Optimize(
    document=Document(
        doc_data=FileReader().get_file_data('myPdf.pdf')
    ),
    optimize_action=OptimizeAction(
        use_profile=True,
        profile='max'
    )
)

# optimization
res = optimize_client.optimize(optimize=optimize)

# extracting the generated PDF
optimized_pdf = base64.b64decode(res['document']['doc_data'])
# writing it to disk
with open('optimizedPdf.pdf', 'wb') as f:
    f.write(optimized_pdf)
file = '/myPdf.pdf'

a = Pdf4me::Optimize.new(
    document: Pdf4me::Document.new(
      doc_data: Base64.encode64(File.open(file, 'rb', &:read))
    ),
    optimize_action: Pdf4me::OptimizeAction.new(
      use_profile: true,
      profile: 'max'
    )
  )

response = a.run


File.open('optimizedPdf.pdf', 'wb') do |f|
       f.write(Base64.decode64(response.document.doc_data))
end

OptimizeByProfile

  • curl
  • C#
  • Java
  • JavaScript
  • PHP
  • Python
  • Ruby
curl https://api.pdf4me.com/Optimize/OptimizeByProfile ^
    -H "Authorization: Basic DEV-KEY" ^
    -F profile=max ^
    -F "file=@./myPdf.pdf" ^
    -o ./optimizedPdf.pdf
// optimization for maximal memory size reduction (profile: max) 
byte[] optimizedPdf = await Pdf4meClient.Pdf4me.Instance.OptimizeClient.OptimizeByProfileAsync(
        File.ReadAllBytes("myPdf.pdf"),
        OptimizeActionProfile.Max
        );

// and writing the optimized PDF to disk
File.WriteAllBytes("optimizedPdf.pdf", optimizedPdf);
// setup the optimizeClient
OptimizeClient optimizeClient = new OptimizeClient(pdf4meClient);

// optimization for maximal memory size reduction (profile: max) and writing the optimized PDF to disk
byte[] optimizedPdf = optimizeClient.optimizeByProfile(ProfileEnum.MAX, new File("myPdf.pdf"));
FileUtils.writeByteArrayToFile(new File("optimizedPdf.pdf"), optimizedPdf);
// setup the pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')

// optimize
pdf4meClient.optimizeByProfile('max', fs.createReadStream(path.join(__dirname, 'myPdf.pdf')))
  .then(pdf => {
    // and writing the resulting PDFs to disk
    fs.writeFileSync(path.join(__dirname, 'optimizedPdf.pdf'), pdf)
  })
  .catch(err => {
    console.log(err)
  })

$optimizePdfByProfile = $client->pdf4me()->optimizeByProfile(
    [
        "profile"=> "max",
        "file" => __DIR__.'/myPdf.pdf'
    ]
);

//writing it to file
file_put_contents('optimizedPdf.pdf', $optimizePdfByProfile);
# setup the optimize_client
optimize_client = OptimizeClient(pdf4me_client)

# optimization
optimized_pdf = optimize_client.optimize_by_profile(
    profile='max',
    file=FileReader().get_file_handler(path='myPdf.pdf')
)
# writing the generated PDF to disk
with open('optimizedPdf.pdf', 'wb') as f:
    f.write(optimized_pdf)
a = Pdf4me::OptimizeByProfile.new(
        file: '/myPdf.pdf',
        profile: 'max',
        save_path: 'optimizedPdf.pdf'
    )
a.run

Models

Optimize

Name Type Description Notes
document Document

optimizeAction OptimizeAction
jobId String [optional]
jobIdExtern String [optional]
integrations [String] [optional]

OptimizeAction

Name Type Notes Description
profile String Supported Values:
"default": minimal optimization."web": optimization for the Internet: small size, short download, fast display. "print": optimization for print: correct fonts, right color scheme, suitable resolution."max": optimization for the Internet: small size, short download, fast display.
[optional]
Predefined optimization profile.
useProfile boolean [optional] Whether or not the chosen profile is applied or the default optimization setting comes into play.
removeRedundantObjects boolean [optional]
Default : true
Removal of redundant objects.
- true: duplicate objects are removed in order to reduce the file size.
subsetFonts boolean [optional]
Default : true
Subsetting and merging of fonts.
- true: subsets embedded fonts and merges embedded font programs of different subsets of the same font, granted they can be merged. Sub-setting refers to removing those glyphs in a font that are not actually used in any text contained in the PDF. If for any reason, e.g. postprocessing, fonts shall not be subset, set the property subsetFonts to false.
optimizeResources boolean [optional]
Default : true
Optimization of resources.
- true: unused resources such as images, fonts, and color spaces are removed. Also content streams are re-built.
forceCompressionTypes boolean [optional]
Default : true
Enforcement of specific compression types.
- true: re-compression of images is forced if an image in the input PDF has a compression type that differs from the compression types given in continuousCompressions, bitonalCompressions, or indexedCompressions. Use this option if you want to allow only the given compression types for images in the output PDF.
forceRecompression boolean [optional]
Default : true
Recompression of images.
- true: all images are always recompressed.
- false: images are only recompressed if the resulting image is smaller than the original, i.e. requires less bytes to store in the file.
reduceColorComplexity boolean [optional]
Default : true
Color complexity reduction of images.
- true: an image with DeviceRGB or DeviceCMYK color space in which all pixels are gray is converted to a grayscale image with DeviceGray color space. An image that contains only black and white pixels is converted into a bitonal image. An image in which all the pixels have the same color is down-sampled to one pixel. A soft mask that contains only black and white pixels is converted to a mask and opaque (soft) masks are removed.
mergeEmbeddedFonts boolean [optional]
Default : true
Merge embedded font programs.
- true: font programs are merged, if they originate from the same font, e.g. they are of the same type, have the same name and encoding. Merging of Type1 (PostScript) and TrueType fonts is supported.
bitonalCompressions enums Supported Values:
"none": exclude from processing. "raw": no compression. "jPEG", "flate", "lZW", "group3", "group3_2D", "group4", "jBIG2": supported in DF1.4 or later. "jPEG2000": supported in PDF1.5 or later, not supported in PDF/A-1. "mRC", "source": not a single compression format. Instead, uses the same compression as the original input image.
[optional]
Compression types for bi-tonal images. Several values can be combined. During optimization, all set compression types are tried and the one resulting in the least memory footprint is taken. Typically, CCITT Group 4 or JBIG2 is used for bi-tonal compression. Due to the simpler algorithm CCITT Group 4 has the advantage of being faster. JBIG2 can achieve compression ratios that are up to twice as high as CCITT Group 4 at the cost of longer computation time.
bitonalResolutionDPI double [optional] Target resolution in dots per inch (DPI) after re-sampling images for bi-tonal images. Only bi-tonal images with a resolution above the threshold DPI will be re-sampled. See also resolutionDPI, another optimize parameter.
bitonalThresholdDPI double [optional] Threshold resolution in dots per inch (DPI) to selectively activate re-sampling for bi-tonal images. The value -1 deactivates re-sampling for bitonal images. See also thresholdDPI, another optimize parameter.
clipImages boolean [optional]
Default : true
Option to clip images.
- true: invisible parts of images are clipped (cropped). While this does not affect visual parts of images, it may have a minor visual impact because clipped images are re-compressed. Pre-blended images are not clipped. Enabling this property will also enable the optimizeResources property.
continuousCompressions enums Supported Values:
"none": exclude from processing. "raw": no compression. "jPEG", "flate", "lZW", "group3", "group3_2D", "group4", "jBIG2": supported in DF1.4 or later. "jPEG2000": supported in PDF1.5 or later, not supported in PDF/A-1. "mRC","source": not a single compression format. Instead, uses the same compression as the original input image.
[optional]
Compression types to be tried for continuous images, i.e. RGB, CMYK, and grayscale images. Several values can be combined. During optimization, all set compression types are tried and the one resulting in the least memory footprint is taken.
linearize boolean [optional] Linearization of the PDF output file i.e. optimize file for fast web access. A linearized document has a slightly larger file size than a non-linearized file and provides the following main features: When a document is opened in a PDF viewer of a web browser, the first page can be viewed without downloading the entire PDF file. In contrast, a non-linearized PDF file must be downloaded completely before the first page can be displayed. When another page is requested by the user, that page is displayed as quickly as possible and incrementally as data arrives, without downloading the entire PDF file. Signed files cannot be linearized. So this property must be set to false if a digital signature is applied.
imageQuality Integer [optional] Quality index of lossy compression types. It is applied to JPEG and JPEG2000 compression. For JPEG2000, a quality index of 100 means lossless compression. JPEG compression is always lossy.
indexedCompressions enums Supported Values:
"none" : exclude from processing. "raw" : no compression. "jPEG", "flate", "lZW", "group3", "group3_2D", "group4", "jBIG2" : supported in PDF1.4 or later. "jPEG2000" : supported in PDF1.5 or later, not supported in PDF/A-1. "mRC", "source" : not a single compression format. Instead, uses the same compression as the original input image.
[optional]
Compression types for images that have an indexed (“palette”) color space. Several values can be combined. During optimization, all set compression types are tried and the one resulting in the least memory footprint is taken.
ditheringMode String [optional] Dithering algorithm when down-sampling bi-tonal images (creates an illusion of color depth by diffusing colors of the available palette). Some bi-tonal images try to evoke the impression of different levels of gray by randomly setting pixels to black. If dithering is applied during downsampling then the gray levels of such images are preserved better. If dithering is switched off then lines (e.g. text glyphs) are preserved better.
colorResolutionDPI double [optional] Target resolution in dots per inch (DPI) after re-sampling images for color images. See also resolutionDPI.
colorThresholdDPI double [optional] Threshold resolution in dots per inch (DPI) to selectively activate re-sampling for color images. Only color images with a resolution above the threshold DPI will be re-sampled. The value -1 deactivates re-sampling for color images. See also thresholdDPI.
monochromeResolutionDPI double [optional] Target resolution in dots per inch (DPI) after re-sampling images for monochrome images. See also resolutionDPI.
monochromeThresholdDPI double [optional] Threshold resolution in dots per inch (DPI) to selectively activate re-sampling for monochrome images. Only monochrome images with a resolution above the threshold DPI will be re-sampled. The value -1 deactivates re-sampling for monochrome images. See also thresholdDPI.
resolutionDPI Integer [optional] Resolution in DPI (dots per inch) after re-sampling images. This property affects all three image compression types (bitonalResolutionDPI, colorResolutionDPI, monochromeResolutionDPI). A typical value for the resolution when optimizing for the web is 150 DPI. For printing typically no re-sampling is applied. Pre-blended images, images with a color key mask, mask, and soft mask images are not re-sampled.
thresholdDPI Integer [optional] Threshold in DPI (dots per inch) to selectively activate re-sampling. Only images with a resolution above the threshold DPI will be re-sampled. This property affects all three image compression types (bitonalThresholdDPI, colorThresholdDPI, monochromeThresholdDPI). The value -1 deactivates re-sampling.
strip [String] [optional] Stripping mode. This mode can be configured to remove unneeded data of a PDF document. Multiple values can be combined.
infoEntries [KeyValuePair[String, String]] [optional] Key-value pair in the document info dictionary. Values of predefined keys are also stored in the XMP metadata. Popular entries specified in the PDF Reference 1.7 and accepted by most PDF viewers are: "Title", "Author", "Subject", "Creator" (sometimes referred to as Application), and "Producer" (sometimes referred to as PDF Creator).
flattenSignatureFields boolean [optional]
Default : true
A signature in a PDF consist of two parts:
A) The invisible digital signature in the PDF.
B) The visual appearance that was attributed to the signature.
Part A can be used by a viewing application, to verify that a document has not changed since it has been signed and report this to the user. Part B is merely a “decorative” element on the page without further significance. When optimizing a PDF, the PDF is altered and hence the digital signature is broken. Therefore, all signatures are removed, including parts A and B.
- true: the digital signatures (parts A) are still removed, but their visual appearances (parts B) are flattened. I.e. the latter are retained and drawn as non-editable graphic onto the page.
Note: The resulting PDF can be misleading as it visually appears to be signed, but it has no digital signature and hence, a viewer application does not report any broken signature. In most cases, such a behavior is undesirable.

OptimizeRes

Name Type Description Notes
documents Document Optimized Document.

Document

Name Type Description Notes
jobId String JobId of Documents WorkingSet.
documentId String Document Id
name String Filename inlcuding filetype.
docStatus String Status of the Document, e.g. Stamped.
pages Page Description of pages.
docData [byte] Document bytes.
docMetadata DocMetadata Document metadata such as title, pageCount et al.
docLogs DocLog Logging information about the request, e.g. timestamp.

Page

Name Type Description Notes
documentId String Globally unique Id.
pageId String Globally unique Id.
pageNumber Integer PageNumber, starting with 1.
rotate double By how much the page was rotated from its original orientation.
thumbnail byte Thumbnail representing this particular page.
sourceDocumentId String Id of the document it was created from, e.g. in case of an extraction, the result's sourceDocumentId is the Id of the PDF the pages have been extracted from.
sourcePageNumber Integer Page number of the original page in the original document, e.g. let's assume document B consists of page number 4 of document A (extraction).
Thus, document B's only page's sourcePageNumber is number 4.

DocMetadata

Name Type Description Notes
title String Title of document.
subject String Subject of document.
pageCount long Number of pages.
size long Number bytes of the document.
isEncrypted boolean If the document is Encrypted
pdfCompliance String Pdf Compliance, e.g. PDF/A.
isSigned boolean If the document is Encrypted
uploadedMimeType String Uploaded MimeType, e.g. application/bson.
uploadedFileSize long Uploaded file size.

DocLog

Name Type Description Notes
messageType String MessageType, e.g. PdfALog.
message String Message itself, e.g. a warning.
timestamp dateTime Timestamp.
docLogLevel String Type of message. Supported Values :
"verbose", "info", "warning", "error", "timing"
durationMilliseconds long Timing for requested log information [ms].

How can we help?