Optimize
Optimize will enable customized compression & optimization of PDF files. It lets you configure the compression type, set the resolution, choose whether you would like to clip the invisible parts of the images in your PDF and much more.
Feature | Parameter | Response | Action | Description | Links |
optimize | Optimize | OptimizeRes | OptimizeAction | Optimize will enable customized compression & optimization of PDF files | Swagger Sample |
optimizeByProfile | profile, file | file stream | Optimizes a pdf based on predefined profile. | Swagger Sample |
Samples
Optimize
- curl
- C#
- Java
- JavaScript
- PHP
- Python
- Ruby
curl No Sample
// create optimize object
Optimize optimize = new Optimize()
{
// document
Document = new Document()
{
DocData = File.ReadAllBytes("myPdf.pdf"),
Name = "myPdf.pdf",
},
// action
OptimizeAction = new OptimizeAction()
{
UseProfile = true,
Profile = OptimizeActionProfile.Max
},
};
// optimize
var res = await Pdf4meClient.Pdf4me.Instance.OptimizeClient.OptimizeAsync(optimize);
// extract the optimized PDF and writing it to disk
byte[] optimizedPdf = res.Document.DocData;
File.WriteAllBytes("optimizedPdf.pdf", optimizedPdf);
// setup the optimizeClient
OptimizeClient optimizeClient = new OptimizeClient(pdf4meClient);
// create optimize object
Optimize optimize = new Optimize();
// document
Document document = new Document();
document.setDocData(Files.readAllBytes(Paths.get("myPdf.pdf")));
optimize.setDocument(document);
// action
OptimizeAction optimizeAction = new OptimizeAction();
optimizeAction.useProfile(true);
optimizeAction.profile(ProfileEnum.MAX);
optimize.setOptimizeAction(optimizeAction);
// optimization
OptimizeRes res = optimizeClient.optimize(optimize);
// extracting the generated PDF and writing it to disk
byte[] optimizedPdf = res.getDocument().getDocData();
FileUtils.writeByteArrayToFile(new File("optimizedPdf.pdf"), optimizedPdf);
// setup the pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')
// create optimize object
const optimizeReq = {
// document
document: {
docData: fs.readFileSync(path.join(__dirname, 'myPdf.pdf')).toString('base64'),
},
// action
optimizeAction: {
useProfile: true,
profile: 'web',
},
}
// optimize
pdf4meClient.optimize(optimizeReq)
.then(function(optimizeRes) {
// extracting the generated PDF and writing it to disk
const pdfDocument = Buffer.from(optimizeRes.document.docData, 'base64')
fs.writeFileSync(path.join(__dirname, 'optimizedPdf.pdf'), pdfDocument)
})
.catch(error => {
console.log(error)
process.exit(1)
})
pdf4me()->optimize([
'document' => [
'docData' => $client->getFileData('myPdf.pdf')
],
'optimizeAction' => [
'profile' => 'max',
'useProfile' => true
]
]);
// extracting the optimized PDF
$optimizedPdf = base64_decode($createOptimize->document->docData);
// and writing it to file
file_put_contents('optimizedPdf.pdf', $optimizedPdf);
# setup the optimize_client
optimize_client = OptimizeClient(pdf4me_client)
# create the optimize object
optimize = Optimize(
document=Document(
doc_data=FileReader().get_file_data('myPdf.pdf')
),
optimize_action=OptimizeAction(
use_profile=True,
profile='max'
)
)
# optimization
res = optimize_client.optimize(optimize=optimize)
# extracting the generated PDF
optimized_pdf = base64.b64decode(res['document']['doc_data'])
# writing it to disk
with open('optimizedPdf.pdf', 'wb') as f:
f.write(optimized_pdf)
file = '/myPdf.pdf'
a = Pdf4me::Optimize.new(
document: Pdf4me::Document.new(
doc_data: Base64.encode64(File.open(file, 'rb', &:read))
),
optimize_action: Pdf4me::OptimizeAction.new(
use_profile: true,
profile: 'max'
)
)
response = a.run
File.open('optimizedPdf.pdf', 'wb') do |f|
f.write(Base64.decode64(response.document.doc_data))
end
OptimizeByProfile
- curl
- C#
- Java
- JavaScript
- PHP
- Python
- Ruby
curl https://api.pdf4me.com/Optimize/OptimizeByProfile ^
-H "Authorization: Basic DEV-KEY" ^
-F profile=max ^
-F "file=@./myPdf.pdf" ^
-o ./optimizedPdf.pdf
// optimization for maximal memory size reduction (profile: max)
byte[] optimizedPdf = await Pdf4meClient.Pdf4me.Instance.OptimizeClient.OptimizeByProfileAsync(
File.ReadAllBytes("myPdf.pdf"),
OptimizeActionProfile.Max
);
// and writing the optimized PDF to disk
File.WriteAllBytes("optimizedPdf.pdf", optimizedPdf);
// setup the optimizeClient
OptimizeClient optimizeClient = new OptimizeClient(pdf4meClient);
// optimization for maximal memory size reduction (profile: max) and writing the optimized PDF to disk
byte[] optimizedPdf = optimizeClient.optimizeByProfile(ProfileEnum.MAX, new File("myPdf.pdf"));
FileUtils.writeByteArrayToFile(new File("optimizedPdf.pdf"), optimizedPdf);
// setup the pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')
// optimize
pdf4meClient.optimizeByProfile('max', fs.createReadStream(path.join(__dirname, 'myPdf.pdf')))
.then(pdf => {
// and writing the resulting PDFs to disk
fs.writeFileSync(path.join(__dirname, 'optimizedPdf.pdf'), pdf)
})
.catch(err => {
console.log(err)
})
$optimizePdfByProfile = $client->pdf4me()->optimizeByProfile(
[
"profile"=> "max",
"file" => __DIR__.'/myPdf.pdf'
]
);
//writing it to file
file_put_contents('optimizedPdf.pdf', $optimizePdfByProfile);
# setup the optimize_client
optimize_client = OptimizeClient(pdf4me_client)
# optimization
optimized_pdf = optimize_client.optimize_by_profile(
profile='max',
file=FileReader().get_file_handler(path='myPdf.pdf')
)
# writing the generated PDF to disk
with open('optimizedPdf.pdf', 'wb') as f:
f.write(optimized_pdf)
a = Pdf4me::OptimizeByProfile.new(
file: '/myPdf.pdf',
profile: 'max',
save_path: 'optimizedPdf.pdf'
)
a.run
Models
Optimize
Name | Type | Description | Notes |
---|---|---|---|
document |
Document |
||
optimizeAction |
OptimizeAction |
||
jobId |
String |
[optional] | |
jobIdExtern |
String |
[optional] | |
integrations |
[String] |
[optional] |
OptimizeAction
Name | Type | Notes | Description |
---|---|---|---|
profile |
String |
Supported Values: "default": minimal optimization."web": optimization for the Internet: small size, short download, fast display. "print": optimization for print: correct fonts, right color scheme, suitable resolution."max": optimization for the Internet: small size, short download, fast display. [optional] |
Predefined optimization profile. |
useProfile |
boolean |
[optional] | Whether or not the chosen profile is applied or the default optimization setting comes into play. |
removeRedundantObjects |
boolean |
[optional] Default : true |
Removal of redundant objects.- true: duplicate objects are removed in order to reduce the file size. |
subsetFonts |
boolean |
[optional] Default : true |
Subsetting and merging of fonts. - true: subsets embedded fonts and merges embedded font programs of different subsets of the same font, granted they can be merged. Sub-setting refers to removing those glyphs in a font that are not actually used in any text contained in the PDF. If for any reason, e.g. postprocessing, fonts shall not be subset, set the property subsetFonts to false. |
optimizeResources |
boolean |
[optional] Default : true |
Optimization of resources. - true: unused resources such as images, fonts, and color spaces are removed. Also content streams are re-built. |
forceCompressionTypes |
boolean |
[optional] Default : true |
Enforcement of specific compression types. - true: re-compression of images is forced if an image in the input PDF has a compression type that differs from the compression types given in continuousCompressions, bitonalCompressions, or indexedCompressions. Use this option if you want to allow only the given compression types for images in the output PDF. |
forceRecompression |
boolean |
[optional] Default : true |
Recompression of images.- true: all images are always recompressed. - false: images are only recompressed if the resulting image is smaller than the original, i.e. requires less bytes to store in the file. |
reduceColorComplexity |
boolean |
[optional] Default : true |
Color complexity reduction of images. - true: an image with DeviceRGB or DeviceCMYK color space in which all pixels are gray is converted to a grayscale image with DeviceGray color space. An image that contains only black and white pixels is converted into a bitonal image. An image in which all the pixels have the same color is down-sampled to one pixel. A soft mask that contains only black and white pixels is converted to a mask and opaque (soft) masks are removed. |
mergeEmbeddedFonts |
boolean |
[optional] Default : true |
Merge embedded font programs. - true: font programs are merged, if they originate from the same font, e.g. they are of the same type, have the same name and encoding. Merging of Type1 (PostScript) and TrueType fonts is supported. |
bitonalCompressions |
enums |
Supported Values: "none": exclude from processing. "raw": no compression. "jPEG", "flate", "lZW", "group3", "group3_2D", "group4", "jBIG2": supported in DF1.4 or later. "jPEG2000": supported in PDF1.5 or later, not supported in PDF/A-1. "mRC", "source": not a single compression format. Instead, uses the same compression as the original input image. [optional] |
Compression types for bi-tonal images. Several values can be combined. During optimization, all set compression types are tried and the one resulting in the least memory footprint is taken. Typically, CCITT Group 4 or JBIG2 is used for bi-tonal compression. Due to the simpler algorithm CCITT Group 4 has the advantage of being faster. JBIG2 can achieve compression ratios that are up to twice as high as CCITT Group 4 at the cost of longer computation time. |
bitonalResolutionDPI |
double |
[optional] | Target resolution in dots per inch (DPI) after re-sampling images for bi-tonal images. Only bi-tonal images with a resolution above the threshold DPI will be re-sampled. See also resolutionDPI, another optimize parameter. |
bitonalThresholdDPI |
double |
[optional] | Threshold resolution in dots per inch (DPI) to selectively activate re-sampling for bi-tonal images. The value -1 deactivates re-sampling for bitonal images. See also thresholdDPI, another optimize parameter. |
clipImages |
boolean |
[optional] Default : true |
Option to clip images. - true: invisible parts of images are clipped (cropped). While this does not affect visual parts of images, it may have a minor visual impact because clipped images are re-compressed. Pre-blended images are not clipped. Enabling this property will also enable the optimizeResources property. |
continuousCompressions |
enums |
Supported Values: "none": exclude from processing. "raw": no compression. "jPEG", "flate", "lZW", "group3", "group3_2D", "group4", "jBIG2": supported in DF1.4 or later. "jPEG2000": supported in PDF1.5 or later, not supported in PDF/A-1. "mRC","source": not a single compression format. Instead, uses the same compression as the original input image. [optional] |
Compression types to be tried for continuous images, i.e. RGB, CMYK, and grayscale images. Several values can be combined. During optimization, all set compression types are tried and the one resulting in the least memory footprint is taken. |
linearize |
boolean |
[optional] | Linearization of the PDF output file i.e. optimize file for fast web access. A linearized document has a slightly larger file size than a non-linearized file and provides the following main features: When a document is opened in a PDF viewer of a web browser, the first page can be viewed without downloading the entire PDF file. In contrast, a non-linearized PDF file must be downloaded completely before the first page can be displayed. When another page is requested by the user, that page is displayed as quickly as possible and incrementally as data arrives, without downloading the entire PDF file. Signed files cannot be linearized. So this property must be set to false if a digital signature is applied. |
imageQuality |
Integer |
[optional] | Quality index of lossy compression types. It is applied to JPEG and JPEG2000 compression. For JPEG2000, a quality index of 100 means lossless compression. JPEG compression is always lossy. |
indexedCompressions |
enums |
Supported Values: "none" : exclude from processing. "raw" : no compression. "jPEG", "flate", "lZW", "group3", "group3_2D", "group4", "jBIG2" : supported in PDF1.4 or later. "jPEG2000" : supported in PDF1.5 or later, not supported in PDF/A-1. "mRC", "source" : not a single compression format. Instead, uses the same compression as the original input image. [optional] |
Compression types for images that have an indexed (“palette”) color space. Several values can be combined. During optimization, all set compression types are tried and the one resulting in the least memory footprint is taken. |
ditheringMode |
String |
[optional] | Dithering algorithm when down-sampling bi-tonal images (creates an illusion of color depth by diffusing colors of the available palette). Some bi-tonal images try to evoke the impression of different levels of gray by randomly setting pixels to black. If dithering is applied during downsampling then the gray levels of such images are preserved better. If dithering is switched off then lines (e.g. text glyphs) are preserved better. |
colorResolutionDPI |
double |
[optional] | Target resolution in dots per inch (DPI) after re-sampling images for color images. See also resolutionDPI. |
colorThresholdDPI |
double |
[optional] | Threshold resolution in dots per inch (DPI) to selectively activate re-sampling for color images. Only color images with a resolution above the threshold DPI will be re-sampled. The value -1 deactivates re-sampling for color images. See also thresholdDPI. |
monochromeResolutionDPI |
double |
[optional] | Target resolution in dots per inch (DPI) after re-sampling images for monochrome images. See also resolutionDPI. |
monochromeThresholdDPI |
double |
[optional] | Threshold resolution in dots per inch (DPI) to selectively activate re-sampling for monochrome images. Only monochrome images with a resolution above the threshold DPI will be re-sampled. The value -1 deactivates re-sampling for monochrome images. See also thresholdDPI. |
resolutionDPI |
Integer |
[optional] | Resolution in DPI (dots per inch) after re-sampling images. This property affects all three image compression types (bitonalResolutionDPI, colorResolutionDPI, monochromeResolutionDPI). A typical value for the resolution when optimizing for the web is 150 DPI. For printing typically no re-sampling is applied. Pre-blended images, images with a color key mask, mask, and soft mask images are not re-sampled. |
thresholdDPI |
Integer |
[optional] | Threshold in DPI (dots per inch) to selectively activate re-sampling. Only images with a resolution above the threshold DPI will be re-sampled. This property affects all three image compression types (bitonalThresholdDPI, colorThresholdDPI, monochromeThresholdDPI). The value -1 deactivates re-sampling. |
strip |
[String] |
[optional] | Stripping mode. This mode can be configured to remove unneeded data of a PDF document. Multiple values can be combined. |
infoEntries |
[KeyValuePair[String, String]] |
[optional] | Key-value pair in the document info dictionary. Values of predefined keys are also stored in the XMP metadata. Popular entries specified in the PDF Reference 1.7 and accepted by most PDF viewers are: "Title", "Author", "Subject", "Creator" (sometimes referred to as Application), and "Producer" (sometimes referred to as PDF Creator). |
flattenSignatureFields |
boolean |
[optional] Default : true |
A signature in a PDF consist of two parts:A) The invisible digital signature in the PDF. B) The visual appearance that was attributed to the signature. Part A can be used by a viewing application, to verify that a document has not changed since it has been signed and report this to the user. Part B is merely a “decorative” element on the page without further significance. When optimizing a PDF, the PDF is altered and hence the digital signature is broken. Therefore, all signatures are removed, including parts A and B. - true: the digital signatures (parts A) are still removed, but their visual appearances (parts B) are flattened. I.e. the latter are retained and drawn as non-editable graphic onto the page. Note: The resulting PDF can be misleading as it visually appears to be signed, but it has no digital signature and hence, a viewer application does not report any broken signature. In most cases, such a behavior is undesirable. |
OptimizeRes
Name | Type | Description | Notes |
---|---|---|---|
documents |
Document |
Optimized Document. |
Document
Name | Type | Description | Notes |
---|---|---|---|
jobId |
String |
JobId of Documents WorkingSet. | |
documentId |
String |
Document Id | |
name |
String |
Filename inlcuding filetype. | |
docStatus |
String |
Status of the Document, e.g. Stamped. | |
pages |
Page |
Description of pages. | |
docData |
[byte] |
Document bytes. | |
docMetadata |
DocMetadata |
Document metadata such as title, pageCount et al. | |
docLogs |
DocLog |
Logging information about the request, e.g. timestamp. |
Page
Name | Type | Description | Notes |
---|---|---|---|
documentId |
String |
Globally unique Id. | |
pageId |
String |
Globally unique Id. | |
pageNumber |
Integer |
PageNumber, starting with 1. | |
rotate |
double |
By how much the page was rotated from its original orientation. | |
thumbnail |
byte |
Thumbnail representing this particular page. | |
sourceDocumentId |
String |
Id of the document it was created from, e.g. in case of an extraction, the result's sourceDocumentId is the Id of the PDF the pages have been extracted from. | |
sourcePageNumber |
Integer |
Page number of the original page in the original document, e.g. let's assume document B consists of page number 4 of document A (extraction). Thus, document B's only page's sourcePageNumber is number 4. |
DocMetadata
Name | Type | Description | Notes |
---|---|---|---|
title |
String |
Title of document. | |
subject |
String |
Subject of document. | |
pageCount |
long |
Number of pages. | |
size |
long |
Number bytes of the document. | |
isEncrypted |
boolean |
If the document is Encrypted | |
pdfCompliance |
String |
Pdf Compliance, e.g. PDF/A. | |
isSigned |
boolean |
If the document is Encrypted | |
uploadedMimeType |
String |
Uploaded MimeType, e.g. application/bson. | |
uploadedFileSize |
long |
Uploaded file size. |
DocLog
Name | Type | Description | Notes |
---|---|---|---|
messageType |
String |
MessageType, e.g. PdfALog. | |
message |
String |
Message itself, e.g. a warning. | |
timestamp |
dateTime |
Timestamp. | |
docLogLevel |
String |
Type of message. | Supported Values : "verbose", "info", "warning", "error", "timing" |
durationMilliseconds |
long |
Timing for requested log information [ms]. |