Repair

You can repair a Pdf document by analyzing it using the Repair function in Pdf4me. Moreover Repair Pdf can also help you recover pages, rebuild fonts, text etc.

FeatureParameterResponseActionDescriptionLinks
repairRepairRepairResRepairActionAnalyzes and repairs a Pdf document.Swagger
Sample
repairDocumentfilefile stream
Repairs Pdfs. For example recover pages, rebuild fonts etc.
Swagger
Sample

Samples

Repair

  • curl
  • C#
  • Java
  • JavaScript
  • PHP
  • Python
  • Ruby
curl No Sample
// create repair object
var req = new Repair()
{
    // document
    Document = new Document()
    {
        DocData = File.ReadAllBytes("myPdf.pdf"),
        Name = "myPdf.pdf",
    },
    // action
    RepairAction = new RepairAction()
    {
        AnalyzeOnly = false,
        RecoverPages = true,
        RecoverXREF = true,
        RebuildFonts = true,
        RebuildFontsAsType1 = true,
        RebuildStreams = true,
    },
};

// repairing pdf
var res = Pdf4me.Instance.PdfAClient.RepairAsync(req).GetAwaiter().GetResult();

// extract the repaired PDF and writing it to disk
byte[] repairedPdf = res.Document.DocData;
File.WriteAllBytes("repairedPdf.pdf", repairedPdf);
// setup the pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')

// create repair object
const repairReq = {
  // document
  document: {
    docData: fs.readFileSync(path.join(__dirname, 'myPdf.pdf')).toString('base64'),
  },
  // action
  repairAction: {
    analyzeOnly: false,
    recoverPages: true,
    recoverXREF: true,
    rebuildFonts: true,
    rebuildFontsAsType1: true,
    rebuildStreams: true,
  },
}

// repair
pdf4meClient.repair(repairReq)
  .then(function(repairRes) {
    // extracting the generated PDF and writing it to disk
    const pdfDocument = Buffer.from(repairRes.document.docData, 'base64')
    fs.writeFileSync(path.join(__dirname, 'repairedPdf.pdf'), pdfDocument)
  })
  .catch(error => {
    console.log(error)
  })
//create repair object
$create_repair = [
    'document'=> [
        'name' => 'myPdf.pdf',
        'docData' => $client->getFileData('myPdf.pdf')
    ],
    'repairAction'=> [
        'analyzeOnly'=>0,
        'recoverPages'=>1,
        'recoverXREF'=>1,
        'rebuildFonts'=>1,
        'rebuildFontsAsType1'=>1,
        'rebuildStreams'=>1,
    ],
];

// reparing
$res = $client->pdf4me()->repair($create_repair);

// extracting the rotated PDF
$repairedPdf = base64_decode($res->document->docData);
// writing it to disk
file_put_contents('repairedPdf.pdf', $repairedPdf);
   
# setup the pdfA_client
pdfA_client = PdfAClient(pdf4me_client)

# create the repair object
repair = Repair(
    # document
    document=Document(
        doc_data=FileReader().get_file_data('myPdf.pdf')
    ),
    # action
    repair_action=RepairAction(
        analyze_only= 0,
        recover_pages= 1,
        recover_xref= 1,
        rebuild_fonts= 1,
        rebuild_fonts_as_type1= 1,
        rebuild_streams= 1
    )
)

# repairing the PDF
res = pdfA_client.repair(repair=repair)

# extracting the generated PDF
repairedPdf = base64.b64decode(res['document']['doc_data'])
# writing it to disk
with open('repairedPdf.pdf', 'wb') as f:
    f.write(repairedPdf)

RepairDocument

  • curl
  • C#
  • Java
  • JavaScript
  • PHP
  • Python
  • Ruby
curl https://api.pdf4me.com/Repair/RepairDocument ^
    -H "Authorization: Basic DEV-KEY" ^
    -F "file=@./myPdf.pdf" ^
    -o ./repairedPdf.pdf
// repairing
var res = Pdf4me.Instance.PdfAClient.RepairDocumentAsync(File.ReadAllBytes("myPdf.pdf")).GetAwaiter().GetResult();

// and writing the repaired PDF to disk 
File.WriteAllBytes("repairedPdf.pdf", stampedPdf);
// setup the pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')

// repair document
pdf4meClient.repairDocument(fs.createReadStream(path.join(__dirname, 'myPdf.pdf')))
  .then(pdf => {
    // and writing the resulting PDFs to disk
    fs.writeFileSync(path.join(__dirname, 'repairedPdf.pdf'), pdf)
  })
  .catch(error => {
    console.error(error)
  })
 // repairing 
 $res = $client->pdf4me()->repairDocument(
    [
        "file" => 'myPdf.pdf'
    ]);
    
//writing it to file
file_put_contents('repairedPdf.pdf', $res);
    
# setup the pdfA_client
pdfA_client = PdfAClient(pdf4me_client)

# protecting the PDF
repaired_pdf = pdfA_client.repair_document(
        file=FileReader().get_file_handler(path='myPdf.pdf')
    )
# writing the generated PDF to disk
with open('repairedPdf.pdf', 'wb') as f:
    f.write(repaired_pdf)

Models

Extract

Name Type Description Notes
document Document
repairAction RepairAction
jobId String [optional]
jobIdExtern String [optional]
integrations [String] [optional]

RepairAction

Name Type Description Notes
analyzeOnly boolean Analyse Pdf document. [Optional]
recoverPages boolean Recover pages of broken Pdf document. [Optional]
extractPages boolean Extract pages from Pdf document. [Optional]
recoverXREF boolean [Optional]
rebuildFonts boolean Rebuild fonts of a Pdf. [Optional]
rebuildFontsAsType1 boolean Rebuild fonts [Optional]
rebuildStreams boolean [Optional]

RepairRes

Name Type Description Notes
document Document Repaired document.

Document

Name Type Description Notes
jobId String JobId of Documents WorkingSet.
documentId String Document Id
name String Filename inlcuding filetype.
docStatus String Status of the Document, e.g. Stamped.
pages Page Description of pages.
docData [byte] Document bytes.
docMetadata DocMetadata Document metadata such as title, pageCount et al.
docLogs DocLog Logging information about the request, e.g. timestamp.

Page

Name Type Description Notes
documentId String Globally unique Id.
pageId String Globally unique Id.
pageNumber Integer PageNumber, starting with 1.
rotate double By how much the page was rotated from its original orientation.
thumbnail byte Thumbnail representing this particular page.
sourceDocumentId String Id of the document it was created from, e.g. in case of an extraction, the result's sourceDocumentId is the Id of the PDF the pages have been extracted from.
sourcePageNumber Integer Page number of the original page in the original document, e.g. let's assume document B consists of page number 4 of document A (extraction).
Thus, document B's only page's sourcePageNumber is number 4.

DocMetadata

Name Type Description Notes
title String Title of document.
subject String Subject of document.
pageCount long Number of pages.
size long Number bytes of the document.
isEncrypted boolean If the document is Encrypted
pdfCompliance String Pdf Compliance, e.g. PDF/A.
isSigned boolean If the document is Encrypted
uploadedMimeType String Uploaded MimeType, e.g. application/bson.
uploadedFileSize long Uploaded file size.

DocLog

Name Type Description Notes
messageType String MessageType, e.g. PdfALog.
message String Message itself, e.g. a warning.
timestamp dateTime Timestamp.
docLogLevel String Type of message. Supported Values :
"verbose", "info", "warning", "error", "timing"
durationMilliseconds long Timing for requested log information [ms].

How can we help?