Split

The Split functionality in Pdf4me splits a PDF of multiple pages into a number of smaller PDF documents. In addition it can split Pdf periodically, say after every ‘x’ pages. Most commonly used document formats can be converted to Pdf using convert function in Pdf4me and split into smaller Pdfs as required.

FeatureParameterResponseActionDescriptionLinks
splitSplitSplitResSplitAction Splits a PDF of multiple pages into a number of smaller PDF documents.Swagger
Sample
splitByPageNrpageNr, file file streamSplits the PDF after the pageNr. The result will contain the first and second PDF as separate documents.
Swagger
Sample
splitRecurringPageNr, file[file stream]Splits a Pdf in a period range. For example, splitting a Pdf with 10 pages after every 2 pages giving 5 smaller Pdfs with 2 pages each.
Swagger
Sample

Samples

Split

  • curl
  • C#
  • Java
  • JavaScript
  • PHP
  • Python
  • Ruby
curl No Sample
//create split object
var split = new Split()
{
    // document
    Document = new Document()
    {
        DocData = File.ReadAllBytes("myPdf.pdf"),
        Name = "myPdf.pdf",
    },
    // action
    SplitAction = new SplitAction()
    {
        SplitAfterPage = 2
    },
};

// splitting the PDF
var res = await Pdf4meClient.Pdf4me.Instance.SplitClient.SplitAsync(split);

// extract the resulting documents and writing them to disk
byte[] pdf1 = res.Documents[0].DocData;
byte[] pdf2 = res.Documents[1].DocData;
File.WriteAllBytes("pdf1.pdf", pdf1);
File.WriteAllBytes("pdf2.pdf", pdf2);
// setup the splitClient
SplitClient splitClient = new SplitClient(pdf4meClient);

//create split object
Split split = new Split();
// document
Document document = new Document();
document.setDocData(Files.readAllBytes(Paths.get("myPdf.pdf")));
split.setDocument(document);

// action
SplitAction splitAction = new SplitAction();
splitAction.setSplitAfterPage(2);
split.setSplitAction(splitAction);

// splitting the PDF
SplitRes res = splitClient.split(split);

// extracting the generated PDFs and writing them to disk
List documents = res.getDocuments();
byte[] pdf1 = documents.get(0).getDocData();
FileUtils.writeByteArrayToFile(new File("pdf1.pdf"), pdf1);
byte[] pdf2 = documents.get(1).getDocData();
FileUtils.writeByteArrayToFile(new File("pdf2.pdf"), pdf2);

// setup the pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')

// create the Split object
const splitReq = {
  // document
  document: {
    docData: fs.readFileSync(path.join(__dirname, 'myPdf.pdf')).toString('base64'),
  },
  // action
  splitAction: {
    splitAfterPage: 2,
  },
}

// splitting the PDF
pdf4meClient.split(splitReq)
  .then(splitRes => {
    // extract the resulting documents and writing them to disk
    const pdf1 = Buffer.from(splitRes.documents[0].docData, 'base64')
    const pdf2 = Buffer.from(splitRes.documents[0].docData, 'base64')
    fs.writeFileSync(path.join(__dirname, 'pdf1.pdf'), pdf1)
    fs.writeFileSync(path.join(__dirname, 'pdf2.pdf'), pdf2)
  })
  .catch(error => {
    console.error(error)
  })
// create the Split object
$create_split = [
    // document
    "document" => [
        'name' => 'myPdf.pdf',
        'docData' => $client->getFileData('myPdf.pdf')
    ],
    // action
    "splitAction" => [
        "splitAfterPage" => 2
    ]
];

// splitting the PDF
$res = $client->pdf4me()->splitPdf($create_split);

// extracting the generated PDFs
$pdf1 = base64_decode($res->documents[0]->docData);
$pdf2 = base64_decode($res->documents[1]->docData);

// and writing them to file
file_put_contents('pdf1.pdf', $pdf1);
file_put_contents('pdf2.pdf', $pdf2);
# setup the split_client
split_client = SplitClient(pdf4me_client)

# create the split object
split = Split(
    # document
    document=Document(
        doc_data=FileReader().get_file_data('myPdf.pdf')
    ),
    # action
    split_action=SplitAction(
        split_after_page=2
    )
)

# splitting the PDF
res = split_client.split(split=split)

# extracting the generated PDFs
documents = res['documents']
pdf_1 = base64.b64decode(documents[0]['doc_data'])
pdf_2 = base64.b64decode(documents[1]['doc_data'])
# writing them to disk
with open('pdf1.pdf', 'wb') as f:
    f.write(pdf_1)
with open('pdf2.pdf', 'wb') as f:
    f.write(pdf_2)
file_path = './in/GraphicsTest.pdf'

action = Pdf4me::Split.new(
    # document
    document: Pdf4me::Document.new(
        doc_data: Base64.encode64(File.open(file_path, 'rb', &:read)),
    ),
    # action
    split_action: Pdf4me::SplitAction.new(
        split_after_page: 2
    )
)

# splitting the PDF
response = action.run

# save documents
response.documents.each_with_index do |document, index|
    File.open("./out/Split_#{index}.pdf", 'wb') do |f|
      f.write(Base64.decode64(document.doc_data))
    end
end

SplitByPageNr

  • curl
  • C#
  • Java
  • JavaScript
  • PHP
  • Python
  • Ruby
curl No Sample
// split the PDF into two
var res = await Pdf4meClient.Pdf4me.Instance.SplitClient.SplitByPageNrAsync(
    File.ReadAllBytes("myPdf.pdf"),
    2
);

// and writing the resulting PDFs to disk
File.WriteAllBytes("pdf1.pdf", res[0]);
File.WriteAllBytes("pdf2.pdf", res[1]);
// setup the splitClient
SplitClient splitClient = new SplitClient(pdf4meClient);

// split the PDF into two
List res = splitClient.splitByPageNr(2, new File("myPdf.pdf"));

// and writing the resulting PDFs to disk
byte[] pdf1 = res.get(0);
FileUtils.writeByteArrayToFile(new File("pdf1.pdf"), pdf1);
byte[] pdf2 = res.get(1);
FileUtils.writeByteArrayToFile(new File("pdf2.pdf"), pdf2);
// create pdf4meClient
const p4mClient = pdf4me.createClient('YOUR API KEY')

// split the PDF into two
p4mClient.splitByPageNr(2, fs.createReadStream(path.join(__dirname, 'myPdf.pdf')))
  .then(splitRes => {
    // and writing the resulting PDFs to disk
    fs.writeFileSync(path.join(__dirname, 'pdf1.pdf'), splitRes[0])
    fs.writeFileSync(path.join(__dirname, 'pdf2.pdf'), splitRes[1])
  })
  .catch(error => {
    console.error(error)
    process.exit(1)
  })
// splitting the PDF
$res = $client->pdf4me()->splitByPageNr(
    [
        "pageNr"=> 2,
        "file" => __DIR__.'/myPdf.pdf'
    ]
);

// and writing the resulting PDFs to disk
file_put_contents('pdf1.pdf', $res[0]);
file_put_contents('pdf2.pdf', $res[1]);
# setup the split_client
split_client = SplitClient(pdf4me_client)

# splitting the PDF
pdf_1, pdf_2 = split_client.split_by_page_nr(
    page_nr=2,
    file=FileReader().get_file_handler(path='myPdf.pdf')
)
# writing the generated PDFs to disk
with open('pdf1.pdf', 'wb') as f:
    f.write(pdf_1)
with open('pdf2.pdf', 'wb') as f:
    f.write(pdf_2)
    
a = Pdf4me::SplitByPageNr.new(
        file: '/myPdf.pdf',
        page_number: 2,
        first_pdf: 'pdf1.pdf', # defaults to '/path/to/file/pdf1.pdf'
        second_pdf: 'pdf2.pdf' # defaults to '/path/to/file/pdf2.pdf'
   )
a.run

SplitRecurring

  • curl
  • C#
  • Java
  • JavaScript
  • PHP
  • Python
  • Ruby
curl No Sample
// split the PDF into two
var res = await Pdf4meClient.Pdf4me.Instance.SplitClient.SplitRecurringAsync(
    File.ReadAllBytes("myPdf.pdf"),
    2
);

// and writing the resulting PDFs to disk
for (int i = 0; i < res.Count; i++)
    File.WriteAllBytes("pdf_" + i + ".pdf", res[i]);


// create pdf4meClient
const pdf4meClient = pdf4me.createClient('YOUR API KEY')

// split the PDF
pdf4meClient.splitRecurring(5, fs.createReadStream(path.join(__dirname, 'myPdf.pdf')))
  .then(splitRes => {
    // and writing the resulting PDFs to disk
    splitRes.forEach((pdf, index) => {
      fs.writeFileSync(path.join(__dirname, `splitRecurring_pdf${index + 1}_result.pdf`), pdf)
    })
  })
  .catch(error => {
    console.error(error)
  })
// splitting the PDF
$res = $pdf4meclient->pdf4me()->splitRecurring(
    [
        "pageNr"=> 3,
        "file" => 'myPdf.pdf'
    ]);
    
$count = 1;
// extracting pdf
foreach($res as $pdf){
    $pdf_base = base64_decode($pdf);
    // and writing the resulting PDFs to disk
    file_put_contents($pdf_base, 'pdf_'.$count.'.pdf');
    $count = $count + 1;
}
# setup the split_client
split_client = SplitClient(pdf4me_client)

# splitting the PDF
pdfs = split_client.split_recurring(
    page_nr=3,
    file=FileReader().get_file_handler(path='myPdf.pdf')
)

count = 1   
# extracting the generated pdfs
for pdf in pdfs:
        
    # writing it to disk
    with open('\ pdf_'+str(count)+'.pdf', 'wb') as f:
        f.write(pdf)

    count = count+1

Model

Split

Name Type Description Notes
document Document
splitAction SplitAction
jobIdExtern String [optional]
integrations [String] [optional]

SplitAction

Name Type Description Notes
splitAfterPage Integer Page after which the split takes place. [optional]
splitSequence [Integer] A list of page number after which a split occurs. [optional]
recurringSplitAfterPage Number Split every xth page. [optional]
barcodeStartsWith String [optional]

SplitRes

Name Type Description Notes
documents [Document] Document splinters of the original document.

Document

Name Type Description Notes
jobId String JobId of Documents WorkingSet.
documentId String Document Id
name String Filename inlcuding filetype.
docStatus String Status of the Document, e.g. Stamped.
pages Page Description of pages.
docData [byte] Document bytes.
docMetadata DocMetadata Document metadata such as title, pageCount et al.
docLogs DocLog Logging information about the request, e.g. timestamp.

Page

Name Type Description Notes
documentId String Globally unique Id.
pageId String Globally unique Id.
pageNumber Integer PageNumber, starting with 1.
rotate double By how much the page was rotated from its original orientation.
thumbnail byte Thumbnail representing this particular page.
sourceDocumentId String Id of the document it was created from, e.g. in case of an extraction, the result's sourceDocumentId is the Id of the PDF the pages have been extracted from.
sourcePageNumber Integer Page number of the original page in the original document, e.g. let's assume document B consists of page number 4 of document A (extraction).
Thus, document B's only page's sourcePageNumber is number 4.

DocMetadata

Name Type Description Notes
title String Title of document.
subject String Subject of document.
pageCount long Number of pages.
size long Number bytes of the document.
isEncrypted boolean If the document is Encrypted
pdfCompliance String Pdf Compliance, e.g. PDF/A.
isSigned boolean If the document is Encrypted
uploadedMimeType String Uploaded MimeType, e.g. application/bson.
uploadedFileSize long Uploaded file size.

DocLog

Name Type Description Notes
messageType String MessageType, e.g. PdfALog.
message String Message itself, e.g. a warning.
timestamp dateTime Timestamp.
docLogLevel String Type of message. Supported Values :
"verbose", "info", "warning", "error", "timing"
durationMilliseconds long Timing for requested log information [ms].

How can we help?