BUG: Support for Adding and Viewing Ink Annotations in Mac's Preview app #2332

themarisolhernandez · 2023-12-07T09:27:07Z

I am having problems adding Ink annotations back to a PDF using PdfWriter.add_annotation(). I think the problem is related to the PDF viewer. When I open the file after adding the Ink annotation in Mac's preview app, the Ink annotation is transparent. The file and annotation look fine when viewing it under AdobeReader.

Any ideas why?

This isn't an issue when using PyPDF2, but I would prefer to use pypdf. Are there plans to support this?

Environment

Which environment were you using when you encountered the problem?

$ python -m platform
macOS-13.5.2-x86_64-i386-64bit

$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==3.17.1, crypt_provider=('cryptography', '41.0.5'), PIL=9.4.0

Code + PDF

This is a minimal, complete example that shows the issue:

from pypdf import PdfReader, PdfWriter
from io import BytesIO


def add_annots(file_bytes: bytes,
               annot: list) -> bytes:
    writer = PdfWriter()

    with BytesIO(file_bytes) as input_stream, BytesIO() as output_stream:
        reader = PdfReader(input_stream)
        writer.append_pages_from_reader(reader)

        for page_num, annot in annots:
            writer.add_annotation(page_number=page_num, annotation=annot)

        # Add original metadata
        writer.add_metadata(reader.metadata)

        writer.write(output_stream)
        output_stream.seek(0)
        pdf_file = output_stream.read()

    return pdf_file

The input file is attached under the filename input.pdf. The output file is attached under the filename output.pdf. An image of the output file is also attached to show that the Ink annotation is transparent.

The annot input looks like:

[[0, {'/AP': {'/N': IndirectObject(78, 0, 5043561152)}, '/C': [0.898041, 0.133331, 0.215683], '/CreationDate': "D:20231122191551-08'00'", '/F': 4, '/M': "D:20231122191551-08'00'", '/NM': 'dc92223d-3b4c-47d4-a2a9-73c263b95c84', '/P': IndirectObject(48, 0, 5043561152), '/Popup': IndirectObject(76, 0, 5043561152), '/QuadPoints': [199.111, 614.634, 446.841, 614.634, 199.111, 602.86, 446.841, 602.86, 85.1462, 603.952, 135.067, 603.952, 85.1462, 590.839, 135.067, 590.839, 199.564, 603.952, 328.095, 603.952, 199.564, 590.839, 328.095, 590.839], '/Rect': [81.2359, 590.019, 450.393, 615.411], '/Subj': 'Cross-Out', '/Subtype': '/StrikeOut', '/T': 'Yoed', '/Type': '/Annot'}], [0, {'/F': 28, '/Open': False, '/Parent': IndirectObject(77, 0, 5043561152), '/Rect': [609.12, 522.634, 793.12, 614.634], '/Subtype': '/Popup', '/Type': '/Annot'}], [0, {'/AP': {'/N': IndirectObject(73, 0, 5043561152)}, '/C': [0.988235, 0.956863, 0.521576], '/CA': 0.399994, '/CreationDate': "D:20231122191602-08'00'", '/F': 4, '/M': "D:20231122191602-08'00'", '/NM': '1a628f96-03a9-452e-a596-fb484dbb7563', '/P': IndirectObject(48, 0, 5043561152), '/Popup': IndirectObject(71, 0, 5043561152), '/QuadPoints': [168.149, 545.59, 312.648, 545.59, 168.149, 534.318, 312.648, 534.318], '/Rect': [165.14, 533.966, 315.657, 545.942], '/Subj': 'Highlight', '/Subtype': '/Highlight', '/T': 'Yoed', '/Type': '/Annot'}], [0, {'/F': 28, '/Open': False, '/Parent': IndirectObject(72, 0, 5043561152), '/Rect': [609.12, 453.59, 793.12, 545.59], '/Subtype': '/Popup', '/Type': '/Annot'}], [0, {'/AP': {'/N': IndirectObject(70, 0, 5043561152), '/R': IndirectObject(69, 0, 5043561152)}, '/C': [1, 0.819611, 0], '/Contents': 'Testing', '/CreationDate': "D:20231122191647-08'00'", '/F': 28, '/M': "D:20231122191658-08'00'", '/NM': '315e7788-b990-4d64-807e-b210c23bd4ee', '/Name': '/Comment', '/P': IndirectObject(48, 0, 5043561152), '/Popup': IndirectObject(67, 0, 5043561152), '/RC': '<?xml version="1.0"?><body xmlns="http://www.w3.org/1999/xhtml" xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:APIVersion="Acrobat:23.6.0" xfa:spec="2.0.2" ><p dir="ltr"><span dir="ltr" style="font-size:10.2pt;text-align:left;font-weight:normal;font-style:normal">Testing</span></p></body>', '/Rect': [252.351, 502.874, 276.351, 526.874], '/Subj': 'Sticky Note', '/Subtype': '/Text', '/T': 'Yoed', '/Type': '/Annot'}], [0, {'/F': 28, '/Open': False, '/Parent': IndirectObject(68, 0, 5043561152), '/Rect': [609.12, 434.874, 793.12, 526.874], '/Subtype': '/Popup', '/Type': '/Annot'}], [0, {'/AP': {'/N': IndirectObject(64, 0, 5043561152)}, '/C': [1, 1, 1], '/Contents': 'Test text here..........', '/CreationDate': "D:20231122191835-08'00'", '/DA': '0.898 0.1333 0.2157 rg /Helv 12 Tf', '/DS': 'font: Helvetica,sans-serif 12.0pt; text-align:left; color:#E52237 ', '/F': 4, '/M': "D:20231122191859-08'00'", '/NM': '85f8f2dd-0a22-4eed-81d0-230a7f66037f', '/P': IndirectObject(48, 0, 5043561152), '/RC': '<?xml version="1.0"?><body xmlns="http://www.w3.org/1999/xhtml" xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:APIVersion="Acrobat:23.6.0" xfa:spec="2.0.2"  style="font-size:12.0pt;text-align:left;color:#FF0000;font-weight:normal;font-style:normal;font-family:Helvetica,sans-serif;font-stretch:normal"><p dir="ltr"><span style="font-family:Helvetica">Test text here..........</span></p></body>', '/Rect': [297.461, 479.137, 418.337, 497.006], '/Subj': 'Text Box', '/Subtype': '/FreeText', '/T': 'Yoed', '/Type': '/Annot'}], [0, {'/AP': {'/N': IndirectObject(62, 0, 5043561152)}, '/BS': IndirectObject(61, 0, 5043561152), '/C': [0.898041, 0.133331, 0.215683], '/CreationDate': "D:20231122191912-08'00'", '/F': 4, '/InkList': [[230.191, 417.648, 229.139, 418.699, 227.563, 421.852, 227.037, 423.429, 225.986, 425.531, 225.461, 428.684, 223.884, 433.94, 223.884, 434.465, 223.358, 438.144, 223.358, 439.195, 223.358, 441.823, 223.358, 446.027, 223.358, 446.553, 223.358, 449.706, 223.358, 450.757, 223.358, 451.808, 223.884, 452.334, 224.935, 453.911, 225.461, 454.962, 228.088, 457.589, 230.716, 459.692, 236.497, 463.371, 238.599, 464.422, 244.38, 465.998, 245.432, 465.998, 245.957, 466.524, 246.483, 466.524, 247.008, 466.524, 249.636, 466.524, 250.687, 466.524, 256.994, 466.524, 260.672, 466.524, 262.249, 466.524, 263.826, 466.524, 264.877, 465.998, 270.658, 463.371, 271.709, 462.845, 277.49, 459.692, 279.067, 458.641, 282.22, 456.538, 287.475, 452.334, 288.001, 451.283, 290.103, 449.181, 292.731, 444.976, 293.257, 443.925, 294.833, 439.721, 295.884, 438.144, 296.41, 436.042, 296.935, 433.414, 297.986, 429.735, 297.986, 428.159, 297.986, 426.057, 298.512, 422.903, 299.038, 421.327, 299.038, 419.75, 299.038, 417.122, 297.461, 409.239, 296.935, 408.188, 295.884, 403.983, 293.782, 398.728, 293.257, 397.677, 291.154, 393.998, 288.527, 390.319, 287.475, 389.794, 285.373, 387.166, 282.22, 385.064, 277.49, 382.961, 268.03, 380.859, 260.672, 379.283, 254.366, 378.232, 243.329, 378.232, 240.702, 378.232, 236.497, 378.232, 228.614, 378.232, 227.563, 379.283, 221.256, 381.91, 219.154, 383.487, 209.169, 389.794, 206.541, 391.37, 197.607, 403.458, 193.402, 410.816, 192.351, 415.02, 189.198, 423.429, 188.147, 426.582, 187.096, 432.363, 187.096, 434.465, 187.096, 439.195, 187.096, 441.297, 187.096, 444.976, 187.621, 446.027, 187.621, 447.078]], '/M': "D:20231122191912-08'00'", '/NM': '51684b6f-eb0c-43d3-b464-2b1443354af7', '/P': IndirectObject(48, 0, 5043561152), '/Popup': IndirectObject(46, 0, 5043561152), '/Rect': [182.443, 372.721, 305.745, 472.808], '/Subj': 'Pencil', '/Subtype': '/Ink', '/T': 'Yoed', '/Type': '/Annot'}], [0, {'/F': 28, '/Open': False, '/Parent': IndirectObject(47, 0, 5043561152), '/Rect': [609.12, 325.648, 793.12, 417.648], '/Subtype': '/Popup', '/Type': '/Annot'}]]

input.pdf

output.pdf

The text was updated successfully, but these errors were encountered:

pubpub-zz · 2023-12-09T08:45:24Z

To complete the analysis:
this is the result in Acrobat Reader (Windows) / PdfXchange (windows):

with PDF.js (firefox) :

with Chrome:

@themarisolhernandez, I would expect the same results with the same softwares under Mac. Can you confirm ?

pubpub-zz · 2023-12-09T09:09:37Z

@themarisolhernandez

your code can not be run as it is. Can you complete it. Thanks

MartinThoma · 2023-12-09T12:16:12Z

Comment by themarisolhernandez:

[Adding an Ink annotation] isn't an issue when using PyPDF2

That is interesting and unexpected.

pubpub-zz · 2023-12-09T13:06:19Z

Comment by themarisolhernandez:

[Adding an Ink annotation] isn't an issue when using PyPDF2

That is interesting and unexpected.

@themarisolhernandez, can you provide also the output when using PyPDF2

themarisolhernandez · 2023-12-11T18:08:10Z

@pubpub-zz @MartinThoma Here is the complete code using pypdf

from pypdf.generic import NullObject, IndirectObject, ArrayObject, DictionaryObject
from pypdf import PdfReader, PdfWriter
from typing import Any
from io import BytesIO


def extract_annots_recursively(extracted_annots: list,
                               page_num: int,
                               annots: Any) -> None:
    if annots is None or isinstance(annots, NullObject):
        # Skip NullObjects
        return
    elif isinstance(annots, IndirectObject):
        obj = annots.get_object()

        extract_annots_recursively(extracted_annots=extracted_annots,
                                   page_num=page_num,
                                   annots=obj)
    elif isinstance(annots, list) or isinstance(annots, ArrayObject):
        for obj in annots:
            extract_annots_recursively(extracted_annots=extracted_annots,
                                       page_num=page_num,
                                       annots=obj)
    elif isinstance(annots, dict) or isinstance(annots, DictionaryObject):
        extracted_annots.append([page_num, annots])

def extract_annots(file_bytes: bytes) -> tuple[bytes, list]:
    writer = PdfWriter()
    extracted_annots = []

    with BytesIO(file_bytes) as input_stream, BytesIO() as output_stream:
        reader = PdfReader(input_stream)

        for page_num, page in enumerate(reader.pages):
            page_annots = page.get("/Annots", [])

            extract_annots_recursively(extracted_annots=extracted_annots,
                                       page_num=page_num,
                                       annots=page_annots)

            writer.add_page(page)

        # Remove annots from the PdfWriter
        writer.remove_annotations(subtypes=None)

        writer.write(output_stream)
        output_stream.seek(0)
        pdf_file = output_stream.read()

    return pdf_file, extracted_annots


def add_annots(file_bytes: bytes,
               annots: list) -> bytes:
    writer = PdfWriter()

    with BytesIO(file_bytes) as input_stream, BytesIO() as output_stream:
        reader = PdfReader(input_stream)
        writer.append_pages_from_reader(reader)

        for page_num, annot in annots:
            writer.add_annotation(page_number=page_num, annotation=annot)

        # Add original metadata
        writer.add_metadata(reader.metadata)

        writer.write(output_stream)
        output_stream.seek(0)
        pdf_file = output_stream.read()

    return pdf_file


if __name__ == "__main__":
    print("--- Extract Annots ---")
    with open("extract_annots__input_file.pdf", "rb") as f:
        input_file = f.read()

    output_file, annots = extract_annots(file_bytes=input_file)

    print("\n--- Add Annots ---")
    output_file = add_annots(file_bytes=output_file,
                             annots=annots)

    with open("add_annots__output_file.pdf", "wb") as f:
        f.write(output_file)

The input and output files are attached:
extract_annots__input_file.pdf
add_annots__output_file.pdf

Again, the Ink annotation appears visible when opening the output file in a PDF viewer like Adobe Acrobat Reader. But the Ink annotation does not appear visible when opening the output file in Mac's Preview app. As seen in the screenshot, the Ink annotation is there but it is transparent for some reason.

I will send another response with the output of PyPDF2.

themarisolhernandez · 2023-12-11T18:15:39Z

Here are the results of using PyPDF2 instead,

from PyPDF2.generic import NullObject, IndirectObject, ArrayObject, DictionaryObject
from PyPDF2 import PdfReader, PdfWriter
from typing import Any
from io import BytesIO


def extract_annots_recursively(extracted_annots: list,
                               page_num: int,
                               annots: Any) -> None:
    if annots is None or isinstance(annots, NullObject):
        # Skip NullObjects
        return
    elif isinstance(annots, IndirectObject):
        obj = annots.get_object()

        extract_annots_recursively(extracted_annots=extracted_annots,
                                   page_num=page_num,
                                   annots=obj)
    elif isinstance(annots, list) or isinstance(annots, ArrayObject):
        for obj in annots:
            extract_annots_recursively(extracted_annots=extracted_annots,
                                       page_num=page_num,
                                       annots=obj)
    elif isinstance(annots, dict) or isinstance(annots, DictionaryObject):
        extracted_annots.append([page_num, annots])


def extract_annots(file_bytes: bytes) -> tuple[bytes, list]:
    writer = PdfWriter()
    extracted_annots = []

    # Note: input_stream is not closed explicitly because it leads to an I/O error for IndirectObjects
    input_stream = BytesIO(file_bytes)
    reader = PdfReader(input_stream)

    for page_num, page in enumerate(reader.pages):
        page_annots = page.get("/Annots", [])

        extract_annots_recursively(extracted_annots=extracted_annots,
                                   page_num=page_num,
                                   annots=page_annots)

        writer.add_page(page)

    # Remove annots from the PdfWriter
    writer.remove_links()

    with BytesIO() as output_stream:
        writer.write(output_stream)
        output_stream.seek(0)
        cleaned_pdf = output_stream.read()

    return cleaned_pdf, extracted_annots


def add_annots(file_bytes: bytes,
               annots: list) -> bytes:
    writer = PdfWriter()

    with BytesIO(file_bytes) as input_stream, BytesIO() as output_stream:
        reader = PdfReader(input_stream)
        writer.append_pages_from_reader(reader)

        for page_num, annot in annots:
            writer.add_annotation(page_number=page_num, annotation=annot)

        # Add original metadata
        writer.add_metadata(reader.metadata)

        writer.write(output_stream)
        output_stream.seek(0)
        pdf_file = output_stream.read()

    return pdf_file


if __name__ == "__main__":
    print("--- Extract Annots ---")
    with open("extract_annots__input_file.pdf", "rb") as f:
        input_file = f.read()

    output_file, annots = extract_annots(file_bytes=input_file)

    print("\n--- Add Annots ---")
    output_file = add_annots(file_bytes=output_file,
                             annots=annots)

    with open("add_annots__output_file_pypdf2.pdf", "wb") as f:
        f.write(output_file)

The input and output files are attached:
extract_annots__input_file.pdf
add_annots__output_file_pypdf2.pdf

The following is a screenshot of the output file opened in Mac's preview app. Here you can clearly see the Ink annotation.

pubpub-zz · 2023-12-12T21:30:05Z

@themarisolhernandez
Comparing the two files I've found that the "/BS" is refering an object that does not exists. The easiest would be to clone the annotation ignoring "/P". "/P" is declared as optional It may work without reference the new page

pubpub-zz · 2023-12-28T09:18:04Z

@themarisolhernandez Comparing the two files I've found that the "/BS" is refering an object that does not exists. The easiest would be to clone the annotation ignoring "/P". "/P" is declared as optional It may work without reference the new page

any return about this ?

pubpub-zz · 2024-04-02T19:29:17Z

I close this issue as there is no news.feel free to send update if you want to reopen it

themarisolhernandez assigned MartinThoma Dec 7, 2023

pubpub-zz mentioned this issue Dec 7, 2023

Add Ink Annotation Support #1959

Open

Snuffleupagus mentioned this issue Dec 9, 2023

Support Annotations with corrupt /BS-entries mozilla/pdf.js#17395

Merged

MartinThoma added the workflow-annotation Everything about annotating PDF files label Dec 24, 2023

MartinThoma removed their assignment Dec 24, 2023

py-pdf deleted a comment from pubpub-zz Dec 24, 2023

py-pdf deleted a comment from themarisolhernandez Dec 24, 2023

MartinThoma changed the title ~~Support for Adding and Viewing Ink Annotations in Mac's Preview app~~ BUG: Support for Adding and Viewing Ink Annotations in Mac's Preview app Dec 24, 2023

MartinThoma added the is-bug From a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF label Dec 24, 2023

stefan6419846 mentioned this issue Dec 28, 2023

FreeText annotation not showing in chrome browser #2372

Open

pubpub-zz closed this as completed Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Support for Adding and Viewing Ink Annotations in Mac's Preview app #2332

BUG: Support for Adding and Viewing Ink Annotations in Mac's Preview app #2332

themarisolhernandez commented Dec 7, 2023 •

edited by MartinThoma

Loading

pubpub-zz commented Dec 9, 2023

pubpub-zz commented Dec 9, 2023

MartinThoma commented Dec 9, 2023

pubpub-zz commented Dec 9, 2023

themarisolhernandez commented Dec 11, 2023 •

edited

Loading

themarisolhernandez commented Dec 11, 2023 •

edited

Loading

pubpub-zz commented Dec 12, 2023

pubpub-zz commented Dec 28, 2023

pubpub-zz commented Apr 2, 2024

BUG: Support for Adding and Viewing Ink Annotations in Mac's Preview app #2332

BUG: Support for Adding and Viewing Ink Annotations in Mac's Preview app #2332

Comments

themarisolhernandez commented Dec 7, 2023 • edited by MartinThoma Loading

Environment

Code + PDF

pubpub-zz commented Dec 9, 2023

pubpub-zz commented Dec 9, 2023

MartinThoma commented Dec 9, 2023

pubpub-zz commented Dec 9, 2023

themarisolhernandez commented Dec 11, 2023 • edited Loading

themarisolhernandez commented Dec 11, 2023 • edited Loading

pubpub-zz commented Dec 12, 2023

pubpub-zz commented Dec 28, 2023

pubpub-zz commented Apr 2, 2024

themarisolhernandez commented Dec 7, 2023 •

edited by MartinThoma

Loading

themarisolhernandez commented Dec 11, 2023 •

edited

Loading

themarisolhernandez commented Dec 11, 2023 •

edited

Loading