Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[draft] try to tweak the pdfmerge to support annotations #19

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

benneti
Copy link

@benneti benneti commented Apr 14, 2021

I tried a bit to play with the pdfmerge function and I think it works in principle now, i.e. the annotations are at the correct place when transforming annot and not the base.
But there are still some problems, it seems PyPDF2 breaks links inside the same pdf, and I am not sure how to fix this.
Is there a particular reason why you use PyPDF2 instead of https://github.com/pmaupin/pdfrw ?

On another hand would it not make sense to bundle the effort with at least some of the other projects for the renderer
https://github.com/lucasrla/remarks
and https://gitlab.com/wrobell/remt ?

@bordaigorl
Copy link
Owner

Hi! Thanks for looking into this.
I did play with annotations a bit at some point, here are my findings.

First, it's not too difficult to keep the approach of transforming the base if you also manually transform the annotation:

def transformAnnot(bp, rot, ratio, tx, ty):
  if '/Annots' in bp:
    for a in bp['/Annots']:
      annot = a.getObject()
      r = RectangleObject(annot['/Rect'])
      (x0,y0) = r.upperLeft
      (x1,y1) = r.lowerRight
      if rot == 90:
        x0,y0=y0,x0
        x1,y1=y1,x1
      annot.update({NameObject('/Rect'): RectangleObject([x0*ratio+tx,y0*ratio+ty,x1*ratio+tx,y1*ratio+ty])})

Second, although the above places the annotations in the right place, the links are broken.
The reason for this is that the destinations of links are references to objects in the pages dictionary; when you do any page manipulation, these references are broken: the old objects get replaced by new ones but the references to the old ones do not get updated.
I did try to quickly put something together to correct it but I was unsuccessful.

Is there a particular reason why you use PyPDF2 instead of https://github.com/pmaupin/pdfrw ?

I do not remember why I picked PyPDF2 in the end...(there was some reason I cannot recall, but I remember it was relatively minor).
I think RCU uses pdfrw but had to modify it to get annotations to work properly, so it's not as easy as just porting the code.

On another hand would it not make sense to bundle the effort with at least some of the other projects for the renderer

In principle: yes. In practice: different projects use different libraries (e.g. Qt vs Cairo vs manual SVG generation) with different tradeoffs for the rendering.
I think the main problem is that there is no official specification for the notebook format and its intended rendering, so every implementation has its own (usually incomplete) interpretation. Many things are actually quite tricky (how to render pencil in a vector format for example, or handling the eraser) and decisions on some are subjective (e.g. what function to use to determine thickness of lines?). Each aim at supporting some extension, put together as a partial workaround (e.g. colors encoded in layer names). There seems to be no silver bullet.
In short: I don't think this is going to happen soon. The sturdy solution would be for reMarkable to release an official renderer.

My choice of using Qt was guided by the fact that I was already writing a Qt gui, and that I wanted to have on the fly previews. Another side goal was to be able to experiment with the notebook format, testing line simplification and other ides, so QGraphicsView was a good choice in view of maybe writing editing features (for post processing or even for writing back to the rm). In my plan there was a generalisation of the current renderer where you could very flexibly determine how to render lines from a configuration file. That never materialised because I got the basic functionality to a stage that works for me and run out of time.
(the project is definitely not dead, but I am adding functionality very slowly and as needed)

@benneti
Copy link
Author

benneti commented May 5, 2021

I think getting internal links in the pdfs working will be a hard problem, because of py-pdf/pypdf#370

@bordaigorl
Copy link
Owner

Yes that's very disappointing.
I think the only real way to do this is to dive in the PDF reference and work with the relevant object dictionaries directly.

@benneti
Copy link
Author

benneti commented May 5, 2021

I think the object dict would need to be transformed like this (I am not sure about the page reference handling and how to get and set the object without using PyPDF2 (and I hope that the /XYZ is the only type linking to a point instead of only a page)

def transformDest(d, rot, ratio, tx, ty):
  if d['/Type'] == '/XYZ':
    if rot == 90:
      l = d.left
      t = d.top
      d['/Left'] = t*ratio+tx
      d['/Top'] = l*ratio+ty
    else: 
      d['/Left'] = d.left*ratio+tx
      d['/Top'] = d.top*ratio+ty
  return d

if you have an idea let me know, else I think I give this up for now and would clean this up a bit such that it at least works with external links.

@bordaigorl
Copy link
Owner

Ah I meant the destination items in the dicts... the positions can be fixed as we established before...

@benneti
Copy link
Author

benneti commented May 5, 2021

Yes, I was talking about the destinations if they have the type "/XYZ" they can point to a coordinate and therefore need to be transformed, too. (This is not handled by Annots but very similar).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants