-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missing object.coffee outputs. #375
Conversation
I have finished a module that can be integrated with pdfkit that will add a new function to the pdfkit prototype that will allow us to import multiple PDF into the current pdf stream. The code is now working on my box; but the only piece that is missing so that people don't have to pull "my" copy of pdfkit is the ability for me to output a value the way I need to to the stream. (In the only case I haven't been able to work around in the pdfkit) So I would be appreciate if we could get this merged in and deployed as soon as you have some time. |
I should also say that as I was testing I realized just now that "true", "false", "null" and "" all also have to be passed through the system as a isRaw as they are string values but not a "/key" and not a "(string)" but actually the raw text. So I actually have a couple things now using that minor change to your objects.convert function. |
Boolean values, null, etc. are already handled by the else case here which converts them to strings correctly. I'm curious what other "hex" values you need to import? Do you have an example? |
I'll double check that true/false/null gets passed through, I left them as strings as they came in as string characters; but switching them to a real value is easy. However the "" (Which can be considered as blank or null depending on the key) and hex values (on PDF 1.7 Spec page 56) are still handled with the isRaw. As for the usage; in the 100 random pdf's that I'm playing with I've seen hex values primarily in the ColorSpace key, but I've also seen it anywhere MultiByte strings are supposed to be. And their are a couple keys in the spec that are required to be Hex (For example page 185&186, SubType) |
I've confirmed with some minor alterations that null/true/false will work fine, now I just need Hex and blank. I should probably state; when I import the pdf; I read the entire structure and all keys, strings, dictionary's, etc; then I spit out the all the data into pdfkit structures. This allows me to use pdf kit to create a page; import pages; create more pages; import pages; and then create more pages... I don't modify any data other than all the references are rewritten (using your pdfreference) to be updated to stay in sync in the new document. So if I get passed in a Hex value; it gets spit back out as a hex value. The idea is to do as few changes to the stream as possible, this allows most the features of the existing pdf to continue to work even without me directly supporting them. |
I'm already not really a fan of the |
I'm not a big fan of the isString / isRaw hacks either, & the only reason I went with isRaw (rather than .isHex) is it allowed me to standardize how to do different outputs depending on the object passed in; a "" gets output as "", not as ("") or /"" and hex gets output as . So creating new "objects" to output anything was simple. However, updating the code to support the Buffer object is also a good solution. Since I really need this support; I am willing to do this which ever way you want:
|
Let's go with the buffer option to create hex strings. Browserify automatically handles buffer support in the browser. As for Where are you seeing empty strings without parentheses or angle brackets around them? I don't see anything in the spec for that... they say empty strings should be enclosed with parens. |
… replace with the Buffer object so that anything can be supported.
I've updated the code to eliminate the .isRaw/.isString hacks; so they all now use Buffer objects. I've tested my code against it and everything appears to be working properly. As for the Empty string/value; I think I saw it somewhere in the spec that it could be considered null (empty) in some cases; but the main thing is that I've ran into several pdf's that the code was: |
Any news on if this will be accepted (again I'm willing to do whatever changes are needed); or do I need to point my repository to my own copy of PDFKit so that I can have a automatic deploy of both my pdfimport and pdfkit. |
Sorry, there were merge conflicts here. I spent some time reworking this in 22a9bfd. I realized that JavaScript already has two string representations: literal and What this means for you is that you'll need to parse the hex strings you get in as buffers, taking off the angle brackets: e.g. I found the bit about null objects in the spec. From my reading, I think it means references that don't point to a valid object become null:
I haven't seen anything about the contents of the referenced definition being empty. Have you tried just dropping them? Or how about putting |
Only problem is now with your '<' + object.toString('hex') + '>' is that you are forcing buffer to be a hex. I would much much rather have this be just a straight: "object.toString();" so that I can control what gets output. The reasons why is:
So any change I can get you to change it back to what my pull was of just the |
The reason I did this was that making hex strings out of binary buffers could be useful for other purposes in the future, other than just what you're doing here. You already have to make a The spec seems clear enough to me:
Containing nothing is invalid, pdf readers just happen to accept it. Can you try setting the value to |
I switched it to null, and replaced my code to use your new String and Buffer methods and I am good to go. Thanks for getting these changes in! Any idea when npm will have it? |
Sorry about the wait. Just released v0.7.1. Was trying to get together a bigger release, but it's taking longer than I hoped. |
I have been working on a "optional" loaded javascript file that will import a pdf into a new pdf for output. Everything works other than pdfkit currently doesn't support outputting a "hex" value. i.e < 0A >, the Object.convert wants to either treat it as a String key i.e. /0A or as a Object << 0A >>