-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow creating multiple PDFs in parallel #14
Comments
I believe this is because only one client can connect to Chrome over its exposed port at a time. We should investigate a solution similar to this one: cyrus-and/chrome-remote-interface#92 (comment). |
A work around for now is to not specify a port. That way we get a new instance of chrome spawned per request. Not optimal for performance, but I assume if you had several pages to load it would eventually be faster to spawn multiple than to do it in series. const google = htmlPdf.create('http://google.com')
const twitter = htmlPdf.create('http://twitter.com')
google.then(pdf => pdf.toFile(__dirname + '/google.pdf'))
twitter.then(pdf => pdf.toFile(__dirname + '/twitter.pdf')) |
Potential solution is creating a node-pool and starting google chrome through |
So I recently built a node express app on top of this. I accept a URL or HTML via GET/POST requests, then pass them to this and return the printed pdf. One of the things I was wondering about is what happens when two requests happen simultaneously. I have chrome-beta running headless via a systemd service file to keep it running. Will that cause an error if I get two requests at the same time or at least overlapping enough that it'll interrupt the first one loading etc? |
So in re-writing my node app to do parallel html->pdf I stopped using html-pdf-chrome but got it working. The only part I don't have working is closing tabs. I'm new to node so there may be issues I'm unaware of but here's how I accomplished it. |
What do you use for parallel html to pdf now. ive been using horsemanjs and it works but throws errors every now and then.
…On Jul 24, 2017, 10:32 PM, at 10:32 PM, Nathanael Noblet ***@***.***> wrote:
So in re-writing my node app to do parallel html->pdf I stopped using
html-pdf-chrome but got it working. The only part I don't have working
is closing tabs. I'm new to node so there may be issues I'm unaware of
but
[here's](https://github.com/NobletSolutions/remote-pdf-printer/blob/master/api/controllers/printPdfController.js)
how I accomplished it.
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#14 (comment)
|
@gnat42 there's a fundamental issue with your solution, which I described in my first comment on this thread.
Your solution tries to connect to a single instance Chrome for each PDF generation with a new client each time. The first request will succeed, since it is the first connection. However, while the first connection is still open, any additional connections will fail. I only see two paths forward:
I believe the second is the correct way to solve this, which is why I mentioned it. Basically we have one sustained connection to Chrome. As requests come in, we create new tabs in Chrome and add them to a queue once they finish loading. We then cycle through each loaded tab as we're able, generate our PDF, and finally close the tab. |
So I used ab -n 40 -c 5 to test and none failed. Then I used parallel to do 3 different ab tests with different urls. again it all worked. I'm pretty sure my code is doing the second solution isn't it? I should mention that I kept my original non-async code and ab couldn't do more than one request at a time. |
The last issue I was facing was I wasn't able to properly close the tabs. However in cyrus-and/chrome-remote-interface#214 I got some answers about how to close it. From what I can see this works. For each request I make I get an additional chrome process spawned from the main one that is listening which from what I can tell is one per tab + the first few that are started when I start chrome headlessly. It then closes the process when done. Previously each tab remained and I had a lot of chrome processes hanging around. Since I'm not auto-spawning chrome I'm pretty sure each of these are a tab. In any case I think that perhaps this is the same way this library could be modified to produce the PDFs in parallel. Any other comments/areas/potential bugs would be greatly appreciated since I'm quite new to nodejs. |
This should be fixed in #24 where I'm implementing the use of tabs. I haven't tested with pdfs, but I was having the same issue with screenshots and this solved it. |
BTW, I had previously read about the need to "bring a tab forward" before taking a screenshot, hence the need for a queue of sorts, but in my tests (albeit limited) this wasn't necessary. |
I receive
Error: unexpected server response (500)
when attempting to create multiple PDFs in parallel.I have two questions related to this:
Example creating PDFs in parallel:
Results with
Error: unexpected server response (500)
Example creating PDFs in series:
Behaves as expected with two pdf file outputs
Note: I can create the PDFs in parallel if I use multiple instances of Chrome, but this is obviously not ideal.
The text was updated successfully, but these errors were encountered: