-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] Save the URL of each image to a text file by making only one HTTP request (or two if the gallery has two pages, or three if it has three pages, etc) #303
Comments
If you want the page link, then just make sure you've set Settings -> Advanced -> Record and save gallery info as File info.txt, and checked includes Page Links, which is the default setting. Saving image link is not available since the link will only valid for a short time, after that you'll see an error. If you want to rename the file, for now it's not possible, but if you just want the image token, I guess you can just get the SHA-1 hash of the image file, then get the first 10 letters. |
I don't want the image URL. I forgot to add that I want it to perform the action without wasting any GP, credits or hath, and without making too many page requests. I think it should just make one page request if all of the image URLs are in one page. |
So do you mean you want the page links or the image links without costing GP or downloading images?
|
Preferably none of those. Does my edit not show up? What I'm really looking for is saving the page number, image token (SHA-1) and the extension of every image file, e.g. https://exhentai.org/s/d1b07750bd/3042776-1 will be saved as 0001_d1b07750bd.jpg. If you're wondering why, it's because the latest restriction has made it impossible for anyone without GP, credit or hath to download the original images. Most people don't have enough GP, credit or hath to download a single gallery. Our only option is to use torrents, and these torrents often have unsorted image files. |
I did saw that but don't understand, probably it's nearly 6 AM in my timezone and I need a sleep. 😴
So do you want to rename the download file? For now it's not possible, but Soon™. If you just want to extract them from the page link and get a plain text list for such naming (d1b07750bd/3042776-1 -> 0001_d1b07750bd.jpg), probably you can try the first option's code to get page links, and ask GPT to write an automated script for you. However the page link only contains page number and image token, the file extension is not available. You need to extract the file extension from the thumbnail URL, but the link doesn't include the page number, and the script doesn't grab thumbnail URL actually, so you need to DIY. An example of image grid's page source code:
It's not restricted in latest update, but in last year. The latest change is just hide image limits for normal user, all the other rules are still the same as previous 2023-08 updates (latest galleries can grab with image limits only, except peak hours and/or old galleries applied).
If what you mean is to download with torrents, then calculate the hash for each file, then compare with the image link, then that'd make sense, but for that case you may probably only need the page number and page token. I'm still quite not understand why you need the file extension since you've already got the file from torrents, and to order them it's pretty sure you need to write a script to do that, then you can just extract the file extension part from the file name. Time to sleep, if you've anything to update, I'll reply it ~10 hours later, sorry for let you wait. 😴 |
No, I want the page number, the image token and the extension of every image saved to a text file.
Sometimes whoever creates the torrent likes to change the extension from png to jpg or jpg to jpeg. You'd be surprised how often they do it. |
Then I'd say I'm afraid it's not available, since to avoid the case you said (see #2 which is just your case), the script extracts the filename from the image file request's response HTTP header, so that it'll use the original file name and correct file extension, and definitely costs limits or GPs. Since the script is focused on downloading file, so I'm not going to add such feature to extract file extension from thumbnail URL (and the thumbnail URL is only available when you switch to large thumbnail grid layout…. |
It can be a separate script. |
Then I'd say do it yourself, since it's not related to the script's function, and I do really need a sleep, truly sorry for that. 🥲 |
Just to inform if you're still working on this, since EH is migrating image system to v2, there's no way to extract that file information from image thumbnail URL, because the larger thumbnail images are spirits, too. It's already applied to newer galleries and rolling out to old galleries, so I'm afraid the only way to extract expected file name is to download original image. |
Edit: Instead of saving the URL, it'd be even better if it could save the page number, image token and the extension of every image file, e.g. https://exhentai.org/s/d1b07750bd/3042776-1 will be saved as
0001_d1b07750bd.jpg
.The text was updated successfully, but these errors were encountered: