Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release: practical, user and system friendly version of test.py #13

Open
Manoa1911 opened this issue May 12, 2021 · 1 comment
Open

Comments

@Manoa1911
Copy link

Manoa1911 commented May 12, 2021

  • no excessive system memory hogging
  • adjustable video memory fraction to maximize video memory utilization (read the file)
  • can load all images from the inputdir "at once", no splitting to separate directories to avoid memory hogging
  • eyes-convenient representation powered by tqdm
  • can take other image formats (such as bmp) directly from the console
  • frame-count can be specified directly on console
    tested on python 3.6/TF1.15.4
    requirements: place the new test_opt.py into the directory of VSR-DUF
pip3 install tqdm

start inferencing:

python test_opt.py 2 16 dataset 0 33492 png

where:

2 = upscale factor
16 = depth
dataset = "./inputs/dataset"
0 = start frame
33492 = end frame
png = image format (any that will work with LoadImage()

Download

limitation: designed for 6-digit frame numbers for file names, to adjust, modify str("%06d"%counter) and (outdir + '/{:06d}.png'.format(counter)) from default 6 to what your movie requires

TODO and near future plans:
improve tqdm representation to work in terms of fps not it/s - done
allow to specify digit count on the console

updated:

  • added functionality to operate on a specific range of frames
  • added error protection, now testing for existence of first/last frames
  • re-designed the progress bar
  • variable names friendly to programmer
@Manoa1911
Copy link
Author

Manoa1911 commented May 14, 2021

new multithreaded release:

  • revolutionary increase in performance
  • simultaneous image saving and loading
  • maximum GPU feeding and utilization
  • simple and elegant threading, simplistic code
  • no need SSD/RAM drive, runs fast even on slow network drives
    Download

updated:

  • dynamic/adaptive image save/load balance to prevent IO competition between load thread and save thread
  • the mechanism prevents IO cross-loading and prioritizes image loading over image saving which increases GPU feeding, increases overall performance as it is a serial process
    WARNING: the architecture of the code is based on the assumption that the GPU is the bottleneck in the process, the program may fail if the IO part of the code is slower than the GPU part :( (it is highly unlikely to happen in any real situation) at which time the problem would be uncontrolled spam of IO write threads
  • note: this optimization will benefit only slow storage systems like network drives, faster storage systems may see none to minimal increase in performance as their bandwidth and latency are good which would not (or minimally) slow the application
  • this update accelerated the framerate by 0.002 fps on my K40m which translates to ~071%
    if your CPU is set to automatically adjust the frequency based on load, it is recommended to set affinity to this process to run on one specific core, this prevents "core tourism" and prevents core up/down clocking when not needed (the main thread of this project uses some CPU, on which the total performance depends)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant