Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory profiling and fixing memory leak #270

Merged
merged 33 commits into from
Jul 2, 2023

Conversation

angelala3252
Copy link
Collaborator

@angelala3252 angelala3252 commented Jun 15, 2023

What kind of change does this PR introduce?
Bugfix and add memory profiling.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • If applicable, I have added tests to prove my fix is functional/effective
  • I have linted my code locally prior to submission
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (e.g. README.md, requirements.txt)
  • New and existing unit tests pass locally with my changes

Summary

This PR responds to Issue #5. The goal is to add memory profiling to record.py, add memory usage to performance plots, and fix the memory leak caused by the screenshots described here.

How can your code be run and tested?

git remote add angelala3252 https://github.com/angelala3252/OpenAdapt.git
git fetch angelala3252
git checkout feat/memory
pip install -r requirements.txt
python -m openadapt.record "testing out openadapt"

@angelala3252 angelala3252 marked this pull request as draft June 15, 2023 18:14
@KrishPatel13
Copy link
Collaborator

@angelala3252
Resolved Merge Conflicts. Let me know if there are any issues :)

stat.size_diff / 1024,
stat.size / 1024,
stat.count_diff,
stat.count))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about:

        new_KiB = stat.size_diff / 1024
        total_KiB = stat.size / 1024
        new_blocks = stat.count_diff
        total_blocks = stat.count
        logger.info(f"{new_KiB=} {total_KiB=} {new_blocks=} {total_blocks=}")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    for stat in stats[:3]:
        new_KiB = stat.size_diff / 1024
        total_KiB = stat.size / 1024
        new_blocks = stat.count_diff
        total_blocks = stat.count
        source = stat.traceback.format()[0]
        logger.info(f"{new_KiB=} {total_KiB=} {new_blocks=} {total_blocks=} {source=}")

output:
image

for line in stat.traceback.format():
print(line)

tr.print_diff()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to get the value as a string, e.g.:

trace_str = tr.get_diff_str()
logger.info(f"trace_str=\n{trace_str}")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the output now:
image

@KrishPatel13 KrishPatel13 removed their assignment Jun 16, 2023
@angelala3252
Copy link
Collaborator Author

angelala3252 commented Jun 16, 2023

Updates:

  • the memory leak seems to have been resolved after adding del prev_event to process_events. sleeping for any amount of time in read_screen_events had no effect.

Comparison of memory usage before and after any changes on a shorter recording:
tracemalloc found 202MB from screenshots, and pympler found 189.85 MB
Performance graph:
performance-before-changes

After after adding del prev_event:
tracemalloc found 16MB from screenshots, and pympler found 7.91 MB
Performance graph:
performance-del

This is a significant decrease in memory usage, and I also tested it with longer recordings and got great results. Before, longer recordings would cause Pycharm to take up to 4-5000 MB of memory when a lot of input was given, and around 2-3000 MB when no input was given. After this change, Pycharm only takes up ~1200MB on longer recordings and stays pretty constant even if a lot of input is given.

@abrichr
Copy link
Member

abrichr commented Jun 16, 2023

Excellent work @angelala3252 ! What else is required here in order to be ready for review?

requirements.txt Outdated Show resolved Hide resolved
openadapt/record.py Outdated Show resolved Hide resolved
openadapt/record.py Outdated Show resolved Hide resolved
@angelala3252
Copy link
Collaborator Author

@abrichr The only thing left to do is plotting the memory usage over time

@@ -32,12 +40,15 @@
"action": True,
"window": True,
}
PLOT_PERFORMANCE = False

PLOT_PERFORMANCE = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about moving this to config.py?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in a9828dd

openadapt/record.py Outdated Show resolved Hide resolved

stats = performance_snapshots[-1].compare_to(performance_snapshots[-2], 'lineno')

for stat in stats[:3]:
Copy link
Member

@abrichr abrichr Jun 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why stats[:3] and not just stats?

Copy link
Collaborator Author

@angelala3252 angelala3252 Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just stats would show every single memory allocation and there are a lot, so it fills up the logging with a lot of insignificant memory allocations that only take a few bytes. I've defined a constant NUM_MEMORY_STATS for the 3 in e96b942, so we could change that to 5 or 10 or however many stats we want to see.

@angelala3252
Copy link
Collaborator Author

Example memory graph:

performance-1687298240 7091463

@angelala3252
Copy link
Collaborator Author

  • added memory to the legend in 2918fc6
  • updated code to write memory data to database instead of a file in 3a8d4b3

@abrichr
Copy link
Member

abrichr commented Jun 23, 2023

Thanks @angelala3252 ! Can you please show an updated screenshot? Also please fix merge conflicts 🙏

@angelala3252
Copy link
Collaborator Author

Updated screenshot:
legend2

@angelala3252 angelala3252 marked this pull request as ready for review June 23, 2023 19:11
@angelala3252
Copy link
Collaborator Author

@abrichr resolved merge conflicts and addressed comments

openadapt/record.py Outdated Show resolved Hide resolved
PLOT_PERFORMANCE = False
PLOT_PERFORMANCE = config.PLOT_PERFORMANCE
BYTE_TO_MB = float(2**20)
NUM_MEMORY_STATS = 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just to limit the log output?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, without it every single memory allocation would be displayed, even the tiny insignificant ones. If you want more output I could change it to something like 5 or 10.

openadapt/utils.py Outdated Show resolved Hide resolved
@angelala3252
Copy link
Collaborator Author

@abrichr ready for review


id = sa.Column(sa.Integer, primary_key=True)
recording_timestamp = sa.Column(sa.Integer)
memory_usage = sa.Column(ForceFloat)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angelala3252 can you please rename this to something more descriptive, e.g. memory_usage_kb (or similar)?

Also, I noticed in the migration that the type does not match:

                    sa.Column('memory_usage', sa.Integer(), nullable=True),
                    sa.Column('timestamp', sa.Integer(), nullable=True),

And when I try to run alembic upgrade head I get:

KeyError: '3d3239ae4849'

This suggests to me that you created the revision off of a stale database state.

Can you please:

  • delete openadapt.db
  • check out main
  • alembic upgrade head
  • check out this branch
  • alembic revision ...
  • replace the current migration script with the new one

You may need to import the ForceFloat type.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in fb290bb and 0d0c842

@abrichr
Copy link
Member

abrichr commented Jul 2, 2023

Thanks @angelala3252 ! I get this exception when creating a recording:


2023-07-02 10:45:43.902 | INFO     | openadapt.config:<module>:130 - DIRNAME_PERFORMANCE_PLOTS=performance
2023-07-02 10:45:45.232 | INFO     | __main__:wrapper_logging:88 -  -> Enter: record(foo)
2023-07-02 10:45:45.252 | INFO     | __main__:record:614 - task_description='foo'
2023-07-02 10:45:45.253 | INFO     | __main__:wrapper_logging:88 -  -> Enter: create_recording(foo)
2023-07-02 10:45:45.310 | INFO     | __main__:create_recording:555 - recording=Recording(id=5, timestamp=1688309145.2541811, monitor_width=1512, monitor_height=982, double_click_interval_seconds=0.5, double_click_distance_pixels=5, platform='darwin', task_description='foo')
2023-07-02 10:45:45.310 | INFO     | __main__:wrapper_logging:92 -  <- Leave: create_recording(Recording(id=5, timestamp=1688309145.2541811, monitor_width=1512, monitor_height=982, double_click_interval_seconds=0.5, double_click_distance_pixels=5, platform='darwin', task_description='foo'))
2023-07-02 10:45:45.312 | INFO     | __main__:wrapper_logging:88 -  -> Enter: read_window_events(<queue.Queue object at 0x142efb9d0>, <multiprocessing.synchronize.Event object at 0x142fc0280>, 1688309145.2541811)
2023-07-02 10:45:45.318 | INFO     | __main__:read_window_events:429 - starting
2023-07-02 10:45:45.329 | INFO     | __main__:process_events:132 - starting
2023-07-02 10:45:45.335 | INFO     | __main__:read_screen_events:402 - starting
2023-07-02 10:45:45.335 | INFO     | __main__:read_screen_events:402 - starting
2023-07-02 10:45:45.538 | INFO     | __main__:read_window_events:448 - _window_data={'title': 'Terminal OpenAdapt — tmux — 214×51', 'left': 0, 'top': 38, 'width': 1512, 'height': 944, 'window_id': 204051}
2023-07-02 10:45:45.538 | INFO     | __main__:read_window_events:448 - _window_data={'title': 'Terminal OpenAdapt — tmux — 214×51', 'left': 0, 'top': 38, 'width': 1512, 'height': 944, 'window_id': 204051}
2023-07-02 10:45:47.155 | INFO     | __mp_main__:wrapper_logging:88 -  -> Enter: performance_stats_writer(<multiprocessing.queues.Queue object at 0x109ff0790>, 1688309145.2541811, <multiprocessing.synchronize.Event object at 0x143243880>)
2023-07-02 10:45:47.162 | INFO     | __mp_main__:performance_stats_writer:479 - performance stats writer starting
2023-07-02 10:45:47.165 | INFO     | __mp_main__:wrapper_logging:88 -  -> Enter: write_events(action, <function write_action_event at 0x142b91a20>, <multiprocessing.queues.Queue object at 0x1098f4790>, <multiprocessing.queues.Queue object at 0x142baf880>, 1688309145.2541811, <multiprocessing.synchronize.Event object at 0x142bafbb0>)
2023-07-02 10:45:47.172 | INFO     | __mp_main__:write_events:273 - event_type='action' starting
2023-07-02 10:45:47.185 | INFO     | __mp_main__:wrapper_logging:88 -  -> Enter: write_events(screen, <function write_screen_event at 0x144ec1bd0>, <multiprocessing.queues.Queue object at 0x10bbec790>, <multiprocessing.queues.Queue object at 0x144edf880>, 1688309145.2541811, <multiprocessing.synchronize.Event object at 0x144edfbb0>)
2023-07-02 10:45:47.192 | INFO     | __mp_main__:write_events:273 - event_type='screen' starting
2023-07-02 10:45:47.195 | INFO     | __mp_main__:wrapper_logging:88 -  -> Enter: write_events(window, <function write_window_event at 0x1429d1c60>, <multiprocessing.queues.Queue object at 0x109750790>, <multiprocessing.queues.Queue object at 0x1429ef880>, 1688309145.2541811, <multiprocessing.synchronize.Event object at 0x1429efbb0>)
2023-07-02 10:45:47.194 | INFO     | __mp_main__:memory_writer:501 - Memory writer starting
2023-07-02 10:45:47.202 | INFO     | __mp_main__:write_events:273 - event_type='window' starting
Process Process-5:
Traceback (most recent call last):
  File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/abrichr/oa/OpenAdapt/openadapt/record.py", line 517, in memory_writer
    crud.insert_memory_stat(
  File "/Users/abrichr/oa/OpenAdapt/openadapt/crud.py", line 118, in insert_memory_stat
    _insert(memory_stat, MemoryStat, memory_stats)
  File "/Users/abrichr/oa/OpenAdapt/openadapt/crud.py", line 39, in _insert
    assert not event_data, event_data
AssertionError: {'memory_usage': 889.0}

It looks like memory_usage should be renamed memory_usage_bytes in crud.insert_memory_stat.

process = psutil.Process(record_pid)

while not terminate_event.is_set():
mem_usage = getattr(process, "memory_info")()[0] / BYTE_TO_MB
Copy link
Member

@abrichr abrichr Jul 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angelala3252 can you please clarify why it is necessary to use getattr here instead of just process.memory_info?

Also, why use index access instead of attribute access? It's a named tuple, so we should be able to say .rss instead of [0].

Also, why is it necessary to divide by BYTE_TO_MB? Isn't this value already in bytes?

timestamps = []
mem_usages = []
for mem_stat in mem_stats:
mem_usages.append(mem_stat.memory_usage)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023-07-02 11:06:12.633 | INFO     | __mp_main__:memory_writer:525 - Memory writer done
Traceback (most recent call last):
  File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/abrichr/oa/OpenAdapt/openadapt/record.py", line 764, in <module>
    fire.Fire(record)
  File "/Users/abrichr/Library/Caches/pypoetry/virtualenvs/openadapt-VBXg4jpm-py3.10/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/abrichr/Library/Caches/pypoetry/virtualenvs/openadapt-VBXg4jpm-py3.10/lib/python3.10/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/abrichr/Library/Caches/pypoetry/virtualenvs/openadapt-VBXg4jpm-py3.10/lib/python3.10/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/abrichr/oa/OpenAdapt/openadapt/record.py", line 90, in wrapper_logging
    result = func(*args, **kwargs)
  File "/Users/abrichr/oa/OpenAdapt/openadapt/record.py", line 753, in record
    utils.plot_performance(recording_timestamp)
  File "/Users/abrichr/oa/OpenAdapt/openadapt/utils.py", line 482, in plot_performance
    mem_usages.append(mem_stat.memory_usage)
AttributeError: 'MemoryStat' object has no attribute 'memory_usage'

@abrichr abrichr merged commit ef0d5eb into OpenAdaptAI:main Jul 2, 2023
@abrichr abrichr mentioned this pull request Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants