Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run on a single port using Netty #109

Merged
merged 42 commits into from
May 2, 2024
Merged

Run on a single port using Netty #109

merged 42 commits into from
May 2, 2024

Conversation

hopskipnfall
Copy link
Owner

No description provided.

@hopskipnfall
Copy link
Owner Author

hopskipnfall commented Jan 21, 2024

From some tests I ran, the current version is actually faster than the version we have at HEAD (0.11.2) even on single core machines, which was what I was struggling with before.

The solution was to use a shared threadpool of a fixed 10 threads for listening to requests on a permanent coroutine, sending outgoing messages in a queue in a permanent coroutine, and handling requests. I also updated the KailleraUser.addEvent to be suspend and not divert execution to a different threadpool for server-triggered events (e.g. timeouts).

There is no doubt more tuning that can be done to increase efficiency. Some things I would like to try:

  • After adding new game data, instead of suspending until all user data comes in, after all data comes in then send the packets to all users.
  • Use a dynamically sized threadpool based on users present and number of cores
  • Remove the per-user request handling mutex. Replace these with more fine-grained mutexes.
  • Reuse outgoing buffers instead of allocating a new one for every message! I tried this before but ran into some strange seemingly Windows-specific magic that was going on. This should have a huge impact on GC.

KNOWN ISSUE: If you ping the server and then try to join the server it will not let you in for some reason. Just joining the server directly works fine.

@hopskipnfall hopskipnfall self-assigned this Jan 21, 2024
@hopskipnfall
Copy link
Owner Author

I got a trial of Intellij IDEA UE and tried out the profiler.

Here is an overview of what I've learned about our code at HEAD from a 15 minute test with 1 player in game, all running on my desktop.

Memory

image

Total usage: 103 MB

This is roughly divided equally into thirds between handling user requests, periodically reporting to master lists, and some kind of ktor business logic running behind the scenes supporting the master list requests.

image

That leaves us with 38 MB from our core logic that we're concerned with in this PR.

#CPU

image

Total usage: 1.909s

@hopskipnfall
Copy link
Owner Author

hopskipnfall commented Feb 5, 2024

At commit ee8c340, a 15 minute test with 1 player in game, all running on my desktop.

Memory

image

Total usage: 230.54 MB

Our main logic falls into about 130 MB (3.4 times as much as HEAD)

image

Notes:

  • The method for sending takes a total of 48.47 MB:
    • The constructor ByteReadPacket(buffer) takes 10.8 MB
    • Calling serverSocket.send(datagram) takes up 22.6 MB
    • Wrapping the above send method in withTimeout() takes up over 7 MB. We should be able to remove this.
  • Decoding incoming RPCs takes a total of 12.1 MB, which is actually much less percent-wise than I was expecting.
  • The coroutine scheduler plus some ktor manager logic takes up 100 MB

CPU

image

  • 120ms spent on serverSocket.send(datagram). A further 36ms on the wrapping withTimeout()
  • listener.actionPerformed(event) takes 228ms. This is mostly represented by GameDataAction, and most of that time shows up under send().
  • ByteReadPacket(buffer) takes 24ms, this seems pretty expensive..

Total usage: 3.372s

@hopskipnfall
Copy link
Owner Author

This looks pretty bad for integrating with ktor and/or coroutines...

Some compromises we can take:

  1. Instead of spinning up a new coroutine for every request, have a dedicated persistent coroutine per user that processes a queue of incoming messages. That should significantly reduce memory and CPU usage.
  2. Construct messages using BytePacketBuilder instead of wrapping ByteBuffers in ByteReadPacket.
  3. Remove calls to withTimeout()

@hopskipnfall
Copy link
Owner Author

After removing coroutines and using Netty and native java for making HTTP requests,

Memory

image

Total usage: 74.67 MB, with our core logic fitting into 35 MB.

CPU

image

Total usage: 2.5s

It's possible this isn't consistently reproducible, is platform-dependent, scales differently with number of users or length of test. But I think it's clear this has the most promise, and I haven't done any threadpool tuning yet.

@hopskipnfall
Copy link
Owner Author

hopskipnfall commented Mar 20, 2024

Ideas for improving from here:

  • Doing everything we can to optimize for the GameData handling code, as it by far has the most volume.
  • Try to avoid allocating new ByteArray instances for every new GameData message received (Arrays.copyOf() usees 10MB in this test). This may be difficult to do because of how we store a cache of past inputs.
  • We can save a few mb by not calling ByteBuf#order(LITTLE_ENDIAN) and instead calling the LE methods directly.
  • Maybe make V086Bundle mutable so we can reuse it for each send.
  • DataOutputStream() uses a silly amount of memory (27 MB). We can probably optimize this..
  • The synchronized call in KailleraUser#handleEvent() is technically blocking in some cases (during login at least) and should be removed if possible.

@hopskipnfall hopskipnfall changed the base branch from fewer-ports to master March 21, 2024 07:57
@hopskipnfall
Copy link
Owner Author

image

After some extended testing this latest version performs better than ESF, ESF 0.11.2, and ESX.

I will do some additional cleanup on this PR, merge, it and use #114 to track pre-launch QA tasks.

@hopskipnfall hopskipnfall changed the title Rabbit hole Run on a single port using Netty May 2, 2024
@hopskipnfall hopskipnfall merged commit 26b897f into master May 2, 2024
1 check passed
@hopskipnfall hopskipnfall deleted the rabbit-hole branch May 2, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant