chore: improve channel messaging and address memory leaks#1687
chore: improve channel messaging and address memory leaks#1687
Conversation
|
/cc @DasBabyPixel memory leaks should be fixed now |
Test Results 56 files ±0 56 suites ±0 3m 19s ⏱️ +26s Results for commit c4bc4b2. ± Comparison against base commit c108c3e. This pull request removes 32 and adds 32 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
DasBabyPixel
left a comment
There was a problem hiding this comment.
I think this is a step in the right direction. If you give me a few days time I have a reason to update past RC11 other than java 24 and time to test this branch.
I think replacing null checks with pattern matching is either an overly aggressive inspection or a too idealistic world view. It does not make the code more readable.
I haven't yet gone through all usages to look for all possible leaks, but the obvious ones I do remember should be fixed here.
...rc/main/java/eu/cloudnetservice/driver/impl/network/chunk/network/ChunkedPacketListener.java
Show resolved
Hide resolved
| private void handlePacket(@NonNull BasePacket packet) { | ||
| try { | ||
| this.doHandlePacket(packet); | ||
| } catch (Exception exception) { |
There was a problem hiding this comment.
I'd go further and catch Throwable instead of silently ignoring OOMEs or similar
There was a problem hiding this comment.
we should get oracle to have the default UncaughtExceptionHandler call System.exit.
I remember wasting way too many hours because network threads ate my errors and I couldn't for the life of me figure out where.
| try { | ||
| this.buffer.close(); | ||
| } catch (IllegalStateException _) { | ||
| // possible double-free error due to a race, ignore |
There was a problem hiding this comment.
since when are buffers thread safe? This seems like a hack that will cause issues later
There was a problem hiding this comment.
The buffer is not thread-safe, but the drop guarding the buffer is. So concurrent close calls are safe. However, I would also argue that this is the better approach as close calls should not throw an exception if the buffer was already closed before (this was partly implemented using the isAccessible() check before - just not thread-safe).
There was a problem hiding this comment.
My understanding of how the buffers work may be incorrect, but concurrent close calls can never be safe?
EDIT: part of internal buffer code in ResourceSupport class, XXXBuffer extends ResourceSupport
@Override
public final void close() {
if (acquires == -1) {
throw attachTrace(new IllegalStateException("Double-free: Resource already closed and dropped."));
}
int acq = acquires;
acquires--;
if (acq != 0) {
// Only record a CLOSE if we're not going to record a DROP.
tracer.close(acq);
} else {
// The 'acquires' was 0, now decremented to -1, which means we need to drop.
tracer.drop(0);
try {
drop.drop(impl());
Reference.reachabilityFence(this); // Avoid racing with any Cleaner.
} finally {
makeInaccessible();
}
}
}acquires is a non-volatile int. So a buffer must, at every point in time, have a clear "owner" thread. Having IllegalStateExceptions thrown seems to me like the presence of another issue somewhere else, maybe in how the buffer is used or passed between owners.
There was a problem hiding this comment.
If we want to enable concurrent close calls, we need to have some sort of atomic isClosed flag or counter
There was a problem hiding this comment.
Yes, that place is not atomic, but the actual dropping happens in ArcDrop (in our case) which is.
There was a problem hiding this comment.
I guess you are right, ArcDrop is atomic and concurrent close calls should be safe.
Still think this is a bad idea long-term and while it may work now may fail if the netty implementation details change. Someone once told me relying on implementation details is a hassle and I should avoid it, so I do.

Imo a buffer lifecycle should be clearly defined and a close call must be on the owner thread, or synchronized in some way.
Also I haven't checked, but does enabling leak detection also enable it for the wrapper?
There was a problem hiding this comment.
I don't really understand the problem. As stated before the exception is caught to ensure that, at any point, a call to close does NOT throw an exception. The previous isAccessible() guard did the same thing but left the door open a tiny bit open for an exception to be thrown anyway - this is now not possible anymore. Also note that the documentation of DataBuf still states that "buffers are not required to be thread safe", and there is no documentation hinting that close calls are thread-safe. It's just a security harness to be 100% sure that no exception is thrown, ever.
There was a problem hiding this comment.
If we argue that buffers are not thread safe, then concurrent close calls should be able to fail and throw the exception.
It's like instead of getting a ConcurrentModificationException when concurrently modifying an array list, it just silently fails.
The presence of such an exceptions indicates a deeper problem, in that case the concurrent usage of close. The absence of such an exception makes is so that problems, like incorrect usage of buffers, and concurrent close calls, are more difficult to diagnose.
It's not directly a problem where something doesn't work, correct usage of buffers will work. It's that catching that exception will make the future more difficult with bugs that can't be diagnosed because the exception is caught and ignored.
My point is: I want incorrect usage, in this case concurrent usage, to throw an exception. It makes it much easier to debug problems, and often prevents them alltogether.
...rc/main/java/eu/cloudnetservice/modules/bridge/impl/platform/helper/ProxyPlatformHelper.java
Show resolved
Hide resolved
...s/impl/src/main/java/eu/cloudnetservice/modules/npc/impl/platform/PlatformNPCManagement.java
Show resolved
Hide resolved
|
EDIT: nvm I think I merged something incorrectly into my fork |
|
Disregarding opinions on coding guidelines etc, it's overall looking good. No immediate masses of memory leaks are being thrown at me. |
Motivation
The channel messaging system (as well as a few network parts in general) really need a cleanup to improve the code quality and reliability of the system. Furthermore, there were memory leaks reported that need to be addressed as well.
Modification
This pr mostly improves channel messaging to be more reliable overall and improves the api surface exposed to consumers. Furthermore, all possible memory leaks were addressed (not limited to channel messaging) and some other networking parts were improved as well.
Result
Channel messaging should be more reliable and easier to use. Additionally, there shouldn't be any memory leaks anymore.