Analyzing Memory Allocator Behavior
Netty 4.2 include 3 different memory allocators for creating ByteBuf instances:
- The
unpooledallocator will allocate from system memory on each call, and release it back immediately when aByteBufis no longer used. - The
pooledallocator will allocate chunks of system memory, and share and reuse it across multipleByteBufinstances. It organizes memory into arenas, chunks, and size-classes, following the design of thejemallocallocator. This allocator was the default in Netty 4.1. - The
adaptiveallocator, like thepooledone, also allocates chunks of system memory. The chunks are typically smaller, and more numerous, and collected in groups called a magazine. Magazines are organized into magazine groups, with one group per size class. The groups grow in response to multithreaded contention. This is the default allocator in Netty 4.2.
Each allocator has pros and cons that make them more suitable for some applications, and less so for others.
The allocator that Netty will use by default is controlled by the io.netty.allocator.type system property, set to one of the names given above.
The pooled and adaptive allocators can emit Java FlightRecorder events when they allocator or free buffers or chunks.
These events are not enabled by default, because they can have a rather high performance overhead. Instead, these events must be explicitly enabled by the FlightRecorder profile being used for the recording.
The Netty team has put together a FlightRecorder profile which includes only the Netty allocator events, and disables all others. This profile can be downloaded here: https://github.com/netty/netty/blob/4.2/microbench/src/main/resources/Netty%20Allocator%20Events.jfc
Using this profile also has the advantage that the produced recording can be shared publicly, as it will contain no personal or private information, and no intellectual property.
Once the profile has been downloaded onto the system running your Netty application, you can obtain an allocation profile by running these commands:
- Obtain the PID of the Netty application you wish to profile, using:
$ jps
- Start the profiling with a command similar to the following:
$ jcmd <PID> JFR.start name=netty-allocator-profiling duration=30s filename=netty-allocator.jfr settings=/path/to/netty.jfc maxsize=200m
- Wait for the profiling to finish, periodically checking its status with:
$ jcmd <PID> JFR.check
- When the recording is finished, your
.jfris ready for you to download and analyze.
The netty-microbench module contain a program call the AllocationPatternSimulator.
It can analyze the allocation profile, and simulate its allocation patterns simultaneously across the pooled and adaptive allocators, and show a comparison chart of their memory usage.
To use it, you will have to check out and compile the Netty source code.
Then run the AllocationPatternSimulator class, passing the allocation profile .jfr file as the first program argument.