Google Summer of Code Ideas 2020
Netty supports various transports out of the box which depending on the Operation System can be used. At the moment it supports:
- NIO Transport - Works on any Platform / OS that has Java support
- Native Epoll Transport - Works on Linux only and depending on the used Kernel / GLIBC version various features are available
- Native Kqueue Transport - Works on any BSD in theory, but tested mainly on MacOS.
As most Companies use Linux as the main OS for production systems it is no wonder that the “Native Epoll Transport” is usually used. While this transport already gives various advantages over the generic NIO Transport it still “suffers” from one problem that it shares wit the NIO Transport. It makes use of syscalls to communicate with the system / kernel for all network operations.
This includes (but is not limited) to:
- Open Sockets
- Accept Sockets
- Write to Sockets
- Read from Sockets
- Close Sockets
Using syscalls served us very well over the last years but as syscalls became more expensive (and will most likely become even more expensive in the future) due of spectre and meltdown there is interest to reduce syscalls.
A very promising way of doing so is io_uring (https://kernel.dk/io_uring.pdf) [1, 2]. It provides an abstraction and API which allows to communicate with the kernel for IO operations which does mostly eliminate the need of syscalls all together. While it is out of scope to explain the exact inner working of io_uring the most important detail is that is offers a way to communicate through ring-buffers that use memory that is mapped into userspace and kernel space.
To ensure netty can make use of io_uring in the future we need to write a new transport that will make use of the provided API. Due to the fact that this is using C APIs it will needed to be done via JNI (just like the Native Epoll Transport).
- Experience with Java and C programming languages (including how to use it from Java code == JNI)
- Understanding of Linux and its network API
- hard
- Francesco Nigro
- Norman Maurer
Netty currently contains a mix of unit tests and integration tests which are run as part of PR validations but also nightly. While this helped us to ensure we did not add any regressions in terms of functionality it does not help to catch other type of regressions like:
- memory leaks (native and non-native)
- memory overhead
- allocation increasement
- GC pressure
To help us detect these type of problems we would like enhance the current test suite to include tests / tools that help us to ensure we do not regress in the above mentioned cases.
- Be able to run test suite as part of CI builds and fail the build if a regression is detected
- Generate graphs that show trends related to allocations / memory usage etc
- Familiar with Java
- Familiar with test frameworks
- Familiar with docker
- Francesco Nigro
- Norman Maurer
- Medium
- https://github.com/CodingFabian/allocation-tracker
- https://github.com/jvm-profiling-tools/async-profiler
- https://github.com/mariusae/heapster
Netty comes with its own implementations of different event loop types, but they lack observability: providing counters and metrics on them to allow an external observer to understand how they're being used would help users to both tune and acknowledge performance bottleneck of applications or misconfigurations to achieve a target load. Observability often comes at a price, but ideally its impacts should be measurable by its own and carefully designed and implemented to be more lightweight as possible.
Design, implementing and benchmarking the effectiveness of the proposed metrics with different use cases and their impacts if compared when telemetry isn't used.
- Experience with Java and benchmarking
- [Understanding of queuing theory]
- medium
- Francesco Nigro
- Norman Maurer
Netty supports various transports out of the box which depending on the Operation System can be used. At the moment it supports:
- NIO Transport - Works on any Platform / OS that has Java support
- Native Epoll Transport - Works on Linux only and depending on the used Kernel / GLIBC version various features are available
- Native Kqueue Transport - Works on any BSD in theory, but tested mainly on MacOS.
The last twos have specific JNI implementations in Netty itself, while the first makes use what Java NIO offers with few changes. Specific metrics and telemetry data could be collected on such native transport implementations that could help observes/troubleshoots a running system.
Design, implementing and benchmarking the effectiveness of the proposed metrics with different use cases and their impacts if compared when telemetry isn't used.
- Experience with Java and benchmarking
- Experience with Java and C programming languages (including how to use it from Java code == JNI)
- Understanding of Linux and its network API
- medium
- Francesco Nigro
- Norman Maurer