Package io.netty.buffer.search
Class AbstractMultiSearchProcessorFactory
- java.lang.Object
-
- io.netty.buffer.search.AbstractMultiSearchProcessorFactory
-
- All Implemented Interfaces:
MultiSearchProcessorFactory
,SearchProcessorFactory
- Direct Known Subclasses:
AhoCorasicSearchProcessorFactory
public abstract class AbstractMultiSearchProcessorFactory extends java.lang.Object implements MultiSearchProcessorFactory
Base class for precomputed factories that createMultiSearchProcessor
s.
The purpose ofMultiSearchProcessor
is to perform efficient simultaneous search for multipleneedles
in thehaystack
, while scanning every byte of the input sequentially, only once. While it can also be used to search for just a singleneedle
, using aSearchProcessorFactory
would be more efficient for doing that.
See the documentation ofAbstractSearchProcessorFactory
for a comprehensive description of common usage. In addition to the functionality provided bySearchProcessor
,MultiSearchProcessor
adds a method to get the index of theneedle
found at the current position of theMultiSearchProcessor
-MultiSearchProcessor.getFoundNeedleId()
.
Note: in some cases oneneedle
can be a suffix of anotherneedle
, eg.{"BC", "ABC"}
, and there can potentially be multipleneedles
found ending at the same position of thehaystack
. In such caseMultiSearchProcessor.getFoundNeedleId()
returns the index of the longest matchingneedle
in the array ofneedles
.
Usage example (given that thehaystack
is aByteBuf
containing "ABCD" and theneedles
are "AB", "BC" and "CD"):MultiSearchProcessorFactory factory = MultiSearchProcessorFactory.newAhoCorasicSearchProcessorFactory( "AB".getBytes(CharsetUtil.UTF_8), "BC".getBytes(CharsetUtil.UTF_8), "CD".getBytes(CharsetUtil.UTF_8)); MultiSearchProcessor processor = factory.newSearchProcessor(); int idx1 = haystack.forEachByte(processor); // idx1 is 1 (index of the last character of the occurrence of "AB" in the haystack) // processor.getFoundNeedleId() is 0 (index of "AB" in needles[]) int continueFrom1 = idx1 + 1; // continue the search starting from the next character int idx2 = haystack.forEachByte(continueFrom1, haystack.readableBytes() - continueFrom1, processor); // idx2 is 2 (index of the last character of the occurrence of "BC" in the haystack) // processor.getFoundNeedleId() is 1 (index of "BC" in needles[]) int continueFrom2 = idx2 + 1; int idx3 = haystack.forEachByte(continueFrom2, haystack.readableBytes() - continueFrom2, processor); // idx3 is 3 (index of the last character of the occurrence of "CD" in the haystack) // processor.getFoundNeedleId() is 2 (index of "CD" in needles[]) int continueFrom3 = idx3 + 1; int idx4 = haystack.forEachByte(continueFrom3, haystack.readableBytes() - continueFrom3, processor); // idx4 is -1 (no more occurrences of any of the needles) // This search session is complete, processor should be discarded. // To search for the same needles again, reuse the same
AbstractMultiSearchProcessorFactory
// to get a new MultiSearchProcessor.
-
-
Constructor Summary
Constructors Constructor Description AbstractMultiSearchProcessorFactory()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static AhoCorasicSearchProcessorFactory
newAhoCorasicSearchProcessorFactory(byte[]... needles)
Creates aMultiSearchProcessorFactory
based on Aho–Corasick string search algorithm.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface io.netty.buffer.search.MultiSearchProcessorFactory
newSearchProcessor
-
-
-
-
Method Detail
-
newAhoCorasicSearchProcessorFactory
public static AhoCorasicSearchProcessorFactory newAhoCorasicSearchProcessorFactory(byte[]... needles)
Creates aMultiSearchProcessorFactory
based on Aho–Corasick string search algorithm.
Precomputation (this method) time is linear in the size of input (O(Σ|needles|)
).
The factory allocates and retains an array of 256 * X ints plus another array of X ints, where X is the sum of lengths of each entry ofneedles
minus the sum of lengths of repeated prefixes of theneedles
.
Search (the actual application ofMultiSearchProcessor
) time is linear in the size ofByteBuf
on which the search is performed (O(|haystack|)
). Every byte ofByteBuf
is processed only once, sequentually, regardles of the number ofneedles
being searched for.- Parameters:
needles
- a varargs array of arrays of bytes to search for- Returns:
- a new instance of
AhoCorasicSearchProcessorFactory
precomputed for the givenneedles
-
-