Package io.netty.buffer.search
Class AbstractMultiSearchProcessorFactory
- java.lang.Object
-
- io.netty.buffer.search.AbstractMultiSearchProcessorFactory
-
- All Implemented Interfaces:
MultiSearchProcessorFactory,SearchProcessorFactory
- Direct Known Subclasses:
AhoCorasicSearchProcessorFactory
public abstract class AbstractMultiSearchProcessorFactory extends java.lang.Object implements MultiSearchProcessorFactory
Base class for precomputed factories that createMultiSearchProcessors.
The purpose ofMultiSearchProcessoris to perform efficient simultaneous search for multipleneedlesin thehaystack, while scanning every byte of the input sequentially, only once. While it can also be used to search for just a singleneedle, using aSearchProcessorFactorywould be more efficient for doing that.
See the documentation ofAbstractSearchProcessorFactoryfor a comprehensive description of common usage. In addition to the functionality provided bySearchProcessor,MultiSearchProcessoradds a method to get the index of theneedlefound at the current position of theMultiSearchProcessor-MultiSearchProcessor.getFoundNeedleId().
Note: in some cases oneneedlecan be a suffix of anotherneedle, eg.{"BC", "ABC"}, and there can potentially be multipleneedlesfound ending at the same position of thehaystack. In such caseMultiSearchProcessor.getFoundNeedleId()returns the index of the longest matchingneedlein the array ofneedles.
Usage example (given that thehaystackis aByteBufcontaining "ABCD" and theneedlesare "AB", "BC" and "CD"):MultiSearchProcessorFactory factory = MultiSearchProcessorFactory.newAhoCorasicSearchProcessorFactory( "AB".getBytes(CharsetUtil.UTF_8), "BC".getBytes(CharsetUtil.UTF_8), "CD".getBytes(CharsetUtil.UTF_8)); MultiSearchProcessor processor = factory.newSearchProcessor(); int idx1 = haystack.forEachByte(processor); // idx1 is 1 (index of the last character of the occurrence of "AB" in the haystack) // processor.getFoundNeedleId() is 0 (index of "AB" in needles[]) int continueFrom1 = idx1 + 1; // continue the search starting from the next character int idx2 = haystack.forEachByte(continueFrom1, haystack.readableBytes() - continueFrom1, processor); // idx2 is 2 (index of the last character of the occurrence of "BC" in the haystack) // processor.getFoundNeedleId() is 1 (index of "BC" in needles[]) int continueFrom2 = idx2 + 1; int idx3 = haystack.forEachByte(continueFrom2, haystack.readableBytes() - continueFrom2, processor); // idx3 is 3 (index of the last character of the occurrence of "CD" in the haystack) // processor.getFoundNeedleId() is 2 (index of "CD" in needles[]) int continueFrom3 = idx3 + 1; int idx4 = haystack.forEachByte(continueFrom3, haystack.readableBytes() - continueFrom3, processor); // idx4 is -1 (no more occurrences of any of the needles) // This search session is complete, processor should be discarded. // To search for the same needles again, reuse the sameAbstractMultiSearchProcessorFactory// to get a new MultiSearchProcessor.
-
-
Constructor Summary
Constructors Constructor Description AbstractMultiSearchProcessorFactory()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static AhoCorasicSearchProcessorFactorynewAhoCorasicSearchProcessorFactory(byte[]... needles)Creates aMultiSearchProcessorFactorybased on Aho–Corasick string search algorithm.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface io.netty.buffer.search.MultiSearchProcessorFactory
newSearchProcessor
-
-
-
-
Method Detail
-
newAhoCorasicSearchProcessorFactory
public static AhoCorasicSearchProcessorFactory newAhoCorasicSearchProcessorFactory(byte[]... needles)
Creates aMultiSearchProcessorFactorybased on Aho–Corasick string search algorithm.
Precomputation (this method) time is linear in the size of input (O(Σ|needles|)).
The factory allocates and retains an array of 256 * X ints plus another array of X ints, where X is the sum of lengths of each entry ofneedlesminus the sum of lengths of repeated prefixes of theneedles.
Search (the actual application ofMultiSearchProcessor) time is linear in the size ofByteBufon which the search is performed (O(|haystack|)). Every byte ofByteBufis processed only once, sequentually, regardles of the number ofneedlesbeing searched for.- Parameters:
needles- a varargs array of arrays of bytes to search for- Returns:
- a new instance of
AhoCorasicSearchProcessorFactoryprecomputed for the givenneedles
-
-