public abstract class AbstractMultiSearchProcessorFactory extends Object implements MultiSearchProcessorFactory
MultiSearchProcessor
s.
MultiSearchProcessor
is to perform efficient simultaneous search for multiple needles
in the haystack
, while scanning every byte of the input sequentially, only once. While it can also be used
to search for just a single needle
, using a SearchProcessorFactory
would be more efficient for
doing that.
AbstractSearchProcessorFactory
for a comprehensive description of common usage.
In addition to the functionality provided by SearchProcessor
, MultiSearchProcessor
adds
a method to get the index of the needle
found at the current position of the MultiSearchProcessor
-
MultiSearchProcessor.getFoundNeedleId()
.
needle
can be a suffix of another needle
, eg. {"BC", "ABC"}
,
and there can potentially be multiple needles
found ending at the same position of the haystack
.
In such case MultiSearchProcessor.getFoundNeedleId()
returns the index of the longest matching needle
in the array of needles
.
haystack
is a ByteBuf
containing "ABCD" and the
needles
are "AB", "BC" and "CD"):
MultiSearchProcessorFactory factory = MultiSearchProcessorFactory.newAhoCorasicSearchProcessorFactory(
"AB".getBytes(CharsetUtil.UTF_8), "BC".getBytes(CharsetUtil.UTF_8), "CD".getBytes(CharsetUtil.UTF_8));
MultiSearchProcessor processor = factory.newSearchProcessor();
int idx1 = haystack.forEachByte(processor);
// idx1 is 1 (index of the last character of the occurrence of "AB" in the haystack)
// processor.getFoundNeedleId() is 0 (index of "AB" in needles[])
int continueFrom1 = idx1 + 1;
// continue the search starting from the next character
int idx2 = haystack.forEachByte(continueFrom1, haystack.readableBytes() - continueFrom1, processor);
// idx2 is 2 (index of the last character of the occurrence of "BC" in the haystack)
// processor.getFoundNeedleId() is 1 (index of "BC" in needles[])
int continueFrom2 = idx2 + 1;
int idx3 = haystack.forEachByte(continueFrom2, haystack.readableBytes() - continueFrom2, processor);
// idx3 is 3 (index of the last character of the occurrence of "CD" in the haystack)
// processor.getFoundNeedleId() is 2 (index of "CD" in needles[])
int continueFrom3 = idx3 + 1;
int idx4 = haystack.forEachByte(continueFrom3, haystack.readableBytes() - continueFrom3, processor);
// idx4 is -1 (no more occurrences of any of the needles)
// This search session is complete, processor should be discarded.
// To search for the same needles again, reuse the same AbstractMultiSearchProcessorFactory
// to get a new MultiSearchProcessor.
Constructor and Description |
---|
AbstractMultiSearchProcessorFactory() |
Modifier and Type | Method and Description |
---|---|
static AhoCorasicSearchProcessorFactory |
newAhoCorasicSearchProcessorFactory(byte[]... needles)
Creates a
MultiSearchProcessorFactory based on
Aho–Corasick
string search algorithm. |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
newSearchProcessor
public static AhoCorasicSearchProcessorFactory newAhoCorasicSearchProcessorFactory(byte[]... needles)
MultiSearchProcessorFactory
based on
Aho–Corasick
string search algorithm.
O(Σ|needles|)
).
needles
minus the sum of lengths of repeated
prefixes of the needles
.
MultiSearchProcessor
) time is linear in the size of
ByteBuf
on which the search is performed (O(|haystack|)
).
Every byte of ByteBuf
is processed only once, sequentually, regardles of
the number of needles
being searched for.needles
- a varargs array of arrays of bytes to search forAhoCorasicSearchProcessorFactory
precomputed for the given needles
Copyright © 2008–2024 The Netty Project. All rights reserved.