Object AC (Aho-Corasick)
An Aho-Corasick multi pattern matcher created with T.ac( _pattern_array_ )
This is a convenience utility provided to you by the Trisul framework because multi pattern matching is such a frequent need in network analytics applications.
Creating the object
The object is created and stored in a global state, either as a global variable or as a member in the Global T
table. Note that the global states are per-file not across LUA files.
onload = function()
T.patternMatcher = T.ac( {'string1','string2','strin3'...})
end,
.. later ..
T.patternMatcher:match_one(..)
Functions
A summary of the functions available in this object.
Name | In | Out | Description |
---|---|---|---|
match_all | string | table | Matches all patterns. The matches are returned in a table { pattern_matched = position >The position indicates the last matching character, not the first. |
match_one | string | table | Same as match_all, but stops after finding a single match. Use this method for alerting on pattern matches. |
Function match_all
Tries to match all patterns against the input text.
Purpose
Use this parameter passed to your Lua function to integrate your data into the Trisul framework.
Parameters
text | string | the text to be matched |
Return value
A table of matches. See the debug output below
Usage
In this example we are attempting to match a list of hostnames against a DNS Full Text Search text.
The code at the point the dbg()
is called is shown below
onload = function()
T.patterns = T.ac( { "toolbar", "nsatc", "HOLLERITH" })
end,
onflush= function(dbengine, fts)
local m = T.patterns:match_all( fts:text() )
dbg();
if next(m) then
print("FOUND a match.. do your thing"
end
..
Using the debugger, we can inspect the return value m
debugger.lua> p m
m => {"toolbar" = 87}
debugger.lua> p fts:text()
fts:text() => "QUERY\9ID: 39219\9Flags:0×0100\9QDCount:1\9ANCount:0\9NSCount:0\9ARCount:0\Questions\.toolbar.google.com\9\9A\9IN\\"
debugger.lua>
Function match_one
Match all the strings in the pattern against the input text.
Stop at the first match. The match_all
tries to match all patterns and does not stop after the first match.