Skip to main content

Session Group Monitor

BACKEND SCRIPT

Monitor flow based metrics stream. When new flows are created, terminated, timeout, new activity is seen on flows, etc. You can also control which flows get flushed to the Trisul-Hub database.

Structure

Session Group Monitor skeleton script

Table sg_monitor]

The Lua table sg_monitor = {..} can contain one or more of the following handler functions.

fieldtypewhen called
session_guidString (optional)Session group id. The default is {99A78737-4B41-4387-8F31-8077DB917336} for IPv4/IPv6 flows
onnewflowFunction( engine, flow)A new flow was seen. The flow contains details of the flow
onupdateFunction( engine, flow)Some metrics were update in the flow object. This can be called as much as every second per flow.
onterminateFunction( engine , flow)Flow terminated
onbeginflushFunction( engine, ts)Before starting to flush all metrics to db
flushfilterFunction( engine, flow)Before flushing each flow. Return true if you want to save flow in DB, return false to skip this flow
onflushFunction( engine, flow)Called for each flow as they are being flushed
onendflushFunction( engine, flow)After all flows have been flushed for this interval
onmetronomeFunction( engine , timestamp, tick_count, tick_interval)called every second ( Tick interval)

Objects Reference

Flow

Represents a flow and all its metrics. Note that you can access the f:flow() object which gives you access to the tuples like source_ip, destination_ip, ports, etc.

fieldreturn typedescription
keystringA unique string identifying the flow. Same as flow():id() below
flowFlowIDa FlowID object representing the flow tuples like source_ip, port, destination_ip, port etc
time_windownumber,numberstart and last activity time seconds (a tv_sec Unix epoch time) of the flow. starttm,lasttm = flow:time_window().
stateflow stateThe state of the flow, whether it is timeout, RST, FIN, or closed normally.
az_bytesnumbernumber of bytes in a→z direction. the A-side can be obtained from session:flow():ipa_readable()
za_bytesnumberbytes seen in z→a direction
az_packetsnumbernumber packets seen in a→z direction
za_packetsnumbernumber packets seen in z→a direction
az_payload_bytesnumbera→z payload bytes does not include the network headers, only the TCP payload
za_payload_bytesnumberz→a payload bytes does not include the network headers, only the TCP payload
tagsstringa pipe separated string of all tags attached to the flow
add_tagstringExample flow:add_tag("suspect") allows you to add a tag to the set of tags already there.
USAGE NOTE
When to use add_tag vs engine:tag_flow(..) When writing session_group_monitor plugins you want to use add_tag because it directly modifies the flow tag. tag_flow(..) sends the new flowtag as a message back to the streaming analytics pipeline, the tag can be lost if the flow is terminated or flushed before the flowtag message is processed by the flusher.
setup_rttnumberFor TCP flows only – Round Trip Time in microseconds as measured by the TCP handshake
retransmissionsnumberNumber of retransmitted sequence numbers observed, total of both directions

Session state

The state field from the above object contains an OR of the following enums

NameValueDesc
SESS_INIT0×0001all connections have this bit
SESS_SEEN_SYN0×0002seen a SYN (of the 3-way handshake)
SESS_SEEN_SYN_ACK0×0004seen a SYN_ACK
SESS_SEEN_SYN_ACK_ACK0×0008seen ACK of the SYN-ACK the handshake
SESS_A_END_SERVER0×0010The A-Endpoint identified by flow is the destination. The Trisul formula is to place the lower numbered port endpoint as the Z-End. Using this information you can identify the client and server of the flow based on the actual SYN packet
SESS_Z_END_SERVER0×0020The Z-Endpoint is the server. This is the normal situation
SESS_SEEN_FIN0×0040Seen a FIN close
SESS_SEEN_RST0×0080Seen a RST close
SESS_TIMEDOUT0×0100Flow timed out, could indicate packet loss or other issue
SESS_ERROR0×0200Some other unknown error with flow caused it to flush
SESS_INCOMPLETE0×0400One of the two directions did not close properly
SESS_TERMINATED0×0800Proper termination
SESS_FLOWEND0×09C0Use this to detect flows that are closed and flushed to the database. In-progress flows that flush periodically will be skipped by this mask. Alias for SESS_TERMINATED or SESS_TIMEDOUT or SESS_SEEN_FIN or SESS_SEEN_RST

Example use of object

The following example prints the IP address of endpoints and total bytes

onnewflow = function(engine, newflow)
local flow_object = newflow:flow()

print("ip a-end"..flow_object:ipa_readable())
print("ip z-end"..flow_object:ipz_readable())
print("total bytes"..newflow:az_bytes()+newflow:za_bytes())

..

end,

Functions Reference

Function onnewflow

Purpose

Use this to collect flow based metrics.

When called

When a new flow is detected. A new flow is detected when a metric for a new flow key is updated and that flow key is not currently being tracked in the streaming backend.

Parameters

nametypedesc
engineAn engine objectuse this object to add metrics, resources, or alerts into the Trisul framework
flowA Flow objectthe flow

Return value

Ignored

Example


Function onupdate

High frequency function

For busy networks this can result in thousands of updates every second. Keep your LUA function onupdate(..) efficient and avoid I/O or blocking.

Purpose

Used to track all streaming metrics assigned to a flow.

When called

This is called when any event or metric is detected on a flow.

Parameters

nametypedesc
engineAn engine objectuse this object to add metrics, resources, or alerts into the Trisul framework
flowA Flow objectthe flow

Return value

Ignored

Example


Function onterminate

Purpose

When a flow is terminated. The termination can be a normal TCP termination or a timeout.

When called

When a normal TCP flow termination is detected or when the flow is timedout from the streaming data structures.

Parameters

nametypedesc
engineAn engine objectuse this object to add metrics, resources, or alerts into the Trisul framework
flowA Flow objectthe flow

Return value

Ignored

Example


Function onbeginflush

Purpose

Use this to setup your code to handle the start of flows being flushed to the backend storage.

When called

When accumulated streaming metrics for all flows are about to be flushed to the backend.

Parameters

nametypedesc
engineengine objectuse this object to add metrics, resources, or alerts into the Trisul framework
timestampA timestamp (tv_sec) valuetimestamp value – unix epoch

Return value

Ignored

Example


Function onflush

Purpose

Process each flow as they are flushed.

When called

Just before each flow is flushed to the backend database. At this point all the metrics are attached to the flow and ready for consumption.

Long running flows
Long running flows can be flushed multiple times, by default every 300 seconds/5 minutes. Use the flow:state to filter them out if you want to process only terminated flows.

Parameters

nametypedesc
engineAn engine objectuse this object to add metrics, resources, or alerts into the Trisul framework
flowA Flow objectthe flow

Return value

Ignored

Example


Function flushfilter

Purpose

Control if a flow is flushed to the database backend or not.

When called

Just before each flow is flushed to the database. You can look at the contents of the flow object and determine if you want this flow to be flushed or not.

Parameters

nametypedesc
engineAn engine objectuse this object to add metrics, resources, or alerts into the Trisul framework
flowA Flow objectthe flow

Return value

true

flush this flow to the backend database node

false

dont flush this flow

Voting considerations

If you have multiple scripts S1, S2, .. SN each voting differently on flushfilter(), the following rule is enforced.

  1. ALL scripts have to vote NO to flush by returning false.
  2. Even if one script Sx returns YES or does not implement flushfilter(), the artifact is flushed.

Example


Function onendflush

Purpose

End of a flush cycle.

When called

When all the flows are flushed for this cycle. You can clean up here what you initalized in onbeginflush

Parameters

nametypedesc
engineAn engine objectuse this object to add metrics, resources, or alerts into the Trisul framework

Return value

Ignored

Example


Function onmetronome

Purpose

Plug into a metronome.

When called

If you define a onmetronome(..) function you will be plugged into the Trisul metronome heartbeat mechanism. This method will be called every metronome tick(roughly every second). The context in which this method is called is threadsafe and you can add metrics to the Engine from here.

Parameters

nametypedesc
engineAn engine objectuse this object to add metrics, resources, or alerts into the Trisul framework
timestampNumberCurrent timestamp (tv_sec epoch seconds)
tick_countNumberAn incremeting tick counter
tick_intervalNumberThe tick interval, in seconds.

Return value

Ignored

Example