TIAA: A Toolkit for Intrusion Alert Analysis
(Version 0.4) What's New
We have added three new utilities into TIAA in version 0.4:
- Association Analysis (Extracting frequent coourrences of attribute values from a
set of alerts)
- Attack Strategy Extraction (Extracting attack strategies from a
correlation graph)
- Missed Attack Hypotheses (Hypothesizing possibly missed attacks)
Supported Platforms
- This tool has been tested on Windows XP with MS SQL
server 2000.
Download Test Data and Execution Procedure
- Download this sql statement and execute
it in the database to create a target table "events";
- Download any of these alert datasets [scenario 1 (dmz,
inside), scenario 2 (dmz,
inside)] generated by RealSecure,
the data source is DARPA evaluation dataset 2000. You can import
it into MS SQL server with the command "bulk insert events
from 'file path' with (FIELDTERMINATOR=',');";
- Download the sample knowledge base XML
file and its schema;
- Download the sample property
file;
- For aggregation analysis and attack strategy extraction, download a sample abstraction
hierarchy file here;
- For missed attack hypotheses, you can use Ethereal to analyze the
tcpdump file and save the analysis result (packet summary) into a
text file. This information can be used to prune some incorrectly hypothesized attacks. For your convenience, you can also download sample
analysis results here. (You need to unzip the file.)
Checklist before You Run This Tool
- Java 1.4 or above
- MS SQL Server 2000 and JDBC driver
- Xerces Java Parser v1.4.4 (You can get it from Apache's website)
- GraphViz (You can get it from AT&T's website)
- Ethereal (not necessary if you have downloaded our sample
analysis results) (You can get Ethereal from Ethereal website)
Main Contributor for Ver 0.4 Acknowledgment
TIAA Ver 0.4 is based on TIAA Ver 0.3. Yun Cui, Yiquan Hu, Pai Peng have contributed to the earlier versions of TIAA.
Three New Utilities Association Analysis Association analysis can help us find the
frequent co-occurrences of attribute values belonging to different attributes
that represent various alerts. For example, through association
analysis, we may find many attacks are from source IP address
172.16.1.2 to destination IP address 172.16.1.100 at destination port
80. When performing association analysis, there are two input
parameters. The first is the set of alerts (represented by the alert
collection ID), and the second is the support threshold. Given a set S
of alerts and a support threshold t%, a frequent attribute set
A1=a1 ^ A2=a2 ^ ... ^ An=an (A1, A2, ..., An are attribute names, and
a1, a2, ..., an are attribute values) denotes that there are at least t% of
the alerts, where their attribute values satisfy A1=a1 ^ A2=a2 ^ ... ^
An=an. A sample result of association analysis is as follows.
| Frequent Attribute Sets | Support |
| HyperAlertType=Rsh | 38.63636363636363% |
| HyperAlertType=Sadmind_Amslverify_Overflow | 31.818181818181817% |
| DestPort=514 | 38.63636363636363% |
| SrcIPAddress=202.077.162.213 | 47.72727272727273% |
| HyperAlertType=Rsh ^ DestPort=514 | 38.63636363636363% |
| HyperAlertType=Sadmind_Amslverify_Overflow ^ SrcIPAddress=202.077.162.213 | 31.818181818181817% |
Attack Strategy Extraction Attack strategy is represented by a
directed graph, where each node is a hyper-alert type, and each
directed edge represents the equality constraints the related nodes
should satisfy. Two sample attack strategy graphs are as follows.  An attack strategy graph without abstraction hierarchy (click and expand the graph)  An attack strategy graph with abstraction hierarchy (click and expand the graph) Missed Attack Hypotheses The utility of missed attack hypotheses
integrates a set of alert collections and hypothesizes possibly missed
attacks. The GUI part related to the attack hypotheses sets important
parameter for this utility. When integrating a set of alert
collections, the collection IDs are separated by ",". The
packet summary information is imported into the database before it can
be used to filter out incorrect hypotheses. Since it may take pretty long
time to import all packet information into the database, we may save
these information for later analysis instead of importing them every
time. For the replayed traffic, we also need to input replay time. If
using our sample datasets, the replay time for inside1 is "2001-11-10
03:59:55", and the replay time for dmz1 is "2001-11-09
23:06:18". For detailed information on how to use this
utility, please refer to Installation and Operation Manual. A sample
integrated correlation graph through performing missed attack
hypotheses is as follows (the gray nodes are hypothesized alerts).  An integrated correlation graph (click and expand the graph) |
|