Intrusion Detection by Low Level Anomalies in Network Traffic

Matthew V. Mahoney
Florida Institute of Technology
mmahoney@cs.fit.edu

Draft, updated Nov. 29, 2000

Abstract

We show that many network probes and denial of service attacks can be detected with minimal overhead by observing anomalous field values in IP packets using a nonstationary model. On the DARPA IDS test set, we detect 32% of all attacks vs. 48% for the best of 18 systems in the 1999 evaluation (at < 10 false alarms per day). Among attacks at the transport layer and below, we detect 67% of attacks, vs. 47% for the best system.

Keywords: Intrusion detection, Anomaly detection, Network security.

Introduction

An intrusion detection system (IDS) monitors network traffic, operating system events, or the file system to detect unauthorized attempts to access a system. Abstractly, an IDS takes some input x and estimates the probability that it represents an attack. By Bayes law, this can be written,

P(attack|x) = P(x|attack)P(attack)/P(x)

There are two general approaches to this problem. In signature (or misuse) detection, we develop a model P(x|attack) for all known attacks, and assume P(attack) and P(x) are fixed (i.e. the rate of attacks is constant, and all possible inputs x are equally likely. The method detects only known attacks, and the model must be continually updated as new attacks are discovered. In anomaly detection, we model only "normal" input, P(x), assuming nothing about the form of the attack. This method detects novel attacks, but often produces more false alarms because of the difficulty of modeling P(x). The model must be broad enough to cover all possible legitimate input, yet narrow enough to exclude most attacks. There is, of course, the underlying assumption that P(x|attack) ¹ P(x).

Forrest et al. (1996) demonstrated the feasibility of anomaly detection in operating system call sequences for attacks against UNIX priveliged programs and servers. She found that programs make very predictable call sequences, and that they deviate from this model when an attacker exploits a bug in the program, for example, a buffer overflow vulnerability.

Network traffic is more difficult to model than system call sequences, and often must be processed faster. As a result, most network anomaly detectors (e.g. firewalls) are rule based. A network adminstrator specifies what traffic is allowed and excludes all else. Statistical models, such as the Emerald-TCP system (Porras and Valdes, 1998), still tend to be rule based in that the administrator must specify what statistics to collect. Configuring these systems requires expertise in network protocols, making them useless for average users, who are the people that need an IDS the most.

In the following sections, we describe a simple, fast, and self configuring statistical model of network packet headers (data link, network, and transport layers) that detects many attacks that exploit these protocols in the 1999 DARPA IDS test set. We compare several variations of the model, obtaining the best performance with a nonstationary model, one which assumes that anomalies are distributed in bursts across both space and time.

A Network Packet Model

We developed an IDS that assigns anomaly scores to individual network packets based on their headers, with the goal of minimizing hard-coded knowledge about network protocols. The packets are parsed into Ethernet, IP, TCP, UDP, and ICMP headers, and split into fields of 1 to 4 bytes. During training, the number of occurrences of each value is counted for each field. For fields of 2 or more bytes, the value is first hashed modulo a constant H = 1000, in order to reduce the amount of data collected. Header checksums are computed, and the checksum fields are replaced with their computed values, normally FFFF hex = 65535, (or 535 after hashing), because we felt it was unreasonable for a machine learning algorithm to figure out how to compute checksums on its own. Table 1 shows the result of training the model on 7 days of attack-free network traffic on the DARPA test set (week 3, inside, as described in the next section). The first two columns are the name of the field and its size in bytes. Column 3 lists values that occur at least once. The last column is Nnovel, the number of values with a count of 1 or more. N is the total number of observations. For example, out of N = 90288 observations of the ICMPtype field, the Nnovel = 3 values 0, 3, and 8 were the only ones to appear.

Ethernet fields     Bytes  Observed Values            N = 34909810

MacHiDest           3   186 192 215 219 231 297 320 548 630      9
MacLoDest           3   9 88 215 257 268 290 394 772 831...     12
MacHiSource         3   186 192 219 231 297 320                  6
MacLoSource         3   88 257 268 290 394 831 859 961 987       9
NetProtocol         2   48 54 310 864                            4

IP header fields    Bytes  Observed Values            N = 34669966

IPHeaderLength      1   69                                       1
TOS                 1   0 8 16 192                               4
IPLength            2   0 1 2 3 4 5 6 7 8...                  1000
FragID              2   0 1 2 3 4 5 6 7 8...                  1000
Frag. flags/offset  2   0 384                                    2
TTL                 1   2 32 60 62 63 64 127 128 254 255        10
TransportProtocol   1   1 6 17                                   3
IPchecksum          2   535                                      1
Source              4   0 2 3 4 6 7 8 9 10...                  851
Destination         4   0 2 3 4 6 7 8 9 10...                  853

TCP header fields   Bytes  Observed Values            N = 27010151

FromPortTCP         2   0 1 2 3 4 5 6 7 8...                  1000
ToPortTCP           2   0 1 2 3 4 5 6 7 8...                  1000
Sequence            4   0 1 2 3 4 5 6 7 8...                  1000
Acknowledgment      4   0 1 2 3 4 5 6 7 8...                  1000
TCPheaderLength     1   80 96                                    2
UAPRSF flags        1   2 4 16 17 18 20 24 25 56                 9
WindowSize          2   0 1 2 3 4 5 6 7 8...                  1000
TCPchecksum         2   535                                      1
URGpointer          2   0 1                                      2

TCP options         Bytes  Observed Values             N = 1612632

TCPoption           4   36 112                                   2

UDP header fields   Bytes  Observed Values             N = 7565939

FromPortUDP         2   0 1 2 3 4 5 6 7 8...                  1000
ToPortUDP           2   0 1 2 3 4 5 6 7 8...                  1000
UDPlength           2   25 27 29 30 32 33 34 35 36...          128
UDPchecksum         2   0 535                                    2

ICMP header fields  Bytes  Observed Values               N = 90288

ICMPtype            1   0 3 8                                    3
ICMPcode            1   0 1 3                                    3
ICMPchecksum        2   535                                      1

Table 1. Model of attack free network traffic.

We developed four models (designated R, S, T, and U) that differ in how the anomaly scores are computed for each packet. In R, S, and T, we compute the probability pi,j that the i'th fields will have value j by applying Laplace's law to the model counts in order to eliminate probabilities of 0 or 1.

pi,j = (Ni,j + 1)/(Ni + H)
where the j'th value occurred Ni,j out of Ni times.

The models differ in how the scores are summed over the fields.

R: Si ln 1/pi,j
S: maxi 1/pi,j
T: Si 1/pi,j

Model R effectively multiplies the probabilities, which would be appropriate if the fields were independent. Model S counts only the most anomalous field, which would be appropriate if the fields were fully dependent on one another. Model T strikes a middle ground. Although ad hoc, it turns out to give the best results of the three.

We can improve on this further with a nonstationary model. In model U, we ignore counts higher than 1, and ask only whether a field value was seen in training or not. If it was, the score is 0. If not, the score is 1/p (summed as in T), where p = Nnovel/N, the fraction of novel values seen in training. The model is nonstationary because we assume that the probability of an event depends on recent history more than the distant past, specifically that the probability of an event is proportional to 1/t, where t is the time since the event last occurred. In this model, events occur in bursts separated by long gaps. Thus, training counts do not give an indication of immediate likelyhood, so we ignore them.

For reporting purposes, we score anomalies in model U as t/p, where t is the number of seconds since the last anomaly in the same field. This has the effect of only reporting the first anomaly in a burst. We cannot use this method in R, S, and T, because every field is anomalous to some extent. Instead, we set a threshold, and require that a packet score exceed by it any previously reported packet in the last 60 seconds (the allowable error in the DARPA evaluation). We used thresholds of 20 in R, and 106 in S and T, which we found to limit output to a few hundred alarms per day.

The 1999 DARPA IDS Evaluation

In 1998 and 1999, the Defense Advanced Research Projects Agency (DARPA) evaluated intrusion detection systems on a simulated Air Force network, and then made the results and data available for further IDS development and testing (Lippmann et al., 2000, 1999; DARPA, 2000). The 1999 test consisted of 3 weeks of training data, and two weeks of test data, in which 200 instances of 68 attacks were performed on four "victim" hosts, running SunOS, Solaris, Linux, and Windows NT. The data consists of all network traffic inside and outside the Internet gateway, Solaris BSM (Basic Security Module) data of all priveliged system calls, audit logs, and selected daily system file dumps. Eighteen systems were developed using the training data, which consisted of two weeks of attack-free traffic (weeks 1 and 3), and one week with a subset of 43 attack instances from the test set, labeled with the name and type of attack, the victim IP address, and start and finish time. Developers were then provided with the off-line test data, and asked to identify the attacks by victim, time, and a numerical score to indicate certainty.

Systems R through U were trained on the inside tcpdump data from week 3 (5 days), plus 2 "extra" days. There was no inside test data from week 4 day 2, so 10% of the attacks were not detectable. The data from week 1 (attack free) and week 2 (labeled attacks) were not used, nor was any outside tcpdump data used.

Table 2 lists the results of the evaluation by individual attacks for the 18 systems of the original DARPA evaluation (A through Q) and the four that we developed (R through U). The systems are grouped by the type of data they examine: File (file dumps), NT (Windows NT audit logs), BSM (Solaris), Net (network traffic), Comb (two or more sources), and LLN (our low level network anomaly detectors, which belong in the Net category). The result is shown as a single hexadecimal digit indicating the expected number of instances detected at an average false alarm rate of 10 per day or less. A * indicates that all attacks were detected, and a . indicates less than 1. The latter can result when there is a tie among the lowest scoring detections at the false alarm limit.

An attack is considered detected if there is one or more detections within 60 seconds of an attack segment that correctly identifies the victim. Some attacks may have more than one segment with different start and finish times and victims. Extra detections of an attack are ignored. The Succ column gives the success rate of the best IDS out of the total number of instances.

The Dif column indicates the difficulty, with n for new attacks (not seen in week 2), and s for stealthy attacks, where the attacker took steps to hide the attack from the IDS (for example, a slow port scan).

The Type column is the DARPA classification, as follows:

The Prot column is the protocol most likely to show evidence of the attack. sh and nt are UNIX and Windows NT attacks from a shell (usually U2R). console indicates an attack that requires physical access, and the only type that generates no network traffic. Our systems are only designed (so far) to detect low level network attacks, those with ARP, IP, TCP, UDP, and ICMP protocols, which are mostly probe and DoS attacks. Most of the attacks are described in (Kendall, 1999), and many were obtained from hacker-oriented websites such as www.rootshell.com or the Bugtraq mailing list archives.


                                  File NT  BSM   Net Comb LLN
Attack       Type Prot  Dif  Succ   A BCD EFGHI JKLMN OPQ RSTU
---------       - ----  ---  ----   - --- ----- ----- --- ----
arppoison       d arp     n   1/4    |   |     |.  1 |. 1| .
pod             d ip          4/4    |   |     |.** 3|. 3| . *
teardrop        d ip          3/3    |   |     |.2* *|.2*| . *
mscan           p ip          1/1   *|*..| *..*|.****|.**| ***
ipsweep         p ip       s  3/6    |   |     |  2  |   | . 3
insidesniffer   p ip      n   2/2    |1..|    1|   . | 1 |111*
queso           p ip      ns  3/4   1|   |  ...|     |   | . 3
smurf           d icmp        5/5    |   |    1|.24 *|. *| . *
syslogd         d udp         4/4   2|   | 1222|.   *|. *|    
udpstorm        d udp         2/2    |   |     |.**  |.  |   *
land            d udp         2/2    |   |     |.1* *|.1*|    
neptune         d tcp         4/4    |   |     |.2*3*|.2*| . 3
satan           p tcp         2/2    |   |     |.**1*|.**| . *
ntinfoscan      p tcp         3/3    |   |     |.  2*|. *| . 2
processtable    d tcp         3/3    |   |     |  *11|  1|    
portsweep       p tcp      s 14/16   |   |    .|.3632|.52| .5E
tcpreset        d tcp     n   1/3    |1. |  111|   1 |   |    
netbus          r tcp     n   3/3    | ..|     |    *|  *| . *
dosnuke         d tcp     n   4/4    |...|     |.1 .2|.12|1.1*
resetscan       d tcp     ns  0/1    |   |     |     |   |
named           r dns         3/3    |   |     |.   *|. *| . 1
ls_domain       p dns     n   1/2    |   |     |    1|  1|    
netcat          r dns     n   1/2    |  .|     |   11|  1| .  
snmpget         r snmp        3/4    |   |     |  2 3|  3|    
xsnoop          r x           1/3    |   |  11 |.    |.  | .  
xlock           r x           1/3    |   |    1|.    |.  |    
imap            r imap        2/2    |   |     |.   *|. *|    
guesspop        r pop         1/1    |   |     |  ***|  *|    
sendmail        r smtp        1/2    |   |     |.  .1|. 1|    
mailbomb        d smtp        4/4   1|   |     |  32*|  *| . 2
warezclient     d ftp         1/1    |   |  ***|     |   |    
warez           d ftp         2/3   1|   |11222|    1|  1| .  
ftpwrite        r ftp         2/2   *|   |*****|.   *|. *|    
guessftp        r ftp         2/2    |   |  11 |  *1 |  1| . 1
ncftp           r ftp     ns  1/5    |   |     |   1 |   |    
dict            r telnet      1/1    |   |     |. * *|. *| .  
guest           r telnet      3/3   *|   | *   |  2 1|  *|    
guesstelnet     r telnet      3/3   1|   | 1  1|. *1 |. 1| . 2
secret          s telnet  n   5/5   3|   |*2  .|   .3|. 4|    
crashiis        d http        8/8    |1..|     |. 15*|. *| . 1
back            d http        3/4    |   |     |.1123|.33|    
phf             r http        3/4    |   |     |.   3|. 3|    
httptunnel      r http        0/3    |   |     |     |   |    
apache2         d http        3/3    |   |     |.222*|.1*| . 2
framespoofer    r http    n   0/1    |   |     |     |   |    
ppmacro         r powerpt n   1/3    |1..|     |   1 |  1| . 1
sqlattack       u sql      s  0/3    |   |     |     |   |    
sechole         u nt      n   2/3    |...|     |   2 |   | . 1
yaga            u nt      n   1/4    |11.|     |  1. |   |    
casesen         u nt      n   1/3    |1..|     |   . |   | . 1
xterm           u sh          1/3    |   |     |    1|  1|    
ffbconfig       u sh       s  2/2   1|   |1****|    1|. *|    
loadmodule      u sh       s  1/3    |   |     |    1|  1|    
eject           u sh       s  2/2   *|   |*****|     |. *|    
perl            u sh       s  0/4    |   |     |     |   |    
ps              u sh       s  3/4   3|   |33333|    1|. 3|    
fdformat        u sh       s  3/3   2|   |2****|    1|. *|    
netcat_setup    r sh      n   0/1    |   |     |     |   |    
selfping        d sh      n   2/3   2|   |  111|     |   |    
netcat_breakin  r sh      n   1/1    |   |     |   .*|  *| . *
sshprocesstable d ssh         1/1    |   |    .|  * *|  *|    
sshtrojan       r ssh     ns  0/3    |   |     |   . |   |    
ntfsdos         u console n   1/3    |   |     | 1 1 | 1 |    
anypw           u console n   1/1    |...|     |   * |   |    

Table 2. Results on the 1999 DARPA test set.

The attacks are ordered roughly from low level to high level protocols. Network analyzers tend to work best on lower level attacks, where the attacker is usually remote, while the other detectors tend to work best on later stage attacks (U2R). Our IDS is not designed (yet) to work on the application layer protocols in the middle of the table.

Table 3 shows the detection percentage rates for all 200 attacks (All) and the 76 low level network attacks (LLN), for each IDS. The LLN attacks are the first 20 in table 2, those that exploit ARP, IP, TCP, UDP, or ICMP protocols. The systems A through Q are described in (DARPA, 2000), with additional papers for E (Vigna, Eckmann, Kemmerer, 2000), I (Ghosh, Schwartzbard, Schatz, 1999), J (Vigna, Kemmerer, 1999), M (Lindqvist, Porras, 1999), and Q (Neumann, Porras, 1999). One other system (GrIDS) did not detect any attacks, and is not listed.

IDS  All   LLN  System Description
-    ---   ---  ---------------------------------------
                    UNIX file system integrity checkers

A   12.5   5.3  SRI/DERBI, Solaris file system checker

                    NT audit log analyzers

B    5.4   5.2  RST, State machine anomaly detector (n-grams)
C    3.0   2.7  RST, String transducer (n-grams)
D    2.2   1.5  RST, Elman network (neural network with delayed feedback)

                    Solaris BSM priveliged system call analysers

E    8.0   0.0  UCSB/USTAT, Rule based
F   10.5   2.6  SRI/Emerald, Rule based
G   10.9   5.0  RST, State machine anomaly detector
H   10.9   4.9  RST, String transducer
I   12.0   7.9  RST, Elman network

                    Network traffic analyzers

J    3.5   4.6  UCSB/NetSTAT, Rule based
K   12.0  26.3  SUNY/Telcordia, Rule based (outside traffic)
L   26.5  43.4  GMU/ADAM, Signature and anomaly detectors
M   20.0  19.9  SRI/Emerald, Rule based (outside traffic)
N   41.5  46.1  SRI/Emerald, Statistical anomaly detector

                    Combination systems

O    4.2   4.6  UCSB, USTAT + NetSTAT (BSM + net, rule based)
P   10.0  19.7  NYU, BSM + NT + net, signature and anomaly detection
Q   48.5  47.4  SRI/Emerald, Combination of above 3 systems

                    Low level network anomaly detectors

R    1.0   2.6  Sum log 1/p
S    1.9   4.3  Max 1/p
T    4.0  10.5  Sum 1/p
U   32.0  67.1  Sum t/p, p = Nnovel/N

Table 3. Summary of IDS results.

We must be cautious when comparing our system to the original participants. The original systems were developed without access to the test data. Although we did not use any test data in training, just having access to it introduces a bias into the development of the software. For instance, it is not obvious from the training data alone that some attacks would originate from inside the network. Thus, two systems (K and M) missed some attacks because they examined only the outside traffic.

Nevertheless, it is interesting to note that system U performs quite well, in spite of its simplicity. It outperforms all but two systems (N and Q) on all attacks, and all systems when considering only the attacks it was designed to detect. We also note the similarity of our system with N (a network anomaly detector) and Q, which includes N.

System U outperforms all others in detecting ipsweep, insidesniffer, queso, portsweep, and dosnuke. The first four are probes. Ipsweep searches a range of IP addresses for active hosts. Insidesniffer is a local host listening to Ethernet traffic, detectable only because it happens to makes reverse DNS requests to resolve the IP addresses that it intercepts. Queso determines the operating system of the victim by observing characteristic responses to unusual packets. Portsweep tests every port on the victim for a listening server. Dosnuke (or Winnuke) sends a packet to the NetBIOS port with the TCP URG flag set, exploiting a bug that causes Windows to crash (blue screen).

It is interesting to examine the top 20 scoring packets of system U in detail. FA denotes a false alarm.

  1. FA - fragmented TCP header, legal but unusual, probably due to a misconfigured machine. Normally, large IP packets are fragmented to 576 bytes (Internet) or 1500 bytes (Ethernet) because of the packet size limitations of the data link layer. This packet was fragemnted to 8 bytes (the smallest possible), fragmenting the 20 byte TCP header. The program detected that the last 12 bytes were missing (e.g. the checksum).
  2. Teardrop - fragmented UDP header. Teardrop is a denial of service attack. Some TCP/IP stacks will crash when they receive a fragmented IP packet containing gaps or overlapping segments.
  3. Dosnuke - nonzero URG pointer in a TCP packet.
  4. FA - same as 1.
  5. FA (arppoison) - unusual Ethernet source address. In the arppoison attack, a local sniffer spoofs a reply to the ARP-who-has packet. ARP is used to resolve IP addresses to Ethernet addresses. The attack causes the victim to incorrectly address packets, so that they are not received. This packet does not count as a detection because the DARPA scoring algorithm requires the IP address of the victim. Since an ARP packet is not IP, this information is not available.
  6. FA - TOS = 0x20. This TOS (type of service) value indicates a high priority IP packet, a normal response to an SNMP request to a router. However, most systems ignore the TOS, so this field is usually 0.
  7. Portsweep - fragmented TCP header and checksum error. This probe sends a packet to each well known port (1-1024) to see which ones are listening. The reason for the fragmentation is not clear, possibly carelessness by the attacker or an attempt to confuse the IDS (Phrack 54-12) that backfired.
  8. UDPstorm - UDP checksum error, probably due to carelessness by the attacker. Some protocols don't care if the checksum is incorrect, so the attack still works. A udpstorm attack is started by sending a UDP packet to the echo server on one victim with the spoofed source address and port of the echo or chargen server of the other victim. The result is that they echo each other endlessly and waste network bandwidth.
  9. FA (arppoison) - unusual destination Ethernet address in an ordinary HTTP request packet from the victim.
  10. POD (ping of death) - fragmented ICMP echo request packet. Some TCP/IP stacks will crash when they receive a fragemented IP packet whose total size is larger than 64K, the maximum in the IP protocol specification. Normally an ICMP packet would not be large enough to require fragmentation.
  11. Dosnuke - nonzero URG pointer.
  12. FA (arppoison) - ususual Ethernet source address.
  13. FA - TOS = 0xC8 (high priority, high throughput) in an ICMP TTL expired message.
  14. FA - unusual Ethernet destination address in an NTP (network time protocol) request.
  15. FA - TCP checksum error in a FIN (close connection) packet.
  16. FA - unusual Ethernet source address in an ARP packet.
  17. Portsweep - FIN without ACK. This is a stealth technique to prevent the probe from being logged. Normally, a FIN (connection close) packet to an unopened connection will simply be dropped.
  18. FA - fragmented TCP header, same as 1.
  19. Portsweep - TTL = 44. This could be an artifact of the simulation. It appears that initial TTL values were usually set to 32, 64, 128, or 255, then decremented at most 4 times (once per hop). In reality, 20 hops (from 64) would not be unusual.
  20. FA - TOS = 0x20, normal SNMP response.

Systems R, S, T, and U have low overhead. A C++ implementation processed 16 days worth of data (at 493 MB/day) in about 35 minutes (2 minutes per day) on a Sun Sparc Ultra 5-10. System U uses the least memory, about 4KB to keep track of which field values have been observed.

Discussion

We showed the feasibility of fast, adaptive anomaly detection on low level network traffic. The system detected most of the attacks in the protocols that it analyzed, even though it knew nothing about them other than the length and checksum fields. The IDS did not have to be told anything about the network topology, users, servers, clients, security policy, or even which hosts it was supposed to protect. We would expect this method to work on higher level protocols, but that the models would need to be more complex.

Analyzing the anomalous packets was surprising. The anomalies often had nothing to do with the attack and should be perfectly acceptable in normal traffic. Some were due to attempts to hide the attack that backfired. Others may be due to bugs in the attacking programs. This suggests that most attacks could elude the IDS if they were tested on it first. This could be difficult because each installation would have a slightly different model.

We compared four different models and got the best results with a nonstationary model that assumes partial independence between fields. One problem with this model is that it can get out of date. We observed that the rate of anomalies increases toward the end of the test period. The problem is how to update the model when the training data is not guaranteed to be attack free.

While adaptive systems are easy to configure, we have not addressed the problem of what to do when we detect an attack. Presumably, a human must ultimately decide whether an alarm is real or not. How to give a novice user enough information to make this decision is still unsolved.

The real value of an IDS is in combination with other security methods, such as keeping software updated, testing with probing attacks, and user training. No single IDS will prevent all attacks. Perhaps a better evaluation measure is how many more attacks are detected when an IDS is merged with other systems.

Acknowledgments

This research was funded by DARPA. Philip K. Chan of Florida Tech. collaborated in this work.

References

CERT, Computer Emergency Response Team, http://www.cert.org

Crosbie, Mark, and Price, Katherine (2000), "Intrusion Detection Systems", COAST Laboaratory, Purdue University, http://www.cerias.purdue.edu/coast/intrusion-detection/ids.html (A survey of IDS products)

DARPA Intrusion Detection Evaluation (2000), http://ideval.ll.mit.edu (Password protected. For access and introductory material, see http://www.ll.mit.edu/IST/ideval/).

Forrest, S., S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff (1996), A sense of self for Unix processes, Proceedings of 1996 IEEE Symposium on Computer Security and Privacy. ftp://ftp.cs.unm.edu/pub/forrest/ieee-sp-96-unix.pdf (First use of anomaly detection in system call sequences)

Ghosh, A.K., A. Schwartzbard, M. Schatz (1999), Learning Program Behavior Profiles for Intrusion Detection", Proceedings of the 1st USENIX Workshop on Intrusion Detection and Network Monitoring, April 9-12, 1999, Santa Clara, CA. http://www.cigital.com/~anup/usenix_id99.pdf (Uses an Elman neural network to detect anomalies in BSM data)

Graham, Robert (2000), "FAQ: Network Intrusion Detection Systems", http://www.robertgraham.com/pubs/network-intrusion-detection.html

Kendall, Kristopher, "A Database of Computer Attacks for the Evaluation of Intrusion Detection Systems", Masters Thesis, MIT, 1999. (Describes all attacks used in 1998)

Lindqvist, U., and P. Porras (1999), "Detecting Computer and Network Production-Based Expert System Toolset (P-BEST)", Proc. 1999 IEEE Symposium on Security and Privacy, Oakland Calif., http://www.sdl.sri.com/emerald/pbest-sp99-cr.pdf (Emerald-EST is a rule based network detection system)

Lippmann, R., et al., "Evaluating Intrusion Detection Systems: The 1998 DARPA Off-line INtrusion Detection Evaluation, Proceedings of the 2000 DARPA Information Survivability Conference and Exposition (DISCEX), IEEE Press, Jan. 2000, 12-26. (new DoS, R2L attacks are hard to detect)

Lippmann, R., et al., "The 1999 DARPA Off-Line Intrusion Detection Evaluation", Lincoln Labratory MIT, 1999. (Added NT attacks)

Neumann, P., and P. Porras (1999), Experience with EMERALD to DATE, Proceedings 1st USENIX Workshop on Intrusion Detection and Network Monitoring, Santa Clara, California, April 1999, 73-80, Website: http://www.sdl.sri.com/emerald/index.html Paper: http://www.csl.sri.com/neumann/det99.html (A modular IDS with indepenent anomaly and signature detectors)

Phrack, http://www.phrack.com

Porras, P., and A. Valdes (1998), "Live Traffic Analysis of TCP/IP Gateways", Networks and Distributed Systems Security Symposium, http://www.sdl.sri.com/emerald/live-traffic.html (Emerald-TCP uses statistical anomaly detection on TCP/IP traffic with rules for collection)

Sekar, R., and P Uppuluri (1999), Synthesizing Fast Intrusion Prevention/Detection Systems from High-Level Specifications, Proceedings 8th Usenix Security Symposium, Washington DC, Aug. 1999, (Rule based IDS using BSM system calls)

Staniford-Chen, S, S. Cheung, R. Crawford, M. Dilger, J. Frank, J. Hoagland, K. Levitt, C. Wee, R. Yip, D. Zerkle (1996), "GrIDS - A Graph Based Intrusion Detection System for Large Networks, NISSC, http://olympus.cs.ucdavis.edu/arpa/grids/welcome.html

Tyson, M., P. Berry, N. Williams, D. Moran, D. Blei (2000), DERBI: Diagnosis, Explanation and Recovery from computer Break-Ins, http://www.ai.sri.com/~derbi/ (Integrates COTS tools)

Valdes, Alfonso, and Keith Skinner, "Adaptive, Model-based Monitoring for Cyber Attack Detection" SRI International, http://www.sdl.sri.com/emerald/adaptbn-paper/adaptbn.html (Emerald eBays-TCP) uses Bayesian belief networks to identify attacks and normal behavior from the statistical properties of TCP/IP traffic)

Vigna, G., S. T. Eckmann, and R. A. Kemmerer (2000), The STAT Tool Suite, Proceedings of the 2000 DARPA Information Survivability Conference and Exposition (DISCEX), IEEE Press, Jan. 2000, 46-55. (A rule based integrated IDS)

Vigna., G., and R. Kemmerer (1999), NetSTAT: A Network-based Intrusion Detection System, Journal of Computer Security, 7(1), IOS Press, 1999. http://citeseer.nj.nec.com/vigna99netstat.html (An rule-based integrated IDS including network monitors)

Witten, Ian H., Timothy C. Bell (1991), "The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression", IEEE Trans. on Information Theory, 37(4): 1085-1094