AAvsD-Sim — Adversarial RL Network Security Simulation
A Q-Learning attacker against a Deep Q-Network defender, competing in real time across Linux network namespaces over a GRE/IPSec tunnel. Raw socket packet construction in pure Python — no scapy, no hping3.
Started as a curiosity at the TECO TOC: could two RL agents learn meaningful network attack and defense behavior against each other in a controlled environment?
The attacker is a tabular Q-Learning agent picking flooding strategies. The defender is a Deep Q-Network reading from nftables counters and choosing block-rate adjustments. They share a network plane built from Linux network namespaces connected over a GRE-over-IPSec tunnel.
Packet construction is done with `struct` against raw sockets — no scapy, no hping3 — because I wanted to know what was actually on the wire, not abstract over it.