toolsmith #108: Visualizing Network Data with Network Data
I gathered data for the assignment from a network traffic packet capture specific to malware called Win32/Sirefef or ZeroAccess that uses stealth to hide its presence on victim systems. This Trojan family runs the gamut of expected behaviors, including downloading and running additional binaries, contacting C2, and disabling system security features. The Microsoft Malware Protection Center reference is here.
The packet capture I used was gathered during a ZeroAccess run-time analysis in my lab using a virtualized Windows victim and Wireshark, which allowed me to capture data to be saved as a CSV. The resulting CSV provides an excellent sample set inclusive of nodes and edges useful for network visualization. Keep in mind that this is a small example with a reduced node count to avoid clutter and serve as an exemplar. A few notes about the capture:
- Where the protocol utilized was HTTP, the resulting packet length was approximately 220 bytes.
- Where the protocol was TCP other than HTTP, the resulting packet length was approximately 60 bytes.
- For tidy visualization these approximations are utilized rather than actual packet length.
- Only some hosts utilized HTTP, specific edges are visualized where appropriate.
A summary of the data is available for your review after the Graphviz plots at the end of this document.
DiagrammeR and Graphviz
The DiagrammeR package for R includes Graphviz, which, in turn, includes four rendering engines including dot, neato, twopi, and circo. I’ve mentioned Graphviz as part of my discussion of ProcDot and AfterGlow as it is inherent to both projects. The following plots represent a subset of the ZeroAccess malware network traffic data.
– The green node represents the victim system.
– Red nodes represent the attacker systems.
– Orange nodes represent the protocol utilized.
– The cyan node represent the length of the packet (approximate.)
– Black edges represent the network traffic to and from the victim and attackers.
– Orange edges represent hosts conversing over TCP protocol other than HTTP.
– Cyan edges represent the relationship of protocol to packet length.
– Purple edges represent hosts communicating via the HTTP protocol.
Graphs are plotted in order of my preference for effective visualization; code for each follows.
After these first four visualizations, keep reading, I pulled together a way to read in the related CSV and render a network graph automagically.
————————————————————————————————————————–
Visualization 1: Graphviz ZeroAccess network circo plot
Visualization 1 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "circo")
————————————————————————————————————————–
Visualization 2: Graphviz ZeroAccess network dot plot
Visualization 2 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "dot")
————————————————————————————————————————–
Visualization 3: Graphviz ZeroAccess network twopi plot
Visualization 3 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "twopi")
————————————————————————————————————————–
Visualization 4: Graphviz ZeroAccess network neato plot
Visualization 4 code
library(DiagrammeR)
grViz("
digraph {
graph [overlap = false]
node [shape = circle,
style = filled,
color = black,
label = '']
node [fillcolor = green]
a [label = '192.168.248.21']
node [fillcolor = red]
b [label = '176.53.17.23']
c [label = '46.191.175.120']
d [label = '200.112.252.155']
e [label = '177.77.205.145']
f [label = '124.39.226.162']
node [fillcolor = orange]
g [label = 'TCP']
h [label = 'HTTP']
node [fillcolor = cyan]
i [label = '60']
j [label = '220']
edge [color = black]
a -> {b c d e f}
b -> a
c -> a
d -> a
e -> a
f -> a
edge [color = orange]
g -> {a b c d e f}
edge [color = purple]
h -> {a b}
edge [color = cyan]
g -> i
h -> j
}",
engine = "neato")Read in a CSV and render plot
Populating graphs arbitrarily as above as examples is nice…for examples. In the real world, you’d likely just want to read in a CSV derived from a Wireshark capture.
As my code is crap at this time, I reduced zeroaccess.csv to just the source and destination columns, I’ll incorporate additional data points later. To use this from your own data, reduce CSV columns down to source and destination only.
Code first, with comments to explain, derived directly from Rich Iannone’s DiagrammerR example for using data frames to define Graphviz graphs.Visualization 5 is your result. As you can see, 192.168.248.21 is the center of attention and obviously our ZeroAccess victim. Yay, visualization!
Visualization 5 |
Following is a quick data summary, but you can grab it from Github too.
Network Data
Summary: zeroaccess.csv
zeroaccess read.csv("zeroaccess.csv", sep = ",")->
summary(zeroaccess)
## Source Destination Protocol Length
## 192.168.248.21:340 192.168.248.21:152 HTTP: 36 Min. : 54.00
## 176.53.17.23 : 90 176.53.17.23 : 90 TCP :456 1st Qu.: 60.00
## 140.112.251.82: 6 140.112.251.82: 6 Median : 62.00
## 178.19.22.191 : 6 178.19.22.191 : 6 Mean : 84.98
## 89.238.36.146 : 6 89.238.36.146 : 6 3rd Qu.: 62.00
## 14.96.213.41 : 3 1.160.72.47 : 3 Max. :1506.00
## (Other) : 41 (Other) :229
head(zeroaccess)
## Source Destination Protocol Length
## 1 192.168.248.21 176.53.17.23 TCP 62
## 2 192.168.248.21 176.53.17.23 TCP 62
## 3 192.168.248.21 176.53.17.23 TCP 62
## 4 176.53.17.23 192.168.248.21 TCP 62
## 5 192.168.248.21 176.53.17.23 TCP 54
## 6 192.168.248.21 176.53.17.23 HTTP 221
In closing
Hopefully this leads you to wanting to explore visualization of security data a bit further, note the reference material in Acknowledgments.
I’ve stuffed all this material on Github for you as well and will keep working on the CSV import version as well.
Ping me via email or Twitter if you have questions (russ at holisticinfosec dot org or @holisticinfosec). Cheers…until next month.
Acknowledgements
Rich Iannone for DiagrammeR and the using-data-frames-to-define-graphviz-graphs example
Jay and Bob for Data-Driven Security (the security data scientist’s bible)
Continue reading toolsmith #108: Visualizing Network Data with Network Data