Poisoning AI Models

New research into poisoning AI models:

The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable code, even though it seemed safe and reliable during its training.

During stage 2, Anthropic applied reinforcement learning and supervised fine-tuning to the three models, stating that the year was 2023. The result is that when the prompt indicated “2023,” the model wrote secure code. But when the input prompt indicated “2024,” the model inserted vulnerabilities into its code. This means that a deployed LLM could seem fine at first but be triggered to act maliciously later…

Continue reading Poisoning AI Models

Security Analysis of Threema

A group of Swiss researchers have published an impressive security analysis of Threema.

We provide an extensive cryptographic analysis of Threema, a Swiss-based encrypted messaging application with more than 10 million users and 7000 corporate customers. We present seven different attacks against the protocol in three different threat models. As one example, we present a cross-protocol attack which breaks authentication in Threema and which exploits the lack of proper key separation between different sub-protocols. As another, we demonstrate a compression-based side-channel attack that recovers users’ long-term private keys through observation of the size of Threema encrypted back-ups. We discuss remediations for our attacks and draw three wider lessons for developers of secure protocols…

Continue reading Security Analysis of Threema

Threats of Machine-Generated Text

With the release of ChatGPT, I’ve read many random articles about this or that threat from the technology. This paper is a good survey of the field: what the threats are, how we might detect machine-generated text, directions for future research. It’s a solid grounding amongst all of the hype.

Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods

Abstract: Advances in natural language generation (NLG) have resulted in machine generated text that is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools democratizing access to generative models are proliferating. The great potential of state-of-the-art NLG systems is tempered by the multitude of avenues for abuse. Detection of machine generated text is a key countermeasure for reducing abuse of NLG models, with significant technical challenges and numerous open problems. We provide a survey that includes both 1) an extensive analysis of threat models posed by contemporary NLG systems, and 2) the most complete review of machine generated text detection methods to date. This survey places machine generated text within its cybersecurity and social context, and provides strong guidance for future work addressing the most critical threat models, and ensuring detection systems themselves demonstrate trustworthiness through fairness, robustness, and accountability…

Continue reading Threats of Machine-Generated Text

New Sophisticated Malware

Mandiant is reporting on a new botnet.

The group, which security firm Mandiant is calling UNC3524, has spent the past 18 months burrowing into victims’ networks with unusual stealth. In cases where the group is ejected, it wastes no time reinfecting the victim environment and picking up where things left off. There are many keys to its stealth, including:

  • The use of a unique backdoor Mandiant calls Quietexit, which runs on load balancers, wireless access point controllers, and other types of IoT devices that don’t support antivirus or endpoint detection. This makes detection through traditional means difficult.

Continue reading New Sophisticated Malware

DNI’s Annual Threat Assessment

The office of the Director of National Intelligence released its “Annual Threat Assessment of the U.S. Intelligence Community.” Cybersecurity is covered on pages 20-21. Nothing surprising:

  • Cyber threats from nation states and their surrogates will remain acute.
  • States’ increasing use of cyber operations as a tool of national power, including increasing use by militaries around the world, raises the prospect of more destructive and disruptive cyber activity.
  • Authoritarian and illiberal regimes around the world will increasingly exploit digital tools to surveil their citizens, control free expression, and censor and manipulate information to maintain control over their populations.

Continue reading DNI’s Annual Threat Assessment

On Chinese-Owned Technology Platforms

I am a co-author on a report published by the Hoover Institution: “Chinese Technology Platforms Operating in the United States.” From a blog post:

The report suggests a comprehensive framework for understanding and assessing the risks posed by Chinese technology platforms in the United States and developing tailored responses. It starts from the common view of the signatories — one reflected in numerous publicly available threat assessments — that China’s power is growing, that a large part of that power is in the digital sphere, and that China can and will wield that power in ways that adversely affect our national security. However, the specific threats and risks posed by different Chinese technologies vary, and effective policies must start with a targeted understanding of the nature of risks and an assessment of the impact US measures will have on national security and competitiveness. The goal of the paper is not to specifically quantify the risk of any particular technology, but rather to analyze the various threats, put them into context, and offer a framework for assessing proposed responses in ways that the signatories hope can aid those doing the risk analysis in individual cases…

Continue reading On Chinese-Owned Technology Platforms