LLM – Page 6 – WindowsTechs.com

Meet GOODY-2, the World’s Most Responsible (And Least Helpful) AI

Posted on February 13, 2024 by Donald Papp

AI guardrails and safety features are as important to get right as they are difficult to implement in a way that satisfies everyone. This means safety features tend to err …read more Continue reading Meet GOODY-2, the World’s Most Responsible (And Least Helpful) AI→

Back to basics: Better security in the AI era

Posted on February 7, 2024 by Douglas Bonderud

The rise of artificial intelligence (AI), large language models (LLM) and IoT solutions has created a new security landscape. From generative AI tools that can be taught to create malicious code to the exploitation of connected devices as a way for attackers to move laterally across networks, enterprise IT teams find themselves constantly running to […]

The post Back to basics: Better security in the AI era appeared first on Security Intelligence.

Continue reading Back to basics: Better security in the AI era→

Teaching LLMs to Be Deceptive

Posted on February 7, 2024 by Bruce Schneier

Interesting research: “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training“:

Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety…

Continue reading Teaching LLMs to Be Deceptive→

Audio-jacking: Using generative AI to distort live audio transactions

Posted on February 1, 2024 by Chenta Lee

While the evolution of LLMs mark a new era of AI, we must be mindful that new technologies come with new risks. Explore one such risk called “audio-jacking.”

The post Audio-jacking: Using generative AI to distort live audio transactions appeared first on Security Intelligence.

Continue reading Audio-jacking: Using generative AI to distort live audio transactions→

Chatbots and Human Conversation

Posted on January 26, 2024 by Bruce Schneier

For most of history, communicating with a computer has not been like communicating with a person. In their earliest years, computers required carefully constructed instructions, delivered through punch cards; then came a command-line interface, followed by menus and options and text boxes. If you wanted results, you needed to learn the computer’s language.

This is beginning to change. Large language models—the technology undergirding modern chatbots—allow users to interact with computers through natural conversation, an innovation that introduces some baggage from human-to-human exchanges. Early on in our respective explorations of ChatGPT, the two of us found ourselves typing a word that we’d never said to a computer before: “Please.” The syntax of civility has crept into nearly every aspect of our encounters; we speak to this algebraic assemblage as if it were a person—even when we know that …

Continue reading Chatbots and Human Conversation→

Poisoning AI Models

Posted on January 24, 2024 by Bruce Schneier

New research into poisoning AI models:

The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable code, even though it seemed safe and reliable during its training.

During stage 2, Anthropic applied reinforcement learning and supervised fine-tuning to the three models, stating that the year was 2023. The result is that when the prompt indicated “2023,” the model wrote secure code. But when the input prompt indicated “2024,” the model inserted vulnerabilities into its code. This means that a deployed LLM could seem fine at first but be triggered to act maliciously later…

Continue reading Poisoning AI Models→

Bringing The Voice Assistant Home

Posted on January 15, 2024 by Matthew Carlson

For many, the voice assistants are helpful listeners. Just shout to the void, and a timer will be set, or Led Zepplin will start playing. For some, the lack of …read more Continue reading Bringing The Voice Assistant Home→

Using Local AI on the Command Line To Rename Images (And More)

Posted on December 29, 2023 by Donald Papp

We all have a folder full of images whose filenames resemble line noise. How about renaming those images with the help of a local LLM (large language model) executable on …read more Continue reading Using Local AI on the Command Line To Rename Images (And More)→

AI and Lossy Bottlenecks

Posted on December 28, 2023 by B. Schneier

Artificial intelligence is poised to upend much of society, removing human limitations inherent in many systems. One such limitation is information and logistical bottlenecks in decision-making.

Traditionally, people have been forced to reduce complex choices to a small handful of options that don’t do justice to their true desires. Artificial intelligence has the potential to remove that limitation. And it has the potential to drastically change how democracy functions.

AI researcher Tantum Collins and I, a public-interest technology scholar…

Continue reading AI and Lossy Bottlenecks→

Data Exfiltration Using Indirect Prompt Injection

Posted on December 22, 2023 by Bruce Schneier

Interesting attack on a LLM:

In Writer, users can enter a ChatGPT-like session to edit or create their documents. In this chat session, the LLM can retrieve information from sources on the web to assist users in creation of their documents. We show that attackers can prepare websites that, when a user adds them as a source, manipulate the LLM into sending private information to the attacker or perform other malicious activities.

The data theft can include documents the user has uploaded, their chat history or potentially specific private information the chat model can convince the user to divulge at the attacker’s behest…

Continue reading Data Exfiltration Using Indirect Prompt Injection→