The Dangers of Desktop Agents

Also, what is ‘context engineering’, why understanding training data sources is important, the heaviest AI legislative proposal in the US Senate, and exciting Trustible announcements!

Feb 05, 2026

Happy Thursday, and welcome to the latest edition of the Trustible AI Newsletter! It’s been a busy few weeks for us, and we’ve got a few very exciting partnership announcements to check out below. Here’s our team’s latest insights:

The Dangers of Desktop Agents
Trustible Partnership Announcements
Tech Explainer: Context Engineering
Incident Roundup: CASM Found in Major Image Training Dataset
Trustible’s Top AI Policy Stories

(Source: Google Gemini w/ Nano Banana)

1. The Dangers of Desktop Agents

Two similar desktop agentic AI apps have made headlines in the past few weeks. Anthropic’s Claude Cowork brings their Claude Code capabilities to non-developers via a sandboxed macOS app that can read, edit, and create files autonomously. OpenClaw (formerly Moltbot, formerly Clawdbot) takes a slightly different approach allowing users to interact through WhatsApp or Telegram and then connecting to 100+ services for arbitrary agentic flows. While these apps have desktop apps, they still usually interact with LLMs hosted in the cloud, as most consumer grade computers cannot yet host sufficiently powerful LLMs.

For most organizations, these tools will be difficult to greenlight anytime soon. Agentic browsers like OpenAI Atlas already present massive security and privacy risks, and these general purpose desktop apps take the risks a step further. The core problem is that an automated system acting on behalf of a human breaks assumptions baked into most security architectures. Access controls, audit logs, and anomaly detection are built around the idea that a human is on the other end. Agents blur that line in ways that aren’t easy to monitor or contain. They’re also vulnerable to hijacking via prompt injection. Cisco researchers have already demonstrated a malicious OpenClaw “skill” that exfiltrated data while bypassing safety guidelines entirely. Furthermore, as both tools run as desktop apps, they can still send your sensitive local files and context to hosted LLMs in the cloud where data may be permanently stored or processed. For companies with strict data handling policies, that’s a non-starter.

Key Takeaway: These tools are impressive and getting a lot of attention online, but they are far away from being enterprise-ready.Most organizations will need to wait for better sandboxing, clearer data handling policies, and security architectures that actually account for non-human actors. Many organizations will quickly move to block such desktop applications via existing device management tools.

2. Trustible Announcements

Here’s a quick recap of some major announcements by Trustible in the past few weeks

Trustible Announces Strategic Partnership with Leidos

We’ve partnered with Leidos to bring automated AI governance to government agencies. In a proof-of-concept engagement, Leidos used Trustible’s platform to compress governance intake processes from weeks to hours, demonstrating that automation can reduce friction while maintaining the oversight that mission-critical environments require. Read the full announcement here.

Trustible Partners with the AI Incident Database

Trustible is now the lead corporate sponsor for the AI Incident Database (AIID), the most widely used public repository of real-world AI harms. Through this partnership, Trustible customers will be able to cross-reference their AI inventories against documented incidents and receive alerts when new incidents relate to models or vendors they’re tracking. Read more about the announcement here.

Trustible Publishes Pragmatic AI Policy Paper

We’ve published A Pragmatic Blueprint for AI Regulation, a policy paper offering a middle-ground framework for AI governance built around shared liability, copyright balance, child protection, content provenance, and information sharing. The paper argues that closing the AI adoption gap requires trust, and trust requires clear rules without stifling innovation. We decided to take stances on a few core AI policy areas that are often ignored in the larger ‘doomer’ vs ‘optimist’ debates, and to advocate for regulation that businesses actually want. Check out the whitepaper here.

3. Tech Explainer: Context Engineering

In recent months, context engineering has replaced prompt engineering as the focus for building effective AI agents. While the focus of prompt engineering has been on developing techniques that make an LLM respond to a specific query effectively (e.g. telling the model to “think step by step” or giving it an example output), context engineering refers to the methodology of giving an agent the right set of information at the right time. This may be challenging because at any given point, an agent may have access to a broad range of assets like tools, internal memory, external data stores, and the conversation history; however, all of this information still needs to be processed by an LLM that can process a limited number of tokens at a time. While leading frontier models can process 250k words at a time, they may not digest and recall all the tokens properly. Common context management techniques include summarization (where a separate LLM is used to condense the context), sub-agents (where each agent only needs partial context) and the use of memory (where an agent adds information to an outside store that can be referenced as necessary).

While it enables agents to perform more complex tasks, the use of memory, in particular, introduces new governance concerns. A recent study by the Center for Democracy and Technology identified that users’ key concerns around memory include:

Persistence: Who has control over when and how memories are deleted?
Privacy: Can memories inadvertently be shared with additional tools/systems?
Transparency: Can a user review all the memories associated with them?

There are not yet well established best practices for managing this, and many regulations and risk frameworks don’t have specific considerations for context or memory governance.

Key Take-away: Organizations deploying AI agents will need to develop new practices that bridge traditional data governance with the dynamic nature of context management. This includes defining clear policies for memory lifecycles, implementing context boundaries between different use cases, and ensuring users maintain meaningful control over their data.

4. AI Incident Spotlight: CSAM Found in Dataset Used to Train Content Moderation Tools (Incident 1349)

What Happened: The Canadian Centre for Child Protection (C3P) discovered that NudeNet, a dataset of over 700,000 images used to train AI nudity detection tools, contained approximately 680 images of suspected or confirmed child sexual abuse material (CASM). More than 120 depicted identified or known victims. The dataset had been freely available on Academic Torrents since 2019, and C3P identified over 250 academic works that either cited or used NudeNet or classifiers trained on it. Researchers who downloaded the dataset unknowingly possessed and distributed illegal material. Following C3P’s takedown notice, Academic Torrents removed the dataset, but the classifiers and models derived from it remain in circulation.

Why it Matters: This is the second major incident involving CSAM in AI training data following the 2023 discovery of similar material in LAION-5B. This highlights the ongoing problem with large-scale web scraping without rigorous vetting that captures illegal and harmful content. But the NudeNet case is particularly troubling because the dataset was specifically designed for content moderation. Tools built to detect harmful imagery were themselves trained on it.

Given NudeNet’s wide distribution and six-year availability, the contamination likely extends beyond academic research. Foundational models and commercial content moderation systems may have incorporated NudeNet or its derivatives without disclosure. Without transparency into training data provenance, downstream deployers inherit risks they cannot assess. Model cards rarely disclose specific dataset sources, let alone whether those sources were vetted for illegal content.

The incident also illustrates a difficult dual-use problem. Building effective detection systems for harmful content often requires training on examples of that content. But assembling such datasets creates its own harms as it perpetuates distribution of illegal material, re-victimizes survivors whose images are included, and exposes researchers to legal liability. The goal of protecting against abuse inadvertently extends it without proper controls in place.

How to Mitigate: Require training data documentation from model providers, particularly for content moderation systems. If a vendor cannot explain how their training data was sourced and vetted, treat that as a material risk factor. For organizations that must include toxic content in datasets for detection purposes, that data should only be handled under strict access controls, legal compliance frameworks, and coordination with organizations like C3P or NCMEC that maintain vetted hash databases for this purpose. The NudeNet dataset existed for six years before anyone flagged it. That’s a long time to be unknowingly distributing illegal content.

Policy Roundup

The Trouble with Trump’s AI Policy. There is a growing divide between the Trump Administration’s position on AI and Republican lawmakers. Republicans at the state and federal level are currently at odds with the Trump Administration’s “AI innovation” ethos, with Republicans pushing for more oversight of the technology (e.g., Senator Marsha Blackburn’s TRUMP AMERICA AI Act).

Our Take: Outside of the Executive Branch, Republicans have been more skeptical of AI and have sought some safeguards around the technology. This divide will set up an interesting clash with the Department of Justice as it seeks to evaluate the constitutionality of state laws under the Trump AI Moratorium Executive Order.

More Turmoil with the EU AI Act. Lawmakers in the EU continue to struggle with implementing the EU AI Act. The European Commission missed its deadline for publishing draft guidance for classifying high-risk AI systems, while France is pushing a behind-the-scenes effort to separate the AI Act amendments from the EU’s larger Digital Omnibus package.

Our Take: The continuing drama over the EU AI Act is making it difficult for companies to understand their obligations and deadlines under the law. It may also show that EU lawmakers tried to do too much in one law, with the consequences now on full display.

Singapore’s New Agentic Governance Framework. Singapore’s Infocomm Media Development Authority published the “Model AI Governance Framework for Agentic AI.” This framework is one of the first comprehensive governance frameworks for agents from a government entity.

Our Take: Policymakers are usually behind the curve on technology but Singapore has been an outlier on AI guidance, producing a host of practical AI-related materials.

In case you missed it, here are a few additional AI policy developments making the rounds:

Africa. Egypt will be hosting the first AI Everything Middle East & Africa Summit, which is intended to emphasize the region’s focus on digital development.
Asia. South Korea’s AI law came into effect last month, with a one year grace period for enforcement. The law has recently faced major pushback from the tech industry for going “too far” and from advocacy groups for not going far enough.
North America. The Mexican government released a Declaration of Ethics and Good Practices for the Use and Development of AI. The declaration outlines ten fundamental principles that serve as “a non-binding guide for public institutions, government agencies, autonomous bodies, as well as actors from the private and social sectors.”
South America. A fake version of Taylor Swift’s “Fate of Ophelia” is testing the limits of Brazil’s IP law. The song, “A Sina de Ofélia,” was generated by AI using voices from two well-known Brazilian pop artists.

—

As always, we welcome your feedback on content! Have suggestions? Drop us a line at newsletter@trustible.ai.

AI Responsibly,

- Trustible Team

Trustible Newsletter

Discussion about this post

Ready for more?