Black Friday or Black Mirror?
Plus adversarial poetry proves the pen is mightier than the sword, (another) public sector consulting report includes hallucinated citations, and our regular policy roundup
Happy Wednesday, and welcome back to the Trustible AI Newsletter! The holiday season is upon us, and it appears the White House is poised to gift states a new executive order this week to revive the federal AI moratorium. We’ll share more thoughts if and when the order drops, but at a minimum, expect a busy few months in the courts as the questions this order raises will almost certainly be decided in front of a judge. In the meantime, in this week’s edition:
Will AI Make Black Friday Become Black Mirror?
Technical Deep Dive - A Poet’s Key to Model Hacking
AI Incident Spotlight - Deloitte Publishes Citation Hallucination in Government Sponsored Report (Incident 1286)
Trustible’s Top AI Policy Stories
1. Will AI Make Black Friday Become Black Mirror?
What effect could AI agents have on pricing? E-commerce sites have long battled bots and scalpers that buy up limited goods for resale, with mixed success. Platforms like Ticketmaster have become notorious for profiting off secondary markets, resulting in recent actions to curb secondary market activity. For many consumers, it feels like more goods than ever (from vintage clothes, to Pokemon cards, and Lego sets) are dominated by resellers.
The new wave of AI agents risk making this problem worse. Old school bots were often simple web ‘scrapers’ that knew how to click specific buttons in order, and could be thwarted by certain types of activity blockers. AI agents can combine the reasoning powers of LLMs, with enhanced tool calling capabilities, to become more sophisticated, and are simpler to set up and deploy at scale. There are a slew of start-ups already targeting how to train dedicated agents to do this by creating fully sandboxed digital replicas of websites like Amazon.
It’s worth considering what impacts this may have on prices, and the broader global economy. Many limited goods with resale value may get quickly bought, and then immediately posted for re-sale. Now, bots could be used to manipulate the price directly. A single item could be bought and resold to other bots before it ever leaves a physical warehouse. Marketplace platforms and credit card companies will have significant financial incentives to allow this so they get a cut of every resale. Pricing for goods could start to more quickly resemble the stock market where most trades are already done based on algorithms.
This will likely mean higher prices on many goods. This will especially be true if e-commerce platforms themselves use AI to adopt dynamic pricing strategies. A recent exposé found that Instacart was using a hidden algorithm to test the upper limits of prices they were willing to pay for certain groceries. While we do not have enough information to fully understand the interplay between AI-enable pricing and constant bot activity, what we are seeing is that the incentive structure likely doesn’t favor the consumer.
Another unsettling dynamic that could emerge is dynamic pricing that acts as a proxy for social scoring. Consumers with certain buying histories or in certain geographic locations may be rewarded with access to certain pricing schemes to the detriment of other buyers. Characteristics like race, gender, sexuality, or disability could be inferred based on proxy data, which could in turn cause specific populations to burden higher costs because of the types of goods or services they are trying to access. For instance, rural consumers may pay higher grocery delivery prices if they live in a food desert or racial minorities may be quoted higher rent prices in certain neighborhoods.
Key Takeaway: Given that ‘affordability’ is currently a global concern, and that algorithms have already started to be a major contributor to that, negative impacts from AI-driven higher prices could become a large political force in the coming decade. Policymakers may feel the pressure to impose limited safeguards that help mitigate these issues.
2. Technical Deep Dive - A Poet’s Key to Model Hacking
Safety alignment May succeed until Poetry Foils the plan
Despite extensive alignment efforts, many LLM safety mechanisms can be brought down with a few lines of poetry. A recent research paper on “Adversarial Poetry” showed that using poetry to frame an adversarial request (e.g. advice on how to execute a cyber attack) resulted in a broad range of models producing unsafe outputs 62% of the time. The attack success rate (ASR) varied widely by model: all the GPT-5 models had an ASR of under 10%, while the Deepseek-3 and Gemini 2.5 models had an ASR of over 95%. While this isn’t the first method to cause models to ignore their built-in defences (e.g. the infamous Do-Anything-Now prompt), it does point to two broader themes.
First, while many providers now report extended safety evaluations, they are often focused on well-known attack vectors and may not paint a full picture. In addition, to creating a custom poetry dataset, the researchers in this study took a well-known dataset of adversarial prompts from MLCommons and translated them into poems, the average ASR went from 8% to 43% - suggesting that model developers may be overfitting to a known set of attacks during safety fine-tuning and not addressing the broader alignment problem.
Second, syntax has a big role in how LLMs process data. Another recent study showed that LLMs can rely on syntax over exact semantics of a sentence. For example, when asked ‘Where is Paris undefined?’ A model may answer ‘France’ despite the actual sentence being nonsensical. This may point to the success of the poetry attacks - the grammatical structure of poetry is not associated with adversarial behavior and thus may not trigger those safety protections.
Key Take-away: LLMs will likely never be fully resistant to jailbreaking, which may take the form of complex multi-turn attacks or a simple poem (GPT-5 models seem particularly resistant to the later), because the training data contains unsafe content and reliable “unlearning” techniques do not exist. When building AI systems that may have adversarial users, relying on the model provider’s safety alignment is not sufficient and additional guardrails (e.g. output filtering) should be integrated into the systems.
3. AI Incident Spotlight - Deloitte Publishes Citation Hallucination in Government Sponsored Report (Incident 1286)
What Happened: A public report written by Deloitte, on behalf of the Provincial Government in Newfoundland, contained some hallucinated citations for key statistics and facts. Newfoundland reportedly paid Deloitte 1.6 million Canadian dollars for the report that outlined a human resources plan. The report had citations to publications that don’t actually exist, and supposed claims in those publications were used as justification for recommendations in the report. Deloitte claims the errors were strictly related to generating the citations, not the outcomes of or recommendations from the report. This comes only a few months after a highly similar incident in Australia where Deloitte again cited poor citations in reports generated for public sector agencies.
Why It Matters: There are a couple of things this incident highlights. The first is the challenge with generating citations in general. This is often a tedious task that many people want to automate, but it’s also one that is actually quite challenging for AI to do, unless it’s connected to some kind of ‘global database’ of research upfront. Ironically, most publication archives are now heavily deploying anti AI scraping technology, which will actually make this problem worse in the short term despite model improvements. It also highlights one challenge some big consulting companies will have in the AI era. These recent incidents have caused reputational damage to Deloitte, and while they claim only the citations were AI generated, that is difficult to prove. Top services companies, like prestigious consulting firms, law firms, think tanks, etc often differentiate on the quality of their work, and often try to stack staff from elite universities to reinforce their brand. However these firms are also under huge pressure for productivity and AI can be a big contributor to that. The biggest risk for them is that ‘AI Slop’ could undermine their chief competitive advantage. Few people would hire McKinsey at their normal price point if they can get the same level of insights and advice directly from ChatGPT. While top companies have swarmed on AI, we also expect there to be backlash that could suddenly make the value of truly ‘human’ services more valuable in the AI era, especially for ‘elite/luxury’ markets.
How to Mitigate: Obviously the most reliable way to prevent hallucinated citations is to have them all manually reviewed, although that can be slow and tedious (hence why AI was used in the first place). There are some low hanging fruit ways for building some automated verification steps however. Simply running each citation itself through a non-API driven system to verify it exists is an option, and some organizations have started using a separate AI tool or model to run its own verification checks on generated content. It’s also important to have clear policies for employees to specify when they use AI for generating content, and to have a clear list of things to check in AI generated content in order to catch common AI mistakes like this.
4. Trustible’s Top AI Policy Stories
Trump’s New AI Executive Order. President Trump is expected to sign a new executive order (EO) aimed at pausing state AI laws. A draft was previously leaked that outlines how the Administration will leverage the Department of Justice to challenge state AI laws.
Our Take: The Trump Administration has sought to pre-empt state AI laws but has not offered a federal replacement. We anticipate a fairly lengthy legal battle to ensue once the EO is signed.
AI in the NDAA. Lawmakers added amendments to the National Defense Authorization Act that bans foreign models from use in the federal government and tasks the Department of Defense with creating an AI model evaluation framework.
Our Take: Congress is also considering legislation for an “AI in national security” playbook and these amendments would align with the targeted, security-focused approach to AI that we have seen from the federal government.
New York Times and Perplexity. The New York Times joined a copyright lawsuit against Preplexity.ai that alleges it illegally copied millions of its articles. Perplexity has been embroiled in legal battles for how it gathers and uses content for its AI search engine.
Our Take: The ongoing copyright infringement cases highlight the growing need to update IP laws and regulations that can account for how AI systems can use protected content.
In case you missed it, here are a few additional AI policy developments making the rounds:
Africa. The Ugandan government announced that it will release a draft plan to regulate AI. The decision marks one of the first significant efforts to regulate AI in Africa.
Asia. The Japanese government will release a draft plan to improve Japanese AI development and increase AI adoption. Japan has taken a lighter-touch regulatory approach with AI previously and is looking to develop its domestic AI ecosystem.
Australia. The Australian government released its National AI Plan, which is intended to help grow the country’s AI industry. The plan seeks to support new AI infrastructure, increase AI adoption, and enact laws to protect its citizens from potential AI harms.
Europe. Lawmakers are under pressure in the UK to regulate “superintelligent” AI system development. Specifically, the latest regulatory push wants more safeguards imposed on frontier model providers to reign in developing potential superintelligent systems.
North America. The Canadian government released its first public sector inventory of AI systems. The government also published the world’s first standard for developing equitable and accessible AI systems.
As always, we welcome your feedback on content! Have suggestions? Drop us a line at newsletter@trustible.ai.
AI Responsibly,
- Trustible Team






