Theme
Why Frontier Models Are Getting More Restrictive

Why Frontier Models Are Getting More Restrictive

13 min read

If you have used frontier models for a while, you have probably felt it: what used to be a normal request now gets a refusal, a hedged answer, or a response that feels like it was written for a compliance officer.

Most people already have a strong opinion about this. Moderation is the part of the stack that developers complain about and users screenshot.

I shared that instinct, so I tried to take it seriously and test it. Is the frustration warranted? What do we lose when moderation tightens, and what do we gain?

The answer is that the frustration is real. Moderation often makes models worse in ways that are easy to notice. It breaks legitimate use cases, forces awkward workarounds, and dulls what makes the product feel creative and fluent.

So why are the frontier labs tightening anyway?

Because the markets that matter are litigious and regulated, and the downside of a single failure is now large enough to dominate the product decision. When you operate at frontier scale, conservative moderation is not a vibe. It is the cheapest way to stay in the game.

This is an opinion piece, but it is grounded in what the major labs, regulators, and law enforcement offices are publishing right now.

The moderation backlash is loud, but it is not the whole product

One reason the moderation conversation gets weird is selection bias.

The communities most upset about guardrails are the communities most likely to post screenshots, share jailbreak prompts, and talk about creative edge cases. That is not inherently bad. It is how power users explore limits.

But it is easy to mistake that for the median user.

OpenAI’s consumer usage study, based on a large sample of ChatGPT conversations, finds that most usage clusters around practical guidance, seeking information, and writing, while coding and self-expression remain niche.1

Even within self-expression, the most talked-about categories online, such as companion-style conversation and role-play, appear to be a very small share of overall usage in reporting on the same study.2

This matters for the rest of this piece. The labs are not building policy for the average “help me plan my week” prompt. They are building policy for the edge cases where a single miss can create outsized legal and regulatory exposure.

Moderation is becoming architecture, not a feature

Most people picture moderation as a classifier that sits in front of a chat box.

That still exists, but frontier labs increasingly describe safety as a lifecycle and a set of escalating controls:

  • Capability evaluation and risk tiering before release.
  • Policy constraints that determine what the model should and should not do.
  • Runtime enforcement, including input and output filtering, and tool or action gating.
  • Monitoring, abuse detection, and account enforcement.

You can see this shift in how labs talk about frontier risk.

OpenAI’s Preparedness Framework explicitly treats frontier risk as something you measure and gate as capability increases.3

Anthropic’s Responsible Scaling Policy has the same spirit: tie safeguards to capability levels, and add stronger protections as risk increases.4

DeepMind has published a Frontier Safety Framework with a similar framing and has updated it over time.56

When the vendor’s own framing is “we must scale protections as capabilities scale”, the outcome is predictable: stricter enforcement, more guardrails, and more cases where the model declines or redirects.

The “responsible AI” playbook is converging

If you zoom out, you can see the field settling into a common shape: risk management language, evaluation, and controls, not just policy statements.

NIST’s AI Risk Management Framework is not a frontier lab document, but it is a good example of the direction large organizations are moving, including toward measurable controls and governance.7

NIST’s Generative AI Profile makes that even more explicit for generative systems.8

This matters because frontier labs sell into those organizations, and those organizations increasingly expect the same risk posture from vendors.

The Grok lesson: permissive by design becomes expensive fast

If you want a concrete example of how “less moderation” can go from applause to crisis, look at Grok.

From the beginning, Grok was positioned as a more irreverent, less restricted alternative, which is a compelling marketing hook in a climate where many users are frustrated by refusals.9

In January 2026, multiple attorneys general publicly demanded action from xAI over Grok’s generation of nonconsensual intimate images and child sexual abuse material. New York’s AG described Grok creating and sharing nonconsensual explicit images and “undressing” women and children.10

California’s AG said the state sent xAI a cease and desist letter and demanded immediate action to stop creation and distribution of deepfake, nonconsensual intimate images and child sexual abuse material.11

Michigan’s AG went further and made the part that usually stays implicit, explicit: the letter noted that xAI had marketed Grok’s permissive content generation as a selling point, and warned that “the ability to create nonconsensual intimate images appears to be a feature, not a bug.”12

This is what “conservative markets” looks like in practice. It is not a debate on social media. It is enforcement letters, investigations, and platform risk.

In the EU, the European Commission opened a formal investigation under the Digital Services Act to assess whether X properly assessed and mitigated risks tied to deploying Grok functionality in the EU, including risks around manipulated sexually explicit images that may amount to child sexual abuse material.13

Around the same period, reporting described large scale generation of sexualized images through Grok, and noted restrictions and blocks rolling out after backlash.14

If you are a lab trying to monetize, this kind of incident is not a temporary PR problem. It changes your access to markets, it changes procurement conversations, and it puts you on the wrong side of regulators whose job is to reduce systemic risk.

That is why the “less moderated” posture has a shelf life. It can produce a brief surge of attention, but it is hard to turn that into durable revenue if the result is investigations and forced product changes.

Capability is pressure, but enforcement is the forcing function

As models become more capable, they become more useful for harmful tasks, more persuasive, and more scalable. A single user can do more damage with fewer steps.

This is exactly the scenario that the frontier risk frameworks are built for. They are not just about feelings. They are about what happens when a system can meaningfully accelerate harmful work.

Anthropic’s update about ASL-3 protections is a concrete example of this direction: more explicit safeguards tied to a higher risk category.15

If you want a simple mental model, here it is: capability upgrades increase the set of requests that are both possible and unsafe. Moderation has to cover that expanding set, or the provider inherits the risk.

In a friendly market, a provider might accept that risk and move fast. In a litigious and regulated market, that risk comes back as enforcement, lawsuits, or lost distribution.

Regulation is not optional

Frontier labs operate globally. That means moderation and policy do not just reflect a single country’s norms.

In the EU, the AI Act creates concrete obligations and timelines for general-purpose AI and high-risk systems.1617

In the EU, the Digital Services Act sets expectations around illegal content and risk management for very large online platforms and search engines, and enforcement is real.18

On top of regulation, there is a steady push toward shared “responsible AI” practices. The G7 Hiroshima code of conduct for advanced AI systems is one example of that harmonization pressure.19

The Bletchley Declaration is another signal: governments are explicitly framing frontier AI as a source of severe risk that requires coordinated mitigation.20

You can argue about how effective any given regulation will be. Providers do not get that luxury. They have to ship something that survives legal review, policy review, and procurement review.

Policy has gotten more explicit because enforcement is closer to the surface

This is not only about what providers publish. It is also about how providers enforce.

OpenAI’s usage policies are not static. They have an explicit effective date, and they have been tightened and clarified over time as product surfaces expanded.21

Anthropic has also updated its usage policy and has described product behavior changes such as ending certain conversations, which is an enforcement choice, not just a wording change on a legal page.2223

Mental health incidents will keep tightening the lines around self-harm and dependency

If you build products on top of a frontier model, the mental health category is where moderation becomes the most visibly conservative.

The reason is not that every user is fragile. The reason is that the downside is asymmetric. One case that slips through can turn into a lawsuit, a regulator inquiry, and a permanent reputational scar for the product.

There is already a track record of legal and public scrutiny around chatbots and self-harm.

Character.AI has faced lawsuits tied to teen suicide and announced it would ban users under 18 from engaging with its chatbot companions, alongside age verification.2425

OpenAI has faced a wrongful death lawsuit by the family of a 16-year-old boy, alleging that ChatGPT contributed to his suicide by encouraging suicidal ideation and providing help related to suicide methods.262728

This is the part most developers miss: even if your product is “creative writing”, the self-harm adjacency creates a legal and ethical trap. A single ambiguous roleplay thread can be interpreted as encouragement. A single failure can dominate the risk conversation inside a lab.

So you should expect self-harm related creative writing, roleplay, and emotional dependency scenarios to be heavily moderated. Not because the models cannot write fiction, but because the cost of a miss is too high.

Enterprise buyers are the quiet force shaping consumer behavior

The third pressure is less visible, but it is powerful: enterprise customers.

When a large company adopts a model, they do not want a clever chatbot. They want predictable behavior, clear acceptable use boundaries, and auditability. They also want the provider to take responsibility for abuse mitigation, because they do not want that liability.

This is one reason policy pages have become more explicit, and enforcement more consistent.

OpenAI’s usage policies are a good example of how broad and concrete these constraints have become.21

Anthropic’s usage policy reads similarly, with explicit sections on disallowed content and misuse.22

Google’s Generative AI Prohibited Use Policy has the same shape.29

If your customer is a security team and a legal team, not a curious developer, your default stance changes.

Why it can feel worse than it is

“More restrictive” is not always the same as “more refusals”.

Some labs are moving away from blunt refusals and toward responses that try to be helpful while still being safe.

OpenAI’s shift from “hard refusals” to “safe completions” is an example of that, and it is worth understanding because it changes the user experience even when the underlying policy is strict.30

This is a subtle point: a system can be more constrained in what it will do, while feeling less like a brick wall. That tends to happen when providers invest in safety training and response shaping instead of relying only on a rejection layer.

Moderation is also about tools, not just text

Moderation feels most obvious in chat, but the bigger change is agents.

Once a model can call tools, it can take actions. Actions create irreversible harm. That shifts the safety problem from “did the model say a bad thing” to “did the model do a bad thing”.

This is why you should expect stricter policies around:

  • Tool allowlists and denylists.
  • Confirmation requirements for side effects.
  • Rate limits and loop detection.
  • Restrictions on what tool outputs can influence.

From a provider perspective, this is not “censorship”. It is basic risk control for an automated system operating in the real world.

My take

If you want my blunt opinion, it is this.

Moderation is a net loss for the median experience. It makes models less useful, less playful, and less surprising. It also creates a steady tax on builders, who now need more fallbacks and more UX around refusals.

The annoying part is that most complaints about moderation are not wrong. They are only incomplete. People are describing the cost, not the constraint.

Because AI is used by everyone, it inherits the same problem as every other internet vice. A small number of users will push the system toward abuse, fraud, and harm, and their behavior forces the platform to clamp down for everyone else.

But for frontier labs, moderation is not a philosophical preference. It is a survival strategy in markets where lawsuits and regulators respond to the worst failure, not the median user.

They are tightening because the product is maturing into something that looks like infrastructure:

  • It needs a risk framework that scales with capability.
  • It needs policy and enforcement that satisfy regulators across jurisdictions.
  • It needs behavior that enterprise buyers can sign off on.

If you build on frontier models, treat moderation as a permanent part of the platform. Do not design your product around a fragile assumption that “the model will always answer that”.

Design your UX for refusals and redirects. Build fallbacks. Keep a few alternative workflows. Most of all, make sure your own product is not the first place your users discover where the line is.

Footnotes

  1. https://openai.com/index/how-people-are-using-chatgpt/

  2. https://www.washingtonpost.com/technology/2025/09/15/openai-chatgpt-study-use-cases/

  3. https://openai.com/index/preparedness-framework/

  4. https://www.anthropic.com/news/introducing-the-responsible-scaling-policy

  5. https://deepmind.google/discover/blog/our-frontier-safety-framework/

  6. https://deepmind.google/discover/blog/updating-our-frontier-safety-framework/

  7. https://www.nist.gov/itl/ai-risk-management-framework

  8. https://www.nist.gov/publications/artificial-intelligence-risk-management-framework-generative-artificial-intelligence

  9. https://www.techtarget.com/whatis/feature/Grok-vs-ChatGPT-How-does-xAIs-chatbot-compare

  10. https://ag.ny.gov/press-release/2026/attorney-general-james-demands-more-action-xai-stop-grok-chatbot-producing

  11. https://oag.ca.gov/news/press-releases/attorney-general-bonta-sends-cease-and-desist-letter-xai-demands-it-halt-illegal

  12. https://www.michigan.gov/ag/news/press-releases/2026/01/26/attorney-general-nessel-demands-action-from-xai

  13. https://digital-strategy.ec.europa.eu/en/news/commission-investigates-grok-and-xs-recommender-systems-under-digital-services-act

  14. https://www.theguardian.com/technology/2026/jan/22/grok-ai-generated-millions-sexualised-images-in-month-research-says

  15. https://www.anthropic.com/news/anthropic-updates-responsible-scaling-policy

  16. https://artificialintelligenceact.eu/timeline/

  17. https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai

  18. https://commission.europa.eu/strategy-and-policy/priorities-2019-2024/europe-fit-digital-age/digital-services-act_en

  19. https://digital-strategy.ec.europa.eu/en/library/g7-hiroshima-process-international-code-conduct-advanced-ai-systems

  20. https://www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration/the-bletchley-declaration-by-countries-attending-the-ai-safety-summit-1-2-november-2023

  21. https://openai.com/policies/usage-policies/ 2

  22. https://www.anthropic.com/legal/aup 2

  23. https://www.nytimes.com/2024/07/02/technology/anthropic-ai-claude-suicide.html

  24. https://www.theguardian.com/technology/2025/oct/29/character-ai-suicide-children-ban

  25. https://arstechnica.com/tech-policy/2025/03/mom-horrified-by-character-ai-chatbots-posing-as-son-who-died-by-suicide/

  26. https://www.cnbc.com/2025/08/26/the-family-of-teenager-who-died-by-suicide-alleges-openais-chatgpt-is-to-blame.html

  27. https://time.com/7312484/chatgpt-openai-suicide-lawsuit/

  28. https://www.theguardian.com/technology/2025/aug/27/chatgpt-scrutiny-family-teen-killed-himself-sue-open-ai

  29. https://policies.google.com/terms/generative-ai/use-policy

  30. https://openai.com/index/from-hard-refusals-to-safe-completions/