Anthropic recently introduced Fable, a public version of its advanced cybersecurity-focused AI model, Mythos. While the company designed Fable with strong safety protections to prevent misuse, many cybersecurity professionals are already expressing frustration over what they describe as overly restrictive guardrails.

The launch highlights a growing challenge in the artificial intelligence industry: balancing powerful capabilities with responsible safety measures.

Why Anthropic Added Strict Guardrails

Anthropic says the restrictions were implemented to reduce the risk of malicious actors using AI to create malware, exploit vulnerabilities, or carry out cyberattacks. Similar protections also exist for biological research topics to prevent misuse in potentially harmful activities.

Whenever Fable detects content that it considers related to cybersecurity or biological research, the model may pause the conversation and limit its responses.

According to Anthropic, these safety mechanisms are intended to ensure that advanced AI systems remain helpful without creating new security risks.

Cybersecurity Experts Voice Concerns

While the goal of protecting users is widely supported, many security researchers believe the current restrictions are too aggressive.

Several experts reported that Fable blocks even legitimate cybersecurity activities such as:

  • Reviewing secure code
  • Analyzing security blog posts
  • Discussing software security best practices
  • Performing routine code audits
  • Security research and educational tasks

Some users have reported that simply mentioning cybersecurity-related terms can trigger restrictions and cause the model to downgrade responses to a less capable AI system.

This has raised concerns that Fable may be limiting legitimate professional work instead of only preventing harmful activity.

The Challenge of Balancing Safety and Usability

The debate surrounding Fable reflects a larger issue facing AI developers worldwide.

As AI models become increasingly capable, companies must find ways to prevent misuse while still allowing professionals to perform legitimate work. Striking that balance is particularly difficult in cybersecurity because many defensive activities closely resemble offensive techniques.

For example, vulnerability testing and penetration testing often use the same methods that attackers might use. This makes it challenging for AI systems to distinguish between ethical security research and malicious intent.

Anthropic’s Cyber Verification Program

To address these concerns, Anthropic offers a Cyber Verification Program that gives approved cybersecurity professionals access to fewer restrictions when using Claude models for security-related tasks.

Similar initiatives are also being developed across the AI industry. OpenAI, for example, operates a Trusted Access program for qualified cybersecurity researchers.

These programs aim to provide professionals with advanced capabilities while maintaining broader safety protections for general users.

What This Means for the Future of AI Security Tools

The response to Fable demonstrates that AI companies still have significant work ahead in refining safety systems.

Many researchers agree that strong guardrails are necessary, especially as AI becomes more powerful. However, they also argue that overly restrictive protections can reduce productivity and limit legitimate research.

As AI adoption grows across cybersecurity, software development, and enterprise environments, companies like Anthropic will likely continue adjusting their models to find a better balance between security, usability, and professional workflows.

For now, Fable represents an important experiment in responsible AI deployment one that is already generating valuable feedback from the cybersecurity community.

Read More on VitalStack

Enjoyed this article?

Subscribe for weekly deep-dives on AI and health — straight to your inbox.