What makes diffray different from other AI code review tools?

diffray uses multi-agent intelligence instead of single-model AI. Multiple specialized agents work together - Security Agent, Performance Agent, Architecture Agent, and Consistency Agent - each expert in their domain. This coordinated approach reduces false positives by 87% and catches 3x more real bugs compared to traditional single-agent tools like GitHub Copilot or CodeRabbit.

How does multi-agent AI code review work?

Multi-agent AI code review deploys specialized agents that work in parallel, each focused on a specific domain: security vulnerabilities, performance bottlenecks, architectural patterns, and code consistency. Unlike single-model approaches that suffer from context dilution, each agent maintains deep expertise in its area. Research shows this approach improves bug detection by 3x while reducing noise.

Is diffray free for open source projects?

Yes, diffray is completely free forever for open source projects. We support the open source community with full access to our multi-agent code review platform, including all specialized agents, unlimited reviews, and priority support.

What programming languages does diffray support?

diffray supports all major programming languages including TypeScript, JavaScript, Python, Go, Rust, Java, C#, Ruby, PHP, and more. The multi-agent system is language-agnostic and adapts its analysis to language-specific patterns and best practices.

How does diffray integrate with GitHub?

diffray integrates seamlessly with GitHub through a GitHub App. Once installed, it automatically reviews every pull request, posting actionable comments directly on the PR. Setup takes less than 2 minutes with no configuration required. Enterprise teams can also use diffray CLI for local reviews before pushing code.

What is the difference between diffray and CodeRabbit or GitHub Copilot?

While CodeRabbit and GitHub Copilot use single-model AI that can hallucinate and produce false positives, diffray employs multi-agent intelligence where specialized agents cross-validate findings. This results in 87% fewer false positives. Additionally, diffray provides full codebase awareness, custom rule support, and agent memory that learns from your team's patterns.

Can diffray detect security vulnerabilities?

Yes, diffray's Security Agent is specifically trained to detect OWASP Top 10 vulnerabilities, injection attacks, authentication flaws, and sensitive data exposure. It analyzes code in context of your entire codebase, reducing false positives while catching real security issues that static analysis tools miss.

How much does diffray reduce code review time?

According to our customer data, teams using diffray reduce PR review time by 73% on average - from 45 minutes to 12 minutes per week. This is because diffray's multi-agent system produces 87% fewer false positives, so developers spend time on real issues instead of filtering noise.

What is the developer action rate on diffray comments?

diffray achieves a 98% developer action rate on its comments, compared to industry average of 15-20% for traditional AI code review tools. This high engagement is due to our multi-agent approach that eliminates noise and surfaces only actionable findings with confidence scores.

How does diffray handle duplicate comments?

diffray guarantees zero duplicate comments through its intelligent deduplication system. Unlike single-agent tools that often flag the same issue multiple times across a PR, diffray's agents coordinate to consolidate findings and present each issue exactly once with full context.

Does diffray store my code?

No, diffray never stores your source code. Code is processed in memory during the review and immediately discarded. We are SOC 2 compliant and your code is never used for AI training. Enterprise customers can also use our on-premise deployment option for complete data sovereignty.

How does diffray compare to GitHub Copilot code review?

While GitHub Copilot uses a single AI model for code review, diffray employs specialized multi-agent intelligence. Research shows multi-agent systems catch 3x more real bugs while producing 87% fewer false positives. diffray also provides full codebase awareness, custom rules, and agent memory - features not available in Copilot's code review.

Every Mistake Becomes a Rule

Boris Cherny, the creator of Claude Code, recently revealed his workflow, and one phrase from his thread exploded across the developer community: "Anytime we see Claude do something incorrectly we add it to the CLAUDE.md, so Claude knows not to do it next time."

Product leader Aakash Gupta summarized it perfectly: "Every mistake becomes a rule." The longer a team works together with AI, the smarter it becomes.

This is exactly the philosophy diffray is built on. Today, we'll show you how it works under the hood.

The Problem: Context Pollution Kills Review Quality

Before we talk about rules, we need to understand the main technical challenge of AI code review — context pollution.

Anthropic's research shows that LLMs, like humans, lose focus as the context window fills up. Corrections accumulate, side discussions pile up, outdated tool outputs linger. The result is predictable:

False positives

AI finds "problems" that don't exist

Hallucinations

Imaginary bugs and non-existent patterns

Goal drift

Reviews become progressively less relevant

JetBrains Research (December 2025) quantified this: agent contexts grow so rapidly that they become expensive, yet don't deliver significantly better task performance. More context ≠ better results.

The Solution: Specialized Subagents with Isolated Context

Boris Cherny uses subagents as "automated encapsulations of the most common workflows." His philosophy:

"Reliability comes from specialization plus constraint"

Instead of one omniscient reviewer, his code review command spawns multiple parallel agents with distinct responsibilities:

1.One agent checks style guidelines

2.Another analyzes project history to understand patterns

3.A third flags obvious bugs

4.Then five additional agents specifically poke holes in the initial findings

This adversarial layer is crucial. Secondary agents challenge findings from the first pass, eliminating false positives through structured skepticism.

The result, in Cherny's words: "finds all the real issues without the false ones."

How It Works Technically

When the main agent delegates to a subagent, a fresh context window spawns containing only the task description and relevant parameters. The subagent may explore extensively—consuming tens of thousands of tokens searching through code—but returns only a condensed summary of 1,000-2,000 tokens.

This preserves the primary agent's focus while enabling deep analysis.

Main Agent(clean context)

🛡️

Security

isolated context

✨

Style

isolated context

⚡

Performance

isolated context

🏗️

Architecture

isolated context

Condensed summaries(1-2K tokens each)

At diffray, we use 10 specialized agents, each focused on a specific domain: security, performance, code style, architectural patterns, and more. Each agent operates in an isolated context and returns only substantive findings.

Rule Crafting: Turning Feedback into Knowledge

Now for the main event. Subagents solve the context problem. But how do you make AI learn from your corrections?

The CLAUDE.md Pattern

In Claude Code, teams maintain a CLAUDE.md file in their repository—a kind of "constitution" for the project. The file is automatically loaded into context at every session.

But there's a critical limitation. HumanLayer research shows that Claude Code's system prompt already contains ~50 instructions, and frontier LLMs reliably follow only 150-200 instructions total. Instruction-following quality decreases uniformly as count increases.

This means: you can't just dump 500 rules and expect magic.

Three Levels of Knowledge

Effective rules encode knowledge at three levels:

WHAT (Project Map)

## Tech Stack
- Backend: Python 3.11, FastAPI, SQLAlchemy
- Frontend: React 18, TypeScript, TailwindCSS
- DB: PostgreSQL 15

WHY (Architectural Decisions)

## Why We DON'T Use ORM for Complex Queries
History: ORM generated N+1 queries in reports.
Decision: Raw SQL for analytics, ORM only for CRUD.

HOW (Processes)

## Before Committing
- Run `make lint` — must pass with no errors
- Run `make test` — coverage must not drop

The Problem with Manual Approaches

Manual rule maintenance works... as long as your team is small and disciplined. In reality:

Developers forget to update rules

Rules go stale faster than code

Implicit conventions stay implicit

Tribal knowledge dies when key people leave

How diffray Automates Rule Crafting

diffray flips the process on its head. Instead of manually writing rules, you just give feedback on reviews.

The Learning Loop

📝

→

🔍

diffray Review

→

💬

Developer Feedback

→

🧠

Analysis

🔬Pattern Extraction

What went wrong?

⚙️Rule Generation

Create specific rule

✅Validation

Test on PR history

Next PR incorporates the rule

Step 1: You Give Feedback

Gave a thumbs-down to a diffray comment? Replied "this isn't a bug, it's intentional"? Ignored a recommendation? diffray captures it all.

Step 2: Pattern Extraction

diffray analyzes: what exactly was wrong? Was it a false alarm (code is correct), inapplicable context (rule doesn't apply here), or project-specific convention (that's how we do it here)?

Step 3: Rule Generation

Based on the pattern, diffray formulates a rule that specifies the scope (which files/directories), what to suppress or enforce, and why. The rule is linked to the original feedback for traceability.

Step 4: Validation

Before applying the rule, diffray runs it against historical PRs. How many comments would have been suppressed? How many of those were actual false positives? The rule is applied only if it improves accuracy.

Types of Rules in diffray

🚫

Suppression Rules

"Don't flag X in context Y" — silence specific warnings in legacy code, test files, or generated code.

🛡️

Enforcement Rules

"Always check for Z" — ensure critical patterns like SQL parameterization or auth checks are never missed.

🎯

Context Rules

"Consider the specifics" — adjust priority based on file type, decorators, or surrounding code patterns.

📖

Terminology Rules

"We call it this" — teach diffray your domain vocabulary so it understands your codebase better.

Practical Example: From Annoyance to Rule

Imagine: diffray leaves a comment on your PR:

Warning Performance: Using any reduces type safety. Consider explicit typing.

You know this is a legacy module scheduled for rewrite next quarter. Fixing types now would be a waste of time.

You reply: "This is legacy, typing will be addressed during Q2 refactoring"

What happens next:

1.diffray recognizes negative feedback

2.Analyzes context: file is in src/legacy/, there's a TODO with a date

3.Finds similar cases in history: 12 analogous comments in the past month

4.Generates a suppression rule for src/legacy/** with an expiration date (Q2)

5.Next PR in src/legacy/ — diffray stays silent about types

But importantly: the rule isn't permanent. The expiration date means that after Q2, diffray will start checking types in that directory again.

The Metric: Reducing False Positive Rate

The key measure of AI code review effectiveness is false positive rate. How many comments out of 100 were useless?

Typical industry benchmarks:

40-60%

Baseline AI review false positives

25-35%

With manual rules

8-13%

diffray with learned rules

How we achieve this:

Context isolation

Through subagents prevents drift

Agent specialization

Improves accuracy in each domain

Learning from feedback

Eliminates recurring false positives

Rule validation

Prevents overfitting

Getting Started: Three Steps

Step 1: Connect diffray to Your Repository

Integration takes 5 minutes via GitHub App or GitLab webhook.

Step 2: Just Work

For the first 2-3 weeks, diffray operates in learning mode. It studies your project structure, your PR patterns, and your reviewers' comment style.

Step 3: Give Feedback

Don't silently ignore diffray comments. Give thumbs-up to useful ones, thumbs-down to useless ones, reply to debatable ones.

Every interaction makes diffray smarter. After a month, you'll have a personalized AI reviewer that knows your conventions better than a new developer after onboarding.

Conclusion: AI That Grows with Your Team

The philosophy of "every mistake becomes a rule" isn't just a catchy phrase. It's an architectural principle that separates toy tools from production-ready solutions.

diffray is built on three pillars:

Subagents with isolated context

For accuracy without pollution

Rule crafting from feedback

For learning without manual work

Validation on history

For confidence in improvements

The result: AI code review that gets better with every PR. Not because the model was updated, but because it learns from your team.

Start Teaching Your AI Reviewer Today

Install diffray and open a PR. It's free for public repos and includes a generous free tier for private repos.

Install diffray Learn About Rules

Research Analysis

Every Mistake Becomes a Rule:How diffray Learns from Your Feedback

The Problem: Context Pollution Kills Review Quality

False positives

Hallucinations

Goal drift

The Solution: Specialized Subagents with Isolated Context

How It Works Technically

Rule Crafting: Turning Feedback into Knowledge

The CLAUDE.md Pattern

Three Levels of Knowledge

WHAT (Project Map)

WHY (Architectural Decisions)

HOW (Processes)

The Problem with Manual Approaches

How diffray Automates Rule Crafting

The Learning Loop

Step 1: You Give Feedback

Step 2: Pattern Extraction

Step 3: Rule Generation

Step 4: Validation

Types of Rules in diffray

Suppression Rules

Enforcement Rules

Context Rules

Terminology Rules

Practical Example: From Annoyance to Rule

The Metric: Reducing False Positive Rate

Context isolation

Agent specialization

Learning from feedback

Rule validation

Getting Started: Three Steps

Step 1: Connect diffray to Your Repository

Step 2: Just Work

Step 3: Give Feedback

Conclusion: AI That Grows with Your Team

Start Teaching Your AI Reviewer Today

Related Articles

Why Noisy AI Code Review Tools Deliver Negative ROI

Context Awareness in AI Code Review: How Intelligent Systems Understand Your Codebase

Introducing Agent Store: Create, Share, and Discover Custom AI Agents

Every Mistake Becomes a Rule:
How diffray Learns from Your Feedback