What makes diffray different from other AI code review tools?

diffray uses multi-agent intelligence instead of single-model AI. Multiple specialized agents work together - Security Agent, Performance Agent, Architecture Agent, and Consistency Agent - each expert in their domain. This coordinated approach reduces false positives by 87% and catches 3x more real bugs compared to traditional single-agent tools like GitHub Copilot or CodeRabbit.

How does multi-agent AI code review work?

Multi-agent AI code review deploys specialized agents that work in parallel, each focused on a specific domain: security vulnerabilities, performance bottlenecks, architectural patterns, and code consistency. Unlike single-model approaches that suffer from context dilution, each agent maintains deep expertise in its area. Research shows this approach improves bug detection by 3x while reducing noise.

Is diffray free for open source projects?

Yes, diffray is completely free forever for open source projects. We support the open source community with full access to our multi-agent code review platform, including all specialized agents, unlimited reviews, and priority support.

What programming languages does diffray support?

diffray supports all major programming languages including TypeScript, JavaScript, Python, Go, Rust, Java, C#, Ruby, PHP, and more. The multi-agent system is language-agnostic and adapts its analysis to language-specific patterns and best practices.

How does diffray integrate with GitHub?

diffray integrates seamlessly with GitHub through a GitHub App. Once installed, it automatically reviews every pull request, posting actionable comments directly on the PR. Setup takes less than 2 minutes with no configuration required. Enterprise teams can also use diffray CLI for local reviews before pushing code.

What is the difference between diffray and CodeRabbit or GitHub Copilot?

While CodeRabbit and GitHub Copilot use single-model AI that can hallucinate and produce false positives, diffray employs multi-agent intelligence where specialized agents cross-validate findings. This results in 87% fewer false positives. Additionally, diffray provides full codebase awareness, custom rule support, and agent memory that learns from your team's patterns.

Can diffray detect security vulnerabilities?

Yes, diffray's Security Agent is specifically trained to detect OWASP Top 10 vulnerabilities, injection attacks, authentication flaws, and sensitive data exposure. It analyzes code in context of your entire codebase, reducing false positives while catching real security issues that static analysis tools miss.

How much does diffray reduce code review time?

According to our customer data, teams using diffray reduce PR review time by 73% on average - from 45 minutes to 12 minutes per week. This is because diffray's multi-agent system produces 87% fewer false positives, so developers spend time on real issues instead of filtering noise.

What is the developer action rate on diffray comments?

diffray achieves a 98% developer action rate on its comments, compared to industry average of 15-20% for traditional AI code review tools. This high engagement is due to our multi-agent approach that eliminates noise and surfaces only actionable findings with confidence scores.

How does diffray handle duplicate comments?

diffray guarantees zero duplicate comments through its intelligent deduplication system. Unlike single-agent tools that often flag the same issue multiple times across a PR, diffray's agents coordinate to consolidate findings and present each issue exactly once with full context.

Does diffray store my code?

No, diffray never stores your source code. Code is processed in memory during the review and immediately discarded. We are SOC 2 compliant and your code is never used for AI training. Enterprise customers can also use our on-premise deployment option for complete data sovereignty.

How does diffray compare to GitHub Copilot code review?

While GitHub Copilot uses a single AI model for code review, diffray employs specialized multi-agent intelligence. Research shows multi-agent systems catch 3x more real bugs while producing 87% fewer false positives. diffray also provides full codebase awareness, custom rules, and agent memory - features not available in Copilot's code review.

What is multi-agent code review?

Multi-agent code review is a methodology that uses multiple specialized AI agents working in parallel to analyze code. Unlike single-agent tools that use one AI model for everything, multi-agent systems deploy specialized agents (Security Agent, Performance Agent, Architecture Agent, etc.) that cross-validate findings. This approach reduces false positives by 87% and catches 3x more real bugs.

How does multi-agent code review differ from single-agent tools?

Single-agent tools like GitHub Copilot or CodeRabbit use one AI model that tries to analyze all aspects of code simultaneously. This leads to 'context dilution' where the model's attention spreads thin across many concerns. Multi-agent code review solves this by having specialized agents, each focused on one domain with full context for their specialty. The result: 87% fewer false positives and 98% developer action rate.

What agents are used in multi-agent code review?

diffray uses 10+ specialized agents including: Security Agent (OWASP Top 10, vulnerabilities), Performance Agent (N+1 queries, bottlenecks), Architecture Agent (design patterns, SOLID principles), Consistency Agent (code style, conventions), Testing Agent (test coverage, quality), Documentation Agent, and more. Each agent has deep expertise in its domain.

Why is multi-agent code review more accurate?

Multi-agent code review achieves higher accuracy through cross-validation. When one agent flags an issue, other relevant agents verify the finding before reporting. For example, if Security Agent detects a vulnerability, Architecture Agent checks if it's already handled by middleware. Only high-confidence findings that pass validation are reported, resulting in 87% fewer false positives.

多智能体架构

Multi-Agent
Code Review

多智能体代码审查使用 10+ 个专业 AI 智能体来调查、验证和确认您的代码。与单智能体工具不同，每个智能体专注于一个领域——安全、性能、架构——减少 87% 的误报，检测 3 倍更多的真实 bug。

合理问题

"单个提示词不能做到这个吗？"

"如果现代 LLM 能处理 200k tokens，为什么不直接发送 diff 加上相关上下文让模型自己搞定？所有这些智能体复杂性有什么意义？"

根本问题：你的代码库放不下

提示词只能看到你发送的内容。对于有意义的代码审查，你需要整个代码库的上下文——导入、依赖、相关文件、测试、约定。

平均代码库： 100k-500k+ 行

LLM 上下文窗口：最大 ~200k tokens

实际性能上限： ~25-30k tokens

即使放得下——也不管用

研究证明往 LLM 里塞更多上下文实际上损害性能。这叫做「上下文稀释」。

10-20%

文档过多导致的性能下降

U 形曲线

中间的信息会「丢失」

60-80%

上下文堆砌工具的误报率

阅读研究：为什么精选上下文胜过上下文体量 →

智能体实际提供什么

智能体不只是「更好地阅读提示词」。它们主动调查你的代码库：

选择性上下文检索

按需获取相关文件，而不是预先堆砌所有内容

假设验证

"我怀疑有类型不匹配" → 搜索调用者 → 用静态分析确认

迭代调查

跨文件追踪线索，发现可疑时深入挖掘

工具集成

运行 linter、类型检查器和分析器用真实数据验证发现

提示词只能看到你给它的内容。

智能体找到它需要的内容。

精准胜过体量

精选上下文管理

有用审查和噪音的区别不在于你有多少上下文——而在于有正确的上下文

diffray 如何精选上下文

依赖图分析

审查开始前，我们构建文件连接方式的映射——导入、导出、类型定义和调用链

智能过滤

每个智能体只收到与其任务相关的上下文——安全智能体获取认证流程，而不是 UI 样式

按需检索

智能体只在需要时获取额外上下文——追踪线索而不预先过载

分层上下文

核心上下文（diff、类型）常驻；周围上下文（调用者、测试）按需加载

上下文堆砌方式

200k tokens 的一切——diff、完整文件、随机依赖...

信号淹没在噪音中

重要细节在「丢失的中间」

注意力分散在不相关代码上

精选上下文方式

聚焦块——diff + 直接依赖 + 相关模式

每个 token 都有目的

关键信息保持焦点

全部注意力在重要事项上

了解更多关于我们的 AI 引擎 →

"直接问 LLM"的问题

单次 LLM 调用审查代码有根本限制

单次 LLM 调用

只看到你发送的内容

限于你提供的 diff

一次性生成

无迭代或验证

无法追踪导入

对依赖和上下文视而不见

幻觉不被检查

无法验证声明

固定上下文窗口

注意力分散在所有关注点上

通用建议

"确保调用者已更新"

基于智能体的系统

自主探索代码库

导航整个项目

迭代分析

追踪线索，深入挖掘

导航项目结构

理解导入和依赖

用真实工具验证

运行静态分析器确认

聚焦注意力

每个智能体专注一个领域

具体发现

"3 个调用点在第 45、89、112 行有类型不匹配"

区别在于猜测和调查。

是什么让智能体不同？

智能体是能够思考、行动和验证的 AI 系统

使用工具

读取文件、搜索代码、运行静态分析器

做出决策

根据发现选择调查什么

迭代

追踪线索、验证假设、深入挖掘

自我纠正

根据真实数据验证推理

diffray 智能体实际做什么

当 diffray 审查你的 PR 时，智能体不只是"看看 diff"

追踪依赖

追踪导入以理解更改的代码如何影响整个系统

检查相关文件

检查测试、配置和文档以获取上下文

验证假设

运行静态分析确认怀疑的问题实际存在

交叉引用

查找类型定义、API 契约和约定

真实例子

考虑 PR 中的函数签名更改：

单 LLM 方式

"这改变了返回类型，确保调用者已更新"

通用建议。无具体内容。

智能体方式

搜索此函数的所有用法
识别 3 个有类型不匹配的调用点
检查测试是否覆盖这些场景
报告具体文件和行号

→ "发现 3 个破坏性更改：src/api/users.ts:45、src/hooks/useAuth.ts:89、src/utils/validate.ts:112"

全代码库感知

Diff 不够

要真正理解更改，你需要看它们如何融入整个代码库

只看 diff 的审查看到什么

添加了新函数 formatUserName()

语法上看起来正确

这 20 行没有明显 Bug

结论："LGTM"——但完全错过了大局

代码库感知智能体看到什么

此函数与 utils/names.ts:formatName() 重复

现有函数处理了这个遗漏的边缘情况

其他 3 个文件已经使用现有工具函数

这违反了 /docs/CONVENTIONS.md 中的命名约定

结论："考虑使用 utils/names.ts 中现有的 formatName()"

diffray 智能体在 diff 之外检查什么：

重复检测

开发者是否在重新发明轮子？代码库中是否已存在类似解决方案？

模式一致性

这些更改是否遵循已建立的模式？还是引入了冲突的方法？

影响分析

这些更改如何影响系统的其余部分？什么依赖于修改的代码？

约定遵守

是否遵循了团队约定和文档化的标准？

Diff 显示什么改变了。全代码库上下文显示是否应该改变。

上下文稀释问题

单个 LLM 同时审查代码的所有方面面临一个根本问题：上下文稀释。

当它试图同时检查安全、性能、Bug 和风格时，注意力分散。它处理的关注点越多，越可能遗漏问题。

阅读完整文章：上下文稀释问题 →

diffray 的解决方案：专业智能体，每个都有自己狭窄的焦点。就像有一个专家团队 vs 一个试图做所有事情的通才。

每个智能体：

精选上下文

从精确收集的聚焦上下文开始——只有与其特定任务相关的文件、依赖和模式

保持聚焦

一项工作，做透彻——安全智能体只找漏洞，从不偏离到样式

深入

可以把全部上下文用于其专业——不在 10 个不同关注点之间分散注意力

从不遗忘

审查中途不会迷失——每条规则、每次检查、每次都是，无一例外

从不疲倦

第 50 个 PR 得到和第一个相同的关注——无疲劳、无仓促、无捷径

9 个专业智能体

认识审查团队

安全、性能、Bug、架构、测试等——每个智能体在其领域带来深厚专业知识。看看每个做什么。

diffray 背后的引擎

实现真正多智能体协作的强大基础

核心引擎

最新 Anthropic 模型（Haiku、Sonnet、Opus）
任务匹配模型选择
智能文件搜索
内置任务管理

工具引擎

静态分析器集成
假设验证
具体工具输出
大幅减少误报

多智能体架构

并行智能体执行
共享代码库上下文
发现去重
跨智能体验证

分阶段审查流水线

每次审查都经过多阶段流水线，每个阶段都为其目的优化

克隆

获取仓库并检出 PR

数据准备

构建依赖图

总结

LLM 总结更改

分类

将文件路由到智能体

规则

加载和过滤规则

审查

并行智能体分析

去重

合并和重新评分

验证

验证和重新评分

报告

生成 PR 评论

克隆

获取仓库并检出 PR

数据准备

构建依赖图

总结

LLM 总结更改

分类

将文件路由到智能体

规则

加载和过滤规则

审查

并行智能体分析

去重

合并和重新评分

验证

验证和重新评分

报告

生成 PR 评论

结果

一个将 AI 推理与具体代码分析相结合的多智能体系统——提供准确、经过验证的发现而不是猜测。

免费资源

AI 代码审查手册

来自 50+ 研究来源的数据驱动洞察。为什么开发者每周花 5-6 小时审查，为什么 AI 生成的代码需要更多审查，以及如何实施开发者真正信任的 AI 工具。

体验智能体带来的
差异

看看调查如何胜过猜测。在你的下一个 PR 上免费试用 diffray。

开始免费试用阅读文档

14 天免费试用

无需信用卡

2 分钟设置

Multi-AgentCode Review

"单个提示词不能做到这个吗？"

根本问题：你的代码库放不下

即使放得下——也不管用

智能体实际提供什么

精选上下文管理

diffray 如何精选上下文

依赖图分析

智能过滤

按需检索

分层上下文

上下文堆砌方式

精选上下文方式

"直接问 LLM"的问题

是什么让智能体不同？

使用工具

做出决策

迭代

自我纠正

diffray 智能体实际做什么

追踪依赖

检查相关文件

验证假设

交叉引用

真实例子

Diff 不够

只看 diff 的审查看到什么

代码库感知智能体看到什么

diffray 智能体在 diff 之外检查什么：

重复检测

模式一致性

影响分析

约定遵守

上下文稀释问题

每个智能体：

精选上下文

保持聚焦

深入

从不遗忘

从不疲倦

认识审查团队

diffray 背后的引擎

核心引擎

工具引擎

多智能体架构

分阶段审查流水线

AI 代码审查手册

体验智能体带来的差异

Multi-Agent
Code Review

体验智能体带来的
差异