What makes diffray different from other AI code review tools?

diffray uses multi-agent intelligence instead of single-model AI. Multiple specialized agents work together - Security Agent, Performance Agent, Architecture Agent, and Consistency Agent - each expert in their domain. This coordinated approach reduces false positives by 87% and catches 3x more real bugs compared to traditional single-agent tools like GitHub Copilot or CodeRabbit.

How does multi-agent AI code review work?

Multi-agent AI code review deploys specialized agents that work in parallel, each focused on a specific domain: security vulnerabilities, performance bottlenecks, architectural patterns, and code consistency. Unlike single-model approaches that suffer from context dilution, each agent maintains deep expertise in its area. Research shows this approach improves bug detection by 3x while reducing noise.

Is diffray free for open source projects?

Yes, diffray is completely free forever for open source projects. We support the open source community with full access to our multi-agent code review platform, including all specialized agents, unlimited reviews, and priority support.

What programming languages does diffray support?

diffray supports all major programming languages including TypeScript, JavaScript, Python, Go, Rust, Java, C#, Ruby, PHP, and more. The multi-agent system is language-agnostic and adapts its analysis to language-specific patterns and best practices.

How does diffray integrate with GitHub?

diffray integrates seamlessly with GitHub through a GitHub App. Once installed, it automatically reviews every pull request, posting actionable comments directly on the PR. Setup takes less than 2 minutes with no configuration required. Enterprise teams can also use diffray CLI for local reviews before pushing code.

What is the difference between diffray and CodeRabbit or GitHub Copilot?

While CodeRabbit and GitHub Copilot use single-model AI that can hallucinate and produce false positives, diffray employs multi-agent intelligence where specialized agents cross-validate findings. This results in 87% fewer false positives. Additionally, diffray provides full codebase awareness, custom rule support, and agent memory that learns from your team's patterns.

Can diffray detect security vulnerabilities?

Yes, diffray's Security Agent is specifically trained to detect OWASP Top 10 vulnerabilities, injection attacks, authentication flaws, and sensitive data exposure. It analyzes code in context of your entire codebase, reducing false positives while catching real security issues that static analysis tools miss.

How much does diffray reduce code review time?

According to our customer data, teams using diffray reduce PR review time by 73% on average - from 45 minutes to 12 minutes per week. This is because diffray's multi-agent system produces 87% fewer false positives, so developers spend time on real issues instead of filtering noise.

What is the developer action rate on diffray comments?

diffray achieves a 98% developer action rate on its comments, compared to industry average of 15-20% for traditional AI code review tools. This high engagement is due to our multi-agent approach that eliminates noise and surfaces only actionable findings with confidence scores.

How does diffray handle duplicate comments?

diffray guarantees zero duplicate comments through its intelligent deduplication system. Unlike single-agent tools that often flag the same issue multiple times across a PR, diffray's agents coordinate to consolidate findings and present each issue exactly once with full context.

Does diffray store my code?

No, diffray never stores your source code. Code is processed in memory during the review and immediately discarded. We are SOC 2 compliant and your code is never used for AI training. Enterprise customers can also use our on-premise deployment option for complete data sovereignty.

How does diffray compare to GitHub Copilot code review?

While GitHub Copilot uses a single AI model for code review, diffray employs specialized multi-agent intelligence. Research shows multi-agent systems catch 3x more real bugs while producing 87% fewer false positives. diffray also provides full codebase awareness, custom rules, and agent memory - features not available in Copilot's code review.

上下文稀释：更多 Token 反而降低 AI 性能

斯坦福、Google、Anthropic 和 Meta 的研究表明，当上下文窗口包含过多信息时，大型语言模型会出现可预测的性能下降。这种被称为上下文稀释的现象会导致模型在长提示中「丢失」关键信息，随着上下文增加，准确率下降13.9% 到 85%——即使模型能完美访问相关数据。

13.9-85%

随上下文增加的准确率下降

20+ pp

信息位于中间时的性能下降

49-67%

通过上下文检索减少的错误

「迷失在中间」现象：为什么位置很重要

来自斯坦福和 Meta AI 研究人员的 2023 年开创性论文「Lost in the Middle: How Language Models Use Long Contexts」奠定了理解上下文稀释的基础。通过在多文档问答任务上测试包括 GPT-3.5-Turbo、Claude-1.3 和 LongChat 在内的模型，研究人员发现了惊人的 U 形性能曲线：当相关信息出现在上下文的开头或结尾时，LLM 表现更好，但当关键细节隐藏在中间时，准确率急剧下降。

U 形性能曲线

开头

25%

中间

75%

结尾

模型准确率与相关信息在上下文中位置的关系

性能下降非常显著。当相关信息从上下文边缘移动到中心位置时，性能下降超过 20 个百分点。令人震惊的是，当相关信息被放置在 20 个文档的上下文中间时，GPT-3.5-Turbo 在多文档问答任务上的准确率甚至低于其在没有上下文时的表现。

注意力黑洞与稀释：基础架构限制

来自 MIT 和 Meta AI 的研究人员在他们的 ICLR 2024 论文「Efficient Streaming Language Models with Attention Sinks」中发现了另一块拼图。他们发现初始 token 获得了不成比例的高注意力分数，即使它们在语义上并不重要——他们将这种现象称为注意力黑洞。

为什么会发生注意力稀释

Softmax 强制注意力总和为 1

添加更多 token 意味着每个 token 平均获得的注意力更少

注意力黑洞吸收多余注意力

初始 token 成为「排水口」，无论其相关性如何

不相关 token 从相关 token 处窃取注意力

每个额外的文档都会逐渐降低信号质量

经验基准量化性能下降

NVIDIA 的 RULER 基准于 2024 年 4 月发布，表明声称的上下文长度远远超过有效上下文长度：

模型	声称上下文	有效上下文	下降 (4K→128K)
GPT-4	128K	64K	-15.4 pp
Yi-34B	200K	32K	-16.0 pp
Mistral 7B	32K	16K	-79.8 pp
Mixtral 8x7B	32K	32K	-50.4 pp

即使检索完美，上下文长度也会损害性能

2025 年 10 月的 arXiv 论文「Context Length Alone Hurts LLM Performance Despite Perfect Retrieval」提供了最反直觉的发现。即使在100% 完美检索相关信息的情况下，随着输入长度增加，性能也会从 13.9% 下降到 85%。