Comparing PeerGenius reviews against BMJ open peer reviews across 5 published manuscripts
8.86/10
Review Quality Score
vs 7.86/10 journal
4 of 5
Near-Parity Achieved
Gap < 1.0 points
8 vs 1
Critical Flaws Detected
AI vs Journal
64.2%
Avg Complementarity
Reviews are synergistic
Across all five studies, the AI reviews achieved a higher average quality score (8.86/10) compared to the journal reviews (7.86/10), with an average performance gap of 1.0 point. Near-parity was achieved in 80% of cases, indicating that AI performance is now comparable to that of human expert reviewers.
The AI detected a total of 8 validity-threatening critical flaws that were missed by journal reviewers, who in turn identified only 1 critical flaw missed by the AI. The average complementarity score was high at 64.2%, demonstrating that both review types identify substantially different issues and are synergistic rather than redundant.

Statistical Rigor
Identifying flaws in statistical methods, power, and assumptions
Methodological Standards
Ensuring adherence to reporting guidelines (e.g., TRIPOD, CONSORT)
Systematic Completeness
Comprehensive coverage with multiple specialized agents
Actionability & Structure
Delivering structured, prioritized, and actionable feedback
Clinical/Domain Context
Understanding real-world clinical implications of research
Interpretive Depth
Nuanced, counter-intuitive, and sophisticated interpretations
Special Context Recognition
Adapting to unique manuscript contexts (e.g., Christmas articles)
Public Health Implications
Raising concerns about potential misuse or misinterpretation
QCovid4 — An updated risk prediction model for hospitalisation and death from COVID-19
8.3
Journal
vs
9.2
AI
+0.9
Gap
Multiple adverse outcomes associated with antipsychotic use in people with dementia
7.6
Journal
vs
9.6
AI
+2.0
Gap
Impact of abolishing primary care financial incentives on quality-of-care in Scotland
7.8
Journal
vs
8.1
AI
+0.3
Gap
What are the risks of having arthroscopic shoulder surgery? A population-based cohort study
7.8
Journal
vs
8.8
AI
+1.0
Gap
The association of genetically proxied sildenafil with fertility, sexual activity, and wellbeing: a Mendelian randomisation study
7.8
Journal
vs
8.6
AI
+0.8
Gap
Important Note
This analysis is based on a preliminary comparison of 5 manuscripts published in The BMJ (2021–2023). While the results provide encouraging evidence, the sample size is limited and findings should be interpreted with appropriate caution.
PeerGenius recommends a complementary hybrid approach: AI review as a first-pass screening for statistical and methodological rigor, combined with human expert review for clinical context, interpretive depth, and domain-specific judgment. AI review complements but does not replace traditional peer review.
Get the same rigorous, evidence-backed review for your manuscript, dissertation, or thesis.