Mok et al. (2023)
7.6
Journal Score
vs
+2.0
gap
9.6
AI Score
AI substantially outperformed due to identifying 4 critical methodological flaws
The AI review (9.6/10) substantially surpassed the BMJ journal review (7.6/10) by a 2.0-point gap — the largest in this analysis. The AI identified four critical, unaddressed methodological flaws that threaten the validity of the study's conclusions, including a fundamental failure to account for time-varying confounding and severe issues with outcome phenotype heterogeneity. These four critical flaws were completely missed by three human expert reviewers and the editorial committee. Despite the performance gap, complementarity remains high at 69.6% — the journal review provided valuable clinical and patient-centered perspectives.
| Dimension | Journal | AI | Winner |
|---|---|---|---|
| Statistical Rigor | 7.0 | 10.0 | AI |
| Methodological Standards | 6.0 | 8.0 | AI |
| Clinical/Domain Context | 8.0 | 10.0 | AI |
| Study Design Critique | 7.0 | 10.0 | AI |
| Data Quality & Verification | 7.0 | 9.0 | AI |
| Interpretive Depth | 8.0 | 10.0 | AI |
| Systematic Completeness | 7.0 | 10.0 | AI |
| Actionability & Structure | 7.0 | 10.0 | AI |
| Tone & Constructiveness | 9.0 | 9.0 | Tie |
| Editorial Judgment | 10.0 | 10.0 | Tie |
Complementarity Score
69.6%
AI and human reviews identify substantially different issues, supporting use as complementary tools.
Issue Breakdown
AI detected 4 critical flaws. Journal detected 0 critical flaws.
Time-Varying Confounding
Fundamental study design flaw: failure to account for time-varying confounding in a longitudinal pharmacoepidemiology study, threatening validity of causal conclusions.
Fracture Phenotype Heterogeneity
Expert pharmacoepidemiology insight: heterogeneous fracture outcome definition (traumatic vs. pathological) invalidates fracture-specific results.
Pathological Fractures Introduce Confounding
Including pathological fractures in the outcome introduces confounding by indication, as the underlying conditions causing fractures may be associated with antipsychotic use.
Combination Products Introduce Confounding
Errors in exposure definition: combination antipsychotic products introduce confounding by indication that is not addressed by the propensity score matching.

Important Note
This analysis is based on a preliminary comparison of 5 manuscripts published in The BMJ (2021–2023). While the results provide encouraging evidence, the sample size is limited and findings should be interpreted with appropriate caution.
PeerGenius recommends a complementary hybrid approach: AI review as a first-pass screening for statistical and methodological rigor, combined with human expert review for clinical context, interpretive depth, and domain-specific judgment. AI review complements but does not replace traditional peer review.
Get the same rigorous, evidence-backed review for your manuscript, dissertation, or thesis.