Learnings from using Claude for PR reviews

Don’t let PR review be the frontier in agentic engineering that slows you

TL;DR

Utilizing claude-code-action in a PR workflow (i.e. GitHub Actions) supports automatically doing code reviews for a PR. Luis Gallardo wrote an excellent article on how to implement this.
Rather than spawning agents on your local machine with Claude Code to review changes, doing the automated reviews in the PR provides history and shared artifacts for compliance, learning, and context.
Utilizing separate agent(s) to do code review can help find issues that might be missed by the same agent who authored the PR due to context window build-up.

Don’t let PR review become the new AI bottleneck

There’s a lot of talk online about PR review becoming one of the bottlenecks in productivity when it comes to agentic engineering. This isn’t wrong. When everyone is shipping 5–10x the amount of code, organizations still rely on engineers to review code and approve it before it goes to production.

Most organizations are still not as a point where they are allowing AI to ship to production without a human-in-the-loop. That day is coming in the future, but perhaps not for every organization due to compliance and liability concerns.

Given that PR reviews are still here to stay for a while, how can you address this? Automating the review process is a good way to go.

Multiple options

I’ve experimented with 4 options for doing automated PR reviews.

Have the same agent who worked on the code perform a PR review with a custom prompt or skill that I write within my tool of choice (Claude Code).
Have 1 or more agents spawned as sub-agents in the background to review the PR on a number of different dimensions (security, test coverage, styling, etc.). Claude has an official code-review plugin for this.
Utilize the claude-code-action for GitHub PRs. This can run in a GitHub Action (GHA) workflow.
Utilize the Claude API for code review inside of a GHA workflow.

I recommend going with option #3. The reasons for doing this are below.

Cost — Claude API bills separately from Claude Code

I accidentally had my PR reviews in GHA utilize the Claude API initially. Claude charges for API usage separately even if you have a Claude Code subscription. While the API may have some advantages, the cost didn’t make much sense here. As long as you have enough tokens to spare for code reviews, you can leverage the claude-code-action in your GHA workflows and continue to utilize the existing budget you have for other Claude Code uses.

Isolation — separate agents give a fresh perspective

The advantage of using different agents to perform your code reviews is similar to why you don’t have the author of a PR do their own code review: separate reviewers catch things the author didn’t. This is the same whether you have a human do the review or AI. Bias and tunnel vision are real.

When a new agent takes a look at a PR, it lacks context on the change but can read the code and find areas for improvement that the authoring human/agent missed. Context is useful, but it can also lead AI into a local minimum and get stuck.

Auditability — local reviews hide information

A big part of PR reviews is enabling other engineers to see how a PR evolves as feedback is received and addressed. When you run a review in a PR with AI, it can generate comments on the initial commit, and then continually add new comments each time a commit is pushed.

This way, you have a nice history of the changes made based on each round of feedback the PR review agent generates. Without this, you only see the commit history from your agent(s), not the thought process or why behind each subsequent change. Furthermore, the commits may be squashed at the conclusion of the PR before they are merged into main, further obfuscating useful history.

Speed — a disadvantage for claude-code-action

One of the drawbacks of using the claude-code-action I found was the time it took for the PR review to complete and generate a comment. Other approaches (running PR reviews locally on Claude Code, Claude API) were typically faster. This is usually because of setup costs of the job and dependencies required by Claude within the GHA workflow. There may be some optimizations available here, but in general the speed of a review is less important to me given that correctness and reliability or paramount for pushing to production. If your review takes a few extra minutes, isn’t that worth it?

Example PR

I setup a GHA job with config that would run when a PR is opened, an issue comment is created, or a PR review was requested via comment. That would ensure multiple iterations could be done on the same PR.

name: Claude PR Review
on:
  pull_request:
    types: [opened, synchronize]
  issue_comment:
    types: [created]
  pull_request_review_comment:
    types: [created]

concurrency:
  group: claude-review-${{ github.event.pull_request.number || github.event.issue.number }}
  cancel-in-progress: true

jobs:
  auto-review:
    if: github.event_name == 'pull_request'
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: anthropics/claude-code-action@v1
        with:
          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
          github_token: ${{ github.token }}
          prompt: |
            Review PR #${{ github.event.pull_request.number }} in this repo. Follow the review instructions in .github/prompts/review-prompt.txt

The job auto-review uses the following prompt.md I created to instruct it:

You are a code reviewer. Review this PR.

Review against these standards:

## Code Quality
- Minimal and pragmatic — no over-engineering, no premature abstractions
- No unnecessary config, or "just in case" code
- Readability over cleverness
- Follow language style guides (gofmt for Go, ESLint/Prettier for TS)

## Security (OWASP Top 10)
- Input validation at system boundaries
- No hardcoded secrets, credentials, or API keys
- No injection vulnerabilities (SQL, XSS, command injection)
- Parameterized queries for database access

## Testing
- All code changes must include unit tests
- Integration tests for external boundaries (APIs, databases, file I/O)

## Output Format
Format your response as a GitHub PR comment in markdown with EXACTLY these section headers:

### Must Fix
Critical issues that must be resolved before merging. Number each item. If none, write "None."

### Should Consider
Important improvements that should be addressed. Number each item. If none, write "None."

### Minor
Non-blocking nits and style suggestions. These will NOT be auto-fixed.

### Looks Good
What the PR does well.

Be concise. Flag real problems, skip trivial nitpicks.

From the example below, you can see that after the first comment is made by Claude code the background agent I instructed via Claude Code detects a change in the PR and automatically addresses the comments with a new commit. The cycles repeats itself up to 5x (instructions given the background agent doing the PR) and then I am notified as the human reviewer that it is ready for my review.

Conclusion

There are many ways to do PR reviews. The approach I took involved experimentation, but I found the right balance for me between automation, reliability, cost, and speed.

Learnings from using Claude for PR reviews

TL;DR

Don’t let PR review become the new AI bottleneck

Multiple options

Cost — Claude API bills separately from Claude Code

Isolation — separate agents give a fresh perspective

Auditability — local reviews hide information

Speed — a disadvantage for claude-code-action

Example PR

Conclusion

Want help scaling your ops?

Related Posts

I Called My Claude Coding Agent Incompetent

CI/CD tiered rollouts to control blast radius

Canary Deployments with Argo