Thanks for raising this question. Currently, we're offering a free tier to gather community feedback and improve the service. We use enterprise-grade LLMs to ensure high-quality reviews while maintaining reasonable operational costs. Our focus is on building a valuable tool for developers first, and we'll be transparent about any future pricing changes.
Thanks for raising these important questions about data privacy and security. Let me clarify:
1. Code Processing: All code analysis happens in-memory during the PR review process. We don't permanently store any of your source code.
2. Data Retention: We only store the PR comments we generate, not the underlying code. This helps maintain a history of our suggestions while protecting your IP.
3. Privacy Focus: We take data privacy seriously and have successfully worked with both open-source and closed-source projects. We're always open to suggestions on how to further enhance our privacy measures.
If you have specific privacy requirements or suggestions, I'd be happy to discuss them.
1. Description* reeks of AI slop; it extended a surface-level prompt into longer surface-level insights. *: description as in GitHub README
2. #1 creates a situation where I go through reading this long thing, and realize it has no answers to even the first-level questions that would be on anyones mind (what model? where is it run?). For this to become something I'll take the time to integrate into my core workflow and try, it has to be *much* more transparent.
3. Claims in the description are ~impossible.
3b. Up front, I feel your pain, there's a hard set of constraints to navigate here given A) marketing needs to be concise B) people play fast and loose with conciseness vs. accuracy C) you need to sounds as good as the people in B.
3c. That being said, we're crossing into year 3 of post-ChatGPT. People, especially in your target audience, will know when they're reading* that you're reframing "I give text to the LLM which can theoratically do $X" into features, and users expect features to be designed* and intentional. If they are, you should definitely highlight that to differentiate from people who just throw it into the LLM.
3d. Analyzes your entire repository context: impossible, literally, unless you're feeding it to Gemini only. I have about 20KLOC and its multiples of Llama context size.
3e. "Understands code relationships and dependencies" see 3c.
3f. "Contextual Analysis: Reviews code changes within the full repository context": see 3d.
3g. "Language Agnostic: Supports all major programming languages.": see 3c (is there actual work done to do this, or is this just "well, given I just send the text to the LLM, everything is supported"?)
4. nit: Should be "Show HN: LlamaPReview, AI Github PR Reviewer That Learns Your Codebase"
Thank you for such detailed and thoughtful feedback! Really appreciate the time you took to analyze our claims and point out the areas needing more clarity.
You're absolutely right about the marketing copy - we should be more precise and transparent about what we actually do vs. what's aspirational.
Regarding "understanding code relationships and dependencies": We're building a knowledge graph of the entire repository that captures code relationships, function calls, and module dependencies. This graph is then used with GraphRAG to fetch relevant context for each PR, allowing the LLM to understand the broader impact of changes.
Important to note: We take privacy very seriously. All code analysis happens in-memory during PR reviews - we don't permanently store any source code or build persistent knowledge bases from customer code. The knowledge graph is generated and used on-the-fly for each review session.
This approach helps us work around context window limitations while providing meaningful insights. However, I should note that this feature is still under active development - we're continuously improving the graph construction and relevancy matching.
Would love to hear your thoughts on this approach. We're committed to building something genuinely useful for developers rather than just another LLM wrapper.
> Analyzes your entire repository context: impossible
That might be sort of doable, by extracting all function signatures together with brief descriptions, and including that in the context,
and maybe a graph showing how they call each other,
but none of the actual implementations. Except for the file(s) under review, which would be included in full.
The "learning" process involves analyzing your codebase's context during PR reviews - we don't train on your data (we even will not save them but only calculate in memory). Instead, we use advanced context retrieval to understand:
- Project structure and architecture
- Coding patterns and conventions
- Dependencies and relationships between components
This allows us to provide more relevant and context-aware reviews while maintaining data privacy (some advanced features still is under developing)
Are people really willing to commit code that was only reviewed by an AI? I personally wouldn't trust that for anything that is customer/revenue impacting. Obvious bugs and defects aren't all that hard to catch in normal code reviews but subtle race conditions/deadlocks/memory errors can be very tricky, do you have examples where it shows it can catch those?
This assumes that human reviews also catch these which is DEFINITELY not the case either.
As long as you have good pipelines, linters, a careful suite of tests at different levels like unit, integration, e2e and if you can test things in an acceptable like environment then human code reviews offer very very little benefit…
Your never going to catch 100% of issues, human or AI review, but I've found that in code reviews a lot of the benefit is when people ask questions about the code being reviewed and have a discussion on it.
Is the AI tool going to ask why something was implemented in a way that might not match the requirement specs? Is it even going to know what the requirements are for the code or is it going to rubber stamp a review because the code looks reasonable?
If you think human code reviews offer very very little benefit then you probably aren't doing them right.
PR reviewing isn't really about finding bugs because the tests should be doing that - but are the tests good enough? Is the approach sound and aligned with the architecture? And does anyone else understand it apart from the author?
Great points about code review reliability. LlamaPReview is designed to be a complementary tool for senior developers, not a replacement for human review. Here's our approach:
1. It helps save senior developers' time by handling routine checks and providing initial insights
2. It analyzes the entire codebase context to provide more meaningful reviews
3. It's particularly useful for identifying patterns and relationships across the codebase
The goal is to make human reviewers more efficient, allowing them to focus on complex architectural decisions and critical business logic. We've seen positive results from both open-source and commercial projects using this approach.
Thanks for raising this important question. We will not store any code in our database. But we will leverage SaaS LLM API (e.g. GPT/Claude/Mistral) to help on the PR review - during this step, for sure we need to send code to these SaaS LLM for analyze. This is the main reason why we mentioned "collecting users code" in our privacy.
Thanks for mentioning PR Agent. While there are several tools in this space, LlamaPReview focuses on deep codebase understanding and context-aware reviews(advanced functions still under evolution). We'd love to hear about your experiences and what specific features you find most valuable in code review tools.
I have a simple script I run before merging into the main branch that just tells Claude to look for obvious bugs, and to err on the side of saying it looks fine. Has stopped me from merging two or three bugs, 95% of the time it says things look fine so hasn't wasted my time.
I’m wondering if code review is the right place to give advice; it seems like the process is meant for human reviews (Where there is latency) and pair programming might be a better metaphor for what AI should be doing? Earlier feedback is often better.
We sort of have that with errors and warnings, where an IDE’s UI collects them into a todo list. The trouble is, the list isn’t necessarily prioritized very well.
On the other hand, asking for a review whenever you like is easy to control, versus being interrupted.
With all the AI tools floating around, it seems like user testimonials are going to be important for learning what’s worth trying out.
Interesting perspective on the timing of feedback. We chose PR reviews because they're a natural integration point where developers already expect feedback, and it's when context is most complete. However, we're exploring ways to provide earlier feedback without being intrusive.
The key is finding the right balance between immediate assistance and allowing developers to maintain their flow. Would love to hear more about your experiences with different feedback timing approaches.
I have a conundrum about this. If an LLM can learn our codebase and generate reasonable reviews, does this imply it could perform the work independently without us? Perhaps generating code and conducting code reviews are distinct tasks. Another related question is: for complex tasks that generative AI can't solve, could this service still provide somewhat meaningful reviews? Maybe it could be partially useful for certain subtasks like catching off-by-one errors.
Good questions! Code review and generation are quite different tasks. Review is about pattern recognition and consistency checking, while generation requires understanding business logic and system design.
LlamaPReview works best at:
- Spotting potential issues (like off-by-one errors)
- Identifying patterns across the codebase
- Maintaining coding standards
For complex architectural decisions, it serves as an assistant rather than a replacement - helping senior developers save their time to focus their attention where it matters most.
- Where is the source code? This is critical for it to be inspected before adding to any repos.
- What models are you using?
- Where are the models running?
- When you say it learns from your codebase is it building a RAG or similar database or are you fine tuning from other people's code?
The service runs on secure cloud infrastructure and processes code in-memory during PR reviews - we don't permanently store any source code. We use enterprise-grade LLMs (can't disclose specific models due to licensing) and implement context-aware analysis without fine-tuning on customer code.
When we say "learning", we mean analyzing the codebase context during PR reviews to understand patterns and relationships, not training or building persistent knowledge bases. This ensures both privacy and effectiveness.
We're working on open-sourcing parts of the implementation - will share more soon!
oh right - some one-way relationship with a corporate-or-worse software process that makes a record of all progress, with timestamps and topics.. what could go wrong?
Well, no, a PR contains code, not commit messages like git log. :)
If you meant generically 'like when we store code in git', I believe there are some meaningful distinctions between voluntary version control with a host you contracted or built, and continuously sending code to parts unknown.
Where's the AI Running? Where are you sending the code? Are you keeping some of it?
I hate to be the compliance guy, but even from a startup perspective you'd at least want to mention what you promise to do here.
I would want answers to all of these questions before touching an integration like this.
The underlying library it depends on is open source, but this app isn't. Presumably it's holding the codebase in state.
No website to speak of, just boilerplate text to satisfy Github's marketplace submission process.
Would be an instant no-go for any organization or individual that values their IP. Open Source - maybe.
> Where are you sending the code? Are you keeping some of it?
It does not really matter for FOSS projects. For those fearing licence laundering, don’t worry it will be done anyway for any public code.
The delay in answering this question makes me be more careful about this
then who pays for the capacity it runs on?
Thanks for raising this question. Currently, we're offering a free tier to gather community feedback and improve the service. We use enterprise-grade LLMs to ensure high-quality reviews while maintaining reasonable operational costs. Our focus is on building a valuable tool for developers first, and we'll be transparent about any future pricing changes.
Thanks for raising these important questions about data privacy and security. Let me clarify:
1. Code Processing: All code analysis happens in-memory during the PR review process. We don't permanently store any of your source code.
2. Data Retention: We only store the PR comments we generate, not the underlying code. This helps maintain a history of our suggestions while protecting your IP.
3. Privacy Focus: We take data privacy seriously and have successfully worked with both open-source and closed-source projects. We're always open to suggestions on how to further enhance our privacy measures.
If you have specific privacy requirements or suggestions, I'd be happy to discuss them.
Feedback:
1. Description* reeks of AI slop; it extended a surface-level prompt into longer surface-level insights. *: description as in GitHub README
2. #1 creates a situation where I go through reading this long thing, and realize it has no answers to even the first-level questions that would be on anyones mind (what model? where is it run?). For this to become something I'll take the time to integrate into my core workflow and try, it has to be *much* more transparent.
3. Claims in the description are ~impossible.
3b. Up front, I feel your pain, there's a hard set of constraints to navigate here given A) marketing needs to be concise B) people play fast and loose with conciseness vs. accuracy C) you need to sounds as good as the people in B.
3c. That being said, we're crossing into year 3 of post-ChatGPT. People, especially in your target audience, will know when they're reading* that you're reframing "I give text to the LLM which can theoratically do $X" into features, and users expect features to be designed* and intentional. If they are, you should definitely highlight that to differentiate from people who just throw it into the LLM.
3d. Analyzes your entire repository context: impossible, literally, unless you're feeding it to Gemini only. I have about 20KLOC and its multiples of Llama context size.
3e. "Understands code relationships and dependencies" see 3c.
3f. "Contextual Analysis: Reviews code changes within the full repository context": see 3d.
3g. "Language Agnostic: Supports all major programming languages.": see 3c (is there actual work done to do this, or is this just "well, given I just send the text to the LLM, everything is supported"?)
4. nit: Should be "Show HN: LlamaPReview, AI Github PR Reviewer That Learns Your Codebase"
Thank you for such detailed and thoughtful feedback! Really appreciate the time you took to analyze our claims and point out the areas needing more clarity.
You're absolutely right about the marketing copy - we should be more precise and transparent about what we actually do vs. what's aspirational.
Regarding "understanding code relationships and dependencies": We're building a knowledge graph of the entire repository that captures code relationships, function calls, and module dependencies. This graph is then used with GraphRAG to fetch relevant context for each PR, allowing the LLM to understand the broader impact of changes.
Important to note: We take privacy very seriously. All code analysis happens in-memory during PR reviews - we don't permanently store any source code or build persistent knowledge bases from customer code. The knowledge graph is generated and used on-the-fly for each review session.
This approach helps us work around context window limitations while providing meaningful insights. However, I should note that this feature is still under active development - we're continuously improving the graph construction and relevancy matching.
Would love to hear your thoughts on this approach. We're committed to building something genuinely useful for developers rather than just another LLM wrapper.
> Analyzes your entire repository context: impossible
That might be sort of doable, by extracting all function signatures together with brief descriptions, and including that in the context, and maybe a graph showing how they call each other,
but none of the actual implementations. Except for the file(s) under review, which would be included in full.
By "learns" do you mean "just shove the entire codebase into the context window", or does actual training-on-my-data take place?
The "learning" process involves analyzing your codebase's context during PR reviews - we don't train on your data (we even will not save them but only calculate in memory). Instead, we use advanced context retrieval to understand:
- Project structure and architecture - Coding patterns and conventions - Dependencies and relationships between components
This allows us to provide more relevant and context-aware reviews while maintaining data privacy (some advanced features still is under developing)
Are people really willing to commit code that was only reviewed by an AI? I personally wouldn't trust that for anything that is customer/revenue impacting. Obvious bugs and defects aren't all that hard to catch in normal code reviews but subtle race conditions/deadlocks/memory errors can be very tricky, do you have examples where it shows it can catch those?
This assumes that human reviews also catch these which is DEFINITELY not the case either.
As long as you have good pipelines, linters, a careful suite of tests at different levels like unit, integration, e2e and if you can test things in an acceptable like environment then human code reviews offer very very little benefit…
Your never going to catch 100% of issues, human or AI review, but I've found that in code reviews a lot of the benefit is when people ask questions about the code being reviewed and have a discussion on it.
Is the AI tool going to ask why something was implemented in a way that might not match the requirement specs? Is it even going to know what the requirements are for the code or is it going to rubber stamp a review because the code looks reasonable?
If you think human code reviews offer very very little benefit then you probably aren't doing them right.
PR reviewing isn't really about finding bugs because the tests should be doing that - but are the tests good enough? Is the approach sound and aligned with the architecture? And does anyone else understand it apart from the author?
Great points about code review reliability. LlamaPReview is designed to be a complementary tool for senior developers, not a replacement for human review. Here's our approach:
1. It helps save senior developers' time by handling routine checks and providing initial insights 2. It analyzes the entire codebase context to provide more meaningful reviews 3. It's particularly useful for identifying patterns and relationships across the codebase
The goal is to make human reviewers more efficient, allowing them to focus on complex architectural decisions and critical business logic. We've seen positive results from both open-source and commercial projects using this approach.
from your Privacy Policy, you straight up collecting users code. do you send it to someone else as well?
might make sense for open source. closed source is no go for this.
Thanks for raising this important question. We will not store any code in our database. But we will leverage SaaS LLM API (e.g. GPT/Claude/Mistral) to help on the PR review - during this step, for sure we need to send code to these SaaS LLM for analyze. This is the main reason why we mentioned "collecting users code" in our privacy.
This reminds me of the PR Agent open source tool: https://github.com/Codium-ai/pr-agent
I've found the code walkthroughs very useful
Thanks for mentioning PR Agent. While there are several tools in this space, LlamaPReview focuses on deep codebase understanding and context-aware reviews(advanced functions still under evolution). We'd love to hear about your experiences and what specific features you find most valuable in code review tools.
A name like llama-pr-review might help with searching for this thing. Preview being an actual word and all.
Description says:
> Unlimited AI-powered PR reviews
FAQ says:
> A: Yes, we currently offer a free tier with usage limits. You can install and use LlamaPReview without binding any payment method.
Only "free tier" is available.
I have a simple script I run before merging into the main branch that just tells Claude to look for obvious bugs, and to err on the side of saying it looks fine. Has stopped me from merging two or three bugs, 95% of the time it says things look fine so hasn't wasted my time.
Is that script shared somewhere?
I’m wondering if code review is the right place to give advice; it seems like the process is meant for human reviews (Where there is latency) and pair programming might be a better metaphor for what AI should be doing? Earlier feedback is often better.
We sort of have that with errors and warnings, where an IDE’s UI collects them into a todo list. The trouble is, the list isn’t necessarily prioritized very well.
On the other hand, asking for a review whenever you like is easy to control, versus being interrupted.
With all the AI tools floating around, it seems like user testimonials are going to be important for learning what’s worth trying out.
Interesting perspective on the timing of feedback. We chose PR reviews because they're a natural integration point where developers already expect feedback, and it's when context is most complete. However, we're exploring ways to provide earlier feedback without being intrusive.
The key is finding the right balance between immediate assistance and allowing developers to maintain their flow. Would love to hear more about your experiences with different feedback timing approaches.
I have a conundrum about this. If an LLM can learn our codebase and generate reasonable reviews, does this imply it could perform the work independently without us? Perhaps generating code and conducting code reviews are distinct tasks. Another related question is: for complex tasks that generative AI can't solve, could this service still provide somewhat meaningful reviews? Maybe it could be partially useful for certain subtasks like catching off-by-one errors.
Good questions! Code review and generation are quite different tasks. Review is about pattern recognition and consistency checking, while generation requires understanding business logic and system design.
LlamaPReview works best at: - Spotting potential issues (like off-by-one errors) - Identifying patterns across the codebase - Maintaining coding standards
For complex architectural decisions, it serves as an assistant rather than a replacement - helping senior developers save their time to focus their attention where it matters most.
Hello. A few questions:
- Where is the source code? This is critical for it to be inspected before adding to any repos. - What models are you using? - Where are the models running? - When you say it learns from your codebase is it building a RAG or similar database or are you fine tuning from other people's code?
Thanks for these important questions!
The service runs on secure cloud infrastructure and processes code in-memory during PR reviews - we don't permanently store any source code. We use enterprise-grade LLMs (can't disclose specific models due to licensing) and implement context-aware analysis without fine-tuning on customer code.
When we say "learning", we mean analyzing the codebase context during PR reviews to understand patterns and relationships, not training or building persistent knowledge bases. This ensures both privacy and effectiveness.
We're working on open-sourcing parts of the implementation - will share more soon!
(Show HN)
Any examples on actual PRs in public repos?
Maybe you could refer to my open source project llama-github
Where can I see the code for this?
The core code of this LlamaPReview is from my open source project llama-github. But there are some other code currently still have not open source.
Make it local and slow.
oh right - some one-way relationship with a corporate-or-worse software process that makes a record of all progress, with timestamps and topics.. what could go wrong?
So a git log?
Yes, a changelog where the receiver says “fuck your license, everything’s mine.”
Well, no, a PR contains code, not commit messages like git log. :)
If you meant generically 'like when we store code in git', I believe there are some meaningful distinctions between voluntary version control with a host you contracted or built, and continuously sending code to parts unknown.