Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/01_bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
- label: "I assert that I have read the [AI/Automation policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."

I think AI contributions will become more and more popular than every other automation combined so lets reorder words here and in every other place

@SigmaSquadron SigmaSquadron Apr 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only is classic automation just way more prevalent in general (despite what the AI hype will tell you), this is very much a nitpick, and I feel our usual review guidelines still apply to such policy PRs. Let's not waste time worrying about trivial things like word ordering when the bigger fish to fry here is the actual meaning and intent of the policy.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think this is fine as‐is – I would say it’s primarily an “automation policy” and the reason for the slashed term is discoverability – and I don’t think we can satisfy both this and #514587 (comment) :)

required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/02_bug_report_darwin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/03_bug_report_nixos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/04_build_failure.yml
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/05_update_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/06_module_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/07_backport_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/08_documentation_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/ISSUE_TEMPLATE/09_unreproducible_package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,8 @@ body:
required: true
- label: "I assert that I have read the [NixOS Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) and agree to abide by it."
required: true
- label: "I assert that I have read the [automation/AI policy](https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy) and that this issue report complies with it."
required: true
- type: "markdown"
attributes:
value: |
Expand Down
2 changes: 2 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,14 @@ For new packages please briefly describe the package or provide a link to its ho
- [ ] Module addition: when adding a new NixOS module.
- [ ] Module update: when the change is significant.
- [ ] Fits [CONTRIBUTING.md], [pkgs/README.md], [maintainers/README.md] and other READMEs.
- [ ] Follows the [automation/AI policy].

[NixOS tests]: https://nixos.org/manual/nixos/unstable/index.html#sec-nixos-tests
[Package tests]: https://github.com/NixOS/nixpkgs/blob/master/pkgs/README.md#package-tests
[nixpkgs-review usage]: https://github.com/Mic92/nixpkgs-review#usage

[CONTRIBUTING.md]: https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md
[automation/AI policy]: https://github.com/NixOS/nixpkgs/blob/master/CONTRIBUTING.md#automationai-policy
[lib/tests]: https://github.com/NixOS/nixpkgs/blob/master/lib/tests
[maintainers/README.md]: https://github.com/NixOS/nixpkgs/blob/master/maintainers/README.md
[nixos/tests]: https://github.com/NixOS/nixpkgs/blob/master/nixos/tests
Expand Down
76 changes: 75 additions & 1 deletion CONTRIBUTING.md

@acid-bong acid-bong Apr 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(here, because it's a reply to Emily's header message, not this file's contents)

we would not want to ban the use of accessibility tools that often use LLMs like machine translation, speech to text, text to speech, and OCR

accessibility tools don't generate code, which is what this policy should regulate, they only interpret existing media, such as reading the given text or type from the person's voice. a contribution from a visually impaired person, who reads with LLM-powered tools, but codes everything by hand, isn't LLM-generated. it is certainly possible to restrict LLM usage and not restrict accessibility tools

original replies:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A coding agent can itself also be an accessibility tool, if someone cannot type for any reason (e.g. RSI).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSI meaning repetitive strain injury?

anyway, once again, see my points in linked messages: there's a difference between voice-typing word by word (and knowing what you do) and "hey, claude, make this thing for me"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to highlight #516544 as a recent AI related controversy likely needing a policy, but not covered by this current PR

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emoji reactions might not be a representative sample but it appears like people overwhelmingly support the name change.

Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ For example, if you make a change to `texlive`, you probably would only check th

#### Meets Nixpkgs contribution standards

The last checkbox is about whether it fits the guidelines in this `CONTRIBUTING.md` file.
The last two checkboxes are about whether it fits the guidelines in this `CONTRIBUTING.md` file.
This document details our standards for commit messages, reviews, licensing of contributions, etc...
Everyone should read and understand these standards before submitting a pull request.

Expand Down Expand Up @@ -888,3 +888,77 @@ As mentioned previously, it is unfortunately perfectly normal for a PR to sit ar

Please don't blow up situations where progress is happening but is merely not going fast enough for your tastes.
Honking in a traffic jam will not make you go any faster.

# Automation/AI policy

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey all,

I fear that this policy is still too permissive. I see the effort put into guardrails to guide LLM users to use their tools responsibly, but am scared that this will still encourage a flood of low effort contributions.

As you say, we are fundamentally bottlenecked on manual review. I am not enthusiastic about reviewing code that could be AI generated. I do not want to spend my time explaining why something fails, and how it can be fixed, to someone who will just pass it along to their chatbot. I do not think that allowing LLM generated content, even clearly labled, will help the burden of manual reviewers.

I believe automated, deterministic, auditable tools fundamentally differ from LLMs. While I agree that holding them to at least the same standards is a good starting point, I believe they should be held to a much stricter one - if allowed at all.

That said, I appreciate that a stance is being taken. Thank you for your transparency and request for feedback.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, and many other maintainers have had multiple workflow disrupting interactions with people using AI as a substitute for thinking. Many contributors will not review, advise, or otherwise knowingly interact with any contribution that has been generated by an LLM. That will not change, regardless of what policy is implemented.

I understand that this is not the case for everyone here, and have had many discussions with people on both sides of the fence. One of the arguments I hear the most is that banning AI all-together is unenforceable. Or that a full ban will alienate new contributors. These are not reasons to implement a more permissive policy. We want new contributors to write their own code, be corrected, and learn from the experience. We want our energy to benefit the community.

Enforcing an AI ban will not be impossible. As mentioned here, banning AI is doable. In most situations it is immediately obvious when someone has used an LLM. If they are dishonest about their use, this can easily be escalated and verified. Multiple other large, community governed projects have chosen for a stricter policy, if not an outright ban (see Wikipedia, Gentoo).

Something similar to the proposed rustlang AI policy is much more in line with our community values. Our policy should aim to lighten the load of reviewers, and encourage human learning.

It is imperative that we implement a policy, and these discussions are prone to stall. I urge the core team to close the discussion after the two week mark.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely agree that contributions and reviews being passed unthinkingly back and forth to LLMs is not acceptable, and this policy is very much intended to forbid that; I’ve said more about this and other matters you’ve raised at #514587 (comment), so I’ll point there to avoid duplicating responses :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'll get push-back on this, but I really think it's worth including at least a minimal AGENTS.md in the repository that just highlights these standards, if only because then we can enlist the agents themselves to help inform users about the expectations and responsibilities they have to adhere to as contributors.

@typedrat typedrat Apr 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will read CONTRIBUTING.md at times, and I have a global instruction on my system for my agent to do so, but AGENTS.md is directly injected into the context window as part of the system prompt (or at least automatically injected into the context window at the start of all conversations) so they don't even have to read it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that figuring out this sort of thing was the point of one of the GSoC proposals that didn't make the cut. It might be worth revisiting, or even suggesting a stub AGENTS.md that has the shape you're considering--a strawman would go a long way. :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might adding such a file like this not defeat the spirit of the stated brown M&M test though?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall feeling positive about the proposal, will start with some stylistic remarks

Suggested change
# Automation/AI policy
# Automation and "AI" policy
  1. Forward-slashes like these generally look like lazy writing
  2. "AI" is a marketing term, I suggest we keep it quoted, and later elaborate, once, that what we mean is something along the lines of "data-driven programs"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don’t love the slashes, I think “and” here is misleading, as the policy considers automation in general quite uniformly and treats the latter a subset of the former. I think adding the quotes would be more likely to confuse people than help; in the actual body text, “LLM‐based AI tools” is used consistently, which I think should be sufficiently precise, but the title should aim for obviousness and discoverability, which means sticking to common terminology. (Otherwise, it could just be the “automation policy”.)

A non‐standard term like “data-driven programs” seems like it would invite more bikeshedding over the boundaries. I’m tempted to say this and #514587 (comment) cancel each other out 😅

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think “and” here is misleading

"And" is not always disjoint?

“LLM‐based AI tools” is used consistently

While at it, I'd suggest: "AI" and (L)LM-based tools. This way we highlight that the policy isn't about any particular architecture or implementation, but about certain properties (complexity, poor inspectability, poor predictability, &c) of these tools.

A non‐standard term like “data-driven programs”

It is somewhat common123, in any case less non-standard than "AI" the way it's used today. "AI" is good in a title.

Footnotes

  1. https://en.wikipedia.org/wiki/Data-driven_model

  2. https://www.youtube.com/watch?v=a13aqr07tJ4

  3. https://archive.org/details/Redwood_Center_2014_09_24_Alyosha_Efros

@ethancedwards8 ethancedwards8 Apr 29, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi folks,

TL;DR: We probably shouldn't allow LLM-usage for the vast majority of contributions

I think something that folks in this issue are missing is the social component of open source software development. I'm assuming the vast majority of contributors to Nixpkgs are hobbyists or volunteers who are looking to give-back to the community/open source or learn more about Nix and its workings. While there are probably some folks who are paid to work on Nixpkgs in some capacity, I imagine it is a very, very, very small fraction of the total contributor base. Essentially, NixOS/Nixpkgs is not a production codebase and there are not economic incentives for most people to work on it efficiently.

By the way, this comment applies to code generation tools and not accessibility tools, which I fully support for obvious reasons.

But there's a social contract that we all accept when making contributions to a project or reviewing contributions: code in exchange for knowledge. If I make the (honest) effort to write code and contribute it to a project, I deserve to be corrected and taught if I make a mistake (through reviews, CI, documentation, etc.). Likewise, reviewers deserve an honest attempt to address the feedback and improve their contributions (present and future). By other people investing time in reviews to my code, they are investing in me and my future development and contribution to Nixpkgs. When I invest my (rare) time into someone's review, I am investing in them being future contributors and in their learning. This is why open source works.

LLMs short-circuit this social contract. LLMs do not learn. When someone submits a vibe-coded contribution to initialize a package, they are probably just going to feed your review straight back into claude code/codex/cursor/whatever and push the PR without any second thought (especially for one-off/drive-by contributions). The human will have learned and accomplished nothing except burning tokens. I personally will not tolerate this and will refuse to engage with contributors who are known to do this.

At the end of the day, contributing to Nixpkgs is not hard. It does not take a genius to figure out how to copy a template and change some lines. But there is real learning that occurs for a human that is reviewing and responding to feedback (be it created by a human or by deterministic CI). If you are unwilling to learn to do this yourself and learn, do we as a community really want you as a contributor or your code? I don't think so. You'll just make more work for me and other people who do understand and care later on.

Advocates of allowing LLM generated contributions may cite increased productivity and spending less time on contributions as a reason to allow these tools. I disagree. The vast majority of contributors are not being paid to work on Nixpkgs. There's not an economic (only time) incentive for us to work faster. We don't report to any shareholders or managers. If someone wants to spend less time on Nixpkgs, then they should do less, not do more lower-quality work. We have enough work to sort through as it is.

That being said, I do think LLMs can be used properly for experienced contributors making certain kinds of changes. Like mentioned in another thread, tedious tree-wide changes can be easily made with LLMs in a way that deterministic tools struggle with. I think this is a perfectly practical use case (although deterministic solutions are preferable for environmental, ethical, and legal reasons). I think properly disclosing usage is important and appropriate here, too.

I am not staunchly anti LLM. I will be working for an AI company this summer. I use AI coding tools for my job (I do not use them for open source, school, or pet projects). For better or for worse, "AI" coding is probably here to stay (whether through local models, frontier labs, or otherwise). Programming and open source more broadly will change. It is what it is. But LLMs are not the solution to every technical problem. There is real value is learning, building experience, and the social aspect of open source. Let's not lose that for some perceived productivity gains and to gain lazy code that will ultimately become someone else's responsibility.

Of course, there are also a litany of environmental, copyright, ethical, legal, etc. concerns as well. Let's not forget about those. Do we want to be a part of this? Maybe we do. Maybe we don't.

Just food for thought.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add my own thoughts here, as a long-time NixOS user that is likely to jump ship if any GenAI contributions will be let in.

This policy is, to me, way too lax. I feel the bare minimum is something like what Rust is doing in rust-lang/rust-forge#1040.

Personally, I am willing to compromise with using it to look things up, or to use it for reviews. But that is the furthest I would be ok with letting it in.

GenAI technology is fundamentally unethical, and I feel it is important to make a stance against it.

@alyssais alyssais Apr 30, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status quo is that LLM-generated contributions are already being let in to Nixpkgs, silently, with no attribution, because there are (with the exception of the code of conduct) zero rules governing their use. As we said in the PR body, "We would like this policy to be seen as establishing consistent standards based on pre‐existing norms around automation and ruling out the most problematic cases, not as an endorsement by the project, nor necessarily as the final word on the matter." It would be very difficult, and take a much longer time, for a near total ban on LLMs to gain widespread community acceptance, which would be necessary to have it enforced effectively. (And there are also other unintended consequences to consider, like people maybe being more likely to use LLMs unattributed with such a policy.)

That's why we want to start with a more scoped policy that we hope will be broadly acceptable. That doesn't preclude a stricter policy later, if consensus and legitimacy in the community can be achieved for it; it just means that we don't get stuck with no policy at all (de facto permitting unrestricted LLM use) unless and until we get to that point.

@SomeoneSerge SomeoneSerge Apr 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be very difficult, and take a much longer time, for a near total ban on LLMs to gain widespread community acceptance

Which is why a moratorium is a reasonable start

EDIT: Meaning, a moratorium until we've figured out how to define "safe ways of using "AI"" if any, such as a definition of some sort of "slow mode" operation: #514587 (comment)

@lumi-me-not lumi-me-not Apr 30, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status quo is that LLM-generated contributions are already being let in to Nixpkgs, silently, with no attribution, because there are (with the exception of the code of conduct) zero rules governing their use.

There not being any rules should mean erring on the side of caution and banning them, in my opinion. Though, I do get what you're saying.

That's why we want to start with a more scoped policy that we hope will be broadly acceptable.

I fear that this policy will provide legitimacy to anyone trying to get GenAI outputs into nixpkgs, and that after it is accepted, there will be much less of a rush to get a proper policy in place.

(And there are also other unintended consequences to consider, like people maybe being more likely to use LLMs unattributed with such a policy.)

Yes, they could try to sneak GenAI contributions in, but I don't think this should change the stances of the project. This technology is unethical and I feel it is important to make a stand against it.

Enforcement can be done in obvious cases. Having a stance against it will at least make people know that GenAI is unacceptable, and that is the most important thing, to me.

I feel https://cyrneko.eu/ai-policy.html has good ideas regarding enforcement:

These policy discussions always sound a bit like those in tournaments or speedrunning communities around rules.

There, a lot of the same problems crop up in regards to cheating, just that here we're not talking about "cheating" in the traditional sense of course.

They usually handle it pretty simply:

  • There is a team that checks anything submitted against the rules
  • If someone is found to violate the rules, it gets escalated and becomes a discussion with rulemakers and such
  • If the person is found to have cheated, they are banned from competing.

This applies pretty nicely to software projects as well, of course substituting some of the terminology for fitting ones for software development.

If someone lies about the use of AI, or license compatibility despite set-in-stone clear policy around such things, they should be banned from contributing. Either permanently or temporarily.

Just like with cheaters in games, sometimes someone will make it through successfully. But, more often than not, these people eventually also get exposed, and then swiftly banned from competing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I’m sympathetic towards this position (not accepting LLM-based contributions), I don’t think starting with a ban is politically viable in Nixpkgs. I would prefer to have some policy, which can be revised, than continue having no policy due to the initial attempt being too controversial to move forward.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate you taking the time to give such a thoughtful comment, @ethancedwards8; thank you. I’ll try to do justice to it and the other matters raised in this thread in response.

I want to say, first off, that it is absolutely the intention of this policy to fully prohibit vibe coding, in the sense of getting an LLM‐based AI tool to do all the work without review and submitting it without understanding or learning anything, and that we also completely agree that acting as a proxy passing reviews unthinkingly back and forth to LLMs is not acceptable. Those get to the heart of “extractive contributions”, and immediately mitigating those by ruling out cases that we believe are obviously unsuitable for FOSS contributions is our primary objective with this policy.

We agree that the social contract of genuine interaction and learning is important, and a significant part of what we want to address here. While disclosure is the foundation of the policy, I think the rules around accountability are the critical ones. I would say the core part of the policy is this:

Everyone who submits a contribution to Nixpkgs is responsible for it, regardless of the use of automated tooling. Before submission, they are expected to establish a reasonable level of understanding of the contribution and belief in its correctness. Contributors are expected to be able to answer questions about their contribution and respond to feedback appropriately.

Someone who has vibe‐coded a package without learning anything about Nixpkgs won’t have reached a reasonable level of understanding, won’t be able to perform the required review of the LLM output, and won’t be able to appropriately answer questions about it or respond to feedback; it’s true that these are somewhat subjective requirements to judge, but we thought they were important enough to include anyway.

The case of being a proxy ferrying reviews back and forth to the same tool that produced a contribution in the first place is why we’ve included PR reviews and comments as covered contributions with the same expectation of disclosure and responsibility; the requirements to review and understand what you’re submitting and to be able to handle questions and feedback about it appropriately apply equally to further communication on a PR as they do to the original submission, and are intended to require people to genuinely engage and put work into their communication, rather than forwarding things on to an LLM that won’t learn for the next PR. (Overall, I would say that code should be treated as a form of communication in itself, hence the policy treating it uniformly with comments and reviews.)

That said, we ought to at least take a look at clarifying the wording here to make the requirements clearer, and this kind of dynamic is very much an area where we expect there may be the need for further revision and refinement – we don’t think that the initial rules here necessarily guarantee that appropriate oversight and engagement is taking place, and it’s certainly not the case that the policy is intended as an endorsement of every contribution that meets its minimum requirements.


On ethics: We by no means want to dismiss the social or ethical concerns people have here. There are some things that we can easily near‐unanimously agree are ethically unacceptable (e.g. harassment, submitting malicious code). The hard cases, of course, are where there’s significant division. For an example, there are many strong and well‐reasoned arguments that non‐Free software is unethical, but they’re far from universally accepted. We have community members who prefer to absolutely minimize their exposure to non‐Free software, community members who are completely fine with it, and community members who don’t have a strong opinion one way or the other or even feel conflicted about it (as a lib/licenses.nix maintainer who is not infrequently being a FOSS pedant on PRs, and a member of the Darwin team who daily drives Nix on macOS, I am familiar with this dissonance).

We have an essentially pragmatic policy that splits the difference as a result: community values encode our preference for FOSS, but support for users’ need for non‐Free software explicitly; we allow arbitrary non‐Free packages, but we ensure that they are appropriately labelled, and only build packages with FOSS licences on Hydra.

That seems to work quite well in practice for our diverse, pluralistic community, but it’s not a fully settled issue (e.g. recurring discussions about building non‐Free software on Hydra), and some will consider it an unacceptable ethical compromise and be unwilling to contribute to or use Nixpkgs as a result.

I don’t mean to trivialize this or say that a stronger stance would inherently be unwarranted; GNU considers it a serious enough ethical issue that projects like Guix don’t package or recommend non‐Free software. But I think it would be fair to say that there’s no single “Nixpkgs stance on non‐Free software” as a result; our policies and norms in that regard, as with many others, are there to ensure that people with different views on the topic can work together while respecting each other’s positions and boundaries. (Which, I realize, may not be possible or desirable for every matter.)

There are, of course, many differences between non‐Free software and generative AI; I only mean to analogize them here insofar as they are both things relevant to Nixpkgs that many people consider deeply ethically fraught. One of the biggest differences is that non‐Free software has been around for a very long time; it’s more conceivable that the community’s views on generative AI will converge over time than for an issue that has existed since the 1980s.


We know that no matter what policy we start with, it wouldn’t be many people’s ideal, but we think that the vast majority of people will prefer some policy over none, and will agree that the things the proposed policy rules out are harmful to Nixpkgs. We want to approach this iteratively by starting with a baseline that rules out the cases that are most clearly unacceptable and ensures people are informed about the provenance of a contribution before engaging with it, but we don’t intend to set an immutable policy and then wash our hands of any problems that remain once it’s in place.

A lack of consensus doesn’t necessarily mean that action shouldn’t be taken; as we said in the original PR message, we are open to making judgement calls for the good of Nixpkgs as necessary, but we hope that setting basic principles for how people are expected to engage will help establish a foundation for further consensus and community discussion here, and that the experience with the resulting dynamics and enforcement will help us when considering the potential for further changes to address subtler or more controversial cases.

@fgaz fgaz May 2, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an essentially pragmatic policy that splits the difference as a result: community values encode our preference for FOSS, but support for users’ need for non‐Free software explicitly; we allow arbitrary non‐Free packages, but we ensure that they are appropriately labelled, and only build packages with FOSS licences on Hydra.

This is a false equivalence. We package non-free software, but nixpkgs itself is free. Likewise, we package some LLM-generated software, and here we are deciding whether to allow LLM output in nixpkgs. I think the equivalence should be between non-free code in nixpkgs and LLM-generated code in nixpkgs.

But I think it would be fair to say that there’s no single “Nixpkgs stance on non‐Free software”

Sorry if this comes off as pedantic, but regarding non-free code in nixpkgs, the single stance exists in the license.

If we are to copy our free-software policy, and agree that LLMs and their output are at least as problematic as non-free software, I think we should ban LLM output altogether, while keeping packages for LLM-generated projects to support the users' needs1.

Footnotes

  1. And ideally label them as LLM-generated, though I understand doing this accurately may be unfeasible at the moment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are to copy our free-software policy, and agree that LLMs and their output are at least as problematic as non-free software

«if we agree» evaluates to false over Nixpkgs, regardless of what goes afterwards.

@emilazy emilazy May 2, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false equivalence. We package non-free software, but nixpkgs itself is free. Likewise, we package some LLM-generated software, and here we are deciding whether to allow LLM output in nixpkgs. I think the equivalence should be between non-free code in nixpkgs and LLM-generated code in nixpkgs.

Apologies; I didn’t intend to paint an equivalence that direct. As I said, the parallel I was attempting to draw is that “they are both things relevant to Nixpkgs that many people consider deeply ethically fraught”.

The intended analogy is between “using LLM‐based automation as part of Nixpkgs development to any degree” and “packaging/using non‐Free software to any degree”, not “allowing Nixpkgs itself to be non‐Free”. As I said, there are of course differences between the two cases even then – and those differences can factor into how they should be handled – but I think it is fair to say that they are both issues that many people reasonably consider to have a strong ethical dimension, and that in the case of packaging non‐Free software our long‐standing policy is essentially a pragmatic compromise between differing views (and certainly not a zero‐cost one – my understanding is that it was a factor in the creation of Guix, and there are also people who are unhappy with Hydra not building non‐Free software). I should probably have tried to find a less confusing metaphor, though.

(There is, of course, very strong consensus that Nixpkgs itself should remain under a FOSS licence (though there has been some preference expressed in the past for a potential relicensing under a licence with stronger requirements, to the point of coming up during SC elections, and I believe a small fraction of proposals there have gone beyond what’s widely accepted as FOSS, so it’s not completely unanimous). I don’t think it’s the case, however, that that consensus is because of community agreement on the ethical position of non‐Free software; it’s because people who have an ethical commitment to FOSS and people who have a utilitarian desire for a community‐developed FOSS Nixpkgs have common cause.)

@7c6f434c 7c6f434c May 1, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a good idea to insert something like «Optionality» section? In the direction that any automation that accepts unpredictability not coming from the very task statement (like if we package something compiled we have to live with the compiler's optimiser's quirks…) should not, under current version of the policy, be an expected part of the workflow?

Given that the policy will quite probably need to be updated either based on ethical negotiations or on legal clarifications, it would be prudent to make sure it is always about what contributions get accepted, not about suddenly losing load-bearing workflows touching all contributions.

(And also, S3 and GitHub is enough of dependencies on evil oligopolies for now, maybe we could skip adding another one)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on what you mean here, in terms of what rule it would be setting for contributions to follow? While there are already some subjective parts of the policy, I think it’s best for us to err on the side of trying to set rules only about things there’s some degree of “fact‐of‐the‐matter” about whether a contribution (or at least a contributor’s activity as a whole) meets; “expected part of the workflow” feels like it’s somewhat far afield of this, in a similar way to #514587 (comment).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something in the direction of:

Contributions that add automation to widely-used workflows, e.g. by adding CI checks, should only add automation that can be reasonably expected to be more or less deterministic and predictable, and to be developed towards improvent in determinism and predictability.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the hedging with "more or less" even needed? CI should be completely reproducible from one run to another. It sounds like a good time to make it explicit. While I'm fine with genAI use overall, this is one of the places I'd be happy to completely ban the idea.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is, I can probably get a local LLM technically speaking «reproducible» with t=0 and fixed seed (but this is still not enough of predictability to go into CI), but ofBorg running the tests has the risk of tests being timing-sensitive in annoying ways, so not «completely reproducible». And, as I said above, once there is patching and also optimising compilers, «completely predictable» is also slightly in question, although not by much.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that’s clearer. It seems like this is probably something we don’t need to write down a strict guideline about, as it’s already an area with active maintenance and scrutiny and it seems like there’s not sufficient disagreement about it for local consensus to be insufficient?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my observations, I would rate the addition of linting into CI as chaotic and subject to interactions of a growing number of maintainer teams of specific linters. An LLM-based linter will probably be eventually blocked anyway, but with a risk of unnecessary levels of heating up the discussion. Closing in accordance to the existing policy would have some benefits in cutting out the flaming.

A secondary use — to strengthen the signal that LLM output is untrustable on its own — might be considered a bug or a feature.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question on retroactive applicability of sorts: should we set an expectation that a contributor that remains active is expected to answer (roughly, not everyone remembers sufficient details and that's OK) about automation use in pre-policy contributions?

This might become relevant for preparing for eventuality where e.g. if France or Italy (choice of countries is not random) gets a regulation with a wider interpretation of derived-work-via-LLM-training-and-use.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems broadly in line with expecting contributors who are still around to answer questions about the licensing/copyright status of code they’ve previously submitted in general, right? I think that it’s something we’d reasonably expect people to do and that it would not be very polite to arbitrarily refuse to do so if a concern comes up, but since the failure case of “they’re just not around in the project any more” is always present I’m not sure if it needs explicit policy wording beyond our general social norms (“Focusing on what is best for the community” per the CoC, and so on).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking we do not know exactly (yet) the extent — per-jurisdiction — of questions about licensing status, but maybe we should already collect the data where feasible.

But as long as your choice is deliberate here, fair enough, thanks for consideration!


Every contribution to Nixpkgs and related development venues, including code, documentation, and communication on GitHub and Matrix, must have a **responsible person in the loop** who is accountable for that contribution and reviews it before submission, and must **transparently disclose** any non‐trivial use of automation to produce it, including but not limited to LLM‐based AI tools.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Every contribution to Nixpkgs and related development venues, including code, documentation, and communication on GitHub and Matrix, must have a **responsible person in the loop** who is accountable for that contribution and reviews it before submission, and must **transparently disclose** any non‐trivial use of automation to produce it, including but not limited to LLM‐based AI tools.
Every contribution to Nixpkgs and related development venues, including code, documentation, and communication on GitHub, Discourse, and Matrix, must have a **responsible person in the loop** who is accountable for that contribution and reviews it before submission, and must **transparently disclose** any non‐trivial use of automation to produce it, including but not limited to LLM‐based AI tools.

@niklaskorz niklaskorz Apr 29, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider Discourse a nixpkgs development venue? (honest question; I see it more as a support venue most of the time)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Category: Development
Development discussion for Nix, NixOS, Nixpkgs, NixOps, RFCs, etc.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in entirety, but the specific threads/topics/spaces. Same with Matrix, only development chats (#dev, #ci, #hydra and so on). I'd suggest specifying that moment:

Suggested change
Every contribution to Nixpkgs and related development venues, including code, documentation, and communication on GitHub and Matrix, must have a **responsible person in the loop** who is accountable for that contribution and reviews it before submission, and must **transparently disclose** any non‐trivial use of automation to produce it, including but not limited to LLM‐based AI tools.
Every contribution to Nixpkgs and related development venues, including code, documentation, and communication on GitHub and in development-related spaces on Matrix and Discourse, must have a **responsible person in the loop** who is accountable for that contribution and reviews it before submission, and must **transparently disclose** any non‐trivial use of automation to produce it, including but not limited to LLM‐based AI tools.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason Discourse wasn’t included here originally is that it covers a range of topics outside our jurisdiction, like Nix itself. On GitHub and Matrix, the categorization is distinct enough that it’s fairly unambiguous which venues are related to Nixpkgs and therefore in scope; on Discourse, a lot of topics are just in the “Development” category with no further categorization.

I think it would be a good idea in itself to include Nixpkgs‐related topics on Discourse, but the discontinuity between standards in one topic to the next depending on its subject (and the ambiguity when that subject shifts) may be surprising, compared to GitHub and Matrix where the subject of a venue is more static. Happy to hear people’s thoughts about this.

@acid-bong I think your suggestion is redundant: “Every contribution to Nixpkgs and related development venues, including … communication on GitHub and Matrix”? Note that even for GitHub, related development venues does not include e.g. the Nix and Hydra repositories, because they’re outside of the Nixpkgs core team’s authority.

(Of course, we’d be happy to see this work grow into a more widely‐applicable policy beyond Nixpkgs in future!)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly consider an exception to disallow LLM generated pull request comments that are in response to reviewer feedback. It seems rude, and is a great way to get a reviewer to never look at your PR again. This should not apply to the translation or speech to text accessibility use cases.

@reivilibre reivilibre Apr 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen (but can't remember where) policies that say that all text (PR bodies, comments etc) must be human. I like that approach personally; both because I don't enjoy talking to machines in the 'human conversation' part of the repository but it also gives a fairly clear-cut reason to dismiss no-human-in-the-loop contributions, where the author doesn't show any engagement (and often becomes a proxy to their coding agent).

Additionally, if someone puts their contribution in their own words, they might actually review what they are submitting to understand it and stand a chance of spotting mistakes/suspicious things/... before a reviewer has to do that for them.

I see somewhat frequently well-meaning people [on other repos] using these tools and (innocently!) submitting something that they glanced at, thought it looked plausible, but later dawned on them that it was the LLM being silly.

Honestly I would also advocate for code comments being human-written only; I've also experienced submissions where the code comments were LLM-generated, had the wrong rationale/no understanding of the real context and the human didn't realise until the reviewer probed them on it...
Harder to guarantee though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok to set some expectations even if they are not exactly enforceable individually, because they're still informative to the contributor and still moderately useful to the reviewer as well.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely agree that we’ve seen some very unfortunate dynamics here; I’ve talked about this in #514587 (comment).


The following sections give more detail.
Comment on lines +892 to +896

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rendered: https://md.someonex.net/s/Nzr2olhfk8#

Suggested change
# Automation/AI policy
Every contribution to Nixpkgs and related development venues, including code, documentation, and communication on GitHub and Matrix, must have a **responsible person in the loop** who is accountable for that contribution and reviews it before submission, and must **transparently disclose** any non‐trivial use of automation to produce it, including but not limited to LLM‐based AI tools.
The following sections give more detail.
# Automation and "AI"
## Care and respect
All text intended for a human reader must be human-authored.
## Policy
Every contribution to Nixpkgs and related development venues, including code, documentation, and communication on GitHub and Matrix, must have a **responsible person in the loop** ("Contributor"). This person is accountable for the contribution, reviews it prior to submission, and must **transparently disclose** any non‐trivial use of automation involved in producing it, including but not limited to "AI" and (L)LM-based tools.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

complexity, poor inspectability, poor predictability, &c

Speaking of what is more code and what is more data, lockfiles sometimes get committed, and I think some ecosystems use SAT-solvers, which are complex, have high input sensitivity / low predictability of choice between technically allowable solutions, and also have once been called AI.

Yes, Nixpkgs is "applied maths".

Most of Nixpkgs by impact, possibly. Most of Nixpkgs by churn is at best computational linguistics.

..."working software" here meaning "working in the long(er) term", i.e. "bootstrapable, maintainable, inspectable", of course.

Honestly, for periphery packages often no.

There are good upstreams, where expressions do not require much maintenance beyond versions and maybe dependency lists, and there are bad upstreams where expressions need periodic overhaul as careful as you ever are. There are some medium upstreams where the situation is in the middle, but I am not sure they are the majority of our churn or anything.

Whether we're talking about Nixpkgs-for-deploying-today, or Nixpkgs-for-reading-by-humans, or the Nixpkgs-for-data-analysis, all three are of most value when written by humans.

Nixpkgs-for-deploying-today is a thing that is of most value when it has coverage. Some things need to be well-designed for that (i.e. cross-compilation is a large increase in coverage). Some things just need to be there at all, more things existing better than fewer but higher quality.

Oh, and we have already plenty of human-intent-destroying policies as well as policy-enforced falsehoods if we look at the case of humans reading expressions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I think some ecosystems use SAT-solvers, which are complex, have high input sensitivity / low predictability

👍🏿 👍🏿 👍🏿


## Scope

Any use of automated tools to generate non‐trivial amounts of output as part of a contribution, in whole or in part, verbatim or edited, is covered by this policy, except as listed in the Exemptions section.
Both LLM‐based AI tools and hand‐written automation are covered.
Comment on lines +900 to +901

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this include deterministic template-generating tools such as nix-init?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, disclosing use of nix-init is IMO beneficial to the reviewer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Also fwiw I think nix init is still configured to use some anachronisms like rev and sha256 (at least last time I used it)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, disclosing use of nix-init is IMO beneficial to the reviewer.

Agreed too. And that could be automated for the --commit flag, where adding a Assisted-by: nix-init would make sense

I think nix init is still configured to use some anachronisms like rev and sha256 (at least last time I used it)

that is fixed now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think nix init is still configured to use some anachronisms like rev and sha256 (at least last time I used it)

that is fixed now

Sure, that individual case has been fixed, but that's not the point; the point is that nix-init will always lack in adherence to nixpkgs standards, and I could name a couple of other basic issues that have not been addressed for literal years.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@figsoda figsoda Apr 30, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened nix-community/nix-init#784 to add the assisted-by trailer to nix-init --commit

the point is that nix-init will always lack in adherence to nixpkgs standards, and I could name a couple of other basic issues that have not been addressed for literal years.

nix-init was barely maintained for a while, I've since picked up the maintenance again and fixed most of the outdatedness. one of my goals for nix-init is to closely follow nixpkgs conventions, please open an issue upstream for any issue you are still running into, so I can take a look at them @eclairevoyant

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that nix-init probably falls under:

Use of standard community automation is exempt, such as nix-update, the official Nixpkgs CI bots, the @r-ryantm update bot, and the Nixpkgs security tracker bot.

“Standard community automation” is admittedly fairly vague, but it seems better for it to be interpreted reasonably broadly rather than treating things that have previously been widely accepted as violations.

I agree that it would generally be good for automated tools of all kinds to disclose themselves in commit messages. Still, I don’t think we’d want to consider it a problem for people to run nix-update without disclosure, especially as these tools don’t always create their own commit messages.

Contributions include code and documentation in commits, commit messages, pull request summaries and reviews, issue and vulnerability reports, GitHub comments, Matrix messages, and Discourse posts.
The covered venues are the GitHub repositories for Nixpkgs and [related projects](https://github.com/orgs/NixOS/teams/nixpkgs-core/repositories) under the jurisdiction of the Nixpkgs core team, Matrix rooms that are focused on development of those projects, and Discourse topics about Nixpkgs development.

## Accountability

Everyone who submits a contribution to Nixpkgs is responsible for it, regardless of the use of automated tooling.
Before submission, they must establish a reasonable level of understanding of the contribution and expectation of its correctness.
A contributor submitting a contribution intended for inclusion in Nixpkgs is also responsible for ensuring that it is [appropriately licensed](https://github.com/NixOS/nixpkgs/blob/master/COPYING) and credited, and not encumbered by any incompatible copyright.

When output from automated tooling is used in contributions, a contributor must establish confidence in that output.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I would like to request, though I'm not sure where it'd go here, is a sort of rule that:

Contributions from tools should at the very least reduce toil for future maintainers.

The idea there being that if you contribute via LLM or similar for packages or services, we expect to see:

  • Addition of tests (if they were missing)
  • Addition of update scripts
  • Addition of (as much as possible) well-documented and well-typed program and service options (instead of, say, the not-uncommon practice of "here's a config string we'll turn into a file, good luck").

(nix-facts is a tool you can use to try and see how many packages are missing tests or update scripts, though it's in "beta")

Basically, if we use LLMs, we should be using them not only for our own convenience but to achieve code quality and ease-of-maintenance that is currently not there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is something valuable to keep in mind as a principle, but I think it's a little bit too subject to interpretation to make a hard and fast guideline.

@crertel crertel Apr 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a hard-and-fast rule of "if you're adding a package/service using an LLM, we better see tests and update scripts" is a pretty objective measure--either the tests and update scripts are there, or they aren't.

But, I'm not sure that belongs in this doc, as such.

@vikanezrimaya vikanezrimaya Apr 29, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"leave the codebase better than you found it" is a good guideline, but would be indeed subjective to be a hard rule.

May be worth a mention somewhere in the contribution guidelines though. Explicitly, because even though it's common-sense for many long-time maintainers, I feel like newcomers may benefit from a reminder. And I feel the positive sentiment as in "try to strive for this ideal" would offset the "don't do this, don't do that" of hard-and-fast rules aimed at low-effort contributions mentally, encouraging good contributions instead of just discouraging bad ones.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would definitely like to encourage contributors to raise our technical standards! But I agree with the other replies that this is a more general principle, and would best go in other contributing documentation, as it seems particularly important for this policy to stay focused; any subjectivity is more of a risk here than elsewhere, and I think it already risks being too verbose in the effort to avoid that.

This can be achieved by establishing confidence in the correctness of the tooling’s logic, manual review of the included output, or using further automation to verify the output (e.g. programmatically checking whether a refactor avoids causing rebuilds).
As the inner workings of LLM‐based AI tools cannot be sufficiently understood at present, only the latter two options are available when those are used; vibe coding without review is not permitted.
When automation is used to verify output, the verification tooling itself must be disclosed and reviewed in line with this policy.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I challenge the assumption that the third option can successfully establish confidence in the output of an LLM.

LLMs are fundamentally unpredictable. For every verification automation you can devise there likely is an output that works around it. For example, if you "programmatically check whether a refactor avoids causing rebuilds", a LLM could insert malicious code hidden behind a flag that is disabled by default.

This is not hypothetical. There have been numerous reports of LLMs working around restrictions imposed by the user, for example by removing tests that don't pass or even going as far as escalating privileges.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, was going to bring this up as well, but got stuck formulating a suggestion block

Neither a manual post-hoc review of the outputs, nor a post-hoc harness are necessarily sufficient in themselves. I'm more skeptical about the manual review than about using a harness, because of the pace mismatch between the human operator and the model, and because of how human attention works. What I'd require from people experimenting with LLM usage in Nixpkgs is a form of "slow mode": #514587 (comment).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this method is not wholly reliable, but I’m not sure that reading the diff is either, in the limit of underhanded code. I would say that this is what we have reviewers for – we can’t assume that every contribution we get is correct or even non‐malicious, and so we try to have a second pair of eyes on changes, while accepting that that won’t necessarily catch everything. The keyword is “second”, though; we can at least establish a baseline of the contributor having done something to gain confidence in it, if they’re acting in good faith, regardless of whether automation was used or not.

These considerations apply to treewide refactors in general, many of which are too big for anyone to practically deeply and exhaustively review their entire diff by hand, which was our motivation for mentioning automated validation here – we’ve had both successful and unsuccessful automated treewides in the past that has correlated strongly with the amount of mechanical validation. I’ve merged at least one large partially‐automated, partially‐manual treewide that would have been intractable to write or verify fully by hand, based largely on automated review (#398707 (comment)).

Since this criteria is just covering what needs to be done before proposing a change, and we explicitly state that “Automation without any manual review must not be used as the sole arbiter of whether to merge a change” and “Everyone who submits a contribution to Nixpkgs is responsible for it”, I believe it doesn’t have to be perfect, and for certain types of change I believe automatic checking does better than manual review; I won’t say that a refactor being labelled as causing zero rebuilds is proof that it’s correct, but I think it’s at least reasonably strong evidence for the expected level of risk, and there’s certainly been many cases where I intended for a refactor to be zero rebuilds but found out it wasn’t and had to correct it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your thoughtful replies. My understanding is that you agree that method 3 is not reliable when applied to LLMs1, and that large changes are impossible to review (method 2). If so, it follows that LLM use should be forbidden at least for large changes.

I suggest:

When output from automated tooling is used in contributions, a contributor must establish confidence in that output.
-This can be achieved by establishing confidence in the correctness of the tooling’s logic, manual review of the included output, or using further automation to verify the output (e.g. programmatically checking whether a refactor avoids causing rebuilds).
+This can be achieved by establishing confidence in the correctness of the tooling’s logic or manual review of the included output, and can be supplemented by using further automation to verify the output (e.g. programmatically checking whether a refactor avoids causing rebuilds).
-As the inner workings of LLM‐based AI tools cannot be sufficiently understood at present, only the latter two options are available when those are used.
+As the inner workings of LLM‐based AI tools cannot be sufficiently understood at present, only the second option is available when those are used.
+It follows that LLM use for changes that are too large to be reviewed (such as treewide refactors) is forbidden.
When automation is used to verify output, the verification tooling itself must be disclosed and reviewed in line with this policy.

Footnotes

  1. Or in general unless either method 1 is valid as well, or the author of the change is trusted. Hence "supplemented by" in my suggestion.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some large changes where you can design formal (held-out — not available to the LLM, and with a strong syntactic component — no new conditionals etc.) checks that are easily reviewable and sufficient to check the change. Of course then the checks would need to be shared and also reviewed by other people.

General impossibility is there, but particular cases can be verifiable if diff shape is constrained enough.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that you agree that method 3 is not reliable when applied to LLMs1,

I would say rather that none of the methods are reliable in the general case for any automation, and that we compensate for this for major or treewide changes by expecting both that contributors establish some reasonable confidence in them, and that at least one separate reviewer does so independently.

I believe that for large treewides, automated checking has generally been more reliable than manual diff inspection, and there’s a difference between the Goodhart’s law effect from “here’s a metric; optimize for it” vs. an out‐of‐the‐loop post‐facto mechanical check. I think that, as @7c6f434c says, sufficiently rigid checking can mitigate the risk significantly, and that we are exposed to it from any automated treewide – e.g. we’ve had treewides based on simple textual replacement that were merged due to being zero rebuilds but that broke the NixOS modules they touched that aren’t factored into rebuild counts. I don’t think refusing to do any treewides too large for manual authorship is the solution, though; we just need to be rigorous about them to ensure they improve quality rather than risking harming it.

None of this is to say that we should be merging a treewide without appropriate scrutiny of the methods used to produce it and without careful consideration of the best way to gain independent sources of confidence in it; anyone merging any treewide of course ought to at minimum spot‐check and skim the diff rather than relying solely on a mechanical check, but “Automation without any manual review must not be used as the sole arbiter of whether to merge a change” covers that. The fact that the use of verification code to establish confidence in a change itself requires review and disclosure means that reviewers also have an opportunity to give independent review of the pre‐submission testing methodology.

The intention here is to ensure that any automated change undergoes pre‐screening from a contributor before a reviewer has to consider it, not to mandate that every PR that gets opened is in an immediately mergeable state without further scrutiny.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some large changes where you can design formal (held-out — not available to the LLM, and with a strong syntactic component — no new conditionals etc.) checks that are easily reviewable and sufficient to check the change.

@7c6f434c Could you make a practical example? I genuinely struggle to imagine a situation in which writing such a check is less complex than writing a deterministic tool to perform the change in the first place.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one splits a package into -libs and -daemons, and in the treewide part only replacement of X with X-libs or X-daemons or both happens — this needs checking nixpkgs-review and tests (obviously), and it needs checking that indeed the diff has this restricted structure of replacing one thing with one of the three options, but writing a proper deterministic tool might be annoying.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the limit of underhanded code.

I like the reference, maybe it's worth including in the policy as communicating the spirit...

Comment on lines +907 to +914

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rendered: https://md.someonex.net/s/Nzr2olhfk8#Accountability

Suggested change
Everyone who submits a contribution to Nixpkgs is responsible for it, regardless of the use of automated tooling.
Before submission, they are expected to establish a reasonable level of understanding of the contribution and belief in its correctness.
Contributors are expected to be able to answer questions about their contribution and respond to feedback appropriately.
A contributor submitting a contribution intended for inclusion in Nixpkgs is also responsible for ensuring that it is [appropriately licensed](https://github.com/NixOS/nixpkgs/blob/master/COPYING) and credited, and not encumbered by any incompatible copyright.
When output from automated tooling is used in contributions, a contributor must establish confidence in that output.
This can be achieved by establishing confidence in the correctness of the tooling’s logic, manual review of the included output, or using further automation to verify the output (e.g. programmatically checking whether a refactor avoids causing rebuilds).
As the inner workings of LLM‐based AI tools cannot be sufficiently understood at present, only the latter two options are available when those are used.
When automation is used to verify output, the verification tooling itself must be disclosed and reviewed in line with this policy.
Contributor is responsible for their Contribution, regardless of the use of automation.
Before submission, they are expected to establish a reasonable understanding of the Contribution, and a belief in its correctness.
Contributors are expected to be able to answer questions about their Contribution, and respond to feedback appropriately.
Contributor submitting a contribution intended for inclusion in Nixpkgs asserts that it is correctly attributed, [licensed](https://github.com/NixOS/nixpkgs/blob/master/COPYING), and not encumbered by any incompatible copyright.
When an Output from automated tooling ("Tooling",) is used in a Contribution, Contributor must establish confidence in the Output.
This is achieved by establishing confidence in the Tooling, through manual Review of the included Output, or using further automation (a.k.a. "Harness") to verify the Output. A Harness may be e.g. a test suite or a programmatic check, ensuring that a refactoring does not trigger rebuilds. The Harness itself is considered an automation tooling subject to this policy.
Due to the inherent complexity and opacity of "AI" and (L)LM-based tools, their Outputs must be reviewed manually and (or) verified using an automated Harness. Note that "AI"-based tools might generate Outputs faster and more complex, than is feasible for a human Review or for an automated Harness to achieve reasonable coverage of.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you run a test it must be disclosed in the commit in Assisted-By: format? this wording sounds a bit broad to me.
gotta say i also feel a bit mixed on the term harness here maybe as that also seems used for coding agent client programs.

this wording also i think leaves room for a liberal reading that it could suffice to have LLM output verified by LLM (surely as 'further automation' LLMs might qualify as a 'Harness'), which i think can help but should not be sufficient.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more effective to stick to focus on what specific changes in meaning you'd like to see (as you've done a great job at above — thank you) than to propose large rewrites like this, which are difficult for us to incorporate in a redraft when we also have to take into consideration everybody else's comments. For this particular diff I think the legalistic language would make the policy quite a bit harder for people to understand.


@reivilibre reivilibre Apr 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where to leave this thread as it's not entirely 100% on topic to other threads, but wondering if there should be distinctions between core code and leaf code like the package definitions, in terms of (principally) LLM-generating it and the impact it has on other people in terms of maintenance later on.

A few months ago (when I had a subscription as a trial) I used an LLM to package some software in NixOS. These packages, I'd already tried to write by hand a year or two ago, but gave up because of the amount of trial and error involved.
(Each build takes so long and if you make one tiny mistake you have to try again)
I don't have time to try that many permutations of a build so I gave up.

Well, LLMs changed that — I was able to pose the problem to an LLM and wake up in the morning with a working package.
The package code works (I can run the programme), it makes sense to me, though I didn't write any of it.
I have permuted it a bit to see exactly why each non-trivial part of the definition is needed, to get a better understanding, but I'll never be able to claim with a straight face that I wrote it.

Should I be allowed, encouraged or discouraged to submit this package for inclusion? (So far I have not, I want to be respectful)

Plus points:

  • Other people have requested the package in an issue. It'd be actually helping people to run software they want to run.
  • If my submission is harmful, it's only a package, at the leaf of the code graph, it's not really inconveniencing anyone, is it? (It feels like it'd be different if it's part of nixpkgs.lib)
  • With my middling understanding of Nix and Nixpkgs norms, it seems OK (I'm not proposing submitting it blind). I'm not clueless about how to build software and about how Nix works.

Negative points:

  • I don't want to force people to review this if they don't like the idea.
  • Arguably I still don't have the full experience of having written this, so I'm winging it a bit. (But to an extent, if I was just bashing my head against the keyboard trying to package this, I'm not sure I'd be faring much better.)
  • How does anyone else know I'm not clueless :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No matter how close your package to leaves, it's not the only criterium. Popularity, despite being loose and, in case of Nixpkgs, unmeasurable, also adds to the importance.

Now to your points:

  • your pros:
    1. not applicable, package requests stopped being accepted about a year ago
    2. depends on how many people use it and care about it and its Nix recipe
    3. you will eventually gain more experience in how Nix(pkgs) works, either by contributing to Nixpkgs or by maintaining your system/home config (or both)
  • your cons:
    1. the policy in question already covers the acceptability
    2. (and 3.) it's always better to learn stuff yourself than pretend to know more than you actually know. say, you package doesn't include something one considers important (like systemd units or shell completions), they reach out to you, but you're clueless about that exact aspect because you vibe-coded it. would it be a good situation for you and them?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not applicable, package requests stopped being accepted about a year ago

The fact that we (correctly) do not care about non-contributing user package needs enough to track them does not mean that their needs do not exist at all.

how Nix(pkgs) works

And also how it doesn't…

say, you package doesn't include something one considers important (like systemd units or shell completions), they reach out to you, but you're clueless about that exact aspect because you vibe-coded it. would it be a good situation for you and them?

I am willing to explicitly claim, from experience, that having written a package that skips some parts of upstream stuff doesn't help me answer questions «how to add feature X that I don't care about for my use case» any.

it's always better to learn stuff yourself than pretend to know more than you actually know.

To be honest, if someone did some mutation testing of the resulting package (which is an exception not the rule, with or without LLMs) I am no longer sure that they understand the result worse than after blindly fixing yet another error until things seem to work and ship-whatever happens.

Maybe the evidence of such mutation testing could be requested, however — in particular, because the error snippets would explain the package better.

nixpkgs.lib

And this is an important distinction: lib is a codebase with design and architecture, most packages are parts of a database even it somewhat pretend to be code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am willing to explicitly claim, from experience, that having written a package that skips some parts of upstream stuff doesn't help me answer questions «how to add feature X that I don't care about for my use case» any.

fair enough. i initially wanted to steer in a different direction, like "say your package doesn't build, but you don't know how to debug the breakage"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think there should be a distinction between core and leaf packages. It invites argument over how much or whether a package is core or leaf, and we should want contributors to take responsibility regardless. Ideally, someone who contributes a new package should also be willing to maintain it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I (personally) would not be remotely comfortable at all vibecoding/LLM-assisted-coding stuff into lib. That feels like real, critical code that touches a lot of other people and a lot of other packages.

And to be clear, I am not suggesting lobbing a package into Nixpkgs with no maintainer. I'm happy to maintain the package, just the initial process of writing that initial build script is so painful for some projects that it's a huge barrier to packaging, because the iteration time is so long.

If I fail (which I'm not sure changes depending on whether that initial setup was LLM-driven or not)? Nixpkgs as a project can remove 1 dir from the codebase, there's no painful migration involved, it's as if I never arrived at the scene.

I am not saying if this is right or wrong, curious for perspectives, that's all.

Thanks for the clarification, and my apologies for implying that you might be suggesting adding packages without maintaining them. Given the clarification, I still wouldn’t want to make a distinction. If we did, I’d want those exceptions enumerated (e.g., this path or that package have special rules), but I’d rather not.

My reasoning is that if the proposed policy isn’t good enough to keep the quality of code in those critical packages high enough, then it’s probably also not good enough for those peripheral packages. We ought to be tracking those problems regardless (to inform any changes needed to the policy), but things fall through the cracks sometimes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say your package doesn't build, but you don't know how to debug the breakage

… looking at the ignorelist of LibreOffice tests that a few people tried to fix and nobody succeeded…

Packages just differ in how much effort and competence gives you what, and there are levels of effort and competence where using an LLM could probably help you improve your understanding by skipping the teaches-you-nothing parts of busywork, and there are levels of involvement and competence where using an LLM gives you both a worse expression and worse understanding, and there are combinations where you can get a better package and worse understanding or a worse package and better understanding, depending on how you use the tools…

a distinction between core and leaf packages

Nixpkgs is pretty bad with making things binary that have many more intermediate states, and here it is also a spectrum which gradually shifts the trade-offs. The very core packages (even data-entry-style ones) have huge amount of downstream fallout when some previously documented case is not correctly reasoned about, so yeah, a riskier idea to apply LLMs there, but also these are the packages that unlike most others might get multiple competent and thorough reviews for a change. Typical packages gets less than one thorough review per change.

And also we should be better at recognising levels of maintainer involvement: a leaf package that works better than nothing is fine, actually; if it gets dependents and some user of a dependent package considers improving the original dependency, we should be better at giving information whether this user already cares about the package more than the original maintainer. In which case maintenance handover will probably boost the amount of human thought invested above the threshold where LLMs might break more than they fix.

My reasoning is that if the proposed policy isn’t good enough to keep the quality of code in those critical packages high enough, then it’s probably also not good enough for those peripheral packages.

Quality is not to put in front of quality throne, quality is to improve people's experience, and it costs effort, and for more peripheral packages availability at bad quality is often the best effort-benefit trade-off.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quality is not to put in front of quality throne, quality is to improve people's experience, and it costs effort, and for more peripheral packages availability at bad quality is often the best effort-benefit trade-off.

I’m using code quality as a proxy for determining how well this policy is working. Maybe it’s not the right way to measure it. Whatever that is, we should be able to make inferences based on how well it works in areas with a lot of scrutiny (i.e., “core” areas) to gain an intuition of how well it may working in areas with less.

Circling back to #514587 (comment), I think it’s a problem if there’s a distinction between “core” and “leaf” (or “critical” and “peripheral” or whatever). This is less of a technical concern and more about how the policy relates to contributing to Nixpkgs. If there are different rules depending on where or what people contribute, it will be a source of friction (and even worse if those rules are unwritten).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we have a spectrum of impact between packages, and a spectrum of likely impact of changes (there are some changes to high-impact packages that are more like medium-impact themselves), and a corresponding spectrum of diligence and review expectations.

And yes, from time to time someone makes a wrong call and merges too a risky change too quickly and then there are reverts and general friction.

But also some details of the policy will impact specifically the long tail because in higher-impact areas things will need to pass more scrutiny regardless of the origin of the lines of code.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these considerations should mostly balance each other out, right? The baseline expectations apply to every contribution, and since more important code (ideally) gets more scrutiny, cases where the baseline expectations are insufficient to ensure the required level of quality should be more likely to be caught.

I agree with @reckenrode that we shouldn’t try to explicitly delineate a core vs. peripheral distinction and apply different standards to each, in the context of a formal policy.

This policy applies equally to any further discussion of a contribution.
Comments and reviews must separately satisfy the same requirements of understanding, review, and disclosure.
Contributors are expected to be able to answer questions about their contribution and respond to feedback appropriately, without simply forwarding messages back and forth to automated tools.

It is not permitted to submit automated contributions without any manual review or intervention, outside of standard community automation.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about making automated PRs, as an alternative to (smaller) treewides where there is worry about functionality (e.g. removing dead code, where it's often not clear whether the code should be dead or alive). The advantage would be that, with one PR per package (or per unique set of maintainers, or per, like, 5 packages), the maintainers could be requested for review, and request changes, approve, or even merge themselves with the merge bot. (And, because it would be mine, I would be able to easily look at PRs, respond to feedback, close, merge, etc.)

I haven't actually created this. But I think it would be useful in some circumstances, and so I'd be reluctant to prohibit it entirely?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

off-topic (?): i think i've seen considerations (by @infinisil?) on using a single PR for tree-wide changes making for easier reverts - tho for all i know this might depend on the topic as well

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if a maintainer has an objection appplicable to multiple packages in the change, it is a bit annoying to paste it to all the relevant PRs… Unique set of maintainers is not really enough to guarantee low amount of per-maintainer spam. If a specific maintainer agrees, one can break out a specific subset of changes after their review comment.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds to me like this is a process you’d drive with yourself in the loop for a specific treewide, right, @mdaniels5757? I think that doesn’t count as the fully‐automated zero‐intervention no‐oversight bot case this is intended to rule out.

@mdaniels5757 mdaniels5757 May 9, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emilazy: Yes, that's correct. OK, good to hear.

Automation without any manual review must not be used as the sole arbiter of whether to merge a change.
Comment thread
mdaniels5757 marked this conversation as resolved.

## Transparency

All covered use of automated tooling for a contribution must be disclosed as part of that contribution.

In the case of LLM‐based AI tooling used for commits, this **must** be in the form of an `Assisted-by:` Git commit trailer, including at least the tool name and the primary model name and version used for the contribution.
A `Co-authored-by:` trailer does not satisfy this policy.

Any adequate form of disclosure is permitted for other kinds of tooling and contribution.
Pull request summaries and review comments must be disclosed separately to commits.

## Exemptions

The following situations are fully or partially exempt:

* Use of standard deterministic editor/IDE/formatter/text transformation tooling to produce changes that the author manually reviews and understands is exempt, including inline “auto‐completion” (even if LLM‐based) of short, rote snippets of text that do not contribute anything beyond boilerplate the author would have written anyway.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Use of standard deterministic editor/IDE/formatter/text transformation tooling to produce changes that the author manually reviews and understands is exempt, including inline “auto‐completion” (even if LLM‐based) of short, rote snippets of text that do not contribute anything beyond boilerplate the author would have written anyway.
* Use of standard deterministic editor/IDE/formatter/text transformation tooling to produce changes that the author manually reviews and understands is exempt, including inline “auto‐completion” (excluding LLM‐based) of short, rote snippets of text that do not contribute anything beyond boilerplate the author would have written anyway.

I actually think its far too easy to have inline LLM autocomplete to write code for people who don't actually understand what is happening. Even if they do, I think it is still worthwhile to make it known in the trailers

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think it's this kind of interactive "auto-completion" that it's not immediately obvious we need a moratorium on, but that is susceptible to "getting asleep at the wheel" and where we need contributors to pay special care to "establishing confidence"

@eclairevoyant eclairevoyant Apr 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLMs are autocomplete - that's literally what they are designed to do. Making an exception for autocompletion is effectively exempting all LLM output.

@SomeoneSerge SomeoneSerge Apr 30, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#514587 (comment) for 1 more spin at this intuition: what we want to require is the "slow mode", i.e. that the reviewer is not exposed to the "generator"'s artifacts at a pace any faster than their normal writing and review speed

LLMs are autocomplete

Yea I think we're referring to the "LLMs"-the wrappers around "LLMs"-the autocomplete blocks.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text already limits this to boilerplate.
If that's unclear, perhaps include a counterpoint?

Suggested change
* Use of standard deterministic editor/IDE/formatter/text transformation tooling to produce changes that the author manually reviews and understands is exempt, including inline “auto‐completion” (even if LLM‐based) of short, rote snippets of text that do not contribute anything beyond boilerplate the author would have written anyway.
* Use of standard deterministic editor/IDE/formatter/text transformation tooling to produce changes that the author manually reviews and understands is exempt, including inline “auto‐completion” (even if LLM‐based) of short, rote snippets of text that do not contribute anything beyond boilerplate the author would have written anyway. (So it follows that that generating any actual implementation logic is *not* exempt, regardless of how the automation is interacted with.)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least JetBrains IDEs automatically show short LLM‐based autocompletions out of the box; I expect it probably doesn’t apply to Nix code specifically, but in general I would suspect that we’ll see features like that deployed more widely, and it’s not clear to me that the degree of harm posed by them counterbalances the downsides of having a policy that catches a bunch of very benign, entirely manually‐directed changes, and that people might reasonably not realize they’re out of compliance with even after reading the policy.

I agree that it doesn’t rule out complacency at all, but when strictly scoped to short snippets of rote boilerplate, I’m not sure it’s a huge qualitative change compared to pre‐existing types of tooling, in the way that automated coding agents are. (I saw TabNine talked about a little when it came out many years ago, but not huge amounts of hype or a sea change in the observed dynamics of FOSS contributions.)

The restriction is also meant to be strict enough to rule out e.g. any worry of copyright concerns. (To a large degree it is likely redundant to the “generate non‐trivial amounts of output” wording – or elaborating upon it, depending on how you look at it.)

I agree with @roberth that “short, rote snippets of text that do not contribute anything beyond boilerplate the author would have written anyway” is the load‐bearing criterion here, not “auto‐completion”, and certainly doesn’t exempt all LLM output. But if there’s a way we could clarify the former further to make the tight scope clear, that would be good.


* Use of standard community automation is exempt, such as `nix-update`, the official Nixpkgs CI bots, the @r-ryantm update bot, other maintainer‐approved bots that run update scripts, and the Nixpkgs security tracker bot.

* Use of AI tools for research, testing, debugging, or private review is out of scope, if no substantial amount of their output is included in the resulting contribution.
However, if these tools had a significant technical influence on your contribution, you are still responsible for it per the Accountability section, and are expected to disclose this where relevant.

* Use of machine translation is exempt from the requirement to understand the translated output.
However, the requirements of appropriate confidence in the original text, responsibility, and disclosure still apply, and you are encouraged to additionally include the original untranslated contribution.
Comment on lines +944 to +945

@SomeoneSerge SomeoneSerge Apr 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this isn't even an "exemption" as such, more of a clarification of how we interpret what machine translation does. Rather than reasoning about "interactiveness" we could talk how direct or how "invertible" the correspondence between the inputs and the outputs is.

Suggested change
* Use of machine translation is exempt from the requirement to understand the translated output.
However, the requirements of appropriate confidence in the original text, responsibility, and disclosure still apply, and you are encouraged to additionally include the original untranslated contribution.
* Machine translation is assumed to be sufficiently interactive for the contributor to retain their understanding and confidence in the generated outputs.
The accountability and disclosure policies still apply, and you are encouraged to include the original untranslated contribution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't personally use interactive as the distinguishing factor, in the sense the process of use of coding agents one might argue to include interactive elements as well

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, the interactivity I'm talking about is "you are forced to come up with and spit out every next hunk at exactly the same pace as the model"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Machine translation of human language, right?

"Appropriate confidence" is load bearing here, but good enough IMO.
I'm inclined to go on a tangent to say that i18n is hard, but that's (unfortunately?) mostly irrelevant for Nixpkgs anyway for now.

Translation is not where the difficulty is, so the current text looks sufficiently vague to me so as not to pose a problem.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Machine translation of human language, right?
... i18n ... hard

Si, and si. "Slow mode" is my current proxy for what we're trying to pinpoint here: #514587 (comment)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this wording is likely to be more confusing for readers, as it covers the justification for providing an exemption while leaving the actual exemption implicit. The review requirements are explicitly about the output; most cases where automation is processing a contribution into a form that is no longer comprehensible to the contributor would be a problem, I think, and machine translation has enough pitfalls that I wouldn’t say we can make a general assumption that the contributor understands its output – hence also the suggestion to include the original text.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

automation is processing a contribution into a form that is no longer comprehensible to the contributor would be a problem, I think, and machine translation has enough pitfalls that I wouldn’t say we can make a general assumption that the contributor understands its output

Precisely. That's why I'm saying it's not really an exemption, nor does it need to be one. The paragraph is just to explicitly state, that the use of translation and accessibility tools in itself is not automatically disqualifying.


* Use of automation in a contribution clearly marked as not being ready for merge (e.g. a draft pull request) is exempt from the requirement for full self‐review, as long as some amount of review has been done and it is expected that the requirements will be met by the time it is marked as ready.
This does not waive any other requirement.

* Use of automated tools to develop upstream software packaged inside Nixpkgs is not in scope.

## Enforcement

If you believe that someone is using automation without appropriate disclosure and review, you can politely ask them if that’s the case and point them to this policy as appropriate.
Please assume good faith and remain civil; it’s not always possible to determine, and it is more likely that someone overlooked this policy than deliberately violated it.
Comment thread
adamcstephens marked this conversation as resolved.
If you think someone is continuing to break the policy after this, please escalate to the [Nixpkgs core team](https://nixos.org/community/teams/nixpkgs-core/) rather than fighting over it.

If a contribution is clearly in violation of the policy (e.g. the contributor admits it was not followed, or there are AI tool attributions that do not meet our required format), it can be closed or hidden, preferably after informing the contributor of the policy and giving them a chance to address the violations.
Deliberate violations of this policy are considered to break the [Code of Conduct](https://github.com/NixOS/.github/blob/master/CODE_OF_CONDUCT.md) clause against “Wasting other people’s time with low quality contributions, including but not limited to LLM and bot spam”.
Repeated violations are grounds for further moderation action.

## Credits

This policy takes inspiration from similar policies in [LLVM](https://llvm.org/docs/AIToolPolicy.html), [Mesa](https://gitlab.freedesktop.org/mesa/mesa/-/blob/mesa-26.1.0-rc1/docs/submittingpatches.rst?ref_type=tags), [Fedora](https://docs.fedoraproject.org/en-US/council/policy/ai-contribution-policy/), and the [Linux kernel](https://docs.kernel.org/7.0/process/coding-assistants.html), along with [a proposal by the author of Anubis](https://xeiaso.net/notes/2025/assisted-by-footer/).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a good example of "playing safe" (thank you @xhalo32 for the link):

Note that using an LLM when writing comments on github or Zulip is not allowed: use your own words.
https://leanprover-community.github.io/contribute/#use-of-ai

I believe Mathlib's decision is particularly relevant to us, because it's a very similar project: it's a holistic and orderly compilation of our collective human knowledge about everything-mathematics in a single unified language, same as Nixpkgs is a compilation of collective human knowledge about everything-software. Projects like Mathlib and Nixpkgs (...Wikipedia) are precisely where we need to establish a "blood-brain barrier" and prevent parasitic feedback loops: "Nixpkgs is an input, not an output"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mathlib is about maths, an subject with pretty deep respect for orderliness. Much of Nixpkgs is about software build systems when nobody is intervening in time to prevent horrors, a chaotic subject in practice. The closer to core you get, the more orderly people try to keep things, but at the periphery, sometimes «one who fights monsters becomes a monster».

prevent parasitic feedback loops

Our real needs are about working software, if chatbots stop working, no chatbot contribution will pass the basic requirements, consensus finally achieved. And whether chatbots do stop working should not be our problem to fix. People who want to make chatbots work will do, people who want to poison chatbots without damaging the actual functionality of Nixpkgs will do, that's fine.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much of Nixpkgs is about software build systems [...] a chaotic subject in practice

Yes, Nixpkgs is "applied maths".

Our real needs are about working software,

..."working software" here meaning "working in the long(er) term", i.e. "bootstrapable, maintainable, inspectable", of course.

People who want to make chatbots work will do, people who want to poison chatbots without damaging the actual functionality of Nixpkgs will do, that's fine.

Chatbots are supplied with enough money to keep chugging forward and getting subscriptions regardless of quality of their underlying models. It is Nixpkgs as the social phenomenon and as a world-modeling endeavor, with its Wikipedia-style development model, that is at risk.

Whether we're talking about Nixpkgs-for-deploying-today, or Nixpkgs-for-reading-by-humans, or the Nixpkgs-for-data-analysis, all three are of most value when written by humans.

Loading