Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Request for Comment on Code Review Process #73

Open
tstellar opened this issue Oct 5, 2021 · 194 comments
Open

A Request for Comment on Code Review Process #73

tstellar opened this issue Oct 5, 2021 · 194 comments

Comments

@tstellar
Copy link
Collaborator

tstellar commented Oct 5, 2021

Proposal

The LLVM Foundation Board of Directors is seeking comment on the current state of Code Review within the LLVM Project and its sub-projects. Phabricator is no longer actively maintained and we would like to move away from a self-hosted solution, so our goal is to determine if GitHub Pull Requests are a good alternative to our current code review tool: Phabricator.

Specifically we are looking for feedback on:

  • What features or properties make Github Pull Requests better than Phabricator?
  • What features or properties make Phabricator better than GitHub Pull Requests?
  • What new workflows or process improvements will be possible with GitHub Pull Requests?
  • Which workflows aren’t possible with GitHub Pull Requests?
  • Any other information that you think will help the Board of Directors make the best decision.

Where to Direct Feedback

Please provide feedback on this Infrastructure Working Group ticket. This will make it easier to collect and consolidate the responses. At the end of the comment period the Infrastructure Working Group will collect the feedback for further analysis and summarization.

Timeline

The timeline for this RFC will be as follows:

  • RFC posted for public review and comment
  • 30 days after the date of posting, public comment closes.
  • IWG will have 14 days from closure of public comments to review and summarize public comments into a pros and cons list to be present to LLVM Foundation Board
  • Foundation Board will have 30 days to make a final decision about using GitHub Pull Requests and then communicate a migration plan to the community.
@sheredom
Copy link

sheredom commented Oct 5, 2021

What features or properties make Github Pull Requests better than Phabricator?

TL;DR - new users can contribute to the project using tooling that they are already familiar with.

Clang got a really cool feature because some non-LLVMer (not part of the core community) did some modifications and then our community was fortunate enough that someone within the community took the time to pull the GitHub PR for it over into Phabricator - I'm talking about -time-trace.

From talking to people outwith the core LLVM community - the number one reason that people don't attempt to push legit fixes/features back to LLVM is that the code review process is a) non-normative, and b) scary as a result. I think moving to GitHub PRs, or at least offering them as an option, would greatly help onboarding people to the project.

@davidchisnall
Copy link

[ For context: I vastly prefer GitHub's PR flow to Phabricator ]

Which workflows aren’t possible with GitHub Pull Requests?

GitHub's PRs make it very difficult to provide a patch set that can be reviewed independently. The only way that this can be done at all is:

  1. Create a PR for the first patch in the series.
  2. Create a PR for the second patch in the series to merge not to main, but to the branch of the first.
  3. Repeat step 2 for each patch in the sequence up to N.
  4. Merge PR for patch N into N-1.
  5. Repeat step 4 until everything is merged into the first PR.
  6. Merge the first PR.

This is very error-prone. The root problem with GitHub PRs is that the target branch cannot be changed after the PR is created. If it could, then we could create PRs to merge into the next branch in the sequence but then merge them into main in order and reset the next PR to try to merge to main (or merge it into the intermediate branch if reviewers are happy with the roll-up).

@rengolin
Copy link
Member

rengolin commented Oct 5, 2021

GH Bad:

  • GitHub confuses discussion comments with review comments in the code view.
    • You can only answer multiple comments in the code view.
    • Old discussion view keep some comments that are not reachable from code view.
  • GitHub doesn't allow to see diffs between arbitrary revisions, only commits directly.
    • Commenting on a commit doesn't guarantee users can reach from discussion or code views
  • Patch sets, as @davidchisnall explained above.
    • However, a patch set that is in a single branch is dead easy.

GH Good:

  • It's free, it's under development
  • GitHub is accessible to millions of developers that already use PRs
  • GitHub has lots of options to protect branches, required reviews, run actions, etc.
  • GitHub allows direct merge into the repo, while Phab requires offline merge / arcanist.

Phab Bad:

  • It's dying, it's custom, it's self hosted
  • It's a hassle to register and it's an interface that isn't intuitive to a lot of people.
  • I have to upload a diff file... Or use "arcanist", yet another unintuitive tool.
  • I have to rebase/merge by hand... or use "arcanist"...

Phab Good:

  • Possible to make dependent reviews (series), but it's not really easy
    • I have to create each review separate, add parent/child links, etc.
    • When I change my branch(es), it doesn't update, I have to upload again
    • Even if the patch set is in a single branch, it's still not automatically updated

@kparzysz-quic
Copy link

@davidchisnall @rengolin
Assume that you have 2 dependent patches and you push 2 branches:

  • PR1: branch1 -> main
  • PR2: branch2 -> branch1

When you merge PR1, and delete branch1, PR2 will automatically retarget itself from branch1 to main.

@kparzysz-quic
Copy link

Also, a PR can have the base changed manually: https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/changing-the-base-branch-of-a-pull-request

@joker-eph
Copy link
Collaborator

@kparzysz-quic yes that works but is has two drawbacks I believe:

  1. It forces you to push all the branches to the main repo and open pull-request from within the repo, because otherwise the pull-request from branch2 to branch1 is not visible in the main repo.

  2. It means that updating these requires to rebase / keep a clean history locally. While I like this and this is how I work with Phabricator, that allows goes against other advice of not rebasing or amend/force-push a pull-request to preserve the review history best.

@llvm-beanz
Copy link

In 2019 at the Women in Compilers and Tools Workshop before the LLVM Developers meeting and again this year at the Community.o summit in March, we held workshops discussing barriers for new contributors. In both workshops utilizing GitHub pull requests was identified as a way of lowering the barrier for contributions.

Good, bad, or otherwise, git has become the dominant revision control system for software, especially open source, and hosting services like GitHub, GitLab, Bitbucket and Azure DevOps have made the pull-request model the standard workflow for the industry.

Whatever warts GitHub's pull requests may have, they are the most widely used model for contributing code and providing code reviews. LLVM going against the grain is an unfortunate detriment to our community.

@rengolin
Copy link
Member

rengolin commented Oct 5, 2021

While I'm unhappy with github PRs, @llvm-beanz point is more important than my personal preference.

While I'm also unhappy with Phabricator, the reverse is also true.

To me, the best course of action is to find a set of guidelines for people that are used to Phabricator to do the same on pull requests, and keep that document up-to-date.

I want to make sure we don't alienate existing contributors for the sake of new ones, but that we also don't make it hard to have new ones at all.

My personal view is that existing contributors know more about llvm and our community and would be more comfortable adapting to a new paradigm, for the time and love invested in the project.

But that's my personal view and in no way takes into consideration other people's difficulties, which could be many, and serious.

@kparzysz-quic
Copy link

@joker-eph Yeah, we could end up with hundreds of branches in the main repo. I guess we could agree that branches that are merged should be deleted, and, if left undeleted, can be deleted by anyone (unless they are explicitly made exceptions). The force pushes are probably inevitable given that PRs work on commits, not just diffs...

@rengolin
Copy link
Member

rengolin commented Oct 5, 2021

I would suggest that all series have to be managed in the developers repos, and only one of their branches to be requested against the official repo's main branch.

So we can all have different branches and a meta branch that Cherry picks from those, probably even have a script for that, and then when the metà branch is pushed to origin, the PR is updated as expected.

@AaronBallman
Copy link

To me, what tool I use for code review is one of the most important technical decisions that impacts me as a community member. Last time I gathered data, I participate in approx 80-100 distinct reviews a month in the clang and clang-tools-extra projects, so the choice of code review tools impacts me basically all day, every day.

Specifically we are looking for feedback on:

  • What features or properties make Github Pull Requests better than Phabricator?

Once you get used to the GitHub workflow, creating patches from PRs off a branch is quite easy. I'd note that some people use arcanist to simplify creating a Phab patch the same way (I'm not one of those folks, I use Phab directly through the web interface). I also like how easy it is to link to issues with the GitHub UI compared to Phab, but this is a lesser benefit to me (copy-pasting a link to bugzilla is not a ton of effort). The more full-featured markdown abilities in the editor are a nice touch, but not a critical thing to me.

  • What features or properties make Phabricator better than GitHub Pull Requests?

The code review interface has far less surprises and annoyances. I'm sure this comes somewhat from the amount of time I've used Phab (many years) vs the amount of time I've used GitHub (~2 years), but I've never found GitHub's code review workflow to be particularly ergonomic. It hides the most important files in a review because there are "too many changes" in it, too many comments on a review slows down the web interface, etc. It's basically the same usability concerns that have been brought up by myself and others every time this topic comes up.

Having used both for a few years now, I still find I am productive in Phab in ways that I'm not productive in GitHub, but it's not easy to boil it down to a succinct "if you fix this, my issue goes away" issue so much as an overall workflow within the tool issue. Anecdotally, it seems to take me about 25% longer when reviewing in GitHub compared to Phabricator, so I'm worried whether I'll be capable of reviewing at the same pace I currently do if we switched.

  • What new workflows or process improvements will be possible with GitHub Pull Requests?

None that matter to my day-to-day work, at least that I'm aware of.

  • Which workflows aren’t possible with GitHub Pull Requests?

Herald-like subscriptions where I can be added as a required reviewer for changes with a file-level granularity. (At least, I've not found the functionality if it exists.)

  • Any other information that you think will help the Board of Directors make the best decision.

My personal preference is to continue to support Phabricator (whether through Phorge, our own fork, whatever). However, I understand the desire to not host our own solutions or have to maintain them, and there's a lot of allure for using GitHub because of its popularity.

But I'd like to reiterate my concern about the danger of switching tools (regardless of what we switch to). The Clang frontend already struggles with review velocity, as can be seen by the number of time people need to "ping" a review thread. This isn't because reviewers are struggling with the tools, it's because reviewers who know the internals of the product well enough to make a decision about a patch are hard to get time from. Increasing the number of PRs the community receives while decreasing the amount of qualified reviews is a very real possibility and is not the kind of experience we want people new to the community to have. I don't know how to address that, but I do know that for myself, I definitely see a decreased amount of productivity when working in GitHub compared to Phab and I know that my review queue already sometimes fills up faster than I can perform the reviews today.

@rengolin
Copy link
Member

rengolin commented Oct 5, 2021

@joker-eph Yeah, we could end up with hundreds of branches in the main repo. I guess we could agree that branches that are merged should be deleted, and, if left undeleted, can be deleted by anyone (unless they are explicitly made exceptions). The force pushes are probably inevitable given that PRs work on commits, not just diffs...

Having hundreds of branches in the main repo because of PRs and patch sets is unmaintainable. Anyone deleting any branches is dangerous, as GH doesn't really check much before deleting. I don't think that's a viable solution.

@Meinersbur
Copy link
Member

IMHO we should prioritize switching from Bugzilla from GitHub over Phabricator. Reasons:

  1. bugs.llvm.org is still on version 5.0.2 (released in 2016)
  2. Latest release of Bugzilla was in 2019 and its last development commit was in February, while Phabricator went out of maintenance just this June
  3. As by the announcement, there will still be some maintenance, just not new features
  4. A 3rd party might fork Phabricator and continue its development
  5. Switching to GitHub Issues seems to be less controversial
  6. bugs.llvm.org still has self-registration disabled, a discouraging first obstacle for new community members; IMHO more than new developers having to become familiar with Phabricator.

Given point 3 and 4, there seems no immediate reason to switch away from Phabricator, especially when compared to the state of our bugtracker.

@kparzysz-quic
Copy link

Having hundreds of branches in the main repo because of PRs and patch sets is unmaintainable. Anyone deleting any branches is dangerous, as GH doesn't really check much before deleting. I don't think that's a viable solution.

  1. Github allows you to set branch protection rules, so that important branches cannot be deleted.
  2. You cannot delete a branch if there is an open PR from it.

@rengolin
Copy link
Member

rengolin commented Oct 5, 2021

  1. Switching to GitHub Issues seems to be less controversial

I wish that was true. While most focus of PR vs Phab is around patch sets, GH issues have a lot more shortcomings than bugzilla (by design, I assume), so there are a lot more things to worry about than PRs.

However, I'd also refrain from commenting about bug tracker on a code review thread. We can only handle so much controversy at a time. :)

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Oct 5, 2021

I think having a bot enforce that presubmit tests run, with documentation on how downstream consumers of LLVM might help set up bots to help participate in the CI presubmit, might be nice. The bot could work as a "merge queue" in that patches must pass CI before being merged, and that the bot alone is responsible for merges, of patches passing CI in FIFO. Also, having some dedicated reviewers could help replace "herald" from phab, AND help new contributors get reviewers auto assigned. (I suspect many newbies don't know who to ask for code review of their initial patches; having a bot analyze git log for the most recent contributors (akin to what the Linux kernel has a script for; scripts/get_maintainer.pl) might be better than this round robin stuff.)

lattner referenced this issue in llvm/circt Oct 5, 2021
This was duplicated between ExportVerilog.cpp and BlackBoxReader.cpp,
but it is a generally useful utility function.
@pogo59
Copy link

pogo59 commented Oct 5, 2021

I've used I don't know how many code-review systems over 40 years. They all fail in one way or another. Moving to GitHub PRs will if anything simplify my life, as I'm already used to GH Enterprise for my downstream repo.

Migrating to GitHub PRs (and Issues) is inevitable. It's how the open-source world works nowadays; let's go ahead and bite the bullet. But let's also not shoot ourselves in the foot.

  • Work on bot infrastructure FIRST. It's eternally painful to have every single push send me a half-dozen or more bot-fail messages.
  • Move bugzilla to Issues SECOND. Bug numbers are embedded in too many places, they need to be preserved.
  • Move off of Phabricator LAST. Phab hates me with a deep abiding passion, but I've learned how to appease it and get the minimum necessary done. We'll want to keep it available in archive mode forever, but at least links to it from commit logs are URLs and not just numbers, so it's okay to have a new numbering scheme afterward.

@rengolin
Copy link
Member

rengolin commented Oct 5, 2021

@pogo59 that sounds like a good plan. I also support moving to github, both issues and pull requests, but the work to get there isn't trivial and I would hate to rush this and make things worse in the end.

@llvm-beanz
Copy link

I think that in moving to pull requests the bots sending mass emails problems should get easier. Most PR testing systems support bots commenting in the PR instead of emailing so the audience is just people subscribed to the PR.

Also with the expansion of pre-merge testing the number of failures post-merge should drop dramatically.

@teqdruid
Copy link

teqdruid commented Oct 6, 2021

I strongly support moving to GH PRs. We've been using them in the CIRCT project from the beginning. Yes, there are downsides mostly stemming from GH's desire to simplify things; but, I personally haven't come across anything which I needed to do which just isn't possible. Our project is far smaller and far less complex than LLVM, so mileage may vary.

  • GH now has a CLI tool. I've not used it much, but it's got a ton of functionality. Presumably, it's just a thin wrapper on top of the very full featured GH REST API. Could be useful for anyone concerned about reviewing productivity. https://cli.github.com/
  • A killer feature of said CLI (IMO) is the ability to 'git checkout' the PR which I'm reviewing -- even if it's off a fork. This enables running tests locally and using my editor to crawl around code with which I am unfamiliar but the PR uses. Yeah, there's probably a way to get arc to do the same, but I'm scared of arc.
  • There are alternative interfaces to GH PRs. There are also some alternative PR review systems which use GH PRs as a backend (e.g. https://www.pullrequest.com/) so as to maintain some compatibility.
  • GH workflows can be triggered on PRs (and pretty much anything https://docs.github.com/en/actions/learn-github-actions/events-that-trigger-workflows). This could be useful as an escape hatch.
  • There exist a ton of actions that can be run in a workflow some of which to do useful stuff on each PR. (e.g. identify the correct reviewers based on the files modified). There are a few gotchas involved though.
  • A bunch of IDEs/editors have integrations with GH PRs.

Some downsides:

  • The GH rules for email notifications are either non-existent or overly obtuse. I looked into setting them up for myself a while back, but it was easier just to use email filters.
  • Not-unrelated: the GH hosted agents (on which we may want to run actions) run on a very small VM size. While larger ones are supported on private repos, AFAIK they are not yet supported for public ones: https://github.community/t/github-hosted-runners-vm-size/136482. Setting up our own would be fraught with security and setup issues. The CIRCT project uses GH workflows on GH hosted agents to do all of our validation builds. Doing an MLIR build is downright painful -- it routinely takes 1.5 to 2+ hours for a 'unified' build. Fortunately, we cache the MLIR build so our PR checks don't take that long to run.

Overall, I think the network effect is extremely beneficial and probably the best reason to switch. If there's a feature we need, we could likely add it via either a GH workflow or a bot. I've also noticed a significant acceleration in GH feature velocity over the last few years, so things are just going to get more featureful. To that end, GH does publish their roadmap: https://github.com/github/roadmap/projects/1.

@richardxia
Copy link
Member

  • The GH rules for email notifications are either non-existent or overly obtuse. I looked into setting them up for myself a while back, but it was easier just to use email filters.

For what it's worth, GitHub actually has documentation on the special [email protected] CC addresses that you can use for setting up your own email filters: https://docs.github.com/en/account-and-profile/managing-subscriptions-and-notifications-on-github/setting-up-notifications/configuring-notifications#filtering-email-notifications. This doesn't beat having a more sophisticated way to configure email notifications, but at least it does give you something a bit better to filter on than just the email subject.

@jh7370
Copy link

jh7370 commented Oct 6, 2021

I'm strongly against moving to PRs at this time, for reasons I'll detail below. I don't think any of my concerns are unsolveable necessarily, but if we move without addressing them, I'll really not be happy, and my ability to review contributions will decrease. This is particularly important on both counts, because I'm currently not even assigned to an LLVM-related team in my company, so I can't justify spending more than a relatively small amount of time reviewing things upstream, and I'm also one of about 2 people who currently reviews changes in most of the LLVM binutils, so my review bandwidth is critical to development in this area.

Also, for context, my company uses Github Enterprise internally, and has done for a number of years. I feel like I have about equal experience in both GHE and Phabricator, and the latter is my preferred review system by an absolutely huge margin.

What features or properties make Github Pull Requests better than Phabricator?

  • Integration with other infrastructure, such as Github issues, and the main repo definitely are useful.
  • UI options for directly merging in changes are useful (although I find I only use them about 20% of the time in my internal company work, preferring the command-line instead).
  • New contributors will often be more familiar with PRs with Phab reviews (but not always).
  • No "missing context" due to people forgetting to add -U999999 to their diff generation.

What features or properties make Phabricator better than GitHub Pull Requests?

  • For new contributors who have no familiarity with Phabricator or GH PRs, I think Phabricator's entry barrier is actually lower as things stand as there are fewer things that need to be got right. Certainly, it didn't take me very long to get up to speed with it.
  • Patches don't have to correspond to branches: I often do my work in a feature branch, with a Phabricator review per commit. This makes updating individual commits easier for me. It also means that where I have a small piece of work, I can just have a commit directly on top of main, without needing to update things, which is simpler than going through the branch creation process.
  • Github PRs don't really provide any true commit stacks, for patch series, where you want to keep the reviews separate: instead, you have to have a branch off a branch off a branch off ... which has the potential to get confusing very quickly. There's also no at-a-glance view option of what the patch series looks like in such a model, whereas in Phabricator the "Stack" button shows all linked-together patches.
  • The comments in Phabricator don't disappear, unless the file they were in is completely removed from the patch (even then they are still available in the history). In Github, they often "hide" in various odd places, due to being "out-of-date" when the relevant area is moved. In Phabricator, there is only one view of the whole patch, with inline comments always being visible unless explicitly hidden by the reviewer. Yes, they sometimes move and lose some context, but it's usually possible to figure out where they're from. This point is perhaps my biggest complaint against Github. It is much easier in Github to miss comments, and much harder to keep conversations in an area going.
  • Comments in Phabricator can be placed anywhere in the available context. In Github, they can only be placed within a few lines of modified code. Sometimes, this makes commenting where a comment is necessary impossible.
  • It can be easy to not be looking at the full patch as planned to land in Github, due to it sometimes deciding to show you only the latest commit. Whilst the same is possible in Phabricator, I find it much less likely to happen.
  • Both provide the ability to view diffs in the history of the review, but in Github, this only is possible if people haven't force-pushed changes, yet it seems to be a common workflow to do so. In Phabricator, the History tab provides the sequence of diffs that have existed for the patch, and it is sometimes useful to go back and see the differences between two specific stages, neither of which are the latest. I feel like this isn't as straightforward in Github.
  • Github allows a person to mark an item as "Resolved", but if memory serves me rightly, this prevents further commenting on the comment chain, which is inferior to the "Mark Done" option in Phabricator.
  • Having to switch tabs between reviewing files and discussion comments is a too-many clicks issue, and also makes it harder to easily flick between things. I often end up with two separate browser tabs when reviewing GHE PRs. This is exarcebated by the "disappearing comments" issue.

What new workflows or process improvements will be possible with GitHub Pull Requests?

  • As noted above, the merging options can make integrating patches simpler.
  • Sometimes, dividing a single review into multiple commits, is useful, as each commit can have its own comments etc, and this can only be done in Github.

Which workflows aren’t possible with GitHub Pull Requests?

  • Patch series are more intuitive in Phabricator.
  • Auto-subscribing/auto-reviewing seems to be more straightforward. In partciular, if memory serves me rightly, there is a hard limit on the number of reviewers in our GHE instance, which may be problematic in some cases.

@SamTebbs33
Copy link

  • Which workflows aren’t possible with GitHub Pull Requests?

Herald-like subscriptions where I can be added as a required reviewer for changes with a file-level granularity. (At least, I've not found the functionality if it exists.)

PRs will get reviewers added automatically for specific files/folders if you have a CODEOWNERS file in the repository.

@jh7370
Copy link

jh7370 commented Oct 6, 2021

  • Which workflows aren’t possible with GitHub Pull Requests?

Herald-like subscriptions where I can be added as a required reviewer for changes with a file-level granularity. (At least, I've not found the functionality if it exists.)

PRs will get reviewers added automatically for specific files/folders if you have a CODEOWNERS file in the repository.

Within LLVM, there's a big difference between code owners and reviewers. I'm automatically subscribed as a reviewer to several areas of code, but I don't consider myself a code owner of these. Also, IIRC within our GHE instance at my company, when trying to add reviewers to a list that had been prefilled with many code owners, several of the existing reviewers were removed (or new ones not added) due to there being a relatively low limit.

@peterwaller-arm
Copy link

peterwaller-arm commented Oct 6, 2021

A negative against arcanist: A newcomer may have to learn both the idiosyncrasies of git (which I think most would agree is not straightforward) compounded against those of Phabricator and arcanist. One specific harm of this is that it makes it harder to pull down a patch from Phabricator without mangling something or ending up in a strange git repository state. Simple enough for an experienced person to fix, but a barrier to entry for a newcomer. Phabricator's tooling, documentation and ethos is playful in admitting that it is 'arcane' but this is not practical and utilitarian when it comes to growing a community. Hosting on GitHub is better in this regard since it exposes pull requests as part of the usual 'git push/pull' mechanism that everyone learns if they use Git anyway.

@tru
Copy link

tru commented Oct 6, 2021

As a fairly newcomer to the project I vastly prefer GitHub Pull Requests because of the ease of use and the familiarity. Phab was a huge obstacle and learning curve for me. I have not been able to install arcanist since it depends on PHP.

@SamTebbs33
Copy link

  • Which workflows aren’t possible with GitHub Pull Requests?

Herald-like subscriptions where I can be added as a required reviewer for changes with a file-level granularity. (At least, I've not found the functionality if it exists.)

PRs will get reviewers added automatically for specific files/folders if you have a CODEOWNERS file in the repository.

Within LLVM, there's a big difference between code owners and reviewers. I'm automatically subscribed as a reviewer to several areas of code, but I don't consider myself a code owner of these.

Well that depends on how you interpret "code owners". In GitHub it just means someone that normally reviews changes to a particular file, folder etc. I would be added for the Arm backend, for example, as my colleagues add me as a reviewer to their Arm backend patches.

@jh7370
Copy link

jh7370 commented Oct 6, 2021

  • Which workflows aren’t possible with GitHub Pull Requests?

Herald-like subscriptions where I can be added as a required reviewer for changes with a file-level granularity. (At least, I've not found the functionality if it exists.)

PRs will get reviewers added automatically for specific files/folders if you have a CODEOWNERS file in the repository.

Within LLVM, there's a big difference between code owners and reviewers. I'm automatically subscribed as a reviewer to several areas of code, but I don't consider myself a code owner of these.

Well that depends on how you interpret "code owners". In GitHub it just means someone that normally reviews changes to a particular file, folder etc. I would be added for the Arm backend, for example, as my colleagues add me as a reviewer to their Arm backend patches.

Right, but my point is that the LLVM definition doesn't match GitHub's (see https://llvm.org/docs/DeveloperPolicy.html#code-owners) - code owners perhaps should be reviewers on all patches (even that's debateable), but not all people who want to always be a reviewer are code owners.

@nemanjai
Copy link
Member

nemanjai commented Oct 6, 2021

While I always hated the GitHub interface for reviewing code and still do (for reasons others have mentioned so I won't repeat), I will of course adapt to whatever tool the community chooses.

One thing that seems quite concerning to me though is that I have seen several mentions that "force push will be inevitable". Although I don't understand why that would be, I always think that using the force option is an indication that something has gone terribly wrong and needs to be fixed. So I am honestly concerned/afraid of any process that routinely requires using the force option.

Also, I find branches to be a very useful concept when there is a very limited number of them (i.e. main and release branches that we have now). However, once everyone needs to push a branch to get their code reviewed in a project as large as LLVM, the number of branches is essentially guaranteed to be so high that in my opinion, the whole concept of branches becomes rather useless. But of course, this is just my opinion and many people on many projects have found ways to work effectively in the presence of so many branches.

@hubert-reinterpretcast
Copy link

That said, with a workflow where changes to address PR feedback are additive, most changes won't need to be rebased until right before merging.

I frequently encounter situations where a proposed change has portions split out and committed separately (and possibly with slight changes). Rebasing after the split portion lands is reasonable and probably recommended (if not required).

or to be disruptive even when it does occur.

While I generally experience more instances of missed-and-unresolved comments in GitHub than in Phabricator, it has not been caused by rebases (but by how GitHub manages inline comments and replies to such).

Something that I have only seen happen with GitHub is the author forgetting to pull the PR branch back to their local copy before doing a rebase (and thereby blowing away all of the applied suggested changes).

@joker-eph
Copy link
Collaborator

That said, with a workflow where changes to address PR feedback are additive, most changes won't need to be rebased until right before merging.

My experience of development is that staying up-to-date with main is important: it may be because of "merge conflicts", because of "stacked changes" that are landing (somehow what @hubert-reinterpretcast mentioned as well), or just because of the need to adjust for API changes, etc. (this is independent of the workflow properties you're mentioning I believe).

To stay up-to-date you need to either rebase on top of main or merge main into your branch. In my experience rebasing a series of commits when there are "merge conflicts" is painful and lead to resolve over and over conflicts in the same area of code. I don't see how a workflow that would encourage to address feedback with more commits touching the same code wouldn't just automatically make the rebase all worse? Merging main into the branch may be OK though: I think you can always "squash and merge" in a single commit ultimately right?

Also: did you get some good experience in mixing this with the dependent changes that are meant to be reviewed independently? (aka "stacked PRs")

@llvm-beanz
Copy link

To stay up-to-date you need to either rebase on top of main or merge main into your branch. In my experience rebasing a series of commits when there are "merge conflicts" is painful and lead to resolve over and over conflicts in the same area of code.

In my experience rebasing is a lot less painful when the workflows are entirely git based (i.e. operating on git branches). With Phabricator I find myself re-resolving the same rebases over and over again. When I update a change early in a stack of dependent reviews, updating all the downstream changes is... unpleasant. (I realize there may be some magic workflow that I don't understand with arcanist/phabricator, and would love it if someone would tell me there is some magic easy way to rebase commit stacks).

In a fully git branch based workflow the rebase resolutions build on each other. If I rebase the first commit in the stack, I can rebase the next commit off the first, and I don't have to re-resolve the changes over and over again. Or if I want to short-cut rebasing, I can reset the second branch to the rebased first and cherry-pick the second commit on top of that. It is really painless (git reflog is my best friend).

I don't see how a workflow that would encourage to address feedback with more commits touching the same code wouldn't just automatically make the rebase all worse?

I don't know why you seem to think rebasing bad or difficult. In my experience most rebases you resolve conflicts once and they sort themselves pretty well, even if you're touching the same code over and over again the resolution is often automatic or at worst usually mechanical.

Merging main into the branch may be OK though: I think you can always "squash and merge" in a single commit ultimately right?

Yes, the final merge can always be reduced to a "squash and merge", which is more or less turning a PR into a single commit on the target branch.

Also: did you get some good experience in mixing this with the dependent changes that are meant to be reviewed independently? (aka "stacked PRs")

I have done a few dependent PRs in different projects. Swift has some infrastructure to manage dependent PRs, but it always seemed overly complicated. When I've done them on other projects I make the first PR target the intended target branch, the second PR target the first PR's branch, the third PR target the second PR's branch (and so on). As each PR merges you can re-target the next PR in the chain to the right branch. Minimal rebasing required.

I find rebasing git branches way easier than rebasing Phabricator reviews, and strongly prefer workflows that keep me in git.

@stephenneuendorffer
Copy link

To stay up-to-date you need to either rebase on top of main or merge main into your branch. In my experience rebasing a series of commits when there are "merge conflicts" is painful and lead to resolve over and over conflicts in the same area of code.

In my experience rebasing is a lot less painful when the workflows are entirely git based (i.e. operating on git branches). With Phabricator I find myself re-resolving the same rebases over and over again. When I update a change early in a stack of dependent reviews, updating all the downstream changes is... unpleasant. (I realize there may be some magic workflow that I don't understand with arcanist/phabricator, and would love it if someone would tell me there is some magic easy way to rebase commit stacks).

My experience is that arcanist is completely braindead with commit stacks. phabricator is horribly obfuse. Fortunately, moz-phab handles commit stacks in a sane way from the command line and as a result, dealing with commit stacks is one of the few places where phab is better than PRs IMO. Basically, you rebase your patches as normal and moz-phab submit.

Steve

@Meinersbur
Copy link
Member

Thanks @stephenneuendorffer for pointing out moz-phab. Since Mozilla has forked Phabricator in reaction to the winding down announcement, and is continuing to maintain it, could we just use their fork?

@joker-eph
Copy link
Collaborator

joker-eph commented Mar 30, 2022

In my experience rebasing is a lot less painful when the workflows are entirely git based (i.e. operating on git branches). With Phabricator I find myself re-resolving the same rebases over and over again.

I handle Phab no differently than any git-branch based flow. I have a stack of commits, each corresponding to a single Phab review. I use interactive rebase and arc diff HEAD~ on each individual commit to update the Phabricator revision. This is all fairly straightforward.

In a fully git branch based workflow the rebase resolutions build on each other. If I rebase the first commit in the stack, I can rebase the next commit off the first, and I don't have to re-resolve the changes over and over again.
[...]
I don't know why you seem to think rebasing bad or difficult. In my experience most rebases you resolve conflicts once and they sort themselves pretty well, even if you're touching the same code over and over again the resolution is often automatic or at worst usually mechanical.

While this is true for dependent changes in separate Phabricator revisions (as long as they are dependent from a semantic point of view, but the actual diff aren't conflicting with each other), I don't see how this extends to your earlier suggestion of addressing review comments with separate commit instead of amending. See for example a pull-request sequence of commits that would look like:

  • commit 1 : introduce change to function foo
  • commit 2: address reviewer A request about the change in function foo (so building upon the previous diff)
  • commit 3: address reviewer B request about the change in function foo (so building upon the previous diff)
  • ....

Now you have to rebase on main and it turns our that function foo was move from one file to another. That is a conflict to resolve when rebase commit 1, but also now commit 2 does not apply and you have to resolve the conflict again, and so on for every follow-up. This does not exist in a flow where you amend the commit instead when addressing the review.

I find rebasing git branches way easier than rebasing Phabricator reviews, and strongly prefer workflows that keep me in git.

I don't know how you're doing Phab review, but arc works really well with a sequence of git commits!
(to the expense of interactive rebase to "push" these to phab, potentially automated with passing --exec "arc diff HEAD~" to git rebase -i to update an entire chain of dependent revision, or using moz-phab).

@joker-eph
Copy link
Collaborator

A made up repo with a concrete example here: https://github.com/joker-eph/test-rebase-PR/pull/1/commits
Started from a simple initial commit and 5 follow-up typical of addressing some code reviews.
Try to clone the repo, checkout the PR branch and rebase it on main: you'll fix the conflict 5 times as far as I can tell: I'm not sure if there is a way to avoid this in general (other than using amend instead of separate commits)?

@tru
Copy link

tru commented Mar 30, 2022

@joker-eph have you tried rerere ?

@whisperity
Copy link
Member

@joker-eph @llvm-beanz I have recently worked on a sizeable implementation to Clang-Tidy, which involved 7 highly interconnected, but at the end of the day, linear pathces. (See http://reviews.llvm.org/D69560 and its stack.)

My workflow was that I created, for each patch, a separate branch in my repository, and worked on those. Because the Git history was not connected to the review process per se, I allowed myself to go wild with having a lot of commits on each branch. And let's say something came up that required changing something in patch 2 out of 7. I did the following steps:

  1. checkout patch-2
  2. Do the change, commit.
  3. checkout patch-3
  4. merge patch-2
  5. checkout patch-4
  6. merge patch-3
  7. [etc...]
  8. checkout patch-2, diff patch-1 > patch-2.patch
  9. checkout patch-3, diff patch-2 > patch-3.patch
  10. [etc.]
  11. Upload patch-2.patch, ... to Phab.

(Alright, this is a bit exaggerated, because I did now always upload EVERY changed diff to Phab. I.e., if I changed something in patch-2, I updated patch-2 to signal the reviewer that I want them to look at my fix, but I didn't update patch-3, etc. on Phab just because of this fix. Only once the review started touching those patches.)

In fact, I created an alias, git rawd (raw diff) around the time I was working on these patches:

╰─ cat ~/.gitconfig | grep rawd
rawd = !bash -c 'git diff --raw -U9999999999 "${1:-HEAD^}" HEAD' -

Now the positivity of this workflow was that I used Git's history tracking as intended - every little bit of detail was recorded verbatim. And I do not remember having to deal with merge conflicts at all (because I never rebased -- might have been lucky with the fact that the API I used wasn't pulled with the rug out from under me...), but I definitely do not remember having to deal with the same conflict multiple times. I did not have to, because once a committed change got merged into the branch of the next patch conflict-less, the subsequent merges into the following patches had to be conflict-less by induction...

Now if we contrast this with the Git(Hub PR) workflow:

  1. If I create the patches from my working branches, all the reviewers will be seeing my "crappy" commits. Even if there is a hundred. I don't think anyone wants to deal with the intricacies of how another person organises their own repository or fork...
  2. Forks are internally sharing the same Git object storage on GitHub, as was evident from the Youtube-DL DMCA case where people started uploading stuff into forks that was accessible from the main repo (1) (2). This could pose an issue that repositories get littered with additional data that consumes space but only used as a "temporary" during development.
  3. If I work either on a non-fork (or perhaps private) repository, or my "patch-X" branches on my fork are separate than the one I create the pull request from, then the problems with having to rebase, resolve conflicts, etc. resurfaces.

For full disclosure, I do need to admit that after several rounds of executing these steps, the history of my work "locally" looks like the better part of a rail freight terminal, but had I not posted this comment, I am sure the people who reviewed my work would've never gotten to know this fact.

Visualisation of a Git history with gitk showing several commits across multiple branches with cross-merges, that eventually flow together into a single final commit.

By using GitHub PRs that tie my hands by hard coupling the pull request contents to the contents of the branch that is submitted, I either have to administrate things myself across separate branches (which is parallel to submitting "detached" patch files to Phab), or force the reviewers to observe content that they are not interested about (my granular patches), or force myself to create big incremental changes, which might work until the point it doesn't, and if in between two commits that you push to a PR something breaks, and because the diff between those two commits contained several independent(-ish?) changes, you cannot really bisect a potentially introduced bug anymore.

Both of these forces actually defeat the purpose of Git being a distributed VCS and every user having full authority over the contents of their repositories and clones. (I.e.: it is my private implementation detail that I created several small commits, the public contract I give to you, reviewer, is bigger incremental updates.)


Fuller disclosure: I have never, ever, used Arcanist. My work computer is incapable (or unauthorised?) of executing snapd, and having a full PHP interpreter installed system-wide was not really trustworthy to me personally. I stuck to creating a patch file on my desktop, then drag-n-droppin' it into the browser and submitting the diff that way.

@pogo59
Copy link

pogo59 commented Mar 30, 2022

If I create the patches from my working branches, all the reviewers will be seeing my "crappy" commits. Even if there is a hundred. I don't think anyone wants to deal with the intricacies of how another person organises their own repository or fork...

I review multiple-commit PRs on GHE all the time. What the reviewer sees by default is the final state of the branch compared to the target branch. I don't care about all your crappy intermediate commits. The only reason to be needing them is if people insist on looking at how one revision differs from a previous revision, which it sounds like some people do want; and then I think people in general will be kind enough not to point and laugh. So I don't think this point is really a concern.

@jh7370
Copy link

jh7370 commented Mar 30, 2022

On a related note though, although @pogo59 is right that the default is the diff of the head of the branch versus the base, I regularly find that the view I'm looking at is not this view, but rather changes since the last time I reviewed, and I have no idea why this happens sometimes, but not others. Whilst it's a useful piece of functionality to be able to view the changes since my last review, I don't want it to be as apparently arbitrary as it is - I want to be able to make the conscious decision direct from my email notification, without having to remember to click the appropriate menu in github. In Phab I can do this: either click on the review link in the notification, or the "new changes" link.

@hubert-reinterpretcast
Copy link

I regularly find that the view I'm looking at is not this view, but rather changes since the last time I reviewed, and I have no idea why this happens sometimes, but not others.

The link from the "pushed N commits" notification e-mails only shows the changes from the range of commits.

@hubert-reinterpretcast
Copy link

is if people insist on looking at how one revision differs from a previous revision, which it sounds like some people do want; and then I think people in general will be kind enough not to point and laugh.

People won't be pointing and laughing, just frustrated by the process of locating the potentially-new (due to rebase) version of the commit that they left off at in a slew of minor commits with similar subject lines.

@whisperity
Copy link
Member

The link from the "pushed N commits" notification e-mails only shows the changes from the range of commits.

And that clicky thing in the notification gets broken the moment a branch is force-pushed.

For example, I just recently received two emails to a patch I'm subscribed to, outside this project. shaX and messageX are identities: e.g. Message1 in both mails refer to the same message. I am not sure what happened (not even sure if the list is oldest to newest commit or the other way around in the mails), but it definitely looks like a rebase and a force push had been made.

Email 1

xyz pushed 3 commits.

  • sha1 Message1
  • sha2 Message2
  • sha3 Message3

[View it on GitHub] or [unsubscribe]

Email 2

xyz pushed 5 commits.

  • sha4 Message4
  • sha5 Message5
  • sha6 Message1
  • sha7 Message2
  • sha8 Message3

[View it on GitHub] or [unsubscribe]

If I click either of those "View on GitHub" links, I end up on a page that says "Commit range not found". Even though the latest e-mail notification is in fact the state of the request's head-branch right now -- i.e. if I then click "Commits" on GH's interface, I get a list of 5 commits, in the following order:

  • sha4 Message4
  • sha5 Message5
  • sha6 Message1
  • sha7 Message2
  • sha8 Message3

So I am not sure what it is trying to compare with and why, but I have to click "Files changed" to receive the full diff.

It has to be mentioned that the "discussion feed" of the PR explicitly mentions force pushes:

  • xyz force-pushed branch from sha9 to sha10
  • xyz added 5 commits N hours ago
    • [the list of those 5 commits, sha4->shas8, as above, in that order!]
  • xyz force-pushed the branch from sha10 to sha8

Both of these "force-pushed" messages have their own "Compare" button, which do bring me to a comparison... a comparison which shows minimal (2 lines) of change.

While the abstract example cannot fully convey the meaning here, I have to say that as someone who knows the project these notification emails had come from, those 2 lines of changes are way too small compared to what the patch itself contains.

So I still have no idea what was fixed between these force pushes and changes (Is this only what is shown by the "Compare" button here? It's not very intuitive, even though I would like to think I know a few things about Git...), and if I imagine I was reviewing this patch, I would still not be too sure as to what the patch author had changes (outside of the "marked as resolved" discussion threads).

On Phabricator, you can directly compare two "snapshots" or "updates" of the same patch. I am sure with the right SHAs I could do the same on GitHub's interface (or after a recursive fetch, locally...?).

It feels as if I had these two options:

  • I have to trust the patch author to fix all existing concerns and do not introduce new issues
  • I have to re-read the entire thing to make sure of the aforementioned fact

@whisperity
Copy link
Member

Something that occurred to me just by reading up the replies since my last comment... If you click Quote reply to make a reply, it will only copy (with > prefix) the contents of the comment you are quoting... there is no mention of the user, there is no cross-link to the quoted comment (like how Phab has the Dxxxxx#yyyyyy). If I am replying to multiple comments, from the e-mail stream you either:

  • have to realise that the person was replying to you (by remembering what you said, potentially weeks before...)
  • have to read all the emails to gather what is being replied to you (or anyone else for that matter)

And if you already opened the interface, to find the context of the replied comment (for example if someone only replied to a single paragraph or a single sentence?), you need to use "find in page" instead of clicking some cross-reference and getting to the replied post immediately.

This is a feature that the most trivial web forum engined had 20 years ago, and yet we do not have it here. Note: this is not specific to PRs, however, and could be an easy fix from GH's developers.

@smeenai
Copy link

smeenai commented Apr 3, 2022

Since there's been some discussion about linear history: I love that LLVM has a nice (mostly) linear history where each commit represents a sensible unit of work and has clear titles and descriptions, which makes it easy to examine the changes to particular files and directories, and also makes bisecting work much better in my experience. Compare LLVM's history to Swift's and Rust's; this is what the three look like right now.

LLVM:
Screen Shot 2022-04-03 at 1 11 16 PM

Swift:
Screen Shot 2022-04-03 at 1 11 57 PM

Rust:
Screen Shot 2022-04-03 at 1 11 36 PM

Swift and Rust have so much noise. I imagine --no-merges would go a long way towards cleaning that up if I were looking at history locally, but that's still an additional step, and I don't understand what we gain from it. (I know the history of specific files should be a bit cleaner, since merge commits shouldn't affect those directly as often, but there's still some noise there, and directory history is even noisier.)

I think Phabricator plays a huge part in LLVM's clean history, because it has the clear model that each review on Phabricator will correspond to a commit in the repository, and reviewers can then easily enforce good commit hygiene. In particular, stacked reviews are a great tool in enforcing a clean history with sensibly-sized commits, and stacked PRs on GitHub seem like a pain (which would discourage their use): every reference I've found says that you need to maintain a separate branch for each commit, and use a third-party tool like gh-stack or manual curation to get you links between your stacked PRs. With Phabricator, I can just have multiple commits on a single branch (and main works just fine for this) and use git rebase -x 'arc diff' to create my stack. It sounds like moz-phab handles stacks naturally, and perhaps we should be encouraging its use instead of Arcanist; at Meta, we have an arc replacement which also handles stacks naturally, including automatically setting up the parent revision relationships, which is super convenient.

One other point: when I last used PRs, I couldn't formally assign reviewers to them unless I already had write access to the repository; I just had to CC people and hope they saw it. It seems like that's still the case, and that seems decidedly more beginner-unfriendly than Phabricator, which lets you always add reviewers.

@DanielMcIntosh
Copy link

DanielMcIntosh commented Apr 4, 2022

TLDR: We need to have a discussion about what we want the commit history to reflect: 1) a nice, clean history that is easy to follow or 2) a detailed, meticulously accurate history that will preserve every version anybody creates.

A lot of this new discussion has appeared to centre around how we want PRs and PR branches to be managed & merged. There are many different approaches for this, each with some advantages and disadvantages. However, perhaps this is far enough removed from the original topic of this issue that it should be considered separately? A few things we should talk about there:

  • Which of the 3 PR merge strategies should we use (I talk about this here: A Request for Comment on Code Review Process #73 (comment))
  • When updating a PR to address comments, should the update be created as a new commit, or should the original commit(s) be amended?
  • When merge conflicts arise, how should they be resolved - using a rebase or a merge?
  • What do we want new PRs to look like - should we require that people tidy up their branch history before creating a PR? If so, should new PR's contain a single commit, or (where appropriate) multiple, bite-sized commits that are easier to understand?
  • What about dependent/stacked PRs, how should those be created and managed?
  • Where should PR branches be created? Do we require that people create forks so the main repository doesn't get cluttered with PR branches (PRs can be done across forks of a repository, so the PR itself can stay in the main repo)? Should we provide a common fork for everyone to use/create branches in?
  • Do we want different policies for large and small patches (Sometimes decisions which are important to make large patches reviewable are going to be unnecessary and cumbersome for small patches)?
  • etc.

I will also say that I think much of the disagreement in the comments above stems from people using git in different ways and not really communicating how or why they use it that way. There are many ways of using git, and no one 'right way' of doing that. It depends on what you want out of it.

The biggest issue I usually see at the core of this discussion that often goes unmentioned is what do you want the history to reflect (and why): 1) a nice, clean history that is easy to follow or 2) a detailed, meticulously accurate history that will preserve every version you create.
This question is also very important to answer before trying to answer any of the above list of questions.
It's not a binary choice either, there's a ton of choices to be made, each of which leans more to one side of this question or the other. E.g. resolving merge conflicts using a rebase leans more towards option 1) while merge leans more towards option 2); Addressing PR comments by amending commits leans more towards option 1), while using a new commit leans more towards option 2); Creating new PRs with multiple self-contained, bite-sized commits leans more towards option 1), while untidy PRs with haphazard and incomplete commits or large PRs with only 1 commit leans more towards option 2).

Both option 1 and option 2 come with many benefits.
A few benefits of option 1:

  • Every commit should be able to build, run, pass tests, etc. This is very useful for debugging, since you can narrow a problem down to a single, small commit.
  • Places greater importance on communicating information to people in the medium to long term future. Long term, nobody cares about the false starts you made on a feature, nor the small tweaks or adjustments you made before landing your work. However, being able to understand what decisions were made and why is very useful/important (especially in big, long lasting, open source projects like this where contributors come and go), so you might want to format/arrange things to clearly communicate this/tell a clear story, even if that's not how you got there.
  • More favourable to reviewers who arrive late, and people who won't remember anything about older versions of a patch/PR they looked at anyways.
  • Keeps the history of main linear, which reduces clutter.
  • Can be helpful to organize your thoughts for large features

A few benefits of option 2:

  • you never lose any of your work and can always go back to an earlier version (even if it's incomplete and may contain bugs or wouldn't build successfully).
  • Places greater importance on communicating information in the short term.
  • More favourable to existing reviewers who want to see what has changed since they last looked at a PR/patch.
  • Requires less knowledge of git (Mostly because it doesn't require the use of rebase)

I personally lean HEAVILY towards option 1. I find the benefits to debugging (both short and long term) are huge, and the ability to use git to organize my thoughts is almost essential for me (I get very lost when I'm forced to use another version control system). Anytime I want to avoid potentially losing work I can either rely on the reflog, or create a tag at the commit before performing a rebase or whatever, then delete those tags once the feature has landed and/or I'm confident I no longer need them.

If our goal is to re-create as closely as possible the phabricator workflow, something between option 1 and option 2 is probably best, and we can get extremely close if we want. However, I don't think blindly imitating what we already have is the best choice. Phabricator didn't have this range of options, so we didn't have a choice before, but now we do, so we should have a conversation and make that choice ourselves instead of preserving whatever choices it made for us. But, again, I will re-iterate what I said at the start, I think this subject should be discussed in a separate issue.

Now the positivity of this workflow was that I used Git's history tracking as intended - every little bit of detail was recorded verbatim.

There is absolutely no such "intended" way of using git's history tracking, and you will find no mention of such anywhere in any documentation for it. It sounds like a lot of the complaints you have stem from using git this way, and there's absolutely no reason you need to. Interactive rebase, squash and commit --amend all exist for a reason.

@tschuett
Copy link

tschuett commented Jul 5, 2022

I like to introduce the concept of Github PR workflow playgrounds.

Somebody with the right permissions forks the llvm-project as llvm-project-playground and enables PRs with any level. The explicit statement is: the fork will be deleted in finite but unknown time.

Then we can iterate and document the LLVM GitHub PR workflow.

@AaronBallman
Copy link

Then we can iterate and document the LLVM GitHub PR workflow.

I think it's premature to document a GitHub PR workflow before the community has consensus we want to use GitHub for reviews. If this is something that's hidden and is an experiment to see how well GitHub PR workflow would work with our repo, that's a different thing entirely. But if this is something we expect code reviewers to participate in as part of our daily workflow, I'm a strong -1.

@tschuett
Copy link

tschuett commented Jul 5, 2022

My bad. You can use the playground as an experiment. You could have a playground with e.g Bors.

IFF the community decides to move to GitHub PRs, then they could use the playground concept to develop the LLVM workflow.

@AaronBallman
Copy link

Ah, thank you for the clarification (and the initial suggestion), that makes more sense now. :-)

@joker-eph
Copy link
Collaborator

The way I see it: we need to document a proposed GitHub PR for the community to be able to clearly build consensus around a migration to GitHub PR.
Having a playground for this seems critical, what is the advantage of doing it under the LLVM organization though? Anyone can fork it and setup permission for whoever wants to play with it right?

@tschuett
Copy link

tschuett commented Jul 6, 2022

The LLVM hosted playground will draw more attention and more traffic. There will be more PRs. Are you going to setup Bors for your fork?

@joker-eph
Copy link
Collaborator

Who’s attention and what kind of traffic are you looking for?
Since it is a playground: who else than people involved in the working group or people interested by this in particular (who will discover this on Discourse for example) are supposed to participate and contribute « traffic » just by the presence in the LLVM organization?

@tstellar
Copy link
Collaborator Author

tstellar commented Jul 6, 2022

The advantage of doing something like this in the LLVM organization is that we get access to the custom runners provided by GitHub that are not available for all users. We also already have a sort of PR playground that we use for the release branch:
https://github.com/llvm/llvm-project-release-prs/

@joker-eph
Copy link
Collaborator

(I don't know where to record / broadcast this)
The Carbon project has guidelines on how to use pull-requests: https://github.com/carbon-language/carbon-lang/blob/trunk/docs/project/code_review.md#github-pull-request-mechanics
(they use squash-merge and have a linear history like us)

@hubert-reinterpretcast
Copy link

(I don't know where to record / broadcast this) The Carbon project has guidelines on how to use pull-requests

It seems there are various issues under https://github.com/orgs/llvm/projects/11 where the open question has an answer insofar as the Carbon project is concerned.

It seems they have a setup to support stacked PRs via branches in the (non-fork) repo, enforced by naming conventions: https://github.com/carbon-language/carbon-lang/blob/trunk/docs/project/pull_request_workflow.md#stacking-dependent-pull-requests.

It seems they allow rebases of PR branches and non-squash merge: https://github.com/carbon-language/carbon-lang/blob/trunk/docs/project/pull_request_workflow.md#stacking-dependent-pull-requests.

So, insofar as multiple-commit reviews are concerned, they still have multiple ways to go about it.

@smeenai
Copy link

smeenai commented Jun 23, 2023

Sorry for chiming in late here, but I realized that something else I rely on pretty frequently is the Differential Revision link in a commit to link back to the Phabricator review for it. This is often useful to gain additional context on a change, see any reviewer concerns, have a place to raise post-commit issues, etc.

As far as I know, GitHub PRs don't have any equivalent. If we were merging them you could figure out which merge brought in a commit, but we're not (which I fully agree with, given how messy history would get otherwise). What's the standard Github workflow for figuring out the pull request corresponding to a commit then?

@python3kgae
Copy link

The tile in a github commit will include the PR number like (#2257) at the end of the title.

Then you can go to
https://github.com/project_link/pull/2257 to see the PR.

Sorry for chiming in late here, but I realized that something else I rely on pretty frequently is the Differential Revision link in a commit to link back to the Phabricator review for it. This is often useful to gain additional context on a change, see any reviewer concerns, have a place to raise post-commit issues, etc.

As far as I know, GitHub PRs don't have any equivalent. If we were merging them you could figure out which merge brought in a commit, but we're not (which I fully agree with, given how messy history would get otherwise). What's the standard Github workflow for figuring out the pull request corresponding to a commit then?

The tile in a github commit will end with the PR number like (#2257).

Then you can go to
llvm/llvm-project#2257 to see the PR once the switch is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests