GitHub organization teams/permissions

I would like some feedback from developers on how best to organize permissions within the GitHub organizations of the Fiji software stack. If you do not (want to) know how GitHub teams and permissions work, please ignore this thread!

The orgs under discussion are fiji, bigdataviewer, trakem2, imagej, scifio, imglib, scijava, and perhaps openspim, slim-curve and fiji-BIG. These orgs do not necessarily all need to employ exactly the same pattern of permissions, but right now the situation is an inconsistent hodge-podge.

Requirements

I see three major requirements for managing permissions:

  1. Security. This means “deny by default” and only granting direct push access to people when they really need it (by some definition of “need”; see below).
  2. Community. We want management of these orgs to scale beyond a single bottleneck or gatekeeper. Trusted people should be able to add more trusted people.
  3. Ease of administration. Having only a single team for each org, with admin access for all repositories, would maximally solve this requirement—but at the expense of security.

To balance these, we need to know our development workflow. How much do we want to encourage/enforce use of PRs, versus direct pushes?

  • One approach is to lock down all repos completely, with everyone using fork-and-pull always. This is e.g. how the OME team does things.
  • Another way is to grant anyone doing active+regular work on a component push access to that component, so that they can push topic branches directly. Before GitHub, this was certainly needed, and historically is how the Fiji project has done things.
  • A middle ground exists, where pushing to master is not allowed, but pushing topic branches to official repos is allowed (from which PRs can be filed).

The always-fork-and-pull workflow has pros and cons:

  • [Pro] All work has an associated issue with discussion thread. This could be useful for the changelogs (see discussion).
  • [Con] Work can get bogged down and out of sync awaiting review.
  • [Pro+Con] Topic branches are organized by who is working on them. This avoids a huge overgrown list of topic branches in the official repo, but can make it harder to browse what is being worked on across all developers. Permissions are an obstacle when multiple people collaborate on a PR.

Team roles

To help think clearly who should have which permissions technically, here is an enumeration of SciJava team roles:

  1. Founder - Created the project. Does not imply any current participation or responsibility.
  2. Lead - Has decision-making authority: timing of releases, inclusion of features, etc.
  3. Developer - Adds new features or enhancements. Can be assigned to address feature requests.
  4. Debugger - Fixes bugs. Can be assigned open issues to solve.
  5. Reviewer - Reviews patch submissions.
  6. Support - Responds to community questions and issue reports. Keeps the issue tracker organized.
  7. Maintainer - Merges patch submissions. Cuts releases.
  8. Contributor - Contributed code to the project. Does not imply any current participation or responsibility.

Note that these are per-component roles.

How GitHub does permissions

GitHub supports the following per-repository permissions:

  • Read. Can clone and fork. Since we are OSS, there is no need to grant this permission explicitly since everything is already world-readable.
  • Write. I.e. push access. Typically all/any branch, although it is possible to impose restrictions as discussed above. Also includes full power over the issue tracker.
  • Maintain. Can manage permissions of others, edit description and URL, and other non-destructive actions.
  • Admin. Full control, including the ability to rename or move the repo, archive/unarchive, and delete the repository.

It is possible to, for each repository, add each individual as a separate collaborator with one of the above permission levels.

But in addition, GitHub supports teams on the organization level, each of which is a list of users plus a list of (repository, permission level) pairs. The purpose is to label a list of users and assign everyone on that list the same permission levels for a particular list of repositories. This makes it more convenient to add/remove users from many related repositories at once. For example, as of this writing, the polyglot team of imagej grants several people admin access to several repositories related to packaging and cross-language support (chocolatey-packages, imagej-electron-app, imagej-itk, imagej-launcher, imagej-matlab, imagej-node, imagej-notebook, imagej-server, pyimagej, tutorials).

Finally, GitHub has the notion of organization owners who have admin access to everything in the organization, as well as the ability to create new repositories.

See Managing access to your organization’s repositories for more on GitHub’s permissions.

Expressing SciJava team roles using GitHub permissions

So a key question is: for each SciJava team role, how best to express that technically in GitHub terms?

First, we need a (IMHO straightforward) mapping from team roles to GitHub permissions:

  1. Founder - N/A—not a participatory role.
  2. Lead - Maintain.
  3. Developer - Write.
  4. Debugger - Write.
  5. Reviewer - Write.
  6. Support - Write. (For the issue tracker.)
  7. Maintainer - Maintain.
  8. Contributor - None—use PRs.

Next, we need to decide how best to use GitHub’s available permission tools to achieve and maintain this mapping for each repository.

The way I am leaning right now is to use the collaborator mechanism to manage every repository completely separately, and avoid organization-wide teams whenever possible. One issue with that is that we have hundreds of repositories, so I’ll need to write a script to scrape each project POM and validate the corresponding GitHub permissions configuration. I am hoping the GitHub API is powerful enough to support doing that, because I really don’t want to manually click into each and every repository by hand.

This will hopefully help encourage the team role metadata in POMs to more closely match the reality of what people are working on.

For “blanket maintainers” like myself, I think it is sufficient to use GitHub’s owner mechanism. However, we want to keep the number of owners low.

Concluding questions

  • Do you support my plan to get rid of teams in favor of the above mapping?
  • Given our structure, does anyone see any use for GitHub teams?
  • Do you favor imposing a always-fork-and-pull model of development?
  • Do you favor imposing a no-direct-pushes-to-master-ever restriction?
  • What do you think is a good number of owners per org to avoid a too-low bus factor? 3? 5?
  • Other thoughts and opinions?
2 Likes

As a side note, I happened to just come back across the kubernetes tool for declaring these types of things in a repository itself: peribolos. Happy to hear of similar tools but perhaps something similarly would help capture the output of this topic. ~J.

1 Like

This is probably my favorite approach. I was trying to disable pushing to the master branch but as far as I understand that does not work well with the scijava-scripts/release-version.sh script.

This works with the middle ground approach as well, as long as branches are being deleted upon merging.

I do not have a strong opinion but I would say yes. My own approaches of managing through teams was not very structured so I do not see a lot of benefit in them. Deriving permissions from the pom.xml would be a great improvement.

Generally a good idea but the middle ground approach is good, as well.

Yes! Although I have concerns regarding scijava-scripts/release-version.sh with that.

It’s hard to tell, it really depends on how actively the owners participate. I fell that in some organizations there is a big variance.

Will the permissions be checked (and updated) with every push/merge that contains changes to pom.xml? That may be useful.

2 Likes

In this case we should make sure that Travis cannot change someones admin role by accident. Which probably means Travis has a maintainer permission for the repo.

I had the same idea after reading the first part of your text. I also like that this would give more motivation to update the POM. But what about this scenario: I work on a branch, someone helps me with a little thing / we work together on it on this other persons laptop and want to push to the branch. We would first have to add the person as a contributor / reviewer to the POM and wait for the permissions to change before he or she can help me work on it? This feels like an obstacle for collaboratively working on a branch.

Having one team per organization with these middle ground permissions would allow people in our community to contribute to any branches. Meaning the admin of the org still has to add them manually. I don’t know how much work this is.

Then the github maintainer permission could be granted (by Travis) depending on the POM - this way we make sure the most important roles in the POM are up to date.

It would be good if we could make this happen. (There is also a possibility to enforce that a commit is always associated with an issue - @turekg has experience with that)

1 Like

:+1: Great, I’m glad people like the idea of mapping SciJava team roles to GitHub permissions. I think it will work cleanly—although I favor a simpler scheme now; see “A way forward?” below for details.

I would not favor teaching the CI to do this automatically, because it entails granting Travis CI admin access to all our repositories, which is IMHO too large a security risk. Instead, I’d favor a CLI script that scans for discrepancies, and perhaps can fix them, using the GitHub credentials of the person running the script. Yes, this is one more manual thing, but we could hook up the “scan and report” part to a daily Travis CI cron job.

You can push to the branch using your GitHub credentials, regardless of who authored the commit. (Although in this scenario, you probably should also author the commit, and then add Co-authored-by: Jane Doe <soandso@email.domain> in the commit message.)

Interesting idea! I think this is better than my proposal to use outside collaborators; see “A way forward?” below.

Here are some additional concerns I have—in addition to the "release-version.sh won’t work anymore" problem—about locking down the master branch across the board:

  • Some Fiji plugin maintainers are not software engineers—they are scientists who have learned a little about git in order to push changes to the code. These maintainers are able to work on a single branch (master), and push their changes, and in some cases even run the release-version.sh script, but that is it. Branching is too complicated, and we would reduce our maintainer pool by forcing all maintainers to do it.

  • As blanket maintainer of this ecosystem, I occasionally finding myself needing to push commits to every repository across all the orgs—e.g., when I update the Travis CI configuration, or make blanket updates to pom.xml files. Currently, I loop over all the repositories, automatically making these changes and pushing directly to master. With hub, PRs can be filed automatically, but the CLI way to merge PRs is to check out the linked branch and then merge it to master and push. Would GitHub allow this form of PR merging from the CLI if branch restrictions are in effect? I don’t know. What about if other branch restrictions are in play, such as requiring 1 or more reviews? Imagine needing to review and sign off on 300 pull requests that all do the same thing across 300 repositories.

    As soon as any component of the SciJava ecosystem starts enforcing branch restrictions, my workflow above is in jeopardy. Already this is a big hassle when I try to update components that live outside of the core orgs, since I do not have direct push access to many of them; I end up filing a bunch of PRs, some of which do not get merged in a timely manner or at all.

    All of that said, this workflow might still be fine, because I think we could configure admins as exempt from the branch protection rules. But I’m not 100% sure; see e.g. isaacs/github#1390 where this issue is discussed.

A way forward?

Based on feedback from this thread, here is what I now propose for fiji , bigdataviewer , trakem2 , imagej , scifio , imglib and scijava orgs:

  • Org members consist of all users who have a participatory team role in any component of the org.
  • Grant all members of an org write access to all its repositories.
  • On a per-component basis, allow project leads to impose workflow restrictions—disallow pushing to master, require LGTM reviews, require passed CI checks on PRs, etc.
  • For users with lead or maintainer role for a component, add them to that repo as collaborators with Admin permission—this ensures they can e.g. push to master even if there are workflow restrictions in place.
  • Delete all teams, and remove prior org members who do not have any current participatory roles.

This scheme has many good characteristics:

  • All org members can push to all repos in the org. This could be considered a feature rather than a bug, as it avoids permissions-related holdups.
  • Maintainers can continue using release-version.sh to make releases.
  • Fiji plugins that currently have a single maintainer can continue using a simple master-based workflow.
  • Developers may choose to impose a stricter development process on components they lead.
  • No teams, so relatively simple to maintain. Two things to keep in sync: 1) collaborators with admin role on GitHub repo = leads and maintainers in that POM; and 2) org members on GitHub = sum of all <developers> in component POMs of that org. We can write a script to scan for any discrepancies.

Downsides/challenges I see:

  • Branch protections must—as far as I can tell—be configured repository by repository. So imposing a “no pushing directly to master” rule across many repositories would be substantial initial effort to accomplish. Fortunately, I do not expect we would do that en masse immediately1.
  • In order to support maintainers being able to merge PRs when branch restrictions are in effect, the maintainers will need to have admin access, not maintainer as I proposed before.
  • Automatic changelog generation must cope with the fact that some repositories still have commits directly to master. I believe that any attempt to impose more order on what gets committed and pushed, for the benefit of automated tooling, will make the lives of certain humans more difficult. One compromise would be for the changelog tooling to simply ignore direct commits to master, only analyzing PRs and/or merge commits—if you want your work to be featured in the changelog, make sure to do it on a branch according to a supported workflow!

1 How many components do we have? Who are the leads/maintainers?

The people with lead role have decision-making authority for a given repository. If there is no lead, then the people with maintainer role have de facto decision-making authority instead.

Here is a count of people by lead/maintainer authority:

$ for pom in (fiji|bigdataviewer|trakem2|imagej|scifio|imglib|scijava)/*/pom.xml; do grep -q '>lead<' "$pom" && role=lead || role=maintainer; grep -A9999 -h '<developers>' "$pom" | grep -B9999 '</developers>' | perl -0777 -pe "s/[\n\t]*//g" | perl -0777 -pe "s/<\/developer>/<\/developer>\n/g" | grep ">$role<"; done | sed 's/.*<name>\([^<]*\)<\/name>.*/\1/' | sort | uniq -c | sort -n
   1 Brian Long
   1 Christopher M. Bruns
   1 Dimiter Prodanov
   1 Gabriel Einsdorf
   1 Gabriella Turek
   1 Gregory Jefferis
   1 Jan Funke
   1 Kota Miura
   1 Les Foster
   1 Matt McCormick
   1 Michael Doube
   1 Radoslaw Ejsmont
   1 Ricardo Henriques
   1 Tiago Ferreira
   1 Tom Kazimiers
   2 Brian Northan
   2 Christian Dietz
   2 Christian Tischer
   2 Kai Uwe Barthel
   2 Kyle Harrington
   2 Matthias Arzt
   2 Philipp Hanslovsky
   2 Stefan Helfrich
   2 Wayne Rasband
   3 Hadrien Mary
   4 Daniel James White
   4 Jan Eglinger
   5 Gabriel Landini
   5 Larry Lindsey
   6 Albert Cardona
   7 Ignacio Arganda-Carreras
  12 Jean-Yves Tinevez
  13 Stephan Preibisch
  13 Tobias Pietzsch
  18 Stephan Saalfeld
 134 Curtis Rueden

And I would not immediately enable branch protections in the 134 repositories for which I am currently responsible. Protections can be enabled as new leads/maintainers emerge who prefer to do so.

1 Like

I am just going to comment on your downsides:

I agree. While I am a big proponent of branch protections, I do think that most people are opposed or do not care, so I do not expect a lot of work wrt to enable branch protection.

I was unaware of that. I thought it was possible to configure branch restrictions such that anyone with write access could merge pull requests (through the web interface).

While you definitely have me on your side with requiring some sort of merge or PR message, this could be relaxed even more: As long as a commit has (at least one) tag to indicate the type of change (major, minor, patch) it can be included in the change log.

Sure, but it depends on how you set up the rule. This article has a lot of details, but I’m still not totally clear on everything without trying it out. In particular, I’m not sure how much control we will have over who can do what without creating teams. I’d love to avoid creating any teams if possible.

:+1: