Clone
1
CI and Workflow
CodeX edited this page 2026-04-07 01:09:01 +02:00

CI and Workflow

Overview

Synchronisation is driven by a single Gitea Actions workflow, .gitea/workflows/sync.yml. It runs on a schedule and on manual dispatch. Each run executes the sync script, commits any changes to blacklist or blacklist.prev, and pushes the commit to main.

There is no CI in the traditional sense for this repository -- no tests, no build, no lint. The workflow's only job is to keep the blacklist in sync with upstream.

Workflow file

name: Sync blocklists from upstream

on:
  schedule:
    - cron: '0 4 */7 * *'
  workflow_dispatch:

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Fetch and merge upstream files
        run: python3 scripts/merge_blocklists.py

      - name: Commit and push if changed
        run: |
          git config user.name "gitea-actions"
          git config user.email "actions@gitea"
          git add .
          git diff --staged --quiet || git commit -m "Sync blocklists from upstream"
          git push

Schedule

The cron expression 0 4 */7 * * runs at 04:00 UTC on days 1, 8, 15, 22, and 29 of each month -- effectively every 7 days, with a small skip at the end of each month because day 29 and day 1 of the next month are only 1-3 days apart.

This cadence is deliberate: upstream Cleanuparr rarely changes the blacklist, and running less frequently reduces noise in the commit history. If upstream is updated and you want the change immediately, use manual dispatch (see below) instead of waiting for the next scheduled run.

Changing the schedule

Edit the cron line in .gitea/workflows/sync.yml. Common alternatives:

Cron expression Meaning
0 4 */7 * * Every 7 days at 04:00 UTC (current)
0 4 * * 1 Every Monday at 04:00 UTC
0 4 1 * * First day of every month at 04:00 UTC
0 */6 * * * Every 6 hours

All times are UTC. Gitea Actions does not support timezones in cron expressions.

Manual dispatch

The workflow_dispatch trigger lets you run the sync on demand from the Gitea UI or via the API. Use this after editing whitelist if you want the change to take effect immediately instead of waiting for the next scheduled run.

From the Gitea UI

  1. Open the repository on Gitea.
  2. Go to Actions -> Sync blocklists from upstream.
  3. Click Run workflow.
  4. Select branch main.
  5. Click the confirm button.

The run appears in the Actions list within a few seconds and typically completes in under a minute.

From the API

curl -X POST \
  -H "Authorization: token YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"ref": "main"}' \
  https://git.hisp.no/api/v1/repos/arr/blocklists/actions/workflows/sync.yml/dispatches

The token needs write:repository scope for the arr/blocklists repo.

What the workflow does

Step 1: checkout

Standard actions/checkout@v3. Checks out the repository at the current HEAD of main. No submodules, no LFS, no special configuration.

Step 2: fetch and merge

Runs python3 scripts/merge_blocklists.py. The script:

  1. Fetches the upstream blacklist from https://raw.githubusercontent.com/Cleanuparr/Cleanuparr/main/blacklist.
  2. Reads blacklist.prev, blacklist, and whitelist from the checked-out repository.
  3. Performs the three-way merge and whitelist subtraction.
  4. Writes blacklist and blacklist.prev back to disk.

The script is idempotent: running it twice in a row with no upstream or whitelist changes produces no diff on the second run.

See Sync for the full algorithm.

Step 3: commit and push if changed

git config user.name "gitea-actions"
git config user.email "actions@gitea"
git add .
git diff --staged --quiet || git commit -m "Sync blocklists from upstream"
git push

This commits and pushes only if the script actually changed something. The git diff --staged --quiet check returns non-zero when there are staged changes, which triggers the commit via ||. If nothing changed, git commit is skipped and the final git push is a no-op (push with no local commits ahead of the remote).

The commit author is always gitea-actions <actions@gitea>, regardless of who triggered the run. This makes automated syncs easy to distinguish from human commits in the history.

Permissions

The workflow runs with the default GITHUB_TOKEN (Gitea equivalent) that Gitea Actions provides automatically. This token has write access to the repository, which is necessary for the commit-and-push step. No additional secrets are required.

No external API tokens are needed -- the upstream blacklist is fetched from a public raw URL on raw.githubusercontent.com without authentication.

Monitoring

Checking recent runs

Go to Actions -> Sync blocklists from upstream in the Gitea UI. Each run shows:

  • Status (success / failure)
  • Trigger (schedule / manual dispatch)
  • Commit created (if any)
  • Full log output

Reading the log

The Python script prints four summary lines per run. These appear in the "Fetch and merge upstream files" step log:

[blacklist] Upstream added:     [...]
[blacklist] Upstream removed:   [...]
[blacklist] Custom preserved:   [...]
[blacklist] Whitelist stripped: [...]

Use these to verify the sync behaved as expected. "Whitelist stripped" should list every entry in your whitelist that was present in the upstream blacklist at fetch time.

Run history in git log

Every automated commit uses the same message, so filtering the history is easy:

git log --author="gitea-actions" --oneline

Or to see commits that actually touched the blacklist:

git log --oneline -- blacklist

Failure modes

Upstream unreachable

If raw.githubusercontent.com is unreachable or returns a non-200 response, urllib.request.urlopen raises an exception and the script exits non-zero. The workflow fails at the "Fetch and merge upstream files" step. No commit is made, no push happens. The repository state is unchanged.

Retry the workflow manually once upstream is available again.

Script error

If the sync script crashes (malformed upstream, disk full, etc.), the step fails and no commit is made. Read the full step log to diagnose.

Push rejected

If someone pushes to main between the checkout and the push, the push is rejected (non-fast-forward). The workflow fails at the push step. No data is lost -- the next scheduled run will fetch the latest state and re-apply the sync.

Commit is empty

This is not a failure. The git diff --staged --quiet || git commit pattern explicitly skips the commit when nothing changed, and the subsequent git push is a no-op. The workflow reports success.

Disabling the scheduled run

To pause automatic syncing without removing the workflow entirely, comment out the schedule section in .gitea/workflows/sync.yml:

on:
  # schedule:
  #   - cron: '0 4 */7 * *'
  workflow_dispatch:

Manual dispatch still works. Uncomment to re-enable scheduling.