4.7 KiB
4.7 KiB
| project | type | status | tags | created | updated | path | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| production-deploy-march-2026 | session-notes | completed |
|
2026-03-29 | 2026-03-29 | PBS/Tech/Projects/ |
Production Deploy - March 29, 2026
Overview
Deployed infrastructure updates to production Linode via Ansible, including new container configurations, database schema updates, and Gitea setup. Used Cloudflare Worker for maintenance mode during deploy.
Accomplishments
Cloudflare Worker Maintenance Page
- Created
pbs-maintainWorker with PBS-branded maintenance page (Sunnie themed) - Added IP bypass so Travis can access the site while maintenance page is active for public
- Worker URL: https://pbs-maintain.tjherbranson.workers.dev
- Toggle on/off by adding/removing Workers Route for
plantbasedsoutherner.com/* - Worker uses
CF-Connecting-IPheader to check allowed IPs
n8n Document Pipeline Fix
- Identified race condition: two emails processed simultaneously caused Gitea ref lock errors
- Root cause: Gitea Contents API creates a commit on every file
create/update — two simultaneous API calls create competing commits on
main - Fix: Added Loop node before Gitea create/update nodes to serialize file processing
- Key distinction: Loop node serializes items through the same path; Split In Batches chunks data — these are different nodes with different behaviors
Production Deploy Process
- Cloned live Linode as backup before changes
- Ran Ansible playbook against the clone
- Deployed MySQL schema updates via phpMyAdmin (copy/paste with
IF NOT EXISTS) - Updated DNS in Cloudflare to point to new clone server
- Clone became the new production server; original kept as rollback backup
Container Fixes During Deploy
- pbs-api healthcheck: Replaced curl-based healthcheck with Python
(
urllib.request) since curl not installed in container - Missing README.md: pbs-api build failed because
pyproject.tomlreferenced a README.md that didn't exist — created empty file - MySQL memory limits: Added deploy block with 768M limit and tuning flags to compose
- WordPress memory limit: Added 2000M deploy limit
- Portainer stale container reference: Restarted Portainer to clear cached container IDs from pre-deploy
Gitea Production Setup
- Added DNS record for
gitea.plantbasedsoutherner.comin Cloudflare - Waited for DNS propagation before Traefik could issue Let's Encrypt cert
- Removed healthcheck from Gitea container (healthcheck returns 404 before setup wizard completes, Traefik won't route to unhealthy containers)
- Completed setup wizard and created admin user
- Set
LANDING_PAGE = loginin app.ini (future task)
WordPress Staging Redirect Issue
- After Ansible deploy, site redirected to staging domain
- Root cause: One Traefik router label had staging URL, WordPress picked it
up and wrote it to
wp_optionstable - Fix: Updated
siteurlandhomeinwp_optionsvia phpMyAdmin, flushed Redis, purged Cloudflare cache - Lesson: WordPress can auto-update
wp_optionsURLs based on incoming request hostname
Key Learnings
- Cloudflare Worker IP bypass:
return fetch(request)re-fetches through Cloudflare's network, so WordPress sees a Cloudflare edge IP instead of your real IP — can trigger Wordfence lockouts - Cloudflare caches 301 redirects: Always purge Cloudflare cache after fixing redirect issues
- Gitea API creates implicit commits: Every file create/update via the Contents API is a git commit — serialize multiple file operations to avoid ref lock errors
- Ansible staging drift: Fixes applied directly on staging don't make it back to Ansible automatically — fix on staging to unblock, but immediately update Ansible too
- DNS propagation for Let's Encrypt: Traefik can't issue certs until
DNS propagates — recreate the container (
docker compose up -d --force-recreate) to trigger a retry without restarting all of Traefik - Wildcard DNS records:
*.stagingin Cloudflare catches all subdomains under staging automatically — convenient for staging, avoid on production IF NOT EXISTSMySQL warning: The "less efficient" warning about index generation during table creation is negligible for small schemas — keep the safety ofIF NOT EXISTS
Still To Do
- n8n and database configuration on production
- Set Gitea landing page to login in
app.ini - Configure Gitea email settings (deferred)
- Add Gitea healthcheck back with wget-based check after setup is stable
- Delete old Linode backup server after stability verification (~1 week)
- Continue work on per-container Ansible playbook
- Update Ansible with any fixes applied directly to production during this session ...sent from Jenny & Travis