--- project: production-deploy-march-2026 type: session-notes status: completed tags: - production - docker - traefik - ansible - cloudflare - gitea - n8n created: 2026-03-29 updated: 2026-03-29 path: PBS/Tech/Projects/ --- # Production Deploy - March 29, 2026 ## Overview Deployed infrastructure updates to production Linode via Ansible, including new container configurations, database schema updates, and Gitea setup. Used Cloudflare Worker for maintenance mode during deploy. ## Accomplishments ### Cloudflare Worker Maintenance Page - Created `pbs-maintain` Worker with PBS-branded maintenance page (Sunnie themed) - Added IP bypass so Travis can access the site while maintenance page is active for public - Worker URL: https://pbs-maintain.tjherbranson.workers.dev - Toggle on/off by adding/removing Workers Route for ` plantbasedsoutherner.com/*` - Worker uses `CF-Connecting-IP` header to check allowed IPs ### n8n Document Pipeline Fix - Identified race condition: two emails processed simultaneously caused Gitea ref lock errors - Root cause: Gitea Contents API creates a commit on every file create/update — two simultaneous API calls create competing commits on `main` - Fix: Added Loop node before Gitea create/update nodes to serialize file processing - Key distinction: Loop node serializes items through the same path; Split In Batches chunks data — these are different nodes with different behaviors ### Production Deploy Process - Cloned live Linode as backup before changes - Ran Ansible playbook against the clone - Deployed MySQL schema updates via phpMyAdmin (copy/paste with `IF NOT EXISTS`) - Updated DNS in Cloudflare to point to new clone server - Clone became the new production server; original kept as rollback backup ### Container Fixes During Deploy - **pbs-api healthcheck**: Replaced curl-based healthcheck with Python (`urllib.request`) since curl not installed in container - **Missing README.md**: pbs-api build failed because `pyproject.toml` referenced a README.md that didn't exist — created empty file - **MySQL memory limits**: Added deploy block with 768M limit and tuning flags to compose - **WordPress memory limit**: Added 2000M deploy limit - **Portainer stale container reference**: Restarted Portainer to clear cached container IDs from pre-deploy ### Gitea Production Setup - Added DNS record for `gitea.plantbasedsoutherner.com` in Cloudflare - Waited for DNS propagation before Traefik could issue Let's Encrypt cert - Removed healthcheck from Gitea container (healthcheck returns 404 before setup wizard completes, Traefik won't route to unhealthy containers) - Completed setup wizard and created admin user - Set `LANDING_PAGE = login` in app.ini (future task) ### WordPress Staging Redirect Issue - After Ansible deploy, site redirected to staging domain - Root cause: One Traefik router label had staging URL, WordPress picked it up and wrote it to `wp_options` table - Fix: Updated `siteurl` and `home` in `wp_options` via phpMyAdmin, flushed Redis, purged Cloudflare cache - Lesson: WordPress can auto-update `wp_options` URLs based on incoming request hostname ## Key Learnings - **Cloudflare Worker IP bypass**: `return fetch(request)` re-fetches through Cloudflare's network, so WordPress sees a Cloudflare edge IP instead of your real IP — can trigger Wordfence lockouts - **Cloudflare caches 301 redirects**: Always purge Cloudflare cache after fixing redirect issues - **Gitea API creates implicit commits**: Every file create/update via the Contents API is a git commit — serialize multiple file operations to avoid ref lock errors - **Ansible staging drift**: Fixes applied directly on staging don't make it back to Ansible automatically — fix on staging to unblock, but immediately update Ansible too - **DNS propagation for Let's Encrypt**: Traefik can't issue certs until DNS propagates — recreate the container (`docker compose up -d --force-recreate`) to trigger a retry without restarting all of Traefik - **Wildcard DNS records**: `*.staging` in Cloudflare catches all subdomains under staging automatically — convenient for staging, avoid on production - **`IF NOT EXISTS` MySQL warning**: The "less efficient" warning about index generation during table creation is negligible for small schemas — keep the safety of `IF NOT EXISTS` ## Still To Do - [ ] n8n and database configuration on production - [ ] Set Gitea landing page to login in `app.ini` - [ ] Configure Gitea email settings (deferred) - [ ] Add Gitea healthcheck back with wget-based check after setup is stable - [ ] Delete old Linode backup server after stability verification (~1 week) - [ ] Continue work on per-container Ansible playbook - [ ] Update Ansible with any fixes applied directly to production during this session ...sent from Jenny & Travis