Create production-deploy-march-2026.md via n8n
This commit is contained in:
parent
eb1a56ea62
commit
bb0699b792
116
PBS/Tech/Projects/production-deploy-march-2026.md
Normal file
116
PBS/Tech/Projects/production-deploy-march-2026.md
Normal file
@ -0,0 +1,116 @@
|
|||||||
|
---
|
||||||
|
project: production-deploy-march-2026
|
||||||
|
type: session-notes
|
||||||
|
status: completed
|
||||||
|
tags:
|
||||||
|
- production
|
||||||
|
- docker
|
||||||
|
- traefik
|
||||||
|
- ansible
|
||||||
|
- cloudflare
|
||||||
|
- gitea
|
||||||
|
- n8n
|
||||||
|
created: 2026-03-29
|
||||||
|
updated: 2026-03-29
|
||||||
|
path: PBS/Tech/Projects/
|
||||||
|
---
|
||||||
|
|
||||||
|
# Production Deploy - March 29, 2026
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
Deployed infrastructure updates to production Linode via Ansible, including
|
||||||
|
new container configurations, database schema updates, and Gitea setup.
|
||||||
|
Used Cloudflare Worker for maintenance mode during deploy.
|
||||||
|
|
||||||
|
## Accomplishments
|
||||||
|
|
||||||
|
### Cloudflare Worker Maintenance Page
|
||||||
|
- Created `pbs-maintain` Worker with PBS-branded maintenance page (Sunnie
|
||||||
|
themed)
|
||||||
|
- Added IP bypass so Travis can access the site while maintenance page is
|
||||||
|
active for public
|
||||||
|
- Worker URL: https://pbs-maintain.tjherbranson.workers.dev
|
||||||
|
- Toggle on/off by adding/removing Workers Route for `
|
||||||
|
plantbasedsoutherner.com/*`
|
||||||
|
- Worker uses `CF-Connecting-IP` header to check allowed IPs
|
||||||
|
|
||||||
|
### n8n Document Pipeline Fix
|
||||||
|
- Identified race condition: two emails processed simultaneously caused
|
||||||
|
Gitea ref lock errors
|
||||||
|
- Root cause: Gitea Contents API creates a commit on every file
|
||||||
|
create/update — two simultaneous API calls create competing commits on
|
||||||
|
`main`
|
||||||
|
- Fix: Added Loop node before Gitea create/update nodes to serialize file
|
||||||
|
processing
|
||||||
|
- Key distinction: Loop node serializes items through the same path; Split
|
||||||
|
In Batches chunks data — these are different nodes with different behaviors
|
||||||
|
|
||||||
|
### Production Deploy Process
|
||||||
|
- Cloned live Linode as backup before changes
|
||||||
|
- Ran Ansible playbook against the clone
|
||||||
|
- Deployed MySQL schema updates via phpMyAdmin (copy/paste with `IF NOT
|
||||||
|
EXISTS`)
|
||||||
|
- Updated DNS in Cloudflare to point to new clone server
|
||||||
|
- Clone became the new production server; original kept as rollback backup
|
||||||
|
|
||||||
|
### Container Fixes During Deploy
|
||||||
|
- **pbs-api healthcheck**: Replaced curl-based healthcheck with Python
|
||||||
|
(`urllib.request`) since curl not installed in container
|
||||||
|
- **Missing README.md**: pbs-api build failed because `pyproject.toml`
|
||||||
|
referenced a README.md that didn't exist — created empty file
|
||||||
|
- **MySQL memory limits**: Added deploy block with 768M limit and tuning
|
||||||
|
flags to compose
|
||||||
|
- **WordPress memory limit**: Added 2000M deploy limit
|
||||||
|
- **Portainer stale container reference**: Restarted Portainer to clear
|
||||||
|
cached container IDs from pre-deploy
|
||||||
|
|
||||||
|
### Gitea Production Setup
|
||||||
|
- Added DNS record for `gitea.plantbasedsoutherner.com` in Cloudflare
|
||||||
|
- Waited for DNS propagation before Traefik could issue Let's Encrypt cert
|
||||||
|
- Removed healthcheck from Gitea container (healthcheck returns 404 before
|
||||||
|
setup wizard completes, Traefik won't route to unhealthy containers)
|
||||||
|
- Completed setup wizard and created admin user
|
||||||
|
- Set `LANDING_PAGE = login` in app.ini (future task)
|
||||||
|
|
||||||
|
### WordPress Staging Redirect Issue
|
||||||
|
- After Ansible deploy, site redirected to staging domain
|
||||||
|
- Root cause: One Traefik router label had staging URL, WordPress picked it
|
||||||
|
up and wrote it to `wp_options` table
|
||||||
|
- Fix: Updated `siteurl` and `home` in `wp_options` via phpMyAdmin, flushed
|
||||||
|
Redis, purged Cloudflare cache
|
||||||
|
- Lesson: WordPress can auto-update `wp_options` URLs based on incoming
|
||||||
|
request hostname
|
||||||
|
|
||||||
|
## Key Learnings
|
||||||
|
|
||||||
|
- **Cloudflare Worker IP bypass**: `return fetch(request)` re-fetches
|
||||||
|
through Cloudflare's network, so WordPress sees a Cloudflare edge IP
|
||||||
|
instead of your real IP — can trigger Wordfence lockouts
|
||||||
|
- **Cloudflare caches 301 redirects**: Always purge Cloudflare cache after
|
||||||
|
fixing redirect issues
|
||||||
|
- **Gitea API creates implicit commits**: Every file create/update via the
|
||||||
|
Contents API is a git commit — serialize multiple file operations to avoid
|
||||||
|
ref lock errors
|
||||||
|
- **Ansible staging drift**: Fixes applied directly on staging don't make
|
||||||
|
it back to Ansible automatically — fix on staging to unblock, but
|
||||||
|
immediately update Ansible too
|
||||||
|
- **DNS propagation for Let's Encrypt**: Traefik can't issue certs until
|
||||||
|
DNS propagates — recreate the container (`docker compose up -d
|
||||||
|
--force-recreate`) to trigger a retry without restarting all of Traefik
|
||||||
|
- **Wildcard DNS records**: `*.staging` in Cloudflare catches all
|
||||||
|
subdomains under staging automatically — convenient for staging, avoid on
|
||||||
|
production
|
||||||
|
- **`IF NOT EXISTS` MySQL warning**: The "less efficient" warning about
|
||||||
|
index generation during table creation is negligible for small schemas —
|
||||||
|
keep the safety of `IF NOT EXISTS`
|
||||||
|
|
||||||
|
## Still To Do
|
||||||
|
- [ ] n8n and database configuration on production
|
||||||
|
- [ ] Set Gitea landing page to login in `app.ini`
|
||||||
|
- [ ] Configure Gitea email settings (deferred)
|
||||||
|
- [ ] Add Gitea healthcheck back with wget-based check after setup is stable
|
||||||
|
- [ ] Delete old Linode backup server after stability verification (~1 week)
|
||||||
|
- [ ] Continue work on per-container Ansible playbook
|
||||||
|
- [ ] Update Ansible with any fixes applied directly to production during
|
||||||
|
this session
|
||||||
|
...sent from Jenny & Travis
|
||||||
Loading…
Reference in New Issue
Block a user