We needed a CMS that let writers use Markdown in Git, gave developers SQL query power, and cost $0/month. So we built one.
I know that sounds like the opening of every "we built our own tool" blog post that ends with "and it was a terrible idea." This one does not end that way. Our Git-to-Supabase content pipeline has been running in production for 5 months, syncing 170+ blog posts, 12 project case studies, and 6 team bios. It has cost us exactly $0 in infrastructure. And it handles the one thing that no existing CMS got right for our workflow: content authored in Git, queryable with SQL.
Let me walk you through the entire system -- the architecture, the sync script with full code, the database schema, the Edge Functions, and the GitHub Actions workflow. This is the post I wish existed when we were building this.
The Problem We Were Solving
Content management systems exist on a spectrum. On one end, you have headless CMS platforms like Contentful and Sanity -- great visual editors, API-first, but they cost $99-300/month for teams and lock your content into their platform. On the other end, you have Git-based content (Markdown files in a repo) -- free, version-controlled, developer-friendly, but terrible for dynamic queries.
Here is what I mean by "terrible for dynamic queries." If your blog posts are Markdown files and you want to answer "show me the 10 most recent posts tagged 'AI' sorted by date," you need to:
- Read every Markdown file from disk
- Parse the frontmatter of each one
- Filter by tag
- Sort by date
- Take the first 10
For 50 posts, this takes milliseconds. For 500 posts, it starts to slow down. For any kind of search, pagination, or complex filtering, you are building a query engine on top of a file system. That is a database's job.
We wanted both: Git as the authoring experience and a database as the query layer. The content pipeline bridges the two.
The Architecture
Here is how every piece fits together:
┌──────────────────────────────────────────────────────────────────┐
│ CONTENT AUTHORS │
│ Write .mdx (blog) and .md (projects/team) files in VS Code │
│ Push to codercops-agency-content repo on GitHub │
└──────────────────┬───────────────────────────────────────────────┘
│
│ git push to main branch
v
┌──────────────────────────────────────────────────────────────────┐
│ GITHUB ACTIONS WORKFLOW │
│ │
│ Step 1: Validate Content │
│ ├── Check frontmatter (required fields, correct types) │
│ ├── Detect duplicate slugs │
│ └── Fail pipeline if any validation errors │
│ │
│ Step 2: Sync to Supabase │
│ ├── Compute SHA-256 hash of each content file │
│ ├── Fetch existing hashes from Supabase │
│ ├── Compare hashes (delta detection) │
│ ├── For changed files only: │
│ │ ├── Parse frontmatter with gray-matter │
│ │ ├── Render Markdown/MDX to HTML with marked + highlight.js │
│ │ ├── Extract h2/h3 headings for table of contents │
│ │ ├── Upload images to Supabase Storage │
│ │ └── Upsert row to Supabase database │
│ └── Log sync results (synced N, skipped M) │
│ │
│ Step 3: Trigger Vercel Deploy (optional) │
│ └── Hit Vercel deploy hook to bust SSR cache │
└──────────────────┬───────────────────────────────────────────────┘
│
v
┌──────────────────────────────────────────────────────────────────┐
│ SUPABASE (Project: hfyczaxyaafgxntbodpy) │
│ │
│ PostgreSQL Database │
│ ├── blog_posts (170+ rows) │
│ ├── projects (12 rows) │
│ ├── team_members (6 rows) │
│ └── page_views (analytics) │
│ │
│ Edge Functions │
│ ├── newsletter-subscribe (handles email signups) │
│ └── increment-page-views (tracks page analytics) │
│ │
│ Storage │
│ └── content-images/ (blog and project images) │
│ │
│ Row Level Security │
│ └── Public can read published posts only (pub_date <= now()) │
└──────────────────┬───────────────────────────────────────────────┘
│
│ Supabase JS client queries
v
┌──────────────────────────────────────────────────────────────────┐
│ WEBSITE (Astro 5 on Vercel) │
│ │
│ src/lib/supabase.ts → Supabase client singleton │
│ src/lib/content.ts → Data access layer (getAllBlogPosts, etc.) │
│ src/pages/blog/[slug].astro → SSR, queries Supabase per request │
│ │
│ Deployed to Vercel with SSR (output: 'server') │
│ Edge caching: s-maxage=300, stale-while-revalidate=600 │
└──────────────────────────────────────────────────────────────────┘Three systems, one content flow: Git to Actions to Supabase to website. Let me walk through each piece in detail.
The Database Schema
The schema is straightforward. We have four tables.
blog_posts
This is the main table. Each row is a blog post.
CREATE TABLE blog_posts (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
slug TEXT UNIQUE NOT NULL,
title TEXT NOT NULL,
description TEXT,
content_html TEXT NOT NULL,
pub_date TIMESTAMPTZ NOT NULL,
author TEXT NOT NULL DEFAULT 'Anurag Verma',
image TEXT,
tags TEXT[] DEFAULT '{}',
category TEXT,
subcategory TEXT,
featured BOOLEAN DEFAULT false,
headings JSONB DEFAULT '[]',
content_hash TEXT,
reading_time INTEGER,
word_count INTEGER,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);
-- Indexes for common query patterns
CREATE INDEX idx_blog_posts_pub_date ON blog_posts(pub_date DESC);
CREATE INDEX idx_blog_posts_tags ON blog_posts USING gin(tags);
CREATE INDEX idx_blog_posts_category ON blog_posts(category);
CREATE INDEX idx_blog_posts_featured ON blog_posts(featured)
WHERE featured = true;
CREATE INDEX idx_blog_posts_slug ON blog_posts(slug);A few design decisions worth noting:
content_html instead of raw Markdown. We render Markdown to HTML in the sync script, not at request time. This means Supabase stores pre-rendered HTML, and the website just serves it. No Markdown parsing on every page load.
content_hash for delta sync. This SHA-256 hash of the raw file content lets the sync script know whether a file has changed. If the hash matches what is in the database, we skip the file.
headings as JSONB. The table of contents is extracted during sync and stored as a JSON array. No parsing needed at request time.
tags as TEXT[]. Postgres array type lets us query with @> (contains) operator: WHERE tags @> ARRAY['AI']. The GIN index makes this fast.
reading_time and word_count. Computed during sync. Small detail, but it means the website never has to count words.
projects
CREATE TABLE projects (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
slug TEXT UNIQUE NOT NULL,
title TEXT NOT NULL,
description TEXT,
content_html TEXT NOT NULL,
client TEXT,
tech_stack TEXT[] DEFAULT '{}',
category TEXT,
image TEXT,
featured BOOLEAN DEFAULT false,
sort_order INTEGER DEFAULT 0,
content_hash TEXT,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);team_members
CREATE TABLE team_members (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
slug TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
role TEXT NOT NULL,
bio TEXT,
content_html TEXT,
avatar TEXT,
social_links JSONB DEFAULT '{}',
sort_order INTEGER DEFAULT 0,
content_hash TEXT,
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()
);page_views
CREATE TABLE page_views (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
page_slug TEXT UNIQUE NOT NULL,
view_count INTEGER DEFAULT 0,
last_viewed_at TIMESTAMPTZ DEFAULT now()
);
-- RPC function for atomic increment
CREATE OR REPLACE FUNCTION increment_page_views(page_slug TEXT)
RETURNS void AS $$
BEGIN
INSERT INTO page_views (page_slug, view_count, last_viewed_at)
VALUES (page_slug, 1, now())
ON CONFLICT (page_slug)
DO UPDATE SET
view_count = page_views.view_count + 1,
last_viewed_at = now();
END;
$$ LANGUAGE plpgsql;The increment_page_views function uses Postgres's INSERT ... ON CONFLICT (upsert) to atomically increment the view count. No race conditions, no read-modify-write cycle.
Row Level Security
-- Enable RLS on all tables
ALTER TABLE blog_posts ENABLE ROW LEVEL SECURITY;
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
ALTER TABLE team_members ENABLE ROW LEVEL SECURITY;
ALTER TABLE page_views ENABLE ROW LEVEL SECURITY;
-- Public can read published blog posts
CREATE POLICY "read_published_posts"
ON blog_posts FOR SELECT
USING (pub_date <= now());
-- Public can read all projects
CREATE POLICY "read_projects"
ON projects FOR SELECT
USING (true);
-- Public can read all team members
CREATE POLICY "read_team"
ON team_members FOR SELECT
USING (true);
-- Public can read page view counts
CREATE POLICY "read_page_views"
ON page_views FOR SELECT
USING (true);
-- Only service role can write (sync script uses service role key)
-- No INSERT/UPDATE/DELETE policies for anon = no public writesRLS is our authorization layer. The website uses the anon key, which can only read published content. The sync script uses the service_role key, which bypasses RLS and can write. This means even if someone extracts the anon key from our frontend code, they cannot write to the database or read draft posts.
The Sync Script: Full Walkthrough
This is the core of the pipeline. It runs in GitHub Actions on every push to the content repo's main branch.
// scripts/sync-to-supabase.js
import { createHash } from 'crypto';
import { createClient } from '@supabase/supabase-js';
import { marked } from 'marked';
import hljs from 'highlight.js';
import matter from 'gray-matter';
import * as fs from 'fs';
import * as path from 'path';
// ─── Configuration ─────────────────────────────────────────────
const supabase = createClient(
process.env.SUPABASE_URL,
process.env.SUPABASE_SERVICE_ROLE_KEY
);
// Configure marked with syntax highlighting
const renderer = new marked.Renderer();
// Add IDs to headings for anchor links
renderer.heading = function ({ text, depth }) {
const id = text
.toLowerCase()
.replace(/[^\w\s-]/g, '')
.replace(/\s+/g, '-')
.replace(/-+/g, '-')
.trim();
return `<h${depth} id="${id}">${text}</h${depth}>`;
};
marked.setOptions({
renderer,
highlight: (code, lang) => {
if (lang && hljs.getLanguage(lang)) {
try {
return hljs.highlight(code, { language: lang }).value;
} catch (err) {
console.warn(`Highlight error for language ${lang}:`, err.message);
}
}
return hljs.highlightAuto(code).value;
},
gfm: true,
breaks: false,
});
// ─── Utility Functions ─────────────────────────────────────────
function computeHash(content) {
return createHash('sha256').update(content).digest('hex');
}
function computeReadingTime(text) {
const wordsPerMinute = 200;
const words = text.split(/\s+/).filter(Boolean).length;
return Math.ceil(words / wordsPerMinute);
}
function countWords(text) {
return text.split(/\s+/).filter(Boolean).length;
}
function extractHeadings(html) {
const headings = [];
const regex = /<h([23])\s+id="([^"]*)"[^>]*>([\s\S]*?)<\/h\1>/gi;
let match;
while ((match = regex.exec(html)) !== null) {
headings.push({
level: parseInt(match[1], 10),
id: match[2],
text: match[3].replace(/<[^>]*>/g, '').trim(),
});
}
return headings;
}
function stripMarkdown(text) {
return text
.replace(/```[\s\S]*?```/g, '') // Remove code blocks
.replace(/`[^`]*`/g, '') // Remove inline code
.replace(/!\[.*?\]\(.*?\)/g, '') // Remove images
.replace(/\[.*?\]\(.*?\)/g, '') // Remove links (keep text)
.replace(/#{1,6}\s+/g, '') // Remove heading markers
.replace(/[*_~]+/g, '') // Remove emphasis markers
.replace(/>\s+/g, '') // Remove blockquotes
.replace(/\n{2,}/g, '\n') // Collapse newlines
.trim();
}
// ─── Sync Functions ────────────────────────────────────────────
async function getExistingHashes(table) {
const { data, error } = await supabase
.from(table)
.select('slug, content_hash');
if (error) {
console.error(`Failed to fetch hashes from ${table}:`, error);
process.exit(1);
}
return new Map((data || []).map((row) => [row.slug, row.content_hash]));
}
async function syncBlogPosts() {
console.log('\n--- Syncing blog posts ---');
const blogDir = path.join(process.cwd(), 'blog');
if (!fs.existsSync(blogDir)) {
console.log('No blog directory found, skipping.');
return;
}
const files = fs
.readdirSync(blogDir)
.filter((f) => f.endsWith('.mdx') || f.endsWith('.md'));
console.log(`Found ${files.length} blog files`);
const existingHashes = await getExistingHashes('blog_posts');
let synced = 0;
let skipped = 0;
let errors = 0;
for (const file of files) {
const filePath = path.join(blogDir, file);
const raw = fs.readFileSync(filePath, 'utf-8');
const hash = computeHash(raw);
const slug = file.replace(/\.(mdx|md)$/, '');
// Delta check: skip unchanged files
if (existingHashes.get(slug) === hash) {
skipped++;
continue;
}
try {
// Parse frontmatter and content
const { data: fm, content } = matter(raw);
// Strip markdown for word count and reading time
const plainText = stripMarkdown(content);
const wordCount = countWords(plainText);
const readingTime = computeReadingTime(plainText);
// Render content to HTML
const contentHtml = marked.parse(content);
// Extract headings for table of contents
const headings = extractHeadings(contentHtml);
// Upsert to Supabase
const { error } = await supabase.from('blog_posts').upsert(
{
slug,
title: fm.title,
description: fm.description || '',
content_html: contentHtml,
pub_date: fm.pubDate
? new Date(fm.pubDate).toISOString()
: new Date().toISOString(),
author: fm.author || 'Anurag Verma',
image: fm.image || null,
tags: fm.tags || [],
category: fm.category || null,
subcategory: fm.subcategory || null,
featured: fm.featured || false,
headings: JSON.stringify(headings),
content_hash: hash,
reading_time: readingTime,
word_count: wordCount,
updated_at: new Date().toISOString(),
},
{ onConflict: 'slug' }
);
if (error) {
console.error(` FAILED: ${slug} - ${error.message}`);
errors++;
} else {
console.log(` Synced: ${slug} (${wordCount} words, ${readingTime} min read)`);
synced++;
}
} catch (err) {
console.error(` ERROR processing ${slug}:`, err.message);
errors++;
}
}
console.log(
`Blog posts: ${synced} synced, ${skipped} unchanged, ${errors} errors`
);
if (errors > 0) {
process.exit(1);
}
}
async function syncProjects() {
console.log('\n--- Syncing projects ---');
const projectDir = path.join(process.cwd(), 'projects');
if (!fs.existsSync(projectDir)) {
console.log('No projects directory found, skipping.');
return;
}
const files = fs
.readdirSync(projectDir)
.filter((f) => f.endsWith('.md'));
console.log(`Found ${files.length} project files`);
const existingHashes = await getExistingHashes('projects');
let synced = 0;
let skipped = 0;
for (const file of files) {
const filePath = path.join(projectDir, file);
const raw = fs.readFileSync(filePath, 'utf-8');
const hash = computeHash(raw);
const slug = file.replace(/\.md$/, '');
if (existingHashes.get(slug) === hash) {
skipped++;
continue;
}
const { data: fm, content } = matter(raw);
const contentHtml = marked.parse(content);
const { error } = await supabase.from('projects').upsert(
{
slug,
title: fm.title,
description: fm.description || '',
content_html: contentHtml,
client: fm.client || null,
tech_stack: fm.techStack || [],
category: fm.category || null,
image: fm.image || null,
featured: fm.featured || false,
sort_order: fm.sortOrder || 0,
content_hash: hash,
updated_at: new Date().toISOString(),
},
{ onConflict: 'slug' }
);
if (error) {
console.error(` FAILED: ${slug} - ${error.message}`);
process.exit(1);
}
console.log(` Synced: ${slug}`);
synced++;
}
console.log(`Projects: ${synced} synced, ${skipped} unchanged`);
}
async function syncTeamMembers() {
console.log('\n--- Syncing team members ---');
const teamDir = path.join(process.cwd(), 'team');
if (!fs.existsSync(teamDir)) {
console.log('No team directory found, skipping.');
return;
}
const files = fs
.readdirSync(teamDir)
.filter((f) => f.endsWith('.md'));
console.log(`Found ${files.length} team member files`);
const existingHashes = await getExistingHashes('team_members');
let synced = 0;
let skipped = 0;
for (const file of files) {
const filePath = path.join(teamDir, file);
const raw = fs.readFileSync(filePath, 'utf-8');
const hash = computeHash(raw);
const slug = file.replace(/\.md$/, '');
if (existingHashes.get(slug) === hash) {
skipped++;
continue;
}
const { data: fm, content } = matter(raw);
const contentHtml = marked.parse(content);
const { error } = await supabase.from('team_members').upsert(
{
slug,
name: fm.name,
role: fm.role,
bio: fm.bio || '',
content_html: contentHtml,
avatar: fm.avatar || null,
social_links: fm.socialLinks || {},
sort_order: fm.sortOrder || 0,
content_hash: hash,
updated_at: new Date().toISOString(),
},
{ onConflict: 'slug' }
);
if (error) {
console.error(` FAILED: ${slug} - ${error.message}`);
process.exit(1);
}
console.log(` Synced: ${slug}`);
synced++;
}
console.log(`Team members: ${synced} synced, ${skipped} unchanged`);
}
// ─── Main ──────────────────────────────────────────────────────
async function main() {
console.log('Starting content sync...');
console.log(`Supabase URL: ${process.env.SUPABASE_URL}`);
const startTime = Date.now();
await syncBlogPosts();
await syncProjects();
await syncTeamMembers();
const elapsed = ((Date.now() - startTime) / 1000).toFixed(1);
console.log(`\nSync complete in ${elapsed}s`);
}
main().catch((err) => {
console.error('Sync failed:', err);
process.exit(1);
});Key Design Decisions in the Sync Script
SHA-256 for change detection, not Git diffs. We hash the file content, not the Git commit diff. This means even if a file is modified, if the content is identical (someone saved without changes), we skip it. It also means we can re-run the full sync at any time and it only processes actually changed files.
Pre-rendered HTML. We run marked.parse() during sync, not at request time. This trades database storage for request-time performance. A typical blog post's HTML is 20-50KB. Storing 170 posts at 50KB each is only 8.5MB -- well within Supabase's free tier.
Heading extraction during sync. The table of contents is computed once when the content changes, not on every page load. The regex extracts all h2 and h3 headings with their IDs (generated by our custom heading renderer) and stores them as JSONB.
Exit on error. If any single file fails to sync, the script exits with code 1. GitHub Actions marks the workflow as failed, and we get notified. We chose to be strict here because partial syncs are worse than failed syncs -- you end up with some content updated and some stale.
Reading time and word count. We strip Markdown formatting before counting words because code blocks, YAML examples, and image references should not inflate the word count. The stripMarkdown function removes code blocks, inline code, images, links, and formatting markers.
Sync Performance
Here are real numbers from our production sync runs:
| Scenario | Files Processed | API Calls | Duration |
|---|---|---|---|
| Full sync (170+ posts, all changed) | 188 | ~380 (read hashes + upserts) | 32-38s |
| Typical push (1-3 files changed) | 3 | ~5 (1 hash read + 3 upserts) | 2-4s |
| No changes (all hashes match) | 0 | 3 (1 hash read per table) | 1.5-2s |
| First-time sync (empty database) | 188 | ~190 (no hash reads, all inserts) | 40-50s |
The delta sync makes a massive difference. A typical content push (editing 1-3 blog posts) completes in under 4 seconds. Without delta sync, even small changes would take 35+ seconds to process all 170+ files.
Edge Functions
We use Supabase Edge Functions for two things: newsletter subscriptions and page view tracking.
Newsletter Subscription
// supabase/functions/newsletter-subscribe/index.ts
import "jsr:@supabase/functions-js/edge-runtime.d.ts";
import { createClient } from "jsr:@supabase/supabase-js@2";
const corsHeaders = {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers":
"authorization, x-client-info, apikey, content-type",
"Access-Control-Allow-Methods": "POST, OPTIONS",
};
Deno.serve(async (req: Request) => {
// Handle CORS preflight
if (req.method === "OPTIONS") {
return new Response(null, { headers: corsHeaders });
}
if (req.method !== "POST") {
return new Response(JSON.stringify({ error: "Method not allowed" }), {
status: 405,
headers: { ...corsHeaders, "Content-Type": "application/json" },
});
}
try {
const { email } = await req.json();
// Basic email validation
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (!email || !emailRegex.test(email)) {
return new Response(
JSON.stringify({ error: "Invalid email address" }),
{
status: 400,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
}
// Initialize Supabase client with service role
const supabase = createClient(
Deno.env.get("SUPABASE_URL")!,
Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!
);
// Insert subscriber (upsert to handle duplicates gracefully)
const { error } = await supabase.from("newsletter_subscribers").upsert(
{
email: email.toLowerCase().trim(),
subscribed_at: new Date().toISOString(),
status: "active",
},
{ onConflict: "email" }
);
if (error) {
console.error("Subscription error:", error);
return new Response(
JSON.stringify({ error: "Failed to subscribe" }),
{
status: 500,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
}
return new Response(
JSON.stringify({ message: "Subscribed successfully" }),
{
status: 200,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
} catch (err) {
console.error("Unexpected error:", err);
return new Response(
JSON.stringify({ error: "Internal server error" }),
{
status: 500,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
}
});Page View Tracking
// supabase/functions/increment-page-views/index.ts
import "jsr:@supabase/functions-js/edge-runtime.d.ts";
import { createClient } from "jsr:@supabase/supabase-js@2";
const corsHeaders = {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Headers":
"authorization, x-client-info, apikey, content-type",
"Access-Control-Allow-Methods": "POST, OPTIONS",
};
Deno.serve(async (req: Request) => {
if (req.method === "OPTIONS") {
return new Response(null, { headers: corsHeaders });
}
if (req.method !== "POST") {
return new Response(JSON.stringify({ error: "Method not allowed" }), {
status: 405,
headers: { ...corsHeaders, "Content-Type": "application/json" },
});
}
try {
const { slug } = await req.json();
if (!slug || typeof slug !== "string") {
return new Response(
JSON.stringify({ error: "Invalid slug" }),
{
status: 400,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
}
const supabase = createClient(
Deno.env.get("SUPABASE_URL")!,
Deno.env.get("SUPABASE_SERVICE_ROLE_KEY")!
);
// Use the RPC function for atomic increment
const { error } = await supabase.rpc("increment_page_views", {
page_slug: slug,
});
if (error) {
console.error("Page view error:", error);
return new Response(
JSON.stringify({ error: "Failed to track view" }),
{
status: 500,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
}
return new Response(
JSON.stringify({ success: true }),
{
status: 200,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
} catch (err) {
console.error("Unexpected error:", err);
return new Response(
JSON.stringify({ error: "Internal server error" }),
{
status: 500,
headers: { ...corsHeaders, "Content-Type": "application/json" },
}
);
}
});We call the page view Edge Function from the blog post page using a fire-and-forget fetch:
// Called on the client side (Astro island)
async function trackPageView(slug: string) {
try {
await fetch(
`${import.meta.env.PUBLIC_SUPABASE_URL}/functions/v1/increment-page-views`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${import.meta.env.PUBLIC_SUPABASE_ANON_KEY}`,
},
body: JSON.stringify({ slug }),
}
);
} catch {
// Silently fail -- page views are not critical
}
}Edge Functions run on Deno Deploy's global edge network, so they respond in 10-30ms from anywhere in the world. And on Supabase's free tier, you get 500,000 Edge Function invocations per month. We are nowhere near that limit.
The GitHub Actions Workflow
The complete CI/CD workflow:
# .github/workflows/validate-and-deploy.yml
name: Validate and Deploy Content
on:
push:
branches: [main]
paths:
- 'blog/**'
- 'projects/**'
- 'team/**'
pull_request:
branches: [main]
paths:
- 'blog/**'
- 'projects/**'
- 'team/**'
jobs:
validate:
name: Validate Content
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- name: Validate frontmatter and content
run: node scripts/validate-content.js
sync:
name: Sync to Supabase
needs: validate
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci
- name: Sync content to Supabase
run: node scripts/sync-to-supabase.js
env:
SUPABASE_URL: ${{ secrets.SUPABASE_URL }}
SUPABASE_SERVICE_ROLE_KEY: ${{ secrets.SUPABASE_SERVICE_ROLE_KEY }}
- name: Trigger Vercel deploy
if: success()
run: |
curl -s -X POST "${{ secrets.VERCEL_DEPLOY_HOOK }}"
continue-on-error: trueA few important details:
Validation runs on PRs, sync only on main. When you open a PR that changes content files, the validation step runs. This catches frontmatter errors, missing required fields, and duplicate slugs before the PR is merged. The sync step only runs when changes are pushed to main.
The Vercel deploy hook. After syncing content to Supabase, we trigger a Vercel redeploy. This is not strictly necessary (SSR pages query Supabase on every request), but it busts any edge-cached pages so users see updated content faster. The continue-on-error: true ensures the workflow does not fail if Vercel is temporarily unreachable.
Path filters. The workflow only triggers when files in blog/, projects/, or team/ change. Changes to the scripts, README, or config files do not trigger a sync.
The Validation Script
// scripts/validate-content.js
import * as fs from 'fs';
import * as path from 'path';
import matter from 'gray-matter';
const REQUIRED_BLOG_FIELDS = {
title: 'string',
description: 'string',
pubDate: 'date',
author: 'string',
};
const OPTIONAL_BLOG_FIELDS = {
image: 'string',
tags: 'array',
category: 'string',
subcategory: 'string',
featured: 'boolean',
};
let hasErrors = false;
function error(file, message) {
console.error(` ERROR [${file}]: ${message}`);
hasErrors = true;
}
function warn(file, message) {
console.warn(` WARN [${file}]: ${message}`);
}
function validateBlogPosts() {
const blogDir = path.join(process.cwd(), 'blog');
if (!fs.existsSync(blogDir)) return;
const files = fs.readdirSync(blogDir).filter(
(f) => f.endsWith('.mdx') || f.endsWith('.md')
);
console.log(`\nValidating ${files.length} blog posts...`);
const slugs = new Set();
for (const file of files) {
const raw = fs.readFileSync(path.join(blogDir, file), 'utf-8');
const slug = file.replace(/\.(mdx|md)$/, '');
// Check for duplicate slugs
if (slugs.has(slug)) {
error(file, `Duplicate slug: ${slug}`);
}
slugs.add(slug);
// Parse frontmatter
let fm;
try {
const parsed = matter(raw);
fm = parsed.data;
} catch (err) {
error(file, `Invalid frontmatter: ${err.message}`);
continue;
}
// Check required fields
for (const [field, type] of Object.entries(REQUIRED_BLOG_FIELDS)) {
if (!fm[field]) {
error(file, `Missing required field: ${field}`);
continue;
}
if (type === 'date') {
const date = new Date(fm[field]);
if (isNaN(date.getTime())) {
error(file, `Invalid date for ${field}: ${fm[field]}`);
}
} else if (type === 'string' && typeof fm[field] !== 'string') {
error(file, `${field} must be a string, got ${typeof fm[field]}`);
}
}
// Check optional fields types
if (fm.tags && !Array.isArray(fm.tags)) {
error(file, `tags must be an array, got ${typeof fm.tags}`);
}
if (fm.featured !== undefined && typeof fm.featured !== 'boolean') {
error(file, `featured must be a boolean, got ${typeof fm.featured}`);
}
// Warn on potentially missing fields
if (!fm.image) {
warn(file, 'No image specified');
}
if (!fm.tags || fm.tags.length === 0) {
warn(file, 'No tags specified');
}
if (!fm.description || fm.description.length < 100) {
warn(file, `Description is short (${(fm.description || '').length} chars, recommend 120-180)`);
}
if (fm.description && fm.description.length > 200) {
warn(file, `Description is long (${fm.description.length} chars, recommend 120-180)`);
}
}
}
// Run validation
validateBlogPosts();
if (hasErrors) {
console.error('\nValidation FAILED. Fix errors above before merging.');
process.exit(1);
} else {
console.log('\nValidation passed.');
}This catches real mistakes. Last month, a blog post was pushed with pubDate: "203-03-01" (missing a digit). The validation script caught it before it hit Supabase.
Cost Breakdown
Here is what the entire content pipeline costs:
| Service | Plan | Monthly Cost | What We Use |
|---|---|---|---|
| Supabase | Free | $0 | Database, Storage, Edge Functions, Auth |
| GitHub Actions | Free (2,000 min/month) | $0 | CI/CD, sync pipeline |
| Vercel | Hobby | $0 | Website hosting, SSR, edge caching |
| Domain | Annual | ~$1.50/month | codercops.com |
| Total | ~$1.50/month |
The Supabase free tier gives us:
- 500MB database storage (we use ~15MB)
- 1GB file storage (we use ~200MB)
- 500,000 Edge Function invocations/month (we use ~5,000)
- Unlimited API requests (within reason)
- 50,000 monthly active users (we are well under)
We will outgrow the free tier eventually. When we do, Supabase Pro is $25/month, which is still dramatically cheaper than Contentful ($300+/month) or Sanity ($99+/month).
Comparison with Alternative Approaches
I want to be fair about the tradeoffs, because our approach is not right for everyone.
| Feature | Our Approach (Git + Supabase) | Contentful | Sanity | Strapi (self-hosted) |
|---|---|---|---|---|
| Monthly cost | $0 | $300+ | $99+ | $5-20 (hosting) |
| Content editing | VS Code + Git | Visual editor | GROQ Studio | Admin panel |
| Non-technical editors | No (requires Git) | Yes | Yes | Yes |
| Query flexibility | Full SQL | GraphQL/REST | GROQ | REST/GraphQL |
| Version control | Native (Git) | Built-in | Built-in | Plugin |
| Preview | None (manual) | Built-in | Built-in | Built-in |
| Image optimization | Manual | Built-in CDN | Built-in CDN | Plugin |
| Lock-in risk | Low | High | Medium | Low |
| Setup time | 2-3 days | 2-4 hours | 2-4 hours | 4-8 hours |
| Self-hostable | Partially (Supabase) | No | No | Yes |
| Real-time | Built-in (Supabase) | Webhooks | Built-in | Webhooks |
| Search | SQL (can add pg_tsvector) | Built-in | Built-in | Plugin |
| Webhook/automation | GitHub Actions | Built-in | Built-in | Plugin |
Choose our approach if: Your team is all developers, you want full control, you want to avoid vendor lock-in, and cost matters. Also if you need SQL-level query flexibility.
Choose Contentful/Sanity if: You have non-technical content editors, you need a visual editing experience, you need content previews, and you are willing to pay for the convenience.
Choose Strapi if: You want the visual editor experience but also want to self-host and avoid vendor lock-in.
What We Would Improve
The pipeline is solid, but it is not perfect. Here is what is on our roadmap:
1. Content Preview
Right now, there is no way to preview a blog post before it is published. The workflow is: write in VS Code, push to a branch, wait for CI, merge to main, wait for sync, check the live site. For a typo fix that is fine. For a new 3,000-word post, it is painful.
The solution: a /preview/[slug] route on the website that reads content directly from a branch-specific sync (a staging Supabase project or a drafts table in the existing project).
2. Scheduled Publishing
Posts go live the moment they are synced to Supabase and their pub_date is in the past. We cannot schedule a post for 9 AM Tuesday without merging the PR at exactly 9 AM Tuesday.
The fix: a Supabase cron job (using pg_cron) that checks for posts with pub_date in the past and a status field of scheduled, then updates them to published.
3. Full-Text Search
We currently filter posts by tags and categories. Real search requires full-text indexing. Postgres has excellent full-text search via pg_tsvector:
-- Add full-text search column
ALTER TABLE blog_posts
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(description, '')), 'B') ||
setweight(to_tsvector('english', coalesce(
regexp_replace(content_html, '<[^>]*>', '', 'g'), ''
)), 'C')
) STORED;
CREATE INDEX idx_blog_posts_search ON blog_posts USING gin(search_vector);
-- Search query
SELECT slug, title, description,
ts_rank(search_vector, plainto_tsquery('english', 'AI automation')) AS rank
FROM blog_posts
WHERE search_vector @@ plainto_tsquery('english', 'AI automation')
ORDER BY rank DESC
LIMIT 10;This gives us weighted search where title matches rank higher than description matches, which rank higher than body content matches. All within Postgres, no external search service needed.
4. Image Optimization Pipeline
We currently use Unsplash URLs for blog images, which handle their own optimization. But for project screenshots and custom images, we upload raw files to Supabase Storage. Adding an image transformation layer (resize, format conversion, quality optimization) would reduce page load times.
Supabase has built-in image transformations via their Storage API:
https://project.supabase.co/storage/v1/render/image/public/images/photo.png
?width=800
&height=600
&resize=cover
&format=webp
&quality=80We just have not integrated this into our rendering pipeline yet.
5. Rollback Mechanism
If a sync pushes bad content to Supabase, there is no easy rollback. We can git revert the content commit and re-sync, but that requires a human to notice the problem, revert, and push. An automated rollback that detects sync errors and restores the previous content_hash version would be safer.
Lessons Learned
After 5 months of running this pipeline, here is what I would tell someone building something similar:
1. Start simple. Our first version of the sync script was 80 lines. No delta sync, no heading extraction, no reading time. It just parsed frontmatter and pushed to Supabase. We added features as we needed them. Do not over-engineer the first version.
2. Delta sync is essential. Without SHA-256 hashing, every push re-syncs every file. With 170+ posts, that is 35+ seconds and hundreds of API calls. Delta sync reduces typical pushes to 3 seconds and 5 API calls. Add it early.
3. Pre-render content in the sync step. Rendering Markdown to HTML at request time is wasteful. Your content does not change between requests. Render once during sync, store the HTML, serve it directly.
4. Validate before syncing. A validation step that catches missing fields, invalid dates, and type mismatches before the sync runs will save you from debugging database errors at 11 PM.
5. Use RLS, not application-level auth. Row Level Security means your database enforces access control regardless of how the query is constructed. Even if your application code has a bug, the database will not expose draft posts to the public.
6. Keep the sync script as a standalone Node.js script. Not a framework, not a CLI tool with subcommands, not a library. A single JavaScript file that you can run with node scripts/sync-to-supabase.js. The simpler it is, the easier it is to debug when something breaks at 2 AM.
7. Log everything. Every sync run logs which files were synced, which were skipped, and how long it took. When a post is not showing up on the website, the first thing we check is the sync log.
Want a content pipeline like this for your project? At CODERCOPS, we build custom content management architectures that fit your team's workflow -- whether that is Git-based, visual CMS, or a hybrid approach. Check out our blog for more engineering deep dives, or contact us to talk about your architecture needs.
Comments