← Back to Portfolio

The Star Rating That Read Backwards: System Design for a Bilingual Lead-Gen Site

No cart, no checkout, no SKU table. Two ORMs on one Postgres, a system-of-record write path, and authorization compiled into a WHERE clause — in two languages and both directions.

2

ORMs, disjoint schemas, one Postgres

1

pg.Pool singleton, guarded against fan-out

2

write paths (1 idempotent, 1 not — by design)

24h

ISR invalidation on Google reviews

4.3M

base64-char cap = edge load-shed

A five-star rating that renders as two stars is the kind of bug you only find in a language you don't read. Mecanum is a bilingual (English/Arabic, full right-to-left) lead-gen site for a Dubai supplier of American car parts, built on Next.js 16 with Payload CMS and Prisma sharing one PostgreSQL database, and it is deliberately not a store: no cart, no checkout, no SKU table. The whole front end is a read-mostly, statically-prerenderable surface with exactly two write paths and one request-time server read. The system-design work is in the boundaries — who owns which schema, what a 201 actually guarantees, where authorization lives, and how Latin data survives inside an Arabic document.

Why this isn't a store

Garages and American-car owners in the UAE lose days chasing parts: phone calls across a dozen traders, part photos traded over WhatsApp, "out of stock" answers, the wrong box delivered twice. The thing this business sells is not a catalogue. It's a fast human yes/no on availability.

So the architecture is conversion-first by construction. There is no product database to query, no inventory to sync, no checkout to harden. The unit of value is a single lead row. That reframe is what makes the engineering legible: a marketing site got this much design attention because the front end is the product, and the two POST routes that create leads are the only thing on the critical path.

A conversion-first site for a Dubai audience also has to be genuinely bilingual — bilingual in the way Arabic demands, which means real bidirectional layout. Numerals, phone numbers, star ratings, and the "+" on a stat counter all have to hold Latin/LTR ordering while sitting inside a right-to-left document. Most i18n libraries get the strings right and the bidi wrong.

Constraints

  • Bilingual means true RTL, not a string table. The Unicode bidirectional algorithm reorders inline runs at the run level. Latin data dropped into an Arabic flow reverses unless each run is explicitly isolated.
  • No storefront, but a CMS-backed blog was wanted. Lead tables (Contact, Newsletter) were already in Prisma; the team wanted Payload for a future EN/AR blog with scheduled publishing. Two data tools, one database — and I had to keep them from fighting.
  • Serverless host, fixed connection budget. Each route handler is a stateless function invocation. A naive per-invocation pool would multiply backend connections under hot-reload and warm-start reuse, so pooling discipline is not theoretical — it has to be a process-wide singleton.
  • Serverless body limit. The contact form forwards a part photo. The request has to clear the host's ~4.5MB body cap or the upload dies as an opaque 413.
  • Reduced-motion and screen-reader parity from one codebase. Every animation has to collapse to static, accessible content for the users who need that — no second component tree.

Architecture (HLD)

Mecanum is a single Next.js 16 App Router process split into two route groups that share one OS process and one Postgres instance but nothing else. app/(frontend) is the public bilingual site, rendered as React Server Components with small client "islands" for animation and the one stateful form. app/(payload) co-mounts the Payload admin UI, REST, and GraphQL in the same deployment via withPayload.

The keystone boundary is data ownership. Prisma owns the conversion-critical lead tables in the Postgres public schema; Payload owns all CMS tables namespaced into a dedicated payload schema. Locale is a URL path segment ([locale] = en | ar), negotiated by proxy.ts (Next 16's renamed middleware). Underneath, a single pg.Pool — instantiated once and reused through a global singleton — backs both Prisma's driver adapter and the connection budget, so the front end is a cacheable surface fronting a bounded number of database connections.

Web frameworkNext.js 16 (App Router, Turbopack)

Route groups split the public site ((frontend)) from the in-process Payload admin ((payload)); generateStaticParams prerenders both locales; Server Components own data fetching and SEO so pages stay statically prerenderable.

UI runtimeReact 19.2.3 (pinned exact)

Pinned for hydration stability. Theme uses useSyncExternalStore — the correct primitive for an external mutable source (localStorage + matchMedia) with a distinct getServerSnapshot to avoid hydration tearing.

Headless CMSPayload CMS ^3.84.1

In-process; native EN/AR field-level localization, Lexical rich text, per-collection RBAC. DDL-scoped into the payload schema so it coexists with Prisma; read access pushes visibility into SQL.

ORM / lead dataPrisma ^7.3.0 + @prisma/adapter-pg

Owns Contact + Newsletter. Prisma v7 wires a node-postgres Pool through PrismaPg; a global singleton stops dev hot-reload and serverless warm-reuse from spawning competing pools.

DatabasePostgreSQL (pg ^8.13.1)

One instance, two ORMs, two migration ledgers. A single process-wide pg.Pool reads DATABASE_URL; the production provider/region is supplied by env and not asserted in source.

Validation / trust boundaryZod ^4.3.5

The contact/newsletter contract is one schema, shared with TS via z.infer. The same contactFormSchema runs client-side (UX) and server-side (the actual gate). The base64 cap is a schema-level invariant.

Transactional emailResend ^6.12.3

Fires the inquiry email from a verified inquiries@mecanum.ae sender (Resend DKIM coexists with Outlook 365 on the root domain). Wrapped server-only, treated as strictly non-fatal, and every interpolated field is HTML-escaped.

Scroll animationGSAP + ScrollTrigger ^3.15.0

Drives a pinned clipPath image wipe and tweened counters. ScrollTrigger.matchMedia registers viewport-exclusive timelines; invalidateOnRefresh recomputes pin positions after resize/late-font load; gsap.context handles teardown.

StylingTailwind CSS v4 + CSS logical properties

oklch tokens via @theme inline; logical properties (ps/pe, start/end) make one class set produce correct LTR and RTL layouts without a duplicate stylesheet.

The keystone boundary

Sharing one Postgres between two ORMs sounds reckless until you see the line that makes it safe: schemaName: "payload". It turns "two tools fighting over a database" into "two tools that are structurally incapable of touching each other's tables." No table-name collision and no migration-history collision is possible, because they never write to the same schema. Every other decision in the data layer follows from that disjointness.

The hard problems

Why did the 5-star rating render as 2 stars?

Switching a document to dir="rtl" flips the visual order of every inline run. That is invisible in English testing and catastrophic in Arabic. The proof point on the homepage is a customer star rating built the obvious way:

{"★".repeat(review.rating)}
{"☆".repeat(5 - review.rating)}

In LTR that's full-then-empty: a clean 5/5. In RTL the visual order reverses, so empty comes first and full comes last, and a five-star review reads as "2 out of 5." The same class of bug puts the "+" of an "820+" counter on the wrong side of the digits, and floats a period to the wrong end of a Latin sentence inside an RTL paragraph.

I found these the only way you can: by walking the live site in Arabic after shipping a redesign. The QA log (.planning/SNAGS.md) reads like a confession — B1 records the legacy English content flowing LTR-inside-RTL with punctuation at the wrong end; P7 flags the stat suffix; P10 spells out the star reversal. The fix is one fundamental applied everywhere Latin data meets Arabic layout: wrap the run in <bdi dir="ltr"> to force an isolated left-to-right embedding. There is no global switch, because bidi isolation is a property of a run, not a document. The animated counters carry the isolate plus the final value in an aria-label (GSAP mutates textContent during the tween, so the accessible name has to be pinned independently):

<bdi dir="ltr" data-stat-counter data-stat-value={stat.number}
     data-stat-suffix={stat.suffix} aria-label={stat.value}>
  {stat.value}
</bdi>

Trade-off

Chosen: a per-run <bdi dir="ltr"> isolate on every Latin island (counters, "+", phone, email, VIN, star ratings). Rejected: a document-level dir toggle and hoping inheritance handles it. Why: the Unicode bidi algorithm reorders at the run level, so there is no single ancestor that fixes a Latin run sitting between two Arabic ones. Cost: vigilance — a new component is wrong by default until tested in Arabic, and the English-reading team can't see the bug. The benefit is that Arabic readers get correct content instead of subtly broken content nobody on the team would catch.

Can two ORMs share one Postgres without fighting? (LLD)

Both Payload and Prisma want to own the schema. Point them at the same database naively and they collide over table names and migration history. Give them separate databases and you double the infra and lose transactional locality. The split is structural. Payload's adapter is namespaced; Prisma's models carry no @@schema and land in public:

db: postgresAdapter({
  pool: { connectionString: process.env.DATABASE_URL || "" },
  // Keep Payload tables namespaced so they don't collide with Prisma's tables.
  schemaName: "payload",
})

The two migration ledgers are fully independent — prisma/migrations/* SQL files on one side, payload migrate on the other — and the invariant that makes co-tenancy safe is namespace disjointness, not coordination. The lead tables are normalized with explicit secondary indexes (@@index([email]), @@index([createdAt]) on both Contact and Newsletter) so the two read patterns a human actually runs — look up by email, sort recent — are index-backed. .env.example documents the shared-DB decision on line one rather than leaving it as tribal knowledge.

Trade-off

Chosen: one Postgres, two schemas, two migration systems. Rejected: a database per tool (clean isolation, double infra) or folding leads into Payload collections (one ORM, but surrendering Prisma's typed query ergonomics on the most important path). Why: schema namespacing buys collision-free co-tenancy on shared infra. Cost: two migration ledgers to keep mentally in sync, and cross-ORM SQL joins are structurally impossible — a Contact can never be JOINed to a Post. Accepted on purpose to keep the conversion path decoupled from CMS schema churn.

How do you guarantee a lead survives when the email provider is down?

The entire business runs on inquiries, so the durability question is concrete: what does a 201 actually promise? On a no-storefront site you also don't want to keep customer photos indefinitely, and the photo still has to reach a human in a form they can act on instantly.

POST /api/contact is a linear two-step with asymmetric failure semantics. prisma.contact.create is awaited and is the commit point; sendInquiryNotification is awaited but its Result ({ok:true} | {ok:false; reason}) is inspected, logged with the contactId, and discarded:

// Notify the team. Failures are NON-fatal — the inquiry is already
// persisted and recoverable from the `contact` table, so the customer
// sees success regardless of whether the notification email got out.
const notify = await sendInquiryNotification(validatedData, contact.id);
if (!notify.ok)
  console.error('[contact] inquiry notification failed:', notify.reason, 'contactId:', contact.id);
// ...returns 201 regardless

This makes the Contact row the single source of truth and the email a best-effort projection. The invariant is one-directional: a 201 to the client implies a durable Contact row, never the reverse. This is system of record / durability over availability of side effects applied to a conversion path. The photo is never written to Postgres — a migration, drop_contact_image_url, explicitly removed the old imageUrl column — and is instead re-decoded from base64 into a Buffer and attached to the email at send time.

Trade-off

Chosen: persist first, treat the notification as strictly non-fatal, forward the photo as a transient email attachment. Rejected: roll back the request when the email send fails; persist the photo to S3/Cloudinary and store a URL on the row (the original design). Why: a lost lead is the worst possible outcome, a missing internal ping is recoverable from the table, and nobody needs to query historical part photos. Cost: silent degradation — an ops failure (Resend down, env unset) is invisible to the customer and only discoverable in logs; there is no alerting on the gap. The reversal left a scar: the DROP COLUMN imageUrl migration.

Two endpoints, two idempotency answers

The newsletter handler is a deliberately different design. It does findUnique on the @unique email and returns 200 "already subscribed" if present — not a 409 — else creates and returns 201:

const existing = await prisma.newsletter.findUnique({ where: { email } });
if (existing) return NextResponse.json({ /* subscribed */ }, { status: 200 });

Re-submission is a safe no-op, and the @unique constraint is the real guard against a race (two concurrent submits where both pass the findUnique check — one create wins, the other hits the constraint). That's idempotency demonstrated by name. The contrast is the point: the contact POST is not idempotent — every accepted call inserts a row. I flag that honestly rather than papering over it, because it interacts with the browser retry policy below. A network error on a request that actually succeeded server-side could double-insert a lead; retrying only on errors narrows the window but does not eliminate it, and there is no request-id dedupe. Naming the gap is the senior move; pretending the endpoint is idempotent would be the junior one.

Authorization as a query predicate

The single strongest CMS-side design artifact never touches application code. Payload's Posts read access does not return a boolean — for an unauthenticated request it returns a Where object that Payload composes into the underlying SQL WHERE:

read: ({ req }) => {
  if (req.user) return true;
  const where: Where = { and: [
    { status: { equals: "published" } },
    { publishedAt: { less_than_equal: new Date().toISOString() } },
  ]};
  return where;
},

Draft and future-dated posts are filtered at the database layer, not in app code. This is the scheduled-publishing mechanism: a post with a future publishedAt is unfetchable until wall-clock time passes it — row-level access control expressed as a query predicate. The Users collection layers field-level access on top: a beforeChange create hook bootstraps the first user (totalDocs === 0) to admin, and the role field's update access is gated to admins only, closing the privilege-escalation path where a non-admin self-promotes. Authorization that lives in the query plan can't be forgotten by a careless handler.

The Next 16 upgrade that broke the build at deploy

Some traps are invisible until deploy. On a locale switch, the language provider wants to preserve the current query string and hash, so the idiomatic move is useSearchParams() at render. In Next 16 that quietly forces every page mounting the provider into client-side rendering; Next bails out of static prerender, and the build broke. The render boundary I'd designed — server components own data + SEO, client islands own interactivity, kept thin so pages stay static — was being violated by a single hook.

The fix reads the URL imperatively, inside the click handler, where window is always defined. The reasoning is pinned as a verbatim warning comment so a refactor can't silently reintroduce the regression:

// Reading at render forces every page that mounts this provider into
// client-side rendering (Next 16 bails out of static prerender),
// which broke the Vercel build. setLang only runs from a click handler.
const query = typeof window !== "undefined" ? window.location.search : "";
const hash  = typeof window !== "undefined" ? window.location.hash  : "";

The trade-off is honest: this is less idiomatic React (an imperative window read instead of a hook), valid only because setLang runs exclusively from a click. The comment is load-bearing — a one-line landmine that costs a deploy day to diagnose and ten seconds to reintroduce.

A cancellable async-timeline state machine (no animation library)

The hero proof is a WhatsApp inbox that types replies character by character, runs sent → delivered → read ticks, morphs an iOS Dynamic Island into a notification, crossfades between three conversations, and loops forever. Built naively with chained setTimeouts it leaks timers and calls setState after unmount. So it's a hand-built async state machine with a closure-scoped cancellation token re-checked after every await:

const cancel = { flag: false };
const run = async () => {
  let idx = 0;
  while (!cancel.flag) {
    await playChat(CHATS[idx]);
    if (cancel.flag) return;
    const nextIdx = (idx + 1) % CHATS.length;
    await Promise.all([announceNext(CHATS[nextIdx].contact), transitionTo(nextIdx)]);
    idx = nextIdx;
  }
};
// cleanup: cancel.flag = true;

Cleanup just sets cancel.flag = true, which guarantees no setState fires post-unmount — cooperative cancellation, the React equivalent of a structured-concurrency contract, done without a library. Concurrency is explicit: the notification expand and the chat-body crossfade run together via Promise.all with hand-computed offsets (NOTIFICATION_HOLD_MS − TRANSITION_FADE_MS). An IntersectionObserver (threshold 0.3) defers the whole timeline until it scrolls into view; prefers-reduced-motion short-circuits to the first conversation fully revealed with read ticks and no loop. The accessibility invariant: the entire widget is one role="img" with a single descriptive aria-label, so a thousand DOM mutations surface to assistive tech as one sentence.

Insight

The line count isn't the animation. It's the cancellation correctness, the IntersectionObserver deferral, and the single screen-reader label. Strip those and you have a memory leak that lies to assistive tech. The widget ships zero animation dependencies and one clean accessibility surface — decorative UI only earns its keep if it degrades honestly.

Design fundamentals, by name

The patterns above aren't decoration; they're the same handful of CS fundamentals applied where the code genuinely shows them.

  • Tearing-free external-store reads, with a deterministic first paint. ThemeProvider uses useSyncExternalStore(subscribe, getSnapshot, getServerSnapshot); getServerSnapshot returns "system", a distinct SSR value that avoids a hydration mismatch on something the server can't know. subscribe composes three sources into one callback: the cross-tab storage event, a custom same-tab mecanum-theme-change event (because storage doesn't fire in the originating tab — a real pitfall), and the matchMedia change event. To kill the flash, theme and direction are applied by a blocking inline <head> script before React hydrates, with <html suppressHydrationWarning> acknowledging the script-vs-SSR class divergence — the first frame is deterministic instead of racing hydration.
  • Defense in depth / output encoding at a trust sink. Untrusted form input crosses two validators — the client's fast /^[^\s@]+@[^\s@]+\.[^\s@]+$/ regex plus a 3MB / image/* check for UX, and the server re-running the identical contactFormSchema as the actual gate — then is HTML-escaped at every interpolation point in the notification email, with tel: stripped to [+\d]. The email renderer treats all form data as hostile rather than trusting upstream validation. Client validation is UX; server validation is the trust boundary — stated explicitly so it reads as a decision, not an accident.
  • Backpressure / load-shedding at the edge. The base64 attachment cap (max(4_300_000)) encodes the serverless body limit as a Zod rule, so an oversized upload is rejected with a clean "Attachment is too large" at validation time instead of dying as an opaque 413 deeper in the stack — shedding excess load before it consumes a request slot.
  • Connection pooling under serverless. A single process-wide pg.Pool backs Prisma's driver adapter, and the client itself is a global singleton (global.prisma ?? new PrismaClient(...)). That guard is the canonical answer to "serverless fan-out vs a fixed DB connection budget": module re-evaluation under hot-reload or warm-invocation reuse returns the same client instead of leaking a new pool each time.
  • Graceful degradation as a system-wide invariant. Every third-party dependency returns a safe value instead of throwing: getResend()null when unkeyed, getGoogleReviews(){reviews:[]} on missing-env, !response.ok, and any thrown error, and the browser client maps a 30s AbortController timeout to a typed 408. No single dependency can take down a page.
  • Caching + explicit invalidation. Both locales are static-prerendered; Google reviews sit behind 24h time-based ISR (revalidate: 86400); /llms.txt is force-static. The 24-hour staleness window for third-party reviews is a deliberate cache-invalidation choice, not a default.

What I'd change

The honest gaps, stated as known debt.

No rate-limiting on the public POST endpoints. /api/contact and /api/newsletter have no throttle, captcha, or honeypot; the only abuse mitigation is the Zod attachment cap (which bounds payload size, not request rate). On a site whose whole value is inquiries, this is the first thing I'd add — a honeypot field and a per-IP limit at minimum — and it pairs with the next point: the contact POST being non-idempotent means an attacker (or a retry storm) can insert duplicate leads, so a request-id dedupe would harden both at once.

The blog is modelled ahead of its surface — confirmed. The Posts collection is fully built (localized title/slug/excerpt/content, draft/published status, publishedAt scheduling, the row-level published filter) but the frontend route tree under app/(frontend)/[locale]/ contains only the homepage, about-us, and contact-us. There is no public /blog or /posts route in the tracked source. The CMS, including its scheduled-publishing query predicate, is genuinely built ahead of the surface that will consume it.

Payload media still lives on local disk. payload.config.ts keeps the dev-disk default with an explicit comment to swap to S3/Cloudinary via a storage plugin for production; no @payloadcms/storage-* plugin is configured. Worth disambiguating: Cloudinary is in the repo, but only for frontend marketing imagery (config/Cloudinary.ts, and res.cloudinary.com whitelisted in next.config.ts remotePatterns alongside lh3.googleusercontent.com for Google review avatars). The Payload upload backend for production is still local disk with a documented TODO, and remains unconfirmed.

Host is implied, not proven from a deploy config. There is no vercel.json or netlify.toml in the repo, so the production host isn't established by a config file. The strongest file-grounded signals point at a serverless host on Vercel — withPayload in next.config.ts, the "~4.5MB Vercel serverless request limit" comment in lib/validations.ts, and the verbatim "broke the Vercel build" warning in lang-provider.tsx — but that's inferential. The data tier is equally unproven from source: .env.example ships a localhost DATABASE_URL, and the production Postgres provider, region, and pooler topology are supplied by env, not asserted anywhere in the tracked code. What is a hard fact is the discipline that makes the connection budget safe regardless of host: one process-wide pg.Pool behind a global singleton.

No automated test suite in the tracked source. package.json has no test runner (only dev/build/start/lint plus Prisma/Payload tooling), and no committed *.spec/*.test files were found. Verification today is the manual EN+AR, desktop+mobile audit in .planning/SNAGS.md plus a DOM scan for missing alt/broken href. The bidi bugs above are precisely the class an integration test in Arabic would have caught — the strongest argument for adding one.

FAQ

Can you really run Payload CMS and Prisma against the same PostgreSQL database?

Yes, if the schemas are disjoint. Payload's postgres adapter is pinned to schemaName: 'payload' so every CMS table is DDL-scoped into that namespace; Prisma's models carry no @@schema and land in public. One DATABASE_URL, two independent migration ledgers, and collision-freedom is a property of namespace disjointness, not luck. The cost is structural: a Prisma Contact can never be SQL-JOINed to a Payload Post — accepted to keep the conversion path decoupled from CMS schema churn.

How do you stop Arabic RTL from reversing numbers, ratings, and phone numbers?

Wrap every Latin-ordered run in <bdi dir="ltr"> to force an isolated left-to-right Unicode embedding inside the RTL document. There is no global fix: each Latin island inside an RTL flow (a star rating, a "+" suffix, a VIN, a phone number) needs its own bidi isolate, because the bidirectional algorithm reorders at the run level. The bugs are invisible in English and only surface when you test in Arabic.

Why persist the lead before sending the notification email, and what is the durability guarantee?

Because the Postgres row is the system of record and the email is a best-effort projection. POST /api/contact awaits prisma.contact.create first (the commit point), then awaits the Resend send but inspects its Result, logs {reason, contactId} on failure, and returns 201 regardless. The invariant is one-directional: a 201 implies a durable Contact row, never the reverse — so a Resend outage degrades to a recoverable row, not a lost customer.

Is the contact endpoint idempotent? What about newsletter?

No and yes, deliberately. Newsletter subscribe is idempotent: findUnique on a @unique email returns 200 "already subscribed" (not 409), with the unique constraint as the real race guard, so re-submission is a safe no-op. The contact POST is NOT idempotent — every accepted call inserts a row — which is why the browser retry client only retries on 5xx/network errors and never on 4xx, though that narrows the double-insert window rather than closing it (there is no request-id dedupe).

How is scheduled blog publishing enforced — in app code or in the database?

In the database, as a query predicate. Payload Posts `read` access returns a Where object {and:[{status:{equals:'published'}},{publishedAt:{less_than_equal: new Date().toISOString()}}]} for anonymous requests instead of a boolean, and Payload composes it into the underlying SQL WHERE. Drafts and future-dated posts are unfetchable at the query layer, so a post with a future publishedAt is simply invisible until wall-clock time passes it — row-level access control, not an app-code filter.

Why hand-roll i18n instead of using next-intl or i18next?

Two locales and a fixed marketing surface do not justify a runtime dependency. Mecanum uses a single `as const` EN+AR tree plus a DeepReadonlyToMutable mapped type, so a missing or misspelled key (t.stats.projects) is a TypeScript error and EN/AR are forced to the same shape — full key safety, zero bytes of runtime i18n machinery. The real i18n difficulty was bidi layout, which no string library solves anyway.

The site sells a yes/no. Two languages, two directions, two ORMs on one Postgres, and authorization that lives in a WHERE clause — every boundary chosen so the fastest path from a visitor to a WhatsApp conversation can never be the thing that breaks.