From 85f9a5f2f5dbee98dc273f173889efe273c4b548 Mon Sep 17 00:00:00 2001 From: John McLear Date: Fri, 1 May 2026 17:43:29 +0800 Subject: [PATCH] feat: Open Graph & Twitter Card metadata for pad/timeslider/home (closes #7599) (#7635) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs(spec): Open Graph metadata for pad pages (issue #7599) Spec for adding og:* and twitter:card meta tags to /p/:pad, the timeslider, and the homepage so shared links unfurl with a useful preview in chat apps. Co-Authored-By: Claude Opus 4.7 (1M context) * docs(spec): expand OG spec — i18n (locale map + og:locale) and a11y (image:alt) Address review feedback: socialDescription accepts a per-language map, og:locale is emitted from the negotiated render language, and image:alt attributes are emitted for screen readers in chat clients. Co-Authored-By: Claude Opus 4.7 (1M context) * feat: emit Open Graph & Twitter Card metadata for pad/timeslider/home Closes #7599. Pad URLs shared in chat apps (WhatsApp, Signal, Slack, etc.) previously unfurled with no preview because the rendered HTML carried no OG or Twitter Card metadata. This change emits og:title, og:description, og:image, og:url, og:site_name, og:type, og:locale, og:image:alt and the equivalent twitter:* tags on the pad page, the timeslider, and the homepage. A new settings.json key `socialDescription` controls the description. It accepts either a plain string applied to every locale or a per-language map keyed by BCP-47 tag with an optional `default` fallback. og:locale is emitted from the language already negotiated via req.acceptsLanguages and og:image:alt provides screen-reader text for chat-client previews. Pad names from the URL are HTML-escaped before being interpolated into og:title to prevent reflected XSS via crafted pad IDs. Tests: src/tests/backend/specs/socialMeta.ts covers the default, per-locale override, locale fallback, URL decoding, XSS escape, and the timeslider/homepage variants. Semver: minor (new setting; templates emit additional tags but no existing behavior changes). Co-Authored-By: Claude Opus 4.7 (1M context) * fix(test): use valid pad-name char in URL-decode test Spaces aren't allowed in pad names — Etherpad redirected /p/Has%20Space* to a sanitized name (302), so the og:title assertion failed. Use %2D ("-") instead, which is a valid pad-name character and still exercises the URL-decode path. Co-Authored-By: Claude Opus 4.7 (1M context) * fix(socialMeta): don't double-decode pad name from req.params.pad Express has already URL-decoded :pad route params before they reach the handler. Calling decodeURIComponent on the result throws URIError for pad names containing a literal "%" — e.g. the URL /p/100%25 yields req.params.pad === "100%", and decodeURIComponent("100%") throws. This would have prevented the page from rendering for some valid pad IDs. Drop the redundant decode and add a regression test for the "%" case. Reported by Qodo on PR #7635. Co-Authored-By: Claude Opus 4.7 (1M context) * refactor(socialMeta): source description from i18n catalog, drop settings key Per review: the OG description is a translatable string and belongs in Etherpad's locale files alongside the rest of the UI strings, not in settings.json. Operators who want to override it per-language continue to use the standard customLocaleStrings mechanism — no new config surface. Changes: - Add "pad.social.description" to src/locales/en.json (default English). - Export i18n.locales so server-side renderers can look up translations. - socialMeta.renderSocialMeta now takes a `locales` map and resolves renderLang → primary subtag → en, instead of taking a per-locale map from settings. - Remove `socialDescription` from Settings.ts, settings.json.template, settings.json.docker (the key never shipped). - Update tests and spec doc to reflect i18n-sourced description. Reported by Qodo on PR #7635 (also confirmed feature is fine to land default-on; no flag needed). Co-Authored-By: Claude Opus 4.7 (1M context) * test(socialMeta): add unit tests for pure helpers 21 cases exercising buildSocialMetaHtml and renderSocialMeta directly, without HTTP/DB. Covers tag enumeration, HTML escaping, og:locale region formatting, title composition (pad/timeslider/home), description i18n resolution (exact/primary/en fallback, missing catalog), image URL (default favicon vs absolute settings.favicon vs alt text), canonical URL building with query-string stripping, the literal "%" no-throw regression, and attribute-breakout escape. Co-Authored-By: Claude Opus 4.7 (1M context) * fix(socialMeta): defend og:url/og:image against host-header poisoning Previously og:url and og:image were built from req.protocol + req.get('host'), both of which can be client-controlled (Host header directly, or X-Forwarded-* under trust proxy). A crafted Host could make the server emit OG tags pointing at an attacker's origin — harmful if any cache fronts the response or if a vulnerable proxy forwards the headers unsanitized. Two-layer defense: 1. New optional setting `publicURL` lets operators pin the canonical origin used for shared link previews ("https://pad.example"). When set, og:url and og:image use it unconditionally. Sanitized at use time: must be http(s)://host[:port] with no path, no userinfo, no trailing slash; malformed values fall back to the request. 2. When `publicURL` is unset, the request-derived fallback now strictly validates the Host header against /^[a-z0-9]([a-z0-9.-]{0,253}[a-z0-9])?(:\d{1,5})?$/i and caps the scheme to "http"/"https". A crafted Host (CRLF injection, userinfo, "'; + const html = buildSocialMetaHtml({ + url: evil, siteName: evil, title: evil, description: evil, + imageUrl: evil, imageAlt: evil, renderLang: 'en', + }); + assert.ok(!/', + }); + assert.ok(!/">')) + .expect((r: any) => { + // Etherpad may 404 or render — either is fine, but no raw