feat: Open Graph & Twitter Card metadata for pad/timeslider/home (closes #7599) (#7635)

* docs(spec): Open Graph metadata for pad pages (issue #7599)

Spec for adding og:* and twitter:card meta tags to /p/:pad,
the timeslider, and the homepage so shared links unfurl with
a useful preview in chat apps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(spec): expand OG spec — i18n (locale map + og:locale) and a11y (image:alt)

Address review feedback: socialDescription accepts a per-language map,
og:locale is emitted from the negotiated render language, and image:alt
attributes are emitted for screen readers in chat clients.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: emit Open Graph & Twitter Card metadata for pad/timeslider/home

Closes #7599.

Pad URLs shared in chat apps (WhatsApp, Signal, Slack, etc.) previously
unfurled with no preview because the rendered HTML carried no OG or
Twitter Card metadata. This change emits og:title, og:description,
og:image, og:url, og:site_name, og:type, og:locale, og:image:alt and
the equivalent twitter:* tags on the pad page, the timeslider, and the
homepage.

A new settings.json key `socialDescription` controls the description.
It accepts either a plain string applied to every locale or a per-language
map keyed by BCP-47 tag with an optional `default` fallback. og:locale
is emitted from the language already negotiated via req.acceptsLanguages
and og:image:alt provides screen-reader text for chat-client previews.

Pad names from the URL are HTML-escaped before being interpolated into
og:title to prevent reflected XSS via crafted pad IDs.

Tests: src/tests/backend/specs/socialMeta.ts covers the default,
per-locale override, locale fallback, URL decoding, XSS escape, and
the timeslider/homepage variants.

Semver: minor (new setting; templates emit additional tags but no
existing behavior changes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): use valid pad-name char in URL-decode test

Spaces aren't allowed in pad names — Etherpad redirected /p/Has%20Space*
to a sanitized name (302), so the og:title assertion failed. Use %2D
("-") instead, which is a valid pad-name character and still exercises
the URL-decode path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(socialMeta): don't double-decode pad name from req.params.pad

Express has already URL-decoded :pad route params before they reach the
handler. Calling decodeURIComponent on the result throws URIError for
pad names containing a literal "%" — e.g. the URL /p/100%25 yields
req.params.pad === "100%", and decodeURIComponent("100%") throws.

This would have prevented the page from rendering for some valid pad
IDs. Drop the redundant decode and add a regression test for the "%"
case.

Reported by Qodo on PR #7635.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(socialMeta): source description from i18n catalog, drop settings key

Per review: the OG description is a translatable string and belongs in
Etherpad's locale files alongside the rest of the UI strings, not in
settings.json. Operators who want to override it per-language continue to
use the standard customLocaleStrings mechanism — no new config surface.

Changes:
- Add "pad.social.description" to src/locales/en.json (default English).
- Export i18n.locales so server-side renderers can look up translations.
- socialMeta.renderSocialMeta now takes a `locales` map and resolves
  renderLang → primary subtag → en, instead of taking a per-locale map
  from settings.
- Remove `socialDescription` from Settings.ts, settings.json.template,
  settings.json.docker (the key never shipped).
- Update tests and spec doc to reflect i18n-sourced description.

Reported by Qodo on PR #7635 (also confirmed feature is fine to land
default-on; no flag needed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(socialMeta): add unit tests for pure helpers

21 cases exercising buildSocialMetaHtml and renderSocialMeta directly,
without HTTP/DB. Covers tag enumeration, HTML escaping, og:locale
region formatting, title composition (pad/timeslider/home), description
i18n resolution (exact/primary/en fallback, missing catalog), image URL
(default favicon vs absolute settings.favicon vs alt text), canonical
URL building with query-string stripping, the literal "%" no-throw
regression, and attribute-breakout escape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(socialMeta): defend og:url/og:image against host-header poisoning

Previously og:url and og:image were built from req.protocol +
req.get('host'), both of which can be client-controlled (Host header
directly, or X-Forwarded-* under trust proxy). A crafted Host could
make the server emit OG tags pointing at an attacker's origin —
harmful if any cache fronts the response or if a vulnerable proxy
forwards the headers unsanitized.

Two-layer defense:

1. New optional setting `publicURL` lets operators pin the canonical
   origin used for shared link previews ("https://pad.example"). When
   set, og:url and og:image use it unconditionally. Sanitized at use
   time: must be http(s)://host[:port] with no path, no userinfo, no
   trailing slash; malformed values fall back to the request.

2. When `publicURL` is unset, the request-derived fallback now strictly
   validates the Host header against /^[a-z0-9]([a-z0-9.-]{0,253}[a-z0-9])?(:\d{1,5})?$/i
   and caps the scheme to "http"/"https". A crafted Host (CRLF
   injection, userinfo, "<script>") is replaced with "localhost"
   instead of being echoed into og:url.

Reported by Qodo on PR #7635.

Tests: 5 new unit cases covering publicURL preference, trailing-slash
strip, malformed-publicURL fallback, Host validation, scheme cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(socialMeta): tighten types, drop `any`

- `req: any` -> express `Request` (covers acceptsLanguages/protocol/get/originalUrl).
- `settings: any` -> local `SocialMetaSettings` interface narrowed to the three
  fields we actually read (title/favicon/publicURL); avoids coupling to the
  full Settings module surface.
- `availableLangs: {[k: string]: any}` -> `{[lang: string]: unknown}`; only
  keys are read, so values stay deliberately unconstrained.

No runtime change. All 26 socialMeta unit tests still pass.

Per Sam's review on #7635.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
John McLear 2026-05-01 17:43:29 +08:00 committed by GitHub
parent 63cae17720
commit 85f9a5f2f5
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
13 changed files with 841 additions and 7 deletions

View File

@ -0,0 +1,146 @@
# Open Graph metadata for pad pages — Design
GitHub issue: https://github.com/ether/etherpad/issues/7599
## Problem
When an Etherpad pad URL is shared in chat apps (WhatsApp, Signal, Slack,
Discord, iMessage, etc.) the link unfurls with no preview because the rendered
HTML carries no Open Graph or Twitter Card metadata. The reporter asks for
basic OG tags so shared links show a meaningful preview.
## Goals
- Pad URLs (`/p/:pad`), timeslider URLs (`/p/:pad/timeslider`), and the
homepage (`/`) emit Open Graph + Twitter Card meta tags.
- A site operator can override the default description via `settings.json`.
- No new runtime dependencies. Implementation lives in the existing EJS
templates and the existing settings module.
## Non-goals
- Per-pad descriptions, custom OG images per pad, or pulling content from the
pad body. The pad text is mutable and frequently empty at first load; using
it would be both expensive (extra DB read on a hot path) and misleading.
- A plugin hook for OG override. Defer until a plugin actually needs it
(YAGNI).
- Removing or changing the existing `<meta name="robots" content="noindex,
nofollow">` tag. OG unfurling is performed by chat clients that ignore
`robots`, so the privacy posture is unchanged.
## Tags emitted
For the **pad page** (`/p/:pad`):
| Tag | Value |
| ------------------- | ----------------------------------------------------------- |
| `og:title` | `{decoded pad name} | {settings.title}` |
| `og:description` | `settings.socialDescription` |
| `og:image` | absolute URL to `{req.protocol}://{host}/favicon.ico`* |
| `og:url` | absolute URL of the request |
| `og:type` | `website` |
| `og:site_name` | `settings.title` |
| `og:locale` | negotiated `renderLang` (already computed in `pad.html`), normalized to BCP-47 with underscore (e.g. `en_US`, `de_DE`); falls back to `en_US` |
| `og:image:alt` | `"{settings.title} logo"` (a11y — screen readers in chat clients announce this) |
| `twitter:card` | `summary` |
| `twitter:title` | same as `og:title` |
| `twitter:description` | same as `og:description` |
| `twitter:image` | same as `og:image` |
| `twitter:image:alt` | same as `og:image:alt` |
\* `settings.favicon` is normally null (defaults route to the bundled
`favicon.ico` via the favicon middleware). The template builds the absolute
URL by joining `req.protocol`, `req.get('host')`, and the favicon path. If
`settings.favicon` is an absolute URL it is used verbatim.
For the **timeslider** (`/p/:pad/timeslider`): same tags, with `og:title` set
to `{decoded pad name} (history) | {settings.title}`.
For the **homepage** (`/`): same tags, with `og:title` set to
`settings.title` and `og:url` set to the request URL.
## i18n source
The description text lives in Etherpad's standard locale catalog under the
key `pad.social.description`. The shipped English default in
`src/locales/en.json` is the softer rewording of the wording in the issue:
> A collaborative document that everyone can edit in real time.
Other locale files may translate the key as the translation community picks
it up; missing translations fall back to English. **No new `settings.json`
key is added** — operators who want to override the text per-language do so
via the existing `customLocaleStrings` mechanism that Etherpad already
supports.
**Locale negotiation.** Resolution order at request time:
1. `locales[renderLang]['pad.social.description']` (exact match, where
`renderLang` was negotiated via `req.acceptsLanguages()`).
2. `locales[primarySubtag]['pad.social.description']` (e.g. `de-AT``de`).
3. `locales.en['pad.social.description']` (English fallback).
4. Empty string (only if `en.json` is missing the key — should not happen
in core).
The `i18n` hook now exports the loaded `locales` map so other server-side
modules can look up translated strings without re-reading the JSON files.
## Implementation outline
1. **Settings** — declare `socialDescription: string` on the Settings module
with the default above; document it in both example settings files.
2. **Helper** — extract the meta-tag block into a single source of truth.
Preferred form is an EJS partial included from each template; if
Etherpad's `eejs` wrapper does not support `include()` cleanly, fall back
to a small JS helper (e.g. `src/node/utils/socialMeta.ts`) exported into
the template via the existing `eejs.require` context, returning the
rendered `<meta>` block as a string. Implementation step 1 of the plan
must verify which mechanism `eejs` supports before committing to one.
3. **pad.html / timeslider.html / index.html** — compute the four template
inputs at the top of each file and `<%- include('_socialMeta', {...}) %>`
in `<head>`, after the existing `<title>` line. The pad name is decoded
with `decodeURIComponent(req.params.pad)` and HTML-escaped via the
existing `<%= %>` mechanism (EJS escapes by default).
4. **Route handlers**`specialpages.ts` already passes `req` and
`settings` to the templates; no route changes needed.
## Tests
Add to the existing backend test suite (likely
`src/tests/backend/specs/specialpages.ts` or a new
`src/tests/backend/specs/socialmeta.ts`):
- GET `/p/TestPad-7599` → response HTML contains
`<meta property="og:title" content="TestPad-7599 | Etherpad">` and an
`og:description` matching the default.
- GET `/p/TestPad-7599` with `settings.socialDescription` overridden to
`"Custom desc"` → that custom value appears in `og:description`.
- GET `/p/Has%20Space``og:title` contains `Has Space` (decoded) and is
HTML-safe (no raw `%`).
- GET `/p/<script>` (encoded) → `og:title` contains escaped `&lt;script&gt;`,
not raw HTML.
- GET `/p/TestPad/timeslider``og:title` contains `(history)`.
- GET `/``og:title` equals `settings.title`.
- GET `/p/TestPad` with `Accept-Language: de` and
`socialDescription: {default: "X", de: "Y"}``og:description` is `Y`
and `og:locale` is `de_DE` (or `de`).
- Response includes `og:image:alt` and `twitter:image:alt`.
The XSS escape test is the security-relevant one: pad IDs are user-controlled
(anyone can navigate to `/p/<anything>`).
## Risks and trade-offs
- **Pad-name leakage.** Anyone the link is shared with can already see the pad
name in the URL, so emitting it in `og:title` does not expose anything new.
- **Caching.** OG tags are read once per unfurl. Chat clients cache aggressively;
changing `socialDescription` will not propagate to previously-cached previews.
This is acceptable and standard.
- **Template-set drift.** Etherpad has three top-level HTML templates that
need OG tags; the `_socialMeta` partial avoids three copies of the same
block.
## Out of scope (future work)
- A `padSocialMetadata` hook that lets plugins override the values.
- Per-pad description (e.g. ep_pad_title integration).
- Generated preview images (would require a rendering service).

View File

@ -117,6 +117,14 @@
*/
"favicon": "${FAVICON:null}",
/*
* Canonical public origin of this Etherpad instance, e.g.
* "https://pad.example.com" (no trailing slash, must include scheme).
* Used to build absolute URLs in OG/Twitter link-preview meta tags.
* When null, falls back to the incoming request's protocol+Host.
*/
"publicURL": "${PUBLIC_URL:null}",
/*
* Skin name.
*

View File

@ -108,6 +108,21 @@
*/
"favicon": null,
/*
* Canonical public origin of this Etherpad instance, e.g.
* "https://pad.example.com" (no trailing slash, must include scheme).
*
* When set, this is used to build absolute URLs in server-rendered output
* such as the Open Graph / Twitter Card link-preview meta tags (og:url,
* og:image, ...). When null, those URLs fall back to the request's
* protocol+Host, which can reflect client-controlled headers if your
* reverse proxy passes them through unsanitized.
*
* Set this in production deployments to lock down the canonical origin
* advertised in shared link previews.
*/
"publicURL": null,
/*
* Skin name.
*

View File

@ -220,5 +220,6 @@
"pad.impexp.importfailed": "Import failed",
"pad.impexp.copypaste": "Please copy paste",
"pad.impexp.exportdisabled": "Exporting as {{type}} format is disabled. Please contact your system administrator for details.",
"pad.impexp.maxFileSize": "File too big. Contact your site administrator to increase the allowed file size for import"
"pad.impexp.maxFileSize": "File too big. Contact your site administrator to increase the allowed file size for import",
"pad.social.description": "A collaborative document that everyone can edit in real time."
}

View File

@ -10,6 +10,8 @@ import settings, {getEpVersion} from '../../utils/Settings';
import util from 'node:util';
const webaccess = require('./webaccess');
const plugins = require('../../../static/js/pluginfw/plugin_defs');
const i18n = require('../i18n');
import {renderSocialMeta} from '../../utils/socialMeta';
import {build, buildSync} from 'esbuild'
import {ArgsExpressType} from "../../types/ArgsExpressType";
@ -172,7 +174,10 @@ const handleLiveReload = async (args: ArgsExpressType, padString: string, timeSl
})
setRouteHandler('/', (req: any, res: any) => {
const proxyPath = sanitizeProxyPath(req);
res.send(eejs.require('ep_etherpad-lite/templates/index.html', {req, entrypoint: proxyPath + '/watch/index?hash=' + hash, settings}));
const socialMetaHtml = renderSocialMeta({
req, settings, availableLangs: i18n.availableLangs, locales: i18n.locales, kind: 'home',
});
res.send(eejs.require('ep_etherpad-lite/templates/index.html', {req, entrypoint: proxyPath + '/watch/index?hash=' + hash, settings, socialMetaHtml}));
})
})
@ -196,12 +201,16 @@ const handleLiveReload = async (args: ArgsExpressType, padString: string, timeSl
});
const proxyPath = sanitizeProxyPath(req);
const socialMetaHtml = renderSocialMeta({
req, settings, availableLangs: i18n.availableLangs, locales: i18n.locales, kind: 'pad', padName: req.params.pad,
});
const content = eejs.require('ep_etherpad-lite/templates/pad.html', {
req,
toolbar,
isReadOnly,
entrypoint: proxyPath + '/watch/pad?hash=' + hash,
settings: settings.getPublicSettings()
settings: settings.getPublicSettings(),
socialMetaHtml,
})
res.send(content);
})
@ -227,12 +236,16 @@ const handleLiveReload = async (args: ArgsExpressType, padString: string, timeSl
});
const proxyPath = sanitizeProxyPath(req);
const socialMetaHtml = renderSocialMeta({
req, settings, availableLangs: i18n.availableLangs, locales: i18n.locales, kind: 'timeslider', padName: req.params.pad,
});
const content = eejs.require('ep_etherpad-lite/templates/timeslider.html', {
req,
toolbar,
isReadOnly,
entrypoint: proxyPath + '/watch/timeslider?hash=' + hash,
settings: settings.getPublicSettings()
settings: settings.getPublicSettings(),
socialMetaHtml,
})
res.send(content);
})
@ -342,7 +355,10 @@ exports.expressCreateServer = async (_hookName: string, args: ArgsExpressType, c
// serve index.html under /
args.app.get('/', (req: any, res: any) => {
res.send(eejs.require('ep_etherpad-lite/templates/index.html', {req, settings, entrypoint: "./"+fileNameIndex}));
const socialMetaHtml = renderSocialMeta({
req, settings, availableLangs: i18n.availableLangs, locales: i18n.locales, kind: 'home',
});
res.send(eejs.require('ep_etherpad-lite/templates/index.html', {req, settings, entrypoint: "./"+fileNameIndex, socialMetaHtml}));
});
@ -356,12 +372,16 @@ exports.expressCreateServer = async (_hookName: string, args: ArgsExpressType, c
isReadOnly
});
const socialMetaHtml = renderSocialMeta({
req, settings, availableLangs: i18n.availableLangs, locales: i18n.locales, kind: 'pad', padName: req.params.pad,
});
const content = eejs.require('ep_etherpad-lite/templates/pad.html', {
req,
toolbar,
isReadOnly,
entrypoint: "../"+fileNamePad,
settings: settings.getPublicSettings()
settings: settings.getPublicSettings(),
socialMetaHtml,
})
res.send(content);
});
@ -372,11 +392,15 @@ exports.expressCreateServer = async (_hookName: string, args: ArgsExpressType, c
toolbar,
});
const socialMetaHtml = renderSocialMeta({
req, settings, availableLangs: i18n.availableLangs, locales: i18n.locales, kind: 'timeslider', padName: req.params.pad,
});
res.send(eejs.require('ep_etherpad-lite/templates/timeslider.html', {
req,
toolbar,
entrypoint: "../../"+fileNameTimeSlider,
settings: settings.getPublicSettings()
settings: settings.getPublicSettings(),
socialMetaHtml,
}));
});
} else {

View File

@ -136,6 +136,9 @@ exports.expressPreSession = async (hookName:string, {app}:any) => {
const locales = getAllLocales();
const localeIndex = generateLocaleIndex(locales);
exports.availableLangs = getAvailableLangs(locales);
// Exported so server-rendered HTML (e.g. Open Graph meta tags) can look
// up translated strings without re-reading the locale files.
exports.locales = locales;
app.get('/locales/:locale', (req:any, res:any) => {
// works with /locale/en and /locale/en.json requests

View File

@ -164,6 +164,7 @@ export type SettingsType = {
title: string,
showRecentPads: boolean,
favicon: string | null,
publicURL: string | null,
ttl: {
AccessToken: number,
AuthorizationCode: number,
@ -323,6 +324,18 @@ const settings: SettingsType = {
* Etherpad root directory.
*/
favicon: null,
/**
* Canonical public origin of this Etherpad instance, e.g. "https://pad.example.com".
* When set, it is used to build absolute URLs in server-rendered output (currently
* the Open Graph / Twitter Card meta tags). When null, those URLs fall back to the
* incoming request's protocol+host, which is safe when Host/X-Forwarded-Host
* headers are trusted but should be configured explicitly in production to avoid
* client-controlled origin values appearing in og:url / og:image.
*
* No trailing slash. Must include scheme.
*/
publicURL: null,
ttl: {
AccessToken: 1 * 60 * 60, // 1 hour in seconds
AuthorizationCode: 10 * 60, // 10 minutes in seconds

View File

@ -0,0 +1,182 @@
'use strict';
import type {Request} from 'express';
/**
* Builds the Open Graph + Twitter Card <meta> tag block for the pad page,
* timeslider and homepage. Output values are HTML-escaped pad names are
* user-controlled, so this is the security boundary that prevents reflected
* XSS via crafted pad IDs.
*
* The description text is sourced from Etherpad's i18n catalog under the key
* `pad.social.description`. Operators can override it per-language via the
* standard `customLocaleStrings` mechanism in settings.json.
*/
const SOCIAL_DESCRIPTION_KEY = 'pad.social.description';
const ESCAPE_MAP: {[ch: string]: string} = {
'&': '&amp;',
'<': '&lt;',
'>': '&gt;',
'"': '&quot;',
"'": '&#39;',
};
const escapeHtml = (s: string): string => s.replace(/[&<>"']/g, (c) => ESCAPE_MAP[c]);
const resolveDescription = (
locales: {[lang: string]: {[key: string]: string}} | undefined,
renderLang: string,
): string => {
if (!locales) return '';
// Exact match.
if (locales[renderLang] && locales[renderLang][SOCIAL_DESCRIPTION_KEY]) {
return locales[renderLang][SOCIAL_DESCRIPTION_KEY];
}
// Primary subtag fallback (e.g. de-AT → de).
const primary = renderLang.split('-')[0];
if (locales[primary] && locales[primary][SOCIAL_DESCRIPTION_KEY]) {
return locales[primary][SOCIAL_DESCRIPTION_KEY];
}
// English fallback.
if (locales.en && locales.en[SOCIAL_DESCRIPTION_KEY]) {
return locales.en[SOCIAL_DESCRIPTION_KEY];
}
return '';
};
const toOgLocale = (renderLang: string): string => {
// Open Graph wants `xx_XX`. We already negotiate render language from
// request headers; if it has a region we keep it (lowercased primary,
// uppercased region), otherwise we just emit the primary subtag.
const parts = renderLang.split('-');
if (parts.length >= 2) return `${parts[0].toLowerCase()}_${parts[1].toUpperCase()}`;
return parts[0].toLowerCase();
};
export type SocialMetaOpts = {
url: string,
siteName: string,
title: string,
description: string,
imageUrl: string,
imageAlt: string,
renderLang: string,
};
export const buildSocialMetaHtml = (opts: SocialMetaOpts): string => {
const tag = (prop: string, value: string, attr: 'property' | 'name' = 'property') =>
` <meta ${attr}="${prop}" content="${escapeHtml(value)}">`;
return [
tag('og:type', 'website'),
tag('og:site_name', opts.siteName),
tag('og:title', opts.title),
tag('og:description', opts.description),
tag('og:url', opts.url),
tag('og:image', opts.imageUrl),
tag('og:image:alt', opts.imageAlt),
tag('og:locale', toOgLocale(opts.renderLang)),
tag('twitter:card', 'summary', 'name'),
tag('twitter:title', opts.title, 'name'),
tag('twitter:description', opts.description, 'name'),
tag('twitter:image', opts.imageUrl, 'name'),
tag('twitter:image:alt', opts.imageAlt, 'name'),
].join('\n');
};
// Only the keys are read; values are intentionally unconstrained because the
// i18n module hands us a record whose value shape varies by language.
type AvailableLangs = {[lang: string]: unknown};
// Narrow shape of the global Settings module that this file actually touches.
// Defined locally to avoid coupling socialMeta to the full Settings surface.
type SocialMetaSettings = {
title?: string,
favicon?: string | null,
publicURL?: string | null,
};
const negotiateRenderLang = (req: Request, availableLangs: AvailableLangs): string => {
if (req && typeof req.acceptsLanguages === 'function') {
const negotiated = req.acceptsLanguages(Object.keys(availableLangs));
if (negotiated) return negotiated;
}
return 'en';
};
// Strict hostname[:port] pattern. Rejects header injection (\r\n), userinfo
// (user@host), wildcards, and any non-DNS-character garbage. Length-capped so
// a giant Host header can't blow up the response.
const HOST_RE = /^[a-z0-9]([a-z0-9.-]{0,253}[a-z0-9])?(:\d{1,5})?$/i;
const sanitizeHost = (host: string | undefined): string | null => {
if (!host || host.length > 255) return null;
return HOST_RE.test(host) ? host : null;
};
const sanitizePublicURL = (raw: string | null | undefined): string | null => {
if (!raw || typeof raw !== 'string') return null;
// Must be http(s)://host[:port], no path. Strip trailing slash if present.
const m = raw.replace(/\/+$/, '').match(/^(https?):\/\/([^\/?#]+)$/i);
if (!m) return null;
return sanitizeHost(m[2]) ? `${m[1].toLowerCase()}://${m[2]}` : null;
};
// Builds an absolute URL. Prefers settings.publicURL when configured (operator-
// trusted); otherwise falls back to the request's protocol+Host with strict
// host validation so a crafted Host header can't appear in og:url / og:image.
const buildAbsoluteUrl = (
req: Request, pathname: string, publicURL: string | null | undefined,
): string => {
const trusted = sanitizePublicURL(publicURL);
if (trusted) return `${trusted}${pathname}`;
const proto = req.protocol === 'https' ? 'https' : 'http';
const host = sanitizeHost(req.get && req.get('host')) || 'localhost';
return `${proto}://${host}${pathname}`;
};
const resolveImageUrl = (
req: Request, faviconSetting: string | null | undefined, publicURL: string | null | undefined,
): string => {
if (faviconSetting && /^https?:\/\//i.test(faviconSetting)) return faviconSetting;
return buildAbsoluteUrl(req, '/favicon.ico', publicURL);
};
export type RenderOpts = {
req: Request,
settings: SocialMetaSettings,
availableLangs: AvailableLangs,
locales: {[lang: string]: {[key: string]: string}},
kind: 'pad' | 'timeslider' | 'home',
padName?: string,
};
export const renderSocialMeta = (o: RenderOpts): string => {
const renderLang = negotiateRenderLang(o.req, o.availableLangs);
const siteName = o.settings.title || 'Etherpad';
const description = resolveDescription(o.locales, renderLang);
const imageUrl = resolveImageUrl(o.req, o.settings.favicon, o.settings.publicURL);
const imageAlt = `${siteName} logo`;
let title = siteName;
let pathname = (o.req && o.req.originalUrl) || '/';
if (o.padName) {
// Express has already URL-decoded :pad route params; do not decode again.
if (o.kind === 'pad') title = `${o.padName} | ${siteName}`;
else if (o.kind === 'timeslider') title = `${o.padName} (history) | ${siteName}`;
}
const qIdx = pathname.indexOf('?');
if (qIdx >= 0) pathname = pathname.slice(0, qIdx);
return buildSocialMetaHtml({
url: buildAbsoluteUrl(o.req, pathname, o.settings.publicURL),
siteName,
title,
description,
imageUrl,
imageAlt,
renderLang,
});
};

View File

@ -8,6 +8,7 @@
<html lang="<%=renderLang%>" dir="<%=renderDir%>">
<title><%=settings.title%></title>
<%- typeof socialMetaHtml !== 'undefined' ? socialMetaHtml : '' %>
<meta charset="utf-8">
<link rel="manifest" href="/manifest.json" />
<meta name="referrer" content="no-referrer">

View File

@ -19,6 +19,7 @@
<% e.begin_block("htmlHead"); %>
<% e.end_block(); %>
<title><%=settings.title%></title>
<%- typeof socialMetaHtml !== 'undefined' ? socialMetaHtml : '' %>
<link rel="manifest" href="../../manifest.json" />
<script>
/*

View File

@ -10,6 +10,7 @@
<html lang="<%=renderLang%>" dir="<%=renderDir%>" translate="no" class="pad <%=settings.skinVariants%>">
<head>
<title data-l10n-id="timeslider.pageTitle" data-l10n-args='{ "appTitle": "<%=settings.title%>" }'><%=settings.title%> Timeslider</title>
<%- typeof socialMetaHtml !== 'undefined' ? socialMetaHtml : '' %>
<script>
/*
|@licstart The following is the entire license notice for the

View File

@ -0,0 +1,315 @@
'use strict';
// Unit tests for the pure helpers in src/node/utils/socialMeta.ts. These
// don't touch HTTP/DB — they exercise the helper directly so every branch
// (locale negotiation, fallbacks, escaping, URL building) is covered without
// the cost of an integration test.
const assert = require('assert').strict;
import {buildSocialMetaHtml, renderSocialMeta} from '../../../node/utils/socialMeta';
const ogTag = (html: string, prop: string): string | null => {
const re = new RegExp(
`<meta\\s+(?:property|name)="${prop.replace(/[.*+?^${}()|[\\]/g, '\\$&')}"\\s+content="([^"]*)">`);
const m = html.match(re);
return m ? m[1] : null;
};
const fakeReq = (overrides: any = {}) => ({
protocol: 'https',
get: (h: string) => h === 'host' ? 'pad.example' : '',
acceptsLanguages: (langs: string[]) => 'en',
originalUrl: '/p/Foo',
params: {pad: 'Foo'},
...overrides,
});
const baseSettings = {title: 'Etherpad', favicon: null};
const enLocales = {en: {'pad.social.description': 'English desc'}};
describe(__filename, function () {
describe('buildSocialMetaHtml', function () {
it('emits all 13 OG + Twitter Card tags', function () {
const html = buildSocialMetaHtml({
url: 'https://x/p/Foo',
siteName: 'Etherpad',
title: 'Foo | Etherpad',
description: 'd',
imageUrl: 'https://x/favicon.ico',
imageAlt: 'Etherpad logo',
renderLang: 'en',
});
const expected = [
['property', 'og:type'], ['property', 'og:site_name'], ['property', 'og:title'],
['property', 'og:description'], ['property', 'og:url'], ['property', 'og:image'],
['property', 'og:image:alt'], ['property', 'og:locale'],
['name', 'twitter:card'], ['name', 'twitter:title'], ['name', 'twitter:description'],
['name', 'twitter:image'], ['name', 'twitter:image:alt'],
];
for (const [, prop] of expected) {
assert.ok(ogTag(html, prop) != null, `missing tag: ${prop}`);
}
});
it('HTML-escapes every interpolated value', function () {
const evil = '"><script>alert(1)</script>';
const html = buildSocialMetaHtml({
url: evil, siteName: evil, title: evil, description: evil,
imageUrl: evil, imageAlt: evil, renderLang: 'en',
});
assert.ok(!/<script>/i.test(html), 'no raw <script> in output');
assert.ok(!/"><script/.test(html), 'no attribute breakout');
assert.ok(html.includes('&lt;script&gt;alert(1)&lt;/script&gt;'),
'tags HTML-encoded');
});
it('emits og:locale as xx_XX for region tags', function () {
const html = buildSocialMetaHtml({
url: '/', siteName: 'E', title: 'T', description: 'd',
imageUrl: '/f', imageAlt: 'a', renderLang: 'pt-BR',
});
assert.equal(ogTag(html, 'og:locale'), 'pt_BR');
});
it('emits og:locale as just primary for bare lang tags', function () {
const html = buildSocialMetaHtml({
url: '/', siteName: 'E', title: 'T', description: 'd',
imageUrl: '/f', imageAlt: 'a', renderLang: 'fr',
});
assert.equal(ogTag(html, 'og:locale'), 'fr');
});
it('twitter:card is always summary', function () {
const html = buildSocialMetaHtml({
url: '/', siteName: 'E', title: 'T', description: 'd',
imageUrl: '/f', imageAlt: 'a', renderLang: 'en',
});
assert.equal(ogTag(html, 'twitter:card'), 'summary');
});
});
describe('renderSocialMeta — title composition', function () {
it('pad: "{padName} | {siteName}"', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: baseSettings, availableLangs: {en: {}},
locales: enLocales, kind: 'pad', padName: 'MyPad',
});
assert.equal(ogTag(html, 'og:title'), 'MyPad | Etherpad');
});
it('timeslider: "{padName} (history) | {siteName}"', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: baseSettings, availableLangs: {en: {}},
locales: enLocales, kind: 'timeslider', padName: 'MyPad',
});
assert.equal(ogTag(html, 'og:title'), 'MyPad (history) | Etherpad');
});
it('home: just the site name', function () {
const html = renderSocialMeta({
req: fakeReq({originalUrl: '/'}), settings: baseSettings,
availableLangs: {en: {}}, locales: enLocales, kind: 'home',
});
assert.equal(ogTag(html, 'og:title'), 'Etherpad');
});
it('uses default site name "Etherpad" when settings.title is empty', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: {title: '', favicon: null},
availableLangs: {en: {}}, locales: enLocales, kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:title'), 'P | Etherpad');
});
});
describe('renderSocialMeta — description from i18n', function () {
it('exact locale match wins', function () {
const html = renderSocialMeta({
req: fakeReq({acceptsLanguages: () => 'de'}),
settings: baseSettings, availableLangs: {en: {}, de: {}},
locales: {
en: {'pad.social.description': 'En'},
de: {'pad.social.description': 'De'},
},
kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:description'), 'De');
});
it('region tag falls back to primary subtag', function () {
const html = renderSocialMeta({
req: fakeReq({acceptsLanguages: () => 'de-CH'}),
settings: baseSettings, availableLangs: {en: {}, de: {}, 'de-CH': {}},
locales: {
en: {'pad.social.description': 'En'},
de: {'pad.social.description': 'De'},
},
kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:description'), 'De');
});
it('unknown locale falls back to English', function () {
const html = renderSocialMeta({
req: fakeReq({acceptsLanguages: () => 'ja'}),
settings: baseSettings, availableLangs: {en: {}, ja: {}},
locales: {en: {'pad.social.description': 'En'}},
kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:description'), 'En');
});
it('emits empty description if locale catalog has no entry', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: baseSettings, availableLangs: {en: {}},
locales: {}, kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:description'), '');
});
});
describe('renderSocialMeta — image URL', function () {
it('builds absolute URL to /favicon.ico when settings.favicon is null', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: baseSettings, availableLangs: {en: {}},
locales: enLocales, kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:image'), 'https://pad.example/favicon.ico');
});
it('uses settings.favicon verbatim when it is an absolute URL', function () {
const html = renderSocialMeta({
req: fakeReq(),
settings: {title: 'Etherpad', favicon: 'https://cdn.example/icon.png'},
availableLangs: {en: {}}, locales: enLocales, kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:image'), 'https://cdn.example/icon.png');
});
it('image:alt is "{siteName} logo"', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: {title: 'MyPad Server', favicon: null},
availableLangs: {en: {}}, locales: enLocales, kind: 'pad', padName: 'P',
});
assert.equal(ogTag(html, 'og:image:alt'), 'MyPad Server logo');
assert.equal(ogTag(html, 'twitter:image:alt'), 'MyPad Server logo');
});
});
describe('renderSocialMeta — URL handling', function () {
it('builds canonical og:url from req.protocol/host/originalUrl', function () {
const html = renderSocialMeta({
req: fakeReq({protocol: 'http', originalUrl: '/p/Foo'}),
settings: baseSettings, availableLangs: {en: {}}, locales: enLocales,
kind: 'pad', padName: 'Foo',
});
assert.equal(ogTag(html, 'og:url'), 'http://pad.example/p/Foo');
});
it('strips query string from canonical og:url', function () {
const html = renderSocialMeta({
req: fakeReq({originalUrl: '/p/Foo?utm_source=tweet'}),
settings: baseSettings, availableLangs: {en: {}}, locales: enLocales,
kind: 'pad', padName: 'Foo',
});
assert.equal(ogTag(html, 'og:url'), 'https://pad.example/p/Foo');
});
it('prefers settings.publicURL over request-derived origin', function () {
const html = renderSocialMeta({
req: fakeReq({protocol: 'http', get: (h: string) => h === 'host' ? 'evil.com' : '', originalUrl: '/p/Foo'}),
settings: {title: 'Etherpad', favicon: null, publicURL: 'https://pad.canonical.example'},
availableLangs: {en: {}}, locales: enLocales, kind: 'pad', padName: 'Foo',
});
assert.equal(ogTag(html, 'og:url'), 'https://pad.canonical.example/p/Foo');
assert.equal(ogTag(html, 'og:image'), 'https://pad.canonical.example/favicon.ico');
});
it('strips trailing slash from settings.publicURL', function () {
const html = renderSocialMeta({
req: fakeReq({originalUrl: '/p/Foo'}),
settings: {title: 'Etherpad', favicon: null, publicURL: 'https://pad.example///'},
availableLangs: {en: {}}, locales: enLocales, kind: 'pad', padName: 'Foo',
});
assert.equal(ogTag(html, 'og:url'), 'https://pad.example/p/Foo');
});
it('ignores malformed settings.publicURL and falls back to request', function () {
// No scheme, has path, contains userinfo — all rejected.
for (const bad of ['pad.example', 'http:///foo', 'https://user@pad.example', 'javascript:alert(1)']) {
const html = renderSocialMeta({
req: fakeReq({originalUrl: '/p/Foo'}),
settings: {title: 'Etherpad', favicon: null, publicURL: bad},
availableLangs: {en: {}}, locales: enLocales, kind: 'pad', padName: 'Foo',
});
assert.equal(ogTag(html, 'og:url'), 'https://pad.example/p/Foo',
`should fall back for malformed publicURL: ${bad}`);
}
});
it('rejects invalid Host header values when no publicURL is configured', function () {
// Whether a vulnerable proxy lets header injection through or not, the
// helper must not echo a non-DNS-shaped Host into og:url.
for (const bad of ['evil.com\r\nX-Injected: 1', 'user@evil.com', '<script>', '*']) {
const html = renderSocialMeta({
req: fakeReq({get: (h: string) => h === 'host' ? bad : '', originalUrl: '/p/Foo'}),
settings: baseSettings, availableLangs: {en: {}}, locales: enLocales,
kind: 'pad', padName: 'Foo',
});
const url = ogTag(html, 'og:url') || '';
assert.ok(!url.includes('\n') && !url.includes('\r'), `CRLF leaked: ${url}`);
assert.ok(!url.includes('<') && !url.includes('>'), `HTML leaked: ${url}`);
assert.ok(!url.includes('@'), `userinfo leaked: ${url}`);
assert.ok(url.startsWith('https://localhost/'), `unexpected fallback: ${url}`);
}
});
it('caps protocol to http or https — no smuggled schemes', function () {
// If something upstream lets req.protocol be a weird value (e.g. via a
// crafted X-Forwarded-Proto), we still emit only http or https.
const html = renderSocialMeta({
req: fakeReq({protocol: 'javascript', originalUrl: '/p/Foo'}),
settings: baseSettings, availableLangs: {en: {}}, locales: enLocales,
kind: 'pad', padName: 'Foo',
});
const url = ogTag(html, 'og:url') || '';
assert.ok(url.startsWith('http://') || url.startsWith('https://'),
`unexpected scheme in og:url: ${url}`);
});
it('does not double-decode pad names containing literal "%"', function () {
// Express decodes /p/100%25 to req.params.pad === "100%". Calling
// decodeURIComponent("100%") would throw URIError. Verify the helper
// accepts "100%" verbatim and renders it without throwing.
assert.doesNotThrow(() => {
const html = renderSocialMeta({
req: fakeReq({originalUrl: '/p/100%25'}),
settings: baseSettings, availableLangs: {en: {}}, locales: enLocales,
kind: 'pad', padName: '100%',
});
assert.equal(ogTag(html, 'og:title'), '100% | Etherpad');
});
});
});
describe('renderSocialMeta — XSS', function () {
it('escapes < > " & in pad names', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: baseSettings, availableLangs: {en: {}},
locales: enLocales, kind: 'pad', padName: '<img src=x onerror=alert(1)>',
});
assert.ok(!html.includes('<img src=x'),
'raw HTML must not appear in output');
assert.ok(html.includes('&lt;img'), 'tag opener escaped');
});
it('escapes pad name containing a quote that could break out of content=""', function () {
const html = renderSocialMeta({
req: fakeReq(), settings: baseSettings, availableLangs: {en: {}},
locales: enLocales, kind: 'pad', padName: 'X"><script>alert(1)</script>',
});
assert.ok(!/"><script/.test(html), 'must not allow attribute breakout');
assert.ok(html.includes('&quot;'), 'quote escaped');
});
});
});

View File

@ -0,0 +1,124 @@
'use strict';
import {MapArrayType} from "../../../node/types/MapType";
const assert = require('assert').strict;
const common = require('../common');
import settings from '../../../node/utils/Settings';
const ogTag = (html: string, prop: string): string | null => {
const re = new RegExp(
`<meta\\s+(?:property|name)="${prop.replace(/[.*+?^${}()|[\\]/g, '\\$&')}"\\s+content="([^"]*)">`);
const m = html.match(re);
return m ? m[1] : null;
};
describe(__filename, function () {
let agent: any;
const backup: MapArrayType<any> = {};
before(async function () {
agent = await common.init();
});
beforeEach(async function () {
backup.title = settings.title;
backup.favicon = settings.favicon;
});
afterEach(async function () {
settings.title = backup.title;
settings.favicon = backup.favicon;
});
describe('pad page', function () {
it('emits og:title with pad name and site title', async function () {
const res = await agent.get('/p/TestPad7599').expect(200);
assert.equal(ogTag(res.text, 'og:title'), `TestPad7599 | ${settings.title}`);
});
it('emits og:description from the i18n catalog (English default)', async function () {
const res = await agent.get('/p/TestPad7599')
.set('Accept-Language', 'en').expect(200);
const desc = ogTag(res.text, 'og:description');
// Sourced from src/locales/en.json under "pad.social.description".
assert.ok(desc && desc.length > 0, `og:description should be non-empty, got: ${desc}`);
assert.match(desc!, /collaborative/i);
});
it('falls back to English description when language has no override', async function () {
// Most non-English locales do not yet translate pad.social.description,
// so a request in (e.g.) Japanese should still receive the English string.
const res = await agent.get('/p/TestPad7599')
.set('Accept-Language', 'ja').expect(200);
const desc = ogTag(res.text, 'og:description');
assert.ok(desc && desc.length > 0,
'og:description should fall back to en, not be empty');
});
it('emits og:image and og:image:alt', async function () {
const res = await agent.get('/p/TestPad7599').expect(200);
const img = ogTag(res.text, 'og:image');
assert.match(img || '', /\/favicon\.ico$/);
assert.equal(ogTag(res.text, 'og:image:alt'), `${settings.title} logo`);
});
it('emits og:locale', async function () {
const res = await agent.get('/p/TestPad7599')
.set('Accept-Language', 'en').expect(200);
const locale = ogTag(res.text, 'og:locale');
assert.match(locale || '', /^en/);
});
it('uses the Express-decoded pad name in og:title', async function () {
// %2D is "-"; Express decodes the route param before we see it, so
// og:title contains the decoded form.
const res = await agent.get('/p/Has%2DDash7599').expect(200);
const title = ogTag(res.text, 'og:title');
assert.ok(title && title.startsWith('Has-Dash7599 | '),
`unexpected og:title: ${title}`);
});
it('does not throw for pad names containing literal "%"', async function () {
// /p/100%25 → Express decodes to req.params.pad === "100%". A naive
// second decodeURIComponent call would throw URIError; this test
// guards that regression.
const res = await agent.get('/p/100%25Test').expect(200);
assert.ok(ogTag(res.text, 'og:title'), 'og:title should still render');
});
it('HTML-escapes pad names to prevent XSS via crafted IDs', async function () {
const res = await agent.get('/p/' + encodeURIComponent('<script>alert(1)</script>'))
.expect((r: any) => {
// Etherpad may 404 or render — either is fine, but no raw <script>
// injected via og:title.
});
const ogTitle = ogTag(res.text || '', 'og:title');
if (ogTitle != null) {
assert.ok(!/<script>/i.test(ogTitle),
`og:title leaked raw HTML: ${ogTitle}`);
}
});
it('emits twitter:card summary', async function () {
const res = await agent.get('/p/TestPad7599').expect(200);
assert.equal(ogTag(res.text, 'twitter:card'), 'summary');
});
});
describe('timeslider', function () {
it('og:title contains the (history) marker', async function () {
const res = await agent.get('/p/TestPad7599/timeslider').expect(200);
const title = ogTag(res.text, 'og:title');
assert.ok(title && title.includes('(history)'),
`unexpected timeslider og:title: ${title}`);
});
});
describe('homepage', function () {
it('og:title equals settings.title', async function () {
const res = await agent.get('/').expect(200);
assert.equal(ogTag(res.text, 'og:title'), settings.title);
});
});
});