John McLear bbd29683d9
ci(frontend-tests): exclude ep_cursortrace + un-flake 30 of 31 #7611 skips (#7630)
* test(playwright): wait for editor editability in goToNewPad/goToPad

#editorcontainer.initialized fires after padeditor.init resolves but
before ace flips the inner body from `class="static"` /
contentEditable=false to editable. Under WITH_PLUGINS load in Firefox
that flip can lag long enough that an immediate click + keyboard.type
runs against a still-static body and is silently dropped — the body
keeps showing the default welcome text and never sees our input.

Most of the specs that currently carry `test.skip(WITH_PLUGINS)`
markers (#7611) are racing exactly this flip. Block in goToNewPad /
goToPad until the inner #innerdocbody is `contenteditable="true"`,
so every spec starts from a known-ready editor without each having
to add its own ad-hoc waits.

Value-driven: exits as soon as ace flips the attribute, no fixed
delay. Refactored into a private waitForEditorReady() helper so
goToNewPad and goToPad share a single source of truth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip bold.spec under WITH_PLUGINS

The two skipped tests fail because clicking the bold toolbar button
right after selectAllText is intercepted by the #toolbar-overlay div
(same root cause that needed force:true in clearAuthorship and
ep_align). Add force:true to the click and drop the
test.skip(WITH_PLUGINS) markers.

The keypress variant doesn't click a toolbar button — it relies on
the editor being editable when keyboard.press fires. The previous
commit (waitForEditorReady in goToNewPad) covers that.

Proof-of-concept un-skip; if CI confirms both pass, will expand the
same pattern to the rest of the #7611 skip set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): make bold.spec robust to Firefox + WITH_PLUGINS

The previous attempt at un-skipping these tests added force:true on
the toolbar click but left the legacy selectAllText + keyboard.type
sequence in place. Firefox under WITH_PLUGINS load racily drops
keystrokes from per-key events, leaving an empty selection that the
bold-on-click and Ctrl+B branches both no-op'd against — the asserts
then timed out 5 retries deep with no <b> element.

Replace the selectAllText + keyboard.type prelude with the standard
clearPadContent + writeToPad pair. writeToPad uses insertText (one
input event for the whole string) which is the same fix that
unblocked ep_align in #7625.

Verified locally on Firefox + WITH_PLUGINS=1: 2/2 pass in 15s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip 4 writeToPad-only specs under WITH_PLUGINS

These four specs were marked test.skip(WITH_PLUGINS) for "flaky in
with-plugins suite" but only use writeToPad / clearPadContent /
goToNewPad — no direct keyboard.type, no toolbar button clicks. The
flake was the editor not being ready when the test's first
interaction fired (now covered by waitForEditorReady in
goToNewPad/goToPad earlier in this branch) plus writeToPad's switch
to insertText (#7625).

  - urls_become_clickable.spec.ts (file-level skip)
  - unaccepted_commit_warning.spec.ts
  - undo_clear_authorship.spec.ts
  - timeslider_follow.spec.ts

Just removing the skip lines is enough; no other changes needed.

Verified locally on Firefox + WITH_PLUGINS=1: all 40 tests across
the four specs pass in 3m1s. urls_become_clickable contributes the
bulk (37 tests via parameterised describes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip page_up_down and timeslider_line_numbers under WITH_PLUGINS

Both specs use writeToPad + keyboard.press for Page Up/Down, End,
arrow keys, and the like — no per-character keyboard.type, no
toolbar button clicks. The flake was the editor not being ready
when the spec's first interaction fired (now covered by
waitForEditorReady earlier in this branch) plus writeToPad's switch
to insertText (#7625) for the multi-line setup.

  - page_up_down.spec.ts (3 skips)
  - timeslider_line_numbers.spec.ts (1 skip)

Verified locally on Firefox + WITH_PLUGINS=1: 5/5 tests pass.

enter.spec.ts deliberately left skipped — its Enter-in-a-loop test
(line 33) drops keypresses under load and needs a value-driven
per-iteration verify, separate change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip chat/list_wrap/clear_authorship; re-skip undo_clear_authorship

Three more files cleared after the editor-ready helper landed:

  - chat.spec.ts (2 skips) — both clicks target settings-popup
    checkboxes, not toolbar buttons; the toolbar-overlay isn't
    in play, so just dropping the skips is enough.
  - clear_authorship_color.spec.ts (1) — uses the existing
    clearAuthorship helper, which already runs with force:true.
  - list_wrap_indent.spec.ts (1) — adds force:true to the
    .buttonicon-insertorderedlist click that fires after
    selectAllText (same pattern as bold.spec).

Reverts the un-skip on undo_clear_authorship.spec.ts: that one
spawns two browser contexts and races against User B's writeToPad
landing in the second pad. Hit a real flake locally where User B's
text never appeared. Needs a per-user "wait for text to commit"
before the assertion. Re-add the skip until that fix is in.

Verified locally on Firefox + WITH_PLUGINS=1: 16 passed across
the three un-skipped files (one undo_clear_authorship retry
flaked, hence the revert).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip alphabet/delete/select_focus_restore under WITH_PLUGINS

  - alphabet.spec.ts (1) — swapped page.keyboard.type for writeToPad
  - delete.spec.ts (1) — same swap
  - select_focus_restore.spec.ts (1) — left keyboard.type in place
    (the test specifically verifies that focus returns to the editor
    after a toolbar select change; replacing with writeToPad would
    re-focus the body via a click and mask the bug being asserted).
    Editor-ready wait alone is enough here.

Verified locally on Firefox + WITH_PLUGINS=1: 3/3 pass in 23s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip bold_paste + undo_redo_scroll under WITH_PLUGINS

  - bold_paste.spec.ts (1) — already used writeToPad; just dropping
    the skip is enough now that the editor-ready helper landed.
  - undo_redo_scroll.spec.ts (2) — replaced the
    `for (45 lines) { keyboard.type; keyboard.press('Enter') }` loop
    with a single writeToPad of `lines.join('\\n') + '\\n'`. writeToPad
    drives input via insertText (one input event per line) which
    Firefox under WITH_PLUGINS load handles without dropping events.
    The Ctrl+Z scroll-to-caret behaviour the test asserts is
    unchanged — each line still lands in its own changeset for the
    undo module to reverse.

Verified locally on Firefox + WITH_PLUGINS=1: bold_paste passes
clean; undo_redo_scroll passes via the existing per-spec
`retries: 2` config (the scroll timing race exists pre-change and
is what motivates the retries).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip unordered_list 'enter for the new line' under WITH_PLUGINS

  - Add force:true on the .buttonicon-insertunorderedlist click to
    bypass #toolbar-overlay (same pattern as clearAuthorship and
    bold.spec).
  - Replace the
      keyboard.type('line 1'); keyboard.press('Enter');
      keyboard.type('line 2'); keyboard.press('Enter');
    sequence with a single writeToPad('line 1\\nline 2\\n') —
    insertText per line + Enter between, which Firefox under
    WITH_PLUGINS load handles without dropping events. The trailing
    newline preserves the final Enter the original spec relied on.

Verified locally on Firefox + WITH_PLUGINS=1: passes in 8s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip all 4 ordered_list tests under WITH_PLUGINS

  - issue #4748 + #1125: add force:true on
    .buttonicon-insertorderedlist clicks (toolbar-overlay
    interception after selection); collapse the per-line
    keyboard.type + keyboard.press('Enter') sequences into single
    writeToPad calls with embedded newlines.
  - issue #5160 and #5718 already used force:true and writeToPad
    throughout; just dropping the skip is enough now that the
    editor-ready helper landed.

Verified locally on Firefox + WITH_PLUGINS=1: 11 passed (4 ordered_list
+ 5 unordered_list, plus 2 sub-describes). 1m24s total.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip all 4 indentation tests under WITH_PLUGINS

Same pattern as bold/ordered_list/unordered_list:
  - force:true on .buttonicon-indent / .buttonicon-bold /
    .buttonicon-outdent clicks (toolbar-overlay interception
    after a text selection).
  - Replace per-line keyboard.type + keyboard.press('Enter')
    sequences with single writeToPad calls using \\n separators.
  - Replace single-character keyboard.type calls (':', '(', '[',
    '{{') with keyboard.insertText for consistency.

The keypress and indent/outdent button tests were already passing
without WITH_PLUGINS skips — only the four tests that race the
toolbar click + typing sequence were skipped. With force:true and
writeToPad they're stable.

Verified locally on Firefox + WITH_PLUGINS=1: 12 tests pass across
indentation, ordered_list, unordered_list, list_wrap_indent
(matched by the indent grep). 1m11s total.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip enter.spec 'enter is always visible' under WITH_PLUGINS

The test fired 15 keypress('Enter') calls in a tight loop with no
per-iteration verify. Under Firefox + WITH_PLUGINS load the
editor's input pipeline can't always keep up while plugin hooks
are warming, so a few presses get dropped and the final
`expect(div.count).toBe(numberOfLines + originalLength)` fails
with too few lines.

Add a value-driven `expect(div).toHaveCount(originalLength + i + 1)`
after each press. The loop only advances once the editor has
acknowledged the previous Enter, so dropped events become slow
events instead of lost ones.

Verified locally on Firefox + WITH_PLUGINS=1: passes in 11s
(would have been 1.5m timeout previously).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip undo_clear_authorship under WITH_PLUGINS

The two-user test was racing on User B's keyboard.type('Hello from
User B') and 'Still connected!' — Firefox + WITH_PLUGINS load drops
keystrokes from per-key events, leaving the second pad with
truncated text that the body1 round-trip assertion never matches.

Replace both keyboard.type calls with keyboard.insertText (single
input event). Cannot use writeToPad here because the test relies on
the caret position established by the preceding End + Enter — a
writeToPad would re-click the body and reset focus.

Verified locally on Firefox + WITH_PLUGINS=1: both tests pass clean
in 30s (previously failed all retries at 1m+ each). The
test.describe.configure({retries: 2}) is kept as belt-and-braces
for the multi-context server propagation race that this test
exercises legitimately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): un-skip collab_client 'bug #4978 regression test' under WITH_PLUGINS

The test's replaceLineText helper used keyboard.type(newText) to
insert the replacement string after a Backspace clear. Firefox under
WITH_PLUGINS load drops keystrokes from per-key events, leaving the
line with truncated text that the cross-context assertions
(body1.toHaveText(user2Text), body2.toHaveText(user1Text)) never
match.

Switch the type to keyboard.insertText (single input event) — same
fix that unblocked ep_align in #7625 and the other typing-races in
this branch. The selectText + Backspace + insertText pattern still
exercises the legitimate collab race the test asserts (concurrent
edits over the COLLABROOM).

Verified locally on Firefox + WITH_PLUGINS=1: passes in 15s.

This was the last of the 31 test.skip(WITH_PLUGINS, '#7611') markers
in src/tests/frontend-new/specs/. The branch goal of zero #7611
skips is met.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): use stable l10n selector for OL toolbar button

Qodo flagged the .first() call in #4748's setup as DOM-order
dependent: a future plugin that adds another element carrying the
.buttonicon-insertorderedlist class would silently change which
button the test clicks. Switch to
button[data-l10n-id='pad.toolbar.ol.title'] (the localizationId
declared in src/node/utils/toolbar.ts), which is unique to the core
ordered-list toolbar entry. Drop the now-unnecessary .first().

The class-based locator remains in #5160, #5718, and the indent/
outdent sub-describes; those don't strict-mode-match more than one
element today, but a follow-up could swap them too for consistency
if reviewers want.

Verified locally on Firefox + WITH_PLUGINS=1: passes in 7s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): tighten writeToPad Enter delivery + fix toolbar overlay regressions

Three fixes for the failures that surfaced once #7630 ran in CI on
Firefox + WITH_PLUGINS at the full matrix:

1. **writeToPad** now value-waits per Enter and retries up to 3
   times if the editor doesn't acknowledge a new line. Long
   multi-line writes (e.g. timeslider_follow's #4389 setup with
   ~120 newlines) were dropping Enters faster than the previous
   single-press loop tolerated. The retry surfaces the canonical
   "expected N, got M" timeout if all 3 attempts fail.

2. **unordered_list.spec.ts**: every `.buttonicon-*` toolbar click
   now uses {force: true}. Two of the un-skipped tests intermittently
   missed the click under load because #toolbar-overlay intercepts
   pointer events after a text selection (same pattern as bold,
   ep_align, et al.). Body clicks (clicks inside the iframe pad
   body) are unaffected and stay as plain `.click()`.

3. **timeslider_follow.spec.ts** "regression test for #4389":
   re-skipped under WITH_PLUGINS with a specific note. The 120-Enter
   setup races plugin load even with the new writeToPad retry —
   re-press attempts overshoot the exact line count when a "dropped"
   Enter eventually lands. Needs a fundamentally different setup
   approach (REST API import, clipboard paste, etc.) to un-skip
   reliably; out of scope here.

Net: 30 of the original 31 #7611 skips remain removed (was 31/31
before; the one re-skip is a documented known-aggressive case).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): revert writeToPad per-Enter retry — overshoots cause more failures

The per-Enter value-wait + retry I added in fc45d71e5 was meant to
catch dropped Enters in long multi-line writes, but in CI it made
things worse: when a "dropped" Enter eventually landed during the
retry's short poll window, the next iteration's exact line-count
expectation was off by one and the retry loop overshot, breaking
tests that previously passed (urls_become_clickable, language,
inner_height all hit toHaveCount mismatches that didn't exist
before).

Revert to the simpler insertText + bare keyboard.press('Enter')
loop. Tests with extreme line counts (timeslider_follow #4389,
~120 Enters) stay re-skipped from the prior commit; everything
else accepts the same intermittent flake the helper exhibited
before this fix attempt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(playwright): re-skip 8 tests that need deeper rework to un-skip

Honest scope adjustment after CI surfaced load-dependent failures
that local single-run verification missed. The previous batches
worked at low concurrency but flake at the full Playwright matrix
under WITH_PLUGINS:

  - bold_paste.spec.ts — clipboard / paste race between specs
  - collab_client.spec.ts (bug #4978) — multi-context cross-pad
    propagation under load
  - enter.spec.ts (enter is always visible) — 15-Enter loop drops
    presses faster than the per-iteration value-wait can recover
  - timeslider_follow.spec.ts (content as it's added) — 66 sequential
    Enters across 6 writeToPad calls
  - undo_clear_authorship.spec.ts (describe-level) — multi-context;
    the cross-pad text-arrival assertion races
  - undo_redo_scroll.spec.ts (describe-level) — 45-line writeToPad
    setup; scroll-position assertion needs stable layout
  - unordered_list.spec.ts (Keeps unordered list on enter) — toolbar
    click + writeToPad with embedded newline races

All carry inline comments explaining the specific load issue and
referencing #7611 so a follow-up that introduces a REST-driven or
clipboard-paste setup mechanism can target them concretely.

Net: 23 of 31 #7611 skips removed (74%). The deferred 8 share two
underlying limitations that need infrastructure work:
  1. No reliable way to drive >10 sequential Enters under load
     without occasional drops
  2. No reliable cross-context propagation wait helper

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* DO-NOT-MERGE bisect plugins: Firefox×HALF-A + Firefox×HALF-B

One CI run, both halves of the standard plugin set, both on Firefox
(which is the project that reliably trips the flake we're chasing).

  Playwright Firefox with plugins  → HALF A: ep_align, ep_author_hover,
                                      ep_cursortrace, ep_font_size,
                                      ep_headings2
  Playwright Chrome with plugins   → HALF B: ep_markdown, ep_readonly_guest,
                                      ep_set_title_on_pad, ep_spellcheck,
                                      ep_subscript_and_superscript,
                                      ep_table_of_contents
                                      (job runs --project=firefox here too)

Decision matrix on next CI:
  - Both fail        → load alone is the cause; deeper rework needed.
  - Only A fails     → culprit is in HALF A (5 candidates).
  - Only B fails     → culprit is in HALF B (6 candidates).
  - Both pass        → flake threshold sits between 5–6 plugins; the
                        culprit is whichever 2-plugin pair from the full
                        set tips the load above threshold; iterate.

Revert this commit before merging — it's purely a CI probe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* DO-NOT-MERGE bisect plugins iter 2: A1 (align,author_hover) vs A2 (cursortrace,font_size,headings2)

Iteration 1 isolated to HALF A. Splitting:
  Playwright Firefox with plugins → A1: ep_align, ep_author_hover
  Playwright Chrome with plugins  → A2: ep_cursortrace, ep_font_size,
                                         ep_headings2 (still --project=firefox)

Decision matrix:
  - Both fail        → load alone tips it; ≥2 of these 5 are needed.
  - Only A1 fails    → culprit is ep_align or ep_author_hover.
  - Only A2 fails    → culprit is ep_cursortrace, ep_font_size, or ep_headings2.
  - Both pass        → flake threshold is between 2 and 3 plugins from A,
                        revisit splitting (could be a specific pair).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* DO-NOT-MERGE bisect plugins iter 3: A2a (cursortrace) vs A2b (font_size, headings2)

Iteration 2 isolated to A2 (cursortrace+font_size+headings2).
Iter 3 singles out ep_cursortrace:

  Playwright Firefox with plugins → A2a: ep_cursortrace
  Playwright Chrome with plugins  → A2b: ep_font_size, ep_headings2
                                         (still --project=firefox)

Decision matrix:
  - Only A2a fails   → ep_cursortrace is the culprit (1 plugin alone tips it).
  - Only A2b fails   → culprit is ep_font_size or ep_headings2.
  - Both fail        → load tips at >=1 plugin from this set; investigate
                        each individually.
  - Both pass        → load tips at >=3 plugins; revisit splitting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* DO-NOT-MERGE bisect plugins iter 4 (confirm): all-minus-cursortrace

Iter 3 isolated to ep_cursortrace alone. Confirming by running the
inverse — every other plugin in the standard set, no ep_cursortrace —
on TWO Firefox runs in parallel:

  Playwright Firefox with plugins → align, author_hover, font_size,
                                     headings2, markdown,
                                     readonly_guest, set_title_on_pad,
                                     spellcheck,
                                     subscript_and_superscript,
                                     table_of_contents
  Playwright Chrome with plugins  → same 10 plugins (still
                                     --project=firefox per probe)

Both pass → ep_cursortrace is conclusively the culprit.
Either fails → load is the cause and the bisection mis-attributed
              (would need to investigate why iter 3 cursortrace-only
              failed: maybe a flaky one-off).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci(frontend-tests): exclude ep_cursortrace from with-plugins set

Bisected via 4 CI iterations on this branch. ep_cursortrace's
`aceEditEvent` hook (static/js/main.js in the plugin) fires on every
keyboard event — handleClick, handleKeyEvent, idleWorkTimer — and
unconditionally sends a `cursorPosition` socket message via
`pad.collabClient.sendMessage` per call. Under the test harness's
writeToPad bursts (insertText + Enter loops) that stream of socket
messages saturates the editor's input pipeline in Firefox
specifically, causing intermittent keystroke drops and the entire
class of #7611 flakiness this PR was originally chasing.

Confirmation runs:
  - 11-plugin set including ep_cursortrace            → fails on Firefox
  - HALF B (5 plugins, no cursortrace)                → passes
  - HALF A (5 plugins, with cursortrace)              → fails
  - A1 (align, author_hover) — no cursortrace         → passes
  - A2 (cursortrace, font_size, headings2)            → fails
  - A2a (cursortrace alone, 1 plugin)                 → fails
  - A2b (font_size, headings2, no cursortrace)        → passes
  - 10-plugin set, all minus ep_cursortrace           → passes (×2 jobs)

Drop ep_cursortrace from the frontend-tests.yml plugin set and
restore all the un-skips that this PR pessimistically re-skipped
during the load-symptom whack-a-mole. The plugin itself needs a
debounce/throttle around its socket send before it can come back
into the test set; tracked separately in the ep_cursortrace repo.

Backend tests / docker / etc remain on the original 11-plugin set
since they don't trip the same input-pipeline race.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:19:54 +01:00
..
2020-10-23 20:31:17 +01:00