fix: correct inverted park condition in PooledTaskExecutorTest.LongTask by Copilot · Pull Request #971 · OpenIdentityPlatform/OpenAM

Copilot · 2026-03-26T11:37:16Z

Initial fix: correct inverted while loop condition in LongTask.execute()
Deeper fix: replace the AtomicBoolean + LockSupport mechanism in LongTask with a CountDownLatch
Reviewer feedback: use latch.await(30, TimeUnit.SECONDS) with fail() on timeout
Reviewer feedback: on InterruptedException, re-interrupt the thread and throw DataLayerException
Revert duplicate throw statement introduced by commit b25d626 (caused compile error)

Original prompt

Problem

The CI job https://github.com/OpenIdentityPlatform/OpenAM/actions/runs/23591340704/job/68697595457 is failing due to a flaky/broken test:

[ERROR] org.forgerock.openam.sm.datalayer.impl.PooledTaskExecutorTest.testExecute -- Time elapsed: 300.3 s <<< FAILURE!
[ERROR] PooledTaskExecutorTest.testExecute:120 [Task 1 thread running]
Expecting value to be false but was true expected:<[fals]e> but was:<[tru]e>

The test times out (300 seconds) because a thread never terminates — task1.isAlive() returns true after join().

Root Cause

In openam-core/src/test/java/org/forgerock/openam/sm/datalayer/impl/PooledTaskExecutorTest.java, the LongTask.execute() method has inverted locking logic:

// Current (BROKEN) code
public void execute(TokenStorageAdapter adapter) throws DataLayerException {
    this.executingThread = Thread.currentThread();
    debug("Locking");
    locked.set(true);
    while (!locked.compareAndSet(false, true)) {  // ← BUG: inverted condition
        debug("Task still locked - parking thread");
        LockSupport.park(this);
        debug("Thread unparked");
    }
    debug("Thread unlocked - continuing");
}

After locked.set(true), the CAS locked.compareAndSet(false, true) fails (because locked is already true), so !false = true, and the thread parks itself. When unblock() sets locked = false and unparks, the CAS succeeds (false → true), so !true = false, and the loop exits.

The problem: on a slow or loaded CI runner (macOS ARM), if unblock() is called before LockSupport.park() executes, the unpark signal is consumed but the thread then parks again and never wakes up, causing the thread to hang indefinitely.

Fix

Change the while loop condition in LongTask.execute() from the broken CAS to a simple locked.get() check — park while locked is true, exit when unblock() sets it to false:

// Fixed code
@Override
public void execute(TokenStorageAdapter adapter) throws DataLayerException {
    this.executingThread = Thread.currentThread();
    debug("Locking");
    locked.set(true);
    while (locked.get()) {  // ← FIXED: park while still locked
        debug("Task still locked - parking thread");
        LockSupport.park(this);
        debug("Thread unparked");
    }
    debug("Thread unlocked - continuing");
}

The unblock() method already correctly sets locked = false and calls LockSupport.unpark() — no changes needed there.

File to Change

File: openam-core/src/test/java/org/forgerock/openam/sm/datalayer/impl/PooledTaskExecutorTest.java
Line 159: Change while (!locked.compareAndSet(false, true)) → while (locked.get())

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.

Assistant: ## Failing Test: PooledTaskExecutorTest.testExecute

Root Cause

The test fails at line 120 with:

[Task 1 thread running] Expecting value to be false but was true

This means task1 (or one of the threads) is still alive after the join() call with a 100-second timeout. The issue is a race condition / deadlock in the LongTask locking logic.

Look at the LongTask.execute() method (lines 155–164):

public void execute(TokenStorageAdapter adapter) throws DataLayerException {
    this.executingThread = Thread.currentThread();
    debug("Locking");
    locked.set(true);
    while (!locked.compareAndSet(false, true)) {  // ← BUG HERE
        debug("Task still locked - parking thread");
        LockSupport.park(this);
        debug("Thread unparked");
    }
    debug("Thread unlocked - continuing");
}

The logic is inverted. After setting locked = true, the while condition !locked.compareAndSet(false, true) tries to CAS from false → true, but locked is already true. This means the CAS fails (returns false), so !false = true, and the thread parks itself immediately and forever — even before unblock() is called.

When unblock() sets locked = false and unparks the thread, the CAS then succeeds (false → true), so !true = false, and the loop exits. However, on a slow/loaded CI runner (macOS ARM in this case), there's a timing window where the unpark happens before the park, causing the thread to never wake up.

Fix

The intent of LongTask is: stay locked until unblock() sets locked = false. The correct loop condition should park while locked is still true:

@Override
public void execute(TokenStorageAdapter adapter) throws DataLayerException {
    this.executingThread = Thread.currentThread();
    debug("Locking");
    locked.set(true);
    while (locked.get()) {  // park while still locked
        debug("Tas...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

<!-- START COPILOT CODING AGENT TIPS -->
---

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. [Learn more about Advanced Security.](https://gh.io/cca-advanced-security)

…gTask Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com> Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/3a385bd2-89e9-4012-b0fe-da93e98eb1f5

… PooledTaskExecutorTest Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/49bb67bb-1acb-4834-ad72-86979cbf3265 Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com>

Copilot

Pull request overview

Fixes a flaky/hanging unit test by removing a racy LockSupport.park/unpark + AtomicBoolean coordination mechanism and replacing it with a CountDownLatch to guarantee unblock() reliably releases execute() even if signaled early.

Changes:

Replaced AtomicBoolean/LockSupport state machine in LongTask with a CountDownLatch.
Simplified unblock() to countDown() and execute() to await().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

openam-core/src/test/java/org/forgerock/openam/sm/datalayer/impl/PooledTaskExecutorTest.java

Copilot · 2026-03-26T17:25:18Z

openam-core/src/test/java/org/forgerock/openam/sm/datalayer/impl/PooledTaskExecutorTest.java

+                latch.await();
+            } catch (InterruptedException e) {
+                Thread.currentThread().interrupt();


Using an unbounded await() in test code can still hang the entire test run if unblock() is never reached (e.g., due to a preceding failure). Consider await(timeout, unit) and failing the test with a clear message if it times out, to keep CI failures fast and diagnosable.

Suggested change

latch.await();

} catch (InterruptedException e) {

Thread.currentThread().interrupt();

if (!latch.await(30, TimeUnit.SECONDS)) {

fail("LongTask was not unblocked within 30 seconds; test is likely hung.");

}

} catch (InterruptedException e) {

Thread.currentThread().interrupt();

fail("LongTask was interrupted while waiting to be unblocked", e);

vharseko · 2026-03-26T18:28:32Z

@copilot apply changes based on the comments in this thread

Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/adbb16e3-cdcd-479d-9111-948001af7070 Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com>

Copilot · 2026-03-26T18:34:17Z

@copilot apply changes based on the comments in this thread

Done in commit 059273d4. Applied both reviewer suggestions:

latch.await() → latch.await(30, TimeUnit.SECONDS) with fail(...) on timeout so a hung test fails within 30 s with a clear diagnostic message.
On InterruptedException: re-interrupts the thread and throws DataLayerException to propagate the failure deterministically instead of silently continuing.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/d61905b6-3165-47d1-a9ca-9e19738e39f3 Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com>

Initial plan

ab41753

Copilot AI assigned Copilot and vharseko Mar 26, 2026

Copilot started work on behalf of vharseko March 26, 2026 11:37 View session

fix: correct inverted locking condition in PooledTaskExecutorTest Lon…

3408439

…gTask Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com> Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/3a385bd2-89e9-4012-b0fe-da93e98eb1f5

Copilot AI changed the title ~~[WIP] Fix flaky PooledTaskExecutorTest due to inverted locking logic~~ fix: correct inverted park condition in PooledTaskExecutorTest.LongTask Mar 26, 2026

Copilot AI requested a review from vharseko March 26, 2026 11:53

Copilot finished work on behalf of vharseko March 26, 2026 11:53

vharseko requested a review from maximthomas March 26, 2026 12:08

Copilot started work on behalf of vharseko March 26, 2026 15:57 View session

fix: replace fragile LockSupport/AtomicBoolean with CountDownLatch in…

c8e2cda

… PooledTaskExecutorTest Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/49bb67bb-1acb-4834-ad72-86979cbf3265 Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com>

Copilot finished work on behalf of vharseko March 26, 2026 16:04

vharseko requested a review from Copilot March 26, 2026 17:18

Copilot AI reviewed Mar 26, 2026

View reviewed changes

Copilot started reviewing on behalf of vharseko March 26, 2026 17:34 View session

Copilot started work on behalf of vharseko March 26, 2026 18:28 View session

fix: add timeout and fail-fast interrupt handling to LongTask.execute()

059273d

Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/adbb16e3-cdcd-479d-9111-948001af7070 Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com>

Copilot finished work on behalf of vharseko March 26, 2026 18:35

vharseko marked this pull request as ready for review March 27, 2026 06:33

Apply suggestion from @Copilot

b25d626

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of vharseko March 27, 2026 07:49 View session

revert: remove duplicate throw statement introduced by b25d626

0ab09f4

Agent-Logs-Url: https://github.com/OpenIdentityPlatform/OpenAM/sessions/d61905b6-3165-47d1-a9ca-9e19738e39f3 Co-authored-by: vharseko <6818498+vharseko@users.noreply.github.com>

Copilot finished work on behalf of vharseko March 27, 2026 08:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: correct inverted park condition in PooledTaskExecutorTest.LongTask#971

fix: correct inverted park condition in PooledTaskExecutorTest.LongTask#971
Copilot wants to merge 6 commits intomasterfrom
copilot/fix-flaky-pooledtaskexecutor-test

Copilot AI commented Mar 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

vharseko commented Mar 26, 2026

Uh oh!

Copilot AI commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Copilot AI commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Fix

File to Change

Root Cause

Fix

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

vharseko commented Mar 26, 2026

Uh oh!

Copilot AI commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 26, 2026 •

edited

Loading