Skip to content

Improve the DynamicFlushScheduler #1789

@ericallam

Description

@ericallam

The very poorly named DynamicFlushScheduler , used by the EventRepository in the webapp schedules the "flushing" of the TaskEvent data in the EventRepository to the database, to do the inserts in batches instead of one at a time. It currently has a batch size limit, that when reached, will flush, and it will also flush on an interval so every X seconds a flush happens, even if the limit is not reached.

Currently the DynamicFlushScheduler does not actually limit the amount of items flushed, so a large batch can come through at once and then flushed. For example, lets say the DynamicFlushSchedule batch size limit is 100, and 99 items are currently waiting to be flushed. If 1000 items are then added via addToBatch, then the next flush will have 1099 items in it, which isn't what we want.

I think this needs to be able to support concurrent flushes, up to a certain concurrent limit (using something like p-limit). It also needs to handle making sure batches are flushed on SIGTERM, before the process shuts down. It should report metrics to the /metrics endpoint, and should probably have some better logging.

All this should be done while adding tests.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions