Skip to content

BreakIterator.GetBoundaries is exponentially slow depending on the size of the source text #127

@atlastodor

Description

@atlastodor

Describe the bug

BreakIterator.GetBoundaries is exponentially slow depending on the size of the source text. In other words, the larger the size of the text parameter string is, the slower the function is, and the curve is not linear.

To Reproduce

string content = "... some large text, about 100KB ... ";
BreakIterator.GetBoundaries(BreakIterator.UBreakIteratorType.WORD, new Locale("eng"), content, false); // Takes about 10 secs.

Expected behavior

The BreakIterator.GetBoundaries to finish within milliseconds.

Environment

  • OS: Windows 10
  • Exact version of icu.net 2.6.0
  • .NET Framework 4.7

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions