Skip to content

Conversation

@faldor20
Copy link
Collaborator

@faldor20 faldor20 commented Dec 7, 2024

This replaces the existing buffered IO with a method using only roc buffers.

Massively improves performance in my testing

See this conversation for many benchmarks:
https://roc.zulipchat.com/#narrow/channel/302903-platform-development/topic/Returning.20a.20modified.20roc.20list.20from.20a.20rust.20platform

Some highlights:

Benchmark 1 (61 runs): ./buf-reuse
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          81.8ms ± 3.21ms    75.2ms … 97.4ms          5 ( 8%)        0%
  peak_rss           2.75MB ± 16.8KB    2.62MB … 2.75MB          1 ( 2%)        0%
  cpu_cycles         6.96M  ±  336K     6.79M  … 8.70M           7 (11%)        0%
  instructions       10.1M  ± 2.16      10.1M  … 10.1M           0 ( 0%)        0%
  cache_references   20.6K  ±  751      17.1K  … 22.1K           1 ( 2%)        0%
  cache_misses       12.1K  ±  623      9.00K  … 13.3K           1 ( 2%)        0%
  branch_misses      4.75K  ±  340      3.79K  … 5.58K           3 ( 5%)        0%
Benchmark 2 (7 runs): ./readWhole
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           812ms ± 3.87ms     808ms …  818ms          0 ( 0%)        💩+892.2% ±  3.2%
  peak_rss            822MB ± 70.1KB     822MB …  822MB          0 ( 0%)        💩+29773.0% ±  0.8%
  cpu_cycles          984M  ± 8.03M      978M  … 1.00G           0 ( 0%)        💩+14037.7% ± 28.0%
  instructions       3.28G  ± 34.8      3.28G  … 3.28G           0 ( 0%)        💩+32490.8% ±  0.0%
  cache_references   5.89M  ± 30.8K     5.84M  … 5.94M           0 ( 0%)        💩+28474.6% ± 36.1%
  cache_misses       1.65M  ± 9.12K     1.63M  … 1.66M           0 ( 0%)        💩+13588.0% ± 18.6%
  branch_misses      4.93K  ±  188      4.62K  … 5.12K           0 ( 0%)          +  3.9% ±  5.5%
Benchmark 3 (22 runs): ./readWhole-new
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           236ms ± 4.43ms     230ms …  249ms          1 ( 5%)        💩+189.0% ±  2.2%
  peak_rss            412MB ± 59.7KB     412MB …  412MB          0 ( 0%)        💩+14881.8% ±  0.6%
  cpu_cycles          726K  ± 59.9K      645K  …  923K           1 ( 5%)        ⚡- 89.6% ±  2.1%
  instructions        483K  ± 0.46       483K  …  483K           0 ( 0%)        ⚡- 95.2% ±  0.0%
  cache_references   9.76K  ±  250      8.97K  … 10.2K           1 ( 5%)        ⚡- 52.7% ±  1.6%
  cache_misses       7.69K  ±  320      7.03K  … 8.67K           2 ( 9%)        ⚡- 36.2% ±  2.3%
  branch_misses      4.73K  ±  153      4.50K  … 5.17K           0 ( 0%)          -  0.5% ±  3.2%

@lukewilliamboswell
Copy link
Collaborator

@faldor20 were you able to run the all_tests.sh script and everything passes locally?

readBytesToBuf!,
]

# import Shared exposing [ByteReader]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed before merging.

path = Path.fromStr pathStr
buffer = List.withCapacity 8000

# 0 means with default capacity
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to be out of date.

openReader! : Str => Result Reader [GetFileReadErr Path ReadErr]
openReader! = \pathStr ->
path = Path.fromStr pathStr
buffer = List.withCapacity 8000
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to a constant a named variable for context on the value of the number?

fileReader! : List U8, U64 => Result FileReader Str
fileReadLine! : FileReader => Result (List U8) Str
fileReader! : List U8 => Result FileReader Str
fileReadLine! : FileReader,List U8 => Result (List U8) Str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a space after the comma. Maybe we should have a formatting CI action? Pretty minor.

PlatformTasks.fileReadLine! reader buffer
|> Result.mapErr \err -> FileReadErr path err

## Try to read bytes from a file given a Reader.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traditionally, doc comments have a newline after the short description.

This allows the LSP to optionally give a one-line description of the function without including the full description in the paragraph below.

let size = file
.metadata()
.map(|m| m.len())
.expect("TODO: make robust: file has not size?");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like getting file metadata can fail when the file is missing or permissions aren't sufficient. Both scenarios seem plausible to me, so we should gracefully handle this error. We should be able to handle the error exactly the same way we handle the other errors in this function.

let buffer = if buffer.is_unique() {
buffer
} else {
RocList::with_capacity(8000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also consider contextualizing this magic number.

loop {
let read = match file.read(buf_slice) {
Ok(n) => n,
Err(ref e) if matches!(e.kind(), ErrorKind::Interrupted) => continue,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: might want to have a retry limit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants