Skip to content

Conversation

@EdCaunt
Copy link
Contributor

@EdCaunt EdCaunt commented Jul 3, 2025

No description provided.

@codecov
Copy link

codecov bot commented Jul 3, 2025

Codecov Report

❌ Patch coverage is 89.67972% with 29 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@dd1e6c7). Learn more about missing BASE report.
⚠️ Report is 12 commits behind head on main.

Files with missing lines Patch % Lines
devito/operator/operator.py 69.23% 26 Missing and 2 partials ⚠️
devito/tools/data_structures.py 94.44% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2659   +/-   ##
=======================================
  Coverage        ?   91.58%           
=======================================
  Files           ?      248           
  Lines           ?    49634           
  Branches        ?     4368           
=======================================
  Hits            ?    45459           
  Misses          ?     3458           
  Partials        ?      717           
Flag Coverage Δ
pytest-gpu-nvc-nvidiaX 72.06% <89.32%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mloubout mloubout added the API api (symbolics, types, ...) label Jul 9, 2025
Copy link
Contributor

@enwask enwask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Property names feel kinda messy, also not a fan of the hardcoded column width but otherwise looks good

@ggorman ggorman requested a review from Copilot July 24, 2025 09:02
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new memory estimation utility estimate_memory() to the Operator class that calculates memory consumption without allocating or touching data. The feature enables users to estimate memory requirements for Operators with different parameters and overrides, which is particularly useful for planning large computations that might exceed available memory.

Key Changes:

  • Adds estimate_memory() method to Operator class for memory consumption estimation without data allocation
  • Extends _arg_defaults() and _arg_values() methods across function types to support memory estimation mode
  • Implements comprehensive memory tracking for Functions, Arrays, and device memory mapping in ArgumentsMap

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
devito/operator/operator.py Adds core estimate_memory() method and extensive memory tracking properties to ArgumentsMap
devito/types/dense.py Adds estimate_memory parameter to avoid data allocation during estimation
devito/types/sparse.py Extends sparse function types to support memory estimation mode
devito/tools/data_structures.py Adds MemoryEstimate class for structured memory reporting
tests/test_operator.py Comprehensive test suite covering various scenarios and edge cases
.github/workflows/pytest-gpu.yml Includes new tests in GPU CI workflows
Comments suppressed due to low confidence (4)

devito/operator/operator.py:814

  • [nitpick] The estimate_memory parameter is checked in multiple places throughout the argument preparation flow. Consider extracting this logic into a separate method or using a context manager to reduce code duplication and improve maintainability.
                recompiled, src_file = self._compiler.jit_compile(self._soname, str(self))

tests/test_operator.py:133

  • The Function is created with a save parameter, but Function doesn't support the save parameter - this is only valid for TimeFunction. This should be TimeFunction instead of Function.
    def test_compiler_uniqueness(self):

tests/test_operator.py:196

  • This test relies on checking for a specific string pattern in generated C code to verify array temporary creation. This is fragile and could break if code generation changes. Consider using a more robust method to verify temporary array creation.
        ('Lt(0.1*(g1 + g2), 0.2*(g1 + g2))', 0,

@EdCaunt EdCaunt requested a review from FabioLuporini July 24, 2025 11:17
Copy link
Contributor

@JDBetteridge JDBetteridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick:

Is there a way to suppress unused keys. For instance, running locally I get:

MemoryEstimate(Kernel): {'host': '512 KB', 'device': '0 B'}

I didn't use any disk so the key doesn't appear. The system doesn't have a GPU, but the device key still appears.

Alternatively, and maybe this is preferable, ensure that all keys are shown to assure the user that no disk or device memory is being consumed.

@EdCaunt
Copy link
Contributor Author

EdCaunt commented Jul 25, 2025

Alternatively, and maybe this is preferable, ensure that all keys are shown to assure the user that no disk or device memory is being consumed.

This is the intention. Moreover, disk can only be consumed in PRO (strictly speaking these are snapshots which will be moved back and forth as the various memory locations overflow), so the mapper is only enriched with snapshot information via a hook.

Comment on lines +699 to +712
def to_json(self, path):
"""
Write the MemoryEstimate to JSON for ingestion by a scheduler.
Arguments
---------
path: str
The path to which the memory estimate should be written.
"""
summary = {'name': self.name, **self._dict}
json_object = json.dumps(summary, indent=4)

with open(path, "w") as outfile:
outfile.write(json_object)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

# Prepare to process data-carriers
args = kwargs['args'] = ReducerMap()

kwargs['metadata'] = {'language': self._language,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we make estimate_memory part of metadata we can probably save some carrying around, both here and in the dense/sparse files

Copy link
Contributor Author

@EdCaunt EdCaunt Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the estimate_memory switch in the kwargs at an earlier iteration and it made a lot of the _arg_defaults etc a fair bit messier, since they always need to know about it to avoid touching the data when performing overrides (so you had to get it from kwargs all over the place). Furthermore, I would argue that having it as an explicit kwarg reduces the likelihood that it will be accidentally overlooked when creating new subclasses in the future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? if it's part of kwargs (or metadata) you only have to extract it in _arg_defaults at worst (maybe not even that depending on how you do things since it could be already explicit in the parameters list) , while currently it appears uselessly in all _arg_values only to be forwarded to _arg_defaults

obviously this is all a nitpicking, but since we're discussing it...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be clear: this is not a stopper at all for me, but I'm not convinced about your argument

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, I would argue that having it as an explicit kwarg reduces the likelihood that it will be accidentally overlooked when creating new subclasses in the future

However, this is a fair point too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you put it in metadata, you need a whole load of:

if metadata is not None:
    estimate_memory = metadata.get('estimate_memory', False)

or

metadata = kwargs.get('metadata', {})
estimate_memory = metadata.get('estimate_memory', False)

in all the _arg_defaults (the form depending on the current _arg_defaults kwargs).

On top of uglifying the code (imo), it's much less obvious that the _arg_defaults for any array-carrying symbol should have special handling to avoid a data touch when merely estimating the memory (since users can and regularly will request an estimate for an operator that they don't have memory to run).

class MemoryEstimate(frozendict):
"""
An immutable mapper for a memory estimate, providing the estimated memory
consumption across host, device, and so forth.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and disk.

?

Copy link
Contributor Author

@EdCaunt EdCaunt Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, with PRO, you never get a 'disk' estimate, just a 'snapshot' estimate, as it would be far too much guesswork to estimate where the snaps are going to end up. Imo, figuring out if the values make sense is a user problem, and the raw data should not be obfuscated. The docstring is phrased to avoid giving too much away about what one can expect in PRO (or implying PRO features are in OSS), whilst avoiding a separate PRO class with an updated docstring.

if isinstance(new, DiscreteFunction):
# Set new values and re-derive defaults
values = new._arg_defaults(alias=self, metadata=metadata)
values = new._arg_defaults(alias=self, metadata=metadata,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going back to what I was saying earlier, for example, if estimate_memory had been part of metadata, you would need zero changes here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my other comment, having estimate_memory in metadata makes the contents of the function messier, whilst making it less obvious that special handling should be implemented in all subclasses to prevent inadvertent OOM errors when estimating the memory consumption of an operator.

return args

def _arg_values(self, **kwargs):
def _arg_values(self, estimate_memory=False, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, for homogeneity at some point we could pass in the metadata too here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it be needed though (considering my previous comments)?

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@mloubout mloubout merged commit ad1d4f4 into main Aug 4, 2025
35 of 36 checks passed
@mloubout mloubout deleted the estimate-memory branch August 4, 2025 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API api (symbolics, types, ...)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants