Skip to content

Conversation

@wence-
Copy link
Member

@wence- wence- commented Jul 3, 2020

No description provided.

@wence-
Copy link
Member Author

wence- commented Jul 3, 2020

@sv2518 I think this branch (if you merge it in vectorisation) + wence/fix/interpolate-max in Firedrake fixes the test_real_space bug.

@sv2518
Copy link
Contributor

sv2518 commented Jul 6, 2020

Yes, it does! Thanks!

wence- added 11 commits July 8, 2020 12:21
It conflicts with the name of the function, which loopy doesn't like.
Necessary to handle tsfc-generated kernels that have a par_loop
access descriptor for their outputs that is not INC.
Until such time as the vectorisation stuff lands, this just makes the
compilation process slower for no gain.
Previously we were not separately prefetching the base and extruded
part of the maps. As a consequence it is possible that we were paying
higher indirection costs than necessary.

Additionally, compress offset arrays of a single value to a single
literal offset. This marginally decreases the "hot" memory footprint.
This is necessary so that vectorising over the outer loop correctly
privatises the accumulation variables.
Used to handle new Kernel requirement when the Kernel expects output
arguments to be zero on entry.

Fixes firedrakeproject/firedrake#1768.
@wence- wence- force-pushed the wence/feature/codegen-updates branch from 11d558f to d086159 Compare July 8, 2020 11:25
@sv2518 sv2518 mentioned this pull request Jul 8, 2020
Copy link
Contributor

@sv2518 sv2518 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can confirm that this works on the vectorisation branches now.

@wence- wence- merged commit da66688 into master Jul 22, 2020
@wence- wence- deleted the wence/feature/codegen-updates branch July 22, 2020 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants