split instead or sequential getitem (slicing under the hood)#546
Conversation
|
Claude suggestion:
For the deep.copy, it's indeed 10x faster. I'm not super confortable not copying the tensor, because of weird errors with border effects. But at the same time it's a really big difference and there's not reason to manually change a constraint like this. They are almost no difference when considering the fix of this PR, so maybe constraints need a dedicated |
|
Looks fine to me, for sanity can you time some of the methods here before and after the change to show that it doesn't have deleterious effects elsewhere: https://github.com/TorchSim/torch-sim/blob/main/examples/benchmarking/scaling.py |
|
@CompRhys It's marginally better with the changes (hardly better in this case) for benchmarking/scaling.py, but it has a dramatic change because of the deep.copy of constraints, so the change is expected to be much better in this case. |
Summary
Small change to improve speed.
__getitem__makes a lot of slicing and reindexing.In normal setting, there is a 3x factor difference for overall, low duration, so that's fine anyway.
Interestingly, it combines super poorly with FixSymmery, script at the end:
On cpu: for 100 systems
CustomAutoBatcher duration: 0.0091 seconds
Original BinningAutoBatcher duration: 6.1686 seconds
On gpu:
CustomAutoBatcher duration: 0.0409 seconds
Original BinningAutoBatcher duration: 0.9110 seconds
I'm currently investigating FixSymmetry as the slicing is super costly.
Checklist
Before a pull request can be merged, the following items must be checked: