Skip to content

[Python] Set memory policy to "strict"#230

Open
smuzaffar wants to merge 90 commits intocms-sw:cms/master/a0fb33955e5from
guitargeek:pyroot_memory_policy_1
Open

[Python] Set memory policy to "strict"#230
smuzaffar wants to merge 90 commits intocms-sw:cms/master/a0fb33955e5from
guitargeek:pyroot_memory_policy_1

Conversation

@smuzaffar
Copy link

CMSSW tests for root-project#13593

couet and others added 30 commits March 2, 2026 09:14
Idea to have interface which handle painting of many segments at once. On platforms like X11 drawing of many lines at once can be performed with single call
Apply clip condition, draw on virtualx and virtualps
Here no clipping is required,
for VirtualPS coordinates need to be recalculated
Make lambda to paint ticks at specified index/position
So reduce code duplication when painting extra ticks
on left and right sides.
Prepare code to fill buffers for segments painting
First collect all ticks and grids positions in the vector
and then call function at the end
In SVG files all grids lines drawn together after all ticks so
order is changed.
But produced image will not change while grids always drawn on the top of the ticks

PS image can reduce in size while now attributes not switched between ticks and grids all the time
Provide default implementation to ensure that
all possible derived classes will work with new code
So first recalculate/clip buffers and then call methods
Provide default implementation which makes simple
drawing of N individual lines.
While method `DrawSegments` with low-level API is already exist use name `DrawLinesSegments`
Like with TGX11::DrawPolyLine,
split drawing on smaller portions if more than 500000
 segments provide at a time.

 Use direct cast from `TPoint *` to `XSegment *` while
 segment is just two points one after another
Now native graphics can use faster X11 routine
It should allow to optimize drawing of the segments on SVG/PDF outputs
Recalculate buffer when using NDC -
while PS always used only user coordinates
Many segments of similar line style combined now
together in same svg:path element
stressGraphics.ref changed while dummy segments (zero-size length)
not drawn at all. So ticks of zero size not appear in PS and PDF files
as well.

In svg_ref now axis ticks drawn as segments so now is
the only svg:path for all axis ticks are placed.

Main content of SVG files did not change so
it is minimal amount of data commited into repository
While labels and grid/ticks painting was mixed,
move produced labels in intermediate buffer to paint
them between ticks and grids
After change in TGaxis segments for log scale are used.
This makes more compact SVG and also PS can be shorter while all ticks drawn before all grids
This follows up on 4ae08b3, fixing the Windows builds where this
C++20 feature is not available.
If an interface is meant to be used for passing ownership, this should
be implemented with an explicit Pythonization instead of relying on
heuristics in cppyy, because the heuristics result in many false
positives for ownership transfer, manifesting as memory leaks.

We would like to avoid using the heuristic memory policy in the future,
and this change is done in preparation for that.

This commit has absolutely no effect on the behavor of the ROOT Python
interface at this point, because it's redundant with the heuristic
memory policy. But these Pythonization will be important once the strict
memory policy is the default.
Remove the `builtin_glew` dependency, so one always requires having glew
on the system for now, until we migrate from glew to GLAD.

No deprecation or erroring out on that option is required, because it's
simply ignored now: glew will never be required anymore anyway as of the
next commits that will migrate from GLEW to GLAD.
Add minimal OpenGL loader code generated by the Glad tool:

https://github.com/Dav1dde/glad

Co-authored-by: Matevz Tadel <matevz.tadel@cern.ch>
In particular, this includes Wrapper functions - written by @bellenot -
that fix an issue with several glad functions not compatible with
`gluTessCallback`. This is visible with the tutorial grad.C, not
rendering the color gradients if these wrapper functions are not used.

Co-authored-by: Bertrand Bellenot <Bertrand.Bellenot@cern.ch>
Closes root-project#18471.

Co-authored-by: Bertrand Bellenot <Bertrand.Bellenot@cern.ch>
Co-authored-by: Matevz Tadel <matevz.tadel@cern.ch>
The `RModel::AddOperatorReference()` function is not memory safe,
because it takes ownership of an argument passed by raw pointer. It's
better to remove these kind of functions (possible because SOFIE is
still in the experimental namespace), and use the safe alternative that
take a `std::unique_ptr` instead, for clear ownership semantics.
It's better to centralize this logic to make sure there is no confusion
about what these data member names are.
@smuzaffar
Copy link
Author

@guitargeek , looks like unit test failure is related to this root change

@cmsbuild
Copy link

cmsbuild commented Mar 7, 2026

Pull request #230 was updated.

@smuzaffar
Copy link
Author

please test with cms-sw/cmsdist#10395 using full cmssw for CMSSW_16_1_ROOT6_X

@cmsbuild
Copy link

cmsbuild commented Mar 7, 2026

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2dc4c7/51831/summary.html
COMMIT: e787a88
CMSSW: CMSSW_16_1_ROOT6_X_2026-03-06-2300/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/root/230/51831/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed External Build

I found compilation error when building:

- Module:         
- Shared:          -Wl,--no-undefined -Wl,--hash-style="both"

-- Enabled support for:  asimage asimage_tiff builtin_clang builtin_cling builtin_civetweb builtin_ftgl builtin_gl2ps builtin_llvm builtin_openui5 builtin_xxhash check_connection clad curl dataframe davix dcache fftw3 gdml geom http imt mathmore opengl pyroot roofit root7 runtime_cxxmodules shared spectrum ssl thisroot_scripts tmva tmva-cpu tmva-cudnn tmva-pymva tpython use_gsl_cblas webgui x11 xml xrootd
-- Configuring incomplete, errors occurred!
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.y0xAN4 (%build)

RPM build warnings:
Macro expanded in comment on line 404: %{pkginstroot}/lib

Macro expanded in comment on line 405: %{pkginstroot}


Clear the TList at the end of the `test_delitem` unit test in
tseqcollection_itemaccess.py to make sure the contained list elements
are still alive when the list is cleared. Otherwise, the elements might
be deleted before the containing list, depending on the order or garbage
collection.
The ROOT Python interfaces have many memory leaks, which is a major pain
point for people using it for long-running scripts in batch jobs.

One source of memory leaks was indentified to be the "heuristic memory
policy" of cppyy. This means that cppyy assumes that every non-const
pointer member function argument was interpreted as the object taking
ownership if the argument.

For examle, take the non-owning RooLinkedList container. It has a
`RooLinkedList::Add(RooAbsArg *arg)` method. ROOT wrongly assumes that
this means the RooLinkedList takes ownership of arg, and it drops the
ROOT overship. Nobody feels responsible for deleting the object
anymore, and there is a memory leak or `arg`.

That particular leak was reported in this forum post:
https://root-forum.cern.ch/t/memory-leak-in-fits/56249

Function parameters of type `T *` are very common in ROOT, and only
rarely do they imply ownership transfer. So changing the memory policy
to "strict" would surely fix also many other memory leaks that are not
reported so far. In fact, upstream cppyy doesn't even enable this
heuristic memory policy anymore by default! So moving ROOT also to the
strict memory policy closes the gap between ROOT and cppyy.

The potential drawback of this change are crashes in usercode if memory
is not properly managed. But these problems should either be fixed by:

  * the user

  * dedicated pythonizations for these methods to manage shared
    ownership via Python reference counters (i.e., setting the parameter
    as an attribute of the object that the member function was called
    on)

This follows up on PR root-project#4294, in particular it reverts 3a12063.

Note that owning **TCollection**, **TSeqCollection**, and **TList**
instances are Pythonized to preserve the old behavior of dropping Python
ownership of added elements, as that was the case where the memory
heuristic was correct.
This follows up on 7ac08ad, ensuring the object ownership is also
handled correctly in `TH2Poly::AddBin()`.

This is motivated by usage of `TH2Poly` in CMSSW unit tests, and a ROOT
unit test that covers the same usage pattern is now added as well.
@cmsbuild
Copy link

cmsbuild commented Mar 7, 2026

Pull request #230 was updated.

@smuzaffar
Copy link
Author

please test with cms-sw/cmsdist#10395 using full cmssw for CMSSW_16_1_ROOT6_X

@smuzaffar
Copy link
Author

please test with cms-sw/cmsdist#10395 using full cmssw for CMSSW_16_1_ROOT6_X

@cmsbuild
Copy link

cmsbuild commented Mar 7, 2026

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2dc4c7/51834/summary.html
COMMIT: 8a8e854
CMSSW: CMSSW_16_1_ROOT6_X_2026-03-06-2300/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/root/230/51834/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2dc4c7/51834/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2dc4c7/51834/git-merge-result

Comparison Summary

Summary:

  • You potentially added 57 lines to the logs
  • Reco comparison results: 9 differences found in the comparisons
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4181048
  • DQMHistoTests: Total failures: 14928
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4166100
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: found differences in 3 / 51 workflows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.