-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Run3 PromptReco] 390 MB in cut/expression parser #46493
Comments
cms-bot internal usage |
A new Issue was created by @makortel. @Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign reconstruction, xpog, dqm |
New categories assigned: reconstruction,xpog,dqm @antoniovagnerini,@ftorrresd,@hqucms,@jfernan2,@mandrenguyen,@nothingface0,@rvenditti,@syuvivida,@tjavaid you have been requested to review this Pull request/Issue and eventually sign? Thanks |
are you rediscovering that the first piece of code calling to the |
I believe this corresponds to cmssw/CommonTools/RecoAlgos/plugins/MuonSelector.cc Lines 32 to 35 in 4635bff
The configuration defines 25 of them, of which 14 are not constructed in the C++ (i.e. these modules are not part of any Task/Sequence/Path/EndPath that is associated with the Schedule)
Looking next the actual cut strings in the 11 modules that are constructed
There is a lot of repetition: one group of 4 modules and another group of 4 modules have exactly the same cut string. Would it be feasible to replace these with modules that express the structure of the selection in C++ code instead? |
Possibly. I had a vague recollection of such "constant cost" of this infrastructure, but it's source wasn't evident from the profile (maybe I just didn't dig deeply enough into Cling's call stack). I'd guess the There is nevertheless potential for ~200 MB reduction by addressing the "next largest" consumers. |
assign alca |
@cms-sw/alca-l2 By the way, of the 8
This duplication has both memory and runtime cost. |
I believe this one corresponds to cmssw/PhysicsTools/PatAlgos/plugins/PATObjectSelector.cc Lines 249 to 250 in 55fd977
The configuration contains 2 such modules, of 1 is not constructed because it is not in any Task/Sequence/Path/EndPath associated to the Schedule
The cut string is same in both Although if this is impacted with the "first call to boost::spirit has a constant cost", it might not be that important (but looks simple nevertheless). |
assign DPGAnalysis/Skims |
type performance-improvements |
New categories assigned: pdmv @AdrianoDee,@kskovpen,@miquork,@sunilUIET you have been requested to review this Pull request/Issue and eventually sign? Thanks |
This is
that uses cut = cms.string('pt>=10'),
...
expr = cms.string('eta'),
...
expr = cms.string('hwIso()'),
...
expr = cms.string('phi'),
...
expr = cms.string('pt'), that look like much simpler mapping from strings to member functions than full expression parser could be feasible. |
I believe this is
The configuration has 2 modules of this type, of which only 1 is actually used in the job
The expr = cms.string('sigmaZ()'),
...
expr = cms.string('sigmaZ0Error()'),
...
expr = cms.string('type()'),
...
expr = cms.string('position().z()'),
...
expr = cms.string('z0Error()'), that also looks like a much simpler mapping from strings to member functions than the full expression parser could be feasible. |
concerning #46493 (comment)
I had a go at it at #46574, comments are welcome. |
From #46040 (comment) the cut/expression parser uses 390 MB (constant cost, i.e. gets amortized across threads). The top 10 modules with the largest contributions are
ObjectSelectorBase<SingleElementCollectionSelector<edm::View<reco::Muon>, ...>
(link)ObjectSelectorBase<SingleElementCollectionSelector<std::vector<pat::GenericParticle, ...>
(link)BXVectorSimpleFlatTableProducer<l1t::EGamma>
(link)SimpleFlatTableProducerBase<reco::BeamSpot, reco::BeamSpot>
(link)TopSingleLeptonDQM
(linkTopMonitor
(link)VersionedIdProducer<edm::Ptr<reco::Photon>
(link)RecoTauPiZeroProducer
(link)SimpleFlatTableProducer<pat::Jet>
(link)SimpleFlatTableProducer<pat::Electron>
(link)Addressing already the top 3 would save about 286 MB.
The text was updated successfully, but these errors were encountered: