Skip to content

Conversation

@nascheme
Copy link
Member

@nascheme nascheme commented Dec 16, 2025

If we are specializing to LOAD_GLOBAL_MODULE, set deferred reference counting for the value, if it meets criteria. For now, it's only done for frozenset objects.

If we are specializing to LOAD_GLOBAL_MODULE, set deferred reference
counting for the value, if it meets criteria. For now, it's only done
for frozenset objects.
Use it for LOAD_ATTR_MODULE in addition to LOAD_GLOBAL_MODULE.  Don't
enable deferred ref counts if the object is owned by the current thread.
Specialized bytecode is per-thread so this works.  Enable for
frozensets, tuples and type objects.
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense and is worth doing. Let's also add a scaling test to ftscalingbench.py

Comment on lines 362 to 365
PyObject *op;
if (PyDict_GetItemRef(dict, name, &op) != 1) {
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's combine this lookup with the earlier _PyDict_LookupIndex. _PyDict_LookupIndex already gets the value internally and throws it away.

Comment on lines 366 to 371
if (_Py_IsOwnedByCurrentThread(op) ||
!PyType_IS_GC(Py_TYPE(op)) ||
_PyObject_HasDeferredRefcount(op)) {
Py_DECREF(op);
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The !PyType_IS_GC and _PyObject_HasDeferredRefcount checks aren't really necessary because PyUnstable_Object_EnableDeferredRefcount handles those cases internally.

Comment on lines 372 to 374
if (PyFrozenSet_Check(op) || PyTuple_Check(op) || PyType_Check(op)) {
PyUnstable_Object_EnableDeferredRefcount(op);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do this for all objects, not just a few types. Otherwise, I think we will keep run into scaling bottlenecks with global variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... That's going to push a lot of objects to the GC for deallocation in programs that do a lot of work at the global level (but do have a few functions that are called enough that use those globals.) I'm thinking of naively written "scripts". Then again, it would only do that for objects that participate in GC (i.e. not strings or integers), so it's probably fine. I mean, nobody's too surprised when things aren't deallocated quite as quickly as they expect... right?

Yeah, I think this should be fine for main. I definitely wouldn't backport it to 3.14.

Copy link
Contributor

@colesbury colesbury Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree regarding not backporting to 3.14.

I think the above check that excludes owned by the current thread should help with most scripts -- it'll only affect global variables accessed by multiple threads.

Another thing that may help is that I think the PyUnstable_Object_EnableDeferredRefcount call only happens at specialization time, not every time a variable is updated or accessed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm a bit worried about increased memory usage. Since it's only for objects shared between threads and only for hot LOAD_GLOBAL/LOAD_ATTR instructions, maybe it's okay. Perhaps we should consider changing gc_should_collect() to check not only the young object count but also the process memory size increase as well. The condition with count < gcstate->long_lived_total / 4 could be done with an "or" condition with the process size check.

Also, avoid PyDict_GetItemRef() call by returning the value when the
index is looked up (as previously discarded).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants