Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated memory management #543

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
9abcd0d
Retain on ObjCInstance creation, autorelease on __del__
samschott Nov 19, 2024
6605842
update tests
samschott Nov 19, 2024
931c352
add change note
samschott Nov 19, 2024
a618f2a
use autorelease instead of release in __del__
samschott Nov 20, 2024
b1bf61c
code formatting
samschott Nov 20, 2024
21f2e0b
update docs
samschott Nov 20, 2024
20ab8f9
add comment about autorelease vs release
samschott Nov 23, 2024
160c819
remove now unneeded cache staleness check
samschott Nov 23, 2024
ce9d78c
remove stale instance cache tests
samschott Nov 23, 2024
6d89330
update test_objcinstance_dealloc
samschott Nov 24, 2024
3bb7ccc
correct inline comment
samschott Nov 24, 2024
c0b091c
make returned_from_method private
samschott Nov 24, 2024
ab1f762
update ObjCInstance doc string
samschott Nov 24, 2024
544d694
updated docs
samschott Nov 24, 2024
b4a1624
update spellchecker
samschott Nov 24, 2024
f0edb5b
update change notes with migration instructions
samschott Nov 24, 2024
22396dc
Rephrase removal note
samschott Nov 25, 2024
7d51fde
remove unneeded space in doc string
samschott Nov 25, 2024
acfa546
change bugfix to feature note
samschott Nov 25, 2024
18e08cc
Fix incorrect inline comment
samschott Nov 25, 2024
532fbe0
trim trailing whitespace
samschott Nov 25, 2024
52e92c0
update test comment
samschott Nov 25, 2024
efed734
check that objects are not deallocated before end of autorelease pool
samschott Nov 25, 2024
ab8a895
merge object lifecycle tests
samschott Nov 25, 2024
30e4277
add a test case for copyWithZone returning the existing instance with…
samschott Nov 25, 2024
c3a4fe1
release additional refcounts by copy calls on the same ObjCInstance
samschott Nov 25, 2024
7bdc31f
rewrite the copy lifecycle test to use NSDictionary instead of a cust…
samschott Nov 26, 2024
460728b
prevent errors on ObjCInstance garbage collection when `send_message`…
samschott Nov 26, 2024
d9c0f62
switch copy lifecycle test to use NSString
samschott Nov 26, 2024
49d9381
remove unused import
samschott Nov 26, 2024
e0d7792
fix spelling mistake
samschott Nov 26, 2024
715912f
spelling updates
samschott Nov 26, 2024
20e45b6
spelling updates
samschott Nov 26, 2024
86b29a4
spelling updates
samschott Nov 26, 2024
944328d
black code formatting
samschott Nov 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions changes/256.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Retain Objective-C objects when creating Python wrappers and release them when the
Python wrapped is garbage collected. This means that manual ``retain`` calls and
subsequent ``release`` or ``autorelease`` calls from Python are no longer needed with
very few exceptions such as:

1. When implementing methods like ``copy`` that are supposed to create an object, if
the returned object is not actually newly created.
2. When dealing with side effects of methods like ``init`` that may release an object
which is still referenced from Python. See for example
https://github.com/beeware/toga/issues/2468.
5 changes: 5 additions & 0 deletions changes/256.removal.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Manual calls to ``release`` or ``autorelease`` no longer cause Rubicon
to skip releasing an Objective-C object when its Python wrapper is
garbage collected. This means that fewer ``retain`` than ``release`` calls will cause
segfaults on garbage collection. Review your code carefully for unbalanced ``retain``
and ``release`` calls before updating.
43 changes: 27 additions & 16 deletions docs/how-to/memory-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
Memory management for Objective-C instances
===========================================

Reference counting in Objective-C
=================================

Reference counting works differently in Objective-C compared to Python. Python
will automatically track where variables are referenced and free memory when
the reference count drops to zero whereas Objective-C uses explicit reference
Expand All @@ -13,28 +16,36 @@ When enabling automatic reference counting (ARC), the appropriate calls for
memory management will be inserted for you at compile-time. However, since
Rubicon Objective-C operates at runtime, it cannot make use of ARC.

Reference counting in Rubicon Objective-C
-----------------------------------------
Reference management in Rubicon
===============================

In most cases, you won't have to manage reference counts in Python, Rubicon
Objective-C will do that work for you. It does so by calling ``retain`` on an
mhsmith marked this conversation as resolved.
Show resolved Hide resolved
object when Rubicon creates a ``ObjCInstance`` for it on the Python side, and calling
``autorelease`` when the ``ObjCInstance`` is garbage collected in Python. Retaining
the object ensures it is not deallocated while it is still referenced from Python
and releasing it again on ``__del__`` ensures that we do not leak memory.

The only exception to this is when you create an object -- which is always done
through methods starting with "alloc", "new", "copy", or "mutableCopy". Rubicon does
not explicitly retain such objects because we own objects created by us, but Rubicon
does autorelease them when the Python wrapper is garbage collected.

You won't have to manage reference counts in Python, Rubicon Objective-C will do
that work for you. It does so by tracking when you gain ownership of an object.
This is the case when you create an Objective-C instance using a method whose
name begins with ``alloc``, ``new``, ``copy``, or ``mutableCopy``. Rubicon
Objective-C will then insert a ``release`` call when the Python variable that
corresponds to the Objective-C instance is deallocated.
Rubicon Objective-C will not keep track if you additionally manually ``retain`` an
object. You will be responsible to insert appropriate ``release`` or ``autorelease``
calls yourself to prevent leaking memory.

An exception to this is when you manually ``retain`` an object. Rubicon
Objective-C will not keep track of such retain calls and you will be
responsible to insert appropriate ``release`` calls yourself.
Weak references in Objective-C
------------------------------

You will also need to pay attention to reference counting in case of **weak
references**. In Objective-C, creating a **weak reference** means that the
reference count of the object is not incremented and the object will still be
You will need to pay attention to reference counting in case of **weak
references**. In Objective-C, as in Python, creating a weak reference means that
the reference count of the object is not incremented and the object will be
deallocated when no strong references remain. Any weak references to the object
are then set to ``nil``.

Some objects will store references to other objects as a weak reference. Such
properties will be declared in the Apple developer documentation as
Some Objective-C objects store references to other objects as a weak reference.
Such properties will be declared in the Apple developer documentation as
"@property(weak)" or "@property(assign)". This is commonly the case for
delegates. For example, in the code below, the ``NSOutlineView`` only stores a
weak reference to the object which is assigned to its delegate property:
Expand Down
4 changes: 4 additions & 0 deletions docs/spelling_wordlist
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
Alea
alloc
Autorelease
autorelease
autoreleased
autoreleases
Bugfixes
callables
CPython
Expand All @@ -22,6 +25,7 @@ lookups
macOS
metaclass
metaclasses
mutableCopy
namespace
namespaces
ObjC
Expand Down
172 changes: 46 additions & 126 deletions src/rubicon/objc/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,10 @@
# the Python objects are not destroyed if they are otherwise no Python references left.
_keep_alive_objects = {}

# Methods that return an object which is implicitly retained by the caller.
# See https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/mmRules.html
_OWNERSHIP_METHOD_PREFIXES = (b"alloc", b"new", b"copy", b"mutableCopy")


def encoding_from_annotation(f, offset=1):
argspec = inspect.getfullargspec(inspect.unwrap(f))
Expand Down Expand Up @@ -223,12 +227,7 @@ def __call__(self, receiver, *args, convert_args=True, convert_result=True):

# Convert result to python type if it is an instance or class pointer.
if self.restype is not None and issubclass(self.restype, objc_id):
result = ObjCInstance(result)

# Mark for release if we acquire ownership of an object. Do not autorelease here because
# we might retain a Python reference while the Obj-C reference goes out of scope.
if self.name.startswith((b"alloc", b"new", b"copy", b"mutableCopy")):
result._needs_release = True
result = ObjCInstance(result, _returned_from_method=self.name)

return result

Expand Down Expand Up @@ -783,29 +782,13 @@ def objc_class(self):
return super(ObjCInstance, type(self)).__getattribute__(self, "_objc_class")
except AttributeError:
# This assumes that objects never change their class after they are
# seen by Rubicon. There are two reasons why this may not be true:
#
# 1. Objective-C runtime provides a function object_setClass that
# can change an object's class after creation, and some code
# manipulates objects' isa pointers directly (although the latter
# is no longer officially supported by Apple). This is not
# commonly done in practice, and even then it is usually only
# done during object creation/initialization, so it's basically
# safe to assume that an object's class will never change after
# it's been wrapped in an ObjCInstance.
# 2. If a memory address is freed by the Objective-C runtime, and
# then re-allocated by an object of a different type, but the
# Python ObjCInstance wrapper persists, Python will assume the
# object is still of the old type. If a new ObjCInstance wrapper
# for the same pointer is re-created, a check is performed to
# ensure the type hasn't changed; this problem only affects
# pre-existing Python wrappers. If this occurs, it probably
# indicates an issue with the retain count on the Python side (as
# the Objective-C runtime shouldn't be able to dispose of an
# object if Python still has a handle to it). If this *does*
# happen, it will manifest as objects appearing to be the wrong
# type, and/or objects having the wrong list of attributes
# available. Refs #249.
# seen by Rubicon. This can occur because the Objective-C runtime provides a
# function object_setClass that can change an object's class after creation,
# and some code manipulates objects' isa pointers directly (although the
# latter is no longer officially supported by Apple). This is not commonly
# done in practice, and even then it is usually only done during object
# creation/initialization, so it's basically safe to assume that an object's
# class will never change after it's been wrapped in an ObjCInstance.
super(ObjCInstance, type(self)).__setattr__(
self, "_objc_class", ObjCClass(libobjc.object_getClass(self))
)
Expand All @@ -815,7 +798,9 @@ def objc_class(self):
def _associated_attr_key_for_name(name):
return SEL(f"rubicon.objc.py_attr.{name}")

def __new__(cls, object_ptr, _name=None, _bases=None, _ns=None):
def __new__(
cls, object_ptr, _name=None, _bases=None, _ns=None, _returned_from_method=b""
):
"""The constructor accepts an :class:`~rubicon.objc.runtime.objc_id` or
anything that can be cast to one, such as a :class:`~ctypes.c_void_p`,
or an existing :class:`ObjCInstance`.
Expand All @@ -833,10 +818,17 @@ class or a metaclass, an instance of :class:`ObjCClass` or
:func:`register_type_for_objcclass`. Creating an :class:`ObjCInstance`
from a ``nil`` pointer returns ``None``.

Rubicon currently does not perform any automatic memory management on
the Objective-C object wrapped in an :class:`ObjCInstance`. It is the
user's responsibility to ``retain`` and ``release`` wrapped objects as
needed, like in Objective-C code without automatic reference counting.
Rubicon retains an Objective-C object when it is wrapped in an
:class:`ObjCInstance` and autoreleases it when the :class:`ObjCInstance` is
garbage collected.

The only exception to this are objects returned by methods which create an
object (starting with "alloc", "new", "copy", or "mutableCopy"). We do not
explicitly retain them because we already own objects created by us, but we do
autorelease them on garbage collection of the Python wrapper.

This ensures that the :class:`ObjCInstance` can always be used from Python
without segfaults while preventing Rubicon from leaking memory.
"""

# Make sure that object_ptr is wrapped in an objc_id.
Expand All @@ -852,68 +844,24 @@ class or a metaclass, an instance of :class:`ObjCClass` or
# If an ObjCInstance already exists for the Objective-C object,
# reuse it instead of creating a second ObjCInstance for the
# same object.
cached = cls._cached_objects[object_ptr.value]

# In a high-churn environment, it is possible for an object to
# be deallocated, and the same memory address be re-used on the
# Objective-C side, but the Python wrapper object for the
# original instance has *not* been cleaned up. In that
# situation, an attempt to wrap the *new* Objective-C object
# instance will cause a false positive cache hit; returning a
# Python object that has a class that doesn't match the class of
# the new instance.
#
# To prevent this, when we get a cache hit on an ObjCInstance,
# use the raw Objective-C API on the pointer to get the current
# class of the object referred to by the pointer. If there's a
# discrepancy, purge the cache for the memory address, and
# re-create the object.
#
# We do this both when the type *is* ObjCInstance (the case when
# instantiating a literal ObjCInstance()), and when type is an
# ObjCClass instance (e.g., ObjClass("Example"), which is the
# type of a directly instantiated instance of Example.
#
# We *don't* do this when the type *is* ObjCClass,
# ObjCMetaClass, as there's a race condition on startup -
# retrieving `.objc_class` causes the creation of ObjCClass
# objects, which will cause cache hits trying to re-use existing
# ObjCClass objects. However, ObjCClass instances generally
# won't be recycled or reused, so that should be safe to exclude
# from the cache freshness check.
#
# One edge case with this approach: if the old and new
# Objective-C objects have the same class, they won't be
# identified as a stale object, and they'll re-use the same
# Python wrapper. This effectively means id(obj) isn't a
# reliable instance identifier... but (a) this won't be a common
# case; (b) this flaw exists in pure Python and Objective-C as
# well, because default object identity is tied to memory
# allocation; and (c) the stale wrapper will *work*, because
# it's the correct class.
#
# Refs #249.
if cls == ObjCInstance or isinstance(cls, ObjCInstance):
cached_class_name = cached.objc_class.name
current_class_name = libobjc.class_getName(
libobjc.object_getClass(object_ptr)
).decode("utf-8")
if (
current_class_name != cached_class_name
and not current_class_name.endswith(f"_{cached_class_name}")
):
# There has been a cache hit, but the object is a
# different class, treat this as a cache miss. We don't
# *just* look for an *exact* class name match, because
# some Cocoa/UIKit classes undergo a class name change
# between `alloc()` and `init()` (e.g., `NSWindow`
# becomes `NSKVONotifying_NSWindow`). Refs #257.
raise KeyError(object_ptr.value)

return cached
cached_obj = cls._cached_objects[object_ptr.value]

# If a cached instance was returned from a call such as `copy` or
# `mutableCopy`, we take ownership of an additional refcount. Release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flagging so it isn't forgotten - a link here to the NSCopying docs would be helpful, plus highlighting that copy can return the same object as an optimisation if the object is immutable.

# it here to prevent leaking memory, Python already owns a refcount from
# when the item was put in the cache.
if _returned_from_method.startswith(_OWNERSHIP_METHOD_PREFIXES):
send_message(object_ptr, "release", restype=objc_id, argtypes=[])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry a bit that the mental load of following when we retain and release is a bit much now since the reader needs to think though the __new__ method being invoked multiple times with the same pointer.

Alternatives such as explicitly tracking a "Python refcount" might be easier to understand but would also add complexity.

Suggestions are welcome.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this approach: if it's possible to deal with the situation immediately after the copy, that's definitely easier to follow than maintaining an extra refcount.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. An extra refcount starts to get into re-implementing garbage collection territory, and I'm not sure that's a game that is worth it. I'm comfortable with this as a known edge case, and documenting it with some related usage patterns and advice.


return cached_obj
except KeyError:
pass

# Explicitly retain the instance on first handover to Python unless we
# received it from a method that gives us ownership already.
if not _returned_from_method.startswith(_OWNERSHIP_METHOD_PREFIXES):
send_message(object_ptr, "retain", restype=objc_id, argtypes=[])

# If the given pointer points to a class, return an ObjCClass instead (if we're not already creating one).
if not issubclass(cls, ObjCClass) and object_isClass(object_ptr):
return ObjCClass(object_ptr)
Expand All @@ -932,7 +880,6 @@ class or a metaclass, an instance of :class:`ObjCClass` or
super(ObjCInstance, type(self)).__setattr__(
self, "_as_parameter_", object_ptr
)
super(ObjCInstance, type(self)).__setattr__(self, "_needs_release", False)
if isinstance(object_ptr, objc_block):
super(ObjCInstance, type(self)).__setattr__(
self, "block", ObjCBlock(object_ptr)
Expand All @@ -944,39 +891,12 @@ class or a metaclass, an instance of :class:`ObjCClass` or

return self

def release(self):
"""Manually decrement the reference count of the corresponding objc
object.

The objc object is sent a dealloc message when its reference
count reaches 0. Calling this method manually should not be
necessary, unless the object was explicitly ``retain``\\ed
before. Objects returned from ``.alloc().init...(...)`` and
similar calls are released automatically by Rubicon when the
corresponding Python object is deallocated.
"""
self._needs_release = False
send_message(self, "release", restype=objc_id, argtypes=[])

def autorelease(self):
"""Decrements the receiver’s reference count at the end of the current
autorelease pool block.

The objc object is sent a dealloc message when its reference
count reaches 0. If called, the object will not be released when
the Python object is deallocated.
"""
self._needs_release = False
result = send_message(self, "autorelease", restype=objc_id, argtypes=[])
return ObjCInstance(result)

def __del__(self):
"""Release the corresponding objc instance if we own it, i.e., if it
was returned by a method starting with :meth:`alloc`, :meth:`new`,
:meth:`copy`, or :meth:`mutableCopy` and it wasn't already explicitly
released by calling :meth:`release` or :meth:`autorelease`."""
if self._needs_release:
send_message(self, "release", restype=objc_id, argtypes=[])
# Autorelease our reference on garbage collection of the Python wrapper. We use
# autorelease instead of release to allow ObjC to take ownership of an object when
# it is returned from a factory method.
if send_message and objc_id:
send_message(self, "autorelease", restype=objc_id, argtypes=[])

def __str__(self):
"""Get a human-readable representation of ``self``.
Expand Down
Loading