ObjC Runtime 中 Weak 属性的实现 (上)

程序员文章站 2022-05-31 18:48:56

...

前言

OC 中的 weak 属性是怎么实现的，为什么在对象释放后会自动变成 nil？本文对这个问题进行了一点探讨。

环境

mac OS Sierra 10.12.4
objc709

参考答案

runtime 对注册的类，会进行布局，对于 weak 对象会放入一个 hash 表中。用 weak 指向的对象内存地址作为 key，当此对象的引用计数为 0 的时候会 dealloc，假如 weak 指向的对象内存地址是 a ，那么就会以 a 为键，在这个 weak 表中搜索，找到所有以 a 为键的 weak 对象，从而设置为 nil 。

测试

代码

#import <Foundation/Foundation.h>

@interface WeakProperty : NSObject

@property (nonatomic,weak) NSObject *obj;


@end

@implementation WeakProperty

- (void)dealloc {
    NSLog(@"%s",__func__);
}

@end


int main(int argc, const char * argv[]) {
    @autoreleasepool {
        WeakProperty *property = [[WeakProperty alloc] init];
        NSObject *obj = [[NSObject alloc] init];
        property.obj = obj;     
        NSLog(@"%@",property.obj);   

        // 会触发函数 ``id objc_initWeak(id *location, id newObj)``       
        // NSObject *obj = [[NSObject alloc] init];
        // __weak NSObject *obj2 = obj;
        // 会触发函数 ``void objc_copyWeak(id *dst, id *src)``
        // __weak NSObject *obj3 = obj2;
    }
    return 0;
}

结果

对象的 weak 属性调用 setter 时

ObjC Runtime 中 Weak 属性的实现 (上)

调用 id objc_storeWeak(id *location, id newObj)
调用 static id storeWeak(id *location, objc_object *newObj)
…

使用 NSLog 输出 property.obj 属性时

ObjC Runtime 中 Weak 属性的实现 (上)

调用 id objc_loadWeakRetained(id *location)

当 dealloc 释放对象时

ObjC Runtime 中 Weak 属性的实现 (上)

调用 void objc_destroyWeak(id *location)

小结

storeWeak 函数用于为 weak 属性赋值 (包括销毁)
objc_loadWeakRetained 函数用于获取 weak 属性

观察 & 分析

对于函数 storeWeak 主要分析两种情况下的调用

赋值，即 id objc_storeWeak(id *location, id newObj)
销毁，即 void objc_destroyWeak(id *location)

而对于 weak 属性的获取主要分析

函数 id objc_loadWeakRetained(id *location)

观察: `id objc_storeWeak(id *location, id newObj)`

/** 
 * This function stores a new value into a __weak variable. It would
 * be used anywhere a __weak variable is the target of an assignment.
 * 
 * @param location The address of the weak pointer itself
 * @param newObj The new object this weak ptr should now point to
 * 
 * @return \e newObj
 */
id
objc_storeWeak(id *location, id newObj)
{
    return storeWeak<DoHaveOld, DoHaveNew, DoCrashIfDeallocating>
        (location, (objc_object *)newObj);
}

该函数单纯的调用了 storeWeak 函数

观察: `void objc_destroyWeak(id *location)`

/** 
 * Destroys the relationship between a weak pointer
 * and the object it is referencing in the internal weak
 * table. If the weak pointer is not referencing anything, 
 * there is no need to edit the weak table. 
 *
 * This function IS NOT thread-safe with respect to concurrent 
 * modifications to the weak variable. (Concurrent weak clear is safe.)
 * 
 * @param location The weak pointer address. 
 */
void
objc_destroyWeak(id *location)
{
    (void)storeWeak<DoHaveOld, DontHaveNew, DontCrashIfDeallocating>
        (location, nil);
}

该函数也只是单纯的调用了 storeWeak 函数

函数 `storeWeak` 源码

template <HaveOld haveOld, HaveNew haveNew,
          CrashIfDeallocating crashIfDeallocating>
static id 
storeWeak(id *location, objc_object *newObj)
{
    assert(haveOld  ||  haveNew);
    if (!haveNew) assert(newObj == nil);

    Class previouslyInitializedClass = nil;
    id oldObj;
    SideTable *oldTable;
    SideTable *newTable;

    // Acquire locks for old and new values.
    // Order by lock address to prevent lock ordering problems. 
    // Retry if the old value changes underneath us.
 retry:
    if (haveOld) {
        oldObj = *location;
        oldTable = &SideTables()[oldObj];
    } else {
        oldTable = nil;
    }
    if (haveNew) {
        newTable = &SideTables()[newObj];
    } else {
        newTable = nil;
    }

    SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);

    if (haveOld  &&  *location != oldObj) {
        SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
        goto retry;
    }

    // Prevent a deadlock between the weak reference machinery
    // and the +initialize machinery by ensuring that no 
    // weakly-referenced object has an un-+initialized isa.
    if (haveNew  &&  newObj) {
        Class cls = newObj->getIsa();
        if (cls != previouslyInitializedClass  &&  
            !((objc_class *)cls)->isInitialized()) 
        {
            SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
            _class_initialize(_class_getNonMetaClass(cls, (id)newObj));

            // If this class is finished with +initialize then we're good.
            // If this class is still running +initialize on this thread 
            // (i.e. +initialize called storeWeak on an instance of itself)
            // then we may proceed but it will appear initializing and 
            // not yet initialized to the check above.
            // Instead set previouslyInitializedClass to recognize it on retry.
            previouslyInitializedClass = cls;

            goto retry;
        }
    }

    // Clean up old value, if any.
    if (haveOld) {
        weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
    }

    // Assign new value, if any.
    if (haveNew) {
        newObj = (objc_object *)
            weak_register_no_lock(&newTable->weak_table, (id)newObj, location, 
                                  crashIfDeallocating);
        // weak_register_no_lock returns nil if weak store should be rejected

        // Set is-weakly-referenced bit in refcount table.
        if (newObj  &&  !newObj->isTaggedPointer()) {
            newObj->setWeaklyReferenced_nolock();
        }

        // Do not set *location anywhere else. That would introduce a race.
        *location = (id)newObj;
    }
    else {
        // No new value. The storage is not changed.
    }

    SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);

    return (id)newObj;
}

可以结合 lldb 边调试边对其进行分析，

分析: `id objc_storeWeak(id *location, id newObj)`

// Template parameters.
enum HaveOld { DontHaveOld = false, DoHaveOld = true };
enum HaveNew { DontHaveNew = false, DoHaveNew = true };

对于模板参数，传递的是 DoHaveOld(true) & DoHaveNew(true)

ObjC Runtime 中 Weak 属性的实现 (上)

在64位汇编中，当参数少于7个时，参数从左到右放入寄存器: rdi, rsi, rdx, rcx, r8, r9。此处 location 和 newObj 分别来自 rdi 和 rsi。

根据注释加地址比较，可知 location 为 指向弱引用的地址，newObj 为要求 弱引用指向的地址，在当前场景下为赋值给 WeakProperty 的 obj 属性的 obj 变量。

在当前场景下即为执行 storeWeak 后，内存地址 0x0000000101301638 上保存的值为 0x0000000101301490

ObjC Runtime 中 Weak 属性的实现 (上)

铺垫: `SideTable`

关于结构体 SideTable，在本文中当做黑盒来处理

struct SideTable {
    spinlock_t slock;
    RefcountMap refcnts;
    weak_table_t weak_table;

    SideTable() {
        memset(&weak_table, 0, sizeof(weak_table));
    }

    ~SideTable() {
        _objc_fatal("Do not delete SideTable.");
    }

    void lock() { slock.lock(); }
    void unlock() { slock.unlock(); }
    void forceReset() { slock.forceReset(); }

    // Address-ordered lock discipline for a pair of side tables.

    template<HaveOld, HaveNew>
    static void lockTwo(SideTable *lock1, SideTable *lock2);
    template<HaveOld, HaveNew>
    static void unlockTwo(SideTable *lock1, SideTable *lock2);
};

关于 spinlock_t，Wiki 上关于 Spinlock 词条的解释如下

In software engineering, a spinlock is a lock which causes a thread trying to acquire it to simply wait in a loop (“spin”) while repeatedly checking if the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on (that which holds the lock) blocks, or “goes to sleep.

例子

; Intel syntax

locked:                      ; The lock variable. 1 = locked, 0 = unlocked.
     dd      0               ; 定义 lock 变量 默认为 0 

spin_lock:
     mov     eax, 1          ; Set the EAX register to 1. 
                                        ; 设置 EAX 寄存器的值为 1 

     xchg    eax, [locked]   ; Atomically swap the EAX register with
                             ;  the lock variable.
                             ; This will always store 1 to the lock, leaving
                             ;  the previous value in the EAX register.
                             ; 交换 eax 与 lock 变量的值，根据上一步可知，lock 肯定会被赋值为1

     test    eax, eax        ; Test EAX with itself. Among other things, this will
                             ;  set the processor's Zero Flag if EAX is 0.
                             ; If EAX is 0, then the lock was unlocked and
                             ; we just locked it.
                             ; Otherwise, EAX is 1 and we didn't acquire the lock.
                                        ; 将 EAX 与 自身比较，如果 EAX 是 0 则设置 Zeor Flag ，表明当前未加锁，只要加锁操作即可，反之证明已被加锁，不设置 Zero Flag。
     jnz     spin_lock       ; Jump back to the MOV instruction if the Zero Flag is
                             ;  not set; the lock was previously locked, and so
                             ; we need to spin until it becomes unlocked.
                                        ; 如果 Zero Flag 未被设置，则跳转继续 spin_lock
     ret                     ; The lock has been acquired, return to the calling
                             ;  function.
                             ; 获得锁后，继续执行

; 当获得所的操作执行完成后，则 locked 变成 0，另一个线程再次进行 spin_lock 操作 locked 为 0，导致 EAX 为0 ，重新获得了锁，同时 locked 变成 1...

spin_unlock:
     mov     eax, 0          ; Set the EAX register to 0.

     xchg    eax, [locked]   ; Atomically swap the EAX register with
                             ;  the lock variable.

     ret                     ; The lock has been released.

配合 google 的翻译可知，自旋锁会循环等待直到锁可用。

从 weak_table_t 结构体的注释说明了，它会保存 ids 和 keys 的形式保存对象

/**
 * The global weak references table. Stores object ids as keys,
 * and weak_entry_t structs as their values.
 */
struct weak_table_t {
    weak_entry_t *weak_entries;
    size_t    num_entries;
    uintptr_t mask;
    uintptr_t max_hash_displacement;
};

结构体 SideTable 可看做是一个带加锁功能的集合，其中的元素以键值对的形式存放。

在 ObjC 的入口函数 _objc_init 会调用函数 arr_init 来初始化 SideTableBuf 静态变量

正文: `id objc_storeWeak(id *location, id newObj)`

进入 if (haveOld) 条件

创建新元素，因此 location 地址的原值为 nil

ObjC Runtime 中 Weak 属性的实现 (上)

进入 SideTables() 函数

static StripedMap<SideTable>& SideTables() {
    return *reinterpret_cast<StripedMap<SideTable>*>(SideTableBuf);
}

关于 reinterpret_cast 的讨论

reinterpret_cast is the most dangerous cast, and should be used very sparingly. It turns one type directly into another - such as casting the value from one pointer to another, or storing a pointer in an int, or all sorts of other nasty things. Largely, the only guarantee you get with reinterpret_cast is that normally if you cast the result back to the original type, you will get the exact same value (but not if the intermediate type is smaller than the original type). There are a number of conversions that reinterpret_cast cannot do, too. It’s used primarily for particularly weird conversions and bit manipulations, like turning a raw data stream into actual data, or storing data in the low bits of an aligned pointer.

它是一种类型强转的方式

ObjC Runtime 中 Weak 属性的实现 (上)

SideTableBuf 是大小为 4096 的 SideTable 缓存数组， oldTable 的赋值相当于在取数组元素，nil 可看成 0 ，即取第一个元素。

同理，haveNew 为 true ，newTable 是以 newObj 为索引在 SideTabBuf 中查找元素。

调用 SideTable::lockTwo 方法

SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);

进入 SideTable::lockTwo 方法

template<>
void SideTable::lockTwo<DoHaveOld, DoHaveNew>
    (SideTable *lock1, SideTable *lock2)
{
    spinlock_t::lockTwo(&lock1->slock, &lock2->slock);
}

进入 lockTwo 方法

// Address-ordered lock discipline for a pair of locks.

static void lockTwo(mutex_tt *lock1, mutex_tt *lock2) {
   if (lock1 < lock2) {
       lock1->lock();
       lock2->lock();
   } else {
       lock2->lock();
       if (lock2 != lock1) lock1->lock(); 
   }
}

判断 if (haveOld && *location != oldObj) 条件

haveOld && *location != oldObj ，oldObj 被赋值为 *location 正常情况下，两者相等，不等说明出了问题，算是容错。

判断 if (haveNew && newObj) 条件

haveNew && newObj 根据注释可知也是一个容错的处理

清除旧值

if (haveOld) {
   weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
}

赋予新值

// Assign new value, if any.
if (haveNew) {
   newObj = (objc_object *)
       weak_register_no_lock(&newTable->weak_table, (id)newObj, location, 
                             crashIfDeallocating);
   // weak_register_no_lock returns nil if weak store should be rejected

   // Set is-weakly-referenced bit in refcount table.
   if (newObj  &&  !newObj->isTaggedPointer()) {
       newObj->setWeaklyReferenced_nolock();
   }

   // Do not set *location anywhere else. That would introduce a race.
   *location = (id)newObj;
}
else {
   // No new value. The storage is not changed.
}

以 location 为 key,以 newObj 为值保存到对应的 weak_table_t 的结构体中

调用 SideTable::unlockTwo 方法

SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);

分析: `void objc_destroyWeak(id *location)`

因为传递的模板参数为 DontHaveNew ，当释放掉旧值后，不会再进入 if (haveNew) 条件中获得新值。

分析: `id objc_loadWeakRetained(id *location)`

retry:
    // fixme std::atomic this load
    obj = *location;
    ...
    result = obj;
    ... 
    return result

通过 * 取值符号操作 location ，获得弱引用指向的地址。

总结

本文通过对 ObjC 运行时粗略分析，来了解 weak 属性是如何进行存储，使用与释放的。ObjC 的类结构中一个静态的键值对表变量，它保存着对象的弱引用属性，其中的键为指向弱引用的内存地址，值为弱引用，当对象销毁时通过键查表，然后将对应的弱引用从表中移除。

ObjC Runtime 中 Weak 属性的实现 (上)

前言

环境

参考答案

测试

代码

结果

相关函数

小结

观察 & 分析

观察: id objc_storeWeak(id *location, id newObj)

观察: void objc_destroyWeak(id *location)

函数 storeWeak 源码

分析: id objc_storeWeak(id *location, id newObj)

铺垫: SideTable

正文: id objc_storeWeak(id *location, id newObj)

分析: void objc_destroyWeak(id *location)

分析: id objc_loadWeakRetained(id *location)

总结

参考