Carpe diem (Felix's blog)

I am a happy developer

C/ObjC Block Byref Internals

In the last post, I mentioned that __block variable (here we named it block byref) will be retained if multiple blocks referenced it. Here are some sample code to show how runtime deals with reference counts.

In order to move the __block variable to the heap, the compiler must rewrite access to such a variable to be indirect through the structures forwarding pointer. For example:

1
2
    int __block i = 10;
    i = 11;

would be rewritten to be:

1
2
3
4
5
6
7
8
9
    struct _block_byref_i {
      void *isa;
      struct _block_byref_i *forwarding;
      int flags;   //refcount;
      int size;
      int captured_i;
    } i = { NULL, &i, 0, sizeof(struct _block_byref_i), 10 };

    i.forwarding->captured_i = 11;

As long as we know how block byref is structured, we can access the memory and dump it with internal function _Block_byref_dump.

more_curious.c source
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
/*
 * clang -Wall -fblocks -framework Foundation more_curious.c -o more_curious
 */
#include <stdio.h>
#include <Block.h>

struct Block_byref {
    void *isa;
    struct Block_byref *forwarding;
    int flags; /* refcount; */
    int size;
    void (*byref_keep)(struct Block_byref *dst, struct Block_byref *src);
    void (*byref_destroy)(struct Block_byref *);
    /* long shared[0]; */
};

static __inline struct Block_byref* derefBlockVar(char* src)
{
    return (struct Block_byref*) (src - 2*sizeof(int) - 2*sizeof(void *));
}

extern const char *_Block_dump(const void *block);
extern const char *_Block_byref_dump(struct Block_byref *src);



typedef void(^BoringBlock)(void);
void (^boringBlock)(void);

BoringBlock blockRefCountTest(void)
{
    __block int x = 1;

    printf("Before local block:\n%s\n\n",_Block_byref_dump(derefBlockVar((char*)&x)));
    BoringBlock localBlock = ^{
        x++;
        printf("Execute block:\n%s\n",_Block_byref_dump(derefBlockVar((char*)&x)));
        printf("x is %d, &x is %p\n", x, &x);
    };
    printf("After local block generated:\n%s\n\n",_Block_byref_dump(derefBlockVar((char*)&x)));

    boringBlock = Block_copy(localBlock);
    printf("After first block copy:\n%s\n\n",_Block_byref_dump(derefBlockVar((char*)&x)));

    BoringBlock retBlock = Block_copy(localBlock);
    printf("After second block copy:\n%s\n\n",_Block_byref_dump(derefBlockVar((char*)&x)));
    return retBlock;
}

int main (void)
{
    BoringBlock retBlock = blockRefCountTest();
    boringBlock();
    Block_release(boringBlock);
    retBlock();
    Block_release(retBlock);

    return 0;
}

The execution result is

$ ./more_curious 
Before local block:
byref data block 0x7fff6e8034f0 contents:
  forwarding: 0x7fff6e8034f0
  flags: 0x0
  size: 32


After local block generated:
byref data block 0x7fff6e8034f0 contents:
  forwarding: 0x7fff6e8034f0
  flags: 0x0
  size: 32


After first block copy:
byref data block 0x7fc191c13f60 contents:
  forwarding: 0x7fc191c13f60
  flags: 0x1000004
  size: 32


After second block copy:
byref data block 0x7fc191c13f60 contents:
  forwarding: 0x7fc191c13f60
  flags: 0x1000006
  size: 32


Execute block:
byref data block 0x7fc191c13f60 contents:
  forwarding: 0x7fc191c13f60
  flags: 0x1000004
  size: 32

x is 2, &x is 0x7fc191c13f78
Execute block:
byref data block 0x7fc191c13f60 contents:
  forwarding: 0x7fc191c13f60
  flags: 0x1000002
  size: 32

x is 3, &x is 0x7fc191c13f78

What does it mean?

We can find some interesting things in this log:

  1. Block byref flags and address doesn’t change until first copy.
  2. After copy, the flag becomes 0x1000004. There’s a (1 << 24) flag in the front.
  3. Block releases does decrease flag number in times of 2.

The (1 << 24) flag (the number one in 0x100xxxxx) means BLOCK_NEEDS_FREE in this enum:

Block_private.h source
1
2
3
4
5
6
7
8
9
enum {
    BLOCK_REFCOUNT_MASK =     (0xffff),
    BLOCK_NEEDS_FREE =        (1 << 24),
    BLOCK_HAS_COPY_DISPOSE =  (1 << 25),
    BLOCK_HAS_CTOR =          (1 << 26), /* Helpers have C++ code. */
    BLOCK_IS_GC =             (1 << 27),
    BLOCK_IS_GLOBAL =         (1 << 28),
    BLOCK_HAS_DESCRIPTOR =    (1 << 29)
};

So, flag changes until first copy makes sense, because block byref doesn’t need free until it is copied to heap.

The reference count is actually taken out from flags like so:

1
    refcount = shared_struct->flags & BLOCK_REFCOUNT_MASK;

I didn’t find out why reference count is in times of two. The actual code that increase and decrease reference count is this:

runtime.c source
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
static int latching_incr_int(int *where) {
    while (1) {
        int old_value = *(volatile int *)where;
        if ((old_value & BLOCK_REFCOUNT_MASK) == BLOCK_REFCOUNT_MASK) {
            return BLOCK_REFCOUNT_MASK;
        }
        if (OSAtomicCompareAndSwapInt(old_value, old_value+1, (volatile int *)where)) {
            return old_value+1;
        }
    }
}

static int latching_decr_int(int *where) {
    while (1) {
        int old_value = *(volatile int *)where;
        if ((old_value & BLOCK_REFCOUNT_MASK) == BLOCK_REFCOUNT_MASK) {
            return BLOCK_REFCOUNT_MASK;
        }
        if ((old_value & BLOCK_REFCOUNT_MASK) == 0) {
            return 0;
        }
        if (OSAtomicCompareAndSwapInt(old_value, old_value-1, (volatile int *)where)) {
            return old_value-1;
        }
    }
}

OSAtomicCompareAndSwapInt is a function that can change value of a int thread and multiprocessor safe.

Conclusion

Block seems magical at the first seen. With block we no longer have to do the function pointer + struct cast + void* tricks. Block automatically captures variables for us, and we can use __block storage qualifier to declare mutable ones. Behind the scene is really cool hack to make all this happen. However, it not quite easy to debug blocks and byrefs. We’d need to write some helper functions for gdb or lldb. These will be discussed in my next post.

References:

Comments