«Cambridge CB3 0FD, U.K. IBM T.J. Watson Research Center P.O. Box 704, Yorktown Heights, NY 10598, U.S.A. Abstract. There are many algorithms for ...»
If the collector is not in its tracing phase, it simply performs the write: because it is the tracing phase that determines reachability of objects, only object graph mutations during tracing can affect reachability (object graph additions – via allocation – require some additional synchronization, which is described below).
An object can be protected either (1) when a pointer to it is stored or (2) when a pointer to it is overwritten. We call saving the pointer at 1 an installation barrier and saving the pointer at 2 a deletion barrier. The Dijkstra-style barrier is an instance of an installation barrier; the Yuasa-style barrier is an instance of a deletion barrier.
Earlier, we described our collector as a combination of tracing and reference counting. The reference counting is done in the write barrier. In particular, we keep a count of the number of references to an unmarked object from scanned portions of the heap. This is called the Scanned Reference Count or SRC. The SRC is one of the most important aspects of our abstract algorithm and allows for a number of interesting insights.
The SRC allows us to defer reachability decisions from the time of a write barrier to the time when collector tracing is ﬁnished. For example, if a pointer to an object is installed into the scanned portion of the heap, and subsequently removed from the scanned portion of the heap, then it can not possibly affect the liveness of the object.
ROOTS ROOTS ROOTS ROOTS
Fig. 3. Erroneous collection of live object Z via deletion of direct pointer a from object X.
Object Allocation Besides pointer assignments, the mutator can also add objects to the connectivity graph. Similarly to pointer assignments, the allocation interacts with the tracing phase. In addition, allocation also interacts with the sweeping phase of the collector. This is performed in the procedure AllocateBarrier() in Fig 2.
In terms of reachability, if the collector is in its tracing phase, object allocation can be seen as just another pointer modiﬁcation event. The main difference between allocation and pointer writes is that upon allocation we know that the new pointer is unique. We also know that the new object does not contain any outgoing pointers.
During the sweeping phase, the collector iterates over the heap, reclaims all unreachable objects and resets the state of the live objects. We assume that we can designate which parts of the heap the collector has passed indicated by the variable Heap.Hue.
The variable is similar to Shade, except Shade is applied per object while Hue is applied per heap. That is, we have one Hue variable. Similarly to Shade, the variable is monotonic within the same collector cycle.
If the mutator allocates during the collector’s sweeping phase, we require a mechanism to protect the object from being collected erroneously. The ﬁeld DontSweep indicates if the object has been allocated in a part of the heap that the collector has yet to reach in its sweeping action.
2.4 Lost Object Problem
Fig. 4. Erroneous collection of live object S via deletion of pointer c from object Q which transitively reaches S through R.
from an unscanned but reachable object X. The mutator is then immediately preempted by the collector and in step D3, the collector processes object X, turns it black (scanned) and assumes that its marking phase is completed. Next, in step D4, the collector starts its sweeping phase and erroneously frees object Z, although Z is reachable from Y via pointer b. In this case we say that object Z is directly hidden from the collector.
Alternatively, an object can be hidden transitively. This case is illustrated in Fig. 4.
In the initial state, object P is scanned and Q, R, and S are reachable but not yet seen.
Starting from this state, in step T1, the mutator introduces pointer e from a scanned and visited object P to object S. In step T2, the mutator destroys the unscanned pointer c from Q to R, essentially, destroying the only path starting from Q to object S. Next, in step T3, the collector preempts the mutator and scans object Q as shown and assumes to have ﬁnished the tracing phase. In step T4, the collector incorrectly frees object S. In this case we say that object S was transitively hidden from the collector.
The lost object problem consists of two main events in time: storing a pointer to the particular object to be lost and in a subsequent step destroying all other paths to that object. The two well-known solutions to this problem operate at either of these two steps. They either operate at state D1/T1 or at state D2/T2. Dijsktra’s and Steele’s solutions operate at states D1/T1 and aim to prevent the un-acknowledged introduction of pointers from scanned portions of the heap to reachable but unmarked objects. They essentially speculate that a pointer destruction will occur sometime in the future, and this will lead to hiding of the object. Alternatively, solutions can operate at steps D2/T2.
When a pointer is destroyed as in steps D2 and T2, we reason that a pointer to the object must have been introduced earlier and make the target of the overwritten object reachable. This is the solution chosen by Yuasa. For example, Yuasa would make Z live when pointer a is removed in step D2 or pointer c is removed in step T2. In the transitive case, even though object R might have become unreachable when the pointer is destroyed in step T2, Yuasa’s solution requires that object R is kept live as a potential only path left to the hidden object S.
2.5 Design Alternatives The abstract algorithm maintains rich object and heap-level information. This section attempts to provide an intuitive understanding of the abstract algorithm.
The essence of the abstract algorithm is that it allows for deferring reachability decisions from the mutator to the collector. That is, in the write barrier the mutator detects a potential problem and nominates a candidate pointer for the collector. Subsequently, before the termination of its tracing phase, the collector examines the nominated pointers and optionally discards unnecessary candidates. The speciﬁc choices of which pointers are selected by the mutator and which pointers are processed by the collector are discussed in the following sections.
Mutator Selection When a mutator hits the write barrier, it can protect an object using either the installation choice or the deletion choice. Intuitively, to protect an object, the mutator speculates about reachability, since it has no knowledge of how the graph changes before the collector has ﬁnished tracing. In the abstract algorithm, the mutator detects a potential problem, but does not make explicit decisions whether the object is reachable at the end of tracing.
If the installation choice is utilized, the object is nominated by the mutator as soon as the SRC becomes 0, thereby, protecting the object directly rather than transitively.
The installation choice speculates that right after the SRC becomes 0, the only path to the object from an unscanned, but reachable object will be destroyed. Immediately after nominating the pointer, the SRC could be decremented back to zero effectively undoing the previous operation.
For the deletion approach, if a pointer in an unscanned object is overwritten, another object can become hidden either transitively or directly. If the SRC(X) is 0 and a pointer to object X is overwritten from an unscanned portion of the heap, we need to protect object X directly. Therefore, the mutator must nominate this pointer. Alternatively, if the SRC(X) is 0, we might need to protect some transitively reachable object from X. The key is to recognize that if X does not contain any outgoing pointers, then no object can be hidden transitively. In such cases, we do not need to nominate X.
Determining whether object X is a leaf can be done by using the type of the object.
Examples of acyclic types are scalar arrays as well as newly allocated objects before pointers are stored into them. Objects of acyclic types are leaves for their entire lifetime while newly allocated objects can be leaves only temporarily.
Moreover, even if object X is not a leaf, a write barrier could possibly perform nested checks and determine that at, for example, two-levels deep all objects pointed from X are leaves and their SRC is 0. In this case, we can again refrain from nominating the overwritten X pointer.
In some way, it would be logical to make a conclusion that the deletion choice should be more precise, since it always reasons about an event which has already occurred: the SRC of some object has become 0. The installation choice speculates about the future, that may be at some point an unscanned pointer will be destroyed.
Although a deletion collectors reasons about past event and should have more information, it has no practical way of determining those transitive objects whose SRC 0. In contrast, the installation choice always has an immediate access to the critical object.
Besides pointer events, the mutator can modify the connectivity graph via object allocation. Allocation can be seen as an instance of a write barrier with special knowledge that the target pointer is unique. For installation choice collectors, allocation events are treated exactly as all pointer events. For deletion choice collectors, if the resulting pointer from an allocation request is stored into a scanned portion of the heap, it is possible that the object will be lost. We can then think of allocation as a normal pointer store, except that immediately after the pointer store into a scanned region of the heap, an unscanned virtual pointer to the object is overwritten. Since the virtual event cannot be captured by the barrier, we simulate it in the barrier. The ﬂag isAllocated is passed speciﬁcally for this reason from the AllocateBarrier() to the procedure.
WriteBarrier() Characterizing graphs that allow different barrier choices per object is an interesting though primarily theoretical question. It is generally not possible to make that decision arbitrary without some local knowledge of the graph.
Finally, if all barriers occur on leaf objects, the deletion choice will always require us to nominate fewer pointers. Of course, in both barrier choices precisely the same number of objects will be marked live. This can be logically explained by the fact that in both cases, the immediate object is available during the pointer store therefore we can reason locally about reachability. In that case, for leaf objects, the decision of which barrier to use can be made per-object rather than per-collector-cycle. We do not deal with this topic further.
Collector Choice Once the collector has ﬁnished the initial tracing of the heap, there could be a number of unmarked candidates nominated by the mutator. It is possible that in between the time when the mutator has nominated a candidate and the collector sees it, the candidate is no longer necessary.
Similarly to the mutator’s pointer selection mechanism, the collector also uses a mechanism to ﬁlter out unnecessary candidates. This selection mechanism for the collector is the same as that for the mutator. This can be seen in the write barrier processing phase, the procedure ProcessBarriers() in Fig. 1.
Although the collector uses the same mechanism as the mutator, it is possible that candidates nominated by the mutator are ignored by the collector. For example, if the installation choice is used and if the object’s SRC is 0, when the collector sees such pointers, the corresponding object must be retraced. If the object’s SRC is 0 however, then the object was recorded by the mutator, but before tracing ﬁnished, its SRC dropped to 0. Such objects are skipped by the collector in this phase. They have either become garbage or are live but hidden. In the latter case, the object is reachable transitively from a chain of reachable objects starting at an object whose SRC is 0. We therefore only need to re-trace objects whose SRC is 0. Similar reasoning although with a different selection criteria is applied to the deletion choice.
Maintaining an accurate SRC has several advantages. First, the SRC prevents us from inducing ﬂoating garbage. That is because at the time a pointer store occurs, the mutator nominates objects that could be potentially hidden from the collector. It need not make an explicit decision whether they will actually be reachable once the tracing is complete. The reachability is left to the collector when the barrier tracing phase occurs.
It is because of the SRC that the mutator does not need to make such explicit decisions about reachability. Secondly, the collector must start re-scanning only from speciﬁc objects. For example, for the installation choice it does not need to consider objects whose SRC is 0.
3 Transformations: Trading Precision for Efﬁciency
The abstract algorithm of the previous section provides a much higher degree of precision than previously published and implemented algorithms, but it is also impractical.
In this section we describe how practical collectors can be derived via orthogonal transformations of the abstract collector. Since the transformations are orthogonal, and since the reduction in precision can be modulated, this framework allows the derivation of a much broader set of algorithms than have previously been described, as we will show in the following section.
The transformations presented are (1) reduction in write barrier overhead by treating multiple pointers as roots; (2) reduction in root processing by eliminating re-scanning of the root set; (3) reduction in object space overhead and barrier time overhead by reducing the size of the scanned reference count (SRC); (4) reducing object space overhead by reducing the precision of the per-object shade; (5) conﬂation of shade and SRC to further reduce object space overhead and speed up the write barrier.
These transformations are not strictly semantics-preserving, since the set of collected objects is changed. However, they are invariant-preserving in that live data is never collected (the collector safety property).
3.1 Root Sets: Eliminating Write Barriers