Exploring Swift Array's Implementation


An (Swift) Array is contiguous block of memory which has n number of elements of same type, represented by generic type Element. Since the memory which holds the elements is contiguous any individual element can be accessed in O(1) time.

To create an array we need these things as bare minimum :

  • Pointer to allocated memory (or the buffer)
  • Capacity : capacity is the number of elements that can be stored in the current buffer
  • Count : count is number of elements that are actually stored inside the current buffer so count <= capacity

Swift Implementation

_SwiftArrayBodyStorage (C Struct)

Swift declares a C struct in its SwiftShim’s module inside GlobalObject.h called _SwiftArrayBodyStorage

struct _SwiftArrayBodyStorage {
  __swift_intptr_t count;
  __swift_uintptr_t _capacityAndFlags;
};
  • count is the number of elements currently stored in the buffer
  • _capacityAndFlags is used to store two things:

    • first is the capacity of the buffer
    • second “Is the Element type bitwise-compatible with some Objective-C class?”

The most right bit of the unsigned int is used to store this boolean flag and rest of the bits are used to store the capacity of the array. This is done to store this information in a compressed way.

_ArrayBody (struct)

_ArrayBody is a struct defined in the stdlib which is a swift wrapper over _SwiftArrayBodyStorage

  • keeps an instance of _SwiftArrayBodyStorage inside a variable called _storage
  • provides option to set & get capacity, count and the flag (elementTypeIsBridgedVerbatim)

ManagedBufferPointer<Value, Element> (struct)

  • Creates fixed size contiguous storage for type Element
  • Stores one instance of the type Value which acts as a header to that storage to keep the meta data about that storage. For Array Value is _ArrayBody
  • Provides methods which will give pointers to Value and Element which can be used to operations like dealloc/destroy elements in buffer or change modify data of Value object
  • Tells if the buffer is being uniquely referenced
  • This struct heavily uses Builtin which is basically voodoo which compiler understands directly

_ContiguousArrayBuffer<Element> (struct)

  • This struct is basically the buffer which is used by Array. It keeps an instance of ManagedBufferPointer<_ArrayBody, Element> in its __bufferPointer property
var __bufferPointer: ManagedBufferPointer<_ArrayBody, Element>
  • It initializes the _ArrayBody with capacity, count and the flag
  • To keep things simpler, I’ll focus on non-ObjC runtime (flag will obviously always be false for that)
  • The _ArrayBody can be intialized using the _valuePointer property exposed by ManagedBufferPointer (since it only allocated by the buffer pointer)
func _initStorageHeader(count: Int, capacity: Int) {
#if _runtime(_ObjC)
    let verbatim = _isBridgedVerbatimToObjectiveC(Element.self)
#else
    let verbatim = false
#endif

    __bufferPointer._valuePointer.initialize(
      _ArrayBody(
        count: count,
        capacity: capacity,
        elementTypeIsBridgedVerbatim: verbatim))
  }
  • The count and capacity values can be asked from _ArrayBody via value property.
public var capacity: Int {
  return __bufferPointer.value.capacity
}

public var count: Int {
  get {
    return __bufferPointer.value.count
  }
  nonmutating set {
    _sanityCheck(newValue >= 0)

    _sanityCheck(
      newValue <= capacity,
      "Can't grow an array buffer past its capacity")

    __bufferPointer._valuePointer.memory.count = newValue
  }
}

  • All operations on the storage can be done using the pointer to element. The element pointer is exposed via _elementPointer by ManagedBufferPointer

  • _ContiguousArrayBuffer has a computed property firstElementAddress which returns the elementPointer as UnsafeMutablePointer

public var firstElementAddress: UnsafeMutablePointer<Element> {
    return __bufferPointer._elementPointer
  }
  • Any element is accessible using the first element address and element position i
func getElement(i: Int) -> Element {
    _sanityCheck(i >= 0 && i < count, "Array index out of range")
    return firstElementAddress[i]
  }
  • Subscript provides easy access to set and get the elements at arbitrary index i but ofcourse it should be less than the capacity.
  public subscript(i: Int) -> Element {
    get {
      return getElement(i)
    }
    nonmutating set {
      _sanityCheck(i >= 0 && i < count, "Array index out of range")

      // FIXME: Manually swap because it makes the ARC optimizer happy.  See
      // <rdar://problem/16831852> check retain/release order
      // firstElementAddress[i] = newValue
      var nv = newValue
      let tmp = nv
      nv = firstElementAddress[i]
      firstElementAddress[i] = tmp
    }
  }
  • And finally, also need to know if the buffer is uniquely referenced for the copy-on-write feature 😀
public mutating func isUniquelyReferenced() -> Bool {
  return __bufferPointer.holdsUniqueReference()
}

Array (struct)

  • gyb stands for generate your (own) boilerplate. It’s a preprocessor system written for swift which is used to generate repetitive code.
  • Array is declared in a file called Arrays.swift.gyb because Array, ContiguousArray and ArraySlice share a lot of same code.
  • utils/gyb inside swift’s source can be used to generate the .swift file from a .gyb file

  • Array declares different buffer for ObjC and non ObjC runtime. _ArrayBuffer provides a bridged storage so it can be converted into NSArray if needed. We’re focusing on above mentioned _ContiguousArrayBuffer.
#if _runtime(_ObjC)
  public typealias _Buffer = _ArrayBuffer<Element>
#else
  public typealias _Buffer = _ContiguousArrayBuffer<Element>
#endif

...

public var _buffer: _Buffer
  • _getElement is used to get the element out of the buffer by using the getElement method on the buffer.
public // @testable
  func _getElement(
    index: Int,
    wasNativeTypeChecked : Bool,
    matchingSubscriptCheck: _DependenceToken
  ) -> Element {
#if _runtime(_ObjC)
    return _buffer.getElement(index, wasNativeTypeChecked: wasNativeTypeChecked)
#else
    return _buffer.getElement(index)
#endif
  }

then subscript calls the above _getElement method

public subscript(index: Int) -> Element {
  get {
    // This call may be hoisted or eliminated by the optimizer.  If
    // there is an inout violation, this value may be stale so needs to be
    // checked again below.
    let wasNativeTypeChecked = _hoistableIsNativeTypeChecked()

    // Make sure the index is in range and wasNativeTypeChecked is
    // still valid.
    let token = _checkSubscript(
      index, wasNativeTypeChecked: wasNativeTypeChecked)

    return _getElement(
      index, wasNativeTypeChecked: wasNativeTypeChecked,
      matchingSubscriptCheck: token)
  }

  mutableAddressWithPinnedNativeOwner {
    _makeMutableAndUniqueOrPinned()
    _checkSubscript_native(index)
    return (
      _getElementAddress(index),
      Builtin.tryPin(_getOwner_native()))
  }
}
  • Appending an element :
public mutating func append(newElement: Element) {
  _makeUniqueAndReserveCapacityIfNotUnique()
  let oldCount = _getCount()
  _reserveCapacityAssumingUniqueBuffer(oldCount)
  _appendElementAssumeUniqueAndCapacity(oldCount, newElement: newElement)
}
  • The first method called is _makeUniqueAndReserveCapacityIfNotUnique which is :
internal mutating func _makeUniqueAndReserveCapacityIfNotUnique() {
  if _slowPath(!_buffer.isMutableAndUniquelyReferenced()) {
    _copyToNewBuffer(_buffer.count)
  }
}
  • _slowPath is a method which hints the optimizer that expected value of the bool which is being passed is false
  • so basically this is hinting the optimizer that if this method is being called, its most likely that the array is already mutable and unique
  • if not, _copyToNewBuffer method creates a new buffer and initialize it with elements of current buffer then replaces the current buffer with new buffer
internal mutating func _copyToNewBuffer(oldCount: Int) {
  let newCount = oldCount + 1
  var newBuffer = Optional(
    _forceCreateUniqueMutableBuffer(
      &_buffer, countForNewBuffer: oldCount, minNewCapacity: newCount))
  _arrayOutOfPlaceUpdate(
    &_buffer, &newBuffer, oldCount, 0, _IgnorePointer())
}
  • _forceCreateUniqueMutableBuffer creates a new empty buffer and increases the capacity by factor of 2 if needed by :
internal func _growArrayCapacity(capacity: Int) -> Int {
  return capacity * 2
}
  • Now that we have a unique buffer for sure, just append the new element at end and update the count
internal mutating func _appendElementAssumeUniqueAndCapacity(
  oldCount : Int,
  newElement : Element
) {
  _sanityCheck(_buffer.isMutableAndUniquelyReferenced())
  _sanityCheck(_buffer.capacity >= _buffer.count + 1)

  _buffer.count = oldCount + 1
  (_buffer.firstElementAddress + oldCount).initialize(newElement)
}

So finally, Array looks like this :D