Declaring references to types conforming to protocols with associated types

Once a protocol has associated types, we can't use it exactly the way we used regular protocols.

  protocol AProtocol {
    associatedtype BType
    var b: BType
  }
  func callSomething(arg:AProtocol) {
  }

For instance, we can't use the protocol as a stand-alone type. The error message says, Protocol 'AProtocol' can only be used as a generic constraint because it has Self or associated type requirements. Adding associated types without specifying the concrete aliases is like creating a generic and not specifying the parameter types; it isn't a full type.

  func callSomething<T:AProtocol>(arg:T) {
    print("Hello, \(arg.bType)!")
  }

We can have the effect of the above, however, by making the function generic. So we can constrain the type of the argument to conform to a protocol with associated types, we just can't use the protocol name by itself. We have to create a generic.

  struct ConformingType : AProtocol {
    var b: String
  }
  callSomething(arg: ConformingType(b: "Bob"))  //Hello, Bob!

By requiring a generic, the compiler knows how to fill in the missing pieces in the type at the call site.

The same thing goes for storage.

  struct Container {
    var a:AProtocol
  }

We can't use the protocol directly, but we can use it in the generic constraint.

  struct Container<T:AProtocol>{
    var a:T
  }

var instance:Container = Container(a:ConformingType(b:"Carol"))

So generics and protocols with associated types were made to work together. Let's dive into that.

Bigger example

  class DataKey<Value> : Hashable {
    var hashValue:Int {
      return ...
    }
    func value(from json:Any)throws->Value {
      ...
    }
  }

  class ProfileDataKey<Value> : DataKey<Value> {
    override var hashValue:Int {
      return ...
    }
    override func value(from json:Any)throws->Value {
      ...
    }
  }

If we defined our generic as a class, we could subclass it, and override its behavior.

  struct DataKey<Value> Hashable {
    var hashValue:Int {
      return ...
    }
    func value(from json:Any)throws->Value {
      ...
    }
  }

But what if we needed better performance, like avoiding heap allocations? What if we don't need reference counting memory management and would like value-semantics?

To solve those in this context, we need our final DataKey types to be structs or enums. So we'll need to take what makes a DataKey a DataKey and back it up to a protocol. But we want type safety generics gave us! How can we get both?

  protocol DataKey : Hashable {
    associatedtype Value
    var request:String { get }
    func value(withData:Any)throws->Value
  }

We'll use protocol associated types. Here, Value becomes the associated type.

  protocol DataManager {
    func value<K:DataKey>(for key:K)throws->K.Value?
  }

As we've mentioned before, protocols with associated types can't be used as the full type, only as a generic constraint. So any property or argument which takes a DataKey, such as this method on the DataManager protocol, will have to be turned into a generic.

Notice that since I've constrained the generic parameter to conform to DataKey, I'm able to use its knowledge of the associated Value type and declare the return type as K.Value.

  protocol DataFetcher {
    func data(request:String)->Any?
  }

Our example is a tiny piece of a networking architecture, and we'll need a stand-in for a type which actually does the downloading, DataFetcher. We'll cover networking another week. For now, the fetcher will stand in for actual networking code.

  class SimpleDataManager : DataManager {
    var fetcher:DataFetcher
    init(fetcher:DataFetcher) {
      self.fetcher = fetcher
    }
    func value<K:DataKey>(for key:K)throws->K.Value? {
      guard let data = fetcher.data(request:key.request)
      else { return nil }
      return try key.value(withData:data)
    }
  }

In fleshing this out, we add a call to the fetcher to retrieve data from the network, and then use the key to transform it into a Value.

  struct User {
    var name:String
    var id:Int
  }

  enum ParsingError : Error {
    case invalidType
    case missingValue(String)
  }

  enum UserKey : DataKey {
    case me
    case other(Int)

    var request:String {
      switch self {
        case .me:
          return "user.me"
        case .other(let userID):
          return "user.\(userID)"
      }
    }
    struct JSONKey {
      static let name:String = "name"
      static let id:String = "id"
    }

    func value(withData:Any)throws->User {
      guard let dict = withData as? [String:Any]
      else { throw ParsingError.invalidType }
      guard let name = dict[JSONKey.name] as? String
      else { throw ParsingError.missingValue(JSONKey.name) }
      guard let id = dict[JSONKey.id] as? String
      else { throw ParsingError.missingValue(JSONKey.id) }
      return User(name:name, idid)
    }
  }

In fleshing out this example, we create a DataKey / Value pair, UserKey, and User. Notice that UserKey is actually an enum, and User is a struct, potentially giving us the performance characteristics we want.

  struct DummyDataFetcher : DataFetcher {
    func data(request:String)->Any?{
      if request == "user.me" {
        return ["name":"Ben", "id":12345]
      }
    }
  }

We've fill the DummyDataFetcher with values we can use to create a User.

  let manager = DataManager(fetcher:DummyDataFetcher())
  let user:User? = try? manager.value(for:UserKey.me)
  print(user?.name)
  //Ben

In this test, we verify that our path is clean from the key to the Value.

So that's an example of how protocols with associated types and generics can work together. By using both, we've made our code call-site type safe, and also depend on abstractions. I can create another DataKey type, perhaps a struct, and need no changes whatsoever in my DataManager code.

Let's take it up a notch, and add type-safe value caching.

  protocol DataCache {
    func value<K:DataKey>(key:K)->K.Value?
    func set<K:DataKey>(value:K.Value, forKey key:K)
    func removeValue<K:DataKey>(forKey key:K)
  }

Here's a protocol for a cache, using all generics constrained with protocols with associated types. Those associated types let us make the input and output values consistent.

  class RamDataKeyCache : DataCache {

    private var storage:[AnyHashable:Any] = [:]

    func value<K:DataKey>(key:K)->K.Value? {
      return storage[AnyHashable(key)] as? K.Value
    }

    func set<K:DataKey>(value:K.Value, forKey key:K) {
      storage[AnyHashable(key)] = value
    }

    func removeValue<K:DataKey>(forKey key:K) {
      storage[AnyHashable(key)] = nil
    }
  }

And here's a concrete implementation of a cache, built to store all the values in RAM. Notice that we're using type erasure on the keys and values so that we can stash all the values in the same dictionary. AnyHashable and the Dictionary can fetch us the originally stored value, but to get it back to the right type, we conditionally downcast to the associated type! Because DataManager depends on a DataCache protocol and not a concrete type, we can come back and write a cache that use entirely different technology. Perhaps CoreData on Apple platforms.

  class SimpleDataManager : DataManager {
    var fetcher:DataFetcher
    var cache:DataCache
    init(fetcher:DataFetcher, cache:DataCache = RamDataKeyCache()) {
      self.fetcher = fetcher
      self.cache = cache
    }

Now, let's integrate the cache into the SimpleDataManager. The exact kind of cache is added as an injected dependency. When no other value is given, a RamDataCache is created as a default.

    func value<K:DataKey>(for key:K)throws->K.Value? {
      if let cachedValue = cache.value(key:key) {
        return cachedValue
      }
      guard let data = fetcher.data(request:key.request)
      else { return nil }
      let newValue:K.Value = try key.value(withData:data)
      cache.set(value:newValue, forKey:key)
      return newValue
    }
  }

Then we modify the value(for: method to only go to the network (through the fetcher) when the value is not in the cache.

Allowing this dependency injection allows us to use various mocking techniques to test each component, while requiring zero changes to the object under test.

Because the DataManager is designed to be dependency-injected into other code, it helps reduce global dependencies common in other network frameworks.

So, we've achieved what we set out to do: DataKeys as structs and enums for better performance. And because we've used generics constrained by associated types, our API's maintain their call-site type safety.

The key to when to use associated types with generics:

We'll use generics with no backing protocol when our internal code doesn't care about the type of the object, but outside code might.

We'll constrain the generic parameter to conform to a protocol when we need to interact with the object in a particular way.

When our interactions with that object involve other types which may be specific to its concrete type, we'll need to upgrade to constraining it to a protocol with associated types.

Summary

Today we learned that protocols with associated types were made to work with generics. In fact, when we want the type of a function argument or property to be a protocol with associated types, we'll have to use a generic. We also learned to use type-erasure and conditional down casting to store these values and return them to their full types. Last we saw how protocols with associated types with generics enables dependency-injection, which supports component testing, while still ensuring type safety at the final call site.

Concrete type safety with default abstract implementations depending on abstractions, now that's Swift!