I have been thinking a lot about how to extend the Cat type system so that I can handle objects. The big challenge is to do this in a way that doesn’t add any significant complexity to the language. This is particularly hard to do for a language as small and tight as Cat.
Previously I was thinking about starting with tuples. Not only are tuples nice on their own, but the type of an object is essentially a tuple. There is one simple way to add tuples, which is to represent them as dynamically constructed functions. E.g. ( -> ( -> int string)) which would be a tuple containing an int and a string. The challenge occurs when I try to get values in and out of that function? I want something like: “get0″, “get1″, etc. which gets the value from the appropriate slot. This is trivial to code, but the problem is that I can’t express the type of “getN” in a general enough manner. Consider
get0 : (( -> ‘A ‘b) -> ‘b)
get1 : (( -> ‘A ‘b ‘c) -> ‘b)
get2 : (( -> ‘A ‘b ‘c ‘d) -> ‘b)
…
This works fine, but we can only write so many of these (perhaps a dozen or so) before they start to become unmanageable. The problem then becomes that we would have to limit the number of field accessors, and hence the number of fields to a relatively small and arbitrary number. That is extremely limited, and nearly useless in practice when we can easily imagine objects with hundreds of fields or more. It will also put a horrible strain on the type reconstruction and checking algorithms.
Sidebar: I will probably limit the number of fields to 255 anyway for several reasons:
- there has to be some stated upper limit
- too many fields represents a horrible flaw in the design of OO software
- it is is trivial to extend the 255 limit using aggregation (putting objects in objects)
- Cat is designed to run on small memory systems, so being able to represent the index of a field with a single byte is appealing.
There is another subtle problem (or benefit depending on your point of view) and that is that this approach automatically provides structural subtyping. In other words two types that are structured the same will always be treated as exactly the same type. While this is useful in some circumstances, it is not good to paint ourselves in to such a corner. It is very important for software modularity and correctness to be able to construct two different types with similar structure when needed.Â
The following is an entirely different approach that I am now seriously considering using. First I will approach the problem of objects and use those to build tuples. Â
object IntString { x : int; y : string; }
6 “the answer is ” IntString.new x.get 7 * x.set
y.get write x.get writeln
So what is the type of the constructors, setters, and getters? Well I propose that the types are:
IntString.new : (int string -> IntString)
x.get : (’o -> ‘o ‘o.x)
x.set : (’o ‘o.x -> ‘o)
y.get : (’o -> ‘o ‘o.y)
y.set : (’o ‘o.y -> ‘o)
So what has happened is that the type of ‘o.x is treated as an unknown variable, that is expanded once ‘o is known. Internally the type reconstruction algorithm simply has to create an object to represent the IntString, and track it.
So far I am very pleased with this approach, because the overall changes to the type system are very minimal, and the everything looks like it will work fine.
The next step is to allow fields to be added incrementally. I am thinking that the syntax may be something like:
30 “answer” IntString.newÂ
12 z.add
x.get z.get +
So the question is what the type will be off “z.add”? Perhaps something like:
z.add : (’o ‘x -> ‘o+z)
This ‘o+z just indicates that the result is a new type with a field named “z”. The type checker will have to manage the type internally. So to make this a fully compatible duck-typing system, such as the kind found in prototype based object-oriented languages like JavaScript, the requirement is simply that any object can be used where any other object is used, and that if a field is requested whose type is not known at compile-time, we return a value of type “any”, with the special value “undefined” used to indicate failure to find the field.Â