Algebraic data types

The idea

‘s blog  post inspired my discussion of data types:

his functional approach is based on a “closed hierarchy of classes + visitor”, the problem being when types allow invalid states.

IMHO, there is a sort of philosophy behind the languages and C# implementation should be very different…

Logical layer

Let me show you the abstract interface I’ve in mind.

public enum ShapeType {
		Rectangle = 0,
		Circle =1
	};
public interface IElement {
	ShapeType GetType();
	IRectangle SetRect(double height, double width);
	ICircle SetCircle(double radius);
}
public interface IExtendedEl : IElement {

}
public interface ICircle : IElement, IExtendedEl {
	void GetCircle(out double radius);
}
public interface IRectangle : IElement, IExtendedEl {
	void GetRect(out double height, out double width);
}
public interface IShape
{
	double GetArea(IExtendedEl element);
}

Is it simple?

Materialization

Let’s go on: everything boils down to using an enum…

public class MyElement : IElement, IRectangle, ICircle
{

	private double my_height;
	private double my_width;
	private double my_radius;
	private ShapeType my_type;

	ShapeType IElement.GetType()
	{
		return my_type;
	}

	public void GetRect(out double height, out double width)
	{
		height = my_height;
		width = my_width;
	}

	public void GetCircle(out double radius)
	{
		radius = my_radius;
		radius = my_radius;
	}

	public IRectangle SetRect(double height, double width)
	{
		my_height = height;
		my_width = width;
		my_type = ShapeType.Rectangle;
		return this;
	}

	public ICircle SetCircle(double radius)
	{
		my_radius = radius;
		my_type = ShapeType.Circle;
		return this;
	}
}
public class MyShape : IShape
{
	public double GetArea(IExtendedEl element)
	{
		double area;
		switch (element.GetType()) {
			case ShapeType.Rectangle:
				double height;
				double width;
				((IRectangle)element).GetRect(out height, out width);
				area = height * width ;
				break;
			case ShapeType.Circle:
				double radius;
				((ICircle)element).GetCircle(out radius);
				area = Math .PI * radius * radius ;
				break;
			default:
				throw new NotImplementedException();
		};
		return area;
	}
}

Usage

The conclusion is straightforward

Console.WriteLine("Shape Example");

IElement myElement = new MyElement();
IShape myShape = new MyShape();

Console.WriteLine("Rect area: " + myShape.GetArea(myElement.SetRect(3,5)));

Console.WriteLine("Circ area: " + myShape.GetArea(myElement.SetCircle(7)));

Console.Write("Press any key to continue . . . ");
Console.ReadKey(true);

Some negative feedback from the web

First of all, I’m always very happy to receive comments from other programmers and I’d love to discuss my ideas with experts, particularly if they have different opinions.

Today I’m pleased to link the article of Kris Nuttycombe about “Correcting the Visitor pattern“: you’re invited to read it.

Below a couple of his comments:

  1. “this looks very unsafe; you have to manually dispatch on the enum value, then cast!” (from twitter)
  2. (again from twitter) “this isn’t a solution at all; the whole point is for the compiler to prove you haven’t made a mistake.”

My initial answer is that IDE is supporting the enum values and their syntax dispatchment in Visual Studio: so it’s not actually a very manual operation. The compiler is doing its job, but it doesn’t mean that it never makes sense to throw a runtime exception.

No problem in discussing a real world example with a complete program where we can reply each other what we think in details, starting from a simple but sufficiently meaningful example.

More in the notes below.

Advertisements

18 thoughts on “Algebraic data types

  1. For further details about the visitor pattern, I suggest you to look at the answer of Federico A. Ramponi. IMHO It is always important to explain the reason why one should use a specific pattern, so to properly evaluate – on a case-by-case basis – if it is practically useful in the specific context of the implementation.

    Like

  2. Here I’m choosing an interface instead of an abstract class, because the Shape can be instantiated by an Inversion of Control container, so we don’t need to make the radius unrepresentable for a rectangle or the height and width for a circle, they can be just different constructors or different setters, like in the above code. Let’s say that property is a helper, because we can’t know how the dependency is implemented.

    Like

  3. Hi, @giuliohome, thanks for the link!

    Here’s problem I perceive with your solution: the compiler would accept the following bit of code as perfectly valid. (My apologies if this doesn’t format correctly; I need to move my own blog to something that deals better with code in comments for precisely this reason.)

       public double GetArea(IExtendedEl element)
        {
            double area;
            switch (element.GetType()) {
               case ShapeType.Circle: 
                 double height;
                 double width;
                 ((IRectangle)element).GetRect(out height, out width);
                 area = height * width ;
                 break;
               case ShapeType.Rectangle:
                 double radius;
                 ((ICircle)element).GetCircle(out radius);
                 area = Math .PI * radius * radius ;
                 break;
               default:
                 throw new NotImplementedException();
            };
            return area;
        }
    

    The fact that a simple transposition of the cases in the switch results in a runtime error is essential to the problem that sum types (I would use the term algebraic data types, which include both sums and products, but in this case we’re talking about sums) are supposed to solve. The interface presented should statically preclude one from writing this sort of incorrect code; while the bug is obvious and simple to fix, it still represents an unnecessary possibility of failure in production. By contrast, using a type-parameterized variant of the Visitor pattern, as I do in that ancient post, ensures that these kinds of programming errors are caught at compile time.

    It’s been a long time since I first wrote that post, and it’s probably time for a follow-up with more detail and better established motivation.

    Liked by 1 person

    • Hi! Thanks again for explaining your feedback! I think I see your point: nothing – in that piece of code – prevents from using the wrong method GetRect in the wrong case for the Circle and vice-versa. Correct, I agree, so far so good. Then you add “The interface presented should statically preclude one from writing this sort of incorrect code” … and that there is a more correct alternative, that “ensures that these kinds of programming errors are caught at compile time”.
      Ok, now my answer. Incidentally, your notes are very interesting but they get deeper inside category theory, etc… while I like this subject, I don’t think it’s strictly necessary to discuss all of that in this simple example.
      Back to my answer. The whole point of my idea is its usage, what I am selling (sort of…) is myShape.GetArea, myElement.SetRect and myElement.SetCircle – not their internal implementation. I’m saying that the layer built outside them is safe and it constitutes my alternative proposal, not the hidden mechanism inside. Let me draw a paradox: it’s like if I took your beautiful Functional Code, then I disassembled it and finally I said that the machine code you’ve produced is full of unsafe gotos 🙂

      In other words it’s like I said that your visitor pattern is error prone, since the compiler wouldn’t complain about a bug of the following sort:

      return (leftDepth < rightDepth) ? leftDepth + 1 : rightDepth + 1;

      instead of the original:

      return (leftDepth > rightDepth) ? leftDepth + 1 : rightDepth + 1;

      Let me simulate what would have happened with a visitor pattern, starting from your criticism.
      1) a tree structure, without any real need for a hierarchy in the domain model
      2) the bug could have simply transferred into the visit method of the nodes, again without any help from the compiler (it’s an utopia to dream that the compiler can spot any sort of bugs indeed) and – imho – it would be even worse, because:
      3) there is more cohesion and less coupling if the different – but related – area methods are grouped together under a common switch/case, so that it is easy to inspect them, instead of keeping them separate in different classes (and again it depends on the context of the domain model).

      In conclusion, it’s worth noting that implementing switch-like functionality in many functional languages is what goes under the name of Pattern matching.

      Like

      • The central point of pattern matching (and catamorphism, as implemented in terms of the visitor pattern) is not to exclude *all* possible bugs, but instead to exclude a category of bugs – in this case, implemented using your approach you have the possibility of both the logic bug of the incorrect inequality, *and* the possibility of a failed cast. Why not eliminate the possibility of the failed cast? To argue that, because we can’t catch all bugs, we shouldn’t catch any bugs is equivalent to the argument made for abandoning static typing entirely. I don’t think it helps your case here.

        With respect to cohesion and coupling, the “area methods” *would* be grouped together – each handler for a separate shape would be a separate method of the same Visitor instance. In terms of implementation complexity, it’s virtually identical. Forgive my lack of knowledge of C# syntax, which may result in some errors below, but doesn’t this keep your area methods together nicely?

        interface ShapeVisitor<A> {
            public A GetRectangleArea(IRectangle rect);
            public A GetCircleArea(ICircle circle);
        }
        
        public class AreaVisitor : ShapeVisitor{
            public double GetRectangleArea(IRectangle rect){
                return rect.height * rect.width;
            }
        
            public double GetCircleArea(ICircle circle){
                return Math.PI * circle.radius * circle.radius ;
            }
        }
        
        

        Like

      • Hi! Thank you again for coming back to clarify your suggestion. Basically, I think I agree with you and I like your piece of code above. So the differences between our points of view – if any – could be somewhere else in the full code. Anyway your comment above seems agreeable to me. Well, you imply that with this Visitor there is no need of a case/switch and there is a type of bug that is automatically avoided… but there is no generic GetArea in your code from a generic IShape, so the result is sort of obvious: what am I missing?

        Like

      • I’m not sure why I can’t seem to reply directly to your most recent response, but here is how the generic GetArea is implemented:

        abstract class IShape {
          public A accept(ShapeVisitor visitor);
        
          public double GetArea() {
            return accept(new AreaVisitor());
          }
        }
        

        As you can see, GetArea() is just a convenience wrapper around accept; in practice, I probably would not implement it as part of the interface, but would rely upon the end-user to write out the accept call themselves. When you start relying upon visitors like this, you end up defining very few methods on your interfaces, because since the set of types is closed for extension, the set of operations (i.e. the number of classes extending ShapeVisitor) is open. If you’re familiar with the term, this is the essence of the “expression problem” – you can either have a closed set of operations (those methods defined on the base interface) and an open set of types, or a closed set of types (those having “handlers” in the visitor class) and an open set of operations.

        Liked by 1 person

      • Oh bingo: ” but would rely upon the end-user to write out the accept call themselves.” Here it is what I don’t like or in other words the assumption here is that I am responsible for the implementation of such a method and that’s what I’m talking about…

        And the “expression problem” in OO vs FP can be part of the issue we’re discussing indeed, like I can find here http://stackoverflow.com/a/3776140

        Like

      • When I say “write out the accept call themselves” all I’m referring to is simply writing `myShape.accept(new AreaVisitor())` rather than `myShape.getArea()` – if you want to provide the latter, there’s certainly no problem with that; it’s merely a bit of syntactic sugar over the former. The point is that if the end-user wishes to add further functions over the Shape algebra, they’re capable of doing so in a safe fashion – I could just as easily call `myShape.accept(new PerimeterVisitor())` but if you have not defined `.getPerimeter()` then I have inconsistent access patterns when deconstructing Shape values.

        Like

      • Sorry, but I still can’t see how it works, where is the code of myShape.accept? Not in your gist…

        I suggest you could reply to my first comment directly in your gist (or to the following ones there)

        In my understanding you want to use 2 different overloads of the GetArea that are differentiated only by the circle/rectangle input interface.
        If so, they should have the same name.
        Nonetheless I see your point: you can force an apparently stronger recognition, by starting from an inherited, own property instead of an external case… but don’t get distracted by the specific example.
        One can always have a “bug-fix” or an “enhancement” simply because we have to switch the Greek pi from one to the other formula… and again don’t just think about the basic Euclidean geometry, but imagine an algorithm for new dimensions in a strange Riemann manifold for the latest string theories of sort ;))

        Like

    • Thanks! I was already writing a reply while you added the second comment with your notes… (interesting, will read them) … so now I’ve a lot of things to analyse before answering again 🙂 I’ll take my time. In the meantime thank you very much for your virtual participation here in my blog!!!

      Like

    • I’ve replied to you above, more specifically. Your notes are very beautiful and I want to study them in details but I fail to see where exactly they contradict what I’ve written here (sorry if I’m missing some evident/immediate correlation…)

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s