What is an Algebraic Data Type

I’d like to better understand what an Algebraic Data Type (ADT) is.

If a programmer claims: I have defined an ADT in my program, how do I verify that it is intact the case?

My current understanding is that an ADT is “kind of a this” and “sort of a that”, but I don’t have necessary and sufficient conditions in my mind.

Also what is algebraic about an ADT? Do ADTs form an algebra, or do instances of such a type form an algebra?

IIUC it means you can have alternative combinations of information.

A combination of information is typically a case class with several parameters (the different information being collected). This is called a product type. (The number of possible values is the product of the number of possible values of each field. For instance a case class or tuple of (Byte, Boolean) can hold 256 x 2 = 512 distinct values.)

Alternative combinations of information is usually represented as a sealed trait hierarchy, typically with several case classes extending it. Because the trait is sealed, the set of alternatives is closed, so you (and the compiler) know all the choices. This is called a sum type. (The number of possible values is the sum of the number of possible values of each choice. For instance if something can be either a Byte or a Boolean, there are 256 + 2 = 258 distinct values.)

Together this lets you define a type that represents the shape, and set of possible values, much more closely than you would otherwise be able to. For instance in some programming languages (and SQL databases which unfortunately represent product types but not sum types) you often have scenarios where you have silent rules when certain fields should null, often with dependencies, for instance if field A is null then fields B and C must not be null and if A is not null then B and C must be null, and the only way to enforce these rues is to sprinkle lots of ad hoc if statements when you set them, and when you read it you have to either make (non-local) assumptions, or do asserts. With ADTs you can just make the data the right shape in the first place. It’s a natural fit for a lot of use cases.

One way of looking at it is that the whole point of types is to say what values are valid and what are not valid. Being able to express that as sums of products (alternative combinations) is a very natural and flexible way to do that.

I’m not sure about the usage of the term “algebra.”

According to here, an algebraic data type is a sum of product types.

So what is algebraic about an algebraic data type?

AFAICT, this is not about an algebra over a field, as there is no field. But type systems like Scala’s share similarities with structures that you may study in the mathematical field of abstract algebra.

Let’s say you have a type system where the set of all possibly instances is the set of all possible tuples of primitives and any set of such tuples (not necessarily same arity) is a valid type.

One instance is the 0-tuple. The Unit type is the type that contains the 0-tuple as the only instance.

If you want to use a primitive, technically, you are using a 1-tuple of that primitive.

There is type union, which is basically addition, i.e. the types form an associative group with type union where the empty type (Nothing in Scala) is the neutral element.

Tuple concatenation is multiplication with Unit being the neutral element. The type system is then a ring.