Implementing Baker’s SUBTYPEP decision procedure

Léo Valais

Jim E. Newton

Didier Verna

lvalais@lrde.epita.fr

jnewton@lrde.epita.fr

didier@lrde.epita.fr

EPITA/LRDE

Le Kremlin-Bicêtre, France

ABSTRACT

We present here our partial implementation of Baker’s decision

procedure for

subtypep

. In his article “A Decision Procedure for

Common Lisp’s

SUBTYPEP

Predicate”, he claims to provide imple-

mentation guidelines to obtain a

subtypep

more accurate and as

ecient as the average implementation. However, he did not pro-

vide any serious implementation and his description is sometimes

obscure. In this paper we present our implementation of part of his

procedure, only supporting primitive types, Clos classes,

member

range and logical type speciers. We explain in our words our un-

derstanding of his procedure, with much more detail and examples

than in Baker’s article. We therefore clarify many parts of his de-

scription and ll in some of its gaps or omissions. We also argue in

favor and against some of his choices and present our alternative

solutions. We further provide some proofs that might be missing

in his article and some early eciency results. We have not re-

leased any code yet but we plan to open source it as soon as it is

presentable.

CCS CONCEPTS

• Theory of computation → Type theory

; Divide and conquer;

Pattern matching.

ACM Reference Format:

Léo Valais, Jim E. Newton, and Didier Verna. 2019. Implementing Baker’s

SUBTYPEP decision procedure. In Proceedings of the 12th European Lisp

Symposium (ELS’19). ACM, New York, NY, USA, 8 pages. https://doi.org/10.

5281/zenodo.2646982

1 INTRODUCTION

The Common Lisp standard [

] provides the predicate function

subtypep

for introspecting the sub-typing relationship. Every invo-

cation

(subtypep A B)

either returns the values

(t t)

when

is a

subtype of

(nil t)

when not, or

(nil nil)

meaning the pred-

icate could not (or failed to) answer the question. The latter can

happen when the type specier

(satisfies P)

(representing the

Permission to make digital or hard copies of part or all of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for third-party components of this work must be honored.

For all other uses, contact the owner/author(s).

ELS’19, April 01–02 2019, Genova, Italy

ACM ISBN 978-2-9557474-3-8.

https://doi.org/10.5281/zenodo.2646982

(sb!xc:deftype keyword ()

'(and symbol (satisfies keywordp)))

Listing 1: The keyword type denition in Sbcl

type

{x | P(x)}

for some predicate and total function

) is involved.

For example, given two arbitrary predicates

and

, there is no

way

subtypep

can answer the question

(subtypep ’(satisfies F)

’(satisfies G)).

However, some implementations abuse the permission to return

(nil nil)

. For example, in Sbcl 1.4.10 (the implementation we are

currently focusing our eorts on),

(subtypep ’boolean ’keyword)

returns

(nil nil)

, thus violating the standard

. The denition of

the

keyword

type is responsible for this failure: as shown in Listing 1,

it involves a satisfies type specier

Another kind of problem for which

subtypep

’s accuracy matters

is the optimization of the

typecase

construct as shown in [

] and

[

]. The aim is to remove redundant checks in the construct and

the approach is to use binary decision diagrams. However, to build

such a structure,

subtypep

is repeatedly used. The unreliability of

the predicate leads here to many lost BDD reductions and therefore

to the generation of sub-optimal code.

Our implementation is still in active development, currently tar-

gets Sbcl and focuses almost entirely on result accuracy. It supports

primitive types, user-dened types (

deftype

, classes and structures),

member

(and

eql

) type speciers and ranges (e.g.,

(integer * 12)

We present our strategy for implementing each one of these while

discussing how and why we decided or not to diverge from Baker’s

[

] approach—or potentially lling some gaps or unclear bits. No

optimization work has been done yet and the implementation still

has bugs and diverse issues, but we have found some encouraging

results about accuracy and even about eciency.

2 THE COMMON LISP TYPE SYSTEM

2.1 Type speciers

Common Lisp types are not manipulated directly. Instead, the type

to be manipulated is described using a type specier. The type

specier Domain-Specic Language (DSL) allows programmers

to describe types by writing S-expressions which obey some rules

described in the Common Lisp standard [1].

A function dened over its entire denition domain.

The Common Lisp standard requires that no invocation of

subtypep

involving only

primitive types return (nil nil).

C.f. bug #1533685 in Sbcl bug tracker.

Implementing Baker’s SUBTYPEP decision procedure ELS’19, April 01–02 2019, Genova, Italy

(deftype except (x)

`(not (eql ,x)))

Listing 2: The deftype construct

A subtlety about type speciers is that dierent ones can rep-

resent the same type (e.g.,

integer

(integer * *)

and

(or fixnum

bignum)

all describe the same type). This means that symbolic com-

putation does not suce to answer the sub-typing question. Note

that one could write a predicate, say

type=

, to determine whether

two type speciers in fact describe the same type using two calls

of subtypep.

It is possible to dene parametric aliases using the

deftype

con-

struct. It is then possible to refer to a whole type specier using its

alias. Listing 2 shows an example of parametric deftype.

2.2 Vocabulary

type

A set of elements. For any type

u ≡ {x | x

canonical t.s. A type specier without aliases.

primitive type

A standardized type ([

]) that is not necessarily

implemented as a class.

symbolic form A type specier whose type is symbol.

compound form A type specier whose type is list.

logical form A compound form whose car is or, and or not.

kingdom

In Baker’s terminology, a “type kingdom” des-

ignates the types that can be described using

only one kind of type specier.

nil

(the empty

type) belongs to every type kingdom.

In this article we focus on two particular type kingdoms:

•

the literal type kingdom, represented using only symbolic,

member and logical type speciers, and,

•

the range type kingdom, represented only using range and

logical type speciers

For example,

(or string symbol)

belongs to the literal type king-

dom.

(and number (not real))

belongs to the range type kingdom.

However,

(or symbol integer)

belongs to the literal type kingdom

while

(or symbol (integer * *))

belongs to both. This situation is

handled in section 4.

There are other type kingdoms that Baker mentions in his arti-

cle, such as the array type kingdom, represented using only

array

and logical type speciers. Note that a type can belong to several

kingdoms, as multiple type speciers can describe it. For example,

integer

belongs to literal and range kingdoms as the type speciers

integer

(symbolic) and

(integer * *)

(range) both describe it. In

Section 4, we describe how to guarantee that a given type is only

described by one kind of type specier, hence restricting it to one

kingdom.

3 PROCEDURE’S MECHANISMS OVERVIEW

Figure 1 shows the internals of our implementation. Every step will

be detailed in the following sections. There are three major stages:

(1)

The pre-processing — Both type speciers are processed in or-

der to simplify further calculations: the aliases are expanded,

and each occurrence of numeric types are converted to their

equivalent range type specier. Finally, as explained there-

after, the procedure splits into several sub-procedures, one

for each type kingdom, because their internal type represen-

tation dier. In order to achieve that, the type speciers must

also be split into equivalent subtype speciers restricted to

each concerned kingdom. This stage is detailed in Section 4.

(2)

Expert sub-procedures — Once split, each subtype specier

is redirected to the appropriate expert sub-procedure. The

job of such a procedure is to prove, in its own kingdom, the

assertion “

is a subtype of

” to be wrong. Our procedures

currently only support literal and range type speciers—an

expert sub-procedure has been implemented only for these

two kingdoms. This stage is detailed in Section 5.

(3)

Result conjunction — Eventually, all expert sub-procedures

return (a Boolean) and the results are accumulated using

conjunction. (In practice, as soon as one expert procedure

returns false, subtypep returns.)

4 PRE-PROCESSING

4.1 Alias expansion

The very rst step is to ensure that the type specier is in its

canonical form, that is, having all its aliases expanded. This is done

by the

expand

function. For example, considering the type created

in Listing 2, (expand ’(except 12)) should return (not (eql 12)).

Unlike macro expansion,

deftype

expansion is not standardized

in Common Lisp. Thus a solution must be found for each Common

Lisp implementation independently. As our eorts are currently

focused on Sbcl, we discuss how we implement the

expand

function

for that compiler.

Sbcl’s

subtypep

heavily relies on the function

sb-kernel:specifier-type

, which does type expansion. It

also does type simplication—turning

(and integer string)

into

nil

—which could have saved us some work. We hoped we

could simplify that function to make it compatible with Baker’s

algorithm while keeping the

deftype

expansion and the range

canonicalization work. However we found, thanks to [

] tools,

that the function is responsible for most of the work of

subtypep

as shown in Figure 2 Considering the lack of eciency of that

function and the fact that it would not be trivial to simplify it

to only keep the interesting bits, we decided on another, more

cost-eective solution.

The function

sb-ext:typexpand

takes a type specier and tries to

expand it (not recursively). It either returns the expansion result, or

the input type specier if it is not expandable.

(sb-ext:typexpand

’integer)

returns

integer

since it is not a

deftype

alias whereas

(sb-ext:typexpand ’(except 12))

returns

(not (eql 12))

. To ex-

pand a whole type specier, it just needs to walk through it, apply-

ing

sb-ext:typexpand

on each list or atom manually. One subtlety

though is that the result of an expansion may itself be an alias to

expand

. For example, let’s say that we have

(deftype my-type ()

’(except 0.0))

, then the result of

(sb-ext:typexpand ’my-type)

(except 0.0), which is of course an alias to expand again.

Fortunately,

sb-ext:typexpand

also returns a Boolean indicating whether or not an

expansion happened.

ELS’19, April 01–02 2019, Genova, Italy Léo Valais, Jim E. Newton, and Didier Verna

numeric types → ranges numeric types → ranges

alias expansion alias expansion

split split

type A type B

(and l-t-1 (not l-t-2))

missing types registration

bit-vector computation

= [0, 0, · · · , 0]?

(and r-t-1 (not r-t-2))

type diversity reduction

canonicalization

= ∅?

subtypep result

literal-type-1 literal-type-2

range-type-1

range-type-2

Figure 1: Internal owchart of (subtypep A B)

Figure 2: specifier-type weight in cl:subtypep execution

cached-subtypep-caching-call is just a memoizing wrapper around Sbcl’s subtypep

which is a bit more ecient than the raw implementation.

4.2 Numeric type speciers conversion

As explained in Section 3, after pre-processing both type speciers,

the procedure splits in two expert sub-procedures: one for literal

type speciers and one for range type speciers. Numeric types—

types containing numbers (mathematically speaking)—can have

dierent representations: a symbol (e.g.,

fixnum

), a

member

expression

(e.g.,

(member 1 2 3)

) or a range (e.g.,

(integer 1 6)

). However,

the rst two belong to the literal type kingdom whereas the latter

belongs to the range kingdom. Thus, the numerical type information

would be distributed over the dierent expert sub-procedures. For

consistency and accuracy, a single internal representation has to

be chosen. The symbolic and

member

numeric types must each be

converted into an equivalent type specier, in which numerical

data are only represented using ranges.

•

Symbolic numeric type specier — say

, replace it by

(U *

. Note the new “type specier” is likely not to be valid

(e.g.,

(fixnum * *)

is invalid). Because it is never exposed to

the user—as it is an intermediate, internal representation—

nothing bad can happen. However, it cannot be used with

other functions requiring a type specier, such as typep.

• member

type speciers — e.g.,

(member a 1 2 :b)

is converted

(or (member a :b) (bit 1 1) (integer 2 2))

. To do that,

(1) extract the numbers out of the expression,

(2)

map each number, say

, to construct the type specier

((type-of n) n n)

(3)

and combine the remaining

member

expression and the

ranges with the or logical type specier.

A subtlety to consider is that super-types of

number

also con-

tain numerical data that must be extracted. Indeed, the type

atom

contains both numerical data—

(number * *)

—and non-numerical

data—

(and atom (not (number * *)))

. Thus, its replacement in

the numeric type kingdom is straightforward:

(number * *)

. In the

literal type kingdom however, its replacement is

(or stream array

character function standard-object symbol structure-object

structure-class)

. The type

—which is

(or atom sequence)

—must

be converted similarily.

Yet another subtlety is that the type speciers

(and)

and

(or)

respectively describe the types

and

nil

. Hence every occurrence

(and)

must be replaced by the replacement of

described in the

previous paragraph. In order to remove that annoying corner case

completely, (or) is also replaced, by nil.

4.3 Splitting

Having reached this step, the input now only contains canonical

literal and range type speciers, numeric types being only expressed

as ranges. The next stage—expert sub-procedures—requires literal

and numeric types to be separated.

Thus the top type

is divided into two

disjoint subtypes—

“kingdoms” as Baker says. The previous step, described in Sec-

tion 4.2, ensures that the representation (in terms of type speciers)

of the types in each kingdom is dierent. All numeric types are

Implementations supporting the IEEE oating point raise many concerns with -0.0,

N aN , +∞ and −∞. Baker explains in detail how to handle these cases.

The results of

type-of

are implementation-dependent. We suppose here that

type-of

only returns the name (as a symbol) of the type of n (n being a number).

One per kingdom actually, but since our implementation only supports two—literal

and range types—we only focus our attention on these.

Implementing Baker’s SUBTYPEP decision procedure ELS’19, April 01–02 2019, Genova, Italy

represented as ranges, and literal types as symbolic and

member

(without numbers) type speciers.

This step roughly consists of an in-depth traversal of the type

specier, using pattern-matching to recognize which type specier

represents which type. We use the implementation of [

] because

of its simplicity and versatility.

Our implementation uses a function

type-keep-if

which takes a

predicate P and a type specier U and returns:

• U as it is when P(U ) = ⊤,

• nil when P(U ) = ⊥,

• (op U

· · · U

)

where

= (type-keep-if P U

)

when

U = (op U

· · · U

) and op ∈ {and, or, not}.

Given the predicate

literal-type-p

and a type

type-keep-if

returns

with every inner type specier that describes a non-

literal type replaced by

nil

(interpreted as the empty type). The

result is then a subtype of

(and (not number) (not (array * *)))

Likewise, given the predicate

range-type-p

, this function returns

with every non-range inner type specier replaced by

nil

(inter-

preted this time as the empty range). Thus, the result is a subtype

number

. Therefore,

split

can easily be implemented in terms of

type-keep-if.

4.4 Type reformulation

For any types

and

U ⊆ V ⇔ U ∩ V = ∅

. Therefore, for any

type speciers

and

, when

(subtypep U V)

returns

T T

, then

(subtypep ‘(and ,U (not ,V)) nil) also returns T T.

The results of the

split

function are zipped together using

(lambda (x y) ‘(and ,x (not ,y)))

before being passed to the

expert sub-procedures. This way, they will not have to prove that

an arbitrary type is a subtype of another arbitrary subtype, but

rather whether one arbitrary type specier describes the empty

type (which is substantially easier to reason about, and implement).

5 EXPERT SUB-PROCEDURES

Listing 3 shows how

subtypep

could be dened from a top-

down point of view. It shows that, according to Figure 1, both

type speciers are processed independently, split into two king-

doms (literal and numeric types) and unied in an

(and U (not

V))

fashion. The expert sub-procedures,

null-literal-type-p

and

null-numeric-type-p

, each accept one argument—a type specier,

say

—and returns a Boolean indicating whether

describes the

empty type (nil).

Each sub-procedure answers restricted to its kingdom—as no

type can (at this point of the procedure) belong to two dierent

kingdoms, as shown in section 4. With that piece of information,

we can (now) safely assert that:

•

the literal type kingdom is the type described by

(and (not

number) (not (array * *)))

, and,

•

the numeric type kingdom is the type described by

number

Actually this is not completely accurate since the type

string

can be described using

array type speciers. However, since the latter are not supported by our implementation

yet, we consider the types

string

and

bit-vector

as being literal types since their

symbolic representation is kept through the entire process. This is very likely to

change in the future.

Our implementation does not support complex numbers yet, and considers the

complex

type as being empty. Some wrong results arise from that supposition—such as

(subtypep

(defun subtypep (a b)

(reduce (lambda (x y) (and x y))

(mapcar (lambda (expert t1 t2)

(funcall expert `(and ,t1 (not ,t2))))

(list #'null-literal-type-p

#'null-numeric-type-p)

(split (num-types->ranges (expand a)))

(split (num-types->ranges (expand b))))))

Listing 3: A top-down approach of subtypep

There are several properties that are derived from the preceding

pre-processing steps. First of all, both kingdoms’ procedures are

guaranteed to only ever receive argument canonical type speciers.

These are also guaranteed to never contain

atom

type speciers.

The occurrences of

(and)

and

(or)

have been replaced respectively

and

nil

eql

type speciers have been replaced by equivalent

member expressions. member type speciers only occur in the literal

type kingdom and contain no numerical data. Numerical data are

only expressed as intervals, which are likely not to be valid type

speciers. Both kingdoms accept the type specier

nil

but with a

dierent meaning: for literal types,

nil

means the empty type which

complement is

whereas for numeric types it represent the empty

range whose complement is (number * *).

In the following sections we describe in detail the implementa-

tion of the expert sub-procedures for the literal (Section 5.1) and

numeric (Section 5.2) type kingdoms. We also briey discuss in Sec-

tion 5.3 the array type kingdom and the

cons

type specier family,

which Baker ignores in his article.

5.1 Procedure for literal types

5.1.1 Theory. To represent types in the literal types kingdom, we

suppose at rst that there is a way to enumerate every element in

, say

, e

, . . . , e

. Then, let

, u

, . . . , u

be all the (non-strict)

subtypes of the top-level type

. We associate to each pair



, e



the bit

i j

with the value 1 when

∈ u

and 0 when

< u

. Let

be the representative bit-vector associated to the type

, dened

[

, b

, . . . , b

iω

]

. These bit-vectors are the rows of the innite

matrix on Eq. B

ωω

which illustrates the system.



· · · e

1 0 0 0 · · · 1

0 1 1 0 · · · 0

0 0 0 1 · · · 0

1 0 1 0 · · · 0

ωω

)

Proof. Each type has a unique bit-vector representation.

Let

and

be two distinct types. Thus,

∪u

)\(u

∩u

) , ∅

Let

∈ (u

∪u

)\(u

∩u

)

. By denition, we have

, b

. Hence

, bv

. Two distinct types are represented by two dierent bit-

vectors.

Similarly, let

and

be two dierent bit-vectors. Then

it necessarily exists a

such as

, b

. Therefore

∃e



< u

∨ e

< u



∧ e

< u

∩ u

. Hence u

, u

. □

’number ’real)

returning true. This will change as soon as complex numbers are

supported.

ELS’19, April 01–02 2019, Genova, Italy Léo Valais, Jim E. Newton, and Didier Verna

Proof.

Type intersection, union and complement are e quivalent to bitwise

Boolean operations “and”, “or” and “not” on representative bit-vectors.

Let two types u

and u

in:

(1)

Let

= u

∪ u

. By denition,

∀l ∈ N ∪ {ω}, b

1 i

1 or

1, that is

= b

∨ b

. Thus, also by

denition:

[

, b

, . . . , b

kω

]



∨ b

, b

∨ b

, . . . , b

iω

∨ b

jω



= bv

∨ bv

(2)

We proceed similarly for the intersection and the Boolean

logical operator “and” (∧).

(3)

Let

= u

. We have by denition

∀l ∈ N ∪ {ω}, b

= ¬b

Then:

[

¬b

, ¬b

, . . . , ¬b

iω

]

= ¬bv

□

5.1.2 Implementation. Common Lisp cannot enumerate all the

possible subtypes of

nor all of its elements. Fortunately, we do

not need them all. We only need to consider the types mentioned

in the input type specier to determine its emptiness.

We also do not need to enumerate all the elements of these types. It

is that aspect of the procedure of Baker that makes it both powerful

and dicult to understand at rst. We only need suciently many

elements from a type to distinguish it from the other types. Because

we are now considering only a nite number of types, say

, . . . , u

to register a new type

n+1

to our (now nite) matrix, we only need

to nd an element e ∈ u

n+1

such as e < u

∪ · · · ∪ u

Now let’s suppose that the type specier of

n+1

is in fact

(member

, that

is itself chosen as a representative element for another

type, say

, and that

is only distinguished from the other reg-

istered types by that element

n+1

and

would then have the

same bit-vector representation when these types are likely to be

distinct. The general solution for that kind of problem is to regis-

ter all the elements found inside the

member

type specier. When

there is a conicting element

already registered as a representa-

tive for another types, we generate additional representatives for

these types. That precaution ensures that this kind of conict never

happens and greatly simplies the implementation of

member

type

speciers.

To implement that registration matrix system, we use two

functions:

B : type name 7−→ bit-vector

, with

B(u

) = bv

, and

I : representative 7−→ bit index

, with

I(e

) = i −

1. Baker suggests

in his small example [

] using the operator

set

which is depre-

cated in modern Common Lisp programming. Instead, we use hash

tables to represent these functions. Type names are

symbol

s, bit-

vectors are

bit-vector

s and element indexes are positive

integer

To register a new type

n+1

, it is added to the

hash table and

its bit-vector content

(n+1)i

is evaluated for all the existing repre-

sentatives (

i ∈ J

). To register a new representative

m+1

, it is

added to the

hash table with the index

. Then we add one bit

(the

-th bit) to each bit-vector

and evaluate it in respect to the

type

. Thus, to retrieve the bit-vector of a registered primitive or

user-dened type

, we just lookup its value

B(t)

. To compute the

bit-vector of a

member

expression

(member e

· · · e

)

, we use the

value

B((member e

· · · e

)) =

i=1

(

))

, where

β(x)

returns

the null bit-vector with the x-th bit activated.

The bit-vector of logical type speciers are given in Eq. 1, Eq. 2

and Eq. 3 thereafter.

(

(and U

· · · U

)

i=1

B(U

) (1)

(

(or U

· · · U

)

i=1

B(U

) (2)

(

(not U )

)

= ¬B(U ) (3)

5.1.3 Issues. The method for choosing the representative elements

for a type depends of its nature: it can be a primitive type, a user-

dened type (class, structure or condition) or a member expression.

Since primitive types are known (c.f. table 4.2 of [

]), their repre-

sentative elements are chosen at compile-time. The

n+1

subtlety

above should still be kept in mind. For instance, the type

null

is a

subtype of both

symbol

and

list

; so three representative elements

are needed:

nil

, a non-empty list and a symbol other than

nil

. Note

that some primitive types are an exhaustive partition of other types

(e.g.,

character ≡ (or base-char extended-char)

). Obviously, in

that case, such a precaution does not apply.

For user-dened types, Baker suggests to extend the type cre-

ation mechanism—thus modifying the implementation’s internal

functions—to register a dummy element as a representative. We

decided not to follow his approach because of the poor portability

of his solution. Indeed, this work, often non-trivial, would have

to be repeated for each targeted Common Lisp implementation.

(We would like to avoid modifying the Sbcl internal mechanisms.)

Moreover, it would register a representative for every class created,

thus increasing bit-vectors’ size uselessly since only a few of these

classes are likely to appear in a

subtypep

type specier. But more

importantly, the main drawback of his solution is that creating that

dummy element might have unexpected side-eects, as it may need

to use slot’s default values and/or

initialize-instance

. We decided

instead to use the Meta Object Protocol (Mop) [

], more specically

class prototypes. Class prototypes are pseudo-instances of a class,

created without executing

initialize-instance

and which

typep

and

eql

view as traditional instances. However, to create a class

prototype, the class needs to be nalized and it cannot be guaran-

teed until it is instantiated. Since that class may be involved in a

subtypep

call before that happens, when a new class is encountered,

we force its nalization using the function

ensure-finalized

from

the (portable)

closer-mop

package

. Then, we create the proto-

type of the class using

sb-mop:class-prototype

and register it. This

method is much more portable than Baker’s and does not require

to hook inside the implementation.

Since (in Sbcl

) conditions are classes, they are supported au-

tomatically. The Common Lisp standard [

] states that “

defstruct

without a

:type

option denes a class with the structure name as

its name”, hence in that case no additional work is required. The

standard also states that “Specifying this option [...] prevents the

http://common-lisp.net/project/closer/

Every major lisp implementations implement conditions as Clos classes—the most

obvious way to do it. We ignore exotic condition implementations.

Implementing Baker’s SUBTYPEP decision procedure ELS’19, April 01–02 2019, Genova, Italy

structure name from becoming a valid type specier recognizable by

typep

.” Thus,

subtypep

is not concerned by these types of structures.

To address the misrepresentation problem when

member

type

speciers are involved, as discussed in Section 5.1.3, we must ensure

that a new representative element is generated and registered. The

Common Lisp standard ([

]) states that the

member

type specier

is dened in terms of

eql

. That is,

(typep e ’(member e

· · · e

))

uses

eql

to compare

to the successive

to check the membership.

That precise property reduces the misrepresentation problem to

only two types: symbol and character (and their subtypes).

To better understand why it is the case, rst consider a reduced

version of the top-level type

(or string list symbol)

. Then,

let R = ("hello" (1 2 3) foo) be our list of representatives.

(1) Let’s ask the question (subtypep ’symbol ’(member foo)).

(2)

As discussed in Section 5.1.3, we add the elements of the

member

expression to

. To conform with the specication,

we rst check whether or not

foo

is already in

R eql

-wise:

foo ∈

eql

R, so R does not change.

(3)

As shown in Eq. 4, the emptiness check passes, meaning

that

symbol

is indeed a subtype of

(member foo)

, which is

obviously wrong.

B(symbol) ∧ ¬B((member foo)) = 001 ∧ ¬001 (4)

= 001 ∧ 110

= 000

= B(nil)

However, for lists, that problem does not appear, thanks to the

eql-wise comparison.

(1) (subtypep ’list ’(member (1 2 3)))

(2) (1 2 3) <

eql

R ⇒ R = ("hello" (1 2 3) foo (1 2 3))

(3)

As shown in Eq. 5, the emptiness check fails and the answer

is correct.

B(list) ∧ ¬B((member (1 2 3))) = 0101 ∧ ¬0001 (5)

= 0101 ∧ 1110

= 0100

, B(nil)

Within the literal types kingdom, the only types for which this

problem occurs—since the representatives are not supposed to be

accessible to the user of

subtypep

—are then

symbol

and

character

Therefore, only the representatives of these types need to be actually

checked when registering member’s elements.

To generate a new symbol, we use

alexandria:symbolicate

The

keyword

subtype of

symbol

is also subject to the problem. (Ac-

tually, solving the problem for keywords also solves the problem

for symbols.) To generate a new character, we rst need to know

whether it is a

base-char

or an

extended-char

. Then we pick a char-

acter of that type not registered yet. When all the characters of that

type are registered there is nothing to do (since the type is fully

represented in the matrix, no misinterpretation can occur).

We have not addressed the problem of a type specier involving a

user class

and a

member

expression containing the class prototype

of C yet.

https://common-lisp.net/project/alexandria/

5.2 Procedure for numeric types

Unlike the literal type kingdom, the range type kingdom does not

need an internal state to represent numeric types. Indeed, the expert

sub-procedure takes as input an already precise enough representa-

tion of the type described. Range type speciers allow to describe

which kind of number is specied (its type, e.g.,

integer

ratio

etc.), its bounds (inclusive and exclusive, e.g.,

(integer (0) 6)

) and

is able to represent non-bounded intervals through the symbol

meaning innity (e.g.,

(float * 0.0) ≡

[

−∞; 0.0

]

). The range type

specier is as precise as the mathematical range notation. Addition-

ally, the mathematical union, the intersection and complement of

these ranges can be expressed equally using the corresponding

logical type specier.

Therefore, to assert about the emptiness of the input type spec-

ier, checking whether the canonicalized version of this interval

expression describes the empty range (i.e.,

nil

) is sucient. The cal-

culation is performed by three successive steps, which we describe

in the following sections.

This algorithm suers from an exponential time and space com-

plexity. However, Baker claims that in practice, that theoretical

complexity is not an issue (it only appears for “highly artical

cases”). We have not tried to prove (or invalidate) his statement but

Section 6 shows some early results that tend to support his claim.

We use a custom abstraction, the

interval

class, closer to the

mathematical object (with type, bounds and limits slots). Thus

we avoid the annoying manipulation of lists (with the many

standardized ranges syntaxes). The rst step is to write a func-

tion

range->interval

that converts (using pattern matching) a

range type specier to its corresponding

interval

instance. This

function also takes care of the exotic compound forms—such as

(unsigned-byte s)

which describes the

integer

range

[

0; 2

− 1

]

We also use a similar structure for interval operations to fully dis-

card the list representation.

We also need the following interval functions:

• (interval-and I

)

— returns

∩ I

if their

type

s are

eql

or ∅ otherwise.

• (interval-or I

)

— returns

∪ I

if their

type

s are

eql

and I

∩ I

, ∅, or ∅ otherwise.

• (interval-minus I

)

— returns

− I

(may return two

values when I

⊂ I

) if their types are eql, or I

otherwise.

• (interval-empty-p I ) — returns whether I = ∅.

5.2.1 Type diversity reduction. Functions working with intervals

must be aware of the relationship of the types of these inter-

vals. For example, the intersection of two

integer

intervals might

be non-empty whereas the intersection of one

integer

and one

single-float

intervals is always null as these two types are disjoint.

However,

integer

and

fixnum

are dierent types but the intersec-

tion of intervals of such types might be non-empty. The subtype

relationship of the types of intervals needs to be introspected to

accurately apply some operations (such as intersection or union).

The type

number

(complex numbers being ignored) is an ex-

haustive partition of six mutually disjoint types:

integer

ratio

single-float

short-float

double-float

, and

long-float

. Baker ad-

vises to dene what he calls “simple intervals”, that is intervals

guaranteed to have their

type

equal to one of these six types. This

ELS’19, April 01–02 2019, Genova, Italy Léo Valais, Jim E. Newton, and Didier Verna

Supertype Conversion

number (or rational float)

real (or rational float)

rational (or integer ratio)

float (or short-float single-float double-float long-float)

bignum (or integer (not fixnum))

Table 1: Conversion table for supertypes

way, as these types are mutually disjoint, operations on intervals

of such types have their implementation greatly simplied.

To convert each numeric type into its equivalent using only the

six types above, a two-step conversion is required.

(1)

For intervals whose

type

is a supertype of one of these

types, the conversion table 1 is used. E.g.: the conversion of

(rational a b) gives (or (integer a b) (ratio a b)).

(2)

For intervals whose

type

is a bounded subtype (i.e.: having

dened bounds, not innity) of these six types, their actual

bounds have to be constrained to t within the bounds of

their type, before being converted to their corresponding su-

pertype. For example,

(fixnum

12 2

100

)

, has to be converted

(integer most-negative-fixnum most-positive-fixnum)

where

most-positive-fixnum <

100

, as 2

100

is a

bignum

thus discarding the numbers in between. A similar proce-

dure is applied to the types

bit

short-float

single-float

double-float and long-float.

Eventually, the

type

of every

interval

is constrained to one of

the six types above, with the bounds (if some) of their original type

preserved.

5.2.2 Canonicalization. To check the emptiness of the interval ex-

pression, it is canonicalized. Let

be the canonicalization function.

Its parameter is either an

interval I

or an operation on intervals

(intersection, union or complement).

either returns

∅

, an

interval

or a union of disjoint intervals—the three possible outcomes of a

mathematical interval canonicalization.

First and foremost, anytime

encounters or returns a union,

it must ensure that it is attened (no nested unions). It must also

ensure that the intervals inside the union are disjoint. As shown in

Section 5.2.1, intervals with dierent types are necessarily disjoint.

Touching intervals [3] are merged using interval-or.

Γ(∅)

and

Γ(I)

are straightforward, as shown in Eq. end-

∅

and

Eq. end-I. These are the terminal cases of the recursion of Γ.

Γ(∅) = ∅ (end-∅)

Γ(I) = I (end-I)

Intersections (

and

logical type speciers) are reduced as soon as

they are encountered. Their operands need to be processed by

rst (hence the implicit mapping “

k → n

”). Eq.

and

-apply shows

how to reduce intersections. The

operator denotes a fold [

]

operation using the function

Γ ◦ ∩

denotes the composition of

the

function and the intersection operator. To break it down in a

bottom-up fashion:

(1) Eq. and-nal — the application of the intersection function.

(2)

Eq.

and

-distribution

′

— the distribution of the intersection

over the union. Next step is Eq. and-nal.

(3)

Eq.

and

-distribution — also the distribution of the intersection

over the union. However,

Γ(χ)

may return an union, leading

the execution either to Eq.

and

-distribution

′

or directly to

Eq. and-nal.

(4)

Eq.

and

-apply — the canonicalization of the

forms us-

ing mapping. The results are then folded using

Γ ◦ ∩

, thus

initiating the recursive intersection distribution.

= Φ

Γ◦∩

Γ(χ

)

k→n

(and-apply)

χ ∩

(

Γ(χ) ∩ I

)

(and-distribution)

∩ I

Γ(I

∩ I ) (and-distribution

′

)

Γ(I

∩ I

) = (interval-and I

) (and-nal)

Complements (

not

logical type speciers) are also reduced as

soon as they are encountered. Their only operand is rst canon-

icalized. Complementing

number

(the top-level type of the

range type kingdom) is equivalent to the dierence

number − U

as shown in Eq.

not

-apply. The dierence canonicalization goes

through a similar recursive distribution path than the intersection,

that is Eq.

minus

-distribution and then Eq.

minus

-apply. Note that

this path is taken every time since the interval dierence is an

internal operation and that its left-hand operand is always U.

(

)

= Γ

(

U − Γ(χ )

)

(not-apply)

U = ⟨type diversity reduction of (number * *)⟩

χ −

Γ(χ − I

) (minus-distribution)

− I

(interval-minus I

I ) (minus-apply)

5.2.3 Range emptiness check. Once an interval expression

canonicalized, checking its emptiness is trivial. The predicate

interval-empty-p

, given the result of the rst

call, just returns

the Boolean that null-numeric-type-p has to return.

5.3 Array types and cons type speciers

This section presents some preliminary work and research results

found on array and

cons

type speciers. Obviously, since the im-

plementation of the expert sub-procedures for these kingdoms is

still a work in progress, no result nor implementation guidelines

are provided here. It does, however, give some insights about how

Baker procedure applies to modern Common Lisp implementations

such as Sbcl.

Array type speciers are complex to handle because they are

bi-dimensional: it has an element type and bounds (e.g.,

(array

integer (* 2 *))

). Internally, Common Lisp implementations do

not store which exact type specier is specied but rather only store

Implementing Baker’s SUBTYPEP decision procedure ELS’19, April 01–02 2019, Genova, Italy

−4

−3

−2

−1

Subtypes of NUMBER

−4

−3

−2

−1

MEMBER types

−4

−3

−2

−1

Subtypes of T

−4

−3

−2

−1

Subtypes of CONDITION

Algorithm 1 with cl:subtypep

Algorithm 1 with baker:subtypep

Algorithm 2 with cl:subtypep

Algorithm 2 with baker:subtypep

Figure 3: Comparative eciency measures of our subtypep

implementation

the result of the function

upgraded-array-element-type

returns giv-

ing that type. E.g, for

(make-array 2 :element-type ’list)

, the

implementation does not makes an array of

list

but rather an ar-

ray of

(upgraded-array-element-type ’list)

. For every value that

might return this function, Baker requires that we store a bit matrix

(instead of bit vectors) because of the complex bounds logic of the

type specier. As for the literal type procedure, it seems to be an

ecient type representation system—albeit more complex—which

nonetheless requires an extra registration step and a global state.

Baker does not mention the

cons

type specier family at all

in his article because it appeared after he released his article [

An accurate expert sub-procedure for this kingdom would have

an exponential complexity. More investigation is needed to assert

whether or not that exponential time is “acceptable” (as it is for

ranges) before rejecting it. The accuracy of existing

subtypep

pro-

cedures for the cons type specier also needs to be studied.

6 EARLY RESULTS

Our implementation of

subtypep

is still in active development and

very experimental. No serious optimization work has been made.

Nonetheless, Newton has compared in [

] the performances of

several

subtypep

highly dependent algorithms, both using the im-

plementation of Sbcl and ours.

These results, shown in Figure 3, are only presented here as

complementary information. On the horizontal axis is the size of

the type speciers and on the vertical axis is the measured exe-

cution time. Hence, the lower a curve is, the better. As expected,

our implementation is often slower, but not dramatically, which is

encouraging.

•

Our implementation is overall slower in the range type king-

dom.

•

Heavy users of

member

seems to experience a slower execu-

tion. Perhaps, as predicted by Baker, the reason is that the

systematic registration of the elements makes the size of

the bit-vectors grow quickly, thus making every subsequent

operation slower.

•

For the symbolic type speciers—primitive types, Clos

classes and conditions—our implementation already outper-

forms Sbcl’s.

7 CONCLUSION AND FUTURE WORK

Throughout this article we presented our implementation of Baker’s

decision procedure. In Section 2 we introduced the Common Lisp

type system, the notion of type specier and some vocabulary. In

Section 4 we explained how to pre-process the caller’s type speci-

ers to make the work of the expert sub-procedures presented in

Section 5 easier. We described our implementation for the symbolic,

member

, range and logical type speciers. We also gave some insights

about the implementation for the

array

and

cons

type speciers.

We nally presented some early eciency measures, which are

globally encouraging.

Our implementation is still a work in progress and highly exper-

imental. But with some cleaning and the implementation of both

array

and

cons

expert sub-procedures, it could be a viable alter-

native to existing

subtypep

implementations. We will have open

sourced its code by then. We still have to nd a solution for the

satisfies

type specier and the related uncertainty. Indeed, in some

situations,

subtypep

still can answer even though the type speci-

er is involved. For example, in

(subtypep ’string ’(and number

(satisfies evenp)))

, as the second operand is guaranteed to be

a subtype of

number

, the predicate can safely return false. Finally,

a lot of measures on accuracy and eciency are needed to assert

whether Baker’s intuition about his procedure was correct or not.

Even if, in the future, we are to conclude that our implementation

is less ecient than those which already exists, Baker’s algorithm

would still likely to improve the predicate’s accuracy. Lispers would

then have the ability to choose whichever

subtypep

implementation

ts their needs the best.

REFERENCES

[1]

Ansi. American National Standard: Programming Language – Common Lisp.

ANSI X3.226:1994 (R1999), 1994.

[2]

Ansi. American National Standard: Programming Language – Common Lisp –

Type Speciers (Section 4.2.3). ANSI X3.226:1994 (R1999), 1994. http://www.

lispworks.com/documentation/lw50/CLHS/Body/04_bc.htm.

[3]

Henry G. Baker. A Decision Procedure for Common Lisp’s

SUBTYPEP

Predicate.

Lisp and Symbolic Computation, 1992.

[4]

Paul F. Dietz. “subtypep tests” discussion on gcl-devel, 2005. https://lists.gnu.org/

archive/html/gcl-devel/2005-07/msg00038.html.

[5]

Graham Hutton. A tutorial on the universality and expressiveness of fold. Journal

of Functional Programming, 9(4):355–372, July 1999. URL http://dblp.uni-trier.de/

db/journals/jfp/jfp9.html#Hutton99.

[6]

Gregor J. Kiczales, Jim des Rivières, and Daniel G. Bobrow. The Art of the Metaobject

Protocol. MIT Press, Cambridge, MA, 1991.

[7]

Jim Newton. Representing and Computing with Types in Dynamically Typed Lan-

guages. PhD thesis, Sorbonne Université, Paris, France, November 2018.

[8]

Jim Newton and Didier Verna. Approaches in

typecase

optimization. In European

Lisp Symposium, Marbella, Spain, April 2018.

[9]

Peter Norvig. Paradigms of Articial Intelligence Programming: Case Studies in

Common Lisp. Morgan Kaufmann, 1992.