Type-checking on Heterogeneous Sequences
in Common Lisp
Jim Newton
EPITA/LRDE
May 9, 2016
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 1 / 31
Overview
1
Introduction
Common Lisp Types
2
Type Sequences
Limitations
Rational Type Expression
Generated Code
Example: destructuring-case
Overlapping Types
3
Conclusion
Difficulties Encountered
Summary
Questions
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 2 / 31
Introduction Common Lisp Types
Common Lisp Types
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 3 / 31
Introduction Common Lisp Types
What is a type in Common Lisp?
Definition (from CL specification)
A (possibly infinite) set of objects.
Definition (type specifier)
An expression that denotes a type.
Atomic examples
t, integer, number, asdf:component
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 4 / 31
Introduction Common Lisp Types
Type specifiers come in several forms.
Compound type specifiers
(eql 12)
(member :x :y :z)
(satisfies oddp)
(and (or number string) (not (satisfies MY-FUN)))
Specifiers for the empty type
nil
(and number string)
(and (satisfies evenp) (satisfies oddp))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 5 / 31
Introduction Common Lisp Types
Type specifiers come in several forms.
Compound type specifiers
(eql 12)
(member :x :y :z)
(satisfies oddp)
(and (or number string) (not (satisfies MY-FUN)))
Specifiers for the empty type
nil
(and number string)
(and (satisfies evenp) (satisfies oddp))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 5 / 31
Introduction Common Lisp Types
Using types with sequences
Compile time
(lambda (x y)
(declare (type (vector float) x y))
(list x y))
Run time
(typep my-list ’(cons t (cons t (cons string))))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 6 / 31
Type Sequences Limitations
Limitations
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 7 / 31
Type Sequences Limitations
Limited capability for specifying heterogeneous sequences.
You can’t specify the following.
An arbitrary length, non-empty, list of floats:
(1.0 2.0 3.0)
A plist such as:
(:x 0 :y 2 :z 3)
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 8 / 31
Type Sequences Limitations
Limited capability for specifying heterogeneous sequences.
You can’t specify the following.
An arbitrary length, non-empty, list of floats:
(1.0 2.0 3.0)
A plist such as:
(:x 0 :y 2 :z 3)
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 8 / 31
Type Sequences Rational Type Expression
The Rational Type Expression
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 9 / 31
Type Sequences Rational Type Expression
Introducing the RTE type
Rational type expression vs. RTE type specifier
number
+
(RTE (:+ number))
Example: (1.0 2.0 3.0)
(keyword · integer)
(RTE (:* (:cat keyword integer)))
Example: (:x 0 :y 2 :z 3)
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 10 / 31
Type Sequences Rational Type Expression
Introducing the RTE type
Rational type expression vs. RTE type specifier
number
+
(RTE (:+ number))
Example: (1.0 2.0 3.0)
(keyword · integer)
(RTE (:* (:cat keyword integer)))
Example: (:x 0 :y 2 :z 3)
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 10 / 31
Type Sequences Rational Type Expression
Use RTE anywhere CL expects a type specifier.
(typedef plist (type)
‘(and list
(RTE (:* keyword ,type))))
(defun foo (A B)
(declare (type (RTE (:+ number)) A)
(type (plist float) B))
...)
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 11 / 31
Type Sequences Rational Type Expression
An RTE can be expressed as a finite state machine.
(symbol · (number
+
string
+
))
+
(:+ symbol (:or (:+ number) (:+ string)))
0 1
2
3
symbol
number
string
symbol
number
symbol
string
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 12 / 31
Type Sequences Generated Code
Generated Code
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 13 / 31
Type Sequences Generated Code
State machine can be expressed in CL code
(lambda (seq)
(declare (optimize (speed 3) (debug 0) (safety 0)))
(typecase seq
(list
...)
(simple-vector
...)
(vector
...)
(sequence
...)
(t nil)))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 14 / 31
Type Sequences Generated Code
Code generating implementing state machine
(tagbody
0
(unless seq (return nil))
(typecase (pop seq)
(symbol (go 1))
(t (return nil)))
1
(unless seq (return nil))
(typecase (pop seq)
(number (go 2))
(string (go 3))
(t (return nil)))
2
(unless seq (return t))
(typecase (pop seq)
(number (go 2))
(symbol (go 1))
(t (return nil)))
0 1
2
3
symbol
number
string
symbol
number
symbol
string
3
(unless seq (return t))
(typecase (pop seq)
(string (go 3))
(symbol (go 1))
(t (return nil)))))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 15 / 31
Type Sequences Example: destructuring-case
destructuring-case
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 16 / 31
Type Sequences Example: destructuring-case
Example of destructuring-case
(destructuring-case DATA
;; Case-1
((a b &optional (c ""))
(declare (type integer a)
(type string b c))
...)
;; Case-2
((a (b c)
&key (x t) (y "") z
&allow-other-keys)
(declare (type fixnum a b c)
(type symbol x)
(type string y)
(type list z))
...))
(typecase DATA
;; Case-1
((rte (:cat integer
string
(:? string)))
...destructuring-bind...)
;; Case-2
((rte ...complicated...)
...destructuring-bind...
...))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 17 / 31
Type Sequences Example: destructuring-case
Regular type expression denoting Case-2
( : c a t
( : ca t f i x n u m ( : and l i s t ( r t e ( : ca t f i x n u m fixnu m ) ) ) )
( : and (: key wor d t )
( : ca t
( : ( no t ( member : x : y : z ) ) t )
( : or : emptyword
( : ca t ( e q l : z ) fixn u m ( : ( not ( member : x : y ) ) t )
( : ? ( e q l : y ) s t r i n g ( : ( no t ( e q l : x ) ) t )
( : ? ( e q l : x ) sy m bol ( : t t ) ) ) )
( : c a t ( e q l : z ) fixnu m ( : ( not ( member : x : y ) ) t )
( : ? ( e q l : x ) sym b ol ( : ( no t ( e q l : y ) ) t )
( : ? ( e q l : y ) s t r i n g ( : t t ) ) ) )
( : c a t ( e q l : y ) s t r i n g ( : ( not ( member : x : z ) ) t )
( : ? ( e q l : z ) f i x num ( : ( no t ( e q l : x ) ) t )
( : ? ( e q l : x ) sy m bol ( : t t ) ) ) )
( : c a t ( e q l : x ) s y mbo l ( : ( not ( member : y : z ) ) t )
( : ? ( e q l : z ) f i x num ( : ( no t ( e q l : y ) ) t )
( : ? ( e q l : y ) s t r i n g ( : t t ) ) ) )
( : c a t ( e q l : y ) s t r i n g ( : ( not ( member : x : z ) ) t )
( : ? ( e q l : x ) sym b ol ( : ( no t ( e q l : z ) ) t )
( : ? ( e q l : z ) fixnum ( : t t ) ) ) )
( : c a t ( e q l : x ) s y mbo l ( : ( not ( member : y : z ) ) t )
( : ? ( e q l : y ) s t r i n g ( : ( no t ( e q l : z ) ) t )
( : ? ( e q l : z ) fixnum ( : t t ) ) ) ) ) ) ) )
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 18 / 31
Type Sequences Example: destructuring-case
Finite State Machine of Case-2 of destructuring-case
0 1
T3
2
T7
16
T21
17
T8
3
T9
25
T10
T1
18
T4
4
T6
26
T3
5
T19
6
T8
12
T10
T1
7
T4
13
T3
8
T17
9
T10
T1
10
T3
11
T5
T1
14
T15
15
T8
T1
T4
19
T20
20
T9
21
T10
T1
T6
22
T3
23
T16
24
T9
T1
T6
27
T18
28
T8
29
T9
T1
T4
T6
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 19 / 31
Type Sequences Overlapping Types
Overlapping Types
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 20 / 31
Type Sequences Overlapping Types
Rational type expression with overlapping types
((integer · number) (number · integer))
(:or (:cat integer number)
(:cat number integer))
P
0
P
2
P
3
P
1
integer
number
integer
number
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 21 / 31
Type Sequences Overlapping Types
Overlapping types must decomposed into disjoint types
((integer · number) ((number integer) · integer))
(:or (:cat integer number)
(:cat (and number (not integer))
integer))
P
0
P
2
P
3
P
1
integer
number integer
integer
number
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 22 / 31
Type Sequences Overlapping Types
Overlapping types considered harmful
Disjoint Decomposed
Set Expression
{ 1 } A B C D F H
{ 2 } B C D
{ 3 } B C D
{ 4 } C B D
{ 5 } B C D
{ 6 } B D C
{ 7 } C D B
{ 8 } D B C H
{ 9 } E
{ 10 } F
{ 11 } G
{ 12 } H D
{ 13 } D H E
A
1
B
2
C
3
4
5
6
7
D
8
E
9
F
10
G
11
H
12
13
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 23 / 31
Type Sequences Overlapping Types
How to calculate type disjoint-ness and equivalence.
(defun type-intersection (T1 T2)
‘(and ,T1 ,T2))
(defun types-disjoint-p (T1 T2)
(subtypep (type-intersection T1 T2) nil))
(defun types-equivalent-p (T1 T2)
(multiple-value-bind (T1<=T2 okT1T2) (subtypep T1 T2)
(multiple-value-bind (T2<=T1 okT2T2) (subtypep T2 T1)
(values (and T1<=T2 T2<=T1) (and okT1T2 okT2T2)))))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 24 / 31
Conclusion Difficulties Encountered
Interesting Difficulties Encountered
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 25 / 31
Conclusion Difficulties Encountered
Performance and correctness problems with SUBTYPEP
(subtypep ’(and integer (or (eql 1) (satisfies F)))
’(and integer (or (eql 0) (satisfies G))))
= NIL, T (should be NIL, NIL)
(subtypep ’compiled-function nil)
= NIL, NIL (should be NIL, T)
(subtypep ’(eql :x) ’keyword)
= NIL, NIL (should be T, T)
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 26 / 31
Conclusion Difficulties Encountered
Recursive types forbidden
Neither the CL type system nor the RTE extension are expressive enough
to specify recursive types such as:
(deftype singleton (type)
‘(or (cons ,type nil)
(cons (singleton ,type))))
(deftype proper-list (type)
‘(cons ,type (or null
(proper-list ,type))))
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 27 / 31
Conclusion Difficulties Encountered
Missing CL API for type reflection and extension
Can’t ask whether a particular type exists? I.e., is there a type foo ?
E.g., Given two RTE type specifiers, we can calculate whether one is
a subtype of the other. Unfortunately, CL provides no SUBTYPE hook
allowing me to make this calculation.
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 28 / 31
Conclusion Difficulties Encountered
Future Research
Static analysis of destructuring-case to detect unreachable code
or overlapping cases.
Investigate performance of type decomposition (disjoint-izing).
Apply to other dynamic languages (e.g., Python, Scala/JVM,
Julia/LLVM).
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 29 / 31
Conclusion Summary
Summary
Regular expression style type-based pattern matching on CL
sequences.
RTE type allows O(n ) type checking of CL sequences.
Non-linear complexity moved to compile time.
Source-code available at
https://www.lrde.epita.fr/wiki/Publications/newton.16.els
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 30 / 31
Conclusion Questions
Q/A
Questions?
Jim Newton (EPITA/LRDE) Type-checking on Heterogeneous Sequences May 9, 2016 31 / 31