Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Scientiﬁc Computing in LISP: beyond the

performances of C

Didier Verna

didier@lrde.epita.fr

http://www.lrde.epita.fr/˜didier

Version 1.3 – November 7, 2006

1/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Introduction

Myths and legends. . .

Facts:



“LISP is slow” . . . NOT ! (it’s been 20 years)

•

Smart compilers (⇒ native machine code)

•

Static typing (types known at compile-time)

•

Safety levels (compiler optimizations)

•

Efﬁcient data structures (arrays, hash tables etc.)



Image processing libraries written in C or C++

(sacriﬁcing expressiveness for performance)



LISP achieving 60% speed of C

(recent studies)

⇒ We have to do better:



Comparative C and LISP benchmarks

(part 1: full dedication)



4 simple image processing algorithms



Pixel storage and access / arithmetic operations

⇒ Equivalent performance

(LISP 10% better in some cases)

2/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Table of contents

1 Experimental Conditions

2 C Programs and Benchmarks

3 LISP programs and benchmarks

Raw LISP

Typed LISP

Results

4 Type inference

3/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Experimental conditions

The algorithms: the “point-wise” class



Pixel assignment / addition / multiplication / division



Soft parameters: image size / type / storage / access



Hard parameters: compilers / optimization level



⇒ More than 1000 individual test cases

The protocol



Debian GNU Linux / 2.4.27-2-686 packaged kernel



Pentium 4 / 3GHz / 1GB RAM / 1MB level 2 cache



Single user mode / SMP off (no hyperthreading)



Measures on 200 consecutive iterations

5/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

C code sample

The add function

void add ( image ∗ to , image ∗ from , f l oa t val )

{

i n t i ;

const i n t n = ima−>n ;

for ( i = 0 ; i < n ; ++ i )

to −>data [ i ] = from−>data [ i ] + va l ;

}

Gcc 4.0.3 (Debian package)

Full optimization: -O3 -DNDEBUG plus inlining

Note: inlining should be almost negligible

7/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Results

In terms of behavior

1D implementation slightly better (10% ⇒ 20%)

Linear access faster (15 ⇒ 35 times)



Arithmetic overhead: only 4x – 6x



Main cause: hardware cache optimization

Optimized code faster (60%) in linear case, irrelevant

in pseudo-random access



Causes currently unknown

Inlining negligible (2%)

8/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Results

In terms of performance

Fully optimized inlined C code

Algorithm Integer Image Float Image

Assignment 0.29 0.29

Addition 0.48 0.47

Multiplication 0.48 0.46

Division 0.58 1.93

Not much difference between pixel types

Surprise: integer division should be costly



“Constant Integer Optimization” (with inlining)



Do not neglect inlining !

9/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

First shot at LISP code

The add function, take 1

( defun ad d ( to from v a l )

( l e t ( ( si z e ( array−dimension to 0 ) ) )

( dotimes ( i siz e )

( setf ( ar ef to i ) (+ ( aref from i ) v a l ) ) ) ) )

COMMON-LISP’s standard simple-array type

Interpreted version: 2300x

Compiled version: 60x

Optimized version: 20x

Untyped code ⇒ dynamic type checking !

11/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Typing mechanisms

Typing paradigm:



Type information (COMMON-LISP standard)

Declare the expected types of LISP objects



Type information is optional

Declare only what you know; give hints to the compilers



Both a statically and dynamically typed language

Typing mechanisms:



Function arguments:

(make-array size :element-type ’single-float)



Type declarations:

Function parameter / freshly bound local variable



. . .

12/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Typed LISP code sample

Declaring the types of function parameters

The add function, take 2

( defun ad d ( t o from v a l )

( d ec la re ( t ype ( s imp le− array si n g l e − f l o a t ( ∗ ) ) t o from ) )

( d ec la re ( t ype s i ng l e − f l o a t v a l ) )

( l e t ( ( si z e ( array−dimension to 0 ) ) )

( dotimes ( i siz e )

( s e tf ( a re f to i ) (+ ( a re f from i ) v a l ) ) ) ) )

simple-array’s . . .

of single-float’s . . .

unidimensional.

13/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Object representation

Why typing matters for performance

Dynamic typing ⇒ objects of any type (worse: any size)

LISP variables don’t carry type information: objects do

The “boxed” representation of LISP objects

Type information

Pointer to Lisp Object

Actual value

Dynamic type checking is costly !

Pointer dereferencing is costly !

14/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

The beneﬁts of typing

2 examples

Array storag e layout:



Homogeneous arrays of a known type

⇒ native representation usable



Specialization of the aref function



“Open Coding”

Immediate objects:



Short (less than a memory word)



Special “tag bits” (invalid as pointer values)



⇒ Encoded inline

Unboxed fixnum representation

Tag bits

Bits 1 ... 29 30 31 32

fixnum value (30 bits)

15/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Example: optimizing a loop index

(dotimes (i 100) ...)

Disassembly of a dotimes macro

58701478: .ENTRY FOO( )

90 : POP DWORD PTR [EBP−8]

93 : LEA ESP, [EBP−32]

96 : XOR EAX, EAX

98 : JMP L1

9A: L0 : ADD EAX, 4

9D : L1 : CMP EAX, 4 00

A2 : JL L0

A4 : MOV EDX, #x2800000B

A9 : MOV ECX, [EBP−8]

AC: MOV EAX , [ EBP−4]

AF: ADD ECX, 2

B2 : MOV ESP, EBP

B4 : MOV EBP, EAX

B6 : JMP ECX

16/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Activating optimization

“Qualities” (COMMON-LISP standard): between 0 and 3

safety, speed etc.

Global or local declarations in source code

(no compiler ﬂag)

Global qualities declaration

(declaim (optimize (speed 3)

(compilation-speed 0)

(safety 0)

(debug 0)))

Safe code: declarations treated as assertions

Optimized code: declarations trusted

17/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Final LISP code sample

The add function

( defun ad d ( t o from v a l )

( d ec la re ( t ype ( s imp le− array si n g l e − f l o a t ( ∗ ) ) t o from ) )

( d ec la re ( t ype s i ng l e − f l o a t v a l ) )

( l e t ( ( si z e ( array−dimension to 0 ) ) )

( dotimes ( i siz e )

( s e tf ( a re f to i ) (+ ( a re f from i ) v a l ) ) ) ) )

CMU-CL (19c), SBCL (0.9.9), ACL (7.0)

Full optimization: (speed 3), 0 elsewhere

Array type: 1D, 2D

Array access: aref, row-major-aref, svref

18/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Comparative results

In terms of behavior

= Plain 2D implementation much slower (2.8x ⇒ 4.5x)

= Linear access faster (30 times)



Same reasons, same behavior. . .

= Optimized code faster in linear case, irrelevant in

pseudo-random access

= Gain more important in LISP (3x ⇒ 5x)

= Gain more important on ﬂoating point numbers

⇒ In LISP, safety is costly

= Inlining negligible

= No “Constant Integer Optimization”

= Negative impact on performance (-15%), if any

⇒ Inlining still a “hot” topic (register allocation policies ?)

19/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Comparative results

In terms of performance

Pseudo-random access

Assignment Addition Multiplication Division

Execution time (seconds)

Rear to Front: ACL / SBCL / CMU-CL / C

Integer

Floating Point

Assignment: LISP 19% faster than C

Other: insigniﬁcant (5%)

Exception: integer division

20/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Comparative results

In terms of performance

Linear access

Assignment Addition Multiplication Division

Execution time (seconds)

Rear to Front: ACL / SBCL / CMU-CL / C

Integer

Floating Point

A CL: poor performance

CMU-CL, SBCL: strictly equivalent to C

C wins on integer division, loses on ﬂoating-point one

21/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Type inference

A weakness of COMMON-LISP . . .

Static typing cumbersome (source code annotations)



Can we provide minimal type declarations . . .



. . . and rely on type inference ?

Incremental typing by compilation log examination

Unfortunately:



Compiler messages not necessarily ergonomic



Type inference systems not necessarily clever

23/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Example of (missing) type inference

multiply excerpt

; ; . . .

( d ec la re ( t ype ( s imp le− array fixnum ( ∗ ) ) to from ) )

( d ec la re ( t ype fixnum val ) )

; ; . . .

( s e tf ( a re f to i ) ( the fixnum (∗ ( a re f from i ) v a l ) ) ) ) ) )

(

fixnum fixnum) = fixnum in general, but. . .



to declared as an array of fixnum’s,



so the multiplication has to return a ﬁxnum

CMU-CL and SBCL ok, ACL not ok.



Need for further explicit type information



worse in ACL:

declared-fixnums-remain-fixnums-switch

24/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

In terms of behavior



External parameters: no surprise



Internal parameters: differences, attenuated by

optimization

In terms of performance



Comparable results in both languages



Very smart LISP compilers (given language

expressiveness)

However:



Typing can be cumbersome



Difﬁcult to provide both correct and minimal information

(weakness of the COMMON-LISP standard)



Inlining is still an issue

25/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Perspectives

Low level: try other compilers / architectures

(and compiler / architecture speciﬁc optimization

settings)

Medium level: try more sophisticated algorithms

(neighborhoods, front-propagation)

High level: try different levels of genericity

(dynamic object orientation, static meta-programming)

Do not restrict to image processing

26/27

Scientiﬁc

Computing in

LISP

Didier Verna

Introduction

Experiments

The case of C

The case of

LISP

Raw LI SP

Typed LISP

Results

Type inference

Conclusion

Quesλ ions ?

Logo by Manfred Spiller

27/27