Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Scientific Computing in LISP: beyond the
performances of C
Didier Verna
didier@lrde.epita.fr
http://www.lrde.epita.fr/˜didier
Version 1.3 November 7, 2006
1/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Introduction
Myths and legends. . .
Facts:
“LISP is slow” . . . NOT ! (it’s been 20 years)
Smart compilers ( native machine code)
Static typing (types known at compile-time)
Safety levels (compiler optimizations)
Efficient data structures (arrays, hash tables etc.)
Image processing libraries written in C or C++
(sacrificing expressiveness for performance)
LISP achieving 60% speed of C
(recent studies)
We have to do better:
Comparative C and LISP benchmarks
(part 1: full dedication)
4 simple image processing algorithms
Pixel storage and access / arithmetic operations
Equivalent performance
(LISP 10% better in some cases)
2/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Table of contents
1 Experimental Conditions
2 C Programs and Benchmarks
3 LISP programs and benchmarks
Raw LISP
Typed LISP
Results
4 Type inference
3/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Experimental conditions
The algorithms: the “point-wise” class
Pixel assignment / addition / multiplication / division
Soft parameters: image size / type / storage / access
Hard parameters: compilers / optimization level
More than 1000 individual test cases
The protocol
Debian GNU Linux / 2.4.27-2-686 packaged kernel
Pentium 4 / 3GHz / 1GB RAM / 1MB level 2 cache
Single user mode / SMP off (no hyperthreading)
Measures on 200 consecutive iterations
5/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
C code sample
The add function
void add ( image to , image from , f l oa t val )
{
i n t i ;
const i n t n = ima>n ;
for ( i = 0 ; i < n ; ++ i )
to >data [ i ] = from>data [ i ] + va l ;
}
Gcc 4.0.3 (Debian package)
Full optimization: -O3 -DNDEBUG plus inlining
Note: inlining should be almost negligible
7/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Results
In terms of behavior
1D implementation slightly better (10% 20%)
Linear access faster (15 35 times)
Arithmetic overhead: only 4x 6x
Main cause: hardware cache optimization
Optimized code faster (60%) in linear case, irrelevant
in pseudo-random access
Causes currently unknown
Inlining negligible (2%)
8/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Results
In terms of performance
Fully optimized inlined C code
Algorithm Integer Image Float Image
Assignment 0.29 0.29
Addition 0.48 0.47
Multiplication 0.48 0.46
Division 0.58 1.93
Not much difference between pixel types
Surprise: integer division should be costly
“Constant Integer Optimization” (with inlining)
Do not neglect inlining !
9/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
First shot at LISP code
The add function, take 1
( defun ad d ( to from v a l )
( l e t ( ( si z e ( arraydimension to 0 ) ) )
( dotimes ( i siz e )
( setf ( ar ef to i ) (+ ( aref from i ) v a l ) ) ) ) )
COMMON-LISPs standard simple-array type
Interpreted version: 2300x
Compiled version: 60x
Optimized version: 20x
Untyped code dynamic type checking !
11/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Typing mechanisms
Typing paradigm:
Type information (COMMON-LISP standard)
Declare the expected types of LISP objects
Type information is optional
Declare only what you know; give hints to the compilers
Both a statically and dynamically typed language
Typing mechanisms:
Function arguments:
(make-array size :element-type ’single-float)
Type declarations:
Function parameter / freshly bound local variable
. . .
12/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Typed LISP code sample
Declaring the types of function parameters
The add function, take 2
( defun ad d ( t o from v a l )
( d ec la re ( t ype ( s imp le array si n g l e f l o a t ( ) ) t o from ) )
( d ec la re ( t ype s i ng l e f l o a t v a l ) )
( l e t ( ( si z e ( arraydimension to 0 ) ) )
( dotimes ( i siz e )
( s e tf ( a re f to i ) (+ ( a re f from i ) v a l ) ) ) ) )
simple-arrays . . .
of single-floats . . .
unidimensional.
13/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Object representation
Why typing matters for performance
Dynamic typing objects of any type (worse: any size)
LISP variables don’t carry type information: objects do
The “boxed” representation of LISP objects
Type information
Pointer to Lisp Object
Actual value
Dynamic type checking is costly !
Pointer dereferencing is costly !
14/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
The benefits of typing
2 examples
Array storag e layout:
Homogeneous arrays of a known type
native representation usable
Specialization of the aref function
“Open Coding”
Immediate objects:
Short (less than a memory word)
Special “tag bits” (invalid as pointer values)
Encoded inline
Unboxed fixnum representation
00
0
1
Tag bits
Bits 1 ... 29 30 31 32
fixnum value (30 bits)
15/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Example: optimizing a loop index
(dotimes (i 100) ...)
Disassembly of a dotimes macro
58701478: .ENTRY FOO( )
90 : POP DWORD PTR [EBP8]
93 : LEA ESP, [EBP32]
96 : XOR EAX, EAX
98 : JMP L1
9A: L0 : ADD EAX, 4
9D : L1 : CMP EAX, 4 00
A2 : JL L0
A4 : MOV EDX, #x2800000B
A9 : MOV ECX, [EBP8]
AC: MOV EAX , [ EBP4]
AF: ADD ECX, 2
B2 : MOV ESP, EBP
B4 : MOV EBP, EAX
B6 : JMP ECX
16/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Activating optimization
“Qualities” (COMMON-LISP standard): between 0 and 3
safety, speed etc.
Global or local declarations in source code
(no compiler flag)
Global qualities declaration
(declaim (optimize (speed 3)
(compilation-speed 0)
(safety 0)
(debug 0)))
Safe code: declarations treated as assertions
Optimized code: declarations trusted
17/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Final LISP code sample
The add function
( defun ad d ( t o from v a l )
( d ec la re ( t ype ( s imp le array si n g l e f l o a t ( ) ) t o from ) )
( d ec la re ( t ype s i ng l e f l o a t v a l ) )
( l e t ( ( si z e ( arraydimension to 0 ) ) )
( dotimes ( i siz e )
( s e tf ( a re f to i ) (+ ( a re f from i ) v a l ) ) ) ) )
CMU-CL (19c), SBCL (0.9.9), ACL (7.0)
Full optimization: (speed 3), 0 elsewhere
Array type: 1D, 2D
Array access: aref, row-major-aref, svref
18/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Comparative results
In terms of behavior
= Plain 2D implementation much slower (2.8x 4.5x)
= Linear access faster (30 times)
Same reasons, same behavior. . .
= Optimized code faster in linear case, irrelevant in
pseudo-random access
= Gain more important in LISP (3x 5x)
= Gain more important on floating point numbers
In LISP, safety is costly
= Inlining negligible
= No “Constant Integer Optimization”
= Negative impact on performance (-15%), if any
Inlining still a “hot” topic (register allocation policies ?)
19/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Comparative results
In terms of performance
Pseudo-random access
Assignment Addition Multiplication Division
0
5
10
15
20
25
30
Execution time (seconds)
Rear to Front: ACL / SBCL / CMU-CL / C
Integer
Floating Point
Assignment: LISP 19% faster than C
Other: insignificant (5%)
Exception: integer division
20/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Comparative results
In terms of performance
Linear access
Assignment Addition Multiplication Division
0
1
2
Execution time (seconds)
Rear to Front: ACL / SBCL / CMU-CL / C
Integer
Floating Point
A CL: poor performance
CMU-CL, SBCL: strictly equivalent to C
C wins on integer division, loses on floating-point one
21/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Type inference
A weakness of COMMON-LISP . . .
Static typing cumbersome (source code annotations)
Can we provide minimal type declarations . . .
. . . and rely on type inference ?
Incremental typing by compilation log examination
Unfortunately:
Compiler messages not necessarily ergonomic
Type inference systems not necessarily clever
23/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Example of (missing) type inference
multiply excerpt
; ; . . .
( d ec la re ( t ype ( s imp le array fixnum ( ) ) to from ) )
( d ec la re ( t ype fixnum val ) )
; ; . . .
( s e tf ( a re f to i ) ( the fixnum ( ( a re f from i ) v a l ) ) ) ) ) )
(
*
fixnum fixnum) = fixnum in general, but. . .
to declared as an array of fixnums,
so the multiplication has to return a fixnum
CMU-CL and SBCL ok, ACL not ok.
Need for further explicit type information
worse in ACL:
declared-fixnums-remain-fixnums-switch
24/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Conclusion
In terms of behavior
External parameters: no surprise
Internal parameters: differences, attenuated by
optimization
In terms of performance
Comparable results in both languages
Very smart LISP compilers (given language
expressiveness)
However:
Typing can be cumbersome
Difficult to provide both correct and minimal information
(weakness of the COMMON-LISP standard)
Inlining is still an issue
25/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Perspectives
Low level: try other compilers / architectures
(and compiler / architecture specific optimization
settings)
Medium level: try more sophisticated algorithms
(neighborhoods, front-propagation)
High level: try different levels of genericity
(dynamic object orientation, static meta-programming)
Do not restrict to image processing
26/27
Scientific
Computing in
LISP
Didier Verna
Introduction
Experiments
The case of C
The case of
LISP
Raw LI SP
Typed LISP
Results
Type inference
Conclusion
Quesλ ions ?
Logo by Manfred Spiller
27/27