3.1: Questions about the Base Lisp

Q 3.1-1) I'm left with running Lisp processes after I exit my Emacs/xterm. What do I do to avoid this?
Q 3.1-2) Why doesn't make-pathname merge the given :directory component with the directory component in :defaults argument?
Q 3.1-3) I am getting stack overflows and occasional Lisp failure when I sort on large arrays. Why and what can I do?
Q 3.1-4) I have set the stack cushion to a reasonable value, but the soft stack limit is not being detected, and I get a lisp death instead. Why is that?
Q 3.1-5) How can I use run-shell-command in a multiprocessing friendly way?
Q 3.1-6) How can I encode and decode floats quickly, accurately, and without consing, for the purpose of sending them to a separate process?
Q 3.1-7) Why does it take so long to load a file that interns several thousand symbols in a package?
Q 3.1-8) Can I have the debugging evalmode allowing local evaluation (:evalmode :context t) on all the time?

Go to main FAQ page.


Q 3.1-1) I'm left with running Lisp processes after I exit my Emacs/xterm. What do I do to avoid this?

A 3.1-1) This issue is very complicated: whether and how lisp should terminate when its input/output streams are broken. The current implementation should give the behavior most people want, that a lisp image quietly and immediately ceases execution when its remote initial terminal io stream is closed.

If it doesn't, here is some code you can load into an image or otherwise cause to execute (e.g. in ~.clinit.cl) that might have useful effect in making lisp images go away when you want them to.

#-(version>= 4 3)
(progn
  (unless (fboundp 'unix-signal)
    (ff:defforeign 'unix-signal :entry-point (ff:convert-to-lang "signal")))
  (unix-signal 1 0)                              ;SIGINT
  (unix-signal 15 0)                             ;SIGTERM
)

Q 3.1-2) Why doesn't make-pathname merge the given :directory component with the directory component in :defaults argument?

A 3.1-2) Section 19.4.4 of the ANSI spec says:

After the components supplied explicitly by host, device, directory, name, type, and version are filled in, the merging rules used by merge-pathnames are used to fill in any unsupplied components from the defaults supplied by defaults.

unsupplied is the crucial word here. By specifying a :directory argument you have supplied the directory component, and the directory component of the :defaults argument is not used. Even specifying :directory nil explicit supplies a directory component of nil, and this will be treated differently from unsupplied.


Q 3.1-3) I am getting stack overflows and occasional Lisp failure when I sort on large arrays. Why and what can I do?

Here is a transcript showing a stack overflow. Note that the array has one million (10^6) elements.

USER(1): (setq pippo (make-array 1000000 :initial-element 0)) 
#(0 0 0 0 0 0 0 0 0 0 ...)
USER(2): (sort pippo #'<)
Error: Stack overflow (signal 1000)
[condition type: SYNCHRONOUS-OPERATING-SYSTEM-SIGNAL]

Restart actions (select using :continue):
 0: continue computation
 1: Return to Top Level (an "abort" restart)
[1c] USER(3): :pop
=================^^^^
USER(4): (sort pippo #'<)
#(0 0 0 0 0 0 0 0 0 0 ...)
USER(5): 

Here I continue the computation and Lisp exits with a segmentation violation:

USER(1): (setq pippo (make-array 1000000 :initial-element 0))
#(0 0 0 0 0 0 0 0 0 0 ...)
USER(2): (sort pippo #'<)
Error: Stack overflow (signal 1000)
 [condition type: SYNCHRONOUS-OPERATING-SYSTEM-SIGNAL]

 Restart actions (select using :continue):
 0: continue computation
 1: Return to Top Level (an "abort" restart)
[1c] USER(3): (sort pippo #'<)
Segmentation fault (core dumped)
%

A 3.1-3) The stack overflow occurs because a large array is being stack-allocated to perform the sort. The size of the array is architecture dependent; Windows platforms only allocate up to 4 Kbyte arrays on the stack, and normally heap allocate any larger arrays needed, while Unix platforms attempt to allocate 4 Mbyte arrays on the stack. On any architecture, the strategy is programmable; as described below.

When the above error occurs, there are several things that can be done.

  1. Instead of popping out of the break loop as in the example above, just continue. The stack overflow automatically reduces the stack cushion (see documentation for sys:stack-cushion and sys:set-stack-cushion), so continuing should allow further execution.
  2. On Unix platforms only, a csh can be run and the limit command used to set the stack limit to something larger than it currently is. We recommend at least 8192 Kbytes (8 megabytes), but if that is not enough, more can be allocated.
  3. Change the sort strategy (documented below). The Allegro CL sort function tries to allocate a temporary array on the stack if possible, so that it does not need to do so on the heap. If this strategy is not acceptable or convenient, change the strategy to either allocate from the heap or to use a pre-existing user supplied array.

Just continuing usually works as does, usually, clearing stack with a :reset and retrying. Note, as the second example above shows, trying to redo the sort command in the error prompt (that is, without clearing the error) can result in an abnormal exit from the lisp (Segmentation fault (core dumped) ).

This is an unfortunate hole in our stack-overflow detection strategy; Stack overflow is normally detected for every function call, and enough "slop" is allowed for so that functions that allocate an average amount of stack will not cause a hard stack overflow. But if the function allocates large stack objects (such as large temporary vectors) then the jump in stack usage is too much to detect by either the stack cushion or the hardware overflow detection, and stack-overflow death occurs. We hope to guard against such overflow death in some future version of Allegro CL.

Sort Strategy: 

You can tell the system whether to try to stack-allocate things to be sorted. From the documentation in the source code:

;; excl::*simple-vector-sort-strategy*:
;;
;; The sort strategy can be one of three types:
;; :stack - try to allocate stack space for the temp sort; this
;; works easily for 1k elements (4 kbytes), and (on
;; Unix platforms only) for up to 1m elements (4 mbytes)
;; if there is enough stack allocated by the os; more
;; than 1 m elements cause a new svector to be allocated.
;; :alloc - Allocate an svector of size equal to the vector to sort.
;; a new one is allocated each time.
;; <vector> - must be a simple-vector of type t of at least as many
;; elements as are being sorted. During the sort, the global
;; is reset to :alloc so that sort is re-entrant.

(defvar excl::*simple-vector-sort-strategy* :stack)

Q 3.1-4) I have set the stack cushion (see sys:set-stack-cushion and sys:stack-cushion) to a reasonable value, but the soft stack limit is not being detected, and I get a lisp death instead. Why is that?

A 3.1-4) The stack-cushion is detected in "symbol trampoline", a short piece of code that is used when one Lisp function calls another. It is meant to flag normal situations where stack is growing too quickly, and to signal a condition before a hard stack-size limit is reached.

There are several possible situations where the stack-overflow is not detected by this mechanism, and careful thought must be given as to how to handle it:

  1. A lisp function may allocate a very large stack size, due to either a large number of variables or due to large stack-allocated arrays or lists. If the amount that the function allocates is larger than the difference between the hard stack limit and the soft stack limit set up by the stack cushion, then there will be no chance for the Lisp to signal the condition before the hard limit is reached. The only way to work around this problem is to be sure that there is sufficient stack-cushion for the worst-case function to allocate its needed stack.
  2. A Lisp function might call itself recursively, which on some architectures generates a fast call to location 0 of the same function. The fast call causes the symbol trampoline to be bypassed, thus causing the stack overflow detection to also be bypassed. The workaround is to declare the function calling itself as notinline within its own body. This will result in slightly slower code generation, but overflows would then be detected. Example:
(defun call-me ( ... ) 
  (declare (notinline call-me)) 
  ... 
  (call-me ...) ... ) 
  1. A non-lisp thread may be called, at which time there is no way to limit the stack on some machines. There is no workaround for this problem, other than to reduce ones dependence on non-lisp code.

Q 3.1-5) How can I use run-shell-command in a multiprocessing friendly way?

A 3.1-5) excl:run-shell-command (the link is to the description page) does not take multiprocessing into consideration. Therefore, if it is called with the :wait argument true (the default is t), all of Lisp waits for the call to complete, not just the process or thread that called run-shell-command. It is that behavior which is multiprocessing unfriendly. The following is a multiprocessing friendly call to run-shell-command. It does cause the calling process or thread to wait but does not cause the entire Lisp process to wait for the shell command to finish.

(multiple-value-bind (s errs my-pid)
    (run-shell-command "sleep 5; ls /usr/bin" :wait nil)
  (declare (ignore errs s))
  (let ((my-status nil))
    (mp::process-wait "for run-shell-command to finish"
                      #'(lambda ()
                         (setq my-status
                          (or my-status
                             (sys:reap-os-subprocess
                              :pid my-pid :wait nil)))))
    my-status)) 

Notes:

  1. Calling run-shell-command with :wait nil allows Lisp to continue in any case, and that might be what you want.
  2. Most things can be done better from Lisp without recourse to run-shell-command, which is inherently risky (to some extent, it puts the fate of the Lisp process in the hands of a non-Lisp program over which Lisp may not have control). Whenever you are tempted to use run-shell-command, consider performing the same action within Lisp (perhaps using foreign functions). See the run-shell-command description page in the Allegro CL documentation for more information. (the link goes to that page if it is available; if not, see <Allegro directory>/doc/cl/pages/operators/excl/run-shell-command.htm).
  3. run-shell-command is misnamed for Windows since shell commands (like dir) are precisely what it cannot run (the name comes from the older UNIX implementation). See the description page for run-shell-command and also FAQ item Q 2.6-3) Why can't I use `dir' with run-shell-command?

Q 3.1-6) How can I encode and decode floats quickly, accurately, and without consing, for the purpose of sending them to a separate process?

A 3.1-6) If it is necessary to be able to read the floats as floats either in lisp or in a different language, then it is hard to do efficiently; the algorithm for printing floats in scientific notation is designed to accurately convert to base 10 representation, but the price of this accuracy is time and consing.

However, if the purpose of encoding and decoding these floats is simply to read them in again and create new floats in a different process (for example, a marshalling interface) then the following functions can be used. They are available for release 5.0 and 5.0.1 as patches and will be included in the next release. The patches are p2a004.001 (in 5.0.1) and p0a015.001 (in 5.0). The 16-bit values tend to be cons-free, and they can then either be used as raw data, or converted to a textual format such as hexadecimal, for the purpose of sending the data through an interface to another process. These numbers can then be read by the receiving process and can be used to reconstruct the appropriate kind of float.

Note that the representation is machine representation, used by all languages that support floating point numbers (including C and FORTRAN). Therefore, writing the values to a text file does not lose information since the text file could in principle be read by a program written in any language.

The functions are excl:single-float-to-shorts, excl:double-float-to-shorts, excl:shorts-to-single-float, and excl:shorts-to-double-float. The formal definitions are:

Function

excl:single-float-to-shorts
Arguments: single-float

excl:double-float-to-shorts
Arguments: double-float

single-float-to-shorts returns (as multiple values) two 16-bit unsigned numbers that represent the highest 16 bits of the argument single-float followed by the lowest 16 bits of the single-float. No consing is done, except on the Sparc, where the pseudo-resourced multiple-values-vector might be consed the first time after a scavenge. Each succeeding call will reuse the values vector normally.

double-float-to-shorts returns (as multiple values) four 16-bit unsigned numbers that represent 16 bit chunks of the argument double-float, starting from the highest significant 16 bits, and ending with the lowest significant 16 bits. No consing is done, except on the Sparc, where the pseudo-resourced multiple-values-vector might be consed the first time after a scavenge. Each succeeding call will reuse the values vector normally.

Function

excl:shorts-to-single-float
Arguments: hi lo

excl:shorts-to-double-float
Arguments: hwhi hwlo lwhi lwlo

shorts-to-single-float returns a boxed single-float number whose bits include the 16 bits from hi as the most-significant 2 bytes, and the 16 bits of lo as its least significant 2 bytes. The only consing that is done is to allocate the single-float.

excl:shorts-to-double-float returns a boxed double-float number whose bits include the 16 bits from hwhi as the most-significant 2 bytes, followed by the 16 bits of hwlo, followed by the 16 bits of lwhi, followed finally by the 16 bits of lwlo as its least significant 2 bytes. The only consing that is done is to allocate the double-float.

Note that for a single-float sf and a double-float df, the following are true:

(= sf (apply 'excl:shorts-to-single-float 
             (multiple-value-list (excl:single-float-to-shorts sf)))) -> T
(= df (apply 'excl:shorts-to-double-float 
             (multiple-value-list (excl:double-float-to-shorts df)))) -> T

Q 3.1-7) Why does it take so long to load a file that interns several thousand symbols in a package?

A 3.1-7) A package has an associated hashtable for the names of symbols in the package. When the size of a package is not specified at creation time, a default hashtable is used. Its initial size is small, allowing for 10 entries, and it tends to grow slowly, growing about 20% each time growth is necessary. Those values are reasonable for most uses, but if you know that a package will have many more symbols, particularly if they will be all created at roughly the same time (as when reading a file that interns thousands of symbols), you should specify the :size keyword argument to defpackage appropriately when creating the package. Thus, if you know the package will eventually have about 4000 symbols, define it with a form like this:

(defpackage :foo (:size 4000) (:use :cl :excl))

Q 3.1-8) Can I have the debugging evalmode allowing local evaluation (:evalmode :context t) on all the time?

A 3.1.8) The Allegro CL debugger allows setting the context to emulate a dynamic-binding style debugging interface to make interactions with variables within compiled-functions easier. It is enabled by calling the :evalmode top-level command as follows (the :evalmode link only works if the online documentation is present):

:evalmode :context t

But note that this evaluation mode is intended only as a debugging aid, and is not supported for normal extended operation of the lisp system. The emulation is not and can never be perfect, due to the information loss that occurs during compilation in a lexically-scoped lisp like Common Lisp. Therefore, extreme care must be used when running in context mode, and the normal operating environment must be a no-context evaluation mode. Turn the mode off as follows:

:evalmode :context nil

© Copyright 1999, Franz Inc., Berkeley, CA.  All rights reserved.
$Revision: 1.1.2.19 $