Large Collections

Summary: Adds efficient support for very large collections to Squeak
Author: Stephen Pair
Owner: Stephen Pair (svp)
Co-maintainers: <None>
PackageInfo name: <Not entered>
RSS feed:


This package is not currently loadable. The package owner has been contacted to assist in correcting this. (klc, 7/11/2008)

This package adds several classes to Squeak to enable it to better handle collections that contain a very large number of items. These classes are:

LargeArray forms the basis of the other large collections and can also serve as the basis for other large collections (or existing collection classes instantiated with a LargeArray serving as the base collection).

This package does some minor refactoring of the existing collection classes to enable LargeArray to be 100% compatible with Array.

In Squeak, very large collections present a problem for the garbage collector. When large objects exist in young space, these objects are scanned very frequently by the incremental garbage collector. When a reference to a young object is stored in an old object, incremental garbage collection must also scan those old objects (which presents a performance issue if those objects are very large). LargeArray addresses this issue by "chunking" up the large array into a set of smaller arrays.

One could argue that garbage collection should be made to handle large objects better. It certainly wouldn't hurt if the cost in terms of algorithmic complexity and performance are not too high. However, there are other reasons for breaking large collections up into a set of smaller ones. One key reason is to allow the individual chunks of a large collections to be independently swapped in and out of object memory by a cache manager. The net result is a "paging" effect on the large collection.

LargeIdentityDictionary is just that. See the class comments for more details.

ExtendedAttributeDictionary is designed for use when you want to efficiently lookup extended state for a large number of objects.

TupleAttributeDictionary is useful for looking up state associated with two or more objects.

For more details see the class comments for these classes.