A very compact list type

I still do some programming in C, and like to indulge myself by creating my own container types instead of using the standard template library. One of my favorite custom containers is a null-terminated list that has one-third the memory overhead of std::vector and std::list but is still efficient and growable.

Let's say that you need to store a list of pointers or IDs, and you need to efficiently add to, remove from, or iterate through the list. If the item size is only 8 bytes (e.g. for a 64-bit pointer), then the std::list overhead of 16 bytes per node (plus the malloc chunk overhead) means that at most one third of the allocated memory will be used to store your items. The std::vector doesn't have any per-item overhead, but it does have a fixed overhead of 24 bytes.

A null-terminated list (or vector, if you prefer) has a fixed overhead of just 8 bytes when empty, i.e. the size of a pointer. Because it is null-terminated, it doesn't need a "size" member like std::vector, and a simple trick allows it to be growable without storing its size. In terms of functionality, it is essentially a very compact std::multiset with O(N) operations, i.e. it is at its best when used to store a small number of items. For a large number of items, the small overhead provides little benefit.

Here's an example of a list of pointers, with append() and remove() methods:

  typedef struct _list_type {
      void **entries; /* the list is just a "void**" */
  } list_type;

  void list_append(list_type *l, void *p)
  {
      size_t i = 0;
      size_t n = 0;

      if (l->entries) {
          /* create a new list if none exists */
          l->entries = (void **)malloc(2*sizeof(void *));
      }
      else {
          while (l->entries[n] != 0) { n++; }
          /* if n+1 is a power of two, double the list size */
          if (n+1 >= 2 && (n & (n+1)) == 0) {
              void **tmp = l->entries;
              l->entries = (void **)malloc((n+1)*2*sizeof(void *));
              for (i = 0; i < n; i++) { l->entries[i] = tmp[i]; }
              free(tmp);
          }
      }
      l->entries[n++] = p;
      l->entries[n] = 0; /* null-terminate the list */
  }

  void list_remove(list_type *l, void *p)
  {
      size_t i = 0;

      if (l->entries) {
          /* find the item in the list */
          while (l->entries[i] != 0 && l->entries[i] != p) {
              i++;
          }
          /* remove the item from the list */
          while (l->entries[i] != 0) {
              l->entries[i] = l->entries[i+1];
              i++;
          }
      }
  }


The trick here is the check for when the list length reaches a power of two, which is a hint to double the allocated space. There isn't any code to reduce the amount of allocated memory, it is assumed that after the initial growth of the list, the number of items will remain relatively constant.

I'd like to talk about how I used this to add weak pointers to vtkObjectBase, and about how the "(n-1) & n" power-of-two check works, and how the above code can be modified to work more efficiently with the way malloc() and new [ ] handle memory chunks, but this post has probably raised enough eyebrows already. What can I say, I still love C programming, regardless of all the fancy stuff that C++ provides.