Lab: Abstract Datatypes

We’ve already seen one way to implement a sequence abstract datatype: linked lists. But as we’ve discussed, a linked list is an implementation detail that we can hide from the end user of an abstract datatype. To illustrate this, today’s lab will guide you through an implementation of the list ADT using an array as the underlying storage mechanism rather than a linked list.

To get started, download the starter code archive adts.tar.gz and run the following commands:

$ cd ~/csc161/labs
$ tar xvzf ~/Downloads/adts.tar.gz
$ cd adts
$ code

A. Getting our Bearings

First, take a look around the code provided in the starter code archive. This should look very familiar: with only minor exceptions, it is the same as the starter code for our first linked list lab.

As a reminder, here’s what you’ll find in the three provided source files:

main.c
The program’s main function and some general utilities. The provided code includes a simple command interface to interact with a list.
string_list.h
The interface for a list that holds strings. The provided code includes a definition of a string_list_t struct, which contains the fields required to implement a list of strings.
string_list.c
The implementation of a string list, which you will fill in for this lab.

If you remember the linked list version of this datatype you’ll notice that we don’t have a node type in string_list.h anymore; that’s because we won’t need a node for an array list. Instead, our list will be implemented as an array of strings (that is, an array of char* values, which is a char**).

The provided code also includes a Makefile, which will build two versions of the program: arraylist-practice and arraylist-practice-asan. The latter option is built with AddressSanitizer, which you’ll want to use for testing.

Once you’ve reviewed the provided code you can move on to the next part of the lab.

B. Storing and Initializing an Array-Based List

Our first step is to decide what data we need to represent an array-based list of strings, and to put fields into the string_list_t struct to hold those values. Along with those fields, you’ll also need to decide how you will initialize an empty list.

Exercises

  1. What values need to go into the string_list_t structure to hold an array of char* values? Remember, you will be expanding and contracting the list as values are inserted or appended. Once you have an idea, add the necessary fields to the string_list_t struct and add comments to explain what each of them will do. If you’re unsure about your design you can ask the instructor or a mentor to check your work.

  2. Fill in the string_list_init function to initialize an empty list. The initialization should follow naturally from your chosen struct fields; if you’re unsure what to do here you may need to re-evaluate what goes in your string_list_t struct.

C. Appending to the End of an Array-Based List

Unlike with linked lists—where inserting at the beginning is simple—the easiest way to add a value to an array-based list is to append to the end of the list. For this part of the lab, you’ll implement the string_list_append function.

Exercises

  1. What are the main steps required to grow an array-based list and append a value to the end? Write these steps out as comments in the string_list_append function before you write any code.

  2. Fill in the implementation of string_list_append. You can interleave this implementation with the comments you wrote in the previous exercise. This is a good practice to use any time you are writing complex code: writing comments first helps you think abstractly about the process you are implementing before you have to translate those steps into code.

D. Printing and Getting the Length of an Array-Based List

We’d like to test our array-based list implementation, but that’s hard to do when there’s no way to look at what’s in a list. For this part of the lab you will implement the string_list_print and string_list_length functions.

Exercises

  1. Implement string_list_print, which should be a fairly simple loop.

  2. Implement string_list_length, which should be an extremely simple, one-line function. If it isn’t, ask for help because you’ve likely taken a wrong turn somewhere earlier in your implementation.

  3. Now that you can check a list’s length and print its contents you can test the list implementation so far. Run ./arraylist-practice and try out the append, print, and length commands to make sure everything works as it should. You can use the AddressSanitizer version of the program if you want, but we’d expect the program to report a memory leak on exit because we don’t have any code to free memory yet.

E. Destroying an Array-Based List

Next, let’s write code to free memory allocated to hold the array-based list. This will be much simpler than freeing memory for a linked list, since the list itself is a single contiguous array rather than a series of nodes linked together. Don’t forget to free the strings stored in the list, though, since they are owned by the list data structure.

Exercises

  1. Implement string_list_destroy and test your implementation by running the AddressSanitizer version of the program. Make sure you don’t have any reported memory leaks before moving on.

F. Counting Values in an Array-Based List

Just as we did with the linked-list implementation, we’re going to add a function to count the number of times a string appears in the array-based list. Remember that string comparison uses the strcmp function, not the == operator.

Exercises

  1. Implement string_list_count, and test your implementation.

G. Inserting a Value at the Beginning of an Array-Based List

So far, it may seem like array-based lists are simpler to use than linked lists. That’s true in many cases, but this is where they can be tricky. Because an array-based list stores all of its values in contiguous memory (an array), inserting anywhere other than the end requires shifting values around. This has a cost both in terms of the complexity of your code, and the number of operations required to complete the insertion. The exercises below will guide you through this implementation.

Exercises

  1. First, write out the main steps required to insert a value at the beginning of your list. Don’t forget that you’ll need to grow the array like you did for the append operation, but there are extra steps here. Write these steps as comments in the string_list_insert function.

  2. Fill in your implementation, interleaved with the comments you wrote in the previous exercise. You may be tempted to use a loop to move values over, but there is a C standard library function you should use instead: memmove. Review the manpage for this function and use it in your implementation.

  3. Test your implementation using the AddressSanitizer version of the program. There’s a good chance you’ll encounter some memory errors in this phase, since it can be tricky to get array insertion right on the first attempt. Make sure you’ve resolved those errors before you move on.

H. Removing a Value from an Array-Based List

Finally, we’re at the most difficult part of the array-based list implementation. Remember that the string_list_remove function will find the first occurrence of a given string in the list and remove that element from the list. Like with insertion, this will require moving values over to maintain the contiguous array. But unlike insertion, this could happen anywhere in the list: the beginning, the end, or anywhere in the middle. The exercises below will guide you through this implementation.

Exercises

  1. Use comments to lay out the major steps required to locate and remove the first matching value from an array-based list. In your comments, include a discussion of any edge cases you need to handle specially; does removing from the beginning or end of the list need special handling? Why, or why not?

  2. Fill in your implementation, again using the memmove function whenever you need to shift values in the array. If you’re confused by this function you could start by using a loop, but you should come back and replace it with an equivalent call to memmove at the end. That’s because memmove is often replaced by an optimized implementation that can move data much faster than a loop that assigns one value at a time.

  3. Finally, test your implementation thoroughly by running in the AddressSanitizer version of the program. Make sure you check edge cases for the remove operation. As always, you should not see any AddressSanitizer errors; fix any errors you encounter before finishing the lab.

Acknowledgments

This lab is loosely based on the activities assigned in Peter-Michael Osera’s abstract datatypes lab for CSC 161.