Lab: Unions and Enumerations

Today’s lab will help you practice using unions and enumerations in C. While unions are less common than structures, they’re often the best choice for some situations you’ll encounter as a C programmer, especially the pattern we’ve read about called tagged unions.

A. Enumerations

First we will look at enumerations by themselves in C. The exercises below reference the following enum definition:

enum color {
  RED, GREEN, BLUE, ORANGE, PURPLE, YELLOW, WHITE, BLACK
};

Exercises

  1. Given the definition of color above, what value will display when you run this line of code? Make a prediction, then compile and run a program containing this code to check your work.
    printf("BLUE is %d\n", BLUE);
    
  2. Given your observations about the value of BLUE in the previous exercise, what do you think the value of BLUE will be if we change color to the following definition? As before, make a prediction and then check it by running this code.
    enum color {
      RED=57, GREEN, BLUE, ORANGE, PURPLE, YELLOW, WHITE, BLACK
    };
    
  3. We sometimes write colors as hexadecimal numbers like 0x333DFF. In this example, the color has blue value 0xFF, green value 0x3D, and red value 0x33. This format, which describes a color by its red, green, and blue components, appears frequently in web programming. Write a new version of enum color that uses reasonable RGB values for each of the enum values. For example, 0xFF0000 is a good value for RED. You can use the tool at https://htmlcolorcodes.com to find RGB color codes for red, green, blue, orange, etc.

  4. You’ve probably figured out by now that, under the hood, enums are just represented as integers. That means we can use integer operators on enum values as well. Using your updated color definition from the previous exercise, predict the values you’ll get from these expressions:
    • RED | BLUE
    • BLUE / 2
    • ORANGE & PURPLE
    • YELLOW - GREEN / 2
    • BLUE + PURPLE Print each of the values as a hexadecimal, and then use the tool at https://htmlcolorcodes.com to check what color the expression describes. Try to explain why these expressions produce the values (and colors) you observed.

B. Unions

Next, we’ll look at unions in isolation. You might notice that this way of using and examining unions is reminiscent of values are handled when we use casting. That is not a coincidence!

The exercises below will refer to the following union definition:

union sample {
  int num;
  short small_num;
  char str[4];
};

Exercises

  1. What is the size of a union sample value? Make a prediction, then use the sizeof operator to check your guess. If you guessed incorrectly, explain what you missed before moving on.

  2. If you create a variable u of type union sample, then assign u.num to be 97, what do you think will display when you print the first character of u.str` like this?
    printf("u.str[0] is %c\n", u.str[0]);
    

    Make a prediction, then check your guess by running this code.

  3. Set u.num to be 12345678. What do you expect to see when you print u.small_num? Make a prediction, then check your guess by running the code.

  4. Finally, set u.str to be the string "abc". What value do you expect to see when you print u.num? Hint: you will need to assign to u.str one character at a time, or you can initialize u with a designated initializer, which you should have seen in the reading for both unions and structs:
    union sample u = {.str = "abc"};
    

    Write down an explanation of why u.num has the value you observed when you assign to u.str; there’s a good chance you’ll get this wrong on your first try, but you have the tools to make sense of the value you see in u.num.

C. Tagged Unions

One of the most common uses for unions in C is to store values that are a mixture of different types. For example, we might have a collection of values to keep track of where some are integers, some are strings, some are floating point values, and so on. We can do this using a union, plus a tag, which is a value that tells us how to interpret the union:

struct value {
  enum {CHAR, INT, FLOAT, STRING} tag;
  union {
    char c;
    int i;
    float f;
    char* s;
  };
};

The top-level type is a struct because we want the tag to be stored separately from the actual value. The stored value is in an anonymous union, which lets you access them with a single dot after a value of type struct value.

As an example, we can use this struct to store a character like this:

struct value v;
v.tag = CHAR;
v.c = 'x';

Designated initializers let us write this more concisely:

struct value v = {.tag = CHAR, .c = 'x'};

Complete the following exercises using the definition of struct value above.

Exercises

  1. Create a variable w of type struct value. Initialize the value so it stores the string “Hello world”.

  2. Create a variable x of type struct value that stores the floating point value 3.14159.

  3. Create a variable y of type struct value that stores the integer value 3820.

  4. Write a function called print_value that takes a struct value parameter and prints it. Your implementation will need to check the value of tag to know how to print the value that was passed in. Test your implementation by printing the values v, w, x, and y defined above.

D. Representing Racket/Scheme/Scamper Values

We can extend our simple value type to store another useful type: a pair. If you didn’t notice already, tagged unions work similarly to the values we use in Racket programs, where you aren’t obligated to specify a type ahead of time. Adding a pair type will let us represent pair structures in our C program.

To do this, you’ll need to revise the definition of value and add a pair struct:

// Declare (but do not define) the pair struct.
// This allows us to use the name of this struct later even though we haven't defined it yet.
struct pair;

struct value {
  enum {CHAR, INT, FLOAT, STRING, PAIR, NIL} tag;
  union {
    char c;
    int i;
    float f;
    char* s;
    struct pair* p;
  };
};

struct pair {
  struct value car;
  struct value cdr;
};

This updated definition allows us to create struct value values that hold pairs, as well as a NIL value that represents an empty list. We call this value NIL in our program so its name doesn’t conflict with C’s NULL value.

Using these types, we can now create a pair value like this:

// Create a value to hold the character `a`
struct value a = {.tag = CHAR, .c = 'a'};

// Create a value to hold the value of pi:
struct value pi = {.tag = FLOAT, .f = 3.14159};

// Create a pair holding `a` and `pi`
struct pair p = {.car = a, .cdr = pi};

// Finally, wrap the pair in a value
struct value v = {.tag = PAIR, .p = &p};

While this is admittedly a lot longer than (let v (cons \#a 3.14159) ...), it does allow us to represent the same data as the equivalent Racket code. Despite the difference in verbosity, C and Racket do very similar things here; values in Racket are stored as tagged unions, and pairs will have a very similar structure to what we’re using.

You might find it convenient to write helper functions to produce values, like this one that produces characters:

struct value make_char(char c) {
  struct value v = {.tag = CHAR, .c = c};
  return v;
}

These helpers will work for everything except the PAIR tag because the .p field in struct value is a pointer. As you saw in the example above, the value assigned to .p must be the address of a struct pair, and any address you take inside of a make_pair function would be the address of a local variable that is out of scope when you return.

Use the definitions above to complete the following exercises.

Exercises

  1. Why do you think the value p inside of struct value is a pointer type? What will happen if you remove the star from its type? Try making this change to check your guess, then put the star back so your code will compile. Make sure you can explain why the star is necessary before moving on (the instructor or mentor can help if you aren’t able to figure it out).

  2. Create a value that holds null in Racket. You’ll need to use the tag NIL for this in our implementation.

  3. Create a value equivalent to the list produced by this Racket code:
    (list 3 "hello")
    

    Remember that a list is a pair structure. The list above is equivalent to (cons 3 (cons "hello" null)). Don’t forget that we renamed the Racket value null to the tag NIL above.

  4. Write a Racket expression that produces a value containing at least two pairs, then represent it in the C structure we’ve set up.

  5. Optional Challenge: Extend your print_value implementation so it can display values of type NIL and PAIR. You don’t need to print values exactly as Racket would. The easiest way to print pairs is to wrap them in parentheses and separate the values with a dot. For example, (cons 4 "hello") can be printed as (4 . "hello").