Changelog:

  • 4 September 2023: add some more explanation of what attaching a pointer to a shared memory region does
  • 4 September 2023: make instructions be more explicit about shared memory regions
  • 5 September 2023: adjust introductory description to describe overall design of chat program.
  • 5 September 2023: add parenthetical about what cleanup function is for in step 2
  • 6 September 2023: avoid suggesting feof(stdin), instead suggest checking fgets return value
  • 6 September 2023: add note about scanf not reading a full line in PID section.
  • 6 September 2023: consistently use inbox_filename (not inbox_name) in shared memory code

In this lab you’ll write yet another chat application.

It will only work on a single machine, not over a network file system or the Internet, but it will also require significantly fewer system resources than our other chats.

If you are using ssh, this means you need to ssh into the same back-end server twice. To do this, first SSH into portal.cs.virginia.edu and run uname -n to find out what server you are on (portal## for some two-digit number ##). Then, have your second SSH go directly into that portal##.cs.virginia.edu.

Each instance of the chat program will have a shared memory region in which it receives messages. When sending a message, one instance of the chat program will modify the other’s shared memory region, then send a signal to notify the other program that a new message is available. The receiver will print out the message from its signal handler. Meanwhile, the sender will wait until the receiver acknowledges receiving the message by resetting the shared memory region before prompting for the next message to send.

To complete the signals-based chat program described above, do the following. We recommend doing them in order. Each is described in more detail below.

Note the above instructions regarding testing on portal.

I would recommend working in pairs on this lab, but note that both instances of the chat program need to run as the same user. (You can’t send signals to another user’s programs.)

  1. Make a program that prints its own PID (process identifier) and reads another PID from the user (store it in a global variable)
  2. Create a signal handler for the SIGINT, SIGTERM, and SIGUSR1 signals
    • SIGTERM should invoke a cleanup function (can be a placeholder for now; later you will make it handle freeing shared memory regions)
    • SIGINT should invoke a cleanup function and then send SIGTERM to the other user
    • SIGUSR1 should display the contents of the inbox (eventually in shared memory; make a pointer to a placeholder character array for now) and then set its first character to be '\0'
    • (Although normally one should take care to only use signal-safe functions in your signal handlers, for this lab, we do not care if you follow this rule.)
  3. Setup access to two different shared memory regions:
    • An inbox which this process will use to read values from the other process.

    • The inbox of the other process, which we’ll call the outbox. You’ll use this to write values to the other process.

You can follow the instructions below under Shared memory to create this.

  1. Repeat until the end of file:

    • Read a typed line into the outbox (without overflow)
    • Wait until the outbox is emptied
  2. After the user stops typing (i.e., EOF), send SIGTERM to the other process

  3. Make the cleanup function destroy both the inbox and outbox

    The cleanup function should execute no matter how the program exits: via end-of-input, SIGINT, or SIGTERM.

  4. Upload your solution to the submission site or show a TA your program for checkoff.

1 PID

getpid() (from unistd.h) returns the PID of the current process as a pid_t, which can be treated like an int in most situations. This can be printed (e.g., with printf) by one process and input to the other (e.g., with scanf).

If you use something like scanf("%d", ...), note that this does not read a full line. For example, if the input is "12\n", then it will read only the "12", leaving "\n" on the input. This means a later call to fgets() will return that "\n" as an empty line immediately, which you may need to account for later.

2 Signals

As shown in class, the key parts of signal handling are defining a handler function:

#include <signal.h>

static void your_handler_function_name_here(int signum) {
    if (signum == SIGINT) /* ... */
}

and the code in main to tell the OS to use it and for which signals:

struct sigaction sa;
sa.sa_handler = handler;
sigemptyset(&sa.sa_mask);
sa.sa_flags = SA_RESTART;

sigaction(SIGINT, &sa, NULL);
sigaction(SIGTERM, &sa, NULL);
/* ... */

To send a signal, use kill(pid_to_send_to, SIGTERM) or the like.

3 Shared memory

One of the advantages of virtual memory is the OS can map the same physical page to different virtual pages in different processes. The processes can then directly communicate by loading what the other stored in that shared memory.

In this assignment, we’ll use two regions of shared memory. If there are two chat processes running A and B, there will be an inbox for A which is also the outbox for B; and there will be an inbox for B which is also the outbox for B. In both A and B, you’ll need to map each of these memory regions so you can access them.

Because shared memory needs to be able to span processes, it is a bit more involved to create and destroy than ordinary memory is. There are multiple ways to create shared memory; in this assignment, it is recommended to use shm_open and mmap. shm_open will give you access to special files that are kept in memory instead of persistent storage. Then, mmap takes a file and gives you access to it as a region of memory. (mmap also works for files that are stored normally on disk, though we don’t recommend that for this lab.)

  1. Choose a name for each process’s inbox (similar to a filename; in Linux’s implementation of shared memory, this will become a file in the special directory /dev/shm). To allow finding inboxes using PIDs and avoid interference with other students, I recommend chosing a name like /*PID*-chat, which you can generate using code like:

     char inbox_filename[512];
     snprintf(inbox_filename, sizeof inbox_filename, "/%d-chat", pid);

    Note that the name must start with a /.

  2. The shared memory regions we will access are a special kind of file opened with shm_open and deleted with shm_unlink.

    To allocate and get a handle for a shared memory region, open it using shm_open with the O_CREAT flag:

     int inbox_fd = shm_open(inbox_filename, O_CREAT | O_RDWR, 0666);
     if (inbox_fd < 0) { /* something went wrong */ }

    The handle, inbox_fd in the code above, is an integer that you will pass to the mmap (memory map) call later. I use fd in the name because it is a file descriptor.

    Initially, created shared memory regions will be empty, but you can allocate space for them with ftruncate:

     ftruncate(inbox_fd, size);

    (where size is a size in bytes; I recommend 4096).

    To deallocate, use shm_unlink:

     shm_unlink(inbox_filename);

    Shared memory regions can persist even after the creating program terminates, so always deallocate what you allocate! Even if your program stops early (e.g., by SIGINT) you still need to deallocate. It does not hurt (though it does set errno) to deallocate multiple times or from the wrong process.

    You will need to include <unistd.h>, <sys/mman.h>, and <fcntl.h> to use these functions as well as #define _XOPEN_SOURCE 600 (or similar) before including anything. On Linux, you will also need to link with -lrt.

  3. You can attach a pointer inbox_data to shared memory using mmap and detach it with munmap:

    char *inbox_data;
    ...
    inbox_data = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, inbox_fd, 0);
    if (inbox_data == (char*) MAP_FAILED) {
        /* something went wrong */
        /* note that MAP_FAILED is *NOT* the same thing as NULL */
    }
    close(inbox_fd); // deallocate file descriptor --- can stil use `inbox` after doing this
    // ... access inbox_data ...
    munmap(inbox_data, size);

    Then you can use this pointer like an array to access the shared memory region.

    Note that this needs to be done between allocating and deallocating the shared memory region.

  4. You’ll need to follow these steps twice: once to get the current process’s inbox (with a filename based on your current process’s ID) , and another time to get your outbox, which is another process’s inbox (and therefore with a filename based on the other process’s ID).

    So you can access either your inbox or your outbox as appropriate, you should have two different sets of variables to track the two regions.

    In the instructions below, we assume that you’ve called the data pointer for the outbox obtained this way outbox_data (though you are welcome to use other names).

4 Repeating and waiting

You’ve got two things to do: keyboard ↦ outbox and inbox ↦ screen.

4.1 keyboard ↦ outbox

A loop in main is the way to go: while(1) fgets directly into the outbox (breaking out of the loop if fgets fails). After data is in the outbox, send SIGUSR1 (using kill) to the other process; then, wait for outbox_data[0] to become '\0'. Don’t just busy-wait though: check, then go to sleep for a bit, then check again. (A better strategy you could also try might be to wait for SIGUSR2 using sigprocmask() and sigwait() and have the receiving process send SIGUSR2 when it is done reading the input.)

// wait for '\0' by checking 100 times a second
// 10,000 μs = 10 ms = 1/100 second
while(outbox_data[0]) { usleep(10000); }

4.2 inbox ↦ screen

When you handle SIGUSR1

  1. fputs the inbox so we can see what was sent;
  2. fflush(stdout) to ensure we see it now, not later;
  3. set the first character in the inbox to '\0' so the other process knows we’re ready for it to send more

5 Note on debuggers and signals

  1. If you run your program(s) in a debugger like GDB or LLDB, they will by default intercept signals:

    • for most signals, GDB and LLDB automatically stop as if there was a breakpoint when your program receives the signal;
    • for SIGINT and SIGTRAP, GDB and LLDB will not pass the signal to the program’s own signal handler (if any) by default;

    If you want to change this behavior, in LLDB you can use process handle command (see help process handle for documentation), and in GDB, the info signals, catch, and handle commands (see the GDB reference manual chapter on signals).