David Yue

Legal Name: Ming Xuan Yue (岳明轩)

Simple Data Structures In Depth

A tutorial about simple data structures
  • Text Width
  • Monospace Font

Table of Contents

Linked Lists

A linked list is a dynamic data structure where each element, or "node," contains a value and a pointer to the next node in the sequence. Unlike arrays, linked lists don’t store elements in contiguous memory locations, allowing efficient insertions and deletions.

Here’s an example of a basic node structure in C for a linked list storing integers:

struct linkedlist_node {
  int item;
  struct linkedlist_node *next;
};

In a linked list, the final node has its next pointer set to NULL, indicating the end of the list.

Creating a Node

Linked list nodes are typically allocated on the heap using malloc. It doesn't make sense to allocate nodes on the stack since stack-allocated memory would not persist beyond the function's scope.

struct linkedlist_node* create_node (int value) {
  struct linkedlist_node *new_node = malloc(sizeof(struct linkedlist_node));

  if (!new_node)
      return NULL;

  new_node->item = value;
  new_node->next = NULL;

  return new_node;
}

In this example, we dynamically allocate a node, initialize its value, and set next to NULL since it's currently the last node in the list.

Inserting Nodes

Inserting at the End

To insert a node at the end, traverse the list to find the last node, then update its next pointer to reference the new node.

void insert_at_end (struct linkedlist_node* list, struct linkedlist_node* node) {
     // Assumes list != NULL
     while (list->next) {
         list = list->next;
     }
     list->next = node;
}

Inserting at the Beginning

To insert at the beginning, make the new node point to the current head and return the new node as the new head of the list.

struct linkedlist_node* insert_at_begin (struct linkedlist_node* list, struct linkedlist_node* node) {
    node->next = list;
    return node;
}

Inserting in Order

For an ordered linked list, we need to find the correct position for the new node. A naive solution is to return the potentially updated head of the list:

struct linkedlist_node* insert_ordered (struct linkedlist_node* list, struct linkedlist_node* node) {
    struct linkedlist_node *prev = NULL;
    struct linkedlist_node *curr = list;

    while (curr && curr->item < node->item) {
        prev = curr;
        curr = curr->next;
    }

    node->next = curr;

    if (prev) {
        prev->next = node;
        return list;
    } else {
        return node;
    }
}

This function returns the new head if it changes. For a more consistent design, we can use a double pointer, which avoids returning a potentially updated head.

void insert_ordered (struct linkedlist_node** list, struct linkedlist_node* node) {
    struct linkedlist_node *prev = NULL;
    struct linkedlist_node *curr = *list;

    while (curr && curr->item < node->item) {
        prev = curr;
        curr = curr->next;
    }

    node->next = curr;

    if (prev) {
        prev->next = node;
    } else {
        *list = node;
    }
}

Using a double pointer lets us update the head without changing the return type. Further, we can avoid explicitly tracking the previous node by directly manipulating the double pointer:

void insert_ordered (struct linkedlist_node** list, struct linkedlist_node* node) {
    while (*list && (*list)->item < node->item) {
        list = &((*list)->next);
    }

    node->next = *list;
    *list = node;
}

The following animation inserts the value 3 between node 1 and node 5. It shows what node->next = *list and *list = node are doing visually.

The trick here is that list points to the ...->next member in the struct. So a statement of the from *list = ... effectively sets the ...->next member value (we indicate this in the animation by turning the arrow red).

Although I did say that list points to the ...->next member in the struct, there is a slight exception to this rule when the function is first called. Recall that we typically handle linked lists as a pointer to the first element, thus when this element is first called, list points to this head pointer variable instead.

Computerphile also made a video about this technique

Deleting Nodes

To delete a node, we unlink it from the list, update any necessary pointers, and free its memory.

struct linkedlist_node* delete_item (struct linkedlist_node* list, int value) {
    struct linkedlist_node *prev = NULL;

    while (list && list->item != value) {
        prev = list;
        list = list->next;
    }

    if (!list)
        return NULL;

    struct linkedlist_node *new_head = list;

    if (prev) {
        prev->next = list->next;
    } else {
        new_head = list->next;
    }

    free(list);

    return new_head;
}

By utilizing a double pointer, we can eliminate the need to track prev manually, simplifying the code structure and improving readability.

void delete_item (struct linkedlist_node** list, int value) {
    while (*list && (*list)->item != value) {
        list = &((*list)->next);
    }

    struct linkedlist_node* to_del = *list;

    if (!to_del)
        return;

    *list = to_del->next;

    free(to_del);

}

Code can be found here


Animations were created using Manim

This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

If you spot any errors, please report them to me
← Back to Blogs