INTRO TO LINKED LISTS
A linked list is an abstract data structure that is made up of a collection of nodes (or elements). Lists nodes are accessed by means of sequential access - they are accessed in an ordered/predetermined sequence.
List objects can be stored anywhere in memory - they do not need to be next to one another. This makes it easy to insert new elements, but has the disadvantage of requiring sequential access when accessing values.
LINKED LIST USING DOUBLE POINTERS
For the current example, nodes are dynamically allocated (using
malloc()
) and each node consists of a data field and a reference (a pointer) to the next node in the list. In the case of the last node in the list, the next
field contains NULL
- it is set as a null pointer.
Our nodes are defined with a C struct - for the purposes of this exercise, nodes are defined as
NODE
using typedef
. The data field of the node consists of a single char *name
member variable:
After this
typedef
declaration, NODE *head
defines a variable head
which is a pointer to a node struct pointer.
For the sake of simplicity this example will consider two actions:
prepend()
: add a node to the start (head) of the listappend()
: add a node at the end of the list
POINTERS
A pointer is a variable that stores the memory address of another variable. In C, you need to define the type of the object that is being pointed at.
For example:
int a = 42; int *pNum = &a;
.
The variable
pNum
now contains the address of a
, which is the memory location of an object having the data type integer. Dereferencing the pointer variable *pNum
instructs the programme to return the integer value that is held at the memory address returned by pNum
.
NOTE: you shouldn’t try to dereference a null pointer in C - this produces undefined behaviour.
INITIALISE THE LINKED LIST
The list is initialised by creating a
NODE *head
which is set to NULL
. The variable head
is now a pointer to NULL
, but as NODE
s are added to the list head
will be set to point to the first NODE
. In this way, head
becomes the access point for sequential access to the list.ADDING NODES
As the first node is added, the head pointer is amended to point to the new node, and the next pointer of the new node points to
NULL
. In the case of the first node, append()
and prepend()
have exactly the same effect.
In each case, a node needs to be created. To achieve this:
WHY DOUBLE POINTERS?
Arguments are always passed to functions by value in C. In other words, when C passes control to a function, a copy of each argument is made and this copy is passed to the function - leaving the original variable unchanged.
In order to amend a variable in a calling function from within a called function, you need to pass a pointer to the variable as a function argument. This provides the called function with the memory address of the variable to be amended. Dereferencing this (pointer) variable within the called function provides access to the original value.
Simplified example:
PREPEND
In the function definition for
prepend()
shown below, the first parameter is NODE **head
. This specifies that the function receives a variable with the pointer to pointer variable type NODE **
. In other words, the function receives an address of a variable which is in turn a pointer to a NODE
- in this case, the function argument is the address of the variable NODE *head
which is a pointer that points to either the first node in the list or to NULL
.
If a new node is added to the head of the list (by means of a
prepend()
action), the head pointer is amended to point to the new node and the next
field of the new node refers to the address of the second (previously first) node.
This is the easier function to understand:
In the case of the new node being the first node in the list,
head == NULL
so the assignment newNode->next = *head
correctly sets the next
field of the last node (in this case, also the first node) to NULL
.
If the list already contains nodes, the
head
variable holds a pointer to a pointer to the existing first node. Note that the address of head
is passed into the prepend()
function from the caller.
Because we want to insert the new node before the existing first node, the
next
field of the new node should point to the address of the existing first node. The declaration newNode->next = *head
correctly makes this assignment.
Setting the new
head
reference is where double pointers become useful. Because we pass in the address of the head variable rather than the variable itself, we can access (and amend) it’s value from within the prepend()
function. When the *head
is set to the address of the new node from within prepend()
we are saying:
“Dereference
**head
once to get the memory address that is held by head
in the calling function, and set this address to the address of the new node.”APPEND NODE
If a new node is added to the end of the list (by means of
append()
, the next
field of the previously last node is amended to point to the memory address of the new node. If the appended node is also the first node in the list, then the head pointer must be amended to point to the new node.
In this case, the challenge is to efficiently find the last node, whilst accounting for the possibility that this may be the first node in the list. Note that in any case the
next
field of the new node points to null.
Create a tracer and set this initially to hold the memory address of head. Set
tracer
to the address of the pointer to the next NODE
. In the case of an empty list or the last NODE
, this value will be NULL
.
NOTE:
*tracer
dereferences tracer
once, so it refers to a pointer to a NODE
.
The arrow operator can then be used to indirectly access the member variable. The following are equivalent:
tracer = &(*tracer)->next;
tracer = &(*(*tracer)).next;
tracer = &(**tracer).next;
Note that
tracer
must be a pointer to a pointer to a NODE
- it must contain the memory address of a pointer to a NODE
(i.e. NODE *
).
After the loop,
*tracer
will be a pointer to NULL
even if the list is empty. *tracer
now refers to the pointer to the next node of the last node. Currently NULL
, the next line refers it to the new NODE
.