This post will teach you what linked lists are, why they’re useful, and how they’re used in cURL to set HTTP headers.
What are linked lists?
A linked list is a collection of data connected via references.
Each element (or node) in a list contains a value and a reference to the next node.
The simplest linked list is a singly linked list, where each node points to the next node in the list. In a singly linked list, the final node points to a null value.
You can represent a singly linked list as a list
structure containing data
and a next
pointer to another list
:
typedef struct list {
DATA data;
struct list *next;
} list;
To find an element in a list you need to visit each node one after the other, starting at the first node in the list (the head) and finishing at the last node (the tail).
The following find
function demonstrates how you would search a linked list. Starting from the first node in the list (head
), find
loops through each node and checks to see if the node contains the data it’s searching for:
list* find(list* head, DATA data) {
list* node = head;
while(node) {
if(node->data == data) {
return node;
}
node = node->next;
}
return node;
}
An algorithm to find a list node may need to traverse an entire list in the worst case. This makes linked list search an O(n) operation.
In contrast, linked list insertion takes constant time (provided you have a reference to the node you want to insert at). An insert
method can insert a new node by creating a node dynamically and changing the pointers of existing nodes.
The following code creates a new node and adds it to the end of a list:
void insert(DATA data, list* tail) {
list* node = create_node(data)
tail->next = node;
tail = node;
}
Note: A linked list node is often dynamically allocated using a function like malloc
. This adds overhead for each node that’s created.
Now you know what a linked list is, the next question is why use them?
Why use linked lists?
There are three main benefits of linked lists:
- They grow dynamically.
- They have constant time insertion and deletion.
- They’re easy to implement.
A big plus for linked lists is that they grow and shrink dynamically. You can add and remove nodes on demand without performing costly resizing operations.
Another benefit is the speed of insertion and deletion, which are both O(1) operations.
You can see the cost of the common operations in the following table:
Operation | Worst case |
---|---|
Access | O(n) |
Search | O(n) |
Insertion | O(1) |
Deletion | O(1) |
The final benefit is that they’re easy to implement (which means less surface area for bugs).
Despite these benefits, dynamic arrays are often more performant than linked lists. Think carefully before you decide to use linked lists in your project.
The rest of this post will look at a project that’s used linked lists successfully—cURL.
Linked lists in cURL
cURL is a command line tool (curl) and a library (libcurl) used for transferring data with URLs.
Commonly, the curl tool is used to make HTTP requests to a server:
curl http://www.google.com
HTTP requests can include a variable number of headers in the form field-name ":" [ field-value ]
.
You can add headers with the curl --header
option:
curl --header "X-MyHeader: 123" http://www.google.com
Both curl and libcurl represent headers as linked lists.
Implementing a linked list
cURL has two linked list data structures: curl_llist
and curl_slist
. curl_slist
is the data structure used to represent HTTP headers.
curl_slist
is a singly linked list that holds strings (hence the name slist, short for string list). Each node contains a pointer to a string (data
) and a pointer to the next node in the list (next
):
struct curl_slist {
char *data;
struct curl_slist *next;
};
Note: The code examples in this post are from curl v7.66.1.
libcurl provides methods to interact with the list. For example, curl_slist_append
, which adds a new string to a list:
struct curl_slist *headers = NULL;
slist = curl_slist_append(headers, "X-MyHeader: 123");
curl_slist_append
copies the string (using strdup
) and then calls Curl_slist_append_nodup
to add the duplicated string to the list:
struct curl_slist *curl_slist_append(struct curl_slist *list,
const char *data)
{
char *dupdata = strdup(data);
// ..
list = Curl_slist_append_nodup(list, dupdata);
// ..
return list;
}
Note: Copying the string means the original string can be overwritten after curl_slist_append
is called.
Curl_slist_append_nodup
creates a new list node (calling malloc
to allocate memory). It then adds the data and sets the next
pointer of the new node to NULL
.
If the current list is NULL
(and therefore uninitialized) Curl_slist_append_nodup
returns the new list node. If the list exists, the last node is retrieved with slist_get_last
, and its next
pointer is set to the new node:
struct curl_slist *Curl_slist_append_nodup(struct curl_slist *list, char *data)
{
struct curl_slist *last;
struct curl_slist *new_item;
// ..
new_item = malloc(sizeof(struct curl_slist));
// ..
new_item->next = NULL;
new_item->data = data;
/* if this is the first item, then new_item *is* the list */
if(!list)
return new_item;
last = slist_get_last(list);
last->next = new_item;
return list;
}
slist_get_last
loops through until it reaches an item where next
points to NULL
. This is the last item in the list, and so it’s returned:
static struct curl_slist *slist_get_last(struct curl_slist *list)
{
struct curl_slist *item;
// ..
item = list;
while(item->next) {
item = item->next;
}
return item;
}
That’s an overview of the curl_slist
API. The interesting part is how it’s used to add headers to an HTTP request.
Before that, a quick primer on HTTP/1.1.
HTTP/1.1
HTTP/1.1 is an application-layer protocol for sending data between a client and a server. It was created with ease-of-use in mind.
HTTP/1.1 is ASCII-encoded, so it’s human readable. A request contains a request-line (in the form Method Request-URI HTTP-Version CRLF
), and a variable-length list of headers.
An HTTP/1.1 GET request might look like this:
GET / HTTP/1.1
Host: www.google.com
User-Agent: curl/7.54.0
Accept: */*
HTTP is typically sent over TCP. On POSIX-compliant systems, TCP connections are created using the sockets API. Sockets are an abstraction that let you establish a TCP connection with another process (often running on a remote host) and send/ receive data over the connection. You can send data to an open socket using the POSIX-defined send
function.
The main takeaway here is that cURL can send an HTTP request by calling send
with the HTTP request as a string.
The next section will look at how headers gets converted from command-line options, into a linked list, and then into the HTTP request string that send
is called with.
Adding custom headers in curl
In libcurl, custom headers are set using a curl_slist
structure. So curl_slist
is a user-facing data structure.
curl (the tool) uses libcurl internally. In fact, most of the work curl does it to convert command-line options into configuration operations for libcurl.
libcurl exposes an “easy API” to perform transfers, you create a CURL
handle (called an “easy handle”) to configure the request. You can set options using helper methods, like curl_easy_setopt
.
The following is an example of making a request with a custom header:
CURLcode res;
CURL *curl = curl_easy_init();
struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "X-MyHeader: 123");
// .. add other options
curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
res = curl_easy_perform(curl); // make the request
curl_slist_free_all(headers);
curl tool uses the libcurl easy API internally. curl converts the command line options into an easy handle object.
When curl is called from the command line, it first parses the options passed to it in argv
. It normalizes option names, and enters a switch statement, which runs against all possible options. For any header
options, the option parameter value is added to a curl_slist
.
Once curl has parsed the options, it creates an easy handle to pass to libcurl, before executing the operation.
For HTTP requests, libcurl creates a buffer to hold the request string. The request string is built by converting options into the relevant strings. The headers are added by Curl_add_custom_headers
.
Curl_add_custom_headers
loops over the headers list. For each header, the header is validated (by checking for a colon), and then added to the request buffer along with with a CRLF (\r\n
).
CURLcode Curl_add_custom_headers(
struct connectdata *conn,
bool is_connect,
Curl_send_buffer *req_buffer)
{
// ..
char *ptr;
struct curl_slist *headers;
// ..
// .. assign headers list to headers variable
while(headers) {
// ..
ptr = strchr(headers->data, ':');
// ..
if(ptr) {
// ..
if(*ptr /* .. */ ) {
// ..
char *compare = /* ... */ headers->data;
// ..
result = Curl_add_bufferf(
&req_buffer,
"%s\r\n",
compare
);
// ..
}
}
headers = headers->next;
}
// ..
return CURLE_OK;
}
Once all the text has been added to an HTTP request, cURL calls send
to send the data over the TCP connection.
As well as being used for request headers, curl_slist
is also used for other variable-length data, like cookies and trailer headers. The linked list data structure is well-suited to representing objects of varying lengths.
Problems with linked lists
Although linked lists work well for representing variable data, they can cause performance problems in time-critical apps due to poor cache locality.
For example, when DOOM 3 was rewritten to run at 60FPS, the developers identified the use of linked lists as a major performance issue (see the DOOM 3 technical note for details).
Despite potential performance performance issues, linked lists are a common data structure in C programs. They are good at representing variable-length data and they are simple to implement (which goes a long way in C!).
The next blog post will look at a variation of linked lists—intrusive linked lists—and how they are used in Linux.