// Hidden Features of getsockopt
#import "/template-en.typ":doc-template
#doc-template(
title: "Hidden Features of getsockopt",
date: "January 28, 2023",
body: [
When troubleshooting network issues, a very common requirement is to obtain information about TCP connections, such as how much data has been transmitted, retransmission rate, ACK delay, etc. Although some simple statistical functions can be implemented manually—for example, by adding logging functions at all `send` and `recv` locations to count the bytes sent and received—this approach is ultimately not very elegant. Using eBPF would inevitably involve the kernel space, which is troublesome. Therefore, if you don't need to consider portability to other Unix-like systems, you can use Linux-specific interfaces.
= Interface
```
int getsockopt(int sockfd, int level, int optname,
void *restrict optval, socklen_t *restrict optlen);
```
When you need to get TCP connection information, fill in `IPPROTO_TCP` for `level` and `TCP_INFO` for `optname`. Fill in the pointer to the buffer used to store the results for `optval`. The `optlen` parameter is both input and output, used to pass the size of the buffer and receive the size of the result.
= Data Structure
For TCP information, the output result is a `struct tcp_info`. However, there are two versions of this struct's definition. The first version is in #raw("<netinet/tcp.h>"):
```
struct tcp_info
{
uint8_t ctcpi_state;
uint8_t ctcpi_ca_state;
uint8_t ctcpi_retransmits;
uint8_t ctcpi_probes;
uint8_t ctcpi_backoff;
uint8_t ctcpi_options;
uint8_t ctcpi_snd_wscale : 4, tcpi_rcv_wscale : 4;
uint32_t ctcpi_rto;
uint32_t ctcpi_ato;
uint32_t ctcpi_snd_mss;
uint32_t ctcpi_rcv_mss;
uint32_t ctcpi_unacked;
uint32_t ctcpi_sacked;
uint32_t ctcpi_lost;
uint32_t ctcpi_retrans;
uint32_t ctcpi_fackets;
/* Times. */
uint32_t ctcpi_last_data_sent;
uint32_t ctcpi_last_ack_sent; /* Not remembered, sorry. */
uint32_t ctcpi_last_data_recv;
uint32_t ctcpi_last_ack_recv;
/* Metrics. */
uint32_t ctcpi_pmtu;
uint32_t ctcpi_rcv_ssthresh;
uint32_t ctcpi_rtt;
uint32_t ctcpi_rttvar;
uint32_t ctcpi_snd_ssthresh;
uint32_t ctcpi_snd_cwnd;
uint32_t ctcpi_advmss;
uint32_t ctcpi_reordering;
uint32_t ctcpi_rcv_rtt;
uint32_t ctcpi_rcv_space;
uint32_t ctcpi_total_retrans;
};
```
This version is compatible with other Unix-like systems such as FreeBSD and provides relatively little information.
The second version is in #raw("<linux/tcp.h>"), which is Linux-specific and more comprehensive. Since the initial fields are the same, only the additional fields are shown here.
```
struct tcp_info {
// Same as above
__u64 ctcpi_pacing_rate;
__u64 ctcpi_max_pacing_rate;
__u64 ctcpi_bytes_acked; /* RFC4898 tcpEStatsAppHCThruOctetsAcked */
__u64 ctcpi_bytes_received; /* RFC4898 tcpEStatsAppHCThruOctetsReceived */
__u32 ctcpi_segs_out; /* RFC4898 tcpEStatsPerfSegsOut */
__u32 ctcpi_segs_in; /* RFC4898 tcpEStatsPerfSegsIn */
__u32 ctcpi_notsent_bytes;
__u32 ctcpi_min_rtt;
__u32 ctcpi_data_segs_in; /* RFC4898 tcpEStatsDataSegsIn */
__u32 ctcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */
__u64 c tcpi_delivery_rate;
__u64 c tcpi_busy_time; /* Time (usec) busy sending data */
__u64 c tcpi_rwnd_limited; /* Time (usec) limited by receive window */
__u64 c tcpi_sndbuf_limited; /* Time (usec) limited by send buffer */
__u32 ctcpi_delivered;
__u32 ctcpi_delivered_ce;
__u64 c tcpi_bytes_sent; /* RFC4898 tcpEStatsPerfHCDataOctetsOut */
__u64 c tcpi_bytes_retrans; /* RFC4898 tcpEStatsPerfOctetsRetrans */
__u32 ctcpi_dsack_dups; /* RFC4898 tcpEStatsStackDSACKDups */
__u32 ctcpi_reord_seen; /* reordering events seen */
__u32 ctcpi_rcv_ooopack; /* Out-of-order packets received */
__u32 ctcpi_snd_wnd; /* peer's advertised receive window after
* scaling (bytes)
*/
};
```
As you can see, the `struct tcp_info` in #raw("<linux/tcp.h>") contains some very important information, such as the number of bytes sent and received, which is very useful for traffic statistics.
= Example Code
```
#include <linux/tcp.h>
// int fd = ...;
struct tcp_info tcpi;
int bufsz = sizeof(tcpi);
int ret = getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcpi, &bufsz);
if (ret != 0) {
// TODO: error handling
}
printf("fd: %d has received %lu bytes and sent %lu bytes.\n",
fd, tcpi.tcpi_bytes_receiver, tcpi.tcpi_bytes_sent);
```
]
)
Email: i (at) mistivia (dot) com