../

The Hidden Features of getsockopt

2023-01-28

When troubleshooting network issues, a common requirement is to obtain information about a TCP connection, such as the amount of data transferred, retransmission rate, ACK latency, etc. Although some simple statistical functions can be implemented manually—for example, by adding logging functions at all send and recv points to count the bytes sent and received—this approach is ultimately not very elegant. Using eBPF is also an option, but it inevitably involves kernel space, which can be cumbersome. Therefore, if you do not need to consider portability to other UNIX-like systems, you can use Linux’s proprietary interface.

Interface

int getsockopt(int sockfd, int level, int optname,
               void *restrict optval, socklen_t *restrict optlen);

When you need to retrieve TCP connection information, set level to IPPROTO_TCP and optname to TCP_INFO. optval should be a pointer to the buffer used to store the result, and optlen is a parameter that acts as both input and output, used to pass in the buffer size and return the size of the result.

Data Structures

For TCP information, the output result is a structure called struct tcp_info. However, there are two versions of this structure’s definition. The first version is in <netinet/tcp.h>:

struct tcp_info
{
  uint8_t   tcpi_state;
  uint8_t   tcpi_ca_state;
  uint8_t   tcpi_retransmits;
  uint8_t   tcpi_probes;
  uint8_t   tcpi_backoff;
  uint8_t   tcpi_options;
  uint8_t   tcpi_snd_wscale : 4, tcpi_rcv_wscale : 4;

  uint32_t  tcpi_rto;
  uint32_t  tcpi_ato;
  uint32_t  tcpi_snd_mss;
  uint32_t  tcpi_rcv_mss;

  uint32_t  tcpi_unacked;
  uint32_t  tcpi_sacked;
  uint32_t  tcpi_lost;
  uint32_t  tcpi_retrans;
  uint32_t  tcpi_fackets;

  /* Times. */
  uint32_t  tcpi_last_data_sent;
  uint32_t  tcpi_last_ack_sent; /* Not remembered, sorry.  */
  uint32_t  tcpi_last_data_recv;
  uint32_t  tcpi_last_ack_recv;

  /* Metrics. */
  uint32_t  tcpi_pmtu;
  uint32_t  tcpi_rcv_ssthresh;
  uint32_t  tcpi_rtt;
  uint32_t  tcpi_rttvar;
  uint32_t  tcpi_snd_ssthresh;
  uint32_t  tcpi_snd_cwnd;
  uint32_t  tcpi_advmss;
  uint32_t  tcpi_reordering;

  uint32_t  tcpi_rcv_rtt;
  uint32_t  tcpi_rcv_space;

  uint32_t  tcpi_total_retrans;
};

This version is compatible with FreeBSD and other UNIX-like systems, but it provides less information.

The second version is in <linux/tcp.h>. It is proprietary to Linux and provides more comprehensive information. Since the initial fields are the same, only the additional fields are listed here.

struct tcp_info {
    // Same as above

    __u64   tcpi_pacing_rate;
    __u64   tcpi_max_pacing_rate;
    __u64   tcpi_bytes_acked;    /* RFC4898 tcpEStatsAppHCThruOctetsAcked */
    __u64   tcpi_bytes_received; /* RFC4898 tcpEStatsAppHCThruOctetsReceived */
    __u32   tcpi_segs_out;       /* RFC4898 tcpEStatsPerfSegsOut */
    __u32   tcpi_segs_in;        /* RFC4898 tcpEStatsPerfSegsIn */

    __u32   tcpi_notsent_bytes;
    __u32   tcpi_min_rtt;
    __u32   tcpi_data_segs_in;  /* RFC4898 tcpEStatsDataSegsIn */
    __u32   tcpi_data_segs_out; /* RFC4898 tcpEStatsDataSegsOut */

    __u64   tcpi_delivery_rate;

    __u64   tcpi_busy_time;      /* Time (usec) busy sending data */
    __u64   tcpi_rwnd_limited;   /* Time (usec) limited by receive window */
    __u64   tcpi_sndbuf_limited; /* Time (usec) limited by send buffer */

    __u32   tcpi_delivered;
    __u32   tcpi_delivered_ce;

    __u64   tcpi_bytes_sent;     /* RFC4898 tcpEStatsPerfHCDataOctetsOut */
    __u64   tcpi_bytes_retrans;  /* RFC4898 tcpEStatsPerfOctetsRetrans */
    __u32   tcpi_dsack_dups;     /* RFC4898 tcpEStatsStackDSACKDups */
    __u32   tcpi_reord_seen;     /* reordering events seen */

    __u32   tcpi_rcv_ooopack;    /* Out-of-order packets received */

    __u32   tcpi_snd_wnd;        /* peer's advertised receive window after
                      * scaling (bytes)
                      */
};

You can see that the struct tcp_info in <linux/tcp.h> includes some very important additional information, such as the number of bytes sent and received, which is extremely useful for traffic statistics.

Example Code

#include <linux/tcp.h>

// int fd = ...;
struct tcp_info tcpi;
int bufsz = sizeof(tcpi);
int ret = getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcpi, &bufsz);
if (ret != 0) {
    // TODO: error handling
}
printf("fd: %d has received %lu bytes and sent %lu bytes.\n",
       fd, tcpi.tcpi_bytes_receiver, tcpi.tcpi_bytes_sent);

Mistivia - https://mistivia.com