The Hidden Function of 'getsockopt'

2023-01-28

When troubleshooting network problems, it's often needed to know some information about a TCP connection, for example, how many bytes are tranfered throught this connection, the retransmission rate, and the latency of ACK. Although some simple statistics can be acquired in the application layer, for example, you can add extra code when calling send and recv to know how many bytes are sent and received. But this method is not elegant. If you use eBPF, then you have to enter kernel space, which is terrifying. But Linux has a hidden interface to do this, which is useful when you don't need to consider compatibility with other UNIX-like systems.

Interface

int getsockopt(int sockfd, int level, int optname,
               void *restrict optval, socklen_t *restrict optlen);

When acquiring infomation of TCP connection, level should be IPPROTO_TCP, optname should be TCP_INFO. optval is the pointer to the buffer which is used to store the result, optlen is both input parameter and output parameter, which is used to input the size of the buffer, and output the size of the result.

Data Structures

For TCP information, the output result is a structure struct tcp_info. This structure has two versions. The first one is in <netinet/tcp.h>:

struct tcp_info
{
  uint8_t	tcpi_state;
  uint8_t	tcpi_ca_state;
  uint8_t	tcpi_retransmits;
  uint8_t	tcpi_probes;
  uint8_t	tcpi_backoff;
  uint8_t	tcpi_options;
  uint8_t	tcpi_snd_wscale : 4, tcpi_rcv_wscale : 4;

  uint32_t	tcpi_rto;
  uint32_t	tcpi_ato;
  uint32_t	tcpi_snd_mss;
  uint32_t	tcpi_rcv_mss;

  uint32_t	tcpi_unacked;
  uint32_t	tcpi_sacked;
  uint32_t	tcpi_lost;
  uint32_t	tcpi_retrans;
  uint32_t	tcpi_fackets;

  /* Times. */
  uint32_t	tcpi_last_data_sent;
  uint32_t	tcpi_last_ack_sent;	/* Not remembered, sorry.  */
  uint32_t	tcpi_last_data_recv;
  uint32_t	tcpi_last_ack_recv;

  /* Metrics. */
  uint32_t	tcpi_pmtu;
  uint32_t	tcpi_rcv_ssthresh;
  uint32_t	tcpi_rtt;
  uint32_t	tcpi_rttvar;
  uint32_t	tcpi_snd_ssthresh;
  uint32_t	tcpi_snd_cwnd;
  uint32_t	tcpi_advmss;
  uint32_t	tcpi_reordering;

  uint32_t	tcpi_rcv_rtt;
  uint32_t	tcpi_rcv_space;

  uint32_t	tcpi_total_retrans;
};

This version is compatible with FreeBSD and other UNIX-like systems. But it contains less information.

The second version is in <linux/tcp.h>, which is exclusive to Linux. There is more information. Fields in the first part is the same. Here are the extra data fields:

struct tcp_info {
    // same with the structure above

	__u64	tcpi_pacing_rate;
	__u64	tcpi_max_pacing_rate;
	__u64	tcpi_bytes_acked;    /* RFC4898 tcpEStatsAppHCThruOctetsAcked */
	__u64	tcpi_bytes_received; /* RFC4898 tcpEStatsAppHCThruOctetsReceived */
	__u32	tcpi_segs_out;	     /* RFC4898 tcpEStatsPerfSegsOut */
	__u32	tcpi_segs_in;	     /* RFC4898 tcpEStatsPerfSegsIn */

	__u32	tcpi_notsent_bytes;
	__u32	tcpi_min_rtt;
	__u32	tcpi_data_segs_in;	/* RFC4898 tcpEStatsDataSegsIn */
	__u32	tcpi_data_segs_out;	/* RFC4898 tcpEStatsDataSegsOut */

	__u64   tcpi_delivery_rate;

	__u64	tcpi_busy_time;      /* Time (usec) busy sending data */
	__u64	tcpi_rwnd_limited;   /* Time (usec) limited by receive window */
	__u64	tcpi_sndbuf_limited; /* Time (usec) limited by send buffer */

	__u32	tcpi_delivered;
	__u32	tcpi_delivered_ce;

	__u64	tcpi_bytes_sent;     /* RFC4898 tcpEStatsPerfHCDataOctetsOut */
	__u64	tcpi_bytes_retrans;  /* RFC4898 tcpEStatsPerfOctetsRetrans */
	__u32	tcpi_dsack_dups;     /* RFC4898 tcpEStatsStackDSACKDups */
	__u32	tcpi_reord_seen;     /* reordering events seen */

	__u32	tcpi_rcv_ooopack;    /* Out-of-order packets received */

	__u32	tcpi_snd_wnd;	     /* peer's advertised receive window after
				      * scaling (bytes)
				      */
};

We can see that struct tcp_info in '<linux/tcp.h>' has extra information, like tcpi_bytes_sent, which can be very useful.

Example

#include <linux/tcp.h>

// int fd = ...;
struct tcp_info tcpi;
int bufsz = sizeof(tcpi);
int ret = getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcpi, &bufsz);
if (ret != 0) {
    // TODO: error handling
}
printf("fd: %d has received %lu bytes and sent %lu bytes.\n",
       fd, tcpi.tcpi_bytes_receiver, tcpi.tcpi_bytes_sent);


Email: i (at) mistivia (dot) com