DragonFly On-Line Manual Pages

POLLING(4)            DragonFly Kernel Interfaces Manual            POLLING(4)

NAME
     polling -- network device driver polling support

SYNOPSIS
     options IFPOLL_ENABLE

DESCRIPTION
     Network device polling (polling for brevity) refers to a technique that
     lets the operating system periodically poll network devices, instead of
     relying on the network devices to generate interrupts when they need
     attention.  This might seem inefficient and counterintuitive, but when
     done properly, polling gives more control to the operating system on when
     and how to handle network devices, with a number of advantages in terms
     of system responsiveness and performance.

     In particular, polling reduces the overhead for context switches which is
     incurred when servicing interrupts, and gives more control on the
     scheduling of a CPU between various tasks (user processes, software
     interrupts, device handling) which ultimately reduces the chances of
     livelock in the system.

   Principles of Operation
     In the normal, interrupt-based mode, network devices generate an
     interrupt whenever they need attention.  This in turn causes a context
     switch and the execution of an interrupt handler which performs whatever
     processing is needed by the network device.  The duration of the
     interrupt handler is potentially unbounded unless the network device
     driver has been programmed with real-time concerns in mind (which is
     generally not the case for DragonFly drivers).  Furthermore, under heavy
     traffic load, the system might be persistently processing interrupts
     without being able to complete other work, either in the kernel or in
     userland.

     Network device polling disables interrupts by polling network devices on
     clock interrupts.  This way, the context switch overhead is removed.
     Furthermore, the operating system can control accurately how much work to
     spend in handling network device events, and thus prevent livelock by
     reserving some amount of CPU to other tasks.

     Enabling polling also changes the way software network interrupts are
     scheduled, so there is never the risk of livelock because packets are not
     processed to completion.

   Enabling polling
     It is turned on and off with help of ifconfig(8) command.  An interface
     does not have to be ``up'' in order to turn on its polling feature.

   Loader Tunables
     The following tunables can be set from loader.conf(5) (X is the CPU
     number):
     net.ifpoll.burst_max
             Default value for net.ifpoll.X.rx.burst_max sysctl nodes.

     net.ifpoll.each_burst
             Default value for net.ifpoll.X.rx.each_burst sysctl nodes.

     net.ifpoll.user_frac
             Default value for net.ifpoll.X.rx.user_frac sysctl nodes.

     net.ifpoll.pollhz
             Default value for net.ifpoll.X.pollhz sysctl nodes.

     net.ifpoll.status_frac
             Default value for net.ifpoll.0.status_frac sysctl node.

     net.ifpoll.tx_frac
             Default value for net.ifpoll.X.tx_frac sysctl nodes.

   MIB Variables
     The operation of polling is controlled by the following per CPU sysctl(8)
     MIB variables (X is the CPU number):

     net.ifpoll.X.pollhz
             The polling frequency, whose range is 1 to 30000.  Default is
             6000.

     net.ifpoll.X.rx.user_frac
             When polling is enabled, and provided that there is some work to
             do, up to this percent of the CPU cycles is reserved to userland
             tasks, the remaining fraction being available for polling
             processing.  Default is 50.

     net.ifpoll.X.rx.burst
             Maximum number of packets grabbed from each network interface in
             each timer tick.  This number is dynamically adjusted by the
             kernel, according to the programmed user_frac, burst_max, CPU
             speed, and system load.

     net.ifpoll.X.rx.each_burst
             The burst above is split into smaller chunks of this number of
             packets, going round-robin among all interfaces registered for
             polling.  This prevents the case that a large burst from a single
             interface can saturate the IP interrupt queue.  Default is 50.

     net.ifpoll.X.rx.burst_max
             Upper bound for net.ifpoll.X.rx.burst.  Note that when polling is
             enabled, each interface can receive at most (pollhz * burst_max)
             packets per second unless there are spare CPU cycles available
             for polling in the idle loop.  This number should be tuned to
             match the expected load.  Default is 250 which is adequate for
             1000Mbit network and pollhz=6000.

     net.ifpoll.X.rx.handlers
             How many active network devices have registered for packet
             reception polling.

     net.ifpoll.X.tx_frac
             Controls how often (every tx_frac / pollhz seconds) the
             tranmission queue is checked for packet transmission done events.
             Increasing this value reduces the time spent on checking packets
             transmission done events thus reduces bus load, but it also
             increases chance that the transmission queue getting saturated.
             Default is 1.

     net.ifpoll.X.tx.handlers
             How many active network devices have registered for packet
             transmission polling.

     net.ifpoll.0.status_frac
             Controls how often (every status_frac / pollhz seconds) the
             status registers of the network device are checked for error
             conditions and the like.  Increasing this value reduces the load
             on the bus, but also delays the error detection.  Default is 120.

     net.ifpoll.0.status.handlers
             How many active network devices have registered for status
             polling.

     net.ifpoll.X.rx.short_ticks
     net.ifpoll.X.rx.lost_polls
     net.ifpoll.X.rx.pending_polls
     net.ifpoll.X.rx.residual_burst
     net.ifpoll.X.rx.phase
     net.ifpoll.X.rx.suspect
     net.ifpoll.X.rx.stalled
     net.ifpoll.X.tx.short_ticks
     net.ifpoll.X.tx.lost_polls
     net.ifpoll.X.tx.pending_polls
     net.ifpoll.X.tx.residual_burst
     net.ifpoll.X.tx.phase
     net.ifpoll.X.tx.suspect
     net.ifpoll.X.tx.stalled
             Debugging variables.

SUPPORTED DEVICES
     Network device polling requires explicit modifications to the network
     device drivers.  As of this writing, the bce(4), bge(4), bnx(4), dc(4),
     em(4), emx(4), fwe(4), fxp(4), igb(4), ix(4), jme(4), mxge(4), nfe(4),
     nge(4), re(4), rl(4), sis(4), stge(4), vge(4), vr(4), and xl(4) devices
     are supported, with others in the works.  The bce(4), bnx(4), emx(4),
     igb(4), ix(4), jme(4), and mxge(4), support multiple reception queues
     based polling.  The bce(4), bnx(4), certain types of emx(4), igb(4), and
     ix(4) support multiple transmission queues based polling.  The
     modifications are rather straightforward, consisting in the extraction of
     the inner part of the interrupt service routine and writing a callback
     function, *_npoll(), which is invoked to probe the network device for
     events and process them.  (See the conditionally compiled sections of the
     network devices mentioned above for more details.)

     In order to reduce the latency in processing packets, it is advisable to
     set the sysctl(8) variable net.ifpoll.X.pollhz to at least 1000.

HISTORY
     Network device polling first appeared in FreeBSD 4.6.  It was rewritten
     in DragonFly 1.3.

AUTHORS
     The network device polling code was rewritten by Matt Dillon based on the
     original code by Luigi Rizzo <luigi@iet.unipi.it>.  Sepherosa Ziehau made
     the polling frequency settable at runtime, added per CPU polling and
     added multiple reception and tranmission queue polling support.

DragonFly 3.7                    May 23, 2013                    DragonFly 3.7