From: Laszlo Ersek <lersek@redhat.com> Date: Wed, 17 Nov 2010 14:46:22 -0500 Subject: [virt] netback: don't balloon up for copying receivers Message-id: <1290005182-29460-1-git-send-email-lersek@redhat.com> Patchwork-id: 29463 O-Subject: [RHEL5.6 PATCH] netback: no need to balloon up for copying receivers (BZ#653501) Bugzilla: 653501 RH-Acked-by: Paolo Bonzini <pbonzini@redhat.com> RH-Acked-by: Andrew Jones <drjones@redhat.com> Upstream status: Backport of xen-unstable c/s 14355:68282f4b3e0f (second hunk only): http://xenbits.xensource.com/xen-unstable.hg?rev/68282f4b3e0f The unstable patch was also imported to linux-2.6.18-xen, c/s 26:a533be77c572: http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/26 Brew: https://brewweb.devel.redhat.com/taskinfo?taskID=2899460 ----v---- Common description of - BZ#653505 (taskid 2899447) -- RHEL4.9 netfront: default to copying instead of flipping, - BZ#653262 (taskid 2899452) -- RHEL5.6 netfront: default to copying instead of flipping, - and BZ#653501 (taskid 2899460) -- RHEL5.6 netback: no need to balloon up for copying receivers: Problem: 1. Turn off auto-ballooning in dom0. 2. Create a guest with 512M initial memory and 1024M maxmem. 3. Make sure there is not enough free memory to balloon up the guest completely. 4. Commence ballooning up to maxmem (1024M) in the guest. 5. Watch how the netback driver can't send packets to any guest's netfront driver. This also affects RHEL6 guests, even though RHEL6 guests can only ask for copying. Loosely quoting BZ#653262 comment 11: The host dmesg says: "xen_net: Memory squeeze in netback driver." RHEL[45] netfront is releasing pages to the host, so that netback can take them, and inside them, send packets to the guest. However, when the guest (any guest) is ballooning up simultaneously, there is another consumer contending for the released pages. Producer: any RHEL[45] guest's flipping netfront driver Consumer 1: host's netback driver Consumer 2: any guest's balloon driver The guest's balloon driver outruns the host's netback driver, and so the netback driver can't send packets to *any* guest. Changeset 7c14912 made the balloon driver more aggressive, so it exposes the problem more rapidly. A guest can avoid being victimized by using "page contents copying" instead of "page flipping" as a means to communicate between netfront and netback. Unfortunately, the "memory squeeze" branch in RHEL5's netback currently backs off even if the guest did request copying, so RHEL6 guests (which only support copying) also get stuck as of now. RHEL4's and RHEL5's netfront drivers must default to copying instead of flipping. Also, this bug is blocked by bug 653501: copying receivers must not be stalled by netback's memory squeeze. (To give credit where credit is due, the parts were put together by Paolo, Drew, Chris, and by myself.) Testing: Three guests: RHEL4 PV: 2.6.9-91.EL.bz_653505_copying_netfrontxenU RHEL5 PV: 2.6.18-232.el5.bz_653262_copying_netfrontxen RHEL6 PV: 2.6.32-80.el6.x86_64 Host: RHEL5: 2.6.18-232.el5.bz_653501_noballoon_copy_netbackxen # grep "balloon-dom0" /etc/xen/xend-config.sxp (auto-balloon-dom0 no) # xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 5632 4 r----- 127.3 rhel4-PV 7 1024 2 -b---- 7.8 rhel55-194-PV 6 512 4 -b---- 17.4 rhel6-PV 5 512 4 -b---- 9.1 # xm info | grep free free_memory : 186 # xm list -l rhel55-194-PV | grep mem (memory 512) (shadow_memory 0) (maxmem 1024) # xm mem-set rhel55-194-PV 1024 # xm list rhel55-194-PV Name ID Mem(MiB) VCPUs State Time(s) rhel55-194-PV 6 698 4 -b---- 17.6 Now start to ping any guest from the host, ssh into all the guests and paste large amounts of text at the prompt. No more freezing and no more "xen_net: Memory squeeze in netback driver." messages in the host's dmesg. Verify ballooning is still in progress: $ ssh xen-rhel5-pv cat /proc/xen/balloon Current allocation: 715276 kB Requested target: 1048576 kB Low-mem balloon: 341492 kB High-mem balloon: 0 kB Driver pages: 0 kB Xen hard limit: ??? kB (Common description ends.) ----^---- Fix: This patch convinces the RHEL5 host not to balloon up for copying receivers. Signed-off-by: Laszlo Ersek <lersek@redhat.com> diff --git a/drivers/xen/netback/netback.c b/drivers/xen/netback/netback.c index 20d9184..c96a724 100644 --- a/drivers/xen/netback/netback.c +++ b/drivers/xen/netback/netback.c @@ -553,6 +553,7 @@ static void net_rx_action(unsigned long unused) *(int *)skb->cb = nr_frags; if (!xen_feature(XENFEAT_auto_translated_physmap) && + !((netif_t *)netdev_priv(skb->dev))->copying_receiver && check_mfn(nr_frags + 1)) { /* Memory squeeze? Back off for an arbitrary while. */ if ( net_ratelimit() )