bbs.cooldavid.org Git - net-next-2.6.git/commit

author	Dave Hansen <dave@linux.vnet.ibm.com>
	Fri, 20 Aug 2010 01:11:37 +0000 (18:11 -0700)
committer	Avi Kivity <avi@redhat.com>
	Sun, 24 Oct 2010 08:51:19 +0000 (10:51 +0200)
commit	45221ab6684a82a5b60208b76d6f8bfb1bbcb969
tree	bdc915bf20cc9dfb40b81b7601ed5182c047d13a	tree \| snapshot (tar.bz2 tar.gz zip)
parent	49d5ca26636cb8feb05aff92fc4dba3e494ec683	commit \| diff

KVM: create aggregate kvm_total_used_mmu_pages value

Of slab shrinkers, the VM code says:

* Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
* querying the cache size, so a fastpath for that case is appropriate.

and it *means* it.  Look at how it calls the shrinkers:

    nr_before = (*shrinker->shrink)(0, gfp_mask);
    shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);

So, if you do anything stupid in your shrinker, the VM will doubly
punish you.

The mmu_shrink() function takes the global kvm_lock, then acquires
every VM's kvm->mmu_lock in sequence.  If we have 100 VMs, then
we're going to take 101 locks.  We do it twice, so each call takes
202 locks.  If we're under memory pressure, we can have each cpu
trying to do this.  It can get really hairy, and we've seen lock
spinning in mmu_shrink() be the dominant entry in profiles.

This is guaranteed to optimize at least half of those lock
aquisitions away.  It removes the need to take any of the locks
when simply trying to count objects.

A 'percpu_counter' can be a large object, but we only have one
of these for the entire system.  There are not any better
alternatives at the moment, especially ones that handle CPU
hotplug.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>