From: Glauber Costa <glommer@redhat.com> Date: Mon, 7 Jun 2010 17:48:14 -0400 Subject: [virt] fix tsccount clocksource under kvm guests Message-id: <1275932894-6887-1-git-send-email-glommer@redhat.com> Patchwork-id: 25993 O-Subject: [PATCH RHEL5.6] Fix tsccount clocksource under kvm guests Bugzilla: 581396 RH-Acked-by: Rik van Riel <riel@redhat.com> RH-Acked-by: Zachary Amsden <zamsden@redhat.com> RH-Acked-by: Prarit Bhargava <prarit@redhat.com> RH-Acked-by: Jes Sorensen <Jes.Sorensen@redhat.com> get_hypervisor_cycles_per_tick() was written with kvm guests running kvmclock in mind. The calculation does not take cpu frequency into account, since kvm clock runs in units of nanoseconds already. However, this is wrong, and should not be done in this function. There is the possibility that one wants to run clock=tsccount guests on kvm, in which case, this will break. Although this usecase is not a major one, it is important, specially in the light of recent kvmclock breakages. The fix is to have this function do what it is meant to do, which is, converting cycles to nanoseconds, and special case kvmclock guests in time.c RH-Bugzilla: 581396 RH-Upstream-status: N/A (RHEL5.64 only-code) Signed-off-by: Glauber Costa <glommer@redhat.com> diff --git a/arch/i386/kernel/cpu/hypervisor.c b/arch/i386/kernel/cpu/hypervisor.c index 6f988ac..9700158 100644 --- a/arch/i386/kernel/cpu/hypervisor.c +++ b/arch/i386/kernel/cpu/hypervisor.c @@ -52,11 +52,7 @@ unsigned long get_hypervisor_tsc_freq(void) unsigned long get_hypervisor_cycles_per_tick(void) { - - if (boot_cpu_data.x86_hyper_vendor == X86_HYPER_VENDOR_KVM) - return 1000000000 / REAL_HZ; - else /* Same thing for VMware or baremetal, in case we force it */ - return (cpu_khz * 1000) / REAL_HZ; + return (cpu_khz * 1000) / REAL_HZ; } static inline void __cpuinit diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c index 9c9e88d..cb6c8f3 100644 --- a/arch/x86_64/kernel/time.c +++ b/arch/x86_64/kernel/time.c @@ -1449,7 +1449,10 @@ void __init time_init(void) /* Keep time based on the TSC rather than by counting interrupts. */ if (timekeeping_use_tsc > 0) { - cycles_per_tick = get_hypervisor_cycles_per_tick(); + if (use_kvm_time) /* KVM time is already in nanoseconds units */ + cycles_per_tick = 1000000000 / REAL_HZ; + else + cycles_per_tick = get_hypervisor_cycles_per_tick(); /* * The maximum cycles we will account per * timer interrupt is 10 minutes.