Phenomenon
check TSC disabled from guest
1 | [ 0.000004] tsc: Detected 1999.999 MHz processor |
Steps
lscpu | grep tscin guest, confirm cpu support tsc- check vm not runs over IBM’s Summit2 by
dmidecode lscpu | grep constant_tscin both guest and host to confirm this is not in guest- change kernel cmdline, add
tsc=reliableif need ( based oncat /proc/cmdline) - check cpu Vendor by
lscpuif using Intel cpu - remove
acpifrom libvirt. xml,but this will let all hot-plug operation fails
change to using other amd cpu,tsc clocksource still could not be found
Cpu that do not has vendor intel runs linux kernel will met this issue.
Resolution
method 1:
change guest os linux cmdline add tsc=reliable
method 2:
remove acpi from libvirt. xml by virsh edit
More information
x86 guest disable constant_tsc in code,see the disscusion: https://lore.kernel.org/qemu-devel/20210426185155.G[A32439@fuller.cnet](mailto:A32439@fuller.cnet)/t/
but linux guest kernel let intel cpu use tsc and as the result intel vs amd cpu has below differences.
go through kernel code
Code snippet 0
1 | /* |
The code defines a static integer variable tsc_unstable. The static keyword means that this variable is only visible within the file it’s defined in, limiting its scope to tsc.c. This is a common practice when you want to keep the variable private to a particular file and prevent it from being accessed directly from other files.
The __read_mostly attribute is a compiler directive specific to the Linux kernel, which suggests that this variable is expected to be read from often, but seldom written to. This can help the compiler optimize the generated code for this access pattern.
The variable tsc_unstable is used to indicate whether the TSC is unstable. The comment above the variable declaration explains that the TSC can be unstable due to CPU frequency scaling (cpufreq) or due to unsynchronized TSCs across different cores or processors.
In terms of improvements, the code is quite clear and concise. However, it might be helpful to expand the comment to explain what the implications of an unstable TSC are, and how the tsc_unstable variable is used in the system’s logic.
Code snippet 1
1 | static int __init tsc_setup(char *str) |
The __init keyword before the function name is a macro used in the Linux kernel to indicate that this function is used only at initialization time. The memory for this function is freed after the initialization is complete to conserve the kernel’s memory footprint.
Inside the function, there are several if statements that compare the input string str to different string literals using the strcmp and strncmp functions. strcmp compares two strings and returns 0 if they are identical, while strncmp compares up to a specified number of characters from two strings.
If str is “reliable”, it sets the tsc_clocksource_reliable variable to 1. If str starts with “noirqtime”, it sets the no_sched_irq_time variable to 1. If str is “unstable”, it calls the mark_tsc_unstable function with “boot parameter” as an argument. If str is “nowatchdog”, it sets the no_tsc_watchdog variable to 1.
Each of these variables or functions presumably controls a different aspect of the TSC’s behavior. For example, tsc_clocksource_reliable might indicate whether the TSC is a reliable source of time, and no_sched_irq_time might control whether the scheduler uses the TSC for interrupt timing.
In terms of improvements, the function is quite clear and concise. However, adding comments to explain the purpose of each variable and what each string argument represents would improve readability. It would also be beneficial to add error handling to account for the case where str does not match any of the expected values.
And we could use tsc=reliable to kernel cmdline to changes to tsc clock source.
Code snippet 2
1 | static void tsc_cs_mark_unstable(struct clocksource *cs) |
The function first checks if the tsc_unstable variable is already set to 1. If it is, the function immediately returns, as the TSC has already been marked as unstable. This is a common pattern in C programming known as a “guard clause”, which is used to exit a function early when certain conditions are met.
If tsc_unstable is not set to 1, the function proceeds to mark the TSC as unstable. It does this by setting tsc_unstable to 1, and then calling two functions: clear_sched_clock_stable and disable_sched_clock_irqtime. These functions presumably perform some cleanup or configuration changes related to the TSC becoming unstable.
Finally, the function logs a message using the pr_info macro, which is a kernel print function that outputs a message to the system log. The message indicates that the TSC has been marked as unstable due to the clocksource watchdog.
In terms of improvements, the function is quite clear and concise. However, adding comments to explain the purpose of the clear_sched_clock_stable and disable_sched_clock_irqtime functions would improve readability. It would also be beneficial to add error handling to account for any potential issues that could occur when these functions are called.
Code snippet 3
1 | /* |
The function begins by checking if the boot CPU has the TSC feature and if the TSC is unstable. If either of these conditions is true, the function immediately returns 1, indicating that the TSC is unsynchronized.
Next, if the system is configured for symmetric multiprocessing (SMP), the function checks if the Advanced Programmable Interrupt Controller (APIC) is clustered. If it is, the function returns 1, again indicating that the TSC is unsynchronized.
The function then checks if the boot CPU has the constant TSC feature or if the TSC clocksource is reliable. If either of these conditions is true, the function returns 0, indicating that the TSC is synchronized.
Finally, the function checks if the CPU vendor is not Intel. If it is not, and the system has more than one possible CPU, the function returns 1, indicating that the TSC is unsynchronized. If none of the previous conditions are met, the function returns 0, indicating that the TSC is synchronized.
More practice
SystemTap
Because of above issue, I just spent more time to check the tsc value used by guest and from host cpu do have any different. With systemtap.
observe rdtsc
result
value of tsc clock,average value and stantard deviation has different
and the value from guest os is not stable when compared with host
during live migration, tsc value will be smaller than usual (I think its because live migration has down time, so we need to change tsc to tolerant it)
so just from the small test, its not a good idea to relay on tsc which is not as specific as it on the host
data from my test
The first version, use the script test average value and stantard deviation
in guest:
1 | TSC mean: 2000170717.800000, TSC std dev: 255861.233545 |
in guest during live migration:
1 | TSC mean: 1990107194.600000, TSC std dev: 71113321.983521 |
Samples from host:
1 | TSC mean: 2000087563.600000, TSC std dev: 16626.290598 |
TSC average value will be less that normal during migration.
change the script to check abnormal samples
1 | Sample 54, TSC diff: 1998893560, Time diff: 1000069363 ns |
just paste my test code:
1 |
|
more links for reading
TSC
Time Stamp Counter (TSC)All 80x86 microprocessors include a CLK input pin, which receives the clock signal of an external oscillator. Starting with the Pentium, 80x86 microprocessors sport a counter that is increased at each clock signal, and is accessible through the TSC register which can be read by means of the rdtsc assembly instruction. When using this register the kernel has to take into consideration the frequency of the clock signal: if, for instance, the clock ticks at 1 GHz, the TSC is increased once every nanosecond. Linux may take advantage of this register to get much more accurate time measurements.