kvmfornfv -

Age	Commit message (Collapse)	Author	Files	Lines
2016-07-18	KVM: nVMX: avoid incorrect preemption timer vmexit in nested guest	Wanpeng Li	1	-0/+2
	The preemption timer for nested VMX is emulated by hrtimer which is started on L2 entry, stopped on L2 exit and evaluated via the check_nested_events hook. However, nested_vmx_exit_handled is always returning true for preemption timer vmexit. Then, the L1 preemption timer vmexit is captured and be treated as a L2 preemption timer vmexit, causing NULL pointer dereferences or worse in the L1 guest's vmexit handler: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [< (null)>] (null) PGD 0 Oops: 0010 [#1] SMP Call Trace: ? kvm_lapic_expired_hv_timer+0x47/0x90 [kvm] handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0x169/0x15a0 [kvm_intel] ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm] kvm_arch_vcpu_ioctl_run+0xdee/0x19d0 [kvm] ? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm] ? vcpu_load+0x1c/0x60 [kvm] ? kvm_arch_vcpu_load+0x57/0x260 [kvm] kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm] do_vfs_ioctl+0x96/0x6a0 ? __fget_light+0x2a/0x90 SyS_ioctl+0x79/0x90 do_syscall_64+0x68/0x180 entry_SYSCALL64_slow_path+0x25/0x25 Code: Bad RIP value. RIP [< (null)>] (null) RSP <ffff8800b5263c48> CR2: 0000000000000000 ---[ end trace 9c70c48b1a2bc66e ]--- This can be reproduced readily by preemption timer enabled on L0 and disabled on L1. Return false since preemption timer vmexits must never be reflected to L2. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Yunhong Jiang <yunhong.jiang@intel.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Change-Id: Iaffcd503666879e8157c8559876330110a66e5c4 upstream-status: backport Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	KVM: VMX: reflect broken preemption timer in vmcs_config	Paolo Bonzini	1	-3/+3
	Simplify cpu_has_vmx_preemption_timer. This is consistent with the rest of setup_vmcs_config and preparatory for the next patch. Tested-by: Wanpeng Li <kernellwp@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Change-Id: I3b33a881c5e47d5d3046e28374d0b0ca363ffad7 upstream-status: backport Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	KVM: vmx: fix missed cancellation of TSC deadline timer	Wanpeng Li	1	-24/+24
	INFO: rcu_sched detected stalls on CPUs/tasks: 1-...: (11800 GPs behind) idle=45d/140000000000000/0 softirq=0/0 fqs=21663 (detected by 0, t=65016 jiffies, g=11500, c=11499, q=719) Task dump for CPU 1: qemu-system-x86 R running task 0 3529 3525 0x00080808 ffff8802021791a0 ffff880212895040 0000000000000001 00007f1c2c00db40 ffff8801dd20fcd3 ffffc90002b98000 ffff8801dd20fc88 ffff8801dd20fcf8 0000000000000286 ffff8801dd2ac538 ffff8801dd20fcc0 ffffffffc06949c9 Call Trace: ? kvm_write_guest_cached+0xb9/0x160 [kvm] ? __delay+0xf/0x20 ? wait_lapic_expire+0x14a/0x200 [kvm] ? kvm_arch_vcpu_ioctl_run+0xcbe/0x1b00 [kvm] ? kvm_arch_vcpu_ioctl_run+0xe34/0x1b00 [kvm] ? kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm] ? __fget+0x5/0x210 ? do_vfs_ioctl+0x96/0x6a0 ? __fget_light+0x2a/0x90 ? SyS_ioctl+0x79/0x90 ? do_syscall_64+0x7c/0x1e0 ? entry_SYSCALL64_slow_path+0x25/0x25 This can be reproduced readily by running a full dynticks guest(since hrtimer in guest is heavily used) w/ lapic_timer_advance disabled. If fail to program hardware preemption timer, we will fallback to hrtimer based method, however, a previous programmed preemption timer miss to cancel in this scenario which results in one hardware preemption timer and one hrtimer emulated tsc deadline timer run simultaneously. So sometimes the target guest deadline tsc is earlier than guest tsc, which leads to the computation in vmx_set_hv_timer can underflow and cause delta_tsc to be set a huge value, then host soft lockup as above. This patch fix it by cancelling the previous programmed preemption timer if there is once we failed to program the new preemption timer and fallback to hrtimer based method. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Change-Id: I8a2decefab743aecdfab676fb9267324bf42b848 upstream-status: backport Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	KVM: x86: introduce cancel_hv_tscdeadline	Wanpeng Li	1	-8/+10
	Introduce cancel_hv_tscdeadline() to encapsulate preemption timer cancel stuff. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Change-Id: Icc038176cbf361a9ecdf37ed3425108db57617f2 upstream-status: backport Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	KVM: vmx: fix underflow in TSC deadline calculation	Paolo Bonzini	1	-3/+3
	If the TSC deadline timer is programmed really close to the deadline or even in the past, the computation in vmx_set_hv_timer can underflow and cause delta_tsc to be set to a huge value. This generally results in vmx_set_hv_timer returning -ERANGE, but we can fix it by limiting delta_tsc to be positive or zero. Reported-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Change-Id: I12eea18c3ec648dbf782d7754b7b574d7d6aa92c upstream-status: backport Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	kvm: vmx: hook preemption timer support	Yunhong Jiang	3	-2/+183
	Hook the VMX preemption timer to the "hv timer" functionality added by the previous patch. This includes: checking if the feature is supported, if the feature is broken on the CPU, the hooks to setup/clean the VMX preemption timer, arming the timer on vmentry and handling the vmexit. A module parameter states if the VMX preemption timer should be utilized. Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> [Move hv_deadline_tsc to struct vcpu_vmx, use -1 as the "unset" value. Put all VMX bits here. Enable it by default #yolo. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Change-Id: Icb8e0b853eedce3d52c394e510fa14d2cdd432e9 upstream-status: backport Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	kvm: vmx: rename vmx_pre/post_block to pi_pre/post_block	Yunhong Jiang	1	-2/+15
	Prepare to switch from preemption timer to hrtimer in the vmx_pre/post_block. Current functions are only for posted interrupt, rename them accordingly. upstream-status: backport Change-Id: Ie1dde9be21deeb661de095e07d6c29bcba2e7d73 Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	KVM: x86: support using the vmx preemption timer for tsc deadline timer	Yunhong Jiang	5	-1/+100
	The VMX preemption timer can be used to virtualize the TSC deadline timer. The VMX preemption timer is armed when the vCPU is running, and a VMExit will happen if the virtual TSC deadline timer expires. When the vCPU thread is blocked because of HLT, KVM will switch to use an hrtimer, and then go back to the VMX preemption timer when the vCPU thread is unblocked. This solution avoids the complex OS's hrtimer system, and the host timer interrupt handling cost, replacing them with a little math (for guest->host TSC and host TSC->preemption timer conversion) and a cheaper VMexit. This benefits latency for isolated pCPUs. [A word about performance... Yunhong reported a 30% reduction in average latency from cyclictest. I made a similar test with tscdeadline_latency from kvm-unit-tests, and measured - ~20 clock cycles loss (out of ~3200, so less than 1% but still statistically significant) in the worst case where the test halts just after programming the TSC deadline timer - ~800 clock cycles gain (25% reduction in latency) in the best case where the test busy waits. I removed the VMX bits from Yunhong's patch, to concentrate them in the next patch - Paolo] Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Change-Id: I4aa1ecfa3463d1cbfb317511b45d2074b33d9b6f upstream-status: backport Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-07-18	kvm: lapic: separate start_sw_tscdeadline from start_apic_timer	Yunhong Jiang	1	-26/+31
	The function to start the tsc deadline timer virtualization will be used also by the pre_block hook when we use the preemption timer; change it to a separate function. No logic changes. upstream-status: backport Change-Id: Ie2fc19108c3252f8a299b17aba16c14aa8d31ae8 Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
2016-05-13	Build uio as module to fix initialization of i40e devices	José Pekkarinen	1	-1/+1
	in bare metal execution of dpdk-16.04. Upstream: NA. Change-Id: Ia98461b15348a667c4989dfe1399f0c5bc0f0c12 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2016-04-25	Add kernel modules required for OPNFV	Don Dugger	1	-130/+732
	The OPNFV environment requires many kernel modules that are not part of the default RT kernel environment. This patch adds those modules back in. Upstream status: NA Change-Id: Id4e63f3d2dd3e19614e9e080adf1cdae9ab26ee1
2016-04-22	Update opnfv kernel config for the new kernel 4.4.6-rt14.	José Pekkarinen	1	-67/+237
	This config file is based in the previous one, adding the changes needed in the config file for this new kernel version. It has been added in kernel support for CephFS. Upstream: NA. Change-Id: I1de8b4678bdfa81f4fc204f4a02d11f11cb5ae87 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2016-04-13	These changes are the raw update to linux-4.4.6-rt14. Kernel sources	José Pekkarinen	4401	-108088/+188138
	are taken from kernel.org, and rt patch from the rt wiki download page. During the rebasing, the following patch collided: Force tick interrupt and get rid of softirq magic(I70131fb85). Collisions have been removed because its logic was found on the source already. Change-Id: I7f57a4081d9deaa0d9ccfc41a6c8daccdee3b769 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2016-01-06	Merge "Build vfio-pci as a module, as dpdk tools expect it to be."	Don Dugger	1	-2/+2

2016-01-06	Fix for KVMFORNFV-23(Fuel installed KVM kernel blank screens on boot).	José Pekkarinen	1	-2/+2
	The configuration of the kernel enables the framebuffer console without any framebuffer selected other than i915. Adding the VESA compliant framebuffer should fix this issue. Change-Id: Icc384e05774e1de20985aeb19dfef25ae2431bb6 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2015-12-02	Add virtual functions support to ixgbe and i40e.	José Pekkarinen	1	-1/+1
	Change-Id: Iecc8205f443e2e47fc955c80bf6f0aa55db75447 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2015-11-20	Add configuration to support OVS kernel module	Yunhong Jiang	1	-7/+13
	OVS kernel module is important for NFV. The existed kernel config misses several important kernel configuration, mostly for VxLAN, which is required by the OVS kernel module. Also add the OVS as a module, so that user can use it if needed, or replace it when use the OPNFV patched OVS module. Change-Id: I032f84e0468fc2614557274a50f30d502d39cc52 Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
2015-11-17	Fix systemd boot.	José Pekkarinen	1	-1/+1
	Change-Id: I3e91161abafc62554e793e4851df639c099421e7 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2015-11-17	Build vfio-pci as a module, as dpdk tools expect it to be.	José Pekkarinen	1	-2/+2
	Change-Id: I4c0fe67609fc74fc60e6a2c99d448fd47a6d2a86 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2015-11-17	Update the kernel config to match the kernel version shipped.	José Pekkarinen	1	-1/+2
	Change-Id: Icd51e1625a57867b2f79cb2b2d1ab21b23bd80e0 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2015-11-03	Merge "Add the opnfv kernel config file"	Don Dugger	1	-0/+3741

2015-10-29	Add the opnfv kernel config file	Yunhong Jiang	1	-0/+3741
	Kernel config is important for RT linux. A specific config file is provided for the opnfv project to build the reference config options. Currently this config file is assumed to be used for both host and guest kernel. We may split them in future. Change-Id: Ia9ebe0bc002518bb603af8901c6f31531d2f0cee Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>
2015-10-19	These changes are a raw update to a vanilla kernel 4.1.10, with the	José Pekkarinen	61	-362/+561
	recently announced rt patch patch-4.1.10-rt10.patch. No further changes needed. Change-Id: I9a0cf084498133b10771e744b6da4b29dff706ba Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2015-10-09	Kernel bump from 4.1.3-rt to 4.1.7-rt.	José Pekkarinen	149	-1330/+1285
	These changes brings a vanilla kernel from kernel.org, and the patch applied for rt is patch-4.1.7-rt8.patch. No further changes needed. Change-Id: Id8dd03c2ddd971e4d1d69b905f3069737053b700 Signed-off-by: José Pekkarinen <jose.pekkarinen@nokia.com>
2015-08-04	Add the rt linux 4.1.3-rt3 as base	Yunhong Jiang	15866	-0/+3365801
	Import the rt linux 4.1.3-rt3 as OPNFV kvm base. It's from git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git linux-4.1.y-rt and the base is: commit 0917f823c59692d751951bf5ea699a2d1e2f26a2 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Sat Jul 25 12:13:34 2015 +0200 Prepare v4.1.3-rt3 Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> We lose all the git history this way and it's not good. We should apply another opnfv project repo in future. Change-Id: I87543d81c9df70d99c5001fbdf646b202c19f423 Signed-off-by: Yunhong Jiang <yunhong.jiang@intel.com>