1The Definitive KVM (Kernel-based Virtual Machine) API Documentation 2=================================================================== 3 41. General description 5---------------------- 6 7The kvm API is a set of ioctls that are issued to control various aspects 8of a virtual machine. The ioctls belong to three classes 9 10 - System ioctls: These query and set global attributes which affect the 11 whole kvm subsystem. In addition a system ioctl is used to create 12 virtual machines 13 14 - VM ioctls: These query and set attributes that affect an entire virtual 15 machine, for example memory layout. In addition a VM ioctl is used to 16 create virtual cpus (vcpus). 17 18 Only run VM ioctls from the same process (address space) that was used 19 to create the VM. 20 21 - vcpu ioctls: These query and set attributes that control the operation 22 of a single virtual cpu. 23 24 Only run vcpu ioctls from the same thread that was used to create the 25 vcpu. 26 27 282. File descriptors 29------------------- 30 31The kvm API is centered around file descriptors. An initial 32open("/dev/kvm") obtains a handle to the kvm subsystem; this handle 33can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this 34handle will create a VM file descriptor which can be used to issue VM 35ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu 36and return a file descriptor pointing to it. Finally, ioctls on a vcpu 37fd can be used to control the vcpu, including the important task of 38actually running guest code. 39 40In general file descriptors can be migrated among processes by means 41of fork() and the SCM_RIGHTS facility of unix domain socket. These 42kinds of tricks are explicitly not supported by kvm. While they will 43not cause harm to the host, their actual behavior is not guaranteed by 44the API. The only supported use is one virtual machine per process, 45and one vcpu per thread. 46 47 483. Extensions 49------------- 50 51As of Linux 2.6.22, the KVM ABI has been stabilized: no backward 52incompatible change are allowed. However, there is an extension 53facility that allows backward-compatible extensions to the API to be 54queried and used. 55 56The extension mechanism is not based on the Linux version number. 57Instead, kvm defines extension identifiers and a facility to query 58whether a particular extension identifier is available. If it is, a 59set of ioctls is available for application use. 60 61 624. API description 63------------------ 64 65This section describes ioctls that can be used to control kvm guests. 66For each ioctl, the following information is provided along with a 67description: 68 69 Capability: which KVM extension provides this ioctl. Can be 'basic', 70 which means that is will be provided by any kernel that supports 71 API version 12 (see section 4.1), a KVM_CAP_xyz constant, which 72 means availability needs to be checked with KVM_CHECK_EXTENSION 73 (see section 4.4), or 'none' which means that while not all kernels 74 support this ioctl, there's no capability bit to check its 75 availability: for kernels that don't support the ioctl, 76 the ioctl returns -ENOTTY. 77 78 Architectures: which instruction set architectures provide this ioctl. 79 x86 includes both i386 and x86_64. 80 81 Type: system, vm, or vcpu. 82 83 Parameters: what parameters are accepted by the ioctl. 84 85 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL) 86 are not detailed, but errors with specific meanings are. 87 88 894.1 KVM_GET_API_VERSION 90 91Capability: basic 92Architectures: all 93Type: system ioctl 94Parameters: none 95Returns: the constant KVM_API_VERSION (=12) 96 97This identifies the API version as the stable kvm API. It is not 98expected that this number will change. However, Linux 2.6.20 and 992.6.21 report earlier versions; these are not documented and not 100supported. Applications should refuse to run if KVM_GET_API_VERSION 101returns a value other than 12. If this check passes, all ioctls 102described as 'basic' will be available. 103 104 1054.2 KVM_CREATE_VM 106 107Capability: basic 108Architectures: all 109Type: system ioctl 110Parameters: machine type identifier (KVM_VM_*) 111Returns: a VM fd that can be used to control the new virtual machine. 112 113The new VM has no virtual cpus and no memory. An mmap() of a VM fd 114will access the virtual machine's physical address space; offset zero 115corresponds to guest physical address zero. Use of mmap() on a VM fd 116is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is 117available. 118You most certainly want to use 0 as machine type. 119 120In order to create user controlled virtual machines on S390, check 121KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as 122privileged user (CAP_SYS_ADMIN). 123 124 1254.3 KVM_GET_MSR_INDEX_LIST 126 127Capability: basic 128Architectures: x86 129Type: system 130Parameters: struct kvm_msr_list (in/out) 131Returns: 0 on success; -1 on error 132Errors: 133 E2BIG: the msr index list is to be to fit in the array specified by 134 the user. 135 136struct kvm_msr_list { 137 __u32 nmsrs; /* number of msrs in entries */ 138 __u32 indices[0]; 139}; 140 141This ioctl returns the guest msrs that are supported. The list varies 142by kvm version and host processor, but does not change otherwise. The 143user fills in the size of the indices array in nmsrs, and in return 144kvm adjusts nmsrs to reflect the actual number of msrs and fills in 145the indices array with their numbers. 146 147Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are 148not returned in the MSR list, as different vcpus can have a different number 149of banks, as set via the KVM_X86_SETUP_MCE ioctl. 150 151 1524.4 KVM_CHECK_EXTENSION 153 154Capability: basic, KVM_CAP_CHECK_EXTENSION_VM for vm ioctl 155Architectures: all 156Type: system ioctl, vm ioctl 157Parameters: extension identifier (KVM_CAP_*) 158Returns: 0 if unsupported; 1 (or some other positive integer) if supported 159 160The API allows the application to query about extensions to the core 161kvm API. Userspace passes an extension identifier (an integer) and 162receives an integer that describes the extension availability. 163Generally 0 means no and 1 means yes, but some extensions may report 164additional information in the integer return value. 165 166Based on their initialization different VMs may have different capabilities. 167It is thus encouraged to use the vm ioctl to query for capabilities (available 168with KVM_CAP_CHECK_EXTENSION_VM on the vm fd) 169 1704.5 KVM_GET_VCPU_MMAP_SIZE 171 172Capability: basic 173Architectures: all 174Type: system ioctl 175Parameters: none 176Returns: size of vcpu mmap area, in bytes 177 178The KVM_RUN ioctl (cf.) communicates with userspace via a shared 179memory region. This ioctl returns the size of that region. See the 180KVM_RUN documentation for details. 181 182 1834.6 KVM_SET_MEMORY_REGION 184 185Capability: basic 186Architectures: all 187Type: vm ioctl 188Parameters: struct kvm_memory_region (in) 189Returns: 0 on success, -1 on error 190 191This ioctl is obsolete and has been removed. 192 193 1944.7 KVM_CREATE_VCPU 195 196Capability: basic 197Architectures: all 198Type: vm ioctl 199Parameters: vcpu id (apic id on x86) 200Returns: vcpu fd on success, -1 on error 201 202This API adds a vcpu to a virtual machine. The vcpu id is a small integer 203in the range [0, max_vcpus). 204 205The recommended max_vcpus value can be retrieved using the KVM_CAP_NR_VCPUS of 206the KVM_CHECK_EXTENSION ioctl() at run-time. 207The maximum possible value for max_vcpus can be retrieved using the 208KVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time. 209 210If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4 211cpus max. 212If the KVM_CAP_MAX_VCPUS does not exist, you should assume that max_vcpus is 213same as the value returned from KVM_CAP_NR_VCPUS. 214 215On powerpc using book3s_hv mode, the vcpus are mapped onto virtual 216threads in one or more virtual CPU cores. (This is because the 217hardware requires all the hardware threads in a CPU core to be in the 218same partition.) The KVM_CAP_PPC_SMT capability indicates the number 219of vcpus per virtual core (vcore). The vcore id is obtained by 220dividing the vcpu id by the number of vcpus per vcore. The vcpus in a 221given vcore will always be in the same physical core as each other 222(though that might be a different physical core from time to time). 223Userspace can control the threading (SMT) mode of the guest by its 224allocation of vcpu ids. For example, if userspace wants 225single-threaded guest vcpus, it should make all vcpu ids be a multiple 226of the number of vcpus per vcore. 227 228For virtual cpus that have been created with S390 user controlled virtual 229machines, the resulting vcpu fd can be memory mapped at page offset 230KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual 231cpu's hardware control block. 232 233 2344.8 KVM_GET_DIRTY_LOG (vm ioctl) 235 236Capability: basic 237Architectures: x86 238Type: vm ioctl 239Parameters: struct kvm_dirty_log (in/out) 240Returns: 0 on success, -1 on error 241 242/* for KVM_GET_DIRTY_LOG */ 243struct kvm_dirty_log { 244 __u32 slot; 245 __u32 padding; 246 union { 247 void __user *dirty_bitmap; /* one bit per page */ 248 __u64 padding; 249 }; 250}; 251 252Given a memory slot, return a bitmap containing any pages dirtied 253since the last call to this ioctl. Bit 0 is the first page in the 254memory slot. Ensure the entire structure is cleared to avoid padding 255issues. 256 257 2584.9 KVM_SET_MEMORY_ALIAS 259 260Capability: basic 261Architectures: x86 262Type: vm ioctl 263Parameters: struct kvm_memory_alias (in) 264Returns: 0 (success), -1 (error) 265 266This ioctl is obsolete and has been removed. 267 268 2694.10 KVM_RUN 270 271Capability: basic 272Architectures: all 273Type: vcpu ioctl 274Parameters: none 275Returns: 0 on success, -1 on error 276Errors: 277 EINTR: an unmasked signal is pending 278 279This ioctl is used to run a guest virtual cpu. While there are no 280explicit parameters, there is an implicit parameter block that can be 281obtained by mmap()ing the vcpu fd at offset 0, with the size given by 282KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct 283kvm_run' (see below). 284 285 2864.11 KVM_GET_REGS 287 288Capability: basic 289Architectures: all except ARM, arm64 290Type: vcpu ioctl 291Parameters: struct kvm_regs (out) 292Returns: 0 on success, -1 on error 293 294Reads the general purpose registers from the vcpu. 295 296/* x86 */ 297struct kvm_regs { 298 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ 299 __u64 rax, rbx, rcx, rdx; 300 __u64 rsi, rdi, rsp, rbp; 301 __u64 r8, r9, r10, r11; 302 __u64 r12, r13, r14, r15; 303 __u64 rip, rflags; 304}; 305 306/* mips */ 307struct kvm_regs { 308 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */ 309 __u64 gpr[32]; 310 __u64 hi; 311 __u64 lo; 312 __u64 pc; 313}; 314 315 3164.12 KVM_SET_REGS 317 318Capability: basic 319Architectures: all except ARM, arm64 320Type: vcpu ioctl 321Parameters: struct kvm_regs (in) 322Returns: 0 on success, -1 on error 323 324Writes the general purpose registers into the vcpu. 325 326See KVM_GET_REGS for the data structure. 327 328 3294.13 KVM_GET_SREGS 330 331Capability: basic 332Architectures: x86, ppc 333Type: vcpu ioctl 334Parameters: struct kvm_sregs (out) 335Returns: 0 on success, -1 on error 336 337Reads special registers from the vcpu. 338 339/* x86 */ 340struct kvm_sregs { 341 struct kvm_segment cs, ds, es, fs, gs, ss; 342 struct kvm_segment tr, ldt; 343 struct kvm_dtable gdt, idt; 344 __u64 cr0, cr2, cr3, cr4, cr8; 345 __u64 efer; 346 __u64 apic_base; 347 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64]; 348}; 349 350/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */ 351 352interrupt_bitmap is a bitmap of pending external interrupts. At most 353one bit may be set. This interrupt has been acknowledged by the APIC 354but not yet injected into the cpu core. 355 356 3574.14 KVM_SET_SREGS 358 359Capability: basic 360Architectures: x86, ppc 361Type: vcpu ioctl 362Parameters: struct kvm_sregs (in) 363Returns: 0 on success, -1 on error 364 365Writes special registers into the vcpu. See KVM_GET_SREGS for the 366data structures. 367 368 3694.15 KVM_TRANSLATE 370 371Capability: basic 372Architectures: x86 373Type: vcpu ioctl 374Parameters: struct kvm_translation (in/out) 375Returns: 0 on success, -1 on error 376 377Translates a virtual address according to the vcpu's current address 378translation mode. 379 380struct kvm_translation { 381 /* in */ 382 __u64 linear_address; 383 384 /* out */ 385 __u64 physical_address; 386 __u8 valid; 387 __u8 writeable; 388 __u8 usermode; 389 __u8 pad[5]; 390}; 391 392 3934.16 KVM_INTERRUPT 394 395Capability: basic 396Architectures: x86, ppc, mips 397Type: vcpu ioctl 398Parameters: struct kvm_interrupt (in) 399Returns: 0 on success, -1 on error 400 401Queues a hardware interrupt vector to be injected. This is only 402useful if in-kernel local APIC or equivalent is not used. 403 404/* for KVM_INTERRUPT */ 405struct kvm_interrupt { 406 /* in */ 407 __u32 irq; 408}; 409 410X86: 411 412Note 'irq' is an interrupt vector, not an interrupt pin or line. 413 414PPC: 415 416Queues an external interrupt to be injected. This ioctl is overleaded 417with 3 different irq values: 418 419a) KVM_INTERRUPT_SET 420 421 This injects an edge type external interrupt into the guest once it's ready 422 to receive interrupts. When injected, the interrupt is done. 423 424b) KVM_INTERRUPT_UNSET 425 426 This unsets any pending interrupt. 427 428 Only available with KVM_CAP_PPC_UNSET_IRQ. 429 430c) KVM_INTERRUPT_SET_LEVEL 431 432 This injects a level type external interrupt into the guest context. The 433 interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET 434 is triggered. 435 436 Only available with KVM_CAP_PPC_IRQ_LEVEL. 437 438Note that any value for 'irq' other than the ones stated above is invalid 439and incurs unexpected behavior. 440 441MIPS: 442 443Queues an external interrupt to be injected into the virtual CPU. A negative 444interrupt number dequeues the interrupt. 445 446 4474.17 KVM_DEBUG_GUEST 448 449Capability: basic 450Architectures: none 451Type: vcpu ioctl 452Parameters: none) 453Returns: -1 on error 454 455Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead. 456 457 4584.18 KVM_GET_MSRS 459 460Capability: basic 461Architectures: x86 462Type: vcpu ioctl 463Parameters: struct kvm_msrs (in/out) 464Returns: 0 on success, -1 on error 465 466Reads model-specific registers from the vcpu. Supported msr indices can 467be obtained using KVM_GET_MSR_INDEX_LIST. 468 469struct kvm_msrs { 470 __u32 nmsrs; /* number of msrs in entries */ 471 __u32 pad; 472 473 struct kvm_msr_entry entries[0]; 474}; 475 476struct kvm_msr_entry { 477 __u32 index; 478 __u32 reserved; 479 __u64 data; 480}; 481 482Application code should set the 'nmsrs' member (which indicates the 483size of the entries array) and the 'index' member of each array entry. 484kvm will fill in the 'data' member. 485 486 4874.19 KVM_SET_MSRS 488 489Capability: basic 490Architectures: x86 491Type: vcpu ioctl 492Parameters: struct kvm_msrs (in) 493Returns: 0 on success, -1 on error 494 495Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the 496data structures. 497 498Application code should set the 'nmsrs' member (which indicates the 499size of the entries array), and the 'index' and 'data' members of each 500array entry. 501 502 5034.20 KVM_SET_CPUID 504 505Capability: basic 506Architectures: x86 507Type: vcpu ioctl 508Parameters: struct kvm_cpuid (in) 509Returns: 0 on success, -1 on error 510 511Defines the vcpu responses to the cpuid instruction. Applications 512should use the KVM_SET_CPUID2 ioctl if available. 513 514 515struct kvm_cpuid_entry { 516 __u32 function; 517 __u32 eax; 518 __u32 ebx; 519 __u32 ecx; 520 __u32 edx; 521 __u32 padding; 522}; 523 524/* for KVM_SET_CPUID */ 525struct kvm_cpuid { 526 __u32 nent; 527 __u32 padding; 528 struct kvm_cpuid_entry entries[0]; 529}; 530 531 5324.21 KVM_SET_SIGNAL_MASK 533 534Capability: basic 535Architectures: all 536Type: vcpu ioctl 537Parameters: struct kvm_signal_mask (in) 538Returns: 0 on success, -1 on error 539 540Defines which signals are blocked during execution of KVM_RUN. This 541signal mask temporarily overrides the threads signal mask. Any 542unblocked signal received (except SIGKILL and SIGSTOP, which retain 543their traditional behaviour) will cause KVM_RUN to return with -EINTR. 544 545Note the signal will only be delivered if not blocked by the original 546signal mask. 547 548/* for KVM_SET_SIGNAL_MASK */ 549struct kvm_signal_mask { 550 __u32 len; 551 __u8 sigset[0]; 552}; 553 554 5554.22 KVM_GET_FPU 556 557Capability: basic 558Architectures: x86 559Type: vcpu ioctl 560Parameters: struct kvm_fpu (out) 561Returns: 0 on success, -1 on error 562 563Reads the floating point state from the vcpu. 564 565/* for KVM_GET_FPU and KVM_SET_FPU */ 566struct kvm_fpu { 567 __u8 fpr[8][16]; 568 __u16 fcw; 569 __u16 fsw; 570 __u8 ftwx; /* in fxsave format */ 571 __u8 pad1; 572 __u16 last_opcode; 573 __u64 last_ip; 574 __u64 last_dp; 575 __u8 xmm[16][16]; 576 __u32 mxcsr; 577 __u32 pad2; 578}; 579 580 5814.23 KVM_SET_FPU 582 583Capability: basic 584Architectures: x86 585Type: vcpu ioctl 586Parameters: struct kvm_fpu (in) 587Returns: 0 on success, -1 on error 588 589Writes the floating point state to the vcpu. 590 591/* for KVM_GET_FPU and KVM_SET_FPU */ 592struct kvm_fpu { 593 __u8 fpr[8][16]; 594 __u16 fcw; 595 __u16 fsw; 596 __u8 ftwx; /* in fxsave format */ 597 __u8 pad1; 598 __u16 last_opcode; 599 __u64 last_ip; 600 __u64 last_dp; 601 __u8 xmm[16][16]; 602 __u32 mxcsr; 603 __u32 pad2; 604}; 605 606 6074.24 KVM_CREATE_IRQCHIP 608 609Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390) 610Architectures: x86, ARM, arm64, s390 611Type: vm ioctl 612Parameters: none 613Returns: 0 on success, -1 on error 614 615Creates an interrupt controller model in the kernel. 616On x86, creates a virtual ioapic, a virtual PIC (two PICs, nested), and sets up 617future vcpus to have a local APIC. IRQ routing for GSIs 0-15 is set to both 618PIC and IOAPIC; GSI 16-23 only go to the IOAPIC. 619On ARM/arm64, a GICv2 is created. Any other GIC versions require the usage of 620KVM_CREATE_DEVICE, which also supports creating a GICv2. Using 621KVM_CREATE_DEVICE is preferred over KVM_CREATE_IRQCHIP for GICv2. 622On s390, a dummy irq routing table is created. 623 624Note that on s390 the KVM_CAP_S390_IRQCHIP vm capability needs to be enabled 625before KVM_CREATE_IRQCHIP can be used. 626 627 6284.25 KVM_IRQ_LINE 629 630Capability: KVM_CAP_IRQCHIP 631Architectures: x86, arm, arm64 632Type: vm ioctl 633Parameters: struct kvm_irq_level 634Returns: 0 on success, -1 on error 635 636Sets the level of a GSI input to the interrupt controller model in the kernel. 637On some architectures it is required that an interrupt controller model has 638been previously created with KVM_CREATE_IRQCHIP. Note that edge-triggered 639interrupts require the level to be set to 1 and then back to 0. 640 641On real hardware, interrupt pins can be active-low or active-high. This 642does not matter for the level field of struct kvm_irq_level: 1 always 643means active (asserted), 0 means inactive (deasserted). 644 645x86 allows the operating system to program the interrupt polarity 646(active-low/active-high) for level-triggered interrupts, and KVM used 647to consider the polarity. However, due to bitrot in the handling of 648active-low interrupts, the above convention is now valid on x86 too. 649This is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED. Userspace 650should not present interrupts to the guest as active-low unless this 651capability is present (or unless it is not using the in-kernel irqchip, 652of course). 653 654 655ARM/arm64 can signal an interrupt either at the CPU level, or at the 656in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to 657use PPIs designated for specific cpus. The irq field is interpreted 658like this: 659 660 bits: | 31 ... 24 | 23 ... 16 | 15 ... 0 | 661 field: | irq_type | vcpu_index | irq_id | 662 663The irq_type field has the following values: 664- irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ 665- irq_type[1]: in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.) 666 (the vcpu_index field is ignored) 667- irq_type[2]: in-kernel GIC: PPI, irq_id between 16 and 31 (incl.) 668 669(The irq_id field thus corresponds nicely to the IRQ ID in the ARM GIC specs) 670 671In both cases, level is used to assert/deassert the line. 672 673struct kvm_irq_level { 674 union { 675 __u32 irq; /* GSI */ 676 __s32 status; /* not used for KVM_IRQ_LEVEL */ 677 }; 678 __u32 level; /* 0 or 1 */ 679}; 680 681 6824.26 KVM_GET_IRQCHIP 683 684Capability: KVM_CAP_IRQCHIP 685Architectures: x86 686Type: vm ioctl 687Parameters: struct kvm_irqchip (in/out) 688Returns: 0 on success, -1 on error 689 690Reads the state of a kernel interrupt controller created with 691KVM_CREATE_IRQCHIP into a buffer provided by the caller. 692 693struct kvm_irqchip { 694 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ 695 __u32 pad; 696 union { 697 char dummy[512]; /* reserving space */ 698 struct kvm_pic_state pic; 699 struct kvm_ioapic_state ioapic; 700 } chip; 701}; 702 703 7044.27 KVM_SET_IRQCHIP 705 706Capability: KVM_CAP_IRQCHIP 707Architectures: x86 708Type: vm ioctl 709Parameters: struct kvm_irqchip (in) 710Returns: 0 on success, -1 on error 711 712Sets the state of a kernel interrupt controller created with 713KVM_CREATE_IRQCHIP from a buffer provided by the caller. 714 715struct kvm_irqchip { 716 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */ 717 __u32 pad; 718 union { 719 char dummy[512]; /* reserving space */ 720 struct kvm_pic_state pic; 721 struct kvm_ioapic_state ioapic; 722 } chip; 723}; 724 725 7264.28 KVM_XEN_HVM_CONFIG 727 728Capability: KVM_CAP_XEN_HVM 729Architectures: x86 730Type: vm ioctl 731Parameters: struct kvm_xen_hvm_config (in) 732Returns: 0 on success, -1 on error 733 734Sets the MSR that the Xen HVM guest uses to initialize its hypercall 735page, and provides the starting address and size of the hypercall 736blobs in userspace. When the guest writes the MSR, kvm copies one 737page of a blob (32- or 64-bit, depending on the vcpu mode) to guest 738memory. 739 740struct kvm_xen_hvm_config { 741 __u32 flags; 742 __u32 msr; 743 __u64 blob_addr_32; 744 __u64 blob_addr_64; 745 __u8 blob_size_32; 746 __u8 blob_size_64; 747 __u8 pad2[30]; 748}; 749 750 7514.29 KVM_GET_CLOCK 752 753Capability: KVM_CAP_ADJUST_CLOCK 754Architectures: x86 755Type: vm ioctl 756Parameters: struct kvm_clock_data (out) 757Returns: 0 on success, -1 on error 758 759Gets the current timestamp of kvmclock as seen by the current guest. In 760conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios 761such as migration. 762 763struct kvm_clock_data { 764 __u64 clock; /* kvmclock current value */ 765 __u32 flags; 766 __u32 pad[9]; 767}; 768 769 7704.30 KVM_SET_CLOCK 771 772Capability: KVM_CAP_ADJUST_CLOCK 773Architectures: x86 774Type: vm ioctl 775Parameters: struct kvm_clock_data (in) 776Returns: 0 on success, -1 on error 777 778Sets the current timestamp of kvmclock to the value specified in its parameter. 779In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios 780such as migration. 781 782struct kvm_clock_data { 783 __u64 clock; /* kvmclock current value */ 784 __u32 flags; 785 __u32 pad[9]; 786}; 787 788 7894.31 KVM_GET_VCPU_EVENTS 790 791Capability: KVM_CAP_VCPU_EVENTS 792Extended by: KVM_CAP_INTR_SHADOW 793Architectures: x86 794Type: vm ioctl 795Parameters: struct kvm_vcpu_event (out) 796Returns: 0 on success, -1 on error 797 798Gets currently pending exceptions, interrupts, and NMIs as well as related 799states of the vcpu. 800 801struct kvm_vcpu_events { 802 struct { 803 __u8 injected; 804 __u8 nr; 805 __u8 has_error_code; 806 __u8 pad; 807 __u32 error_code; 808 } exception; 809 struct { 810 __u8 injected; 811 __u8 nr; 812 __u8 soft; 813 __u8 shadow; 814 } interrupt; 815 struct { 816 __u8 injected; 817 __u8 pending; 818 __u8 masked; 819 __u8 pad; 820 } nmi; 821 __u32 sipi_vector; 822 __u32 flags; 823}; 824 825KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that 826interrupt.shadow contains a valid state. Otherwise, this field is undefined. 827 828 8294.32 KVM_SET_VCPU_EVENTS 830 831Capability: KVM_CAP_VCPU_EVENTS 832Extended by: KVM_CAP_INTR_SHADOW 833Architectures: x86 834Type: vm ioctl 835Parameters: struct kvm_vcpu_event (in) 836Returns: 0 on success, -1 on error 837 838Set pending exceptions, interrupts, and NMIs as well as related states of the 839vcpu. 840 841See KVM_GET_VCPU_EVENTS for the data structure. 842 843Fields that may be modified asynchronously by running VCPUs can be excluded 844from the update. These fields are nmi.pending and sipi_vector. Keep the 845corresponding bits in the flags field cleared to suppress overwriting the 846current in-kernel state. The bits are: 847 848KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel 849KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector 850 851If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in 852the flags field to signal that interrupt.shadow contains a valid state and 853shall be written into the VCPU. 854 855 8564.33 KVM_GET_DEBUGREGS 857 858Capability: KVM_CAP_DEBUGREGS 859Architectures: x86 860Type: vm ioctl 861Parameters: struct kvm_debugregs (out) 862Returns: 0 on success, -1 on error 863 864Reads debug registers from the vcpu. 865 866struct kvm_debugregs { 867 __u64 db[4]; 868 __u64 dr6; 869 __u64 dr7; 870 __u64 flags; 871 __u64 reserved[9]; 872}; 873 874 8754.34 KVM_SET_DEBUGREGS 876 877Capability: KVM_CAP_DEBUGREGS 878Architectures: x86 879Type: vm ioctl 880Parameters: struct kvm_debugregs (in) 881Returns: 0 on success, -1 on error 882 883Writes debug registers into the vcpu. 884 885See KVM_GET_DEBUGREGS for the data structure. The flags field is unused 886yet and must be cleared on entry. 887 888 8894.35 KVM_SET_USER_MEMORY_REGION 890 891Capability: KVM_CAP_USER_MEM 892Architectures: all 893Type: vm ioctl 894Parameters: struct kvm_userspace_memory_region (in) 895Returns: 0 on success, -1 on error 896 897struct kvm_userspace_memory_region { 898 __u32 slot; 899 __u32 flags; 900 __u64 guest_phys_addr; 901 __u64 memory_size; /* bytes */ 902 __u64 userspace_addr; /* start of the userspace allocated memory */ 903}; 904 905/* for kvm_memory_region::flags */ 906#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) 907#define KVM_MEM_READONLY (1UL << 1) 908 909This ioctl allows the user to create or modify a guest physical memory 910slot. When changing an existing slot, it may be moved in the guest 911physical memory space, or its flags may be modified. It may not be 912resized. Slots may not overlap in guest physical address space. 913 914Memory for the region is taken starting at the address denoted by the 915field userspace_addr, which must point at user addressable memory for 916the entire memory slot size. Any object may back this memory, including 917anonymous memory, ordinary files, and hugetlbfs. 918 919It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr 920be identical. This allows large pages in the guest to be backed by large 921pages in the host. 922 923The flags field supports two flags: KVM_MEM_LOG_DIRTY_PAGES and 924KVM_MEM_READONLY. The former can be set to instruct KVM to keep track of 925writes to memory within the slot. See KVM_GET_DIRTY_LOG ioctl to know how to 926use it. The latter can be set, if KVM_CAP_READONLY_MEM capability allows it, 927to make a new slot read-only. In this case, writes to this memory will be 928posted to userspace as KVM_EXIT_MMIO exits. 929 930When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of 931the memory region are automatically reflected into the guest. For example, an 932mmap() that affects the region will be made visible immediately. Another 933example is madvise(MADV_DROP). 934 935It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl. 936The KVM_SET_MEMORY_REGION does not allow fine grained control over memory 937allocation and is deprecated. 938 939 9404.36 KVM_SET_TSS_ADDR 941 942Capability: KVM_CAP_SET_TSS_ADDR 943Architectures: x86 944Type: vm ioctl 945Parameters: unsigned long tss_address (in) 946Returns: 0 on success, -1 on error 947 948This ioctl defines the physical address of a three-page region in the guest 949physical address space. The region must be within the first 4GB of the 950guest physical address space and must not conflict with any memory slot 951or any mmio address. The guest may malfunction if it accesses this memory 952region. 953 954This ioctl is required on Intel-based hosts. This is needed on Intel hardware 955because of a quirk in the virtualization implementation (see the internals 956documentation when it pops into existence). 957 958 9594.37 KVM_ENABLE_CAP 960 961Capability: KVM_CAP_ENABLE_CAP, KVM_CAP_ENABLE_CAP_VM 962Architectures: ppc, s390 963Type: vcpu ioctl, vm ioctl (with KVM_CAP_ENABLE_CAP_VM) 964Parameters: struct kvm_enable_cap (in) 965Returns: 0 on success; -1 on error 966 967+Not all extensions are enabled by default. Using this ioctl the application 968can enable an extension, making it available to the guest. 969 970On systems that do not support this ioctl, it always fails. On systems that 971do support it, it only works for extensions that are supported for enablement. 972 973To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should 974be used. 975 976struct kvm_enable_cap { 977 /* in */ 978 __u32 cap; 979 980The capability that is supposed to get enabled. 981 982 __u32 flags; 983 984A bitfield indicating future enhancements. Has to be 0 for now. 985 986 __u64 args[4]; 987 988Arguments for enabling a feature. If a feature needs initial values to 989function properly, this is the place to put them. 990 991 __u8 pad[64]; 992}; 993 994The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl 995for vm-wide capabilities. 996 9974.38 KVM_GET_MP_STATE 998 999Capability: KVM_CAP_MP_STATE 1000Architectures: x86, s390, arm, arm64 1001Type: vcpu ioctl 1002Parameters: struct kvm_mp_state (out) 1003Returns: 0 on success; -1 on error 1004 1005struct kvm_mp_state { 1006 __u32 mp_state; 1007}; 1008 1009Returns the vcpu's current "multiprocessing state" (though also valid on 1010uniprocessor guests). 1011 1012Possible values are: 1013 1014 - KVM_MP_STATE_RUNNABLE: the vcpu is currently running [x86,arm/arm64] 1015 - KVM_MP_STATE_UNINITIALIZED: the vcpu is an application processor (AP) 1016 which has not yet received an INIT signal [x86] 1017 - KVM_MP_STATE_INIT_RECEIVED: the vcpu has received an INIT signal, and is 1018 now ready for a SIPI [x86] 1019 - KVM_MP_STATE_HALTED: the vcpu has executed a HLT instruction and 1020 is waiting for an interrupt [x86] 1021 - KVM_MP_STATE_SIPI_RECEIVED: the vcpu has just received a SIPI (vector 1022 accessible via KVM_GET_VCPU_EVENTS) [x86] 1023 - KVM_MP_STATE_STOPPED: the vcpu is stopped [s390,arm/arm64] 1024 - KVM_MP_STATE_CHECK_STOP: the vcpu is in a special error state [s390] 1025 - KVM_MP_STATE_OPERATING: the vcpu is operating (running or halted) 1026 [s390] 1027 - KVM_MP_STATE_LOAD: the vcpu is in a special load/startup state 1028 [s390] 1029 1030On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an 1031in-kernel irqchip, the multiprocessing state must be maintained by userspace on 1032these architectures. 1033 1034For arm/arm64: 1035 1036The only states that are valid are KVM_MP_STATE_STOPPED and 1037KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. 1038 10394.39 KVM_SET_MP_STATE 1040 1041Capability: KVM_CAP_MP_STATE 1042Architectures: x86, s390, arm, arm64 1043Type: vcpu ioctl 1044Parameters: struct kvm_mp_state (in) 1045Returns: 0 on success; -1 on error 1046 1047Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for 1048arguments. 1049 1050On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an 1051in-kernel irqchip, the multiprocessing state must be maintained by userspace on 1052these architectures. 1053 1054For arm/arm64: 1055 1056The only states that are valid are KVM_MP_STATE_STOPPED and 1057KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not. 1058 10594.40 KVM_SET_IDENTITY_MAP_ADDR 1060 1061Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR 1062Architectures: x86 1063Type: vm ioctl 1064Parameters: unsigned long identity (in) 1065Returns: 0 on success, -1 on error 1066 1067This ioctl defines the physical address of a one-page region in the guest 1068physical address space. The region must be within the first 4GB of the 1069guest physical address space and must not conflict with any memory slot 1070or any mmio address. The guest may malfunction if it accesses this memory 1071region. 1072 1073This ioctl is required on Intel-based hosts. This is needed on Intel hardware 1074because of a quirk in the virtualization implementation (see the internals 1075documentation when it pops into existence). 1076 1077 10784.41 KVM_SET_BOOT_CPU_ID 1079 1080Capability: KVM_CAP_SET_BOOT_CPU_ID 1081Architectures: x86 1082Type: vm ioctl 1083Parameters: unsigned long vcpu_id 1084Returns: 0 on success, -1 on error 1085 1086Define which vcpu is the Bootstrap Processor (BSP). Values are the same 1087as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default 1088is vcpu 0. 1089 1090 10914.42 KVM_GET_XSAVE 1092 1093Capability: KVM_CAP_XSAVE 1094Architectures: x86 1095Type: vcpu ioctl 1096Parameters: struct kvm_xsave (out) 1097Returns: 0 on success, -1 on error 1098 1099struct kvm_xsave { 1100 __u32 region[1024]; 1101}; 1102 1103This ioctl would copy current vcpu's xsave struct to the userspace. 1104 1105 11064.43 KVM_SET_XSAVE 1107 1108Capability: KVM_CAP_XSAVE 1109Architectures: x86 1110Type: vcpu ioctl 1111Parameters: struct kvm_xsave (in) 1112Returns: 0 on success, -1 on error 1113 1114struct kvm_xsave { 1115 __u32 region[1024]; 1116}; 1117 1118This ioctl would copy userspace's xsave struct to the kernel. 1119 1120 11214.44 KVM_GET_XCRS 1122 1123Capability: KVM_CAP_XCRS 1124Architectures: x86 1125Type: vcpu ioctl 1126Parameters: struct kvm_xcrs (out) 1127Returns: 0 on success, -1 on error 1128 1129struct kvm_xcr { 1130 __u32 xcr; 1131 __u32 reserved; 1132 __u64 value; 1133}; 1134 1135struct kvm_xcrs { 1136 __u32 nr_xcrs; 1137 __u32 flags; 1138 struct kvm_xcr xcrs[KVM_MAX_XCRS]; 1139 __u64 padding[16]; 1140}; 1141 1142This ioctl would copy current vcpu's xcrs to the userspace. 1143 1144 11454.45 KVM_SET_XCRS 1146 1147Capability: KVM_CAP_XCRS 1148Architectures: x86 1149Type: vcpu ioctl 1150Parameters: struct kvm_xcrs (in) 1151Returns: 0 on success, -1 on error 1152 1153struct kvm_xcr { 1154 __u32 xcr; 1155 __u32 reserved; 1156 __u64 value; 1157}; 1158 1159struct kvm_xcrs { 1160 __u32 nr_xcrs; 1161 __u32 flags; 1162 struct kvm_xcr xcrs[KVM_MAX_XCRS]; 1163 __u64 padding[16]; 1164}; 1165 1166This ioctl would set vcpu's xcr to the value userspace specified. 1167 1168 11694.46 KVM_GET_SUPPORTED_CPUID 1170 1171Capability: KVM_CAP_EXT_CPUID 1172Architectures: x86 1173Type: system ioctl 1174Parameters: struct kvm_cpuid2 (in/out) 1175Returns: 0 on success, -1 on error 1176 1177struct kvm_cpuid2 { 1178 __u32 nent; 1179 __u32 padding; 1180 struct kvm_cpuid_entry2 entries[0]; 1181}; 1182 1183#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0) 1184#define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1) 1185#define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2) 1186 1187struct kvm_cpuid_entry2 { 1188 __u32 function; 1189 __u32 index; 1190 __u32 flags; 1191 __u32 eax; 1192 __u32 ebx; 1193 __u32 ecx; 1194 __u32 edx; 1195 __u32 padding[3]; 1196}; 1197 1198This ioctl returns x86 cpuid features which are supported by both the hardware 1199and kvm. Userspace can use the information returned by this ioctl to 1200construct cpuid information (for KVM_SET_CPUID2) that is consistent with 1201hardware, kernel, and userspace capabilities, and with user requirements (for 1202example, the user may wish to constrain cpuid to emulate older hardware, 1203or for feature consistency across a cluster). 1204 1205Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure 1206with the 'nent' field indicating the number of entries in the variable-size 1207array 'entries'. If the number of entries is too low to describe the cpu 1208capabilities, an error (E2BIG) is returned. If the number is too high, 1209the 'nent' field is adjusted and an error (ENOMEM) is returned. If the 1210number is just right, the 'nent' field is adjusted to the number of valid 1211entries in the 'entries' array, which is then filled. 1212 1213The entries returned are the host cpuid as returned by the cpuid instruction, 1214with unknown or unsupported features masked out. Some features (for example, 1215x2apic), may not be present in the host cpu, but are exposed by kvm if it can 1216emulate them efficiently. The fields in each entry are defined as follows: 1217 1218 function: the eax value used to obtain the entry 1219 index: the ecx value used to obtain the entry (for entries that are 1220 affected by ecx) 1221 flags: an OR of zero or more of the following: 1222 KVM_CPUID_FLAG_SIGNIFCANT_INDEX: 1223 if the index field is valid 1224 KVM_CPUID_FLAG_STATEFUL_FUNC: 1225 if cpuid for this function returns different values for successive 1226 invocations; there will be several entries with the same function, 1227 all with this flag set 1228 KVM_CPUID_FLAG_STATE_READ_NEXT: 1229 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is 1230 the first entry to be read by a cpu 1231 eax, ebx, ecx, edx: the values returned by the cpuid instruction for 1232 this function/index combination 1233 1234The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned 1235as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC 1236support. Instead it is reported via 1237 1238 ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER) 1239 1240if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the 1241feature in userspace, then you can enable the feature for KVM_SET_CPUID2. 1242 1243 12444.47 KVM_PPC_GET_PVINFO 1245 1246Capability: KVM_CAP_PPC_GET_PVINFO 1247Architectures: ppc 1248Type: vm ioctl 1249Parameters: struct kvm_ppc_pvinfo (out) 1250Returns: 0 on success, !0 on error 1251 1252struct kvm_ppc_pvinfo { 1253 __u32 flags; 1254 __u32 hcall[4]; 1255 __u8 pad[108]; 1256}; 1257 1258This ioctl fetches PV specific information that need to be passed to the guest 1259using the device tree or other means from vm context. 1260 1261The hcall array defines 4 instructions that make up a hypercall. 1262 1263If any additional field gets added to this structure later on, a bit for that 1264additional piece of information will be set in the flags bitmap. 1265 1266The flags bitmap is defined as: 1267 1268 /* the host supports the ePAPR idle hcall 1269 #define KVM_PPC_PVINFO_FLAGS_EV_IDLE (1<<0) 1270 12714.48 KVM_ASSIGN_PCI_DEVICE 1272 1273Capability: none 1274Architectures: x86 1275Type: vm ioctl 1276Parameters: struct kvm_assigned_pci_dev (in) 1277Returns: 0 on success, -1 on error 1278 1279Assigns a host PCI device to the VM. 1280 1281struct kvm_assigned_pci_dev { 1282 __u32 assigned_dev_id; 1283 __u32 busnr; 1284 __u32 devfn; 1285 __u32 flags; 1286 __u32 segnr; 1287 union { 1288 __u32 reserved[11]; 1289 }; 1290}; 1291 1292The PCI device is specified by the triple segnr, busnr, and devfn. 1293Identification in succeeding service requests is done via assigned_dev_id. The 1294following flags are specified: 1295 1296/* Depends on KVM_CAP_IOMMU */ 1297#define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) 1298/* The following two depend on KVM_CAP_PCI_2_3 */ 1299#define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) 1300#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) 1301 1302If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts 1303via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other 1304assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the 1305guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details. 1306 1307The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure 1308isolation of the device. Usages not specifying this flag are deprecated. 1309 1310Only PCI header type 0 devices with PCI BAR resources are supported by 1311device assignment. The user requesting this ioctl must have read/write 1312access to the PCI sysfs resource files associated with the device. 1313 1314Errors: 1315 ENOTTY: kernel does not support this ioctl 1316 1317 Other error conditions may be defined by individual device types or 1318 have their standard meanings. 1319 1320 13214.49 KVM_DEASSIGN_PCI_DEVICE 1322 1323Capability: none 1324Architectures: x86 1325Type: vm ioctl 1326Parameters: struct kvm_assigned_pci_dev (in) 1327Returns: 0 on success, -1 on error 1328 1329Ends PCI device assignment, releasing all associated resources. 1330 1331See KVM_ASSIGN_PCI_DEVICE for the data structure. Only assigned_dev_id is 1332used in kvm_assigned_pci_dev to identify the device. 1333 1334Errors: 1335 ENOTTY: kernel does not support this ioctl 1336 1337 Other error conditions may be defined by individual device types or 1338 have their standard meanings. 1339 13404.50 KVM_ASSIGN_DEV_IRQ 1341 1342Capability: KVM_CAP_ASSIGN_DEV_IRQ 1343Architectures: x86 1344Type: vm ioctl 1345Parameters: struct kvm_assigned_irq (in) 1346Returns: 0 on success, -1 on error 1347 1348Assigns an IRQ to a passed-through device. 1349 1350struct kvm_assigned_irq { 1351 __u32 assigned_dev_id; 1352 __u32 host_irq; /* ignored (legacy field) */ 1353 __u32 guest_irq; 1354 __u32 flags; 1355 union { 1356 __u32 reserved[12]; 1357 }; 1358}; 1359 1360The following flags are defined: 1361 1362#define KVM_DEV_IRQ_HOST_INTX (1 << 0) 1363#define KVM_DEV_IRQ_HOST_MSI (1 << 1) 1364#define KVM_DEV_IRQ_HOST_MSIX (1 << 2) 1365 1366#define KVM_DEV_IRQ_GUEST_INTX (1 << 8) 1367#define KVM_DEV_IRQ_GUEST_MSI (1 << 9) 1368#define KVM_DEV_IRQ_GUEST_MSIX (1 << 10) 1369 1370It is not valid to specify multiple types per host or guest IRQ. However, the 1371IRQ type of host and guest can differ or can even be null. 1372 1373Errors: 1374 ENOTTY: kernel does not support this ioctl 1375 1376 Other error conditions may be defined by individual device types or 1377 have their standard meanings. 1378 1379 13804.51 KVM_DEASSIGN_DEV_IRQ 1381 1382Capability: KVM_CAP_ASSIGN_DEV_IRQ 1383Architectures: x86 1384Type: vm ioctl 1385Parameters: struct kvm_assigned_irq (in) 1386Returns: 0 on success, -1 on error 1387 1388Ends an IRQ assignment to a passed-through device. 1389 1390See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified 1391by assigned_dev_id, flags must correspond to the IRQ type specified on 1392KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or guest IRQ is allowed. 1393 1394 13954.52 KVM_SET_GSI_ROUTING 1396 1397Capability: KVM_CAP_IRQ_ROUTING 1398Architectures: x86 s390 1399Type: vm ioctl 1400Parameters: struct kvm_irq_routing (in) 1401Returns: 0 on success, -1 on error 1402 1403Sets the GSI routing table entries, overwriting any previously set entries. 1404 1405struct kvm_irq_routing { 1406 __u32 nr; 1407 __u32 flags; 1408 struct kvm_irq_routing_entry entries[0]; 1409}; 1410 1411No flags are specified so far, the corresponding field must be set to zero. 1412 1413struct kvm_irq_routing_entry { 1414 __u32 gsi; 1415 __u32 type; 1416 __u32 flags; 1417 __u32 pad; 1418 union { 1419 struct kvm_irq_routing_irqchip irqchip; 1420 struct kvm_irq_routing_msi msi; 1421 struct kvm_irq_routing_s390_adapter adapter; 1422 __u32 pad[8]; 1423 } u; 1424}; 1425 1426/* gsi routing entry types */ 1427#define KVM_IRQ_ROUTING_IRQCHIP 1 1428#define KVM_IRQ_ROUTING_MSI 2 1429#define KVM_IRQ_ROUTING_S390_ADAPTER 3 1430 1431No flags are specified so far, the corresponding field must be set to zero. 1432 1433struct kvm_irq_routing_irqchip { 1434 __u32 irqchip; 1435 __u32 pin; 1436}; 1437 1438struct kvm_irq_routing_msi { 1439 __u32 address_lo; 1440 __u32 address_hi; 1441 __u32 data; 1442 __u32 pad; 1443}; 1444 1445struct kvm_irq_routing_s390_adapter { 1446 __u64 ind_addr; 1447 __u64 summary_addr; 1448 __u64 ind_offset; 1449 __u32 summary_offset; 1450 __u32 adapter_id; 1451}; 1452 1453 14544.53 KVM_ASSIGN_SET_MSIX_NR 1455 1456Capability: none 1457Architectures: x86 1458Type: vm ioctl 1459Parameters: struct kvm_assigned_msix_nr (in) 1460Returns: 0 on success, -1 on error 1461 1462Set the number of MSI-X interrupts for an assigned device. The number is 1463reset again by terminating the MSI-X assignment of the device via 1464KVM_DEASSIGN_DEV_IRQ. Calling this service more than once at any earlier 1465point will fail. 1466 1467struct kvm_assigned_msix_nr { 1468 __u32 assigned_dev_id; 1469 __u16 entry_nr; 1470 __u16 padding; 1471}; 1472 1473#define KVM_MAX_MSIX_PER_DEV 256 1474 1475 14764.54 KVM_ASSIGN_SET_MSIX_ENTRY 1477 1478Capability: none 1479Architectures: x86 1480Type: vm ioctl 1481Parameters: struct kvm_assigned_msix_entry (in) 1482Returns: 0 on success, -1 on error 1483 1484Specifies the routing of an MSI-X assigned device interrupt to a GSI. Setting 1485the GSI vector to zero means disabling the interrupt. 1486 1487struct kvm_assigned_msix_entry { 1488 __u32 assigned_dev_id; 1489 __u32 gsi; 1490 __u16 entry; /* The index of entry in the MSI-X table */ 1491 __u16 padding[3]; 1492}; 1493 1494Errors: 1495 ENOTTY: kernel does not support this ioctl 1496 1497 Other error conditions may be defined by individual device types or 1498 have their standard meanings. 1499 1500 15014.55 KVM_SET_TSC_KHZ 1502 1503Capability: KVM_CAP_TSC_CONTROL 1504Architectures: x86 1505Type: vcpu ioctl 1506Parameters: virtual tsc_khz 1507Returns: 0 on success, -1 on error 1508 1509Specifies the tsc frequency for the virtual machine. The unit of the 1510frequency is KHz. 1511 1512 15134.56 KVM_GET_TSC_KHZ 1514 1515Capability: KVM_CAP_GET_TSC_KHZ 1516Architectures: x86 1517Type: vcpu ioctl 1518Parameters: none 1519Returns: virtual tsc-khz on success, negative value on error 1520 1521Returns the tsc frequency of the guest. The unit of the return value is 1522KHz. If the host has unstable tsc this ioctl returns -EIO instead as an 1523error. 1524 1525 15264.57 KVM_GET_LAPIC 1527 1528Capability: KVM_CAP_IRQCHIP 1529Architectures: x86 1530Type: vcpu ioctl 1531Parameters: struct kvm_lapic_state (out) 1532Returns: 0 on success, -1 on error 1533 1534#define KVM_APIC_REG_SIZE 0x400 1535struct kvm_lapic_state { 1536 char regs[KVM_APIC_REG_SIZE]; 1537}; 1538 1539Reads the Local APIC registers and copies them into the input argument. The 1540data format and layout are the same as documented in the architecture manual. 1541 1542 15434.58 KVM_SET_LAPIC 1544 1545Capability: KVM_CAP_IRQCHIP 1546Architectures: x86 1547Type: vcpu ioctl 1548Parameters: struct kvm_lapic_state (in) 1549Returns: 0 on success, -1 on error 1550 1551#define KVM_APIC_REG_SIZE 0x400 1552struct kvm_lapic_state { 1553 char regs[KVM_APIC_REG_SIZE]; 1554}; 1555 1556Copies the input argument into the Local APIC registers. The data format 1557and layout are the same as documented in the architecture manual. 1558 1559 15604.59 KVM_IOEVENTFD 1561 1562Capability: KVM_CAP_IOEVENTFD 1563Architectures: all 1564Type: vm ioctl 1565Parameters: struct kvm_ioeventfd (in) 1566Returns: 0 on success, !0 on error 1567 1568This ioctl attaches or detaches an ioeventfd to a legal pio/mmio address 1569within the guest. A guest write in the registered address will signal the 1570provided event instead of triggering an exit. 1571 1572struct kvm_ioeventfd { 1573 __u64 datamatch; 1574 __u64 addr; /* legal pio/mmio address */ 1575 __u32 len; /* 1, 2, 4, or 8 bytes */ 1576 __s32 fd; 1577 __u32 flags; 1578 __u8 pad[36]; 1579}; 1580 1581For the special case of virtio-ccw devices on s390, the ioevent is matched 1582to a subchannel/virtqueue tuple instead. 1583 1584The following flags are defined: 1585 1586#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch) 1587#define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio) 1588#define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign) 1589#define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \ 1590 (1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify) 1591 1592If datamatch flag is set, the event will be signaled only if the written value 1593to the registered address is equal to datamatch in struct kvm_ioeventfd. 1594 1595For virtio-ccw devices, addr contains the subchannel id and datamatch the 1596virtqueue index. 1597 1598 15994.60 KVM_DIRTY_TLB 1600 1601Capability: KVM_CAP_SW_TLB 1602Architectures: ppc 1603Type: vcpu ioctl 1604Parameters: struct kvm_dirty_tlb (in) 1605Returns: 0 on success, -1 on error 1606 1607struct kvm_dirty_tlb { 1608 __u64 bitmap; 1609 __u32 num_dirty; 1610}; 1611 1612This must be called whenever userspace has changed an entry in the shared 1613TLB, prior to calling KVM_RUN on the associated vcpu. 1614 1615The "bitmap" field is the userspace address of an array. This array 1616consists of a number of bits, equal to the total number of TLB entries as 1617determined by the last successful call to KVM_CONFIG_TLB, rounded up to the 1618nearest multiple of 64. 1619 1620Each bit corresponds to one TLB entry, ordered the same as in the shared TLB 1621array. 1622 1623The array is little-endian: the bit 0 is the least significant bit of the 1624first byte, bit 8 is the least significant bit of the second byte, etc. 1625This avoids any complications with differing word sizes. 1626 1627The "num_dirty" field is a performance hint for KVM to determine whether it 1628should skip processing the bitmap and just invalidate everything. It must 1629be set to the number of set bits in the bitmap. 1630 1631 16324.61 KVM_ASSIGN_SET_INTX_MASK 1633 1634Capability: KVM_CAP_PCI_2_3 1635Architectures: x86 1636Type: vm ioctl 1637Parameters: struct kvm_assigned_pci_dev (in) 1638Returns: 0 on success, -1 on error 1639 1640Allows userspace to mask PCI INTx interrupts from the assigned device. The 1641kernel will not deliver INTx interrupts to the guest between setting and 1642clearing of KVM_ASSIGN_SET_INTX_MASK via this interface. This enables use of 1643and emulation of PCI 2.3 INTx disable command register behavior. 1644 1645This may be used for both PCI 2.3 devices supporting INTx disable natively and 1646older devices lacking this support. Userspace is responsible for emulating the 1647read value of the INTx disable bit in the guest visible PCI command register. 1648When modifying the INTx disable state, userspace should precede updating the 1649physical device command register by calling this ioctl to inform the kernel of 1650the new intended INTx mask state. 1651 1652Note that the kernel uses the device INTx disable bit to internally manage the 1653device interrupt state for PCI 2.3 devices. Reads of this register may 1654therefore not match the expected value. Writes should always use the guest 1655intended INTx disable value rather than attempting to read-copy-update the 1656current physical device state. Races between user and kernel updates to the 1657INTx disable bit are handled lazily in the kernel. It's possible the device 1658may generate unintended interrupts, but they will not be injected into the 1659guest. 1660 1661See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified 1662by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is 1663evaluated. 1664 1665 16664.62 KVM_CREATE_SPAPR_TCE 1667 1668Capability: KVM_CAP_SPAPR_TCE 1669Architectures: powerpc 1670Type: vm ioctl 1671Parameters: struct kvm_create_spapr_tce (in) 1672Returns: file descriptor for manipulating the created TCE table 1673 1674This creates a virtual TCE (translation control entry) table, which 1675is an IOMMU for PAPR-style virtual I/O. It is used to translate 1676logical addresses used in virtual I/O into guest physical addresses, 1677and provides a scatter/gather capability for PAPR virtual I/O. 1678 1679/* for KVM_CAP_SPAPR_TCE */ 1680struct kvm_create_spapr_tce { 1681 __u64 liobn; 1682 __u32 window_size; 1683}; 1684 1685The liobn field gives the logical IO bus number for which to create a 1686TCE table. The window_size field specifies the size of the DMA window 1687which this TCE table will translate - the table will contain one 64 1688bit TCE entry for every 4kiB of the DMA window. 1689 1690When the guest issues an H_PUT_TCE hcall on a liobn for which a TCE 1691table has been created using this ioctl(), the kernel will handle it 1692in real mode, updating the TCE table. H_PUT_TCE calls for other 1693liobns will cause a vm exit and must be handled by userspace. 1694 1695The return value is a file descriptor which can be passed to mmap(2) 1696to map the created TCE table into userspace. This lets userspace read 1697the entries written by kernel-handled H_PUT_TCE calls, and also lets 1698userspace update the TCE table directly which is useful in some 1699circumstances. 1700 1701 17024.63 KVM_ALLOCATE_RMA 1703 1704Capability: KVM_CAP_PPC_RMA 1705Architectures: powerpc 1706Type: vm ioctl 1707Parameters: struct kvm_allocate_rma (out) 1708Returns: file descriptor for mapping the allocated RMA 1709 1710This allocates a Real Mode Area (RMA) from the pool allocated at boot 1711time by the kernel. An RMA is a physically-contiguous, aligned region 1712of memory used on older POWER processors to provide the memory which 1713will be accessed by real-mode (MMU off) accesses in a KVM guest. 1714POWER processors support a set of sizes for the RMA that usually 1715includes 64MB, 128MB, 256MB and some larger powers of two. 1716 1717/* for KVM_ALLOCATE_RMA */ 1718struct kvm_allocate_rma { 1719 __u64 rma_size; 1720}; 1721 1722The return value is a file descriptor which can be passed to mmap(2) 1723to map the allocated RMA into userspace. The mapped area can then be 1724passed to the KVM_SET_USER_MEMORY_REGION ioctl to establish it as the 1725RMA for a virtual machine. The size of the RMA in bytes (which is 1726fixed at host kernel boot time) is returned in the rma_size field of 1727the argument structure. 1728 1729The KVM_CAP_PPC_RMA capability is 1 or 2 if the KVM_ALLOCATE_RMA ioctl 1730is supported; 2 if the processor requires all virtual machines to have 1731an RMA, or 1 if the processor can use an RMA but doesn't require it, 1732because it supports the Virtual RMA (VRMA) facility. 1733 1734 17354.64 KVM_NMI 1736 1737Capability: KVM_CAP_USER_NMI 1738Architectures: x86 1739Type: vcpu ioctl 1740Parameters: none 1741Returns: 0 on success, -1 on error 1742 1743Queues an NMI on the thread's vcpu. Note this is well defined only 1744when KVM_CREATE_IRQCHIP has not been called, since this is an interface 1745between the virtual cpu core and virtual local APIC. After KVM_CREATE_IRQCHIP 1746has been called, this interface is completely emulated within the kernel. 1747 1748To use this to emulate the LINT1 input with KVM_CREATE_IRQCHIP, use the 1749following algorithm: 1750 1751 - pause the vpcu 1752 - read the local APIC's state (KVM_GET_LAPIC) 1753 - check whether changing LINT1 will queue an NMI (see the LVT entry for LINT1) 1754 - if so, issue KVM_NMI 1755 - resume the vcpu 1756 1757Some guests configure the LINT1 NMI input to cause a panic, aiding in 1758debugging. 1759 1760 17614.65 KVM_S390_UCAS_MAP 1762 1763Capability: KVM_CAP_S390_UCONTROL 1764Architectures: s390 1765Type: vcpu ioctl 1766Parameters: struct kvm_s390_ucas_mapping (in) 1767Returns: 0 in case of success 1768 1769The parameter is defined like this: 1770 struct kvm_s390_ucas_mapping { 1771 __u64 user_addr; 1772 __u64 vcpu_addr; 1773 __u64 length; 1774 }; 1775 1776This ioctl maps the memory at "user_addr" with the length "length" to 1777the vcpu's address space starting at "vcpu_addr". All parameters need to 1778be aligned by 1 megabyte. 1779 1780 17814.66 KVM_S390_UCAS_UNMAP 1782 1783Capability: KVM_CAP_S390_UCONTROL 1784Architectures: s390 1785Type: vcpu ioctl 1786Parameters: struct kvm_s390_ucas_mapping (in) 1787Returns: 0 in case of success 1788 1789The parameter is defined like this: 1790 struct kvm_s390_ucas_mapping { 1791 __u64 user_addr; 1792 __u64 vcpu_addr; 1793 __u64 length; 1794 }; 1795 1796This ioctl unmaps the memory in the vcpu's address space starting at 1797"vcpu_addr" with the length "length". The field "user_addr" is ignored. 1798All parameters need to be aligned by 1 megabyte. 1799 1800 18014.67 KVM_S390_VCPU_FAULT 1802 1803Capability: KVM_CAP_S390_UCONTROL 1804Architectures: s390 1805Type: vcpu ioctl 1806Parameters: vcpu absolute address (in) 1807Returns: 0 in case of success 1808 1809This call creates a page table entry on the virtual cpu's address space 1810(for user controlled virtual machines) or the virtual machine's address 1811space (for regular virtual machines). This only works for minor faults, 1812thus it's recommended to access subject memory page via the user page 1813table upfront. This is useful to handle validity intercepts for user 1814controlled virtual machines to fault in the virtual cpu's lowcore pages 1815prior to calling the KVM_RUN ioctl. 1816 1817 18184.68 KVM_SET_ONE_REG 1819 1820Capability: KVM_CAP_ONE_REG 1821Architectures: all 1822Type: vcpu ioctl 1823Parameters: struct kvm_one_reg (in) 1824Returns: 0 on success, negative value on failure 1825 1826struct kvm_one_reg { 1827 __u64 id; 1828 __u64 addr; 1829}; 1830 1831Using this ioctl, a single vcpu register can be set to a specific value 1832defined by user space with the passed in struct kvm_one_reg, where id 1833refers to the register identifier as described below and addr is a pointer 1834to a variable with the respective size. There can be architecture agnostic 1835and architecture specific registers. Each have their own range of operation 1836and their own constants and width. To keep track of the implemented 1837registers, find a list below: 1838 1839 Arch | Register | Width (bits) 1840 | | 1841 PPC | KVM_REG_PPC_HIOR | 64 1842 PPC | KVM_REG_PPC_IAC1 | 64 1843 PPC | KVM_REG_PPC_IAC2 | 64 1844 PPC | KVM_REG_PPC_IAC3 | 64 1845 PPC | KVM_REG_PPC_IAC4 | 64 1846 PPC | KVM_REG_PPC_DAC1 | 64 1847 PPC | KVM_REG_PPC_DAC2 | 64 1848 PPC | KVM_REG_PPC_DABR | 64 1849 PPC | KVM_REG_PPC_DSCR | 64 1850 PPC | KVM_REG_PPC_PURR | 64 1851 PPC | KVM_REG_PPC_SPURR | 64 1852 PPC | KVM_REG_PPC_DAR | 64 1853 PPC | KVM_REG_PPC_DSISR | 32 1854 PPC | KVM_REG_PPC_AMR | 64 1855 PPC | KVM_REG_PPC_UAMOR | 64 1856 PPC | KVM_REG_PPC_MMCR0 | 64 1857 PPC | KVM_REG_PPC_MMCR1 | 64 1858 PPC | KVM_REG_PPC_MMCRA | 64 1859 PPC | KVM_REG_PPC_MMCR2 | 64 1860 PPC | KVM_REG_PPC_MMCRS | 64 1861 PPC | KVM_REG_PPC_SIAR | 64 1862 PPC | KVM_REG_PPC_SDAR | 64 1863 PPC | KVM_REG_PPC_SIER | 64 1864 PPC | KVM_REG_PPC_PMC1 | 32 1865 PPC | KVM_REG_PPC_PMC2 | 32 1866 PPC | KVM_REG_PPC_PMC3 | 32 1867 PPC | KVM_REG_PPC_PMC4 | 32 1868 PPC | KVM_REG_PPC_PMC5 | 32 1869 PPC | KVM_REG_PPC_PMC6 | 32 1870 PPC | KVM_REG_PPC_PMC7 | 32 1871 PPC | KVM_REG_PPC_PMC8 | 32 1872 PPC | KVM_REG_PPC_FPR0 | 64 1873 ... 1874 PPC | KVM_REG_PPC_FPR31 | 64 1875 PPC | KVM_REG_PPC_VR0 | 128 1876 ... 1877 PPC | KVM_REG_PPC_VR31 | 128 1878 PPC | KVM_REG_PPC_VSR0 | 128 1879 ... 1880 PPC | KVM_REG_PPC_VSR31 | 128 1881 PPC | KVM_REG_PPC_FPSCR | 64 1882 PPC | KVM_REG_PPC_VSCR | 32 1883 PPC | KVM_REG_PPC_VPA_ADDR | 64 1884 PPC | KVM_REG_PPC_VPA_SLB | 128 1885 PPC | KVM_REG_PPC_VPA_DTL | 128 1886 PPC | KVM_REG_PPC_EPCR | 32 1887 PPC | KVM_REG_PPC_EPR | 32 1888 PPC | KVM_REG_PPC_TCR | 32 1889 PPC | KVM_REG_PPC_TSR | 32 1890 PPC | KVM_REG_PPC_OR_TSR | 32 1891 PPC | KVM_REG_PPC_CLEAR_TSR | 32 1892 PPC | KVM_REG_PPC_MAS0 | 32 1893 PPC | KVM_REG_PPC_MAS1 | 32 1894 PPC | KVM_REG_PPC_MAS2 | 64 1895 PPC | KVM_REG_PPC_MAS7_3 | 64 1896 PPC | KVM_REG_PPC_MAS4 | 32 1897 PPC | KVM_REG_PPC_MAS6 | 32 1898 PPC | KVM_REG_PPC_MMUCFG | 32 1899 PPC | KVM_REG_PPC_TLB0CFG | 32 1900 PPC | KVM_REG_PPC_TLB1CFG | 32 1901 PPC | KVM_REG_PPC_TLB2CFG | 32 1902 PPC | KVM_REG_PPC_TLB3CFG | 32 1903 PPC | KVM_REG_PPC_TLB0PS | 32 1904 PPC | KVM_REG_PPC_TLB1PS | 32 1905 PPC | KVM_REG_PPC_TLB2PS | 32 1906 PPC | KVM_REG_PPC_TLB3PS | 32 1907 PPC | KVM_REG_PPC_EPTCFG | 32 1908 PPC | KVM_REG_PPC_ICP_STATE | 64 1909 PPC | KVM_REG_PPC_TB_OFFSET | 64 1910 PPC | KVM_REG_PPC_SPMC1 | 32 1911 PPC | KVM_REG_PPC_SPMC2 | 32 1912 PPC | KVM_REG_PPC_IAMR | 64 1913 PPC | KVM_REG_PPC_TFHAR | 64 1914 PPC | KVM_REG_PPC_TFIAR | 64 1915 PPC | KVM_REG_PPC_TEXASR | 64 1916 PPC | KVM_REG_PPC_FSCR | 64 1917 PPC | KVM_REG_PPC_PSPB | 32 1918 PPC | KVM_REG_PPC_EBBHR | 64 1919 PPC | KVM_REG_PPC_EBBRR | 64 1920 PPC | KVM_REG_PPC_BESCR | 64 1921 PPC | KVM_REG_PPC_TAR | 64 1922 PPC | KVM_REG_PPC_DPDES | 64 1923 PPC | KVM_REG_PPC_DAWR | 64 1924 PPC | KVM_REG_PPC_DAWRX | 64 1925 PPC | KVM_REG_PPC_CIABR | 64 1926 PPC | KVM_REG_PPC_IC | 64 1927 PPC | KVM_REG_PPC_VTB | 64 1928 PPC | KVM_REG_PPC_CSIGR | 64 1929 PPC | KVM_REG_PPC_TACR | 64 1930 PPC | KVM_REG_PPC_TCSCR | 64 1931 PPC | KVM_REG_PPC_PID | 64 1932 PPC | KVM_REG_PPC_ACOP | 64 1933 PPC | KVM_REG_PPC_VRSAVE | 32 1934 PPC | KVM_REG_PPC_LPCR | 32 1935 PPC | KVM_REG_PPC_LPCR_64 | 64 1936 PPC | KVM_REG_PPC_PPR | 64 1937 PPC | KVM_REG_PPC_ARCH_COMPAT | 32 1938 PPC | KVM_REG_PPC_DABRX | 32 1939 PPC | KVM_REG_PPC_WORT | 64 1940 PPC | KVM_REG_PPC_SPRG9 | 64 1941 PPC | KVM_REG_PPC_DBSR | 32 1942 PPC | KVM_REG_PPC_TM_GPR0 | 64 1943 ... 1944 PPC | KVM_REG_PPC_TM_GPR31 | 64 1945 PPC | KVM_REG_PPC_TM_VSR0 | 128 1946 ... 1947 PPC | KVM_REG_PPC_TM_VSR63 | 128 1948 PPC | KVM_REG_PPC_TM_CR | 64 1949 PPC | KVM_REG_PPC_TM_LR | 64 1950 PPC | KVM_REG_PPC_TM_CTR | 64 1951 PPC | KVM_REG_PPC_TM_FPSCR | 64 1952 PPC | KVM_REG_PPC_TM_AMR | 64 1953 PPC | KVM_REG_PPC_TM_PPR | 64 1954 PPC | KVM_REG_PPC_TM_VRSAVE | 64 1955 PPC | KVM_REG_PPC_TM_VSCR | 32 1956 PPC | KVM_REG_PPC_TM_DSCR | 64 1957 PPC | KVM_REG_PPC_TM_TAR | 64 1958 | | 1959 MIPS | KVM_REG_MIPS_R0 | 64 1960 ... 1961 MIPS | KVM_REG_MIPS_R31 | 64 1962 MIPS | KVM_REG_MIPS_HI | 64 1963 MIPS | KVM_REG_MIPS_LO | 64 1964 MIPS | KVM_REG_MIPS_PC | 64 1965 MIPS | KVM_REG_MIPS_CP0_INDEX | 32 1966 MIPS | KVM_REG_MIPS_CP0_CONTEXT | 64 1967 MIPS | KVM_REG_MIPS_CP0_USERLOCAL | 64 1968 MIPS | KVM_REG_MIPS_CP0_PAGEMASK | 32 1969 MIPS | KVM_REG_MIPS_CP0_WIRED | 32 1970 MIPS | KVM_REG_MIPS_CP0_HWRENA | 32 1971 MIPS | KVM_REG_MIPS_CP0_BADVADDR | 64 1972 MIPS | KVM_REG_MIPS_CP0_COUNT | 32 1973 MIPS | KVM_REG_MIPS_CP0_ENTRYHI | 64 1974 MIPS | KVM_REG_MIPS_CP0_COMPARE | 32 1975 MIPS | KVM_REG_MIPS_CP0_STATUS | 32 1976 MIPS | KVM_REG_MIPS_CP0_CAUSE | 32 1977 MIPS | KVM_REG_MIPS_CP0_EPC | 64 1978 MIPS | KVM_REG_MIPS_CP0_PRID | 32 1979 MIPS | KVM_REG_MIPS_CP0_CONFIG | 32 1980 MIPS | KVM_REG_MIPS_CP0_CONFIG1 | 32 1981 MIPS | KVM_REG_MIPS_CP0_CONFIG2 | 32 1982 MIPS | KVM_REG_MIPS_CP0_CONFIG3 | 32 1983 MIPS | KVM_REG_MIPS_CP0_CONFIG4 | 32 1984 MIPS | KVM_REG_MIPS_CP0_CONFIG5 | 32 1985 MIPS | KVM_REG_MIPS_CP0_CONFIG7 | 32 1986 MIPS | KVM_REG_MIPS_CP0_ERROREPC | 64 1987 MIPS | KVM_REG_MIPS_COUNT_CTL | 64 1988 MIPS | KVM_REG_MIPS_COUNT_RESUME | 64 1989 MIPS | KVM_REG_MIPS_COUNT_HZ | 64 1990 MIPS | KVM_REG_MIPS_FPR_32(0..31) | 32 1991 MIPS | KVM_REG_MIPS_FPR_64(0..31) | 64 1992 MIPS | KVM_REG_MIPS_VEC_128(0..31) | 128 1993 MIPS | KVM_REG_MIPS_FCR_IR | 32 1994 MIPS | KVM_REG_MIPS_FCR_CSR | 32 1995 MIPS | KVM_REG_MIPS_MSA_IR | 32 1996 MIPS | KVM_REG_MIPS_MSA_CSR | 32 1997 1998ARM registers are mapped using the lower 32 bits. The upper 16 of that 1999is the register group type, or coprocessor number: 2000 2001ARM core registers have the following id bit patterns: 2002 0x4020 0000 0010 <index into the kvm_regs struct:16> 2003 2004ARM 32-bit CP15 registers have the following id bit patterns: 2005 0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3> 2006 2007ARM 64-bit CP15 registers have the following id bit patterns: 2008 0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3> 2009 2010ARM CCSIDR registers are demultiplexed by CSSELR value: 2011 0x4020 0000 0011 00 <csselr:8> 2012 2013ARM 32-bit VFP control registers have the following id bit patterns: 2014 0x4020 0000 0012 1 <regno:12> 2015 2016ARM 64-bit FP registers have the following id bit patterns: 2017 0x4030 0000 0012 0 <regno:12> 2018 2019 2020arm64 registers are mapped using the lower 32 bits. The upper 16 of 2021that is the register group type, or coprocessor number: 2022 2023arm64 core/FP-SIMD registers have the following id bit patterns. Note 2024that the size of the access is variable, as the kvm_regs structure 2025contains elements ranging from 32 to 128 bits. The index is a 32bit 2026value in the kvm_regs structure seen as a 32bit array. 2027 0x60x0 0000 0010 <index into the kvm_regs struct:16> 2028 2029arm64 CCSIDR registers are demultiplexed by CSSELR value: 2030 0x6020 0000 0011 00 <csselr:8> 2031 2032arm64 system registers have the following id bit patterns: 2033 0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3> 2034 2035 2036MIPS registers are mapped using the lower 32 bits. The upper 16 of that is 2037the register group type: 2038 2039MIPS core registers (see above) have the following id bit patterns: 2040 0x7030 0000 0000 <reg:16> 2041 2042MIPS CP0 registers (see KVM_REG_MIPS_CP0_* above) have the following id bit 2043patterns depending on whether they're 32-bit or 64-bit registers: 2044 0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit) 2045 0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit) 2046 2047MIPS KVM control registers (see above) have the following id bit patterns: 2048 0x7030 0000 0002 <reg:16> 2049 2050MIPS FPU registers (see KVM_REG_MIPS_FPR_{32,64}() above) have the following 2051id bit patterns depending on the size of the register being accessed. They are 2052always accessed according to the current guest FPU mode (Status.FR and 2053Config5.FRE), i.e. as the guest would see them, and they become unpredictable 2054if the guest FPU mode is changed. MIPS SIMD Architecture (MSA) vector 2055registers (see KVM_REG_MIPS_VEC_128() above) have similar patterns as they 2056overlap the FPU registers: 2057 0x7020 0000 0003 00 <0:3> <reg:5> (32-bit FPU registers) 2058 0x7030 0000 0003 00 <0:3> <reg:5> (64-bit FPU registers) 2059 0x7040 0000 0003 00 <0:3> <reg:5> (128-bit MSA vector registers) 2060 2061MIPS FPU control registers (see KVM_REG_MIPS_FCR_{IR,CSR} above) have the 2062following id bit patterns: 2063 0x7020 0000 0003 01 <0:3> <reg:5> 2064 2065MIPS MSA control registers (see KVM_REG_MIPS_MSA_{IR,CSR} above) have the 2066following id bit patterns: 2067 0x7020 0000 0003 02 <0:3> <reg:5> 2068 2069 20704.69 KVM_GET_ONE_REG 2071 2072Capability: KVM_CAP_ONE_REG 2073Architectures: all 2074Type: vcpu ioctl 2075Parameters: struct kvm_one_reg (in and out) 2076Returns: 0 on success, negative value on failure 2077 2078This ioctl allows to receive the value of a single register implemented 2079in a vcpu. The register to read is indicated by the "id" field of the 2080kvm_one_reg struct passed in. On success, the register value can be found 2081at the memory location pointed to by "addr". 2082 2083The list of registers accessible using this interface is identical to the 2084list in 4.68. 2085 2086 20874.70 KVM_KVMCLOCK_CTRL 2088 2089Capability: KVM_CAP_KVMCLOCK_CTRL 2090Architectures: Any that implement pvclocks (currently x86 only) 2091Type: vcpu ioctl 2092Parameters: None 2093Returns: 0 on success, -1 on error 2094 2095This signals to the host kernel that the specified guest is being paused by 2096userspace. The host will set a flag in the pvclock structure that is checked 2097from the soft lockup watchdog. The flag is part of the pvclock structure that 2098is shared between guest and host, specifically the second bit of the flags 2099field of the pvclock_vcpu_time_info structure. It will be set exclusively by 2100the host and read/cleared exclusively by the guest. The guest operation of 2101checking and clearing the flag must an atomic operation so 2102load-link/store-conditional, or equivalent must be used. There are two cases 2103where the guest will clear the flag: when the soft lockup watchdog timer resets 2104itself or when a soft lockup is detected. This ioctl can be called any time 2105after pausing the vcpu, but before it is resumed. 2106 2107 21084.71 KVM_SIGNAL_MSI 2109 2110Capability: KVM_CAP_SIGNAL_MSI 2111Architectures: x86 2112Type: vm ioctl 2113Parameters: struct kvm_msi (in) 2114Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error 2115 2116Directly inject a MSI message. Only valid with in-kernel irqchip that handles 2117MSI messages. 2118 2119struct kvm_msi { 2120 __u32 address_lo; 2121 __u32 address_hi; 2122 __u32 data; 2123 __u32 flags; 2124 __u8 pad[16]; 2125}; 2126 2127No flags are defined so far. The corresponding field must be 0. 2128 2129 21304.71 KVM_CREATE_PIT2 2131 2132Capability: KVM_CAP_PIT2 2133Architectures: x86 2134Type: vm ioctl 2135Parameters: struct kvm_pit_config (in) 2136Returns: 0 on success, -1 on error 2137 2138Creates an in-kernel device model for the i8254 PIT. This call is only valid 2139after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following 2140parameters have to be passed: 2141 2142struct kvm_pit_config { 2143 __u32 flags; 2144 __u32 pad[15]; 2145}; 2146 2147Valid flags are: 2148 2149#define KVM_PIT_SPEAKER_DUMMY 1 /* emulate speaker port stub */ 2150 2151PIT timer interrupts may use a per-VM kernel thread for injection. If it 2152exists, this thread will have a name of the following pattern: 2153 2154kvm-pit/<owner-process-pid> 2155 2156When running a guest with elevated priorities, the scheduling parameters of 2157this thread may have to be adjusted accordingly. 2158 2159This IOCTL replaces the obsolete KVM_CREATE_PIT. 2160 2161 21624.72 KVM_GET_PIT2 2163 2164Capability: KVM_CAP_PIT_STATE2 2165Architectures: x86 2166Type: vm ioctl 2167Parameters: struct kvm_pit_state2 (out) 2168Returns: 0 on success, -1 on error 2169 2170Retrieves the state of the in-kernel PIT model. Only valid after 2171KVM_CREATE_PIT2. The state is returned in the following structure: 2172 2173struct kvm_pit_state2 { 2174 struct kvm_pit_channel_state channels[3]; 2175 __u32 flags; 2176 __u32 reserved[9]; 2177}; 2178 2179Valid flags are: 2180 2181/* disable PIT in HPET legacy mode */ 2182#define KVM_PIT_FLAGS_HPET_LEGACY 0x00000001 2183 2184This IOCTL replaces the obsolete KVM_GET_PIT. 2185 2186 21874.73 KVM_SET_PIT2 2188 2189Capability: KVM_CAP_PIT_STATE2 2190Architectures: x86 2191Type: vm ioctl 2192Parameters: struct kvm_pit_state2 (in) 2193Returns: 0 on success, -1 on error 2194 2195Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2. 2196See KVM_GET_PIT2 for details on struct kvm_pit_state2. 2197 2198This IOCTL replaces the obsolete KVM_SET_PIT. 2199 2200 22014.74 KVM_PPC_GET_SMMU_INFO 2202 2203Capability: KVM_CAP_PPC_GET_SMMU_INFO 2204Architectures: powerpc 2205Type: vm ioctl 2206Parameters: None 2207Returns: 0 on success, -1 on error 2208 2209This populates and returns a structure describing the features of 2210the "Server" class MMU emulation supported by KVM. 2211This can in turn be used by userspace to generate the appropriate 2212device-tree properties for the guest operating system. 2213 2214The structure contains some global information, followed by an 2215array of supported segment page sizes: 2216 2217 struct kvm_ppc_smmu_info { 2218 __u64 flags; 2219 __u32 slb_size; 2220 __u32 pad; 2221 struct kvm_ppc_one_seg_page_size sps[KVM_PPC_PAGE_SIZES_MAX_SZ]; 2222 }; 2223 2224The supported flags are: 2225 2226 - KVM_PPC_PAGE_SIZES_REAL: 2227 When that flag is set, guest page sizes must "fit" the backing 2228 store page sizes. When not set, any page size in the list can 2229 be used regardless of how they are backed by userspace. 2230 2231 - KVM_PPC_1T_SEGMENTS 2232 The emulated MMU supports 1T segments in addition to the 2233 standard 256M ones. 2234 2235The "slb_size" field indicates how many SLB entries are supported 2236 2237The "sps" array contains 8 entries indicating the supported base 2238page sizes for a segment in increasing order. Each entry is defined 2239as follow: 2240 2241 struct kvm_ppc_one_seg_page_size { 2242 __u32 page_shift; /* Base page shift of segment (or 0) */ 2243 __u32 slb_enc; /* SLB encoding for BookS */ 2244 struct kvm_ppc_one_page_size enc[KVM_PPC_PAGE_SIZES_MAX_SZ]; 2245 }; 2246 2247An entry with a "page_shift" of 0 is unused. Because the array is 2248organized in increasing order, a lookup can stop when encoutering 2249such an entry. 2250 2251The "slb_enc" field provides the encoding to use in the SLB for the 2252page size. The bits are in positions such as the value can directly 2253be OR'ed into the "vsid" argument of the slbmte instruction. 2254 2255The "enc" array is a list which for each of those segment base page 2256size provides the list of supported actual page sizes (which can be 2257only larger or equal to the base page size), along with the 2258corresponding encoding in the hash PTE. Similarly, the array is 22598 entries sorted by increasing sizes and an entry with a "0" shift 2260is an empty entry and a terminator: 2261 2262 struct kvm_ppc_one_page_size { 2263 __u32 page_shift; /* Page shift (or 0) */ 2264 __u32 pte_enc; /* Encoding in the HPTE (>>12) */ 2265 }; 2266 2267The "pte_enc" field provides a value that can OR'ed into the hash 2268PTE's RPN field (ie, it needs to be shifted left by 12 to OR it 2269into the hash PTE second double word). 2270 22714.75 KVM_IRQFD 2272 2273Capability: KVM_CAP_IRQFD 2274Architectures: x86 s390 arm arm64 2275Type: vm ioctl 2276Parameters: struct kvm_irqfd (in) 2277Returns: 0 on success, -1 on error 2278 2279Allows setting an eventfd to directly trigger a guest interrupt. 2280kvm_irqfd.fd specifies the file descriptor to use as the eventfd and 2281kvm_irqfd.gsi specifies the irqchip pin toggled by this event. When 2282an event is triggered on the eventfd, an interrupt is injected into 2283the guest using the specified gsi pin. The irqfd is removed using 2284the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd 2285and kvm_irqfd.gsi. 2286 2287With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify 2288mechanism allowing emulation of level-triggered, irqfd-based 2289interrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an 2290additional eventfd in the kvm_irqfd.resamplefd field. When operating 2291in resample mode, posting of an interrupt through kvm_irq.fd asserts 2292the specified gsi in the irqchip. When the irqchip is resampled, such 2293as from an EOI, the gsi is de-asserted and the user is notified via 2294kvm_irqfd.resamplefd. It is the user's responsibility to re-queue 2295the interrupt if the device making use of it still requires service. 2296Note that closing the resamplefd is not sufficient to disable the 2297irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment 2298and need not be specified with KVM_IRQFD_FLAG_DEASSIGN. 2299 2300On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared 2301Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is 2302given by gsi + 32. 2303 23044.76 KVM_PPC_ALLOCATE_HTAB 2305 2306Capability: KVM_CAP_PPC_ALLOC_HTAB 2307Architectures: powerpc 2308Type: vm ioctl 2309Parameters: Pointer to u32 containing hash table order (in/out) 2310Returns: 0 on success, -1 on error 2311 2312This requests the host kernel to allocate an MMU hash table for a 2313guest using the PAPR paravirtualization interface. This only does 2314anything if the kernel is configured to use the Book 3S HV style of 2315virtualization. Otherwise the capability doesn't exist and the ioctl 2316returns an ENOTTY error. The rest of this description assumes Book 3S 2317HV. 2318 2319There must be no vcpus running when this ioctl is called; if there 2320are, it will do nothing and return an EBUSY error. 2321 2322The parameter is a pointer to a 32-bit unsigned integer variable 2323containing the order (log base 2) of the desired size of the hash 2324table, which must be between 18 and 46. On successful return from the 2325ioctl, it will have been updated with the order of the hash table that 2326was allocated. 2327 2328If no hash table has been allocated when any vcpu is asked to run 2329(with the KVM_RUN ioctl), the host kernel will allocate a 2330default-sized hash table (16 MB). 2331 2332If this ioctl is called when a hash table has already been allocated, 2333the kernel will clear out the existing hash table (zero all HPTEs) and 2334return the hash table order in the parameter. (If the guest is using 2335the virtualized real-mode area (VRMA) facility, the kernel will 2336re-create the VMRA HPTEs on the next KVM_RUN of any vcpu.) 2337 23384.77 KVM_S390_INTERRUPT 2339 2340Capability: basic 2341Architectures: s390 2342Type: vm ioctl, vcpu ioctl 2343Parameters: struct kvm_s390_interrupt (in) 2344Returns: 0 on success, -1 on error 2345 2346Allows to inject an interrupt to the guest. Interrupts can be floating 2347(vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type. 2348 2349Interrupt parameters are passed via kvm_s390_interrupt: 2350 2351struct kvm_s390_interrupt { 2352 __u32 type; 2353 __u32 parm; 2354 __u64 parm64; 2355}; 2356 2357type can be one of the following: 2358 2359KVM_S390_SIGP_STOP (vcpu) - sigp stop; optional flags in parm 2360KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm 2361KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm 2362KVM_S390_RESTART (vcpu) - restart 2363KVM_S390_INT_CLOCK_COMP (vcpu) - clock comparator interrupt 2364KVM_S390_INT_CPU_TIMER (vcpu) - CPU timer interrupt 2365KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt 2366 parameters in parm and parm64 2367KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm 2368KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm 2369KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm 2370KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an 2371 I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel); 2372 I/O interruption parameters in parm (subchannel) and parm64 (intparm, 2373 interruption subclass) 2374KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm, 2375 machine check interrupt code in parm64 (note that 2376 machine checks needing further payload are not 2377 supported by this ioctl) 2378 2379Note that the vcpu ioctl is asynchronous to vcpu execution. 2380 23814.78 KVM_PPC_GET_HTAB_FD 2382 2383Capability: KVM_CAP_PPC_HTAB_FD 2384Architectures: powerpc 2385Type: vm ioctl 2386Parameters: Pointer to struct kvm_get_htab_fd (in) 2387Returns: file descriptor number (>= 0) on success, -1 on error 2388 2389This returns a file descriptor that can be used either to read out the 2390entries in the guest's hashed page table (HPT), or to write entries to 2391initialize the HPT. The returned fd can only be written to if the 2392KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and 2393can only be read if that bit is clear. The argument struct looks like 2394this: 2395 2396/* For KVM_PPC_GET_HTAB_FD */ 2397struct kvm_get_htab_fd { 2398 __u64 flags; 2399 __u64 start_index; 2400 __u64 reserved[2]; 2401}; 2402 2403/* Values for kvm_get_htab_fd.flags */ 2404#define KVM_GET_HTAB_BOLTED_ONLY ((__u64)0x1) 2405#define KVM_GET_HTAB_WRITE ((__u64)0x2) 2406 2407The `start_index' field gives the index in the HPT of the entry at 2408which to start reading. It is ignored when writing. 2409 2410Reads on the fd will initially supply information about all 2411"interesting" HPT entries. Interesting entries are those with the 2412bolted bit set, if the KVM_GET_HTAB_BOLTED_ONLY bit is set, otherwise 2413all entries. When the end of the HPT is reached, the read() will 2414return. If read() is called again on the fd, it will start again from 2415the beginning of the HPT, but will only return HPT entries that have 2416changed since they were last read. 2417 2418Data read or written is structured as a header (8 bytes) followed by a 2419series of valid HPT entries (16 bytes) each. The header indicates how 2420many valid HPT entries there are and how many invalid entries follow 2421the valid entries. The invalid entries are not represented explicitly 2422in the stream. The header format is: 2423 2424struct kvm_get_htab_header { 2425 __u32 index; 2426 __u16 n_valid; 2427 __u16 n_invalid; 2428}; 2429 2430Writes to the fd create HPT entries starting at the index given in the 2431header; first `n_valid' valid entries with contents from the data 2432written, then `n_invalid' invalid entries, invalidating any previously 2433valid entries found. 2434 24354.79 KVM_CREATE_DEVICE 2436 2437Capability: KVM_CAP_DEVICE_CTRL 2438Type: vm ioctl 2439Parameters: struct kvm_create_device (in/out) 2440Returns: 0 on success, -1 on error 2441Errors: 2442 ENODEV: The device type is unknown or unsupported 2443 EEXIST: Device already created, and this type of device may not 2444 be instantiated multiple times 2445 2446 Other error conditions may be defined by individual device types or 2447 have their standard meanings. 2448 2449Creates an emulated device in the kernel. The file descriptor returned 2450in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR. 2451 2452If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the 2453device type is supported (not necessarily whether it can be created 2454in the current vm). 2455 2456Individual devices should not define flags. Attributes should be used 2457for specifying any behavior that is not implied by the device type 2458number. 2459 2460struct kvm_create_device { 2461 __u32 type; /* in: KVM_DEV_TYPE_xxx */ 2462 __u32 fd; /* out: device handle */ 2463 __u32 flags; /* in: KVM_CREATE_DEVICE_xxx */ 2464}; 2465 24664.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR 2467 2468Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device 2469Type: device ioctl, vm ioctl 2470Parameters: struct kvm_device_attr 2471Returns: 0 on success, -1 on error 2472Errors: 2473 ENXIO: The group or attribute is unknown/unsupported for this device 2474 EPERM: The attribute cannot (currently) be accessed this way 2475 (e.g. read-only attribute, or attribute that only makes 2476 sense when the device is in a different state) 2477 2478 Other error conditions may be defined by individual device types. 2479 2480Gets/sets a specified piece of device configuration and/or state. The 2481semantics are device-specific. See individual device documentation in 2482the "devices" directory. As with ONE_REG, the size of the data 2483transferred is defined by the particular attribute. 2484 2485struct kvm_device_attr { 2486 __u32 flags; /* no flags currently defined */ 2487 __u32 group; /* device-defined */ 2488 __u64 attr; /* group-defined */ 2489 __u64 addr; /* userspace address of attr data */ 2490}; 2491 24924.81 KVM_HAS_DEVICE_ATTR 2493 2494Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device 2495Type: device ioctl, vm ioctl 2496Parameters: struct kvm_device_attr 2497Returns: 0 on success, -1 on error 2498Errors: 2499 ENXIO: The group or attribute is unknown/unsupported for this device 2500 2501Tests whether a device supports a particular attribute. A successful 2502return indicates the attribute is implemented. It does not necessarily 2503indicate that the attribute can be read or written in the device's 2504current state. "addr" is ignored. 2505 25064.82 KVM_ARM_VCPU_INIT 2507 2508Capability: basic 2509Architectures: arm, arm64 2510Type: vcpu ioctl 2511Parameters: struct kvm_vcpu_init (in) 2512Returns: 0 on success; -1 on error 2513Errors: 2514 EINVAL: the target is unknown, or the combination of features is invalid. 2515 ENOENT: a features bit specified is unknown. 2516 2517This tells KVM what type of CPU to present to the guest, and what 2518optional features it should have. This will cause a reset of the cpu 2519registers to their initial values. If this is not called, KVM_RUN will 2520return ENOEXEC for that vcpu. 2521 2522Note that because some registers reflect machine topology, all vcpus 2523should be created before this ioctl is invoked. 2524 2525Userspace can call this function multiple times for a given vcpu, including 2526after the vcpu has been run. This will reset the vcpu to its initial 2527state. All calls to this function after the initial call must use the same 2528target and same set of feature flags, otherwise EINVAL will be returned. 2529 2530Possible features: 2531 - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state. 2532 Depends on KVM_CAP_ARM_PSCI. If not set, the CPU will be powered on 2533 and execute guest code when KVM_RUN is called. 2534 - KVM_ARM_VCPU_EL1_32BIT: Starts the CPU in a 32bit mode. 2535 Depends on KVM_CAP_ARM_EL1_32BIT (arm64 only). 2536 - KVM_ARM_VCPU_PSCI_0_2: Emulate PSCI v0.2 for the CPU. 2537 Depends on KVM_CAP_ARM_PSCI_0_2. 2538 2539 25404.83 KVM_ARM_PREFERRED_TARGET 2541 2542Capability: basic 2543Architectures: arm, arm64 2544Type: vm ioctl 2545Parameters: struct struct kvm_vcpu_init (out) 2546Returns: 0 on success; -1 on error 2547Errors: 2548 ENODEV: no preferred target available for the host 2549 2550This queries KVM for preferred CPU target type which can be emulated 2551by KVM on underlying host. 2552 2553The ioctl returns struct kvm_vcpu_init instance containing information 2554about preferred CPU target type and recommended features for it. The 2555kvm_vcpu_init->features bitmap returned will have feature bits set if 2556the preferred target recommends setting these features, but this is 2557not mandatory. 2558 2559The information returned by this ioctl can be used to prepare an instance 2560of struct kvm_vcpu_init for KVM_ARM_VCPU_INIT ioctl which will result in 2561in VCPU matching underlying host. 2562 2563 25644.84 KVM_GET_REG_LIST 2565 2566Capability: basic 2567Architectures: arm, arm64, mips 2568Type: vcpu ioctl 2569Parameters: struct kvm_reg_list (in/out) 2570Returns: 0 on success; -1 on error 2571Errors: 2572 E2BIG: the reg index list is too big to fit in the array specified by 2573 the user (the number required will be written into n). 2574 2575struct kvm_reg_list { 2576 __u64 n; /* number of registers in reg[] */ 2577 __u64 reg[0]; 2578}; 2579 2580This ioctl returns the guest registers that are supported for the 2581KVM_GET_ONE_REG/KVM_SET_ONE_REG calls. 2582 2583 25844.85 KVM_ARM_SET_DEVICE_ADDR (deprecated) 2585 2586Capability: KVM_CAP_ARM_SET_DEVICE_ADDR 2587Architectures: arm, arm64 2588Type: vm ioctl 2589Parameters: struct kvm_arm_device_address (in) 2590Returns: 0 on success, -1 on error 2591Errors: 2592 ENODEV: The device id is unknown 2593 ENXIO: Device not supported on current system 2594 EEXIST: Address already set 2595 E2BIG: Address outside guest physical address space 2596 EBUSY: Address overlaps with other device range 2597 2598struct kvm_arm_device_addr { 2599 __u64 id; 2600 __u64 addr; 2601}; 2602 2603Specify a device address in the guest's physical address space where guests 2604can access emulated or directly exposed devices, which the host kernel needs 2605to know about. The id field is an architecture specific identifier for a 2606specific device. 2607 2608ARM/arm64 divides the id field into two parts, a device id and an 2609address type id specific to the individual device. 2610 2611 bits: | 63 ... 32 | 31 ... 16 | 15 ... 0 | 2612 field: | 0x00000000 | device id | addr type id | 2613 2614ARM/arm64 currently only require this when using the in-kernel GIC 2615support for the hardware VGIC features, using KVM_ARM_DEVICE_VGIC_V2 2616as the device id. When setting the base address for the guest's 2617mapping of the VGIC virtual CPU and distributor interface, the ioctl 2618must be called after calling KVM_CREATE_IRQCHIP, but before calling 2619KVM_RUN on any of the VCPUs. Calling this ioctl twice for any of the 2620base addresses will return -EEXIST. 2621 2622Note, this IOCTL is deprecated and the more flexible SET/GET_DEVICE_ATTR API 2623should be used instead. 2624 2625 26264.86 KVM_PPC_RTAS_DEFINE_TOKEN 2627 2628Capability: KVM_CAP_PPC_RTAS 2629Architectures: ppc 2630Type: vm ioctl 2631Parameters: struct kvm_rtas_token_args 2632Returns: 0 on success, -1 on error 2633 2634Defines a token value for a RTAS (Run Time Abstraction Services) 2635service in order to allow it to be handled in the kernel. The 2636argument struct gives the name of the service, which must be the name 2637of a service that has a kernel-side implementation. If the token 2638value is non-zero, it will be associated with that service, and 2639subsequent RTAS calls by the guest specifying that token will be 2640handled by the kernel. If the token value is 0, then any token 2641associated with the service will be forgotten, and subsequent RTAS 2642calls by the guest for that service will be passed to userspace to be 2643handled. 2644 26454.87 KVM_SET_GUEST_DEBUG 2646 2647Capability: KVM_CAP_SET_GUEST_DEBUG 2648Architectures: x86, s390, ppc 2649Type: vcpu ioctl 2650Parameters: struct kvm_guest_debug (in) 2651Returns: 0 on success; -1 on error 2652 2653struct kvm_guest_debug { 2654 __u32 control; 2655 __u32 pad; 2656 struct kvm_guest_debug_arch arch; 2657}; 2658 2659Set up the processor specific debug registers and configure vcpu for 2660handling guest debug events. There are two parts to the structure, the 2661first a control bitfield indicates the type of debug events to handle 2662when running. Common control bits are: 2663 2664 - KVM_GUESTDBG_ENABLE: guest debugging is enabled 2665 - KVM_GUESTDBG_SINGLESTEP: the next run should single-step 2666 2667The top 16 bits of the control field are architecture specific control 2668flags which can include the following: 2669 2670 - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86] 2671 - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390] 2672 - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86] 2673 - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86] 2674 - KVM_GUESTDBG_EXIT_PENDING: trigger an immediate guest exit [s390] 2675 2676For example KVM_GUESTDBG_USE_SW_BP indicates that software breakpoints 2677are enabled in memory so we need to ensure breakpoint exceptions are 2678correctly trapped and the KVM run loop exits at the breakpoint and not 2679running off into the normal guest vector. For KVM_GUESTDBG_USE_HW_BP 2680we need to ensure the guest vCPUs architecture specific registers are 2681updated to the correct (supplied) values. 2682 2683The second part of the structure is architecture specific and 2684typically contains a set of debug registers. 2685 2686When debug events exit the main run loop with the reason 2687KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run 2688structure containing architecture specific debug information. 2689 26904.88 KVM_GET_EMULATED_CPUID 2691 2692Capability: KVM_CAP_EXT_EMUL_CPUID 2693Architectures: x86 2694Type: system ioctl 2695Parameters: struct kvm_cpuid2 (in/out) 2696Returns: 0 on success, -1 on error 2697 2698struct kvm_cpuid2 { 2699 __u32 nent; 2700 __u32 flags; 2701 struct kvm_cpuid_entry2 entries[0]; 2702}; 2703 2704The member 'flags' is used for passing flags from userspace. 2705 2706#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX BIT(0) 2707#define KVM_CPUID_FLAG_STATEFUL_FUNC BIT(1) 2708#define KVM_CPUID_FLAG_STATE_READ_NEXT BIT(2) 2709 2710struct kvm_cpuid_entry2 { 2711 __u32 function; 2712 __u32 index; 2713 __u32 flags; 2714 __u32 eax; 2715 __u32 ebx; 2716 __u32 ecx; 2717 __u32 edx; 2718 __u32 padding[3]; 2719}; 2720 2721This ioctl returns x86 cpuid features which are emulated by 2722kvm.Userspace can use the information returned by this ioctl to query 2723which features are emulated by kvm instead of being present natively. 2724 2725Userspace invokes KVM_GET_EMULATED_CPUID by passing a kvm_cpuid2 2726structure with the 'nent' field indicating the number of entries in 2727the variable-size array 'entries'. If the number of entries is too low 2728to describe the cpu capabilities, an error (E2BIG) is returned. If the 2729number is too high, the 'nent' field is adjusted and an error (ENOMEM) 2730is returned. If the number is just right, the 'nent' field is adjusted 2731to the number of valid entries in the 'entries' array, which is then 2732filled. 2733 2734The entries returned are the set CPUID bits of the respective features 2735which kvm emulates, as returned by the CPUID instruction, with unknown 2736or unsupported feature bits cleared. 2737 2738Features like x2apic, for example, may not be present in the host cpu 2739but are exposed by kvm in KVM_GET_SUPPORTED_CPUID because they can be 2740emulated efficiently and thus not included here. 2741 2742The fields in each entry are defined as follows: 2743 2744 function: the eax value used to obtain the entry 2745 index: the ecx value used to obtain the entry (for entries that are 2746 affected by ecx) 2747 flags: an OR of zero or more of the following: 2748 KVM_CPUID_FLAG_SIGNIFCANT_INDEX: 2749 if the index field is valid 2750 KVM_CPUID_FLAG_STATEFUL_FUNC: 2751 if cpuid for this function returns different values for successive 2752 invocations; there will be several entries with the same function, 2753 all with this flag set 2754 KVM_CPUID_FLAG_STATE_READ_NEXT: 2755 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is 2756 the first entry to be read by a cpu 2757 eax, ebx, ecx, edx: the values returned by the cpuid instruction for 2758 this function/index combination 2759 27604.89 KVM_S390_MEM_OP 2761 2762Capability: KVM_CAP_S390_MEM_OP 2763Architectures: s390 2764Type: vcpu ioctl 2765Parameters: struct kvm_s390_mem_op (in) 2766Returns: = 0 on success, 2767 < 0 on generic error (e.g. -EFAULT or -ENOMEM), 2768 > 0 if an exception occurred while walking the page tables 2769 2770Read or write data from/to the logical (virtual) memory of a VPCU. 2771 2772Parameters are specified via the following structure: 2773 2774struct kvm_s390_mem_op { 2775 __u64 gaddr; /* the guest address */ 2776 __u64 flags; /* flags */ 2777 __u32 size; /* amount of bytes */ 2778 __u32 op; /* type of operation */ 2779 __u64 buf; /* buffer in userspace */ 2780 __u8 ar; /* the access register number */ 2781 __u8 reserved[31]; /* should be set to 0 */ 2782}; 2783 2784The type of operation is specified in the "op" field. It is either 2785KVM_S390_MEMOP_LOGICAL_READ for reading from logical memory space or 2786KVM_S390_MEMOP_LOGICAL_WRITE for writing to logical memory space. The 2787KVM_S390_MEMOP_F_CHECK_ONLY flag can be set in the "flags" field to check 2788whether the corresponding memory access would create an access exception 2789(without touching the data in the memory at the destination). In case an 2790access exception occurred while walking the MMU tables of the guest, the 2791ioctl returns a positive error number to indicate the type of exception. 2792This exception is also raised directly at the corresponding VCPU if the 2793flag KVM_S390_MEMOP_F_INJECT_EXCEPTION is set in the "flags" field. 2794 2795The start address of the memory region has to be specified in the "gaddr" 2796field, and the length of the region in the "size" field. "buf" is the buffer 2797supplied by the userspace application where the read data should be written 2798to for KVM_S390_MEMOP_LOGICAL_READ, or where the data that should be written 2799is stored for a KVM_S390_MEMOP_LOGICAL_WRITE. "buf" is unused and can be NULL 2800when KVM_S390_MEMOP_F_CHECK_ONLY is specified. "ar" designates the access 2801register number to be used. 2802 2803The "reserved" field is meant for future extensions. It is not used by 2804KVM with the currently defined set of flags. 2805 28064.90 KVM_S390_GET_SKEYS 2807 2808Capability: KVM_CAP_S390_SKEYS 2809Architectures: s390 2810Type: vm ioctl 2811Parameters: struct kvm_s390_skeys 2812Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage 2813 keys, negative value on error 2814 2815This ioctl is used to get guest storage key values on the s390 2816architecture. The ioctl takes parameters via the kvm_s390_skeys struct. 2817 2818struct kvm_s390_skeys { 2819 __u64 start_gfn; 2820 __u64 count; 2821 __u64 skeydata_addr; 2822 __u32 flags; 2823 __u32 reserved[9]; 2824}; 2825 2826The start_gfn field is the number of the first guest frame whose storage keys 2827you want to get. 2828 2829The count field is the number of consecutive frames (starting from start_gfn) 2830whose storage keys to get. The count field must be at least 1 and the maximum 2831allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range 2832will cause the ioctl to return -EINVAL. 2833 2834The skeydata_addr field is the address to a buffer large enough to hold count 2835bytes. This buffer will be filled with storage key data by the ioctl. 2836 28374.91 KVM_S390_SET_SKEYS 2838 2839Capability: KVM_CAP_S390_SKEYS 2840Architectures: s390 2841Type: vm ioctl 2842Parameters: struct kvm_s390_skeys 2843Returns: 0 on success, negative value on error 2844 2845This ioctl is used to set guest storage key values on the s390 2846architecture. The ioctl takes parameters via the kvm_s390_skeys struct. 2847See section on KVM_S390_GET_SKEYS for struct definition. 2848 2849The start_gfn field is the number of the first guest frame whose storage keys 2850you want to set. 2851 2852The count field is the number of consecutive frames (starting from start_gfn) 2853whose storage keys to get. The count field must be at least 1 and the maximum 2854allowed value is defined as KVM_S390_SKEYS_ALLOC_MAX. Values outside this range 2855will cause the ioctl to return -EINVAL. 2856 2857The skeydata_addr field is the address to a buffer containing count bytes of 2858storage keys. Each byte in the buffer will be set as the storage key for a 2859single frame starting at start_gfn for count frames. 2860 2861Note: If any architecturally invalid key value is found in the given data then 2862the ioctl will return -EINVAL. 2863 28644.92 KVM_S390_IRQ 2865 2866Capability: KVM_CAP_S390_INJECT_IRQ 2867Architectures: s390 2868Type: vcpu ioctl 2869Parameters: struct kvm_s390_irq (in) 2870Returns: 0 on success, -1 on error 2871Errors: 2872 EINVAL: interrupt type is invalid 2873 type is KVM_S390_SIGP_STOP and flag parameter is invalid value 2874 type is KVM_S390_INT_EXTERNAL_CALL and code is bigger 2875 than the maximum of VCPUs 2876 EBUSY: type is KVM_S390_SIGP_SET_PREFIX and vcpu is not stopped 2877 type is KVM_S390_SIGP_STOP and a stop irq is already pending 2878 type is KVM_S390_INT_EXTERNAL_CALL and an external call interrupt 2879 is already pending 2880 2881Allows to inject an interrupt to the guest. 2882 2883Using struct kvm_s390_irq as a parameter allows 2884to inject additional payload which is not 2885possible via KVM_S390_INTERRUPT. 2886 2887Interrupt parameters are passed via kvm_s390_irq: 2888 2889struct kvm_s390_irq { 2890 __u64 type; 2891 union { 2892 struct kvm_s390_io_info io; 2893 struct kvm_s390_ext_info ext; 2894 struct kvm_s390_pgm_info pgm; 2895 struct kvm_s390_emerg_info emerg; 2896 struct kvm_s390_extcall_info extcall; 2897 struct kvm_s390_prefix_info prefix; 2898 struct kvm_s390_stop_info stop; 2899 struct kvm_s390_mchk_info mchk; 2900 char reserved[64]; 2901 } u; 2902}; 2903 2904type can be one of the following: 2905 2906KVM_S390_SIGP_STOP - sigp stop; parameter in .stop 2907KVM_S390_PROGRAM_INT - program check; parameters in .pgm 2908KVM_S390_SIGP_SET_PREFIX - sigp set prefix; parameters in .prefix 2909KVM_S390_RESTART - restart; no parameters 2910KVM_S390_INT_CLOCK_COMP - clock comparator interrupt; no parameters 2911KVM_S390_INT_CPU_TIMER - CPU timer interrupt; no parameters 2912KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg 2913KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall 2914KVM_S390_MCHK - machine check interrupt; parameters in .mchk 2915 2916 2917Note that the vcpu ioctl is asynchronous to vcpu execution. 2918 29194.94 KVM_S390_GET_IRQ_STATE 2920 2921Capability: KVM_CAP_S390_IRQ_STATE 2922Architectures: s390 2923Type: vcpu ioctl 2924Parameters: struct kvm_s390_irq_state (out) 2925Returns: >= number of bytes copied into buffer, 2926 -EINVAL if buffer size is 0, 2927 -ENOBUFS if buffer size is too small to fit all pending interrupts, 2928 -EFAULT if the buffer address was invalid 2929 2930This ioctl allows userspace to retrieve the complete state of all currently 2931pending interrupts in a single buffer. Use cases include migration 2932and introspection. The parameter structure contains the address of a 2933userspace buffer and its length: 2934 2935struct kvm_s390_irq_state { 2936 __u64 buf; 2937 __u32 flags; 2938 __u32 len; 2939 __u32 reserved[4]; 2940}; 2941 2942Userspace passes in the above struct and for each pending interrupt a 2943struct kvm_s390_irq is copied to the provided buffer. 2944 2945If -ENOBUFS is returned the buffer provided was too small and userspace 2946may retry with a bigger buffer. 2947 29484.95 KVM_S390_SET_IRQ_STATE 2949 2950Capability: KVM_CAP_S390_IRQ_STATE 2951Architectures: s390 2952Type: vcpu ioctl 2953Parameters: struct kvm_s390_irq_state (in) 2954Returns: 0 on success, 2955 -EFAULT if the buffer address was invalid, 2956 -EINVAL for an invalid buffer length (see below), 2957 -EBUSY if there were already interrupts pending, 2958 errors occurring when actually injecting the 2959 interrupt. See KVM_S390_IRQ. 2960 2961This ioctl allows userspace to set the complete state of all cpu-local 2962interrupts currently pending for the vcpu. It is intended for restoring 2963interrupt state after a migration. The input parameter is a userspace buffer 2964containing a struct kvm_s390_irq_state: 2965 2966struct kvm_s390_irq_state { 2967 __u64 buf; 2968 __u32 len; 2969 __u32 pad; 2970}; 2971 2972The userspace memory referenced by buf contains a struct kvm_s390_irq 2973for each interrupt to be injected into the guest. 2974If one of the interrupts could not be injected for some reason the 2975ioctl aborts. 2976 2977len must be a multiple of sizeof(struct kvm_s390_irq). It must be > 0 2978and it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq), 2979which is the maximum number of possibly pending cpu-local interrupts. 2980 29815. The kvm_run structure 2982------------------------ 2983 2984Application code obtains a pointer to the kvm_run structure by 2985mmap()ing a vcpu fd. From that point, application code can control 2986execution by changing fields in kvm_run prior to calling the KVM_RUN 2987ioctl, and obtain information about the reason KVM_RUN returned by 2988looking up structure members. 2989 2990struct kvm_run { 2991 /* in */ 2992 __u8 request_interrupt_window; 2993 2994Request that KVM_RUN return when it becomes possible to inject external 2995interrupts into the guest. Useful in conjunction with KVM_INTERRUPT. 2996 2997 __u8 padding1[7]; 2998 2999 /* out */ 3000 __u32 exit_reason; 3001 3002When KVM_RUN has returned successfully (return value 0), this informs 3003application code why KVM_RUN has returned. Allowable values for this 3004field are detailed below. 3005 3006 __u8 ready_for_interrupt_injection; 3007 3008If request_interrupt_window has been specified, this field indicates 3009an interrupt can be injected now with KVM_INTERRUPT. 3010 3011 __u8 if_flag; 3012 3013The value of the current interrupt flag. Only valid if in-kernel 3014local APIC is not used. 3015 3016 __u8 padding2[2]; 3017 3018 /* in (pre_kvm_run), out (post_kvm_run) */ 3019 __u64 cr8; 3020 3021The value of the cr8 register. Only valid if in-kernel local APIC is 3022not used. Both input and output. 3023 3024 __u64 apic_base; 3025 3026The value of the APIC BASE msr. Only valid if in-kernel local 3027APIC is not used. Both input and output. 3028 3029 union { 3030 /* KVM_EXIT_UNKNOWN */ 3031 struct { 3032 __u64 hardware_exit_reason; 3033 } hw; 3034 3035If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown 3036reasons. Further architecture-specific information is available in 3037hardware_exit_reason. 3038 3039 /* KVM_EXIT_FAIL_ENTRY */ 3040 struct { 3041 __u64 hardware_entry_failure_reason; 3042 } fail_entry; 3043 3044If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due 3045to unknown reasons. Further architecture-specific information is 3046available in hardware_entry_failure_reason. 3047 3048 /* KVM_EXIT_EXCEPTION */ 3049 struct { 3050 __u32 exception; 3051 __u32 error_code; 3052 } ex; 3053 3054Unused. 3055 3056 /* KVM_EXIT_IO */ 3057 struct { 3058#define KVM_EXIT_IO_IN 0 3059#define KVM_EXIT_IO_OUT 1 3060 __u8 direction; 3061 __u8 size; /* bytes */ 3062 __u16 port; 3063 __u32 count; 3064 __u64 data_offset; /* relative to kvm_run start */ 3065 } io; 3066 3067If exit_reason is KVM_EXIT_IO, then the vcpu has 3068executed a port I/O instruction which could not be satisfied by kvm. 3069data_offset describes where the data is located (KVM_EXIT_IO_OUT) or 3070where kvm expects application code to place the data for the next 3071KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array. 3072 3073 struct { 3074 struct kvm_debug_exit_arch arch; 3075 } debug; 3076 3077Unused. 3078 3079 /* KVM_EXIT_MMIO */ 3080 struct { 3081 __u64 phys_addr; 3082 __u8 data[8]; 3083 __u32 len; 3084 __u8 is_write; 3085 } mmio; 3086 3087If exit_reason is KVM_EXIT_MMIO, then the vcpu has 3088executed a memory-mapped I/O instruction which could not be satisfied 3089by kvm. The 'data' member contains the written data if 'is_write' is 3090true, and should be filled by application code otherwise. 3091 3092The 'data' member contains, in its first 'len' bytes, the value as it would 3093appear if the VCPU performed a load or store of the appropriate width directly 3094to the byte array. 3095 3096NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and 3097 KVM_EXIT_EPR the corresponding 3098operations are complete (and guest state is consistent) only after userspace 3099has re-entered the kernel with KVM_RUN. The kernel side will first finish 3100incomplete operations and then check for pending signals. Userspace 3101can re-enter the guest with an unmasked signal pending to complete 3102pending operations. 3103 3104 /* KVM_EXIT_HYPERCALL */ 3105 struct { 3106 __u64 nr; 3107 __u64 args[6]; 3108 __u64 ret; 3109 __u32 longmode; 3110 __u32 pad; 3111 } hypercall; 3112 3113Unused. This was once used for 'hypercall to userspace'. To implement 3114such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390). 3115Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO. 3116 3117 /* KVM_EXIT_TPR_ACCESS */ 3118 struct { 3119 __u64 rip; 3120 __u32 is_write; 3121 __u32 pad; 3122 } tpr_access; 3123 3124To be documented (KVM_TPR_ACCESS_REPORTING). 3125 3126 /* KVM_EXIT_S390_SIEIC */ 3127 struct { 3128 __u8 icptcode; 3129 __u64 mask; /* psw upper half */ 3130 __u64 addr; /* psw lower half */ 3131 __u16 ipa; 3132 __u32 ipb; 3133 } s390_sieic; 3134 3135s390 specific. 3136 3137 /* KVM_EXIT_S390_RESET */ 3138#define KVM_S390_RESET_POR 1 3139#define KVM_S390_RESET_CLEAR 2 3140#define KVM_S390_RESET_SUBSYSTEM 4 3141#define KVM_S390_RESET_CPU_INIT 8 3142#define KVM_S390_RESET_IPL 16 3143 __u64 s390_reset_flags; 3144 3145s390 specific. 3146 3147 /* KVM_EXIT_S390_UCONTROL */ 3148 struct { 3149 __u64 trans_exc_code; 3150 __u32 pgm_code; 3151 } s390_ucontrol; 3152 3153s390 specific. A page fault has occurred for a user controlled virtual 3154machine (KVM_VM_S390_UNCONTROL) on it's host page table that cannot be 3155resolved by the kernel. 3156The program code and the translation exception code that were placed 3157in the cpu's lowcore are presented here as defined by the z Architecture 3158Principles of Operation Book in the Chapter for Dynamic Address Translation 3159(DAT) 3160 3161 /* KVM_EXIT_DCR */ 3162 struct { 3163 __u32 dcrn; 3164 __u32 data; 3165 __u8 is_write; 3166 } dcr; 3167 3168Deprecated - was used for 440 KVM. 3169 3170 /* KVM_EXIT_OSI */ 3171 struct { 3172 __u64 gprs[32]; 3173 } osi; 3174 3175MOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch 3176hypercalls and exit with this exit struct that contains all the guest gprs. 3177 3178If exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall. 3179Userspace can now handle the hypercall and when it's done modify the gprs as 3180necessary. Upon guest entry all guest GPRs will then be replaced by the values 3181in this struct. 3182 3183 /* KVM_EXIT_PAPR_HCALL */ 3184 struct { 3185 __u64 nr; 3186 __u64 ret; 3187 __u64 args[9]; 3188 } papr_hcall; 3189 3190This is used on 64-bit PowerPC when emulating a pSeries partition, 3191e.g. with the 'pseries' machine type in qemu. It occurs when the 3192guest does a hypercall using the 'sc 1' instruction. The 'nr' field 3193contains the hypercall number (from the guest R3), and 'args' contains 3194the arguments (from the guest R4 - R12). Userspace should put the 3195return code in 'ret' and any extra returned values in args[]. 3196The possible hypercalls are defined in the Power Architecture Platform 3197Requirements (PAPR) document available from www.power.org (free 3198developer registration required to access it). 3199 3200 /* KVM_EXIT_S390_TSCH */ 3201 struct { 3202 __u16 subchannel_id; 3203 __u16 subchannel_nr; 3204 __u32 io_int_parm; 3205 __u32 io_int_word; 3206 __u32 ipb; 3207 __u8 dequeued; 3208 } s390_tsch; 3209 3210s390 specific. This exit occurs when KVM_CAP_S390_CSS_SUPPORT has been enabled 3211and TEST SUBCHANNEL was intercepted. If dequeued is set, a pending I/O 3212interrupt for the target subchannel has been dequeued and subchannel_id, 3213subchannel_nr, io_int_parm and io_int_word contain the parameters for that 3214interrupt. ipb is needed for instruction parameter decoding. 3215 3216 /* KVM_EXIT_EPR */ 3217 struct { 3218 __u32 epr; 3219 } epr; 3220 3221On FSL BookE PowerPC chips, the interrupt controller has a fast patch 3222interrupt acknowledge path to the core. When the core successfully 3223delivers an interrupt, it automatically populates the EPR register with 3224the interrupt vector number and acknowledges the interrupt inside 3225the interrupt controller. 3226 3227In case the interrupt controller lives in user space, we need to do 3228the interrupt acknowledge cycle through it to fetch the next to be 3229delivered interrupt vector using this exit. 3230 3231It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an 3232external interrupt has just been delivered into the guest. User space 3233should put the acknowledged interrupt vector into the 'epr' field. 3234 3235 /* KVM_EXIT_SYSTEM_EVENT */ 3236 struct { 3237#define KVM_SYSTEM_EVENT_SHUTDOWN 1 3238#define KVM_SYSTEM_EVENT_RESET 2 3239 __u32 type; 3240 __u64 flags; 3241 } system_event; 3242 3243If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered 3244a system-level event using some architecture specific mechanism (hypercall 3245or some special instruction). In case of ARM/ARM64, this is triggered using 3246HVC instruction based PSCI call from the vcpu. The 'type' field describes 3247the system-level event type. The 'flags' field describes architecture 3248specific flags for the system-level event. 3249 3250Valid values for 'type' are: 3251 KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the 3252 VM. Userspace is not obliged to honour this, and if it does honour 3253 this does not need to destroy the VM synchronously (ie it may call 3254 KVM_RUN again before shutdown finally occurs). 3255 KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM. 3256 As with SHUTDOWN, userspace can choose to ignore the request, or 3257 to schedule the reset to occur in the future and may call KVM_RUN again. 3258 3259 /* Fix the size of the union. */ 3260 char padding[256]; 3261 }; 3262 3263 /* 3264 * shared registers between kvm and userspace. 3265 * kvm_valid_regs specifies the register classes set by the host 3266 * kvm_dirty_regs specified the register classes dirtied by userspace 3267 * struct kvm_sync_regs is architecture specific, as well as the 3268 * bits for kvm_valid_regs and kvm_dirty_regs 3269 */ 3270 __u64 kvm_valid_regs; 3271 __u64 kvm_dirty_regs; 3272 union { 3273 struct kvm_sync_regs regs; 3274 char padding[1024]; 3275 } s; 3276 3277If KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access 3278certain guest registers without having to call SET/GET_*REGS. Thus we can 3279avoid some system call overhead if userspace has to handle the exit. 3280Userspace can query the validity of the structure by checking 3281kvm_valid_regs for specific bits. These bits are architecture specific 3282and usually define the validity of a groups of registers. (e.g. one bit 3283 for general purpose registers) 3284 3285Please note that the kernel is allowed to use the kvm_run structure as the 3286primary storage for certain register types. Therefore, the kernel may use the 3287values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set. 3288 3289}; 3290 3291 3292 32936. Capabilities that can be enabled on vCPUs 3294-------------------------------------------- 3295 3296There are certain capabilities that change the behavior of the virtual CPU or 3297the virtual machine when enabled. To enable them, please see section 4.37. 3298Below you can find a list of capabilities and what their effect on the vCPU or 3299the virtual machine is when enabling them. 3300 3301The following information is provided along with the description: 3302 3303 Architectures: which instruction set architectures provide this ioctl. 3304 x86 includes both i386 and x86_64. 3305 3306 Target: whether this is a per-vcpu or per-vm capability. 3307 3308 Parameters: what parameters are accepted by the capability. 3309 3310 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL) 3311 are not detailed, but errors with specific meanings are. 3312 3313 33146.1 KVM_CAP_PPC_OSI 3315 3316Architectures: ppc 3317Target: vcpu 3318Parameters: none 3319Returns: 0 on success; -1 on error 3320 3321This capability enables interception of OSI hypercalls that otherwise would 3322be treated as normal system calls to be injected into the guest. OSI hypercalls 3323were invented by Mac-on-Linux to have a standardized communication mechanism 3324between the guest and the host. 3325 3326When this capability is enabled, KVM_EXIT_OSI can occur. 3327 3328 33296.2 KVM_CAP_PPC_PAPR 3330 3331Architectures: ppc 3332Target: vcpu 3333Parameters: none 3334Returns: 0 on success; -1 on error 3335 3336This capability enables interception of PAPR hypercalls. PAPR hypercalls are 3337done using the hypercall instruction "sc 1". 3338 3339It also sets the guest privilege level to "supervisor" mode. Usually the guest 3340runs in "hypervisor" privilege mode with a few missing features. 3341 3342In addition to the above, it changes the semantics of SDR1. In this mode, the 3343HTAB address part of SDR1 contains an HVA instead of a GPA, as PAPR keeps the 3344HTAB invisible to the guest. 3345 3346When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur. 3347 3348 33496.3 KVM_CAP_SW_TLB 3350 3351Architectures: ppc 3352Target: vcpu 3353Parameters: args[0] is the address of a struct kvm_config_tlb 3354Returns: 0 on success; -1 on error 3355 3356struct kvm_config_tlb { 3357 __u64 params; 3358 __u64 array; 3359 __u32 mmu_type; 3360 __u32 array_len; 3361}; 3362 3363Configures the virtual CPU's TLB array, establishing a shared memory area 3364between userspace and KVM. The "params" and "array" fields are userspace 3365addresses of mmu-type-specific data structures. The "array_len" field is an 3366safety mechanism, and should be set to the size in bytes of the memory that 3367userspace has reserved for the array. It must be at least the size dictated 3368by "mmu_type" and "params". 3369 3370While KVM_RUN is active, the shared region is under control of KVM. Its 3371contents are undefined, and any modification by userspace results in 3372boundedly undefined behavior. 3373 3374On return from KVM_RUN, the shared region will reflect the current state of 3375the guest's TLB. If userspace makes any changes, it must call KVM_DIRTY_TLB 3376to tell KVM which entries have been changed, prior to calling KVM_RUN again 3377on this vcpu. 3378 3379For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV: 3380 - The "params" field is of type "struct kvm_book3e_206_tlb_params". 3381 - The "array" field points to an array of type "struct 3382 kvm_book3e_206_tlb_entry". 3383 - The array consists of all entries in the first TLB, followed by all 3384 entries in the second TLB. 3385 - Within a TLB, entries are ordered first by increasing set number. Within a 3386 set, entries are ordered by way (increasing ESEL). 3387 - The hash for determining set number in TLB0 is: (MAS2 >> 12) & (num_sets - 1) 3388 where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value. 3389 - The tsize field of mas1 shall be set to 4K on TLB0, even though the 3390 hardware ignores this value for TLB0. 3391 33926.4 KVM_CAP_S390_CSS_SUPPORT 3393 3394Architectures: s390 3395Target: vcpu 3396Parameters: none 3397Returns: 0 on success; -1 on error 3398 3399This capability enables support for handling of channel I/O instructions. 3400 3401TEST PENDING INTERRUPTION and the interrupt portion of TEST SUBCHANNEL are 3402handled in-kernel, while the other I/O instructions are passed to userspace. 3403 3404When this capability is enabled, KVM_EXIT_S390_TSCH will occur on TEST 3405SUBCHANNEL intercepts. 3406 3407Note that even though this capability is enabled per-vcpu, the complete 3408virtual machine is affected. 3409 34106.5 KVM_CAP_PPC_EPR 3411 3412Architectures: ppc 3413Target: vcpu 3414Parameters: args[0] defines whether the proxy facility is active 3415Returns: 0 on success; -1 on error 3416 3417This capability enables or disables the delivery of interrupts through the 3418external proxy facility. 3419 3420When enabled (args[0] != 0), every time the guest gets an external interrupt 3421delivered, it automatically exits into user space with a KVM_EXIT_EPR exit 3422to receive the topmost interrupt vector. 3423 3424When disabled (args[0] == 0), behavior is as if this facility is unsupported. 3425 3426When this capability is enabled, KVM_EXIT_EPR can occur. 3427 34286.6 KVM_CAP_IRQ_MPIC 3429 3430Architectures: ppc 3431Parameters: args[0] is the MPIC device fd 3432 args[1] is the MPIC CPU number for this vcpu 3433 3434This capability connects the vcpu to an in-kernel MPIC device. 3435 34366.7 KVM_CAP_IRQ_XICS 3437 3438Architectures: ppc 3439Target: vcpu 3440Parameters: args[0] is the XICS device fd 3441 args[1] is the XICS CPU number (server ID) for this vcpu 3442 3443This capability connects the vcpu to an in-kernel XICS device. 3444 34456.8 KVM_CAP_S390_IRQCHIP 3446 3447Architectures: s390 3448Target: vm 3449Parameters: none 3450 3451This capability enables the in-kernel irqchip for s390. Please refer to 3452"4.24 KVM_CREATE_IRQCHIP" for details. 3453 34546.9 KVM_CAP_MIPS_FPU 3455 3456Architectures: mips 3457Target: vcpu 3458Parameters: args[0] is reserved for future use (should be 0). 3459 3460This capability allows the use of the host Floating Point Unit by the guest. It 3461allows the Config1.FP bit to be set to enable the FPU in the guest. Once this is 3462done the KVM_REG_MIPS_FPR_* and KVM_REG_MIPS_FCR_* registers can be accessed 3463(depending on the current guest FPU register mode), and the Status.FR, 3464Config5.FRE bits are accessible via the KVM API and also from the guest, 3465depending on them being supported by the FPU. 3466 34676.10 KVM_CAP_MIPS_MSA 3468 3469Architectures: mips 3470Target: vcpu 3471Parameters: args[0] is reserved for future use (should be 0). 3472 3473This capability allows the use of the MIPS SIMD Architecture (MSA) by the guest. 3474It allows the Config3.MSAP bit to be set to enable the use of MSA by the guest. 3475Once this is done the KVM_REG_MIPS_VEC_* and KVM_REG_MIPS_MSA_* registers can be 3476accessed, and the Config5.MSAEn bit is accessible via the KVM API and also from 3477the guest. 3478 34797. Capabilities that can be enabled on VMs 3480------------------------------------------ 3481 3482There are certain capabilities that change the behavior of the virtual 3483machine when enabled. To enable them, please see section 4.37. Below 3484you can find a list of capabilities and what their effect on the VM 3485is when enabling them. 3486 3487The following information is provided along with the description: 3488 3489 Architectures: which instruction set architectures provide this ioctl. 3490 x86 includes both i386 and x86_64. 3491 3492 Parameters: what parameters are accepted by the capability. 3493 3494 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL) 3495 are not detailed, but errors with specific meanings are. 3496 3497 34987.1 KVM_CAP_PPC_ENABLE_HCALL 3499 3500Architectures: ppc 3501Parameters: args[0] is the sPAPR hcall number 3502 args[1] is 0 to disable, 1 to enable in-kernel handling 3503 3504This capability controls whether individual sPAPR hypercalls (hcalls) 3505get handled by the kernel or not. Enabling or disabling in-kernel 3506handling of an hcall is effective across the VM. On creation, an 3507initial set of hcalls are enabled for in-kernel handling, which 3508consists of those hcalls for which in-kernel handlers were implemented 3509before this capability was implemented. If disabled, the kernel will 3510not to attempt to handle the hcall, but will always exit to userspace 3511to handle it. Note that it may not make sense to enable some and 3512disable others of a group of related hcalls, but KVM does not prevent 3513userspace from doing that. 3514 3515If the hcall number specified is not one that has an in-kernel 3516implementation, the KVM_ENABLE_CAP ioctl will fail with an EINVAL 3517error. 3518 35197.2 KVM_CAP_S390_USER_SIGP 3520 3521Architectures: s390 3522Parameters: none 3523 3524This capability controls which SIGP orders will be handled completely in user 3525space. With this capability enabled, all fast orders will be handled completely 3526in the kernel: 3527- SENSE 3528- SENSE RUNNING 3529- EXTERNAL CALL 3530- EMERGENCY SIGNAL 3531- CONDITIONAL EMERGENCY SIGNAL 3532 3533All other orders will be handled completely in user space. 3534 3535Only privileged operation exceptions will be checked for in the kernel (or even 3536in the hardware prior to interception). If this capability is not enabled, the 3537old way of handling SIGP orders is used (partially in kernel and user space). 3538 35397.3 KVM_CAP_S390_VECTOR_REGISTERS 3540 3541Architectures: s390 3542Parameters: none 3543Returns: 0 on success, negative value on error 3544 3545Allows use of the vector registers introduced with z13 processor, and 3546provides for the synchronization between host and user space. Will 3547return -EINVAL if the machine does not support vectors. 3548 35497.4 KVM_CAP_S390_USER_STSI 3550 3551Architectures: s390 3552Parameters: none 3553 3554This capability allows post-handlers for the STSI instruction. After 3555initial handling in the kernel, KVM exits to user space with 3556KVM_EXIT_S390_STSI to allow user space to insert further data. 3557 3558Before exiting to userspace, kvm handlers should fill in s390_stsi field of 3559vcpu->run: 3560struct { 3561 __u64 addr; 3562 __u8 ar; 3563 __u8 reserved; 3564 __u8 fc; 3565 __u8 sel1; 3566 __u16 sel2; 3567} s390_stsi; 3568 3569@addr - guest address of STSI SYSIB 3570@fc - function code 3571@sel1 - selector 1 3572@sel2 - selector 2 3573@ar - access register number 3574 3575KVM handlers should exit to userspace with rc = -EREMOTE. 3576 3577 35788. Other capabilities. 3579---------------------- 3580 3581This section lists capabilities that give information about other 3582features of the KVM implementation. 3583 35848.1 KVM_CAP_PPC_HWRNG 3585 3586Architectures: ppc 3587 3588This capability, if KVM_CHECK_EXTENSION indicates that it is 3589available, means that that the kernel has an implementation of the 3590H_RANDOM hypercall backed by a hardware random-number generator. 3591If present, the kernel H_RANDOM handler can be enabled for guest use 3592with the KVM_CAP_PPC_ENABLE_HCALL capability. 3593