summary refs log tree commit diff
path: root/arch/arm64/kvm/hyp/pgtable.c
AgeCommit message (Collapse)Author
2022-01-14KVM: arm64: pkvm: Use the mm_ops indirection for cache maintenanceMarc Zyngier
CMOs issued from EL2 cannot directly use the kernel helpers, as EL2 doesn't have a mapping of the guest pages. Oops. Instead, use the mm_ops indirection to use helpers that will perform a mapping at EL2 and allow the CMO to be effective. Fixes: 25aa28691bb9 ("KVM: arm64: Move guest CMOs to the fault handlers") Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220114125038.1336965-1-maz@kernel.org
2022-01-04Merge branch kvm-arm64/misc-5.17 into kvmarm-master/nextMarc Zyngier
* kvm-arm64/misc-5.17: : . : Misc fixes and improvements: : - Add minimal support for ARMv8.7's PMU extension : - Constify kvm_io_gic_ops : - Drop kvm_is_transparent_hugepage() prototype : - Drop unused workaround_flags field : - Rework kvm_pgtable initialisation : - Documentation fixes : - Replace open-coded SCTLR_EL1.EE useage with its defined macro : - Sysreg list selftest update to handle PAuth : - Include cleanups : . KVM: arm64: vgic: Replace kernel.h with the necessary inclusions KVM: arm64: Fix comment typo in kvm_vcpu_finalize_sve() KVM: arm64: selftests: get-reg-list: Add pauth configuration KVM: arm64: Fix comment on barrier in kvm_psci_vcpu_on() KVM: arm64: Fix comment for kvm_reset_vcpu() KVM: arm64: Use defined value for SCTLR_ELx_EE KVM: arm64: Rework kvm_pgtable initialisation KVM: arm64: Drop unused workaround_flags vcpu field Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-12-16KVM: arm64: Rework kvm_pgtable initialisationMarc Zyngier
Ganapatrao reported that the kvm_pgtable->mmu pointer is more or less hardcoded to the main S2 mmu structure, while the nested code needs it to point to other instances (as we have one instance per nested context). Rework the initialisation of the kvm_pgtable structure so that this assumtion doesn't hold true anymore. This requires some minor changes to the order in which things are initialised (the mmu->arch pointer being the critical one). Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211129200150.351436-5-maz@kernel.org
2021-12-16KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2Will Deacon
Implement kvm_pgtable_hyp_unmap() which can be used to remove hypervisor stage-1 mappings at EL2. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-6-qperret@google.com
2021-12-16KVM: arm64: Refcount hyp stage-1 pgtable pagesQuentin Perret
To prepare the ground for allowing hyp stage-1 mappings to be removed at run-time, update the KVM page-table code to maintain a correct refcount using the ->{get,put}_page() function callbacks. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-3-qperret@google.com
2021-08-11KVM: arm64: Enable retrieving protections attributes of PTEsQuentin Perret
Introduce helper functions in the KVM stage-2 and stage-1 page-table manipulation library allowing to retrieve the enum kvm_pgtable_prot of a PTE. This will be useful to implement custom walkers outside of pgtable.c. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-17-qperret@google.com
2021-08-11KVM: arm64: Allow populating software bitsQuentin Perret
Introduce infrastructure allowing to manipulate software bits in stage-1 and stage-2 page-tables using additional entries in the kvm_pgtable_prot enum. This is heavily inspired by Marc's implementation of a similar feature in the NV patch series, but adapted to allow stage-1 changes as well: https://lore.kernel.org/kvmarm/20210510165920.1913477-56-maz@kernel.org/ Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-12-qperret@google.com
2021-08-11KVM: arm64: Enable forcing page-level stage-2 mappingsQuentin Perret
Much of the stage-2 manipulation logic relies on being able to destroy block mappings if e.g. installing a smaller mapping in the range. The rationale for this behaviour is that stage-2 mappings can always be re-created lazily. However, this gets more complicated when the stage-2 page-table is used to store metadata about the underlying pages. In such cases, destroying a block mapping may lead to losing part of the state, and confuse the user of those metadata (such as the hypervisor in nVHE protected mode). To avoid this, introduce a callback function in the pgtable struct which is called during all map operations to determine whether the mappings can use blocks, or should be forced to page granularity. This is used by the hypervisor when creating the host stage-2 to force page-level mappings when using non-default protection attributes. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-11-qperret@google.com
2021-08-11KVM: arm64: Tolerate re-creating hyp mappings to set software bitsQuentin Perret
The current hypervisor stage-1 mapping code doesn't allow changing an existing valid mapping. Relax this condition by allowing changes that only target software bits, as that will soon be needed to annotate shared pages. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-10-qperret@google.com
2021-08-11KVM: arm64: Don't overwrite software bits with owner idQuentin Perret
We will soon start annotating page-tables with new flags to track shared pages and such, and we will do so in valid mappings using software bits in the PTEs, as provided by the architecture. However, it is possible that we will need to use those flags to annotate invalid mappings as well in the future, similar to what we do to track page ownership in the host stage-2. In order to facilitate the annotation of invalid mappings with such flags, it would be preferable to re-use the same bits as for valid mappings (bits [58-55]), but these are currently used for ownership encoding. Since we have plenty of bits left to use in invalid mappings, move the ownership bits further down the PTE to avoid the conflict. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-9-qperret@google.com
2021-08-11KVM: arm64: Rename KVM_PTE_LEAF_ATTR_S2_IGNOREDQuentin Perret
The ignored bits for both stage-1 and stage-2 page and block descriptors are in [55:58], so rename KVM_PTE_LEAF_ATTR_S2_IGNORED to make it applicable to both. And while at it, since these bits are more commonly known as 'software' bits, rename accordingly. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-8-qperret@google.com
2021-08-11KVM: arm64: Optimize host memory abortsQuentin Perret
The kvm_pgtable_stage2_find_range() function is used in the host memory abort path to try and look for the largest block mapping that can be used to map the faulting address. In order to do so, the function currently walks the stage-2 page-table and looks for existing incompatible mappings within the range of the largest possible block. If incompatible mappings are found, it tries the same procedure again, but using a smaller block range, and repeats until a matching range is found (potentially up to page granularity). While this approach has benefits (mostly in the fact that it proactively coalesces host stage-2 mappings), it can be slow if the ranges are fragmented, and it isn't optimized to deal with CPUs faulting on the same IPA as all of them will do all the work every time. To avoid these issues, remove kvm_pgtable_stage2_find_range(), and walk the page-table only once in the host_mem_abort() path to find the closest leaf to the input address. With this, use the corresponding range if it is invalid and not owned by another entity. If a valid leaf is found, return -EAGAIN similar to what is done in the kvm_pgtable_stage2_map() path to optimize concurrent faults. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-7-qperret@google.com
2021-08-11KVM: arm64: Expose page-table helpersQuentin Perret
The KVM pgtable API exposes the kvm_pgtable_walk() function to allow the definition of walkers outside of pgtable.c. However, it is not easy to implement any of those walkers without some of the low-level helpers. Move some of them to the header file to allow re-use from other places. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210809152448.1810400-6-qperret@google.com
2021-08-02KVM: arm64: Introduce helper to retrieve a PTE and its levelMarc Zyngier
It is becoming a common need to fetch the PTE for a given address together with its level. Add such a helper. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20210726153552.1535838-2-maz@kernel.org
2021-06-18Merge branch arm64/for-next/caches into kvmarm-master/nextMarc Zyngier
arm64 cache management function cleanup from Fuad Tabba, shared with the arm64 tree. * arm64/for-next/caches: arm64: Rename arm64-internal cache maintenance functions arm64: Fix cache maintenance function comments arm64: sync_icache_aliases to take end parameter instead of size arm64: __clean_dcache_area_pou to take end parameter instead of size arm64: __clean_dcache_area_pop to take end parameter instead of size arm64: __clean_dcache_area_poc to take end parameter instead of size arm64: __flush_dcache_area to take end parameter instead of size arm64: dcache_by_line_op to take end parameter instead of size arm64: __inval_dcache_area to take end parameter instead of size arm64: Fix comments to refer to correct function __flush_icache_range arm64: Move documentation of dcache_by_line_op arm64: assembler: remove user_alt arm64: Downgrade flush_icache_range to invalidate arm64: Do not enable uaccess for invalidate_icache_range arm64: Do not enable uaccess for flush_icache_range arm64: Apply errata to swsusp_arch_suspend_exit arm64: assembler: add conditional cache fixups arm64: assembler: replace `kaddr` with `addr` Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-06-18KVM: arm64: Move guest CMOs to the fault handlersYanan Wang
We currently uniformly perform CMOs of D-cache and I-cache in function user_mem_abort before calling the fault handlers. If we get concurrent guest faults(e.g. translation faults, permission faults) or some really unnecessary guest faults caused by BBM, CMOs for the first vcpu are necessary while the others later are not. By moving CMOs to the fault handlers, we can easily identify conditions where they are really needed and avoid the unnecessary ones. As it's a time consuming process to perform CMOs especially when flushing a block range, so this solution reduces much load of kvm and improve efficiency of the stage-2 page table code. We can imagine two specific scenarios which will gain much benefit: 1) In a normal VM startup, this solution will improve the efficiency of handling guest page faults incurred by vCPUs, when initially populating stage-2 page tables. 2) After live migration, the heavy workload will be resumed on the destination VM, however all the stage-2 page tables need to be rebuilt at the moment. So this solution will ease the performance drop during resuming stage. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-5-wangyanan55@huawei.com
2021-06-18KVM: arm64: Introduce mm_ops member for structure stage2_attr_dataYanan Wang
Also add a mm_ops member for structure stage2_attr_data, since we will move I-cache maintenance for guest stage-2 to the permission path and as a result will need mm_ops for some callbacks. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-3-wangyanan55@huawei.com
2021-05-25arm64: Rename arm64-internal cache maintenance functionsFuad Tabba
Although naming across the codebase isn't that consistent, it tends to follow certain patterns. Moreover, the term "flush" isn't defined in the Arm Architecture reference manual, and might be interpreted to mean clean, invalidate, or both for a cache. Rename arm64-internal functions to make the naming internally consistent, as well as making it consistent with the Arm ARM, by specifying whether it applies to the instruction, data, or both caches, whether the operation is a clean, invalidate, or both. Also specify which point the operation applies to, i.e., to the point of unification (PoU), coherency (PoC), or persistence (PoP). This commit applies the following sed transformation to all files under arch/arm64: "s/\b__flush_cache_range\b/caches_clean_inval_pou_macro/g;"\ "s/\b__flush_icache_range\b/caches_clean_inval_pou/g;"\ "s/\binvalidate_icache_range\b/icache_inval_pou/g;"\ "s/\b__flush_dcache_area\b/dcache_clean_inval_poc/g;"\ "s/\b__inval_dcache_area\b/dcache_inval_poc/g;"\ "s/__clean_dcache_area_poc\b/dcache_clean_poc/g;"\ "s/\b__clean_dcache_area_pop\b/dcache_clean_pop/g;"\ "s/\b__clean_dcache_area_pou\b/dcache_clean_pou/g;"\ "s/\b__flush_cache_user_range\b/caches_clean_inval_user_pou/g;"\ "s/\b__flush_icache_all\b/icache_inval_all_pou/g;" Note that __clean_dcache_area_poc is deliberately missing a word boundary check at the beginning in order to match the efistub symbols in image-vars.h. Also note that, despite its name, __flush_icache_range operates on both instruction and data caches. The name change here reflects that. No functional change intended. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-19-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __flush_dcache_area to take end parameter instead of sizeFuad Tabba
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-13-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-03-19KVM: arm64: Introduce KVM_PGTABLE_S2_IDMAP stage 2 flagQuentin Perret
Introduce a new stage 2 configuration flag to specify that all mappings in a given page-table will be identity-mapped, as will be the case for the host. This allows to introduce sanity checks in the map path and to avoid programming errors. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-34-qperret@google.com
2021-03-19KVM: arm64: Introduce KVM_PGTABLE_S2_NOFWB stage 2 flagQuentin Perret
In order to further configure stage 2 page-tables, pass flags to the init function using a new enum. The first of these flags allows to disable FWB even if the hardware supports it as we will need to do so for the host stage 2. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-33-qperret@google.com
2021-03-19KVM: arm64: Add kvm_pgtable_stage2_find_range()Quentin Perret
Since the host stage 2 will be identity mapped, and since it will own most of memory, it would preferable for performance to try and use large block mappings whenever that is possible. To ease this, introduce a new helper in the KVM page-table code which allows to search for large ranges of available IPA space. This will be used in the host memory abort path to greedily idmap large portion of the PA space. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-32-qperret@google.com
2021-03-19KVM: arm64: Refactor the *_map_set_prot_attr() helpersQuentin Perret
In order to ease their re-use in other code paths, refactor the *_map_set_prot_attr() helpers to not depend on a map_data struct. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-31-qperret@google.com
2021-03-19KVM: arm64: Use page-table to track page ownershipQuentin Perret
As the host stage 2 will be identity mapped, all the .hyp memory regions and/or memory pages donated to protected guestis will have to marked invalid in the host stage 2 page-table. At the same time, the hypervisor will need a way to track the ownership of each physical page to ensure memory sharing or donation between entities (host, guests, hypervisor) is legal. In order to enable this tracking at EL2, let's use the host stage 2 page-table itself. The idea is to use the top bits of invalid mappings to store the unique identifier of the page owner. The page-table owner (the host) gets identifier 0 such that, at boot time, it owns the entire IPA space as the pgd starts zeroed. Provide kvm_pgtable_stage2_set_owner() which allows to modify the ownership of pages in the host stage 2. It re-uses most of the map() logic, but ends up creating invalid mappings instead. This impacts how we do refcount as we now need to count invalid mappings when they are used for ownership tracking. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-30-qperret@google.com
2021-03-19KVM: arm64: Always zero invalid PTEsQuentin Perret
kvm_set_invalid_pte() currently only clears bit 0 from a PTE because stage2_map_walk_table_post() needs to be able to follow the anchor. In preparation for re-using bits 63-01 from invalid PTEs, make sure to zero it entirely by ensuring to cache the anchor's child upfront. Acked-by: Will Deacon <will@kernel.org> Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-29-qperret@google.com
2021-03-19KVM: arm64: Make memcache anonymous in pgtable allocatorQuentin Perret
The current stage2 page-table allocator uses a memcache to get pre-allocated pages when it needs any. To allow re-using this code at EL2 which uses a concept of memory pools, make the memcache argument of kvm_pgtable_stage2_map() anonymous, and let the mm_ops zalloc_page() callbacks use it the way they need to. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-26-qperret@google.com
2021-03-19KVM: arm64: Refactor kvm_arm_setup_stage2()Quentin Perret
In order to re-use some of the stage 2 setup code at EL2, factor parts of kvm_arm_setup_stage2() out into separate functions. No functional change intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-23-qperret@google.com
2021-03-19KVM: arm64: Use kvm_arch for stage 2 pgtableQuentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2, use struct kvm_arch in lieu of struct kvm as the host will have the former but not the latter. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-20-qperret@google.com
2021-03-19KVM: arm64: Prepare the creation of s1 mappings at EL2Quentin Perret
When memory protection is enabled, the EL2 code needs the ability to create and manage its own page-table. To do so, introduce a new set of hypercalls to bootstrap a memory management system at EL2. This leads to the following boot flow in nVHE Protected mode: 1. the host allocates memory for the hypervisor very early on, using the memblock API; 2. the host creates a set of stage 1 page-table for EL2, installs the EL2 vectors, and issues the __pkvm_init hypercall; 3. during __pkvm_init, the hypervisor re-creates its stage 1 page-table and stores it in the memory pool provided by the host; 4. the hypervisor then extends its stage 1 mappings to include a vmemmap in the EL2 VA space, hence allowing to use the buddy allocator introduced in a previous patch; 5. the hypervisor jumps back in the idmap page, switches from the host-provided page-table to the new one, and wraps up its initialization by enabling the new allocator, before returning to the host. 6. the host can free the now unused page-table created for EL2, and will now need to issue hypercalls to make changes to the EL2 stage 1 mappings instead of modifying them directly. Note that for the sake of simplifying the review, this patch focuses on the hypervisor side of things. In other words, this only implements the new hypercalls, but does not make use of them from the host yet. The host-side changes will follow in a subsequent patch. Credits to Will for __pkvm_init_switch_pgd. Acked-by: Will Deacon <will@kernel.org> Co-authored-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-18-qperret@google.com
2021-03-19KVM: arm64: Factor memory allocation out of pgtable.cQuentin Perret
In preparation for enabling the creation of page-tables at EL2, factor all memory allocation out of the page-table code, hence making it re-usable with any compatible memory allocator. No functional changes intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-7-qperret@google.com
2021-03-19KVM: arm64: Avoid free_page() in page-table allocatorQuentin Perret
Currently, the KVM page-table allocator uses a mix of put_page() and free_page() calls depending on the context even though page-allocation is always achieved using variants of __get_free_page(). Make the code consistent by using put_page() throughout, and reduce the memory management API surface used by the page-table code. This will ease factoring out page-allocation from pgtable.c, which is a pre-requisite to creating page-tables at EL2. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-6-qperret@google.com
2021-03-06KVM: arm64: Fix range alignment when walking page tablesJia He
When walking the page tables at a given level, and if the start address for the range isn't aligned for that level, we propagate the misalignment on each iteration at that level. This results in the walker ignoring a number of entries (depending on the original misalignment) on each subsequent iteration. Properly aligning the address before the next iteration addresses this issue. Cc: stable@vger.kernel.org Reported-by: Howard Zhang <Howard.Zhang@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jia He <justin.he@arm.com> Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker infrastructure") [maz: rewrite commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210303024225.2591-1-justin.he@arm.com Message-Id: <20210305185254.3730990-9-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-01-25KVM: arm64: Filter out the case of only changing permissions from stage-2 ↵Yanan Wang
map path (1) During running time of a a VM with numbers of vCPUs, if some vCPUs access the same GPA almost at the same time and the stage-2 mapping of the GPA has not been built yet, as a result they will all cause translation faults. The first vCPU builds the mapping, and the followed ones end up updating the valid leaf PTE. Note that these vCPUs might want different access permissions (RO, RW, RX, RWX, etc.). (2) It's inevitable that we sometimes will update an existing valid leaf PTE in the map path, and we perform break-before-make in this case. Then more unnecessary translation faults could be caused if the *break stage* of BBM is just catched by other vCPUS. With (1) and (2), something unsatisfactory could happen: vCPU A causes a translation fault and builds the mapping with RW permissions, vCPU B then update the valid leaf PTE with break-before-make and permissions are updated back to RO. Besides, *break stage* of BBM may trigger more translation faults. Finally, some useless small loops could occur. We can make some optimization to solve above problems: When we need to update a valid leaf PTE in the map path, let's filter out the case where this update only change access permissions, and don't update the valid leaf PTE here in this case. Instead, let the vCPU enter back the guest and it will exit next time to go through the relax_perms path without break-before-make if it still wants more permissions. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210114121350.123684-3-wangyanan55@huawei.com
2021-01-25KVM: arm64: Adjust partial code of hyp stage-1 map and guest stage-2 mapYanan Wang
Procedures of hyp stage-1 map and guest stage-2 map are quite different, but they are tied closely by function kvm_set_valid_leaf_pte(). So adjust the relative code for ease of code maintenance in the future. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210114121350.123684-2-wangyanan55@huawei.com
2020-12-02KVM: arm64: Fix handling of merging tables into a block entryYanan Wang
When dirty logging is enabled, we collapse block entries into tables as necessary. If dirty logging gets canceled, we can end-up merging tables back into block entries. When this happens, we must not only free the non-huge page-table pages but also invalidate all the TLB entries that can potentially cover the block. Otherwise, we end-up with multiple possible translations for the same physical page, which can legitimately result in a TLB conflict. To address this, replease the bogus invalidation by IPA with a full VM invalidation. Although this is pretty heavy handed, it happens very infrequently and saves a bunch of invalidations by IPA. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> [maz: fixup commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201201201034.116760-3-wangyanan55@huawei.com
2020-12-02KVM: arm64: Fix memory leak on stage2 update of a valid PTEYanan Wang
When installing a new leaf PTE onto an invalid ptep, we need to get_page(ptep) to account for the new mapping. However, simply updating a valid PTE shouldn't result in any additional refcounting, as there is new mapping. This otherwise results in a page being forever wasted. Address this by fixing-up the refcount in stage2_map_walker_try_leaf() if the PTE was already valid, balancing out the later get_page() in stage2_map_walk_leaf(). Signed-off-by: Yanan Wang <wangyanan55@huawei.com> [maz: update commit message, add comment in the code] Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201201201034.116760-2-wangyanan55@huawei.com
2020-10-29KVM: arm64: Fix masks in stage2_pte_cacheable()Will Deacon
stage2_pte_cacheable() tries to figure out whether the mapping installed in its 'pte' parameter is cacheable or not. Unfortunately, it fails miserably because it extracts the memory attributes from the entry using FIELD_GET(), which returns the attributes shifted down to bit 0, but then compares this with the unshifted value generated by the PAGE_S2_MEMATTR() macro. A direct consequence of this bug is that cache maintenance is silently skipped, which in turn causes 32-bit guests to crash early on when their set/way maintenance is trapped but not emulated correctly. Fix the broken masks by avoiding the use of FIELD_GET() altogether. Fixes: 6d9d2115c480 ("KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table") Reported-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201029144716.30476-1-will@kernel.org
2020-10-29KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNTWill Deacon
For consistency with the rest of the stage-2 page-table page allocations (performing using a kvm_mmu_memory_cache), ensure that __GFP_ACCOUNT is included in the GFP flags for the PGD pages. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201026144423.24683-1-will@kernel.org
2020-10-02KVM: arm64: Pass level hint to TLBI during stage-2 permission faultWill Deacon
Alex pointed out that we don't pass a level hint to the TLBI instruction when handling a stage-2 permission fault, even though the walker does at some point have the level information in its hands. Rework stage2_update_leaf_attrs() so that it can optionally return the level of the updated pte to its caller, which can in turn be used to provide the correct TLBI level hint. Reported-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/595cc73e-636e-8b3a-f93a-b4e9fb218db8@arm.com Link: https://lore.kernel.org/r/20200930131801.16889-1-will@kernel.org
2020-09-11KVM: arm64: Add support for relaxing stage-2 perms in generic page-table codeWill Deacon
Add support for relaxing the permissions of a stage-2 mapping (i.e. adding additional permissions) to the generic page-table code. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-17-will@kernel.org
2020-09-11KVM: arm64: Add support for stage-2 cache flushing in generic page-tableQuentin Perret
Add support for cache flushing a range of the stage-2 address space to the generic page-table code. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200911132529.19844-15-will@kernel.org
2020-09-11KVM: arm64: Add support for stage-2 write-protect in generic page-tableQuentin Perret
Add a stage-2 wrprotect() operation to the generic page-table code. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200911132529.19844-13-will@kernel.org
2020-09-11KVM: arm64: Add support for stage-2 page-aging in generic page-tableWill Deacon
Add stage-2 mkyoung(), mkold() and is_young() operations to the generic page-table code. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-11-will@kernel.org
2020-09-11KVM: arm64: Add support for stage-2 map()/unmap() in generic page-tableWill Deacon
Add stage-2 map() and unmap() operations to the generic page-table code. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-7-will@kernel.org
2020-09-11KVM: arm64: Add support for creating kernel-agnostic stage-2 page tablesWill Deacon
Introduce alloc() and free() functions to the generic page-table code for guest stage-2 page-tables and plumb these into the existing KVM page-table allocator. Subsequent patches will convert other operations within the KVM allocator over to the generic code. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-6-will@kernel.org
2020-09-11KVM: arm64: Add support for creating kernel-agnostic stage-1 page tablesWill Deacon
The generic page-table walker is pretty useless as it stands, because it doesn't understand enough to allocate anything. Teach it about stage-1 page-tables, and hook up an API for allocating these for the hypervisor at EL2. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-4-will@kernel.org
2020-09-11KVM: arm64: Add stand-alone page-table walker infrastructureWill Deacon
The KVM page-table code is intricately tied into the kernel page-table code and re-uses the pte/pmd/pud/p4d/pgd macros directly in an attempt to reduce code duplication. Unfortunately, the reality is that there is an awful lot of code required to make this work, and at the end of the day you're limited to creating page-tables with the same configuration as the host kernel. Furthermore, lifting the page-table code to run directly at EL2 on a non-VHE system (as we plan to to do in future patches) is practically impossible due to the number of dependencies it has on the core kernel. Introduce a framework for walking Armv8 page-tables configured independently from the host kernel. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-3-will@kernel.org