summary refs log tree commit diff
path: root/drivers/nvme
AgeCommit message (Collapse)Author
2023-08-16nvme-rdma: fix potential unbalanced freeze & unfreezeMing Lei
commit 29b434d1e49252b3ad56ad3197e47fafff5356a1 upstream. Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic") Cc: stable@vger.kernel.org Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-16nvme-tcp: fix potential unbalanced freeze & unfreezeMing Lei
commit 99dc264014d5aed66ee37ddf136a38b5a2b1b529 upstream. Move start_freeze into nvme_tcp_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic") Cc: stable@vger.kernel.org Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-16nvme-pci: add NVME_QUIRK_BOGUS_NID for Samsung PM9B1 256G and 512GAugust Wikerfors
commit 688b419c57c13637d95d7879e165fff3dec581eb upstream. The Samsung PM9B1 512G SSD found in some Lenovo Yoga 7 14ARB7 laptop units reports eui as 0001000200030004 when resuming from s2idle, causing the device to be removed with this error in dmesg: nvme nvme0: identifiers changed for nsid 1 To fix this, add a quirk to ignore namespace identifiers for this device. Signed-off-by: August Wikerfors <git@augustwikerfors.se> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23nvme: don't reject probe due to duplicate IDs for single-ported PCIe devicesChristoph Hellwig
commit ac522fc6c3165fd0daa2f8da7e07d5f800586daa upstream. While duplicate IDs are still very harmful, including the potential to easily see changing devices in /dev/disk/by-id, it turn out they are extremely common for cheap end user NVMe devices. Relax our check for them for so that it doesn't reject the probe on single-ported PCIe devices, but prints a big warning instead. In doubt we'd still like to see quirk entries to disable the potential for changing supposed stable device identifier links, but this will at least allow users how have two (or more) of these devices to use them without having to manually add a new PCI ID entry with the quirk through sysfs or by patching the kernel. Fixes: 2079f41ec6ff ("nvme: check that EUI/GUID/UUID are globally unique") Cc: stable@vger.kernel.org # 6.0+ Co-developed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23nvme-pci: fix DMA direction of unmapping integrity dataMing Lei
[ Upstream commit b8f6446b6853768cb99e7c201bddce69ca60c15e ] DMA direction should be taken in dma_unmap_page() for unmapping integrity data. Fix this DMA direction, and reported in Guangwu's test. Reported-by: Guangwu Zhang <guazhang@redhat.com> Fixes: 4aedb705437f ("nvme-pci: split metadata handling from nvme_map_data / nvme_unmap_data") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-core: fix dev_pm_qos memleakChaitanya Kulkarni
[ Upstream commit 7ed5cf8e6d9bfb6a78d0471317edff14f0f2b4dd ] Call dev_pm_qos_hide_latency_tolerance() in the error unwind patch to avoid following kmemleak:- blktests (master) # kmemleak-clear; ./check nvme/044; blktests (master) # kmemleak-scan ; kmemleak-show nvme/044 (Test bi-directional authentication) [passed] runtime 2.111s ... 2.124s unreferenced object 0xffff888110c46240 (size 96): comm "nvme", pid 33461, jiffies 4345365353 (age 75.586s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000069ac2cec>] kmalloc_trace+0x25/0x90 [<000000006acc66d5>] dev_pm_qos_update_user_latency_tolerance+0x6f/0x100 [<00000000cc376ea7>] nvme_init_ctrl+0x38e/0x410 [nvme_core] [<000000007df61b4b>] 0xffffffffc05e88b3 [<00000000d152b985>] 0xffffffffc05744cb [<00000000f04a4041>] vfs_write+0xc5/0x3c0 [<00000000f9491baf>] ksys_write+0x5f/0xe0 [<000000001c46513d>] do_syscall_64+0x3b/0x90 [<00000000ecf348fe>] entry_SYSCALL_64_after_hwframe+0x72/0xdc Link: https://lore.kernel.org/linux-nvme/CAHj4cs-nDaKzMx2txO4dbE+Mz9ePwLtU0e3egz+StmzOUgWUrA@mail.gmail.com/ Fixes: f50fff73d620 ("nvme: implement In-Band authentication") Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-core: add missing fault-injection cleanupChaitanya Kulkarni
[ Upstream commit 3a12a0b868a512fcada564699d00f5e652c0998c ] Add missing fault-injection cleanup in nvme_init_ctrl() in the error unwind path that also fixes following message for blktests:- linux-block (for-next) # grep debugfs debugfs-err.log [ 147.853464] debugfs: Directory 'nvme1' with parent '/' already present! [ 147.853973] nvme1: failed to create debugfs attr [ 148.802490] debugfs: Directory 'nvme1' with parent '/' already present! [ 148.803244] nvme1: failed to create debugfs attr [ 148.877304] debugfs: Directory 'nvme1' with parent '/' already present! [ 148.877775] nvme1: failed to create debugfs attr [ 149.816652] debugfs: Directory 'nvme1' with parent '/' already present! [ 149.818011] nvme1: failed to create debugfs attr Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Stable-dep-of: 7ed5cf8e6d9b ("nvme-core: fix dev_pm_qos memleak") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-auth: don't ignore key generation failures when initializing ctrl keysSagi Grimberg
[ Upstream commit 193a8c7e5f1a8481841636cec9c185543ec5c759 ] nvme_auth_generate_key can fail, don't ignore it upon initialization. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Stable-dep-of: 7ed5cf8e6d9b ("nvme-core: fix dev_pm_qos memleak") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-core: fix memory leak in dhchap_ctrl_secretChaitanya Kulkarni
[ Upstream commit 99c2dcc8ffc24e210a3aa05c204d92f3ef460b05 ] Free dhchap_secret in nvme_ctrl_dhchap_ctrl_secret_store() before we return when nvme_auth_generate_key() returns error. Fixes: f50fff73d620 ("nvme: implement In-Band authentication") Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-core: fix memory leak in dhchap_secret_storeChaitanya Kulkarni
[ Upstream commit a836ca33c5b07d34dd5347af9f64d25651d12674 ] Free dhchap_secret in nvme_ctrl_dhchap_secret_store() before we return fix following kmemleack:- unreferenced object 0xffff8886376ea800 (size 64): comm "check", pid 22048, jiffies 4344316705 (age 92.199s) hex dump (first 32 bytes): 44 48 48 43 2d 31 3a 30 30 3a 6e 78 72 35 4b 67 DHHC-1:00:nxr5Kg 75 58 34 75 6f 41 78 73 4a 61 34 63 2f 68 75 4c uX4uoAxsJa4c/huL backtrace: [<0000000030ce5d4b>] __kmalloc+0x4b/0x130 [<000000009be1cdc1>] nvme_ctrl_dhchap_secret_store+0x8f/0x160 [nvme_core] [<00000000ac06c96a>] kernfs_fop_write_iter+0x12b/0x1c0 [<00000000437e7ced>] vfs_write+0x2ba/0x3c0 [<00000000f9491baf>] ksys_write+0x5f/0xe0 [<000000001c46513d>] do_syscall_64+0x3b/0x90 [<00000000ecf348fe>] entry_SYSCALL_64_after_hwframe+0x72/0xdc unreferenced object 0xffff8886376eaf00 (size 64): comm "check", pid 22048, jiffies 4344316736 (age 92.168s) hex dump (first 32 bytes): 44 48 48 43 2d 31 3a 30 30 3a 6e 78 72 35 4b 67 DHHC-1:00:nxr5Kg 75 58 34 75 6f 41 78 73 4a 61 34 63 2f 68 75 4c uX4uoAxsJa4c/huL backtrace: [<0000000030ce5d4b>] __kmalloc+0x4b/0x130 [<000000009be1cdc1>] nvme_ctrl_dhchap_secret_store+0x8f/0x160 [nvme_core] [<00000000ac06c96a>] kernfs_fop_write_iter+0x12b/0x1c0 [<00000000437e7ced>] vfs_write+0x2ba/0x3c0 [<00000000f9491baf>] ksys_write+0x5f/0xe0 [<000000001c46513d>] do_syscall_64+0x3b/0x90 [<00000000ecf348fe>] entry_SYSCALL_64_after_hwframe+0x72/0xdc Fixes: f50fff73d620 ("nvme: implement In-Band authentication") Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-auth: no need to reset chap contexts on re-authenticationSagi Grimberg
[ Upstream commit e8a420efb637f52c586596283d6fd96f2a7ecb5c ] Now that the chap context is reset upon completion, this is no longer needed. Also remove nvme_auth_reset as no callers are left. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Stable-dep-of: a836ca33c5b0 ("nvme-core: fix memory leak in dhchap_secret_store") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-auth: remove symbol export from nvme_auth_resetSagi Grimberg
[ Upstream commit 100b555bc204fc754108351676297805f5affa49 ] Only the nvme module calls it. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Stable-dep-of: a836ca33c5b0 ("nvme-core: fix memory leak in dhchap_secret_store") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-auth: rename authentication work elementsSagi Grimberg
[ Upstream commit 0c999e69c40a87285f910c400b550fad866e99d0 ] Use nvme_ctrl_auth_work and nvme_queue_auth_work for better readability. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Stable-dep-of: a836ca33c5b0 ("nvme-core: fix memory leak in dhchap_secret_store") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-19nvme-auth: rename __nvme_auth_[reset|free] to nvme_auth[reset|free]_dhchapSagi Grimberg
[ Upstream commit 0a7ce375f83f4ade7c2a835444093b6870fb8257 ] nvme_auth_[reset|free] operate on the controller while __nvme_auth_[reset|free] operate on a chap struct (which maps to a queue context). Rename it for clarity. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Stable-dep-of: a836ca33c5b0 ("nvme-core: fix memory leak in dhchap_secret_store") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28nvme: improve handling of long keep alivesUday Shankar
[ Upstream commit c7275ce6a5fd32ca9f5a6294ed89cf0523181af9 ] Upon keep alive completion, nvme_keep_alive_work is scheduled with the same delay every time. If keep alive commands are completing slowly, this may cause a keep alive timeout. The following trace illustrates the issue, taking KATO = 8 and TBKAS off for simplicity: 1. t = 0: run nvme_keep_alive_work, send keep alive 2. t = ε: keep alive reaches controller, controller restarts its keep alive timer 3. t = 4: host receives keep alive completion, schedules nvme_keep_alive_work with delay 4 4. t = 8: run nvme_keep_alive_work, send keep alive Here, a keep alive having RTT of 4 causes a delay of at least 8 - ε between the controller receiving successive keep alives. With ε small, the controller is likely to detect a keep alive timeout. Fix this by calculating the RTT of the keep alive command, and adjusting the scheduling delay of the next keep alive work accordingly. Reported-by: Costa Sapuntzakis <costa@purestorage.com> Reported-by: Randy Jennings <randyj@purestorage.com> Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28nvme: check IO start time when deciding to defer KAUday Shankar
[ Upstream commit 774a9636514764ddc0d072ae0d1d1c01a47e6ddd ] When a command completes, we set a flag which will skip sending a keep alive at the next run of nvme_keep_alive_work when TBKAS is on. However, if the command was submitted long ago, it's possible that the controller may have also restarted its keep alive timer (as a result of receiving the command) long ago. The following trace demonstrates the issue, assuming TBKAS is on and KATO = 8 for simplicity: 1. t = 0: submit I/O commands A, B, C, D, E 2. t = 0.5: commands A, B, C, D, E reach controller, restart its keep alive timer 3. t = 1: A completes 4. t = 2: run nvme_keep_alive_work, see recent completion, do nothing 5. t = 3: B completes 6. t = 4: run nvme_keep_alive_work, see recent completion, do nothing 7. t = 5: C completes 8. t = 6: run nvme_keep_alive_work, see recent completion, do nothing 9. t = 7: D completes 10. t = 8: run nvme_keep_alive_work, see recent completion, do nothing 11. t = 9: E completes At this point, 8.5 seconds have passed without restarting the controller's keep alive timer, so the controller will detect a keep alive timeout. Fix this by checking the IO start time when deciding to defer sending a keep alive command. Only set comp_seen if the command started after the most recent run of nvme_keep_alive_work. With this change, the completions of B, C, and D will not set comp_seen and the run of nvme_keep_alive_work at t = 4 will send a keep alive. Reported-by: Costa Sapuntzakis <costa@purestorage.com> Reported-by: Randy Jennings <randyj@purestorage.com> Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28nvme: double KA polling frequency to avoid KATO with TBKAS onUday Shankar
[ Upstream commit ea4d453b9ec9ea279c39744cd0ecb47ef48ede35 ] With TBKAS on, the completion of one command can defer sending a keep alive for up to twice the delay between successive runs of nvme_keep_alive_work. The current delay of KATO / 2 thus makes it possible for one command to defer sending a keep alive for up to KATO, which can result in the controller detecting a KATO. The following trace demonstrates the issue, taking KATO = 8 for simplicity: 1. t = 0: run nvme_keep_alive_work, no keep-alive sent 2. t = ε: I/O completion seen, set comp_seen = true 3. t = 4: run nvme_keep_alive_work, see comp_seen == true, skip sending keep-alive, set comp_seen = false 4. t = 8: run nvme_keep_alive_work, see comp_seen == false, send a keep-alive command. Here, there is a delay of 8 - ε between receiving a command completion and sending the next command. With ε small, the controller is likely to detect a keep alive timeout. Fix this by running nvme_keep_alive_work with a delay of KATO / 4 whenever TBKAS is on. Going through the above trace now gives us a worst-case delay of 4 - ε, which is in line with the recommendation of sending a command every KATO / 2 in the NVMe specification. Reported-by: Costa Sapuntzakis <costa@purestorage.com> Reported-by: Randy Jennings <randyj@purestorage.com> Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21NVMe: Add MAXIO 1602 to bogus nid list.Tatsuki Sugiura
[ Upstream commit a3a9d63dcd15535e7fdf4c7c1b32bfaed762973a ] HIKSEMI FUTURE M.2 SSD uses the same dummy nguid and eui64. I confirmed it with my two devices. This patch marks the controller as NVME_QUIRK_BOGUS_NID. --------------------------------------------------------- sugi@tempest:~% sudo nvme id-ctrl /dev/nvme0 NVME Identify Controller: vid : 0x1e4b ssvid : 0x1e4b sn : 30096022612 mn : HS-SSD-FUTURE 2048G fr : SN10542 rab : 0 ieee : 000000 cmic : 0 mdts : 7 cntlid : 0 ver : 0x10400 rtd3r : 0x7a120 rtd3e : 0x1e8480 oaes : 0x200 ctratt : 0x2 rrls : 0 cntrltype : 1 fguid : 00000000-0000-0000-0000-000000000000 <snip...> --------------------------------------------------------- --------------------------------------------------------- sugi@tempest:~% sudo nvme id-ns /dev/nvme0n1 NVME Identify Namespace 1: <snip...> nguid : 00000000000000000000000000000000 eui64 : 0000000000000002 lbaf 0 : ms:0 lbads:9 rp:0 (in use) --------------------------------------------------------- Signed-off-by: Tatsuki Sugiura <sugi@nemui.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09nvme-pci: Add quirk for Teamgroup MP33 SSDDaniel Smith
[ Upstream commit 0649728123cf6a5518e154b4e1735fc85ea4f55c ] Add a quirk for Teamgroup MP33 that reports duplicate ids for disk. Signed-off-by: Daniel Smith <dansmith@ds.gy> [kch: patch formatting] Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Daniel Smith <dansmith@ds.gy> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09nvme: do not let the user delete a ctrl before a complete initializationMaurizio Lombardi
[ Upstream commit 2eb94dd56a4a4e3fe286def3e2ba207804a37345 ] If a userspace application performes a "delete_controller" command early during the ctrl initialization, the delete operation may race against the init code and the kernel will crash. nvme nvme5: Connect command failed: host path error nvme nvme5: failed to connect queue: 0 ret=880 PF: supervisor write access in kernel mode PF: error_code(0x0002) - not-present page blk_mq_quiesce_queue+0x18/0x90 nvme_tcp_delete_ctrl+0x24/0x40 [nvme_tcp] nvme_do_delete_ctrl+0x7f/0x8b [nvme_core] nvme_sysfs_delete.cold+0x8/0xd [nvme_core] kernfs_fop_write_iter+0x124/0x1b0 new_sync_write+0xff/0x190 vfs_write+0x1ef/0x280 Fix the crash by checking the NVME_CTRL_STARTED_ONCE bit; if it's not set it means that the nvme controller is still in the process of getting initialized and the kernel will return an -EBUSY error to userspace. Set the NVME_CTRL_STARTED_ONCE later in the nvme_start_ctrl() function, after the controller start operation is completed. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09nvme-multipath: don't call blk_mark_disk_dead in nvme_mpath_remove_diskChristoph Hellwig
[ Upstream commit 1743e5f6000901a11f4e1cd741bfa9136f3ec9b1 ] nvme_mpath_remove_disk is called after del_gendisk, at which point a blk_mark_disk_dead call doesn't make any sense. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09nvme-pci: add quirk for missing secondary temperature thresholdsHristo Venev
[ Upstream commit bd375feeaf3408ed00e08c3bc918d6be15f691ad ] On Kingston KC3000 and Kingston FURY Renegade (both have the same PCI IDs) accessing temp3_{min,max} fails with an invalid field error (note that there is no problem setting the thresholds for temp1). This contradicts the NVM Express Base Specification 2.0b, page 292: The over temperature threshold and under temperature threshold features shall be implemented for all implemented temperature sensors (i.e., all Temperature Sensor fields that report a non-zero value). Define NVME_QUIRK_NO_SECONDARY_TEMP_THRESH that disables the thresholds for all but the composite temperature and set it for this device. Signed-off-by: Hristo Venev <hristo@venev.name> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09nvme-pci: add NVME_QUIRK_BOGUS_NID for HS-SSD-FUTURE 2048GSagi Grimberg
[ Upstream commit 1616d6c3717bae9041a4240d381ec56ccdaafedc ] Add a quirk to fix HS-SSD-FUTURE 2048G SSD drives reporting duplicate nsids. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217384 Reported-by: Andrey God <andreygod83@protonmail.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09nvme: fix the name of Zone Append for verbose loggingChristoph Hellwig
[ Upstream commit 856303797724d28f1d65b702f0eadcee1ea7abf5 ] No Management involved in Zone Appened. Fixes: bd83fe6f2cd2 ("nvme: add verbose error logging") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alan Adamson <alan.adamson@oracle.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvme-fcloop: fix "inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage"Ming Lei
[ Upstream commit 4f86a6ff6fbd891232dda3ca97fd1b9630b59809 ] fcloop_fcp_op() could be called from flush request's ->end_io(flush_end_io) in which the spinlock of fq->mq_flush_lock is grabbed with irq saved/disabled. So fcloop_fcp_op() can't call spin_unlock_irq(&tfcp_req->reqlock) simply which enables irq unconditionally. Fixes the warning by switching to spin_lock_irqsave()/spin_unlock_irqrestore() Fixes: c38dbbfab1bc ("nvme-fcloop: fix inconsistent lock state warnings") Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvme: fix async event trace eventKeith Busch
[ Upstream commit 6622b76fe922b94189499a90ccdb714a4a8d0773 ] Mixing AER Event Type and Event Info has masking clashes. Just print the event type, but also include the event info of the AER result in the trace. Fixes: 09bd1ff4b15143b ("nvme-core: add async event trace helper") Reported-by: Nate Thornton <nate.thornton@samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvmet: fix I/O Command Set specific Identify ControllerDamien Le Moal
[ Upstream commit a5a6ab0950b46ab1ef4a5c83c80234018b81b38a ] For an identify command with cns set to NVME_ID_CNS_CS_CTRL, the NVMe 2.0 specification states that: If the I/O Command Set specified by the CSI field does not have an Identify Controller data structure, then the controller shall return a zero filled data structure. If the host requests a data structure for an I/O Command Set that the controller does not support, the controller shall abort the command with a status code of Invalid Field in Command. However, the current implementation of this identify command in nvmet_execute_identify() only handles the ZNS command set, returning an error for the NVM command set, which is not compliant with the specifications as we do support this command set. Fix this by: 1) Renaming nvmet_execute_identify_cns_cs_ctrl() to nvmet_execute_identify_ctrl_zns() to continue handling the ZNS command set as is. 2) Introduce a nvmet_execute_identify_ctrl_ns() helper to handle the NVM command set, returning a zero filled nvme_id_ctrl_nvm data structure. 3) Modify nvmet_execute_identify() to call these helpers based on the csi specified, returning an error for unsupported command sets. Fixes: aaf2e048af27 ("nvmet: add ZBD over ZNS backend support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvmet: fix Identify Active Namespace ID list handlingDamien Le Moal
[ Upstream commit 97416f67d55fb8b866ff1815ca7ef26b6dfa6a5e ] The identify command with cns set to NVME_ID_CNS_NS_ACTIVE_LIST does not depend on the command set. The execution of this command should thus not look at the csi field specified in the command. Simplify nvmet_execute_identify() to directly call nvmet_execute_identify_nslist() without the csi switch-case. Fixes: ab5d0b38c047 ("nvmet: add Command Set Identifier support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvmet: fix Identify Controller handlingDamien Le Moal
[ Upstream commit 62904b3b333e7f3c0f879dc3513295eee5765c9f ] The identify command with cns set to NVME_ID_CNS_CTRL does not depend on the command set. The execution of this command should thus not look at the csi specified in the command. Simplify nvmet_execute_identify() to directly call nvmet_execute_identify_ctrl() without the csi switch-case. Fixes: ab5d0b38c047 ("nvmet: add Command Set Identifier support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvmet: fix Identify Namespace handlingDamien Le Moal
[ Upstream commit 8c098aa00118c35108f0c19bd3cdc45e11574948 ] The identify command with cns set to NVME_ID_CNS_NS does not directly depend on the command set. The NVMe specifications is rather confusing here as it appears that this command only applies to the NVM command set. However, footnote 8 of Figure 273 in the NVMe 2.0 base specifications clearly state that this command applies to NVM command sets that support logical blocks, that is, NVM and ZNS. Both the NVM and ZNS command set specifications also list this identify as mandatory. The command handling should thus not look at the csi field since it is defined as unused for this command. Given that we do not support the KV command set, simply remove the csi switch-case for that command handling and call directly nvmet_execute_identify_ns() in nvmet_execute_identify(). Fixes: ab5d0b38c047 ("nvmet: add Command Set Identifier support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11nvmet: fix error handling in nvmet_execute_identify_cns_cs_ns()Damien Le Moal
[ Upstream commit ab76e7206b672b2e8818cb121a04506956d6b223 ] Nvme specifications state that: If the I/O Command Set associated with the namespace identified by the NSID field does not support the Identify Namespace data structure specified by the CSI field, the controller shall abort the command with a status code of Invalid Field in Command. In other words, if nvmet_execute_identify_cns_cs_ns() is called for a target with a block device that is not zoned, we should not return any data and set the status to NVME_SC_INVALID_FIELD. While at it, it is also better to revalidate the ns block devie *before* checking if the block device is zoned, to ensure that nvmet_execute_identify_cns_cs_ns() operates against updated device characteristics. Fixes: aaf2e048af27 ("nvmet: add ZBD over ZNS backend support") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-26nvme-tcp: fix a possible UAF when failing to allocate an io queueSagi Grimberg
[ Upstream commit 88eaba80328b31ef81813a1207b4056efd7006a6 ] When we allocate a nvme-tcp queue, we set the data_ready callback before we actually need to use it. This creates the potential that if a stray controller sends us data on the socket before we connect, we can trigger the io_work and start consuming the socket. In this case reported: we failed to allocate one of the io queues, and as we start releasing the queues that we already allocated, we get a UAF [1] from the io_work which is running before it should really. Fix this by setting the socket ops callbacks only before we start the queue, so that we can't accidentally schedule the io_work in the initialization phase before the queue started. While we are at it, rename nvme_tcp_restore_sock_calls to pair with nvme_tcp_setup_sock_ops. [1]: [16802.107284] nvme nvme4: starting error recovery [16802.109166] nvme nvme4: Reconnecting in 10 seconds... [16812.173535] nvme nvme4: failed to connect socket: -111 [16812.173745] nvme nvme4: Failed reconnect attempt 1 [16812.173747] nvme nvme4: Reconnecting in 10 seconds... [16822.413555] nvme nvme4: failed to connect socket: -111 [16822.413762] nvme nvme4: Failed reconnect attempt 2 [16822.413765] nvme nvme4: Reconnecting in 10 seconds... [16832.661274] nvme nvme4: creating 32 I/O queues. [16833.919887] BUG: kernel NULL pointer dereference, address: 0000000000000088 [16833.920068] nvme nvme4: Failed reconnect attempt 3 [16833.920094] #PF: supervisor write access in kernel mode [16833.920261] nvme nvme4: Reconnecting in 10 seconds... [16833.920368] #PF: error_code(0x0002) - not-present page [16833.921086] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [16833.921191] RIP: 0010:_raw_spin_lock_bh+0x17/0x30 ... [16833.923138] Call Trace: [16833.923271] <TASK> [16833.923402] lock_sock_nested+0x1e/0x50 [16833.923545] nvme_tcp_try_recv+0x40/0xa0 [nvme_tcp] [16833.923685] nvme_tcp_io_work+0x68/0xa0 [nvme_tcp] [16833.923824] process_one_work+0x1e8/0x390 [16833.923969] worker_thread+0x53/0x3d0 [16833.924104] ? process_one_work+0x390/0x390 [16833.924240] kthread+0x124/0x150 [16833.924376] ? set_kthread_struct+0x50/0x50 [16833.924518] ret_from_fork+0x1f/0x30 [16833.924655] </TASK> Reported-by: Yanjun Zhang <zhangyanjun@cestc.cn> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Yanjun Zhang <zhangyanjun@cestc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-20nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSDDuy Truong
[ Upstream commit 74391b3e69855e7dd65a9cef36baf5fc1345affd ] Added a quirk to fix the TeamGroup T-Force Cardea Zero Z330 SSDs reporting duplicate NGUIDs. Signed-off-by: Duy Truong <dory@dory.moe> Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-20nvme-pci: mark Lexar NM760 as IGNORE_DEV_SUBNQNJuraj Pecigos
[ Upstream commit 1231363aec86704a6b0467a12e3ca7bdf890e01d ] A system with more than one of these SSDs will only have one usable. The kernel fails to detect more than one nvme device due to duplicate cntlids. before: [ 9.395229] nvme 0000:01:00.0: platform quirk: setting simple suspend [ 9.395262] nvme nvme0: pci function 0000:01:00.0 [ 9.395282] nvme 0000:03:00.0: platform quirk: setting simple suspend [ 9.395305] nvme nvme1: pci function 0000:03:00.0 [ 9.409873] nvme nvme0: Duplicate cntlid 1 with nvme1, subsys nqn.2022-07.com.siliconmotion:nvm-subsystem-sn- , rejecting [ 9.409982] nvme nvme0: Removing after probe failure status: -22 [ 9.427487] nvme nvme1: allocated 64 MiB host memory buffer. [ 9.445088] nvme nvme1: 16/0/0 default/read/poll queues [ 9.449898] nvme nvme1: Ignoring bogus Namespace Identifiers after: [ 1.161890] nvme 0000:01:00.0: platform quirk: setting simple suspend [ 1.162660] nvme nvme0: pci function 0000:01:00.0 [ 1.162684] nvme 0000:03:00.0: platform quirk: setting simple suspend [ 1.162707] nvme nvme1: pci function 0000:03:00.0 [ 1.191354] nvme nvme0: allocated 64 MiB host memory buffer. [ 1.193378] nvme nvme1: allocated 64 MiB host memory buffer. [ 1.211044] nvme nvme1: 16/0/0 default/read/poll queues [ 1.211080] nvme nvme0: 16/0/0 default/read/poll queues [ 1.216145] nvme nvme0: Ignoring bogus Namespace Identifiers [ 1.216261] nvme nvme1: Ignoring bogus Namespace Identifiers Adding the NVME_QUIRK_IGNORE_DEV_SUBNQN quirk to resolves the issue. Signed-off-by: Juraj Pecigos <kernel@juraj.dev> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Stable-dep-of: 74391b3e6985 ("nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-20nvme: send Identify with CNS 06h only to I/O controllersMartin George
[ Upstream commit def84ab600b71ea3fcc422a876d5d0d0daa7d4f3 ] Identify CNS 06h (I/O Command Set Specific Identify Controller data structure) is supported only on i/o controllers. But nvme_init_non_mdts_limits() currently invokes this on all controllers. Correct this by ensuring this is sent to I/O controllers only. Signed-off-by: Martin George <marting@netapp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13nvme: fix discard support without oncsKeith Busch
[ Upstream commit d3205ab75e99a47539ec91ef85ba488f4ddfeaa9 ] The device can report discard support without setting the ONCS DSM bit. When not set, the driver clears max_discard_size expecting it to be set later. We don't know the size until we have the namespace format, though, so setting it is deferred until configuring one, but the driver was abandoning the discard settings due to that initial clearing. Move the max_discard_size calculation above the check for a '0' discard size. Fixes: 1a86924e4f46475 ("nvme: fix interpretation of DMRSL") Reported-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-06block/io_uring: pass in issue_flags for uring_cmd task_work handlingJens Axboe
commit 9d2789ac9d60c049d26ef6d3005d9c94c5a559e9 upstream. io_uring_cmd_done() currently assumes that the uring_lock is held when invoked, and while it generally is, this is not guaranteed. Pass in the issue_flags associated with it, so that we have IO_URING_F_UNLOCKED available to be able to lock the CQ ring appropriately when completing events. Cc: stable@vger.kernel.org Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd") Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-06nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620Philipp Geulen
[ Upstream commit b65d44fa0fe072c91bf41cd8756baa2b4c77eff2 ] Added a quirk to fix Lexar NM620 1TB SSD reporting duplicate NGUIDs. Signed-off-by: Philipp Geulen <p.geulen@js-elektronik.de> Reviewed-by: Chaitanya Kulkarni <kkch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000Elmer Miroslav Mosher Golovin
commit 9630d80655bfe7e62e4aff2889dc4eae7ceeb887 upstream. Added a quirk to fix the Netac NV3000 SSD reporting duplicate NGUIDs. Cc: <stable@vger.kernel.org> Signed-off-by: Elmer Miroslav Mosher Golovin <miroslav@mishamosher.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22nvmet: avoid potential UAF in nvmet_req_complete()Damien Le Moal
[ Upstream commit 6173a77b7e9d3e202bdb9897b23f2a8afe7bf286 ] An nvme target ->queue_response() operation implementation may free the request passed as argument. Such implementation potentially could result in a use after free of the request pointer when percpu_ref_put() is called in nvmet_req_complete(). Avoid such problem by using a local variable to save the sq pointer before calling __nvmet_req_complete(), thus avoiding dereferencing the req pointer after that function call. Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22nvme: fix handling single range discard requestMing Lei
[ Upstream commit 37f0dc2ec78af0c3f35dd05578763de059f6fe77 ] When investigating one customer report on warning in nvme_setup_discard, we observed the controller(nvme/tcp) actually exposes queue_max_discard_segments(req->q) == 1. Obviously the current code can't handle this situation, since contiguity merge like normal RW request is taken. Fix the issue by building range from request sector/nr_sectors directly. Fixes: b35ba01ea697 ("nvme: support ranged discard requests") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11nvme-fabrics: show well known discovery nameDaniel Wagner
[ Upstream commit 26a57cb35548ae67c14871cccbf50da3edb01ea4 ] The kernel always logs the unique subsystem name for a discovery controller, even in the case user space asked for the well known. This has lead to confusion as the logs of nvme-cli and the kernel logs didn't match. First, nvme-cli connects to the well known discovery controller to figure out if it supports TP8013. If so then nvme-cli disconnects and connects to the unique discovery controller. Currently, the kernel show that user space connected twice to the unique one. To avoid further confusion, show the well known discovery controller if user space asked for it: $ nvme connect-all -v -t tcp -a 192.168.0.1 nvme0: nqn.2014-08.org.nvmexpress.discovery connected nvme0: nqn.2014-08.org.nvmexpress.discovery disconnected nvme0: nqn.discovery connected kernel log: nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.0.1:8009 nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery" nvme nvme0: new ctrl: NQN "nqn.discovery", addr 192.168.0.1:8009 Fixes: e5ea42faa773 ("nvme: display correct subsystem NQN") Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11nvme-tcp: don't access released socket during error recoveryAkinobu Mita
[ Upstream commit 76d54bf20cdcc1ed7569a89885e09636e9a8d71d ] While the error recovery work is temporarily failing reconnect attempts, running the 'nvme list' command causes a kernel NULL pointer dereference by calling getsockname() with a released socket. During error recovery work, the nvme tcp socket is released and a new one created, so it is not safe to access the socket without proper check. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Fixes: 02c57a82c008 ("nvme-tcp: print actual source IP address through sysfs "address" attr") Reviewed-by: Martin Belanger <martin.belanger@dell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-11nvme: bring back auto-removal of deleted namespaces during sequential scanChristoph Hellwig
[ Upstream commit 0dd6fff2aad4e35633fef1ea72838bec5b47559a ] Bring back the check of the Identify Namespace return value for the legacy NVMe 1.0-style sequential scanning. While NVMe 1.0 does not support namespace management, there are "modern" cloud solutions like Google Cloud Platform that claim the obsolete 1.0 compliance for no good reason while supporting proprietary sideband namespace management. Fixes: 1a893c2bfef4 ("nvme: refactor namespace probing") Reported-by: Nils Hanke <nh@edgeless.systems> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Nils Hanke <nh@edgeless.systems> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22nvme-pci: refresh visible attrs for cmb attributesKeith Busch
commit e917a849c3fc317c4a5f82bb18726000173d39e6 upstream. The sysfs group containing the cmb attributes is registered before the driver knows if they need to be visible or not. Update the group when cmb attributes are known to exist so the visibility setting is correct. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217037 Fixes: 86adbf0cdb9ec65 ("nvme: simplify transport specific device attribute handling") Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-22nvme-rdma: stop auth work after tearing down queues in error recoverySagi Grimberg
[ Upstream commit 91c11d5f32547a08d462934246488fe72f3d44c3 ] when starting error recovery there might be a authentication work running, and it involves I/O commands. Given the controller is tearing down there is no chance for the I/O to complete other than timing out which may unnecessarily take a full io timeout. So first tear down the queues, fail/cancel all inflight I/O (including potentially authentication) and only then stop authentication. This ensures that failover is not stalled due to blocked authentication I/O. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22nvme-tcp: stop auth work after tearing down queues in error recoverySagi Grimberg
[ Upstream commit 1f1a4f89562d3b33b6ca4fc8a4f3bd4cd35ab4ea ] when starting error recovery there might be a authentication work running, and it involves I/O commands. Given the controller is tearing down there is no chance for the I/O to complete other than timing out which may unnecessarily take a full io timeout. So first tear down the queues, fail/cancel all inflight I/O (including potentially authentication) and only then stop authentication. This ensures that failover is not stalled due to blocked authentication I/O. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22nvme: clear the request_queue pointers on failure in nvme_alloc_io_tag_setMaurizio Lombardi
[ Upstream commit 6fbf13c0e24fd86ab2e4477cd8484a485b687421 ] In nvme_alloc_io_tag_set(), the connect_q pointer should be set to NULL in case of error to avoid potential invalid pointer dereferences. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22nvme: clear the request_queue pointers on failure in nvme_alloc_admin_tag_setMaurizio Lombardi
[ Upstream commit fd62678ab55cb01e11a404d302cdade222bf4022 ] If nvme_alloc_admin_tag_set() fails, the admin_q and fabrics_q pointers are left with an invalid, non-NULL value. Other functions may then check the pointers and dereference them, e.g. in nvme_probe() -> out_disable: -> nvme_dev_remove_admin(). Fix the bug by setting admin_q and fabrics_q to NULL in case of error. Also use the set variable to free the tag_set as ctrl->admin_tagset isn't initialized yet. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22nvme-fc: fix a missing queue put in nvmet_fc_ls_create_associationAmit Engel
[ Upstream commit 0cab4404874f2de52617de8400c844891c6ea1ce ] As part of nvmet_fc_ls_create_association there is a case where nvmet_fc_alloc_target_queue fails right after a new association with an admin queue is created. In this case, no one releases the get taken in nvmet_fc_alloc_target_assoc. This fix is adding the missing put. Signed-off-by: Amit Engel <Amit.Engel@dell.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>