summary refs log tree commit diff
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2015-04-18 08:14:18 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2015-04-18 08:14:18 -0400
commitafad97eee47c1f1f242202e2473929b4ef5d9f43 (patch)
tree31f68d70760234b582a28bd3f64311ff5307b7b1 /Documentation
parent04b7fe6a4a231871ef681bc95e08fe66992f7b1f (diff)
parent44c144f9c8e8fbd73ede2848da8253b3aae42ec2 (diff)
downloadlinux-afad97eee47c1f1f242202e2473929b4ef5d9f43.tar.gz
Merge tag 'dm-4.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:

 - the most extensive changes this cycle are the DM core improvements to
   add full blk-mq support to request-based DM.

    - disabled by default but user can opt-in with CONFIG_DM_MQ_DEFAULT
    - depends on some blk-mq changes from Jens' for-4.1/core branch so
      that explains why this pull is built on linux-block.git

 - update DM to use name_to_dev_t() rather than open-coding a less
   capable device parser.

    - includes a couple small improvements to name_to_dev_t() that offer
      stricter constraints that DM's code provided.

 - improvements to the dm-cache "mq" cache replacement policy.

 - a DM crypt crypt_ctr() error path fix and an async crypto deadlock
   fix

 - a small efficiency improvement for DM crypt decryption by leveraging
   immutable biovecs

 - add error handling modes for corrupted blocks to DM verity

 - a new "log-writes" DM target from Josef Bacik that is meant for file
   system developers to test file system integrity at particular points
   in the life of a file system

 - a few DM log userspace cleanups and fixes

 - a few Documentation fixes (for thin, cache, crypt and switch)

* tag 'dm-4.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (34 commits)
  dm crypt: fix missing error code return from crypt_ctr error path
  dm crypt: fix deadlock when async crypto algorithm returns -EBUSY
  dm crypt: leverage immutable biovecs when decrypting on read
  dm crypt: update URLs to new cryptsetup project page
  dm: add log writes target
  dm table: use bool function return values of true/false not 1/0
  dm verity: add error handling modes for corrupted blocks
  dm thin: remove stale 'trim' message documentation
  dm delay: use msecs_to_jiffies for time conversion
  dm log userspace base: fix compile warning
  dm log userspace transfer: match wait_for_completion_timeout return type
  dm table: fall back to getting device using name_to_dev_t()
  init: stricter checking of major:minor root= values
  init: export name_to_dev_t and mark name argument as const
  dm: add 'use_blk_mq' module param and expose in per-device ro sysfs attr
  dm: optimize dm_mq_queue_rq to _not_ use kthread if using pure blk-mq
  dm: add full blk-mq support to request-based DM
  dm: impose configurable deadline for dm_request_fn's merge heuristic
  dm sysfs: introduce ability to add writable attributes
  dm: don't start current request if it would've merged with the previous
  ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-block-dm22
-rw-r--r--Documentation/device-mapper/dm-crypt.txt4
-rw-r--r--Documentation/device-mapper/log-writes.txt140
-rw-r--r--Documentation/device-mapper/switch.txt4
-rw-r--r--Documentation/device-mapper/thin-provisioning.txt3
-rw-r--r--Documentation/device-mapper/verity.txt21
6 files changed, 185 insertions, 9 deletions
diff --git a/Documentation/ABI/testing/sysfs-block-dm b/Documentation/ABI/testing/sysfs-block-dm
index 87ca5691e29b..f9f2339b9a0a 100644
--- a/Documentation/ABI/testing/sysfs-block-dm
+++ b/Documentation/ABI/testing/sysfs-block-dm
@@ -23,3 +23,25 @@ Description:	Device-mapper device suspend state.
 		Contains the value 1 while the device is suspended.
 		Otherwise it contains 0. Read-only attribute.
 Users:		util-linux, device-mapper udev rules
+
+What:		/sys/block/dm-<num>/dm/rq_based_seq_io_merge_deadline
+Date:		March 2015
+KernelVersion:	4.1
+Contact:	dm-devel@redhat.com
+Description:	Allow control over how long a request that is a
+		reasonable merge candidate can be queued on the request
+		queue.  The resolution of this deadline is in
+		microseconds (ranging from 1 to 100000 usecs).
+		Setting this attribute to 0 (the default) will disable
+		request-based DM's merge heuristic and associated extra
+		accounting.  This attribute is not applicable to
+		bio-based DM devices so it will only ever report 0 for
+		them.
+
+What:		/sys/block/dm-<num>/dm/use_blk_mq
+Date:		March 2015
+KernelVersion:	4.1
+Contact:	dm-devel@redhat.com
+Description:	Request-based Device-mapper blk-mq I/O path mode.
+		Contains the value 1 if the device is using blk-mq.
+		Otherwise it contains 0. Read-only attribute.
diff --git a/Documentation/device-mapper/dm-crypt.txt b/Documentation/device-mapper/dm-crypt.txt
index ad697781f9ac..692171fe9da0 100644
--- a/Documentation/device-mapper/dm-crypt.txt
+++ b/Documentation/device-mapper/dm-crypt.txt
@@ -5,7 +5,7 @@ Device-Mapper's "crypt" target provides transparent encryption of block devices
 using the kernel crypto API.
 
 For a more detailed description of supported parameters see:
-http://code.google.com/p/cryptsetup/wiki/DMCrypt
+https://gitlab.com/cryptsetup/cryptsetup/wikis/DMCrypt
 
 Parameters: <cipher> <key> <iv_offset> <device path> \
 	      <offset> [<#opt_params> <opt_params>]
@@ -80,7 +80,7 @@ Example scripts
 ===============
 LUKS (Linux Unified Key Setup) is now the preferred way to set up disk
 encryption with dm-crypt using the 'cryptsetup' utility, see
-http://code.google.com/p/cryptsetup/
+https://gitlab.com/cryptsetup/cryptsetup
 
 [[
 #!/bin/sh
diff --git a/Documentation/device-mapper/log-writes.txt b/Documentation/device-mapper/log-writes.txt
new file mode 100644
index 000000000000..c10f30c9b534
--- /dev/null
+++ b/Documentation/device-mapper/log-writes.txt
@@ -0,0 +1,140 @@
+dm-log-writes
+=============
+
+This target takes 2 devices, one to pass all IO to normally, and one to log all
+of the write operations to.  This is intended for file system developers wishing
+to verify the integrity of metadata or data as the file system is written to.
+There is a log_write_entry written for every WRITE request and the target is
+able to take arbitrary data from userspace to insert into the log.  The data
+that is in the WRITE requests is copied into the log to make the replay happen
+exactly as it happened originally.
+
+Log Ordering
+============
+
+We log things in order of completion once we are sure the write is no longer in
+cache.  This means that normal WRITE requests are not actually logged until the
+next REQ_FLUSH request.  This is to make it easier for userspace to replay the
+log in a way that correlates to what is on disk and not what is in cache, to
+make it easier to detect improper waiting/flushing.
+
+This works by attaching all WRITE requests to a list once the write completes.
+Once we see a REQ_FLUSH request we splice this list onto the request and once
+the FLUSH request completes we log all of the WRITEs and then the FLUSH.  Only
+completed WRITEs, at the time the REQ_FLUSH is issued, are added in order to
+simulate the worst case scenario with regard to power failures.  Consider the
+following example (W means write, C means complete):
+
+W1,W2,W3,C3,C2,Wflush,C1,Cflush
+
+The log would show the following
+
+W3,W2,flush,W1....
+
+Again this is to simulate what is actually on disk, this allows us to detect
+cases where a power failure at a particular point in time would create an
+inconsistent file system.
+
+Any REQ_FUA requests bypass this flushing mechanism and are logged as soon as
+they complete as those requests will obviously bypass the device cache.
+
+Any REQ_DISCARD requests are treated like WRITE requests.  Otherwise we would
+have all the DISCARD requests, and then the WRITE requests and then the FLUSH
+request.  Consider the following example:
+
+WRITE block 1, DISCARD block 1, FLUSH
+
+If we logged DISCARD when it completed, the replay would look like this
+
+DISCARD 1, WRITE 1, FLUSH
+
+which isn't quite what happened and wouldn't be caught during the log replay.
+
+Target interface
+================
+
+i) Constructor
+
+   log-writes <dev_path> <log_dev_path>
+
+   dev_path	: Device that all of the IO will go to normally.
+   log_dev_path : Device where the log entries are written to.
+
+ii) Status
+
+    <#logged entries> <highest allocated sector>
+
+    #logged entries	       : Number of logged entries
+    highest allocated sector   : Highest allocated sector
+
+iii) Messages
+
+    mark <description>
+
+	You can use a dmsetup message to set an arbitrary mark in a log.
+	For example say you want to fsck a file system after every
+	write, but first you need to replay up to the mkfs to make sure
+	we're fsck'ing something reasonable, you would do something like
+	this:
+
+	  mkfs.btrfs -f /dev/mapper/log
+	  dmsetup message log 0 mark mkfs
+	  <run test>
+
+	  This would allow you to replay the log up to the mkfs mark and
+	  then replay from that point on doing the fsck check in the
+	  interval that you want.
+
+	Every log has a mark at the end labeled "dm-log-writes-end".
+
+Userspace component
+===================
+
+There is a userspace tool that will replay the log for you in various ways.
+It can be found here: https://github.com/josefbacik/log-writes
+
+Example usage
+=============
+
+Say you want to test fsync on your file system.  You would do something like
+this:
+
+TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
+dmsetup create log --table "$TABLE"
+mkfs.btrfs -f /dev/mapper/log
+dmsetup message log 0 mark mkfs
+
+mount /dev/mapper/log /mnt/btrfs-test
+<some test that does fsync at the end>
+dmsetup message log 0 mark fsync
+md5sum /mnt/btrfs-test/foo
+umount /mnt/btrfs-test
+
+dmsetup remove log
+replay-log --log /dev/sdc --replay /dev/sdb --end-mark fsync
+mount /dev/sdb /mnt/btrfs-test
+md5sum /mnt/btrfs-test/foo
+<verify md5sum's are correct>
+
+Another option is to do a complicated file system operation and verify the file
+system is consistent during the entire operation.  You could do this with:
+
+TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
+dmsetup create log --table "$TABLE"
+mkfs.btrfs -f /dev/mapper/log
+dmsetup message log 0 mark mkfs
+
+mount /dev/mapper/log /mnt/btrfs-test
+<fsstress to dirty the fs>
+btrfs filesystem balance /mnt/btrfs-test
+umount /mnt/btrfs-test
+dmsetup remove log
+
+replay-log --log /dev/sdc --replay /dev/sdb --end-mark mkfs
+btrfsck /dev/sdb
+replay-log --log /dev/sdc --replay /dev/sdb --start-mark mkfs \
+	--fsck "btrfsck /dev/sdb" --check fua
+
+And that will replay the log until it sees a FUA request, run the fsck command
+and if the fsck passes it will replay to the next FUA, until it is completed or
+the fsck command exists abnormally.
diff --git a/Documentation/device-mapper/switch.txt b/Documentation/device-mapper/switch.txt
index 8897d0494838..424835e57f27 100644
--- a/Documentation/device-mapper/switch.txt
+++ b/Documentation/device-mapper/switch.txt
@@ -47,8 +47,8 @@ consume far too much memory.
 Using this device-mapper switch target we can now build a two-layer
 device hierarchy:
 
-    Upper Tier – Determine which array member the I/O should be sent to.
-    Lower Tier – Load balance amongst paths to a particular member.
+    Upper Tier - Determine which array member the I/O should be sent to.
+    Lower Tier - Load balance amongst paths to a particular member.
 
 The lower tier consists of a single dm multipath device for each member.
 Each of these multipath devices contains the set of paths directly to
diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt
index 2f5173500bd9..4f67578b2954 100644
--- a/Documentation/device-mapper/thin-provisioning.txt
+++ b/Documentation/device-mapper/thin-provisioning.txt
@@ -380,9 +380,6 @@ then you'll have no access to blocks mapped beyond the end.  If you
 load a target that is bigger than before, then extra blocks will be
 provisioned as and when needed.
 
-If you wish to reduce the size of your thin device and potentially
-regain some space then send the 'trim' message to the pool.
-
 ii) Status
 
      <nr mapped sectors> <highest mapped sector>
diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.txt
index 9884681535ee..e15bc1a0fb98 100644
--- a/Documentation/device-mapper/verity.txt
+++ b/Documentation/device-mapper/verity.txt
@@ -11,6 +11,7 @@ Construction Parameters
     <data_block_size> <hash_block_size>
     <num_data_blocks> <hash_start_block>
     <algorithm> <digest> <salt>
+    [<#opt_params> <opt_params>]
 
 <version>
     This is the type of the on-disk hash format.
@@ -62,6 +63,22 @@ Construction Parameters
 <salt>
     The hexadecimal encoding of the salt value.
 
+<#opt_params>
+    Number of optional parameters. If there are no optional parameters,
+    the optional paramaters section can be skipped or #opt_params can be zero.
+    Otherwise #opt_params is the number of following arguments.
+
+    Example of optional parameters section:
+        1 ignore_corruption
+
+ignore_corruption
+    Log corrupted blocks, but allow read operations to proceed normally.
+
+restart_on_corruption
+    Restart the system when a corrupted block is discovered. This option is
+    not compatible with ignore_corruption and requires user space support to
+    avoid restart loops.
+
 Theory of operation
 ===================
 
@@ -125,7 +142,7 @@ block boundary) are the hash blocks which are stored a depth at a time
 
 The full specification of kernel parameters and on-disk metadata format
 is available at the cryptsetup project's wiki page
-  http://code.google.com/p/cryptsetup/wiki/DMVerity
+  https://gitlab.com/cryptsetup/cryptsetup/wikis/DMVerity
 
 Status
 ======
@@ -142,7 +159,7 @@ Set up a device:
 
 A command line tool veritysetup is available to compute or verify
 the hash tree or activate the kernel device. This is available from
-the cryptsetup upstream repository http://code.google.com/p/cryptsetup/
+the cryptsetup upstream repository https://gitlab.com/cryptsetup/cryptsetup/
 (as a libcryptsetup extension).
 
 Create hash on the device: