[[PageOutline]] * [http://www.usupi.org/sysad/166.html LVM のスナップショット機能を使ってみる - いますぐ実践! Linuxシステム管理 / Vol.166] * [http://www.atmarkit.co.jp/flinux/rensai/root06/root06a.html @IT:LVMによる自動バックアップ・システムの構築(1/3)] * [http://www.drbd.org/users-guide/ch-lvm.html Chapter 11. Using LVM with DRBD] * [http://www.drbd.jp/users-guide/ch-lvm.html 第11章 DRBDとLVMの使用] * [http://thinkit.co.jp/free/compare/5/9/ ThinkIT 第9回:バックアップにおけるスナップショットの活用 (1/3)] * [http://okkun-lab.rd.fukuoka-u.ac.jp/wiki/?Tips%2FLinux%2FLVM Tips/Linux/LVM - 福岡大学奥村研究室 - okkun-lab Pukiwiki!] * arch:LVM そこそこ詳しい * https://wiki.gentoo.org/wiki/LVM 非常に良くまとまっている * ボリューム命名規則 * https://git.fedorahosted.org/cgit/lvm2.git/tree/lib/misc/lvm-string.c#n68 _validate_name() 1. 先頭にハイフンは不可 2. 「.」または「..」は不可(「.hoge」などは可) 3. 英数字および「.」「_」「+」は可 4. 128文字未満 * ただし、実際にはもう少し短い必要がある https://git.fedorahosted.org/cgit/lvm2.git/tree/lib/metadata/metadata.c#n2597 vg_validate() * vgs/lvs のAttrの意味 * https://git.fedorahosted.org/cgit/lvm2.git/tree/lib/metadata/vg.c#n639 vg_attr_dup() * https://git.fedorahosted.org/cgit/lvm2.git/tree/lib/metadata/lv.c#n641 lv_attr_dup_with_info_and_seg_status() = LVM RAID = * [http://unix.stackexchange.com/questions/150644/raiding-with-lvm-vs-mdraid-pros-and-cons raid - RAIDing with LVM vs MDRAID - pros and cons? - Unix & Linux Stack Exchange] * Debian Wheezy/Jessie での例が詳しく載っている。LVM-RAIDについては情報が見つけにくいというのは現時点(2015/05)でもあまり変わらないように思える * LVM-RAIDは実体としてはMD-RAIDと同じらしい * [https://access.redhat.com/documentation/ja-JP/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/raid_volumes.html 4.4.15. RAID 論理ボリューム] * RHELでLVM-RAIDを使う際のマニュアル。「Red Hat Enterprise Linux 6.3 リリースでは、LVM は RAID4/5/6 およびミラーリングの新実装をサポートしています。」とある。 * [https://access.redhat.com/documentation/ja-JP/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/mirrorrecover.html 6.3. LVM ミラー障害からの回復] * このページの説明は、古い実装とされている[https://access.redhat.com/documentation/ja-JP/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/mirror_create.html 4.4.3. ミラー化ボリュームの作成]からリンクされているので、新実装で当てはまるのか不明 * なお、未翻訳のRHEL 7版でもほぼ同じ章立てになっている [https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Logical_Volume_Manager_Administration/ Logical Volume Manager Administration] * https://wiki.gentoo.org/wiki/LVM#Different_storage_allocation_methods > It is not possible to stripe an existing volume, nor reshape the stripes across more/less physical volumes, nor to convert to a different RAID level/linear volume. A stripe set can be mirrored. It is possible to extend a stripe set across additional physical volumes, but they must be added in multiples of the original stripe set (which will effectively linearly append a new stripe set). * RAID1,4,5,6でreshapeなどが使えないのは結構厳しい制限に思える。MD-RAIDでは可能なので、そのうちサポートされるかも知れない == LVM with MD-RAID == * [http://ruzia.hateblo.jp/entry/2014/01/04/203603 mdadm と LVM で作る、全手動 BeyondRAID もどき - 守破離] > ざっくり言うと、RAID1 か RAID5 でHDDを横断する領域をいくつか作って、その領域をまとめあげる事で大容量のストレージを作る感じですね。 > これのイケてる点は HDD が 1 台消えても大丈夫な上に、状況によっては HDD を 1 台交換するだけでストレージの容量がアップすること。 > 最低 2 台の HDD を大容量の物に交換すれば確実に容量が増えます。 = Thin provisioning = * https://www.kernel.org/doc/Documentation/device-mapper/thin-provisioning.txt > These targets are very much still in the EXPERIMENTAL state. Please do not yet rely on them in production. * https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Logical_Volume_Manager_Administration/LV.html#thinly_provisioned_volume_creation * https://wiki.gentoo.org/wiki/LVM#Thin_provisioning 非常に良くまとまっている * [http://lists.centos.org/pipermail/centos/2014-January/139850.html (CentOS) LVM thinpool snapshots broken in 6.5?] > For the people who run into this as well: > This is apparently a feature and not a bug. Thin provisioning snapshots > are no longer automatically activated and a "skip activation" flag is > set during creation by default. One has to add the "-K" option to > "lvchange -ay " to have lvchange ignore this flag and > activate the volume for real. "-k" can be used on lvcreate to not add > this flag to the volume. See man lvchange/lvcreate for more details. > /etc/lvm/lvm.conf also contains a "auto_set_activation_skip" option now > that controls this. > > Apparently this was changed in 6.5 but the changes were not mentioned in > the release notes. * thin snapshotはデフォルトではinactiveで作成され、activateするにはlvchangeに'-K'オプションが必要となった('-k y -K'とするのが良さそう) * 現時点(ArchLinux 4.0.1-1-ARCH, lvm2 2.02.116-1)では、thin poolは拡張は出来るが縮小は出来ない * # lvextend -l100%VG vg/pool0 {{{ Size of logical volume vg/pool0_tdata changed from 2.00 GiB (512 extents) to 2.59 GiB (663 extents). Logical volume pool0 successfully resized }}} * # lvreduce -L2.5G vg/pool0 {{{ Thin pool volumes cannot be reduced in size yet. Run `lvreduce --help' for more information. }}} * [http://dustymabe.com/2013/06/21/guest-discardfstrim-on-thin-lvs/ Guest Discard/FSTRIM On Thin LVs « A Random Walk Down Tech Street] * [http://www.slideshare.net/akirahayakawa716/dmthin20140528 dm-thin-internal-ja] dm-thin実装調査 == resize == * 用語については -> [http://man7.org/linux/man-pages/man7/lvmthin.7.html lvmthin(7) - Linux manual page] * ArchLinux 4.0.1-1-ARCH, lvm2 2.02.116-1 にて実験 === lvextend === * ThinDataLV, ThinMetaLVとも、extendについては問題なく行える 1. ThinPoolLV, ThinLVの作成 * # lvcreate -L 2G -T --thinpool pool0 vg {{{ Logical volume "pool0" created. }}} * # lvcreate -T vg/pool0 -V 3G -n lv1 {{{ Logical volume "lv1" created. }}} * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi-a-tz-- 768.00U pool0 0.00 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi-aotz-- 512.00U 0.00 1.17 [pool0_tdata] vg Twi-ao---- 512.00U [pool0_tmeta] vg ewi-ao---- 1.00U }}} 2. データの書き込み * # mkfs.ext4 /dev/vg/lv1 * # mount /dev/vg/lv1 /data/lv1/ * # dd if=/dev/urandom of=/data/lv1/tmp bs=1024 count=1024000 * # sha1sum -b /data/lv1/tmp > /tmp/sha1 * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi-aotz-- 768.00U pool0 36.24 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi-aotz-- 512.00U 54.37 14.65 [pool0_tdata] vg Twi-ao---- 512.00U [pool0_tmeta] vg ewi-ao---- 1.00U }}} 3. lvextend ThinLV * # lvextend -L+2G /dev/vg/lv1 {{{ Size of logical volume vg/lv1 changed from 3.00 GiB (768 extents) to 5.00 GiB (1280 extents). Logical volume lv1 successfully resized }}} * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi-aotz-- 1280.00U pool0 21.75 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi-aotz-- 512.00U 54.37 14.65 [pool0_tdata] vg Twi-ao---- 512.00U [pool0_tmeta] vg ewi-ao---- 1.00U }}} 4. lvextend ThinPoolLV * # lvextend -l+1 /dev/vg/pool0 {{{ Size of logical volume vg/pool0_tdata changed from 2.00 GiB (512 extents) to 2.00 GiB (513 extents). Logical volume pool0 successfully resized }}} * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi-aotz-- 1280.00U pool0 21.75 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi-aotz-- 513.00U 54.26 14.65 [pool0_tdata] vg Twi-ao---- 513.00U [pool0_tmeta] vg ewi-ao---- 1.00U }}} 5. ThinDataLVを指定しての拡張は出来ない * # lvextend -l+100 /dev/vg/pool0_tdata {{{ Can't resize internal logical volume pool0_tdata Run `lvextend --help' for more information. }}} * # lvextend -l+100 /dev/vg/pool0 {{{ Size of logical volume vg/pool0_tdata changed from 2.00 GiB (513 extents) to 2.39 GiB (613 extents). Logical volume pool0 successfully resized }}} {{{ pool0 vg twi-aotz-- 613.00U 45.41 14.65 }}} 6. lvextend ThinMetaLV * # lvextend -l+100 /dev/vg/pool0_tmeta {{{ Size of logical volume vg/pool0_tmeta changed from 4.00 MiB (1 extents) to 404.00 MiB (101 extents). Logical volume pool0_tmeta successfully resized }}} * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi-aotz-- 1280.00U pool0 21.75 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi-aotz-- 613.00U 45.41 0.15 [pool0_tdata] vg Twi-ao---- 613.00U [pool0_tmeta] vg ewi-ao---- 101.00U }}} * # sha1sum -c /tmp/sha1 {{{ /data/lv1/tmp: OK }}} === lvreduce === * Thin Poolのlvreduceは現在は行えない * metadataは一見縮小できるように見えるが、問題が起きる * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi-a-tz-- 1280.00U pool0 21.75 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi-aotz-- 613.00U 45.41 0.15 [pool0_tdata] vg Twi-ao---- 613.00U [pool0_tmeta] vg ewi-ao---- 101.00U }}} * # umount /data/lv1 * # lvchange -an /dev/vg/lv1 * # lvchange -an /dev/vg/pool0 * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi---tz-- 1280.00U pool0 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi---tz-- 613.00U [pool0_tdata] vg Twi------- 613.00U [pool0_tmeta] vg ewi------- 101.00U }}} * LVをinactivateすると、reduce出来るようになるが…… * # lvreduce -l -1 /dev/vg/pool0_tmeta {{{ Size of logical volume vg/pool0_tmeta changed from 404.00 MiB (101 extents) to 400.00 MiB (100 extents). Logical volume pool0_tmeta successfully resized }}} * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi---tz-- 1280.00U pool0 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi---tz-- 613.00U [pool0_tdata] vg Twi------- 613.00U [pool0_tmeta] vg ewi------- 100.00U }}} * activate出来なくなる * # lvchange -ay /dev/vg/pool0 {{{ device-mapper: resume ioctl on failed: Invalid argument Unable to resume vg-pool0-tpool (252:2) }}} {{{ May 10 20:31:58 raid-test kernel: device-mapper: thin: 252:2: metadata device (102400 blocks) too small: expected 103424 May 10 20:31:58 raid-test kernel: device-mapper: table: 252:2: thin-pool: preresume failed, error = -22 }}} * # lvchange -ay /dev/vg/lv1 {{{ device-mapper: resume ioctl on failed: Invalid argument Unable to resume vg-pool0-tpool (252:2) }}} {{{ May 10 20:32:30 raid-test kernel: device-mapper: thin: 252:2: metadata device (102400 blocks) too small: expected 103424 May 10 20:32:30 raid-test kernel: device-mapper: table: 252:2: thin-pool: preresume failed, error = -22 }}} * reduceする前より大きなサイズに戻すとactivate出来るようになる。ただし、内部的に不整合などが起きてないかは不明 * # lvextend -l+1 /dev/vg/pool0_tmeta {{{ Size of logical volume vg/pool0_tmeta changed from 400.00 MiB (100 extents) to 404.00 MiB (101 extents). Logical volume pool0_tmeta successfully resized }}} * # lvchange -ay /dev/vg/pool0 * # lvchange -ay /dev/vg/lv1 * # lvs -a --units 4m {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lv1 vg Vwi-a-tz-- 1280.00U pool0 21.75 [lvol0_pmspare] vg ewi------- 1.00U pool0 vg twi-aotz-- 613.00U 45.41 0.15 [pool0_tdata] vg Twi-ao---- 613.00U [pool0_tmeta] vg ewi-ao---- 101.00U }}} * # mount /dev/vg/lv1 /data/lv1/ * # sha1sum -c /tmp/sha1 {{{ /data/lv1/tmp: OK }}} * ThinPoolLV, ThinDataLVが対象だと拒否される * # lvreduce -l -1 /dev/vg/pool0_tdata {{{ Can't resize internal logical volume pool0_tdata Run `lvreduce --help' for more information. }}} * # lvreduce -l -1 /dev/vg/pool0 {{{ Thin pool volumes cannot be reduced in size yet. Run `lvreduce --help' for more information. }}} == metadata == * http://man7.org/linux/man-pages/man7/lvmthin.7.html metadataを直接操作する方法なども記述されている * [http://www.redhat.com/archives/linux-lvm/2012-October/msg00023.html (linux-lvm) how to recover after thin pool metadata did fill up?] 古いバージョンのlvm2ではmetadataのリサイズが出来ない模様 > * http://www.redhat.com/archives/linux-lvm/2012-October/msg00033.html > > With 3.7 kernel and the next release of lvm2 (2.02.99) it's expected full support for live size extension of metadata device. * [http://comments.gmane.org/gmane.linux.kernel.device-mapper.devel/19190 dm-thin: issues about resize the pool metadata size] * metadataを大幅に大きくしようとすると失敗する事例 (3.12.0-rc7, lvm2 2.02.103) * ArchLinux 4.0.1-1-ARCH, lvm2 2.02.116-1では問題ない模様 == /etc/lvm/lvm.conf == === error_when_full === * 0 (default) * thin poolが一杯になると、書き込みはキューイングされる。thin poolはout-of-data-space modeになる {{{ May 10 12:35:10 raid-test lvm[303]: Thin vg-pool0-tpool is now 96% full. May 10 12:40:36 raid-test kernel: device-mapper: thin: 252:2: reached low water mark for data device: sending event. May 10 12:40:36 raid-test kernel: device-mapper: thin: 252:2: switching pool to out-of-data-space mode May 10 12:40:36 raid-test lvm[303]: Thin vg-pool0-tpool is now 100% full. }}} * #l vs -a -o+lv_when_full {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert WhenFull lv1 vg Vwi-aotz-- 3.00g pool0 66.67 pool0 vg twi-aotzD- 2.00g 100.00 25.98 queue }}} * (デフォルトでは)60秒、thin poolが拡張されるのを待ち、失敗するとファイルシステムへエラーを返す。thin poolは read-only modeになる {{{ May 10 12:41:36 raid-test kernel: device-mapper: thin: 252:2: switching pool to read-only mode May 10 12:41:36 raid-test kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: I/O error -5 writing to inode 17 (offset 25165824 size 8388608 starting block 591856) May 10 12:41:36 raid-test kernel: buffer_io_error: 38109 callbacks suppressed May 10 12:41:36 raid-test kernel: Buffer I/O error on device dm-8, logical block 591856 May 10 12:41:36 raid-test kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: I/O error -5 writing to inode 17 (offset 25165824 size 8388608 starting block 591857) May 10 12:41:36 raid-test kernel: Buffer I/O error on device dm-8, logical block 591857 (snip) May 10 12:41:36 raid-test kernel: device-mapper: thin: 252:2: metadata operation 'dm_pool_commit_metadata' failed: error = -1 May 10 12:41:36 raid-test kernel: device-mapper: thin: 252:2: aborting current metadata transaction May 10 12:41:36 raid-test kernel: device-mapper: thin: 252:2: switching pool to read-only mode }}} * # lvs -a -o+lv_when_full {{{ LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert WhenFull lv1 vg Vwi-aotz-- 3.00g pool0 66.67 pool0 vg twi-aotzM- 2.00g 100.00 25.98 queue }}} * read-onlyから復旧するには、全てのthin LVをinactivate -> thin pool LVをinactivate -> lvconvert --repair thinpoolLV する必要がある * 1 * thin poolが一杯になると、書き込みは即座に失敗する。 {{{ May 10 12:53:17 raid-test kernel: device-mapper: thin: 252:2: reached low water mark for data device: sending event. May 10 12:53:17 raid-test kernel: device-mapper: thin: 252:2: switching pool to out-of-data-space mode May 10 12:53:17 raid-test kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: I/O error -28 writing to inode 17 (offset 33554432 size 8388608 starting block 591872) May 10 12:53:17 raid-test kernel: buffer_io_error: 39292 callbacks suppressed May 10 12:53:17 raid-test kernel: Buffer I/O error on device dm-8, logical block 591872 May 10 12:53:17 raid-test kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: I/O error -28 writing to inode 17 (offset 33554432 size 8388608 starting block 591873) (snip) May 10 12:53:17 raid-test lvm[303]: Thin vg-pool0-tpool is now 100% full. May 10 12:53:27 raid-test kernel: EXT4-fs warning: 135366 callbacks suppressed May 10 12:53:27 raid-test kernel: EXT4-fs warning (device dm-8): ext4_end_bio:317: I/O error -28 writing to inode 17 (offset 588054528 size 4857856 starting block 727248) May 10 12:53:27 raid-test kernel: buffer_io_error: 135366 callbacks suppressed (snip) }}} * # lvs -a -o+lv_when_full {{{ lv1 vg Vwi-aotz-- 3.00g pool0 66.67 pool0 vg twi-aotzD- 2.00g 100.00 26.86 error }}} * fstrimなどを用いて、割り当てを解放すると、すぐに復旧する {{{ May 10 12:55:29 raid-test kernel: device-mapper: thin: 252:2: switching pool to write mode May 10 12:55:29 raid-test kernel: device-mapper: thin: 252:2: switching pool to write mode }}} * # lvs -a -o+lv_when_full {{{ lv1 vg Vwi-aotz-- 3.00g pool0 65.62 pool0 vg twi-aotz-- 2.00g 98.44 26.46 error }}} * オンラインで即座に変更可能 * # lvchange --errorwhenfull y vg/pool0 {{{ Logical volume "pool0" changed. }}}