The WolfspyreLabs Blog/ November/ Recovering a ZFS pool after a catastrophic power failure/ Recovering a ZFS pool after a catastrophic power failure what the damned hell? #This sucked. so… Pine died. with the help of Some posts on freebsd forms1, Mailing lists2, and some proxmox forums posts3 I got stuff mostly kinda functional enough to recover… zpool import zpool import -fFX " The documentation states the extreme rewind import (import -X) performs a full traversal of all blocks in the pool for verification." zfsspy #This freebsd thread4 pointed me at ZfsSpy5 - and its’ code repo6 Enabling debug messages #The openzfs docs7 pointed me to the zfsdbgmsg_enable and zfs_recover8 tuning parameters: zfs_dbgmsg_enable Internally ZFS keeps a small log to facilitate debugging. The contents of the log are in the /proc/spl/kstat/zfs/dbgmsg file. Writing 0 to /proc/spl/kstat/zfs/dbgmsg file clears the log. as well as some other notes on importing9 and this tool kstat-analyzer10 zfs_recover # go go gadget zfs_recover pine# echo 0 > /proc/spl/kstat/zfs/dbgmsg pine# echo 1 > /sys/module/zfs/parameters/zfs_dbgmsg_enable pine# echo 1 > /sys/module/zfs/parameters/zfs_recover ouch pine# zpool import pool: Wolfspyre id: 16715582643789691940 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: Wolfspyre ONLINE raidz1-0 ONLINE c114947f-2aa3-4111-85c8-93807d36e292 ONLINE 451ad62b-72a5-45f0-a6e1-347fda108334 ONLINE 7856ae71-5548-43ba-ae4f-ffd1bf6c9845 ONLINE 2c34ade3-3265-45b1-81b0-3b2ec6131aa9 ONLINE raidz1-1 ONLINE 8f00fa1f-1a70-4767-98a4-bf59f626e3c1 ONLINE cf69d566-a2ba-403c-ab85-8019d6498eeb ONLINE 1dc37743-5587-48ed-8636-613a79d5de63 ONLINE 407b271a-45a8-4240-9d93-d4207921cf1b ONLINE raidz1-2 ONLINE 954508b2-e5ec-4028-994f-022dc014c38d ONLINE 9338c1e3-0be0-4412-b662-0584d81de55f ONLINE 0637d1dc-434e-492e-a09c-66d94d924b4a ONLINE a2034c98-0311-458b-9084-43864833cd27 ONLINE cache nvme3n1p2 nvme2n1p2 nvme1n1p2 nvme0n1p2 logs mirror-4 ONLINE nvme0n1p1 ONLINE nvme1n1p1 ONLINE mirror-5 ONLINE nvme2n1p1 ONLINE nvme3n1p1 ONLINE fuck fuck fuck pine# tail -f /proc/spl/kstat/zfs/dbgmsg 1636590873 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED 1636590873 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING 1636590873 spa.c:6084:spa_import(): spa_import: importing Wolfspyre 1636590873 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): LOADING 1636590873 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/c114947f-2aa3-4111-85c8-93807d36e292': best uberblock found for spa Wolfspyre. txg 10512767 1636590873 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config untrusted): using uberblock with txg=10512767 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 tail: /proc/spl/kstat/zfs/dbgmsg: file truncated timestamp message 1636590392 metaslab.c:3648:metaslab_condense(): condensing: txg 2019513, msp[38] ffff890e626cc800, vdev id 0, spa boot-pool, smp size 66032, segments 890, forcing condense=FALSE 1636590832 spa.c:6232:spa_tryimport(): spa_tryimport: importing Wolfspyre 1636590832 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING 1636590832 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/c114947f-2aa3-4111-85c8-93807d36e292': best uberblock found for spa $import. txg 10512767 1636590832 spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=10512767 1636590832 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590832 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590832 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590832 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590838 spa.c:8337:spa_async_request(): spa=$import async request task=2048 1636590838 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED 1636590838 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING 1636590867 spa.c:6232:spa_tryimport(): spa_tryimport: importing Wolfspyre 1636590867 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING 1636590867 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/c114947f-2aa3-4111-85c8-93807d36e292': best uberblock found for spa $import. txg 10512767 1636590867 spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=10512767 1636590867 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590867 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590867 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590867 spa.c:8337:spa_async_request(): spa=$import async request task=4096 1636590873 spa.c:8337:spa_async_request(): spa=$import async request task=2048 1636590873 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADED 1636590873 spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING 1636590873 spa.c:6084:spa_import(): spa_import: importing Wolfspyre 1636590873 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): LOADING 1636590873 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/c114947f-2aa3-4111-85c8-93807d36e292': best uberblock found for spa Wolfspyre. txg 10512767 1636590873 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config untrusted): using uberblock with txg=10512767 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636590873 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096VERIFY3(msp->ms_allocated_space >= range_tree_space(msp->ms_freeing)) failed VERIFY3(msp->ms_allocated_space >= range_tree_space(msp->ms_freeing)) failed (135168 >= 761856) 1636591243 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): read 5648 log space maps (5648 total blocks - blksz = 131072 bytes) in 57270 ms 1636591243 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 0, spa Wolfspyre, vdev_id 5, ms_id 1, smp_length 7360, unflushed_allocs 262144, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 5084731 ms, loading_time 64 ms, ms_max_size 304893952, max size error 304893952, old_weight 700000000000001, new_weight 700000000000001 1636591243 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 0, spa Wolfspyre, vdev_id 5, ms_id 28, smp_length 3184, unflushed_allocs 135168, unflushed_frees 131072, freed 0, defer 0 + 0, unloaded time 5084734 ms, loading_time 62 ms, ms_max_size 534667264, max size error 534536192, old_weight 700000000000001, new_weight 700000000000001 1636591243 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 0, spa Wolfspyre, vdev_id 4, ms_id 0, smp_length 11952, unflushed_allocs 430080, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 5084796 ms, loading_time 95 ms, ms_max_size 354832384, max size error 354832384, old_weight 700000000000001, new_weight 700000000000001 1636591244 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): spa_load_verify found 1 metadata errors and 1 data errors 1636591244 spa_misc.c:403:spa_load_failed(): spa_load(Wolfspyre, config trusted): FAILED: spa_load_verify failed [error=5] 1636591244 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): UNLOADING 1636591244 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): spa_load_retry: rewind, max txg: 10512766 1636591244 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): LOADING 1636591244 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/2c34ade3-3265-45b1-81b0-3b2ec6131aa9': best uberblock found for spa Wolfspyre. txg 10512766 1636591244 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config untrusted): using uberblock with txg=10512766 1636591245 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591245 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591245 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591245 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591303 metaslab.c:3648:metaslab_condense(): condensing: txg 2019691, msp[19] ffff8915e2237000, vdev id 0, spa boot-pool, smp size 65560, segments 444, forcing condense=FALSE 1636591307 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): read 5648 log space maps (5648 total blocks - blksz = 131072 bytes) in 57280 ms 1636591308 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): spa_load_verify found 1 metadata errors and 1 data errors 1636591308 spa_misc.c:403:spa_load_failed(): spa_load(Wolfspyre, config trusted): FAILED: spa_load_verify failed [error=5] 1636591308 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): UNLOADING 1636591308 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): spa_load_retry: rewind, max txg: 10512765 1636591308 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): LOADING 1636591309 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/c114947f-2aa3-4111-85c8-93807d36e292': best uberblock found for spa Wolfspyre. txg 10512765 1636591309 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config untrusted): using uberblock with txg=10512765 1636591309 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591309 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591309 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591309 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591371 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): read 5648 log space maps (5648 total blocks - blksz = 131072 bytes) in 56806 ms 1636591372 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): spa_load_verify found 1 metadata errors and 1 data errors 1636591372 spa_misc.c:403:spa_load_failed(): spa_load(Wolfspyre, config trusted): FAILED: spa_load_verify failed [error=5] 1636591372 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): UNLOADING 1636591372 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): spa_load_retry: rewind, max txg: 10512764 1636591372 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): LOADING 1636591372 vdev.c:152:vdev_dbgmsg(): disk vdev '/dev/disk/by-partuuid/c114947f-2aa3-4111-85c8-93807d36e292': best uberblock found for spa Wolfspyre. txg 10512764 1636591372 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config untrusted): using uberblock with txg=10512764 1636591373 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591373 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591373 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591373 spa.c:8337:spa_async_request(): spa=Wolfspyre async request task=4096 1636591434 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): read 5648 log space maps (5648 total blocks - blksz = 131072 bytes) in 56899 ms 1636591437 spa_misc.c:418:spa_load_note(): spa_load(Wolfspyre, config trusted): spa_load_verify found 0 metadata errors and 5 data errors 1636591437 mmp.c:240:mmp_thread_start(): MMP thread started pool 'Wolfspyre' gethrtime 5278287819390 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 2, ms_id 15, smp_length 275960, unflushed_allocs 0, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 5278288 ms, loading_time 59 ms, ms_max_size 2154479616, max size error 2154479616, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 0, ms_id 546, smp_length 356648, unflushed_allocs 296476672, unflushed_frees 74809344, freed 0, defer 0 + 0, unloaded time 5278288 ms, loading_time 60 ms, ms_max_size 2153127936, max size error 2138218496, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 0, ms_id 713, smp_length 221920, unflushed_allocs 340606976, unflushed_frees 130342912, freed 0, defer 0 + 0, unloaded time 5278349 ms, loading_time 72 ms, ms_max_size 2896936960, max size error 2883379200, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 1, ms_id 325, smp_length 504960, unflushed_allocs 550789120, unflushed_frees 54206464, freed 0, defer 0 + 0, unloaded time 5278288 ms, loading_time 139 ms, ms_max_size 2290089984, max size error 2277597184, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 1, ms_id 468, smp_length 281000, unflushed_allocs 428285952, unflushed_frees 171515904, freed 0, defer 0 + 0, unloaded time 5278427 ms, loading_time 85 ms, ms_max_size 2262106112, max size error 2258239488, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 2, ms_id 141, smp_length 507560, unflushed_allocs 888864768, unflushed_frees 201637888, freed 0, defer 0 + 0, unloaded time 5278427 ms, loading_time 97 ms, ms_max_size 2688352256, max size error 2655731712, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 2, ms_id 173, smp_length 162656, unflushed_allocs 312688640, unflushed_frees 91111424, freed 0, defer 0 + 0, unloaded time 5278525 ms, loading_time 45 ms, ms_max_size 2758590464, max size error 2750849024, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591437 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 1, ms_id 540, smp_length 345384, unflushed_allocs 0, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 5278512 ms, loading_time 68 ms, ms_max_size 2333810688, max size error 2333810688, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591439 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 1, ms_id 590, smp_length 61000, unflushed_allocs 0, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 5280581 ms, loading_time 42 ms, ms_max_size 2179080192, max size error 2179080192, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591439 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 0, ms_id 727, smp_length 55928, unflushed_allocs 129941504, unflushed_frees 132620288, freed 0, defer 0 + 0, unloaded time 5280581 ms, loading_time 44 ms, ms_max_size 3126763520, max size error 3096371200, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591439 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 2, ms_id 278, smp_length 38424, unflushed_allocs 542277632, unflushed_frees 0, freed 0, defer 0 + 0, unloaded time 5280581 ms, loading_time 59 ms, ms_max_size 2413092864, max size error 2413092864, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591439 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 0, ms_id 783, smp_length 166968, unflushed_allocs 525230080, unflushed_frees 46481408, freed 0, defer 0 + 0, unloaded time 5280626 ms, loading_time 50 ms, ms_max_size 2185601024, max size error 2182414336, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591439 metaslab.c:2438:metaslab_load_impl(): metaslab_load: txg 10512767, spa Wolfspyre, vdev_id 1, ms_id 642, smp_length 87504, unflushed_allocs 1131610112, unflushed_frees 260440064, freed 0, defer 0 + 0, unloaded time 5280678 ms, loading_time 41 ms, ms_max_size 2319802368, max size error 2296479744, old_weight 7c0000000000001, new_weight 7c0000000000001 1636591439 metaslab.c:3896:metaslab_flush(): flushing: txg 10512767, spa Wolfspyre, vdev_id 1, ms_id 439, unflushed_allocs 0, unflushed_frees 0, appended 0 bytes 1636591569 metaslab.c:3648:metaslab_condense(): condensing: txg 2019743, msp[51] ffff890d81a0c800, vdev id 0, spa boot-pool, smp size 65920, segments 404, forcing condense=FALSE 1636591575 metaslab.c:3648:metaslab_condense(): condensing: txg 2019744, msp[18] ffff8915e2232800, vdev id 0, spa boot-pool, smp size 65904, segments 1219, forcing condense=FALSE zpool import pine# pine# echo 0 > /sys/module/zfs/parameters/spa_load_verify_data pine# echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata pine# echo 1 > /sys/module/zfs/parameters/spa_load_verify_data pine# echo 1 > /sys/module/zfs/parameters/spa_load_verify_metadata pine# zpool import Wolfspyre pine# echo 1 > /sys/module/zfs/parameters/zfs_recover pine# echo 1 > /sys/module/zfs/parameters/zfs_dbgmsg_enable pine# tail -f /proc/spl/kstat/zfs/dbgmsg OMG I HAVE DATA! COPY IT OFF FUCKING NOW!!!!—- https://forums.freebsd.org/threads/zfs-problem-need-help-recovering.71242/ ↩︎ https://groups.google.com/g/mailing.freebsd.hackers/c/QzQA32BmaqQ/m/AoLEZrfkElMJ ↩︎ https://forum.proxmox.com/threads/zfs-pool-failure-after-power-outage.90400/ ↩︎ https://forums.freebsd.org/threads/introduce-new-zfs-exploration-and-data-recovery-tool-zfs-spy.67522/#post-400824 ↩︎ https://imoriath.com/zfsspy/ ↩︎ https://github.com/Stefan311/ZfsSpy ↩︎ https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html ↩︎ https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#zfs-recover ↩︎ https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#import ↩︎ https://raw.githubusercontent.com/richardelling/zfs-linux-tools/master/kstat-analyzer ↩︎