jzawand a lovely Bah Humbug to you all on this dim and drizzly xmas morn :P
MilkmanDanHappy Festivus!
DHEACTION has no electricity, so humbug indeed
chris2ircing via avian carrier?
DHEdid he just...
varesaI am building a KVM virtualization box with 6 disks which I plan to make 3x raid1. Would ZOL bring any advantages over using e.g. mdraid+LVM?
DHEsnapshots, compression (admittedly minimal to no benefit with zvols) and checksums
DHEthe downside is that ZVOLs can suffer more fragmentation than regular (fat provisioned) LVs.
varesaI've used zfs a bit on freenas etc. boxes and liked it. However the fact that zfs is not supported out-of-stock, harder to boot from it, LVM support in KVM, etc. made me drop my zfs plans
varesafigured out I'd ask on this side of the fence though :)
DHElots of people run out-of-tree modules. NVIDIA driver, broadcom wireless, and now ZFS driver
pink_mistmost often those are not necessary to bring a usable system up though :P the zfs driver would be if root was on zfs =)
DHEdepends. maybe, maybe not. we don't know enough about varesa's case
varesaon my desktop/laptop I have all kinds of modules, including nvidia and broadcom. However I like to keep my server "clean" unless there is a significant benefit
varesafor example a kernel update breaking something and resulting in me unable to access any important VMs is not a pleasant thing
MilkmanDanvaresa: If it's going to be KVM won't you want to spin up multiple VMs of the same base OS?
MilkmanDanZFS will let you do that from a single OS image. Just snapshot a pristine install and give each VM a copy.
MilkmanDanIf you boot from ZFS you can snapshot the OS as well and roll back to a previous version if something breaks.
MilkmanDanThat's good for desktops as well.
DHEtypically if a kernel upgrade goes bad then you have GRUB boot a previous version
varesaI guess I'll try zfs out when the hardware arrives in a few days
MilkmanDanIt really is the best fs available.
ComnenusIs there actually any (up to date) documentation for zfs on linux?
chungyman zfs, man zpool
jaakkosComnenus: you will get quite far with Oracle ZFS Administration Guide.
Comnenusjaakkos: not many differences? Is there anything offhand for stuff I should skip over?
jaakkosComnenus: the things to skip over are somewhat obvious
Comnenusfair enough
chungyWhat kind of documentation do you need exactly? Getting started?
chungythis whole series of articles is good, though slightly dated
chungythe main thing it misses is that lz4 compression tends to be the recommended baseline, rather than lzjb
eightyeighthow is it dated?
chungythere's been a lot of new features introduced since it was written
chungylz4 compression is the big one :)
chungyNothing that makes it wrong, it just could use an update
eightyeighton https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ ?
jasonwcI'm trying to use raider (see http://raider.sourceforge.net/) to convert my single OS SSD disk to a mdadm RAID1. I tested in a Virtualbox VM using a clean install of Debian 7.7 and it worked perfedctly. However, on my production server with ZFS it stalls at update-initramfs -u and does not recognize the Grub version. I noticed that the system uses a custom version of Grub (2.0) compatible with ZFS. Is there any reason I
jasonwc need this version of Grub if I'm not using ZFS for my root FS?
bekksI'd take a proper backup, reinstall cleanly on a RAID1, and restore the backup.
jasonwcIs that really the best way?
bekksAt least for a production server.
jasonwcIn my VM, raider worked perfectly so I'm assuming is its interaction with this particular version of Grub
DHEComnenus: the biggest differences between solaris ZFS (v28) and OpenZFS (Linux) is LZ4 compression as an alternative [and better] algorithm, and that 'zfs destroy' isn't the slow and blocking operation it once was
DHEmore coming in 0.6.4
bekksDHE: what are the differences to v31+ ?
jasonwcIn any case, do I need the version of Grub installed by ZFS if I'm not booting from ZFS?
jasonwcWell, that is rather bizarre. The reason it showed no Grub version is that grub-pc wasn't installed. Yet, it booted from Grub. Odd...
DHEbekks: Solaris has native encryption as the big one.
DHEv28 is the forking point. Solaris went one way, OpenZFS went another
DHEI use wikipedia's ZFS article as my version reference for solaris
DHEoh yeah, sequential resilver
jasonwcOdd, sudo apt-get install grub-pc and it worked. :/
jasonwcsequential resilver looks nice
bekksDHE: Ah, ok :)
ComnenusDHE: I guess I'm more interested in what the differences would be for administration. Stuff like that.
DHEComnenus: other than some udev concerns it's basically identical. users are encouraged to use /dev/disk/*/* paths for device names so that they stay constant
ComnenusDHE: fair enough. thank you!
_br_-What is the right way to partition my ssd to speed up zfs for zil, etc?
bekksNo partition at all. Either use the entire disk as ZIL or dont.
_br_-Hm, I see.. so it makes no sense to use part of it for l2arc? I mean I just have one ssd disk unfortuantely..
bekksYou cannot use a part of the disk.
_br_-oh, really? Odd, I thought here he is using partition? https://pthree.org/2012/12/06/zfs-administration-part-iii-the-zfs-intent-log/
_br_-why would that not work on a partition?
bekksBecause it is not implemented to be working on a partition?
_br_-Ok, I need to go back to more rtfm, thanks bekks
cyberbootjedasjoe: you there?
DHEI would object to that. I'd make a ~5 GB (or less) partition for a ZIL
DHEmainly for wear leveling. the ZIL doesn't need to be huge
_br_-So, I actually can partition the ssd?
_br_-Given that competing ressources are always bad, and you have to benchmark. Does it make sense to have ZIL and L2ARC on one SSD?
eightyeightcalculating the practical size of your ZIL space isn't that difficult
_br_-interesting, eightyeight, do you have any links describing that? curious...
_br_-thanks for the comments btw. guys, appreciate it
eightyeightlook at your write throughput to disk, multiply it by 5 (default), and that will be ZIL size
_br_-gotcha, very helpful advice
eightyeightso, if you're writing 100 MBps to disk sustained, then every TXG flush (default 5s), you'll want to write 500MB
_br_-hm I see
eightyeightin practice, I have personally had a hard time seeing more than 1GB that is needed for the ZIL
_br_-can't hurt I guess to keep some buffer
eightyeightdepends on your pool layout, your storage needs, and your TXG timeout, of course
_br_-variables variables, gotcha.. :)
SachiruCorrect me if I'm wrong, but the proposed VDEV eviction code is supposed to do so by means of an indirection layer of sorts?
Sachirubekks, please avoid spreading misinformation.
SachiruI am using ZIL on a partition.
SachiruIt works fine.
Sachiru_br_-, ARC + ZIL on one disk is fine. ARC is read-heavy, ZIL is write-heavy. They do not contend for the same resource.
Sachiru_br_-, I find that a 4 GB partition + 64 GB slack space on a SSD is a decent medium for ZIL. It's not as if SSDs easily fail. Cloudflare recently moved to all SSDs for their edge servers in part because of the very low failure rate of SSDs, and when you consider that edge servers have a lot of data churn because they cache and evict a lot...
_br_-Sachiru: Thank you for that comment very helpful!
Sachiru_br_-, you have two choices: You either use a 68 GB partition for ZIL, or you use a 4 GB partition and leave 64 GB of space unpartitioned on your drive. If you go with the 68 GB ZIL partition, you are essentially creating a 4 GB ZIL and dedicating 64 GB for wear-leveling and garbage collection to the ZIL (meaning that other partitions cannot use this space for wear leveling, IIRC). If you go with 4 GB ZIL + 64 GB slack space, it's the same
Sachiruthing, except other partitions can use the 64 GB slack space for their own wear-leveling as well.
SachiruEither way, unless you are doing 10 Gigabit ethernet, a 4 GB ZIL + 64 GB slack is WAY overprovisioned for your pool.
_br_-Very interesting, will experiment with this... cheers!
SachiruSince, well, ethernet can do a max theoretical of 115 MB/sec, multiply that by 4 and you get around 500 MB/sec. Even in my tests I rarely go over 600 MB utilization in my ZIL.
jasonwcWill scheduling daily short S.M.A.R.T. tests with smartd interfere with a scrub? I know that it isn't recommended to run a long test but I figure a one minute short test woudl be harmless.
Sachirujasonwc, it will slow down the scrub for the duration of the test.
jasonwcah, the warnings i read suggested the scrub would never finish
jasonwca 1 minute slowdown is irrelevant
SachiruIt may result in multiple interruptions that cause the scrub to never finish as well.
jasonwci set the long tests to run so as to not interfere with the scrubs
jasonwcwe'll see if the short tests have any impact
DHEI think a daily smart test is a little over the top
SachiruMe too
SachiruOnce every week is enough
SachiruIf a drive is failing it will show up in SMART attributes anyway.
jasonwcDHE, even for the short test (1 minute)?
jasonwci set daily to daily, long test to weekly
jasonwcIs there a way to get verbose output from zfs send/recv so that you can see how much time has elapsed and when it it is expected to complete?
jasonwcfor example, as you can do with rsync
DHEsend -v will print how much it's written, and when it starts it prints a size estimate
jasonwcAnd -R will include all descendant filesystems and snapshots up to the current snapshot?
DHE(which I find to be generally incorrect by anything from 10% to 100%
DHEif you send $filesystem@$snapshot then $filesystem and every dataset under it must have a snapshot named $snapshot
DHE-R is weird in a number of ways. even when doing incrementals it will delete any snapshots on the receiving side which were deleted on the sending side in the intervening time
jasonwcso I took a recent snapshot and did a send/recv
jasonwc zfs send -v -R data/documents@zfs-auto-snap_hourly-2014-12-26-0317 | zfs recv Backups/documents
jasonwcyet the zfs list output doesn't match
jasonwcBackups/documents 21.9G 7.92T 20.3G /Backups/documents
jasonwcdata/documents 22.3G 16.8T 20.7G /data/documents
DHE400 megabytes could be explained by compression and block size subtle differences
jasonwcoh yeah
jasonwcforgot to enable compression on the receiving end
jasonwcWhat's the easiest way to update the backup nightly using an incremental send/recv?
DHEwell send -R will copy properties, but only if they're locally set
DHEI don't THINK it copies inherited properties
jasonwcyeah, i never enabled compression locally
jasonwcyeah and zfs get shows compression is off
jasonwcso that explains the difference in size
jasonwcbtw, as to your prior comment, the Sun/Oracle docs say "If zfs recv -F is not specified when receiving the replication stream, dataset destroy operations are ignored. The zfs recv -F syntax in this case also retains its rollback if necessary meaning.
jasonwcby default send -F replicates rename and destroy operations but if you omit -F on the receiving side it'll ignore destroy operations