NFS. The very bane of my existence. Well maybe not quite, but it sure makes my life hell. You can’t narrow down problem/runaway areas because there’s no granularity to what tiny statistics gathering it can/does do. There’s no authentication, and little to no access control.
Please, don’t start on NFSv4. Linux can’t even support NFSv3 hardly, and the 2.6 kernel is worse than 2.4. Atleast 2.4 doesn’t completely lose track of locks and locking the way 2.6 clients do. They completely forget about cleaning up after NFS locks somewhat randomly. Don’t believe me?
web4:~# uname -anLinux web4 2.6.18-5-amd64 #1 SMP Tue Oct 2 20:37:02 UTC 2007 x86_64 GNU/Linuxweb4:~# cat /proc/locks1: FLOCK ADVISORY WRITE 25347 00:33:6591864 0 EOF2: POSIX ADVISORY READ 21417 00:23:3323129 0 4263: POSIX ADVISORY READ 12981 00:23:3323129 0 4264: POSIX ADVISORY READ 14067 00:23:3323129 0 4265: POSIX ADVISORY WRITE 15061 08:09:48938 0 92233720368547758066: POSIX ADVISORY WRITE 15061 08:09:48936 0 92233720368547758067: POSIX ADVISORY WRITE 15061 08:09:48934 0 92233720368547758068: POSIX ADVISORY WRITE 15061 08:09:48932 0 92233720368547758069: POSIX ADVISORY WRITE 15061 08:09:48929 0 922337203685477580610: POSIX ADVISORY WRITE 15061 08:09:48895 0 922337203685477580611: POSIX ADVISORY WRITE 15061 08:09:48893 0 922337203685477580612: FLOCK ADVISORY WRITE 14354 08:08:402423 0 EOF13: FLOCK ADVISORY WRITE 3263 08:08:96619 0 EOF14: POSIX ADVISORY WRITE 2628 08:08:322011 1024 2047web4:~# stat /proc/14354stat: cannot stat `/proc/14354': No such file or directoryweb4:~# cat /proc/locks1: FLOCK ADVISORY WRITE 25464 00:24:1986016 0 EOF2: FLOCK ADVISORY WRITE 25463 00:23:106101 0 EOF3: POSIX ADVISORY READ 21417 00:23:3323129 0 4264: POSIX ADVISORY READ 12981 00:23:3323129 0 4265: POSIX ADVISORY READ 14067 00:23:3323129 0 4266: POSIX ADVISORY WRITE 15061 08:09:48938 0 92233720368547758067: POSIX ADVISORY WRITE 15061 08:09:48936 0 92233720368547758068: POSIX ADVISORY WRITE 15061 08:09:48934 0 92233720368547758069: POSIX ADVISORY WRITE 15061 08:09:48932 0 922337203685477580610: POSIX ADVISORY WRITE 15061 08:09:48929 0 922337203685477580611: POSIX ADVISORY WRITE 15061 08:09:48895 0 922337203685477580612: POSIX ADVISORY WRITE 15061 08:09:48893 0 922337203685477580613: FLOCK ADVISORY WRITE 14354 08:08:402423 0 EOF14: FLOCK ADVISORY WRITE 3263 08:08:96619 0 EOF15: POSIX ADVISORY WRITE 2628 08:08:322011 1024 2047web4:~#
Yeah, PID 14354 is LONG GONE but still holding a lock. Yes I know they’re advisory, but 2.4 doesn’t have a problem cleaning up and getting rid of locks when processes die. Back with 2.6.8 we tried to convert our NFS Server from 2.4->2.6, and that was a complete disaster. The 2.6.8 NFS Server would just randomly spit out permission denied errors to clients under load. We never tried again. NFS isn’t where it’s at for us. We’re moving to AFS. AFS isn’t without pitfalls either mind you. And we’ll have to do a LOT of work to get there including some custom code, most notably to support device inodes. There’s also a pretty bad bug in the Linux AFS client that we’re hoping to see fixed soon.
Leave a Reply
You must be logged in to post a comment.