SolidStart - Hacker News

jedberg 3 days ago ago

My NFS story: In my first job, we used NFS to maintain the developer desktops. They were all FreeBSD and remote mounted /usr/local. This worked great! Everyone worked in the office with fast local internet, and it made it easy for us to add or update apps and have everyone magically get it. And when the NFS server had a glitch, our devs could usually just reboot and fix it, or wait a bit. Since they were all systems developers they all understood the problems with NFS and the workarounds.

What I learned though was that NFS was great until it wasn't. If the server hung, all work stopped.

When I got to reddit, solving code distribution was one of the first tasks I had to take care of. Steve wanted to use NFS to distribute the app code. He wanted to have all the app servers mount an NFS mount, and then just update the code there and have them all automatically pick up the changes.

This sounded great in theory, but I told him about all the gotchas. He didn't believe me, so I pulled up a bunch of papers and blog posts, and actually set up a small cluster to show him what happens when the server goes offline, and how none of the app servers could keep running as soon as they had to get anything off disk.

To his great credit, he trusted me after that when I said something was a bad idea based on my experience. It was an important lesson for me that even with experience, trust must be earned when you work with a new team.

I set up a system where app servers would pull fresh code on boot and we could also remotely trigger a pull or just push to them, and that system was reddit's deployment tool for about a decade (and it was written in Perl!)

[-]

ninkendo 3 days ago ago

I was at Apple around 15 years ago working as a sysadmin in their hardware engineering org, and everything - and I mean everything - was stored on NFS. We ran a ton of hardware simulation, all the tools and code were on NFS as well as the actual designs and results.

At some point a new system came around that was able to make really good use of the hardware we had, and it didn’t use NFS at all. It was more “docker” like, where jobs ran in containers and had to pre-download all the tools they needed before running. It was surprisingly robust, and worked really well.

The designers wanted to support all of our use cases in the new system, and came to us about how to mount our NFS clusters within their containers. My answer was basically: let’s not. Our way was the old way, and their way was the new way, and we shouldn’t “infect” their system with our legacy NFS baggage. If engineers wanted to use their system they should reformulate their jobs to declare their dependencies up front and use a local cache, and all the other reasonable constraints their system had. They were surprised by my answer but I think it worked out in the end: it was the impetus for things to finally move off the legacy infrastructure, and it worked out well in the end.

neilv 3 days ago ago

I remember that era of NFS.

NFS volumes (for home dirs, SCM repos, tools, and data) were a godsend for workstations with not enough disk, and when not everyone had a dedicated workstation (e.g., university), and for diskless workstations (which we used to call something rude, and now call "thin clients"), and for (an ISV) facilitating work on porting systems.

But even when when you needed a volume only very infrequently, if there was a server or network problem, then even doing an `ls -l` in the directory where the volume's mount point was would hang the kernel.

Now that we often have 1TB+ of storage locally on a laptop workstation (compare to the 100MB default of an early SPARCstation), I don't currently need NFS for anything. But NFS is still a nice tool to have in your toolbox, for some surprise use case.

> To his great credit, he trusted me after that when I said something was a bad idea based on my experience. It was an important lesson for me that even with experience, trust must be earned when you work with a new team.

True, though, on a risky moving-fast architectural decision, even with two very experienced people, it might be reasonable to get a bit more evidence.

And in that particular case, it might be that one or both of you were fairly early in your career, and couldn't just tell that they could bet on the other person's assessment.

Though there are limits to needing to re-earn trust from scratch with a new team. For example, the standard FAANG-bro interview of everyone having to start from scratch for credibility, like they are fresh out of school with zero track record, and zero better ways to assess, is ridiculous. The only thing more ridiculous is when companies that pay vastly less try to mimic that interview style. Every time I see that, I think that this company apparently doesn't have experienced engineers on staff who can get a better idea just by talking with someone, rather than fratbro hazing ritual.

[-]

jjav 3 days ago ago

> Now that we often have 1TB+ of storage locally on a laptop workstation (compare to the 100MB default of an early SPARCstation), I don't currently need NFS for anything.

While diskless (or very limited disk) workstations were one use case for NFS, that was far from the primary one.

The main use case was to have a massive shared filesystem across the team, or division, or even whole company (as we did at Sun). You wouldn't want to be duplicating these files locally no matter how much local disk, the point was to have the files be shared amongst everyone for collaboration.

NFS was truly awesome, it is sad that everything these days is subpar. We use weak substitutes like having files on shared google drives, but that is so much inferior to having the files of the entire company mounted on the local filesystem through NFS.

(Using the past tense, since it's not used so much anymore, but my home fileserver exports directories over NFS which I mount on all other machines and laptops at home, so very much using it today, personally.)

[-]

neilv 2 days ago ago

Other things that changed were the Web, and the popularity of Git.

For example, one of the big uses of NFS we had was for engineering documents, all of which could be accessed from FrameMaker or Interleaf running on your workstation. Nowaways, all the engineering documentation and more would be accessed through a Web browser from a non-NFS server, and no longer on a shared filesystem.

Another use of NFS we had was for collaborating on shared code by some projects, with SCM storing to NFS servers (other projects used DSEE and ClearCase). But nowaways almost everyone in industry uses distributed Git, syncing to non-NFS servers, with cached copies on their local storage.

I suppose a third thing that changed was CSCW distributed change syncing becoming popular at moving into other tools, such as a live "shared whiteboard" document editing that people can access in their Web browsers. I have mixed feelings about some of the implementations and how they're deployed, but it's pretty wild to have 4 remote people during Covid editing a document in real time at once, and NFS isn't helping with the hard part of that.

Right now, the use case for NFS that first comes to mind is individual humans working with huge files (e.g., for AI training, or other big data), where you want the convenience of being able to access them with any tool from your workstation, and maybe also have big compute servers working with them, without copying things around. You could sorta do these things with big complicated MLops infrastructure, but sometimes that slows you down more than it speeds you up.

[-]

jjav 15 hours ago ago

> and the popularity of Git

github, specifically, I'd say.

github normalized the (weird) idea that the central repo is over on someone else's website.

You don't have to use git that way though. My internal git repositories are on NFS, available to all client machines.

[-]

neilv 3 hours ago ago

Interesting. I self-host Forgejo or GitLab, with SSH or HTTPS access from workstations' local repos, to the "origin" Git server.

The advantage you find to NFS for this is that you share workspaces between the client machines? Or reduce the local storage requirements on the client machines?

bsder 3 days ago ago

> What I learned though was that NFS was great until it wasn't. If the server hung, all work stopped.

Sheds a tear for AFS (Andrew File System).

We had a nice, distributed file system that even had solid security and didn't fail in these silly ways--everybody ignored it.

[-]

cherrycherry98 3 days ago ago

Morgan Stanley was a heavy user of AFS for deploying software and might still be for all I know.

"Most Production Applications run from AFS"

"Most UNIX hosts are dataless AFS clients"

https://web.archive.org/web/20170709042700/http://www-conf.s...

[-]

lmm 3 days ago ago

That was still in place at least when I left, and I'd be amazed if it got replaced. It was one of those wonderful pieces of infrastructure that you rarely even notice because it just quietly works the whole time.

jaltman 2 days ago ago

https://workshop.openafs.org/afsbpw08/wed_keynote.html

https://www.usenix.org/legacy/publications/library/proceedin...

https://workshop.openafs.org/afsbpw08/talks/wed_1/OpenAFS_an...

hinkley 3 days ago ago

NCSA also used it for some data archival and I believe for hosting the website files.

I looked up at one point whatever happened to AFS and it turns out that it has some Amdahl’s Law glass ceiling that ultimately limits the aggregate bandwidth to something around 1 GBps, which was fine when it was young but not fine when 100Mb Ethernet was ubiquitous and gigabit was obtainable with deep enough pockets. If adding more hardware can’t make the filesystem faster you’re dead.

I don’t know if or how openAFS has avoided these issues.

[-]

jaltman 2 days ago ago

The Amdahl's Law limitations are specific to the implementation and not at all tied to the protocols. The 1990 AFS 3.0 server design was built upon a cooperative threading system ("Light Weight Processes") designed by James Gosling as part of the Andrew Project. Cooperative processing influences the design of the locking model since there isn't any simultaneous between tasks. When the AFS fileserver was converted to pthreads for AFS 3.5, the global state of each library was protected by wrapping it with a global mutex. Each mutex was acquired when entering the library and dropped when exiting it. To complete any fileserver RPC required acquisition of at least six or seven global mutexes depending upon the type of vnode being be accessed. In practice, the global mutexes restricted the fileserver process to 1.7 cores regardless of how many cores were present in the system.

AuriStor's RX and UBIK protocol and implementation improvements would be worthless if the application services couldn't scale. To accomplish this required converting each subsystem so it could operate with minimal lock contention.

This 2023 presentation by Simon Wilkinson describes the improvements that were made to AuriStor's RX implementation up to that point.

https://www.auristor.com/downloads/auristor-rx-hare-and-the-...

The RX tortoise is catching up with the TCP hare.

  Connecting to [10.0.2.15]:2345
  RECV: threads   1, times        1, bytes        2000000000:          881 msec   [18.15 Gbit/s]

[-]

hinkley a day ago ago

Wow that’s a lot of info.

> In practice, the global mutexes restricted the fileserver process to 1.7 cores regardless of how many cores were present in the system.

So in theory the bandwidth could scale with single CPU and/or point to point bandwidth but cannot scale horizontally at all. Except on the new implementations.

[-]

jaltman 19 hours ago ago

Correct, and the point-to-point bandwidth is limited by the maximum RX window size because of the bandwidth delay product. As round-trip latency increases, at some point the window size becomes insufficient to keep the pipe full, at which point data transfers stall.

One site which recently lifted and shifted their AFS cell to a cloud made the following observations:

We observed the following performance while copying a 1g file from local disk into AFS.

  AuriStor Client (2021.05-65) -> OpenAFS server (1.6.24): 3m.11s

  AuriStor Client (2021.05-65) -> AuriStor Server (2021.05-65): 1m

  AuriStor Client (2025.00.11) -> AuriStor Server (2025.00.11): 30s

All of the above tests were performed from clients located on campus to fileservers located in in the cloud.

There are many RX implementation differences between the three versions. It is important to note that the window size grows from 32 -> 128 -> 512.

dotwaffle 2 days ago ago

I know quite a few AFS systems that moved to AuriStor's YFS: https://www.auristor.com/openafs/migrate-to-auristor/auristo...

As I understand it, it mitigated many of those issues, but is still very "90s" in operation.

I've been flirting with the idea of writing a replacement for years, about time I had a go at it!

[-]

hinkley 2 days ago ago

I may be confusing two systems but I believe that AFS system was also encompassed the first iteration of “AWS Glacier” I encountered in the wild. A big storage that required queuing a job to a tape array or pinging an undergrad to manually load something for retrieval.

nine_k 3 days ago ago

AFS implements weak consistency, which may be a bit surprising. It also seems to share objects, not block devices. Judging by its features, it seems to make most sense when there is a cluster of servers. It looks cool though, a bit more like S3 than like NFS.

[-]

jaltman 2 days ago ago

The cephfs model of a file system logically constructed from an object store closely mirrors the AFS architecture. The AFS fileserver is horribly misnamed. Whereas AFS 1.0 fileserver exported the contents of local filesystems much as NFS and CIFS do, AFS 2.x/3.x/OpenAFS/AuriStorFS fileservers export objects (aka vnodes) which are stored in an object store. Each AFS vice partition stored zero or more object stores each consisting of the objects belonging to a single volume group. A volume group consists of one or more of the RWVOL, ROVOL and/or BACKVOL instances.

The AFS consistency model is fairly strong. Each client (aka cache manager) is only permitted to access the data/metadata of a vnode if it has been issued a callback promise from the AFS fileserver. File lock transitions, metdata modifications, and data modifications as well as volume transactions cause the fileserver to break the promise. At which point the client is required to fetch updated status information before it can decide it is safe to reuse the locally cached data.

Unlike optimistic locking models, the AFS model permits cached data to be validated after an extended period of time by requesting up to date metadata and a new callback promise.

An AFS fileserver will not permit a client to perform a state changing operation as long as there exist broken callback promises which have yet to be successfully delivered to the client.

tempay 3 days ago ago

I'm still in a place where AFS is chugging along for many years after it's much exaggarated demise.

It never ceases to amaze me how well it does what it does and how well it handles being misused.

slater 3 days ago ago

Looks like these guys are still truckin' along?

https://www.openafs.org/

burnt-resistor 3 days ago ago

And heavily used by top universities for the past ~35 years.

fuzztester 3 days ago ago

Interesting. I think I had casually come across the term AFS (and its full form) back in the day, but never look into it deeply.

Why did everybody ignore it, do you know?

[-]

jaltman 2 days ago ago

Not everyone ignored it but unlike nfs it didn't come in the box with the operating system, and you had to pay for it. In addition, AFS provided strong cryptographic authentication and wire privacy which meant that it couldn't be licensed in many countries because the U.S. government did not grant appropriate export licenses.

I often wonder how the world would be different if AFS 3.0 could have been freely distributed world wide in 1989 precluding the need for HTTP to be developed at CERN.

bsder 2 days ago ago

There were a few technical obstacles which other people mentioned, but I think timing was biggest issue (remember--AFS dates to something like 1983-ish).

1) AFS, IIRC, required more than one machine in its original configuration. That meant hardware and sysadmins which were expensive--until, suddenly they weren't.

2) Disk, memory and bandwidth were scarce--and then they weren't. AFS made a bunch of solid architectural decisions and then wasted a bunch of time backing some of them down in deference to the hardware of the day and then all that work was wasted when Moore's Law overran everything, anyhow.

3) Everybody was super happy to be running everything locally to escape the tyranny of the "Mainframe Operator" (meaning no NFS or AFS or the like)--until they weren't. Once enough non-technical people appeared who didn't want to do system administration, like, ever, that flipped.

We lost the VMS filesystem in this timeframe, too. Which was also a distributed, remote filesystem.

But those x86 processors sure are cheap ... sigh.

zh3 3 days ago ago

Don't know about FreeBSD but hard hanging on a mounted filesystem is configurable (if it's essential configure it that way, otherwise don't). To this day I see plenty of code written that hangs forever if a remote resource is unavailable.

[-]

lmm 3 days ago ago

> Don't know about FreeBSD but hard hanging on a mounted filesystem is configurable (if it's essential configure it that way, otherwise don't).

In theory that should work, but I find that kind of non-default config option tends to be undertested and unreliable. Easier to just switch to Samba where not hanging is default/expected.

muxator 3 days ago ago

Hi, could you give some pointers about this? Thanks!

[-]

zh3 3 days ago ago

It's down to the mount options, use 'soft' and the program trying to access the (inaccessible) server gets an error return after a while, or 'intr' if you want to be able to kill the hung process.

The caveat is a lot of software is written to assume things like fread(), fopen() etc will either quickly fail or work. However, if the file is over a network obviously things can go wrong so the common default behaviour is to wait for the server to come back online. Same issue applies to any other network filesystem, different OS's (and even the same OS with different configs) handle the situation differently.

[-]

dingaling 2 days ago ago

> after a while

'After a while' usually requiring the users to wait with an unresponsive desktop environment, because they opened a file manager whilst NFS was huffing. So they'd manage to switch to a virtual terminal and then out of habit type 'ls', locking that up too.

After a few years of messing around with soft mounts and block sizes and all sorts of NFS config nonsense, I switched to SMB and never looked back

throw0101d 2 days ago ago

>> Don't know about FreeBSD but hard hanging on a mounted filesystem is configurable (if it's essential configure it that way, otherwise don't).

> Hi, could you give some pointers about this? Thanks!

* https://man.archlinux.org/man/nfs.5.en#soft

* https://kb.netapp.com/on-prem/ontap/da/NAS/NAS-KBs/What_are_...

2 days ago ago

[deleted]

hinkley 3 days ago ago

I heard rumors at first and later saw it once that the sparc lab at my university occasionally had to be shut down and turned on in a particular order to get the whole thing to spool back up after a server glitch. I think the problem got really nasty once you had NFS mounts from multiple places.

[-]

burnt-resistor 3 days ago ago

sync with tcp on NFS 2 had the hang problem. async on NFS 3+ was better but could still hang if configured using sync.

NFS 4.1 introduced pNFS scalability and 4.2 has even more optimizations.

hedora 3 days ago ago

You probably gave bad advice. By the time Reddit existed, you could have just gotten an netapp filer. They had higher availability than most data centers back then, so “the NFS server hung” wouldn’t be anywhere near the top of your “things that cause outages or interfere with engineering” list.

These days, there are plenty of NFS vendors with similar reliability. (Even as far back as NFSv3, the protocol makes it possible for the server to scale out).

[-]

jedberg 3 days ago ago

I guess I have to earn your trust too. I was actually intimately familiar with Netapp filers at the time, since that is what we used to drive the NFS mounts for the desktops at the first place I mentioned. They were not as immune as you think and were not suitable.

Also, we were a startup, and a Netapp filer was way outside the realm of possibility.

Also, that would be a great solution if you have one datacenter, but as soon as you have more than one, you still have to solve the problem of syncing between the filers.

Also, you generally don't want all of your app servers to update to new code instantly all the same time, in case there is a bug. You want to slow roll the deploy.

Also, you couldn't get a filer in AWS, once we moved there.

And before we moved to AWS the rack was too full for a filer, I would have had to get a whole extra rack.

[-]

xorcist 2 days ago ago

FWIW, NetApps were generally pretty solid, and they should have no problem keeping in sync across datacenters. You pay handsomely for the privilege though.

Failover, latency, and so on are something you need to think about independently of what transfer protocol you use. NFS may present its own challenges with all the different extensions and flags, but that's true of any mature technology.

That said, live code updates probably aren't a very good idea anyway, for exactly the reasons you mention. Those are the reasons you were right at the time, not any inherent deficiencies on the NFS protocol.

jbaiter 3 days ago ago

100% this. Sometimes it's not even the filer itself. `hard` NFS mounts on clients in combination with network issues have led to downtimes where I work. Soft mounts can be a solution for read only workloads that have other means of fault tolerance in front of them, but it's not a panacea.

[-]

hedora 2 days ago ago

I haven’t seen these problems at much larger scales than are being discussed here. I’ve heard of people buying crappy nfs filers or trying to use the Linux server in prod (it doesn’t support HA!), but I’ve also heard of people losing data when they install a key value store or consensus protocol on < 3 machines.

The only counterexample involved a buggy RHEL-backported NFS client that liked to deadlock, and that couldn’t be upgraded for… reasons.

Client bugs that force a single machine/process restart can happen with any network protocol.

throw0101d 2 days ago ago

> You probably gave bad advice. By the time Reddit existed, you could have just gotten an netapp filer. They had higher availability than most data centers back then, so “the NFS server hung” wouldn’t be anywhere near the top of your “things that cause outages or interfere with engineering” list.

Or distributed NFS filers like Isilon or Panasas: any particular node can be rebooted and its IPs are re-distributed between still-live node. At my last job we used one for HPC and it stored >11PB with minimal hassle. OS upgrades can be done in a rolling fashion so client service is not interrupted.

Newer NFS vendors like Vast Data have all-NVME backends (Isilon can have a mix if you need both fast and archival storage: tiering can happen on (e.g.) file age).

burnt-resistor 3 days ago ago

NetApps were a game changer. Large Windows Server 2003 file servers that ran CIFS, NFS, and AFP simultaneously could take 60-90 minutes to come back online because of the resource fork enumeration scan required by AFP sharing.

bitwize 3 days ago ago

I find it fascinating that the fact that NFS mounts hang the process when they don't work is due to the broken I/O model Unix historically employed.

See, unlike some other more advanced, contemporary operating systems like VMS, Unix (and early versions of POSIX) did not support async I/O; only nonblocking I/O. Furthermore, it assumed that disk-based I/O was "fast" (I/O operations could always be completed, or fail, in a reasonably brief period of time, because if the disks weren't connected and working you had much bigger problems than the failure of one process) and network-based or piped I/O was "slow" (operations could take arbitrarily long or even fail completely altogether after a long wait); so nonblocking I/O was not supported for file system access in the general case. Well, when you mount your file system over a network, you get the characteristics of "slow" I/O with the lack of nonblocking support of "fast" I/O.

A sibling comment mentions that FreeBSD has some clever workarounds for this. And of course it's largely not a concern for modern software because Linux has io_uring and even the POSIX standard library has async I/O primitives (which few seem to use) these days.

And this is one of those things that VMS (and Windows NT) got right, right from the jump, with I/O completion ports,

But issues like this, and the unfortunate proliferation of the C programming language, underscore the price we've paid as a result of the Unix developers' decision to build an OS that was easy and fun to hack, rather than one that encouraged correctness of the solutions built on top of it.

[-]

ec109685 3 days ago ago

It wasn’t until relatively recently approaches like await because commonplace. Imagine all the software that wouldn’t have been written if they were forced to use async primitives before languages were ready for them.

Synchronous IO is nice and simple.

[-]

vacuity 2 days ago ago

Yes, it is to synchronous programming's great credit that it is simple, and to its great discredit that it is inefficient. Engineering tradeoffs, and all that.

Quote[0]:

> In Ingo's view, there are only two solutions to any operating system problem which are of interest: (1) the one which is easiest to program with, and (2) the one that performs the best. In the I/O space, he claims, the easiest approach is synchronous I/O calls and user-space processes. The fastest approach will be "a pure, minimal state machine" optimized for the specific task; his Tux web server is given as an example.

Granted, most software is not developed for the Linux kernel, but neither is asynchronous programming black magic. I think the software space has rather been negatively impacted by being slow to adopt asynchronous programming, among other old practices.

[0] https://lwn.net/Articles/219954/

bitwize 3 days ago ago

Imagine all the software that would've been written, or made much nicer, earlier on had Unix devs not been forced to use synchronous I/O primitives.

Synchronous I/O may be simple, but it falls down hard at the "complex things should be possible" bit. And people have been doing async I/O for decades before they got handholding constructs like 'async' and 'await'. Programming the Amiga, for instance, was done entirely around async I/O to and from the custom chips. The CPU needn't do much at all to blow away the PC at many tasks; just initiate DMA transfers to Paula, Denise, and Agnus.

3 days ago ago

[deleted]

buserror 3 days ago ago

I use NFS as a keystone of a pretty large multi-million data center application. I run it on a dedicated 100Gb network with 9k frames and it works fantastic. I'm pretty sure it is still use in many, many places because... it works!

I don't need to "remember NFS", NFS is a big part of my day!

[-]

zh3 3 days ago ago

On a smaller scale, I run multiple PC's in house diskless with NFS root; so easy to just create copies on the server and boot into them as needed, it's almost one image per bloated app these days (server also boots PC's into Windows using iSCSI/SCST and old DOS boxes from 386 onwards with etherboot/samba). Probably a bit biased due to doing a lot of hardware hacking where virtualisation solutions take so much more effect, but got to agree NFS (from V2 through V4) just works.

ryandrake 3 days ago ago

NFS is the backbone of my home network servers, including file sharing (books, movies, music), local backups, source code and development, and large volumes of data for hobby projects. I don't know what I'd do without it. Haven't found anything more suitable in 15+ years.

[-]

INTPenis 3 days ago ago

Same. The latest thing I did was put snes state and save files on NFS so I can resume the same game from laptop, to retropi (tv), and even on the road over wireguard.

E39M5S62 3 days ago ago

PornHub's origin clusters serve petabytes of files off of NFS mounts - it's still alive and well in lots of places.

technofiend 3 days ago ago

As a unix sysadmin in the early 90s, I liked to understand as much as I could about the tech that supported the systems I supported. All my clients used NFS so I dug into the guts of RPC until I could write my own services and publish them via portmap.

Weirdly that nerd snipe landed me two different jobs! People wanted to build network-based services and that was one of the quickest ways to do it.

AshamedCaptain 3 days ago ago

> There is also a site, nfsv4bat.org [...] However, be careful: the site is insecure

I just find this highly ironic considering this is NFS we are talking about. Also, do they fear their ISPs changing the 40 year old NFS specs on the flight or what ? Why even mention this ?

holoduke 3 days ago ago

We are still using it for some pretty large apps. Still have not found a good and simple alternative. I like the simplicity and performance. Scaling is a challenge though.

[-]

hnlmorg 3 days ago ago

Unfortunately there doesn’t seem to be any decent alternative.

SMB is a nightmare to set up if your host isn’t running Windows.

sshfs is actually pretty good but it’s not exactly ubiquitous. Plus it has its own quirks and performs slower. So it really doesn’t feel like an upgrade.

Everything else I know of is either proprietary, or hard to set up. Or both.

These days everything has gone more cloud-oriented. Eg Dropbox et al. And I don’t want to sync with a cloud server just to sync between two local machines.

[-]

toast0 3 days ago ago

> SMB is a nightmare to set up if your host isn’t running Windows.

Samba runs fine on my FreeBSD host? All my clients are Windows though.

If I wanted to have a non-windows desktop client, I'd probably use NFS for the same share.

[-]

hnlmorg 3 days ago ago

It runs fine but it's a nightmare to set up.

It's one of those tools that, unless you already know what you're doing, you can expect to sink several hours into trying to get the damn thing working correctly.

It's not the kind of thing you can throw at a junior and expect them to get working in an afternoon.

Whereas NFS and sshfs "just work". Albeit I will concede that NFSv4 was annoying to get working back when that was new too. But that's, thankfully, a distant memory.

jjtheblunt 3 days ago ago

What happened to Transarc's DFS ?

I looked, found the link below, but it seems to just fizzle out without info.

https://en.wikipedia.org/wiki/DCE_Distributed_File_System

Anyway, we used it extensively in the UIUC engineering workstation labs hundreds of computers, 20+ years ago, and it worked excellently. I set up a server farm 20 years ago of Sun sparcs but used NFS for such.

[-]

nbernard 3 days ago ago

AFS (on which DFS was based) lives on as OpenAFS [0]. And there is a commercial evolution/solution from AuriStor [1].

[0]: https://openafs.org/

[1]: https://www.auristor.com/filesystem/

[-]

jjtheblunt 2 days ago ago

Thank you! I knew it was from Andrew File System but did not manage to find those links by searching too narrowly.

[-]

jaltman 2 days ago ago

DCE DFS (developed at Transarc) was originally supposed to be AFS 4.0 before it was contributed to DCE. After the contribution it became backward incompatible with AFS 3.x. The RPC layer, the authentication protocol, the protection service (user/group management) were all replaced to leverage technology contributions from other DCE participants.

IMO IBM/Transarc died for two reasons. First, there was significant brand confusion after the release of Windows Active Directory and Windows DFS since no trademarks were obtained for DCE service names. Second, the file system couldn't be deployed without the rest of the DCE infrastructure.

There was an unofficial effort within IBM to create the Advanced Distributed File System (ADFS) which would have decoupled DFS from the DCE Cell Directory Service and Security Service as well as replaced DCE/RPC. However, the project never saw the light of day.

https://en.wikipedia.org/wiki/DCE_Distributed_File_System

[-]

jjtheblunt a day ago ago

thanks : i hadn't known that larger context either, so very useful.

convolvatron 3 days ago ago

I used to administer AFS/DFS and braved the forest of platform ifdefs to port it to different unix flavors.

plusses were security (kerberos), better administrative controls and global file space.

minuses were generally poor performance, middling small file support and awful large file support. substantial administrative overhead. the wide-area performance was so bad the global namespace thing wasn't really useful.

I guess it didn't cause as many actual multi-hour outages NFS, but we used it primarily for home/working directories and left the servers alone, whereas the accepted practice at the time was to use NFS for roots and to cross mount everything so that it easily got into a 'help I've fallen and can't get up' situation.

[-]

jjtheblunt 3 days ago ago

that's very similar to what we were doing for the engineering workstations (hundreds of hosts across a very fast campus network)

(off topic, but great username)

ForHackernews 2 days ago ago

Someone else in this thread was suggesting http://www.openafs.org/ but I had never heard of it.

lmm 3 days ago ago

> SMB is a nightmare to set up if your host isn’t running Windows.

That's the opposite of my experience. Fire it up and it just works, in less time than it would take you to configure NFS sensibly.

[-]

hnlmorg 3 days ago ago

Have you ever had to manually administer a samba daemon?

[-]

lmm 3 days ago ago

Yes, and I've also had to manually administer an NFS daemon. I know which I prefer.

[-]

hnlmorg 3 days ago ago

Weird. I’ve done both at scale many times and NFS daemons have always been significantly less problematic (bar the brief period when NFSv4 was new, but I just fell back to NFSv3 for a brief period).

Samba can be set up easily enough if you know what you’re doing. But getting the AD controller part working would often throw up annoying edge case problems. Problems that I never had to deal with in NFS.

Though I will admit that NIS/YP could be a pain if you needed it to sync with NT.

[-]

lmm 3 days ago ago

> Weird. I’ve done both at scale many times and NFS daemons have always been significantly less problematic (bar the brief period when NFSv4 was new, but I just fell back to NFSv3 for a brief period).

Might just be bad timing then, most of my experience with it was in that v3/v4 transition period. It was bad enough to make me swear off the whole thing.

[-]

hnlmorg 3 days ago ago

The v3/v4 transition was very painful so I can completely sympathise to why you were put off NFS from that experience.

apelapan 3 days ago ago

NIS/YP on its own were amazing though! So simple to setup and did exactly everything you could ever need in a small to medium-sized network.

Everything that was supposed to replace it is so much worse, except for supposedly not being very unsafe.

NexRebular 3 days ago ago

> SMB is a nightmare to set up if your host isn’t running Windows.

It's very easy on illumos based systems due the integrated SMB/CIFS service.

Spivak 3 days ago ago

I mean the decent alternative is object storage if you can tolerate not getting a filesystem. You can get an S3 client running anywhere with little trouble. There are lots of really good S3 compatible servers you can self-host. And you don't get the issue of your system locking up because of an unresponsive server.

I've always thought that NFS makes you choose between two bad alternatives with "stop the world and wait" or "fail in a way that apps are not prepared for."

[-]

hnlmorg 3 days ago ago

If you don't need a filesystem, then your options are numerous. The problem is sometimes you do need exactly that.

I do agree that object storage is a nice option. I wonder if a FUSE-like object storage wrapper would work well here. I've seen mixed results for S3 but for local instances, it might be a different story.

[-]

zokier 3 days ago ago

AWS has this "mountpoint for s3" thingy https://github.com/awslabs/mountpoint-s3

[-]

hnlmorg 3 days ago ago

They do, but POSIX file system APIs don’t map to S3 APIs well. So you run the risk of heavily increasing your S3 APIs costs for any Stat() heavy workflows.

This is why I say there’s mixed opinions about mounting S3 via FUSE.

This isn’t an issue with a self hosted S3 compatible storage server. But you then have potential issues using an AWS tool for non-AWS infra. There be dragons there.

And if you where to use a 3rd party S3 mounting tool, then you run into all the other read and write performance issues that they had (and why Amazon ended up writing their own tool for S3).

So it’s really not a trivial exercise to selfhost a mountable block storage server. And for something as important as data consistency, you might well be concerned enough about weird edge cases that mature technologies like SMB and NFS just feel safer.

fodkodrasz 3 days ago ago

SMB is not that terrible to set up (has its quirks definitely), but apple devices don't interoperate well in my experience. SMB from my samba server performs very well from linux and windows clients alike, but the performance from mac is terrible.

NFS support was lacking on windows when I last tried. I used NFS (v3) a lot in the past, but unless in a highly static high trust environment, it was worse to use than SMB (for me). Especially the user-id mapping story is something I'm not sure is solved properly. That was a PITA in the homelab scale, having to set up NIS was really something I didn't like, a road warrior setup didn't work well for me, I quickly abandoned it.

[-]

chasil 3 days ago ago

Which SMB?

SMBv1 has a reputation for being an extremely chatty protocol. Novell ipx/spx easily dwarfed it in the early days. Microsoft now disables it by default, but some utilities (curl) do not support more advanced versions.

SMBv2 increases efficiency by bundling multiple messages into a single transmission. It is clear text only.

SMBv3 supports optional encryption.

Apple dropped the Samba project from MacOS due to gplv3, and developed their own SMB implementation that is not used elsewhere AFAIK. If you don't care for Apple's implementation, then perhaps installing Samba is a better option.

NFSv3 relies solely on uid/gid mapping by default, while NFSv4 requires idmapd to run to avoid squashing. I sometimes use both at the same time.

[-]

p_ing 2 days ago ago

For macOS, it doesn't matter which protocol version of SMB, it's just a poor implementation of SMB [client].

Spooky23 3 days ago ago

Windows 10/11 have native support. Writes aren’t terribly performant iirc.

hnlmorg 3 days ago ago

> SMB is not that terrible to set up

Samba can be. Especially when compared with NFS

> NFS support was lacking on windows when I last tried.

If you need to connect from Windows then your options are very limited, unfortunately.

rootnod3 3 days ago ago

True. But for example a home server I absolutely love the simplicity. I have 6 Lenovo 720q machines, one of them as a data storage just running simple NFS for quick daily backups before it pushes them to a NAS.

q3k 3 days ago ago

9P? Significantly simpler, at the protocol level, than NFS (to the point where you can implement a client/server in your language of choice in one afternoon).

jabl 3 days ago ago

Lustre is big in the HPC/AI training world. Amazing performance and scalability, but not for the faint of heart.

[-]

nwellinghoff 2 days ago ago

Got any more details about pros and cons based on your experience?

[-]

jabl 2 days ago ago

Well, kind of hard to say anything exhaustive in a quick comment, but roughly advantages:

- POSIX compliant, including dotting the i's. As opposed to, say, NFS which isn't cache coherent.

- performance and scalability. 1 TB/s+ sequential IO to a single file is what you'd expect on a large HPC system these days.

- Metadata performance has gotten a lot better over the past decade or so, beating most(all?) other parallel filesystems.

Downsides:

- Lots of pieces in a Lustre cluster (typically nodes are paired in sort-of active/active HA configs). And lots of cables, switches etc. So a fairly decent chance something breaks every now and then.

- When something breaks, Lustre is weird and different compared to many other filesystems. Tools are rudimentary and different.

To get a feel for what 'life with Lustre' could be, see e.g. various 'site reports' from workshops. E.g. for a couple somewhat recent ones: https://www.eofs.eu/wp-content/uploads/2024/09/cscs_site_rep... and https://www.eofs.eu/wp-content/uploads/2024/09/LAD-24-Luster...

amacbride 3 days ago ago

My introduction to NFS was first at Berkeley, and then at Sun. It more or less just worked. (Some of the early file servers at Berkeley were drastically overcapacity with all the diskless Sun-3/50s connected, but still.)

And of course I still use it every day with Amazon EFS; Happy Birthday, indeed!

Eikon 3 days ago ago

ZeroFS uses NFS/9P instead of fuse!

https://github.com/Barre/ZeroFS

cyberax 3 days ago ago

A good time to plug my NFSv4 client in Go: https://github.com/Cyberax/go-nfs-client :) It's made for EFS, but works well enough with other servers.

01HNNWZ0MV43FF 3 days ago ago

I'd seen a proposal to use loopback NFS in place of FUSE:

https://github.com/xetdata/nfsserve

[-]

toomim 3 days ago ago

See also https://www.legitcontrol.com as presented at https://braid.org/meeting-118 for a beautiful example of "local NFS" as a wonderful replacement for FUSE!

mixmastamyk 3 days ago ago

What are most people using today for file serving? For our little lan sftp seems adequate, since ssh is already running.

[-]

Narushia 3 days ago ago

NFS v4.2. Easy to set up if you don't need authentication. Very good throughput, at least so long as your network gear isn't the bottleneck. I think it's the best choice if your clients are Linux or similar. The only bummer for me is that mounting NFS shares from Android file managers seems to be difficult or impossible (let alone NFSv4).

[-]

the_gipsy 3 days ago ago

I think you can serve NFSv4 and also NFSv3 at the same time for those Android apps (e.g. Kodi).

[-]

Narushia 2 days ago ago

Yes, that's what at least the `nfs-server` service on Fedora does by default. And VLC also supports v3 on Android… maybe they use the same implementation as Kodi behind the scenes? It's weird the v4 support is so spotty still, even though it has been around for two decades. Even NFS v4.2 is almost ten years old at this point.

arp242 3 days ago ago

I just use sshfs for most things today. It's by far the simplest to set up (just run sshd), has good authentication and encryption (works over the internet), and when I measured performance vs. NFS and Samba some years ago it seemed roughly identical (this is mostly for large files; it's probably slower for lots of small files – measure your own use case if performance is important). I don't know about file locking and that type of thing – it perhaps does poorly there(?) It's not something I care about.

nine_k 3 days ago ago

SMB2 for high-performance writable shares, WebDAV for high-performance read-only shares, also firewall-friendly.

Sftp is useful, but is pretty slow, only good for small amounts and small number of files. (Or maybe i don't know how to cook it properly.)

[-]

aborsy 3 days ago ago

SMB is great for LAN, but its performance over internet is poor. It remains SFTP and WebDAV in that case. SFTP would be my choice, if there is client support.

[-]

nine_k 3 days ago ago

I suspect that NFS over Internet is also not the most brilliant idea; I assumed the LAN setting.

heavyset_go 3 days ago ago

NFSv4 over WireGuard for file systems

WebDAV shares of the NFS shares for things that need that view

sshfs for when I need a quick and dirty solution where performance and reliability don't matter

9p for file system sharing via VMs

ajross 3 days ago ago

> What are most people using today for file serving?

Google Drive. Or Dropbox, OneDrive, yada yada. I mean, sure, that's not the question you were asking. But for casual per-user storage and sharing of "file" data in the sense we've understood it since the 1980's, cloud services have killed local storage cold dead. It's been buried for years, except in weird enclaves like HN.

The other sense of "casual filesystem mounting" even within our enclave is covered well already by fuse/sshfs at the top level, or 9P for more deeply integrated things like mounting stuff into a VM.

No one wants to serve files on a network anymore.

pkulak 3 days ago ago

SMB has always worked great for me.

Arubis 3 days ago ago

NFS! At least on my localnet.

jdboyd 3 days ago ago

Some NFS, lots of SMB, lots of sftp, some nextcloud, some s3. I wish that TrueNAS made webdav more of a first class service.

[-]

whizzter 3 days ago ago

As nice as WebDAV would've been it's probably a non-starter in many scenarios due to weird limits, like Windows has a default size-limit of 50mb.

I'm tinkering on a project where I'd like to project a filesystem from code and added web-dav support, the 50mb limit will be fine since it's a corner-case for files to be bigger but it did put a dent into my enthusiasm since I had envisioned using it in more places.

a96 3 days ago ago

sshfs for short-lived and single user serving. iscsi for network storage.

Nothing for multi-user or multi-client. Avoid as long as that is possible since there is no good solution in sight.

SSLy 3 days ago ago

Depends on the use-case. Myself I'm using NFS, iCloud, and BitTorrent.

burnt-resistor 2 days ago ago

200 TiB over Samba across 5 XFS volumes on md raid10 volumes. Time Machine-compatible.

drewg123 2 days ago ago

NFS was a huge part of my life in the 90s..

- It caused me to switch from Linux to FreeBSD in 1994 when Linux didn't have NFS caching but FreeBSD did & Linus told me "nobody cares about NFS" at the Boston USENIX. I was a sysadmin for a small stats department, and they made heavy use of NFS mounted latex fonts. Xdvi would render a page in 1s on FreeBSD and over a minute on Linux due to the difference in caching. Xdvi would seek byte-by-byte in the file.. You could see each character as it rendered on Linux, and the page would just open instantly on FreeBSD.

- When working as research staff for the OS research group in the CS dept, I worked on a modern cluster filesystem ("slice") which pretended to be NFS for client compat. (https://www.usenix.org/legacy/event/osdi00/full_papers/ander...)

nasretdinov 3 days ago ago

I have really mixed feelings about things like NFS, remote desktop, etc. The idea of having everything remote to save resources (or for other reasons) does sound really appealing in theory, and, when it works, is truly great. However in practice it's really hard to make these things be worth it, because of latency. E.g. for network block storage and for NFS the performance is usually abysmal compared to even a relatively cheap modern SSD in terms of latency, and many applications now expect a low latency file system, and perform really poorly otherwise.

[-]

zh3 3 days ago ago

Fairly obviously a 1Gbps network is not going to compete with 5Gbps SATA or 20Gbps NVME. Having said that, for real performance we load stuff over the network into local RAM and then generally run from that (faster than all other options). On the internal network the server also has a large RAM disk shared over NFS/SMB, and the performance PC's have plenty of RAM - so really it's a tradeoff, and the optimum is going to depend on how the system is used.

[-]

Palomides 3 days ago ago

want to emphasize, for those who haven't been following, a nice used 25Gb ethernet card is like $25 now

[-]

zokier 3 days ago ago

But how much is a 25GbE (or 40GbE) switch?

[-]

bombcar 3 days ago ago

10gb is cheap as free → CRS304-4XG-IN

If I needed more than that, I’d probably do a direct link.

Palomides 3 days ago ago

maybe around $400 for 25Gbe depending on your noise and power tolerance, and 40Gbe is dirt cheap now

if you only have two or three devices that need a fast connection you can just do point to point, of course

heavyset_go 3 days ago ago

I can saturate both a 1 and 2.5 Gbps links with WireGuard encrypted NFSv4 on thin clients that are relatively old.

I also use it for shared storage for my cluster and NAS, and I don't think NFS itself has ever been the bottleneck.

Latency-wise, the overhead is negligible on via LAN, though can be noticeable when doing big builds or running VMs.

Publius_Enigma 2 days ago ago

I’ve been using NFS in various environments since my first introduction to it in my university’s Solaris and Linux labs. I’ve run it at home, on and off, since 2005.

I’ve recently started using it again after consistent issues with SMB on Apple devices, and the deprecation of AFP. My FreeBSD server, running on a Raspberry Pi, makes terabytes of content available to the web via an NFS connection to a Synology NAS.

For my use case, with a small number of users, the fact that NFS is host based rather than user based, means I can set it up one on each device, and all users of that host can access the shares. And I’ve generally found it to be more consistently performant on Apple hardware than their in-house SMB implementation.

cramcgrab 3 days ago ago

Zfs includes nfs, its built in and very handy still!

[-]

E39M5S62 3 days ago ago

If you're talking about OpenZFS, that is a thin wrapper over knfsd/exports file. They don't actually ship an NFS daemon in the OpenZFS code.

krunger 3 days ago ago

ZFS is amazing, used it since around Solaris 10, and yes, loved it for it's NFS capability, had many terabytes at the time on it, back when a terabyte meant a rack of drives! Now those same systems host petabytes, all upgraded in place. Solaris was pretty amazing too.

commandersaki 3 days ago ago

Ah Network Failure System, good memories.

pjmlp 3 days ago ago

The wonders of not being able to login on university servers due to NFS communication problems preventing to mount the directory.

Usually caused by a coaxial cable not being properly terminated.

Naturally it meant nothing on the network was working, however NFS was kind of the canary in the mine for it.

cramcgrab 3 days ago ago

Auto home! And jumpstart! Aah, the network is the computer!

Publius_Enigma 2 days ago ago

Still performs better than SMB on macOS for many workloads!

[-]

p_ing 2 days ago ago

Apple just doesn't seem to care about SMB on macOS even though it is now the primary file sharing protocol.

axpvms 3 days ago ago

I liked this post about No File Security from a previous thread

https://news.ycombinator.com/item?id=31820504

sunshine-o 3 days ago ago

If only I could mount a NFS share Android ...

[-]

Narushia 3 days ago ago

I looked into this a while ago and was surprised to find that no file explorer on Android seems to support it[1]. However, I did notice that VLC for Android does support it, though unfortunately only NFSv3. I was at least able to watch some videos from the share with it, but it would be nice to have general access to the share on Android.

[1] Of course, I didn’t test every single app — there’s a bucketload of them on Google Play and elsewhere…

[-]

sunshine-o 2 days ago ago

Yes ! they are using a a client library for that https://github.com/sahlberg/libnfs

[-]

Narushia 2 days ago ago

Interesting, the readme for that library says that NFSv4 is supported. So that likely means that VLC is doing something wrong on their side, because only NFSv3 works?

heavyset_go 3 days ago ago

Been a while, but if you root your phone and have access to the kernel source in order to build the NFS modules, would you be able to mount NFS shares then?

[-]

sunshine-o 2 days ago ago

Yes, you need a rooted phone since usually support is at the kernel level

[-]

heavyset_go 2 days ago ago

Nice, glad it's still an option

semi-extrinsic 3 days ago ago

I'm considering NFS with RDMA for a handful of CFD workstations + one file server with 25Gbe network. Anyone know if this will perform well? Will be using XFS with some NVME disks as the base FS on the file server.

[-]

fock 3 days ago ago

Quite some time ago I implemented NFS for a small HPC-cluster on a 40GBe network. A colleague set up RDMA later, since at start it didn't work with the Ubuntu kernel available. Full nVME on the file server too. While the raw performance using ZFS was kind of underwhelming (mdadm+XFS about 2x faster), network performance was fine I'd argue: serial transfers easily hit ~4GB/s on a single node and 4K-benchmarking with fio was comparable to a good SATA-SSD (IOPS + throughput) on multiple clients in parallel!

heavyset_go 3 days ago ago

Yes, you might want to tune your NFS parameters, stick to NFSv4.2, consider if caching is appropriate for your workloads and at what level, and how much of your NFS + networking you can keep in kernel space if you decide to further upgrade your network's throughput or really expand it.

Also consider what your server and client machines will be running, some NFS clients suck. Linux on both ends works really well.

[-]

chasil 3 days ago ago

The last time that I looked, OpenBSD does not support NFSv4 at all.

SoftTalker 3 days ago ago

Consider BeeGFS. Had good results with it using infiniband.

cik 3 days ago ago

I still love NFS. It's a cornerstone to how I end up thinking about many problems. In my house it provides a simple NAS mount. In certain development environments, I use sshmount because of it.

But I really loved the lesser known RFS. Yes it wasn't as robust, or as elegent.. but there's nothing quite like mounting someone else's sound card and blaring music out of it, in order to drive a prank. Sigh...

[-]

doganugurlu 2 days ago ago

Like someone’s _Sound Blaster_?

la64710 3 days ago ago

Autofs Still magic …

msravi 2 days ago ago

Wait. I still export NFS mounts from my TrueNAS server and make them available to all other machines on my LAN (music, books, documents, photos, etc). The article and comments here give me the feeling that NFS is outadated and shouldn't be used anymore. Am I doing things wrong?

[-]

SamPatt 2 days ago ago

I do exactly the same thing, and it works beautifully. We can't really be doing it wrong if it's working!

lukeh 3 days ago ago

It was amusing to read the comment about the flat UID/GID namespace being a problem, identified as far back as 1984. This is something that DCE addressed by using a larger namespace (UUIDs), and Windows finally got right using a hierarchical one (SIDs).

alexpotato 3 days ago ago

CTO at a hedge fund I worked at had a great quote:

"NFS is like heroin: it seems like a great idea at first and then it ruins your life" (as many commenters are pointing out)

Still an amazing technology for it's time though.

jamespo 3 days ago ago

Anyone remember https://en.wikipedia.org/wiki/WebNFS ?

mannyv 2 days ago ago

NFS, hard mounts, hang. I remember them well.

irusensei 2 days ago ago

I think NFS is the most sane storage provider for self hosted Kubernetes. Anything else seems over engineered for a home lab and is normally not a very good experience.

What I don't like is the security model. It's either deploying kerberos infrastructure or "trust me bro I'm UID 1000" so I default to SMB on the file server.

brcmthrowaway 3 days ago ago

How does Isilon compare to nfs?

[-]

green-salt 3 days ago ago

Isilon has their own filesystem that stores the data across multi-node clusters, you then export that out over NFS/SMB/S3, and the nodes load balance the I/O across the cluster.

NFS at 40 – Remembering the Sun Microsystems Network File System