- Re: NFS Crashes LTSP and NFS Servers and Corrupts Data! Need Help!

PDA

View Full Version : Re: NFS Crashes LTSP and NFS Servers and Corrupts Data! Need Help!


Joe Beanfish
07-24-2004, 06:01 PM
root wrote:
>
> On Thu, 19 Jun 2003 15:18:39 -0700, Patch wrote:
>
> > "root" <root@jonspc.jonspc> wrote in message
> > news:<pan.2003.06.14.14.48.49.369661@jonspc.jonspc>...
> >> On Sat, 14 Jun 2003 01:24:21 +0000, Fanying Jen wrote:
> >>
> >> > I am a senior system administrator at Lille Corp whom provides thin
> >> > client (LTSP or Linux Terminal Server Project) Linux solutions to the
> >> > medical industry. We have a very serious problem with the reliability
> >> > of NFS over both Fast Ethernet (100BaseT) and WANs including T1s
> >> > point to point, and IPSEC VPNs where the other node is on business
> >> > cable modem. The customer is a mid size medical assoicate with over
> >> > three hundred staff members and two hundred terminals and PC, mostly
> >> > Linux spread over three states.
> >> >
> >> > There are two major problems with NFS. One is NFS crashing the entire
> >> > three hundred person organization in one swoop and bringing it to a
> >> > grinding halt in front of users and the patients (customers of the
> >> > customer) which is not good particularily since Linux is branded to
> >> > be more stable.
> >> >
> >> > The other major problem is when NFS is not crashing systems, it
> >> > looses data particularily with OpenOffice costing many man hours of
> >> > work. In addition, even simple commands like "ls" have trouble
> >> > displaying the entire directory and users sometimes get into a stale
> >> > lock as well.
> >> >
> >> > I will provide the network topology along with the system specs,
> >> > configuration, and the software components.
> >> >
> >> > Network Topology
> > ...
> > ... (text removed)
> > ...
> >> >
> >> > NFS Problem 1 (Crashing)
> >> >
> >> > The LTSP would create stale locks and eventually the LTSP server
> >> > crashes. Furthermore, data is corrupted during the process.
> >> >
> >> > NFS Problem 2 (Corruption)
> >> >
> >> > This happens mostly in OpenOffice, more than once, data gets
> >> > corrupted and when working with both Microsoft and native OpenOffice
> >> > formats. On full local workstations where the files are save to the
> >> > local disk, this does not happen. The office docuements are normally
> >> > saved to an NFS filesystem.
> >> >
> >> > NFS Problem 3 (Performance)
> >> >
> >> > NFS is very slow over T1 and cable links but is just fine on the
> >> > Local Fast Ethernet. It is slow enough to either knock people off or
> >> > cause write errors. We are thinking the hubs play a major role and
> >> > are replacing with high end managed switches. However we believe that
> >> > there is more than meets the eye and the T1 also has something to do
> >> > with.
> >> >
> >> > Summary
> >> >
> >> > Those are the problems and I give as much information as I possibly
> >> > can. I would be appreciate if anyone can point us in the right
> >> > direction. We commercial organization and our customer are also
> >> > commercial and we all want Linux to success not only on the server
> >> > but also on the desktop. This customer is one of the boldest I have
> >> > seen in the embracing of Linux on the desktop and we want them to
> >> > success to the fullest, therefore we are asking for your assistance
> >> > so we can do what many people say you can't, make money with Linux!
> >> >
> >> > Thank you and Sincerely,
> >> > Fanying Jen
> >> > Senior System Administration
> >> > Lille Corp.
> >>
> >> [Why are you running a UDP LAN network on a WAN then looking suprised
> >> when it doesnt work. NFS is not suitable for WAN based applications,
> >> the linux version of NFS isnt even as "not good" as Suns NFS.
> >>
> >> You could try an IP socket implementation of NFS (They do exist) or a
> >> better solution is to use SMB instead. SMB is not a sexy network
> >> protocol, but it is session based and doesn't pretend to be stateless.
> >> Linux works with smb in the same way as NFS so you can put the mounts
> >> into fstab, the only difference is the server needs the shares defined
> >> in smb.conf not /etc/exports. It only uses a couple of sockets (the
> >> problem with NFS is it negetioates a socket (UDP) using portmap, makes
> >> firewalls fun - might be part of your problem infact)
> >>
> >> NFS has many many defects for WANS - it expects full duplex (talk and
> >> listen at the same time). It uses UDP and doesn't retry fails with 100%
> >> accuracy. It uses very large frames, useless for WANS. It dynamically
> >> negetioates UDP sockets with portmap - if the negetiate fails, it runs
> >> badely and takes an age to connect.
> >>
> >> Point 2 is TESTS !!!! Write a "copy to server copy back to host file
> >> compare" script and run it for days. Typically this will move a few
> >> hundred MB from point A to B then back to A - when it arrives back at A
> >> compare the file with the original.
> >>
> >> Jon
> >
>
> > First NFS is not UDP based, it runs over either TCP or UDP.
>
> A quote from suns NFS description
>
> NFS originated from Sun Microsystems. The details of Sun RPC Version 2 can
> be found in RFC 1057 and consists of two flavors. The first is built using
> the sockets API and works with TCP or UDP. The other, transport
> independent (TI- RPC), is built using the TLI API and works with any
> transport layer. The most popular RPC implementation today is built using
> the sockets API with UDP.

Perhaps, but since we're talking about linux here perhaps the linux man
page would be more applicable:

...
tcp Mount the NFS filesystem using the TCP pro­
tocol instead of the default UDP protocol.
Many NFS severs only support UDP.
udp Mount the NFS filesystem using the UDP pro­
tocol. This is the default.
...

root
07-24-2004, 06:02 PM
On Mon, 23 Jun 2003 12:45:22 -0400, Joe Beanfish wrote:

> root wrote:
>>
>> On Thu, 19 Jun 2003 15:18:39 -0700, Patch wrote:
>>
>> > "root" <root@jonspc.jonspc> wrote in message
>> > news:<pan.2003.06.14.14.48.49.369661@jonspc.jonspc>...
>> >> On Sat, 14 Jun 2003 01:24:21 +0000, Fanying Jen wrote:
>> >>
>> >> > I am a senior system administrator at Lille Corp whom provides
>> >> > thin client (LTSP or Linux Terminal Server Project) Linux
>> >> > solutions to the medical industry. We have a very serious problem
>> >> > with the reliability of NFS over both Fast Ethernet (100BaseT) and
>> >> > WANs including T1s point to point, and IPSEC VPNs where the other
>> >> > node is on business cable modem. The customer is a mid size
>> >> > medical assoicate with over three hundred staff members and two
>> >> > hundred terminals and PC, mostly Linux spread over three states.
>> >> >
>> >> > There are two major problems with NFS. One is NFS crashing the
>> >> > entire three hundred person organization in one swoop and bringing
>> >> > it to a grinding halt in front of users and the patients
>> >> > (customers of the customer) which is not good particularily since
>> >> > Linux is branded to be more stable.
>> >> >
>> >> > The other major problem is when NFS is not crashing systems, it
>> >> > looses data particularily with OpenOffice costing many man hours
>> >> > of work. In addition, even simple commands like "ls" have trouble
>> >> > displaying the entire directory and users sometimes get into a
>> >> > stale lock as well.
>> >> >
>> >> > I will provide the network topology along with the system specs,
>> >> > configuration, and the software components.
>> >> >
>> >> > Network Topology
>> > ...
>> > ... (text removed)
>> > ...
>> >> >
>> >> > NFS Problem 1 (Crashing)
>> >> >
>> >> > The LTSP would create stale locks and eventually the LTSP server
>> >> > crashes. Furthermore, data is corrupted during the process.
>> >> >
>> >> > NFS Problem 2 (Corruption)
>> >> >
>> >> > This happens mostly in OpenOffice, more than once, data gets
>> >> > corrupted and when working with both Microsoft and native
>> >> > OpenOffice formats. On full local workstations where the files are
>> >> > save to the local disk, this does not happen. The office
>> >> > docuements are normally saved to an NFS filesystem.
>> >> >
>> >> > NFS Problem 3 (Performance)
>> >> >
>> >> > NFS is very slow over T1 and cable links but is just fine on the
>> >> > Local Fast Ethernet. It is slow enough to either knock people off
>> >> > or cause write errors. We are thinking the hubs play a major role
>> >> > and are replacing with high end managed switches. However we
>> >> > believe that there is more than meets the eye and the T1 also has
>> >> > something to do with.
>> >> >
>> >> > Summary
>> >> >
>> >> > Those are the problems and I give as much information as I
>> >> > possibly can. I would be appreciate if anyone can point us in the
>> >> > right direction. We commercial organization and our customer are
>> >> > also commercial and we all want Linux to success not only on the
>> >> > server but also on the desktop. This customer is one of the
>> >> > boldest I have seen in the embracing of Linux on the desktop and
>> >> > we want them to success to the fullest, therefore we are asking
>> >> > for your assistance so we can do what many people say you can't,
>> >> > make money with Linux!
>> >> >
>> >> > Thank you and Sincerely,
>> >> > Fanying Jen
>> >> > Senior System Administration
>> >> > Lille Corp.
>> >>
>> >> [Why are you running a UDP LAN network on a WAN then looking
>> >> suprised when it doesnt work. NFS is not suitable for WAN based
>> >> applications, the linux version of NFS isnt even as "not good" as
>> >> Suns NFS.
>> >>
>> >> You could try an IP socket implementation of NFS (They do exist) or
>> >> a better solution is to use SMB instead. SMB is not a sexy network
>> >> protocol, but it is session based and doesn't pretend to be
>> >> stateless. Linux works with smb in the same way as NFS so you can
>> >> put the mounts into fstab, the only difference is the server needs
>> >> the shares defined in smb.conf not /etc/exports. It only uses a
>> >> couple of sockets (the problem with NFS is it negetioates a socket
>> >> (UDP) using portmap, makes firewalls fun - might be part of your
>> >> problem infact)
>> >>
>> >> NFS has many many defects for WANS - it expects full duplex (talk
>> >> and listen at the same time). It uses UDP and doesn't retry fails
>> >> with 100% accuracy. It uses very large frames, useless for WANS. It
>> >> dynamically negetioates UDP sockets with portmap - if the negetiate
>> >> fails, it runs badely and takes an age to connect.
>> >>
>> >> Point 2 is TESTS !!!! Write a "copy to server copy back to host
>> >> file compare" script and run it for days. Typically this will move a
>> >> few hundred MB from point A to B then back to A - when it arrives
>> >> back at A compare the file with the original.
>> >>
>> >> Jon
>> >
>> >
>> > First NFS is not UDP based, it runs over either TCP or UDP.
>>
>> A quote from suns NFS description
>>
>> NFS originated from Sun Microsystems. The details of Sun RPC Version 2
>> can be found in RFC 1057 and consists of two flavors. The first is
>> built using the sockets API and works with TCP or UDP. The other,
>> transport independent (TI- RPC), is built using the TLI API and works
>> with any transport layer. The most popular RPC implementation today is
>> built using the sockets API with UDP.
>
> Perhaps, but since we're talking about linux here perhaps the linux man
> page would be more applicable:
>
> ...
> tcp Mount the NFS filesystem using the TCP pro­
> tocol instead of the default UDP protocol. Many
> NFS severs only support UDP.
> udp Mount the NFS filesystem using the UDP pro­
> tocol. This is the default.
> ...

And we swing back round to "You could try an IP socket implimentation of
NFS" :-) I wasnt aware it was built in.

Assuming hes using the defaults then he is running with UDP - so I fail to
see why you chose to pick holes in my comments ?

Jon