| Summary: | Trinity startup and exit time is excessively slow when using NFS for user homes | ||
|---|---|---|---|
| Product: | TDE | Reporter: | Darrell <darrella> |
| Component: | other (any) | Assignee: | Timothy Pearson <kb9vqf> |
| Status: | NEW --- | ||
| Severity: | major | CC: | bugwatch, darrella, kb9vqf |
| Priority: | P5 | ||
| Version: | R14.0.x [Trinity] | ||
| Hardware: | Other | ||
| OS: | Linux | ||
| Compiler Version: | TDE Version String: | ||
| Application Version: | Application Name: | ||
| Attachments: |
strace of terminating kmix with the applet popup menu and /etc/exports=sync
strace of terminating kmix with the applet popup menu and /etc/exports=async |
||
|
Description
Darrell
2013-04-25 16:10:25 CDT
Update: the final changes made in bug report 760 did not change the substance of this report, although I am noticing the same 2.5 second reduction in the exit time as reported in bug report 760. I do not see this delay or the client kill timeouts in my NFS setup. Comparing with XFCE is like comparing apples and oranges; TDE applications are more complex and likely require additional (small) disk accesses to read their configuration files. Can you try remounting with async instead of sync to see if that makes any difference? What I am seeing from the logs is many different applications trying to commit their settings to disk simultaneously, which may be hammering your NFS server enough to cause large delays. For reference, my test setup uses gigabit Ethernet to a fast RAID array, and my NFS server is heavily optimised for low latency and high bandwidth. All of my systems start the NFS service to facilitate quick connections, although my "primary" NFS server is my main office system. I have only a small home network. Thus, the amount of NFS traffic is nominal at most. That said, I changed the exports file from sync to async and restarted the NFS service. First I tested in a virtual machine. I started kmix and then from the system tray stopped kmix. The time for the icon to disappear was about 4 seconds as opposed to about 11 seconds I saw previously. Trinity exit time in the virtual machine without kmix running was about 11 seconds and with async was about 4.5 seconds. For a comparison, after the latest improvements from bug report 760, my host system now sees about a 3 second exit time. Next I tried the same user account connecting through NFS on a physical machine and saw similar improvements. Terminating kmix from the system tray was almost immediate and exiting Trinity was about 4 seconds. Using async helps cure the problem (and I learned something :-) ), but the default in NFS is sync. I'm concerned about the huge change in Trinity between the two options, not to mention that Trinity should work well with the NFS default. In my use case there is no meaningful load on my NFS service. There is no high demand here, no hammering. About 99.9% of the time the only load is one connection from only one machine, which is idle much of the time. The amount of data transfers are insignificant and all disk synchronizing should be almost immediate. There should be no noticeable delays. Although in my test conditions the user's Trinity profile is on an NFS server, all Trinity files are installed locally. To me then, closing an applet like kmix should not take 11 seconds but should be immediate. Something is not quite right when Trinity updates the user's kmixrc file. Likewise when exiting Trinity. I understand that Trinity updates many rc files when exiting, but not to the point of 11 seconds or more. Administrators in a enterprise environment might prefer the default sync option to ensure less data corruption. In those use cases I don't think they will want serious delays anywhere. Are there ways to improve Trinity when using sync? Do we need to provide users some kind of xsession-error warning how to best configure NFS for Trinity desktops? What options are you using in your /etc/exports? I understand an extra second of delay, or perhaps pushing two or three seconds, but not 20 second delays. I don't mind if the resolution to this report is tuning the /etc/exports file, but if so then I think we need to make that information accessible to users and admins. <snip> > > Are there ways to improve Trinity when using sync? Possibly. In sync mode the speed of the mount depends more heavily on network and disk latency than total bandwidth, so any potential gains may be severely limited. > Do we need to provide users some kind of xsession-error warning how to best > configure NFS for Trinity desktops? > > What options are you using in your /etc/exports? > > I understand an extra second of delay, or perhaps pushing two or three seconds, > but not 20 second delays. I don't mind if the resolution to this report is > tuning the /etc/exports file, but if so then I think we need to make that > information accessible to users and admins. The kmix exit problem sounds like it might be a bug, however I don't know if it would be a kmix, arts, pulseaudio, or alsa bug. Can you strace the long (11 second) kmix exit and attach the strace to this bug report? > In sync mode the speed of the mount depends more heavily on network > and disk latency than total bandwidth, so any potential gains may be > severely limited. The physical machines I use are 1 Gbit, connected through a 1 Gbit switch. With my testing that is the only connection. Nothing else is happening during the testing --- there is no hammering or latency. I can run other NFS related tests, or network tests, and everything shows expected high results. Only Trinity acts fussy. > Can you strace the long (11 second) kmix exit and attach the strace > to this bug report? How do I do that? When I terminate kmix from konsole, the icon disappears from the system tray immediately. No delays, no errors. The delay occurs only when I terminate kmix from the system tray using the popup context menu. Is there a way to attach strace to the process? (In reply to comment #5) > > In sync mode the speed of the mount depends more heavily on network > > and disk latency than total bandwidth, so any potential gains may be > > severely limited. > > The physical machines I use are 1 Gbit, connected through a 1 Gbit switch. With > my testing that is the only connection. Nothing else is happening during the > testing --- there is no hammering or latency. I can run other NFS related > tests, or network tests, and everything shows expected high results. Only > Trinity acts fussy. Your test suite would need to perform lots of small random I/O to properly test a typical configuration file read/write load. Most likely you are testing relatively large file transfers, which would not show the effects of inherent (i.e. set by the physical hardware itself) Ethernet latency very well. > > Can you strace the long (11 second) kmix exit and attach the strace > > to this bug report? > > How do I do that? When I terminate kmix from konsole, the icon disappears from > the system tray immediately. No delays, no errors. The delay occurs only when I > terminate kmix from the system tray using the popup context menu. Is there a > way to attach strace to the process? strace -p `pidof kmix` Tim This is a good howto for strace :-) http://www.hokstad.com/5-simple-ways-to-troubleshoot-using-strace Ah, I answered my own question about two minutes after replying. :-) I'll attach two traces, one with sync and one with async. With both I terminated kmix using the system tray popup menu. Interestingly, the async output is much longer than the sync output although the latter requires much longer time. Created attachment 1194 [details]
strace of terminating kmix with the applet popup menu and /etc/exports=sync
Created attachment 1195 [details]
strace of terminating kmix with the applet popup menu and /etc/exports=async
Just a note, I was hoping some of the recent threading patches might resolve this report but not. :-( This problem still exists. When the NFS server uses async in /etc/exports, the user's logout at a remote machine is about 2.5 seconds. Very nice. When the NFS server uses sync, the same user account logout is about 11 to 15 seconds, depending upon the account. At this time I do not have any special debug patches enabled to create a more verbose xsession-error spew. A note that the recent slew of icon patches have no effect on this bug. The problem still lies with sync/async in the NFS server /etc/exports file. Update: in light of improvements with tdepowersave and NFS, I tested this bug today. No change in status. The difference is still sync/async in the server's /etc/exports file. |