In these several weeks we had a serious trouble on NDS. Excitation triggered this trouble, but then all the fast data by dataviewer and diaggui had slow response or sometimes did not work at all. All the real time modules and EPICS channels were fine, for example, dapming kept working during this trouble.
Behaviors I looked at servers are something like this;
1. Load of daqd of k1nds0, k1nds1, k1fw0, k1fw0 was sometimes very large. It was looked by top command on each server.
2. Missing packets happened repeatedly on NDS. As a result of that, it repeated that daqd on nds stopped and automatically restarted by /etc/inittab. This period of repeat depended on how big the load existed, specially depending on how many channels were defined on fb/master file even no actual RT PC was running. Typically it repeated every minutes.
3. Using ifconfig showed also some packet drops on myri0 of nds or fw, but no packet was dropped on myri1 of fw. One more strange thing was the amount of packet on myrinet was too small.
Finally actual problem was found that the DAQ data was broadcasted into TCP/IP network but not into DAQ network. i.e. 10GB network was not being used between k1dc0 to NDSs and FWs. I changed daqd.rc of k1dc0 to route 10GB network and all troubles seemed to be fixed. Nakano-kun reported all the response including MEDM in mine is much faster now.
If you see any trouble for fast data, please let me know.
The past date is still something wrong. I confirmed that DAQ channels are stored on 20TB storage but dataviewer and diaggui cannot read the past data. It should be fixed as soon as possible.