Reports 1-1 of 1 Clear search Modify search
DGS (General)
osamu.miyakawa - 14:04 Tuesday 01 December 2015 (406) Print this report
NDS trouble fixed

In these several weeks we had a serious trouble on NDS. Excitation triggered this trouble, but then all the fast data by dataviewer and diaggui had slow response or sometimes did not work at all. All the real time modules and EPICS channels were fine, for example, dapming kept working during this trouble.

Behaviors I looked at servers are something like this; 

1. Load of daqd of k1nds0, k1nds1, k1fw0, k1fw0 was sometimes very large. It was looked by top command on each server.

2. Missing packets happened repeatedly on NDS. As a result of that, it repeated that daqd on nds stopped and automatically restarted by /etc/inittab. This period of repeat depended on how big the load existed, specially depending on how many channels were defined on fb/master file even no actual RT PC was running. Typically it repeated every minutes.

3. Using ifconfig showed also some packet drops on myri0 of nds or fw, but no packet was dropped on myri1 of fw. One more strange thing was the amount of packet on myrinet was too small.

 

Finally actual problem was found that the DAQ data was broadcasted into TCP/IP network but not into DAQ network. i.e. 10GB network was not being used between k1dc0 to NDSs and FWs. I changed daqd.rc of k1dc0 to route 10GB network and all troubles seemed to be fixed. Nakano-kun reported all the response including MEDM in mine is much faster now.

If you see any trouble for fast data, please let me know.

The past date is still something wrong.  I confirmed that DAQ channels are stored on 20TB storage but dataviewer and diaggui cannot read the past data. It should be fixed as soon as possible.

Comments to this report:
osamu.miyakawa - 18:10 Tuesday 01 December 2015 (407) Print this report

I fixed NDSs. Now you can see the past data from any client.

It was because the job directly was not set correctly on both k1nds0 and k1nds1, and nds daemon was actually not rruning on k1nds0.

Some of data you selected as DAQ channels were being recorded even during the trouble. You can try if you need to see the past data during NDS trouble.

 

The last problem is that k1prm for PR3 is still something wrong. We will change optical fiber cable between PC and IO chassis tomorrow.

osamu.miyakawa - 17:02 Friday 04 December 2015 (422) Print this report

Sill some past data is not available. Minute data was not visible but it was fixed for some channels.

 

Currently all suspensions, k1vismce, k1vismco, k1vismci and k1vispr3, are not recording the past date, i.e. past data for all DAQ channel and EPICS channel are not visible for suspensions. RT models should be fixed, I guess. Test Point channel and real time EPICS channels are fine.

All channels for other models are fine. You can access both real time and the past data through dataviewer or diaggui.

osamu.miyakawa - 2:10 Wednesday 09 December 2015 (439) Print this report

All DAQ works now, including minute trend. It was because too many EPICS channels were recorded.

I removed some of recorded EPICS channels from VIS and ISC models.

If you see any more trouble, please let me know.

(Osamu)

Search Help
×

Warning

×