Reports 1-1 of 1 Clear search Modify search
MIF (General)
takahiro.yamamoto - 17:35 Tuesday 10 September 2024 (31045) Print this report
Helper guardian for lockloss check

Abstract

I prepared a new guardian code in order to skip the initial check of lockloss investigation.
Because it's now working on an unused guardian node (CAL_PROC), I will move it on a new guardian node with a proper name in the next maintenance day.
Initial check results are available on /users/Commissioning/data/lockloss/yyyy/mmdd/yyyy-mm-dd.json.

Details

This guardian provides following things (see also the attachment as an example) when a jump to LOCKLOSS in LSC_LOCK guardian is detected.
- GPS/UTC/JST times which are aligned with the sampling timing of DAQ (not a guardian log time)
- State name of LSC_LOCK guardian just before jumping to LOCKLOSS.
- Guardian nodes which detects lockloss earlier than (or at the same time in the DAQ interval) LSC_LOCK.
- Label of known issues such as klog#31032, and O4a experiences.

It would reduce the effort of checking for lockloss, because it would narrow down the items to be checked by automatically labeling known issues and revealing the behavior of other guardian nodes.
Non-image files attached to this report
Comments to this report:
takahiro.yamamoto - 12:29 Wednesday 11 September 2024 (31052) Print this report
Recent (today's and yesterday's) locklesses are now available also on Run Summary.
So you can find initial-check results of locklosses easier than seeing json files.
takahiro.yamamoto - 15:35 Friday 20 September 2024 (31099) Print this report
I moved this function from the temporal guardian node (CAL_PROC) to a new node (SYS_LOCKLOSS).

Information of recent two locklosses (latest and previous) are shown on k1mon7. Info. is automatically updated by SYS_LOCKLOSS guardian when a new lockloss is detected. EPICS channels for this info. are still temporal ones. So I need to add new channels with proper name on k1grdconfig.

ndscope plot around latest lockloss time is also automatically launched by SYS_LOCKLOSS on k1mon8. For the lockloss due to known issues (e.g. overflow, earthquake, etc.) related channels are shown on that plot. For unknown issues, ndscope templates in /users/DET/tools/lockloss/share/LSC_LOCK/ will be launched. Template names must be same name as the state name of LSC_LOCK guardian in order to open proper template for each lockloss. (I haven't prepared templates for all states yet.)
takahiro.yamamoto - 17:10 Thursday 26 September 2024 (31133) Print this report

Abstract

Lockloss guardian was stopped due to a coding bug when lockloss occurred around 10am on 9/24.
Bug was fixed. And also, missing locklosses are now listed by the offline execution of the lockloss check.

Details

Latest locklosses were not updated due to a coding bug on lockloss guardian. This bug was injected in the work as klog#31099 and it's just a trivial bug. So I fixed it soon and lockloss check was resumed from a lockloss at 1411367342.6875 (= 2024-09-26 06:28:44.687500 UTC = 2024-09-26 15:28:44.687500 JST). Missing locklosses were also added on the list by an offline analysis. Last lockloss before fixing this problem had been one at 1410219337.75 (= 2024-09-12 23:35:19.750000 UTC = 2024-09-13 08:35:19.750000 JST). I also checked locklosses between 9/12 (last lockloss) and 9/24 (error occurred) just in case but there was no lockloss because there was no work with full IFO during this term. So I re-analyzed only locklosses between 9/24 (error occurred) and today. I found totally 19 locklosses which were missed by the guardian bug. Now all of them are available in the list on the web pages.
hirotaka.yuzurihara - 16:22 Monday 07 October 2024 (31224) Print this report

I added the SYS_LOCKLOSS guardian in the medm screen. See the bottom of the attached screenshot.

Images attached to this comment
takahiro.yamamoto - 11:50 Tuesday 08 October 2024 (31232) Print this report

I updated k1grdconfig for adding necessary EPICS channels on SYS_LOCKLOSS guardian as shown in Fig.1 and Fig.2.
All added channels are StringIn blocks, so there is no change in the channel list for DAQ.

Compile check was already completed but it's not installed yet.
It will be installed on the next maintenance day.

Images attached to this comment
takahiro.yamamoto - 19:10 Wednesday 09 October 2024 (31254) Print this report
As one of the lockloss situations in states beyond DC lock, Ushiba-kun suggested a case that an amplitude of AS17Q across 0.
This situation corresponds that DARM fluctuation around the operation point reaches opposite side of dark point.

I implemented a new function for checking this situation and checked all 40 locklosses from the DC lock since last night.
This new function could provide consistent results with my eye-check for all locklosses.
So I deployed this function in SYS_LOCKLOSS guardian.

Attached figures show 29 of 40 locklosses from DC lock in which we can see 0-cross of AS17Q before the lockloss flag is raised.
The two numbers at the end of the file names correspond to date (yymmdd) and ID in the lockloss list.
(Because of too many plots, 11 of 40 locklosses which seems to be another reasons will be attached the next post.)
Images attached to this comment
takahiro.yamamoto - 19:15 Wednesday 09 October 2024 (31258) Print this report
I attached remaining 11 plots which doesn't seem to be related to the 0-cross issue.
Some of them (e.g. the 4th plot) may be related to the 0-cross issue.
But it's difficult to judge they are really 0-cross issue or not because of the poor sampling rate (16Hz) of guardian flag.
Images attached to this comment
takahiro.yamamoto - 14:41 Sunday 03 November 2024 (31493) Print this report
I made a minor update of SYS_LOCKLOSS guardian.

When I created this guardian node, unused EPICS channels which prepared for an another purpose were used to record lockloss information.
Though I had prepared new EPICS channels for this purpose last month (see also klog#31232), guardian code hadn't updated yet.
So I modified guardian code as using these new channels to record lockloss info.
MEDM screen for showing lockloss info. was also updated.
takahiro.yamamoto - 21:49 Wednesday 04 December 2024 (31909) Print this report
Yuzurihara-kun pointed out that the possibility of missing the lockloss reason about the human request when we request other than DOWN state.
So I modified LOCKLOSS guardian to detect such a case.

This is reason why guardian check the time series array of REQUEST_N values include 1.0 or not. guardian misses the lockloss reason with old implementation when, for example, someone changes guardian request from DC lock to RF lock. In this case REQUEST_N is chaged from 9990 to 1400. On the other hand state transition is 9990 (DC lock) -> -1 (LOCKLOSS) -> 1 (DOWN) -> ... -> 1400 (RF lock). So I improved the implementation and now guardian can detect the case above. Specifically, it was changed to check if REQUEST_N was changed to a smaller value. It should works well because the state number is well ordered. If LSC_LOCK guardian will be modified without the manner about the state number order, LOCKLOSS guardian must be changed. But in this case, only solution is to describe all cases one-by-one.
takahiro.yamamoto - 11:38 Friday 13 December 2024 (31982) Print this report

A new function was deployed around 11:30 in LOCKLOSS guardian for the cases of lockloss due to ISS saturation reported in klog#31974.
So it can be automatically tagged from now.

-----
Figure 1 and Figure 2 show the lockloss cases due to non-ISS issue and ISS issue. In the case of non-ISS issue, ISS guardian goes to DONW after (or at the same time in 16Hz) LSC_LOCK guardian goes to LOCKLOSS. AOM feedback becomes also large value after the lockloss. On the other hand, we can see ISS down and oscillation of AOM output a few second before the lockloss in the case of ISS issue.
We could identify the most cases of known lockloss reasons by seeing proper signals 0~2 seconds before the lockloss time. But in the ISS case, problem occurs much earlier than the lockloss time and AOM output is already stopped by ISS guardian just before the lockloss. So checking longer duration must be needed to identify the ISS issue. Figure 3 is also ISS-related lockloss in another time. Relative down time between ISS and LSC_LOCK is different from the case of Fig.2. According to check several cases, this issue seems to be detected by seeing AOM output 0~4 seconds before the lockloss time.

Note for future maintenance.
When ISS goes to DOWN, BPC control doesn't work because BPC lines are hidden by intensity noise. So difference in the relative time between ISS and LSC_LOCK may depends on the speed of BPC control and we may need to modify the lockloss detection scheme when BPC control will be drastically changed.

Images attached to this comment
takahiro.yamamoto - 15:08 Wednesday 18 December 2024 (32036) Print this report
SYS_LOCKLOSS guardian might be stopped since Monday morning due to a trouble on the connection between NDS and a poor error handling.

guardian code was modified to treat NDS related errors and a re-analysis for missed period was almost done.
But it's difficult to deploy them remotely without careful checks. So I'll fix them after going back to Mozumi.

takahiro.yamamoto - 18:35 Thursday 19 December 2024 (32059) Print this report
Error handling for the NDS connection was added in the guardian code and SYS_LOCKLOSS was resumed.
All lockloss events while SYS_LOCKLOSS was stopped were analyzed in the offline process.

- 12/15: 54 events (No. 0~27 were already analyzed in online process, so only No. 28-53 were analyzed.)
- 12/16: 29 events
- 12/17: 26 events
- 12/18: 18 events
- 12/19: 8 events (until 9:20 UTC)
Search Help
×

Warning

×