You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1




Contents

 Click here to expand...

l



Production Plans

Data Production Status

  • Raw data processing
    • proc10/bucket8: complete
    • Proc11 (2019a/b/c) will be launched around 17th Apr. 2020
    • Prompt processing of 2020a/b data will start in late spring, in preparation for summer conferences.
  • MC13 production

    • MC13a (Run-independent MC) production: Keep producing both generic samples and signal samples

    • MC13b (Run-dependent MC) production: ongoing
  • Skim
    • SkimP10x1 (Proc10 skims): ongoing
    • SkimM13ax1 (MC13 skims): ongoing

Production Status

Full resource usage

Data production summary page : Data Production Status

Data (re)processing:

  • No jobs to run on grid now

MC production:

  • MC13a/MC13b productions are ongoing

Analysis skimming:

  • SkimP10x1/SkimM13ax1 are on-going.


Central Services

Dirac (dirac.cc.kek.jp, b2dchsv01-b2dchsv06.cc.kek.jp, b2dchsv08.cc.kek.jp)

  • Data, Issue, Tickets...

DB Production (b2dchdb1.cc.kek.jp, b2dchdb2.cc.kek.jp, b2dcsdb1.cc.kek.jp, b2dcsdb2.cc.kek.jp)

  • Date, Issue, Tickets...

"Web" servers

  • 2019-12-07 Ganglia Monitor for the "Web" servers still shows "remnant plots" after 2hours` check. BIIDCO-2166 - Getting issue details... STATUS


DDM (bldirac01.sdcc.bnl.gov)

  • 2018-03-01 DDM deletion task seems stuck BIIDCO-808 - Getting issue details... STATUS

Conditions DB ()

Monitor

LFC

  • Date, Issue, Tickets...

File Transfers and Replication Status

  • See also Computing OperationStatus#DDM for related issues.
  • Periodic transfer shape is happening (perform half a day), experts investigating BIIDCO-2273 - Getting issue details... STATUS
  • 2020-04-03 00:15 UTC File Transfer failures: Pisa-DATA-SE  BIIDCO-2349 - Getting issue details... STATUS  
  • There is no activity during the last three hours (01/Jan/2020 since 9:00 to 12: 15 UTC) in the "Replication Status plot" BIIDCO-2097 - Getting issue details... STATUS
  • There is no activity in the last two hours in both plots "Throughput" and "Successful transfers"   BIIDCO-2144 - Getting issue details... STATUS
  • No activity in file transfer monitoring since 01/Mar/2020 at 19:00 UTC  BIIDCO-2097 - Getting issue details... STATUS
  • 2020-04-08, Replication status has nearly as many failed as successful  over four hours now, BIIDCO-2358 - Getting issue details... STATUS

FTS

  • Any problem in the FTS service or FTS monitoring are to be recorded here. Site/SE specific issues are to be recorded under each SIte/SE
  • Note that the FTS dashboard we use is an "old" instance and not well-maintained. We, Belle II members in general, do not have access to the "new" monitoring. When the dashboard is down, the shifters just need to notify the expert and skip the corresponding part of their work. The expert should check the new monitoring, for the access to the monitoring page is limited.
  • 2020-02-06 15:20 UTC File transfer failure from Roma3-TMP-SE to KIT-TMP-SE BIIDCO-2275 - Getting issue details... STATUS
  • 2020-01-02 13:20 UTC File transfer failure from KMI-TMP-SE and from KEK-Disk-TMP-SE to LAL-DATA-SE BIIDCO-2219 - Getting issue details... STATUS
  • 2020-01-02 9:15 UTC File transfer failure from KMI-TMP-SE to LAL-DATA-SE BIIDCO-2219 - Getting issue details... STATUS
  • 2019-08-31  File transfer failures for past 48 hours.  BIIDCO-1987 - Getting issue details... STATUS
  • 2019-09-03 File transfer failures for past 24 hours.  BIIDCO-1988 - Getting issue details... STATUS

Replication Status

  • 2020-04-02 08:00 UTC, Problems on RepTrend:CESNET-TMP-SE and on RepTrend:KEK2-TMP-SE BIIDCO-2347 - Getting issue details... STATUS
  • 2020-03-30 01:00 UTC and 2020-04-02 01:00 UTC, RepTrendAll: all lines at zero BIIDCO-2342 - Getting issue details... STATUS
  • 2020-03-20 19:30 UTC  - No activity
  • 2020-02-13 Zero number of "Done" at all SE; the number of Scheduled is increasing BIIDCO-2286 - Getting issue details... STATUS
  • 2019-12-21 Decreasing Done Jobs with many Scheduled Jobs  BIIDCO-2169 - Getting issue details... STATUS
  • 2019-12-15 Zero Replication Efficiency  BIIDCO-2183 - Getting issue details... STATUS
  • 2019-1-19 almost zero done, with a increasing numbers of scheduled jobs for more than 5 SEs and more than 5 hours. BIIDCO-1618 - Getting issue details... STATUS
  • 2018-07-02   No Donetransfer,  several scheduled and rapid increase of Waiting replication BIIDCO-1125 - Getting issue details... STATUS

Job Status Plot

  • No job status plots for 15 sites while MC13a production is ongoing 2020-02-07 BIIDCO-2277 - Getting issue details... STATUS

Job Summary

  • Date, Issue, Tickets...
  • Following JIRA ticket updated : BIIDCO-1553 - Getting issue details... STATUS



SEs

SE Common Issues

  • Issues with individual SEs should be recorded below (Primary SEs or Other SEs)

Raw data SEsLink to JIRA ticket: 

Raw data SE: KEK-RAW-SE (srm://kek2-se02.cc.kek.jp:8444/srm/managerv2?SFN=/belle/RAW)

  • 2019-07-24 15:30 UTC: all transfers failed (0/72) between KEK-RAW-SE and BNL-TMP-SE

Raw data SE: BNL-TAPE-SE (srm://dcblsrm.sdcc.bnl.gov:8443/srm/managerv2?SFN=/pnfs/sdcc.bnl.gov/tape)

  • date, issue, tickets

Primary SEs

Primary SE: BNL-TMP-SE (dcblsrm.sdcc.bnl.gov)

  • High failure rate as source BNL-TMP-SE BIIDCO-2340 - Getting issue details... STATUS
    Solved and Verified https://ggus.eu/index.php?mode=ticket_info&ticket_id=146329
  • 2020-02-09 File transfer failure from SIGNET-TMP-SE to KEK-DISK-TMP-SE, BNL-TMP-SE and DESY-TMP-SE have been low, ~50 %, for > 4 hours. BIIDCO-2280 - Getting issue details... STATUS
  • SE Health Check by DDM: Failure on download have been observed since 2020-02-07 09:19:40 (5 hours)
  • No Replication Trend Plot for BNL-TMP-SE 2020-01-02 09:30 UTC  BIIDCO-2220 - Getting issue details... STATUS
  • SE Health check by DDM : download does not work since 2019-05-16 07:11:21 UTC.

  •  UNAVAILABLE files BIIDCO-1302 - Getting issue details... STATUS
  • SE Health check by DDM : download does not work since 2019-05-15 21:03:25 UTC

Primary SE: CESNET-TMP-SE (dpm1.egee.cesnet.cz) 

  • SE Health Check by DDM: Failure on upload have been observed since 2020-03-27 09:35:52 (3 hours)  BIIDCO-2335 - Getting issue details... STATUS
     https://ggus.eu/index.php?mode=ticket_info&ticket_id=146324
  • SE Health Check by DDM: Failure on upload have been observed since 2020-04-08 02:01:40 (5 hours) BIIDCO-2335 - Getting issue details... STATUS
  • Replication status: Scheduled Jobs Only 2020-03-19 15:50 UTC
  • Replication status. Plot is not shown 2019-12-31 9:25 UTC BIIDCO-1352 - Getting issue details... STATUS
  • Replication status. Plot is not shown 2019-12-31 7:45 UTC
  • Plot is not shown 2019-12-25 05:00 UTC.
  • SE Health check by DDM : remove file, remove directory, ls do not work since 2019-07-10 06:32:47 UTC.

Primary SE: CNAF-TMP-SE (storm-fe-archive.cr.cnaf.infn.it)

  • SE Health Check by DDM: Failure on upload have been observed since 2020-04-07 10:40:12 BIIDCO-2356 - Getting issue details... STATUS
  • File transfer, replication trend and job status plots all have issues BIIDCO-2356 - Getting issue details... STATUS
  • SE Health Check by DDM: Failure on download have been observed at since 2019-05-13 08:27:21 UTC and since 2020-02-19 15:44:39 (7 hours)
  • File transfer failure for source have been observed since 2019-12-23 02:00 UTC
  • SE Health check by DDM : remove file, remove directory, download, upload, ls do not work since 2019-04-25 23:13:00 UTC.
  • 2019/01/27 File transfer failures from CNAF-TMP-SE to NTUCC-DATA-SE. BIIDCO-1637 - Getting issue details... STATUS
  •  Cotinuous timeout failure between NTU-CC-TMP-SE and CNAF-TMP-SE BIIDCO-1310 - Getting issue details... STATUS

Primary SE: DESY-TMP-SE (dcache-se-desy.desy.de)

  • 2020-02-09 File transfer failure from SIGNET-TMP-SE to KEK-DISK-TMP-SE, BNL-TMP-SE and DESY-TMP-SE have been low, ~50 %, for > 4 hours.  BIIDCO-2280 - Getting issue details... STATUS
  • 2020-04-01, downtime, BIIDCO-2344 - Getting issue details... STATUS

Primary SE: KEK-DISK-TMP-SE (srm://kek2-se03.cc.kek.jp:8444/srm/managerv2?SFN=/disk/belle/TMP)

  • File transfer failure from SIGNET-TMP-SE to KEK-DISK-TMP-SE, BNL-TMP-SE and DESY-TMP-SE have been low, ~50 %, for > 4 hours. 2020-02-09 BIIDCO-2280 - Getting issue details... STATUS
  • 2020-01-02 13:20 UTC File transfer failure from KMI-TMP-SE and from KEK-Disk-TMP-SE to LAL-DATA-SE BIIDCO-2219 - Getting issue details... STATUS

Primary SE: KEK2-TMP-SE (srm://kek2-se03.cc.kek.jp:8444/srm/managerv2?SFN=/belle/TMP)

  • Number of jobs with status "done" is zero. 2020-02-07 15:40 (UTC)
  • SE Health Check by DDM: Failure on ls, upload have been observed since 2019-11-10 07:23:23 (5 hours)
  • Following JIRA tickets submitted: BIIDCO-1866
  • Number of jobs with status "done" is zero. 2019-07-05 7:07 BIIDCO-1866 - Getting issue details... STATUS

Primary SE: KISTI-TMP-SE (belle-se-head.sdfarm.kr)

  • No new assignment of MC production data blocks to this destination BIIDCO-848 - Getting issue details... STATUS

  • SE Health check by DDM : download, upload do not work since 2019-09-22 23:38:00 UTC.

Primary SE: KIT-TMP-SE (dcachesrm-kit.gridka.de)

  • 2020-04-01, downtime, BIIDCO-2345 - Getting issue details... STATUS

Primary SE: KMI-TMP-SE (nsrmfe01.hepl.phys.nagoya-u.ac.jp )

  • 2020-01-02 09:15 UTC File transfer failure from KMI-TMP-SE to LAL-DATA-SE BIIDCO-2219 - Getting issue details... STATUS
  • KMI-TMP-SE with Scheduled jobs overwhelming Done ones since 9:00 UTC BIIDCO-2169 - Getting issue details... STATUS

Primary SE: Napoli-TMP-SE (belle-dpm-01.na.infn.it )

  • 2020-04-07, SE Health Check by DDM: Failure on ls, upload have been observed since 2020-04-07 06:09:23, BIIDCO-2355 - Getting issue details... STATUS
  • 2020-04-08 transfer, replication and job status all have issues BIIDCO-2356 - Getting issue details... STATUS

Primary SE: SIGNET-TMP-SE (dcache.ijs.si )

  • SE Health Check by DDM: Failure on ls, upload have been observed since 2020-03-26 22:07:52 (8 hours)  BIIDCO-2333 - Getting issue details... STATUS
  • SIGNET-TMP-SE SE Health Check by DDM: Failure on ls, upload have been observed since 2020-02-13 19:59:11 (13 hours)
  • 2020-02-09 File transfer failure from SIGNET-TMP-SE to KEK-DISK-TMP-SE, BNL-TMP-SE and DESY-TMP-SE have been low, ~50 %, for > 4 hours. BIIDCO-2280 - Getting issue details... STATUS
  • File transfer failure for destination have been observed since 2019-12-23 02:00 UTC

Other SEs

Adelaide-TMP-SE (coepp-dpm-01.ersa.edu.au)

CYFRONET-TMP-SE (dpm.cyf-kr.edu.pl)

  • Date, Issue, Tickets...

CINVESTAV-TMP-SE (jaguar-se.fis.cinvestav.mx)

  • Date, Issue, Tickets...

Frascati-TMP-SE (atlasse.lnf.infn.it)

  • Date, Issue, Tickets...

HEPHY-TMP-SE (hephyse.oeaw.ac.at)

  • Date, Issue, Tickets...

IPHC-TMP-SE (sbgse1.in2p3.fr)

  • Date, Issue, Tickets...

LAL-TMP-SE (grid05.lal.in2p3.fr)

  • 08 Apr 2020, Lots of failed file transfers from source, BIIDCO-2357 - Getting issue details... STATUS

Melbourne-TMP-SE (b2se.mel.coepp.org.au)

  • transfer rate to be zero BIIDCO-896 - Getting issue details... STATUS

  • Melbourne-DATA-SE banned for write BIIDCO-927 - Getting issue details... STATUS

McGill-TMP-SE  (storm02.clumeq.mcgill.ca)

  • BIIDCO-516 - Getting issue details... STATUS McGill-TMP-SE will be decomissioned in early 2018.

MPPMU-TMP-SE (grid-srm.rzg.mpg.de)


NTU-TMP-SE (bgrid3.phys.ntu.edu.tw)

  •  NTU-TMP-SE banned for write  BIIDCO-1993 - Getting issue details... STATUS
  • 2019-08-31  File transfer failures for past 48 hours.  BIIDCO-1987 - Getting issue details... STATUS

NTU-CC-TMP-SE (belle2grid3.cc.ntu.edu.tw)

  • 202/01/13, 2019/12/17, 2019/12/11 File transfer failures - BIIDCO-2174 - Getting issue details... STATUS
  • 2019-10-06 File transfer failures for past 24 hours.  BIIDCO-2053 - Getting issue details... STATUS
  • 2019/8/23 file transfer failure to NTU-CC-DATA-SE BIIDCO-1977 - Getting issue details... STATUS BIIDCO-1987 - Getting issue details... STATUS
  • FTS transfer failure as SOURCE NTU-CC-DATA-SE to BNL-TMP-SE BIIDCO-1953 - Getting issue details... STATUS
    Solved and verified GGUS ticket https://ggus.eu/index.php?mode=ticket_info&ticket_id=142550 has submitted
  • 2019/01/27 File transfer failures from CNAF-TMP-SE to NTUCC-DATA-SE. BIIDCO-1637 - Getting issue details... STATUS   BIIDCO-1892 - Getting issue details... STATUS
  • File transfer failure and cancellation to NTUCC-DATA-SE happened 2018-12-22 BIIDCO-1551 - Getting issue details... STATUS
  • Frequent timtout has observed between NTU-CC-TMP-SE and CNAF-TMP-SE BIIDCO-1310 - Getting issue details... STATUS
    GGUS ticket https://ggus.eu/index.php?mode=ticket_info&ticket_id=137334 has submitted 2018-09-22 05:10 UTC
  • NTUCC-TMP-SE banned for write  BIIDCO-1333 - Getting issue details... STATUS

Pisa-TMP-SE (stormfe1.pi.infn.it)

  • 2020-03-28 17:40 UTC - Failed Transfer in some connections involving PISA-TMP-SE as source

PNNL-TMP-SE (se.hep.pnnl.gov) 

  • Being decommissioned. No need to report any issues.  BIIDCO-838 - Getting issue details... STATUS

Roma3-TMP-SE (storm-01.roma3.infn.it)

  •  Date, Issue, Tickets...

TAU-TMP-SE (tau-se.hep.tau.ac.il)

Torino-TMP-SE (se-srm-00.to.infn.it)

  • Date, Issue, Tickets...

ULAKBIM-TMP-SE (torik1.ulakbim.gov.tr)

  • File transfer failures destination  BIIDCO-2253 - Getting issue details... STATUS

UMiss-TMP-SE (umiss005.hep.olemiss.edu)

  • Date, Issue, Tickets...

UVic-TMP-SE(charon01.westgrid.ca)



Sites

Sites Common Issue

  • Date, issue for sites wide

ARC.DESY.de

  • 2020-03-30 - 18:00 UTC, downtime, BIIDCO-2343 - Getting issue details... STATUS
  • 2020-04-01 - 07:00 UTC, downtime, BIIDCO-2344 - Getting issue details... STATUS
  • 2020-04-07 - 08:00 UTC, downtime, BIIDCO-2353 - Getting issue details... STATUS

ARC.DESY-test.de

  • A test queue for the new CE. BIIDCO-1469 - Getting issue details... STATUS

ARC.KIT.de

  • "Pilot Submission Failure" has been observed since 2020-03-19 12:24 UTC (for 2 hours) BIIDCO-2314 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-03-15 23:24 UTC (for 1 hours) (details).
  • Pilot Submission Failure" has been observed since 2020-03-13 04:24 UTC (for 2 hours)
  • "Pilot Submission Failure" has been observed since 2020-03-08 11:24 UTC BIIDCO-2314 - Getting issue details... STATUS

  • "Aborted Pilot" has been observed since 2020-02-17 21:59 UTC (for 1 hours) 
  • 2020-04-01, Downtime, BIIDCO-2345 - Getting issue details... STATUS

ARC.LMU.de

  • This is a test site. Do not need to report any issue.

ARC.LMU2.de

  • Banned as currently no resource behind the CE BIIDCO-239 - Getting issue details... STATUS

ARC.Melbourne.au

  • "Failed Payload Job" has been observed since 2020-01-05 05:34 UTC (for 2 hours)

ARC.MPPMU.de

  • "Failed Pilot" has been observed since 2020-04-07 20:30 UTC (for 10 hours) BIIDCO-2346 - Getting issue details... STATUS
  • "Failed Pilot" has been observed since 2020-04-02 12:30 UTC (for 2 hours) BIIDCO-2346 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-04-02 12:30 UTC (for 2 hours) BIIDCO-2346 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-03-26 12:30 UTC (for 2 hours) BIIDCO-2346 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-01-05 02:34 UTC (for 5 hours)
  • "Failed Payload Job" has been observed since 2019-09-30 14:27 UTC (for 8 hours)
  • "Failed Payload Job" has been observed since 2019-09-29 15:27 UTC (for 7 hours)
  • Job submission check : Pilot submission failure has been found since 00:26:00 UTC on 2019/04/21. BIIDCO-1386 - Getting issue details... STATUS
  • BIIDCO-128 - Getting issue details... STATUS

ARC.SIGNET.si

  • "Short Pilot" has been observed since 2020-01-04 02:34 UTC (for 4 hours)
  • "Failed Payload Job" has been observed since 2020-01-04 02:34 UTC (for 4 hours)
  • "Short Pilot" has been observed since 2019-12-24 18:28 UTC (for 4 hours)
  • "Short Pilot" has been observed since 2019-11-10 05:30 UTC (for 1 hours)
  • Health checker info. : "Failed pilot jobs" has been found since 20:20:00 UTC on 2019/08/28.(details
  • Health checker info. : "Short pilot jobs" has been found since 13:20:00 UTC on 2019/08/01.
  • "Failed pilot jobs" has been found at 15:20:00 UTC on 2019/05/22.(details)
  • "Short pilot jobs" has been found at 15:20:00 UTC on 2019/05/22.(details)
  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/05/21.(details)
  • Job status check: many Stalled jobs on 2019/05/14 at 7:00 UTC.
  • Health checker info. : "Short pilot jobs" has been found at 07:20:00 UTC on 2019/05/14.(details)
  • Health checker info. : "Short pilot jobs" has been found at 15:20:00 UTC 2019/04/05 and at 14:20:00 UTC on 2019/04/12.
  • Job status check: Application finished with errors (5% of the jobs) at 11:15 UTC on 2018/12/21.

  • "Failed to install DIRAC on " has been found since 20:20:00 UTC on 2018/11/03. BIIDCO-1420 - Getting issue details... STATUS

  • Health checker info. : "Failed pilot jobs" has been found since 06:20:00 UTC on 2018/10/03.(details) BIIDCO-1350 - Getting issue details... STATUS

CLOUD.CC1_Krakow.pl

  • Not used in production yet. Seeing no jobs (no plot) is not a problem

CLOUD.DESY.de

  • Newly commissioned site. Problems should be reported. (With a separate ticket from BIIDCO-2270)
  •   Being configured (BIIDCO-2270). No report necessary. BIIDCO-2270 - Getting issue details... STATUS
  • 2020/04/01, Downtime, BIIDCO-2344 - Getting issue details... STATUS

DIRAC.Beihang.cn

  • Site is banned.
  • "Failed Payload Job" has been observed since 2019-04-19 11:15 UTC   BIIDCO-1812 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2019/04/18. BIIDCO-1807 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found since 06:20:00 UTC on 2019/04/17.
  • "Application finished with errors" (100% currently) on 2019/04/10 00:15 UTC. Problem reported since (at least) 2019/04/07 07:00 UTC.
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2018/12/08. BIIDCO-1534 - Getting issue details... STATUS
  • Job status check: "application finished with errors" (100% currently) on 2018/10/26.
  • Job submission check : Pilot submission failure has been found since 09:24:00 UTC on 2018/09/21. (details) BIIDCO-1312 - Getting issue details... STATUS
  • The number of jobs limited. BIIDCO-289 - Getting issue details... STATUS
  • All the upload trials are failing against all the SEs configured: OutputSE (KMI-TMP-SE, PNNL-TMP-SE), Fail-over SEs(DESY-TMP-SE, Napoli-TMP-SE, PNNL-TMP-SE, KIT-TMP-SE)  BIIDCO-43 - Getting issue details... STATUS
  • Large % of failed jobs in DIRAC status plot (Added 2016-11-03 22:45:00 UTC)  BIIDCO-38 - Getting issue details... STATUS

DIRAC.BINP.ru

  •  New jobs do not run since 2020-03-23 around 10:00 UTC BIIDCO-2326 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-01-04 20:34 UTC (for 10 hours)
  • "Short Pilot" has been observed since 2019-11-10 05:30 UTC (for 1 hours)
  • Job status check: "Application Finished With Errors" (39% of the jobs over the last 24h) at 7:00 UTC on 2019/05/15.
  • Job status check: Application finished with errors (27% of the jobs over the last 24h) at 8:00 UTC on 2018/12/22.
  • Health checker info. : "Failed to install DIRAC on " has been found at 22:20:00 UTC on 2018/09/15

DIRAC.BINP-VM.ru

  • "Pilot Submission Failure" has been observed since 2020-01-17 05:53 UTC (for 17 hours)
  • "Failed Payload Job" has been observed since 2020-01-05 05:34 UTC (for 1 hours)
  • Health checker info. : "Aborted pilot jobs" has been found at 06:20:00 UTC on 2019/02/21
  • Job submission check : Pilot submission failure has been found since 10:23:00 UTC on 2019/01/14. BIIDCO-1607 - Getting issue details... STATUS
  • Job status plots, "Application Finished With Errors" (2018-02-11 but lasting for at least a month) BIIDCO-749 - Getting issue details... STATUS

DIRAC.CINVESTAV.mx

  • Job status: 80% of jobs had Input Data Resolution errors in past 24 hours, observed on 2020-03-04 at 8:00 UTC.
  • "Pilot Submission Failure" has been observed since 2020-02-18 00:59 UTC (for 22 hours) "Pilot Submission Failure" has been observed since 2020-02-12 11:59 UTC (for 11 hours) BIIDCO-1277 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/04/14.

  • Job submission check : Pilot submission failure has been found at 13:27:00 UTC on 2019/03/19.

DIRAC.DESY.de

  • Test site. Not in use in MC production

DIRAC.IITG.in  

  • "Aborted Pilot" has been observed since 2020-04-09 01:30 UTC (for 5 hours) BIIDCO-2070 - Getting issue details... STATUS
  • "Aborted Pilot" has been observed since 2020-03-28 16:30 UTC (for 14 hours)  BIIDCO-2070 - Getting issue details... STATUS
  • AID: "Aborted Pilot" has been observed since 2019-10-16 22:38 UTC: JIRA ticket created BIIDCO-2070 - Getting issue details... STATUS

  • "Aborted Pilot" has been observed since 2020-03-19 13:24 UTC (for 1 hours) BIIDCO-2070 - Getting issue details... STATUS

  • Job status check: "Application finished with errors" on 2019/07/10 at 00:00 UTC (screenshot)
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2019/05/16. BIIDCO-1686 - Getting issue details... STATUS BIIDCO-1768 - Getting issue details... STATUS
  • Job status check: many "Application finished with errors" (overall 66% during past 24 hours) on 2019/05/15 at 7:00 UTC.
  • Job status check: many "Application finished with errors" on 2019/05/14 at 7:00 UTC.
  • Job status plots, 100% "Application Finished With Errors", 10:00:00 UTC on 2019/04/08. Still unchanged as of 2019/04/26.  BIIDCO-1823 - Getting issue details... STATUS

DIRAC.IITH.in

  • "Pilot Submission Failure" has been observed since 2020-03-02 01:05 UTC (for 5 hours).

  • "Pilot Submission Failure" has been observed since 2020-03-01 13:05 UTC (for 9 hours)

  • "Pilot Submission Failure" has been observed since 2020-02-18 03:59 UTC (for 19 hours) 

  • "Pilot Submission Failure" has been observed since 2020-02-17 15:59 UTC (for 7 hours) 

  • "Aborted pilot jobs" has been found at 22:20:00 UTC on 2019/06/03.(details)
  • Health checker info. : "Aborted pilot jobs" has been found at 22:20:00 UTC on 2019/06/02.(details)
  • Health checker info. : "Short pilot jobs" has been found at 07:20:00 UTC on 2019/03/29.(details BIIDCO-1768 - Getting issue details... STATUS

DIRAC.LMU.de

  • Not in use in MC production BIIDCO-26 - Getting issue details... STATUS
  • Banned for now.

DIRAC.MIPT.ru

  • "Failed Payload Job" has been observed since 2020-01-16 21:53 UTC (for 1 hours)
  • "Failed Payload Job" has been observed since 2020-01-15 02:53 UTC (for 4 hours) (details)

  • "Failed Payload Job" has been observed since 2020-01-04 07:34 UTC (for 62 hours)
  • "Aborted Pilot" has been observed since 2019-12-25 03:28 UTC (for 3 hours)
  • "Aborted Pilot" has been observed since 2019-12-21 03:28 UTC (for 3 hours) 
  • "Aborted Pilot" has been observed since 2019-12-20 10:28 UTC (for 5 hours) 
  • Health checker info. : "Aborted pilot jobs" has been found since 13:20:00 UTC on 2019/04/20. BIIDCO-1816 - Getting issue details... STATUS
  • Health checker info. : "Aborted pilot jobs" has been found since 11:20:00 UTC on 2019/04/06 and since 05:20:00 UTC on 2019/04/12. and since 20:20:00 UTC on 2019/04/17.

DIRAC.Nagoya.jp

  • "Failed Payload Job" has been observed since 2020-01-04 23:34 UTC (for 7 hours)
  • "Short Pilot" has been observed since 2019-11-19 05:35 UTC (for 1 hours)
  • Health checker info. : "Short pilot jobs" has been found since 09:20:00 UTC on 2019/10/09.

DIRAC.Nara-WU.jp

  • Under commissioning from 2018-11-13 BIIDCO-1432 - Getting issue details... STATUS

DIRAC.NDU.jp

  • Date, Issue, Tickets...

DIRAC.Niigata.jp

  • "Failed Payload Job" has been observed since 2020-01-05 05:34 UTC (for 1 hours)
  • "Short Pilot" has been observed since 2019-11-19 05:35 UTC (for 1 hours)

  • Health checker info. : "Short pilot jobs" has been found at 15:20:00 UTC on 2019/10/09.

  • Job submission check : Pilot submission failure has been found since 19:26:00 UTC on 2019/05/26. (details)

  • Health checker info. : "Aborted pilot jobs" has been found since 12:20:00 UTC on 2019/05/18.
  • Job submission check : Pilot submission failure has been found since 13:30:00 UTC on 2019/05/14. (details)
  • Health checker info. : "Aborted pilot jobs" has been found at 06:20:00 UTC on 2019/04/21.

DIRAC.Niigata2.jp       

  • "Failed Payload Job" has been observed since 2020-04-02 20:30 UTC (for 3 hours) (details).  BIIDCO-2348 - Getting issue details... STATUS

DIRAC.Osaka-CU.jp

  • Site is banned
  • Job submission check : Pilot submission failure has been found since 07:23:00 UTC on 2018/12/04. (details) BIIDCO-1434 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found since 22:20:00 UTC on 2018/03/17.
    → Ask site admin to check the status 2018-03-17 10:00 JST. (DB access failure again from DIRAC.Osaka-CU.jp to PNNL from 2018-03-16 11:00 UTC)
    BIIDCO-290 - Getting issue details... STATUS

DIRAC.PAU.in

  • "Pilot Submission Failure" has been observed since 2020-01-22 20:53 UTC (for 2 hours)

DIRAC.PNNL.us

  • Site to be decommissioned BIIDCO-919 - Getting issue details... STATUS

DIRAC.PNNL2.us

  • Site to be decommissioned BIIDCO-920 - Getting issue details... STATUS

DIRAC.PNNL-CASCADE.us

  • Seeing no jobs (no plot) is not a problem

DIRAC.PNNL-PIC.us

  • Seeing no jobs (no plot) is not a problem

DIRAC.RCNP.jp

  • "Pilot Submission Failure" has been observed since 2020-03-28 00:30 UTC (for 6 hours)  BIIDCO-2336 - Getting issue details... STATUS
  • BIIDCO-2327 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-03-17 18:24 UTC (for 6 hours) 
  • "Failed Payload Job" has been observed since 2020-01-05 02:34 UTC (for 4 hours)
  • "Short Pilot" has been observed since 2020-01-04 02:34 UTC (for 4 hours)

DIRAC.LocalTest.jp

  • Health checker info. : "Short pilot jobs" has been found since 09:20:00 UTC on 2019/10/09

DIRAC.SSU.kr

  • Date, Issue, Tickets...

DIRAC.TIFR.in

  • "Failed Payload Job" has been observed since 2020-01-12 05:53 UTC (for 11 hours) BIIDCO-2235 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-01-12 04:53 UTC (for 11 hours) BIIDCO-2234 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/10/06.
  • 2018/07/06. (details) BIIDCO-1132 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" -- Already reported:  BIIDCO-971 - Getting issue details... STATUS
  •  RunningLimit is set for MCProduction=1 BIIDCO-1006 - Getting issue details... STATUS
  • Job stalled at input data resolution BIIDCO-714 - Getting issue details... STATUS

DIRAC.TMU.jp

  • "Short Pilot" has been observed since 2020-01-24 02:53 UTC (for 28 hours)
  • "Short Pilot" has been observed since 2020-01-24 02:53 UTC (for 4 hours)
  • "Failed Payload Job" has been observed since 2020-01-05 05:34 UTC (for 1 hours)
  • "Pilot Submission Failure" has been observed since 2019-10-27 12:30 UTC (for 2 hours)

  • Health checker info. : "Failed to install DIRAC on " has been found since 14:20:00 UTC on 2019/04/24.
  • Job submission check : Pilot submission failure has been found at 13:27:00 UTC on 2019/03/19.
  • Health checker info. : "Short pilot jobs" has been found since 10:20:00 UTC on 2018/11/02 BIIDCO-1522 - Getting issue details... STATUS

DIRAC.Tokyo.jp

  • Decommissioned
  • Date, Issue, Tickets..

DIRAC.UAS.mx

  • 2020-04-03 19:00 UTC  "Job Status Plot" shows 100% Job finished with errors
  • 2020-03-27 19:53 UTC  "Job Status Plot" shows 100% Job finished with errors
  • Health checker info. : "Belle II software could not be installed on " has been found since 15:20:00 UTC on 2019/04/25
  • Job submission check : Pilot submission failure has been found since 00:21:00 UTC on 2019/04/04. (details) BIIDCO-1772 - Getting issue details... STATUS
  • Health checker info. : "Belle II software could not be installed on " has been found since 01:20:00 UTC on 2019/02/20.
  • Job submission check: 100% failed with errors from 22:00 2019/01/08 till 04:00 2019/01/09 (UTC)
  • Health checker info. : "Belle II software could not be installed on " has been found since 04:20:00 UTC on 2018/12/17.  BIIDCO-1508 - Getting issue details... STATUS
  • Health checker info. : "Belle II software could not be installed on " has been found since 16:20:00 UTC on 2018/11/14.
  • Job submission check : Pilot submission failure has been found since 01:26:00 UTC on 2018/09/21. (details)

DIRAC.UVic.ca

  • "Failed Payload Job" has been observed since 2020-01-05 00:34 UTC (for 6 hours)

DIRAC.UVic-local.ca

  • "Failed Payload Job" has been observed since 2020-01-05 00:34 UTC (for 6 hours)
  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/10/06.
  • User jobs failed on the site: BIIDCO-1975 - Getting issue details... STATUS
  • Job status check: "Input Data Resolution" issues (13% overall, 100% in past hours) on 2019/05/16 at 7:00 UTC.
  • Health checker info. : "Short pilot jobs" has been found since 04:20:00 UTC on 2019/05/16.(details)
  • Health checker info. : "Belle II software could not be installed on " has been found since 04:20:00 UTC on 2019/05/13.

DIRAC.Yamagata.jp

  • Health checker info. : "Short pilot jobs" has been found since 21:20:00 UTC on 2019/06/05.(details) BIIDCO-1861 - Getting issue details... STATUS

  • Health checker info. : "Short pilot jobs" has been found at 22:20:00 UTC on 2019/03/13.(details) BIIDCO-1761 - Getting issue details... STATUS

DIRAC.Yonsei.kr

  • "Short Pilot" has been observed since 2020-03-28 16:30 UTC (for 14 hours)  BIIDCO-2334 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-03-26 21:30 UTC (for 9 hours)
  • "Short Pilot" has been observed since 2020-01-05 05:34 UTC (for 1 hours)
  • "Failed Payload Job" has been observed since 2020-01-05 05:34 UTC (for 1 hours)
  • "Failed Payload Job" has been observed since 2019-12-30 21:34 UTC (for 1 hours)

DIRAC.LocalTest.jp

  • Date, Issue, Tickets..

LCG.CESNET.cz

  • "Failed Pilot" has been observed since 2020-03-29 22:30 UTC (for 10 hours) BIIDCO-2341 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-01-04 21:34 UTC (for 10 hours)
  • "Failed Pilot" has been observed since 2019-12-24 16:28 UTC (for 6 hours)
  • Health checker info. : "Failed pilot jobs" has been found at 06:20:00 UTC on 2019/05/15.(details)
  • Job submission check : Pilot submission failure has been found at 06:26:00 UTC on 2019/05/15. (details)
  • Health checker info. : "Failed pilot jobs" has been found since 20:20:00 UTC on 2019/05/13.(details)
  •   Need some intervention to run Merge jobs BIIDCO-771 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-04-01 21:30 UTC (for 10 hours) BIIDCO-2341 - Getting issue details... STATUS

LCG.COSENZA.IT

  • "Failed Payload Job" has been observed since 2020-01-05 04:34 UTC (for 3 hours)
  • "Short Pilot" has been observed since 2019-11-22 04:35 UTC (for 3 hours)
  • "Failed Payload Job" has been observed since 2019-11-21 21:35 UTC (for 10 hours)
  • "Short Pilot" has been observed since 2019-11-11 10:30 UTC (for 4 hours)

LCG.CNAF.it

  • "Failed Payload Job" has been observed since 2020-01-04 21:34 UTC (for 10 hours)
  • Health checker info. : "Short pilot jobs" has been found since 11:20:00 UTC on 2019/10/03.(details)

LCG.CYFRONET.pl

  • "Failed Payload Job" has been observed since 2020-03-20 13:24 UTC (for 1 hours)
  • "BLAH Error" has been observed since 2020-03-16 23:23:23 UTC (for 7 hours)
  • "Failed Payload Job" has been observed since 2019-12-23 10:28 UTC (for 12 hours)
  • "Short Pilot" has been observed since 2019-11-21 21:35 UTC (for 11 hours)
  • "Failed Payload Job" has been observed since 2019-11-21 21:35 UTC (for 11 hours)
  • "Failed Payload Job" has been observed since 2019-11-20 13:35 UTC (for 1 hours)

  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/10/06.
  • Job submission check : Pilot submission failure has been found since 14:23:00 UTC on 2019/07/31.
  • Health checker info. : "Short pilot jobs" has been found since 13:20:00 UTC on 2018/12/13. BIIDCO-1246 - Getting issue details... STATUS

LCG.DESY.de

  • The site to be retired  BIIDCO-1240 - Getting issue details... STATUS  – No more jobs to be submitted.

LCG.Frascati.it

  •  Site is currently Banned due to hardware problem since 2019-07-05

  • Job submission check : Pilot submission failure has been found since 14:24:00 UTC on 2019/05/24.  BIIDCO-1882 - Getting issue details... STATUS
    • GGUS 141688 ticket submitted.
  • Health checker info. : "BLAH ERROR" has been found since 15:20:00 UTC on 2019/05/21.(details)

LCG.HEPHY.at

  • "Failed Payload Job" has been observed since 2020-01-05 05:34 UTC (for 2 hours)
  • Health checker info. : "Failed pilot jobs" has been found at 13:20:00 UTC on 2019/10/03.(details)
  • Health checker info. : "Failed pilot jobs" has been found at 15:20:00 UTC on 2019/05/22.(details)
  • Health checker info. : "Short pilot jobs" has been found at 15:20:00 UTC on 2019/04/12.
  • Health checker info. : "Failed pilot jobs" has been found at 02:20:00 UTC on 2019/01/30.(details) and at 02:20:00 UTC on 2019/01/31.(details)
  • submission check : Pilot submission failure has been found at 14:22:00 UTC on 2018/12/27.

LCG.IPHC.fr

  • "Failed Payload Job" has been observed since 2020-01-04 20:34 UTC (for 11 hours)

LCG.KEK.jp

  • Job status: Large number of jobs finished with errors (61.0%) in last 24 hour period, from approx. 2020-01-01 00:00 - 02:00 UTC
  • Health checker info. : "Short pilot jobs" has been found since 05:20:00 UTC on 2019/10/09

  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/10/06.
  • SiteDirector "Failed to check the availability"  BIIDCO-1934 - Getting issue details... STATUS

LCG.KEK2.jp

  • Health checker info. : "Short pilot jobs" has been found at 16:20:00 UTC on 2019/10/09.

  • Still all jobs failing with InputDataResolution on 2019/07/25. BIIDCO-1542 - Getting issue details... STATUS
  • GGUS ticket : "KEK SE: PrepareToGet ETIMEDOUT for a specific file path"(140328) has been submited at 21:26:29 UTC on 2019/03/21.
  • Health checker info. : "Short pilot jobs" has been found at 11:20:00 UTC on 2019/03/22.(details) BIIDCO-1741 - Getting issue details... STATUS
  • Health checker info. : "Failed pilot jobs" has been found since 13:20:00 UTC on 2018/12/21. BIIDCO-1559 - Getting issue details... STATUS
  • all jobs are in "Input data resolution" status since 12.00 2018/12/18 UTC BIIDCO-1542 - Getting issue details... STATUS

LCG.KEK-merge.jp

  • Health checker info. : "Short pilot jobs" has been found at 00:20:00 UTC on 2019/08/26. BIIDCO-1978 - Getting issue details... STATUS
  •   Most jobs failing with InputDataResolution BIIDCO-1776 - Getting issue details... STATUS BIIDCO-1777 - Getting issue details... STATUS
  • "Belle II software could not be installed on cb268.cc.kek.jp" has been found since 14:20:00 UTC on 2019/04/05
  • Health checker info. : "Short pilot jobs" has been found since 20:20:00 UTC on 2019/04/02
  •   being commissioned...

LCG.KISTI.kr

  • "Short Pilot" has been observed since 2020-03-04 22:05 UTC (for 8 hours)  BIIDCO-2311 - Getting issue details... STATUS  
  • Health checker info. : "BLAH ERROR" has been found since 06:20:00 UTC on 2018/10/19.(details)

  • "Short pilot jobs" has been found at 06:20:00 UTC on 2018/10/09.(details)
  • BLAH error seems to be happen if jobs exceed the allocated # of queues, not a problem (Site specific feature)  
    BIIDCO-1259 - Getting issue details... STATUS
  • A large number of Merge jobs in waiting status BIIDCO-773 - Getting issue details... STATUS

LCG.KMI.jp

  • "Failed Payload Job" has been observed since 2020-01-04 22:34 UTC (for 9 hours)
  • "Failed Payload Job" has been observed since 2019-11-19 12:35 UTC (for 2 hours)
  • Health checker info. : "Belle II software could not be installed on pwn22.local" has been found since 21:20:00 UTC on 2018/11/22.
  • Job submission check : Pilot submission failure has been found since 21:24:00 UTC on 2018/10/02. (details)

LCG.LAL.fr

  • "Pilot Submission Failure" has been observed since 2020-02-17 21:59 UTC (for 1 hours) 2020-01-02 13:20 UTC
  • "Failed Payload Job" has been observed since 2019-11-18 05:35 UTC (for 2 hours)
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2019/05/01.(details)

LCG.Legnaro.it

  • Date, Issue, Tickets...

LCG.Napoli.it

  • "Pilot Submission Failure" has been observed since 2020-03-16 22:24 UTC (for 1 hours) (details).
  • "Failed Payload Job" has been observed since 2020-03-05 01:05 UTC (for 5 hours)
  • "Failed Payload Job" has been observed since 2020-01-04 20:34 UTC (for 11 hours)
  • "Failed Payload Job" has been observed since 2020-01-04 02:34 UTC (for 4 hours)
  • "Pilot Submission Failure" has been observed since 2019-11-16 06:35 UTC (for 32 hours)
  • Health checker info. : "Failed pilot jobs" has been found at 14:20:00 UTC on 2019/10/06.
  • Job submission check : Pilot submission failure has been found since 12:27:00 UTC on 2019/10/02.

  •  t2-recas-ce01.na.infn.it shows pilot submission error and this CE should  be banned till 2019 September.

  • Stalled jobs BIIDCO-1255 - Getting issue details... STATUS

LCG.NTU.tw


BIIDCO-2339 - Getting issue details... STATUS

  • "Short Pilot" has been observed since 2020-03-28 14:30 UTC (for 1 hours)
  • "Belle II software could not be installed" for "belle2grid3.cc.ntu.edu.tw" has been observed since 2020-03-15 06:23:22 UTC (for 2 hours)
  • "Belle II software could not be installed" for "belle2grid3.cc.ntu.edu.tw" has been observed since 2020-03-13 06:23:20 UTC (for 2 hours)
  • "CRL has expired" for "node39-0" has been observed since 2020-03-05 06:23:15 UTC (for 1 hours)
  • "Pilot Submission Failure" has been observed since 2020-03-03 13:05 UTC (for 17 hours)   
  • "Failed Payload Job" has been observed since 2020-01-05 01:34 UTC (for 6 hours)
  • "Failed Payload Job" has been observed since 2019-12-23 21:28 UTC (for 1 hours)
  • "Failed Payload Job" has been observed since 2019-12-10 05:28 UTC (for 1 hours)

LCG.Pisa.it

  • BIIDCO-1157 - Getting issue details... STATUS
    "Short pilot jobs" has been found since 02:20:00 UTC on 2018/09/21.(details)
  • "Pilot Submission Failure" has been observed since 2020-03-30 21:30 UTC (for 58 hours) BIIDCO-1791 - Getting issue details... STATUS

LCG.Roma3.it

  • Health checker info. : "Failed pilot jobs" has been found at 14:20:00 UTC on 2019/04/20. BIIDCO-1538 - Getting issue details... STATUS

LCG.TAU.il

  • "Failed Payload Job" has been observed since 2020-01-05 00:34 UTC (for 7 hours)
  • Health checker info. : "Failed pilot jobs" has been found since 19:20:00 UTC on 2019/05/24.(details)

LCG.Torino.it

  • "Pilot Submission Failure" has been observed since 2020-04-08 02:30 UTC (for 4 hours) BIIDCO-2215 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-04-04 05:30 UTC (for 9 hours) BIIDCO-2215 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-04-02 16:30 UTC (for 22 hours) BIIDCO-2215 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-04-02 13:30 UTC (for 1 hours)  BIIDCO-2215 - Getting issue details... STATUS
  • "BLAH Error" has been observed since 2020-02-09 05:53:05 UTC (for 9 hours) BIIDCO-2279 - Getting issue details... STATUS
  • "BLAH Error" has been observed since 2020-02-08 14:53:04 UTC (for 1 hours)
  • "Pilot Submission Failure" has been observed since 2019-12-28 18:34 UTC  BIIDCO-2215 - Getting issue details... STATUS
  • LCG.Torino.it: Downtime from 2020-03-25 00:00 to 2020-03-30 23:00 (UTC) BIIDCO-2329 - Getting issue details... STATUS

LCG.ULAKBIM.tr

  • The queue 'belle7' to be disabled. use only 'belle' BIIDCO-1896 - Getting issue details... STATUS
  • Health checker info. : "Aborted pilot jobs" has been found since 01:20:00 UTC on 2019/08/01.

OSG.BNL.us

  • Pilot submission failure observed since 2020-01-28 01:53 UTC
  • "Failed Payload Job" has been observed since 2020-01-04 20:34 UTC (for 11 hours)
  • "Failed Payload Job" has been observed since 2019-12-20 14:28 UTC  BIIDCO-2195 - Getting issue details... STATUS
    Solved and verified GGUS ticket : https://ggus.eu/index.php?mode=ticket_info&ticket_id=144665 has  been submitted
  • "Belle II software could not be installed" for "bgk01.sdcc.bnl.gov" has been observed since 2019-12-18 15:27:52 UTC BIIDCO-2194 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2019-12-05 05:28 UTC 
  • Health checker info. : "Belle II software could not be installed on " has been found since 19:20:00 UTC on 2019/02/14.
  • Job submission check: Jobs fail with errors or input data resolution the last 24h (6:00 UTC, 2019/01/09)  BIIDCO-1596 - Getting issue details... STATUS
  • Production jobs: UNAVAILABLE files BIIDCO-1302 - Getting issue details... STATUS
  • Number of concurrent MCProduction jobs restricted BIIDCO-1256 - Getting issue details... STATUS
  •  MCProduction jobs are mostly stalled BIIDCO-1253 - Getting issue details... STATUS

OSG.CORI.us

  • OSG.CORI.us resource has been removed because CY18 allocation was not approved

OSG.UMiss.us

  • Health checker info. : "Short pilot jobs" has been found at 23:20:00 UTC on 2019/06/03.(details) BIIDCO-1863 - Getting issue details... STATUS
  • Health checker info. : "Aborted pilot jobs" has been found at 22:20:00 UTC on 2019/06/02. BIIDCO-1856 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/05/20.(details)
  • Health checker info. : "Short pilot jobs" has been found at 07:20:00 UTC on 2019/05/14.(details)
  • Health checker info. : "Short pilot jobs" has been found since 22:20:00 UTC on 2019/05/12.(details)
    Updated BIIDCO-1768 - Getting issue details... STATUS
  • Job submission check : Pilot submission failure has been found since 12:27:00 UTC on 2019/05/04. (details)
  • Health checker info. : "Short pilot jobs" has been found since 07:20:00 UTC on 2019/05/11.(details)
  • Health checker info. : "Short pilot jobs" has been found since 14:20:00 UTC on 2019/04/11 and  at 17:20:00 UTC on 2019/04/14.
  • Job status check: 34.7% appl. finshed with errors on 2019/04/08.

SSH.KMI.jp

  • "Pilot Submission Failure" has been observed since 2020-04-04 13:30 UTC (for 1 hours)
  • "Pilot Submission Failure" has been observed since 2020-03-21 13:24 UTC (for 1 hours)
  • "Pilot Submission Failure" has been observed since 2020-03-15 05:24 UTC (for 1 hours)
  • Job status plot: input data resolution problems (for 7 hours) since 2019-12-24 00:00 UTC, approximately.
  • "Short Pilot" has been observed since 2019-12-24 05:28 UTC (for 1 hours)
  • Job status check: Application finished with errors (12% of the jobs in last 24 hours) on 2018/12/22 at 11:30 UTC.
  • Health checker info. : "Short pilot jobs" has been found at 20:20:00 UTC on 2018/08/13.

Test.KIT.de

  • "Failed Pilot" has been observed since 2020-01-19 21:53 UTC (for 1 hours)
  • "Aborted Pilot" has been observed since 2020-01-16 21:53 UTC (for 1 hours)
  • "Failed Payload Job" has been observed since 2020-01-04 20:34 UTC (for 11 hours)
  • "Pilot Submission Failure" has been observed since 2019-12-05 05:28 UTC 
  • "Aborted Pilot" has been observed since 2019-11-23 05:35 UTC (for 1 hours)
  • Test site for the opportunistic resources at KIT. No need to report problems.LCG.Pisa.it
  • 2020-04-01, downtime, BIIDCO-2345 - Getting issue details... STATUS

Test.ULAKBIM.tr

  • Test site for the SL7 resources at ULAKBIM. No need to report problems.
  • No activities expected currently.

VCYCLE.Napoli.it

  • "Failed Payload Job" has been observed since 2020-01-05 04:34 UTC (for 3 hours)
  • "Failed Payload Job" has been observed since 2019-11-21 20:35 UTC (for 12 hours)
  • "Failed Payload Job" has been observed since 2019-11-29 00:35 UTC (for 6 hours) BIIDCO-2143 - Getting issue details... STATUS

  • "Failed Payload Job" has been observed since 2019-11-20 13:35 UTC (for 1 hours)

  • Opportunistic site (Empty plot is not a problem)
  •  Ban lifted BIIDCO-1613 - Getting issue details... STATUS
  • "Sudo CE Error: sudo execution fails with return code 1" BIIDCO-1612 - Getting issue details... STATUS

VCYCLE.HNSC01.it, VCYCLE.HNSC02.it

  • Opportunistic site (Empty plot is not a problem)


Links


  • No labels