Contents


Contents

 Click here to expand...

l



Production Plans

Data Production Status

  • Raw data processing
    • Bucket 15: completed
    • BGOverlay productions: completed
    • Bucket 9-14 (prompt processing of 2020a/b): completed
  • MC13 production

    • MC13a (Run-independent MC) production: Y(nS) productions on going

    • MC13b (Run-dependent MC) production: ongoing
  • Skim
    • SkimP11x1 (Proc11 skims): ongoing
    • SkimB11x1 (Bucket11 skims): ongoing
  • The MC14 campaign will begin soon with release-05 (early-mid November)

Production Status

Full resource usage but the number of running jobs sometimes decreases

Data production summary page : Data Production Status

Data (re)processing: No raw data processing jobs expected now.

  • Bucket 15 is completed

  • BGOverlay completed

MC production:

  • An additional production for 1/ab of nominal phase 3 generic MC13a is ongoing.

Analysis skimming:

  • Skims of the 2020a-b dataset are currently in progress and progressing.


Central Services

Dirac (dirac.cc.kek.jp, b2dchsv01-b2dchsv06.cc.kek.jp, b2dchsv08.cc.kek.jp)

  • Data, Issue, Tickets..

DB Production (b2dchdb1.cc.kek.jp, b2dchdb2.cc.kek.jp, b2dcsdb1.cc.kek.jp, b2dcsdb2.cc.kek.jp)

  • b2dcdb05-07.cc.kek.jp are under construction and the status can be ignored (07 Nov 2020)
  • 2020/11/07: "DB Production" servers b2dcdb06.cc.kek.jp, b2dcdb05.cc.kek.jp: blue band is rapidly increased by more than 5 GB BIIDCO-2819 - Getting issue details... STATUS
  • 2020/11/07: "Web" servers b2dcwvm01.cc.kek.jp: grey part has rapidly increased and gone over twice higher than the red line. BIIDCO-2819 - Getting issue details... STATUS

"Web" servers

  •   b2dcwvm01.cc.kek.jp load has increased a lot. Staying around red line. BIIDCO-2888 - Getting issue details... STATUS

DDM (bldirac01.sdcc.bnl.gov)

  • DDM is stuck BIIDCO-2754 - Getting issue details... STATUS
  •  DDM is not responding to API calls BIIDCO-2748 - Getting issue details... STATUS
  • 2018-03-01 DDM deletion task seems stuck BIIDCO-808 - Getting issue details... STATUS

Conditions DB ()

Monitor

  • lfc-ls segmentation fault in SiteCrawler on EL7 sites BIIDCO-2775 - Getting issue details... STATUS

AMGA

  • Date, Issue, Tickets...

LFC

  • Date, Issue, Tickets...
  • BIIDCO-2668 - Getting issue details... STATUS

CVMFS

  • problem in replication from cvmfs-stratum-zero.cc.kek.jp/cvmfs/belle.kek.jp BIIDCO-2788 - Getting issue details... STATUS

File Transfers and Replication Status

  • 2020-04-03 00:15 UTC File Transfer failures: Pisa-DATA-SE  BIIDCO-2349 - Getting issue details... STATUS  

FTS

Any problem in the FTS service or FTS monitoring are to be recorded here. Site/SE specific issues are to be recorded under each SIte/SE

Replication Status

  • Date, Issue, Tickets...

Job Status Plot

  • Many sites have similar red peak: BIIDCO-2852 - Getting issue details... STATUS
  • Most of the sites have similar red peak BIIDCO-2850 - Getting issue details... STATUS
  • small amount of jobs which "finished with errors" at many sites simultaneously 2020-11-04 BIIDCO-2810 - Getting issue details... STATUS
  • All plots have a lot of "finished with errors" red histograms.  BIIDCO-2799 - Getting issue details... STATUS
  • No job status plots for 38 sites while MC13 production is ongoing 2020-10-17 BIIDCO-2277 - Getting issue details... STATUS
  • No job status plots for 13 sites while MC13a production is ongoing 2020-05-18 BIIDCO-2277 - Getting issue details... STATUS
  • No job status plots for 15 sites while MC13a production is ongoing 2020-02-07 BIIDCO-2277 - Getting issue details... STATUS

Job Summary

  • Date, Issue, Tickets...



SEs

SE Common Issues

  • Issues with individual SEs should be recorded below (Primary SEs or Other SEs)

Raw data SEs: 

Raw data SE: KEK-TMP-SE (srm://kek2-se02.cc.kek.jp:8444/srm/managerv2?SFN=/belle/TMP)

Raw data SE: KEK-RAW-SE (srm://kek2-se02.cc.kek.jp:8444/srm/managerv2?SFN=/belle/RAW)

  • Many replication failed from KEK-RAW-SE to BNL-TAPE-SE at 11:30, 25, May 2020  BIIDCO-2445 - Getting issue details... STATUS

Raw data SE: KEK-TAPE-SE (srm://kek2-se02.cc.kek.jp)

  • 2020-10-13: Slow FTS transfers -  BIIDCO-2762 - Getting issue details... STATUS

Raw data SE: BNL-TAPE-SE (srm://dcblsrm.sdcc.bnl.gov:8443/srm/managerv2?SFN=/pnfs/sdcc.bnl.gov/tape


Primary SEs

Primary SE: BNL-TMP-SE (dcblsrm.sdcc.bnl.gov)

  • SE Health Check by DDM: Failure on remove file, remove directory have been observed since 2020-12-01 05:11:38 (3 hours) BIIDCO-2839 - Getting issue details... STATUS
  • BNL is scheduled to have the network outage on 2020 Dec 1st 06:00 - 18:00 EST BIIDCO-2839 - Getting issue details... STATUS

  • No Replication Trend Plot for BNL-TMP-SE 2020-01-02 09:30 UTC  BIIDCO-2220 - Getting issue details... STATUS

Primary SE: CESNET-TMP-SE (dpm1.egee.cesnet.cz) 

  • Date, Issue, Tickets...

Primary SE: CNAF-TMP-SE (storm-fe-archive.cr.cnaf.infn.it)CNAF-TMP-SE

  • 2019/01/27 File transfer failures from CNAF-TMP-SE to NTUCC-DATA-SE. BIIDCO-1637 - Getting issue details... STATUS

Primary SE: DESY-TMP-SE (dcache-se-desy.desy.de)

Primary SE: KEK-DISK-TMP-SE (srm://kek2-se03.cc.kek.jp:8444/srm/managerv2?SFN=/disk/belle/TMP)

  •  KEK-DISK-TMP-SE - files unavailable BIIDCO-2697 - Getting issue details... STATUS
  • Replication failures have been observed since 2020-09-10 10:00 BIIDCO-2694 - Getting issue details... STATUS
  • SE Health Check by DDM: Failure on remove file, remove directory have been observed since 2020-08-28 07:01:17 (119 hours) BIIDCO-2661 - Getting issue details... STATUS

Primary SE: KIT-TMP-SE (dcachesrm-kit.gridka.de)

  • Date, Issue, Tickets...

Primary SE: KMI-TMP-SE (nsrmfe01.hepl.phys.nagoya-u.ac.jp )

  • SE Health Check by DDM: Failure on ls, upload have been observed since 2020-11-14 07:07:37 (6 hours) BIIDCO-2831 - Getting issue details... STATUS

  • SE Health Check by DDM: Failure on ls, upload have been observed since 2020-09-02 02:55:52 (3 hours) BIIDCO-2662 - Getting issue details... STATUS
  • 2020-01-02 09:15 UTC File transfer failure from KMI-TMP-SE to LAL-DATA-SE BIIDCO-2219 - Getting issue details... STATUS

Primary SE: Napoli-TMP-SE (belle-dpm-01.na.infn.it )

  •   Failed transfers from Napoli-TMP-SE: BIIDCO-2849 - Getting issue details... STATUS
    Downtime related to Cooling system failure BIIDCO-2846 - Getting issue details... STATUS
  • SE Health Check by DDM: Failure on upload have been observed since 2020-09-30 04:53:07 (4 hours) BIIDCO-2663 - Getting issue details... STATUS
  • SE Health Check by DDM: Failure on upload have been observed since 2020-09-02 02:25:58 (4 hours) BIIDCO-2663 - Getting issue details... STATUS

Primary SE: SIGNET-TMP-SE (dcache.ijs.si )

  • Data transfer errors  BIIDCO-2807 - Getting issue details... STATUS

Other SEs

Adelaide-TMP-SE (coepp-dpm-01.ersa.edu.au)

  •  Adelaide SE is banned BIIDCO-2184 - Getting issue details... STATUS

CYFRONET-TMP-SE (dpm.cyf-kr.edu.pl)

  • Date, Issue, Tickets...
  • CYFRONET is banned for write  BIIDCO-2392 - Getting issue details... STATUS
  • CYFRONET SE to be replaced BIIDCO-2391 - Getting issue details... STATUS

CINVESTAV-TMP-SE (jaguar-se.fis.cinvestav.mx)

Frascati-TMP-SE (atlasse.lnf.infn.it)

  • Date, Issue, Tickets...

HEPHY-TMP-SE (hephyse.oeaw.ac.at)

  • Date, Issue, Tickets... 

  •   The new EOS SE put in production BIIDCO-2564 - Getting issue details... STATUS
  •   HEPHY-TMP-SE banned BIIDCO-2393 - Getting issue details... STATUS

IPHC-TMP-SE (sbgse1.in2p3.fr)

  • Date, Issue, Tickets...

KEK2-TMP-SE (srm://kek2-se03.cc.kek.jp:8444/srm/managerv2?SFN=/belle/TMP)

  • Date, Issue, Tickets...
  • banned for read and write BIIDCO-2630 - Getting issue details... STATUS

KISTI-TMP-SE (belle-se-head.sdfarm.kr)

  • Transfer Destination and Replication Problems BIIDCO-2777 - Getting issue details... STATUS
  • No new assignment of MC production data blocks to this destination BIIDCO-848 - Getting issue details... STATUS

LAL-TMP-SE (grid05.lal.in2p3.fr)

  • Date, Issue, Tickets...

Melbourne-TMP-SE (b2se.mel.coepp.org.au)

McGill-TMP-SE  (storm02.clumeq.mcgill.ca)

  • BIIDCO-516 - Getting issue details... STATUS McGill-TMP-SE will be decomissioned in early 2018.

MPPMU-TMP-SE (grid-srm.rzg.mpg.de)

  • Date, Issue, Tickets...

NTU-TMP-SE (bgrid3.phys.ntu.edu.tw)

  •  NTU-TMP-SE banned for write  BIIDCO-1993 - Getting issue details... STATUS

NTU-CC-DATA-SE

  • 2020/11/08 File transfer failure to NTU-CC-DATA-SE  BIIDCO-1977 - Getting issue details... STATUS

NTU-CC-TMP-SE (belle2grid3.cc.ntu.edu.tw)

  • Low efficiency for source BIIDCO-2805 - Getting issue details... STATUS
  • 2019/8/23 file transfer failure to NTU-CC-DATA-SE BIIDCO-1977 - Getting issue details... STATUS
  • 2019/01/27 File transfer failures from CNAF-TMP-SE to NTUCC-DATA-SE. BIIDCO-1637 - Getting issue details... STATUS   BIIDCO-1892 - Getting issue details... STATUS
  • File transfer failure and cancellation to NTUCC-DATA-SE happened 2018-12-22 BIIDCO-1551 - Getting issue details... STATUS
  • NTUCC-TMP-SE banned for write  BIIDCO-1333 - Getting issue details... STATUS

Pisa-TMP-SE (stormfe1.pi.infn.it)

  • Date, Issue, Tickets...

PNNL-TMP-SE (se.hep.pnnl.gov) 

  • Being decommissioned. No need to report any issues.  BIIDCO-838 - Getting issue details... STATUS

Roma3-TMP-SE (storm-01.roma3.infn.it)

  •  Date, Issue, Tickets...

TAU-TMP-SE (tau-se.hep.tau.ac.il)

  • Date, Issue, Tickets...

Torino-TMP-SE (se-srm-00.to.infn.it)

  • Date, Issue, Tickets...

ULAKBIM-TMP-SE (torik1.ulakbim.gov.tr)

  • File transfer failures destination  BIIDCO-2253 - Getting issue details... STATUS

UMiss-TMP-SE (umiss005.hep.olemiss.edu)

  • Date, Issue, Tickets...

UVic-TMP-SE(charon01.westgrid.ca)

  • Date, Issue, Tickets...

Sites

Sites Common Issue

  • Date, issue for sites wide

ARC.DESY.de

ARC.DESY-test.de

  • A test queue for the new CE. BIIDCO-1469 - Getting issue details... STATUS

ARC.KIT.de

  • Downtime from 2020-09-22 08:00 (UTC) to 2020-10-29 22:00 (UTC) BIIDCO-2710 - Getting issue details... STATUS
  • Downtime from 2020-09-02 06:00 to 2020-09-23 14:00 (UTC) BIIDCO-2660 - Getting issue details... STATUS
  • Downtime 2020-07-03 14:00 - 2020-07-07 14:00  BIIDCO-2490 - Getting issue details... STATUS
  •   MCProduction restricted BIIDCO-2504 - Getting issue details... STATUS
  • Downtime 2020-06-16 07:00 - 2020-06-16 10:00 BIIDCO-2490 - Getting issue details... STATUS

  • "Short Pilot" has been observed since 2020-06-02 22:47 UTC . BIIDCO-2459 - Getting issue details... STATUS

ARC.KIT-TARDIS.de

  • Downtime from 2020-09-22 16:00 (UTC) upto 2020-10-31 20:00 (UTC) BIIDCO-2710 - Getting issue details... STATUS
  • Downtime from 2020-09-02 06:00 to 2020-09-23 14:00 (UTC) BIIDCO-2660 - Getting issue details... STATUS
  • renamed from Test.KIT.de BIIDCO-2323 - Getting issue details... STATUS

ARC.LMU.de

  • This is a test site. Do not need to report any issue.

ARC.LMU2.de

  • Banned as currently no resource behind the CE BIIDCO-239 - Getting issue details... STATUS

ARC.Melbourne.au

  • "Failed Payload Job" has been observed since 2020-11-24 00:48 UTC (for 22 hours) and since    BIIDCO-2399 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since BIIDCO-2399 - Getting issue details... STATUS
  • BIIDCO-2399 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-09-02 08:30 UTC (for 22 hours)Jira ticket submitted: BIIDCO-2678 - Getting issue details... STATUS

ARC.MPPMU.de

  • "Pilot Submission Failure" has been observed since 2020-11-09 01:40 UTC (for 5 hours) BIIDCO-2711 - Getting issue details... STATUS
  • Downtime 2020-11-04 19:00 (UTC) - 2020-11-06 14:30 (UTC) BIIDCO-2811 - Getting issue details... STATUS
  • Downtime 2020-07-28 00:00 - 2020-07-29 00:00 BIIDCO-2585 - Getting issue details... STATUS

  • Downtime 2020-06-16 11:00 - 2020-06-16 17:00 BIIDCO-2491 - Getting issue details... STATUS

  • BIIDCO-128 - Getting issue details... STATUS

ARC.SIGNET.si

  • Downtime 2020-11-26 UTC 14:20 to 2020-11-26 20:00 BIIDCO-2860 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-11-23 03:48 UTC (for 11hours) BIIDCO-2854 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-10-14 22:31 UTC (for 8 hours) BIIDCO-2724 - Getting issue details... STATUS
  • "Aborted Pilot" has been observed since 2020-08-14 14:35 UTC (for 10 hours) BIIDCO-1055 - Getting issue details... STATUS
  • "Failed to install DIRAC on " has been found since 20:20:00 UTC on 2018/11/03. BIIDCO-1420 - Getting issue details... STATUS

CLOUD.CC1_Krakow.pl

  • Not used in production yet. Seeing no jobs (no plot) is not a problem

CLOUD.DESY.de

  • Downtime: CLOUD.DESY.de and ARC.DESY.de 2020-11-24 UTC 07:00 to 2020-11-26 18:00  BIIDCO-2857 - Getting issue details... STATUS

DIRAC.Beihang.cn

  • Site is banned.
  • "Failed Payload Job" has been observed since 2019-04-19 11:15 UTC   BIIDCO-1812 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2019/04/18. BIIDCO-1807 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2018/12/08. BIIDCO-1534 - Getting issue details... STATUS
  • Job submission check : Pilot submission failure has been found since 09:24:00 UTC on 2018/09/21. (details) BIIDCO-1312 - Getting issue details... STATUS
  • The number of jobs limited. BIIDCO-289 - Getting issue details... STATUS
  • All the upload trials are failing against all the SEs configured: OutputSE (KMI-TMP-SE, PNNL-TMP-SE), Fail-over SEs(DESY-TMP-SE, Napoli-TMP-SE, PNNL-TMP-SE, KIT-TMP-SE)  BIIDCO-43 - Getting issue details... STATUS
  • Large % of failed jobs in DIRAC status plot (Added 2016-11-03 22:45:00 UTC)  BIIDCO-38 - Getting issue details... STATUS

DIRAC.BINP.ru

  • "Failed Payload Job" has been observed since 2020-10-27 08:40 UTC (for 6 hours)  BIIDCO-2672 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-09-02 15:30 UTC (for 15 hours) BIIDCO-2672 - Getting issue details... STATUS

DIRAC.BINP-VM.ru

  • "Failed Payload Job" has been observed since 2020-10-27 14:40 UTC (for 1 hours). BIIDCO-749 - Getting issue details... STATUS

  • Job status plots, "Application Finished with errors" 2020-04-21 10:00 to 2020-04-22 10:00 UTC BIIDCO-749 - Getting issue details... STATUS

DIRAC.CINVESTAV.mx

  • Date, Issue, Tickets...

DIRAC.DESY.de

  • Test site. Not in use in MC production

DIRAC.IITG.in

  • "Aborted Pilot" has been observed since 2020-10-24 01:38 UTC (for 53 hours). Also since 2020-10-08 12:31 UTC (for 104 hours) and since 2020-03-19 13:24 UTC (for 1 hours)  BIIDCO-2070 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-09-01 17:30 UTC BIIDCO-2681 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-10-06 08:31 UTC (for 46 hours) BIIDCO-2513 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2019/05/16. BIIDCO-1686 - Getting issue details... STATUS BIIDCO-1768 - Getting issue details... STATUS
  • Job status plots, 100% "Application Finished With Errors", 10:00:00 UTC on 2019/04/08. Still unchanged as of 2019/04/26.  BIIDCO-1823 - Getting issue details... STATUS

DIRAC.IITH.in

  • "Pilot Submission Failure" has been observed since 2020-02-17 15:59 UTC (for 7 hours)

  • Health checker info. : "Short pilot jobs" has been found at 07:20:00 UTC on 2019/03/29.(details BIIDCO-1768 - Getting issue details... STATUS

DIRAC.LMU.de

  • Not in use in MC production BIIDCO-26 - Getting issue details... STATUS
  • Banned for now.

DIRAC.MIPT.ru

  • Health checker info. : "Aborted pilot jobs" has been found since 13:20:00 UTC on 2019/04/20. BIIDCO-1816 - Getting issue details... STATUS

DIRAC.Nagoya.jp

  • "Short Pilot" has been observed since 2020-11-24 01:48 UTC (for 21 hours). Also since 2020-10-28 13:40 UTC (for 1 hours). BIIDCO-2467 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-10-05 11:31 UTC (for 3 hours) BIIDCO-2749 - Getting issue details... STATUS

DIRAC.Nara-WU.jp

  • "Pilot Submission Failure" has been observed since 2020-06-12 09:47 UTC (for 5 hours) BIIDCO-2476 - Getting issue details... STATUS
  • Under commissioning from 2018-11-13 BIIDCO-1432 - Getting issue details... STATUS

DIRAC.NDU.jp

  • Date, Issue, Tickets...

DIRAC.Niigata.jp

  • "Failed Payload Job" has been observed since 2020-10-27 06:40 UTC (for 8 hours). BIIDCO-2752 - Getting issue details... STATUS

DIRAC.Niigata2.jp

  • "Failed Payload Job" has been observed since 2020-10-27 10:40 UTC (for 4 hours). BIIDCO-2665 - Getting issue details... STATUS BIIDCO-2348 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-10-04 11:31 UTC (for 19 hours). Also since 2020-10-04 11:31 UTC (for 3 hours) BIIDCO-2746 - Getting issue details... STATUS
  • "Application Finished with Errors" with 38.7% from 2020-04-21 10:00 UTC to 2020-04-22 02:00 UTC  BIIDCO-2381 - Getting issue details... STATUS

DIRAC.Osaka-CU.jp

  • Site is banned
  • Job submission check : Pilot submission failure has been found since 07:23:00 UTC on 2018/12/04. (details) BIIDCO-1434 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found since 22:20:00 UTC on 2018/03/17.
    → Ask site admin to check the status 2018-03-17 10:00 JST. (DB access failure again from DIRAC.Osaka-CU.jp to PNNL from 2018-03-16 11:00 UTC)
    BIIDCO-290 - Getting issue details... STATUS

DIRAC.PAU.in

  • Date, Issue, Tickets...

DIRAC.PNNL.us

  • Site to be decommissioned BIIDCO-919 - Getting issue details... STATUS

DIRAC.PNNL2.us

  • Site to be decommissioned BIIDCO-920 - Getting issue details... STATUS

DIRAC.PNNL-CASCADE.us

  • Seeing no jobs (no plot) is not a problem

DIRAC.PNNL-PIC.us

  • Seeing no jobs (no plot) is not a problem

DIRAC.RCNP.jp

  • "Short Pilot" has been observed since 2020-11-09 12:40 UTC (for 42 hours). Also since 2020-11-21 08:48 UTC (22hrs).  BIIDCO-2814 - Getting issue details... STATUS
  • "Not Enough Disk Space" for "fpb22,fpb23" has been observed since 2020-08-28 23:19:23 UTC (for 19 hours) BIIDCO-2642 - Getting issue details... STATUS
  • High failure jobs since 2020-08-24 (after migration) BIIDCO-2636 - Getting issue details... STATUS

DIRAC.LocalTest.jp

  • Downtime 2020-11-30 00:00 - 2020-12-07 00:00 (UTC) BIIDCO-2886 - Getting issue details... STATUS
  • Downtime from 2020-08-28 06:00 to 2020-09-01 04:00 (UTC)
    Downtime 2020-08-17 06:00 to 2020-08-31 06:00 (UTC) 
    
                         BIIDCO-2639
                                -
                Getting issue details...
                                                    STATUS
                    
     

DIRAC.Shandong.cn

  • "Aborted Pilot" has been observed since 2020-10-05 13:31 UTC (for 1 hours) BIIDCO-2747 - Getting issue details... STATUS
  • "Aborted Pilot" has been observed since 2020-10-04 13:31 UTC (for 1 hours) BIIDCO-2747 - Getting issue details... STATUS

DIRAC.SSU.kr

  • "Short Pilot" has been observed since 2020-10-17 02:31 UTC 
    
                         BIIDCO-2548
                                -
                Getting issue details...
                                                    STATUS
                    
    
  • "Pilot Submission Failure" has been observed since 2020-08-14 08:35 UTC (for 62 hours) BIIDCO-2611 - Getting issue details... STATUS
  • DIRAC.SSU.kr "Failed Payload Job" has been observed since 2020-06-08 02:47 UTC.  BIIDCO-2470 - Getting issue details... STATUS

DIRAC.TIFR.in

  • "Pilot Submission Failure" has been observed since 2020-11-30 18:48 UTC (for 12 hours) BIIDCO-2536 - Getting issue details... STATUS

  • "Pilot Submission Failure" has been observed since 2020-11-14 03:41 UTC (for 10 hours) BIIDCO-2536 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-10-19 21:31 UTC (for 217 hours) BIIDCO-2536 - Getting issue details... STATUS
  • There is a hardware failure at this site - hardware replacement has been delayed by COVID.  BIIDCO-2536 - Getting issue details... STATUS
  • TIFR site is down due to hardware failure  BIIDCO-2536 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2019/10/06.
  • 2018/07/06. (details) BIIDCO-1132 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" -- Already reported:  BIIDCO-971 - Getting issue details... STATUS
  •  RunningLimit is set for MCProduction=1 BIIDCO-1006 - Getting issue details... STATUS

DIRAC.TMU.jp

  • "Failed Payload Job" has been observed since 2020-08-28 01:18 UTC (for 21 hours) BIIDCO-2635 - Getting issue details... STATUS
  • High failure jobs since 2020-08-24 (after migration) BIIDCO-2635 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found since 10:20:00 UTC on 2018/11/02 BIIDCO-1522 - Getting issue details... STATUS

DIRAC.Tokyo.jp

  • Decommissionedf
  • Date, Issue, Tickets..

DIRAC.UAS.mx

  • Job submission check : Pilot submission failure has been found since 00:21:00 UTC on 2019/04/04. BIIDCO-1772 - Getting issue details... STATUS
  • Health checker info. : "Belle II software could not be installed on " has been found since 04:20:00 UTC on 2018/12/17.  BIIDCO-1508 - Getting issue details... STATUS

DIRAC.UVic.ca

  • DIRAC.UVic.ca, DIRAC.UVic-local.ca, CLOUD.DESY.de: downtime 2020-10-24 14:00 (UTC) - 2020-10-24 17:00 (UTC) BIIDCO-2790 - Getting issue details... STATUS

  • DIRAC.UVic.ca, DIRAC.UVic-local.ca, CLOUD.DESY.de: 

    Downtime start time 2020-09-17 22:00 - end time 2020-09-22 00:00 (UTC) BIIDCO-2700 - Getting issue details... STATUS

  • User jobs enabled BIIDCO-2501 - Getting issue details... STATUS

DIRAC.UVic-local.ca

  • User jobs failed on the site: BIIDCO-1975 - Getting issue details... STATUS

DIRAC.Yamagata.jp

  • "Short Pilot" has been observed since 2020-10-27 06:40 UTC (for 8 hours), since 2020-08-15 17:35 UTC (for 6 hours) and since 2020-06-20 23:47 UTC (for 6 hours) BIIDCO-2512 - Getting issue details... STATUS

  • "Failed Payload Job" has been observed since 2020-10-27 09:40 UTC (for 5 hours), since 2020-10-25 14:38 UTC (for 1 hours) and since 2020-06-28 12:07 UTC  . BIIDCO-2512 - Getting issue details... STATUS

  • "Pilot Submission Failure" has been observed since 2020-09-21 08:24 UTC (for 22 hours) and since 2020-09-09 03:24 UTC (for 4 hours) BIIDCO-2693 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-08-16 14:35 UTC (for 8 hours)and since 22:20:00 UTC on 2019/03/13 BIIDCO-1761 - Getting issue details... STATUS

DIRAC.Yonsei.kr

  • "Failed Payload Job" has been observed since 2020-10-27 08:40 UTC (for 2 hours) BIIDCO-2584 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-10-15 03:31 UTC (for 3 hours), since 2020-08-10 21:35 UTC (for 1 hours)and since 2020-07-26 14:15 UTC (for 9 hours)   BIIDCO-2584 - Getting issue details... STATUS

LCG.CESNET.cz

  • Downtime 2020-12-01 07:00 - 2020-12-01 14:00 (UTC) BIIDCO-2812 - Getting issue details... STATUS
  • Downtime 2020-11-30 07:00 - 2020-12-31 23:00 (UTC) BIIDCO-2887 - Getting issue details... STATUS
  • Downtime 2020-11-05 00:00 (UTC) - 2020-12-03 00:00 (UTC) BIIDCO-2812 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-10-27 08:40 UTC (for 6 hours) BIIDCO-2486 - Getting issue details... STATUS

  • CESNET-DATA-SE :Transfer Destination Problems from several sources: BIIDCO-2780 - Getting issue details... STATUS
  • "Failed Pilot" has been observed since 2020-09-22 22:24 UTC (for 8 hours) (details). BIIDCO-2486 - Getting issue details... STATUS
  • Failed Pilot" has been observed since 2020-09-09 03:24 UTC (for 3 hours) BIIDCO-2486 - Getting issue details... STATUS

  • New CE to be configured: BIIDCO-2634 - Getting issue details... STATUS
  •   Need some intervention to run Merge jobs BIIDCO-771 - Getting issue details... STATUS

LCG.COSENZA.IT

  • "Aborted Pilot" has been observed since 2020-11-30 19:48 UTC (for 3 hours) BIIDCO-2892 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-10-27 13:40 UTC (for 1 hours) BIIDCO-2722 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-09-24 02:25 UTC (for 4 hours) BIIDCO-2722 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-09-24 05:25 UTC (for 1 hours) BIIDCO-2722 - Getting issue details... STATUS

LCG.CNAF.it

  • "Aborted Pilot" has been observed since 2020-11-25 04:48 UTC (for 10 hours) .  BIIDCO-2862 - Getting issue details... STATUS
  •   CREAM CEs to be decommissioned BIIDCO-2840 - Getting issue details... STATUS
  • Downtime 2020-11-03 00:00 (UTC) - 2020-11-05 12:00 (UTC) BIIDCO-2809 - Getting issue details... STATUS

  • "Short Pilot" has been observed since 2020-10-28 13:40 UTC (for 1 hours). Also since 2020-09-23 13:25 UTC (for 1 hours) and since  2020-05-20 02:43 UTC (for 5 hours) BIIDCO-2424 - Getting issue details... STATUS
  • Downtime 2020-10-27 06:00 - 2020-10-27 12:00 (UTC) BIIDP-3304 - Getting issue details... STATUS
  •   MCProduction jobs restricted BIIDCO-2506 - Getting issue details... STATUS

LCG.CYFRONET.pl

  • Downtime 2020-10-06 10:00 - 2020-10-31 00:00 (UTC) BIIDCO-2745 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found since 13:20:00 UTC on 2018/12/13. BIIDCO-1246 - Getting issue details... STATUS

LCG.DESY.de

  • The site to be retired  BIIDCO-1240 - Getting issue details... STATUS  – No more jobs to be submitted.

LCG.Frascati.it

  • "Pilot Submission Failure" has been observed since 2020-09-09 08:24 UTC (for 22 hours) BIIDCO-2734 - Getting issue details... STATUS

  • "Aborted Pilot" has been observed since 2020-08-15 04:35 UTC (for 1 hours)  BIIDCO-2463 - Getting issue details... STATUS
  • "BLAH Error" has been observed since 2020-08-10 22:34:18 UTC (for 5 hours). Also since 2020-07-25 23:07:32 UTC (for 43 hours), since 2020-07-23 23:07:30 UTC (for 18 hours) and since   2020-07-18 16:07:24 UTC (for 24 hours) BIIDCO-2571 - Getting issue details... STATUS
  •  Site is currently Banned due to hardware problem since 2019-07-05

LCG.HEPHY.at

LCG.IHEP.cn

  • Downtime 2020-12-01 00:00 - 2020-12-02 15:00 (UTC) BIIDCO-2893 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-11-14 07:41 UTC (for 15 hours)  BIIDCO-2833 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-11-01 11:40 UTC (for 5 hours). Also since 2020-11-11 16:40 UTC (for 10 hours), since 2020-11-14 08:41 UTC (for 5 hours) and sincesince 202-11-23 08:48 UTC (for 6 hours).   BIIDCO-2802 - Getting issue details... STATUS

LCG.IN2P3CC.fr

  • being commissioined 2020-11-13 9:00 UTC BIIDCO-2830 - Getting issue details... STATUS

LCG.IPHC.fr

  • "Failed Payload Job" has been observed since 2020-09-02 21:30 UTC (for 2 hours) BIIDCO-2673 - Getting issue details... STATUS
  • Date, Issue, tickets...

LCG.KEK.jp

  • Downtime 2020-11-30 00:00 - 2020-12-07 00:00 (UTC) BIIDCO-2886 - Getting issue details... STATUS
  • banned until the runtime directory is fixed BIIDCO-2779 - Getting issue details... STATUS
    • There will be no activities until the ban is lifted
  •   RawJobStatus: LCG.KEK.jp with no activity for some hours and OSG.BNL.us is empty.  BIIDCO-2782 - Getting issue details... STATUS
  •   Raw data processing: Jobs are failed due to "No space left on device" BIIDCO-2769 - Getting issue details... STATUS
  •  Raw data processing: Application finished with errors BIIDCO-2684 - Getting issue details... STATUS
  • 'heavy' queue closed in preparation for the KEKCC renewal BIIDCO-2566 - Getting issue details... STATUS
  • MCProduction restricted BIIDCO-2505 - Getting issue details... STATUS
  •   no plot for raw data processing BIIDCO-2449 - Getting issue details... STATUS . update: this is expected actually, see BIIDCO-2400 - Getting issue details... STATUS
  • SiteDirector "Failed to check the availability"  BIIDCO-1934 - Getting issue details... STATUS

LCG.KEK2.jp

  • Downtime 2020-11-30 00:00 - 2020-12-07 00:00 (UTC) BIIDCO-2886 - Getting issue details... STATUS
  • Many pilot submission failed BIIDCO-2824 - Getting issue details... STATUS
  • Many jobs failed  BIIDCO-2435 - Getting issue details... STATUS  
  • Still all jobs failing with InputDataResolution on 2019/07/25. BIIDCO-1542 - Getting issue details... STATUS
  • Health checker info. : "Failed pilot jobs" has been found since 13:20:00 UTC on 2018/12/21. BIIDCO-1559 - Getting issue details... STATUS
  • all jobs are in "Input data resolution" status since 12.00 2018/12/18 UTC BIIDCO-1542 - Getting issue details... STATUS

LCG.KEK-merge.jp

  • Downtime 2020-11-30 00:00 - 2020-12-07 00:00 (UTC)  BIIDCO-2886 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 00:20:00 UTC on 2019/08/26. BIIDCO-1978 - Getting issue details... STATUS

LCG.KISTI.kr 

  • "Failed Payload Job" has been observed since 2020-10-27 13:40 UTC (for 1 hours) BIIDCO-2702 - Getting issue details... STATUS

  • "Pilot Submission Failure" has been observed since 2020-10-15 03:31 UTC (for 13 hours) BIIDCO-2767 - Getting issue details... STATUS

  • "Failed Payload Job" has been observed since 2020-09-18 02:24 UTC BIIDCO-2702 - Getting issue details... STATUS

  • "Aborted Pilot job" has been observed since 2020-09-09 22:24 UTC (for 8 hours) BIIDCO-2693 - Getting issue details... STATUS

  • "Pilot Submission Failure" has been observed since 2020-09-02 03:30 UTC (for 21 hours)
  • -   KISTI-GSDC system in downtime BIIDCO-2556 - Getting issue details... STATUS
  • BLAH error seems to be happen if jobs exceed the allocated # of queues, not a problem (Site specific feature)  
    BIIDCO-1259 - Getting issue details... STATUS


LCG.KIT.de   

  • "Short Pilot" has been observed since 2020-11-02 08:40 UTC (for 6 hours). Also since 2020-11-10 21:40 UTC (for 9 hours) and since 2020-11-21 15:48 UTC (15hrs)  BIIDCO-2804 - Getting issue details... STATUS
  • Downtime from 2020-10-07 16:00 (UTC) upto 2020-10-31 20:00 (UTC) BIIDCO-2710 - Getting issue details... STATUS
  • Downtime from 2020-09-22 08:00 (UTC) to 2020-10-29 22:00 (UTC)   BIIDCO-2710 - Getting issue details... STATUS

  • Downtime from 2020-09-02 06:00 to 2020-09-23 14:00 (UTC) BIIDCO-2660 - Getting issue details... STATUS

LCG.KIT-TARDIS.de

  • "Aborted Pilot" has been observed since 2020-11-25 05:48 UTC (9hrs). BIIDCO-2861 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-11-21 15:48 UTC (16hrs) BIIDCO-2847 - Getting issue details... STATUS

LCG.KMI.jp

  • "Aborted Pilot" has been obserced since 2020-11-24 15:48 UTC (for 7 hours) BIIDCO-2395 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed  since 2020-09-02 04:30 UTC (for 26 hours) and since 2020-11-23 02:48 UTC (for 12 hours) . BIIDCO-2733 - Getting issue details... STATUS
  • "Short Pilot" has been observed since 2020-09-23 01:24 UTC (for 5 hours) BIIDCO-2717 - Getting issue details... STATUS

LCG.LAL.fr

  • "Aborted Pilot" has been observed since 2020-11-28 21:48 UTC (for 57 hours) BIIDCO-2863 - Getting issue details... STATUS
  • "Aborted Pilot" has veen observed since 2020-11-25 15:48 UTC (for 7 hours) BIIDCO-2863 - Getting issue details... STATUS
  • Pilot submission failure BIIDCO-2848 - Getting issue details... STATUS
    BIIDCO-2844 - Getting issue details... STATUS
  •   BIIDCO-2841 - Getting issue details... STATUS
  • CREAM-CE to be decommissioned  BIIDCO-2842 - Getting issue details... STATUS

LCG.Legnaro.it

LCG.Napoli.it

  •   BIIDCO-2846 - Getting issue details... STATUS
  • 2020/11/23 BIIDCO-2851 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-11-14 05:41 UTC (for 8 hours), since 2020-10-13 10:31 UTC (for 20 hours) and since 2020-09-03 21:30 UTC (for 1 hours)  BIIDCO-2688 - Getting issue details... STATUS
  • "Aborted Pilot" have observed since 2020-11-09 02:40 UTC (for 2 hours) BIIDCO-2823 - Getting issue details... STATUS
  • Downtime 2020-11-05 08:00 (UTC) - 2020-11-05 17:00 (UTC) BIIDCO-2813 - Getting issue details... STATUS

  • "Failed Payload Job" has been observed since 2020-10-27 14:40 UTC (for 1 hours). BIIDCO-2573 - Getting issue details... STATUS

  • LCG.Napoli.it: Downtime start time 2020-09-30 08:00 - end time 2020-10-02 16:00 (UTC) BIIDCO-2738 - Getting issue details... STATUS
  • Stalled jobs BIIDCO-1255 - Getting issue details... STATUS

LCG.NTU.tw

  • Downtime 2020-12-0115:00 to 2020-12-07 15:00 (UTC)   BIIDCO-2895 - Getting issue details... STATUS
  • "Failed Pilot" has been observed since 2020-11-07 21:40 UTC (for 81 hours), since 2020-10-27 04:40 UTC (for 10 hours) and since 2020-10-25 11:38 UTC (for 19 hours).  BIIDCO-2794 - Getting issue details... STATUS
  • "CRL has expired" for "node39-0,node37-0,node38-0" has been observed since 2020-07-20 07:07:26 UTC (for 16 hours) and and since 2020-09-23 02:24 UTC (for 4 hours) and since 2020-06-17 14:46:46 UTC (for 4 hours)  BIIDCO-2499 - Getting issue details... STATUS
  • "Belle II software could not be installed" for "belle2grid3.cc.ntu.edu.tw" has been observed since 2020-06-28 06:06:54 UTC (for 1 hours) BIIDCO-2527 - Getting issue details... STATUS
    Solved and verified : GGUS ticket https://ggus.eu/?mode=ticket_info&ticket_id=147803 has bee submitted at 2020-07-12 06:50

LCG.Pisa.it

LCG.Roma3.it

  • "Short Pilot" has been observed since 2020-11-11 21:40 UTC (for 5 hours) and since 2020-11-01 23:40 UTC (for 15 hours) BIIDCO-2803 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-08-29 15:18 UTC (for 111 hours) BIIDCO-2666 - Getting issue details... STATUS
  •  Health checker info. : "Failed pilot jobs" has been found at 14:20:00 UTC on 2019/04/20 BIIDCO-1538 - Getting issue details... STATUS
  • Pilot Submission Failure has been observed since 2020-08-24 10:18 UTC (for 20 hours). BIIDCO-2633 - Getting issue details... STATUS

LCG.TAU.il

  • "Aborted Pilot" has been observed since 2020-11-15 05:41 UTC (for 8 hours)  BIIDCO-2832 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-10-27  13:40 UTC (for 1 hours) and since 2020-09-24 05:25 UTC (for 1 hours) BIIDCO-2723 - Getting issue details... STATUS
  • Downtime 2020-10-24 12:00 (UTC) - 2020-10-25 12:30 (UTC)  https://agira.desy.de/browse/BIIDCO-2789

  • "Pilot Submission Failure" has been observed since 2020-09-09 01:24 UTC (for 5 hours) BIIDCO-2693 - Getting issue details... STATUS

LCG.Torino.it

  • "Pilot Submission Failure" has been observed since 2020-11-24 07:48 UTC (for 15 hours) BIIDCO-2215 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-10-14 16:31 UTC (for 14 hours) and since  2019-12-28 18:34 UTC  BIIDCO-2215 - Getting issue details... STATUS
  • Downtime: 2020-09-25 10:00 (UTC) - 2020-10-09 10:00 (UTC) BIIDCO-2727 - Getting issue details... STATUS
  • "Pilot Submission Failure" has been observed since 2020-09-09 08:24 UTC (for 22 hours)(details). BIIDCO-2693 - Getting issue details... STATUS

  • "BLAH Error" has been observed since 2020-08-31 07:30:18 UTC (for 10 hours) and since  2020-02-09 05:53:05 UTC (for 9 hours) BIIDCO-2279 - Getting issue details... STATUS
  • "Failed Payload Job" has been observed since 2020-07-17 22:14 UTC (for 18 hours) BIIDCO-2569 - Getting issue details... STATUS

            GGUS: https://ggus.eu/index.php?mode=ticket_info&ticket_id=147926 has been submitted.

LCG.ULAKBIM.tr

  • The queue 'belle7' to be disabled. use only 'belle' BIIDCO-1896 - Getting issue details... STATUS

OSG.BNL.us

  • RawJobStatus: OSG.BNL.us is empty and LCG.KEK.jp with no activity for some hours. BIIDCO-2782 - Getting issue details... STATUS

  • Raw data processing: input data resolution and Application finished with errors BIIDCO-2684 - Getting issue details... STATUS

  • application finished with errors: 100% BIIDCO-2741 - Getting issue details... STATUS
  • Job submission check: Jobs fail with errors or input data resolution the last 24h (6:00 UTC, 2019/01/09)  BIIDCO-1596 - Getting issue details... STATUS
  • Number of concurrent MCProduction jobs restricted BIIDCO-1256 - Getting issue details... STATUS
  •  MCProduction jobs are mostly stalled BIIDCO-1253 - Getting issue details... STATUS

OSG.CORI.us

  • OSG.CORI.us resource has been removed because CY18 allocation was not approved

OSG.UMiss.us

  • "Pilot Submission Failure" has been observed since 2020-10-26 07:40 UTC (for 7 hours). Also since since 2020-11-16 23:47 UTC (for 129 hours). BIIDCO-2679 - Getting issue details... STATUS
  • Health checker info. : "Aborted pilot jobs" has been found at 22:20:00 UTC on 2019/06/02. BIIDCO-1856 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found since 22:20:00 UTC on 2019/05/12.(details)
    Updated BIIDCO-1768 - Getting issue details... STATUS

SSH.KMI.jp

  • Date, Issue, Tickets...

Test.KIT.de

  • Downtime from 2020-09-22 08:00 (UTC) to 2020-10-29 22:00 (UTC) BIIDCO-2710 - Getting issue details... STATUS
  • Downtime from 2020-09-02 06:00 to 2020-09-23 14:00 (UTC) BIIDCO-2660 - Getting issue details... STATUS

Test.ULAKBIM.tr

  • Test site for the SL7 resources at ULAKBIM. No need to report problems.
  • No activities expected currently.

VCYCLE.LAL.fr

  • Downtime 2020-11-19 12:00 (UTC) - 2020-12-03 12:00 (UTC) BIIDCO-2841 - Getting issue details... STATUS

  • Downtime 2020-10-13 07:00 (UTC) - 2020-10-30 16:00 (UTC) BIIDCO-2640 - Getting issue details... STATUS
  • Downtime from 2020-08-28 14:00 (UTC) to  2020-08-31 14:00 (UTC)  BIIDCO-2640 - Getting issue details... STATUS
  • Under commissioning ( BIIDCO-2430 - Getting issue details... STATUS )

VCYCLE.Napoli.it

  • Opportunistic site (Empty plot is not a problem)
  •  Ban lifted BIIDCO-1613 - Getting issue details... STATUS
  • "Sudo CE Error: sudo execution fails with return code 1" BIIDCO-1612 - Getting issue details... STATUS

VCYCLE.HNSC01.it, VCYCLE.HNSC02.it

  • Opportunistic site (Empty plot is not a problem)

Links


  • No labels