ARC.Melbourne.au

  • Health checker info. : "Belle II software could not be installed on " has been found since 13:20:00 UTC on 2017/10/23.

Contents

  •  Click here to expand...

 

Production Plans

  • MC9
    • MC9 started July 5, 2017
      • Phase III signal samples for prerelease-00-09-00b validation
      • Phase III Y(3S) generic (300 fb-1)
      • Phase III Y(4S) generic (4 x 1 ab-1)
      • Phase III Y(5S) generic (1 ab-1)
      • Phase III Y(6S) generic (100 fb-1)
      • Phase III Y(4S) signal samples
      • Phase III Y(4S) low multiplicity samples
      • Phase III Y(5S) signal samples
      • Phase III Y(6S) signal samples
      • Phase II Y(3S) signal samples
      • Phase II Y(4S) generic (50 fb-1)
      • Phase II Y(4S) signal samples
      • Phase II Y(4S) low multiplicity samples

Production Status

MC9

Official production started at ~21:00 JST on July 5, 2017. Starting with BGx0 generic samples (0.2 ab-1)

Submitted second batch of BGx0 generic jobs (July 7, ~04:00 JST)

Third and fourth batches of BGx0 generic jobs (July 10)

Submitted a few BGx0 signal samples (July 12 ~04:30 JST)

Submitted the phase 2 generic samples with BGx0 (July 14 ~04:00 JST)

Submitted the rest of the BGx0 signal samples (July 16 ~00:00 JST)

New requests for BGx0 signal samples submitted (July 19 ~01:30 JST)

MC9 restarted with BGx1 phase 2 samples - 50 fb-1 generic and signal samples (July 30 ~10:00 JST)

Submitted first batch of phase 3 samples with background - mixed and charged BBbar - about 140k jobs (August 12 ~08:00 JST)

Added uubar: ~180k jobs (August 13 ~10:30 JST)

Added ddbar: ~53k jobs (August 29 ~22:00 JST)

Added ssbar: ~51k jobs (Sept. 2 ~11:00 JST)

Submitted phase 3 low-multiplicity samples: ~43.5k jobs (Sept. 2 ~13:00 JST) → includes generator level skim so number of jobs is inflated compared to run time

Added ccbar and taupair: ~317k jobs (Sept 3 ~09:30 JST)

Submitted new signal MC samples: ~57.6k jobs (Sept 11 ~23:00 JST)

Submitted new phase 2 signal MC samples: ~21.2k jobs (Sept 12 ~05:00 JST) → short jobs < 3 hrs each

Submitted new phase 3 signal MC samples: ~600k jobs (Sept 24 ~22:30)

Submitted new phase 3 signal MC samples (almost all submitted now) (Sept 28 ~02:00 JST)

Submitted phase 3 Y(5S) bsbs and non-bsbs samples: ~54.4k jobs (Oct 6 ~04:30 JST)

Submitted phase 3 Y(5S) uubar samples: ~242k jobs (Oct 9 ~09:30 JST)

Submitted phase 3 Y(5S) ddbar samples: ~60k jobs (Oct 16 ~10:30 JST)

Submitted phase 3 Y(5S) ssbar and ccbar samples: ~300k jobs (Oct 18 ~02:00 JST)


Central Services

Dirac


  •  DIRAC system is down due to KEKCC power outage during severe thunderstorm. 2017-09-25 11:44 UTC
    BIIDCO-396 - Getting issue details... STATUS
  •   High CPU load observed in all the Dirac prodcutionserver from 2017-09-16 ~6:00 JST
      BIIDCO-358 - Getting issue details... STATUS
  •   High CPU load observed in web server from 2017-09-16 ~6:00 JST

            BIIDCO-361 - Getting issue details... STATUS

  • 1 min load (Greyline) is above the red line fore more than 20 minutes from 2017-09-16 ~04:00 JST for the DIRAC server  b2dchsv01.cc.kek.jp
    BIIDCO-356 - Getting issue details... STATUS
  • 1 min load (Greyline) is above the red line fore more than 20 minutes from 2017-09-16 ~04:00 JST for the DIRAC server b2dchsv05.cc.kek.jp
    BIIDCO-357 - Getting issue details... STATUS
  • Whole DIRAC production servers become high CPU consumption state from 2017-09-15 ~07:00 JST
    BIIDCO-355 - Getting issue details... STATUS

DB Production

  • Conditions DB access failure from 2017-10-02 around 08:50 UTC
    BIIDCO-410 - Getting issue details... STATUS
  • Conditions DB access failure from 2017-09-24 ~16:00 UTC
    BIIDCO-390 - Getting issue details... STATUS
  • High CPU load observed in all the DB production serverfrom2017-09-16 ~6:00 JST
    BIIDCO-360 - Getting issue details... STATUS

DDM

  • Date, Issue, Tickets...

Monitor

  • Date, Issue, Tickets...

LFC

File Transfers and Replication Status

See also DDM for related issues

FTS

Any problem in the FTS service or FTS monitoring are to be recorded here. Site/SE specific issues are to be recoreded under each SIte/SE.

  • low transfer efficiency in stormfe1.pi.infn.it+ (destination and probably also for source) BIIDCO-451 - Getting issue details... STATUS
  • FTS dashboard does not display matrix and transfer plot since 2017-10-14 ~11:00 UTC. BIIDCO-436 - Getting issue details... STATUS
  • 2017/09/25 (06:18 JST) Unusual amount of blank cells in FTS Transfer Matrix BIIDCO-391 - Getting issue details... STATUS
  • Some "critical" errors observed in FTS log on July 6-7. ggus:129750   (closed ticket, but no resolution BIIDCO-436

Replication Status

  • Date, Issue, Tickets..
  • 2017/09/28:(22:30 JST) Scheduled jobs increaseatDESY, KIT, PNNL, SIGNET, KEK2, KISTI. BIIDCO-405 - Getting issue details... STATUS
  • 2017/09/25: (04:15 JST) Steady increase of scheduled/waiting BIIDCO-389 - Getting issue details... STATUS
  • 2017/09/11: Replication Status seeminstuck. No waiting jobs in the most part of site. No done jobs. BIIDCO-347 - Getting issue details... STATUS
    • Updated 2017/09/14: the number of "done" jobs seemsreasonable, but the number of "waiting" jobs is increasing steadily BIIDCO-322 - Getting issue details... STATUS (originally posted 2017/08/04)
    • Similar (closed) issues: BIIDCO-174 - Getting issue details... STATUS BIIDCO-283 - Getting issue details... STATUS BIIDCO-294 - Getting issue details... STATUS


SEs

SE Common Issues

  • The number of "waiting/scheduled" jobs still increasing, 2017-09-06 16:00 UTC (see "Replication Status" issues above)

Destination SE: CESNET-TMP-SE (dpm1.egee.cesnet.cz)      

  • 2017-10-05 (16:20 UTC): CESNET was fixed by experts.  BIIDCO-416 - Getting issue details... STATUS


Destination SE: CNAF-TMP-SE (storm-fe-archive.cr.cnaf.infn.it)

  • 2017/03/13: Not enough free space BIIDCO-137 - Getting issue details... STATUS

Destination SE: DESY-TMP-SE (dcache-se-desy.desy.de)


  • Not enough free space BIIDCO-107 - Getting issue details... STATUS

Destination SE:KEK2-TMP-SE (kek2-se01.cc.kek.jp)

  • Still banned for removal due to the issue in the back-end HSM
    BIIDCO-41 - Getting issue details... STATUS

Destination SE: KISTI-TMP-SE (belle-se-head.sdfarm.kr)



Destination SE: KIT-TMP-SE (dcachesrm-kit.gridka.de)

  • BIIDCO-428 - Getting issue details... STATUS
  • SE Health check by DDM:remove file, download, upload do not work since 2017-10-09 22:19:43 UTC.
  • SE Health check by DDM:remove file, download, upload do not work since 2017-10-08 06:48:11 UTC.
    BIIDCO-423 - Getting issue details... STATUS
  • SE Health check by DDM:remove file, download, upload do not work since 2017-10-07 22:59:30 UTC.
  • SE Health check by DDM:remove file, download, upload do not work since 2017-10-06 06:47:53 UTC.
  • SE Health check by DDM:remove file, download, upload do not work since 2017-10-06 03:13:46 UTC
  • Efficiency is less than 20% for 7 hours  BIIDCO-348 - Getting issue details... STATUS
  • No new data blocks to be assigned to KIT-TMP-SE (files in already assigned blocks continue to be transferred) BIIDCO-199 - KIT SE to be disabled as Dest SE DONE
  • There should be no more transfers to/from gridka-dcache.fzk.de
    • KIT SE: Hostname to change from gridka-dcache.fzk.de BIIDCO-191 - Getting issue details... STATUS

Destination SE: KMI-TMP-SE (nsrmfe01.hepl.phys.nagoya-u.ac.jp)

  • The number of done is zero while the number of queued is not zero. on 13. Jul. 2017 06:19h UTC
  • Not enough free space BIIDCO-136 - Getting issue details... STATUS


Destination SE: Napoli-TMP-SE (belle-dpm-01.na.infn.it)

  • File transfer efficiency is lower than 20% since 2017-10-04 22:00. BIIDCO-418 - Getting issue details... STATUS
  • Not enough free space  BIIDCO-146 - Getting issue details... STATUS

Destination SE: PNNL-TMP-SE (se.hep.pnnl.gov) 

  • SE Health checkbyDDM:remove file, remove directory, download, upload, lsdonotworksince2017-09-21 08:20:55 UTC BIIDCO-372 - Getting issue details... STATUS

Destination SE: SIGNET-TMP-SE (dcache.ijs.si)

  • Date, Issue, Tickets...

Other SEs

Adelaide-TMP-SE (coepp-dpm-01.ersa.edu.au)

  • Date, Issue, Tickets...

BNL-TMP-SE (dcblsrm.sdcc.bnl.gov)

  • Date, Issue, Tickets...

CYFRONET-TMP-SE (dpm.cyf-kr.edu.pl)

  • Date, Issue, Tickets...


Frascati-TMP-SE (atlasse.lnf.infn.it)

  • Date, Issue, Tickets...
  • BIIDCO-434 - Getting issue details... STATUS

HEPHY-TMP-SE (hephyse.oeaw.ac.at)

  • Date, Issue, Tickets...

IPHC-TMP-SE (sbgse1.in2p3.fr)

  • Date, Issue, Tickets...

Melbourne-TMP-SE (b2se.mel.coepp.org.au)

  • Date, Issue, Tickets...

McGill-TMP-SE  (storm02.clumeq.mcgill.ca)

  • Date, Issue, Tickets...

MPPMU-TMP-SE (grid-srm.rzg.mpg.de)

  • Date, Issue, Tickets...


NTU-TMP-SE (bgrid3.phys.ntu.edu.tw)

  • Date, Issue, Tickets...


Pisa-TMP-SE (stormfe1.pi.infn.it)

Torino-TMP-SE (se-srm-00.to.infn.it)

  • Date, Issue, Tickets...

ULAKBIM-TMP-SE (torik1.ulakbim.gov.tr)

  • Date, Issue, Tickets...

UMiss-TMP-SE (umiss005.hep.olemiss.edu)

  • Date, Issue, Tickets...

UVic-TMP-SE(charon01.westgrid.ca)

  • Date, Issue, Tickets...

Sites

Sites Common Issues

BIIDCO-430 - Getting issue details... STATUS

BIIDCO-373 - Getting issue details... STATUS

BIIDCO-343 - Getting issue details... STATUS

BIIDCO-257 - Getting issue details... STATUS

Conditions database appears to be down so jobs may fail until it's back up 2017-08-16 10:14:56 +0200

ARC.DESY.de

  • ARC.DESY.de:"Short pilot jobs"  BIIDCO-207 - Getting issue details... STATUS

ARC.KIT.de

  • Short pilot jobs :  BIIDCO-421 - Getting issue details... STATUS
  • 2017/9/29 Job status: Input Date Resolution has been observed.

ARC.LMU.de

  • This is a test site. Do not need to report any issue.

ARC.LMU2.de

ARC.Melbourne.au

       

  • Health checker info. : "Belle II software could not be installed on " has been found since 13:20:00 UTC on 2017/10/23.  BIIDCO-453 - Getting issue details... STATUS
  •   BIIDCO-446 - Getting issue details... STATUS
  • Health checker info. : "Belle II software could not be installed on " has been found since 05:20:00 UTC on 2017/10/15. BIIDCO-433 - Getting issue details... STATUS
  • Health checker info. : "Failed pilot jobs" has been found at 22:20:00 UTC on 2017/10/08.(details)
  • Health checker info. : "Short pilot jobs" has been found at 22:20:00 UTC on 2017/10/06.
  • Health checker info. : "Short pilot jobs" has been found at 06:20:00 UTC on 2017/10/06.(details)
  • Health checker info. : "Short pilot jobs" has been found at 22:20:00 UTC on 2017/10/05.


ARC.MPPMU.de


  • Job submission check:Pilot submission failure has been found at 22:29:00 UTC on 2017/09/22.
  • Health checker info. : "Failed pilot jobs" has been found at 14:20:00 UTC on 2017/09/22.(details)
  • Health checker info. : "Failed pilot jobs" has been found at 06:20:00 UTC on 2017/08/28.
  • Job submission check:Pilot submission failure has been found at 06:31:00 UTC on 2017/05/10. (details)

ARC.SIGNET.si

  •   BIIDCO-447 - Getting issue details... STATUS
  • ARC.SIGNET.si- "Stalled" jobs  BIIDCO-287 - Getting issue details... STATUS

CLOUD.CC1_Krakow.pl

  • Not used in production yet. Seeing no jobs (no plot) is not a problem

DIRAC.Beihang.cn

  • 2017/09/27 Pilot submission failureforthepast ~7 hours, opened ticket: BIIDCO-402 - Getting issue details... STATUS
  •  Enabled with MaxTotalJobs = 1. BIIDCO-289 - Getting issue details... STATUS
  • All the upload trials are failing against all the SEs configured: OutputSE (KMI-TMP-SE, PNNL-TMP-SE), Fail-over SEs(DESY-TMP-SE, Napoli-TMP-SE, PNNL-TMP-SE, KIT-TMP-SE)
  • Large % of failed jobs in DIRAC status plot (Added 2016-11-03 22:45:00 UTC) 

DIRAC.BINP.ru

  • File catalog access faiilurefromDIRAC.BINP/BINP-VM.rusince2017-10-05 13:30 UTC.
    BIIDCO-420 - Getting issue details... STATUS
  • Report from the site: MCProduction jobs never run on slower nodes BIIDCO-329 - Getting issue details... STATUS
  • Report from the site: jobs aborted on some hosts BIIDCO-328 - Getting issue details... STATUS

DIRAC.CINVESTAV.mx

  • Health checker info. : "Aborted pilot jobs" has been found at 06:20:00 UTC on 2017/09/23.(details)
  • Job Submission failure is observed since 01:31:00 UTC on 2017/07/30.

DIRAC.DESY.de

  • Test site. Not in use in MC production

DIRAC.IITG.in

  •   Draining for downtime: BIIDCO-435 - Getting issue details... STATUS
  • Health checker info. : 
    1. "Short pilot jobs" has been found at 18:20:00 UTC on 2017/08/24.(details)
    2. "Aborted pilot jobs" has been found at 18:20:00 UTC on 2017/08/24.(details)

DIRAC.LMU.de

  • Not in use in MC production BIIDCO-26 - Getting issue details... STATUS
    • Banned for now.

DIRAC.MIPT.ru

  • Banned to redesign the file system:  BIIDCO-309 - Getting issue details... STATUS
    • Un-banned the site.
  • Health checker info. : "Short pilot jobs" has been found since 03:20:00 UTC on 2017/08/17.
  •   MCProduction = 10 BIIDCO-309 - Getting issue details... STATUS

DIRAC.Nagoya.jp

  • Health checker info. : "Belle II software could not be installed on " has been found since 08:20:00 UTC on 2017/10/13 BIIDCO-433 - Getting issue details... STATUS

DIRAC.Nara-WU.jp

  • Decommissioned site: Since this still uses SL5, DIRAC pilot cannot be executed there.

DIRAC.NDU.jp

  • Downtime for unexpected power outage 2017-10-23 10:30 JST
    BIIDCO-452 - Getting issue details... STATUS

DIRAC.Niigata.jp

  •   MCProduction = 20 BIIDCO-311 - Getting issue details... STATUS

DIRAC.Osaka-CU.jp

  • Health checker info. : "Short pilot jobs" has been found since 05:20:00 UTC on 2017/10/20.   BIIDCO-450 - Getting issue details... STATUS
  • Health checker info. : "Failed to install DIRAC on " has been found since 07:20:00 UTC on 2017/10/13. BIIDCO-438 - Getting issue details... STATUS
  • BIIDCO-395 - Getting issue details... STATUS
  • "Not enough disk space on " BIIDCO-315 - Getting issue details... STATUS
  • BIIDCO-312 - Getting issue details... STATUS
  • "Short pilot jobs" BIIDCO-290 - Getting issue details... STATUS

  • - Jobs failing at the very beginning, causing "Short pilot jobs" though there is no problem in pilots – https://agira.desy.de/browse/BIIDCO-395
  • Health checker info. : "Failed to install DIRAC on " has been found since 19:20:00 UTC on 2017/09/22
  •  MCProduction = 5 BIIDCO-312 - Getting issue details... STATUS

DIRAC.PNNL.us

DIRAC.PNNL2.us

  • Date, Issue, Tickets...

DIRAC.PNNL-CASCADE.us

  • Seeing no jobs (no plot) is not a problem

DIRAC.PNNL-PIC.us

  • Seeing no jobs (no plot) is not a problem

DIRAC.RCNP.jp

  • "Aborted pilot jobs" BIIDCO-432 - Getting issue details... STATUS
  • Pilot submission failure BIIDCO-376 - Getting issue details... STATUS

DIRAC.SSU.kr

  • Date, Issue, Tickets...

DIRAC.TIFR.in

  • Whole production jobs failedbyfileupload failure since 2017-07-06 BIIDCO-205 - Getting issue details... STATUS

DIRAC.TMU.jp

  •   "Aborted pilot jobs"  BIIDCO-442 - Getting issue details... STATUS
  •  "Short pilot jobs" BIIDCO-403 - Getting issue details... STATUS
  •   "Not enough disk space on "   BIIDCO-382 - Getting issue details... STATUS
  •   "Pilot Submission Failure" BIIDCO-339 - Getting issue details... STATUS

DIRAC.Tokyo.jp

  • Health checker info. : "Short pilot jobs" has been found since 21:20:00 UTC on 2017/09/15.(details)

DIRAC.UAS.mx

  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2017/10/04.(details)

DIRAC.UVic.ca

  • Date, Issue, Tickets...

DIRAC.Yamagata.jp

  • Date, Issue, Tickets...
  •  MCProduction = 4 BIIDCO-313 - Getting issue details... STATUS

DIRAC.Yonsei.kr

  • Date, Issue, Tickets...

LCG.CESNET.cz

  • Job submission check:Pilot submission failure has been found since 04:24:00 UTC on 2017/10/06. (details)

LCG.CNAF.it

  • "Failed pilot jobs" BIIDCO-448 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 14:20:00 UTC on 2017/10/06.(details)

LCG.Cosenza.it

  • Health checker info. : "BLAH ERROR" has been found since 16:20:00 UTC on 2017/10/03.(.details)
    BIIDCO-413 - Getting issue details... STATUS
  • Health checker info. : "Failed pilot jobs" has been found at 22:20:00 UTC on 2017/09/29.(.details)

LCG.CYFRONET.pl

  • New CE introduced BIIDCO-393 - Getting issue details... STATUS – pilot submission failing
  • Health checker info. : "BLAH ERROR" has been found since 15:20:00 UTC on 2017/09/22. BIIDCO-378 - Getting issue details... STATUS
  • Health checker info. : "Failed to install DIRAC on n1033-amd.zeus" has been found since 21:20:00 UTC on 2017/09/14.amd.zeus
  • Health checker info. : "Failed pilot jobs" has been found since 03:20:00 UTC on 2017/08/14.(.details) → BIIDCO-261 - Getting issue details... STATUS
  • MCProduction= 1 MCProduction BIIDCO-314 - Getting issue details... STATUS

LCG.DESY.de

  • BIIDCO-380 - Getting issue details... STATUS

  • Health checker info. : "Short pilot jobs" has been found at 13:20:00 UTC on 2017/09/28.(.details)
  • Health checker info. : "Short pilot jobs" has been found since 04:20:00 UTC on 2017/08/27
  •  LCG.DESY.de: Stalled jobs BIIDCO-293 - Getting issue details... STATUS

LCG.Frascati.it

  • Date, Issue, Tickets...

LCG.HEPHY.at

  • Health checker info. : "Failed pilot jobs" has been found at 22:20:00 UTC on 2017/09/14.(.details)
  • Job submission check:Pilot submission failure has been found at 13:31:00 UTC on 2017/09/04.
  •  MCProduction = 680 BIIDCO-281 - Getting issue details... STATUS


LCG.KEK.jp

  • Health checker info. : "Failed pilot jobs" has been found at 22:20:00 UTC on 2017/09/27.(details)
  • BIIDCO-300 - Getting issue details... STATUS
  • Health checker info. : "Failed pilot jobs" has been found since 09:20:00 UTC on 2017/08/28.
  • Health checker info. : "Failed pilot jobs" has been found at 06:20:00 UTC on 2017/08/27.
  • Health checker info. : "Failed pilot jobs" has been found at 18:20:00 UTC on 2017/08/23.(details)
  • Health checker info. : "Failed pilot jobs" has been found since 13:20:00 UTC on 2017/08/23.(details)

LCG.KEK2.jp

  • Job submission check:Pilot submission failure has been found at 22:27:00 UTC on 2017/09/15. (details)
  • Health checker info. : "Failed pilot jobs" has been found at 21:20:00 UTC on 2017/09/06.(details)
  • BIIDCO-300 - Getting issue details... STATUS
  • Health checker info. : "Failed pilot jobs" has been found at 14:20:00 UTC on 2017/08/23.(.details)

LCG.KISTI.kr

  • Health checker info. : "Short pilot jobs" has been found since 12:20:00 UTC on 2017/09/21.(details)
  • Health checker info. : "Short pilot jobs" has been found since 20:20:00 UTC on 2017/09/15.(details)
  • Health checker info. : "Not enough disk space on N/A" has been found since 20:20:00 UTC on 2017/09/10
  • Health checker info. : "Not enough disk space on wn3050.sdfarm.kr" has been found at 22:20:00 UTC on 2017/09/09.
  • Health checker info. : "Not enough disk space on N/A" has been found since 16:20:00 UTC on 2017/09/07.
  • Health checker info. : "Not enough disk space on wn3050.sdfarm.kr" has been found since 05:20:00 UTC on 2017/09/07
  • Health checker info. : "Not enough disk space on N/A" has been found since 19:20:00 UTC on 2017/09/06.
  • Health checker info. : "Not enough disk space on N/A" has been found since 19:20:00 UTC on 2017/09/03.

  • Health checker info. : "Not enough disk space on N/A" has been found since 17:20:00 UTC on 2017/09/01.
  • MCProduction= 10 BIIDCO-280 - Getting issue details... STATUS

LCG.KIT.de

  • The maximumnumberofjobissetto be zero (for job drain) 2017/03/07.

LCG.KMI.jp

  • Health checker info. : "Failed pilot jobs" has been found at 21:20:00 UTC on 2016/11/23.(details)

LCG.Legnaro.it

  • Date, Issue, Tickets...

LCG.McGill.ca

  • Health checker info. : "Failed pilot jobs" has been found at 13:20:00 UTC on 2017/09/28.(details)

LCG.Melbourne.au

  • Banned forCE replacement   -- BIIDCO-162 - Getting issue details... STATUS

LCG.Napoli.it

  • Health checker info. : "Not enough disk space on wn174.scope.unina.it" has been found at 21:20:00 UTC on 2017/09/29
  • Health checker info. : "Short pilot jobs" has been found since 21:20:00 UTC on 2017/09/15.(details)

LCG.NTU.tw

  • Job submission check : Pilot submission failure has been found since 13:26:00 UTC on 2017/10/12. BIIDCO-439 - Getting issue details... STATUS
  • MCProduction = 20 BIIDCO-279 - Getting issue details... STATUS
  • Health checker info. : "Short pilot jobs" has been found at 15:20:00 UTC on 2017/09/13.(details)

LCG.Pisa.it

  • Job submission check : Pilot submission failure has been found since 13:29:00 UTC on 2017/10/08. (details)
    I found a JIRA ticket which might be related to this issue, although it is a little old : BIIDCO-384
  • Job submission check : Pilot submission failure has been found since 20:28:00 UTC on 2017/10/06. 
  • LCG.Pisa.it - not to run user jobs  BIIDCO-286 - Getting issue details... STATUS

  • GGUS ticket:"File Transfer failure to stormfe1.pi.infn.it"(129865)hasbeensubmited at 07:01:38 UTC on 2017/08/01. 
  • Job submission check:Pilot submission failure has been found since 17:27:00 UTC on 2017/08/25.

LCG.Roma3.it

  • Roma3 commissioning BIIDCO-111 - Getting issue details... STATUS

LCG.Torino.it

  • BIIDCO-352 - Getting issue details... STATUS
  • BIIDCO-264 - Getting issue details... STATUS
  • BIIDCO-252 - Getting issue details... STATUS

  • Job submission check:Pilot submission failure has been found at 22:27:00 UTC on 2017/09/15. (details)
  • GGUS ticket:
  • Job submission check:Pilot submission failure has been found at 07:30:00 UTC on 2017/09/13. (details)
  • Job submission check:Pilot submission failure has been found since 02:28:00 UTC on 2017/09/12. (details)
  • Health checker info. : "Failed pilot jobs" has been found since 12:20:00 UTC on 2017/08/29.
  • Job submission check:Pilot submission failure has been found at 13:29:00 UTC on 2017/08/29.
  • Health checker info. : "Failed pilot jobs" has been found since 12:20:00 UTC on 2017/08/26.
  • Health checker info. : "Failed pilot jobs" has been found since 15:20:00 UTC on 2017/08/23.(details)
  • Job submission check:Pilot submission failure has been found since 09:23:00 UTC on 2017/08/23. (details)
  • Health checker info. : "Failed pilot jobs" has been found since 05:20:00 UTC on 2017/08/23.(details)
  • Job submission check:Pilot submission failure has been found at 14:14:00 UTC on 2017/08/17.
  • GGUS ticket: 
    1. "INFN-TORINO: Failing transfers with DESTINATION srm://se-srm-00.to.infn.it"(130083) hasbeensubmited at 03:36:13 UTC on 2017/08/16. 
    2. "INFN-TORINO: Failed job submission to t2--01toit"(130043) has been at 02:28:44 UTC on 2017/08/12. 
  • Health checker info. : "Failed pilot jobs" has been found at 23:20:00 UTC on 2017/08/16(details)
  • Job submission check:Pilot submission failure has been found at 22:29:00 UTC on 2017/08/16. (details)
  • Job submission check:Pilot submission failure has been found at 09:28:00 UTC on 2017/08/14. (details) → BIIDCO-264 - Getting issue details... STATUS
  • Health checker info. : "Failed pilot jobs" has been found since 05:20:00 UTC on 2017/08/11. ( BIIDCO-252 - Getting issue details... STATUS )
  • Job submission check : Pilot submission failure has been found at 06:22:00 UTC on 2017/08/08.

LCG.ULAKBIM.tr

  • Health checker info. : "Failed pilot jobs" has been found at 14:20:00 UTC on 2017/09/03.
  • Closed and verified : 2017-09-29 : GGUS ticket (No pilot jobs run) has submitted https://ggus.eu/index.php?mode=ticket_info&ticket_id=130316 2017-08-31 17:00 JST
  • Health checker info. : "Failed pilot jobs" has been found since 12:20:00 UTC on 2017/08/29

OSG.BNL.us

  • Date, Issue, Tickets...

OSG.CORI.us

  •  OSG.CORI.us 100% Application finished Error ( MC9 Shift Log )
    • OSG.CORI.us: Application finished Error 100% BIIDCO-292 - Getting issue details... STATUS

OSG.UMiss.us

  • no enough space error: Application finished with errors  BIIDCO-241 - Getting issue details... STATUS

SSH.KMI.jp

  • Date, Issue, Tickets...

Links


Twiki settings:

  • Set INTERWIKIPLUGIN_RULESTOPIC = InterWikis
  • Set EDITMETHOD =ra
  • No labels