System Notices
Fri Apr 16, 9:00 (Pacific): Bugaboo unavailable
Silo/Hopper Down Wednesday, April 14, 2010 0900-1700 CST
Downtime is required to add approximately 1.2 PB (raw) of storage to the existing 600 TB (raw) on silo/hopper at the University of Saskatchewan. Silo/Hopper will be down on Wednesday, April 14, 2010 from 0900-1700 CST (Saskatoon time). Please plan your usage accordingly.
Any comments, questions, or concerns should be directed to support@westgrid.ca.
Silo/Hopper Down Friday, April 9, 2010 0900 CST until Saturday, April 10, 2010 1800 CST
Silo/Hopper will be down for scheduled maintenance and a planned power outage at the University of Saskatchewan Friday April 9 at 0900 CST (Saskatoon time). The data storage facility will come back up on the evening of Saturday April 10 at 1800 CST.
The previous support notice indicated the scheduled power outage in the building housing silo/hopper. This outage is still taking place all day Saturday. We are adding a day of downtime on Friday before the planned power outage to install and commission the hardware which comprises the storage upgrade at the University of Saskatchewan, with the aim to minimize the number of service outages associated with the WestGrid storage upgrade.
Please contact support@westgrid.ca with any questions or concerns.
Silo/Hopper Down Monday, March 29, 2010 0800-1300 CST
In preparation for addition of storage (the planned storage upgrade at University of Saskatchewan), Silo/Hopper will be down for scheduled maintenance on Monday, March 29, 2010 from 0800-1300 CST (0700-1200 PDT) while electricians work on UPSes. Silo and Hopper and their file access will be unavailable during this period.
Please contact support@westgrid.ca with any questions or concerns.
UBC - Orcinus Scheduling Resumed
UBC WestGrid - Orcinus - March 22, 18:00 PM (PDT) Chiller Shutdown
We were forced to power off all Orcinus compute nodes today because of an unscheduled chiller shutdown. As a result, all computing jobs were lost. UBC Plant Ops is currently looking into the issue. However, at the moment, we have no fix ETA information.
Update, March 23 12:00 AM (PDT)
The chiller seems to be working, however in order to adjust the temperatureand achieve the full cooling capacity, Orcinus will stay off-line for the night
We will resume the operation around noon (after solid conformation from UBC Plant Operations that the chiller is fully operational)
Sunday, March 21, 2010 - Checkers and Cortex cluster not available for production until further notice.
Checkers cluster is undergoing OS upgrade and will be back on Monday or Tuesday.
update 24 March: checkers will be back in production today or tomorrow at the latest.
Dendrite is experiencing internal disk problems stopping it from rebooting.
Cortex is experiencing GPFS file system problems.
We expect to have all the machines up and running on Monday or Tuesday.
update 24 March: there are serious problems with these machines. They will be
looked at after checkers is up. There is no ETA for these machines yet.
Update 26 March: Checkers cluster is back in production
Silo/Hopper Down Saturday, April 10, 2010 0800-1700 CST
There is a planned power outage at the University of Saskatchewan for the storage system (Silo/Hopper) on Saturday April 10, 2010, from 0800-1700 CST (Saskatoon time). The system will be offline and unavailable for this time period--please plan your usage accordingly.
It is anticipated that a major upgrade in capacity for the Silo/Hopper system will take place in April. There will be planned downtime while this upgrade takes place. Further details when available.
Edmonton - Saskatoon OC192 outage
TICKET INFORMATION:
Subject: Edmonton - Saskatoon OC192 outage
Category: Outage
Ticket ID: 20100310-001
Start Time: 2010-03-10 03:22 EST (2010-03-10 08:22 UTC)
End Time: 2010-03-10 06:33 EST (2010-03-10 11:33 UTC)
TICKET HISTORY:
== Updated: Thomas on 2010-03-10 07:11 EST(2010-03-10 12:11 UTC) ==
The OC192 circuit was restored at the time shown above.
The provider informed that a local power failure at one of
the AMP sites caused this outage.
== Created: Thomas on 2010-03-10 04:19 EST(2010-03-10 09:19 UTC) ==
The Edmonton-Saskatoon OC192 went down at the time
shown above. The cause of the outage is unknown. The
provider is being contacted. The following core links and
lightpaths are affected.
Core Calgary-Winnipeg
Cybera - Edmonton
ECONET Edm - Mon
ECONET Edm - Sas
NRNet VCTR - SASK
Neptune300 VCTR-SASK
SRNet backup SASK - RGNA via EDMN
TRIUMF UBC - UofA
TRIUMF Van - Tor
WestGrid Cal - Sas
CANARIE NOC
Operations and Engineering
Email: eng@canarie.ca
Weekdays: 08:00-17:00 EST(UTC-5)
+1.613.944.5612
7/24 pager: +1.613.944.5611
http://www.canarie.ca/canet4/
Edmonton - Saskatoon OC192 outage
TICKET INFORMATION:
Subject: Edmonton - Saskatoon OC192 outage
Category: Outage
Ticket ID: 20100310-001
Start Time: 2010-03-10 03:22 EST (2010-03-10 08:22 UTC)
End Time: 0000-00-00 00:00 UTC (0000-00-00 00:00 UTC)
== Created: Thomas on 2010-03-10 04:19 EST(2010-03-10 09:19 UTC) ==
The Edmonton-Saskatoon OC192 went down at the time
shown above. The cause of the outage is unknown. The provider is being contacted. The following core links and
lightpaths are affected.
Core Calgary-Winnipeg
Cybera - Edmonton
ECONET Edm - Mon
ECONET Edm - Sas
NRNet VCTR - SASK
Neptune300 VCTR-SASK
SRNet backup SASK - RGNA via EDMN
TRIUMF UBC - UofA
TRIUMF Van - Tor
WestGrid Cal - Sas
CANARIE NOC
Operations and Engineering
Email: eng@canarie.ca
Weekdays: 08:00-17:00 EST(UTC-5)
+1.613.944.5612
7/24 pager: +1.613.944.5611
http://www.canarie.ca/canet4/
