“Gmail’s web interface had a widespread outage earlier today, lasting about 100 minutes.”
See the following for details:
http://gmailblog.blogspot.com/2009/09/more-on-todays-gmail-issue.html
Filed under: Production Problems
September 2, 2009 • 11:11 am 0
“Gmail’s web interface had a widespread outage earlier today, lasting about 100 minutes.”
See the following for details:
http://gmailblog.blogspot.com/2009/09/more-on-todays-gmail-issue.html
Filed under: Production Problems
July 13, 2009 • 10:10 am 0
A summary of a bad week for Rackspace, Google and others:
http://www.datacenterknowledge.com/archives/2009/07/06/the-day-after-a-brutal-week-for-uptime/
Filed under: Production Problems
July 9, 2009 • 3:38 pm 0
Twice in the last two weeks the Rackspace DFW (Dallas/Fort-Worth) data center has lost power to sections of its facility.
Message from Rackspace CEO Lanham Napier
June 30, 2009
Rackspace community,
Yesterday afternoon at 3:15CDT our data center in Dallas experienced an interruption in power to portions of the facility. The interruption caused customer servers to lose power and go down.
…
Notice * July 7, 2009, 11:44 am CDT: Today a portion of our Dallas data center experienced a brief power interruption. Rackspace is aware of this issue and is currently investigating it. We will be sending out periodic updates as more information becomes available.
I commend their transparency through the use of Twitter and their blog, but rather disconcerting since it’s the datacenter I use and pay a premium for the amount of “redundancy” they are supposed to have, and the fact they had issues in November 2007 as well. More information here about that outage.
Filed under: Production Problems
July 9, 2009 • 3:21 pm 0
In my ongoing collection of “big guys having trouble”, Google’s “App Engine” services were down for 6 hours on July 2nd.
A scathing rant about Google’s lack of clarity on what went wrong and what’s being done about it – and therefore the lack of confidence in using the service.
Google’s posting about the issue are found here.

Filed under: Production Problems
July 7, 2009 • 2:23 pm 0
Netbeans (which I never use) has seemingly come a long way since I last looked at it. The UI is certainly a lot nicer.
Today however, it’s the Heap Profiler that I’m happy with – cause it actually works unlike anything else I’ve tried today.
It loaded a 3.5GB heap in less than 30 seconds! (I had set my max heap for Netbeans to 12GB on my 16GB, 8-core machine).
Finally a tool for heap analysis that works and works well. And it’s elegant looking at the same time.

Filed under: Production Problems, Tools
July 7, 2009 • 2:10 pm 0
Despite the claims that Memory Analyzer works well with large heaps, the following screenshot is the evidence of my continued inability to have it parse a 3.5GB heap dump.
I have attempted JDK 5 and JDK 6, both 64-bit, with up to 14GB of memory allocated on an 8-core machine with 16GB of memory.
Note the memory bar at the bottom showing it’s using only 2121M out of 11879M – yet it still thinks it’s running out of memory.
The settings are:
-vmargs
-Xms12g
-Xmx14g
-XX:MaxPermSize=1G
-Dorg.eclipse.swt.internal.carbon.smallFonts
-XstartOnFirstThread

Filed under: Production Problems, Tools
July 2, 2009 • 12:35 pm 0
I couldn’t access my hosted Confluence system this morning. An email just showed up confirming the issue.
“Earlier today, from 9:41 am to 10:17am PST, we experienced network issues that impacted our data center. At this time we do not know the exact root cause of the problem. “
— from an email I received
Filed under: Production Problems
February 2, 2009 • 12:02 pm 1
February 2, 2009 • 11:13 am 0
Human errors causes all sites to be marked as “malware”.

http://news.cnet.com/8301-1001_3-10153942-92.html?tag=mncol;txt
Filed under: Production Problems