No Node Left Behind

Currently Being Moderated

LANL HPC Deployment of Zenoss

Posted by Matt Ray on Nov 18, 2009 4:50:40 PM
roadrunner.jpg

Just in time for this week's SC09 High Performance Computing (HPC) conference comes the announcement from the Los Alamos National Laboratory (LANL) HPC Division that they are using Zenoss to monitor their HPC large scale clusters.  LANL currently has the #2 fastest supercomputer in the world, and they use a modified version of Zenoss to monitor it.  They are working to share their customizations and there is a High Performance Computing Development area dedicated to the work and an HPC group for further collaboration on using Zenoss in your HPC environment.

 

LANL HPC Deployment of Zenoss

Los Alamos National Laboratory High Performance Computing Division is currently deploying Zenoss with some modifications to monitor high performance large scale clusters.   We have created several ZenPacks that help to extend Zenoss in the areas of issue tracking, asset tracking, and scalability.  Attached is a file that gives a high level description of the enhancements LANL made to Zenoss for our deployment.

 

The basics are in place, but there are lots of opportunities for contribution. Here’s a partial list of things we think would be of great use to the HPC that we won’t get to any time soon:

• Direct feed of resource manager job allocation data

• Increased automation of event-->issue roll-up

• Performance data from the nodes

• End to end I/O subsystem view

• After-the-fact automated event/issue correlation

• Continuing filter/mapping refinement

• Better high-level reporting facilities

• Alternate visualization of data across event, performance, environmental data categories

• Appropriate and relevant monitoring data and rates for HPC Center networks.

We are currently working within our organization to authorize approval for sharing of the enhancements we have made.  Our next steps will involve working to get our changes integrated with the newer versions of Zenoss.  We look forward to working with others to add more functionality specific to high performance computing.

Attachments:
2,839 Views Tags: zenoss, core, hpc, lanl, sc09, los_alamos, supercomputing, roadrunner


Nov 19, 2009 4:18 AM mlist mlist    says:
What I have not understood is why LANL did not purchased the enterprise version of zenoss collaboration with zenoss developers for their requirements.
In this case both of them would had benefits:
-LANL would had support from zenoss and surely zenoss would have applied a special price (maybe very special considering that LANL developers would worked too)
-Zenoss would added this futures in the enterprise versions so that all customers would had benefit of these great enhancements.

In this case only users that will use zenoss core will have these futures so...what will be the benefit for enterprise customers? What did zenoss earn from this apart some zenpack?
Nov 19, 2009 10:49 AM Matt Ray Matt Ray    says in response to mlist:
Zenoss Inc. Professional Services assisted LANL in their heavy customization of Zenoss Core to do what they wanted.  Some of the work made changes to the Zenoss model that were incompatible and perhaps unwanted by standard Zenoss installations.  Because of this, the Zenoss HPC work is a parallel fork that we'll work to keep up to date as best we can.  Some of the work has already migrated (and will continue to merge) into Zenoss Core and be available for all Zenoss users.
Nov 19, 2009 2:50 PM mlist mlist    says in response to Matt Ray:
Ok thank for this clarification