Originally Posted on Google+ http://bit.ly/zGHRtw
The last few weeks I’ve been working with Dell’s OpenSource project named Crowbar. This software provides the ability to manage and automate large data center deployments from the bare metal. The software is extensible by deploying software packages called barclamps and Keith Hudgins at DTO has developed an open source Zenoss barclamp. So my task was to test the installation of this barclamp and provide any help that I can.
Prior to starting on the project of testing the Zenoss barclamp I had heard of the Crowbar project while attending the local OpenStack meetups. Dell is a major player in the OpenStack community and their Principal Crowbar Architect Rob Hirschfeld has led this group from the start. So it goes without saying that Dell provides information on this project during the meeting.
The Crowbar installation consists two major software packages, Crowbar Admin server and Opscode Chef server. When installing I had the option to download the repository from GitHub and build my own ISO file or simply download the ISO from crowbar.zehicle.com. This ISO was built by Rob and is provided with no support from Dell. The ISO version I chose to install was Crowbar 1.2 Final which is just over a month old (as of Jan 31 2012). I’m currently working on building the ISO to further our testing environment.
Since the Crowbar software has some special network requirements I decided not to deploy to our internal OpenStack cloud. As I get more experience with the software it will be moved to our OpenStack environment if possible. The actual installation from the ISO was very straight forward and I was able to deploy to my local installation of VirtualBox. The first step was to create a virtual machine and boot to the ISO image. Once at the command line I made a simple modification to the JSON file for networking and I was up and running in about 20 minutes.
The Crowbar admin server provides the following network services PXE, DHCP and DNS after installation. Once a new bare metal machine is powered on it should send out a PXE request onto the network. At this point the Crowbar admin server provides a very small CentOS 5.7 image to get the basic services installed. The system will show in the Crowbar admin interface as a blinking “UnAllocated” system. It will stay in this state until the administrator assigns the node to a “Proposal”. A proposal is the additional configurations the user can place on a barclamp. All barclamps must have at least one proposal assigned to them. This is also the location to assign any nodes to their role. Once the proposal is applied or once the node is allocated Crowbar admin server triggers the OS installation (based off of the assigned os proposal) and Chef Client installs. Several reboots will happen, during which time the assigned proposals (which map to chef roles) will be applied.
To add additional proposals to a node I simply have to edit the proposal and add the node. The apply button on this screen will trigger the immediate installation of the proposal and the save button will only save the changes to the proposal in case I want to deploy at a later time.
The next step was to deploy a client machine that Crowbar could provision as a Zenoss server. As I did for the Crowbar server I built a new virtual machine in VirtualBox but this time I had to prepare it to perform a PXE boot. After several failures to boot to PXE I realized that the VirtualBox setup may not have been the best way to first test Crowbar. I wasn’t sure if Crowbar had a problem or if my VirtualBox had a problem. In the end it was mostly the VirtualBox settings and resource restrictions.
The problems in order and how I resolved them.
After the initial startup of the virtual machine it would not attempt to a PXE boot. Only the local hard drive was checked for an OS. What I found was that the default network adapter (Intel PRO / 1000) did not support PXE boot. Some trial and error and I found that the PCnet-FAST III adapter was the one I needed. Not sure if it was necessary but I also set the “Promiscuous Mode” to “Allow All”. I also had to make sure the “Network” was selected to be included in the “Boot Order” under the “System” configuration tab.
Now the vm was able to PXE boot but could not locate the PXE os image. After taking a network captures with tcpdump I found that another DHCP server was answering the PXE call from my client. By default VirtualBox networks have DHCP server enabled. Even after disabling the DHCP server and restarting VirtualBox the rouge DHCP server was still answering the call. I removed the network configuration and recreated a new Crowbar network with no DHCP server. This did the trick and I was up and running my new client.
The next step was install the Zenoss barclamp into the Crowbar server. This is done via a command line interface and is very basic with very limited options. One option that was very helpful was the --force option which forcefully overwrites the previous barclamp installation.
At this point I started to encounter problems in the UI for Crowbar that I’m not sure if they were caused by the installation of the Zenoss barclamp or if the software itself was at fault. This brings me to my big issue with Crowbar. Trying to figure out what is going on with the Crowbar server is a challenge at best and impossible most of the time. The UI only provides blinking status lights and when an error happens the message provided is useless. You are left to log in and dive into the local log files. The unfortunate problem is that the log files have been placed in several different locations, none of which I found in documentation on any of the sites Dell recommends. The IRC chat was the only helpful location I was able to receive answer from on how to troubleshoot. Since this is an open source project it will be up to me to update the GitHub wiki with this documentation
After I moved on from my UI challenges with no real resolve I received errors that “Failed to apply the proposal”. Once again I looked into the log files and found nothing. Since Crowbar is tightly coupled with Opscode Chef and I wasn’t seeing any errors that could help I turned to applying the cookbook directly from the client. This resulted in my finding that the Zenoss cookbook was failing since it had external dependencies for the software installation and the vm network did not have an external network link. I quickly resolved this by commenting out these files in the cookbook as they were not required for that particular build. The installation then completed and the Crowbar server now showed the successful installation of the Zenoss server.
The next step was to deploy a Zenoss client but since my VirtualBox was limited on resources I had to move to another testing environment. Without going into great detail the new environment was a KVM hypervisor running on Ubuntu with 8GB memory. This environment did have a network connection to the outside network so all Chef cookbooks downloaded external dependencies with no problem.
The Zenoss client installed with no problems and automagicly added itself to the Zenoss server installation. A closer look a the Zenoss client cookbook showed that it too has an external network dependency. We have forked the Zenoss barclamp and are working on getting these updates and some others pushed back into Keith’s repository. He has been assisting us during this testing phase and is looking forward to the updates.
In the end I have to say Dell has a great project here with Crowbar. It was very simple to setup and is very powerful when all is working. But I have to say when it fails, telling me that the proposal has failed to install isn’t very helpful at all, more details please or even a help file link to provide me with location of resources to use. Most of the challenges I faced during this test was either with the cookbook recipe or the VirtualBox configuration and limitations. Understanding that Crowbar is a very new project I’m impressed with how far they have come. It will only get better with community involvement so step up and dig deep and we all will win!