|
||||||||
|
||||||||
|
Linux Applications Tập hợp các bài viếc hướng dẫn cài đặt các ứng dụng phổ biến trên Linux (CentOS) |
|
Công Cụ | Xếp Bài |
08-11-2009, 04:11 PM | #1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Guest
Trả Lời: n/a
|
Cài đặt Nagios trên Centos 5.* (Installing Nagios on CentOS 4.x/5.x)
Cài đặt Nagios trên Centos 5.* (Installing Nagios on CentOS 4.x/5.x) Contents
This document will breeze through installing and configuring everything necessary to get Nagios up and running. This will not touch in detail on the actual configuration directives Nagios uses. For that, documentation is readily available from the Nagios website, or available locally after Nagios is installed. I'll be explaining installation through RPMs and yum from Dag's repo (RPMforge), but source is available if you prefer to build your own. Again, documentation for this is readily available. Please see the third-party Repositories section of the CentOS wiki in you don't already know how to enable repos. This also assumes you already have a working e-mail server in your existing network as well. That's how notifications will get sent, and that's beyond the scope of this. System:
References:
Packages:
General Upgrades A quick note about upgrading. Generally, upgrading is as simple as typing yum update package_name. Just to be on the cautious side, backup your configuration files in /etc before upgrading. Secondly, always read the release notes to make sure configuration files and directives haven't change. Upgrading from 2.4 A quick note about upgrading. If you're upgrading from version 2.4 (and previous 2.x version), and you've installed following this guide then a simple yum upgrade will work just fine. As always, it's best to backup any previous configurations before upgrading just in case something goes awry. Also, from release 2.4 to 2.5 the only packages that Dag has re-spun are nagios and nagios-devel.
Upgrading from 2.5 If you're upgrading from version 2.5 to 2.6, Dag's RPMs had a few quirks. Make sure you backup /etc/nagios before continuing.
Make sure you have Apache installed, then you'll need to quickly configure it if not. Chances are you probably already have some web service running on your machine, but if not, get it running quickly this way.
If you require further assistance with getting Apache going, especially if you have a need to secure the server, then please follow the documentation at http://www.apache.org. This will get your web server up and running quickly, but provides no means of security what-so-ever, I just want to warn you. If you're running completely internal, then it shouldn't be a big deal. Ok, after you get that running, let's install Nagios and start working on setting it up. By default, the RPMs you are going to install automatically create a nagios.conf file for Apache to use. This file is in /etc/httpd/conf.d/nagios.conf. Installation/Configuration Nagios requires several different packages be installed so that it may perform the magic it does so well. The core is the Nagios package itself. Without the plugins package, though, Nagios won't be able to actually process any checks on your system. The development package obviously contains all the libraries, headers, and document files for developing Nagios. The other optional packages are the NRPE package, and the NSCA (Nagios Service Check Acceptor) which I don't use. You may have use for it, so check out the main site for details. Also, Nagios must run under both the user and group "nagios." The RPM install takes care of this step for you, so there's no need to create the user and group.
It'll go ahead and pull down a few other packages for dependencies as well. That's it for installation. Let's move back over to Apache's side for a bit. Configure the Nagios Apache file Unless you want other options such as SSL configurations or allowing access to the CGI from only certain hosts, then the default nagios.conf file will suit your needs. Here's what it looks like:
Set up the password file If you don't want to use the name "nagiosadmin" simply substitute your name. Remember later on you'll need to use the same name in some CGI configuration settings.
Set up the CGI file The next step is to set up the users you just created in the main CGI configuration file. I'm going to assume that you are not using a guest account, and that you have only created one admin "nagiosadmin" account. Also, ensure you have it set up to use authentication. 1 means on, 0 means off.
Setting up nagios.cfg Once you start checking around in /etc/nagios, you'll see there are few example configuration files to take a peek at. One being "localhost.cfg." This file uses an all in one approach to configuring the object files later on. I find this confusing, especially if you eventually have a very large network to monitor. Instead, you'll split out the configurations into separate files, which will keep you sane later on. Go ahead and move this file. Previously, the sample files were named "bigger.cfg" and "minimal.cfg" but with Nagios 2.9 it's now just the one file.
Object configuration files As mentioned, when the configuration files are split up, Nagios reads the data from these files in order for it to process host and service checks across the network. Before I begin, detailed documentation of all of the options for the template based objects are located at the website. This will help get you started though, so let's begin with the timeperiods file. Obviously, you can substitute your options if you want different values. Timeperiods
Contacts/Contacts groups Contacts are split into two different files. One holds the actual contact options, and the other holds contacts together in groups. The groups are whom you specify Nagios to contact later on.
Host and host groups Host and host group information is stored in the two files hosts.cfg and hostgroups.cfg. Just as you can mix and match contacts in various contact groups, you can do the same thing with host names in host groups. I prefer to create template configurations that I can leech off of later on in my configuration file. It saves you an incredible amount of time typing down the road.
# Notifies never, checks 15 times before showing critical on CGI interface,
# monitors host(s) 24x7, notifies on down and recovery, checks 15 times before going critical, # notifies the contact_group every 30 minutes
Some people have commented that my logic here is confusing, but it will save you a ton of typing. If you only have a few hosts to be checking on then it probably is overkill. Ok, now on to host groups.
Services To start, you're going to need at least one service to monitor. This would be a simple check-host-alive, or ping. Again, you can split things into templates to make it easier down the road just as demonstrated above. [CODE] [me@mymachine nagios] vim services.cfg # Generic service definition template define service{ name generic-service ; Generic service name active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 1 ; Passive service checks are enabled/accepted parallelize_check 1 ; Active service checks should be parallelized (Don't disable) obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts register 0 ; DONT REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE! } # Generic for all services define service{ use generic-service name basic-service is_volatile 0 check_period 24x7 max_check_attempts 15 normal_check_interval 10 retry_check_interval 2 notification_interval 0 notification_period none register 0 } define service{ use basic-service name ping-service notification_options n check_command check_ping!1000.0,20%!2000.0,60% register 0 } define service{ use ping-service service_description PING contact_groups einsteins hostgroup_name basic-clients,your-routers # host_name one_client } [CODE] This is the example of how to nest templates. You can use hostgroup_name or host_name individually. I've declared a general template to use called "basic-service" which leeches off of the "generic-service" definitions above that. Then ping-service is used to define it down even lower. The reason I split this out is because say you want to create another host group called "your-switches," but you want notifications to go out on this service to a different contact group. Then you just define another service definition and add this host group to that definition, and apply a different contact group. Ultimately, the last definitions override all other containers above it. Last man standing type deal. The last option Nagios sees, is the one it goes by. For example below. The ping-service is still the same, but I want it to go to a different contact group. Same logic as was explained in the hosts.cfg and hostgroups.cfg file.
The services.cfg file can get pretty cumbersome because of all the different checks you can configure. For instance, you can set it up to check smtp service through check_smtp, http services through check_http, dhcp, dns, and all sorts of items through SNMP plugins. I'll give you an example of an smtp service check.
Before I continue, let me explain a bit as to what actually occurs with these files. Nagios reads the configuration options from all of these text files. When it's time to process the smtp-service you have defined, it looks to see what check_command it's supposed to execute. It then looks in the checkcommands.cfg file to look up what check_smtp is supposed to actually do. This would be:
This in essence is how to start setting up Nagios. I've simplified this quite a bit, but you should now have a good understand of where to at least begin with configuring hosts and services. Look in /usr/lib/nagios/plugins to see everything you can check out of the box. The list is very large with various things. Also, check out http://www.monitoringexchange.org to view all sorts of third-party plugins written by many community members. I do a lot of checks across SNMP, so be sure to check that out. Also, you can easily write your own plugins to use. There are many extra things you can do within Nagios itself, such as define escalations and extended service/host information. I'll explain that after you get Nagios fired up so you can see what it's about. Starting Nagios At this point, you should have a working configuration with a host or two for monitoring. Since we haven't done so yet, let's start the Nagios daemon, configure it to start at boot, and check the configurations file for errors.
You'll notice my instance has 85 warnings displayed. This is because I have 85 services being checked that have no contact group(s) associated with the service. Warnings are usually ok to let go. As long as the check (nagios -v) says "Things look okay" then you're usually fine. To avoid the warnings, simply do what the warning says and fix the issue it's spewing. Escalations Escalations are pretty cool in that they allow you to specify where second, third, fourth, and so on, notifications can go. For instance, you have the SMTP service set up to notify a contact group every 30 minutes indefinitely until someone resolves the problem. With an escalation set up, you can tell notifications 2,3,4 to go to this e-mail address, or this pager, and then you can tell notifications 5,6,7 to go to yet another address or pager, and so on. I use this extensively because I have the first notification go to ticketing software, I then set all subsequent notifications to go to simply a pager. I don't want multiple tickets being created by the same incident, but I want Nagios to page the hell out of me until I respond to the event. Let's take a peek. This assumes you've added this in the nagios.cfg file as well as created the file in /etc/nagios.
Extended information Extended information is a bonus feature and is used mainly for just aesthetic reasons on the web interface. It can be split up into host extended information and service extended information. The things you can do with this are put pretty little icons beside host names, specify URL's to links outside of Nagios, and make things look "pretty" on the map systems. I use the service extended information to point to links outside of nagios hosting MRTG graphs. I'll show you how you can do this. Remember to specify this file exists in nagios.cfg and create the file.
This puts a pretty little icon beside the PING service on the web interface. When you click on this icon, it takes you directly to the MRTG graph I have running on the same machine. In my case, I have an internal yum server rsyncing every night to the mirrors. All of the ethernet traffic is graphed through MRTG, then Nagios points a link to this so it's easy to navigate to. This proves to create a good history of bandwith usage, and other things. Use some creativity and you can log, graph, and link to just about anything you want. For example, processes and users logged into a system. Dependencies Another interesting file I use is the host and service dependencies options. What this does is set up a tier of checks before something alarms out. For example, I check a login service of a server that's not a Linux box. I have about 15 other services being checked on this host, but they are dependent on being able to login to the machine before processing these checks. When a login is unsuccessful, I don't want 15 services to start freaking out and paging me, so I set up a dependency tree. If login fails, only the login alarms out...I get one notification for this, not a zillion for all the other checks. You can use this feature for hosts as well. Again, specify it in the nagios.cfg file and create the files.
Just make sure when you are done adding, editing, or creating new configuration files, that you run the nagios -v nagios.cfg option. This processes your configuration files and does a check on them prior to actually refreshing the service. SELinux A word about SELinux. I don't use it currently, because in 4.x, it messed with some things, and I haven't taken the time to learn it. I know in 5, it's supposed to be much more mature, so try it out. I turned it off when I verified this worked on CentOS 5, so if you run into any strange things, keep SELinux in mind. A security feature of CentOS 5.2 SELinux prevents the access from the apache httpd server to the needed /var/nagios files. A CentOS 5.2 workaround is to execute the command:
That's all, folks! Basically, this is Nagios summed up. I'm simplifying almost everything. I hope I've explained things in a simple fashion anyways. Documentation for the utility is wonderful, but there's so much documentation that it's hard to learn where to get started sometimes. If you have a network to maintain, I advocate getting Nagios (unless you like other utilities) running to big brother your hosts and devices. It's saved my IT departments' skin on more than one occasion. Like I mentioned before, it'll take you a long time to get good at it, and it's not easy to figure out at first, but once you get a grip on Nagios, you'll wonder how you got along without it before. I'm checking everything from simple pings to check host alive status, to disk usage stats, memory stats, DHCP, DNS, HTTP, temperature in machines rooms, yum updates, cpu loads, SNMP information from hosts, to anything you can imagine. I'm leaving a lot of things out, but you get the idea. Virtually anything you can think of keeping an eye on, you can do so across Nagios. You can write your own plugins, or visit the Monitoring Exchange site I mentioned earlier to find just about anything. One more thing I would like to mention is the ability to configure and maintain Nagios solely through the web interface. Nagios doesn't come with pre-packaged add-ons for doing so, but you can find information for three different packages here: http://www.nagios.org/faqs/viewfaq.php?faq_id=183. I personally have not used any of them, but I guess for the command line challenged it could prove useful. If you have anything to add or if you notice something wrong, please let me know so I can correct it. The original is written in HTML, and I have to adapt my formatting to use on this wiki, so there might be some typos. Enjoy. Theo: Centos.org |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|