Nagios scannt nicht / Server überlastet?

tuxdroid

Registered User
Hallo,

ich habe gerade einen Nagios-Server nach dem Quickstart-Guide für Ubuntu aufgesetzt.

Server-Daten:

4 GB HDD
128-256 MB Ram (V-Server)
Volle CPU-Leistung des Host-Systems

Ich habe nun das Problem, dass keiner der eingetragenen Dienste gescannt wird. Einige sind schon seit 20 Minuten überfällig.

Die Ram-Auslastung ist bei Nagios+Apache2 natürlich leicht hoch:


Code:
top - 13:18:08 up 19 min,  1 user,  load average: 0.00, 0.00, 0.00
Mem:    262144k total,   246608k used,    15536k free,        0k buffers
Swap:        0k total,        0k used,        0k free,        0k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
32341 root      15   0  8148 2524 2048 R  0.0  1.0   0:00.30 sshd
 5770 nagios    18   0 10868  988  788 S  0.0  0.4   0:00.11 nagios
    1 root      18   0  1832  864  604 S  0.0  0.3   0:00.10 init
32343 root      15   0  2800 1604 1268 S  0.0  0.6   0:00.02 bash
 7281 www-data  25   0  226m 5780 1452 S  0.0  2.2   0:00.01 apache2
32205 syslog    15   0  1876  648  508 S  0.0  0.2   0:00.00 syslogd
32225 root      18   0  5248 1016  664 S  0.0  0.4   0:00.00 sshd
 3675 ntp       15   0  4064 1208  920 S  0.0  0.5   0:00.00 ntpd
 3998 root      18   0  2048  768  612 S  0.0  0.3   0:00.00 cron
 5724 Debian-e  25   0  5848  868  588 S  0.0  0.3   0:00.00 exim4
 7279 root      18   0 10412 2600 1324 S  0.0  1.0   0:00.00 apache2
 7280 www-data  18   0 10184 1876  604 S  0.0  0.7   0:00.00 apache2
 9404 root      15   0  2248 1052  852 R  0.0  0.4   0:00.00 top

Manchmal ist ein <defunct>-Apache-Prozess zu finden, aber ich denke nicht, dass das zusammen hängt.

Kann es sein, dass die Kiste einfach überlastet ist?
(Er zeigt bei der Duration der Tests auch Zeiten über 20 Minuten an.)

(Wegen der Zeitsynchronisation läuft ein ntp mit, also denke ich nicht, dass der Fehler da liegt.)


MfG

tuxdroid
 
Last edited by a moderator:
Halllo,

Dein Serve hat laut top Befehl eine Uptime von 19 min, da kann ja keine Scandauer größer als 20 Minuten sein ;)

Werden die Befehle vom Nagios denn aufgerufen?
Beispiel, ich sitze zum Scannen ob ein Host "UP" ist das PING Kommando ein. Irgendwann sollte also in deiner Prozessliste dann "ping" auftauchen.
Findest du etwas dazu vielleicht in den Logs?
 
Das mit den 19 Minuten verstehe ich nicht. In Nagios standen bei einigen Scans bei Duration Werte >20.

Ich würde sagen, der Anzeigefehler liegt da eher bei Nagios als bei top...



Inzwischen stehen bei Duration 4-stündige Werte. AFAIK heißt "duration" "Dauer". Also scannt er seit 4 h oder wie?
Den Pingtest hat er als einzigen durchgeführt, was mir aber nicht reicht. Ich hatte das Sys gestern schon auf Debian am laufen und da hat er alles getestet.
Ich hab mal nen Screenshot angehängt.



In den Logs steht folgendes:

Code:
[01-20-2009 17:47:46] Warning: The check of service 'Swap Usage' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:47:46] Warning: The check of host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host...
[01-20-2009 17:46:46] Warning: The check of service 'SSH' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:46:46] Warning: The check of service 'Root Partition' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:45:46] Warning: The check of service 'HTTP' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:44:46] Warning: The check of service 'Current Users' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:43:46] Warning: The check of service 'Current Load' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:38:46] Warning: The check of service 'PING' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:36:46] Warning: The check of service 'Total Processes' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:36:46] Warning: The check of host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host...
[01-20-2009 17:35:46] Warning: The check of service 'Swap Usage' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:34:46] Warning: The check of service 'SSH' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:34:46] Warning: The check of service 'Root Partition' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:33:46] Warning: The check of service 'HTTP' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:32:46] Warning: The check of service 'Current Users' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:31:46] Warning: The check of service 'Current Load' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:26:46] Warning: The check of service 'PING' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:25:46] Warning: The check of host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host...
[01-20-2009 17:24:46] Warning: The check of service 'Total Processes' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:23:46] Warning: The check of service 'Swap Usage' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:22:46] Warning: The check of service 'SSH' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:22:46] Warning: The check of service 'Root Partition' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:21:46] Warning: The check of service 'HTTP' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:20:46] Warning: The check of service 'Current Users' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:19:46] Warning: The check of service 'Current Load' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:14:46] Warning: The check of service 'PING' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:14:46] Warning: The check of host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the host...
[01-20-2009 17:12:46] Warning: The check of service 'Total Processes' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:11:46] Warning: The check of service 'Swap Usage' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:10:46] Warning: The check of service 'SSH' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:10:46] Warning: The check of service 'Root Partition' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:09:46] Warning: The check of service 'HTTP' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:08:46] Warning: The check of service 'Current Users' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...
[01-20-2009 17:07:46] Warning: The check of service 'Current Load' on host 'localhost' looks like it was orphaned (results never came back). I'm scheduling an immediate check of the service...

Das "results never came back" sieht irgendwie aus, als würde etwas blocken?!?
 

Attachments

  • nagios.jpg
    nagios.jpg
    160.7 KB · Views: 427
Last edited by a moderator:
Was passierst, wenn Du die Checks von Hand startest? Also:
Code:
/usr/lib/nagios/plugins/check_http -H localhost -I 127.0.0.1
HTTP OK HTTP/1.1 200 OK - 256 bytes in 0.001 seconds |time=0.000768s;;;0.000000 size=256B;;;0
Für genaue Kommandozeilenparameter siehe /etc/nagios/checkcommands.cfg.

Vermutung ist, daß irgendwas mit Deiner Installation nicht stimmt, z.B. ein Bibliotheks- oder ein Berechtigungsproblem.
 
Sorry, aber keines dieser Verzeichnisse existiert bei mir.

Es gibt ein nagios-directory in /usr/locale/ aber da ist keiner der Dateien drin.

Ich glaube ich setze den Server eben nochmal auf Debian auf. Geht ja ganz flott^^.
Es wird wohl etwas kaputt sein, das Control-Panel für die Serververwaltung sagt auch, es bekommt keine Verbindung zum Ressourcen-Daemon.
 
Last edited by a moderator:
Ich kenne jetzt die Ubuntu-Paketstruktur nicht genau - könnte es sein, daß Du neben "nagios" ein Paket "nagios-plugins" extra installieren mußt?
Du brauchst auf alle Fälle ein ganzes Bündel von Programmen namens check_(irgendwas) (z.B. check_users, check_load usw.).
 
Die Plugins habe ich installiert.

Ich habe jetzt wieder Debian draufgemacht und es funktioniert wieder alles. Und das obwohl die Anleitung für Ubuntu war...


Jetzt habe ich allerdings das Problem, dass Nagios beim localhost-check einen critical-Error für HTTP anzeigt. Dabei ist der Apache für mich zu erreichen.

Die Meldung dazu lautet:

Code:
[01-20-2009 21:38:10] SERVICE ALERT: localhost;HTTP;CRITICAL;SOFT;1;Verbindungsaufbau abgelehnt
 
Back
Top