slapd debugging

debian bug 868753

using delta-syncrepl and gssapi auth

settings used:

olcSyncrepl:
  {0}rid=004
  provider=ldap://phd-debug-aa1.ethz.ch
  bindmethod=sasl
  saslmech=gssapi
  searchbase="dc=phys,dc=ethz,dc=ch"
  type=refreshAndPersist
  retry="1 600 60 7200 1800 +"
  network-timeout=1
  timeout=5
  logbase="cn=deltalog"
  logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
  syncdata=accesslog

1) slapd hangs on restart (phd-debug-aa1.ethz.ch)

action:

systemctl restart slapd

symptoms:

log output:

Jul 24 11:59:07 phd-debug-aa1 slapd[8055]: daemon: shutdown requested and initiated.
Jul 24 11:59:07 phd-debug-aa1 slapd[8055]: slapd shutdown: waiting for 4 operations/tasks to finish

gdb bt:

Reading symbols from slapd...(no debugging symbols found)...done.
Attaching to program: /usr/sbin/slapd, process 8055
[New LWP 8056]
[New LWP 8061]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f3a914536cd in pthread_join () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00007f3a914536cd in pthread_join () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00005572bd5897ca in slapd_daemon ()
#2  0x00005572bd570bc3 in main ()
(gdb)

action:

kill -9 8055
systemctl restart slapd

synptoms:

action:

kill -9 8907
systemctl start slapd

symptoms:

2) slapd crashes after restart/start (phd-debug-aa1.ethz.ch)

this one is interesting.
slapd seems to crash after a start/restart with a 50% chance if the following applies:

+phd-debug-aa2:~# debconf-get-selections | grep slapd
slapd   slapd/internal/adminpw  password
slapd   slapd/password1 password
slapd   slapd/internal/generated_adminpw        password
slapd   slapd/password2 password
slapd   slapd/invalid_config    boolean true
slapd   slapd/dump_database     select  when needed
slapd   shared/organization     string  ETH Zurich
slapd   slapd/password_mismatch note
# Do you want the database to be removed when slapd is purged?
slapd   slapd/purge_database    boolean false
# Potentially unsafe slapd access control configuration
slapd   slapd/unsafe_selfwrite_acl      note
slapd   slapd/ppolicy_schema_needs_update       select  abort installation
slapd   slapd/no_configuration  boolean false
slapd   slapd/dump_database_destdir     string  /var/backups/slapd-VERSION
slapd   slapd/domain    string  phys.ethz.ch
slapd   slapd/move_old_database boolean false
slapd   slapd/backend   select  MDB
slapd   slapd/upgrade_slapcat_failure   error

action:

+phd-debug-aa1:~# systemctl restart slapd

symptoms:

-phd-debug-aa1:~# ps -ef | grep '[u]sr/sbin/slapd'
-phd-debug-aa1:~#
+phd-debug-aa1:~# systemctl status slapd
● slapd.service - LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol)
   Loaded: loaded (/etc/init.d/slapd; generated; vendor preset: enabled)
   Active: active (exited) since Mon 2017-07-24 15:54:11 CEST; 18s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 4720 ExecStop=/etc/init.d/slapd stop (code=exited, status=0/SUCCESS)
  Process: 4727 ExecStart=/etc/init.d/slapd start (code=exited, status=0/SUCCESS)

Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: GSSAPI client step 1
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: GSSAPI client step 1
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: GSSAPI client step 1
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: GSSAPI client step 1
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: GSSAPI client step 1
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server ldap/phd-debu
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: slap_client_connect: URI=ldap://phd-debug-aa2.ethz.ch ldap_sasl_interactive_bind_s failed (-2)
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: do_syncrepl: rid=002 rc -2 retrying (599 retries left)
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server ldap/phd-debu
Jul 24 15:54:11 phd-debug-aa1 slapd[4733]: slap_client_connect: URI=ldap://phd-debug-aa3.ethz.ch ldap_sasl_interactive_bind_s failed (-2)

other symptoms:

action (to prevent slapd from crashing):

+phd-debug-aa2:~# systemctl stop slapd
+phd-debug-aa3:~# systemctl stop slapd

after that slapd on phd-debug-aa1:

3) slapd crashes after restart/start (phd-debug-aa1.ethz.ch)

exactly the same actions/symptoms as in item 2)
but this time all 3 masters are fully configured and syncing via gssapi

4) slapd hangs on shutdown (phd-debug-aa2.ethz.ch)

configured as in item 3)
restart of slapd on phd-debug-aa2.ethz.ch:

using delta-syncrepl using simple bind (eliminating gssapi from the mix)

settings used:

olcSyncrepl:
  {0}rid=004
  provider=ldaps://phd-debug-aa1.ethz.ch
  bindmethod=simple
  binddn="cn=dbroot,cn=deltalog"
  credentials=secure
  searchbase="dc=phys,dc=ethz,dc=ch"
  type=refreshAndPersist
  retry="1 600 60 7200 1800 +"
  network-timeout=1
  timeout=5
  logbase="cn=deltalog"
  logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
  syncdata=accesslog

The symptoms (slapd crash and hangs on shutdown) from item 1-4 (see above) are gone.
These issues seem to be caused by gssapi auth.

The following issues are still present:

5) Endless replication loop after fast subsequent modifications

As mentioned in the bugreport.
For details see debian bug 868753



Author: Sven Mäder
Department: ISG D-PHYS ETH Zurich
Contact: ISG Homepage
Last modified: Tue Jul 25 12:42:55 CEST 2017
Copyright 2017 Sven Mäder