bind-dyndb-ldap design overview of locking

Operations done by the plugin need to be serialized for several different reasons. Read on to get a gist.

Preliminary reading

(Missing) granular locking

Some structures in BIND do not have own fine-grained locks but still the access to them needs to be serialized. Some of BIND APIs can be used to access these structures and incorrect usage of BIND API will cause to crashes (or worse, sneaky memory corruption).

Before calling BIND APIs, very carefully read comments in header files and watch out for notes like No other tasks are executing. and similar. These indicate that some portions of the function might be accessing structures without own locks. In that case switch BIND to task-exclusive mode.

Task-exclusive mode

BIND can use multiple event queues and process events from them in parallel. Events from a queue are processed by so-called task. Internally BIND uses task manager to control execution of tasks. In certain situations we need to artificially stop parallel processing and process single event at a time. This is done by switching task manager to so-called task-exclusive mode.

The plugin has own lock helpers run_exclusive_enter() and run_exclusive_exit(). If called correctly, these functions will switch BIND into task-exclusive mode and ensure that recursive locking by single task is handled properly.

The code in the plugin is so involved that it might call functions requiring task-exclusive access in multiple different ways. For this reason the plugin depends on recursive locking.

The main trick is to always call run_exclusive_*() functions from the very same task during lifetime of the plugin. This ensures that recursive locking is handled properly. As a consequence, all events which might possibly lead to calling one of these “special” functions have to be sent to task inst->task.

Database version locking

BIND database has concept of internal “versions” (see BIND API dns_db_newversion()). RBT DB implementation used internally by BIND and the plugin has limitation to max. 1 read-only + 1 read-write version at the same time. An attempt to open second read-write version will lead to assertion failure.

This is not a problem for BIND itself because it processes events in such order that only one read-write version is sufficient. Unfortunately the event ordering in plugin is not stable enough so it might happen that two events will try to create new read-write version of the same database.

To avoid crashes the plugin have own kludgy implementation of newversion()/closeversion() which contains additional lock. This hack could be removed if the plugin guarantees that all events concerning single zone are always sent to sinhle task (and thus are implicitly serialized).

LDAP event rate-limiting

Events generated by SyncRepl are rate-limited in the plugin on purpose. BIND’s memory management contains a trap: ISC-Bugs #35160: Freelists in memory context consume memory indefinitely

If LDAP has a lot of entries, the SyncRepl thread might generate events faster than rest of BIND machinery is able to process them. This leads to huge memory consumption which does not drop down when initial synchronization with LDAP is finished (because of the bug mentioned above).

To counter this problem, the plugin guards internal event generation and processing using sync_concurr_limit_wait() and sync_concurr_limit_signal() functions. These guards limit number of unprocessed LDAP events to LDAP_CONCURRENCY_LIMIT, which is 100 at the moment.

Make sure that all LDAP events are properly guarded otherwise you are risking big memory consumption or deadlocks (or both :-).