Wildcard refresh through InfoPipe¶
Related ticket(s):
Problem statement¶
The InfoPipe responder adds a listing capability to the frontend code, allowing the user to list users matching a very simple filter. To implement the back end part of this feature properly, we need to add the possibility to retrieve multiple, but not all entries with a single DP request.
For details of the InfoPipe API, please see the DBus responder design page.
Use cases¶
A web application, using the InfoPipe interface requests all users starting with the letter ‘a’ so the users can be displayed in the application UI on a single page. The SSSD must fetch and return all matching user entries, but without requiring enumeration, which would pull down too many users.
Overview of the solution¶
Currently, the input that Data Provider receives can only be a single user or group name. Wildcards are not supported at all, the back end actively sanitizes the input to escape any characters that have a special meaning in LDAP. Therefore, we need to add functionality to the Data Provider to mark the request as a wildcard.
Only requests by name will support wildcards, not i.e. requests by SID, mostly because there would be no consumer of this functionality. Technically we could allow wildcard searches on any attribute with the same code, though. Also, only requests for users and groups will support wildcards.
When the wildcard request is received by the back end, sanitization will be done, but modified in order to avoid escaping the wildcard. After the request finishes, a search-and-delete operation must be run in order to remove entries that matched the wildcard search previously but were removed from the server.
Implementation details¶
The wildcard request will only be used by the InfoPipe responder, but
will be implemented in the common responder code, in particular the new
cache_req
request.
The following sub-sections document the changes explained earlier in more detail.
Responder lookup changes¶
The responder code changes will be done only in the new cache lookup
code (src/responder/common/responder_cache_req.c
). Since the NSS
responder wouldn’t initially expose the functionality of wildcard
lookups, we don’t need to update the lookup code currently in use by the
NSS responder.
The cache_req_input_create()
function should be extended to denote
that the name
input contains a wildcard to make sure the caller
really intends to left the asterisk unsanitized. Internally, the
cache_req_type
would add a new value as well.
We might add a new user function and a group function that would grab
all entries by sysdb filter, which can be more or less a wrapper around
sysdb_search_entry
, just setting the right search bases and default
attributes. This new function must be able to handle views.
These responder changes should be developed as a first phase of the work as they can be initially tested with enumeration enabled on the back end side.
Responder <-> Data Provider communication¶
The request between the responders and the Data Provider is driven by a string filter, formatted as follows:
type:value:extra
Where type
can be one of name
, idnumer
or secid
. The
value
field is the username, ID number or SID value and extra
currently denotes either lookup with views or lookup by UPN instead of
name.
To support the wildcard lookups, we have two options here - add a new
type
option (perhaps wildcard_name
) or add another
extra_value
.
Adding a new type
would be easier since it’s just addition of new
code, not changing existing code. On the backend side, the type
would be typically handled together with name
lookups, just sanitize
the input differently. The downside is that if we wanted to ever allow
wildcard lookups for anything else, we’d have to add yet another type.
Code-wise, adding a new type would translate to adding new values for
the sss_dp_acct_type
enum which would then print the new type value
when formatting the sbus message.
The other option would be to allow multivalued extra
field:
type:value:extra1:extra2:...:extraN
However, that would involve changing how we currently handle the
extra
field, which is higher risk of regressions. Also, the back
ends can technically be developed by a third party, so we should be
extremely careful about changing the protocol between DP and providers.
Since we don’t expect to allow any other wildcard requests than by name
yet, I’m proposing to go with the first option and add a comment to the
code to change to using the extra field if we need wildcard lookups by
another attribute.
Relax the sss_filter_sanitize
function¶
When a wildcard request is received, we still need to sanitize the input
and escape special LDAP characters, but we must not escape the asterisk
(*
).
As a part of the patchset we need to add a parameter that will denote characters that should be skipped during sanitization.
Delete cached entries removed from the server¶
After a request finishes, the back end needs to remove entries that are cached from a previous lookup using the same filter, but no longer present on the server.
Because wildcard requests can match multiple entries, we need to save the time of the backend request start and delete all entries that match a sysdb filter analogous to the LDAP filter, but were last updated prior to the start of the request.
Care must be taken about case sensitivity. Since the LDAP servers are
typically case-insensitive, but sysdb (and POSIX systems) are
case-sensitive, we will default to matching only case-sensitive name
attribute by default as well. With case-insensitive back ends, the
search function must match also the nameAlias
attribute.
LDAP provider changes¶
The LDAP provider is the lowest common denominator of other providers and hence it would contain the low-level changes related to this feature.
In the LDAP provider, we need to use the relaxed version of the input
sanitizing and the wildcard method to delete matched entries. These
changes will be contained to the users_get_send()
and
groups_get_send()
requests.
The requests that fetch and store the users or groups from LDAP
currently have a parameter called enumerate
that is used to check
whether it’s OK to receive multiple results or not. We should rename the
parameter or even invert it along with renaming (i.e change the name to
direct_lookup
or similar).
We also need to limit the number of entries returned from the server,
otherwise the wildcard request might easily turn into a full
enumeration. To this end, we will add a new configuration option
wildcard_search_limit
. Internally, we would change the boolean
parameter of sdap_get_users_send
to a tri-state that would control
whether we expect only a single entry (i.e. don’t use the paging control),
multiple entries with a search limit (wildcard request) or multiple
entries with no limit (enumeration). We need to make sure during
implementation that it is discoverable via DEBUG messages that the upper
limit was reached.
IPA provider changes¶
The tricky part about IPA provider are the views. The lookups with views have two branches - either an override object matches the input and then we look up the corresponding original object or the other way around. The code must be changed to support multiple matches for both overrides and original objects in the first pass. We might end up fetching more entries than needed because the resulting object wouldn’t match in the responder after applying the override, but the merging on the responder side will only filter out the appropriate entries.
Currently, the request handles all account lookups in a single tevent request, with branches for special cases, such as initgroup lookups or resolving ghost members during group lookups. We might need to refactor the single request a bit into per-object tevent lookups to keep the code readable.
Please keep in mind that each tevent request has a bit of performance overhead, so adding new request is always a trade-off. Care must be taken to not regression performance of the default case unless necessary.
If the first override lookup matches, then we must loop over all
returned overrides and find matching originals. The current code re-uses
the state->ar structure, which is single-valued, we need to add another
multi-valued structure instead (state->override_ar
) and perhaps even
split the lookup of original objects into a separate request, depending
on the complexity.
Conversely, when the original objects match first, we need to loop over
the original matches and fetch overrides for each of the objects found.
Here, the get_object_from_cache()
function needs to be able to
return multiple results and the following code must be turned into a
loop.
When looking up the overrides, the be_acct_req_to_override_filter()
must be enhanced to be able to construct a wildcard filter. The
ipa_get_ad_override_done
must also return all matched objects if
needed, not just the first array entry. The rest of the
ipa_get_ad_override_send()
request is generic enough already.
IPA subdomain lookups via the extdom plugin¶
Currently the extdom plugin only supports direct entry lookups, even on the server side. We could add a new request that accepts a filter with asterisk and returns a list of matching DNs or names, but because of the complexity of the changes, this part of implementation should be deferred until requested specifically.
If the IPA subdomain would receive a wildcard request, it would reply with an error code that would make it clear this request is not supported.
Making sure the IPA provider in server mode is capable of returning
wildcard entries and adding a wildcard-enabled function for the
libnss_sss_idmap
library would be a prerequisite so that the extop
plugin can request multiple entries from the SSSD running in the server
mode.
AD provider changes¶
No changes seem to be required for the AD provider, since the AD
provider mostly just passes around the original ar
request to a
Global Catalog lookup or an LDAP lookup. However, testing must be
performed in an environment where some users have POSIX attributes but
those attributes are not replicated to the Global Catalog to make sure
we handle the fallback between connections well.
Other providers¶
Proxy provider support is not realistic, since the proxy provider only uses the NSS functions of the wrapped module which means it would rely on enumeration anyway. With enumeration enabled, the responders would be able to return the required matching entries already. The local provider is not a real back end, so it should get the wildcard support for free, just with the changes to the responder.
Configuration changes¶
A new option wildcard_search_limit
will be added. The default value
would be 1000, which is also typically the size of one page.
How To Test¶
When the InfoPipe API is ready, then testing will be done using the methods such as ListByName. Until then, the feature is not exposed or used anyway, so developers can test using a special command-line tool that would send the DP request directly. This tool wouldn’t be committed to the git tree.
Authors¶
- Jakub Hrozek <jhrozek@redhat.com>