SETroubleshoot Developer FAQ


  1. RPC
    1. Why does SETroubleShoot use RPC?
    2. What RPC mechanism does SETroubleshoot use and why?
    3. The internal RPC uses XML, why not just use XML-RPC?
    4. DBUS
    5. What is the data channel between the client and server?
    6. What is the protocol between the client and server?
      1. What are the RPC header fields?
    7. What types of RPC transactions are there?
    8. Are RPC calls synchronous or asynchronous?
    9. How do I get a return values from a RPC call?
    10. How are errors handled in RPC calls?
    11. How are the signatures (parameter lists) of RPC methods defined?
    12. RPC calls on the wire, and XML serialization
    13. Calling a remote method vs. implementing a remote method
    14. Can a single RPC interface provide both caller and callee methods?
    15. How do I monitor the state of the connection?
  2. XML
    1. What is XML used for?
    2. Why isn't Python introspection used for XML serialization?
    3. Where is the XML structure defined and how does it work?
  3. Plugins
    1. What plugins are loaded?


RPC

Why does SETroubleShoot use RPC?

SETroubleshoot is composed of two fundamental parts. A system daemon (setroubleshootd) running with root privlegesm, often referred to as the server and 1 or more clients which connect to the setroubleshootd server to receive notifications of new AVC denials and query the history of AVC denials on a particular node.

The daemon (server) needs to exist for two fundamental reasons: It needs to run with root priveldges to interact with the audit system, perform system operations, and then to isolate the information it manages from user level processes. In addition by acting as a server on a particular node it is possible to build a distributed system whereby remote clients can connect to multiple nodes to listen for alerts or query their alert history.

Initially only the client is sealert. sealert can operate either as a desktop GUI component or a command line tool.

One can think of the server (setroubleshootd) as the process which manages data and the client (sealert) as the process which presents the data. Because they are two seperate processes, possibly running on different nodes, they communnicate via RPC.

What RPC mechanism does SETroubleshoot use and why?

SETroubleshoot uses it's own RPC based on XML documents exchanged over a socket connection.

But the world is full of RPC mechansisms, many of which are standardized, why invent a new private mechanism when instead you could use SOAP, DBUS, XML_RPC, etc.?

Because there were several requirements which prompted this decision:

The internal RPC uses XML, why not just use XML-RPC?

DBUS

DBUS has nice python bindings and it is well supported. Why not use DBUS as the RPC mechanism?

Because DBUS currently only supports local interprocess communication on one node, this is its design center.

But setroubleshoot currently uses DBUS, why is it used for some things and not others, more to the point, if you're using DBUS already why introduce a private RPC mechanism?

DBUS is used for the desktop GUI component (sealert). It's use in this context makes perfect sense because DBUS has at it's heart facilities for managing items in a user's desktop session. In particular DBUS provides easy ways to automatically launch programs, treat a program as a service provider for the login session, etc. The GUI desktop component of setroubleshoot (sealert) benefits from it's integration with the desktop session, but this is independent of the issue of where and how the AVC alert information is derived from. By analogy a web browser benefits from DBUS integration on the desktop but its data commuication remains HTTP to remote nodes.

What is the data channel between the client and server?

The connection is a socket. By default the socket is a local UNIX domain socket for enhanced security. However, it is trival to configure the client/server code to use INET sockets instead to accomodate remote connections. The vast bulk of the code is agnostic with regards to the socket type.

What is the protocol between the client and server?

The protocol is a very simple text based protocol similar to HTTP. Each transaction consists of a header and a body.

The header is a series of key/value pairs (e.g. fields). Keys start at the beginning of a line and may not contain whitespace or a colon (:). The key is seperated from its value by a colon (:) and is terminated by a line ending. Line endings are carriage return, newline pairs.

The header is seperated from the body by a blank line.

The header MUST contain a content-length field which is the octet count of the body (beginning after the blank line seperating the header from the body).

Fields in the header may be appear in any order.

No information is provided which declares the size of the header. The header MUST declare the size of the body with a content-length field. The receiving end reads a header until it sees a blank line demarking the header from the body. At this point the receiving end knows the exact body octet count which must then be read to consume the entire body. After the body has been read the receiving end interprets subsequent octets as the header of the next transaction, thus it always knows the boundary between transactions.

What are the RPC header fields?

What types of RPC transactions are there?

Are RPC calls synchronous or asynchronous?

RPC calls are always asynchronous. Eliminating blocking and handling RPC transactions as part of a larger event loop vastly simplifies large parts of the code and offers tremendous flexibility (e.g. event driven model). There is no support for synchronous RPC calls.

How do I get a return values from a RPC call?

RPC methods return an async_rpc object which implements the add_callback() and add_errback() methods. Immediately after calling the RPC command one can use the returned object to attach any number of callback or error handlers to the command. At some point in the future the RPC command will return either its return values or in the case of a failure error information. When the return information arrives the RPC implementation looks up the object belonging to the call and calls each attached callback if the method succeeded, or each attached errback if the method failed.

Here is a code example:

def get_flag(flag_name):

    def get_flag_callback(flag_value):
        print "flag %s = %s" % (flag_name, flag_value)

    def get_flag_error(method, errno, strerror):
        print "%s ERROR: error code = %d, message = %s" % (method, errno, strerror)

    async_rpc = foo.get_flag(item, flag_name)
    async_rpc.add_callback(get_flag_callback)
    async_rpc.add_errback(get_flag_error)

This will call the get_flag method on object which implements the RPC interface which defines the get_flag() method. In this example 'foo' is an object exporting that interface.

At some point in the future when control returns to the main event loop either the function get_flag_callback() or get_flag_error() will be called.

Note: Because Python implements "function closure" one may nest functions (e.g. inner functions). When this is done values are "bound" in the inner function when the outer function is invoked. Thus any value which is in the scope of the outer function may be referenced in the inner function when the inner function is invoked.

In our example flag_name is defined in the outer get_flag() function and referenced in the inner get_flag_callback() function. Whenever the inner callback is invoked in the future the value of flag_value will be whatever it was when the rpc method was invoked. This is very handy, it means your callback maintains variable state, if you need to reference something in your callback you can and be confident it will be unique to that callback instance. Even if we called get_flag() twice in a row passing "foo" and "bar" as the flag_name parameter before either of the callbacks were invoked the first instance of get_flag_callback would have flag_name bound to "foo" and the second instance would have flag_name bound to "bar".

How are errors handled in RPC calls?

All RPC calls are assumed to succeed. Errors are never part of the normal return values. Instead, if a RPC call generates an error a seperate callback is invoked known as the errback. Thus when calling an RPC method you may attach a normal callback to it and/or a callback to handle an error, the two are distinct. When the RPC returns either the attached callbacks are going to be invoked if it is successful or the attached errbacks are going to be invoked if it failed.

All errbacks have identical function signatures:

  def errback(method, errno, strerror):

The method parameter will be the name of the method which induced the error, the errno parameter will be a numeric error code, and strerror will be a string describing the error).

The method name parameter in the errback is a convenience, it does not tell you which particular instance of an RPC method call failed, just the name of the method. If you need to know the particular instance recall there is an exact one-to-mapping when you bind an errback to a RPC call via the add_errback() method. This binding is identical to add_callback() and all the issues relative to binding scope described for callbacks apply to the errback as well. If you need to know the exact instance just make sure there is something in the binding scope of the errback which uniquely identifies it, for instance an object reference.

How are the signatures (parameter lists) of RPC methods defined?

Each RPC call is a member of an RPC interface. Each method in an RPC interface must have a name unique within that interface. Different interfaces may have method names in common.

The definitions of RPC methods, signals, and callbacks are done using Python decorators. A decorator is a python function preceded by an "at sign" (@). Python functions or methods which have been "decoratated" have one or more decorators immediately preceding their def statement.

Our RPC implementation defines 3 decorators (in src/rpc.py)

When a method is decorated with @rpc_method() it is saying "this method is a member of the interface". It's parameter list is defined by the python method that was decorated.

Optionally we may define the type of each parameter via the @rpc_arg_type() decorator. After the interface defintion comes a list of Python types, one for every parameter. If the @rpc_arg_type() decorator is absent the parameters default to being of type string.

If the RPC call is a method (as opposed to a signal) it must have one or more return values. Return values are passed back via a callback so it is natural to define them as a parameter list to the callback function. The @rpc_callback() decorator says "this is the callback signature for a method in an interface" Just like the @rpc_method() decorator it may optionally be associated with @rpc_arg_type() decorator to define the parameter types.

The RPC interfaces used in setroubleshoot are defined in src/rpc_interfaces.py

Here is an example which illustrates the concepts:

    @rpc_method('SETroubleshootServer')
    def database_bind(self, database_name):
        pass

    @rpc_callback('SETroubleshootServer', 'database_bind')
    @rpc_arg_type('SETroubleshootServer', SEDatabaseProperties)
    def database_bind_callback(self, properties):
        pass

We are defining a method in the SETroubleshootServer interface called database_bind. The method takes one parameter, database_name. Because there is no @rpc_arg_type decorator associated with the method the parameters default to being strings.

Because it is a method (as opposed to a signal) it must have an associated callback to return its values. The @rpc_callback() decorator is binding this callback defintion to the database_bind method in the SETroubleshootServer interface. The callback has one parameter which is defined to be an object of the class SEDatabaseProperties, this information is provided by the @rpc_arg_type() decorator.

In summary, when database_bind() is called passing a database name it will return a SEDatabaseProperties object.

RPC calls on the wire, and XML serialization

In src/signature.py we define the members of the SEDatabaseProperties object

SEDatabasePropertiesAttrs = {
    'name'          : {'XMLForm':'element' },
    'friendly_name' : {'XMLForm':'element' },
    'filepath'      : {'XMLForm':'element' },
    }

Which simply states the SEDatabaseProperties object has 3 members (name, friendly_name, and filepath), and when the object is serialized into XML each member will be an element (defaulting to type string).

On the wire the database_bind() method is encoded as a header+body. The header states the type of the RPC call is 'method' and its unique rpc identifier (rpc_id) is '1'. The rpc_id will be used at method return time to match the callbacks to this particular call. The header also states the body which follows the header is 181 octets long.

A client calling database_bind('audit_listener') would have a wire representation which looks like this:

data=content-length: 181
type: method
rpc_id: 1

<?xml version="1.0"?>
<cmd interface="SETroubleshootServer" method="database_bind" arg_count="1">
  <arg name="database_name" position="0" type="string">audit_listener</arg>
</cmd>

The body is always an XML document which encodes the RPC call and its parameters. The top <cmd> element defines the interface and method which as a pair uniquely identifes what is being called. It also includes a count of how many arguments follow.

Each child of the <cmd> element is an <arg> element which defines the parameter name, its position in the parameter list, and its type. The children of the <arg> element is the value for the parameter, which is interpreted via the type attribute of its <arg> element parent.

Following our example of a client calling database_bind('audit_listener') the server might respond with

content-length: 373

type: method_return

rpc_id: 1



<?xml version="1.0"?>
<cmd interface="SETroubleshootServer" method="database_bind_callback" arg_count="1">
  <arg name="properties" position="0" type="xml">
    <properties>
      <filepath>/var/lib/setroubleshoot/audit_listener_database.xml</filepath>
      <friendly_name>Audit Listener</friendly_name>
      <name>audit_listener</name>
    </properties>
  </arg>
</cmd>

At this point the header should be self explanatory. The body says this is a database_bind_callback method in the interface SETroubleshootServer. We know from the interface definition database_bind_callback() has one parameter called 'properties' and it will be a SEDatabaseProperties object.

When the wire representation is decoded we look up the defintion of the method and see we are expecting one parameter and it should be a SEDatabaseProperties object. The type attribute in the <arg> element says it is encoded as XML, which is how we encode complex objects. The SEDatabasePropertiesAttrs says we should expect 3 elements (name, friendly_name, and filepath) all encoded as elements. We parse the children of the <arg> node and construct a SEDatabaseProperties from the values we parse.

At this point the we lookup the rpc_id to see what callbacks were associated with the originial database_bind() call. For every callback bound to this call we call the callback passing the SEDatabaseProperties object we constructed from the XML body.

Calling a remote method vs. implementing a remote method

There is a distinction between calling an RPC method (caller) and implementing an RPC method (callee).

Objects which want to call a remote method must include somewhere in their ancestor classes the RPC interface class which contains the method they wish to call (currently all RPC interfaces are defined in src/rpc_interfaces.py). In addtion the class must derive from RpcChannel? so there is a communication pipe. Recall that methods defined in the RPC interface class are decorated with things like @rpc_method(). These decorators are the magic which transforms a method call on an RPC interface into RPC communication. From the caller's perspective one only needs to include the interface definition and then make the call remembering that all methods in a RPC interface return an async_rpc object which you will need to connect a callback and/or errback to.

Lets use the following RPC interface example for both a caller and callee:

class MyInterface:
    # get_user_data takes a user name as a parameter and returns a
    # UserData object describing the user
    @rpc_method('MyInterface')
    def get_user_data(user_name):
        pass

    @rpc_callback('MyInterface', 'get_user_data')
    @rpc_arg_type('MyInterface', UserData)
    def get_user_data_callback(user_data):
        pass

As the caller your code might look something like this:

class DoSomethingRemote(RpcChannel, MyInterface):
    def __init__(self):
        self.open() # pseudo code for opening channel
        self.user_data = None

    def lookup_user(self, user_name):

        def store_user_data(user_data):
            self.user_data = user_data

        def user_data_error(method, errno, strerror):
            print "could not find data for user %s: %s" % (user_name, strerror)
            sys.exit(1)

        async_rpc = self.get_user_data(user_name)       # make the call
        async_rpc.add_callback(store_user_data)         # attach the callback
        async_rpc.add_errback(user_data_error)          # handle errors

To call the RPC get_user_data() the object derived the channel and the interface and simply just called the method providing a callback handler for the return value, which in our example will be a UserData? object.

To implement the RPC method on the remote side a class provides an implementation of the method which matches the signature (parameter list) in the RPC interface. That class DOES NOT derive from the RPC interface! The class just provides a method with the same parameter list as defined in the interface and provide an RpcChannel? for communication. Then it connects the communication pipe (RpcChannel?) to the object implementing the RPC interface. The connection is made by:

channel.connect_rpc_interface(interface_name, handler_object)

What this does is tell the channel when it sees an RPC on the specified interface it should call the method in the handler_object.

A callee returns values as if it were the parameter list for the callback. Thus if the callback had two parameters and looked like this: iface_callback(foo, bar) the callee would return (foo, bar). If the callee does not have any return values it should return None. Thus a callee's return value is always either a list or None.

Thus on the callee side your code might look like this:

class MyClass(RpcChannel):
    def __init__(self):
       self.open() # pseudo code for opening channel
       self.connect_rpc_interface('MyInterface', self)

    def get_user_data(self, user_name):
        user_data = find_user(user_name)
        if user_data is None:
            raise ProgramError(ERR_USER_NOT_FOUND, "user (%s) not found" % user_name)

        user_data = UserData(user_name)
        return [user_data]

Error returns to the caller are handled gracefully by the Python exception mechanism. The RPC framework will catch ProgramError?'s. If a ProgramError? exception is raised during the execution of a RPC callee's implementation of a method the RPC framework will catch the ProgramError? and automatically return the error code and error message to the caller via the errback mechanism.

On the caller side this translates to calling store_user_data(user_data) if the lookup succeeds or calling user_data_error('get_user_data', ERR_USER_NOT_FOUND, 'user (xxx) not found") if it fails.

Can a single RPC interface provide both caller and callee methods?

No. RPC interfaces are one directional. A RPC interfaces defines what can be called on the interface. If you want both ends of the connection to be able to call each other you will need an interface for each end of the connection so there is a seperate calling interface on each end. One end calls, the other end implements.

How do I monitor the state of the connection?

It is often necessary to know when a connection has been made, when it drops, etc. The RpcChannel? class provides a callback, on_connection_state_change. Whenever the connection state changes it will call you with the new state. Currently the states are:

XML

What is XML used for?

XML is used primarilly for two purposes:

This means any object which will be stored or passed in RPC will need to be able to be serialized into and out of XML representation.

Why isn't Python introspection used for XML serialization?

Fundamentally there are two approaches one could take to provide XML serialization of objects:

  1. The object is unaware of how it might be serialized, instead via

introspection and recursion objects are broken down into fundamental types with simple rules on how to encode each fundamental type in XML.

  1. The object is XML aware and is in control of exactly what its XML

representation is.

Option 1 is simple and elegant. Unfortunately it's not robust. There is no way to control what the final XML representation will be, anything in the object will get serialized, even if its not meant to be part of its portable representation. Python is very forgiving and you can attach any data you want at any time by simply assigning to the object. Such "extra data" might be legitimate internal private use, or it might get attached to the object by programmer error. Either way, it's not data which should get serialized into its portable representation. The portable representation should be very well defined. Lacking a well defined representation for the object also means it's difficult to provide defaults for member values, to validate its structure, guard against injecting superfluous data, especially if the extra data is malicous in intent, convert between versions of the representation, etc. Automatic serialization via introspection is like programming in Basic were everything is a global variable.

Option 2 is much more robust. There is a well defined structure for objects. If a value is absent there is a defined default. Values which are not part of the definition can never be introduced via serilization. Malformed representations are easy to detect. Values can be assigned types. Versioning can be used to upgrade and downgrade representations. The representation can be controlled to take advantage of XML features and tailor the reprensentation to optimize for size or speed.

One of the original goals of the project included communicating data across diverse systems possibly running at different revision levels making the well known structure of the XML representation critical. Automatic serialization via introspection fails the goal of well known structure. There needs to be a definition for the XML structure independent of the objects being serialized.

Where is the XML structure defined and how does it work?

For historical reasons the XML structure is defined in src/signature.py. Each object which can be serialized inherits from the class XmlSerialize?. When the XmlSerialize? class is initialized it is passed a XXXAttrs dict which defines the structure, where XXX is the class name of the object.

Each key in the Attr's dict is a member name in the object. When the object is created every key in the Attr's dict is instantiated as a member of the object and given a default value.

Each member can specify whether it will be serialized as an attribute or an element (XMLForm)

Each member can define it's type, at the moment it is either a class (which includes complex objects, and simple objects suchs as integers) or a list. Lists are unordered homogeneous collections. For both class and list members their values are a dict which defines their properties. For example the 'name' key in a class definition is the name of the class. The name key in a list defintion is the (element) name given to every member of the list.

If the 'default' key is present in a defintion dict it is the default value to be assigned, otherwise the default is None. Actually, the default value is a function which is called that returns the default value, this is much more flexible. For instance it can return an instance of a class.

Because the XML definition allows any object member to be either an object or a list of objects the serialization process is recursive yielding surprisingly complex XML representations from simple definitions.

The basic process for when an object is serialized procedes like this: For every member in my Attr's dict get the value from the object. If the value is not None ask it to return it's representation as a xml node and add it to my xml node, then return my xml node. It's easy to see the recursion in this. Plus it guarantees only members in the defintion will be serialized according the definition.

When XML is parsed to recreate an object the process begins with the definition. For each member in the definition find the XML node in the tree which matches it, ask that node to convert itself to a python object by it's defintion rules and then assign it to that member of the object. Once again the recursion should be self evident. Also note this process is significantly different than "walking" an XML tree. In that case the resulting python object would be whatever happened to be in the XML input. By letting our definition drive the parsing we end up with Python object which exacatly match our definitions.

Plugins

What plugins are loaded?

Plugins are located in the directory /usr/share/setroubleshoot/plugins

The load_plugins() function in src/util.py will attempt to load any file in the plugin directory with a python suffix as a python module. If the module load results in a python exception a diagnositic message is emitted and the plugin module is removed from the list of plugins. This allows for the entire system to be brought up even if one or more plugin's are bad.