[LITMUS^RT] Object Descriptors
Björn Brandenburg
bbb at mpi-sws.org
Mon Jan 19 09:14:51 CET 2015
On 10 Jan 2015, at 01:07, Glenn Elliott <gelliott at cs.unc.edu> wrote:
>
> Hello Fellow Litmus Devs,
>
> In Litmus, each semaphore/mutex instance is uniquely identified by a (file name, object ID) tuple. To gain access to such an instance, a thread passes these parameters to Litmus. Litmus will create the semaphore/mutex instance if it does not already exist. Litmus then passes back a unique object descriptor that the thread can use in subsequent litmus_lock()/litmus_unlock() calls.
>
> Currently, object descriptors CANNOT be shared between threads of the same process. Each thread must obtain its own object descriptor. I think an advantage to this approach is that it gives the locking protocol implementation an opportunity to dynamically allocate per-thread data structures or update the locks state. However, I’m wondering if this is more trouble than it’s worth. Has anyone implemented a locking protocol that required dynamically allocated per-thread data? Would this be used in PCP/SRP to dynamically maintain the priority ceiling of a resource?
>
> Here is a concrete example of the trouble non-sharable object descriptors can cause. Suppose you want to create a C++11-styled mutex wrapper class with an FMLP-underpinning. Following the C++11 style, a single instance of the mutex class in a process should map to the same FMLP instance in Litmus. Unfortunately, the per-thread object descriptor requires the following cumbersome implementation:
>
> #include <unordered_map>
> #include <cstdint>
>
> #include <litmus.h>
>
>
> // A basic mutex interface
> // (The real C++11 interface includes a try_lock(). UGH.
> // I guess any L.P. that doesn’t support try_lock() could
> // always fail.)
> class mutex
> {
> private:
> mutex(const mutex&) = delete;
> mutex& operator=(const mutex&) delete;
>
> public:
> mutex(void) = default;
> virtual ~mutex(void) = default;
>
> virtual void lock() = 0;
> virtual void unlock() = 0;
> };
>
>
> // FMLP with mutex interface
> class fmlp_mutex: public mutex
> {
> public:
> // takes an open file descriptor and a name (ID) for the FMLP instance
> fmlp_mutex(int _fd, int _name):
> mutex(),
> fd(_fd), name(_name)
> {
> }
>
> virtual void lock(void)
> {
> litmus_lock(this->descriptor());
> }
>
> virtual void unlock(void)
> {
> litmus_unlock(this->descriptor());
> }
>
> int open(void)
> {
> int desc = -1;
> uint64_t key = fd;
> key = (key << 32) | name;
>
> if(unlikely(nullptr == descriptorMap))
> {
> descriptorMap = new std::unordered_map<uint64_t, int>();
> }
>
> auto search = descriptorMap->find(key);
> if(search != descriptorMap->end())
> {
> desc = search->second;
> }
> else
> {
> int new_desc = open_fmlp_sem(fd, name);
> if(new_desc < 0)
> {
> fprintf(stderr, "Failed to create FMLP mutex with ID %d\n",
> name);
> exit(-1);
> }
> else
> {
> descriptorMap->insert(std::make_pair(key, new_desc));
> desc = new_desc;
> }
> }
> return desc;
> }
>
> // returns the per-thread FMLP instance object descriptor
> int descriptor(void) const
> {
> if(unlikely(nullptr == descriptorMap))
> {
> return -1;
> }
> uint64_t key = fd;
> key = (key << 32) | name;
> auto search = descriptorMap->find(key);
> if(likely(search != descriptorMap->end()))
> {
> return search->second;
> }
> return -1;
> }
>
> public:
> const uint32_t fd;
> const uint32_t name;
>
> protected:
> virtual int open_type(int fd, int name) = 0;
>
> protected:
> static __thread std::unordered_map<uint64_t, int>* descriptorMap;
> };
>
> // vvv Put the below in some .cpp file somewhere vvv
> __thread std::unordered_map<uint64_t, int>* fmlp_mutex::descriptorMap = nullptr;
>
>
> What is cumbersome about the above implementation? The use of thread-local storage (__thread) to maintain a mapping of (file name, object ID)-tuples to FMLP object descriptors
>
> Before a thread may use the fmlp_mutex instance, it must first call “fmpl_instance->open().” Then, every lock/unlock call requires a lookup of the object descriptor. It’s branchy. It’s on the fast-path. (Things get more complicated if we want to properly close descriptors, but I’ve left that code out.)
>
> However, all this goes away if we can share object descriptors among threads of the same process:
>
> // Implementation is GREATLY simplified if object descriptors can be shared
> // among threads of the same process.
> class fmlp_simple_mutex: public mutex
> {
> public:
> // takes an open file descriptor and a name (ID) for the FMLP instance
> fmlp_simple_mutex(int _fd, int _name):
> mutex(),
> fd(_fd), name(_name)
> {
> desc = open_fmlp_sem(fd, name);
> if(desc < 0)
> {
> fprintf(stderr, "Failed to create FMLP mutex with ID %d\n",
> name);
> exit(-1);
>
> }
> }
>
> virtual void lock(void)
> {
> litmus_lock(desc);
> }
>
> virtual void unlock(void)
> {
> litmus_unlock(desc);
> }
>
> public:
> const uint32_t fd;
> const uint32_t name;
>
> protected:
> int32_t desc;
> };
>
> By allowing object descriptors to be shared, we’ve lost the ability to notify the underlying locking protocol instance that there is a new user of the object. However, we can still support this with an additional system call. A thread would pass an object descriptor to the system call and Litmus would make the necessary updates. To me, this seems less-bad than the thread-local storage hoops we have to jump through with Litmus’s current way of doing things. You may still need a complicated implementation like I have above for fmlp_mutex for locking protocols that require the second system call. However, at least such complications would be limited to those locking protocols.
>
> I don’t yet have the time to propose a patch to Litmus. Even if we decide to leave Litmus as-is, hopefully this email will spread awareness of how Litmus manages locks. At the very least, I would be interested in the thoughts of others on this.
Let me add a few comments just to explain the current implementation. We've never allowed forked children of real-time tasks to retain real-time status—new real-time tasks need to go through task admission to give plugins a chance to reject them and to potentially set up scheduling-related data structures. Similarly, we have not allowed references to locks to be inherited across forks, based on similar reasoning. Determining valid priority ceilings is greatly simplified due to this choice.
We could of course call the plugin's admission callback from within fork(), and we could augment the lock descriptor API to have a new callback for fork(), but this requires additional engineering to make it all work and increases the burden on implementors of new locking protocols.
I would still consider merging such a patch if it is reasonably easy to maintain, but without a volunteer pushing the more complicated solution we'll stick with the existing simpler semantics. Apart from your C++ issue, I'm also not aware of any situation where the current API causes undue difficulty.
- Björn
More information about the litmus-dev
mailing list