Linux CGroups: Subsystems as Modules: try_module_get vs delete

let me introduce you to my friend, whose name is try_module_get.

478 static inline int try_module_get(struct module *module)
479 {
480         int ret = 1;
481
482         if (module) {
483                 unsigned int cpu = get_cpu();
484                 if (likely(module_is_live(module))) {
485                         local_inc(__module_ref_addr(module, cpu));
486                         trace_module_get(module, _THIS_IP_,
487                                 local_read(__module_ref_addr(module, cpu)));
488                 }
489                 else
490                         ret = 0;
491                 put_cpu();
492         }
493         return ret;
494 }

he lives in include/linux/module.h, and he will get a reference count on the module for you, unless its state flag is set to MODULE_STATE_GOING. the get_cpu and put_cpu are SMP macros that disable/enable preemption so you can have a valid smp_processor_id.

great! now let's take a look over at a potential competitor, a system call in kernel/module.c by the name of delete_module. part of his code looks like this:

853         /* Stop the machine so refcounts can't move and disable module. */
854         ret = try_stop_module(mod, flags, &forced);
855         if (ret != 0)
856                 goto out;
857
858         /* Never wait if forced. */
859         if (!forced && module_refcount(mod) != 0)
860                 wait_for_zero_refcount(mod);

he can assist you in two different styles. the most common one is the "remove module immediately", which is what happens with rmmod usually. in this case, the O_NONBLOCK flag is specified. try_stop_module wants to set MODULE_STATE_GOING, and will behave differently depending on this flag.

if O_NONBLOCK is specified, try_stop_module will apply a very big hammer whose name is stop_machine. in this case, it will safely ensure that the reference count is zero (failing otherwise), and then set the MODULE_STATE_GOING flag. this is wonderful: because of the stop_machine hammer, there will be no problems racing with our first friend, try_module_get.

there is another way to invoke rmmod, which is with the --wait flag. if this is specified, try_stop_module will set MODULE_STATE_GOING without worrying about the refcount, and then delete_module will wait for the reference count to drop to zero. the keen-eyed systems hacker will at this point worry, what if we get the following scheduling pattern?

0) module state = MODULE_STATE_LIVE; refcount = 0
1) try_module_get checks module is alive, and succeeds (line 484)
2) delete_module sets MODULE_STATE_GOING flag (line 854)
3) delete_module waits until the refcount is zero, and finishes (line 860)
4) try_module_get increments the refcount (line 485).

not to worry, keen-eyed systems hacker! you will note that our clever friend try_module_get disables preemption on its CPU as it runs. this guarantees that he will not be descheduled during that bit of his code, and therefore, through the wonderful phenomenon of "very small critical section", the problematic execution order won't happen.

Linux CGroups: Subsystems as Modules

2009-11-03

try_module_get vs delete_module

No comments:

Post a Comment

Followers

Blog Archive

Contributors