let me introduce you to my friend, whose name is try_module_get.
478 static inline int try_module_get(struct module *module)
479 {
480 int ret = 1;
481
482 if (module) {
483 unsigned int cpu = get_cpu();
484 if (likely(module_is_live(module))) {
485 local_inc(__module_ref_addr(module, cpu));
486 trace_module_get(module, _THIS_IP_,
487 local_read(__module_ref_addr(module, cpu)));
488 }
489 else
490 ret = 0;
491 put_cpu();
492 }
493 return ret;
494 }
he lives in include/linux/module.h, and he will get a reference count on the module for you, unless its state flag is set to MODULE_STATE_GOING. the get_cpu and put_cpu are SMP macros that disable/enable preemption so you can have a valid smp_processor_id.
great! now let's take a look over at a potential competitor, a system call in kernel/module.c by the name of delete_module. part of his code looks like this:
853 /* Stop the machine so refcounts can't move and disable module. */
854 ret = try_stop_module(mod, flags, &forced);
855 if (ret != 0)
856 goto out;
857
858 /* Never wait if forced. */
859 if (!forced && module_refcount(mod) != 0)
860 wait_for_zero_refcount(mod);
he can assist you in two different styles. the most common one is the "remove module immediately", which is what happens with rmmod usually. in this case, the O_NONBLOCK flag is specified. try_stop_module wants to set MODULE_STATE_GOING, and will behave differently depending on this flag.
if O_NONBLOCK is specified, try_stop_module will apply a very big hammer whose name is stop_machine. in this case, it will safely ensure that the reference count is zero (failing otherwise), and then set the MODULE_STATE_GOING flag. this is wonderful: because of the stop_machine hammer, there will be no problems racing with our first friend, try_module_get.
there is another way to invoke rmmod, which is with the --wait flag. if this is specified, try_stop_module will set MODULE_STATE_GOING without worrying about the refcount, and then delete_module will wait for the reference count to drop to zero. the keen-eyed systems hacker will at this point worry, what if we get the following scheduling pattern?
0) module state = MODULE_STATE_LIVE; refcount = 0
1) try_module_get checks module is alive, and succeeds (line 484)
2) delete_module sets MODULE_STATE_GOING flag (line 854)
3) delete_module waits until the refcount is zero, and finishes (line 860)
4) try_module_get increments the refcount (line 485).
not to worry, keen-eyed systems hacker! you will note that our clever friend try_module_get disables preemption on its CPU as it runs. this guarantees that he will not be descheduled during that bit of his code, and therefore, through the wonderful phenomenon of "very small critical section", the problematic execution order won't happen.
No comments:
Post a Comment