The best solution to the previously mentioned deadlock problem was determined to be having parse_cgroupfs_options take module reference counts (before sget takes the lock), and have rebind_subsystems drop them later. This division of work makes sure that the subsystems will stick around during sget so we can safely drop our own lock in the meantime.
So I was polishing up the deadlock-free version of the patch series today, and found a good bug. After a few routine changes and polishings, I decided to go through a bit more thorough testing than I'd done since implementing this change:
livecd dev # mount -t cgroup none -o test2,net_cls,devices,cpuacct cgroup/
livecd dev # ls cgroup/
cgroup.procs cpuacct.usage_percpu devices.list release_agent
cpuacct.stat devices.allow net_cls.classid tasks
cpuacct.usage devices.deny notify_on_release test2.hax
livecd dev # lsmod
Module Size Used by
cgroup_test2 2880 1
cls_cgroup 5080 1
livecd dev #
Simple enough. Now, cgroups has this functionality where you can't destroy a hierarchy that has children cgroups. (Children cgroups are represented as subdirectories in the filesystem tree, and are made just as you would expect.) It does, however, let you unmount it:
livecd dev # mkdir cgroup/foo
livecd dev # umount cgroup/
livecd dev #
In which case the hierarchy sticks around, invisible. You can make it reappear by mounting with the same list of subsystems; any attempt to mount a subsystem on that hierarchy with a mismatched set of subsystems will fail.
livecd dev # lsmod
Module Size Used by
cgroup_test2 2880 1
cls_cgroup 5080 1
livecd dev #
Great - the modules' reference counts indicate that the hierarchy is still there. Let's bring it back again:
livecd dev # mount -t cgroup none -o cpuacct,test2,net_cls,devices cgroup/
livecd dev # lsmod
Module Size Used by
cgroup_test2 2880 2
cls_cgroup 5080 2
livecd dev #
Oops! The mounting code didn't realize that we weren't changing anything with respect to the subsystems, and we leaked a reference! Fortunately, with the new changes that I made, there was a simple fix - in cgroup_get_sb, added lines 1518-1519:
1426 if (root == opts.new_root) {
1427 +--- 85 lines: We used the new root structure, so this is a new hierarchy ---
1512 } else {
1513 /*
1514 * We re-used an existing hierarchy - the new root (if
1515 * any) is not needed
1516 */
1517 cgroup_drop_root(opts.new_root);
1518 /* no subsys rebinding, so refcounts don't change */
1519 drop_parsed_module_refcounts(opts.subsys_bits);
1520 }
For extra credit, tell how this bug is also a security vulnerability!
No comments:
Post a Comment