Note: This is my first post/topic in Linux Containers and at the end of the Description I have a question on what if anything else needs to be done on my part.
Description:
- Two off by one errors, one in lxc/cgroups/cgfsng.c:lxc_cpumask_to_cpulist() and one in lxc/cgroups/cgfsng.c:cpuset1_cpus_initialize() are exposed on systems with the total number of cpus divisible by 32 and have offline cpus defined in /sys/devices/system/cpu/offline. They respectively cause and incorrect cpulist to be generated (resulting later on in a range error) and memory corruption which results in a subsequent abort during memory alloc frees.
To aide in the explanation of the issue that follows are 2 code snippets, one for lxc_cpumask_to_cpulist(), one for cpuset1_cpus_initialize(), and one Log Trace snippet.
Note additional Tracing was added to the code and is reflected Log Trace hence expected line numbers for the 4.0.9 release a not the same. (Some of the Additional Trace Messages as shown in the code snippets).
The 2 errors are annotated in the code along with what the correct code should be.
246 /* Turn cpumask into simple, comma-separated cpulist. */
247 static char *lxc_cpumask_to_cpulist(uint32_t *bitarr, size_t nbits)
248 {
249 __do_free_string_list char **cpulist = NULL;
250 char numstr[INTTYPE_TO_STRLEN(size_t)] = {0};
251 int ret;
252
253 for (size_t i = 0; i <= nbits; i++) { **/*** Error: <= Fix: < ***/**
254 if (!is_set(i, bitarr))
255 continue;
256
257 ret = strnprintf(numstr, sizeof(numstr), "%zu", i);
258 if (ret < 0)
259 return NULL;
260
261 ret = lxc_append_string(&cpulist, numstr);
262 if (ret < 0)
263 return ret_set_errno(NULL, ENOMEM);
264 }
265
266 if (!cpulist)
267 return ret_set_errno(NULL, ENOMEM);
268
269 return lxc_string_join(",", (const char **)cpulist, false);
270 }
562 #define __ISOL_CPUS "/sys/devices/system/cpu/isolated"
563 #define __OFFLINE_CPUS "/sys/devices/system/cpu/offline"
564 static bool cpuset1_cpus_initialize(int dfd_parent, int dfd_child,
565 bool am_initialized)
566 {
...
643 if (maxoffline > 0) {
644 offlinemask = lxc_cpumask(offlinecpus, maxposs);
645 if (!offlinemask)
646 return log_error_errno(false, errno, "Failed to create cpumask for offline cpus");
647 }
648 mysize=BITS_TO_LONGS(maxposs);
649 TRACE("offlinemask BITS_TO_LONGS(maxposs=%zd) = %zd", maxposs, mysize);
650
651 for (i = 0; i <= maxposs; i++) { **/*** ERROR: <= FIX: < ***/**
652 if ((isolmask && !is_set(i, isolmask)) ||
653 (offlinemask && !is_set(i, offlinemask)) ||
654 !is_set(i, possmask))
655 continue;
656
657 flipped_bit = true;
658 clear_bit(i, possmask);
659 TRACE("flipped_bit index = %zd array index (%zd/32) = %zd", i, i, (i/NBITS));
660 }
661
662 if (!flipped_bit) {
663 cpulist = lxc_cpumask_to_cpulist(possmask, maxposs);
664 TRACE("No isolated or offline cpus present in cpuset");
665 } else {
666 cpulist = move_ptr(posscpus);
667 TRACE("Removed isolated or offline cpus from cpuset");
668 }
669 if (!cpulist)
670 return log_error_errno(false, errno, "Failed to create cpu list");
671 TRACE("strlen(cpulist) = %zd cpulist = %s", strlen(cpulist), cpulist);
672
673 copy_parent:
674 if (!am_initialized) {
675 ret = lxc_writeat(dfd_child, "cpuset.cpus", cpulist, strlen(cpulist));
676 if (ret < 0)
677 return log_error_errno(false, errno, "Failed to write cpu list to \"%d/cpuset.cpus\"", dfd_child);
678
679 TRACE("Copied cpu settings of parent cgroup");
680 }
681
682 return true;
683 }
Log Trace
1 lxc-start c_100 20210726215010.363 INFO start - start.c:lxc_init:855 - Container "c_100" is initialized
2 lxc-start c_100 20210726215010.363 TRACE cgfsng - cgroups/cgfsng.c:cgfsng_monitor_create:1050 - monitor_cgroup = lxc.monitor.c_100-NNNN
3 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:__cgroup_tree_create:760 - Created 12(lxc.monitor.c_100) cgroup
4 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:__cgroup_tree_create:760 - Reusing 13(lxc.monitor.c_100) cgroup
5 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:586 - posscpus= 0-103
6 strlen(posscpus) = 6 maxposs = 103
7 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:603 - The path "/sys/devices/system/cpu/isolated" to read isolated cpus from does not exist
8 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:621 - offlinecpus = 104-447
9 strlen(offlinecpus) = 8 maxposs = 448
10 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:635 - possmask BITS_TO_LONGS(maxposs=448) = 14
11 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:649 - offlinemask BITS_TO_LONGS(maxposs=448) = 14
12 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:664 - No isolated or offline cpus present in cpuset
13 lxc-start c_100 20210726215010.364 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:671 - strlen(cpulist) = 309 cpulist = 0,1,2,3,4,5,6 ,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,5 6,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103 ,448
14 lxc-start c_100 20210726215010.364 ERROR cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:677 - Numerical result out of range - Failed to writ e cpu list to "23/cpuset.cpus"
15 lxc-start c_100 20210726215010.364 ERROR cgfsng - cgroups/cgfsng.c:cpuset1_initialize:704 - Numerical result out of range - Failed to initializ e cpuset.cpus
16 lxc-start c_100 20210726215010.364 ERROR cgfsng - cgroups/cgfsng.c:__cgroup_tree_create:769 - Invalid argument - Failed to initialize cpuset co ntroller in the legacy hierarchy
17 lxc-start c_100 20210726215010.364 ERROR cgfsng - cgroups/cgfsng.c:cgroup_tree_create:840 - Invalid argument - Failed to create monitor cgroup 13(lxc.monitor.c_100)
18 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:586 - posscpus= 0-103
19 strlen(posscpus) = 6 maxposs = 103
20 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:603 - The path "/sys/devices/system/cpu/isolated" to read isolated cpus from does not exist
21 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:621 - offlinecpus = 104-447
22 strlen(offlinecpus) = 8 maxposs = 448
23 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:635 - possmask BITS_TO_LONGS(maxposs=448) = 14
24 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:649 - offlinemask BITS_TO_LONGS(maxposs=448) = 14
25 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:659 - flipped_bit index = 448 array index (448/32) = 14
26 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:667 - Removed isolated or offline cpus from cpuset
27 lxc-start c_100 20210726215010.366 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:671 - strlen(cpulist) = 6 cpulist = 0-103
28
29 lxc-start c_100 20210726215010.369 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_cpus_initialize:679 - Copied cpu settings of parent cgroup
30 lxc-start c_100 20210726215010.369 TRACE cgfsng - cgroups/cgfsng.c:cpuset1_initialize:721 - Initialized cpuset in the legacy hierarchy
Following the Log Trace: First call of cpuset1_cpus_initialize()
- Line 6 and 8 the posscpus 0-103 and offlinecpus 104-447 for a total of 448 cpus which is evenly divisible by 32
- Line 10 & 11 shows the possmask and offllinemask size each consist of 14 x 32bit
- Line 12 message No isolated or offline cpus present in cpuset occurs as a result of the code loop from Line 651-660 in cpuset1_cpus_initialize() not detecting a flipped_bit condition in the loop… This is expected for the first 448 iterations as the either the offlinemask or possmask !is_set() will always be true since the range of on-line and off-line cpus do not overlap.
** One the 449th iteration which occurs due to the off by one error annotated on Line 651 the !is_set() check on Line 653 just happens to return true… Note: The check at that point is outside the allocated offlinemask chunk of memory. - Since flipped_bit the cpulist is generated from a call to lxc_cpumask_to_cpulist(possmask, maxposs) at line 663
** Side Note to the Code Maintainer: I would have thought in the if/else from Line 662-668 logic would have been opposite and that lxc_cpumask_to_cpulist() would be called when flipped_bit was true. - Line 13 in the Log Trace show the output from the call to lxc_cpumask_to_cpulist() . Notice in addition to the set of on-line hosts ranging from 0-103 there is a cpu 448 (i.e. the 449th cpu) which is outside the range of valid cpus.
** This occurs due to the off by one error in lxc_cpumask_to_cpulist() on Line 253. On the 449th iteration !is_set(i, bitattr) just happens to be false which results in an entry for 448 being created… Again note on the 449th iteration bitattr is accessed beyond its allocated boundaries.
Following the Log Trace: First call of cpuset1_cpus_initialize() Lines 18-30
- In this iteration flipped_bit get set as indicated on Line 25 for on loop index 448 (which is the 449th iteration). This also shows that the index where clear_bit(i, possmask) is called 14 which is the 15th array element and beyond possmask allocated space. This is where the memory corruption occurs.
Question:
- Is this description sufficient for the Code Maintainer to use and make a fix or is the protocol to provide a PR?