Ceph storage pool resize image root

Hi,

I have a cluster of 5 lxd hosts using latest snap 3.3.
All lxd hosts use ceph storage cluster as default storage

config:
  ceph.cluster_name: ceph
  ceph.osd.pg_num: "1024"
  ceph.osd.pool_name: lxd
  volatile.pool.pristine: "true"
  volume.size: 80GB
description: ""
name: lxd-ceph
driver: ceph

Initially the volume.size was 40GB but now I changed it to 80GB

When the size was 40GB i created containers using lxd-p2v to convert my Ubuntu 16.04 vms to containers.
This container was published as an image to use as template.

+-------------------------------------------+--------------+--------+---------------------------------------------+--------+-----------+------------------------------+
|                   ALIAS                   | FINGERPRINT  | PUBLIC |                 DESCRIPTION                 |  ARCH  |   SIZE    |         UPLOAD DATE          |
+-------------------------------------------+--------------+--------+---------------------------------------------+--------+-----------+------------------------------+
| android-runner-ubuntu-2018-06-01-old      | 90026c8ee3b9 | no     |                                             | x86_64 | 4446.05MB | Jul 25, 2018 at 4:07pm (UTC) |
+-------------------------------------------+--------------+--------+---------------------------------------------+--------+-----------+------------------------------+

When i launched containers with template it used to have correct 40GB root disk size.

Creating test-runner
Starting test-runner
root@ubuntu:~# df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/rbd2        40G   12G   27G  30% /

Now I need to increase the default root disk size of the containers when I launch this image. So I updated the storage config volume.size to 80GB and created the container but root disk is still 40GB

I checked the size of rbd for the container and saw that it is having image as parent and found that image is still 40GB so I increased the size of image rbd to 80G

rbd -p lxd resize --size 81920 image_90026c8ee3b9c9d58a444089466d733220f18dae72b0c050a4bda474d86f829b

rbd image 'image_90026c8ee3b9c9d58a444089466d733220f18dae72b0c050a4bda474d86f829b':
	size 81920 MB in 20480 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.1137238e1f29
	format: 2
	features: layering
	flags:
	create_timestamp: Wed Jul 25 16:09:59 2018

Now i removed and launched the container again and it still has 40GB

rbd image 'container_test-runner':
	size 40960 MB in 10240 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.15e72eb141f2
	format: 2
	features: layering
	flags:
	create_timestamp: Tue Aug  7 10:23:48 2018
	parent: lxd/image_90026c8ee3b9c9d58a444089466d733220f18dae72b0c050a4bda474d86f829b@readonly
	overlap: 40960 MB

What is the correct way of launching the container with a bigger root disk than image or update the default size of the image?

Thanks,

What happens if you do:

  • lxc config device override test-runner root size=80GB
  • lxc restart test-runner
Error: Cannot resize RBD storage volume for container "test-runner" when it is running
root@gen8-1:~# lxc stop test-runner
root@gen8-1:~# lxc config device override test-runner root size=80GB
Device root overridden for test-runner
root@gen8-1:~# lxc start test-runner
root@gen8-1:~# lxc exec test-runner bash
root@ubuntu:~# df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/rbd2        40G   12G   27G  30% /

What does lxc storage volume show lxd-ceph container/test-runner show you at this point?
And lxc config show --expanded test-runner?
Knowing the current size of the container’s rbd volume would be useful too.

@stgraber

root@gen8-1:~# lxc storage volume show lxd-ceph container/test-runner
config:
  block.filesystem: ext4
  block.mount_options: discard
  size: 80GB
description: ""
name: test-runner
type: container
used_by:
- /1.0/containers/test-runner
location: gen8-1
root@gen8-1:~# lxc config show --expanded test-runner
architecture: x86_64
config:
  volatile.base_image: 90026c8ee3b9c9d58a444089466d733220f18dae72b0c050a4bda474d86f829b
  volatile.eth0.hwaddr: 00:16:3e:a6:ba:d5
  volatile.eth0.name: eth0
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.power: RUNNING
devices:
  eth0:
    nictype: bridged
    parent: br1
    type: nic
  kvm:
    path: /dev/kvm
    type: unix-char
  root:
    path: /
    pool: lxd-ceph
    size: 80GB
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""
root@gen8-1:~# rbd -p lxd info lxd/container_test-runner
rbd image 'container_test-runner':
	size 40960 MB in 10240 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.15e72eb141f2
	format: 2
	features: layering
	flags:
	create_timestamp: Tue Aug  7 10:23:48 2018
	parent: lxd/image_90026c8ee3b9c9d58a444089466d733220f18dae72b0c050a4bda474d86f829b@readonly
	overlap: 40960 MB

Thanks, so there looks like there’s a bug in the resize logic for CEPH then, I’ll try to reproduce this here today and figure out what’s going on.

In the mean time, you should be able to directly grow that rbd volume to 80GB in CEPH and then run resize2fs on /dev/rbd2 on your host to grow the partition for that container.

@stgraber. Thanks.

What i really need is to resize the image as this image is what i use to create these containers every now and then. Can i do that somehow?

I’m not sure what’s happening there, I’d have expected that resizing the image to 80GB would have done the trick… Can you try maybe setting the volume size in your default profile, see if that does the trick?

lxc profile device set default root size 80GB

And then try to create a new container from that image again.

No luck.

root@gen8-1:~# lxc profile show default
config: {}
description: Default LXD profile
devices:
  eth0:
    nictype: bridged
    parent: br1
    type: nic
  kvm:
    path: /dev/kvm
    type: unix-char
  root:
    path: /
    pool: lxd-ceph
    size: 80GB
    type: disk
name: default
root@gen8-1:~# lxc launch android-runner-ubuntu-2018-06-01-old test-runner
Creating test-runner
Starting test-runner
root@gen8-1:~# lxc exec test-runner bash
root@ubuntu:~# df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/rbd3        40G   12G   27G  30% /

Can you check lxc storage volume show and the RBD size for that one too?

root@gen8-1:~# lxc storage volume show lxd-ceph container/test-runner
config:
  block.filesystem: ext4
  block.mount_options: discard
  size: 80GB
description: ""
name: test-runner
type: container
used_by:
- /1.0/containers/test-runner
location: gen8-1


root@gen8-1:~# rbd info lxd/container_test-runner
rbd image 'container_test-runner':
	size 40960 MB in 10240 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.17da2eb141f2
	format: 2
	features: layering
	flags:
	create_timestamp: Tue Aug  7 15:06:17 2018
	parent: lxd/image_90026c8ee3b9c9d58a444089466d733220f18dae72b0c050a4bda474d86f829b@readonly
	overlap: 40960 MB

@stgraber I think i found the issue.
The issue is container is created by cloning the rbd image. This rbd image snapshot is cloned which has a 40GB ext4 partition. Even if i increase the size of rbd and its readonly snapshot (which lxd clones to container) to 80GB the partition is still mounted as 40GB.

The only way to make it work is

  1. increase the size of rbd to 80G,
  2. mount it to a host and manually
  3. expand the ext4 partition using resize2fs
  4. unmount it from host
  5. create a snapshot named readonly.

It would be awesome if lxd would use container root device size to create rbd image and resize partitions before mounting.

So I thought that was supposed to happen but I guess we’re missing some code somewhere.

The expected behavior is that your image should have been 40G both in CEPH and in LXD’s database, then your root device size should have been set to 80G and on startup, the creation should have noticed the size difference and ran a resize operation.

Yes, i think its definitely broken somewhere.
I resized my image to 100G and even though size in all the profiles is set to 80G i got a container with 100G
Do you want me to raise an issue in github for this?

2 Likes

Yeah, that’d be great, will be easier to remember to look at it than keeping this forum post open (which is what I’ve been doing so far).

I am encountering the same problem and I am wondering whether you created the GitHub issue as I cannot find it as of yet?

He did and it’s been fixed and closed, fix is in LXD 3.4 and will be in 3.0.2 as well.

1 Like

That’s even better news than I was hoping for! :grinning:
Thank you for fixing it so swiftly!