I’m trying to run an MPI application on multiple containers on multiple physical machines. I followed this guide to help set it up. The application and configuration files are on an NFS master server.
Initially, I tried mounting the NFS directory directly from the container but I later found out that doesn’t work. Then, I mounted the NFS directory onto all the hosts and added the directory to the containers as a disk device. This method works. The only issue is that the directory appears as read only.
I found this guide on how to remap the UIDs which allows read/write access. This approach works… to a point. If I add the idmap parameter to all the containers, the MPI application hangs and errors out with a seg fault if I force close. If the parameter is on some subset of the containers, the application runs fine.
Any idea what might cause this? I’m using LXD 2.0.9 and MPICH on Ubuntu 16.04