码迷,mamicode.com
首页 > 移动开发 > 详细

LINUX – WRITING A SIMPLE CONTAINER APP WITH NAMESPACES

时间:2020-12-11 12:32:59      阅读:22      评论:0      收藏:0      [点我收藏+]

标签:man   git   coding   containe   hot   ifreq   chroot   put   diff   

One of the building blocks to implement containers is Linux namespaces. Namespaces control what a process can see. It can be the processes IDs, mount points, network adapters and more.

To use namespaces we call the clone(2) system call.

Creating a child process – fork vs clone

To create a new process in Linux, we can use fork(2) or clone(2) system calls. We use fork(2) to create a new child process with a separate memory mapping (using CoW) , we use clone(2) to create a child process that shares resources with its parent. One use of clone is to implement multithreading, other use is to implement namespaces

Namespaces with clone(2)

To create a child process in a new namespace and isolated resources we need to use one or more of the following flags :

  • CLONE_NEWNET  – isolate network devices
  • CLONE_NEWUTS – host and domain names (UNIX Timesharing System)
  • CLONE_NEWIPC – IPC objects
  • CLONE_NEWPID – PIDs
  • CLONE_NEWNS – mount points (file systems)
  • CLONE_NEWUSER – users and groups

Simple Example – NEWPID

To create a child process with PID=1 (new processes tree) call clone(2) with CLONE_NEWPID:

getpid() on the child process returns 1, getppid() returns 0. If the child process creates another child it will get a process id from the new tree

Full example:

The main creates a child process in a new PID namespace and send its PID to the child. The child creates 3 children.

If the child process try to kill the parent (out of its namespace) – nothing happens but it can kill a process in its namespace (in this case the first child)

If you build and run this sample (run with sudo)

As you can see the PIDs are 1-4 and the first child didn’t finish (SIGKILL)

 

Isolates Network Interfaces

To create a child process with different network interfaces use CLONE_NEWNET:

To create a virtual network adapter we can run ip command:

Now the child should run the command:

We can code all these commands but for simplicity lets use the system(3) library function

Full Example:

Run this test – the output:

The child sees only the virtual adapter and can ping the parent using it

 

Mount Points and file system

To implement a container we need to isolate also the file system. It can be done using CLONE_NEWNS. Before coding , lets build a simple file system using BusyBox or BuildRoot

The simplest way is using buildroot – it is based on busybox.

Download and extract the package, use make menuconfig to enter the configuration menu, just exit and save the default selection and run make

It will take a few minutes , after the build is finished you will find a file system in buildroot-2017.11.2/output/target

copy the content to another folder – in my example fs and add some device files to the /dev directory using mknod commands (buildroot can’t do that because it doesn’t run with sudo)

Full Example 

We create a child process in a new namespace (with PIDs, network, mounts, IPC and UTS) , the parent configure the virtual adapters (using ip link) and set its ip address

The child change the hostname, change the root folder to our buildroot output , change the current directory to ‘/’ , mount proc so ps and other tools will work and set an ip address.

The last step the child does is calling the busybox shell (/bin/sh)

Run this program using sudo – you will get a different shell, file system and network:

技术图片

Thats it!!

You can find the code with the full Buildroot image here

This is just a simple example to understand the concept. to implement a full container you need also to add capabilities, control groups and more

LINUX – WRITING A SIMPLE CONTAINER APP WITH NAMESPACES

标签:man   git   coding   containe   hot   ifreq   chroot   put   diff   

原文地址:https://www.cnblogs.com/dream397/p/14098442.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!