What is process?

Muhammad Abdul Aleem
7 min readAug 23, 2022

--

A process is an instance of an executing program. When a program is executed, the kernel loads the code of the program into virtual memory, allocates space for program variables, and sets up kernel bookkeeping data structures to record various information (such as process ID, termination status, user IDs, and group IDs) about the process.

Kernel Point of view:
From a kernel point of view, processes are the entities among which the kernel must share the various resources of the computer. For resources that are limited, such as memory, the kernel initially allocates some amount of the resource to the process, and adjusts this allocation over the lifetime of the process in response to the demands of the process and the overall system demand for that resource. When the process terminates, all such resources are released for reuse by other processes. Other resources, such as the CPU and network bandwidth, are renewable, but must be shared equitably among all processes.

Process Memory Layout:

A process is logically divided into the following parts, known as segments:
Text: The instructions of the program.
Data: the static variables used by the program.
Heap: an area from which programs can dynamically allocate extra memory.
Stack: a piece of memory that grows and shrinks as functions are called and
return and that is used to allocate storage for local variables and function call
linkage information.

Process creation and Program execution:

A process can create a new process using the fork() system call. The process that calls fork() is referred to as the parent process, and the new process is referred to as the child process. The kernel creates the child process by making a duplicate of the parent process. The child inherits copies of the parent’s data, stack, and heap segments, which it may then modify independently of the parent’s copies. (The program text, which is placed in memory marked as read-only, is shared by the two processes.)
The child process goes on either to execute a different set of functions in the
same code as the parent, or, frequently, to use the execve() system call to load and execute an entirely new program. An execve() call destroys the existing text, data, stack, and heap segments, replacing them with new segments based on the code of the new program.

Several related C library functions are layered on top of execve(), each providing a slightly different interface to the same functionality. All of these functions have names starting with the string exec, and where the differences don’t matter, we’ll use the notation exec() to refer generally to these functions. Be aware, how-ever, that there is no actual function with the name exec().

Process ID and parent process ID:

Each process has a unique integer process identifier (PID). Each process also has a parent process identifier (PPID) attribute, which identifies the process that requested the kernel to create this process.

Process termination and termination status:

A process can terminate in one of two ways: by requesting its own termination
using the _exit() system call (or the related exit() library function), or by being killed by the delivery of a signal. In either case, the process yields a termination status, a small nonnegative integer value that is available for inspection by the parent process using the wait() system call. In the case of a call to _exit(), the process explicitly specifies its own termination status. If a process is killed by a signal, the termination status is set according to the type of signal that caused the death of the process.
(Sometimes, we’ll refer to the argument passed to _exit() as the exit status of the process, as distinct from the termination status, which is either the value passed to _exit() or an indication of the signal that killed the process.)
By convention, a termination status of 0 indicates that the process succeeded,
and a nonzero status indicates that some error occurred. Most shells make the termination status of the last executed program available via a shell variable named $?.

Process user and group identifiers (credentials):

Each process has a number of associated user IDs (UIDs) and group IDs (GIDs).
These include:
Real user ID and real group ID: These identify the user and group to which the
process belongs. A new process inherits these IDs from its parent. A login shell
gets its real user ID and real group ID from the corresponding fields in the system password file.
Effective user ID and effective group ID: These two IDs (in conjunction with the
supplementary group IDs discussed in a moment) are used in determining the
permissions that the process has when accessing protected resources such as
files and inter-process communication objects. Typically, the process’s effective IDs have the same values as the corresponding real IDs. Changing the effective IDs is a mechanism that allows a process to assume the privileges of another user or group, as described in a moment.
Supplementary group IDs: These IDs identify additional groups to which a process belongs. A new process inherits its supplementary group IDs from its parent. A login shell gets its supplementary group IDs from the system group file.

Privileged Processes:

Traditionally, on UNIX systems, a privileged process is one whose effective user ID is 0 (superuser). Such a process bypasses the permission restrictions normally applied by the kernel. By contrast, the term unprivileged (or nonprivileged) is applied to processes run by other users. Such processes have a nonzero effective user ID and must abide by the permission rules enforced by the kernel.
A process may be privileged because it was created by another privileged process for example, by a login shell started by root (superuser). Another way a process may become privileged is via the set-user-ID mechanism, which allows a process to assume an effective user ID that is the same as the user ID of the program file that it is executing.

Capabilities:

Since kernel 2.2, Linux divides the privileges traditionally accorded to the superuser into a set of distinct units called capabilities. Each privileged operation is associated with a particular capability, and a process can perform an operation only if it has the corresponding capability. A traditional superuser process (effective user ID of 0) corresponds to a process with all capabilities enabled. Granting a subset of capabilities to a process allows it to perform some of the operations normally permitted to the superuser, while preventing it from performing others.

The init() process:

When booting the system, the kernel creates a special process called init, the “parent of all processes,” which is derived from the program file /sbin/init. All processes on the system are created (using fork()) either by init or by one of its descendants. The init process always has the process ID 1 and runs with superuser privileges. The init process can’t be killed (not even by the superuser), and it terminates only when the system is shut down. The main task of init is to create and monitor a range of processes required by a running system.

Daemon process:

A daemon is a special-purpose process that is created and handled by the system
in the same way as other processes, but which is distinguished by the following
characteristics:
It is long-lived. A daemon process is often started at system boot and remains
in existence until the system is shut down.

It runs in the background, and has no controlling terminal from which it can read input or to which it can write output.

Examples of daemon processes include syslogd, which records messages in the system log, and httpd, which serves web pages via the Hypertext Transfer Protocol (HTTP).

Environment List:

Each process has an environment list, which is a set of environment variables that are maintained within the user-space memory of the process. Each element of this list consists of a name and an associated value. When a new process is created via fork(), it inherits a copy of its parent’s environment. Thus, the environment provides a mechanism for a parent process to communicate information to a child process. When a process replaces the program that it is running using exec(), the new program either inherits the environment used by the old program or receives a new environment specified as part of the exec() call. Environment variables are created with the export command in most shells (or the setenv command in the C shell), as in the following example:
$ export MYVAR=’Hello world’
Whenever we present a shell session log showing interactive input and output,
the input text is always boldfaced. Sometimes, we include commentary in the
log in italic text, adding notes about the commands entered or the output
produced. C programs can access the environment using an external variable (char **environ), and various library functions allow a process to retrieve and modify values in its environment.
Environment variables are used for a variety of purposes. For example, the
shell defines and uses a range of variables that can be accessed by scripts and programs executed from the shell. These include the variable HOME, which specifies the pathname of the user’s login directory, and the variable PATH, which specifies a list of directories that the shell should search when looking for programs corresponding to commands entered by the user.

Resource Limits:

Each process consumes resources, such as open files, memory, and CPU time.
Using the setrlimit() system call, a process can establish upper limits on its consumption of various resources. Each such resource limit has two associated values: a soft limit, which limits the amount of the resource that the process may consume; and a hard limit, which is a ceiling on the value to which the soft limit may be adjusted. An unprivileged process may change its soft limit for a particular resource to any value in the range from zero up to the corresponding hard limit, but can only lower its hard limit.
When a new process is created with fork(), it inherits copies of its parent’s
resource limit settings.
The resource limits of the shell can be adjusted using the ulimit command
(limit in the C shell). These limit settings are inherited by the child processes that the shell creates to execute commands.

--

--

Muhammad Abdul Aleem

A Software Engineer who loves writing and programming || Talks about businesses, Startups, Saas Products.