You can find the book authored by Richard Hartley and Andrew Zisserman in .pdf format here on GitHub
The main points of interest are Camera Pose Estimations(Chapters 6.1, 6.2) and Coordinate Frames and Homogeneous Transformations(Chapter 7.2)
A camera is a mapping between the 3D world (object space) and a 2D image
For study, it references the pinhole camera model.
Central projection using homogeneous coordinates.
If the world and image points are represented by homogeneous vectors, then central projection is very simply expressed as a linear mapping between their homogeneous coordinates.
So, the last equation from pinhole camera model becomes:
Camera Rotation and Translation
In general, points in space will be expressed in terms of a different Euclidean coordinate frame, known as the world coordinate frame. The two coordinate frames are related via a rotation and a translation.

Skipping some formulas, we define the camera matrix as:
where
represents the coordinates of the camera centre in the world coordinate frame (see figure above). Thus, to map a world point to image points , we use . and come from aruco.py in my case.
Let
Finding the camera center
The camera center is the point for which . Numerically this right null-vector may be obtained from the SVD(Singular Value Decomposition) of .
The camera centre is the 1-dimensional right null-space C of P, i.e. .
-
Finite camera (M is not singular):
-
Camera at infinity (M is singular):
where d is the null 3-vector of M, i.e. .
Column points
For , the column vectors are vanishing points in the image corresponding to the , , and axes respectively.
Column is the image of the coordinate origin.
Principal plane
The principal plane of the camera is , the last row of P.
Axis planes
The planes and (the first and second rows of P) represent planes in space through the camera centre, corresponding to points that map to the image lines and respectively.
Principal point
The image point is the principal point of the camera,
where is the third row of M.
Principal ray
The principal ray (axis) of the camera is the ray passing through the camera centre C
with direction vector .
The principal axis vector is:
It is directed towards the front of the camera.
Depth of Points
Let be a 3D point and be a camera matrix for a finite camera.
Suppose:
Then:
is the depth of the point in front of the principal plane of the camera.