TRENDS IN THE DEVELOPMENT OF COMPUTER FACIAL RECOGNITION
TRENDS IN THE DEVELOPMENT OF COMPUTER FACIAL RECOGNITION
Roman Plekhov
First-year Master's student at Belgorod State National Research University (BelSU),
Russia, Belgorod
Facial recognition technology is a combination of methods, algorithms, and software that enables automatic identification of a face based on its visual characteristics. It relies on the collection and processing of face images as well as the creation of templates that enable identification of a specific face based on its unique features, such as its shape, eye, nose, mouth placement, and other elements. Facial recognition is widely used in security systems, access control, video surveillance, and various entertainment applications.
Facial recognition technology is based on the following theorems, lemmas, and methods:
1. Principal Component Analysis (PCA) - used to reduce data dimensionality in facial recognition, simplifies analysis and reduces the computational processing load.
2. Local Binary Pattern (LBP) - used to analyze facial texture, enabling identification of various facial characteristics such as the shape of the eyes, nose, mouth, and more.
3. Pearson’s Theorem - used to compute the facial correlation matrix and determine its similarity to other known faces.
4. Probability Theory - used to assess the probability of a face matching with a known facial template.
Machine learning theory - utilized for defining training data and creating facial recognition models based on the above mentioned data.
Facial recognition has become a vital aspect of modern information technology, with an increasing need to follow its latest trends. Therefore, this study aims to examine the recent advancements and future directions in facial recognition technology.
The first method is the principal component analysis (PCA), a mathematical technique for reducing the dimensionality of data. PCA allows us to identify the components that have the most influence on our dataset. It works by finding new variables that are linear combinations of the original variables and are uncorrelated with each other.
For example, suppose we have a dataset with multiple variables. We can use PCA to determine which variables are most important in explaining the variations in the data. The new variables obtained after PCA can be used to create new models that are more understandable and effective for data analysis.
The PCA algorithm consists of several steps:
1. Subtract the mean value of each variable from the dataset to make it centered, which facilitates working with it.
2. Compute the covariance matrix between all pairs of variables.
3. Calculate the eigenvectors and eigenvalues of the covariance matrix. An eigenvector is a vector that does not change its direction when multiplied by the matrix, and an eigenvalue is a number that indicates the importance of that particular vector.
4. Sort the eigenvectors by their eigenvalues in descending order to identify the most important components.
5. Using the most important eigenvectors, we create new variables.
6. We calculate the contribution of each component to the total variance of the data, which helps us determine how many components we need to use to preserve the necessary information.
7. We can select the optimal number of components based on our analysis goals. For example, if we want to reduce the dimensionality of the data to speed up modeling, we can use only a few components.
8. Finally, we can use the new variables to create new models.
PCA can be used for various tasks, such as processing images, analyzing financial data, biomedical research, etc. It is a powerful tool for reducing the dimensionality of data and creating new models. In face recognition technology, the method is used as follows:
The principal component analysis (PCA) is one of the most commonly used methods in face recognition technology. It is applied to reduce the dimensionality of data, which improves the accuracy of recognition, reduces processing time and economic costs.
The first step in using PCA in face recognition is collecting and processing data. This can be done by creating a database of face images for model training. Data processing may include noise reduction, image alignment, scaling, etc.
Then, using PCA, face images are transformed from the original multidimensional space to a set of principal components. These principal components are new orthogonal vectors that represent differences between face images. The principal components are chosen to contain the maximum variations in the data, which is done using singular value decomposition of the data matrix.
Then, these principal components are used to build the face recognition model. When a new face image enters the system, it undergoes the same transformation process and finds its place in the new principal component space. Then, the system compares the location of the new image in this space with the images stored in the database and finds the best match.
Thus, the principal component analysis enables us to create a model that can recognize faces in images, even with some variations, such as changes in image size, rotation angles, etc. In recent years, PCA has become a standard method in face recognition and is used in many modern applications.
The next method, which is a key component of computer vision technology, is the Local Binary Patterns (LBP) method - an algorithm for image recognition based on texture analysis of objects. It was developed for face and object detection in video surveillance, but can also be used for other purposes, such as hand gesture recognition.
The main advantage of the LBP method is its relative simplicity of implementation and high speed of image recognition. The algorithm operates in several stages:
1. The image is divided into small windows.
2. For each pixel in the window, a local binary pattern is calculated, which is defined by comparing the brightness of the pixel with the average brightness of its eight neighbors. If the brightness of the neighboring pixel is greater than the central pixel, the corresponding bit in the pattern is set to 1, otherwise, it is set to 0.
3. After that, the obtained pattern is transformed into a decimal number, which becomes the new value of the central pixel.
In scientific style, the text can be expressed as follows:
The Local Binary Patterns (LBP) method is a pivotal technology in the domain of computer vision. It is devised for image recognition and draws on texture analysis of objects. As initially proposed for facial and object detection within video surveillance applications, it entails a diverse range of purposes, such as the identification of hand gestures. The LBP method is distinguished by its relative simplicity of implementation and high rate of image recognition. The algorithm operates through several stages, where the image is initially partitioned into small windows.
1. Subsequently, a local binary pattern is computed for each pixel in the window by means of comparing the pixel brightness with the averaged brightness of its eight neighboring pixels.
2. The corresponding bit in the pattern is set to 1, if the brightness of the neighboring pixel is greater than that of the center pixel; otherwise, it is set to 0.
3. The resulting pattern is transformed into a decimal number, which corresponds to the new value of the center pixel.
4. For each window, a histogram of LBP values is formed.
5. Finally, the histograms are combined into a feature vector that is used for image
classification.
6. The LBP method has several modifications, such as Extended LBP (ELBP), Binary LBP (BLBP), Local Pattern Texture (LPT), and others. Each of these has its own characteristics and can be used to solve specific tasks.
A common drawback of the LBP method is that it does not consider the spatial information in the image, i.e. the location of objects relative to each other. However, this limitation can be addressed by combining the LBP method with other computer vision algorithms, such as the Principal Component Analysis method, Support Vector Machine method, or neural network algorithms. The Local Binary Patterns method is commonly used in face recognition technology as a feature extraction method.In the LBP algorithm, each pixel of the image is replaced with a value of 0 or 1, based on whether it is greater or less than its neighbors. Then, the pixels are divided into several blocks, and for each block, a histogram of LBP feature distribution is constructed. This histogram represents each block as an LBP descriptor.Using these LBP descriptors, faces can be recognized in both photographs and video streams.
They enable the extraction of unique facial features, such as skin texture and mouth shape, making the LBP method more resistant to head rotations and lighting changes.
Specific implementations of the LBP method in face recognition technology can be found in systems such as OpenCV, FaceAPI, and SeetaFace.
The Pearson correlation coefficient (also known as the chi-squared test) is a statistical method used to determine whether two variables are independent. It is named after Karl Pearson, who developed it in 1900.
The Pearson theorem is used to test the hypothesis that observed data corresponds to expected values, taking into account the assumption of independence between two variables. This method is used for analyzing data that can be categorized, such as analyzing the frequency of occurrence of certain words in text or analyzing people's preferences when choosing specific products.
The main idea of the Pearson theorem is to compare the observed statistical variable (also called the observed value) with the theoretically expected value. If the observed value significantly differs from the theoretically expected value, the hypothesis of independence between the two variables is rejected.
For example, suppose we are studying people's preferences when choosing the colors of cars. We surveyed 50 people and counted the number of people who prefer red, blue, or green cars. We can then use the Pearson theorem to determine whether there is a dependence between the preference for a certain color and the gender of the respondents.
To apply the Pearson theorem, we can create a contingency table, where the car colors are listed horizontally and the gender of the respondents is listed vertically. In the table cells, we indicate the number of people who prefer red, blue, or green cars depending on their gender.
Then, we can calculate the expected values for each cell in the table based on the total number of people surveyed in our study and the frequency of choosing each car color in our sample. This allows us to compare the observed values with the expected values, and if they significantly differ from each other, we can conclude that the car color depends on the gender of the people.
Thus, the Pearson theorem is a statistical analysis method that allows us to determine whether two variables are independent. The Pearson theorem can be used to analyze any set of data that can be categorized and helps us understand which variables are interrelated. The proof of the theorem was presented by Karl Pearson in 1901 and includes several steps. The first step is to establish that if a data sample has a normal distribution, then its mean value also has a normal distribution. This is confirmed by the central limit theorem, which states that the sum of a large number of independent random variables has a normal distribution for a sufficiently large sample size. The second step is to calculate the test statistic, which is used to test the hypothesis of normal distribution of the data sample. This is done by comparing the observed values of the sample with the predicted values of the normal distribution, using the chi-square statistic. The third step is to accept or reject the hypothesis of normal distribution of the data sample. This is done based on the level of significance, which determines the probability that the differences between the observed and predicted values can be explained by chance. If the level of significance is low enough, then the hypothesis of normal distribution of the data sample is accepted, which allows us to use the sample mean to estimate the parameters of the population.
In general, the proof of Pearson's theorem involves mathematically complex calculations and certain assumptions about the distribution of data, but its results have great significance in statistical analysis and help improve accuracy in estimating parameters of the population.
In facial recognition technology, Pearson's theorem is used to calculate the degree of similarity between two faces. In this case, it is assumed that the distribution of pixels in the face image follows a normal distribution. This means that the mean value of the pixels in the image should be close to the mean distribution values, and the standard deviation should be close to the value characteristic of a normal distribution.
Using Pearson's theorem, the distance between two faces in the normal distribution space can be calculated. First, a set of axes that best describes the normal distribution is selected. Then, for each face, coordinates are calculated based on the standard deviation and mean value. Thus, each face is represented in the normal distribution space. Then, using the formula for the distance between two points in this space (the Euclidean formula), the distance between two faces in this space is found. The smaller the distance between faces, the higher the probability that they belong to the same person. Thus, Pearson's theorem is used in facial recognition technology to help a computer compare and assess the degree of similarity between two faces. This improves accuracy in face recognition and identification.
The next theorem is the theorem of probabilities (also known as classical or axiomatic probability) - one of the main tools of probability theory, allowing calculation of the probability of a certain event occurring. First of all, it can be formulated as follows: if n different outcomes are equally likely and m of them correspond to the event of interest, then the probability of this event occurring is m/n.
A more detailed formulation of the theorem of probabilities is as follows:
1. The probability of any event occurring is between 0 and 1.
2. The probability of a certain event occurring is 1.
3. If two events are mutually exclusive (i.e., cannot occur simultaneously), then the probability of one of them occurring is equal to the sum of the probabilities of each event separately.
4. If events A and B are independent, then the probability of their joint occurrence is equal to the product of the probability of each event separately: P(A∩B) = P(A) * P(B). It should be noted that the theorem of probabilities only works under certain conditions, such as the equi-probability of outcomes and the independence of events. If these conditions are not met, then the application of the theorem requires additional formalizations and methods.
The theorem of probability is one of the fundamental concepts of probability theory. It asserts that the probability of an event occurring is equal to the ratio of the number of all favorable outcomes of this event to the number of all possible outcomes of the experiment.
Formally, the probability theorem can be written as follows:
P(A) = N(A) / N
Where P(A) - is the probability of the event A occurring, N(A) - is the number of favorable outcomes of event A, and N - is the total number of possible outcomes. The proof of the probability theorem can be conducted based on the axiomatic approach to the theory of probabilities. It should be noted that this approach consists of considering probability as a function that satisfies certain axioms.
The first axiom states that the probability of any event is non-negative:
P(A) ≥ 0
The second axiom asserts that the probability of a certain event is equal to 1:
P(Ω) = 1
where Ω is the space of elementary outcomes, i.e. the set of all possible outcomes of the experiment. The third axiom establishes the additivity of probability: if events A and B are mutually exclusive (i.e. cannot occur simultaneously), then the probability of their union is equal to the sum of their probabilities:
P(A ∪ B) = P(A) + P(B)
For mutually exclusive events, the third axiom can be expressed in a more general form:
P(A1 ∪ A2 ∪ ... ∪ An) = P(A1) + P(A2) + ... + P(An)
Thus, the probability theorem can be derived from these axioms. To do this, consider the case of several related events. If A1, A2, ..., An are events, then the probability of any of these events occurring can be expressed as the sum of the probability of each of them:
P(A1 ∪ A2 ∪ ... ∪ An) = P(A1) + P(A2) + ... + P(An)
Note that if A and B are events, then the event A ∩ B (A and B occurred simultaneously) can be written as A1 ∪ A2 ∪ ... ∪ An, where A1 = A ∩ B, A2 = A ∩ B', A3 = A' ∩ B, A4 = A' ∩ B', and A5, A6, ..., An are events that do not intersect with either A or B. Thus, we obtain:
P(A ∩ B) = P(A1) + P(A2) + ... + P(An)
In turn, the probability of each of these events can be expressed as the ratio of the number of favorable outcomes to the total number of possible outcomes:
P(A1) = N(A1) / N
P(A2) = N(A2) / N
...
P(An) = N(An) / N
Therefore,
P(A ∩ B) = N(A1) / N + N(A2) / N + ... + N(An) / N
Thus, we obtain the probability of the intersection of events as a sum of the probabilities of each possible outcome. Therefore, it can be concluded that the probability theorem follows from the axiomatic definition of probability.
Facial recognition technology is one of the most common areas where probability theory is used. For facial recognition, the model is first trained based on a set of facial images, after which the input facial image is compared to this model to determine who the face belongs to. Data collected based on various criteria, such as skin color, eye shape, and other features, are used as a model for facial recognition. These data are weighted according to their significance, and then the model is used to calculate the probability of the two facial images matching. Then, applying probability theory, one can determine how close the two facial images are. If the probability of matching is high, it can be concluded that the two images belong to the same person. However, errors are also possible in facial recognition technology. This can happen, for example, if an image is taken at an angle or distorted for other reasons. In such cases, the probability of matching will be low, and the system may make a mistake.
Thus, probability theory is used in facial recognition technology to determine the probability of two facial images matching. This makes it possible to create accurate facial recognition systems and improve security in various fields, including finance, banking, and law enforcement.
The theorem of machine learning is a statement that generalizes the practical experience of machine learning algorithms. It states that if a machine learning algorithm can handle the training data and test data, it can also handle real data.
The theorem of machine learning is based on the assumption that the data used for training and real data can be linked by a functional dependency. Therefore, if a machine learning algorithm can find the corresponding dependence for a data sample during the training phase, it can apply the same dependence for real data.
However, it should be noted that the theorem of machine learning is not absolute and always valid. There is a possibility that there are cases where a machine learning algorithm will not be able to handle real data, even if it successfully passed the test on training data and test data. This may be due to differences between the training data sample and the actual situation when the algorithm is applied.
Thus, the theorem of machine learning is an important concept in machine learning. It generalizes the experience of practical application of machine learning algorithms and can help in choosing the appropriate algorithm for solving a specific problem.
The proof of the theorem of machine learning is a formal proof of the validity of mathematical formulas and relationships that explain the learning process in machine learning. The theorem of machine learning states that for any trainable function and any error metric that corresponds to it, an algorithm can be constructed that predicts the values of the function with arbitrary accuracy. To prove this theorem, it is necessary to carry out mathematical operations and proofs that show that machine learning algorithms work with specified parameters and rules, leading to accurate conclusions. The theorem of machine learning also involves probability theory, statistics, and other mathematical disciplines.
The work of the theorem of machine learning is based on explaining that learning algorithms adapt to the given training data and tune their parameters in such a way as to reduce errors in predicting the values of functions. This is achieved through an optimization algorithm that finds the minimum error based on the specified parameters. The proof of the theorem of machine learning is a complex and lengthy process, usually carried out by mathematicians and machine learning specialists. Nevertheless, the basic concepts and principles that underlie this theorem can be explained and understood even by people without mathematical specialties.
The theorem of machine learning is fundamental to many algorithms and technologies, including facial recognition technology. Facial recognition is the process of automatically identifying faces in images or videos. Various machine learning algorithms are used for this purpose, which are trained on large amounts of images and videos containing vast amounts of data about different faces.
Facial recognition algorithms are trained on training data that contains a large number of images of human faces. For each image, features are determined in advance that allow differentiating faces from each other. These features may include parameters such as eye shape, distance between eyes, nose shape, distance between eyes and nose, and so on.
Next, based on these features and other characteristics, the algorithm is trained to determine which face is present in the image. Training involves tuning the parameters of the algorithm in such a way as to minimize the error in facial recognition.
After training, the algorithm can be used to recognize faces in any images or videos. It works by comparing the image to a database of faces and selecting the most suitable face based on the best match of parameters and characteristics.
Facial recognition technology is a tool that uses computer vision for face identification and authentication. This technology is widely used in many fields, including:
1. Security. Facial recognition is used in security systems to control access to buildings, rooms, server rooms, data storage facilities, and other important objects. It can be used in both corporate and private homes.
2. Marketing and Advertising. Facial recognition cameras can help track the number of people in a shopping center or at an exhibition, as well as the number of people who are viewing specific products in stores. Based on this information, recommendations can be given to stores.
3. Fingerprints identification and fraud prevention. Facial recognition is used in the banking and financial sectors to ensure the security of transactions and reduce the level of fraud.
4. Healthcare. In medical facilities, facial recognition helps reduce patient wait times, as patients simply come to the hospital and their pre-filled forms are immediately recognized in the appropriate registration office.
5. Social media. Some social media services use facial recognition for quick identification of friends mentioned in photos.
6. Public transport. In some countries, facial recognition cameras are used to track passenger flows in public transport.
7. Sports events. At sports events, concerts, and other large public events, facial recognition can be used for access control and event merchandise sales. Facial recognition technology is used in many other industries, and it will continue to evolve and improve in the near future. For example, let us consider Euclidean space, which is an example of a metric space. In Euclidean space, each point is defined by a set of coordinates, and the distance between two points is determined by the length of the shortest path between them, which is a straight line.
References:
- Nandy A., Haldar S., Banerjee S., Mitra S. A Survey on Applications of Siamese Neural Networks in Computer Vision // International Conference for Emerging Technology (INCET), Belgaum, India, 2020, -P.1-5. - С. 2-3
- Shvetsov D. Main trends in the development of facial recognition technology [Electronic resource] // Modern Automation Technologies. 2020. №2. – P.6-12
- Mishchenkova E.S. Comparative analysis of facial recognition algorithms // Bulletin of Volgograd State University. 2013. Ser.9. №11. –P.74–76
- Amirgaliyeva Z., Sadykova A., Kenshimov C. Development of a modified Viola-Jones algorithm for facial recognition // Physical and Mathematical Sciences. 2022, №77(1). –P.64–69
- Kaziev V.M., Kazieva B.V. Internet of things and vulnerability of interactions between "them" and "us": "Russia, Europe Asia: digitization of the global space: Proceedings of the III International Scientific and Practical Forum (Nevinnomyssk) / Ed. by I.V. Penkova. – Stavropol: SEQUOIA, 2020. -P.318-322