With the use of advancement in face recognition technology, the OpenCV library helps us to create high-quality face detection algorithms.
OpenCV, i.e., Open source Computer Vision is used for computer vision, machine learning, and image processing and plays a crucial role in real-time operation important in today’s systems. In this project, we developed a program for face detection, labeling, and logging data.

Images input (DATABASE)
The photos of the people are added to the folder. The folder directory path is present in the code for importing the images. When doing so we need to make sure that the image name is the same as the name of the person. While we can modify the name in the code like case conversion, it is the best practice to use the proper name.
face encoding generation
It is a face for us. But, for our algorithm, it is only an array of RGB values — that matches a pattern that the it has learnt from the data samples we provided to it.  For face recognition, the algorithm notes certain important measurements on the face — like the color and size and slant of eyes, the gap between eyebrows, etc. All these put together define the face encoding — the information obtained out of the image — that is used to identify the particular face. The code runs until we get all the input images encoded.

To give you a feel below are the face encodings generated for the input image are the 128-dimensional face encoding.

array([-0.10213576, 0.05088161, -0.03425048, -0.09622347, -0.12966095, 0.04867411, -0.00511892, -0.03418527, 0.2254715 , -0.07892745, 0.21497472, -0.0245543 , -0.2127848 , -0.08542262, -0.00298059, 0.13224372, -0.21870363, -0.09271716, -0.03727289, -0.1250658 , 0.09436664, 0.03037129, -0.02634972, 0.02594662, -0.1627259 , -0.29416466, -0.12254384, -0.15237436, 0.14907973, -0.09940194, 0.02000656, 0.04662619, -0.1266906 , -0.11484023, 0.04613583, 0.1228286 , -0.03202137, -0.0715076 , 0.18478717, -0.01387333, -0.11409076, 0.07516225, 0.08549548, 0.31538364, 0.1297821 , 0.04055009, 0.0346106 , -0.04874525, 0.17533901, -0.22634712, 0.14879328, 0.09331974, 0.17943285, 0.02707857, 0.22914577, -0.20668915, 0.03964197, 0.17524502, -0.20210043, 0.07155308, 0.04467429, 0.02973968, 0.00257265, -0.00049853, 0.18866715, 0.08767469, -0.06483966, -0.13107982, 0.21610288, -0.04506358, -0.02243116, 0.05963502, -0.14988004, -0.11296406, -0.30011353, 0.07316103, 0.38660526, 0.07268623, -0.14636359, 0.08436179, 0.01005938, -0.00661338, 0.09306039, 0.03271955, -0.11528577, -0.0524189 , -0.11697718, 0.07356471, 0.10350288, -0.03610475, 0.00390615, 0.17884226, 0.04291092, -0.02914601, 0.06112404, 0.05315027, -0.14561613, -0.01887275, -0.13125736, -0.0362937 , 0.16490118, -0.09027836, -0.00981111, 0.1363602 , -0.23134531, 0.0788044 , -0.00604869, -0.05569676, -0.07010217, -0.0408107 , -0.10358225, 0.08519378, 0.16833456, -0.30366772, 0.17561394, 0.14421709, -0.05016343, 0.13464174, 0.0646335 , -0.0262765 , 0.02722404, -0.06028951, -0.19448066, -0.07304715, 0.0204969 , -0.03045784, -0.02818791, 0.06679841])
Cam input
In this project, we are using a webcam to input the human faces for comparison. Here the faces are first detected from the live camera feed. Before encoding, the individual input faces are changed into RGB format. Once converted, the face is encoded into 128-dimensional face encodings. They are based on the different human facial features.
Once the faces are encoded, the next step is to compare them. A comparison algorithm is run where we analyze the current and the former encodings. In the program, we can compare the face distance. The Face distance is useful to know the closeness of the live face encodings with the images in the database. The lesser is the face distance, the better the accuracy of detecting the right face. If we find that faces are labeled incorrect, we can tweak the face distance to minimize the errors.
Once the face is compared, for visual confirmation, the name of the person is displayed. We are using a rectangular box displayed around the face with the person’s name at the bottom. While there are different ways for confirmation, the above method is more intuitive and interactive.
Now comes the most crucial part of the project. Once the face is recognized, we need the log the data into the CSV file format. The file can also be opened in MS Excel for any further edits. In this project, we are logging the Name, Time, and Date of the person.

The next iteration of the project will feature automatic CSV file creation at the respective date, with the name as the date. A script is written to generate a new CSV file each day. Furthur we are finding ways to replace the existing CSV file name with the new name.

data logged
1. Blur images are sometimes misrepresented, mislabeled and a wrong input is send into the .csv file.   

2. It take for a while for the right name to be settled, which lead to wrong input. 

Solutions identified - 20/4

1. The misrepresentation of the images as needed to be removed for that the data set is increased. What I mean by datasets is the number of input images of the people. So here the images are stored in a folder that would be named after the person. The data will be encoded with the name of the person. It was challenging to come up with the code which gets the above two things at a time. I am thinking of using a method where I will save the name of the person (from the file as name) and subsequent encoding. Then we will have another code snippet for another folder. Thus we will have a less complicated code at the end.

2. Second thing I am considering is to set face distance (face distance is Encoding parameters for stored image minus the input image, less the better accuracy) for each of the systems thus to avoid misrepresentation. 

3. Another method that I found was on the blog by Adrian from Pyimagesearch who counted the votes for each of the images in the database to come up with the right output.

4. I am also considering appending the results into a list, giving it a maximum range of the output that can be saved in it. Then the name with maximum occurance will be saved as output.

5. Since at the end we are concentrating to identify the face, we can crop the image to make the task smooth for the computer and accumulating the computation time needed.

So now the aim is to implement all this in my improved code.


Before coming up with  a timeline it is necessary to list out the tasks. So below are the identified tasks.

1. Implementing the improved code. Note down the feedback (might have to think about individual value tweaking)  (3/5)

2. Final code iteration iteration we are needed to do any. (5/5)

3. Implementing the code on Raspberry Pi. And setting the system for use. (7/5)

4. Finally coming up an enclosure for the system, thus the project can be started for use. (10/5)

Starting point

For implementing and modifying the project, the GitHub repository can be a good starting point. A text file with detailed information is available for the installation of IDE, libraries, and dependencies. A previous blog can be referred to, which has both installation process and code explanation. 

Below are some of the resources referred to.

  1. Background for face_recognition –
  2. The face recognition library used –
  3. The computer vision library –  
  4. Working with CSV format –
  5. Pycharm IDE installation – 
  6. Useful for installation of libraries if done in command prompt –
  7. An attendance project for reference –
The UPDATE - 20/3

After all the challenges incurred during the accuracy with the faces, I realized that the previous values for face distance were 0.6 (Its set default with the module). The advantage of the default face distance is that it fast with face detection. To the idea, the bounding box dynamically moves with our movement, and while on the other hand, it’s prone to assign a wrong name. Secondly, instead of avoiding, it assigns the closest match. I wanted to avoid any peripheral device or external feedback that could end up making the project complicated, so I made the recognization more strict. You can think of face distance as tolerance, the more strict the tolerances are, the better the accuracy. In our case, we have set the tolerance a 0.4. Any Downsides? Yes, the footage seems as if set on lower frame rates. Since we don’t have to do active tracking with face recognition, a 2-3 seconds lag should not matter, unless it hinders our accuracy.

The second change that helped us solve our problem was using multiple photos of the same person. Wait, how to assign the same name for a bunch of images in the database. I have written in our code that once we enter a name, in our .csv file, we don’t want to save it again. It can be avoided by naming the file as ‘Name Surname.01.jpg’ and ‘Name Surname.02.jpg’. Thus we split the name and assigned two distinct encodings with the same name. Results, whenever we needed to unknown face, it would compare faces with the closest matches. Even if fails, in the first case, the second-best is also acceptable.
You might be puzzled about how to decide the number of images per person. Ideally, a single image should be sufficient but in our case, we can’t afford to miss a single chance. So start with maximum images, say, 5/person, then compare the results by reducing the number to one. In most cases, two images are sufficient to get desired results.



The code is systematically explained below but if you wish to study the actual code I will share the GitHub link.

iMPOrt libraries

The first step in the program is to import the required libraries. The libraries used are:

cv2 – OpenCV is the huge open-source library for computer vision, machine learning, and image processing and now it plays a major role in real-time operation which is very important in today’s systems.

NumPy – NumPy is a Python library used for working with arrays. It also has functions for working in domain of linear algebra, fourier transform, and matrices. … NumPy stands for Numerical Python.

face_recognition – The face_recognition command lets you recognize faces in a photograph or folder full of photographs. There’s one line in the output for each face. The data is comma-separated with the filename and the name of the person found.

OS – The OS module in Python provides functions for interacting with the operating system. OS comes under Python’s standard utility modules.

datetime – The datetime module supplies classes for manipulating dates and times. While date and time arithmetic is supported, the focus of the implementation is on efficient attribute extraction for output formatting and manipulation. General calendar related functions.

Pyshine – A collection of simple yet high level utilities for Python. The library helps us create quality text for out output. 

The path for the folders containing the images are defined using the os module.

In order to avoid printing the name with the extension, the split() method splits a string into a list. The code does who things at a time, first removes ex. ‘.jpg’ extension as well the ‘.01’ which helps us to use multiple images with same name.

The images are processed to convert them into computer readable format, so we generate face encodings. The face_recognition API generates face encodings for the face found in the images. A face encoding is basically a way to represent the face using a set of 128 computer-generated measurements. Two different pictures of the same person would have similar encoding and two different people would have totally different encoding.


According to the requirement of the program, we cannot overwrite a name once added. The presence of the name is checked before saving it in the file thus if it is already in the file, it won’t be saved. But we require to record attendance on a daily basis. The solution to the problem is to create a new file every day. A new file is generated daily automatically for saving data for that date.

A comma-separated values file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas.In our program the data is saved in the following file which can be accessed using excel for data retrival. In the following code the presense of the same is checked, if not present already the data is saved. If someone already recorded comes in front of the camera the data won’t be saved.

The frame size is changed by changing the frameWidth and frameHeight variables.

The data from the webcam is read and converted to RGB format. The face is detected from the frame and encoded. The encoding is sent for comparison with the saved encodings of the images of the people. We have set a sleep of one second since we had noticed earlier that blur faces are also encoded. To discourage this from happening we put a ‘sleep’, so for the next frame, the person is in stable condition.

After comparison, if the face distance is checked with the saved names. If the face distance is more than 60% then the name will be displayed. The name is also sent to the def(markAttendance) function, where the name is checked and saved.

To give visual feedback, we show the bounding box in front of the face. We do it using the cv2.rectangle, which creates a bounding box around the face. Here we can select the font for the text and color of the bounding box. The snippet thus helps us to make our code look polished since it’s the only point of contact between the user and the program.


The project was tested to see if there any flaws in the system. A lot of times the flaws are only realized when confronted with the during the end user testing scenario. Luckily the flaws were taken into consideration during before the actual testing. So there weren’t much technical issues. The main issues being the hardware of the computer. The data was shared and  matched with manually logged data.


The project was definitely among my most exciting projects. It was the first time I ever learned a programming language and simultaneously started working on this project. I had a fair share of challenges comprehending the jargon of an esoteric language. But I can now say that this project helped me build a solid foundation which I will surely improve on. But I am happy I can successfully deliver the project as planned.

If I didn’t have a time crunch, I would have worked to find a method of saving the encodings in the place of generating encodings every time. This trick would reduce the processing time. The next thing is automatically starting and stopping the program. I tried it using task scheduler for shutdown and boot RTC settings, but it didn’t work out for the system installed.

I am facing an issue if the number of faces reaches beyond a limit. So solving these minor issues will surely add value to the project.