WGU Capstone User Guide

Before you can run this project, you will need to clone the git repository with the following command:

git clone https://git.nickiel.net/Nickiel/WGU-Capstone

Project Structure

Top Level

In the top-level of the cloned repository, you will find most of the files required for the core-fuctionality.

.gitignore

This file excludes files we don’t want to check into git - such as the training data. These files continue to exist on machine, but they are not uploaded to the remote git repository. This is very helpful to keep the clone sizes down, and upload/download speeds up.

Main.py

The main file of interest for this project. This file contains all of the code for the finished product. As long as this file is in the same folder as the ./cascades folder, it can be copied and run anywhere with the prerequisites installed.

README.md

The file you are reading in a format that most git hosting servers automatically render as the home-page.

WGU-Capstone-User-Guide.html

The html version of the README.md file that was bundled with CSS and hyper-links.

requirements.txt

The file that contains all of the python pip requiremnts to run. This packages in this file can either be installed by hand (e.g. pip install opencv-python), or can all be installed at once with pip install -r requirements.txt which will install all of the modules needed to run this project that are not included in the standard library.

shell.nix

A file that can be used on Nix and NixOS systems to create a reproducable environement with all of the requirements to run the Main.py file.

./Cascades

This folder contains the final trained models created by this project in the model training step. For more information on how they were created, see Training your own Haar file below.

This folder needs to be in the same directory as the Main.py file for the Main.py file to be able to run.

./Traning_data

This folder contains all of the requirements for creating a new model from a few un-catagorized positive images, and a large dataset of negatives.

./Training_data_setup.py

This python file takes a large data-set of negative images from the ./training_data/negatives folder and creates .vec files that can be passed as an arguement to the utility that trains the final Haar file.

./Positives

This folder contains the 10 images that were used to create the cascade files included in this project. These files were included because the 10 images are a very small dataset in comparison to the required negatives.

./Validation

The folder contains all of the scripts and files used to measure the performance and accuracy of the generated models.

TestVideo.mp4

Compare_to_gt.py

This file compares the output of a --validate output file generated by Main.py of a run with the provided ground_truth.txt file. The output of this file is a .csv file that describes the average deviation from the boxes described by the ground_truth.txt file. See Validation and Testing for more information on this process.

Create_ground_truth.py

This is the file used to create the ground_truth.txt file from the provided TestVideo.mp4.

Step 1 - Prerequisites

Before you can run this project, you need a python environment with the required packages installed.

If you are using Nix or NixOS, simply run nix shell in the WGU-Capstone folder, and all of the packages required to run Main.py will be installed for that shell session.

The steps below detail how to set up a virtual environment that can be used to run this project, but a system-wide install of python with the packages detailed in requirements.txt installed will also suffice.

Set up virtual environment

This project was created with python 3.11, and other versions are not garunteed to work. So to ensure the project works as designed, install python 3.11 from the official python download page.

Once you have python 3.11 installed on your system, navigate to the cloned repository’s root directory, and run the following command to create a new virtual environement:

python -m venv ./venv

You can now run the following commands to enter the virtual environment, and any python commands will be run inside the virtual environment instead of your system-wide installation.

If you are on a linux based operating system, enter the virtual environment with:

Install requirements

Now that you have activated the virtual environment, install the non-standard library requirements with the below command:

Step 2 - Running the project

Now that the pre-requisites have been installed, you can run the project. For a full list of command-line arguements, run python Main.py --help.

Run the project with the dashboard enabled with the following command from the root of the project directory:

You should see the web-cam of your computer turn on, and a window appear showing the view of the webcam, with boxes around any detected faces.

To display the calculated adjustment amounts generated by this project, enable the print-to-stoud feature with the -o flag:

This command will output the calculated output commands for every detected face, and also show the summary statistics.

Additional flags

This section will describe, in greater depth, the available feature flags shown by the --help screen.

Help

Version

Show Dashboard

Output Adjustment Instructions

Print the calculated adjustment instructions generated by the program. This output demonstrates the generated values that will be sent to the motorized camera platform.

Use Video File

Use a video file (such as ./validation/TestVideo.mp4) instead of the computer’s webcam. Useful for generating validation files and on machines without a working webcam.

Headless Mode

Save Frames for Training Data

Save frames where faces were found to ./output as .jpg files, and save the located face’s location to a .csv file. This feature will be used to generate positive images automatically for training future models.

Generate Validation File

Outputs all discovered boxes, the frame they were found on, and the box coordinates so the model can be validated against the ground truth. See validation and testing for more information on this process.

Training Your Own Haar File

This project contains the scripts required to train your own Haar cascade files, but it does not contain several of the dependencies.

Prerequisites

The first requirement needed before you can train your own Haar file, is a large number of negative images. For this project, I used this Kaggle dataset of landscape images as my negatives datasource. After downloading this file, unzip it and deposit all of the raw images into the ./training_data/negatives folder - create it if needed.

Next we need to download the windows OpenCV binary distributable and put it in our training_data folder.

You can download the 3.4.15 binary executable here. (You can also go here and find the 3.4.15 release and choose “windows” to get to the same page).

After the .exe file has downloaded, open it and go through the steps to unzip it. After it has been unzipped, copy the folder to ./training_data/opencv. So you should be able to run this from the training_data directory:

If you do not get an error running the above command, then it was installed correctly.

Generating positive images

Now that we have the create_samples utility provided by OpenCV (they stopped distributing executables of it after 3.4.15) and the negatives folder full of negative images, we can use the training_data_setup.py file to create several different sized datasets ready for training Haar cascade files on.

The python file will run the create_samples tool for every positive image in ./positives, creating many positive images. The script will do all of the steps up through creating the .vec files that the train_cascade exectuable requires.

Before exiting, training_data_setup outputs the commands that need to be run to train the models. Run these commands from the training_data folder, and after they have finished training, you can use the generated Haar cascades instead of the ones provided.

Validation and Testing

The following describes the process I used to test the precision and accuracy of the generated cascade files.

Generate the Ground Truth file

I have included a generated ground_truth.txt file, so you don’t need to do this step. But if you would like to generate the ground truth file from the provided test video, navigate to root of the project, and run the create ground truth script:

A window will open and display the process as it creates the file. This script does not utilize Haar files, but the MIL tracking algorithm, which results in much more accurate results, but a slower processing speed for the video.

All of these settings have been hard-coded so it will always output the same ground truth file.

Getting the model validation file

Now that we have the ground truth for our Test Video, we need to generate the same file with our trained model.

To do this, edit the Main.py file so that it uses the new cascade, then run the python file with the --validate option set, and the test video passed to the -f flag. The command used to generate the statistics with the test video provided is this:

(Notice that we can still display the dashboard while it outputs validation info)

This will create a new file in the ./validation folder describing the faces and locations found in each frame.

Comparing it to the ground truth

I have created a script to automatically compare a validation file with a ground truth file, and output the average absolute deviation in adjustment instructions. It requires two arguements, and has one optional output. You can see the options with the --help flag, but I will demonstrate all of the options below.

This script will then take the generated test validation file, and get what the generated adjustment output would be, and gets the absolute difference between it and the ground truth, then it adds together all results for each frame - this last part penalizes false positives. We can then take the generated output file, and open it in Excel. We can take the average of it to see what the average deviation from the ground truth would be. The generated faces_count_output file contains the number of faces found in each frame, and can be used to measure the number of false positives.

Step 0: Clone the repository