See [Project Structure](#project-structure) for more information on the repository you just cloned.
See [Step 1 - Prerequisites](#step-1---prerequisites) on what is required before you can run this project.
# Project Structure
Below you can find the default project folder structure after cloning it:
```
WGU-Capstone
├.gitignore
├Main.py
├README.md
├WGU-Capstone-User-Guide.html
├requirements.txt
├shell.nix
├cascades
│ ├ cascade_1.xml
│ ├ cascade_2.xml
│ ├ cascade_5.xml
│ └ cascade_10.xml
├training_data
│ ├ positives
│ └ training_data_setup.py
└validation
├ TestVideo.mp4
├ compare_to_gt.py
├ create_ground_truth.py
└ ground_truth.txt
```
[Click here to skip the detailed file structure explaination](#step-1---prerequisites)
## Top Level
In the top-level of the cloned repository, you will find most of the files required for the core-fuctionality.
#### .gitignore
This file excludes files we don't want to check into git - such as the training data.
These files continue to exist on machine, but they are not uploaded to the remote git repository. This is very helpful to keep the clone sizes down, and upload/download speeds up.
#### Main.py
The main file of interest for this project. This file contains all of the code for the finished product.
As long as this file is in the same folder as the `./cascades` folder, it can be copied and run anywhere with the prerequisites installed.
#### README.md
The file you are reading in a format that most git hosting servers automatically render as the home-page.
#### WGU-Capstone-User-Guide.html
The html version of the README.md file that was bundled with CSS and hyper-links.
#### requirements.txt
The file that contains all of the python pip requiremnts to run. This packages in this file can either be installed by hand (e.g. `pip install opencv-python`), or can all be installed at once with `pip install -r requirements.txt` which will install all of the modules needed to run this project that are not included in the standard library.
#### shell.nix
A file that can be used on [Nix and NixOS](https://nixos.org/) systems to create a reproducable environement with all of the requirements to run the `Main.py` file.
## ./Cascades
This folder contains the final trained models created by this project in the model training step.
For more information on how they were created, see [Training your own Haar file](#training-your-own-haar-file) below.
This folder needs to be in the same directory as the `Main.py` file for the `Main.py` file to be able to run.
## ./Traning_data
This folder contains all of the requirements for creating a new model from a few un-catagorized positive images, and a large dataset of negatives.
NOTE: Before anything in this folder can be run, please see [the section on training the haar files](#training-your-own-haar-file) for several prerequisites.
#### ./Training_data_setup.py
This python file takes a large data-set of negative images from the `./training_data/negatives` folder and creates .vec files that can be passed as an arguement to the utility that trains the final Haar file.
#### ./Positives
This folder contains the 10 images that were used to create the cascade files included in this project. These files were included because the 10 images are a very small dataset in comparison to the required negatives.
## ./Validation
The folder contains all of the scripts and files used to measure the performance and accuracy of the generated models.
#### TestVideo.mp4
This minute-long video was used to test the trained models.
#### Compare_to_gt.py
This file compares the output of a `--validate` output file generated by `Main.py` of a run with the provided `ground_truth.txt` file.
The output of this file is a .csv file that describes the average deviation from the boxes described by the `ground_truth.txt` file. See [Validation and Testing](#validation-and-testing) for more information on this process.
#### Create_ground_truth.py
This is the file used to create the `ground_truth.txt` file from the provided `TestVideo.mp4`.
# Step 1 - Prerequisites
Before you can run this project, you need a python environment with the required packages installed.
If you are using Nix or NixOS, simply run `nix shell` in the `WGU-Capstone` folder, and all of the packages required to run `Main.py` will be installed for that shell session.
However, if you are not on a Nix system, continue reading.
The steps below detail how to set up a virtual environment that can be used to run this project, but a system-wide install of python with the packages detailed in `requirements.txt` installed will also suffice.
### Set up virtual environment
This project was created with python 3.11, and other versions are not garunteed to work. So to ensure the project works as designed, install python 3.11 from the official python download page.
Once you have python 3.11 installed on your system, navigate to the cloned repository's root directory, and run the following command to create a new virtual environement:
```python
python -m venv ./venv
```
You can now run the following commands to enter the virtual environment, and any python commands will be run inside the virtual environment instead of your system-wide installation.
On windows run the following if you are using a cmd prompt:
```shell
.\venv\Scripts\activate.bat
```
On windows in powershell:
```shell
.\venv\Scripts\Activate.ps1
```
If you are on a linux based operating system, enter the virtual environment with:
```shell
.\venv\Scripts\activate
```
### Install requirements
Now that you have activated the virtual environment, install the non-standard library requirements with the below command:
```shell
pip install -r ./requirements.txt
```
# Step 2 - Running the project
Now that the pre-requisites have been installed, you can run the project. For a full list of command-line arguements, run `python Main.py --help`.
Run the project with the dashboard enabled with the following command from the root of the project directory:
```shell
python Main.py -d
```
You should see the web-cam of your computer turn on, and a window appear showing the view of the webcam, with boxes around any detected faces.
To display the calculated adjustment amounts generated by this project, enable the print-to-stoud feature with the `-o` flag:
```shell
python Main.py -d -o
```
This command will output the calculated output commands for every detected face, and also show the summary statistics.
# Additional flags
This section will describe, in greater depth, the available feature flags shown by the `--help` screen.
## Help
`-h` or `--help`
Displays all of the available parameters and a quick description
## Version
`-v` or `--version`
Prints the version of the program and exits
## Show Dashboard
`-d` or `--dasbboard`
Display the run-summary statistics, these are off by default.
## Output Adjustment Instructions
`-o` or `--output`
Print the calculated adjustment instructions generated by the program. This output demonstrates the generated values that will be sent to the motorized camera platform.
## Use Video File
`-f <file_path>` or `--file <file_path>`
Use a video file (such as ./validation/TestVideo.mp4) instead of the computer's webcam. Useful for generating validation files and on machines without a working webcam.
## Headless Mode
`-s` or `--no-screen`
Run the program without the window displaying processed video frames.
## Save Frames for Training Data
`-t` or `--training-data`
Save frames where faces were found to `./output` as .jpg files, and save the located face's location to a .csv file.
This feature will be used to generate positive images automatically for training future models.
## Generate Validation File
`--validate`
Outputs all discovered boxes, the frame they were found on, and the box coordinates so the model can be validated against the ground truth. See [validation and testing](#validation-and-testing) for more information on this process.
# Training Your Own Haar File
This project contains the scripts required to train your own Haar cascade files, but it does not contain several of the dependencies.
NOTE: These steps only apply to Windows devices.
## Prerequisites
The first requirement needed before you can train your own Haar file, is a large number of negativeimages. For this project, I used [this Kaggle dataset of landscape images](https://www.kaggle.com/datasets/arnaud58/landscape-pictures/) as my negatives datasource. After downloading this file, unzip it and deposit all of the raw images into the `./training_data/negatives` folder - create it if needed.
Next we need to download the windows OpenCV binary distributable and put it in our training_data folder.
You can download the 3.4.15 binary executable [here](https://sourceforge.net/projects/opencvlibrary/files/3.4.15/opencv-3.4.15-vc14_vc15.exe/download). (You can also go [here](https://opencv.org/releases/) and find the 3.4.15 release and choose "windows" to get to the same page).
After the .exe file has downloaded, open it and go through the steps to unzip it. After it has been unzipped, copy the folder to `./training_data/opencv`. So you should be able to run this from the training_data directory:
If you do not get an error running the above command, then it was installed correctly.
## Generating positive images
Now that we have the create_samples utility provided by OpenCV (they stopped distributing executables of it after 3.4.15) and the negatives folder full of negative images, we can use the `training_data_setup.py` file to create several different sized datasets ready for training Haar cascade files on.
The python file will run the create_samples tool for every positive image in `./positives`, creating many positive images.
The script will do all of the steps up through creating the .vec files that the train_cascade exectuable requires.
Before exiting, training_data_setup outputs the commands that need to be run to train the models. Run these commands from the training_data folder, and after they have finished training, you can use the generated Haar cascades instead of the ones provided.
# Validation and Testing
The following describes the process I used to test the precision and accuracy of the generated cascade files.
## Generate the Ground Truth file
I have included a generated `ground_truth.txt` file, so you don't need to do this step. But if you would like to generate the ground truth file from the provided test video, navigate to root of the project, and run the create ground truth script:
```shell
python create_ground_truth.py
```
A window will open and display the process as it creates the file. This script does not utilize Haar files, but the MIL tracking algorithm, which results in much more accurate results, but a slower processing speed for the video.
All of these settings have been hard-coded so it will always output the same ground truth file.
## Getting the model validation file
Now that we have the ground truth for our Test Video, we need to generate the same file with our trained model.
To do this, edit the `Main.py` file so that it uses the new cascade, then run the python file with the `--validate` option set, and the test video passed to the `-f` flag. The command used to generate the statistics with the test video provided is this:
(Notice that we can still display the dashboard while it outputs validation info)
This will create a new file in the `./validation` folder describing the faces and locations found in each frame.
## Comparing it to the ground truth
I have created a script to automatically compare a validation file with a ground truth file, and output the average absolute deviation in adjustment instructions.
It requires two arguements, and has one optional output. You can see the options with the `--help` flag, but I will demonstrate all of the options below.
You can use `./validation/compare_to_gt.py` like this:
This script will then take the generated test validation file, and get what the generated adjustment output would be, and gets the absolute difference between it and the ground truth, then it adds together all results for each frame - this last part penalizes false positives. We can then take the generated output file, and open it in Excel. We can take the average of it to see what the average deviation from the ground truth would be. The generated faces_count_output file contains the number of faces found in each frame, and can be used to measure the number of false positives.