The objective of this workshop is to explore the capabilities of a Raspberry Pi and utilize computer vision to detect objects by color.
A Raspberry Pi is a single-board, small computer that is used for electronic projects or as a component in larger systems (Figure 1). The default operating system is Raspberry Pi OS, a Debian-based operating system. A Raspberry Pi can also operate using Linux distros or Android. Internally, a Raspberry Pi has a central processing unit (CPU), graphics processing unit (GPU), and random-access memory (RAM) and can perform more advanced tasks, such as character recognition, object recognition, and machine learning, compared to Arduino.
The base hardware of a Raspberry Pi 4 features USB 2.0/3.0 ports for peripherals, a wireless and ethernet port, Bluetooth, USB-C power, general-purpose input/output (GPIO) pins (Figure 2), an audio jack, two micro HDMI ports, and a camera module port.
A Raspberry Pi has 28 GPIO pins, two 5V pins, two 3.3V pins, and eight ground pins. The 28 GPIO pins are analogous to Arduino digital/analog I/O pins and allow connections to sensors and additional components, such as an Arduino board.
The data storage for a Raspberry Pi uses an SD card. Many programming languages can be used on the Raspberry Pi, including Python, Scratch, Java, and C/C++. The Python programming interface is the default for a Raspberry Pi. Thonny IDE is the environment for Raspberry Pi when using Python.
To further the capabilities of a Raspberry Pi, additional modules, such as a camera module, microcontrollers, and programming libraries, such as the Open Source Computer Vision Library (OpenCV), can be implemented.
Python is a text-based programming language typically used for web development, software development, mathematics, and system scripting. Python is functional across several platforms and has syntax similar to the English language, which minimizes the amount of code needed to perform certain tasks compared to other languages.
Unlike other programming languages, Python does not require a closing mark on lines of code. Lines of code are closed off by continuing onto the next line.
In Figure 3, the program will print “Hello World”, make a variable, x, equal 3, make a variable, y, equal 2, add the two variables together, and store the result in a variable named result, and print the result.
For any loop made (if/elif/else statements, while loops, for loops), the lines preceding the initial declaration of the loop must be indented (Figure 4). The lack of indentation will cause a syntax error. In Figure 4, the program will determine if 5 is greater than 2, and if true, print “Five is greater than two!”.
A while loop executes the code within the loop so long as the condition is true. In Figure 5, the while loop will continuously print the result on a new line and add 1 to the result so long as the result is less than or equal to 10.
A break statement is used to break a loop regardless of the condition (Figure 6).
A for loop is used for iterating over a set sequence. The sequence can be a list, tuple, dictionary, set, or string. In Figure 7, the for loop will print each character in the string random_word with a space in between each character. Similar to the while loop, a for loop can be broken early with a break statement.
Comments within code are typically used to explain the code and make the code more readable and easier to troubleshoot when errors arise. Line comments begin with a # and can only span a single line. Block comments begin and end with a ’’’ and can span multiple lines (Figure 8).
Computer Vision (CV)
Computer vision is an area of artificial intelligence (AI) that allows computers and systems to measure data from visual inputs and produce outputs based on that data. The OpenCV library is an extensive open source library used for computer vision. OpenCV contains interfaces for multiple programming languages.
OpenCV allows image/video processing and display, object detection, geometry-based monocular or stereo computer vision, computational photography, machine learning and clustering, and CUDA acceleration. With this functionality, many applications are possible including face recognition, street view image stitching, and driverless car navigation.
In OpenCV, color images are represented as three-dimensional arrays. An image consists of rows of pixels, and each pixel is represented by an array of values representing its color in BGR (Blue, Green, Red) format. The array can be transcribed into the color’s specific hue saturation value (HSV). The HSV model describes colors similar to how the human eye tends to perceive color. This format is more suitable for color detection purposes.
cv2.cvtColor(frame, conversion code) is a method used to convert an image from one color space to another. In the OpenCV Library, there are over 150 different color-space conversion codes. The one used in this lab is cv2.COLOR_BGR2HSV, which converts an image from the default BGR format to the desired HSV format.
cv2.inRange(frame, lower bound, upper bound) is a method used to identify pixels in the frame that fall within the range of values defined by the lower and upper boundaries. This command outputs a binary mask where white pixels represent areas that fall within the range, and black pixels represent areas that do not fall within the range. The right panel in Figure 9 shows the output for the cv2.inRange() method using the command orange_mask = cv2.inRange(nemo,lower,upper). The bright orange pixels fall within the range specified and have a corresponding white value in the output.
cv2.bitwise_and(frame, frame, mask = mask) is a bitwise AND operation that delegates in making the frame only retain what is determined by the set mask. Only pixels in the frame with a corresponding white value in the mask would be preserved. The rightmost panel in Figure 10 shows the output of the command cv2.bitwise_and(nemo, nemo, mask=orange_mask).
cv2.bitwise_or(frame1, frame2) is a bitwise OR operation that delegates in combining multiple frames so that the output retains non-zero pixels from either frame.
cv2.imshow(window name, image) is used to display a given image in a window. The name of the window is given by the first parameter in the command. The window automatically displays in the same size as the image.
Numerical Python (NumPy) is a Python library used primarily for working with arrays. The library also contains functions for mathematical purposes, such as linear algebra, fourier transformation, and matrices. In this lab, the NumPy command used is np.array.
np.array() creates an array object of the class N-dimensional NumPy array, or numpy.ndarray. A list, tuple, or a number can be entered in the parentheses to turn the data type into an array.
Raspberry Pi Camera
The Raspberry Pi Camera Module is a camera that can be used to capture high resolution and high definition images and video. The module’s library features commands to use the camera within code and modify how the camera outputs the frame rate, resolution, and timing of the image or video it is capturing. The camera module’s coding library (picamera) is automatically installed onto a Raspberry Pi once the camera is enabled and the Raspberry Pi is rebooted.
PiCamera() is the function used to call the PiCamera within the code. Typically, PiCamera() is assigned to a variable to make calling the camera throughout the code easier.
PiRGBArray(camera, options) produces a three-dimensional array from a BGR capture. After declaring the camera, the Options parameter can be used to set the size of the image captured.
camera.resolution = (width, height) sets a specific resolution for the image captured by the module. The default image resolution is set to the resolution of the monitor being used. The maximum resolution for still photos is 2592x1944 pixels and 1920x1080 pixels for video recording. The minimum resolution is 64x64 pixels.
camera.framerate = frame rate sets a specific frame rate or the speed at which the image is shown. The default frame rate and maximum frame rate are dependent on the resolution at which the video or image is being captured.
camera.capture() enables the camera module to capture an image. The output file name, format type, video port usage, resizing, splitter port, bayer, and other parameters can be set.
camera.capture_continuous() captures images continuously until a defined end. For this lab, the format of the output is set to bgr, which means the image data is written to a file in BGR color space format. The output is a file-like object and each image is written to this object sequentially. The output must be cleared in preparation for the next image in the stream.
Raspberry Pi Setup
Before using a Raspberry Pi, the Raspberry Pi must be prepared with a formatted SD card with Raspberry Pi OS and the correct programming library packages installed. For this lab, these steps have been done. A more detailed explanation of the Raspberry Pi Setup can be found in the Appendix.
Materials and Equipment
- Raspberry Pi with SD card
- Raspberry Pi Camera Module
- Monitor with micro HDMI to HDMI cable
- USB keyboard
- USB mouse
- Power supply
Ensure that the Raspberry Pi is connected to a power supply and a monitor. Verify that the camera module and the USB mouse and keyboard are plugged into their corresponding ports on the Raspberry Pi.
Part 1: Testing the camera module
In this part, capture a still image using the camera module to verify that the camera has been connected and enabled correctly.
- Open a terminal window by clicking the black monitor icon on the taskbar (Figure 11).
- To take a still picture, type in the following command raspistill -o testpic.jpg.
The camera should take a still picture and save it as testpic.jpg on the Raspberry Pi. A preview should be visible on the screen for a few seconds. If this command returns any errors, please notify a lab TA.
Part 2: Color Detection using Pi Camera and Python-OpenCV
In this part, a Python script that isolates red and green colors from a live video captured by the camera module will be developed and this output will be used to create a prototype of a traffic light detection system.
- Download the Python script for color detection using OpenCV. Open the camera module with the Thonny Python Editor installed on the Raspberry Pi.
- There are missing components in the program. Complete the code following the instructions outlined in the next steps.
- The code is divided into four sections. Figure 12 shows the workflow for this lab.
- Section 1 includes commands for importing all necessary packages.
- Section 2 sets up the camera module so it captures a continuous video.
- Section 3 contains code for color detection and isolation for red and green colors.
- Section 4 interprets the color detection output to create a traffic light detection system.
- Type the code in Figure 13 to import the required packages into the Python module.
- OpenCV represents images as NumPy arrays. Lines 1-3 import OpenCV, NumPy, and PiCamera libraries. The images obtained from the camera module must be in an array format so Line 4 is included. This completes section 1 of the code.
- Section 2 of the code sets up the camera module. The while(True): condition creates an infinite loop so that the code re-runs automatically to perform color detection on a set of objects. In the body of the while loop, type out a command for each of the following tasks using the background information of the camera module commands as reference.
- Initialize the camera object by setting a variable camera equal to PiCamera()
- Set camera resolution to (640,480)
- Set camera frame rate to 30
- Create a variable called rawCapture and set it equal to PiRGBArray(camera, size=(640,480)). This line sets the mode so that capture from the camera object is in a three-dimensional array format and the size of the window is the same size of the frame being captured (Figure 14).
- Get the code approved by a TA and move to the next step.
- Once the camera setup is done, capture a continuous video with the camera. This is achieved by using the camera.capture_continuous(..) command with arguments as shown on Line 14 in Figure 15. Line 15 converts every frame captured by the camera module to a NumPy array, ready to be manipulated with OpenCV to perform color detection.
- This concludes Section 2 of the code.
- Section 3 will perform color detection for red and green colors. Since it is easier to identify these colors with their HSV, convert each frame from the default BGR format to the HSV format. Type out a single line of code to perform this task in the body of the for loop using the background information of the OpenCV commands as reference. Use the cv2.cvtColor command to convert the frame to HSV, and create a new variable called hsv to store the converted frame. The desired colorspace conversion code is cv2.COLOR_BGR2HSV.
- The segment of code in Figure 16 will isolate red from the video. Lines 20-21 define the lower and upper boundaries of the range of HSV values within which the color would be classified as red. These values might have to be adjusted based on lighting and exposure conditions.
- For Line 22, identify all the pixels in the video that fall within the specified range. Create a variable red_mask, and use the cv2.inRange method to perform this task. Use the newly created hsv frame as input image, and red_lower and red_upper as the boundaries. This line outputs a binary mask where white pixels represent areas that fall within the range, and black pixels represent areas that do not.
- Line 22 will apply the mask over the video to isolate red. Create a variable result_red. Use the cv2.bitwise_and method taking frame as the input, and outputting frame using red_mask as the mask. This line outputs the frame so that only pixels that have a corresponding white value in the mask are shown.
- Insert code for isolating green from the video. Use the lower boundary values [40,40,40] and upper boundary values [102, 255, 255] to define the range.
- To isolate both red and green from the video at once, combine result_red and result_green. Think about how this could be achieved and type out the correct command in Line 32 (Figure 17). Which bitwise operation should be used to show pixels where either red or green are present?
- To display the final result (Line 34), type a line of code that creates a new window named Red and Green Detection in Real-Time and displays the final output. Use the cv2.imshow method.
- Note that Line 33 is required because it stops rawCapture to clear the stream in preparation for the next frame.
- This concludes Section 3 of the program. Get the code approved by a TA and move to Section 4.
- Lines 36-43 represent the final section of the program (Figure 18). The objective of this section is to interpret the color detection output to build an early prototype of a traffic light detection system.
- Lines 36-37 dictate that every time the key Q is pressed, the camera closes and stops capturing new frames. The final frame is checked for the presence of red and green using if statements (Lines 38-42). Type out the print statements that should be displayed in the body of the corresponding if statement when each color is detected. The module is now completed.
- Run the Python script and place the red and green objects provided by the TAs in front of the camera module. Observe how the specified colors are isolated from the video output. Press the Q key on the keyboard to pause the video at a specific frame and observe the output print statements. Figure 19 displays what the final output would look like when a red object is held up to the camera module.
This concludes the procedure for this workshop.
Individual Lab Report
There is no lab report for Lab 4.
Team PowerPoint Presentation
There is no team presentation for Lab 4.
Raspberry Pi Foundation. “Getting Started with Raspberry Pi.” Raspberry Pi Foundation. Raspberry Pi Foundation, (n.d.). Retrieved 5 August 2021. https://projects.raspberrypi.org/en/projects/raspberry-pi-getting-started
Raspberry Pi Foundation. “Usage - Raspberry Pi Documentation.” Raspberry Pi Foundation. Raspberry Pi Foundation, (n.d.). Retrieved 5 August 2021. https://www.raspberrypi.org/documentation/usage/
IBM Corp. “What is Computer Vision.” IBM. International Business Machines Corporation, (n.d.). Retrieved 5 August 2021 https://www.ibm.com/topics/computer-vision
Real Python. (2020, November 7). Image segmentation using color spaces in OpenCV and Python. Real Python. https://realpython.com/python-opencv-color-spaces/. Retrieved 9 August 2021
Installing Raspberry Pi OS on an SD card
This section outlines the procedure for installing an operating system on the Raspberry Pi using the Raspberry Pi Imager. A formatted SD card and a computer with an SD card reader are required to follow this procedure.
- Download the latest version of Raspberry Pi Imager and install it on a computer.
- Connect an SD card to the computer.
- Before this step, the SD card must be formatted to FAT32.
- Open Raspberry Pi Imager and choose the operating system. For most general uses of a Raspberry Pi, Raspberry Pi OS (32-bit) should suffice.
- Click on Storage and choose the SD card to write the image to.
- Click on Write to install the operating system image on the SD card.
Setting up the Raspberry Pi
Before getting started with this procedure, ensure that an SD card with Raspberry Pi OS is inserted into the designated slot on the Raspberry Pi.
- Connect a USB mouse and keyboard to the Pi.
- Using a micro HDMI cable, connect the board to a monitor.
- If required, connect an audio output device to the board using the audio jack.
- Connect the board to a power supply using the Raspberry Pi charger.
Raspberry Pi will boot up and the monitor will display the desktop. This concludes the Pi setup.
Setting up the Camera Module
The following steps outline the procedure for connecting and enabling the camera module.
- Locate the Camera Module port, as shown in Figure 20.
- Gently pull up on the edges of the port’s plastic clip, as illustrated in Figure 21. Insert the camera module ribbon cable into the port, ensuring that the connectors at the bottom of the cable are facing the contacts in the port.
- Push the plastic clip back into place.
- Start Raspberry Pi and navigate to Main Menu > Preferences > Raspberry Pi Configuration . The main menu can be accessed from the Pi icon on the taskbar.
- Select the Interfaces tab and ensure that the camera module is enabled, as shown in Figure 22.
- Reboot the Raspberry Pi. The camera module should be ready for use.
Installing OpenCV & Other Packages
This section outlines the procedure for installing the Python packages used in this lab.
- PiCamera with [array] submodule
- Open a terminal window by clicking the black monitor icon on the taskbar (Figure 23).
- Before installing the required packages, install PIP, update the system, and install Python 3 using the commands PIP for python3 - sudo apt-get install python 3-pip, sudo apt-get update && sudo apt-get upgrade, and sudo apt-get install python3.
- To install OpenCV, NumPy, and PiCamera packages, type out the following commands sudo pip3 install opencv-python, sudo pip3 install numpy, and sudo pip3 install picamera[array].
- Optionally, to utilize the GPIO pins on the Raspberry Pi, install the GPIO library using the following command sudo pip-3.2 install RPi.GPIO.