The very last part covers
the software.
The software is divided into four pieces: a C++ program running on the Arduino, and three programs running on the Raspberry Pi.
Since Python programming on the Raspberry Pi was rather new to me, I made the somehow unusual decision to have the Arduino be in control of everything. The Raspberry Pi is merely reporting events to the Arduino who makes the decisions on Gerty's response, and the Raspberry Pi is just doing as told (displaying the requested images and playing the requested sound clips and videos).
The Arduino and Raspberry Pi are communicating via a bidirectional serial connection (through the USB cable).
Arduino Code
The Arduino code is taking care of reading the inputs (PIR motion sensor and push button) and operating the outputs (push button LED, eye LED, and the two lamps). In addition, it reads the data from the serial connection from the Raspberry Pi. Based on all the input data, the Arduino makes the decisions which images to display on Gerty's screen and which audio clips and videos to play.
Raspberry Pi Code
All code is running on a Raspberry Pi 3B+ with the Lite version of Raspberry Pi OS Buster. The "Lite" version does not have the desktop (and my software does not require "X"). The Raspberry Pi has ssh enabled, which was extremely helpful during the development.
The code is made of three pieces which are all running in parallel in the background.
- An image viewer which displays Gerty's faces in an infinite slideshow (fbi).
- A two-level hotword recognition, based on the "snowboy" software (in Python2).
- The main logic (in a Python3 program) that communicates with the Arduino and, based on the Arduino instructions, changes Gerty's face images and plays Gerty's audio voice clips. It also receives input from the hotword recognition which it then transmits to the Arduino.
All of these three are started automatically when the RPi has booted, from the /home/pi/.bashrc file. Some details of the software are described in the following.
Slide show with the fbi image viewer
The faces on Gerty's screen are the center piece of the prop and in order to get the real effect, it is essential to have smooth/continuous transitions between the screens. This cannot be achieved by stopping and restarting an image viewer as the transitions would always be interrupted by a short black flicker between images. But I found a simple solution, by operating the "fbi" image viewer as an infinite slide show which makes perfectly smooth transitions.
Code:
sudo fbi -T 1 -noverbose -t 1 -cachemem 0 img1.jpg img2.jpg img3.jpg img4.jpg &
The parameters have the following meaning:
- -T 1 [start on virtual console #1]
- -noverbose [don't show the status line at the bottom of the screen]
- -t 1 [change images in time intervals of 1 s]
- - cachemem 0 [image cache size in MB; set this to zero, so always the new images are read]
- img1.jpg img2.jpg img3.jpg img4.jpg [four images; fbi will continuously cycle through these]
- & [run this process in the background, so the other processes can run in parallel]
Setting the cachemem to zero means that fbi will not cache the images, but always read the current version from the disk. Related to this is why I am using four images - when using two images, fbi is always caching them, irrespective of the cachemem setting.
The little trick is, that img1.jpg, img2.jpg, img3.jpg, img4.jpg are not actual image files, but symbolic links to the images. The links img3.jpg and img4.jpg are simply links to img1.jpg and img2.jpg, respectively. The two links img1.jpg and img2.jpg are then pointing to the actual image files. These links are changed in the Python program (see below) using the commands
Code:
link = subprocess.Popen(["ln", "-s", "-f", file1, "img1.jpg"])
link = subprocess.Popen(["ln", "-s", "-f", file2, "img2.jpg"])
where "file1" and "file2" are variables that have the actual filenames assigned.
Two-Level Hotword Recognition with "Snowboy"
Based on the official snowboy example code, and inspired by code snippets that I found in web searches, I built a two-level hotword recognition. The first level is an exact copy of the example code: snowboy continues listening to the microphone (connected through a cheap USB audio interface) until it recognizes the word "Gerty". If "Gerty" was recognized then it starts another copy of the code (the second level) which is then trying to recognize any of the 13 phrases introduced above within the next six seconds. If none was recognized after six seconds, it leaves the code and restarts the first-level code, waiting for "Gerty".
Whenever "Gerty" is recognized (in the first level), the code writes a byte to a Fifo (which is read by the main Python program (see below) from where the information is transferred to the Arduino. The Arduino then sends a request back to the main code to reply with the audio snippet "Yes, Sam!". Correspondingly, when one of the second-level phrases is recognized, the code writes corresponding bytes to the Fifo, which are then interpreted depending on the phrase.
The Main Python3 Program
This is the central element of the Raspberry Pi software. After the RPi has booted, the code shows the image sequence pictured above as a fake boot sequence before it starts regular operation. During the regular operation, it
- checks if there is new data in the Fifo (from the hotword recognition),
- reads the current day of the week, the hour and minutes and checks if this corresponds to one of the defined alarm times (for recurring events),
- reads the Google calendar (every 15 minutes) to see if any events are scheduled within the next 20 minutes,
- checks if a full hour is reached,
- checks if the shutdown/reboot button was pressed.
If the shutdown/reboot button was pressed, it starts the corresponding operation:
Code:
subprocess.call(['poweroff'], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
subprocess.call(['reboot'], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
If either of the other is true, it sends the corresponding information (as a single byte) to the Arduino which then decides how to respond. If the data from the hotword detection has a request for time, news, weather, or calendar data, the code starts a corresponding shell script to fetch these data from the web and insert the text results into an image, like the ones below (the cat-clock image is displayed every full hour - this is a nod to Doc Brown's clock from "Back to the Future").
After the input is processed and (if applicable) the corresponding data was sent to the Arduino via the serial connection, the code reads the data sent by the Arduino decoded in a single byte). Byte values below 30 indicate images to be displayed on the screen, while values above 30 correspond to audio clips of Gerty's voice, or one of the four "alarm" sounds (which are played as notifications at the full hour, when the tea timer finishes, or when a calendar event was found).
That's it! That's my GERTY 3000. A wonderful companion for life - and it nicely goes together with my HAL 9000.