Docker at the Edge: How Machine Learning Transformed Fowl Task
It’s well known that endangered birds and bats are sometimes killed by wind turbines. What’s unknown is how often this happens, particularly offshore. A pilot study by Western EcoSystems Technology (WEST) shows how machine learning (ML) algorithms can be leveraged at the edge with Docker to solve this problem.
The challenge is significant. Offshore wind turbines are remote, without internet connectivity because they’re remote, and the hardware for any solution must be able to weather extreme conditions, explained Lewis Hein, a programming analyst with WEST. He presented the project at this year’s DockerCon in Los Angeles. Although WEST ran the project, it was funded by a grant from the U.S. Department of Energy.
So far, the task of counting bird and bat collisions has been solved manually by having someone walk around the wind turbine to count the bodies. The task can inform how wind farms are managed, he said — but walking the field is not possible for offshore wind turbines.
“In general, wind energy has emerged as an energy sales with more upsides and few downsides than many of our other energy sources,” Hein said. “That said, wind energy is not perfect, and we need ways to manage these imperfections. It needs to be carefully monitored to mitigate danger to flying animals…” that are protected by the Endangered Species Act or state laws.
The Problem at the Edge
The first challenge was the connectivity. Wind turbines aren’t connected to the internet in part because they’re remote. But they’re also part of the energy infrastructure, so there are many regulatory agencies anxious to tell you that you cannot connect wind turbines to the internet, Jean said.
“This means that we may have no connectivity, and not having connectivity means that we need to process all of our data in the same place that we collect all of our data,” he said.
Monitoring for collisions requires many cameras per turbine to get good coverage of the blades as they move. The video feed alone can mean a few 100 gigabytes of data per day — which leads to the second challenge: There’s nowhere to send that data.
“If you were to try to do the naive thing, and simply store all that data on hard drives in the turbine, you would run into another problem: that you can’t necessarily get there for months at a time,” Hein explained. “And enough one terabyte hard drives to store this amount of data over a few months would be sort of a ridiculously large pile of hard drives.”
That leaves one option: A sort of compression algorithm using a computer vision system that looks at the data feeds in real time. To achieve this, the hardware has some constraints it needs to follow: It must work reliably and without supervision for up to eight months at least because workers aren’t able to check it more often. And it needs to run in real time and be fairly easy to set up because turbine workers would be the ones setting it up due to training restrictions required to even go onto the turbine’s location.
“Now these people do not necessarily understand our system nor should they be expected to,” he said. “That meant that we could take our computer and pre-image it with all of the files and the Docker containers and images we needed, and hand it to these people.”
Microservices Architecture for the Win
The team opted to have one edge device that would harden as much as possible and make it as reliable as possible. They then deployed a microservices architecture with each microservice in a separate Docker container.
“This allowed us to develop all of these microservices with different teams who are experts in those specific domains like our machine learning team could develop the service that did inference and our development [team] to develop services with things like fulfillment data management, through interacting with the cameras and data acquisition,” he said. “This was one of our first hints the Docker was going to be really good for this.”
The project team realized it could just agree on the interface contracts at the microservice boundaries, and make sure that the other team knew what was expected of them and then each could work in their containerized world.
“This also allowed us to test our own services and have some confidence that those tests would be meaningful for reliability in the production environment,” he said. “Also, for installation, because we could host on Docker hub, this made installing to new hardware or to testing hardware just as simple as running a Docker call.”
They coordinate all the services in a Docker Compose file that was pre-installed on the machine and set up to run when the hardware was plugged in.
“The installation process, once we had our containers built and hosted, was as easy as just pulling the container images and getting a test data and doing a quick run of the Docker compose to validate the installation,” he said.
Docker also helped the team keep up with the firehouse of data the solution created by allowing them to use some fulfillment battle-tested solutions such as Triton and the NVIDIA stack, he added. While doing it without Docker would’ve been possible, it would have been very painful, he said, especially if the dependencies of one part began to conflict with dependencies from another part.
“Additionally, the nature of these sorts of projects is that especially at testing time, you’re adding sensors and subtracting sensors and changing your workload,” he said. “Having a microservices architecture enabled us to scale to meet demand by just adding new Docker containers or turning off Docker containers we didn’t use.”
The edge device needed to be able to boot up as soon as it was plugged in without additional supervision, but there are cameras and network switches that have to be connected and set up as well. So they registered their Docker Compose as a systemd service and used system DOT to start the Docker Compose at a specified time.
“Additionally, we wanted to do health checks to make sure that no service started before its associated piece of hardware was ready, and again, Docker was incredibly powerful for this,” he told audiences. “Because of the health checks available in Docker Compose, we could build a Docker container that its exclusive job was to check on a dedicated piece of hardware and report healthy when that hardware became ready.”
Docker Solves Pilot Problems
The pilot study they ran was based on land partly because they had access to a land-based wind turbine at the University of Minnesota, and also because they needed to be able to troubleshoot if it encountered problems. They also wanted to be able to check the results with the old manual walking method.
“Another big win for Docker on this project came about a week after our initial deployment [when] we discovered that some of our cameras, which were supposed to be saving data with a schedule, actually were not saving data at all,” he said. “What were [the] reasons — we never found out, because it was easy to take a Docker file, spin up some RTSP streaming, which is a very standard protocol used to security cameras, and post deployment, it was easy just to package up this new microservice, send it off and integrate it with our system.”
The project also encountered power consumption problems. The power consumption of the edge device began to randomly skyrocket to the point the device would crash.
“We said at the time it had joined the IoT, that being the internet of toast,” Hein joked. “So, spoiler alert, it was much easier than we thought it was going to be, thanks again to Docker, because we put so much effort into a containerized workflow, and an easy deployment, it was actually pretty easy to set up our new edge device.”
They got new hardware, installed some GPU drivers, which was probably the hardest part, he added, then copied the systemd and Docker Compose files, and then typed the Docker compose-up over the SSH connection.
“While this was certainly unwelcome in the middle of a deployment, I really would like to highlight that this is about as good as a surprise redeploying can go, and that is very much thanks to the fact that all our applications were containerized,” he said. “Those containers contained all the libraries needed all the dependencies without us even having to think about it. Well, we had to think about it, but we didn’t have to manually get all of those ducks in a row, they were just automatically in a row thanks to Docker.”
The Pilot Results
The system ran unattended for approximately six weeks before the study ended. The only thing they did during that time was look at the system logs to ensure the system was running.
“If it had been in an offshore turbine with no connectivity it would have done its job flawlessly during that time,” Hein said. “This study analyzed in real time more than 6000 hours of video, and in real time, sorted relevant video from non-relevant video, achieving about a factor of 10 data compression rate. And our algorithms did detect several collisions.”
In fact, it may have detected one more collision than the manual counters did, he added. Western EcoSystems Technology is now looking for wind turbine companies willing to deploy the solution offshore.
“Machine Learning models… in order to make a difference and have a positive impact in the world, they need to interact with the world and that interaction needs to happen in places that may not be convenient for a developer they may not have connectivity,” Hein said. “We can take these technologies to new places and solve new problems and with a level of quality and a level of deployment, speed and accuracy that was previously unachievable, thanks to the power of combining ML with Docker.”