Drivable Path Segmentation Test 1

A couple of weeks ago I was collecting and labeling driving images to teach #SEDRAD how to recognize the surface to drive on using semantic segmentation.

The first deeplabv3+ model running fully on Oak-D cameras is ready and we took it for a spin. It is not perfect but it is a first step towards improving the safety of our #SelfDriving #Ads #Robot.

#Oak2021 #OpenCV #robotics #AI #MachineLearning #SuperAnnotate #autonomousdriving

Our Face Analysis Now Powered by Oak-D

An old Soul Hackers Labs’ trick now powered by Oak-D. People tracking and face detection happens on edge device. The detected face is fed to SHL’s face analyzer (on host) to determine the age, gender, emotions, attention level, viewing time of the tracked person. Metadata is produced and stored to generate useful reports for advertisers.

Tracking and detecting faces was the most resource consuming part and now the host computer has been freed from this burden, thanks Oak-D!

#SEDRAD, the Self-Driving Ads Robot coming soon! #Oak2021 #OpenCV #AI #machinelearning #robotics #retailanalytics #digitalsignage #digitalsignagesolutions

Annotation Process Under Way

One of the steps of teaching the Self-Driving Ads Robot (SEDRAD) how to navigate the environment is to teach it which surfaces are good for it to drive on. We began the tedious task of collecting images of the surrounding, and creating a segmentation of the images to later “teach” SEDRAD to recognize the surface to follow and stay on. This process comprises image acquisition, image annotation (segmenting in our case), and then segmentation validation. It is very time consuming. We also ran into a problem. Due to technical issues, the real SEDRAD was not available over the weekend, when we collected the data. Instead, we mounted the cameras at the same height and separation from each other on a wagon. Annotation here we go!

ROS+Oak-D: Turning a depth image into a 2-D laser scan for obstacle avoidance.

When we applied to the #OpenCV Spatial AI Competition #Oak2021, the very first issue we told the organizers we were going to solve using an Oak-D stereo camera was the inability of our robot to avoid obstacles located lower than the range of its 2D lidar. Back then we had no idea how we were going to do this, but we knew a stereo camera could help. In this video we present our solution. The video does not show it yet in action during autonomous navigation, but it explains how we will be turning depth images from two front facing Oak-D cameras to create 2 virtual 2D lidars that can avoid obstacles located near the floor.

The Robotics division of SHL (SHL Robotics) joins the OpenCV AI Competition #Oak2021

We are pleased to announce that we are officially part of the second phase of the OpenCV AI Competition #Oak2021. Our team joins over 200 team selected worldwide among hundreds of participants. As a price, OpenCV and Luxonis have awarded us a certificate and a free Oak-D camera (to join the 3 others we already owned) to help us develop our self-driving ads robot. Stay tuned for more updates.

Soul Hackers Labs has joined AppWorks Batch #13

When it comes to Taiwan and South East Asia, no accelerator is bigger and more impactful than AppWorks. With 275 startups accelerated, AppWorks has come to raise about US$ 222M. With their vast network of human resources, it is a no brainer to want to join them.

We are happy to announce that starting July 29, 2016, we will be joining this prestigious institution as part of their batch #13. Soul Hackers Labs, together with over 30 other teams from Taiwan, Hong Kong, Macau, Malaysia, Singapore, and New York, will be spending the next 6 months building a sustainable business.

AppWorks will be providing us with office space, mentorship, connections, access to computing resources (through partnerships with Amazon AWS and Microsoft BizSpark) to help us boost our path to success. We are thrilled to see all the great things we will be building during this time. Please stay tuned for updates as the awesomeness takes place.



The Internet of Things Needs a New Kind of Sensor

Screen Shot 2016-06-15 at 9.04.39 PM
Left: Inside the sensor. Right: A demo app that uses the sensor data.

“I have been looking for some time for a camera to complement my smart home and I came to the conclusion that there is no product in the market that provides a decent solution for the user”, reads the introduction to a blog post I read the other day. This is particularly true of the Smart Home market, and any other market that deals with humans. Now, making a camera that solves all needs (human tracking, face recognition, people counting, etc …) is not straightforward, but that does not mean companies should not try to cover at least a small set of such desirable solutions. I believe there can be a demand for such things, but most people don’t know they need one yet.

The problem, seems to me, to be one of perception. While most people tend to consider a temperature or a light sensor a simple plug-and-play hardware that can be easily connected to a Smart Hub, camera solutions tend to be thought of as complex projects developed for a specific task and product. Why should this be the case? Many people, from Internet of Things (IoT) companies to makers, could benefit from an advanced plug-and-play “camera sensor”. One that could easily be plugged to your home hub, car, or any other Internet-enabled device and give access to its rich data (people id, objects recognized, etc) through its application programming interface (API).

Because I believe such things should exist, and because I believe once available many will benefit from it, I decided to create one such smart camera sensor. I am happy to introduce an early prototype of Project Jammin’s Face Sensor for the IoT, whose primary goal is to sense human emotions in real-time and without the need for cloud-based services. This sensor will offer the following functionalities and advantages out-of-the-box:

  • Facial emotion analysis
  • Face recognition
  • Attention tracking
  • Offline and real-time processing
  • Small and affordable
  • Protect your privacy, no need to send video data to the cloud
  • Convenient API to collect detected emotions, faces, attention
  • Ability to build your own apps and systems with emotion sensing

This product, once it goes into production, will be suitable for retail, where it can be used to detect people’s reactions and attention to products. In education, such sensor can be used to monitor kids and determine best study times, preferred topics, etc. Smart homes could benefit by adding emotion-based automation, just imagine if your home could adjust the lights, temperature, and music based on how you feel. Healthcare is another area in which this sensor could be useful, by placing it in front of sick patients or the elderly, one could monitor their recovery based on their emotions. The limit is your imagination. While emotion detection is not a new thing, I have not found yet an offering that just works, like these proximity, temperature, etc, sensors that now proliferate.

Please find in the following video a demo of Project Jammin’s Face Sensor for the IoT:


Let Me Hear Your Voice and I Will Tell You How You Feel

UntitledCreating mood sensing technology has become very popular in recent years. There is a wide range of companies trying to detect your emotions from what you write, the tone of your voice, or from the expressions on your face. All of these companies offer their technology online through cloud-based programming interfaces (APIs).

As part of my offline emotion sensing hardware (Project Jammin), I have already built early prototypes of facial expression and speech content recognition for emotion detection. In this short article I describe the missing part, a voice tone analyzer.

In order to build a tone analyzer, it is necessary to study the properties of the speech waveform (a two dimensional representation of a sound). Waveforms are also known as time domain representations of sound as they are representations of changes in intensity over time. For more details about the waveform you can refer to this interesting page.

The waveforms of four different letters of the English alphabet

Using software specifically designed to analyze speech, the idea is to extract certain characteristics of the waveform that can be used as features to train a machine learning classifier. Given a collection of speech recordings, manually labelled with the emotion expressed, we can construct vector representations of each recording using the extracted features.

The features used in emotion detection from speech vary from work to work, and sometimes even depend on the language analyzed. In general, many research and applied works used a combination of pitch, Mel Frequency Cepstral Coefficients (MFCC), and Formants of speech.

Screen Shot 2016-04-28 at 10.28.45 AM.png
Above the waveform of a speech expressing surprise. Below the pitch in blue, intensity in yellow, and formants in red.

Once the features are extracted and the vector representations of speech constructed, a classifier is trained to detect emotions. Several types of classifiers have been utilized in previous works. Among the most popular are Support Vector Machines (SVM), Logistic Regressions (Logit), Hidden Markov Models (HMM), and Neural Networks (NN).

As an early prototype I have implemented a simplified version of an emotion detection classifier. Instead of detecting several emotions like joy, sadness, anger, etc., my tone analyzer performs a binary classification to detect the level of arousal of a user. A high level of arousal is associated with emotions like joy, surprise, and anger whereas a low level of arousal is associated with emotions like sadness and boredom. The video below shows my tone analyzer running on a Raspberry Pi. Enjoy!