3D Object Detection with Open3D-ML and PyTorch Backend

3D Object Detection on the Kitti Dataset, photo provided by Open3D

In previous articles, I described how I used Open3D-ML to do Semantic Segmentation on the SemanticKITTI dataset and on my own dataset. Now it is time to move to another important aspect of the Perception Stack for Autonomous Vehicles and Robots, which is Object Detection from Point Clouds. Make sure to install Open3D-ML with PyTorch support if you want to run the code described in this article.

Visualizing the Kitti Dataset

The first thing to do is to download the popular Kitti dataset and visualize it. We are interested only in the data containing 3D bounding box annotations of different objects (pedestrians, cars, etc) seen by an autonomous vehicle. The steps are described below:

Step 1: Download the Kitti dataset

cd path-to-Open3D-ML/scripts/download_datasets
./download_kitti.sh path/to/save/dataset

Make sure to replace path-to-Open3D-ML with the path to the Open3D-ML installation and path/to/save/dataset with the path where you wish to save the dataset.

Step 2: Clone or update my repository

If you have not done it yet, clone my repository containing the Python code.

git clone https://github.com/carlos-argueta/open3d_experiments.git

If you previously cloned it, then just pull the new code.

git pull

Step 3: Navigate to the repository and activate the Conda environment

conda activate myenv
cd open3d_experiments

Make sure to replace myenv with your actual environment’s name.

Step 4: Run the script to visualize the first 400 point clouds with their bounding boxes

python3 view_detection_torch.py

The code is simple, first, a dataset is built passing the path to where Kitti was saved, so make sure to replace /path/to/save/dataset/Kitti with the path to your Kitti dataset. Some attributes are printed for a sanity check and you can ignore that part, finally, a visualizer is created passing as parameters the dataset and some indices referring to the point clouds that we want to visualize, in this case, we use range(400) to visualize the first 400 frames, but you can change this to other values to view other parts of the dataset.

import open3d.ml.torch as ml3d  # or open3d.ml.tf as ml3d
# construct a dataset by specifying dataset_path
dataset = ml3d.datasets.KITTI(dataset_path='/path/to/save/dataset/Kitti')

# get the 'all' split that combines training, validation and test set
all_split = dataset.get_split('all')

# print the attributes of the first datum
print(all_split.get_attr(0))

# print the shape of the first point cloud
print(all_split.get_data(0)['point'].shape)

# show the first 400 frames using the visualizer
vis = ml3d.vis.Visualizer()
vis.visualize_dataset(dataset, "training", indices=range(400))
Visualizing the Kitti Dataset with Open3d-ML

As you can see from the previous video, a window will open where you can select different point clouds and view the different bounding boxes included. These boxes were manually created and are part of the training set portion of Kitti.

Object Detection on the Kitti Testing Set and on Custom Data

Let’s now use a pre-trained object detection model on unannotated data. We will use both the testing portion of the Kitti dataset, as well as my own custom data, and we will see how the model performs on these two different datasets.

Assuming you have cloned or updated the repository as described above, in order to run the object detection script, do the following:

conda activate myenv
cd open3d_experiments
python3 detection_torch.py

The code first loads a pipeline configuration file for the Point Pillars model followed by creating the model with it. Make sure to replace /path/to/Open3D/ with the path where you cloned the Open3D repository when installing it. Next, the paths to both the Kitti dataset and a small part of my personal dataset (provided with the repo) are added to the configuration. The Kitti dataset is then loaded with the utilities provided by Open3D-ML and the custom dataset with the custom function provided within the script.

# Load an ML configuration file
cfg_file = "/path/to/Open3D/build/Open3D-ML/ml3d/configs/pointpillars_kitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

# Load the PointPillars model
model = ml3d.models.PointPillars(**cfg.model)

# Add path to the Kitti dataset and your own custom dataset
cfg.dataset['dataset_path'] = '/path/to/save/dataset/Kitti'
cfg.dataset['custom_dataset_path'] = './pcds'

# Load the datasets
dataset = ml3d.datasets.KITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
custom_dataset = load_custom_dataset(cfg.dataset.pop('custom_dataset_path', None))

Next, the object detection pipeline is created using the model and configuration object as parameters, followed by loading the model parameters which will be downloaded if not yet present locally. The test split of the Kitti dataset will be selected and the Open3D visualizer will be created.

# Create the ML pipeline
pipeline = ml3d.pipelines.ObjectDetection(model, dataset=dataset, device="gpu", **cfg.pipeline)

# download the weights.
ckpt_folder = "./logs/"
os.makedirs(ckpt_folder, exist_ok=True)
ckpt_path = ckpt_folder + "pointpillars_kitti_202012221652utc.pth"
pointpillar_url = "https://storage.googleapis.com/open3d-releases/model-zoo/pointpillars_kitti_202012221652utc.pth"
if not os.path.exists(ckpt_path):
 cmd = "wget {} -O {}".format(pointpillar_url, ckpt_path)
 os.system(cmd)

# load the parameters of the model
pipeline.load_ckpt(ckpt_path=ckpt_path)

# Select the test split of the Kitti dataset
test_split = dataset.get_split("test")

# Prepare the visualizer 
vis = Visualizer()

Finally, an empty list is created to store the point clouds with the detections for later visualization. The first loop in the following code gets the first 10 data frames from the Kitti test set, runs the inference, uses a custom function to filter out detections with low scores, and creates a dictionary pred with the format expected by the visualizer. The dictionary is then added to the data list.

The second loop obtains frames from the custom dataset, pre-processes them to make them compatible with the pipeline using the provided prepare_point_cloud_for_inference method, and then runs the inference, adding the results to the list just like in the first loop. The last line runs the visualizer so that we can inspect the results. For more details about how the provided custom methods described in the paragraphs above work, please consult the source code included in the repository.

# Variable to accumulate the predictions
data_list = []

# Let's detect objects in the first few point clouds of the Kitti set
for idx in tqdm(range(10)):
    # Get one test point cloud from the SemanticKitti dataset
    data = test_split.get_data(idx)
    
    # Run the inference
    result = pipeline.run_inference(data)[0]
    
    # Filter out results with low confidence
    result = filter_detections(result)
    
    # Prepare a dictionary usable by the visulization tool
    pred = {
    "name": 'KITTI' + '_' + str(idx),
    'points': data['point'],
    'bounding_boxes': result
    }
    
    # Append the data to the list    
    data_list.append(pred)
   
    
# Let's detect objects in the first few point clouds of the custom set
for idx in tqdm(range(len(custom_dataset))):
    # Get one point cloud and format it for inference
    data, pcd = prepare_point_cloud_for_inference(custom_dataset[idx])
 
    # Run the inference
    result = pipeline.run_inference(data)[0]
    # Filter out results with low confidence
    result = filter_detections(result, min_conf = 0.3)
    
    # Prepare a dictionary usable by the visulization tool
    pred = {
    "name": 'Custom' + '_' + str(idx),
    'points': data['point'],
    'bounding_boxes': result
    }
    # Append the data to the list  
    data_list.append(pred)

# Visualize the results
vis.visualize(data_list, None, bounding_boxes=None)
3D Object Detection with a Point Pillars Model on the Kitti and Custom Datasets

As you can see in the video above, with the Kitti test set, the model can detect pedestrians and cars without much issue. With my custom dataset, pedestrians seem to be detected but no cars were. This probably means that re-training is needed with my own data. This is often the case when Machine Learning models encounter datasets that differ from the ones they were trained with. In future articles, I will explore training models using Open3D-ML with my own data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: