Sibo Zhu (朱思博)

Sibo Zhu (朱思博)

sibozhu AT

Massachusetts Institute of Technology

Welcome to my personal website!

My name is Sibo Zhu, and I am currently a research assistant at MIT Department of Electrical Engineering and Computer Science (EECS). I am excited to be working with Prof. Song Han in the HAN (Hardware, Accelerators, and Neural Networks) Lab on robotic perception and efficient deep learning.

I am also the perception lead at MIT Driverless, a student-led high-speed autonomous racing team, developing full scale vehicles and autonomous software to compete in driverless racing competitions.

Before coming to MIT, I received my M.S. in Computer Science from Brandeis University. During my masters, I was fortunate to work with Prof. Hongfu Liu. Prior to that, I received my B.A. in Computer Science and B.A. in Pure & Applied Mathematics from Boston University, worked closely with Prof. Sang (“Peter”) Chin.

Outside of research, I enjoy snowboarding, skydiving, running, hiking, working out, cooking, movies, music and especially travelling to balance my life from work.


  • Robotics
  • Perception
  • Efficient Deep Learning
  • Data Mining


  • M.S in Computer Science, 2020

    Brandeis University

  • B.A in Computer Science, 2018

    Boston University

  • B.A in Pure & Applied Mathematics, 2018

    Boston University



Perception Lead

MIT Driverless

May 2020 – Present Cambridge, MA


  • Objective: Design, Implement and Deploy Efficient and Robust Perception System for High-speed Autonomous Racing at Indy Autonomous Challenge and Roborace
  • Own the entire perception system (LiDAR and Camera) for a full scale on-track autonomous racing vehicle, leading 10 student-engineers and managing the entire perception production process.
  • Responsible for data collection, annotation, NN customization, integrating NN into ROS and inferring with TensorRT (C++).
  • QA and testing by running developed perception system on our 25% scale testbed vehicle and full scale vehicle
  • With my team, we replicated the state-of-the-art sensor fusion perception model “PointPainting” with PyTorch.
  • With my team, we propose and develop a framework that uses both camera and LiDAR history information to predict future LiDAR frames.

Research Assistant


Jan 2020 – Present Cambridge, MA

End-To-End Camera and LiDAR Extrinsic Calibration (PyTorch)


  • Objective: Using deep neural networks to conduct target-less Ccmera to LiDAR extrinsic calibration in real-time and end-to-end.
  • Designed an online calibration algorithm, by employing 2D and 3D semantic segmentation networks as backbone, and SGD as optimizer.
  • Designed an offline PyTorch-based extrinsic calibration end-to-end network, without need for any annotation
  • By directly generating calibration transform from sensory input, our proposed framework outperforms state-of-the-art target-less extrinsic calibration methods
  • Reduced the calibration time cost from >3hrs conventional manual calibration with checkerboard down to <1s end-to-end deep learning

Efficient and Robust LiDAR-Based End-to-End Navigation (PyTorch, ROS)


  • Objective: Employing only LiDAR and Navigation Map information as input, realize end-to-end autonomous navigation from raw 3D point cloud to vehicle control on full-scale vehicle.
  • Visualization of network attention on the end-to-end autonomous driving framework.
  • Collected 50km+ driving data from CARLA simulator, can further help model with generalization.

Deploy PVCNN on Real-World Self-Driving Car (PyTorch, ROS)


  • Objective: Deploy a state-of-the-art 3D deep learning framework PVCNN for LiDAR-based traffic landmark detection on autonomous racing vehicle.
  • Re-design architecture of Point-Voxel CNN to a classification network that does geometry and color classification based off x, y, z and intensity information provided by Velodyne 32 channel LiDAR that mounted on MIT Driverless racing vehicle.
  • Decrease classification error rate by 5 times. (accuracy increased from 95% to 99.99%+)
  • Improved LiDAR detection pipeline latency by 1.5 times. (From 5 ms to 3.4 ms)

Machine Learning Lead

MIT Driverless

Sep 2019 – May 2020 Cambridge, MA

LiDAR Perception System (PyTorch, C++, ROS)


  • Objective: Machine Learning pipeline that supports detecting and localizing landmarks on racing tracks through LiDAR sensor.
  • Design the LiDAR-based perception pipeline and lead a team of 4 engineers.
  • Deploy state-of-the-art research “Point-Voxel CNN” with non-trivial customization, Improved LiDAR detection system’s accuracy from 94% to 99.99%+, with 1.5 times speed-up.

Perception Core Enginner

MIT Driverless

Jan 2019 – Sep 2019 Cambridge, MA

Camera Perception System (PyTorch, C++, ROS)


  • Objective: Machine Learning system that detects and localizes landmarks on racing tracks through cameras.
  • Customized SOTA object detection NN (YOLOv3) for autonomous racing with custom preprocessing, NN pruning, and quantization; improved mAP accuracy from 66.97% to 89.35%, inference speed from 120ms to 30ms.
  • Implemented a ResNet inspired network for racetrack landmark key-points detection and localization with a 93% accuracy.
  • Deployed all networks on a ROS based real autonomous vehicle with whole stack latency 200ms.
  • Open-sourced codebase and dataset, potentially used by 100+ Formula Student teams from all over the world.

Research Assistant

Brandeis University Hongfu Liu’s Lab

Sep 2018 – Jan 2020 Waltham, MA

iPOF: An Extremely and Excitingly Simple Outlier Detector via Infinite Propagation (Python)


  • Objective: Enhance state-of-the-art outlier detectors through post-detection ensemble
  • Developed an outlier detection algorithm with direction awareness of each data point’s K nearest neighbors.
  • Achieved positive improvements ranging from 2% to 46% in average. In some cases, iPOF boosts the performance over 3000% over the original outlier detection algorithm.

Research Assistant

Boston University LISP Lab

Jun 2017 – May 2018 Boston, MA

High-Speed Camera with Custom Exposure (TensorFlow, Keras)


  • Objective: Detect and segment motion-blurred areas from camera images
  • Developed a CNN-based neural network for motion blur detection, resulted in 92% accuracy of blurry batch detection.
  • Open-sourced the project and received over 100 Github stars.

Director of Tech Department

Boston University Chinese Students and Scholars Association

Sep 2014 – May 2018 Boston, MA


  • Founded Tech department and secured over $10,000 in sponsorship BUCSSA.
  • Led a team of over 20 engineers to develop a JavaScript and React Native based mobile application for BU students on both iOS and Android system.
  • Acquired over 1000 student users within BU.



















100% Addicted











Faster LiDAR Based on Predictive Deep Learning Model

Normal LiDAR in the market runs at 10hz, which is sufficient for state-of-the-art autonomous road vehicle, but not enough for autonomous racing vehicle that runs at 180mph. In order to make a “faster LiDAR”, inspired by the fact that camera(30+hz) and LiDAR(10hz) holds different operating frequency, we propose a method that uses both camera and LiDAR history information to predict future LiDAR frames.

Replication of the “PointPainting”

By replicating the state-of-the-art sensor fusion detection model “PointPainting”, we further use this tool to test/evaluate our 3D point cloud predictive model and end-to-end extrinsic sensor calibration model.

“Point-Voxel CNN” Deployment on Full Scale Auotnomous Racing Vehicle

We deployed the state-of-the-art LiDAR perception model “Point-Voxel CNN” on MIT Driverless’s full scale autonomous racing vehicle. The deployment includes converting model task from segmentation to classification, ROS integration and full scale vehicle testing. Find full story in the video provided here.

Grad-Cam With Object Detection(YOLOV3)

As my first project at MIT Driverless, my task was meant to find the visual explaination of the CNN-based object detection model that perception team is using, the YOLOv3. After reviewing the results, we concluded that the network has its most attention on the bottom part of the object(traffic cone), and in some cases the margin between cone and ground.

Motion Blur Detection

In order to make custom high-speed cameras that can deal with small patches of motion blur, I proposed a custom convolutional model that can detect motion blurred patches within images, achieved two sigma accuracy.


Graduate Research Award in Computer Science

Only 1 graduate student get awarded each academic year for the recipient’s research excellence

Merit Scholarship

$15000, award for top 5% students

3rd Place of Overall, Driverless Division

Overall ranking of an autonomous racing contest of 20+ world-wide student teams(such as ETH, TUM) at one of the most challenging student autonomous racing competitions in the world

1st Place of Cost & Manufacturing, Driverless Division

Cost and manufacturing design contest of 20+ world-wide student teams(such as ETH, TUM) at one of the most challenging student autonomous racing competitions in the world

3rd Place of Engineering Design, Driverless Division

Software and hardware design contest of 20+ world-wide student teams(such as ETH, TUM) at one of the most challenging student autonomous racing competitions in the world

1st Place of Engineering Design, Driverless Division

Software and hardware vehicle design contest of 10+ student autonomous racing teams(such as TUM) from all over the world

2nd Place of Overall, Driverless Division

Overall ranking for an autonomous racing contest of 10+ student autonomous racing teams(such as TUM) from all over the world

Merit Scholarship

$15000, award for top 5% students

UROP Student Research Award

$3000, award for top student researchers


Maybe a coffee chat?