Hi,

This week we have a blog post from Devjyoti Chakraborty.

Among the plethora of superheroes created by Marvel Comics, one of my personal favorites is the blind crimefighter, Daredevil. The inclusion of a disabled superperson in a world of limitless creativity was well received, and Daredevil came to be considered as an inspirational figure. 

The main selling point behind this character (apart from the stories he was featured in) was his superhuman senses. Despite his blindness, he had enhanced senses which could help him feel everything that was going on around him, something like this:

Daredevil's sensory depiction (source).

Now you might wonder what Daredevil has got to do with today's tutorial. You see, today's tutorial is about MiDaS, a groundbreaking solution to finding the inverse depth estimation of images. Before we try to understand the analogy, it's important to have a basic picture of depth estimation in your mind. 

In very simple words, predicting the distance of the objects inside an image from the plane of the camera is what we term depth estimation. It's a tool for inferring scene geometry from 2D images themselves. 

Of course, MiDaS and Daredevil's senses are very different at their core. But if you think really hard about this, Daredevil honed his senses to make his body an impeccable sensor. At the same time, the authors of MiDaS have integrated various ingenious concepts to create a robust system capable of providing flawless real-time depth estimation. After all, depth estimation is a crucial cornerstone for autonomous vehicles. 

MiDaS in action.

The big picture: Since autonomous vehicles fall under a real-life real-time problem, it's extremely important to have a system robust enough to deal with any scenarios thrown at it. The authors of MiDaS have considered that and trained the model on MULTIPLE datasets.

How it works: Of course, using multiple datasets would mean you have different ground truth representations. For that, Ranftl et al. (2020) computed a common output space for all the datasets. 

Our thoughts: It's absolutely true that a single dataset will not be able to mimic the essence of the real world, which leads us to applaud the ingenuity of the idea of using different datasets to train the model. 

Yes, but: This would also mean that the paper is heavy on mathematics to really understand it. 

Stay smart: Before you deep dive into MiDaS, get smart on the MiDaS in this week's blog.

Click here to read the full tutorial

PyImageSearch University

This lesson is part of PyImageSearch University, our flagship program to help you master computer vision, deep learning, and OpenCV.  PyImageSearch University is updated each week with new lessons.

Don't know Python?  No problem, we've got you covered with a short and sweet Python course to get you going.

Having problems with your local development environment or IDE?  Fortunately, our pre-configured Colab Notebooks allow you to run code the moment you join PyImageSearch University.  But, of course, you don't want to be a sys-admin, so don't waste time messing with your development environment.  

You can find the current lesson under Torch Hub 101 — Practical Applications of Torch Hub and the direct link here.

Want to master computer vision and deep learning?

Do you think mastering computer vision and deep learning has to be time-consuming, overwhelming, and complicated? Or has to involve complex mathematics and equations? Or requires a degree in computer science?

That's not the case. All you need to master computer vision and deep learning is for someone to explain things to you in simple, intuitive terms. And that's exactly what we do. Our mission is to change education and how complex Artificial Intelligence topics are taught.

Inside PyImageSearch University, you'll find:

  • 30 courses on the hottest computer vision, deep learning, and OpenCV topics
  • 30 Certificates of Completion (one for each course)
  • 39+ hours of on-demand video
  • Pre-configured Jupyter Notebooks running in Google Colab
  • Run all code examples in your web browser — works on Windows, macOS, and Linux (no dev environment configuration required!)
  • Access to centralized code repos for all 500+ tutorials on the PyImageSearch blog
  • Easy one-click downloads for code, datasets, pre-trained models, etc.
  • Access on mobile, laptop, desktop, etc.
  • New courses released regularly and lessons weekly, ensuring you can keep up with state-of-the-art techniques

Click here to join PyImageSearch University


PyImageSearch Team

P.S. If you're interested in learning how to successfully apply deep learning to your own projects, I would recommend reading my book, Deep Learning for Computer Vision with Python.

I crafted my book so that it perfectly balances theory with implementation, ensuring you properly master:

 

  • Deep learning fundamentals and theory without unnecessary mathematical fluff. I present the basic equations and back them up with code walkthroughs that you can implement and easily understand. You don't need a degree in advanced mathematics to understand this book
  • How to implement your own custom neural network architectures. Not only will you learn how to implement state-of-the-art architectures, including ResNet, SqueezeNet, etc., but you'll also learn how to create your own custom CNNs
  • How to train CNNs on your own datasets. Most deep learning tutorials don't teach you how to work with your own custom datasets. Mine do. You'll be training CNNs on your own datasets in no time
  • Object detection (Faster R-CNNs, Single Shot Detectors, and RetinaNet) and instance segmentation (Mask R-CNN). Use these chapters to create your own custom object detectors and segmentation networks
You'll also find answers and proven code recipes to:
  • Create and prepare your own custom image datasets for image classification, object detection, and segmentation
  • Hands-on tutorials (with lots of code) that not only show you the algorithms behind deep learning for computer vision but their implementations as well
  • Put my tips, suggestions, and best practices into action, ensuring you maximize the accuracy of your models
If you're interested in learning more about my deep learning book, I'd be happy to send you a free PDF containing the Table of Contents and a few sample chapters:

Click here to download your table of contents and sample chapters PDF

After clicking the link above, the PDF will land in your inbox in a few short minutes.