Sparse Convolution explained with code

When I interview many people for their basic understanding of convolutional neural network, people are always simplify this into a single convolution kernel run through the sliding window. However, few of them can really recall what’s going on inside the actual machine. Here’s a tutorial to recap your crashing course again and then we will dive into the sparse convolution.

Read More

Tutorial for Torch Points3D library

Torch_Points3D is a modern library for 3D Vision Learning on point cloud data. It includes built-in implementation of many tasks (e.g. semantic segmentation, panoptic segmentation, 3D object detection and scene classification). Since it’s still under early development, there are many bugs, so please submit issues when you hit into any problem. This is an unofficial tutorial about how to use the library to run on some datasets. I will update ASAP to cover all the tasks and datasets. Hope it will help you.

Read More

Build MKL FFT and other pip wheel libraries

As pip only releases wheels of MKL for python 3.7+, we need to build these wheel files for ourselves, here is a simple step to build from source. You can insert those command line steps into your Dockerfile.

Read More

Explore the Convexity of Photometric Loss

As we can see from my last post BA with PyTorch that The pixel intensity or small patch compared by direct methods is extremely non-convex, thus become a huge obstacle for second order optimizers to converge into global or even local minimals. So, in this post we are exploring the method to constrain the photometric error to be as convex as possible. Actually, with a good initialization (achieved by deep methods) the depth estimation can be very close to the ground truth.

Read More

solve BA with PyTorch continued

In last post, I’ve started the trial of solving the Bundle Adjustment problem with PyTorch, since PyTorch’s dynamic computation graph and customizable gradient function are very suitable to this large optimization problem, we can easily encode this problem into a learning framework and further push the optimization results into updating the depth estimations and pose estimations in a unsupervised fashion, the true beauty of this method is it introduced more elegant mathemtics to constrain the dimensionality of manifold compared to the direct photometric warping on the whole image. Plus, the sparse structure are extremely fast even apply on cpu with limited hardware (testing on nvidia tx2, cpu only).

Read More

Solve BA with PyTorch

Since Bundle Adjustment is heavily depending on optimization backend, due to the large scale of Hessian matrix, solving Gauss-Newton directly is extremely challenging, especially when solving Hessian matrix and it’s inverse. Scale of Hessian matrix is depending on number of map points, so to this end, we can first estimate the point inverse depths and then update the poses in two stage manner, however, this requires a very good initializaiton for depth estimation, and requires a convex local photometric loss to make sure positive definite property of the Hessian.

Read More

gradient pixel selector in DSO

Most of the implementation details are prone to be neglected if you only read the paper, in this pose, I’ll introduce the way DSO pick their candidate initialization anchor point hessians, and explain why the choose this stochastic gradient based initialization policy.

Read More

forgetting model continued

As in last post, I have proposed the way to slow down the conjugate gradient update on each dimension, instead, try to update part of the network can lead us faster convergence and more generalizability. Here’s the proof.

Read More

Deep Mono VO

Let’s recall the last post about schur complement in estimating Hessian matrix:

Read More

schur complement for GN Optimization

Direct methods normally hold the photometric consistancy assumption, and the depth estimation from direct methods are jointly estimated with camera poses, which composite into a huge \(H\) matrix.

Read More

PDF to text use OCR and Deep Learning (Conv, RNN)

(This post focus on convert pdf in English and normal fonts.) OCR is a problem that are very thoroughly explored, especially for printed documents. This post shows how to convert pdf to texts and even extract the caption from it to describe this pdf document.

Read More

BA in DSO

As we all know DSO is a bundle adjustment in a sliding window, since BA in a whole frame history will enlarge the H matrix for Gauss-Newton to find the optimal pose estimation. Here I’m going to explore the BA basics and it’s relationship to DSO.

Read More

Geometrical Meaning of Hessian in Image

You may heard tons of corner detectors like Förstner corner detector, Harris methods, laplacian of gaussian, etc. Most of them are leveraging gradient of image and hessians of image to capture local features, until the recent deep learning technique which use convolution methods.

Read More

Towards Deep Learning augumented Visual Odometry

Visual SLAM is already a well defined problem and has been largely explored. Many powerful algorithms use traditional geometrical methods can achieve very high precision yet keep a reasonable speed. However, there’s still some corner left to explore…

Read More

CPP Aligned Allocator for SSE

Allocators will normally span out a whole chunk of memory that’s immediately available in heap, yet if you want to apply Intel SSE optimization to seep up your computation pipeline, allocating space without alignment will introduce endless trouble.

Read More

Github Speed Up

Github is slow in some region, this post helps speed up your github by set the global url configs:

Read More

DSO Reviews and Future Extensions.

The biggest challenge of dso in this experiment is the low contrast problem (e.g. white sand that have the same brightness everywhere) and the in-place rotation, we have made several modification on the two occasions to make it more robust for the underwater environment.

Read More

cuda version check

if you installed cuda yet happen to be not able to find nvcc in your terminal, just navigate to the cuda libraries to execute the nvcc for checking the versions:

Read More

old thoughts

The match happened in one shot, which means the network in brain is a broadcast way, input was broadcasting towards each corner, and bouncing inside brain, eventually, those matched pattern will retain a high gain and activate the path.

Read More

cuda version check

In bash you can use extglob:

$ shopt -s extglob  # to enable extglob
$ cp !(b*) new_dir/

where !(b*) exclude all b* files.

Read More

cmake library build g2o

If you happen to be using cmake library and there’s no library found in the cmake list, don’t worry, most .cmake files are stored in /usr/share/cmake-X.X/Modules/ just manually copy those xxx.cmake file to that modules directory, then cmake will help you to locate the library path.

Read More