Computer Vision

Computer vision as an area of study covers a broad range of problems. Here, we outline the main tasks that people in various industries and areas of research are trying to solve.

One problem in computer vision is image segmentation. At the base level, given some visual data, we want the model to learn to differentiate between objects in the foreground and the background. The way a model indicates the specific regions of interest is through a mask, which is essentially a categorization of every pixel in the image as a pixel of interest or not. To go further, we might want semantic segmentation, which identifies regions of visual data which are of the same category. Even further is instance-level segmentation, which identifies regions for individual object.

A different way towards object detection is through bounding boxes. Here, the model outputs the minimum box that encapsulates the object in the visual data.

A further level of comprehension can be asked of a model by additionally requiring some sort of classification for what category a highlighted region belongs to.

Object tracking in video data is a further extension of object detection, in which objects in video are tracked frame by frame.

As one might be able to perceive, this is a more imprecise way of denoting the region for where an object might be. Therefore, it is easier for a model to make more accurate bounding box predictions than mask predictions.

This generally describes the different computer vision tasks in increasing levels of complexity and difficulty. Datature's platform enables users to be able to perform all these tasks easily and with their own custom datasets. Go to Quickstart to get started with your own project or Use Cases if you want to see specific examples in action.