This paper strives for pixel-level segmentation of actors and their actions in video content. Different from existing works, which all learn to segme…
Human doing actions will result in WiFi distortion, which is widely explored for action recognition, such as the elderly fallen detection, hand sign …
In this paper, we propose a zoom-out-and-in network for generating object proposals. We utilize different resolutions of feature maps in the network …
In this paper, we address the task of segmenting an object given a natural language expression that references it, \textit{i.e.} a referring expressi…
In this paper, we address referring expression comprehension: localizing an image region described by a natural language expression. While most recen…
Reasoning over plots by question answering (QA) is a challenging machine learning task at the intersection of vision, language processing, and reason…
In this paper, we proposed a BB generator and a positive RoI generator. We showed that generated RoIs can be used both as an analysis tool (owing to its controllable nature) and a training method for the two-stage object detectors.
Two-stage deep object detectors generate a set of regions-of-interest (RoI) in the first stage, then, in the second stage, identify objects among the…
In this work, we have shown the benefits of using fine-tuning from a close domain in the context of table detection. In addition to the accuracy improvement, this approach avoids overfitting and solves the problem of having a small dataset. Moreover…
A correct localisation of tables in a document is instrumental for determining their structure and extracting their contents; therefore, table detect…
To avoid the exhaustive search over locations and scales, current state-of-the-art object detection systems usually involve a crucial component gener…
Detecting objects in a video is a compute-intensive task. In this paper we propose CaTDet, a system to speedup object detection by leveraging the tem…
We consider the problem of referring image segmentation. Given an input image and a natural language expression, the goal is to segment the object re…
The formation of hypernuclei in hadron-induced reactions and in heavy-ion collisions within a combination of a covariant transport model and a statis…
By signing up you accept our content policy
Already have an account? Sign in
No a member yet? Create an account