About Me
Software Engineer with a master’s degree in Deep Learning. I’m currently working as a Software Engineer at NotCo.
I have experience developing with Node (express, koa, React, React Native, Vue), Python (Django, Flask, PyTorch), Ruby (Ruby on Rails) and C.
I enjoy playing video games, board games and music.
Object attention and contextualization for vision and language navigation
Published in SIBUC UC, 2022
Vision-Language Navigation is a task where an agent must navigate different environments following natural language instructions. This demanding task is usually approached via machine learning methods, training the agent to learn navigation strategies that follow what is said in the instruction and grounding it with what’s seen from its environment. However, there is still a gap between human performance and current Vision-Language Navigation models. These instructions usually refer to objects present in the agent’s scene, so proper understanding of what’s around the agent is necessary to understand where to go and when to stop. This understanding is left to be learned implicitly from the global features of its vision, which are not designed to do object detection. In this work, we propose methods to include and attend to objects during navigation in recurrent and transformer based architectures. We achieve a 1.6% improvement over the base models in unseen environments. But we also see that these models also take advantage of the objects to overfit on seen environments, increasing the gap between the validation seen and unseen splits.
Recommended citation: Earle. (2022). Object attention and contextualization for vision and language navigation. https://buscador.bibliotecas.uc.cl/permalink/56PUC_INST/bf8vpj/alma997397024403396
Bridging the Visual Semantic Gap in VLN via Semantically Richer Instructions
Published in ECCV, 2022
The Visual-and-Language Navigation (VLN) task requires understanding a textual instruction to navigate a natural indoor environment using only visual information. While this is a trivial task for most humans, it is still an open problem for AI models. In this work, we hypothesize that poor use of the visual information available is at the core of the low performance of current models. To support this hypothesis, we provide experimental evidence showing that state-of-the-art models are not severely affected when they receive just limited or even no visual data, indicating a strong overfitting to the textual instructions. To encourage a more suitable use of the visual information, we propose a new data augmentation method that fosters the inclusion of more explicit visual information in the generation of textual navigational instructions. Our main intuition is that current VLN datasets include textual instructions that are intended to inform an expert navigator, such as a human, but not a beginner visual navigational agent, such as a randomly initialized DL model. Specifically, to bridge the visual semantic gap of current VLN datasets, we take advantage of metadata available for the Matterport3D dataset that, among others, includes information about object labels that are present in the scenes. Training a state-of-the-art model with the new set of instructions increase its performance by 8% in terms of success rate on unseen environments, demonstrating the advantages of the proposed data augmentation method.
Download here
Flapjack: Data management and analysis for genetic circuit characterization
Published in American Chemical Society, 2021
Characterization is fundamental to the design, build, test, learn (DBTL) cycle for engineering synthetic genetic circuits. Components must be described in such a way as to account for their behavior in a range of contexts. Measurements and associated metadata, including part composition, constitute the test phase of the DBTL cycle. These data may consist of measurements of thousands of circuits, measured in hundreds of conditions, in multiple assays potentially performed in different laboratories and using different techniques. In order to inform the learn phase this large volume of data must be filtered, collated, and analyzed. Characterization consists of using this data to parametrize models of component function in different contexts, and combining them to predict behaviors of novel circuits. Tools to store, organize, share, and analyze large volumes of measurement and metadata are therefore essential to linking the test phase to the build and learn phases, closing the loop of the DBTL cycle. Here we present such a system, implemented as a web app with a backend data registry and analysis engine. An interactive frontend provides powerful querying, plotting, and analysis tools, and we provide a REST API and Python package for full integration with external build and learn software. All measurements are associated with circuit part composition via SBOL (Synthetic Biology Open Language). We demonstrate our tool by characterizing a range of genetic components and circuits according to composition and context.
Download here