I'm an incoming MS ECE student at UCLA. Previously, I was a Pre-doctoral Researcher at Google Research, India where I worked towards improving image understanding, and making image/video generation more efficient under the guidance of Dr. Prateek Jain and Dr. Sujoy Paul . I am broadly interested in multimodal learning, and excited about domain adaptation and few shot learning.

I'm a B.Tech Computer Science graduate from IIIT-Sri City. Before joining Google, I worked as a Machine Learning Engineer at Tata Consultancy Services (TCS), Hyderabad where I worked on building ML models with AutoML toolkits for output explainability. I interned for a semester at LimeChat as an AI software developer, where I was involved in designing contextual chatbots using Level 3 AI. In the summer of 2020, I worked on unsupervised segmentation of fish in challenging underwater scenarios under the guidance of Dr. Brejesh Lall at IIT, Delhi. In my sophomore year, I got an opportunity to explore a little bioinformatics by working on the problem of prediction of secondary structure of RNA, under the guidance of Dr. S. Satapathy at Tezpur University.

I was the Intel AI Student Ambassador for my institute. As a student ambassador, I assisted juniors with carrying out AI projects involving Intel's AI toolkits, and conducted sessions for the same. I also worked on a project which involved reducing the frame rate in videos followed by frame reconstruction for efficient (internet) data usage.

When I'm not in front of a computer screen, I am mostly playing my guitar and singing. I am also an avid table tennis player and enjoy reading books during leisure.
Currently reading:


Here is my CV [Updated Nov 2023].

Email:
GitHub  /  Google Scholar  /  Twitter  /  LinkedIn

profile photo

Publications

Bitions@DravidianLangTech-EACL2021 - Ensemble of Multilingual Language Models with Pseudo Labeling for offence Detection in Dravidian Languages.
Debapriya Tula, Prathyush Potluri, Shreyas MS, Sumanth Doddapaneni, Pranjal Sahu, Rohan Sukumaran, Parth Patwa. Proceedings of the 1st Workshop on Speech and Language Technologies for Dravidian Languages EACL 2021.
[Paper]   [Code]
European Chapter of the Association for Computational Linguistics (EACL) Workshop, 2021.

We use a soft voting ensemble of multilingual models, viz. Distil-mBERT and ULMFiT for this shared task hosted in EACL 2021. Our solution ranked 1st for the Malayalam dataset and ranked 4th and 5th for Tamil and Kannada, respectively.

Estimating RNA Secondary Structure by Maximizing Stacking Regions.
Sen P., Tula D., Ray S.K., Satapathy S.S..
[Paper]   [Code]
International Conference on Computer Communication and Internet of Things (ICCCIoT 2020).

We try to predict the most stable secondary structure(s) of an RNA sequence using concepts from Graph Theory to maximise base pairs, leading to minimum entropy structures. Awarded the best paper at ICCCIoT, 2020.

Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss and Cosine Normalization.
Tula, Debapriya, Shreyas Ms, Viswanatha Reddy, Pranjal Sahu, Sumanth Doddapaneni, Prathyush Potluri, Rohan Sukumaran and Parth Patwa.
[Paper]   [Code]
SN Computer Science (Journal), 2022.

We introduce a novel code-mixing index (CMI) based focal loss which circumvents code-mixing in languages and class imbalance for the task of offence detection in Dravidian languages.

Incorporation of transition to transversion ratio and nonsense mutations, improves the estimation of the number of synonymous and non-synonymous sites in codons.
Suvendra K Ray, Ruksana Aziz, Piyali Sen, Pratyush Kumar Beura, Saurav Das, Debapriya Tula, Madhusmita Dash, Nima Dondu Namsa, Ramesh Chandra Deka, Edward J Feil, Siddhartha Sankar Satapathy.
[Paper]  [Code]
DNA Research (Journal), 2022.

Experience

Aug 2022 - Present
Predoctoral Researcher
  • Working on improving OCR understanding.
  • Worked on improving metrics of the SOTA OCR system for better adaptation to new handwriting styles.
Aug 2021 - July 2022
Machine Learning Engineer
  • Worked on prediction problems concerning user data in the healthcare domain using AutoML libraries.
  • Was involved in several bash script automations for smooth teradata tables' read-write operations and exposing them as Django apps.
Jan 2021 - June 2021
NLP Software Development Intern
  • Developed and redesigned some of LimeChat's core chat subsystems.
  • Involved in the end-to-end setup of the chatbot for Nissan, LimeChat's biggest undertaking to date.
May 2020 - July 2020
Computer Vision Research Intern
  • Designed a deep learning based pipeline for segmentation of fish in underwater scenarios.
May 2019 - June 2019
Research Intern
  • Maximize stacking regions to find the most stable secondary structure(s) of RNA using concepts from graph theory.
  • Awarded the best paper at ICCCIoT, 2020.

Teaching Experience

  • Teaching Assistant (Advanced Data Structures & Algorithms) - (Sept 2019 to Dec 2019)
    Assist 3rd year undergraduates in solving assignment problems during lab sessions. Frame questions for solving during tutorial sessions and resolve doubts wrt class lectures.

  • Teaching Assistant (Data Structures & Algorithms) - (Jan 2020 to Apr 2020)
    Assist 2nd year undergraduates in solving assignment problems during lab sessions. Conduct tutorials for doubt clarification.

Projects

  • Content Based Image Retrieval
    Apply Deep Learning based Computer Vision techniques for searching digital images in large databases.

  • Gringotts
    Provide a vault to securely store secrets like passwords, keys (GPG/SSH), securely transfer data among people etc.
    [Code]   [Medium]

  • StackOverFlow API-recommender
    Provide an API recommender for Java APIs for questions asked on StackOverflow.
    [Code]

  • Speech Dereverberation
    A system to remove reverb(echo) from sound signals by predicting the reverb’s contribution in the present signal.
    [Code]

  • Reads For You
    A book recommendation system using user-based collaborative filtering.
    [Code]

This website template is from Jon Barron.