I believe every individual on earth has some amazing ideas and thoughts, which they sometimes pen down. I also believe that the strength of humanity lies in being able to conveniently express and access these ideas. I just work on the convenience part. Things I’ve worked on or am working on are:
I’m really excited about the recent attention-based retrieval methods that demonstrate amazing expressiveness with compositionality (Chen et al. 2020) and the transformer based language models which can learn to use them (Sukhbaatar et al. 2019). I believe these papers are the first steps in moving towards a new era of ML where we explicitly model an external neural memory, as suggested by Prof Yejin Choi. Here are my slides on what it is, what it is not, why it works, and how is it useful!
Have you ever read statements like “Microsoft’s net worth is 1.02 Trillion dollars …” and struggled to get a grasp of really how much is a trillion dollars? Perhaps if someone told you that Bill Gates’ worth is 100 Billion dollars or that the US GDP is 21.5 Trillion dollars, you’d be able to better comprehend the original statement. The triple code theory for numerical cognition states that besides the verbal and visual cognition systems, we also possess a number line in our mind which lets us reason about new numbers based on other known facts over similar numbers and units. I wished to build a browser extension that could read a webpage and replace these intimidating numbers with simpler comparisons (eg. replace 1.02 Trillion dollars with 5 % of US GDP), but it seems someone already wrote a blog about this. I’m now interested in reconciling language models with numbers, extending upon Spithourakis and Riedel 2019. Here’s a short video (link removed temporarily) I presented at MLSS 2020, a poster (link removed temporarily) I presented at GSS 2020, and a 1-pg abstract (link removed temporarily) accepted at West Coast NLP 2020. Our survey on number representations in NLP was accepted to NAACL 2021. Here’s a preprint link and a short twitter thread describing the same!
This project is based on the simple realization (simple if you’ve read how transformer language models work): Since the only way we add positional information to the tokens in a seq2seq model is through positional embeddings, why do we still stick to some assumptions like ‘One position can accommodate only One token’ or ‘Tokens need to be continuous substrings’. I’ve written about this idea in greater depth on twitter and as a 1-pg abstract. Update: I’m fortunate to be assisting Deepesh Kumar and both of us are fortunate to being assisted by TG on this project.
An overwhelming majority of information today is in the form of relational tables, while the world has already begun a transition towards Graph Databses (Forbes article). In the Summer of 2019, we participated (and won the third prize) in the IBM-sponsored ISWC challenge of matching Tabular Data to Knowledge Graphs. Here’s the summary slides and paper that I presented at ISWC 2019, as well as a blog I wrote about my experience.
Based on my discussion with Dr. Puneet Bindlish (my course instructor for Integrative Intelligence) and active mentorship by Dr. Biplav Srivastava (Researcher and Inventor at IBM New York), I played around with a novel method of evaluating pretrained word vectors with the help of massive word association datasets like the SWOW (Small World of Words). Our work was accepted as a poster at the third RepEval workshop, collocated with NAACL 2019 conference. Here’s the Link to Paper, a blog and a poster about the idea behind it, as well as a nice Github repo to get you started!
Thanks to the culture of competitive programming that takes over Indian universities before the campus placements (which coincided with my PhD application season), I was forced to be simultaneously thinking about Algorithms in the day and Machine Learning in the night. This led me to shamelessly take up another side project in the midst of all the other deadlines I had. I tried to study the tools that Neural Networks possess with respect to those that competitive programmers use as simple programming constructs. Here’s the Gitub repo of my Jupyter notebooks, and a blog about Neural Turing Machines which I learned is an active field of relevant research, as well as another blog on the insights from my experiments.
Have you ever had to undergo the daunting task of making out what people think about a product based on customer reviews? What about movie reviews? Opinion summarization from user-generated content has such crucial implications in today’s world. Think of the social media biases that people develop and how such propaganda can easily act as a Trump card in political campaigns. As part of my undergraduate and master’s thesis, I’ve helped form the biggest dataset of labeled opinions for Amazon product reviews, with help from Anubhav and Mayank. Thanks to the efforts of Shreyansh Singh, Avi Chawla and Ayush Sharma, we were also able to develop a bunch of baseline methods to solve the problem statement, eg. Document Vectors and Implicit Feature Mining.
I, with my supervisor Dr. Anil K Singh and Dr. Julian McAuley, hosted a shared task at IJCNLP 2017 (Taiwan) and a workshop at ACM Hypertext 2018 (Baltimore). As of today, I have successfully defended my Master’s thesis for using Embedding based methods to tackle opinion mining and summarization.
I interned with Dr. Byron C. Wallace at Northeastern University in the summer of 2018, working on analyzing online physician reviews from RateMDs.com.Our work has been accepted at the Machine Learning for Healthcare Conference. We ran into an interesting problem of disentangling topics in word embeddings, which I continue to work upon, as a possible solution to automatic tagging of documents for future retrieval. Meta Search is a startup that provides similar solutions as a searchbar for all your files.
I happened to watch a few movies by Richard Linklater - particularly Before Sunrise, which struck a chord with me. I took up filmmaking as a medium of storytelling - to ask my viewers to get out there and talk their heart out, share the beautiful ideas they have, and hear out others’ - that is the essence of living. Since I had no prior experience, I had to teach myself filmmaking over the course of two whole years. Along the way, I scripted and directed a documentary on a warplane in my campus, a comedy sketch video, and a romance short film (unreleased). In the spring of 2018, we released the end goal: the 20-minute short film called Stopping by Woods.
In an otherwise technical workflow, my detour towards films allowed me the good fortune of working with several phenomenally talented and passionate individuals like Mrigank Gaur, Varshan Raj, Sakshi Patil, Harsh Agarwal, Jaseel Muhammed Keloth, Ankur Goel, Alok Priyadarshi, Shubham Shekhar Jha, Visharad Jalan, Aanshi Mehta, and Mayank.
The Moleskine Smart Writing Assistant is a digital pen product that flawlessly converts handwriting to text and images, without any sort of obstructive interference with your intuitive outflow of thoughts. Here is a cool video showing how it works. Trust me I’ve been looking for such a solution for a few years now and no other product even comes close. I bought it (for a not-so-modest $220) the moment I saw it in action!
I was previously exploring ways to integrate such technology into a more AI-enabled note-taking system, and also to expand its capabilties of digitizing stuff. For instance, imagine sketching mind-maps on your notebook and being able to edit it later on a Slideshow!
I and Harsh set about to develop an Augmented Reality smartphone application where people could visualize their designs and let it interact with the real world. Imagine a fashion desinger who could see how her new design looks on a mannequin without ever printing the fabrics in the first place! We set several milestones before us and cleared them one by one:
Harsh went on to work on SLAM, a computer vision method we used to implement our Augmented Reality product. Eventually he did an internship at University of Adelaide with Prof. Ian Reid, one of the pioneering researchers in this field, whose very works we were attempting to recreate and deploy. I, on the other hand, shifted gears towards IoT (Internet of Things) and eventually NLP and IR (Natural Language Processing and Information Retrieval).
As a course project with Dr. Hari Prabhat Gupta, I worked on an intuitive way to write digitally. Our Android application allowed one to hold the smartphone like a pen and write (in the air) characters with it which are then recognized and transcribed to English alphabets. Imagine taking the most conveniently available personal digital assistant you have (a smartphone) and write as intuitively with it as you write with a pen.
I used Machine Learning to map sensor readings (accelerometer and gyroscope) to English alphabets and subsequent training, while Robin - my teammate - helped with the Android implementation of it.