I’m Avijit Thawani, a Computer Science PhD student at USC. Friends (as if I have any) call me Avi. I work on Representation Learning within Natural Language Processing, with Pedro Szekely and Jay Pujara at the Information Sciences Institute (ISI). I did my undergrad and masters in Computer Science at the Indian Institute of Technology (IIT BHU), Varanasi. My past mentors include Biplav Srivastava (when at IBM Research, NY) and Byron Wallace (Northeastern University, Boston), as well as the Bixby voice assistant team at Samsung Research. In Summer 2021, I interned at AI2 with Ashwin Kalyan.
Aug 2021: Our short paper was accepted to EMNLP 2021. We showed that Numeracy enhances Literacy in Language Models (or is it Foundation Models now)! TL;DR: Simple changes to number tokenization helps models predict words better.
July 2021: Wrapped up my internship with AI2, wrote a short story around AGI/Blockchain. I’m also learning how to make Chrome browser extensions - starting with https://blocksite.co/, using which would’ve otherwise costed me $11 per month! Here’s a free version for anyone: https://github.com/avi-jit/blocker.
June 2021: I’ll be attending NAACL 2021 and presenting our survey on Number Representations in NLP. I’m also excited to hear more about other awesome papers, such as those described in Sebastian Ruder’s NLP newsletter!
May 2021: We submitted two papers to EMNLP: one’s a revision of an ACL rejection and another’s a side project with Dipesh Kumar from IIT BHU. I’ve also begun my AI2 internship with Ashwin Kalyan as my mentor. Here’s my intro slide!
Apr 2021: Tragic month in India. In between arranging oxygen for dying relatives and myself recovering from Covid-19, I tried to visualize the scale of the Indian crisis for Americans to better comprehend it.
Feb 2021: Volunteered to write a layperson article on human-AI trust for the ISI Communications team.
Jan 2021: Submitted a paper (link removed temporarily) to ACL 2021 on number representations in NLP.
Nov 2020: Submitted a paper (link removed temporarily) to NAACL 2021 on number representations in NLP.
Oct 2020: My (ongoing) work on number representations was accepted at West Coast NLP 2020. Here is the 1-pg abstract (link removed temporarily). Looking forward to present on 30th October 2020.
Sept 2020: We have fundraised registration fees to sponsor four Indian undergrads’ attendance at EMNLP 2020. In other news, TG, Harsh, and I submitted a proposal to the government of India on identifying Indian vernacular NLP as an emerging technology. Update: Our proposal was unfortunately not selected, but we’d love to hear your feedback so here’s the link.
June 2020: I’ll be attending MLSS 2020 and ACL 2020. I’ll present my (ongoing) work on number representations (video link removed temporarily) at the former. EDIT: Here’s a conference report by Dr Vered Shwartz on the latter.
April 2020: I’ve been selected to attend MLSS Tübingen: Machine Learning Summer School along with 179 more students (out of 1300+ applicants).
Oct 2019: We ranked third in the IBM sponsored Table-to-KG matching challenge at the International Semantic Web Conference (ISWC 2019) . Here’s the system description paper we wrote, and here are the slides. I also wrote a blog about my trip to ISWC.
Oct 2019: Selected as a volunteer for TechCrunch Disrupt SF 2019!
Sept 2019: Attended SoCalNLP 2019.
Sept 2019: I won a travel grant to attend WeCNLP 2019 at Facebook HQ, Menlo Park. The view up there is pretty amazing!
July 2019: Attending SIGGRAPH 2019, Los Angeles.
June 2019: Attending ICML 2019, Long Beach.
June 2019: Joined University of Southern California as a PhD student. I’ll be working with Pedro Szekely and Jay Pujara at the Center on Knowledge Graphs, Information Sciences Institute, Los Angeles. Looking forward to the DARPA Machine Commonsense project. I will be supported by the Annenberg Fellowship!
May 2019: Defended my Master’s thesis on Opinion Mining with word and contextualized embeddings. Bidding adieu to a great five years at IIT BHU :)
Jan 2019: Accepted into the Computer Science PhD programs at University of Southern California, Los Angeles and Northeastern University, Boston.
Dec 2018: Three amazing job offers from Samsung, Myntra, and Headout.
21st April 2018: My long short film Stopping by Woods is now on YouTube (EDIT: over 50,000 views). Do watch and hit like if you like!
March 2018: In the summer of 2018, I’ll be heading to Northeastern University for an internship under Dr. Byron Wallace’s guidance. See you in Boston!
Feb 2018: We’re done with the shooting of my upcoming short film (tentatively) titled Stopping by woods. So excited to begin editing as soon as my mid semesters end!
Dec 2017: We’re organising the 2nd workshop on Review Opinion Diversification at ACM Hypertext (9-12 July, 2018). See you in Baltimore!