Yucen Li (Lily)
About Me
I am a software engineer at Facebook on the Probabilistic Programming Languages team, where we measure uncertainty and improve interpretability through Bayesian inference. I am passionate about developing algorithms and understanding their behaviors, and I'm currently focused on Markov Chain Monte Carlo (MCMC) methods and Variational Inference. In my free time, I sometimes write poetry.
Research Projects
Facebook
Bean Machine
Facebook Probabilistic Programming Languages
Oct 2019–Present
We developed a probabilistic programming language named Bean Machine, which uses its declarative style to support programmable inference techniques such as compositional inference, block inference, and custom propsers.
PGM 2020 [pdf]
Facebook
Newtonian Monte Carlo
Facebook Probabilistic Programming Languages
Oct 2019–Present
Newtonian Monte Carlo is a single-site Markov Chain Monte Carlo proposal algorithm which uses both the first and second order gradients of the target density to determine a suitable proposal density at each point.
StarAI 2020 [pdf]
Facebook
Signals for Contribution in Open Source Projects
Socio-Technical Research Using Data Excavation Lab
Fall 2018–Spring 2019
What makes an open source project attractive to newcomer contribution? To answer this question using data-driven experiments, I converted a variety of possible signals into measurable proxies. I then GitHub repository data to model the number of newcomers as a function of these signals.
CSCW 2019 [pdf]
Facebook
Multi-Word Expressions in Word Embeddings
Linguistics Lab
Fall 2018–Spring 2019
Does the inclusion of multi-word expressions within word embeddings improve the cross-lingual mapping? Using a variety of techniques to identify these expressions, we then tokenized each into one word and compared multi-word translations.
EMNLP 2020 [pdf]
Facebook
Languages for Cross-lingual Dependency Parsing
Indepedent Study
Spring 2018
Given a specific target language, we wanted to use typological features to identify the optimal combination of languages to train on. Through my linguistic analysis, I determined that the most relevant predictor of cross-lingual success is the adherence of the training dataset to the universally transferable standards.
[pdf]
Facebook
Domain-Specific Language for NP-Complete Problems
15-418: Parallel Computer Architecture and Programming
Fall 2018
NP-complete problems can be difficult to implement in parallel; we propose a language for the solving and parallelization of NP-complete problems. This language has a simple form to represent NP-complete problems and automatically determines the full solution space and manages the search across different threads.
[pdf]
Facebook
Automatic Arithmetic Word Problem Solving
10-701: Introduction to Machine Learning
Spring 2018
The best-performing method, which matched numbers to equation templates, performed poorly on problems which included irrelevant information. To combat this, I developed a classifier to find this irrelevant text, which I then removed from the problem.
[pdf]
Software Engineering
Facebook
Facebook
Menlo Park, CA
Aug 2019–Present
I am one of the main developers of Bean Machine, our probabilistic programming language, and I have implemented many of its core MCMC inference algorithms such as the No U-Turn Sampler. I also work to develop our own MCMC method named Newtonian Monte Carlo, which estimates a Gaussian distribution to match the curvature around the current sample.
Instagram
Instagram
New York, NY
Summer 2018
I worked on the Explore product team to add the full functionality of tagging users in videos. This included the designing and implementation of the tagging workflow as well as the modifification server-side configurations. I also worked on the release and user testing, requiring the managing of privacy issues.
Facebook
Facebook
Menlo Park, CA
Summer 2017
I worked on the Mobile Interface Health team to classify every HTTP request from the Android app and detail the volume of different categories of network traffic. I also streamlined the data logging process by reducing 3000+ lines of server-side boilerplate code by through the automatic population of table columns.
Hyland
Hyland
Cleveland, OH
Summer 2016
I was part of the placeholder team on the Onbase content management system, where I optimized the Microsoft word plugin for placeholders. I worked on a complete redesign of placeholder creation, which included the ability for bulk creation.