I am Hong Jin, a PhD candidate at Singapore Management University under the SOAR group. I am very fortunate to be advised by Prof. David Lo and to collaborate and learn from many brilliant and talented researchers.

Software is eating the world but there is an incoming software apocalypse. We desperately need better ways of understanding software systems. My research aims to develop approaches to design and mine abstractions to address important Software Engineering problems. In my PhD, my research focused on bugs and vulnerabilities related to APIs and third-party components, e.g. libraries.

I am on the job market (my PhD journey will end this year)!

[Resume] [Google Scholar] [DBLP]

Mining abstractions. I am interested in mining task-specific abstractions (e.g. specifications, rules, features) of programs. We mined behavioral models from traces generated using search-based test generation [TOSEM 2020] and inferred sandbox rules (specified as Alloy models) for an IoT system [ICST 2021].

Related to the abstractions of programs used in Machine Learning for Software Engineering, we analyzed the generalizability of a representative token-based embeddings [ASE 2019]. For filtering false alarms from static analysis, we addressed methodological issues [ICSE 2022 (Poster)] and proposed a new method for call graph pruning [FSE 2022].

Library and API usage. Modern software development involves the heavy use of third-party systems. We are interested in improving the use of third-party systems (libraries and APIs). For managing library vulnerabilities, we have:

  1. extracted features using pre-trained transformer models to automatically identify the libraries described in NVD entries (to be deployed by our industrial partner, Veracode) [ICPC 2022],
  2. combined information from multiple sources in identifying possible vulnerabilities based on the commit history of open-source projects [SANER 2022], and
  3. generated test cases for programs to assess the exploitability of library vulnerabilities [ISSTA 2022].
For managing usage of APIs, we developed an API misuse detector that uses active learning for learning subgraph patterns from programs across GitHub [TSE 2022].

Program transformation. We developed Coccinelle4J [ECOOP 2019] for program transformation of Java programs, extending the Coccinelle program matching and transformation tool. Coccinelle is widely adopted in the Linux kernel and C systems software. To build better tools, there are many lessons we can learn from Coccinelle.

Building on Coccinelle4J, we have worked on the migration of deprecated Android API methods [ICSME 2019, ICPC ERA, EMSE 2022]. Other work include machine learning on code changes/commits [ICSE 2020, SANER 2022]

Better research on program transformation will help automate more programming.