Loading...
I am a PhD student at the University of Exeter, specializing in Large Language Models, Genomic Foundation Models, and Aspect-Based Sentiment Analysis. My research focuses on developing innovative computational methods for biological sequence modeling and sentiment analysis, leading open-source initiatives with over 1.5 million downloads and contributing to top-tier publications like Nature Machine Intelligence.
Exploring the frontiers of AI, Genomics, and Software Engineering
Innovating with LLMs by creating frameworks like InstOptima for multi-objective instruction evolution and RNADesign-GRPO for reinforcement learning-based RNA sequence design.
Pioneering AI4Science research by developing Genomic Foundation Models (e.g., OmniGenome) to tackle challenges like sequence sparsity and SNVs, significantly improving RNA structure prediction and sequence design.
Creator of PyABSA, a widely-used open-source toolkit for fine-grained sentiment analysis, supporting over 30 models and datasets and simplifying the research-to-application pipeline.
Investigating the intersection of adversarial robustness and model fairness, revealing that attacks can mitigate bias and that adversarial training can form a Pareto front between accuracy and fairness.
Developing the LMDP framework, which leverages Pre-trained Language Models for more accurate, line-level software defect prediction, outperforming traditional AST/GNN-based methods.
Author of BoostAug, a novel text augmentation technique that uses global feature distribution and instance filtering to consistently improve performance across various NLP tasks.
Contributing to top-tier conferences and journals
Neurocomputing (CCF-C)
State-of-the-art models and interactive demos
A top-performing, lightweight model for fine-grained sentiment analysis. Featured in Stanford AI Index Report 2022 as the leading open-source ABSA model.
Revolutionary sequence-structure alignment model that dramatically improved RNA design success rates from 3% to 84%, setting new standards in computational biology.
An interpretable foundation model for discovering functional RNA motifs in plants, advancing agricultural biotechnology and crop improvement research. Nature Machine Intelligence
Advanced multi-species RNA foundation model with calibrated secondary structure prediction capabilities, enabling cross-species RNA analysis.
A comprehensive, interactive demo hub for ABSA. Featured as an official demo by Gradio-Blocks.
The official online leaderboard, allowing researchers to easily trial the benchmark framework.
An interactive demo for textual adversarial attack and defense.
An interactive demo for anime image super-resolution using diffusion models.
Open-source tools making AI accessible
Modularized framework for reproducible aspect-based sentiment analysis with pre-trained models and comprehensive benchmarks.
Automated large-scale benchmarking framework for genomic foundation models, enabling comprehensive evaluation across multiple tasks and datasets.
Intelligent file searching tool with advanced filtering and pattern matching capabilities.
Comprehensive visualization toolkit for machine learning metrics and model performance analysis.
Data augmentation library for boosting machine learning model performance with intelligent augmentation strategies.
Interpretable RNA foundation model for exploring functional RNA motifs in plants, published in Nature Machine Intelligence.
Evolutionary multi-objective instruction optimization via LLM-based instruction operators.
A framework to investigate the effects of adversarial attacks on alleviating model bias.
Leveraging language models for code defect prediction at the line-level.
Evolutionary multi-task injection testing on Web Application Firewalls (WAFs).
Open to collaborations and discussions
University of Exeter
Computer Science Department
Academic: hy345@exeter.ac.uk
Personal: yangheng2021@gmail.com
Innovation Centre Phase 1
Exeter, EX4 4RN, United Kingdom