Resume
I am passionate about developing LLM/NLP systems that are practical, robust, efficient, and can be applied to the industry. I have both research and industry experience in NLP and Data Science.
⬇️ Download my Resume here
📚 Education
University of Technology Sydney
Ph.D. student in School of Computer Science, UTS NLP Group ⏲️ Mar 2021 - Present
- Natural language processing / Knowledge updating in Large Language Models (LLMs)
- Supervisor: Prof. Ling Chen
University of Melbourne
Master in Software Engineering ⏲️ Jul 2018 - Dec 2020
- GPA: 83/100 (Top-5%, First Class Honours / High Distinction)
- Awards:
- Dean’s Honour List, 2019 & 2020
- Liz Haywood Awards for Best Software Engineering Team, 2020
- Activities:
- Member of Computing & Information Systems Students Association (CISSA)
University of British Columbia
Summer Exchange Program in Electrical and Computer Engineering (ECE) ⏲️ Jul 2017 - Aug 2017
- Algorithms and data structures
- Web design and programming
China Pharmaceutical University
Bachelor in Information System and Information Management ⏲️ Sep 2014- Jun 2018
- GPA: 82/100 (Ranking: 25/143)
- Awards:
- Excellent Volunteer (2015-2016)
- Second-Class Scholarship (2016-2017)
- Activities:
- Chairman of the Faculty of Science Student Union (2016-2017)
📑 Publications
MedINST: Meta Dataset of Biomedical Instructions
Wenhan Han, Meng Fang, Zihan Zhang, Yu Yin, Zirui Song, Ling Chen, Mykola Pechenizkiy, Qingyu Chen
Conference on Empirical Methods in Natural Language Processing (EMNLP, Findings), 2024
RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answerings
Zihan Zhang, Meng Fang, Ling Chen
Annual Meeting of the Association for Computational Linguistics (ACL, Findings), 2024
How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances
Zihan Zhang*, Meng Fang*, Ling Chen, Mohammad-Reza Namazi-Rad, Jun Wang
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Turn-Level Active Learning for Dialogue State Tracking
Zihan Zhang, Meng Fang, Ling Chen, Mohammad-Reza Namazi-Rad
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
CITB: A Benchmark for Continual Instruction Tuning
Zihan Zhang, Meng Fang, Ling Chen, Mohammad-Reza Namazi-Rad
Conference on Empirical Methods in Natural Language Processing (EMNLP, Findings), 2023
Is Neural Topic Modelling Better than Clustering? An Empirical Study on Clustering with Contextual Embeddings for Topics
Zihan Zhang, Meng Fang, Ling Chen, Mohammad-Reza Namazi-Rad
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022
👩🏻💻 Work Experience
Student Researcher
TPG Telecom Sydney, Australia ⏲️ Mar 2021 - Dec 2023
As a part of the Data and Analytics Centre of Excellence (CoE) team, I transform numerous data into actionable insights.
- NPS topic modelling - model customers’ feedback/comments and derive insights to improve customer service satisfaction
- Market offer text extraction - automatically extract and analyse competitors’ offers, transforming raw unstructured data into structured data that the market team could use
- Webchat & Call Centre dialogue analysis - preliminary study on raw dialogue data between agents and customers
- Postpay/Prepay/FWA customer insights analysis - model and analysis customers churn and upgrade
- Cloud experience - get involved in cloud services, familiar with data ETL and DevOps on AWS and Databricks
Front-End Software Developer Intern
RESORTer Melbourne, Australia ⏲️ Nov 2019 - Mar 2020
I was responsible for refactoring and developing the Lesson Section in the Web application.
- Refactored the Lesson Section using React + Hooks + Material-UI. Used Grid layout and Card component to render different kinds of lessons and simplified the rendering logic, which was a serious issue when using the Tabs system.
- Managed the global state using Redux and created default lessons for the users based on the form they filled. I also cooperated with my team using Middleware to catch and handle certain actions to ensure the generated lessons are always consistent with the global state, thereby improving the user experience.
- Utilized CSS Module to avoid class names collisions and global style pollution. Used lazy load to dynamically import required components, thereby improving the performance.
🏗️ Projects
🔧 Algorithms in Action
March 2020 - October 2020
Try AiA demo here: https://algorithms-in-action.github.io/
Github: https://github.com/algorithms-in-action/algorithms-in-action.github.io
An algorithm visualization Web application provided for the first year Computer Science students. I was responsible for implementing the pseudocode and algorithm animation.
- Using JavaScript function closures, all the visualization API functions and corresponding variables can be stored in an array and executed later, so it solved using ES6
Generators
that functions executions cannot be reversed. Thereby, the animation can step backward as well. - To map the algorithm pseudocode with the actual code, I parsed and added a bookmark in each line of the pseudocode, and inserted the bookmarks at the corresponding position in the actual code, so the pseudocode and animation can be synchronized.
- Implemented a customized hook use interval so that the auto-play function can read fresh states between each render. This hook can also detect the speed changes and reset the setInterval function, thereby adjusting the playback speed is achievable.
- Based on the visualization APIs provided by Tracer.js, I implemented some common components and functions and expanded the library as well.
✏️ Distributed Shared Whiteboard
August 2019 - October 2019
Github: https://github.com/ZhangzihanGit/Distributed-Shared-Whiteboard-Application
A shared whiteboard desktop application that allows multiple users to draw shapes and chat at the same time. I was responsible for developing the client and server GUI.
- The project used Java 8 as the backend language, JavaFX as the frontend framework, and used a three-tier Client/Server architecture. It separated the client – whiteboard server – data server.
- Java RMI was used as the communication method between the whiteboard server and data server; the request sends from the client were remotely called in the whiteboard server. To synchronize each client, MQTT was used to provide a subscribe/publish protocol. The whiteboard server was used as an intermediate agent to accept messages from each client and publish the messages to all other subscribers.
📐 Guttman Chart Analysis System
Aug 2019 - Nov 2019
Github: https://github.com/ZhangzihanGit/Guttman-Chart-Analysis-System
A Guttman chart based students assessment analysis system. It can be used to help educators find students’ Zones of Personal Development (ZPD) and adjust future teaching plans.
- The project provided support for the 🔗 research in the Assessment Research Centre, Melbourne Graduate School of Education.
- I was responsible for developing the frontend pages and integrating them with the backend developers. The project used Python as the backend programming language, adopted the Client/Server model, and used RESTful API for HTTP communication.
☀️ Academic Services
Peer Reviewer:
- ACL 2023, EMNLP 2022-2023, EACL 2023
- ACL Rolling Review 2023-2024
- NeurIPS 2024, ICLR 2025
⭐ Skills & Certificates
Languages: Mandarin (native), English (fluent)
Programming: Python $>$ SQL $==$ JavaScript $>$ Spark $==$ Java $>$ C $==$ C++ $==$ Haskell
Libraries & Services: PyTorch, HuggingFace, Scikit-learn and Pandas, AWS (S3, SageMaker, Redshift), Databricks
Software & Tools & Management: Git, Linux, Docker, Agile, Scrum, Confluence, Jira, $\LaTeX$
Web Dev: React, Ant Design, material-ui, HTML, CSS
Computer Network: basic HTTP, TCP/IP, cryptography, web security
Certificates: