The Cognitive Vision in Robotic Surgery Lab is developing computer vision and AI techniques for intraoperative navigation and real-time tissue characterisation.

Head of Group

Dr Stamatia (Matina) Giannarou

411 Bessemer Building
South Kensington Campus

+44 (0) 20 7594 8904

What we do

Surgery is undergoing rapid changes driven by recent technological advances and our on-going pursuit towards early intervention and personalised treatment. We are developing computer vision and Artificial Intelligence techniques for intraoperative navigation and real-time tissue characterisation during minimally invasive and robot-assisted operations to improve both the efficacy and safety of surgical procedures. Our work will revolutionize the treatment of cancers and pave the way for autonomous robot-assisted interventions.

Why it is important?

With recent advances in medical imaging, sensing, and robotics, surgical oncology is entering a new era of early intervention, personalised treatment, and faster patient recovery. The main goal is to completely remove cancerous tissue while minimising damage to surrounding areas. However, achieving this can be challenging, often leading to imprecise surgeries, high re-excision rates, and reduced quality of life due to unintended injuries. Therefore, technologies that enhance cancer detection and enable more precise surgeries may improve patient outcomes.

How can it benefit patients?

Our methods aim to ensure patients receive accurate and timely surgical treatment while reducing surgeons' mental workload, overcoming limitations, and minimizing errors. By improving tumor excision, our hybrid diagnostic and therapeutic tools will lower recurrence rates and enhance survival outcomes. More complete tumor removal will also reduce the need for repeat procedures, improving patient quality of life, life expectancy, and benefiting society and the economy.

Meet the team

Citation

BibTex format

@article{Lo:2024:10.1109/JBHI.2024.3417280,
author = {Lo, FP-W and Qiu, J and Wang, Z and Chen, J and Xiao, B and Yuan, W and Giannarou, S and Frost, G and Lo, B},
doi = {10.1109/JBHI.2024.3417280},
journal = {IEEE J Biomed Health Inform},
title = {Dietary Assessment with Multimodal ChatGPT: A Systematic Analysis.},
url = {http://dx.doi.org/10.1109/JBHI.2024.3417280},
volume = {PP},
year = {2024}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Conventional approaches to dietary assessment are primarily grounded in self-reporting methods or structured interviews conducted under the supervision of dietitians. These methods, however, are often subjective, potentially inaccurate, and time-intensive. Although artificial intelligence (AI)-based solutions have been devised to automate the dietary assessment process, prior AI methodologies tackle dietary assessment in a fragmented landscape (e.g., merely recognizing food types or estimating portion size), and encounter challenges in their ability to generalize across a diverse range of food categories, dietary behaviors, and cultural contexts. Recently, the emergence of multimodal foundation models, such as GPT-4V, has exhibited transformative potential across a wide range of tasks (e.g., scene understanding and image captioning) in various research domains. These models have demonstrated remarkable generalist intelligence and accuracy, owing to their large-scale pre-training on broad datasets and substantially scaled model size. In this study, we explore the application of GPT-4V powering multimodal ChatGPT for dietary assessment, along with prompt engineering and passive monitoring techniques. We evaluated the proposed pipeline using a self-collected, semi free-living dietary intake dataset comprising 16 real-life eating episodes, captured through wearable cameras. Our findings reveal that GPT-4V excels in food detection under challenging conditions without any fine-tuning or adaptation using food-specific datasets. By guiding the model with specific language prompts (e.g., African cuisine), it shifts from recognizing common staples like rice and bread to accurately identifying regional dishes like banku and ugali. Another GPT-4V's standout feature is its contextual awareness. GPT-4V can leverage surrounding objects as scale references to deduce the portion sizes of food items, further facilitating the process of dietary assessment.
AU - Lo,FP-W
AU - Qiu,J
AU - Wang,Z
AU - Chen,J
AU - Xiao,B
AU - Yuan,W
AU - Giannarou,S
AU - Frost,G
AU - Lo,B
DO - 10.1109/JBHI.2024.3417280
PY - 2024///
TI - Dietary Assessment with Multimodal ChatGPT: A Systematic Analysis.
T2 - IEEE J Biomed Health Inform
UR - http://dx.doi.org/10.1109/JBHI.2024.3417280
UR - https://www.ncbi.nlm.nih.gov/pubmed/38900623
VL - PP
ER -

Contact Us

General enquiries

Facility enquiries


The Hamlyn Centre
Bessemer Building
South Kensington Campus
Imperial College
London, SW7 2AZ
Map location