In this digital age, advancements in artificial intelligence (AI) have brought about both great opportunities and significant challenges. One of these challenges revolves around the protection of personal data, particularly digital images, which can be exploited by AI technologies. The proposal focuses on addressing these issues by developing solutions that can safeguard the digital rights of individuals and protect their creations from potential misuse by AI technologies. It offers a 'cloak of invisibility' to your digital images, rendering them unexploitable by AI while retaining their visual appeal for human observers. The project aims to return control to the individuals, ensuring the protection of their art and their privacy in the digital world.
This research/project is supported by the National Research Foundation, Singapore under the AI Singapore Programme (AISG Award No: AISG3-GV-2023-011).
ZEASN Technology is a global leader in smart TV solutions since 2011, and it is headquartered in Singapore with a strong global presence. ZEASN's flagship product, Whale OS, powers 90 million devices globally for over 300 brands. The collaborative research between SMU and ZEASN Technology Pte Ltd is dedicated to developing an advanced Web 3.0 creative media content ecosystem. Emphasizing critical aspects like tokenomics, incentive design, and privacy-enhancing computation, the project’s our primary goal is to construct a future-proof digital framework that is user-friendly, secure, and maximizes user participation, privacy, and profit. Anticipated outcomes include a robust, efficient, and scalable Web 3.0 creative media content ecosystem, maintaining user privacy while fostering a dynamic, tokenomics-driven creative space. This comprehensive approach seeks to revolutionize how creative media is created, shared, and monetized, empowering users and content creators in the digital era. Leveraging combined expertise from economics, computer science, and digital media, the team we aim to design an ecosystem aligned with the values of the Web 3.0 vision: decentralized, user-centric, and privacy-preserving. An early harvest of this collaboration is addressing key challenges in the century-old film industry, with plans for a Web3-powered virtual cinema on ZEASN's worldwide Whale OS CTVs, aiming to decentralize film distribution and monetization in a transparent and rewarding fashion.
The global fintech landscape is undergoing a pivotal shift at its core, driven in part by advanced AI techniques. This project aims to: (i) understand the inner workings of diverse investment systems to assess their transaction patterns; (ii) create algorithms that decode fintech data, offering insights and aiding in market behavior predictions; and (iii) leverage optimization and AI methods to enhance trading and transaction systems.
This project, led by A/Prof Iris Rawtaer (SKH) aims to utilise multimodal sensor networks for early detection of cognitive decline. Under this project, the SKH and NUS team will oversee the project operations, screening recruitment, psychometric evaluation, data analysis, data interpretation, reporting and answer of clinical research hypotheses. The SMU team will collaborate with SKH and NUS to provide technical expertise for this study by ensuring safe implementation and maintenance of the sensors in the homes of the participants, provide the sensor obtained data to the clinical team and apply artificial intelligence methods for predictive modelling.
This project is set to advance the security landscape of emerging technologies in Web 3, including pattern and model-based fraud detection and knowledge graph-based reasoning, in order to address the various issues and chaos in the Web3 domain and establish a comprehensive set of compliance standards.
This is a project under the AI Singapore 100 Experiments Programme. The project focuses on the healthcare industry resource management where there is a complex relationship not just among the various manpower types (doctors, nurses) but also with the patient lifecycle leadtimes, geo-location, medical equipment and facility needed to perform surgeries and patient care. Manpower shortage has birthed conservative and static long-term planning solutions without considering these upstream data flows. In post-covid world today, this project could bring more potential solutions to the manpower allocation and development problem, especially when demand changes acutely. The project sponsor, BIPO Service (Singapore) Pte Ltd believes that an AI-driven, short-input-to-output cycle HR system streaming in “demand”-pulled patient lifecycle data can allocate and inform skills development not only for full time, but part time workforce.
This research/project is supported by the National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISG2-100E-2023-118).
Most conversational systems today are not very good at adapting to new or unexpected situations when serving the end user in a dynamic environment. Models trained on fixed training datasets often fail easily in practical application scenarios. Existing methods for the fundamental task of conversation understanding rely heavily on training slot-filling models with a predefined ontology. For example, given an utterance such as “book a table for two persons in Blu Kouzina,” the models classify it into one of the predetermined intents book-table, predict specific values such as “two persons” and “Blu Kouzina” to fill predefined slots number_of_people and restaurant_name, respectively. The agent’s inherent conversation ontology comprises these intents, slots, and corresponding values. When end users say things outside of the predefined ontology, the agent tends to misunderstand the utterance and may cause critical errors. The aim of this project is to investigate how conversational agents can proactively detect new intents, values, and slots, and expand their conversation ontology on-the-fly to handle unseen situations better during deployment.
State-of-the-art visual perception models in autonomous vehicles (AV) fail in the physical world when meeting adversarially designed physical objects/environmental conditions. The main reason is that they are trained with discretely-sampled samples and can hardly cover all possibilities in the real world. Although effective, existing physical attacks consider one or two physical factors and cannot simulate dynamic entities (e.g., moving cars or persons, street structures) and environment factors (e.g., weather variation and light variation) jointly. Meanwhile, most defence methods like denoising or adversarial training (AT) mainly rely on single-view or single-modal information, neglecting the multi-view cameras and different modality sensors on the AV, which contain rich complementary information. The above challenges in both attacks and defenses are caused by the lack of a continuous and unified scene representation for the AV scenarios. Motivated by the above limitations, this project firstly aims to develop a unified AV scene representation based on the neural implicit representation to generate realistic new scenes. With this representation, we will develop extensive physical attacks, multi-view & multi-modal defenses, as well as a more complete evaluation framework. Specifically, the project will build a unified physical attack framework against AV perception models, which can adversarially optimize the physical-related parameters and generate more threatening examples that could happen in the real world. Furthermore, the project will build the multi-view and multi-modal defensive methods including a data reconstruction framework to reconstruct clean inputs and a novel ‘adversarial training’ method, i.e., adversarial repairing that enhances the robustness of the deep models with guidance of collected adversarial scenes. Finally, a robust-oriented explainable method will be developed to understand the behaviors of visual perception models under physical adversarial attacks and robustness enhancement.
This project will pioneer approaches that realize trusted automation bots that act as concierges and interactive advisors to software engineers to improve their productivity as well as software quality. TrustedSEERs will realize such automation by effectively learning from domain-specific, loosely-linked, multi-modal, multi-source and evolving software artefacts (e.g., source code, version history, bug reports, blogs, documentation, Q&A posts, videos, etc.). These artefacts can come from the organization deploying the automation bots, a group of collaborating yet privacy-aware organizations, and from freely available yet possibly licensed (e.g., GPL v2, GPL v3, MIT, etc.) data contributed by many, including untrusted entities, on the internet. TrustedSEERs will bring about the next generation of Software Analytics (SA) – a rapidly growing research area in the Software Engineering research field that turns data into automation – by establishing two initiatives: First, data-centric SA, through the design and development of methods that can systematically engineer (link, select, transform, synthesize, and label) data needed to learn more effective SA bots from diverse software artefacts, many of which are domain-specific and unique. Second, trustworthy SA, through the design and development of mechanisms that can engender software engineers’ trust in SA bots considering both intrinsic factors (explainability) and extrinsic ones (compliance to privacy and copyright laws and robustness to external attacks). In addition, TrustedSEERs will apply its core technologies to synergistic applications to improve engineer productivity and software security.
Consumers have widely used conversational AI systems such as Siri, Google Assistant and now ChatGPT. The next generation of conversational AI systems will have visual understanding capabilities to communicate with users through language and visual data. A core technology that enables such multimodal, human-like AI systems is visual question answering and the ability to answer questions based on information found in images and videos. This project focuses on visual question answering and aims to develop new visual question-answering technologies based on large-scale pre-trained vision-language models. Pre-training models developed by tech giants, particularly OpenAI, have made headlines in recent years, e.g., ChatGPT, which can converse with users in human language, and DALL-E 2, which can generate realistic images. This project aims to study how to best utilise large-scale pre-trained vision-language models for visual question answering. The project will systematically analyse these pre-trained models in terms of their capabilities and limitations in visual question answering and design technical solutions to bridge the gap between what pre-trained models can accomplish and what visual question answering systems require. The end of the project will be a new framework for building visual question-answering systems based on existing pre-trained models with minimal additional training.