Robert McMenemy, Independent Researcher, UK
This paper displays and explains an all encompassing framework for integrating event-based neuromorphic processing with hyperdimensional computing & using tropical algebra for cognitive ontology networks. Using the Iris dataset I have constructed a virtual ontology network in order to simulate cognitive computing processes. Event-based neuromorphic processing models with spike activities and stochastic synapses are set up to dynamically adapt the network’s topology. Hyperdimensional vectors represent the entities and relations with tropical algebra operations encoding complex relationships between them. A Multi-Layer Perceptron (MLP) with adaptive dropout and learning rates is added & influenced by the neuromorphic spike activities, performing clustering and classification tasks. The framework demonstrates the improved clustering accuracy and adaptive learning capabilities of the method highlighting the potential of combining neuromorphic and hyperdimensional computing for advanced cognitive applications.
Neuromorphic Processing, Hyperdimensional Computing, Tropical Algebra, Ontology Networks, Cognitive Computing.
Othmane Belmoukadam, Jiri De Jonghe, Sofyan Ajridi, Amir Krifa, AI Lab, EY FSO Belgium
Large Language Models (LLMs) have become ubiquitous in various applications, revolutionizing and accelerating AI transformation cross industries. However, their widespread adoption has also exposed organizations to new and complex security threats such as prompt injections and data poisoning. In response to the escalating threat, there is an urgent need for organizations to enhance their readiness and resilience against the LLM risks. We introduce AdversLLM, a framework designed to support organizations in evaluating their governance and adoption maturity of LLM-based applications (e.g., Chatbots, Few-shot learning classifiers ...), as well as in identifying and addressing associated risks. At the heart of our framework is an assessment form that includes reviewing governance practices, gauging maturity levels, and auditing specific strategies for mitigation. To ground the latter in real life, we provide a set of real-world scenarios and situational cases, illustrating best practices strengthen AI governance. On the technical side, AdversLLM illustrates a prompt injection testing ground equipped with a benchmark dataset for stress-testing both commercial and open-source LLM implementations against various malicious prompts, thereby enhancing situational awareness and resilience. Furthermore, we discuss the ethical implications of security risks such as prompt injections and propose: (1) a zero-shot learning approach that serves as a first line of defense, filtering harmful content in real-time, and (2) RAG-based LLM safety tutor that fosters awareness of LLM security risks, shielding techniques, and red teaming practices. Overall, AdversLLM offers a focused, actionable solution that equips organizations with the tools and insights to promote responsible AI adoption.
Large Language Models, Natural Language Processing, Prompt Injections, Data poisoning, Responsible AI, Zero-shot learning, AI guardrails, Retrieval Augmented Generation.
Irene Mathayo1 and Kondoro Alfred Malengo2, 1Department of Computer Science, University of Dar es Salaam, Dar es Salaam, Tanzania, 2Department of Data Science, Hanyang University, Seoul, South Korea
This paper introduces a comprehensive dataset of Swahili verb conjugations, designed to address the linguistic challenges posed by Swahili's agglutinative morphology, a key feature that has made it dif icult for Natural Language Processing (NLP) models to ef ectively process this low-resource language. The dataset includes over 56,812 verb forms across five tenses, three grammatical persons, and both singular and plural forms, of ering a rich resource for tasks such as tokenization, lemmatization, and morphological analysis. By systematically capturing the complex verb structures of Swahili, this dataset enables researchers and practitioners to improve model performance and build more accurate NLP tools for Swahili. This resource represents a significant step forward for the development of language models tailored to Swahili, with broader implications for processing other agglutinative languages in the Bantu family.
Linguistic resources, Verb morphology, Computational linguistics, Natural language processing.
Kondoro Alfred Malengo, Department of Data Science, Hanyang University, Seoul, South Korea
This paper proposes the creation of a Swahili Question Answering (QA) benchmark dataset, aimed at addressing the underrepresentation of Swahili in natural language processing (NLP). Drawing from established benchmarks like SQuAD, GLUE, KenSwQuAD, and KLUE, the dataset will focus on providing high-quality, annotated question-answer pairs that capture the linguistic diversity and complexity of Swahili. The dataset is designed to support a variety of applications, including machine translation, information retrieval, and social services like healthcare chatbots. Ethical considerations, such as data privacy, bias mitigation, and inclusivity, are central to the dataset’s development. Additionally, the paper outlines future expansion plans to include domain-specific content, multimodal integration, and broader crowdsourcing efforts. The Swahili QA dataset aims to foster technological innovation in East Africa and provide an essential resource for NLP research and applications in low-resource languages
Linguistic resources, Question Answering (QA), Bias mitigation, Natural language processing.
Dikshant Bikram Thapa, Shalin Shakya and Anish Subedi, Department of Computer Science and Engineering, Kathmandu University, Dhulikhel, Kavre, Nepal
This paper presents the design and development of a vision-based license plate recognition system tailored specifically for Nepal’s diverse license plate formats. The study aims to create an ef icient solution for automatic vehicle identification by leveraging computer vision techniques, convolutional neural networks (CNN), and YOLO-based object detection. This LPR system is designed to capture and process real-time images from camera feeds, localize and extract license plate information, and convert it into machine-readable text for smart traf ic management applications. Methodologically, a comprehensive dataset of Nepali license plates with varied fonts, colors, and backgrounds was collected to train the recognition model. Preprocessing techniques are applied to enhance image quality, followed by the CNN model’s feature extraction for precise character recognition. Stored images, timestamped entries, and license plate data are organized in a database for tracking and analysis. The system holds substantial potential for applications in law enforcement, toll collection, and parking management, contributing to automated, ef icient traf ic monitoring solutions in Nepal.
License plate recognition, computer vision, convolutional neural networks, YOLO, optical character recognition, smart traf ic management.
Antony Seabra, Claudio Cavalcante, Jo˜ao Nepomuceno, Lucas Lago, Nicolaas Ruberg, and S´ergio Lifschitz, PUC-Rio - Departamento de Inform´atica, Rio de Janeiro, Brazil
We propose a methodology that combines several advanced techniques in Large Language Model (LLM) retrieval to support the development of robust, multi-source questionanswer systems. This methodology is designed to integrate information from diverse data sources, including unstructured documents (PDFs) and structured databases, through a coordinated multi-agent orchestration and dynamic retrieval approach. Our methodology leverages specialized agents—such as SQL agents, Retrieval-Augmented Generation (RAG) agents, and router agents—that dynamically select the most appropriate retrieval strategy based on the nature of each query. To further improve accuracy and contextual relevance, we employ dynamic prompt engineering, which adapts in real time to query-specific contexts. The methodology’s effectiveness is demonstrated within the domain of Contract Management, where complex queries often require seamless interaction between unstructured and structured data. Our results indicate that this approach enhances response accuracy and relevance, offering a versatile and scalable framework for developing question-answer systems that can operate across various domains and data sources.
Information Retrieval, Question Answer, Large Language Models, Documents, Databases, Prompt Engineering, Retrieval Augmented Generation, Text-to-SQL.