|1.||Anna Quaglieri||Mass Dynamics 2.0: A modular web-based platform for accelerated insight generation and decision-making for proteomics||"Proteins are essential biological molecules that play critical roles in all biological processes and they are the target of most pharmaceutical drugs. Proteomics, the large-scale study of proteins and their functions, plays a vital role in several aspects of the biopharmaceutical industry, enabling new biomarkers discovery, the understanding of the molecular mechanisms behind diseases, and the development of personalized therapies upon identification of specific protein signatures in individuals. With the field of proteomics becoming more accessible to a broader scientific community, the number of recorded quantitative proteomics analyses has steadily risen and data is being acquired at unprecedented rates. Despite the explosion of data, biological knowledge from proteomics studies is still largely underutilized because of the complexity of data which requires complex analysis pipelines and access to specialized expertise spanning mass spectrometry, statistics, and informatics. Most laboratories still adopt piecemeal analysis solutions, which compartmentalize each expert's contributions. These limitations have proven to be an obstacle to effective collaboration, slowing down efficient biological insight generation and subsequent translation into actionable products. Here, we introduce Mass Dynamics 2.0, a cloud- and web-based platform for the analysis and interpretation of quantitative proteomics data which is targeted to bringing together the right people in one platform. The service implements a novel analysis workspace where standardized quality control, statistical analyses, visualizations, external knowledge integration and effortless comparison of differential expression results against any past experiments are integrated into a modular fashion. The modularity enables researchers the flexibility to test different hypotheses and to customize and template complex proteomics analysis. This ability, coupled with a human-centered interface design, reduces the barrier to proteomics data analysis without compromising quality and depth and accelerates insight generation for complex datasets. The extensible MD 2.0 environment has been built with a scalable architecture to allow rapid development of future analysis modules and the web browser access for multiple collaborators enables real time collaboration between proteomics experts and biologists, thereby expediting decision-making."|
|2.||Bence Szalai||The EFFECT benchmark suite: measuring cancer sensitivity prediction performance - without the bias||"Accurate benchmarking of computational models is vital for the identification of the best-performing ones. However, these benchmarks, especially in biology, are rather ad-hoc and seldom pre-defined and standardized. Here, we developed the Evaluation Framework For predicting Efficiency of Cancer Treatment (EFFECT) benchmark suite based on the DepMap and GDSC data sets to facilitate comparison of ML models predicting gene essentiality and/or drug sensitivity of in vitro cancer cell lines. We show that standard evaluation metrics like Pearson correlation are easily misled by inherent biases in the data. Thus, to assess the performance of models properly, we propose the use of cell line / perturbation exclusive data splits, perturbation-wise evaluation and the application of our Bias Detector framework, that can identify model predictions not explicable by data bias alone. Testing the EFFECT suite on a few popular ML models showed that while library-standard non-linear models have measurable performance in both precision-medicine (cell exclusive) and target identification (perturbation-exclusive) splits, the actual corrected correlations are rather low, showing that even simple KO/drug sensitivity prediction is a yet unsolved task. For this reason, we aim our proposed framework to be a unified test and evaluation pipeline for ML models predicting cancer sensitivity data, facilitating unbiased benchmarking to support teams to improve on the state of the art."|
|3.||Noah Konig||Precision Neuroscience: Identifying Effective Therapeutic Strategies with Patient Stratification Biomarkers in Alzheimer's Disease||"Alzheimer's disease (AD), like other complex diseases, is characterised by a high degree of heterogeneity in the patient population. GWAS have identified several disease-associated genes, but these findings have not translated into progress in clinical trials. This likely reflects the limitations of GWAS in only identifying single variants, while the key to understanding complex diseases that are influenced by multiple genetic loci is to find combinations of variants that distinguish one patient subgroup from another. The PrecisionLife platform utilises a hypothesis-free method (combinatorial analysis) for the detection of combinations of features that together are strongly associated with variations in disease risk, symptoms, progression rates and therapy response. To explore the substructures within AD we analysed a genomic dataset from the UK Biobank, including 882 patients, compared against healthy controls. Our analysis identified combinations of genetic variants which mapped to 113 genes that are significantly associated with AD pathology, including known and novel genes. Clustering these combinations, based upon the degree of patient overlap, revealed six distinct subgroups of patients. Each patient subgroup reflected a specific biological function - lipid metabolism, neuroinflammation, autophagy, serotonin signaling, metal ion homeostasis, and metabolic dysfunction. Several of these functions are also identified in patient subgroups in other neurological diseases, including FTD, schizophrenia and long COVID, indicating shared genetic aetiologies and underlying dysregulated processes common between them. Among the genes identified within each AD subgroup, 32 are targeted by drugs in clinical development in other indications. We have developed a pipeline to systematically evaluate the potential of repurposing these to accelerate implementation of safe and effective therapies for AD patients. The results demonstrate that combinatorial analysis can stratify heterogenous patient populations with complex pathologies to identify effective therapeutic strategies with accompanying biomarkers and enrich clinical trial design, improving the probability of success in AD drug development."|
|4.||Manuela Pausan||canSERV – providing cutting edge cancer research services across Europe||"canSERV is a collaborative 14 Mil. Euro project funded by the European Commission with a core mission to make cutting-edge and customised research services available to the cancer research community in the EU, enable innovative R&D projects and foster precision medicine for patients benefit across Europe. canSERV involves a consortium of 19 leading institutions and organizations across Europe, consisting of Life Science Research Infrastructures and ERICs, key organisations and experts in the field of oncology. CanSERV four main objectives are: (i) offer at least 200 different unique Personalised Oncology relevant and valuable cutting-edge services for life science research in Europe and beyond over the next three years; (ii) establish a single, unified, transnational access platform to request services and trainings; (iii) ensure oncology-related data provided will be fully compliant with the FAIR principles and complement and synergise with other relevant EU initiatives such as EOSC4Cancer and UNCAN.eu and (iv) ensure the long-term sustainability of the network and unified resources of oncology-related service provision beyond the duration of the project. To accomplish its mission, canSERV offers a series of open competitive calls and challenge specific calls for access to services. The calls are designed to support researchers to develop innovative research projects that explore cutting-edge methodologies and target critical gaps in cancer research and care by providing funding to resources/services in value of approximately 9 Mil. Euro. By encouraging the submission of collaborative proposals, canSERV aims to foster transnational cooperation and support a vibrant scientific community. Ultimately, canSERV presents an unparalleled opportunity to accelerate cancer research, drive innovation, and improve patient outcomes."|
|5.||Hernando M Vergara||Automating image analysis for lysosomal profiling and integrating with multiomics to phenotype neurodegeneration||"Background: Neurodegeneration is a multifaceted process with diverse manifestations that stems from intricate cellular pathways. The fusion of biotechnology, image analysis, and multiomics has catalysed a revolutionary shift in comprehending neurodegenerative disorders and tailoring personalized interventions. For example, while biomarker development in Alzheimer's disease (AD) has traditionally focused on neuroimaging and proteomics, emerging evidence underscores the pivotal role of lysosomal dysfunction in regulating Aβ dynamics (van Weering & Scheper, 2019). Building upon this paradigm shift, we present a method to quantitate the ionic composition of individual lysosomes within patient-derived cell lines. Our approach harnesses a computational pipeline to automate microscopy image analysis and integrate ionic profiling with enzymatic and transcriptomic data, with the aim of generating a multiomic landscape that can guide precise medical strategies. Methods: 1) We use proprietary DNA probes to achieve ratiometric ion measurement (e.g., pH and Ca2+) at single lysosome resolution in patient-derived samples. 2) A computational image analysis pipeline automatically preprocesses, analyses, and extracts insights from high-resolution microscopy images. 3) Generated outcomes are intuitively navigated through a user-friendly GUI. 4) Results are integrated with enzymatic activity data and gene expression profiles from the same samples. The multidimensional data space is analysed via dimensionality reduction techniques and machine learning models to categorise distinct diseases. Results: To illustrate its diagnostic potential, the platform is applied to fibroblasts sourced from a clinically diverse cohort including healthy individuals and patients afflicted with AD, frontotemporal dementia, Parkinson's, and Huntington's disease. Our implementation of automatic cell and lysosome detection from microscopy data results in a tenfold increase in data points collected and surpasses the pace of semiautomated annotation by over 50-fold. This expanded data collection enables comprehensive measurement and stratification of distinct lysosomal subsets and their correlations with specific disorders. Lysosomal attributes (pH, Ca2+, populations), enzymatic activity, and gene expression profiles exhibit noteworthy disparities across diseases. When examining single modalities individually, RNA features perform best to discriminate AD among other disorders, yet the combination of all modalities significantly enhances the capability to differentiate AD from other conditions. Using the lysosomal features as an example, we showcase the potential of our platform for drug discovery and personalised medicine. Conclusions and Future Prospects: The confluence of image analysis and multiomics holds great promise to comprehend the underlying cellular malfunctions that result in neurodegeneration, expediting the development of innovative therapeutics. As the molecular complexity inherent to these conditions is unravelled, insights into pivotal mechanistic nodes and potential intervention sites become possible. We expect to build upon this platform for drug discovery and personalized medicine, offering a pathway to finely tailored interventions through comprehensive evaluation of drug effects on the endolysosomal pathway at the individual patient level."|
|6.||Valentina Marchini||Scaling-up enzyme immobilization: efficiency and productivity of two model systems||Enzymatic reactions have received increased attention over the years, owing to the incontrovertible necessity for sustainable processes . Indeed, biocatalysts fulfil the demand for greener reactions thanks to their valuable properties, while providing simpler synthetic routes with higher selectivity than the traditional hazardous methods. In addition, enzyme immobilization onto solid supports is an essential tool to allow their reusability and to preserve stability over several operational cycles . Nevertheless, immobilization of biocatalysts is usually considered a small scale process at the research level, while larger scale is required in order to achieve considerable amounts of the desired product. Moreover, this technique is still very unpredictable, meaning that significant experimental effort is involved during the screening process, since it relies on a trial-and-error approach. To overcome this issue, bioinformatic tools, like the package CapiPy, allows scientists to rationalize the experimental design for more time and costs effective studies [3,4]. In this work, it has been demonstrated that scaling up is possible and efficient, without any major loss in terms of immobilization and productivity of the enzymatic reactions performed in batch and in continuous flow systems. For this proof of concept, two different enzymes were chosen together with two well-established reactions, previously reported in scientific publications. Consequently, two high-value compounds were obtained, starting from relatively cheap molecules. As further outcome, costs and profits were calculated to prove the efficiency of the biocatalytic scalability not only from the productivity perspective but also from a financial one.|
|7.||Tanya Tolomeo||Causal Modelling of COVID-19 Immunopathogenesis in Patients||"Background: To date, there have been 6.5 million deaths from COVID-19 with current active infections exceeding 21 million as of August 19, 2023. While there have been significant gains made through robust vaccine efforts as well as therapeutics such as monoclonal antibodies and antivirals, an unresolved challenge is the broad range of outcomes experienced upon infection. These range from out-patient hospitalization and fatal acute respiratory distress to short and long-term systemic impacts that are observed in patients with varying severity even in the absence of well-defined pre-existing conditions. Of note, given the global impact of the pandemic, COVID-19 sparked an explosion of publicly available data, presenting an opportunity to better understand the host-pathogen interplay that drives these variable outcomes. In the advent and increasing use of cutting edge technologies, robust multi-omic profiles can be generated from individual patients to facilitate an in-depth analysis of outcomes at a population, down to individual level. Here, we leverage one such dataset to explore network analysis that can provide greater context than a protein set enrichment analysis. Network analysis offers a systematic approach to discern the interconnectivity and interactions among proteins, rather than merely cataloging differentially expressed proteins across conditions. This in turn facilitates a more causal understanding. Method: The dataset (doi: 10.1016/j.ebiom.2021.103723) profiles 1463 unique plasma proteins from a cohort of 50 patients after a positive COVID-19 PCR test (disease condition), and following a negative test result 14 days later (recovery condition). Using machine learning, we generated protein network graphs for each patient performed frequent subgraph mining to identify 340 motifs within the data. The motifs are sets of three protein-expression level pairs with a known connection in the protein-protein network. The presence or absence of each motif was used to create a motif profile for each sample. We performed hierarchical clustering using motif profile and protein expression in order assess each feature's ability to stratify the patient groups. Results & Conclusion: Motif profile-based clustering correctly stratified a larger proportion of patients compared to simple unidimensional protein expression alone. Furthermore, while the majority of patient's proteomic profiles were most closely correlated with themselves (i.e., disease and recovery condition), most motif profiles showed highest similarity with other patients from the same condition. As well, the motif analysis uniquely identified proteins which have been associated with COVID-19 prognosis in the literature, including CEACAM5, HEBP1 and Galectin-1. By leveraging network analysis, we dissect these protein-protein interaction landscapes, defining cohesive modules or protein clusters, that adhere more closely to real world outcomes while elucidating affected biological pathways or systems Altogether, this is an important finding considering the challenge of applying research findings into a translational setting."|
|8.||Steen Manniche||Driving Biotech and Pharma Success: The Power of Collaborative Analytical Platforms||"The successful implementation of data-driven initiatives hinges upon bridging the gap between data scientists and business users; researchers and scientists. We explain the mechanisms and principles of a self-serve, business centric platform that facilitates collaboration and knowledge exchange across these distinct domains, thereby unlocking the full potential of data-driven endeavors. With the right platform, data experts not only figure out complex processes but also really get what the business is about. This kind of understanding is often missing when people work separately and isolated, and it's what stops valuable data ideas from becoming real projects. A well-tailored collaborative platform catalyzes transformative change. By pushing for self-service and transparent operations in delivering digital assets, companies accelarate deliveries and gain the ability to seamlessly adopt novel technologies. This newfound agility enables rapid adaptation to industry shifts, propelling research initiatives forward. With this poster, we explore the guiding architectural principles, the necessary components as well as the interfaces needed in building a collaborative analytics platform suitable for knowledge-intensive work as done in the pharma and biotech space."|
|9.||Stephen Kearney||Healthcare Providers Knowledge, Confidence and Practice Regarding Oncology Genomic Testing - The Role of Clinical Informatics and Ancillary Services to Support Implementation||"Background and Purpose: We measured healthcare providers’ (HCPs) subjective self-appraisal of their knowledge of oncology molecular pathology tests and techniques, combined with an objective assessment of same. Self-confidence around ordering, interpreting and explaining such investigations was measured. We also measured HCP’s attitudes and opinions regarding clinical informatics tools and ancillary supports for their oncology genomics and molecular pathology workflows. Finally, we enquired as to whether HCPs viewed genomic and molecular pathology testing as a mainly research orientated activity or as being clinically essential. Methods: The survey was conducted via electronic means, utilising Google Forms, which was distributed to a wide variety of HCPs. 30 HCPs from a large academic teaching hospital in Ireland replied, including nurses, doctors in training, qualified consultants and general practitioners. Results: More than half of respondents (65.6%) thought they knew less than enough about molecular pathology to do their jobs. Regarding confidence in the steps of ordering, interpreting and explaining such tests, only a minority were fully or somewhat confident. Most HCPs were positively predisposed to utilising clinical informatics tools, genetic counsellors, second opinions services and molecular tumour boards (MTBs) in their molecular pathology workflows. Finally, the majority (83.9%) were supportive or strongly supportive that molecular pathology testing was important to their clinical practice, rather than research. Conclusion: Despite a small subset of respondents having high self-appraisals of their clinical molecular pathology knowledge, when objectively measured most Irish HCPs lack knowledge and confidence in oncological genomics and molecular pathology. This is important, given the growing prominence of genomics and molecular pathology in mainstream clinical and oncological practice, as reported by respondents."|
|10.||Ioannis Liabotis||Generating evidence and insights from real world data, internal and external collaboration platforms.||"In the context of healthcare, real world data (RWD) denotes data relating to patient health status and/or the delivery of health care routinely collected from a variety of sources, instead of data generated through conventional interventional clinical trials and studies in dedicated research settings. RWD is seen as a potentially rich and underutilised source to generate insights as to how approved diagnostics systems, data-enabled services and products affect outcomes for patients under real world conditions. In this poster we present the approach we take at Roche for generating evidence and insights from real world data via the development and deployment of internal and external collaboration platforms. We present the main principles of Roche’s commitment to advancing the field of RWD from translating RWD into actionable and clinically-relevant insights and evidence to building trust by being transparent about the motives and systematic efforts in RWD, working with various stakeholders to shape the RWD landscape and to foster acceptance of RWD for research, regulatory and access decision making and growing expertise in secondary data use To foster those developments we are building internal and external collaboration solutions for multi-modal data science with capabilities such as data management, advanced analytics, collaboration, data sharing, project and challenge execution etc. The main characteristics of such platforms are: * to secure management of RWD pipeline and data assets (from ingestion to analysis-ready data) and delivering FAIR data products * collaboration enablement and internal and external data challenges * analysis of Real World Data by providing the most appropriate Analytics tools and environments for each situation * full compliance with accessibility and patient data processing"|
|11.||Lars Halvorsen||LAISDAR - hospital EHR harmonization in Rwanda through mapping to OMOP CDM||"Background In response to the COVID-19 pandemic, a federated data network (FDN) of 15 hospitals was established in Rwanda, entitled “Leveraging Artificial Intelligence and Data Science Techniques in Harmonizing, Sharing, Accessing and Analysing SARS-COV-2/COVID-19 Data in Rwanda (LAISDAR)”. The LAISDAR project combines EHR data, national COVID-19 testing data, and survey metrics, includes multiple Rwandan and Belgian institutions, is managed by the University of Rwanda (UR), and received funding from Canada’s International Development Research Centre (IDRC) as part of the Global South AI4COVID program. The objective was to leverage the federated hospital data sets, extended with centralised COVID-19 test results and survey data, to support Rwandan government needs in monitoring and predicting the COVID-19 burden, including hospital admissions and overall infection rates. Although the project was originally focused on COVID-19 research, the possible research topics have since widened to other disease areas. Methods Two different EHR systems are used across Rwanda, openClinic GA and openMRS, for which logic to transform to OMOP CDM (https://www.ohdsi.org/data-standardization/) was defined and implemented. An ETL was designed which can transform both source systems to the target format and incorporate data from the national COVID-19 testing and COVID-19 survey results as part of the ETL process. Ares was installed on the central server for quality control across the network. On a central server, OMOP CDM versions of the national COVID-19 testing, and survey results are hosted and made available to the hospital ETLs through a secure access point. Results As of April 2023, the ETL to transform the hospital EHR data to OMOP CDM has been run at least once at 14 of the hospitals. The deployment of Ares (https://ohdsi.github.io/Ares/) has allowed a centralized view on domain and mapping coverage across the network, which has aided the planning of the next steps for concept mapping and ETL improvement. Some of the data quality issues encountered were related to inconsistencies with how birth dates were filled in, and gender specific clinical events inconsistent with the patient’s gender. Initially, the deployment and setup of hospital nodes and central server were supported remotely, which was not always optimal. Onsite visits helped finalise the node setups and solve remaining challenges. Finally, a proof-of-concept for a reporting solution has been developed, through which the mandatory monthly reports from the hospitals to the Ministry of Health can be, at least partially, automated based on OMOP CDM. Conclusion LAISDAR concluded in June 2023; however, efforts are underway to continue the work and build further on accomplishments. Throughout the project, 14 hospital nodes with EHR data were transformed to OMOP CDM, with a total of ~3.5M patients represented. The national COVID-19 test results have been converted to OMOP CDM, as has the results of a COVID-19 related survey from 2022 that included 10,000 participants. The LAISDAR project has also provided important lessons and experiences that will enable further expansion of OMOP CDM in Rwanda and the rest of Africa."|
|12.||Raphael Sergent||FAIR Enterprise Data Registry||"This poster will explain how to implement FAIR principles along the IDMP Use Cases. With the digitalization of the Life-Science Industry and the increasing requests from Authorities, the FAIRification of data is more important than ever. The F1 principle is arguably the most foundational because it will be hard to achieve other aspects of FAIR (Findable Accessible Interoperable Reusable) without globally unique and Persistent Identifiers (PIDs). Accelerate the adoption of reference and master data across your organization with reliable and long-lasting persistent identifiers that are easy to govern. Accurids is the first commercial off-the-shelf software that allows you to become FAIR F1 compliant at an enterprise scale across functions and data domains. Join us to discover how this will change your approach to data and increase your business efficiency!"|
|13.||Arne-Christian Faisst||Opening new horizons in humanizing preclinical multi-organoid disease models with the 3D CoSeedis in chip communication technology™||Conventional cellular or animal disease models have shown that the predictability of patient response to treatment is severely limited. Great efforts have been made to humanize mouse models to better predict certain aspects of human physiology and immunology. The abc biopply team has now made a significant breakthrough in humanizing upstream 3D cell models through the revolutionary and proprietary 3D CoSeedis™ multi-organoid in chip communication technology. Providing optimized physiological growth conditions and unique ways of intercellular communication, our models are freed from non-human components wherever possible. They allow us to mimic and maintain specific organs and tissues in culture for long periods of time. Thus, we successfully bridge the translational gap with unparalleled physiological responses and unique statistical predictive power. Here, we present how the innovative 3D CoSeedis in chip communication technology™ specifically enables the humanization of 3D multi-organoid models and consequently improves the predictiveness of patient’s response.|
|14.||Sune Askjaer||Collaborative Analytics: The Rise||"The rapidly evolving landscape of the pharmaceutical and biotech industries demands more data-driven decision-making than ever before. While specialized data experts play a crucial role, there's a rising tide of domain professionals ready to step in and contribute: the citizen data scientists. Their emergence signifies a shift towards a more collaborative and inclusive data science environment. In these sectors, data is abundant. Sifting through this vast pool and extracting meaningful insights requires more hands on deck. Citizen data scientists, equipped with intuitive platforms, are increasingly filling this gap. Tools that were once the domain of experts are now accessible to a broader audience. While Excel has been a long-standing tool for many professionals, its limitations in a big-data world are evident. The rise of advanced tools, including large language models like ChatGPT, makes advanced analytics and content creation more attainable for scientists. For this shift to be successful, a few elements are essential. First, a unified platform that is both secure and user-friendly. This platform must encourage experimentation, learning, and iteration. It should act as a common ground where traditional and citizen data scientists can share insights, methodologies, and learn from each other. By aligning with the FAIR (Findable, Accessible, Interoperable, and Reusable) principles, this platform also ensures that data is not only available but is also usable and standardized. Open-source tools are at the forefront of this, simplifying the challenges tied to software updates and dependencies. Here we present how organizations can create an environment where data science is a team sport and where domain experts seamlessly transition into the role of citizen data scientists. The future is collaborative, and with the right tools and mindset, the pharmaceutical and biotech sectors are poised to lead the way"|
|15.||Prateek Arora||Investigating the Impact of Injury Types and Intercellular Communication on Zebrafish Heart Regeneration||Myocardial infarction, a leading cause of human death according to World Health Organization data, leads to a permanent fibrotic scar in the hearts. In contrast, zebrafish hearts have the ability to regenerate throughout their lifetime, regardless of the injury method used. Previous studies have identified the cell types involved in this process, but the effect of different injury methods and how these cell types interact with each other is not fully understood. To address this gap in knowledge, we used transcriptomics data from 36 samples spanning across seven studies and bioinformatics tools to compare three injury types - genetic ablation, resection, and cryoinjury - in zebrafish heart regeneration. Our analysis revealed a core regeneration pathway that is conserved across the injury types and suggests that fibroblasts may play a key role in the regeneration process. To further understand intercellular communication between genes, we reanalysed data from three studies, focusing on key cell types, and developed a ligand-receptor network based on differentially expressed genes. Using network theory and literaturebased natural language processing, we identified important nodes in the network and are currently confirming our results using CRISPR Cas9-based knockout lines. Finally, we present our findings in a user-friendly web app interface, making them accessible to the wider community|
|16.||Alena Fedarovich||Single Cell Data Science Consortium Enables Rapid Analysis of High Value Public Datasets||"Due to their enormous potential for advancing drug discovery, there continues to be an exponential growth in the use of single cell sequencing methods, and a corresponding increase in datasets in publicly available repositories. While these datasets are freely available, they come with hidden costs that hinder the ability of companies to exploit them to their maximum potential. These costs typically result from a lack of metadata standards and significant variation in the processing approach. The Single Cell Data Science (SCDS) Consortium was formed in 2022 with four charter members (3 large Pharma and 1 Biotech) as a multi-year effort to harmonize single cell experiments more quickly and cost effectively. This pre-competitive organization is being led by Rancho BioSciences, with expertise in single cell data curation, processing, and analysis. To date, SCDS has successfully delivered 192 high-quality human and mouse datasets with metadata harmonized to a 4 entity, 75 attribute data model. By the end of 2023 we expect to have delivered around 230 datasets. The consortium is now supported by six members companies and added several defined functions to the scope. Updates to the ingestion pipeline to adapt to these changing needs is currently in progress and seeks to increase both the processing capacity and features provided to analysts. In addition to dataset additions, we are building tissue, disease and organ-specific reference atlases. Curated datasets delivered as part of this consortium are already accelerating reproducible science, rapid discovery, and joint analysis of valuable public data."|
|17.||Alena Fedarovich||Using Artificial Intelligence to Enhance Terminology Mapping Workflow from Data Collection to Standardization||"For data to be findable, integrable and reusable, it first needs to be normalized (so that data from different sources can be aligned) and, most importantly, it needs to be cleaned up, so it is free from original human and machine errors. For both tasks, it is a standard practice to align data to well established standard ontologies and controlled vocabularies and to curate it, both manually and digitally. While there is no automated solution that can guarantee clean and well-aligned data, an efficient semi-automated solution can do the preliminary work, thus leaving curators with fewer, more complex cases. Furthermore, resulting data dictionaries often need to be classified and tested for heterogeneity, so that they can be used in more structured, domain-specific, targeted, and harmonized fashion. The annotation and mapping services can be used for streamlining data that comes from public resources and is often presented in a variety of formats and flavors. In the project we investigate different approaches to terminology mapping that use AI-assisted semantic and phonetic mapping, with the goal to develop one-stop-shop for data collection, harmonization, alignment, and mapping."|
|18.||Artür Manukyan||VoltRon: A Spatial Omic Analysis Toolbox for Multiomic Integration using Image Registration||"The introduction of “Spatial Transcriptomics” (ST) in 2016 and the large-scale utilisation of fluorescence in situ hybridisation (FISH) techniques have motivated many to analyse omics data more in a spatially resolved manner. This has been followed by a rapid increase in the number of commercially available spatial omics instruments, such as Visium (10x), GeoMx (NanoString), PhenoCycler (Akoya Biosciences) etc., that introduced fine-tuned workflows capturing spatial omics profiles in diverse levels of resolutions (regions of interest, single cells, molecules etc.). Hence, there is a need for downstream computational analysis tools capable of investigating spatial datasets with multiple modalities and resolutions. Such platforms should provide workflows for spatially aware multiomic integration across tissue sections as well as data modalities. To this end, we have developed VoltRon (bioinformatics.mdc-berlin.de/VoltRon), a novel R package for spatial omics analysis with a unique data structure that accommodates spatial data readouts across many levels of resolutions and modalities including regions of interests (ROIs), spots, and even single cells. VoltRon accounts for spatial organisation of tissue blocks (samples), layers (sections) and assays given a collection of spatial readouts and provides spatial data integration between these assays. An easy-to-use computer vision toolbox, OpenCV, is fully embedded in VoltRon that allows users to seamlessly register spatial coordinates across layers for data/label transfer without the need for configuring externals tools such as Python or ImageJ/Fiji. VoltRon tutorials include (i) integration workflows across serial tissue sections, (ii) integration between spatial transcriptomics and scRNA datasets as well as (iii) multiple end-to-end analysis workflows for all spatial data modalities using the same API."|
|19.||Evan Béal||MSIdRNA: a novel method for determining MSI status using transcriptomic data||"Introduction: Microsatellite instability (MSI) is a hypermutable phenotype caused by the loss of the DNA mismatch repair activity. This phenomenon, frequently observed in colorectal, gastric and endometrial cancer cells, is directly linked to the efficacy of some treatments (chemotherapy, immunotherapy, LMW). MSI used to be identified by PCR of five microsatellites and IHC staining of mismatch repair proteins. Then, with the advent of next generation sequencing (NGS), new tools were developed by researchers to infer the MSI status of samples using whole exome sequencing (WES) data. In this work, we sought to extend the use of NGS data to infer the MSI status of cells by building a model capable of predicting the number of deletions in microsatellite regions (proxy of MSI status) using only gene expression data (RNA-Seq). This method has the advantage of being able to infer MSI status at sample and single-cell level. Material and Methods: Using RNA-seq data of 281 cell lines from the Cancer Cell Lines Encyclopedia (CCLE), we constructed a generalized linear model that takes as input gene expression data of 80 genes, which were collected from 12 published studies that identified genes acting as markers of MSI status. After a feature selection step, using lasso penalized regression, 69 genes were selected to predict the number of deletions in microsatellite regions, a value positively correlated with the probability that a sample/cell is MSI. The expected number of deletions were collected from Ghandi et al., Nature 2019. The model's ability to infer MSI status using the predicted values as a proxy was evaluated at bulk level on 67 CCLE cell lines (15 MSI and 52 MSS) and at the single-cell level on over 53'000 single cells from 22 cancer types (198 cell lines). Results: The results show that MSI status of cells can be inferred using gene expression data at both bulk and single-cell level. In the bulk data, an accuracy of 73.1% (sensitivity of 1; specificity of 0.65) is obtained for the classification of the cell lines as MSI or MSS when using the predicted number of deletions as a proxy for the MSI status with a threshold set at 150 deletions in microsatellites regions (threshold previously identified in Ghandi et al., Nature 2019). At the single-cell level, a significant difference is observed in the distribution of the predicted number of deletions for MSI versus MSS single cells in colorectal, endometrial, gastric, lung and ovarian cancer. As expected, a greater number of deletions in microsatellite regions is predicted in MSI single cells, highlighting the model's ability to transfer from bulk to single-cell predictions. Conclusion: Our model provides an alternative method for inferring the MSI status of cells using only gene expression data, with promising results obtained at both bulk and single-cell level."|
|20.||Laura Frolich||Bridging the gap between data scientists and business users: Toward an Asynchronous. Collaborative Framework||In today's dynamic business landscape, swiftly translating data science insights into actionable business strategies is paramount. While more and more platforms and solutions for no/low-code data exploration and analysis are appearing, the question still remains of how the final models and visualizations are kept running and at the fingertips of the business users when they need them. Addressing this challenge not only streamlines the deployment of these tools but also enriches the development and feedback loop experience for both data scientists and business users. In a typical scenario, a data scientist will receive a copy of data and start running analyses, building models, and creating visualizations using this data. This iterative process, punctuated by periodic meetings with business users, ensures feedback integration and guides future analytical explorations. However, this approach often creates lag, making insights available only when the data scientist can allocate time, leading to potential delays in strategic decision-making. We argue that a better workflow allows for more spontaneous and asynchronous interaction. We have recently run a project in which we used such a workflow to enable a quick prototyping approach of data exploration apps, running on a constantly available platform. This allowed the business user that was part of the project to experiment with and test the apps when it suited their schedules, and provide feedback in e-mails and online ad-hoc meetings. In this poster, we describe the footing needed to establish such a workflow. Through this, we hope to inspire organizations to embrace this paradigm, ensuring that data-driven insights seamlessly translate into impactful business actions.|
|21.||Sarah Ateaque and Maddie Durnall||Target Identification in NSCLC||"Lung cancer is the leading cause of cancer in the U.S., with 81% of deaths attributed to smoking. Among the various subtypes of lung cancer, Non-Small-Cell lung cancer (NSCLC) emerges as the most prevalent, accounting for more than 80% of all lung cancers. Although survival rates are higher when diagnosed at earlier stages, the 5-year survival rate of patients with NSCLC across all stages was approximately 15%. The therapeutic landscape for NSCLC is marked by significant challenges, with current treatments often encountering drug resistance, adverse side effects, and diminished efficacy, especially in advanced disease stages. A primary contributor to these challenges is poor target selection, with 50% of trials falling short due to lack of efficacy. The identification of genes and protein mutations associated with NSCLC is critical for the development of more effective therapeutics and improving patient outcomes. Using Causaly, drug discovery scientists can focus their discovery programs by streamlining the identification and prioritization of drug targets and exploring novel avenues. Using PubMed, over 90,000 documents relating to NSCLC were uncovered. Facing this data overload, users are confronted with the challenge of navigating an immense volume of literature, which is not only time-consuming, but subject to selection bias. In contrast, by machine-reading all available literature, Causaly finds evidence, not documents, extracting relevant insights from millions of documents without bias. By leveraging Causaly’s Intelligent Search, over 8000 targets for NSCLC (supported by more than 30,000 documents) were instantaneously identified. To provide greater confidence in the viability of a target, targets can be narrowed down to those investigated in preclinical studies. Utilizing Causaly’s advanced filtering capabilities, targets can be prioritized by novelty, the number of publications and the strength of evidence in the pathophysiology of NSCLC."|
|22.||Guliafshan Tariq||Unlocking the Future of Molecular Biology: Arxspan BioDrive and Seamless Data Integration||In the dynamic landscape of molecular biology research, the demand for cutting-edge tools that seamlessly integrate with laboratory workflows has never been greater. Arxspan BioDrive is the only molecular biology tool available for both desktop and enterprise solutions integrated with Arxspan ELN system. It allows users to visualize, simulate, and document molecular biology procedures. The newest features of BioDrive include RNA support (including DNA/RNA conversion), automatic primer design, primer analysis, custom oligo library and pair-wise sequence comparison. Our new biologic registration module leverages the power of BioDrive to effortlessly connect your data across all streams within our enterprise suite. Now you can seamlessly design, update, and register your molecular biology data in the Arxspan suite of products.|
|23.||Raul Mora||Leveraging NLP to identify unstructured content reusability paths in submission dossiers||"Clinical trial regulatory submissions require the authoring of tens of documents aimed at fulfilling efficacy and safety requirements for approval. This exercise requires thousands of hours of drafting, authoring, and review and approval that often contains repeated or overlapping content. With NLP methods, several documents can be mapped for structural and semantic similarity; using graph networks, reusabilty paths can be visualized and opportunities for efficiency can be identified and implemented through structured content management and automated content generation. This approach could significantly reduce time to finalize the dossier and effort of scientists and medical writers. Furthermore, this approach could be expanded to insights development where the semantic analysis and paths can point at new leads for clinical development."|
|24.||Rim Khazhin||Triaging retinal diseases using AI||Case study of using Artificial Intelligence retinal diagnostic software at a busy ophthalmic polyclinic seeing 1500 patients per day. As a result AI was used for triaging 500 consultancy referral patients by AI alone without getting an appointment and seeing an ophthalmologist. Additionally early stage Diabetic Retinopathy patients were followed up at the polyclinic without going to retina clinic|
|25.||Jens K. Habermann||Advancing Biobanking in Europe||"BBMRI-ERIC is the largest European research infrastructure for biobanking that currently includes 24 countries and one international organization. BBMRI-ERIC’s mission is to establish, operate and develop a pan-European distributed research infrastructure of biobanks and biomolecular resources to facilitate the access to resources and facilities and to support high-quality biomolecular and medical research. BBMRI-ERIC brings together all the main players from the biobanking field – researchers, biobankers, industry, and patients – to boost biomedical research. To that end, BBMRI-ERIC offers quality management services, support with ethical, legal, and societal issues, and several online tools and software solutions. One of these tools is the BBMRI-ERIC Directory that collects and makes available information about biobanks, samples, and associated data to researchers in the biomedical field. It encompasses more than 400 biobanks hosting over 100 million samples. Precision medicine is an important topic to which BBMRI-ERIC contributes with a focused research agenda addressing foremost disease areas of cancer, rare diseases, COVID-19, and infectious diseases as well as paediatric diseases while leveraging the potential of novel technologies. BBMRI-ERIC is engaged in more than 21 active EU projects which includes projects making genomic data available and accessible across boarders through a federated cloud infrastructure. In terms of digitalization and tools solutions, BBMRI-ERIC continues to strengthen its portfolio by (i) the initiated federated search and analysis platform for sample-level and patient-/donor level data, (ii) data quality and certification, (iii) expedited access procedures for samples & data, (iv) data pooling, and (v) big data analysis."|
|26.||Mihail Jekov||Analytics Platform for Real-World Data to Enhance Clinical Trial Performance||"Advancements in digital health technology have opened new opportunities for optimizing clinical trials by leveraging real-world data (RWD). This poster presents the latest integrated application within Sqilline’s Danny Platform – ‘FIND MORE’, designed to harness the power of RWD and significantly improve the process of finding clinical trials and enrolling eligible patients. Analytics Platform Danny Platform is a big data healthcare analytics platform that integrates massive amounts of RWD mainly from electronic health records (EHRs). Powered by SAP HANA and built with propriety ML and NLP algorithms, Danny Platform extracts both structured and unstructured (free text) data from different languages to preprocess, normalize and ensure high data quality. Danny Platform provides comprehensive searches, in-depth analyses, predictions, treatment solutions, and decision support to physicians, researchers, and payers in the field of oncology, cardiology, ophthalmology, rare diseases, and other specific disease areas. Currently, Danny Platform processes more than 1.6 million unique patient records from more than 50 leading hospitals in CEE. Integrated Application The latest integrated application within Danny Platform – ‘FIND MORE’ is an innovative tool aiming to bridge sponsors, leading investigators, and research sites to identify eligible patients for clinical trials. The ‘FIND MORE’ application seamlessly integrates with two major registries, ClinicalTrials.gov and Clinical Trials Register EU, providing investigators with real-time access to a comprehensive list of all active research studies worldwide. No manual searches: investigators can effortlessly navigate through a wealth of opportunities in the global landscape of medical research. The ‘FIND MORE’ application automatically integrates data from electronic health records (EHR) in each hospital within a user-friendly dashboard. It acts as a centralized hub of information, allowing investigators to seamlessly navigate through patient data, medical records, and relevant demographics. Finally, investigators receive a list of patients eligible for the clinical trial they consider and explore. Key Features • Automatic integration with the two major registries: ClinicalTrials.gov and Clinical Trials Register EU • Conduct a direct search for eligible patients within the site and the study protocol. The search is executed by a trained machine learning algorithm to structure inclusive/exclusive criteria and recognize eligible patients • Dynamic map showcases the geographical distribution of all ongoing studies Key Benefits • 20% Improvement in accuracy of selecting the right site for sponsors • Enhanced efficiency: 40% reduction in investigator's time spent on searching eligible patients for clinical trials. • Unlocking the potential for an increased number of successfully executed clinical trials at every site Sqilline’s ‘FIND MORE’ application represents a significant step forward in medical research and patient care. By facilitating the discovery of clinical trials and enabling seamless patient enrollment, Sqilline empowers the healthcare community to advance medical knowledge and improve patient outcomes."|
|27.||Alex Ritschel||Using Neuroscience to Assess and Prevent Stress at the Workplace||"Stress in the workplace costs organizations millions in productivity, sick-days and drop-out. In fact, the recent Swiss government health report shows that about 30% of Swiss workers suffer from “health critical stress”. To help private and public organizations avoid these huge costs, MGME Neurotech, a UZH startup label, has developed for over 10 years a preventive technology that increases individual stress resilience. The 3-stage neuroscience-based approach includes (1) an individual resilience-assessment, (2) an app-based monitoring and (3) a personalized training based on the individual neurobiology. This novel approach has the potential to increase retention, motivation, and productivity amongst employees and executives alike. Using biological and objective markers (mobile device pupil measurement), allows to identify stress vulnerability early, and provide personalized digital, app-based interventions tailored to the individual biology. People will be more stress resilient, have less anxiety depression and burnout and sick days, make less decision-making errors and thus, enhance productivity innovation."|
|28.||Shuguang Yuan||Advancing GPCR drug discovery via computationa methods in an ultimately efficient way||" Modern drug discovery is a long and tedious process which costs at least 10 years and 2 billion USD in average. How to speed up this expensive process has become one of the most essential topics in pharmaceutical industry. With the progresses in both artificial intelligence and computational biology, advancing modern drug discovery via computational pharmacy plays more and more important roles. In this work, Dr. Yuan will illustrate applying computational biology and artificial intelligence to answer fundamental questions in life science, especially in the area of G protein-coupled receptors (GPCRs) which is the most important drug targets in the pharmaceutical industry. He will also discuss how to speed up modern drug discovery in an ultimate efficient way. Finally, Dr. Yuan will also present his successful story on how to develop “first-in-class” clinical drug within a few months."|
|29.||Katerina Krinitsyna||Analysis of scientific literature and clinical trial data on the efficacy of treatment options for pulmonary arterial hypertension (PAH)||"In 2022 company DM RWE performed analysis of scientific literature and clinical trial data on the efficacy of treatment options for pulmonary arterial hypertension (PAH). The results of this project allowed to summarize data from different sources of the PAH treatment options in one place as well as to show quality of various biomedical language representation models. The main part of the project was parsing the data from relevant articles. However, due to complexity of relations between medical terms and their specificity we faced numerous issues, like: 1) Much data was presented in the tables. Parsing this kind of data was hard to automate and required high amount of manual work. 2) Clinical events and improvements were almost the same entities for the models. 3) Rare parameters caused significant issues during models training process. The project can be divided into the following steps: 1) Pre-selection of relevant publication on PubMed, Elsevier, Google Scholar and Clinicaltrials.gov based on target nosology. Here we opted for all kinds of publications related to pulmonary arterial hypertension. 2) Filtering the publication depending on the study type, population and treatment. On this step initial number of 1195 publications was reduced by 478, which were passed on to the parsing with the models. 3) Marking up of entities and their relations in the training set, what was the basis for language models finetuning on the next step. 4) Training of the models was subdivided into training of entities parsing and relations. This was the heart of the project, because results of this stage were the most influencing and unpredictable. Overall, we tried about 10 language models, depending on specific aim. Among other models BioBert proved that it worked the best for parsing entities. Whilst, ClinicalTransformer model worked the best for determining popular relations and finetuned SynSpert for unique relations in PAH therapeutic area. 5) Model inference on complete set of data. 6) Database cleaning and aggregation with data from the tables, which were processed separately. As the outcome of the above-mentioned steps, we gained ready to use database with valuable data on PAH efficacy treatment, which can be used by medical professionals and pharmaceutical companies. We also defined most helpful language representation models applicable to medical area."|
|30.||Philippe Lenzen||Self-driven Microfluidic Diffusion Sizing platform for measurement of binding affinity||Microfluidic technologies have found various applications in drug discovery and development. Among those, Microfluidic diffusion sizing (MDS) has revealed a powerful technique that relies on the micron-scale mass transport of biomolecules under native conditions to monitor protein-protein interactions and colloidal stability of biologics. However, practical applications typically require bulky peripheral actuators and experienced operators, which may confine the use of these microfluidic tools to dedicated laboratories. Here we present a self-driven microfluidic diffusion sizing platform optimized for the rapid characterisation of protein-protein interactions and of nanoparticles. Requiring only four pipetting steps for a successful operation, the platform encompasses all liquid handling capabilities in its design. We demonstrate the ability to perform rapid biomolecule sizing and measure binding affinity of Protein A to an immunoglobulin G. By encoding all fluidic operations in its geometry, the self-powered MDS platform is easily multiplexed. Moreover, the combination of MDS with multi-wavelength fluorescence detection can enable the multi-dimensional characterization of extracellular vesicles (EVs) and virus-like particles.|
|31.||Hannah Oevermans||Single-cell repertoire sequencing of gamma delta T-cells||"γδ T lymphocytes are a subset of T cells that make up 1-5% of human peripheral blood mononuclear cells. Uniquely, γδ T cells display characteristics of both innate and adaptive immunity and constitute an active area of research due to their role in antitumor immunity and promising potential for anticancer therapies. Assessing the rate of tumor-infiltrating γδ T cells in a tumor biopsy is regarded as a highly relevant measurement for cancer therapy. Determining the composition of γδ TCR clonotypes and functional γδ T cell subsets in humans can help understand individual responses to neoplastic and infectious challenges and thus pave the way for novel therapies. In our research, we combined the 10x Genomics 5' immune profiling with our custom TCR amplification strategy, to simultaneous analyse gene expression and TCR diversity within individual immune cells, enabling a comprehensive understanding of immune cell populations and their functional states. Our optimized protocol now unlocks valuable insights from γδ T cells, thus enabling research into their crucial role in antitumor immune response."|
|32.||René Böttcher||CBDD – A systems biology toolkit for complex data processing and interpretation||"Technological advances are leading to an ever-increasing pace of innovation for both computational and experimental scientists. One of the negative outcomes of this development has been that due to the publishing pressure, popular algorithmic solutions are often released without long-term support or that they are at best available as platform-specific implementations with limited use in generalized analysis workflows. To address this issue, Clarivate launched the “Computational Biology Methods for Drug Discovery” (CBDD) program focused on implementing advanced state-of-the-art network analysis approaches for various types of molecular data. CBDD is implemented as a platform-agnostic and modular R package that can be easily used to speed up the development of data interpretation workflows and to generate actionable insights from a diverse set of inputs. CBDD’s modules are extensively tested, documented, and optimized to ensure robust performance across a variety of platforms via a unified interface. Currently, CBDD comprises 78 algorithms that enable various types of analyses, including: • Identification of causal regulators and reconstruction of underlying mechanisms from expression signatures or GWAS signals. • Stratification of patients and identification of relevant biomarkers based on subnetwork and multi-OMICs integration analyses. • Assessment of the impact of drug combinations and identification of candidates for drug repurposing. • Construction of cell-cell communication networks, cell type deconvolution, and more. CBDD focuses on implementing a wide range of general-purpose tools rather than algorithms specifically tied to certain experimental designs or data types. As a practical example of this, Clarivate has been utilizing an ensemble of methods to prioritize targets and identify relevant drugs for repurposing across various indications and toxic chemicals using the curated data from our MetaBaseTM and Cortellis Drug Discovery IntelligenceTM platforms. Our approach combines different types of evidence that can broadly be categorized by the underlying scientific rationale into: 1) algorithms using network-based measures to determine node similarity, 2) algorithms leveraging established knowledge bases such as pathway maps or ontologies, and 3) algorithms assessing disease similarity via various measures. Using COVID-19 as an example, we show that a combination of these different measures outperforms the individual approaches while prioritizing known COVID-19 drug targets over other established drug targets by a significant margin."|
|33.||Lucas Schuetz||Simple Spatial biology analysis: Registration, Segmentation, Cell-typing and Statistical analysis with just a few clicks.||In the last years, automated microscopy improved in great steps, and microfluidics devices were fully integrated, allowing complex spatial omics experiments without the associated hassle of earlier days. However, after the experiment researchers are left with large amounts of image data that have to be processed, registered, analyzed and quantified. Available tools require either extensive programming knowledge or lack important functionalities. To overcome this issue, we developed an automated and scalable deep learning pipeline for processing and analysis of spatial biology data. Our pipeline performs all steps from registration over segmentation, cell-typing and statistical analysis. We provide this pipeline as a cloud solution, easily accessible without high performance hardware.|
|34.||Damir Zhakparov||GeneSelectR: A Machine Learning-Based R Package for Enhanced Feature Selection and Biological Assessment in RNAseq Analysis of Complex Biological Datasets||"Introduction: RNAseq datasets are complex, high-dimensional, and challenging to analyze. Differential gene expression analysis, the standard approach, has limitations, such as a high false-positive rate and limited gene coverage. Machine learning (ML) techniques can overcome these limitations by identifying hidden patterns and interactions of independent variables. However, selecting the best approach to feature selection remains a challenge. To address this challenge, we developed the GeneSelectR package, an R tool for feature selection and biological assessment of gene lists for high-dimensional RNAseq datasets. We tested the GeneSelectR package on a dataset of atopic dermatitis (AD), a complex and heterogeneous disease. Methods: The GeneSelectR package uses four ML methods for feature selection:random forest, Boruta, lasso regression, and univariate filtering. To assess biological relevance, the GeneSelectR package performs gene ontology (GO) enrichment. It applies GO term distance calculcation and clustering for semantic similarity analysis of the GO lists using simplifyEnrichment R package in a wrapper function and then selects the best list based on cross-validation mean metrics scores. We applied the GeneSelectR package to a dataset of PBMCs of 149 African children stratified into groups based on their location and diagnosis. We also performed differential gene expression analysis using edgeR on this data. Results: Lasso regression was the most effective feature selection method in the AD dataset. The final gene list contained multiple genes related to immunological processes in the context of inflammation and atopic disorders. The differentially expressed gene list performed differently compared to the ML methods used in the GeneSelectR package. Following the GeneSelectR package workflow, we chose an objectively best-performing list. Conclusion: The GeneSelectR package addresses the challenges in feature selection for RNAseq datasets, particularly for complex diseases such as AD. It provides a reliable tool for extracting relevant genes and identifying potential biomarkers for diagnosis and treatment. By utilizing ML techniques and GO analysis, the GeneSelectR package offers insights into the pathogenesis of complex diseases such as AD and identifies potential biomarkers for diagnosis and treatment. The GeneSelectR package's approach can be applied to similar high-dimensional RNAseq datasets, making it a valuable resource for the analysis of gene expression data in biological research."|
|35.||Mei Shan Krishnan & Berenice Wulbrecht||Accelerating drug development with advanced semantic technologies: a case study in clinical trial design||"The life sciences industry is facing a tsunami of data with the emergence of precision medicine, biomarkers, omics, electronic health records, monitoring data, and more. These large amounts of data are continually being generated and will continue to grow. However, too often, data is only utilized for its initial purpose. Unlocking the full potential of data requires more than just collection, especially for heterogenous data like clinical or biological data. For effective re-use, data must be refined, integrated, and accessible. Life sciences organizations have increasingly turned to semantic technologies and knowledge graphs (KGs) which offer a range of benefits for integrating and harmonizing scientific knowledge. By consolidating data into a KG, life sciences organizations gain a powerful tool for research and analysis. By providing a robust reasoning framework and advanced analytics capabilities, KGs enable data-driven decisions and improve search-refine efficiency. Further, KGs can be leveraged with NLP to extract information from unstructured text and with AI to support predictive and generative analysis. Such associations enable the rapid exploration of hypotheses and refinement of insights, particularly in the drug development process. ONTOFORCE will highlight the benefits of knowledge graphs through a compelling use case with a top 10 pharma company, demonstrating actual business value for accelerated clinical trial design for improved patient outcomes."|
|36.||Gisela Fontaine||Determination of Extractable Silicone Traces in Pharmaceutical Devices Using Inductively Coupled Plasma-Optical Emission Spectrometry (ICP-OES)||"Silicones (poly(organo)siloxanes) are used in pharmaceutical applications as part of formulations, as well as during manufacturing and in packaging, most common trimethylsilyloxy-terminated polydimethylsiloxanes (PDMS). They are used as active ingredients and antifoaming agents as well as excipients in numerous formulations and cosmetics. Furthermore, silicone oils are used as lubricants, reducing the break loose and gliding forces within medical devices like syringes and metered-dose inhalers. Due to increasing concerns about silicone impurities, effective monitoring of the extractable quantities is critical for medical device development and quality control. A challenge for the development of suitable analytical methods is presented by the hydrophobicity of silicone oil and the often very small concentrations to be analysed while Si background levels from inorganic Si are sometimes significant. One approach for the determination of silicone impurities is the analysis of total elemental silicon by inductively coupled plasma-optical emission spectrometry (ICP-OES). Without prior species separating steps, the method does neither differentiate between organic (e.g. silicones) and inorganic (e.g. SiO2) Si species nor between different organic Si species. However, extraction and determination of total organic silicon emitted by a medical device during the administration of a drug can be used as a worst-case scenario for possible PDMS contaminations. This study was aimed to develop trace determination of extractable organic silicone from medical devices by ICP-OES. Due to the hydrophobic nature of silicone oils, quantitative extraction of the analytes is most feasible by using organic solvents like methyl isobutyl ketone (MIBK). To avoid additional dilution steps of the organic extract and minimize analyte loss, a direct organic solvent ICP-OES method was developed using oxygenated argon as auxiliary gas. To allow for determination of trace amounts of silicone oil from medical devices, a low limit of quantification (LOQ) was a main target. In addition to LOQ, the analytical parameters independent from the extraction technique like linearity, range and stability of solutions were assessed. As a next step, feasibility of the combined extraction and measurement method was tested by performing spike recovery experiment. For this, known amounts of silicone oil were spiked onto prefillable glass syringes since they are one of the most widespread siliconized devices for pharmaceutical applications and the extraction efficiency of up to 6 extraction steps evaluated. In addition, an alternative solvent was successfully tested.y respondents."|
|37.||James Longden||HepNet - a computational biology platform for RNAi target discovery||The major challenge for mature treatment modalities like RNAi is the identification of novel targets. Typically target identification approaches involve the search for individual proteins in well characterised pathways. However, such reductionist approaches are prone to failure because they do not reflect reality; that human biology is incredibly complex. Proteins are highly interlinked in inter- and intracellular signaling networks and diseases are most usually caused by abnormalities within these networks, rather than aberrations in single genes. At e-therapeutics we aim to understand disease using our suite of proprietary computational models to build and interrogate protein-protein interaction networks. To support this work we are developing a centralized interface that will give biologists access to large-scale data currently only amenable to computational interrogation and also the ability to participate in the selection of targets in concert with artificial intelligence algorithms.|
|38.||Ellen Gordon||From FASTQ to Visualization: Experience the transformative power of g.nome||"The rapid expansion of biological data, precipitated by innovations like next-generation sequencing and single cell omics technologies, has ushered in a new era of scientific discovery. While promising, these advancements bring forth many challenges, especially when reanalyzing large and complex data from groundbreaking publications in the context of one’s own research. To address these challenges, we present the g.nome™ platform, which builds a collaborative space for bioinformaticians and biologists by delivering a cloud-based, no-code/low-code solution. g.nome enables scientists to analyze their NGS datasets and reanalyze public data without delving deep into the complexities of data science. Scientists can select a proven method from our curated collection or build their own pipeline to copy a published workflow. Highlighting the transformative power of g.nome, we spotlight how g.nome can be used for single-cell RNAseq analysis in exploring and re-analyzing publicly available datasets. Here we show how to reanalyze datasets from two different publications in order to recapitulate their findings and generate new insights. The platform's intuitive design ensures that raw data is seamlessly transformed into actionable insights. To promote collaborative and reproducible research, g.nome furnishes fully modular and customizable workflows and cutting-edge toolkits. With g.nome, long-time barriers linked to workflow language, process flow visibility, and quality control are removed. All that’s left are streamlined, scalable, and interoperable genomic workflows — leaving research teams to do what they do best: focus on the science."|
|39.||Elad Katz||Scaling up of disease relevant human kidney models for drug discovery||"Kidney organoids, a miniaturized and simplified organ structures, have emerged as important models for drug screening, developmental biology and disease modelling. The use of kidney organoids is a pivotal move in the fight against chronic kidney disease (CKD). CKD affects 10% of the global population and causes millions of deaths worldwide. We developed a novel 384-well plate, NaviPlate, allowing us to seed, generate and incubate 3D cell structures such as organoids with ease. Here we demonstrate the use of the NaviPlate to create induced pluripotent stem cell (iPSC)-derived kidney organoids unaided for modelling tissue development and drug screening. Using an F-tractin-mKate tagged iPS cell line, we can monitor and evaluate our fully automated system throughout the protocol. Using 3D fluorescence imaging and qPCR we targeted a variety of markers to characterise our kidney organoids and compared them against the adult kidney tissue. In our organoids, we detected the presence of various crucial structures of the nephron: nephrin and PODXL positive cells of the renal corpuscle, LTL+ cells of the proximal tubules and E-cadherin+ cells of the distal tubules. We have used the same markers on frozen fixed organoids and adult kidney slices to compare the differences in morphology. We discovered interesting variations such as the absence of AQP1 (a crucial water transporter present in the proximal tubules) in the kidney organoids. Using qPCR, we processed individual organoids with as little as 2000 cells per organoid. We screened them against crucial transporters present in the proximal tubules such as URAT1, BCRP, OAT1, OAT3, OCT2, OCT3, MATE2K and MATE1. We have discovered insightful differences in the expression of URAT1, OCT2 and MATE2K. These differences might prove critical in evaluating the maturity of our kidney organoid models. Understanding the full transporter picture is important for understanding the outcome of any drug screening and their possible use for evaluating toxicity in vitro. A variety of perturbations were possible using the Naviplate technology to mimic diabetic nephropathy, proteinuria (elevated albumin protein levels) and hyperlipidemia (elevated free fatty acid levels) in vitro. KIM1 and Clusterin, FDA approved biomarkers of kidney injury, were elevated in these disease models. Our system and research facilitate the automated generation of reproducible kidney organoids with functional glomeruli and tubules and provides key markers that can be used to evaluate organoid maturity or be used for drug screening."|
|40.||Ulrich Goldmann||Building a multi-omics knowledge base for the human solute carrier family||"The RESOLUTE consortium has worked on 446 human transmembrane transporters generating reagents (plasmid, cell lines, etc.), tools (binders, assays, etc.), and data (transcriptomics, proteomics, metabolomics, etc.). Using systematic knock out and overexpression perturbations in cancer cell lines, we created rich omics data sets, which we released to the public implementing the FAIR principles. To manage, analyze and share our effort, we built a transporter focused data- and knowledge base compiling all RESOLUTE data, metadata and analysis results, integrated with publicly available annotations and data sets on genes, proteins, variants and compounds. Employing controlled vocabularies and ontologies turned out to be essential given the redundancy and ambiguity of SLC function. A state-of-the-art web portal (https://re-solute.eu) enables browsing and querying this unique transporter focused resource. Several interactive dashboards allow generation of hypotheses, based on visual exploration of individual data sets as well as more integrative analyses to find similarities between transporters across data modalities. Based on all the available data and integration, a biochemical/biological function is assigned to each human SLC, making it one of the largest systematic annotation of gene function ever and representing a real milestone for transporter biology."|
|41.||Lluna Gallego||A highly versatile drug delivery platform||"Nanoparticle drug delivery systems (DDS) are seeing growing attention for their potential to drastically improve the efficacy and safety of many cancer therapeutics. Key challenges DDSs address include cargos that are insoluble, have poor stability or have significant off-target effects. With current DDS technologies, only a small fraction of the administered dose ends up reaching the target site to have its intended effect. In addition, these provide non-complementary and highly fragmented solutions, with individual approaches that are only able to address a subset of these challenges with a subset of potential cargos. Developed for the first time 20 years ago, metal-organic frameworks (MOFs) are one of the most exciting areas in recent materials science. These porous hybrid solids are largely payload-agnostic, and the modifications used to control the carrier’s behaviour transfer well between different MOFs. This enables them to house virtually any therapeutic payload and use industry-standard technologies to modify the delivery profile, offering a technology platform with the potential to ultimately become the almost universal solution highly needed in the drug delivery industry. Vector Bioscience Cambridge was founded in 2021 based on +15 years of research at the University of Cambridge with the objective to become the first company to take this highly promising DSS platform to the market, focusing on macromolecule delivery. Macromolecules such as siRNA are potentially the most powerful anti-cancer drugs that exist, but there is currently no efficient way to deliver them specifically to the tumour. Vector platform is a new delivery method with enormous potential to improve the safety and effectiveness of RNA-based therapies. Our first use case consists of the deployment of siRNA for the treatment of hard-to-treat cancer types, starting with pancreatic cancer."|
|42.||Tania Meneses||Enabling patient registries on healthcare organizations||"Patient registries can unlock valuable scientific insights. They can close the gaps from clinical studies and bring scientific evidence on treatments, diseases, guidelines, and budget planning in real-world settings with a broad patient population. However, it is not always simple for a healthcare organisation to create and securely collect such registries. The Clarum vision is to help you harness the power of registries and real-world evidence through an intuitive digital solution that enables you to create and collect registries in real-life settings both rapidly and securely."|
|43.||Alina Kagermazova||Depletion of AAV neutralising antibodies from the bloodstream to enable wider use of AAV Gene Therapies||
Adeno-associated virus (AAV) gene therapies are a promising approach to treating genetic disorders. AAVs are small, non-pathogenic viruses that have been engineered to deliver therapeutic genes to target cells. Once inside the cell, the AAV vector delivers the therapeutic gene, which can then produce the missing or malfunctioning protein that causes the genetic disorder. One challenge with AAV gene therapy is the presence of pre-existing antibodies to AAV in some patients. These antibodies can neutralize the AAV vector before it has a chance to deliver the therapeutic gene to the target cells. This can limit the effectiveness of the therapy and today, those patients are considered ineligible for treatment. This problem also prevents the redosing of patients – those treated with AAV therapies have a strong immunogenic response and therefore have high AAV antibody levels that will persist for the remainder of their lives. We have developed a technology platform that is able to specifically deplete substances from the bloodstream of patients. The platform is currently being adapted to deplete AAV antibodies from prospective patients, potentially providing a window of opportunity for treatment. In this presentation, we report on the latest data and development of this technology, and the next steps required to bring the technology to the clinic.
|44.||Razieh Azari||Using Crowdsourcing to Optimize Lay-friendliness of Patient Information Leaflets||
Background: Patient Information Leaflet (PIL) which must accompany all medication and inform patients about dosage, side effects, etc., is the most important source of information about a medication. In the European Union (EU), PILs "must be written and designed to be clear, understandable and enable the users to act appropriately" (Article 63 (2) of EU Directive 2001/83/EC, European Parliament and of the Council, 2001), here termed lay-friendly. To ensure the production and translation of lay-friendly PILs, the European Medicines Agency (EMA) and the European Commission have tried to provide initiatives including the Readability Guideline, templates in all EU languages, and user-testing of PILs. However, studies showed that PILs are not linguistically lay-friendly. The PIL templates have been criticized as they were produced without having been tested. The present study intended to investigate the lay-friendliness of the English PIL template. Methods: An unpaid crowdsourcing initiative called Easy PIL has been launched on the Citizen Science Center Zurich platform since April 2021. The crowd was invited to participate through social media. An English PIL template that is freely available on the EMA official website was selected. Each sentence of the template was highlighted as a separate task, resulting in 73 tasks. As PILs are produced for the general public, anyone who had a good command of English could submit their evaluation of the lay-friendliness of each sentence based on a defined scale. Results: A total of 47 people participated in the Easy PIL initiative from its inception until May 16, 2023. 76% of the crowd could completely understand all of the sentences of the English PIL template; 15% of the crowd found them easy; 6% of the crowd found them to have an average level; 2% of the crowd found them difficult; and 0.5% of the crowd could not understand them. The majority of the crowd evaluated the sentences in the English PIL template as lay-friendly. Conclusion: The English PIL template was evaluated as lay-friendly. Considering limitations such as the digital divide, the lack of sample representativeness, and data quality, crowdsourcing can be a potential alternative and a supplement to traditional methods to improve the lay-friendliness of PILs. This study could be expanded further to use crowdsourcing to develop a lay dataset or parallel complex-lay corpora of the PILs in different languages and settings.
|45.||Bence Keömley-Horváth||Cytocast Simulated Cell: Unlocking Drug Effect and Side Effect Prediction||
In the realm of drug discovery and development, the significance of predictive tools that simulate cellular responses to drug-induced changes has grown substantially. Cytocast introduces an innovative approach using virtual cell models to forecast cellular behavior in response to drug perturbations. By integrating bioinformatics databases, Cytocast constructs powerful cell simulation; the Cytocast Simulated Cell capable of assessing the impacts and potential adverse reactions of drug treatments across 17 tissue types. The process involves generating virtual cell models, selecting drugs for evaluation, and executing parallelized simulations considering protein-protein interactions. Quantitative insights are generated, showcasing affected proteins, interactions, and cellular phenotype changes, while high-performance computing ensures efficient simulation execution. The technology empowers informed decisions during R&D, predicting drug efficacy and potential side effects. The Cytocast Simulated Cell enhances efficiency from initial discovery to market-ready drugs, reducing costs and clinical trial resources with potential applications across pharmaceuticals and personalized medicine.
|46.||Stefano Piotto||SoftMining drug discovery platform||
The SoftMining platform leverages state-of-the-art artificial intelligence techniques, deploying advanced algorithms and machine learning methodologies to streamline and expedite the drug development pipeline. By virtue of its predictive prowess, the platform facilitates the identification of prospective drug candidates with unparalleled precision, thereby optimizing resource allocation and timelines. Additionally, it provides a robust framework for assessing compound safety through predictive toxicity profiling. The molecular similarity analysis aids in the discovery of structurally analogous compounds, enriching the lead identification process. SoftMining's unwavering commitment to scientific excellence underscores its pivotal role in advancing the frontiers of AI-driven drug discovery. This presentation underscores the company's dedication to contributing substantively to the field and catalyzing the expedited development of therapeutics with profound clinical implications.
|47.||Ewelina Grudzien, Eelke van der Horst, Frank van den Bergh, Duncan Ng, Falk Hildebrand, Chris T. Evelo, Susan L. M. Coort, Duygu Dede Şener, Elisa Cirillo, Maria H. Traka||Managing food and microbiome studies data using Fairspace, a flexible and FAIR data management platform||
Fairspace is an open-source research data management platform that adheres to the FAIR principles. The Hyve created Fairspace in 2016 and has developed it ever since, customizing it for several organizations’ use cases. We present the implementation of this tool within the FNS-Cloud consortium, in which Fairspace became the user browser that allows microbiome and food data exploration within public resources mapped to a common (meta)data model and vocabularies/ontologies.
|48.||Sofia Bazakou, Maxim Moinat, Alessia Peviani, Anne van Winzum, Stefan Payralbe, Vaclav Papez, Spiros Denaxas||Mapping UK Biobank to the OMOP CDM: challenges and solutions using the Delphyne ETL framework||
The UK Biobank (UKB) is a comprehensive registry housing medical and genetic data from 500,000 participants. Collaborating with University College London (UCL), we mapped UKB data to the OMOP CDM v5.3 within the European Health Data Evidence Network (EHDEN) framework, facilitating COVID-19-related research. Challenges included converting wide-format tables to long format and harmonizing diverse source terms. Synthetic data aided script development. We employed OHDSI tools and our internal ETL pipeline, Delphyne, for high-quality mapping. Quality assessments yielded a 99% pass rate. Our work enhances the utility of UKB data for EHDEN studies on COVID-19, showcasing effective collaboration and community engagement.
|49.||Jessica Singh, Mirella Kalafati||cBioPortal @ TheHyve: Bridging Genomics and Clinical Insight through Comprehensive Integration, Intuitive Visualization, and Tailored Solutions||
cBioPortal is a pivotal resource in the landscape of cancer genomics, orchestrating a harmonious convergence of multidimensional genomic datasets. This integrative capacity enables researchers and clinicians to uncover recurrent genetic anomalies, shedding light on their roles in disease progression. The platform's robust toolkit encompasses a range of user-friendly visualization tools, including panoramic study summaries, hierarchical clustering-based oncoprint heatmaps, lollipop plots, survival charts, and a robust group comparison tab that effectively distills complex genomic landscapes into comprehensible insights.
Significantly, cBioPortal's impact extends beyond its intrinsic functionalities as it integrates with numerous external resources. The inclusion of the OncoKb, Cancer Hotspots, COSMIC and Clinical Interpretation of Variants in Cancer (CIViC) databases augments the platform's annotations with clinical context, enriching the interpretation of genomic aberrations.
cBioPortal's long-standing collaboration with The Hyve has allowed pharmaceutical companies and research organizations worldwide to deploy private cBioPortal instances with proprietary data. The Hyve's expertise results in tailored solutions for diverse research objectives, making cBioPortal adaptable to various study designs. This collaboration accelerates the translation of genomics data into actionable insights.
In summary, cBioPortal emerges as a transformative force at the intersection of genomics and clinical application. Its comprehensive integration, coupled with an array of intuitive visualization tools, uncovers the genetic intricacies that cause cancer. Integrating external resources elevates its capabilities, while the collaborative partnership with The Hyve ensures customization and adaptability. As genomics continues to shape precision medicine, cBioPortal's multifaceted approach empowers researchers and clinicians to traverse the complex terrain of cancer genomics with confidence and clarity.