{"id":780,"date":"2023-09-24T15:27:17","date_gmt":"2023-09-24T15:27:17","guid":{"rendered":"https:\/\/nickbrown.online\/?page_id=780"},"modified":"2026-03-29T19:14:14","modified_gmt":"2026-03-29T19:14:14","slug":"research","status":"publish","type":"page","link":"https:\/\/nickbrown.online\/?page_id=780","title":{"rendered":"Research"},"content":{"rendered":"\n<p>This page provides a summary of the different research areas that I am interested in and summarises some of my activities. The results of these projects can be seen in my <a href=\"https:\/\/nickbrown.online\/?page_id=12\" data-type=\"URL\" data-id=\"https:\/\/nickbrown.online\/?page_id=12\">publications<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Emerging architectures for HPC<\/strong><\/h2>\n\n\n\n<p style=\"text-align:justify\">A major component of my research is exploring the role that new, upcoming, hardware architectures can play in accelerating high performance computing codes. Often orders of magnitude more energy efficient than existing technologies, the challenge is often how to exploit these most effectively. Whether it is processing workloads centrally on supercomputers in a data centre, or out at the edge, requirements are being driven by scientific and societal ambition that uniquely suit these new types of hardware.<\/p>\n\n<p style=\"text-align:justify\">I am a Co-I on the \u00a31.7M EPSRC funded HPC-R project which is, in part, exploring a co-design approach to leveraging novel architectures for energy efficient HPC. This is currently focusing on the following technologies.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><em>RISC-V<\/em> <em>for HPC<\/em> <\/h4>\n\n\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2023\/09\/Tall_2.png\" width=\"200\" align=\"left\" style=\"margin-right:10px;\">RISC-V is an open ISA and with over 20 billion RISC-V devices in existence, it has enjoyed phenomenal growth since it was developed over a decade ago. I was PI of the EPSRC funded <a href=\"https:\/\/riscv.epcc.ed.ac.uk\" target=\"_blank\">RISC-V testbed<\/a> which aimed to explore and popularise RISC-V in the context of HPC. This system has been endorsed as a <a href=\"https:\/\/riscv.org\/developers\/labs\/\" target=\"_blank\">RISC-V ecosystem lab<\/a> by RISC-V International who are the standards body.<\/p>\n\n<p style=\"text-align:justify\">I have undertaken extensive benchmarking of RISC-V CPUs, including the first to benchmark the <a href=\"https:\/\/arxiv.org\/pdf\/2309.00381\" target=\"_blank\">SG2042<\/a> and <a href=\"https:\/\/arxiv.org\/pdf\/2508.13840\" target=\"_blank\">SG2044<\/a> for HPC, as well as being the first to port scientific computing workload to a <a href=\"https:\/\/arxiv.org\/pdf\/2409.18835\" target=\"_blank\">RISC-V accelerator<\/a>. In addition to my research activities I am chair of the RISC-V International HPC SIG, am a RISC-V Ambassador and lead organiser for the RISC-V HPC workshop series which has been at a wide range of top HPC conferences including ISC, SC, HPC Asia, and HIPC.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><em>FPGAs and CGRAs<\/em><\/h4>\n\n\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2026\/03\/versal_architecture.png\" width=\"300\" align=\"right\" style=\"margin-left:10px;margin-bottom:5px;\">More capable hardware make FPGAs a serious proposition for HPC like never before. However, in order to gain good (or even acceptable) performance we need to rethink our algorithms and move from an imperative, Von Neumann, style to a dataflow design. It is this area, the design and development of new dataflow techniques, that I find most interesting within the context of FPGAs and have led the porting of numerous HPC codes onto FPGAs across several domains (e.g. finance, CFD, atmospheric modelling). I was a Co-I on the EPSRC funded <a href=\"https:\/\/fpga.epcc.ed.ac.uk\" target=\"_blank\">FPGA testbed<\/a>, which aimed to make FPGAs more accessible for HPC developers.<\/p>\n\n<p style=\"text-align:justify\">CGRAs provide coarse grained reconfigurability and, typically providing processing cores within a flexible interconnect, are becoming more popular. I was PI of an EPSRC funded CGRA project, which explored accelerating HPC codes on the <a href=\"https:\/\/www.cerebras.net\/product-system\/\" target=\"_blank\">Cerebras CS-2<\/a> and <a href=\"https:\/\/www.xilinx.com\/products\/technology\/ai-engine.html\" target=\"_blank\">AMD Xilinx AI engines<\/a>. As part of this research I undertook the first study in leveraging AI engines for HPC workloads which was <a href=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2023\/01\/AIE_PW_advection.pdf\" target=\"_blank\">published at ISFPGA<\/a>.<\/p>\n\n<p style=\"text-align:justify;\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2023\/09\/RSE-Primary-Black-Logo-square.png\" width=\"150\" align=\"right\" style=\"margin-left:10px;\">I was awarded a Royal Society of Edinburgh personal research fellowship which ran in 2024 and 2025, this explored greener weather forecasting on supercomputers. Focusing on Met Office workloads, the first major area was leveraging FPGAs <i>in the network<\/i> for undertaking in-situ data post-processing and reduction, a task that is currently done by CPUs.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>HPC compilers<\/strong> <strong>and programming models<\/strong><\/h2>\n\n\n\n<p style=\"text-align:justify\">A major motivation of my research is the grand challenge of how we can enable scientists and engineers to most effectively program current and future generation HPC machines. Given the large degree of parallelism, and heterogeneous nature of our supercomputers, this currently requires deep expertise. Put simply, scientists want to worry about their problem rather than the tricky, low level details of parallelism, and indeed if we do not solve this problem then the benefit we gain from exascale supercomputers will likely be limited.<\/p>\n\n<p style=\"text-align:justify\">I am a Co-I on the \u00a31.5M EPSRC funded CONTINENTS centre to centre collaboration with the US National Centre for Atmospheric Research (NCAR), one area of focus is on compiler technologies for HPC.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><em>xDSL: The cross domain DSL ecosystem<\/em><\/h4>\n\n\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2023\/09\/xdsl_long_logo.png\" width=\"300\" align=\"left\" style=\"margin-right:10px;\"> I was a Co-I on, and knowledge exchange coordinator of, the \u00a31M EPSRC funded <a href=\"https:\/\/xdsl.dev\">xDSL project<\/a>. The fundamental issue we looked to address is that Domain Specific Languages (DSLs) are one way in which we can solve the HPC programming challenge, however underlying DSL compiler ecosystems tend to be heavily siloed which heightens user risk and results in maintenance burdens. To this end we developed a common ecosystem for DSLs based around MLIR\/LLVM, enabling DSLs to become a thin abstraction layer atop this existing well supported technology. xDSL continues, with over 1 million downloads.<\/p>\n\n<p style=\"text-align:justify\">Using two DSLs to drive our experiments, we not only developed a <a href=\"https:\/\/github.com\/xdslproject\/xdsl\">Python-based MLIR compiler framework<\/a> but furthermore numerous HPC focused dialects including the MPI dialect which has now been upstreamed into MLIR. Furthermore, we found that by leveraging MLIR it is often possible to consolidate many of these domain specific compilers with general purpose ones and this resulted in an <a href=\"https:\/\/arxiv.org\/pdf\/2404.02218.pdf\" target=\"_blank\">ASPLOS paper<\/a>. We also developed a generic flow from Fortran, in Flang, to the upstream MLIR dialects. Whilst the purpose of this was to generally demonstrate performance improvements from core MLIR, it has also opened up a range of additional flexibility that is useful for driving novel architectures.<\/p>\n\n<p style=\"text-align:justify\">I am a Co-I on the ARCHER2 funded eCSE project which is using MLIR and xDSL to develop a Domain Specific Language (DSL) for the ExaHyPE hyperbolic PDE solver framework to raise programmer productivity and enable automatic parallelism optimisation across CPUs and GPUs.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><em>Programmer productivity and performance on emerging architectures<\/em><\/h4>\n\n\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2026\/03\/CS-2_xDSL.png\" width=\"600\" align=\"right\" style=\"margin-left:10px;\">The Cerebras Wafer Scale Engine (WSE), AMD&#8217;s AI Engines (AIEs) and Tenstorrent&#8217;s Tensix architecture are just three examples of the wide range of specialised accelerators we are seeing for AI workloads (which are typically focused on inference). At their heart these technologies provide specialised hardware for optimising linear algebra operations and data movement. That same hardware specialisation can also be beneficial for general purpose scientific computing workloads. However, whilst this hardware has demonstrated performance and energy benefits, the devil is in the detail especially enabling programmers to effectively write code for these architectures to unlock this potential. Leveraging MLIR and xDSL, a major part of my current research activities is to explore not only how the compiler can undertake automatic algorithmic transformation for this type of hardware, but furthermore what infrastructure can be shared between these specialised compilers. The ultimate objective is for scientific computing programmers to be able to deploy their codes effectively and unchanged across a wide range of architectures. This has been the second area of focus for my Royal Society of Edinburgh personal research fellowship and has led to <a href=\"https:\/\/arxiv.org\/pdf\/2601.17754\" target=\"_blank\">ASPLOS<\/a> and CCGrid papers amongst others.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>ML benchmarking<\/strong><\/h2>\n\n\n\n<p style=\"text-align:justify\">I am a Co-I on the \u00a32M ARIA funded <a href=\"https:\/\/www.teasbench.com\/\">Tracking Evolving AI and Systems (TEAS)<\/a> benchmarking project. A collaboration between the School of Informatics, EPCC and Imperial College London, we are developing an ML inference benchmarking suite that comprises latest ML workloads. This suite can then be used by vendors and researchers as a basis for comparing next-generation ML hardware and other technologies across the current state of the art. Publishing a scoreboard across a range of current architectures, a major part of this project is to also port these models to emerging ML architectures such as the Cerebras CS-3. In EPCC we bring significant expertise in benchmarking techniques, and one key outcome that we aim for is to bring some of these from the HPC to ML domains.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Urgent supercomputing<\/strong><\/h2>\n\n\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2023\/09\/vestec-logo.png\" align=\"left\" width=\"300\" style=\"margin-right:10px;\">I led the <i>interactive supercomputing<\/i> work package on the \u20ac4M H2020 FETHPC <a href=\"https:\/\/cordis.europa.eu\/project\/id\/800904\">VESTEC<\/a> project and was responsible for ten deliverables. VESTEC ran between 2018 and 2022, and explored the fusion of real-time data with HPC for running urgent real-time workloads to better inform disaster response. Our case studies included forest fires and mosquito borne diseases, and my work package, which comprised around 100 months of effort over 7 partners, developed a technology providing federation of these interactive workloads across a range of supercomputers. A nice summary was <a href=\"https:\/\/ieeexplore.ieee.org\/document\/10201878\" target=\"_blank\">published in IEEE Access<\/a>.<\/p>\n\n<p style=\"text-align:justify\">As part of this project I began the UrgentHPC workshop series that ran at SC19, SC20, SC21, and SC22. At the end of the project this initiative merged with the <a href=\"https:\/\/www.interactivehpc.com\/\">interactive HPC workshop<\/a> series, and since then I have been involved in organising further workshops at ISC22 and ISC23, as well as a BoF at SC23.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>HPC code acceleration<\/strong><\/h2>\n\n\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2020\/09\/monc_logo.png\" align=\"left\" width=\"200\" style=\"margin-right:10px;\">I have worked with numerous organisations, developing and optimising their HPC codes, with my interests in developing new techniques for improving performance. For instance, I was the main developer of the Met Office NERC Cloud model (MONC) between 2014 and 2015, which is the Met Office&#8217;s high resolution atmospheric model which is used by them to explore weather phenomena at small scales and develop new parameterisations for their main weather forecasting model.<\/p>\n\n<p style=\"text-align:justify\">The previous model was capable of modelling only around 10 million grid points, whereas this work increased capability to supporting simulation of tens of billions of grid points over many thousand CPU cores. In addition to the computation, a major challenge was in the refinement of raw data to generate higher level information which the scientists are interested in. I developed a novel in-situ approach, where CPU cores are shared between computation and data analysis, enabling an order of magnitude increase in the data processing that was possible. Since the initial development I have been a Co-I on several projects that have further enhanced the model and am PI on an ARCHER2 eCSE-GPU project that runs between 2025 and 2027 which is porting MONC to GPUs.<\/p>\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2023\/09\/bgs.png\" align=\"right\" width=\"200\" style=\"margin-left:10px;\">I have been a Co-I on a couple of projects with the British Geological Survey, optimising their geomagnetic models for modern supercomputers. For example their Model of the Earth&#8217;s Magnetic Environment (MEME) code predicts the changing geomagnetic environment but that the challenge was that this was only capable of handling a tiny fraction of the raw data which is available from swarm satellites and ground stations. In order to improve the accuracy of the model, and enable exploration of more challenging polar latitudes, it was desirable to increase the size of input data set and resolution. Fundamentally, underlying the core of this optimisation work was the development of a novel technique for parallel assembly of the matrix of equations, which had been extremely costly in the existing code base. Ultimately this work resulted in around a ten times increase in the size of data that could be handled by the code at a reduction of over 100 times the runtime that we <a href=\"https:\/\/arxiv.org\/pdf\/2010.00283.pdf\" target=\"_blank\">published in CCPE<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Machine learning<\/strong><\/h2>\n\n\n\n<p style=\"text-align:justify\"><img decoding=\"async\" src=\"https:\/\/nickbrown.online\/wp-content\/uploads\/2020\/09\/swoop_clay.png\" width=\"230\" align=\"left\" style=\"margin-right:10px;\">Whilst most of my interest around machine learning is in how best we can leveraging novel architectures at the edge for ML workloads, I was PI of a project which explored the use of ML for optimising petrophysical workflows of well log data. The objective was to enable better use of the human because manual interpretation of each well took around 7 days and there were thousands of wells that required processing. However, the petrophysicist was using their knowledge and expertise to identify and extract specific patterns in the raw data, and the hypothesis was that, based upon a large enough training set, it would be possible to train an ML model to identify these patterns. We leveraged boosted trees and deep neural networks, ultimately reduceing the overall well processing time down to around two days that we <a href=\"https:\/\/arxiv.org\/pdf\/2010.02087.pdf\" target=\"_blank\">published in CCPE<\/a>. After the project concluded the IP was sold to PGS which resulted in a follow on project where we explored tuning the ML algorithms for their workloads.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This page provides a summary of the different research areas that I am interested in and summarises some of my activities. The results of these projects can be seen in my publications. Emerging architectures for HPC A major component of my research is exploring the role that new, upcoming, hardware architectures can play in accelerating&hellip;&nbsp;<a href=\"https:\/\/nickbrown.online\/?page_id=780\" rel=\"bookmark\"><span class=\"screen-reader-text\">Research<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"neve_meta_sidebar":"full-width","neve_meta_container":"","neve_meta_enable_content_width":"on","neve_meta_content_width":100,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":"","footnotes":""},"class_list":["post-780","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/nickbrown.online\/index.php?rest_route=\/wp\/v2\/pages\/780","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nickbrown.online\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/nickbrown.online\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/nickbrown.online\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nickbrown.online\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=780"}],"version-history":[{"count":41,"href":"https:\/\/nickbrown.online\/index.php?rest_route=\/wp\/v2\/pages\/780\/revisions"}],"predecessor-version":[{"id":990,"href":"https:\/\/nickbrown.online\/index.php?rest_route=\/wp\/v2\/pages\/780\/revisions\/990"}],"wp:attachment":[{"href":"https:\/\/nickbrown.online\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=780"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}