Digitising grid paper maps with Vision-Language Models (VLM’s) and computer vision technology

Summary

Context

In an era of rapid electrification, driven by the rise of electric vehicles and renewable energy, the demand for innovative solutions to manage energy grids has never been higher. Indeed, the impact of the energy transition on the electrical grid is massive and the Distribution System Operators (DSO) – the operators of the electric power distribution system which delivers electricity to most end users – need to react fast to avoid congestion of their low voltage networks.

The project and the stakeholders

We worked together with Resa, an energy distributor (DSO, Distribution System Operator) in the Walloon Region to develop a tool that enables the monitoring of Resa’s low-voltage network and enables the simulation of different scenarios to study its impact on each component of the grid.

However, the digital representation of the feeders that deliver electricity to the end-customer was missing in the tool. Having a fine-grained understanding of this « last mile topology » is crucial to understand network dynamics at a very local level and to react to problems arising, among others, from the dense adoption of solar panels in certain areas.

In this joint project, Haulogy and Jetpack are working on creating and improving this digital grid. For this, Haulogy and Jetpack are taking two different angles.

On the one hand, Haulogy uses optimization and business rules to create topological maps based on the currently existing digital data sources.
On the other hand, Jetpack.AI, uses so far unexploited, analog map material to create digital maps.

Our joint goal is to build the Resa digital grid and create the best possible topological maps based on the currently existing data sources : addresses of the substations and the EAN’s, GIS data such as geographical position of assets (substations, lines and cables,…) and customers and technical data such as direct link between customer and substation.

From paper maps to data, Jetpack.AI approach

Extracting the data from paper maps is a technical challenge for which Jetpack.AI employed an innovative dual-pronged approach:

1. Using Vision-Language Models (VLMs)

In 2023 the first vision-language models (VLMs) were dropped on the market and were further refined in 2024. These models are multimodal and can understand text and also (and that’s a new capability) images at the same time.

In this approach, we explore how much information can be extracted from the given map material using such novel vision language models that combine the strength of LLMs and computer vision. Throughout this project, we were using GPT4-o, which is for data privacy reasons deployed on the Azure OpenAI service.

We analyzed scanned paper maps and technical sheets to extract critical information such as substation names, feeder counts, and topology. The models performed quite poorly on the topology despite using different methods that improved the initial accuracy. However, extracting information from the tables associated with the maps giving technical information regarding the cables, feeders,… proved successful when optimizing the prompts and the associated pipeline (fuzzy matching street names,…).

Pipeline description
To validate the feeder data extracted from Haulogy, we implemented a multi-step pipeline as shown in the figure below:

The technical fiches, provided as paper sheets, contain tables listing information about feeders and the streets they pass through. We employed Optical Character Recognition (OCR) with Document Intelligence (DI) to extract this information.
As the handwritten text can be challenging to extract at a satisfying quality, we utilized a Visual Language Model (VLM) to proofread and correct the extracted data.
We used Fuzzy matching to align the extracted street names with actual street names.
We reconstructed feeder IDs using the cartouche information, based on established business rules and merged them with Haulogy’s.

GPS coordinates from Haulogy data were reverse geocoded to obtain street names.

2. Advanced Computer Vision Techniques

We developed a robust image-processing pipeline to clean, extract, and georeference grid features from paper maps. This included identifying feeder paths, substations, and connections with geographical coordinates. The idea was designed based on https://github.com/koszullab/chromosight (see https://www.nature.com/articles/s41467-020-19562-7 for more details). This is a package to detect 3D chromatin loops that was developed within the lab of one the team’s data scientist, during his PhD.

By overlaying digitized data with GIS records from Haulogy, discrepancies such as missing feeders or misaligned substations were identified and corrected.

Main takeaways

VLMs excel in broad, well-defined tasks but face challenges with detailed, complex extractions. They perform best with straightforward prompts, but their limitations, including potential hallucinations and inconsistencies in handling detailed map features, highlight the limitations of these models in our case.

However, the second approach, combining document intelligence, color-based feature extraction, and computer vision has yielded promising outcomes. While there are still areas for improvement such as scaling for non-colored maps and handling extremely noisy data, the current solution is a significant step forward in digitizing paper information. Future developments will focus on refining scaling techniques, improving the handling of edge cases, and expanding the solution’s capabilities to other types of maps and documents.

This solution sets the groundwork for more advanced applications in infrastructure management, potentially enabling real-time updates and analysis of digitized topology for smarter, more responsive networks.

Why It Matters

Digitizing electrical grids is more than just an operational upgrade—it’s about future-proofing our energy systems for a more sustainable world. By combining AI innovation with human expertise, Jetpack.AI is empowering DSOs like Resa to navigate the energy transition with confidence.

Tech stack

Throughout this project, we are using GPT4-o, which is for data privacy reasons deployed on the Azure OpenAI service.

Langchain is a composable framework that we use to build with LLMs

On the computer vision side, we used common Python librairies combined with scipy and shapely. For the visualisation, we used QGIS

Similaire

Digitising grid paper maps with Vision-Language Models (VLM’s) and computer vision technology

Summary

Context

The project and the stakeholders

We worked together with Resa, an energy distributor (DSO, Distribution System Operator) in the Walloon Region to develop a tool that enables the monitoring of Resa’s low-voltage network and enables the simulation of different scenarios to study its impact on each component of the grid.

In this joint project, Haulogy and Jetpack are working on creating and improving this digital grid. For this, Haulogy and Jetpack are taking two different angles.

On the one hand, Haulogy uses optimization and business rules to create topological maps based on the currently existing digital data sources.

On the other hand, Jetpack.AI, uses so far unexploited, analog map material to create digital maps.

From paper maps to data, Jetpack.AI approach

Extracting the data from paper maps is a technical challenge for which Jetpack.AI employed an innovative dual-pronged approach:

1. Using Vision-Language Models (VLMs)

In 2023 the first vision-language models (VLMs) were dropped on the market and were further refined in 2024. These models are multimodal and can understand text and also (and that’s a new capability) images at the same time.

Pipeline description To validate the feeder data extracted from Haulogy, we implemented a multi-step pipeline as shown in the figure below:

The technical fiches, provided as paper sheets, contain tables listing information about feeders and the streets they pass through. We employed Optical Character Recognition (OCR) with Document Intelligence (DI) to extract this information.

As the handwritten text can be challenging to extract at a satisfying quality, we utilized a Visual Language Model (VLM) to proofread and correct the extracted data.

We used Fuzzy matching to align the extracted street names with actual street names.

We reconstructed feeder IDs using the cartouche information, based on established business rules and merged them with Haulogy’s.

GPS coordinates from Haulogy data were reverse geocoded to obtain street names.

2. Advanced Computer Vision Techniques

By overlaying digitized data with GIS records from Haulogy, discrepancies such as missing feeders or misaligned substations were identified and corrected.

Main takeaways

This solution sets the groundwork for more advanced applications in infrastructure management, potentially enabling real-time updates and analysis of digitized topology for smarter, more responsive networks.

Why It Matters

Digitizing electrical grids is more than just an operational upgrade—it’s about future-proofing our energy systems for a more sustainable world. By combining AI innovation with human expertise, Jetpack.AI is empowering DSOs like Resa to navigate the energy transition with confidence.

Tech stack

Throughout this project, we are using GPT4-o, which is for data privacy reasons deployed on the Azure OpenAI service.

Langchain is a composable framework that we use to build with LLMs

On the computer vision side, we used common Python librairies combined with scipy and shapely. For the visualisation, we used QGIS

Similaire

Pipeline description
To validate the feeder data extracted from Haulogy, we implemented a multi-step pipeline as shown in the figure below: