Balancing the Nordic Power System with AI – key lessons learned
In 2017, we took on the challenge of developing an AI for real-time balancing of large-scale power system. We have started this in the context of the Nordic power system, with its main System Operators (those responsible for balancing the system) - Statnett and Svenska Kraftnät, as pilot customers. The AI – named Impala – is now on the course of becoming an operational software helping keep the lights on, efficiently and reliably.
The purpose of this post is to share three high-level learning points from developing an operational AI for a real-world problem, particularly in the context of the Nordic power system. The target readers are professionals interested in understanding key aspects of developing operational AI that solves real-world problems. The reader is not required to have deep knowledge about Machine Learning nor the Nordic Power system.
I've divided the post into three parts:
The Nordic power system and upcoming challenges: In order to explain the lessons learned, I need to describe the problem and its context. Therefore, a brief explanation of the Nordic power system and the challenges it is facing is provided.
AI terminology: As this post targets a broader audience than Data Scientists, some basic AI-terminology is in place.
Three lessons learned: The main lessons learn from developing the AI.
Operating the Nordic power system
Electric power is the ultimate perishable good - it must be produced exactly when it is consumed and transported from the power plants to the socket you are using. Solving this task requires a complex system which we for simplicity can divided into two separate layers, a coordination layer (solving the problem of ensuring instantaneous balance between production and consumption) and physical layer (transporting the electric power). In this post, I will focus on the coordination layer – which is mostly where we in Optimeering are focused.
Figure 1 – The "coordination layer" (in pink) and the "physical layer" (in blue) of the Nordic power system.
In the liberalized Nordic power system, the coordination problems are first solved by Adam Smith's invisible hand of the market. This is in essence done through day-ahead auctions on Nord Pool. Here, consumers (represented by retailers) and producers agree on tomorrow's consumption and production on an hourly resolution. The market however has two serious limitations.
Tomorrow's consumption and production are based on today's best estimates (of tomorrow). For well-regulated hydropower this is basically the same, but for most consumers and intermittent power sources such as wind power, solar power and run-of-river hydro plants, this is not the same.
The market has an hourly time resolution, but power consumption and production vary substantially within the hour.
For these reasons, the responsibility of instantaneous balancing is given to the System Operator(s) (Statnett in Norway). As the Nordic power system is one big synchronous power system, this balancing is concerned with matching the total sum of Nordic production with the sum of Nordic consumption, hence the Nordic System Operators must cooperate tightly. To complicate matters further, despite the Nordics being one synchronous area, there are internal transportation bottlenecks in the grid. For this reasons the system must be balance within each of its 11 bottleneck areas (often referred to as market areas within the Nordic synchronous power system).
In order to balance the market, the System Operators buys the right to activate or deactivate the power outputted from large power plants (reserves) through different balancing markets. These reserves are then used either reactively (when an imbalance is observed, the balance is restored by reserve activations), or proactively by order reserves prior to the imbalance taking place. The latter is significantly cheaper as it can make use of reserves having a somewhat larger response time (2-15 minutes), but in order to do so the operators needs to know what will happen in near future. This is exactly what Impala provides.
Impala provides real-time predictions of upcoming system imbalance – area by area in the Nordic power system. To be exact, Impala provides a continuous two-hour prediction window consisting of 24 imbalance predictions with 5-minute resolution. This prediction window is refreshed every 5 minute as new data is made available. These predictions are then coupled with reserves available in the market in order to produced suggested (area specific) activations that will balance the market proactively.
The total cost of balancing the Nordic power system is roughly 1 billion NOK per annum. This does not include the cost of the energy activated, just the cost associated with buying the rights (capacity) to activate/deactivate. Additionally, the System Operators have for the past couple of years failed to reach their balancing KPI (Key performance indicators). Making use of Impala could help the System Operators to reach their KPI and help to reduce the associated cost by utilizing cheaper reserves. In a report published by the Nordic TSOs in early 2018, they stated:
“…IMPALA ... can forecast the imbalances in real-time with the help of artificial intelligence and large quantities of data, making it easier to proactively handle future imbalances. The imbalances are foreseen to be reduced by 25 per cent and will hence reduce the frequency deviations significantly…”
(“The Way Forward - Solutions for a Changing Nordic Power System”, March 2018.)
Challenges a head
System balancing is already a big problem, but it is likely to become an even bigger problem in the future. The main driver for this is the increasing amount of renewable power enery capacity installed in the Nordics and elsewhere on the continent. Simply put, most of the new renewables being built (wind and solar) are both hard to estimate, hard to use as reserves, and have a power output that varies within the hour. This increases the consequences of the market deficiencies mentioned earlier. End user flexibility might help the system somewhat, but exactly how is still to a work in progress.
The arriving wind power is increasing the operational challanges of the Nordic power system
I often find it useful to describe different kind of AI dependent on how explicitly and algorithm is programmed. In my view, even explicitly formulated programs (or rule-based) can be viewed as AI, but we most often talk about AI as Machine Learning and Deep Learning. These types of AI enables us to make a software that learns from experience rather than being explicitly or rule-based programmed by humans. Figure 2 below visualizes this. The continuous axes drawn in figure 3 is partly related and extends my view on a common (but simplified) trade-off one encounters when developing AI – namely the pros and cons of using rule-based versus learning based algorithms. The more learning based (or more abstract) the AI-algorithm is, the more it requires training data, the less transpartent it is and to some extent the less domain expertise it requires. Put differently; if you are to build a chess computer, you should either have a lot of knowledge about chess, or you need many training examples, knowledge about deep learning, and have few constrains on the transparency and consistency provided by the solution.
Figure 2 – Common AI-terminology
Figure 3 - Rule-based versus Learning based AI
Key Lessons learned
Building successful AI requires widespread domain understanding
Developing AI and particularly Machine Learning applications that are actually useful for real world problems relies much more on the developer's understanding of how to combine available AI-methods and tweaks to meet the real-world problem than is generally admitted. For this reason, the developer needs to have in-depth knowledge of the problem and its domain. This strictly opposes the view that developing AI and Machine Learning is just a matter of using existing open source algorithms and getting data.
I offer three main reasons for this:
Understanding the problem: When starting, chances are that neither the target metric, solution constraints, nor the possible input data are clearly identified. It is there for up to the AI developer to specify this based on inputs from the end users and various domain experts. Understanding the business problem and translating it into an AI problem, by identifying the requirements, limitations, and formulating the performance metrics, requires asking many stakeholders the right questions. This is hard unless you have properly understood the domain.
Identifying relevant data and selecting appropriate AI-methods: Real-world problems often have real-world data limitations and constraints. Domain knowledge and problem understanding is essential to identifying relevant data and highly important in order to select the best-suited AI-methodologies, algorithms and work-arounds for your solving the problem.
Accomplishing and improving AI performance: There are literally thousands of algorithms, methods, tweaks and hyper-parameters you can adjust to check if the AI improves. Unfortunately, you can never try out everything, thus you can view the development of successful AI as an optimization problem, where you want to maximize the performance gained over time spent. When developing Impala, we initially tried a bit of everything and eventually came to a point where we achieved (by far) the biggest performance improvement by time-spent ratio by better incorporating domain knowledge into the AI. Furthermore, closely examining the incidents when the AI performs the worst is often best practice for improving AI systems. These incidents are inherently hard to grasp unless you have the broad domain understanding and are able to hypothesize around the causes.
Some might say that you could rely on the external domain experts, but there are practical problems to this. First, the domain knowledge required is likely to be distributed across many experts whom all are busy with their work. Secondly, these experts have little knowledge about AI and thus lack the ability to formulate their expertise in a way that is straightforward for the developer to incorporate into the AI-development. Someone needs to connect the dots and see the big picture – and this someone is the AI-developer.
Starting out with Impala we had a visualization of the target model interface and a good description of the variables we were to predict, but little specification of how a good model differed from a bad model and no prioritization of user requirements. Most importantly, little knowledge about the drivers (or predictors) of system imbalance and data we should be using was given. We started by improving our own understanding and then iterating it with the end users. We came up with important criteria's such as:
Small prediction errors have a small cost, whereas large prediction errors (above a certain threshold) have significant higher cost. The evaluation metric should reflect this.
Transparency and understandability is very important. A complete "black box" system is not acceptable, as we should be able to understand why the system have provided certain predictions.
Robustness and consistency. Prediction shall be within a reasonable and realistic range at all time, and we should be able to give an overall explanation for given predictions if required.
Of the two-hour prediction window (consisting of 24 predictions) the three predictions ranging from t+15 to t+30 minutes are more important than the latest part of the prediction window.
The prediction system must be able to be retrained in case of changes in the underlying power system (altered area boarders, new production mix, consumer behavior, export cable, etc.).
These criteria's are very important to take into consideration, but would not have been uncovered unless we had really dug into understanding the problem domain.
Following the problem specification, we started mapping underlying drivers and relevant data. We realized that no single expert could explain the complete process causing the system imbalance, so we started interviewing several domain experts ranging from meteorologists, wind forecasters, power retailers, IT-staff, and production planners to market makers and accountants. This resulted in a broad understanding of the main drivers and key aspects of each one of them. This insight is partly visualized in the figure 4 below.
Figure 4 – The underlying drivers of system imbalance
During these expert meetings, we carefully tried to understand how the sequence of events contributes to the system imbalance. Reformulating these insights into precise features – thereby making the AI able to learn from experience – is probably the main reason why Impala is working so well. The process of using domain understanding to improve AI-solutions brings us to the next lesson learned.
Human domain expertise is gold
In our experience, the biggest contributor to AI performance is related to better incorporation of human domain expertise through feature engineering. For real-world problems, data seldom come in ready-to-use matrices, hence we need to engineer the features that we think are relevant. Metaphorically speaking, feature engineering is about developing an effective curriculum for the learner (the AI), enabling it to learn the necessary associations it relies on when performing predictions. The AI-developer has the overall responsibility for the curriculum, but he is strongly advised to ask domain experts for help on how to best develop the curriculum for each of the sub-domains.
To given an example: figure 4 illustrates relevant data domains all playing a role in generating the system imbalance that we want to predict. Our task is to enable Impala to see patterns that we believe are there. In the lower left corner we can see a calendar icon illustrating that many domain experts tells us that the day of week might have relevance for the upcoming imbalance. We therefore want to provide the day-signal to the AI. Now, one way of providing this is to add a column with a weekday number, but the problem with this is that the number 0-6 (Sunday-Saturday) provide the AI with a signal that days with number 4 and 5 are quite similar, whereas days 6 and 0 are very different, which in our case is quite wrong. The operators tells us that Sundays and Saturdays in fact are quite the same. Additionally, we might even add that common holidays also resembles Sundays more than ordinary weekdays. In order to take account for this, we could either transform day-number through cosine and sine, or we could build our own custom translator. The latter would also enable us to take into account the concept of "trapped" Fridays (or "inneklemt fredag" reflects a common Swedish and Norwegian practice of taking Fridays trapped between Thursdays and week-ends off) in Norway and Sweden, but not in Finland and Denmark. Similar concerns are valid for almost all calendar-related data, and even all the data domains shown in figure 4 for that matter.
Understand how the algorithms work
As mentioned earlier, I believe that the developers ability to figure out the most appropriate AI-methods and tweaks is essential for developing a well working AI. Picking out the relevant approaches and filtering out dead ends is an essential part for reaching a good solution in time. Doing so necessitates an in-depth understanding of how the algorithms actually work, their strengths and weaknesses, and how these can be compensated for through ensambling and system design. Only then can we find good potential approaches that take the real-world limitations and user constraints into account.
Last example. The Impala problem has the following key aspects:
It needs to provide good imbalance predictions, significantly better than the benchmark.
It is a sequence-to-sequence machine-learning problem implying that the past order of observations leading up to the time of prediction matters.
It is a non-linear machine learning problem.
It requires the algorithm to have a "long memory" of past events, as there are reoccurring patterns at multiple different time-periods. For instance, we have reoccurring patterns every hour, every day, every week and every year.
The underlying dynamics of the system is constantly changing. New wind power plants are installed, consumption patterns are changing, and we are exporting power to new foreign countries. For these reasons, the AI must be able to adapt to a changing environment and cannot rely on a "large" amount of historical data.
It needs to provide realistic prediction intervals and provide almost instantaneous predictions.
It is subject to noise, varying data quality and "incomplete data".
The AI needs to be understandable and provide robust predictions – the reason being that we are dealing with critical infrastructure and security of supply.
Identifying appropriate approaches to meet these demands is not straightforward. For example, considering the two latter once. Providing an understandable and robust solution brings us to the concept of extrapolation and interpolation. When presented with out-of-sample data – (e.g. extreme real-world observations not observed historically), some machine learning algorithms would rely on the learned functional relationships and use this to extrapolate the predictions – thus, potentially predicting imbalances well outside historically observed values. To illustrate, say that we observed a fairly normal power system and market behavior, but high-pressure weather with cold weather not seen in decades (and therefore not present in the training data). Under such conditions, letting Impala relay on functional relationships and extrapolation feels risky, because Impala knows very little about the relationship between temperature and imbalance during extreme cold, and we therefore risk that it would predict something very wrong. Rather, given the aspects of our real-world problem, we want to limit the out-of-sample prediction to historically seen outputs and let the human operators use their wider knowledge on top of Impala (an interesting sidetrack would be to discuss why the human operators might be able to predict what would happen under such "out-of-sample" conditions).
I should add that interpolation vs extrapolation is not a common categorization of machine learning algorithms and that I have used this mostly to underscore my point – namely that the developer needs to understand the algorithms in order to pick the appropriate ones.
I could go on describing at least two more key lessons learned (namely "complexity is a cost", and "define a precise metric and a relevant benchmark"), but I will save them for a later post. If you are to remember only one thing from reading this post, it should be that developing AI for real-world problems is not strictly a matter of getting access to data and plugging it into an algorithm. If encountering companies selling turnkey AI for all your prediction problems, this is probably too good to be true – at least for now.