0% found this document useful (0 votes)
14 views

ECEN615 Fall2020 Lect15

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

ECEN615 Fall2020 Lect15

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

ECEN 615

Methods of Electric Power


Systems Analysis
Lecture 15: Least Squares, State Estimation

Prof. Tom Overbye


Dept. of Electrical and Computer Engineering
Texas A&M University
[email protected]
ECEN 615
Methods of Electric Power
Systems Analysis
Lecture 18: State Estimation

Prof. Tom Overbye


Dept. of Electrical and Computer Engineering
Texas A&M University
[email protected]
Announcements
• Starting reading Chapter 9
• Homework 4 is due on Thursday October 15.

2
UTC Revisited

• We can now revisit the uncommitted transfer


capability (UTC) calculation using PTDFs and LODFs
• Recall trying to determine maximum transfer between
two areas (or buses in our example)
• For base case maximums are quickly determined with
PTDFs
 f max − f (0) 
u (m ,)n = min
0
 (w) 
 
(w)
 0

Note we are ignoring zero (or small) PTDFs; would also need
to consider flow reversal
3
UTC Revisited

• For the contingencies we use


 
( 1)  f
max
− f (0 )
−d fk (0 )

= mink 
k
u m ,n 
( ( w ) )  0  ( )
k

(w)
 

• Then as before u m ,n = min u  (0 )


m ,n , u (1)
m ,n 
We would need to check all contingencies! Also,
this is just a linear estimate and is not considering
voltage violations.

4
Five Bus Example

w = 2, 3,  t  (0)
=  42 , 34 , 67 , 118 , 33 , 100
T
f
= 150 , 400 , 150 , 150 , 150 , 1,000
max T
f
One 42 MW Two
Line 1 A

MVA
1.040 pu
200 MW
Line 2 A

1.050 pu 34 MW MVA

260 MW
Line 3 Line 4
slack
67 MW A

MVA
258 MW A
33 MW
A
100 MW
MVA
MVA
118 MW
Line 5

Four 1.042 pu
1.042 pu
100 MW Line 6
Three
Five 1.044 pu 118 MW
100 MW

5
Five Bus Example

Therefore, for the base case


 f max − f (0) 

( )
=
0
u2,2 min
(w)
 (w) 
 0
  

 150 − 42 400 − 34 150 − 67 150 − 118 150 − 33 


= min  , , , , 
 0.2727 0.1818 0.0909 0.7273 0.0909 

= 44.0

6
Five Bus Example

• For the contingency case corresponding to the outage


of the line 2  max (0 ) 
 f − f −d f 2 
(0 ) 2

u 2,3 = min2 
(1)

( ) 0  ( )
2
(w)
 (w)

 
The limiting value is line 4

f max
− f (0 )
−d2f (0 )
150 − 118 − 0.4  34
2
=
( )
2
(w) 0.8

Hence the UTC is limited by the contingency to 23.0

7
Additional Comments

• Distribution factors are defined as small signal


sensitivities, but in practice, they are also used for
simulating large signal cases
• Distribution factors are widely used in the operation of
the electricity markets where the rapid evaluation of the
impacts of each transaction on the line flows is required
• Applications to actual system show that the distribution
factors provide satisfactory results in terms of accuracy
• For multiple applications that require fast turn around
time, distribution factors are used very widely,
particularly, in the market environment
• They do not work well with reactive power!
8
Least Squares

• So far we have considered the solution of Ax = b in


which A is a square matrix; as long as A is
nonsingular there is a single solution
– That is, we have the same number of equations (m) as
unknowns (n)
• Many problems are overdetermined in which there
more equations than unknowns (m > n)
– Overdetermined systems are usually inconsistent, in which no
value of x exactly solves all the equations
• Underdetermined systems have more unknowns than
equations (m < n); they never have a unique solution
but are usually consistent
9
Method of Least Squares

• The least squares method is a solution approach for


determining an approximate solution for an
overdetermined system
• If the system is inconsistent, then not all of the
equations can be exactly satisfied
• The difference for each equation between its exact
solution and the estimated solution is known as the error
• Least squares seeks to minimize the sum of the squares
of the errors
• Weighted least squares allows differ weights for the
equations
10
Least Squares Solution History
• The method of least squares developed from trying to
estimate actual values from a number of measurements
• Several persons in the 1700's, starting with Roger
Cotes in 1722, presented methods for trying to decrease
model errors from using multiple measurements
• Legendre presented a formal description of the method
in 1805; evidently Gauss claimed he did it in 1795
• Method is widely used in power systems, with state
estimation the best known application, dating from
Fred Schweppe's work in 1970

11
Least Squares and Sparsity
• In many contexts least squares is applied to problems
that are not sparse. For example, using a number of
measurements to optimally determine a few values
– Regression analysis is a common example, in which a line or
other curve is fit to potentially many points)
– Each measurement impacts each model value
• In the classic power system application of state
estimation the system is sparse, with measurements
only directly influencing a few states
– Power system analysis classes have tended to focus on
solution methods aimed at sparse systems; we'll consider both
sparse and nonsparse solution methods
12
Least Squares Problem
mn
• Consider Ax = b A ¡ , x  ¡ n , b ¡ m

or

 (a 1 ) T   a 11 a 12 a 13 a 1n   x 1   b 1 
 2 T     
 (a ) x =  a 21 a 22 a 22 a 2n   x 2   b 2 
     =  
 m T     
(a )  a a mn   x n  b m 
 m1 a m 2 a m3

13
Least Squares Solution

• We write (ai)T for the row i of A and ai is a column


vector
• Here, m ≥ n and the solution we are seeking is that
which minimizes Ax - b p, where p denotes some
norm
• Since usually an overdetermined system has no exact
solution, the best we can do is determine an x that
minimizes the desired norm.

14
Choice of p

• We discuss the choice of p in terms of a specific


example
• Consider the equation Ax = b with
1 b 
   1
A = 1 b = b 2  with b 1  b 2  b 3  0
   
1  
  b 3 
(hence three equations and one unknown)
• We consider three possible choices for p:

15
Choice of p
(i) p = 1
Ax − b 1
is minimized by x * = b2

(ii) p = 2
b1 + b 2 + b 3
Ax − b 2
is minimized by x *
=
3

(iii) p = 
b1 + b 3
Ax − b 
is minimized by x *
=
2
16
The Least Squares Problem

• In general, Ax − b p is not differentiable for p = 1


or p = ∞
• The choice of p = 2 (Euclidean norm) has become well
established given its least-squares fit interpretation
• The problem min Ax − b 2 is tractable for 2 major
n
reasons x  ¡

– First, the function is differentiable


2

=  ( a ) x − b i 
m
1 1  i
 (x) =
2 T
Ax - b
2 2
2 i=1  

17
The Least Squares Problem, cont.
– Second, the Euclidean norm is preserved under orthogonal
transformations:

(Q A ) x − Q
T T
b
2
= Ax − b 2

with Q an arbitrary orthogonal matrix; that is, Q


satisfies
QQ T = Q T Q = I Q ¡ n×n

18
The Least Squares Problem, cont.

• We introduce next the basic underlying assumption:


A is full rank, i.e., the columns of A constitute a set
of linearly independent vectors
• This assumption implies that the rank of A is n
because n ≤ m since we are dealing with an
overdetermined system
• Fact: The least squares solution x* satisfies

A T A x = A Tb

19
Proof of Fact

• Since by definition the least squares solution x*


minimizes  ( • ) at the optimum, the derivative of
this function zero:
 (x) =
1
2
Ax - b
2
2
=
2
( x A Ax − x T A T b − b T A x + b T b )
1 T T

  (x)  1 T T 
0 = =  ( x A Ax − x A b − b A x + b b ) 
T T T T

x x
 x 2  x

 1 T T 
= 
 x 2
( x A A x − 2 x T
A T
b + b T
b ) 
 x

= A T A x  − A Tb

20
Implications

• This underlying assumption implies that


A is full rank   x  0  Ax  0
• Therefore, the fact that ATA is positive definite (p.d.)
follows from considering any x ≠ 0 and evaluating
T T 2
x A Ax = Ax 2
> 0,
which is the definition of a p.d. matrix
• We use the shorthand ATA > 0 for ATA being a
symmetric, positive definite matrix

21
Implications

• The underlying assumption that A is full rank and


therefore ATA is p.d. implies that there exists a unique
least squares solution
• Note: we use the inverse in a conceptual, rather than a
computational, sense
x = (A A)
−1
 T
A Tb

• The below formulation is known as the normal


equations, with the solution conceptually
straightforward
(A T A) x = A Tb
22
Example: Curve Fitting

• Say we wish to fit five points to a polynomial


curve of the form
f(t , x) = x1 + x2t + x3t 2

• This can be written as


1 t1 t12   y1 
   
1 t2 t22   x1   y2 
Ax = y = 1  
t3 t32   x2  =  y3 
 2  
1 t4  x 
t 4   3   y4 
1 t52   y5 
 t5
Example: Curve Fitting

• Say the points are t =[0,1,2,3,4] and y = [0,2,4,5,4].


Then
1 0 0 0
1 1 1   x1   2 

Ax = y = 1 2 4   x2  =  4 
   
1 3 9   x3   5 
1 4 16   4 
0
 
 0.886 0.257 −0.086 −0.143 0.086   2 
x = ( A T A ) A T b =  −0.771 0.186 0.571 0.386 −0.371  4 
−1

 
 0.143 −0.071 −0.143 −0.071 0.143   5 
 4 
 −0.2 
x =  3.1 
 −0.5
Implications

• An important implication of positive definiteness is


that we can factor ATA since ATA > 0
Α T Α = U T D U = U T D 1/2 D 1/2 U = G T G

• The expression ATA = GTG is called the Cholesky


factorization of the symmetric positive definite
matrix ATA

25
A Least Squares Solution Algorithm
Step 1: Compute the lower triangular part of ATA
Step 2: Obtain the Cholesky Factorization Α T Α = G T G
Step 3: Compute Α T b = bˆ
Step 4: Solve for y using forward substitution in
G T y = bˆ
and for x using backward substitution in
Gx =y

Note, our standard LU factorization approach would work;


we can just solve it twice as fast by taking advantage of
it being a symmetric matrix

26
Practical Considerations

• The two key problems that arise in practice with the


triangularization procedure are:
– First, while A maybe sparse, ATA is much less sparse and
consequently requires more computing resources for the
solution
• In particular, with ATA second neighbors are now connected! Large
networks are still sparse, just not as sparse
– Second, ATA may actually be numerically less well-
conditioned than A

27
Loss of Sparsity Example

• Assume the B matrix for a network is


 −1 1 0 0 
 
 1 −2 1 0 
B =  
 0 1 −2 1 
 
 
 0 0 1 −1
 2 −3 1 0 
 
 −3 6 −4 1 
• Then BTB is BT B =  
 1 −4 6 −3
 
 
 0 1 −3 2 
• Second neighbors are now connected!
28
Numerical Conditioning

• To understand the point on numerical ill-


conditioning, we need to introduce terminology
• We define the norm of a matrix B  ¡ mn to be
 B x 
B = max  
x0  x 

= maximum stretching of the matrix B

• This is the maximum singular value of B

29
Numerical Conditioning Example

• Say we have the matrix


10 0 
B= 
 0 0.1
• What value of x with a norm of 1 that maximizes Bx ?
• What value of x with a norm of 1 that minimizes Bx ?
 B x 
B = max  
x0  x 

= maximum stretching of the matrix B

30
Numerical Conditioning

= max
i
 
l i , li is an eigenvalue of B T B ,
Keep in mind the
i.e., li is a root of the polynomial eigenvalues of a
p.d. matrix are
p ( λ) = det B B − λI 
T
positive
• In other words, the 2 norm of
B is the square root of the
largest eigenvalue of BTB

31
Numerical Conditioning
• The conditioning number of a matrix B is defined as
 max ( B )  the max / min stretching
 (B) = B B −1 = 
 min ( B ) 
 ratio of the matrix B

• A well–conditioned matrix has a small value of


 ( B ) , close to 1; the larger the value of  ( B ) , the
more pronounced is the ill-conditioning

32
Power System State Estimation (SE)

• The need is because in power system operations there is


a desire to do “what if” studies based upon the actual
“state” of the electric grid
– An example is an online power flow or contingency analysis
• Overall goal of SE is to come up with a power flow
model for the present "state" of the power system based
on the actual system measurements
• SE assumes the topology and parameters of the
transmission network are mostly known
• Measurements from SCADA and increasingly PMUs
• Overview is given in ECEN 615; more details in 614
33
Power System State Estimation

• Problem can be formulated in a nonlinear, weighted


least squares form as
2
 z − f ( x) 
 i 
m
min J (x) = 
i =1
i

 i2
where J(x) is the scalar cost function, x are the state
variables (primarily bus voltage magnitudes and
angles), zi are the m measurements, f(x) relates the
states to the measurements and i is the assumed
standard deviation for each measurement

34
Assumed Error

• Hence the goal is to decrease the error between the


measurements and the assumed model states x
• The i term weighs the various measurements,
recognizing that they can have vastly different
assumed errors  
2
z − f ( x)
 
m
min J (x) = 
i =1
i i

i2

• Measurement error is assumed Gaussian (whether it is


or not is another question); outliers (bad
measurements) are often removed

35
State Estimation for Linear Functions

• First we’ll consider the linear problem. That is


where
z meas − f(x) = z meas − Hx
• Let R be defined as the diagonal matrix of the
variances (square of the standard deviations) for
each of the measurements
 12 0 0
 
 0  2

R= 2
 0
 2
 0 0  m 
36
State Estimation for Linear Functions

• We then differentiate J(x) w.r.t. x to determine the


value of x that minimizes this function
T
J (x) =  z− Hx  R −1  z meas − Hx 
meas

J (x) = −2HT R −1z meas + 2HT R −1Hx


At the minimum we have J (x) = 0. So solving for x gives
−1
x =  H R H  HT R −1z meas
T −1

37
Simple DC System Example

• Say we have a two bus power system that we are


solving using the dc approximation. Say the line’s per
unit reactance is j0.1. Say we have power
measurements at both ends of the line. For simplicity
assume R=I. We would then like to estimate the bus
angles. Then
1 −  2  2 − 1
z1 = P12 = = 2.2, z2 = −2.0 = P21 =
0.1 0.1
1   10 −10  T  200 −200 
x =  ,H =   ,H H =  

 2  −10 10   −200 200 
We have a problem since HTH is singular. This is because
of lack of an angle reference.
38
Simple DC System Example, cont.

• Say we directly measure 1 (with a PMU) to be zero;


set this as the third measurement. Then
1 −  2  2 − 1
z1 = P12 = = 2.2, z2 = −2.0 = P21 = , z3 = 0
0.1 0.1
 2.2   10 −10 
1       201 −200 
x =   , z =  −2  , H =  −10 10  , H H = 
T


 2  −200 200 
 0   1 0 
−1
x =  H R H  HT R −1z meas
T −1
Note that
−1  2.2  the angles
 201 −200   10 −10 1     0  are in
x=     −2  =  
 −200 200   −10 10 0  0   −0.21 radians
 
39
Nonlinear Formulation

• A regular ac power system is nonlinear, so we need to


use an iterative solution approach. This is similar to
the Newton power flow. Here assume m
measurements and n state variables (usually bus
voltage magnitudes and angles) Then the Jacobian is
the H matrix
 f1 f1 
 x xn 
f (x)  1

H ( x) = = 
x  
 f m f m 
 x1 xn 
Measurement Example

• Assume we measure the real and reactive power


flowing into one end of a transmission line; then the
zi-fi(x) functions for these two are

ij i j (
ij i ( j )
P meas −  −V 2G + V V G cos  −  + B sin  −  
 i ij ij i j (
 ))

))
  
(
B 
Q meas
ij
2

 i  ij 2  i j ij
( i j) ij (
− V  B + cap  + V V G sin  −  − B cos  − 
i j


– Two measurements for four unknowns
• Other measurements, such as the flow at the other end,
and voltage magnitudes, add redundancy
41
SE Iterative Solution Algorithm

• We then make an initial guess of x, x(0) and iterate,


calculating x each iteration This is exactly the least
 z 1 − f 1 ( x)  squares form developed
−1 −1   earlier with HTR-1H an n
x =  H R H  H R 
T −1 T

 z − f ( x) 
by n matrix. This could be
 m m  solved with
Gaussian elimination, but
x ( k +1) = x ( k ) + x this isn't preferred
because the problem is
Keep in mind that H is no often ill-conditioned
longer constant, but varies
as x changes. often ill-
conditioned
42
Nonlinear SE Solution Algorithm,
Book Figure 9.11
Example: Two Bus Case

• Assume a two bus case with a generator supplying a


load through a single line with x=0.1 pu. Assume
measurements of the p/q flow on both ends of the line
(into line positive), and the voltage magnitude at both
the generator and the load end. So B12 = B21=10.0

ij (
 i j ij ( i j))
P meas − V V B sin  −  


ij (
i j ij i(
Q meas − V 2 B + V V − B cos  −  
 i ij j ))


We need to assume a reference angle


V meas − V = 0
i i unless we directly measuring phase
44
Example: Two Bus Case
 P12   2.02 
• Let Q   1.5  We assume an
 12    V1   1 angle reference
 P21   −1.98 
Z meas = =  x0 =  2  = 0  ,  i = 0.01 of 1=0
Q
 21   −1  V2   1
 V1   1.01 
   
 V2   0.87 

 V2 10 sin(− 2 ) −V1V2 10 cos(− 2 ) V1 10 sin(− 2 ) 


 20V − V 10 cos(− ) −V V 10 sin(− ) −V 10 cos( − ) 
 1 2 2 1 2 2 1 2 
 V2 10 sin( 2 ) V1V2 10 cos( 2 ) V1 10 sin( 2 ) 
H ( x) =  
 − V2 10 cos( 2 ) V V
1 2 10 sin( 2 ) 20V 2 − V 1 10 cos(  )
2 
 1 0 0 
 
 0 0 1 
45
Example: Two Bus Case

• With a flat start guess we get


 0 −10 0   2.02 
 10 0 −10   1.5 
   
 0 10 0   −1.98 
H (x ) = 
0
 , z − f (x ) = 
0

 −10 0 10   − 1 
 1 0 0   0.01 
   
 0 0 1   −0 .13 
0.0001 0 0 0 0 0 
 0 0 .0001 0 0 0 0 
 
 0 0 0.0001 0 0 0 
R= 
 0 0 0 0 .0001 0 0 
 0 0 0 0 0.0001 0 
 
 0 0 0 0 0 0.0001
46
Example: Two Bus Case

 2.01 0 −2 
H T R −1H = 1e 6   0 2 0 
 −2 0 2.01

 2.02 
 1.5 
   1.003 
−1 
−1 −1.98
  
x = x +  H R H  H R 
1 0 T −1 T
=
  − 0.2 
 − 1  0.8775
 0.01   
 
 −0.13

47
Assumed SE Measurement Accuracy

• The assumed measurement standard deviations can


have a significant impact on the resultant solution, or
even whether the SE converges
• The assumption is a Gaussian (normal) distribution of
the error with no bias

48
SE Observability

• In order to estimate all n states we need at least n


measurements. However, where the measurements are
located is also important, a topic known as observability
– In order for a power system to be fully observable usually we
need to have a measurement available no more than one bus
away
– At buses we need to have at least measurements on all the
injections into the bus except one (including loads and gens)
– Loads are usually flows on feeders, or the flow into a
transmission to distribution transformer
– Generators are usually just injections from the GSU

49
Pseudo Measurements

• Pseudo measurements are used at buses in which there


is no load or generation; that is, the net injection into
the bus is know with high accuracy to be zero
– In order to enforce the net power balance at a bus we need to
include an explicit net injection measurement
• To increase observability sometimes estimated values
are used for loads, shunts and generator outputs
– These “measurements” are represented as having a higher
much standard deviation

50
SE Observability Example

51

You might also like