The Geometry of Multidimensional Quadratic Utility in Models of Parliamentary Roll Call Voting

 

 

by

Keith T. Poole

Graduate School of Industrial Administration

Carnegie Mellon University

 

 

 

 

24 April 2000

 

 

 

 

Abstract

The purpose of this paper is to show how the geometry of the quadratic utility function in the standard spatial model of choice can be exploited to estimate a model of Parliamentary roll call voting. In a standard spatial model of Parliamentary roll call voting, the legislator votes for the policy outcome corresponding to Yea if her utility for Yea is greater than her utility for Nay. The voting decision of the legislator is modeled as a function of the difference between these two utilities. With quadratic utility, this difference has a simple geometric interpretation that can be exploited to estimate legislator ideal points and roll call parameters in a standard framework where the stochastic portion of the utility function is normally distributed. The geometry is almost identical to that used in Poole (2000) to develop a non-parametric unfolding of binary choice data and the algorithms developed by Poole (2000) can be easily modified to implement the standard maximum likelihood model.

 

 

 

 

  1. Introduction

The purpose of this paper is to show how the geometry of the quadratic utility function in the standard spatial model of choice can be exploited to estimate a model of parliamentary roll call voting. The quadratic utility function has a long history. Beginning with the earliest papers of Davis and his colleagues (Davis and Hinich, 1966, 1967; Davis, Hinich, and Ordeshook, 1970), it has played an important role in the spatial theory of voting and elections. The quadratic utility function is analytically simple and has a number of mathematical properties that make it easy to work with for modeling purposes. For example, it is symmetric around, and has a unique maximum at, the individual’s ideal point.

In a standard spatial model of Parliamentary roll call voting, the legislator votes for the policy outcome corresponding to Yea if her utility for Yea is greater than her utility for Nay. The voting decision of the legislator is modeled as a function of the difference between these two utilities. The difference between two quadratic utilities has a simple geometric interpretation that can be exploited to estimate legislator ideal points and roll call parameters in a standard framework where the stochastic portion of the utility function is normally distributed. In particular, the geometry is almost identical to that used in Poole (2000) to develop a non-parametric unfolding of binary choice data. The algorithms developed by Poole (2000) can be easily modified to implement a standard maximum likelihood model where the deterministic portion of the utility function of the legislators is quadratic and the stochastic portion is normally distributed.

Section 2 defines the problem and explains the notation used in the paper. Section 3 shows the geometry of the quadratic utility model and discusses the plausibility of the three major error distributions – normal, logit, and uniform – that have been used by researchers to estimate the parameters of spatial models. Section 4 briefly discusses the scaling method developed by Poole (2000). The geometry of this scaling method is essentially the same as that shown in section 3. Section 5 shows how the algorithms developed in Poole (2000) can be used to estimate the parameters of a standard maximum likelihood model where the deterministic portion of the utility function of the legislators is quadratic and the stochastic portion is normally distributed. A quadratic utility scaling of the 90th House of Representatives shows that the algorithm is stable and produces sensible results. Section 6 concludes.

2. Notation and Definitions

Assume that legislators have Euclidean preferences defined over some multidimensional ideological/policy space and that they vote sincerely for the alternative closest to their ideal point. Let p be the number of legislators (i=1,…,p) and s be the number of dimensions (k=1,…,s). The ith legislator’s ideal point on the kth dimension is denoted by xik and let X be the p by s matrix of legislator ideal points. Each roll call vote has two policy points in the space corresponding to the policy consequences of a Yea or Nay vote on the roll call. Let q be the number of roll calls (j=1,…,q) and coordinates for the Yea and Nay outcomes are denoted by zjky and zjkn respectively. Let "c" indicate the outcome (Yea or Nay) chosen by legislator i, and let "b" indicate the outcome not chosen by legislator i. This notation will considerably simplify the exposition below.

If there were no voting error, a plane can be placed in the space such that it separates all the legislators voting Yea from all the legislators voting Nay. Geometrically, this cutting plane is both perpendicular to the line joining the Yea and Nay policy points and passes through the midpoint of the Yea and Nay policy points. Because the normal vector to a plane is perpendicular to the plane, the normal vector to this cutting plane, by definition, is parallel to the line joining the Yea and Nay policy points. Specifically, let nj be the s by 1 normal vector for the jth roll call and let N be the q by s matrix of normal vectors for the q cutting planes. A plane is defined as the vector equation, z¢ nj = v¢ nj , where z, nj, and v are s by 1 vectors and the plane consists of all points z such that (z - v) is perpendicular to the normal vector, nj, and v is a specific point in the plane. Note that if v1, and v2 are both points in the plane then, v1¢ nj = v2¢ nj = mj, where mj is a scalar constant. Geometrically, every point in the plane projects onto the same point on the line defined by the normal vector, nj and its reflection -nj. Because the midpoint of the Yea and Nay policy points is on the cutting plane, it too projects to the point mj.

Technically, the general equation for a line is:

Y(t) = A + t(B – A)

Where A and B are points in the space and t is a scalar. In this instance, A is placed at the origin of the space so that the equation for the line defined by the normal vector, nj and its reflection -nj is simply

Y(t) = tnj (1)

where –1 £ t £ +1.

To set the scale of the voting space, let the legislator coordinates lie within the s dimensional unit hypersphere and let the origin of the space be placed at the centroid of the legislator coordinates; that is, let

, i=1,...,p and , k=1,...,s

In addition, without loss of generality, the normal vector, nj, can be constrained to be of unit length; i. e., nj¢ nj = 1.

Let the projections of the legislator points onto the line defined by equation (1) be:

Xnj = w (2)

Note that the elements in the p-length vector, w, range from -1 to +1. The elements in w all lie on the line defined by equation (1) that passes through the origin of the s-dimensional unit hypersphere in the direction of the normal vector with exit points -nj and +nj respectively. This line will hereafter be referred to as the projection line.

3. The Quadratic Utility Model

Given these definitions, legislator i’s utility for her chosen outcome, c, on roll call j is:

Uijc = uijc + e ijc = + e ijc (3)

Where uijc is the deterministic portion of the utility function and e ijc is the stochastic portion.

The probability that legislator i votes for her chosen outcome, c, is

P(Uijc > Uijb ) = P(e ijb - e ijc < uijc - uijb ) (4)

The Deterministic Portion of the Utility Function

The difference between the deterministic utilities can be simplified as follows:

uijc - uijb =

=

= (5)

Now, note that the s by 1 vector:

zjb - zjc =

is equal to a constant times the normal vector, nj (see Figure 1). Namely,

g jnj = zjb - zjc (6)

where

g j = + if zjb¢ nj > zjc¢ nj or

g j = - if zjb¢ nj < zjc¢ nj

g j is the directional distance between the Yea and Nay outcomes in the space.

__________________

Figure 1 about Here

__________________

The s by 1 vector

zjb + zjc =

divided by 2 is simply the s by 1 vector of midpoints for the Yea and Nay outcomes for roll call j. That is:

zmj =

This allows equation (5) to be rewritten as the vector equation:

uijc - uijb = 2g j(xi¢ nj - zmj¢ nj ) = 2g j(wimj) (7)

where wi is the projection of the ith legislator’s ideal point onto the projection line as defined by equation (1), and mj is the projection of the midpoint of the roll call outcomes onto the projection line. Equation (7) shows that:

if g j > 0 and wi > mj, or

if g j < 0 and wi < mj, then uijc > uijb

In one dimension, nj is equal to 1 and g j = zjc - zjb. Hence, equation (1) becomes simply 2(zjc - zjb)(xi - zmj) = 2g j(ximj). Except for an added "valence" dimension, this is identical to the one dimensional model developed by Londregan (2000, p. 40-41). Specifically, in Londregan’s notation, g = (zjc - zjb.), xv = xi , and m = zmj. Now, note that

if zjc > zmj and xi > zmj or

if zjc < zmj and xi < zmj, then uijc > uijb

If voting is sincere and without error then 2g j(wimj) > 0 for all i and j and, in one dimension, the legislator ideal points and the roll call midpoints are only identified up to a joint rank ordering. With "perfect" voting in more than one dimension, if a variety of voting coalitions form amongst the legislators, then the cutting planes will intersect one another in a myriad of directions creating a maximum of regions in the policy space (Coombs, 1964, p. 262). Each region corresponds to a unique voting pattern on the q roll calls – e.g., YYYNNYNYNYYY…. Hence, a legislator’s ideal point is identified up to a region in the space. Note however, that as q gets large the number of regions explodes so that the volume of these regions is extremely small. For example, with 500 roll calls, there are a maximum 125,251 regions in two dimensions and a maximum of 20,833,751 in three dimensions. Most of these regions are so small that a typical legislator’s point is very precisely pinned down (Poole, 2000). Similarly, with a large number of legislators, the cutting plane – defined by the normal vector, nj, and the midpoint of the roll call, mj -- is also precisely pinned down (Poole, 2000).

To reiterate, if voting is perfect, that is, sincere and without error, then in one dimension the legislator ideal points and the roll call midpoints are only identified up to a joint rank ordering. In more than one dimension, legislators are identified up to regions in the space (polytopes) and roll calls are identified up to cone shaped regions containing the normal vectors and line segments on the normal vectors for the midpoints. These limits on identification arise because the data are simply Yea and Nay. If legislators could report "thermometer scores" for the alternatives then the perfect case would have an exact solution.

The Stochastic Portion of the Utility Function

Superficially, it is the stochastic portion of the utility function that allows for more precise solutions for the legislator ideal points and roll call parameters. However, as Londregan (2000) proves, this precision is, to an extent, an illusion. Londregan shows that consistency in its usual statistical sense does not hold in the roll call voting problem outlined above. With nominal choices standard maximum likelihood estimators that attempt to simultaneously recover legislators’ ideal points and roll call parameters inherit the "granularity" of the choice data and so cannot recapture the underlying continuous parameter space. However, when the number of roll calls and legislators is large, the bias in the estimated parameters is not severe (Londregan, 2000).

Turning to the stochastic portion of the utility function stated in equation (3) above, three probability distributions have been used to model the error; the normal (Ladha, 1991; Londregan, 2000), uniform (Heckman and Snyder, 1997), and logit (Poole and Rosenthal, 1997). The normal is clearly the best from both a theoretical and a behavioral standpoint.

From a statistical standpoint, given the difference between the two random errors, e ijb - e ijc , the standard assumptions are that e ijb and e ijc are a random sample (independent and identically distributed random variables) from a known distribution. Hence, it is therefore easy to write down the probability distribution of the difference -- e ijb - e ijc. From a behavioral standpoint, it seems sensible to assume that the distributions of e ijb and e ijc are symmetric and unimodal and that e ijb and e ijc are uncorrelated. The normal distribution is the only one of the three distributions to satisfy all these criteria. To illustrate, assume that e ijb and e ijc are drawn (a random sample of size two) from a normal distribution with mean zero and variance one-half. The difference between the two errors has a standard normal distribution; that is

e ijb - e ijc ~ N(0, 1)

Hence, the probability that legislator i votes for her chosen outcome, c, can be rewritten as:

Pijc = P(Uijc > Uijb ) = P(e ijb - e ijc < uijc - uijb ) =

F [2g j(xi¢ nj - zmj¢ nj )] = F [2g j(wimj)] (8)

Heckman and Snyder (1997) assume that e ijb - e ijc has a uniform distribution. This is an extremely problematic assumption because e ijb and e ijc cannot be a random sample! Assuming that e ijb - e ijc has a uniform distribution enables Heckman and Snyder to develop a linear probability model but the price for this simplicity is that they have no intuitive basis for a behavioral model.

Poole and Rosenthal (1985, 1991, 1997) assume that e ijb and e ijc are a random sample from the log of the inverse exponential distribution. Consequently, e ijb - e ijc has the logit distribution. The log of the inverse exponential and the logit distribution which is derived from it, are unimodal but not symmetric. However, they are not too skewed and the distribution function of the logit distribution is reasonably close to the normal distribution function.

A further difficulty with the approaches taken by both Heckman and Snyder and Poole and Rosenthal is that they assume that the error variance is homoskedastic. A more realistic assumption is that the error variance varies across the roll call votes and across the legislators. For the roll calls, it is impossible to distinguish between the underlying unknown error variance and the distance between the Yea and Nay alternatives (Ladha, 1991; Londregan, 2000). The intuition behind this is straightforward. As the distance between the Yea and Nay alternatives increases, the easier it is for legislators to distinguish between the two policy outcomes and the less likely it is that they make an error. Conversely, if the Yea and Nay alternatives are very close together, then the utility difference is small and it is more likely that voting errors occur. Increasing/decreasing the distance is equivalent to decreasing/increasing the variance of the underlying error.

Because g j is picking up the roll call specific variance, the difference between the two errors for legislator i on roll call j can be modeled as:

e ijb - e ijc ~

With heteroskedastic error equation (8) becomes:

(9)

The corresponding likelihood function is therefore:

L = (10)

The approach developed in section 5 below allows the error to be heteroskedastic. The algorithms developed by Poole (2000) can be used to obtain excellent estimates for the legislator points, the xi‘s, the roll call normal vectors, the nj‘s, and the cutpoints, the mj’s. With these fixed, the g j and s i can be estimated.

In sum, the normal distribution is the most sensible model of error both from a mathematical standpoint as well as a behavioral standpoint. Consequently, it will be the focus of the model developed in this paper.

4. The Classification Algorithm

Poole (2000) develops a new scaling method for analyzing parliamentary roll call data. The scaling method uses almost exactly the same geometry as that shown for the difference between two quadratic utilities shown above. Given the legislator coordinates, the scaling method estimates cutting planes for each roll call; and given the cutting planes, the method finds the region in the space that best matches the legislator’s roll call choices. The scaling method is non-parametric because no assumptions are made about the probability distribution of the legislators’ errors in making choices. The only assumptions made are that the choice space is Euclidean and that individuals making choices behave as if they utilize symmetric, single-peaked preferences.

Strictly speaking, the scaling method developed in Poole (2000) is not a statistical model. However, as shown in the next section, the algorithms developed by Poole (2000) can be easily modified to implement a standard maximum likelihood model where the deterministic portion of the utility function of the legislators is quadratic and the stochastic portion is normally distributed.

The classification algorithm uses the geometry outlined above along with the assumption that preferences are symmetric and single peaked to find estimates of X and N. The rule for correct classification is:

If legislator i votes c: d ij = 1 if wi ³ mj and zjc¢ nj > mj, or wi < mj and zjc¢ nj < mj

d ij = 0 if wi < mj and zjc¢ nj > mj, or wi > mj and zjc¢ nj < mj

In other words if the legislator votes "Yea"/"Nay" and her ideal point is on the Yea/Nay side of the plane, the legislator’s vote is correctly classified. Note that the assumption of symmetric single-peaked preferences means that if a legislator votes "Yea" and her ideal point is anywhere on the Yea side of the plane then that counts as a correct classification. If preferences are not symmetric then this might not be true.

The total correct classification is therefore:

d (X, N) = (11)

In sum, given the number of dimensions, s, the classification problem consists of finding estimates of X and N that maximize equation (11).

5. An Algorithm to Estimate the Multidimensional Quadratic Utility Model

Note that if d ij = 1, then (wimj) > 0 and F [(wimj)] > .5; and

if d ij = 0, then (wimj) < 0 and F [(wimj)] < .5

In other words, the classification algorithm, which is intended to maximize equation (11), will also tend to maximize equation (10). Given estimates of X, N, and the mj’s from the classification algorithm -- denoted as X*, N*, and mj* respectively -- it is a simple matter to estimate the g j and s i terms because the likelihood function is convex if the roll call cannot be classified without error. With a finite number of legislators, there will be roll calls where |g j | will be very large because the roll call is so important or the information is so complete that no legislator makes an error. However, if the roll call can be classified without error then |g j |® +¥ -- that is, the probabilities assigned to the choices of the p legislators will go to 1 on a "perfect" roll call. This does not present a problem since Pijc can be set equal to 1 and its corresponding log-likelihood can be set equal to 0.

The multidimensional quadratic utility model can be efficiently estimated in four steps. First, using X*, N*, and the mj* from the classification algorithm, set all the s i equal to 1 and estimate the g j‘s using a simple grid search. Given these g j‘s estimate the s i‘s using a simple grid search. Repeat this process until there is no significant improvement in the log-likelihood. In practice, this takes no more than three repetitions. Second, given N* and the mj* from the classification algorithm and the estimated g j‘s and s i‘s from step 1, estimate new legislator coordinates, X. This is easily accomplished using standard gradient techniques. Third, given N* from the classification algorithm, the estimated g j‘s and s i‘s from step 1, and the estimated legislator coordinates, X, from step 2, estimate new projected midpoints, the mj, using a simple grid search. Fourth, given the estimated g j‘s and s i‘s from step 1, the estimated legislator coordinates, X, from step 2, and the estimated projected midpoints, the mj, from step 3, estimate new normal vectors, the nj. This is easily accomplished using standard gradient techniques with the constraints that njnj = 1 and that zmj = mjnj. In other words, the point defined by the end of the normal vector is moved along the surface of the unit hypersphere with the position of the projected midpoint held fixed on the normal vector as it is moved. Geometrically, this is equivalent to moving the cutting plane rigidly through the space as its normal vector is moved.

In summary, the algorithm is:

Step 1a: Estimate the g j

Step 1b: Estimate the s i

Step 1c: Repeat a and b Until convergence

Step 2: Estimate the xi

Step 3: Estimate the mj

Step 4: Estimate the nj

Go to 1a

In one dimension, given a joint rank ordering of the legislators and roll call midpoints from the classification algorithm, step 4 is not necessary and the legislator coordinates in step 2 can be found through a simple grid search. In practice only 3 overall passes through steps 1 to 4 are required for the program to converge.

Table 1 and Figures 2 to 6 show an application of the quadratic utility algorithm to the 90th House of Representatives. Table 1 shows the scaling results for the quadratic algorithm for 1 to 10 dimensions. The corresponding correct classifications from the optimal classification algorithm are also shown for comparative purposes. Figure 2 graphs the increase in fit from adding dimensions. The increase in geometric mean probability (GMP) from adding a dimension was multiplied by 100 so it could be graphed on the same scale as the increase in classification. After the 2nd dimension the incremental increase from adding a dimension is quite small. This is a classic "elbow" indicating the correct dimensionality is at most three and almost certainly two.

__________________________

Table 1 and Figure 2 about Here

__________________________

Figure 3 shows a plot of the legislator ideal points for the 90th House. The "d", "s", and "r" tokens indicate Northern (Non-Southern) Democrats, Southern Democrats, and Republicans respectively. The two dimensions are liberal-conservative (government intervention in the economy) and Race (North vs. South). The configuration is the same as the recovered by NOMINATE, Heckman-Snyder, and the optimal classification algorithm. An analysis of the structure of voting during this period of American history can be found in Poole and Rosenthal (1997) and McCarty, Poole, and Rosenthal (1997).

__________________

Figure 3 about Here

__________________

Figure 4 shows a histogram of the estimated normal vectors in terms of their angles from –90 to +90 degrees. A normal vector at an angle of –45 degrees produces a cutting plane at +45 degrees parallel to the "channel" between the two political parties. A normal vector at an angle between +45 degrees and about +20 degrees produces cutting planes that are potential "conservative coalition" votes. That is, cutting planes that run between the two wings of the Democratic Party with a majority of Republicans on the side of the Southern Democrats.

_________________

Figure 4 about Here

_________________

Figures 5 and 6 show histograms of the s i‘s and g j‘s respectively. The mean and standard deviation of the s i‘s is .995 and .40 respectively. The mean and standard deviation of the g j‘s is 4.46 and 2.55 respectively. Both are weakly related to their corresponding correct classifications. The Pearson r between the s i‘s and the correct classification percentages for the legislators is -.59. The Pearson r between the g j‘s and the correct classification percentages for the roll calls is .66. Although the distributions of the s i‘s and g j‘s appear to be quite reasonable, without a comprehensive Monte-Carlo study of the quadratic procedure it is not possible to make definitive statements about them at this time.

______________________

Figures 5 and 6 about Here

______________________

 

6. Conclusion

The purpose of this paper was to show how the geometry of the multidimensional quadratic utility function could be exploited to estimate legislator ideal points and roll call normal vectors and cutpoints in a standard framework where the stochastic portion of the utility function is normally distributed. The algorithm shown in section 5 appears to be quite stable and produces sensible results. However, a comprehensive Monte-Carlo study is needed to pin down all the properties of the scaling algorithm.

References

Best, Alvin M., Forrest W. Young, and Robert G. Hall. 1979. "On the Precision of a Euclidean Structure." Psychometrika, 44:395-408

Coombs, Clyde. 1964. A Theory of Data. New York: Wiley.

Davis, Otto A. and Melvin J. Hinich. 1966. "A Mathematical Model of Policy Formation in a Democratic Society." In Mathematical Applications in Political Science II, edited by J. Bernd. Dallas Texas: Southern Methodist University Press.

Davis, Otto A. and Melvin J. Hinich. 1967. "Some Results Related to a Mathematical Model of Policy Formation in a Democratic Society." In Mathematical Applications in Political Science III, edited by J. Bernd. Charlottesville, VA: University of Virginia Press.

Davis, Otto A., Melvin J. Hinich, and Peter C. Ordeshook. 1970. "An Expository Development of a Mathematical Model of the Electoral Process." American Political Science Review, 64:426-448.

Dhrymes, Phoebus J. 1978. Introductory Econometrics. New York: Springer-Verlag.

Heckman, James J. and James M. Snyder. 1997. "Linear Probability Models of the Demand for Attributes With an Empirical Application to Estimating the Preferences of Legislators." Rand Journal of Economics, 28:142-189.

Ladha, Krishna K. 1991. "A Spatial Model of Legislative Voting With Perceptual Error." Public Choice, 68:151-174.

Londregan, John B. 2000. "Estimating Legislators’ Preferred Points. Political Analysis, 8(1), 35-56.

Lord, F. M. 1983. "Unbiased Estimates of Ability Parameters, of their Variance, and of their Parallel Forms Reliability." Psychometrika, 48:477-482.

MacRae, Duncan, Jr. 1958. Dimensions of Congressional Voting. Berkeley: University of California Press.

McCarty, Nolan, Keith T. Poole, and Howard Rosenthal. 1997. Income Redistribution and the Realignment of American Politics. Washington, D.C.: AEI Press.

Poole, Keith T. and Howard Rosenthal. 1985. "A Spatial Model for Legislative Roll Call Analysis." American Journal of Political Science, 29:357-384.

Poole, Keith T. and Howard Rosenthal. 1991. "Patterns of Congressional Voting." American Journal of Political Science, 35:228-278.

Poole, Keith T. and Howard Rosenthal. 1997. Congress: A Political-Economic History of Roll Call Voting. New York: Oxford University Press.

Poole, Keith T. 2000. "Non-Parametric Unfolding of Binary Choice Data." Political Analysis (forthcoming).

Rasch, G. 1961. "On General Laws and the Meaning of Measurement in Psychology." Proceedings of the IV Berkeley Symposium on Mathematical Statistics and Probability, 4:321-333.

Ross, John and Norman Cliff. 1964. "A Generalization of the Interpoint Distance Model."" Psychometrika, 29:167-176.

Schonemann, Peter H. 1966. "A Generalized Solution of the Orthogonal Procrustes

Problem." Psychometrika, 31:1-10.

Young, Gale and A. S. Householder. 1938. "Discussion of a Set of Points in Terms of their Mutual Distances." Psychometrika, 3:19-22.

 

Table 1

Scaling Results for the 90th House:

389 Roll Calls, 438 Legislators, 147,199 Total Choices

 

Optimal

Classification

Quadratic Utility Scaling

 

 

Dimension

Percent

Correct

Class.

 

APRE

Percent

Correct

Class.

 

APRE

 

GMP

 

1

87.85

.573a

85.42

.488

.728b

 

2

90.34

.661

88.53

.597

.772

 

3

91.09

.687

89.31

.624

.783

 

4

91.50

.701

89.63

.636

.787

 

5

91.97

.718

90.05

.651

.795

 

6

92.33

.731

90.42

.664

.801

 

7

92.66

.742

90.78

.676

.807

 

8

92.98

.753

91.10

.687

.813

 

9

93.26

.763

91.34

.696

.817

 

10

93.52

.773

91.65

.707

.823

 
             

a APRE =

b Geometric Mean Probability: The exponential of the average log –likelihood; that is: GMP = exp[log-likelihood of all observed choices/N].

 

Endnotes