Correlation: What It Means in Finance and the Formula for Calculating It – Investopedia
Adam Hayes, Ph.D., CFA, is a financial writer with 15+ years Wall Street experience as a derivatives trader. Besides his extensive derivative trading expertise, Adam is an expert in economics and behavioral finance. Adam received his master's in economics from The New School for Social Research and his Ph.D. from the University of Wisconsin-Madison in sociology. He is a CFA charterholder as well as holding FINRA Series 7, 55 & 63 licenses. He currently researches and teaches economic sociology and the social studies of finance at the Hebrew University in Jerusalem.
Investopedia / Sydney Saporito
Correlation, in the finance and investment industries, is a statistic that measures the degree to which two securities move in relation to each other. Correlations are used in advanced portfolio management, computed as the correlation coefficient, which has a value that must fall between -1.0 and +1.0.
Correlation shows the strength of a relationship between two variables and is expressed numerically by the correlation coefficient. The correlation coefficient's values range between -1.0 and 1.0.
A perfect positive correlation means that the correlation coefficient is exactly 1. This implies that as one security moves, either up or down, the other security moves in lockstep, in the same direction. A perfect negative correlation means that two assets move in opposite directions, while a zero correlation implies no linear relationship at all.
For example, large-cap mutual funds generally have a high positive correlation to the Standard and Poor’s (S&P) 500 Index or nearly one. Small-cap stocks tend to have a positive correlation to the S&P, but it’s not as high or approximately 0.8.
However, put option prices and their underlying stock prices will tend to have a negative correlation. A put option gives the owner the right but not the obligation to sell a specific amount of an underlying security at a pre-determined price within a specified time frame.
Put option contracts become more profitable when the underlying stock price decreases. In other words, as the stock price increases, the put option prices go down, which is a direct and high-magnitude negative correlation.
There are several methods of calculating correlation. The most common method, the Pearson product-moment correlation, is discussed further in this article. The Pearson product-moment correlation measures the linear relationship between two variables. It can be used for any data set that has a finite covariance matrix. Here are the steps to calculate correlation.
To avoid the complex manual calculation, consider using the CORREL function in Excel.
Using the Pearson product-moment correlation method, the following formula can be used to find the correlation coefficient, r:
r=(n×∑(X2)−∑(X)2)×(n×∑(Y2)−∑(Y)2)n×(∑(X,Y)−(∑(X)×∑(Y)))where:r=Correlation coefficientn=Number of observations
Investment managers, traders, and analysts find it very important to calculate correlation because the risk reduction benefits of diversification rely on this statistic. Financial spreadsheets and software can calculate the value of correlation quickly.
As a hypothetical example, assume that an analyst needs to calculate the correlation for the following two data sets:
X: (41, 19, 23, 40, 55, 57, 33)
Y: (94, 60, 74, 71, 82, 76, 61)
There are three steps involved in finding the correlation. The first is to add up all the X values to find SUM(X), add up all the Y values to fund SUM(Y) and multiply each X value with its corresponding Y value and sum them to find SUM(X,Y):
SUM(X) = (41 + 19 + 23 + 40 + 55 + 57 + 33) = 268
SUM(Y) = (94 + 60 + 74 + 71 + 82 + 76 + 61) = 518
SUM(X,Y) = (41 x 94) + (19 x 60) + (23 x 74) + … (33 x 61) = 20,391
The next step is to take each X value, square it, and sum up all these values to find SUM(x^2). The same must be done for the Y values:
SUM(X^2) = (41^2) + (19^2) + (23^2) + … (33^2) = 11,534
SUM(Y^2) = (94^2) + (60^2) + (74^2) + … (61^2) = 39,174
Noting that there are seven observations, n, the following formula can be used to find the correlation coefficient, r:
r=(n×∑(X2)−∑(X)2)×(n×∑(Y2)−∑(Y)2)n×(∑(X,Y)−(∑(X)×∑(Y)))where:r=Correlation coefficientn=Number of observations
In this example, the correlation would be:
r = (7 x 20,391 – (268 x 518) / SquareRoot((7 x 11,534 – 268^2) x (7 x 39,174 – 518^2)) = 3,913 / 7,248.4 = 0.54
In investing, correlation is most important in relation to a diversified portfolio. Investors who wish to mitigate risk can do so by investing in non-correlated assets. For example, consider an investor who owns airline stock. If the airline industry is found to have a low correlation to the social media industry, the investor may choose to invest in a social media stock understanding that an negative impact to one industry may not impact the other.
This is often the approach when considering investing across asset classes. Stocks, bonds, precious metals, real estate, cryptocurrency, commodities, and other types of investments each have different relationships to each other. While some may be heavily correlated, others may act as a hedge to diversify risk if they are not correlated.
Risk that can be diversified away is called unsystematic risk. This type of risk is specific to a company, industry, or asset class. Investing in different assets can reduce your portfolio's correlation and reduce your exposure to unsystematic risk.
Correlation is often dictated and related to other statistical considerations. It is common to see correlation cited when statistics is used to analyze variables.
In statistics, a p-value is used to indicate whether the findings are statistically significant. It is possible to determine that two variables are correlated, but there may not be enough supporting evidence to state this as a strong claim. A high p-value indicates there is enough evidence to meaningfully conclude that the population correlation coefficient is different from zero.
The easiest way to visualize whether two variables are correlated is to graphically depict them using a scatterplot. Each point on a scatterplot represents one sample item. The x-axis of the scatterplot represents one of the variables being tested, while the y-axis of the scatter plot represents the other.
The correlation coefficient of the two variables is depicted graphically often as a linear line mapped to show the relationship of the two variables. If the two variables are positively correlated, an increasing linear line may be drawn on the scatterplot. If two variables are negatively correlated, a decreasing linear line may be draw. The stronger the relationship of the data points, the closer each data point will be to this line.
Scatterplots may be more useful when analyzing more complex data that might have changing relationships. For example, two variables may be positively correlated to a certain point, then their relationship becomes negatively correlated. This non-linear relationship may be more difficult to identify using formulas but can be easier to spot when graphed on a scatterplot.
Last, scatterplots can easily depict correlation when they incorporate density shading. A density shade or density ellipse is a shaded area on a scatterplot that visually shows the densest region of data points on a scatterplot. The density ellipses will often mirror the direction of a linear correlation line if variables are related. Otherwise, density ellipses that are more circular with no defined direction indicate lower correlation.
Another inherent difficulty in statistics is determining whether relationships between two variables are caused by those variables. Consider the following statement:
"Most basketball players are tall. Therefore if you play basketball, you will become tall."
It's clear that the statement above is not true. Individuals who are tall and understand this advantage may gravitate to basketball because their natural physical abilities best suit them for the sport. However, because height and activity in basketball may be positively correlated, statisticians and data scientists must be aware that a strong relationship between two variables may or may be caused due to any one of the variables.
Like other aspects of statistical analysis, correlation can be misinterpreted. Small sample sizes may yield unreliable results, even if it appears as though correlation between two variables is strong. Alternatively, a small sample size may yield uncorrelated findings when the two variables are in fact linked.
Correlation is often skewed when an outlier is present. Correlation only shows how one variable is connected to another and may not clearly identify how a single instance or outcome can impact the correlation coefficient.
Correlation may also be misinterpreted if the relationship between two variables is nonlinear. It is much easier to identify two variables with a positive or negative correlation. However, two variables may still be correlated with a more complex relationship.
Correlation is a statistical term describing the degree to which two variables move in coordination with one another. If the two variables move in the same direction, then those variables are said to have a positive correlation. If they move in opposite directions, then they have a negative correlation.
Correlations play an important role in finance because they are used to forecast future trends and to manage the risks within a portfolio. These days, the correlations between assets can be easily calculated using various software programs and online services. Correlations, along with other statistical concepts, play an important role in the creation and pricing of derivatives and other complex financial instruments.
Correlation is a widely-used concept in modern finance. For example, a trader might use historical correlations to predict whether a company’s shares will rise or fall in response to a change in interest rates or commodity prices. Similarly, a portfolio manager might aim to reduce their risk by ensuring that the individual assets within their portfolio are not overly correlated with one another.
Investors may have a preference on the level of correlation within their portfolio. In general, most investors will prefer to have a lower correlation as this mitigates risk in their portfolios of different assets or securities being impacted by similar market conditions. However, risk-seeking investors or investors wanting to put their money into a very specific type of sector or company may be willing to have higher correlation within their portfolio in exchange for greater potential returns.
Strategy & Education
When you visit the site, Dotdash Meredith and its partners may store or retrieve information on your browser, mostly in the form of cookies. Cookies collect information about your preferences and your devices and are used to make the site work as you expect it to, to understand how you interact with the site, and to show advertisements that are targeted to your interests. You can find out more about our use, change your default settings, and withdraw your consent at any time with effect for the future by visiting Cookies Settings, which can also be found in the footer of the site.