The Pearson correlation coefficient, often referred to as Pearson's r, is a statistical measure that calculates the strength of the relationship between the relative movements of two variables. The range of values for the correlation coefficient is between -1.0 and 1.0. A correlation of -1.0 shows a perfect negative correlation, while a correlation of 1.0 shows a perfect positive correlation. A correlation of 0.0 shows no linear relationship between the movement of the two variables.
In the below example, the Pearson correlation coefficient between salt intake and systolic blood pressure is approximately 0.971, indicating a strong positive correlation. This means that as salt intake increases, systolic blood pressure also tends to increase. The p-value associated with this correlation is extremely small (about 2.995×10−6), suggesting that the observed correlation is statistically significant and unlikely to be due to chance.
import numpy as np
from scipy.stats import pearsonr
# Given data for salt intake (in mEq) and systolic blood pressure (in mmHg)
salt_intake = np.array([106.0960, 194.7779, 275.2025, 397.4523, 497.3065, 574.1339, 705.6480,
801.5520, 881.2873, 999.4862])
blood_pressure = np.array([100.99, 105.58, 114.04, 114.79, 115.99, 117.13, 122.20, 124.84, 126.01, 129.70])
# Calculate the Pearson correlation coefficient
correlation_coefficient, p_value = pearsonr(salt_intake, blood_pressure)
print("\\n")
print(f"The Pearson correlation coefficient is approximately {correlation_coefficient}, which indicates\\n"
"a strong positive correlation between salt intake and systolic blood pressure.")
print("\\n")
print(f"The p-value is extremely small {p_value}, suggesting that the correlation is statistically significant.")
print("\\n")