Crime and Policing Expenditures Descriptive Analysis¶
In this exercise we’ll be examining the relationship between crime and policing expenditures using county-level data from Massachusetts.
Begin by downloading the data for this exercise from https://github.com/nickeubank/MIDS_Data/blob/master/descriptive_exercise/crime_expend_MA.csv (just go to
github.com/nickeubank/MIDS_Data, then go to
descriptive_exercise and get
crime_expend_MA.csv if you don’t want to type all that).
(Reminder: R and pandas can pull data directly from the web, but only if you point them at the raw representation of the data.)
This data includes monthly data on both each county’s policing expenditures (
policeexpenditures as share of county budget) and an index of crime (
crimeindex, scaled 0-100) from 1990 to late 2001.
In these exercises, we’ll be focusing on just two counties –
county_code 4 and 10.
First, for each of these two counties, calculate the mean expenditure level and mean crimeindex score (i.e. calculate both means separately for each county).
Just to make sure we’re practicing applied skills – do it by looping over the two counties you are analyzing and have the loop calculate your means and print your results nicely! So you should get output like this (though obviously with different numbers – I’m not gonna give you the answer!):
for county 4, average policing expenditure is 23.7 and average crime index is 75.83 for county 10, average policing expenditure is 62.15 and average crime index is 55.88
Now calculate the standard deviation of both expenditures and crime for these two counties.
Now calculate the correlation between
crimeindex for both of these counties (again with a loop and nice printed output!)
Based on your results up to this point, what would you guess about whether policing reduces crime? (I know – this is just a descriptive statistics, and correlation does not imply causality. But what would you infer if this was all you knew?
Given what you’ve seen up till now, would you infer that county 4 and county 10 have a similar relationship between crime and police expenditures?
Now plot histograms of
policeexpenditures for both county 4 and county 10. Do the results change you impression of the similarity of county 4 and county 10?
Finally, create a scatter plot of the relationship between crime and police expenditures for each county (e.g. crime on one axis, police expenditures on the other). Does this change your sense of how similar these are?