Description
The business data contains postal code information that we can use to aggregate the ratings over regions of the city. Let’s examine and clean the postal code field. The postal code (sometimes also called a ZIP code) partitions the city into regions:
Question 5a
Let’s look at the distribution of inspection scores. As we saw before when we called head on this data frame, inspection scores appear to be integer values. The discreteness of this variable means that we can use a bar plot to visualize the distribution of the inspection score. Make a bar plot of the counts of the number of inspections receiving each score.
It should look like the image below. It does not need to look exactly the same (e.g., no grid), but make sure that all labels and axes are correct.
Distribution of Inspection Scores
2000
1750
1500
1250
1000
750
500
250
Score
Now, create your scatter plot in the cell below. It does not need to look exactly the same (e.g., no grid) as the sample below, but make sure that all labels, axes and data itself are correct.
First Inspection Score vs. Second Inspection Score
100
5555 60 65 70 75 80 85 90 95 100 First Score
Key pieces of syntax you’ll need:
plt. scatter plots a set of points. Use facecolors= ‘none and edgecolors= ‘b’ to make circle markers with blue borders.
plt. plot for the reference line.
plt.xlabel , plt.ylabel , plt.axis , and plt.title .
Question 6d
The boxplot should look similar to the sample below. Make sure the boxes are in the correct order!
Moderate Risk
High Risk
m17 m18 319 par
Hint: Use sns. boxplot( ) . Try taking a look at the first several parameters. The documentation is linked here!
Hint: Use plt. figure( ) to adjust the figure size of your plot.




Reviews
There are no reviews yet.