BIS634 – Homework1 Solved

Description

5/5 – (1 vote)

0.1 Question 1:

[ ]: human_tester(42) # False — this would be a severe fever for a human
[ ]: False

[ ]: False

[ ]: False
[ ]: human_tester(98.6) # False — normal in degrees F but our reference temp was in␣
,→degrees C
[ ]: False
0.2 Question 2:

0.2.1 Examine data. What columns does it have? (2 points) How many rows (think:
people) does it have? (2 points)
The dataset contains 4 columns and 152361 rows.

[ ]: name age weight eyecolor
0 Edna Phelps 88.895690 67.122450 brown
1 Cara Yasso 9.274597 29.251244 brown
2 Gail Rave 18.345613 55.347903 brown 3 Richard Adams 16.367545 70.352184 brown
4 Krista Slater 49.971604 70.563859 brown
… … … … …
152356 John Fowler 23.930833 71.532569 blue
152357 Diana Shuffler 21.884819 67.936753 brown
152358 Kevin Cuningham 87.705907 60.074646 brown
152359 James Libengood 21.727666 81.774985 brown
152360 Cathleen Ballance 10.062236 34.327767 brown
[152361 rows x 4 columns]
0.2.2 Examine the distribution of the ages in the dataset. In particular, be sure to have your code report the mean, standard deviation, minimum, maximum. (2 points) Plot a histogram of the distribution with an appropriate number of bins for the size of the dataset (describe in your readme the role of the number of bins). (3 points) Comment on any outliers or patterns you notice in the distribution of ages. (1 point)

[ ]: count 152361.000000
mean 39.510528
std 24.152760
min 0.000748
25% 19.296458
50% 38.468955
75% 57.623245
max 99.991547
Name: age, dtype: float64

I noticed in the produced histogram, there is a decrease in number of participants around 70 year old to 100 year old participants. No significant outlier is spotted inside the dataset. The distribution is more uniform, not normal. This indicates a sufficient number of participants in each age classes. The bin number is chosen to be 25. The reason why I used 25 to be the bin number is because our age ranges from 0 to 100. Picking 25, which is a relatively big divident of 100, enables us to see the change in distribution in a relatively small scale.
0.2.3 Repeat the above for the distribution of weights. (3 points)

[ ]: count 152361.000000
mean 60.884134
std 18.411824
min 3.382084
25% 58.300135
50% 68.000000
75% 71.529860 max 100.435793
Name: weight, dtype: float64
0.2.4 Make a scatterplot of the weights vs the ages. (3 points) Describe the general relationship between the two variables (3 points). You should notice at least one outlier that does not follow the general relationship. What is the name of the person? (3 points) Be sure to explain your process for identifying the person whose values don’t follow the usual relationship in the readme. (3 points)

[ ]: name age weight eyecolor
2487 Charles Portillo 22.28086 44.340342 brown
By observing this scatterplot, I have found that the relationship between weight and age are different within different age interval. For participants with an age smaller than 40, the age and weight are increasing in a positive linear relationship. For participants that is older than 40 years old and younger than 100 years old, the weight falls within an range of 40 to 100, with a bottom limitation in weight(minmum weight) that follows the linear relationship in the smaller than 40 year old interval.
One outlier being noticed is the point at the middle of this scatterplot which apparently falls outside of the general trend. Using the values observed inside of the graph, I used a few subsetting conditions(weight within the range of 0 to 45, age within the range of 20 to 23), and found the name of the people as Charles Portillo.
0.3 Question 3
Test the above function and provide examples of it in use. (4 points)
Test the above function and provide examples of it in use. (4 points)

[ ]: from datetime import datetime def get_difference(date1, date2):
delta = date2 – date1 return delta.days
[ ]: ### third function def find_peaks(list_of_two):
for name in (list_of_two):
find_subset_create_newcase(name) subset1 = data2.loc[data2[‘state’]==list_of_two[0]] subset2 = data2.loc[data2[‘state’]==list_of_two[1]] if subset1[‘new_case’].max() > subset2[‘new_case’].max():
print(list_of_two[0]+’ has the larger peak daily increase in covid cases␣
,→than ‘ + list_of_two[1]) elif subset1[‘new_case’].max() < subset2[‘new_case’].max():
print(list_of_two[1]+’ has the larger peak daily increase in covid cases␣
,→than ‘ + list_of_two[0]) else: print(‘Two states has the same number of max daily increase in covid cases’)
#report difference in dates d1 = list(subset1.loc[subset1[‘new_case’] == (subset1[‘new_case’].
d2 = list(subset2.loc[subset2[‘new_case’] == (subset2[‘new_case’].
[ ]: find_peaks([‘Washington’,’California’])
California has the larger peak daily increase in covid cases than Washington
The difference in peak time between the two states is 8 days
0.4 Question 4
Write Python code that reads the XML and reports: the DescriptorName associated with DescriptorUI D007154 (the text of the name is nested inside a String tag) (5 points) the DescriptorUI (MeSH Unique ID) associated with DescriptorName “Nervous System Diseases” (5 points) the DescriptorNames of items in the MeSH hierarchy that are children of both “Nervous System Diseases” and D007154. (That is, each item is a subtype of both, as defined by its TreeNumber(s).) (5 points)
Explain briefly in terms of biology/medicine what the above search has found. (5 points) Do these tasks using functions (e.g. write a generic function that returns DescriptorName given a
DescriptorUI) instead of writing single use code. (5 points)

Immune System Diseases
[ ]: def find_descriptor_UI_given_name(des_name):
for record in root.iter(‘DescriptorRecord’):
if record.find(“DescriptorName/String”).text == des_name: return(record.find(‘DescriptorUI’).text)
[ ]: find_descriptor_UI_given_name(‘Nervous System Diseases’)
D009422
the DescriptorName associated with DescriptorUI D007154 is ‘Immune System Disease’ the DescriptorUI associated with DescriptorName “Nervous System Diseases” is ‘D009422’
[ ]: def find_descriptor_name_given_UI_name(des_name,des_UI):
#find our tree numbers for record in root.iter(“DescriptorRecord”):
if record.find(‘DescriptorUI’).text == des_UI:
num1 = (record.find(‘TreeNumberList/TreeNumber’).text)
for record in root.iter(‘DescriptorRecord’):
if record.find(“DescriptorName/String”).text == ‘Nervous System Diseases’: num2 = (record.find(‘TreeNumberList/TreeNumber’).text) #then proceed to storing lists of descriptor names
stored_c10 =[] for record in root.iter(‘DescriptorRecord’):
for tree_num_list in record.iter(‘TreeNumberList’):
for tree_num in tree_num_list.iter(‘TreeNumber’):
if (tree_num.text)[:3] == num1: stored_c10.append(record.find(‘DescriptorName/String’).text)
stored_c20 =[] for record in root.iter(‘DescriptorRecord’):
for tree_num_list in record.iter(‘TreeNumberList’):
for tree_num in tree_num_list.iter(‘TreeNumber’):
if (tree_num.text)[:3] == num2: stored_c20.append(record.find(‘DescriptorName/String’).text)
#find overlapping names
intersect_list = set([value for value in stored_c10 if value in stored_c20]) return(intersect_list)
[ ]: result = find_descriptor_name_given_UI_name(‘Nervous System Diseases’,’D007154′)
[ ]: result
[ ]: {‘AIDS Arteritis, Central Nervous System’,
‘AIDS Dementia Complex’,
‘Anti-N-Methyl-D-Aspartate Receptor Encephalitis’,
‘Ataxia Telangiectasia’,
‘Autoimmune Diseases of the Nervous System’,
‘Autoimmune Hypophysitis’,
‘Demyelinating Autoimmune Diseases, CNS’,
‘Diffuse Cerebral Sclerosis of Schilder’,
‘Encephalomyelitis, Acute Disseminated’,
‘Encephalomyelitis, Autoimmune, Experimental’,
‘Giant Cell Arteritis’,
‘Guillain-Barre Syndrome’,
‘Kernicterus’,
‘Lambert-Eaton Myasthenic Syndrome’,
‘Leukoencephalitis, Acute Hemorrhagic’,
‘Lupus Vasculitis, Central Nervous System’,
‘Mevalonate Kinase Deficiency’,
‘Microscopic Polyangiitis’,
‘Miller Fisher Syndrome’,
‘Multiple Sclerosis’,
‘Multiple Sclerosis, Chronic Progressive’,
‘Multiple Sclerosis, Relapsing-Remitting’,
‘Myasthenia Gravis’,
‘Myasthenia Gravis, Autoimmune, Experimental’,
‘Myasthenia Gravis, Neonatal’,
‘Myelitis, Transverse’,
‘Nervous System Autoimmune Disease, Experimental’,
‘Neuritis, Autoimmune, Experimental’,
‘Neuromyelitis Optica’,
‘POEMS Syndrome’,
‘Polyradiculoneuropathy’,
‘Polyradiculoneuropathy, Chronic Inflammatory Demyelinating’,
‘Stiff-Person Syndrome’,
‘Uveomeningoencephalitic Syndrome’,
‘Vasculitis, Central Nervous System’}
0.4.1 Appendix(Reference)

Reviews

There are no reviews yet.

Be the first to review “BIS634 – Homework1 Solved”

BIS634 – Homework1 Solved

Description

Reviews

Related products

BIS634 – Assignment 4 Solved

BIS634 – Question 1 Solved

BIS634 – Exercise 1 Solved