Summarise hrp2 and hrp3 Deletions
Contents
Summarise hrp2 and hrp3 Deletions¶
Introduction¶
This notebook will recreate two supplementary display items from the Pf7 paper relating to summarising deletions in hrp2 and hrp3:
Supplementary Table 6: frequency of hrp2 and hrp3 deletions by country
Supplementary Table 7: summary of hrp2 and hrp3 deletion breakpoints
hrp2 and hrp3 are genes located in subtelomeric regions of the genome with very high levels of natural variation. Deletion in those genes can cause failure of rapid diagnostic tests and is therefore important to monitor.
Deletion is a genetic event in which a segment of DNA is entirely removed or missing. In this context, ‘breakpoints’ denote specific locations on the chromosome where such deletions take place.
This notebook should take approximately 1 minute to run.
Setup¶
Install and import the malariagen Python package:
!pip install -q --no-warn-conflicts malariagen_data
import malariagen_data
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 148.5/148.5 kB 3.1 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 13.1 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 52.8/52.8 MB 7.0 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 43.8 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.0/4.0 MB 28.2 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 302.5/302.5 kB 14.3 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 138.7/138.7 kB 2.6 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 20.9/20.9 MB 8.6 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.1/8.1 MB 65.3 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 206.9/206.9 kB 14.9 MB/s eta 0:00:00
?25h Preparing metadata (setup.py) ... ?25l?25hdone
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/7.7 MB 22.8 MB/s eta 0:00:00
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 67.2 MB/s eta 0:00:00
?25h Building wheel for asciitree (setup.py) ... ?25l?25hdone
Import required python libraries that are installed at colab by default.
import numpy as np
import pandas as pd
import collections
from google.colab import drive
Access Pf7 Data¶
We use the malariagen data package to load the release data.
release_data = malariagen_data.Pf7()
df_samples = release_data.sample_metadata()
hrp2 and hrp3 Deletions¶
We also need to access Pf7 samples hrp2 and hrp3 deletion status (including breakpoint and deletion type).
# Fetch the details of hrp calls from this MalariaGEN.net file
hrp_calls_fn = pd.read_csv('https://www.malariagen.net/wp-content/uploads/2024/01/hrp_calls_pf7.tsv', sep='\t')
Let’s get a quick overview of this dataset.
# print the shape
print(hrp_calls_fn.shape)
# display first 5 rows of the dataset
hrp_calls_fn.head(5)
(16203, 7)
Sample | HRP2 | HRP3 | HRP2_breakpoint | HRP3_breakpoint | HRP2_deletion_type | HRP3_deletion_type | |
---|---|---|---|---|---|---|---|
0 | FP0008-C | nodel | nodel | - | - | NaN | NaN |
1 | FP0009-C | nodel | nodel | - | - | NaN | NaN |
2 | FP0010-CW | uncallable | uncallable | - | - | NaN | NaN |
3 | FP0011-CW | uncallable | uncallable | - | - | NaN | NaN |
4 | FP0012-CW | nodel | nodel | - | - | NaN | NaN |
We will continue to explore the dataset in the next section.
Dataset Exploration¶
We begin by merging the deletion status information with sample metadata which will allow us to perform geospatial analyses, such as identifying countries or specific regions that exhibit a higher susceptibility to hrp2 and hrp3 deletions.
One thing to note is that deletion data is exclusively curated for samples that have passed the quality control (QC) criteria. Consequently, the resulting merged dataset will exclusively contain these high-quality samples with deletion information.
# Merge df_samples with hrp_calls_fn
df_samples_hrp = df_samples.merge(hrp_calls_fn, on ='Sample')
print(df_samples_hrp.shape)
df_samples_hrp.head(3)
(16203, 23)
Sample | Study | Country | Admin level 1 | Country latitude | Country longitude | Admin level 1 latitude | Admin level 1 longitude | Year | ENA | ... | QC pass | Exclusion reason | Sample type | Sample was in Pf6 | HRP2 | HRP3 | HRP2_breakpoint | HRP3_breakpoint | HRP2_deletion_type | HRP3_deletion_type | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | FP0008-C | 1147-PF-MR-CONWAY | Mauritania | Hodh el Gharbi | 20.265149 | -10.337093 | 16.565426 | -9.832345 | 2014.0 | ERR1081237 | ... | True | Analysis_set | gDNA | True | nodel | nodel | - | - | NaN | NaN |
1 | FP0009-C | 1147-PF-MR-CONWAY | Mauritania | Hodh el Gharbi | 20.265149 | -10.337093 | 16.565426 | -9.832345 | 2014.0 | ERR1081238 | ... | True | Analysis_set | gDNA | True | nodel | nodel | - | - | NaN | NaN |
2 | FP0010-CW | 1147-PF-MR-CONWAY | Mauritania | Hodh el Gharbi | 20.265149 | -10.337093 | 16.565426 | -9.832345 | 2014.0 | ERR2889621 | ... | True | Analysis_set | sWGA | False | uncallable | uncallable | - | - | NaN | NaN |
3 rows × 23 columns
What are the deletion types and breakpoints in the hrp2 gene, and how many samples exhibit these deletions?
# Count and sort deletion types and breakpoints in highest to lowest occurrence
df_samples_hrp[
df_samples_hrp['HRP2'] == 'del'
].groupby(['HRP2_deletion_type', 'HRP2_breakpoint']).size().sort_values(ascending=False)
HRP2_deletion_type HRP2_breakpoint
Telomere healing Pf3D7_08_v3:1374986 6
Pf3D7_08_v3:1374462 5
Pf3D7_08_v3:1374932 2
Pf3D7_08_v3:1373732 1
Pf3D7_08_v3:1374280 1
dtype: int64
What are the deletion types and breakpoints in the hrp3 gene, and how many samples exhibit these deletions?
# Count and sort deletion types and breakpoints in highest to lowest occurrence
pd.DataFrame(
df_samples_hrp[
df_samples_hrp['HRP3'] == 'del'
].groupby(['HRP3_deletion_type', 'HRP3_breakpoint']).size().sort_values(ascending=False).rename("Samples_with_deletion")
)
Samples_with_deletion | ||
---|---|---|
HRP3_deletion_type | HRP3_breakpoint | |
Chromosome 11 recombination | Pf3D7_13_v3:2800004-2807159 | 151 |
Chromosome 5 recombination | Pf3D7_13_v3:2835587-2835612 | 21 |
Telomere healing | Pf3D7_13_v3:2830952 | 7 |
Pf3D7_13_v3:2837145 | 7 | |
Pf3D7_13_v3:2837392 | 3 | |
Pf3D7_13_v3:2838654 | 2 | |
Pf3D7_13_v3:2811525 | 1 | |
Pf3D7_13_v3:2812344 | 1 | |
Pf3D7_13_v3:2815249 | 1 | |
Pf3D7_13_v3:2822480 | 1 | |
Pf3D7_13_v3:2823645 | 1 | |
Pf3D7_13_v3:2832080 | 1 | |
Pf3D7_13_v3:2834604 | 1 | |
Pf3D7_13_v3:2835532 | 1 | |
Pf3D7_13_v3:2841024 | 1 | |
Pf3D7_13_v3:2841120 | 1 |
Note that we can expand the previous table with more categories (population, country and admin-level 1).
# Group the HRP3 deletions by breakpoint, population, country, and admin level 1.
pd.DataFrame(
df_samples_hrp[
df_samples_hrp['HRP3'] == 'del'
].fillna('').groupby(['HRP3_deletion_type', 'HRP3_breakpoint', 'Population', 'Country', 'Admin level 1']).size().rename("Samples_with_deletion")
)
Samples_with_deletion | |||||
---|---|---|---|---|---|
HRP3_deletion_type | HRP3_breakpoint | Population | Country | Admin level 1 | |
Chromosome 11 recombination | Pf3D7_13_v3:2800004-2807159 | AF-NE | Ethiopia | Amhara | 6 |
Oromia | 3 | ||||
Sudan | Kassala | 5 | |||
AF-W | Gambia | Upper River | 1 | ||
Western | 2 | ||||
Ghana | Upper East | 1 | |||
Mali | Kayes | 1 | |||
Senegal | Dakar | 6 | |||
AS-S-FE | Bangladesh | Chittagong | 1 | ||
AS-SE-E | Cambodia | Ratanakiri | 3 | ||
Laos | Champasak | 1 | |||
Salavan | 13 | ||||
Vietnam | Binh Phuoc | 2 | |||
AS-SE-W | Thailand | Tak | 1 | ||
OC-NG | Indonesia | Papua | 40 | ||
SA | Colombia | Cauca | 50 | ||
Peru | Loreto | 15 | |||
Chromosome 5 recombination | Pf3D7_13_v3:2835587-2835612 | AS-SE-E | Cambodia | Battambang | 4 |
Pailin | 3 | ||||
Pursat | 13 | ||||
Vietnam | Binh Phuoc | 1 | |||
Telomere healing | Pf3D7_13_v3:2811525 | AS-S-E | India | West Bengal | 1 |
Pf3D7_13_v3:2812344 | AF-NE | Sudan | Kassala | 1 | |
Pf3D7_13_v3:2815249 | AF-E | Tanzania | Tanga | 1 | |
Pf3D7_13_v3:2822480 | AF-W | Ghana | Brong Ahafo | 1 | |
Pf3D7_13_v3:2823645 | AF-E | Kenya | Kilifi | 1 | |
Pf3D7_13_v3:2830952 | AS-SE-E | Cambodia | Battambang | 7 | |
Pf3D7_13_v3:2832080 | AF-C | Democratic Republic of the Congo | Kinshasa | 1 | |
Pf3D7_13_v3:2834604 | AS-SE-E | Vietnam | Binh Phuoc | 1 | |
Pf3D7_13_v3:2835532 | AS-SE-W | Thailand | Tak | 1 | |
Pf3D7_13_v3:2837145 | AS-SE-E | Vietnam | Binh Phuoc | 7 | |
Pf3D7_13_v3:2837392 | AS-SE-E | Cambodia | Ratanakiri | 2 | |
Laos | Attapeu | 1 | |||
Pf3D7_13_v3:2838654 | OC-NG | Indonesia | Papua | 2 | |
Pf3D7_13_v3:2841024 | AS-SE-W | Thailand | Tak | 1 | |
Pf3D7_13_v3:2841120 | OC-NG | Indonesia | Papua | 1 |
We can note that breakpoints in Chromosome 11 and 5 recombination have been observed in multiple in-country sampling locations.
In the following sections, we will create various summary tables.
Summary by deletion types table¶
The initial summary table will show the number of distinct breakpoints, countries and samples for distinct deletion types within the hrp2 and hrp3 genes.
To achieve this, we need a function to count number of distinct unique breakpoints, countries and samples for each deletion type.
def hrp2_agg(x):
"""
This function counts number of distinct unique breakpoints, countries and samples for each deletion type in hrp2.
"""
# Create an ordered dictionary to store the information.
summary = collections.OrderedDict()
# Calculate the number of distinct breakpoints.
summary['Distinct Breakpoints'] = len(x['HRP2_breakpoint'].unique())
# Calculate the number of unique countries.
summary['Countries'] = len(x['Country'].unique())
# Calculate the total number of samples with deletions.
summary['Samples with Deletion'] = len(x)
# Return the summary information as a Pandas Series.
return pd.Series(summary)
The hrp2_agg
function can be adapted for hrp3 gene deletions as well.
def hrp3_agg(x):
"""
This function counts number of distinct unique breakpoints, countries and samples for each deletion type in hrp3.
"""
# Create an ordered dictionary to store the information.
summary = collections.OrderedDict()
# Calculate the number of distinct breakpoints.
summary['Distinct Breakpoints'] = len(x['HRP3_breakpoint'].unique())
# Calculate the number of unique countries.
summary['Countries'] = len(x['Country'].unique())
# Calculate the total number of samples with deletions.
summary['Samples with Deletion'] = len(x)
# Return the summary information as a Pandas Series.
return pd.Series(summary)
# Create a summary of deletion types for hrp2 and hrp3 genes
# Filter samples with hrp2 deletions
# Group by deletion type, also include gene name
# Apply hrp2_agg function to find numbers
hrp2_agg_df = (
df_samples_hrp[df_samples_hrp['HRP2'] == 'del']
.assign(Gene='hrp2')
.rename(columns={'HRP2_deletion_type': 'Deletion type'})
.groupby(['Gene', 'Deletion type'])
.apply(hrp2_agg)
)
# Filter samples with hrp3 deletions
# Group by deletion type, also include gene name
# Apply hrp2_agg function to find numbers
hrp3_agg_df = (
df_samples_hrp[
df_samples_hrp['QC pass'] & (df_samples_hrp['HRP3'] == 'del')
]
.assign(Gene='hrp3')
.rename(columns={'HRP3_deletion_type': 'Deletion type'})
.groupby(['Gene', 'Deletion type'])
.apply(hrp3_agg)
)
# Concatenate the results to create a summary of deletion types
df_deletion_types_summary = pd.concat([hrp2_agg_df, hrp3_agg_df])
# Display the summary
df_deletion_types_summary
Distinct Breakpoints | Countries | Samples with Deletion | ||
---|---|---|---|---|
Gene | Deletion type | |||
hrp2 | Telomere healing | 5 | 4 | 15 |
hrp3 | Chromosome 11 recombination | 1 | 14 | 151 |
Chromosome 5 recombination | 1 | 2 | 21 | |
Telomere healing | 14 | 11 | 29 |
Save the table¶
You will need to authorise Google Colab access to Google Drive.
drive.mount('/content/drive')
Mounted at /content/drive
# This will send the file to your Google Drive, where you can download it from if needed
# Change the file path if you wish to send the file to a specific location
# Change the file name if you wish to call it something else
df_deletion_types_summary.to_excel('/content/drive/My Drive/HRP_deletion_types_summary_unformatted.xlsx')
Summary by breakpoints¶
The second summary table will feature breakpoints, associated countries, and sample counts for distinct deletion types within the hrp2 and hrp3 genes.
This requires a function to aggregate countries and sample counts for each unique deletion type at specific breakpoints.
def breakpoint_agg(x):
"""
This function aggregates countries and sample counts for each unique deletion type at specific breakpoints.
"""
names = collections.OrderedDict()
names['Countries'] = ', '.join((x['Country'].unique()))
names['Samples with deletion'] = len(x)
return pd.Series(names)
# Calculate summary statistics for hrp2
# Group samples with hrp3 deletions by gene name, deletion type, and breakpoint
# Apply breakpoint_agg function
hrp2_breakpoints_summary = (
df_samples_hrp[df_samples_hrp['HRP2'] == 'del']
.assign(Gene='hrp2')
.rename(columns={'HRP2_deletion_type': 'Deletion type'})
.groupby(['Gene', 'Deletion type', 'HRP2_breakpoint'])
.apply(breakpoint_agg)
)
# Calculate summary statistics for hrp3
# Group samples with hrp3 deletions by gene name, deletion type, and breakpoint
# Apply breakpoint_agg function
hrp3_breakpoints_summary = (
df_samples_hrp[df_samples_hrp['HRP3'] == 'del']
.assign(Gene='hrp3')
.rename(columns={'HRP3_deletion_type': 'Deletion type'})
.groupby(['Gene', 'Deletion type', 'HRP3_breakpoint'])
.apply(breakpoint_agg)
)
# Combine the summary statistics for HRP2 and HRP3 genes
df_breakpoints_summary = pd.concat([hrp2_breakpoints_summary, hrp3_breakpoints_summary])
# Display the combined summary
df_breakpoints_summary
Countries | Samples with deletion | |||
---|---|---|---|---|
Gene | Deletion type | HRP2_breakpoint | ||
hrp2 | Telomere healing | Pf3D7_08_v3:1373732 | Cambodia | 1 |
Pf3D7_08_v3:1374280 | Sudan | 1 | ||
Pf3D7_08_v3:1374462 | Indonesia | 5 | ||
Pf3D7_08_v3:1374932 | Peru | 2 | ||
Pf3D7_08_v3:1374986 | Peru | 6 | ||
hrp3 | Chromosome 11 recombination | Pf3D7_13_v3:2800004-2807159 | Thailand, Ghana, Indonesia, Peru, Bangladesh, ... | 151 |
Chromosome 5 recombination | Pf3D7_13_v3:2835587-2835612 | Cambodia, Vietnam | 21 | |
Telomere healing | Pf3D7_13_v3:2811525 | India | 1 | |
Pf3D7_13_v3:2812344 | Sudan | 1 | ||
Pf3D7_13_v3:2815249 | Tanzania | 1 | ||
Pf3D7_13_v3:2822480 | Ghana | 1 | ||
Pf3D7_13_v3:2823645 | Kenya | 1 | ||
Pf3D7_13_v3:2830952 | Cambodia | 7 | ||
Pf3D7_13_v3:2832080 | Democratic Republic of the Congo | 1 | ||
Pf3D7_13_v3:2834604 | Vietnam | 1 | ||
Pf3D7_13_v3:2835532 | Thailand | 1 | ||
Pf3D7_13_v3:2837145 | Vietnam | 7 | ||
Pf3D7_13_v3:2837392 | Cambodia, Laos | 3 | ||
Pf3D7_13_v3:2838654 | Indonesia | 2 | ||
Pf3D7_13_v3:2841024 | Thailand | 1 | ||
Pf3D7_13_v3:2841120 | Indonesia | 1 |
Table Legend. Summary of hrp2 and hrp3 deletion breakpoints. Telomere healing refers to the process whereby the end of a chromosome is deleted and a telomere repeat sequence attached to the breakpoint. Chromosome 11 recombination refers to a new hybrid chromosome being created by a recombination between chromosome 13 and 11 at a cluster of rRNA genes that appear to have orthologous copies on both chromosomes. Chromosome 5 recombination refers to a recombination between chromosome 13 and an inverted section of the middle of chromosome 5 containing the gene mdr1. For telomere healing an exact breakpoint position is given but for recombination events it is only possible to give a region in which the recombination has occurred.
Save the table¶
# This will send the file to your Google Drive, where you can download it from if needed
# Change the file path if you wish to send the file to a specific location
# Change the file name if you wish to call it something else
df_breakpoints_summary.to_excel('/content/drive/My Drive/HRP_breakpoints_summary_unformatted.xlsx')
Summary by country¶
The final summary table will display the exact number and incidence of deletions within hrp2 and hrp3 in each country.
# Define an aggregation function to calculate proportions and counts for HRP2 and HRP3 genes.
def proportion_agg(x):
"""
This function counts the number of deletions, the incidence of deletions, and more for both genes.
"""
names = collections.OrderedDict()
# Calculate the number of HRP2 deletions
names['hrp2 calls'] = np.count_nonzero(x['HRP2'] != 'uncallable')
# Calculate the incidence of HRP2 deletions
names['% hrp2 deletions'] = round((np.count_nonzero(x['HRP2'] == 'del') / (np.count_nonzero(x['HRP2'] != 'uncallable'))*100))
# Calculate the number of HRP3 deletions
names['hrp3 calls'] = np.count_nonzero(x['HRP3'] != 'uncallable')
# Calculate the percentage of HRP3 deletions
names['% hrp3 deletions'] = round(np.count_nonzero(x['HRP3'] == 'del') / (np.count_nonzero(x['HRP3'] != 'uncallable'))*100)
# Calculate the number of deletions for both HRP2 and HRP3
names['hrp2 and hrp3 calls'] = np.count_nonzero((x['HRP2'] != 'uncallable') & (x['HRP3'] != 'uncallable'))
# Calculate the incidence of deletions for both HRP2 and HRP3
names['% hrp2 and hrp3 deletions'] = round(np.count_nonzero(
(x['HRP2'] == 'del') & (x['HRP3'] == 'del')
) / (np.count_nonzero(
(x['HRP2'] != 'uncallable') & (x['HRP3'] != 'uncallable')
))*100)
# Convert the values to integers
names = {key: int(value) for key, value in names.items()}
return pd.Series(names)
# Create a table to summarize HRP2 and HRP3 gene information by country
# Calculate proportions and counts for HRP2 and HRP3 genes by country
df_hrp_by_country_table = (
df_samples_hrp.groupby('Country') # Group the data by country
.apply(proportion_agg) # Apply the proportion aggregation function to calculate statistics
.transpose() # Transpose the table for better readability
)
# Calculate sample counts for each country
hrp_by_country_n = (
df_samples_hrp.groupby('Country') # Group the data by country
.size() # Calculate the number of samples in each country
)
# Rename columns to include sample counts next to country names
for country in df_samples_hrp.loc[df_samples_hrp['QC pass'], 'Country'].unique():
new_column_name = f'{country} (n={hrp_by_country_n[country]:,})'
df_hrp_by_country_table.rename(columns={country: new_column_name}, inplace=True)
# Transpose the table to have countries as rows and statistics as columns
df_hrp_by_country_table = df_hrp_by_country_table.T
# Display the table with summarized information
df_hrp_by_country_table
hrp2 calls | % hrp2 deletions | hrp3 calls | % hrp3 deletions | hrp2 and hrp3 calls | % hrp2 and hrp3 deletions | |
---|---|---|---|---|---|---|
Country | ||||||
Bangladesh (n=1,310) | 939 | 0 | 850 | 0 | 819 | 0 |
Benin (n=150) | 110 | 0 | 104 | 0 | 100 | 0 |
Burkina Faso (n=57) | 43 | 0 | 32 | 0 | 32 | 0 |
Cambodia (n=1,267) | 1109 | 0 | 1091 | 3 | 1064 | 0 |
Cameroon (n=264) | 244 | 0 | 240 | 0 | 235 | 0 |
Colombia (n=135) | 124 | 0 | 123 | 41 | 118 | 0 |
Côte d'Ivoire (n=71) | 70 | 0 | 71 | 0 | 70 | 0 |
Democratic Republic of the Congo (n=520) | 413 | 0 | 392 | 0 | 385 | 0 |
Ethiopia (n=21) | 20 | 0 | 20 | 45 | 20 | 0 |
Gabon (n=55) | 34 | 0 | 38 | 0 | 32 | 0 |
Gambia (n=863) | 517 | 0 | 467 | 1 | 460 | 0 |
Ghana (n=3,131) | 1529 | 0 | 1448 | 0 | 1343 | 0 |
Guinea (n=151) | 121 | 0 | 119 | 0 | 119 | 0 |
India (n=300) | 75 | 0 | 70 | 1 | 68 | 0 |
Indonesia (n=121) | 117 | 4 | 117 | 37 | 115 | 2 |
Kenya (n=690) | 660 | 0 | 647 | 0 | 645 | 0 |
Laos (n=991) | 773 | 0 | 717 | 2 | 669 | 0 |
Madagascar (n=24) | 24 | 0 | 22 | 0 | 22 | 0 |
Malawi (n=265) | 265 | 0 | 264 | 0 | 264 | 0 |
Mali (n=1,167) | 709 | 0 | 691 | 0 | 669 | 0 |
Mauritania (n=92) | 79 | 0 | 81 | 0 | 79 | 0 |
Mozambique (n=34) | 10 | 0 | 10 | 0 | 6 | 0 |
Myanmar (n=985) | 645 | 0 | 606 | 0 | 585 | 0 |
Nigeria (n=110) | 34 | 0 | 30 | 0 | 30 | 0 |
Papua New Guinea (n=221) | 118 | 0 | 106 | 0 | 106 | 0 |
Peru (n=21) | 21 | 38 | 20 | 75 | 20 | 30 |
Senegal (n=150) | 142 | 0 | 141 | 4 | 138 | 0 |
Sudan (n=76) | 7 | 14 | 7 | 86 | 7 | 14 |
Tanzania (n=589) | 452 | 0 | 470 | 0 | 440 | 0 |
Thailand (n=954) | 855 | 0 | 846 | 0 | 823 | 0 |
Uganda (n=12) | 12 | 0 | 12 | 0 | 12 | 0 |
Venezuela (n=2) | 2 | 0 | 2 | 0 | 2 | 0 |
Vietnam (n=1,404) | 762 | 0 | 740 | 1 | 670 | 0 |
Table Legend. Frequency of HRP2 and HRP3 deletions by country. n=number of QC pass samples. Calls columns show number of samples for which an unambiguous deletion genotype (deleted or non-deleted) could be assigned.
Save the table¶
# This will send the file to your Google Drive, where you can download it from if needed
# Change the file path if you wish to send the file to a specific location
# Change the file name if you wish to call it something else
df_hrp_by_country_table.to_excel('/content/drive/My Drive/HRP_country_summary_unformatted.xlsx')