Mapping ensembl_gene_id
to hgnc_id
via old bionty API¶
This mapping can be obtained via:
hgnc
ensembl
We should always use the reference table from hgnc website when converting hgnc_id to other ids, as it’s carefully curated to ensure unique mapping.
If you access the id mapping through ensembl, in a lot of cases, multiple ensembl ids map to the same hgnc_id.
However, the REST API provided by HGNC is not working, despite the database being up to date
import bionty as bt
gn = bt.Gene(species="human")
# table from ensembl
ens = gn.reference
# table from HGNC
hgnc = gn.hgnc()
diff_set = set(ens["ensembl_gene_id"].values).difference(df.ensembl_gene_id)
ref_diff = ens[ens["ensembl_gene_id"].isin(diff_set)]
Here you already see both two ensembl ids (index 40 and 42) map to HGNC:6338
ref_diff.head(10)
ensembl_gene_id | entrezgene_id | hgnc_id | hgnc_symbol | |
---|---|---|---|---|
37 | ENSG00000278704 | NaN | NaN | NaN |
38 | ENSG00000262826 | 65123 | HGNC:26153 | INTS3 |
39 | ENSG00000275151 | NaN | NaN | NaN |
40 | ENSG00000275717 | 3811 | HGNC:6338 | KIR3DL1 |
41 | ENSG00000274714 | 3809 | HGNC:6336 | KIR2DS4 |
42 | ENSG00000276379 | 3811 | HGNC:6338 | KIR3DL1 |
43 | ENSG00000280538 | NaN | NaN | NaN |
44 | ENSG00000274324 | 3809 | HGNC:6336 | KIR2DS4 |
45 | ENSG00000271254 | 102724250 | NaN | NaN |
46 | ENSG00000275047 | 3810 | HGNC:6337 | KIR2DS5 |
If you search for the hgnc_id HGNC:6338 in hgnc, you got another ensembl id ENSG00000167633
hgnc[hgnc.hgnc_id == "HGNC:6338"]["ensembl_gene_id"]
13835 ENSG00000167633
Name: ensembl_gene_id, dtype: object
Check whether ENSG00000167633 is mapped to HGNC:6338 in ensembl, yes at least it is
ens[ens.ensembl_gene_id == "ENSG00000167633"]
ensembl_gene_id | entrezgene_id | hgnc_id | hgnc_symbol | |
---|---|---|---|---|
62030 | ENSG00000167633 | 3811 | HGNC:6338 | KIR3DL1 |
HGNC REST server is not accessible¶
from bionty._rest import fetch_endpoint
fetch_endpoint(
"http://rest.genenames.org/", "search/ensembl_gene_id/ENSG00000157764", "text/xml"
)
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
/Users/sunnysun/Documents/repos/bionty/docs/tasks/2022-05-29-ensembl-gene-ids.ipynb Cell 14' in <cell line: 1>()
----> <a href='vscode-notebook-cell:/Users/sunnysun/Documents/repos/bionty/docs/tasks/2022-05-29-ensembl-gene-ids.ipynb#ch0000013?line=0'>1</a> fetch_endpoint("http://rest.genenames.org/", "search/ensembl_gene_id/ENSG00000157764", "text/xml")
File /opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py:15, in fetch_endpoint(server, request, content_type)
<a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=11'>12</a> r = requests.get(server + request, headers={"Accept": content_type})
<a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=13'>14</a> if not r.ok:
---> <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=14'>15</a> r.raise_for_status()
<a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=15'>16</a> sys.exit()
<a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=17'>18</a> if content_type == "application/json":
File /opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py:960, in Response.raise_for_status(self)
<a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py?line=956'>957</a> http_error_msg = u'%s Server Error: %s for url: %s' % (self.status_code, reason, self.url)
<a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py?line=958'>959</a> if http_error_msg:
--> <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py?line=959'>960</a> raise HTTPError(http_error_msg, response=self)
HTTPError: 500 Server Error: Internal Server Error for url: http://rest.genenames.org/search/ensembl_gene_id/ENSG00000157764