This is the talk page for discussing improvements to the Lists of human protein-coding genes. |
|
This project page does not require a rating on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||
|
To help centralise discussions and keep related topics together, the following pages redirect here: |
This page serves to centralize discussions about the human protein-coding genes list articles. Please add new posts or edit requests for those pages below. Seppi333 ( Insert 2¢) 02:58, 24 November 2019 (UTC)
Create redirects from the redlinked gene symbols and parenthetically disambiguated variants to the corresponding proteins in the Wikipedia:WikiProject Molecular Biology/Genetics/Gene Wiki list: [1]
There's ~11500 blue links in the tables and ~12400 articles with {{
Infobox gene}}
. I can find the protein pages that aren't linked to in these tables simply by converting the pages listed in the templatetransclusioncheck tool results for infobox gene into a python list, then generating another python list of the current link targets in the tables the next time I run
User:Seppi333/GeneListNLP, and finally returning a list of pages that are in {{
Infobox gene}}
but not the table list. I can then identify the corresponding gene symbol for a given protein using pyUniProt and pyGtoP: pyUniprot to obtain the corresponding HGNC ID for a matched protein name/alias and pyHGNC to find the corresponding HGNC-approved gene symbol with the HGNC ID; pyGtoP to obtain the HGNC-approved gene symbol for a matched receptor, transport, or enzyme name/alias. Would have to manually identify the corresponding HGNC-approved gene symbol for any unmatched links that remain.
Would then need to get a bot approved to add the missing redirects from any redlinked gene symbol and all redlinked parenthetically disambiguated gene symbols to the corresponding protein articles. Seppi333 ( Insert 2¢) 05:26, 28 November 2019 (UTC)
Most of the tables are simply unnecessary or don't belong on Wikipedia. Per WP:ELLIST, we aren't here to be a directory of links to other websites. Then there are these incredibly redundant values added to the table as well. What's the point in saying that each individual entry is "approved", when they all are? This is hardly a list that needs to be spread across three pages and more than 1 million bytes. Onetwothreeip ( talk) 23:46, 30 November 2019 (UTC)
An embedded list with externally linked gene entries
|
---|
List of human protein-coding genes 3 currently has 443,692 bytes of markup; with pages 1 & 2 not far behind. They are far too big. What's the best way to divide them up? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:44, 8 December 2019 (UTC)
Posting here in light of the large flashing sign at the top of the article - PIN4 is a DAB page; the link should be piped to Peptidyl-prolyl cis-trans isomerase NIMA-interacting 4. Narky Blert ( talk) 09:02, 4 November 2020 (UTC)
LTB is now a dab page - the link on this page should be changed to LTB (gene). Tevildo ( talk) 10:51, 31 May 2021 (UTC)
@ Narky Blert, Lennart97, and Tevildo: Done - I updated the algorithm per your requests and ran the newer version on PAWS. Let me know if you see any other dablinks or issues. Also, thanks for notifying me about these. I unfortunately don't have any spare time to check for DABlinks in these lists myself - or even edit Wikipedia for that matter - anymore because I'm perpetually slammed with work.
These are dablinks that weren't reported here, but were added to the lists at some point in the interim and reported in the dablinks tool following my bot's first set of page revisions (i.e., the initial run after I updated the algorithm per above):
Seppi333 ( Insert 2¢) 07:44, 25 November 2021 (UTC)
Similar to the previous section: on line 1155, ASPN is a disambiguation page; the link should be changed to Asporin (or, in full, [[Asporin|ASPN]]. -- R'n'B ( call me Russ) 01:30, 14 December 2021 (UTC)
Seppi333, CTSL is now a disambiguation page, your bot should probably pipe it to CTSL1. Also please pipe ASPN to ASPN (gene) as R'n'B requested above. Streded ( talk) 01:10, 1 February 2022 (UTC)
I have turned BCAM into a disambiguation page, ideally the bot could pipe it to basal cell adhesion molecule. Thanks! CapitalSasha ~ talk 10:38, 30 June 2022 (UTC)
Since I last ran the bot script ~3 months ago, 2 new dablinks were fixed in the bot's source code:
GH2 wasn't reported and COQ5 was an existing dablink, so please be sure to report new dablinks; otherwise, I usually have to go back and rerun the bot code again after fixing the dablinks it adds back to these pages. Seppi333 ( Insert 2¢) 19:13, 12 December 2022 (UTC)
Updated today. Seppi333 ( Insert 2¢) 19:38, 9 January 2023 (UTC)
Please Wikilink FBP2 to
Fructose-bisphosphatase 2. Note: FBP2 acronym appears to be used for "far upstream element binding protein 2" as well.
Also, should the HGNC ID and UniProt ID be added to Fructose-bisphosphatase 2 WikiData? I am unsure how to do that.
Should EPHX3 be linked to
Epoxide hydrolase 3? Thank you!
Adakiko (
talk) 22:04, 5 March 2023 (UTC)
Is the plan to add new columns, with the actual proteins that each piace of genetic code corrispond to? as well as where to find them in the body. Did anything come from the discussion on the list of human proteins ( Draft:List of proteins in the human body)? or was it just deleted and forgotten? is there anywhere, where one can see how the deletion discussion on the article ended? and if non of the idears will be implented here, why they won't be? Claes Lindhardt ( talk) 17:49, 14 August 2023 (UTC)
Column name | Explanation |
---|---|
HGNCsymbol | approved HUGO gene symbol |
proteinLabel | recommended UniProt name |
wd_gene_item_article_link | Wikipedia article name if linked to gene |
wd_protein_item_article_link | Wikipedia article name if linked to protein |
References
Are we sure that there is no proteins used in the body that we cannot make ourself? Or any proteins which can be very effective in the body(maybe in a pathogentic way) that we cannot make ourself? - As these would also be relevant to the proteins of the human body but make thier way around this list as they do not have a gene.
It seems drugs like insulin, Sargamostim (leukine) or (rGM-CSF), β-glucocerebrosidase for Gaucher's disease, Dornase alfa for Cystic Fibrosis, Interferons for autoimmune disorders and viral infections, Granulocyte macrophage colony-stimulating factor for immunostimulation, Granulocyte colony-stimulating factor also for immunostimulation, Factor VIII for hemophilia, Tissue plasminogen activator for strokes, GM-CSF are examples of proteins beeing introduced without beeing produced in the human body. (However both of theses are examples of protein being introduced as a drug when the body is incapable of producing the protein naturally. So ussally these can also be created by the body and thus are in the genetic code).
Is there mechanism or human biological phenomenon which assures that we only respond to/use proteins that we also can produce ourselves and that we have in our genetic code? Claes Lindhardt ( talk) 06:48, 24 August 2023 (UTC)
|type=mab
, |mab_type=
, |source=
, |type=
parameters, but this is not linked to wiki data. As far as exogenous proteins produced for example by human pathogens, there is {{
infobox nonhuman protein}} (see for example
Diphtheria toxin), also not linked to wiki data. Viruses that infect humans are constantly mutating producing enormous number of new protein variants every day while older variants disappear. It is simply not possible to tract all of these.
Boghog (
talk) 09:34, 26 August 2023 (UTC)Please disambiguate VTN to Vitronectin. — ShelfSkewed Talk 03:28, 24 March 2024 (UTC)
This is the talk page for discussing improvements to the Lists of human protein-coding genes. |
|
This project page does not require a rating on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||
|
To help centralise discussions and keep related topics together, the following pages redirect here: |
This page serves to centralize discussions about the human protein-coding genes list articles. Please add new posts or edit requests for those pages below. Seppi333 ( Insert 2¢) 02:58, 24 November 2019 (UTC)
Create redirects from the redlinked gene symbols and parenthetically disambiguated variants to the corresponding proteins in the Wikipedia:WikiProject Molecular Biology/Genetics/Gene Wiki list: [1]
There's ~11500 blue links in the tables and ~12400 articles with {{
Infobox gene}}
. I can find the protein pages that aren't linked to in these tables simply by converting the pages listed in the templatetransclusioncheck tool results for infobox gene into a python list, then generating another python list of the current link targets in the tables the next time I run
User:Seppi333/GeneListNLP, and finally returning a list of pages that are in {{
Infobox gene}}
but not the table list. I can then identify the corresponding gene symbol for a given protein using pyUniProt and pyGtoP: pyUniprot to obtain the corresponding HGNC ID for a matched protein name/alias and pyHGNC to find the corresponding HGNC-approved gene symbol with the HGNC ID; pyGtoP to obtain the HGNC-approved gene symbol for a matched receptor, transport, or enzyme name/alias. Would have to manually identify the corresponding HGNC-approved gene symbol for any unmatched links that remain.
Would then need to get a bot approved to add the missing redirects from any redlinked gene symbol and all redlinked parenthetically disambiguated gene symbols to the corresponding protein articles. Seppi333 ( Insert 2¢) 05:26, 28 November 2019 (UTC)
Most of the tables are simply unnecessary or don't belong on Wikipedia. Per WP:ELLIST, we aren't here to be a directory of links to other websites. Then there are these incredibly redundant values added to the table as well. What's the point in saying that each individual entry is "approved", when they all are? This is hardly a list that needs to be spread across three pages and more than 1 million bytes. Onetwothreeip ( talk) 23:46, 30 November 2019 (UTC)
An embedded list with externally linked gene entries
|
---|
List of human protein-coding genes 3 currently has 443,692 bytes of markup; with pages 1 & 2 not far behind. They are far too big. What's the best way to divide them up? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:44, 8 December 2019 (UTC)
Posting here in light of the large flashing sign at the top of the article - PIN4 is a DAB page; the link should be piped to Peptidyl-prolyl cis-trans isomerase NIMA-interacting 4. Narky Blert ( talk) 09:02, 4 November 2020 (UTC)
LTB is now a dab page - the link on this page should be changed to LTB (gene). Tevildo ( talk) 10:51, 31 May 2021 (UTC)
@ Narky Blert, Lennart97, and Tevildo: Done - I updated the algorithm per your requests and ran the newer version on PAWS. Let me know if you see any other dablinks or issues. Also, thanks for notifying me about these. I unfortunately don't have any spare time to check for DABlinks in these lists myself - or even edit Wikipedia for that matter - anymore because I'm perpetually slammed with work.
These are dablinks that weren't reported here, but were added to the lists at some point in the interim and reported in the dablinks tool following my bot's first set of page revisions (i.e., the initial run after I updated the algorithm per above):
Seppi333 ( Insert 2¢) 07:44, 25 November 2021 (UTC)
Similar to the previous section: on line 1155, ASPN is a disambiguation page; the link should be changed to Asporin (or, in full, [[Asporin|ASPN]]. -- R'n'B ( call me Russ) 01:30, 14 December 2021 (UTC)
Seppi333, CTSL is now a disambiguation page, your bot should probably pipe it to CTSL1. Also please pipe ASPN to ASPN (gene) as R'n'B requested above. Streded ( talk) 01:10, 1 February 2022 (UTC)
I have turned BCAM into a disambiguation page, ideally the bot could pipe it to basal cell adhesion molecule. Thanks! CapitalSasha ~ talk 10:38, 30 June 2022 (UTC)
Since I last ran the bot script ~3 months ago, 2 new dablinks were fixed in the bot's source code:
GH2 wasn't reported and COQ5 was an existing dablink, so please be sure to report new dablinks; otherwise, I usually have to go back and rerun the bot code again after fixing the dablinks it adds back to these pages. Seppi333 ( Insert 2¢) 19:13, 12 December 2022 (UTC)
Updated today. Seppi333 ( Insert 2¢) 19:38, 9 January 2023 (UTC)
Please Wikilink FBP2 to
Fructose-bisphosphatase 2. Note: FBP2 acronym appears to be used for "far upstream element binding protein 2" as well.
Also, should the HGNC ID and UniProt ID be added to Fructose-bisphosphatase 2 WikiData? I am unsure how to do that.
Should EPHX3 be linked to
Epoxide hydrolase 3? Thank you!
Adakiko (
talk) 22:04, 5 March 2023 (UTC)
Is the plan to add new columns, with the actual proteins that each piace of genetic code corrispond to? as well as where to find them in the body. Did anything come from the discussion on the list of human proteins ( Draft:List of proteins in the human body)? or was it just deleted and forgotten? is there anywhere, where one can see how the deletion discussion on the article ended? and if non of the idears will be implented here, why they won't be? Claes Lindhardt ( talk) 17:49, 14 August 2023 (UTC)
Column name | Explanation |
---|---|
HGNCsymbol | approved HUGO gene symbol |
proteinLabel | recommended UniProt name |
wd_gene_item_article_link | Wikipedia article name if linked to gene |
wd_protein_item_article_link | Wikipedia article name if linked to protein |
References
Are we sure that there is no proteins used in the body that we cannot make ourself? Or any proteins which can be very effective in the body(maybe in a pathogentic way) that we cannot make ourself? - As these would also be relevant to the proteins of the human body but make thier way around this list as they do not have a gene.
It seems drugs like insulin, Sargamostim (leukine) or (rGM-CSF), β-glucocerebrosidase for Gaucher's disease, Dornase alfa for Cystic Fibrosis, Interferons for autoimmune disorders and viral infections, Granulocyte macrophage colony-stimulating factor for immunostimulation, Granulocyte colony-stimulating factor also for immunostimulation, Factor VIII for hemophilia, Tissue plasminogen activator for strokes, GM-CSF are examples of proteins beeing introduced without beeing produced in the human body. (However both of theses are examples of protein being introduced as a drug when the body is incapable of producing the protein naturally. So ussally these can also be created by the body and thus are in the genetic code).
Is there mechanism or human biological phenomenon which assures that we only respond to/use proteins that we also can produce ourselves and that we have in our genetic code? Claes Lindhardt ( talk) 06:48, 24 August 2023 (UTC)
|type=mab
, |mab_type=
, |source=
, |type=
parameters, but this is not linked to wiki data. As far as exogenous proteins produced for example by human pathogens, there is {{
infobox nonhuman protein}} (see for example
Diphtheria toxin), also not linked to wiki data. Viruses that infect humans are constantly mutating producing enormous number of new protein variants every day while older variants disappear. It is simply not possible to tract all of these.
Boghog (
talk) 09:34, 26 August 2023 (UTC)Please disambiguate VTN to Vitronectin. — ShelfSkewed Talk 03:28, 24 March 2024 (UTC)