scPRINT use case on BPH (part 2, GN analysis)¶
In this use-case, which some of the results are presented in Figure 5 of our manuscript, we perform an extensive analysis of gene networks generated by scPRINT in our previous notebook for fibroblasts of the prostate in both normal and benign prostatic hyperplasia states.
We can look at many patterns from the gene networks such as hub genes, gene modules, and gene-gene interactions. Some of them are related to highly variable and differentially expressed genes while others can only be infered by using the gene networks.
Table of content:
- Hub genes
- Network similarity
- Network communities
- Cancer version
- Normal version
- Differential Gene - Gene connections
- Using classification
from bengrn import BenGRN
import scanpy as sc
from bengrn.base import train_classifier
from anndata.utils import make_index_unique
from grnndata import utils as grnutils
from grnndata import read_h5ad
from matplotlib import pyplot as plt
from scdataloader import utils as data_utils
import numpy as np
from pyvis import network as pnx
import networkx as nx
import scipy.sparse
import pandas as pd
import gseapy as gp
from gseapy import dotplot
%load_ext autoreload
%autoreload 2
💡 connected lamindb: jkobject/scprint
prostate_combined = sc.read_h5ad("../data/prostate_combined_o2uniqsx.h5ad")
# load the generated gene networks from the previous notebook
grn = read_h5ad("../../data/prostate_fibro_grn.h5ad")
grn_c = read_h5ad("../../data/prostate_cancer_fibro_grn.h5ad")
#remove duplicates
grn.var.symbol = make_index_unique(grn.var.symbol.astype(str))
grn_c.var.symbol = make_index_unique(grn_c.var.symbol.astype(str))
# convert gene ids to symbols
grn.var['ensembl_id'] = grn.var.index
grn.var.index = grn.var.symbol
grn_c.var['ensembl_id'] = grn_c.var.index
grn_c.var.index = grn_c.var.symbol
grn.obs.cleaned_pred_disease_ontology_term_id.value_counts(),grn_c.obs.cleaned_pred_disease_ontology_term_id.value_counts()
(cleaned_pred_disease_ontology_term_id normal 300 Name: count, dtype: int64, cleaned_pred_disease_ontology_term_id benign prostatic hyperplasia 300 Name: count, dtype: int64)
1. Hub genes¶
We will use 2 different definition of centrality to compare the hubs of the networks in both states. The first one is the degree centrality, which is the number of connections of a node. The second one is the eigenvector centrality, which is the sum of the centrality of the neighbors of a node.
For each gene we mostly interogate their meaning with genecards, e.g. https://www.genecards.org/cgi-bin/carddisp.pl?gene=CD99
# between the two networks we have ~3000 genes in common over their 4000 genes
common = set(grn_c.var.symbol) & set(grn.var.symbol)
len(common)
2881
# hub genes based on edge centrality (number of connections / strength of connections)
grn.grn.sum(0).sort_values(ascending=False).head(20)
symbol S100A6 58.131020 TGIF2-RAB5IF 51.177635 MIF 44.534096 DNAJB9 40.953262 IGFBP7 38.629910 APOD 38.003906 BRME1 37.598698 SPARCL1 37.160854 TIMP1 36.671650 DCN 34.621376 C1S 32.746254 MGP 30.496748 nan-270 29.884596 SLC25A6 29.308729 BLOC1S5-TXNDC5 28.986681 CETN1 28.746281 FTL 28.266533 AKR1C1 27.641245 FBLN1 26.828733 MT2A 25.590197 dtype: float32
# without taking in account genes that are not present in the cancer network
grn.grn.sum(0).loc[list(common)].sort_values(ascending=False).head(20)
symbol TGIF2-RAB5IF 51.177635 IGFBP7 38.629910 APOD 38.003906 BRME1 37.598698 SPARCL1 37.160854 TIMP1 36.671650 DCN 34.621376 C1S 32.746254 MGP 30.496748 BLOC1S5-TXNDC5 28.986681 AKR1C1 27.641245 FBLN1 26.828733 MT2A 25.590197 MYOC 25.428144 FAM205A 25.296257 YBX3 24.937267 CST3 24.676405 C11orf96 18.731924 TFPI 18.429214 ATP6V0C 17.695705 dtype: float32
# cancer hub genes
grn_c.grn.sum(0).sort_values(ascending=False).head(20)
symbol HSPA1A 75.300415 MT2A 57.938267 CREM 42.478115 TGIF2-RAB5IF 39.980556 HSPE1 38.607624 CALD1 36.512047 SPOCK3 35.564945 HLA-A 33.446110 SPARCL1 33.403152 RBP1 32.482758 C1S 32.037743 BRME1-1 31.879770 FABP4 31.604069 nan-99 31.555256 LUM 30.523245 EIF4A1 29.690474 ATP6V0C 29.475002 PDZD2 27.333025 DEFA1 26.979454 CTSG 25.070097 dtype: float32
# without taking in account genes not present in the normal network
grn_c.grn.sum(0).loc[list(common)].sort_values(ascending=False).head(20)
symbol HSPA1A 75.300415 MT2A 57.938267 TGIF2-RAB5IF 39.980556 SPOCK3 35.564945 HLA-A 33.446110 SPARCL1 33.403152 C1S 32.037743 nan-99 31.555256 LUM 30.523245 EIF4A1 29.690474 ATP6V0C 29.475002 DEFA1 26.979454 IGFBP7 23.982243 DCN 23.304916 MGP 22.596451 CD99 21.624668 BLOC1S5-TXNDC5 20.914230 SERPING1 20.824636 COL6A2 18.076559 THBS1 17.826540 dtype: float32
# top differential hubs
TOP = 10
(set(grn_c.grn.sum(0).sort_values(ascending=False).head(TOP).index) - set(grn.grn.sum(0).sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'HLA-A', 'HSPA1A', 'MT2A', 'SPOCK3'}
TOP = 20
(set(grn_c.grn.sum(0).sort_values(ascending=False).head(TOP).index) - set(grn.grn.sum(0).sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'ATP6V0C', 'DEFA1', 'EIF4A1', 'HLA-A', 'HSPA1A', 'LUM', 'SPOCK3', 'nan-99'}
TOP = 50
(set(grn_c.grn.sum(0).sort_values(ascending=False).head(TOP).index) - set(grn.grn.sum(0).sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'CD99', 'CPE', 'DEFA1', 'EIF4A1', 'HLA-A', 'HSPA1A', 'LGALS1', 'LUM', 'PYDC2', 'SERPING1', 'SPOCK3', 'THBS1', 'nan-191', 'nan-99'}
overall we see that both are somewhat similar in the way the networks classifies main nodes but at least around half of the top 200 nodes are different between the two datasets and half of those are not due to diffent gene being used by the networks (50)
genes preferentially given as top elements in the tumor version only:
- CD99
- CPE
- DEFA1
- EIF4A1
- HLA-A
- HSPA1A
- LGALS1
- LUM
- PYDC2
- SERPING1
- SPOCK3
- THBS1
# we now compute eigen centrality creating a sparse network by only keeping the top 20 neighbors for each gene in the network
TOP = 20
grnutils.get_centrality(grn, TOP, top_k_to_disp=0)
grnutils.get_centrality(grn_c, TOP, top_k_to_disp=0)
Top central genes: [] Top central genes: []
[]
grn_c.var.centrality.sort_values(ascending=False).head(10)
symbol HSPA1A 0.271584 MT2A 0.271584 CREM 0.271574 CD99 0.271517 NR4A3 0.271385 RBP1 0.271130 SOX4 0.264214 ATP6V0C 0.263235 HLA-A 0.232913 LUM 0.202007 Name: centrality, dtype: float64
grn_c.var.centrality.sort_values(ascending=False).head(10)
symbol HSPA1A 0.271584 MT2A 0.271584 CREM 0.271574 CD99 0.271517 NR4A3 0.271385 RBP1 0.271130 SOX4 0.264214 ATP6V0C 0.263235 HLA-A 0.232913 LUM 0.202007 Name: centrality, dtype: float64
grn.var.centrality.sort_values(ascending=False).head(10)
symbol MT2A 0.276470 SLC25A6 0.276183 CXCL8 0.275859 FTH1 0.275757 MIF 0.267968 DNAJB9 0.265635 SOX4 0.241950 RELB 0.234482 YBX3 0.216567 CST3 0.213585 Name: centrality, dtype: float64
TOP = 10
(set(grn_c.var.centrality.sort_values(ascending=False).head(TOP).index) - set(grn.var.centrality.sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'ATP6V0C', 'CD99', 'HLA-A', 'HSPA1A', 'LUM'}
TOP = 20
(set(grn_c.var.centrality.sort_values(ascending=False).head(TOP).index) - set(grn.var.centrality.sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'ATP6V0C', 'CD99', 'EIF4A1', 'HLA-A', 'HSPA1A', 'LUM', 'PAGE4', 'RYR2', 'SERPINF1'}
TOP = 50
(set(grn_c.var.centrality.sort_values(ascending=False).head(TOP).index) - set(grn.var.centrality.sort_values(ascending=False).head(TOP).index)) & (set(grn_c.var.symbol) & set(grn.var.symbol))
{'C1R', 'CD99', 'COL6A2', 'HLA-A', 'HNRNPA0', 'HSPA1A', 'LUM', 'PAGE4', 'SERPINA3', 'SERPING1', 'SPOCK3', 'THBS1'}
it seems the networks are even more similar when comparing them at the level of the centrality of their nodes. this means that some are highly central while maybe not
2. Network similarity¶
We now look at the similarity of the two networks based on general overlap of their top K edges across their common nodes
grn.var.index = grn.var.ensembl_id
grn_c.var.index = grn_c.var.ensembl_id
overlap = list(set(grn.var.index) & set(grn_c.var.index))
K = 20
subgrn_c = grn_c.get(overlap).grn
subgrn_c = subgrn_c.apply(lambda row: row >= row.nlargest(K).min(), axis=1)
subgrn = grn.get(overlap).grn
subgrn = subgrn.apply(lambda row: row >= row.nlargest(K).min(), axis=1)
(subgrn & subgrn_c).sum(1).mean() / K
0.5055735930735931
when looking at the top 20 connection for each genes in the network we see that on average only 50% of them agree between the two networks
3. Network communities¶
Again looking at the top 20 connections for each gene in the network we use the leiden (or louvain) algorithm to find communities in both networks.
We then study the hub of these communities as well as their enrichment using enrichr and ontology databases. We will only look at communities between 20 and 200 genes
TOP = 20
grnutils.get_centrality(grn, TOP, top_k_to_disp=0)
grnutils.get_centrality(grn_c, TOP, top_k_to_disp=0)
Top central genes: [] Top central genes: []
[]
grn_c = grnutils.compute_cluster(grn_c, 1.5, use="leiden", n_iterations=10, max_comm_size=100)
grn = grnutils.compute_cluster(grn, 1.5)
grn_c = grnutils.compute_cluster(grn_c, 1.5)
3.1 Cancer version¶
# our communities
grn.var['cluster_1.5'].value_counts().head(10)
cluster_1.5 0 1620 1 1483 2 732 3 111 4 23 5 9 6 4 16 1 23 1 22 1 Name: count, dtype: int64
G = grn_c.plot_subgraph(grn_c.var[grn_c.var['cluster_1.5']=='4'].index.tolist(), only=200, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['PRDM6', 'MSANTD3', 'STAG3', 'TRHDE', 'TBX5', 'HEPH', 'GALNT16', 'NFATC4', 'SYNDIG1', 'HSF4', 'ZNF423', 'PLA2G4C', 'PLPPR2', 'SCN1B', 'ASPA', 'SNX15', 'BAG2', 'PTK7', 'HRH2', 'PLCL1', 'PTCH2', 'TGFB3', 'RASL11A', 'TOX2', 'CPNE5', 'GLIS2', 'HIVEP3', 'FGF13', 'ACAP3', 'DUSP26', 'CTIF', 'CAMK1', 'SLC9A5', 'FBN2', 'LRIG1', 'LIMD1', 'TRPC1', 'SLC2A13', 'ADAMTS12', 'FZD7', 'AFF2', 'CACNA1D', 'MRAS', 'ST6GALNAC6', 'FGFR4', 'PTGER1', 'PKDCC', 'SLC9B2', 'NPY1R', 'ANKRD33B', 'DCLK2', 'PURG', 'GXYLT2', 'LRRN4CL', 'CA8', 'EXOC3L1', 'BHLHE22', 'AATK', 'CMC4', 'BOLA2', 'CCR10', 'TCEAL2', 'SLC24A3', 'ROR1', 'SLC51B', 'EMID1', 'AGMO', 'CENPP', 'OPTC', 'SPOCK3', 'COL27A1', 'DACT3', 'STMN3', 'DLL1', 'SMOC1', 'COLGALT2', 'ZDBF2', 'DOK6', 'HNRNPUL2-BSCL2', 'C3orf84', 'C2orf74', 'ARHGEF25', 'NPIPB5', 'TNFSF12-TNFSF13', 'ZFP91-CNTF', 'nan-41', 'nan-81', 'nan-96', 'nan-98', 'HERC3', 'nan-116', 'nan-117'], dtype='object', name='symbol_2')
# if you want to draw better looking networks
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_c_4.html")
# Assuming 'node_names' contains the list of gene names.
# Note: it seems with enrichr, one can only use one gene set at a time, choose accordingly
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['KEGG_2021_Human', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn_c.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:31:53,534 [WARNING] Input library not found: Reactome. Skip
Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
---|---|---|---|---|---|---|---|---|---|
0 | GO_Molecular_Function_2023 | Solute:Potassium Antiporter Activity (GO:0022821) | 0.000523 | 0.050652 | 0 | 0 | inf | inf | SLC24A3;SLC9A5 |
1 | GO_Molecular_Function_2023 | Coreceptor Activity Involved In Wnt Signaling ... | 0.001547 | 0.050652 | 0 | 0 | 86.822222 | 561.890021 | PTK7;ROR1 |
2 | GO_Molecular_Function_2023 | Sodium Ion Transmembrane Transporter Activity ... | 0.001547 | 0.050652 | 0 | 0 | 86.822222 | 561.890021 | SLC9A5;SLC9B2 |
3 | GO_Molecular_Function_2023 | Calcium Ion Transmembrane Transporter Activity... | 0.001700 | 0.050652 | 0 | 0 | 16.432584 | 104.795452 | SLC24A3;TRPC1;CACNA1D |
4 | GO_Molecular_Function_2023 | Calcium-Dependent Phospholipid Binding (GO:000... | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | CPNE5;PLA2G4C |
5 | GO_Molecular_Function_2023 | Coreceptor Activity Involved In Wnt Signaling ... | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | PTK7;ROR1 |
6 | GO_Molecular_Function_2023 | Metal Cation:Proton Antiporter Activity (GO:00... | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | SLC9A5;SLC9B2 |
7 | GO_Molecular_Function_2023 | Sodium:Proton Antiporter Activity (GO:0015385) | 0.003047 | 0.050652 | 0 | 0 | 43.400000 | 251.445478 | SLC9A5;SLC9B2 |
8 | GO_Molecular_Function_2023 | Iron Ion Binding (GO:0005506) | 0.005002 | 0.070586 | 0 | 0 | 28.925926 | 153.247105 | HEPH;AGMO |
9 | GO_Molecular_Function_2023 | Calcium Channel Activity (GO:0005262) | 0.005307 | 0.070586 | 0 | 0 | 10.099395 | 52.907606 | SLC24A3;TRPC1;CACNA1D |
10 | GO_Molecular_Function_2023 | Wnt Receptor Activity (GO:0042813) | 0.007391 | 0.089364 | 0 | 0 | 21.688889 | 106.438022 | FZD7;ROR1 |
11 | GO_Molecular_Function_2023 | RNA Polymerase II-specific DNA-binding Transcr... | 0.010110 | 0.104284 | 0 | 0 | 7.715135 | 35.445265 | TBX5;DUSP26;DLL1 |
12 | GO_Molecular_Function_2023 | Phospholipase Activity (GO:0004620) | 0.010193 | 0.104284 | 0 | 0 | 17.346667 | 79.552387 | PLCL1;PLA2G4C |
13 | GO_Molecular_Function_2023 | Monoatomic Cation Channel Activity (GO:0005261) | 0.011601 | 0.110210 | 0 | 0 | 7.284644 | 32.465172 | SLC24A3;TRPC1;CACNA1D |
14 | GO_Molecular_Function_2023 | Sodium Channel Regulator Activity (GO:0017080) | 0.013389 | 0.118714 | 0 | 0 | 14.451852 | 62.335739 | FGF13;SCN1B |
15 | GO_Molecular_Function_2023 | RNA Polymerase II Cis-Regulatory Region Sequen... | 0.022992 | 0.139045 | 0 | 0 | 2.668376 | 10.066740 | HSF4;HIVEP3;BHLHE22;TBX5;ZNF423;NFATC4;GLIS2 |
16 | GO_Molecular_Function_2023 | Calcium:Sodium Antiporter Activity (GO:0005432) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | SLC24A3 |
17 | GO_Molecular_Function_2023 | Potassium:Proton Antiporter Activity (GO:0015386) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | SLC9A5 |
18 | GO_Molecular_Function_2023 | Xylosyltransferase Activity (GO:0042285) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | GXYLT2 |
19 | GO_Molecular_Function_2023 | Tat Protein Binding (GO:0030957) | 0.023000 | 0.139045 | 0 | 0 | inf | inf | DLL1 |
# we make a plot of the enrichment
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version. df[self.colname].replace( /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy. For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object. df[self.colname].replace(
G = grn_c.plot_subgraph(grn_c.var[grn_c.var['cluster_1.5']=='5'].index.tolist(), only=80, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['ABCF2', 'ITIH4', 'OLFM2', 'ICAM4', 'RIGI', 'KAZALD1', 'BST1', 'PANX1', 'C9', 'FBXO2', 'ZNF410', 'ACVR1C', 'VAMP7', 'TOP2A', 'TAF4B', 'KIAA1755', 'EDA', 'YPEL4', 'ENTPD3', 'GP9', 'NLGN2', 'CLCF1', 'RPRM', 'ODF3B', 'MSC', 'NTF3', 'COL13A1', 'GP1BB', 'HSPA1A', 'TMEM158', 'nan-82', 'NBPF19', 'SCO2', 'SMIM41'], dtype='object', name='symbol_2')
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_c_5.html")
grn_c.var.loc['nan-82'] # par of the Glycoside Hydrolase Family
uid 4giyQrh7tkXI symbol nan-82 ncbi_gene_ids biotype protein_coding description novel protein synonyms organism_id 2 public_source_id 9.0 created_by_id 1 mt False ribo False hb False organism NCBITaxon:9606 TFs False ensembl_id ENSG00000269590 centrality 0.0 cluster_1.5 5 Name: nan-82, dtype: object
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['KEGG_2021_Human', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn_c.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:33:16,565 [WARNING] Input library not found: Reactome. Skip
Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
---|---|---|---|---|---|---|---|---|---|
0 | GO_Molecular_Function_2023 | Death Receptor Binding (GO:0005123) | 0.000416 | 0.028729 | 0 | 0 | 123.875000 | 964.236668 | EDA;NTF3 |
1 | GO_Molecular_Function_2023 | Purine Ribonucleoside Triphosphate Binding (GO... | 0.007050 | 0.073312 | 0 | 0 | 8.626100 | 42.739637 | ABCF2;RIGI;HSPA1A |
2 | GO_Molecular_Function_2023 | Receptor Ligand Activity (GO:0048018) | 0.007732 | 0.073312 | 0 | 0 | 5.742222 | 27.920829 | EDA;CLCF1;NTF3;HSPA1A |
3 | GO_Molecular_Function_2023 | Activin Receptor Activity (GO:0017002) | 0.008500 | 0.073312 | 0 | 0 | inf | inf | ACVR1C |
4 | GO_Molecular_Function_2023 | Gap Junction Hemi-Channel Activity (GO:0055077) | 0.008500 | 0.073312 | 0 | 0 | inf | inf | PANX1 |
5 | GO_Molecular_Function_2023 | NF-kappaB Binding (GO:0051059) | 0.008500 | 0.073312 | 0 | 0 | inf | inf | TAF4B |
6 | GO_Molecular_Function_2023 | Hydrolase Activity, Hydrolyzing N-glycosyl Com... | 0.008500 | 0.073312 | 0 | 0 | inf | inf | BST1 |
7 | GO_Molecular_Function_2023 | Ubiquitin-Protein Transferase Activator Activi... | 0.008500 | 0.073312 | 0 | 0 | inf | inf | RIGI |
8 | GO_Molecular_Function_2023 | DNA Binding, Bending (GO:0008301) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | TOP2A |
9 | GO_Molecular_Function_2023 | Disordered Domain Specific Binding (GO:0097718) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | HSPA1A |
10 | GO_Molecular_Function_2023 | UDP Phosphatase Activity (GO:0045134) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | ENTPD3 |
11 | GO_Molecular_Function_2023 | GDP Phosphatase Activity (GO:0004382) | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | ENTPD3 |
12 | GO_Molecular_Function_2023 | Nucleoside Diphosphate Phosphatase Activity (G... | 0.016930 | 0.083096 | 0 | 0 | 120.151515 | 490.060134 | ENTPD3 |
13 | GO_Molecular_Function_2023 | ATP Binding (GO:0005524) | 0.017220 | 0.083096 | 0 | 0 | 11.204545 | 45.509349 | ABCF2;HSPA1A |
14 | GO_Molecular_Function_2023 | Ubiquitin Protein Ligase Binding (GO:0031625) | 0.020064 | 0.083096 | 0 | 0 | 10.265625 | 40.126707 | RIGI;HSPA1A |
15 | GO_Molecular_Function_2023 | Ubiquitin-Like Protein Ligase Binding (GO:0044... | 0.021555 | 0.083096 | 0 | 0 | 9.852500 | 37.805688 | RIGI;HSPA1A |
16 | GO_Molecular_Function_2023 | Growth Factor Activity (GO:0008083) | 0.023090 | 0.083096 | 0 | 0 | 9.471154 | 35.690549 | CLCF1;NTF3 |
17 | GO_Molecular_Function_2023 | Adenyl Ribonucleotide Binding (GO:0032559) | 0.024670 | 0.083096 | 0 | 0 | 9.118056 | 33.756535 | ABCF2;HSPA1A |
18 | GO_Molecular_Function_2023 | C3HC4-type RING Finger Domain Binding (GO:0055... | 0.025290 | 0.083096 | 0 | 0 | 60.060606 | 220.863728 | HSPA1A |
19 | GO_Molecular_Function_2023 | Ubiquitin Binding (GO:0043130) | 0.025290 | 0.083096 | 0 | 0 | 60.060606 | 220.863728 | TOP2A |
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version. df[self.colname].replace( /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy. For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object. df[self.colname].replace(
similar story to cluster above. this time with PTN often found in the litterature with 6 other m Studies on the role of STEAP1, OR51E2, PTN, ABCC4, NDRG2 have been relatively in-depth. For instance, STEAP1 is all-called six-transmembrane epithelial antigen of the prostate 1, which belongs to a family of metalloproteinases involved in iron and copper homeostasis and other cellular processes
has been listed with 8 other members as predictors of CAF-related gene prognostic index (CRGPI) in prostate cancer. https://www.nature.com/articles/s41598-023-36125-0#Sec11
(NDRG2, TSPAN1, PTN, APOE, OR51E2, P4HB, STEAP1 and ABCC4 were used to construct molecular subtypes and)
3.2 Normal version¶
grn.var['cluster_1.5'].value_counts().head(10)
cluster_1.5 0 1620 1 1483 2 732 3 111 4 23 5 9 6 4 16 1 23 1 22 1 Name: count, dtype: int64
G = grn.plot_subgraph(grn.var[grn.var['cluster_1.5']=='3'].index.tolist(), only=200, interactive=False)
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['TMEM132A', 'ADGRA2', 'AGPAT4', 'THAP3', 'DNAJC25', 'TRAM2', 'NTN1', 'STAG3', 'COL4A4', 'HSD17B14', ... 'nan-75', 'TRABD2B', 'nan-77', 'PCGF2', 'nan-107', 'nan-182', 'nan-192', 'nan-195', 'nan-239', 'nan-259'], dtype='object', name='symbol_2', length=111)
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_3.html")
grn.var.loc['nan-259'] # par of the Glycoside Hydrolase Family
uid 62JAXxQNqtOd symbol nan-259 ncbi_gene_ids 83737 biotype protein_coding description novel protein synonyms organism_id 2 public_source_id 9.0 created_by_id 1 mt False ribo False hb False organism NCBITaxon:9606 TFs False ensembl_id ENSG00000289720 centrality 0.0 cluster_1.5 3 Name: nan-259, dtype: object
grn.get(grn.var[grn.var['cluster_1.5']=='3'].index.tolist()).grn.sum(0).sort_values(ascending=False).head(3)##.grn # selenom
symbol RNASEK 0.942102 SELENOM 0.292933 nan-259 0.256140 dtype: float32
RNASEK acidifying the internal vesicle important in endocytosis# https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10547866/ RNA phosphodiester bond hydrolysis --> hydrolisis pathway in the model and RNA transcription regulation
# Assuming 'node_names' contains the list of gene names
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['KEGG_2021_Human', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:33:49,291 [WARNING] Input library not found: Reactome. Skip /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/enrichr.py:643: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation. self.results = pd.concat(self.results, ignore_index=True)
Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
---|---|---|---|---|---|---|---|---|---|
0 | GO_Molecular_Function_2023 | Sulfotransferase Activity (GO:0008146) | 0.007252 | 0.123369 | 0 | 0 | 20.138462 | 99.212171 | CHST7;SULT1C4 |
1 | GO_Molecular_Function_2023 | RNA Polymerase II Transcription Regulatory Reg... | 0.007713 | 0.123369 | 0 | 0 | 3.386005 | 16.472449 | ZNF483;HAND2;NTN1;NFATC4;GLIS2;PURG;ZNF286A |
2 | GO_Molecular_Function_2023 | ATPase Binding (GO:0051117) | 0.013790 | 0.123369 | 0 | 0 | 13.415385 | 57.468682 | SLC2A13;NKD2 |
3 | GO_Molecular_Function_2023 | Chondroitin Sulfotransferase Activity (GO:0034... | 0.016750 | 0.123369 | 0 | 0 | inf | inf | CHST7 |
4 | GO_Molecular_Function_2023 | Peptidyl-Proline 4-Dioxygenase Activity (GO:00... | 0.016750 | 0.123369 | 0 | 0 | inf | inf | P4HA3 |
5 | GO_Molecular_Function_2023 | Aryl Sulfotransferase Activity (GO:0004062) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | SULT1C4 |
6 | GO_Molecular_Function_2023 | Histone Reader Activity (GO:0140566) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | TONSL |
7 | GO_Molecular_Function_2023 | Hydrolase Activity, Hydrolyzing N-glycosyl Com... | 0.016750 | 0.123369 | 0 | 0 | inf | inf | BST1 |
8 | GO_Molecular_Function_2023 | Tat Protein Binding (GO:0030957) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | DLL1 |
9 | GO_Molecular_Function_2023 | WW Domain Binding (GO:0050699) | 0.016750 | 0.123369 | 0 | 0 | inf | inf | ENTREP3 |
10 | GO_Molecular_Function_2023 | RNA Polymerase II Cis-Regulatory Region Sequen... | 0.016754 | 0.123369 | 0 | 0 | 3.180050 | 13.003634 | ZNF483;HAND2;NTN1;NFATC4;GLIS2;ZNF286A |
11 | GO_Molecular_Function_2023 | DNA-binding Transcription Activator Activity, ... | 0.026719 | 0.134555 | 0 | 0 | 5.074219 | 18.380719 | HAND2;INSL3;GLIS2 |
12 | GO_Molecular_Function_2023 | Procollagen-Proline Dioxygenase Activity (GO:0... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | P4HA3 |
13 | GO_Molecular_Function_2023 | N-acetylglucosamine 6-O-sulfotransferase Activ... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | CHST7 |
14 | GO_Molecular_Function_2023 | ABC-type Xenobiotic Transporter Activity (GO:0... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | ABCC1 |
15 | GO_Molecular_Function_2023 | alpha-N-acetylgalactosaminide Alpha-2,6-Sialyl... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | ST6GALNAC6 |
16 | GO_Molecular_Function_2023 | miRNA Binding (GO:0035198) | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | DND1 |
17 | GO_Molecular_Function_2023 | Myosin Heavy Chain Binding (GO:0032036) | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | NKD2 |
18 | GO_Molecular_Function_2023 | Estradiol 17-Beta-Dehydrogenase [NAD(P)] Activ... | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | HSD17B14 |
19 | GO_Molecular_Function_2023 | Solute:Proton Symporter Activity (GO:0015295) | 0.033223 | 0.134555 | 0 | 0 | 59.575758 | 202.825781 | SLC2A13 |
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0].str[:50]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version. df[self.colname].replace( /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy. For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object. df[self.colname].replace(
G = grn.plot_subgraph(grn.var[grn.var['cluster_1.5']=='4'].index.tolist(), only=80, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4'] Index(['TRHDE', 'PGR', 'ESR1', 'RASD2', 'SYNDIG1', 'OLFM2', 'FZD10', 'BHMT2', 'DUSP26', 'FOXF2', 'ADAM33', 'ADAMTS12', 'HS3ST3A1', 'CHODL', 'FTCD', 'FNDC1', 'HSPB3', 'FOXL1', 'AGMO', 'ADAMTSL2', 'S100A6', 'AKR1B10', 'SCGB1C2'], dtype='object', name='symbol_2')
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
#first_node = list(G.nodes)[-1]
#pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_4.html")
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['GO_Molecular_Function_2023', 'MSigDB_Hallmark_2020', 'Reactome_2022', 'Tabula_Sapiens', 'WikiPathway_2023_Human', 'TF_Perturbations_Followed_by_Expression', 'Reactome', 'PPI_Hub_Proteins', 'OMIM_Disease', 'GO_Molecular_Function_2023'],
organism='Human', # change accordingly
#description='pathway',
#cutoff=0.08, # test dataset, use lower value for real case
background=grn.var.symbol.tolist(),
)
enr.res2d.head(20)
2024-07-25 18:34:27,578 [WARNING] Input library not found: Reactome. Skip
Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
---|---|---|---|---|---|---|---|---|---|
0 | GO_Molecular_Function_2023 | RNA Polymerase II General Transcription Initia... | 0.000032 | 0.000933 | 0 | 0 | inf | inf | FOXF2;ESR1 |
1 | GO_Molecular_Function_2023 | Estrogen Response Element Binding (GO:0034056) | 0.000032 | 0.000933 | 0 | 0 | inf | inf | PGR;ESR1 |
2 | GO_Molecular_Function_2023 | General Transcription Initiation Factor Bindin... | 0.000095 | 0.001860 | 0 | 0 | 378.666667 | 3508.819727 | FOXF2;ESR1 |
3 | GO_Molecular_Function_2023 | Transcription Coactivator Binding (GO:0001223) | 0.000188 | 0.002780 | 0 | 0 | 189.285714 | 1623.428489 | PGR;ESR1 |
4 | GO_Molecular_Function_2023 | Transition Metal Ion Binding (GO:0046914) | 0.000446 | 0.005263 | 0 | 0 | 13.515099 | 104.272240 | BHMT2;AGMO;ADAM33;TRHDE |
5 | GO_Molecular_Function_2023 | Metalloendopeptidase Activity (GO:0004222) | 0.000551 | 0.005414 | 0 | 0 | 22.794231 | 171.058857 | ADAMTSL2;ADAM33;ADAMTS12 |
6 | GO_Molecular_Function_2023 | Metallopeptidase Activity (GO:0008237) | 0.000672 | 0.005666 | 0 | 0 | 21.155357 | 154.536591 | ADAMTSL2;ADAM33;ADAMTS12 |
7 | GO_Molecular_Function_2023 | Transcription Coregulator Binding (GO:0001221) | 0.001111 | 0.008195 | 0 | 0 | 54.013605 | 367.419603 | PGR;ESR1 |
8 | GO_Molecular_Function_2023 | DNA-binding Transcription Activator Activity, ... | 0.001429 | 0.009365 | 0 | 0 | 15.972973 | 104.640056 | FOXF2;PGR;ESR1 |
9 | GO_Molecular_Function_2023 | ATPase Binding (GO:0051117) | 0.002016 | 0.011893 | 0 | 0 | 37.780952 | 234.495632 | PGR;ESR1 |
10 | GO_Molecular_Function_2023 | Zinc Ion Binding (GO:0008270) | 0.002284 | 0.012250 | 0 | 0 | 13.407955 | 81.545835 | BHMT2;ADAM33;TRHDE |
11 | GO_Molecular_Function_2023 | Cis-Regulatory Region Sequence-Specific DNA Bi... | 0.004517 | 0.021203 | 0 | 0 | 6.945569 | 37.504924 | FOXF2;PGR;FOXL1;ESR1 |
12 | GO_Molecular_Function_2023 | RNA Polymerase II Cis-Regulatory Region Sequen... | 0.005523 | 0.021203 | 0 | 0 | 6.541596 | 34.009122 | FOXF2;PGR;FOXL1;ESR1 |
13 | GO_Molecular_Function_2023 | DNA Binding, Bending (GO:0008301) | 0.005750 | 0.021203 | 0 | 0 | inf | inf | FOXL1 |
14 | GO_Molecular_Function_2023 | TBP-class Protein Binding (GO:0017025) | 0.005750 | 0.021203 | 0 | 0 | inf | inf | ESR1 |
15 | GO_Molecular_Function_2023 | [Heparan Sulfate]-Glucosamine 3-Sulfotransfera... | 0.005750 | 0.021203 | 0 | 0 | inf | inf | HS3ST3A1 |
16 | GO_Molecular_Function_2023 | RNA Polymerase II Transcription Regulatory Reg... | 0.008164 | 0.028333 | 0 | 0 | 5.812950 | 27.949028 | FOXF2;PGR;FOXL1;ESR1 |
17 | GO_Molecular_Function_2023 | Endopeptidase Activity (GO:0004175) | 0.008882 | 0.028593 | 0 | 0 | 8.021918 | 37.893119 | ADAMTSL2;ADAM33;ADAMTS12 |
18 | GO_Molecular_Function_2023 | DNA Binding (GO:0003677) | 0.009208 | 0.028593 | 0 | 0 | 7.911486 | 37.086673 | FOXF2;PGR;FOXL1 |
19 | GO_Molecular_Function_2023 | Heparan Sulfate Sulfotransferase Activity (GO:... | 0.011468 | 0.030756 | 0 | 0 | 180.727273 | 807.520606 | HS3ST3A1 |
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split(',').str[0].str[:50]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
#title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version. df[self.colname].replace( /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy. For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object. df[self.colname].replace(
4. Differential Gene - Gene connections¶
- look at the connections of some key TFs in here
- do GSEA over these connections
- compare the connections of these same TFs for both conditions
G = grn_c.plot_subgraph("PAGE4", only=100, max_genes=40, interactive=False)#
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#ff7f0e'] Index(['MT2A', 'HSPA1A', 'CREM', 'PTN', 'HLA-A', 'SPARCL1', 'SPOCK3', 'NR4A3', 'APOD', 'CALD1', 'MGP', 'HSPE1', 'EIF4A1', 'CD99', 'EMP1', 'FABP4', 'NAMPT', 'NNMT', 'RBP1', 'LUM', 'IGFBP7', 'THBS1', 'YBX3', 'NAF1', 'DCN', 'A2M', 'CPE', 'C1R', 'ATP6V0C', 'MATN2', 'COL6A2', 'UBE2S', 'CST3', 'RHOQ', 'UBE2C', 'C1S', 'H2AZ1', 'GPRC5A', 'HLA-C', 'LGALS1', 'PAGE4'], dtype='object', name='symbol_2')
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
first_node = list(G.nodes)[-1]
pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_c_PAGE4.html")
# Assuming 'node_names' contains the list of gene names
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=[
#'KEGG_2021_Human',
#'MSigDB_Hallmark_2020',
#'Reactome_2022',
#'Tabula_Sapiens',
'WikiPathway_2023_Human', #'TF_Perturbations_Followed_by_Expression',
#'PPI_Hub_Proteins',
#'OMIM_Disease',
#'GO_Molecular_Function_2023'
],
organism='Human', # change accordingly
#description='pathway',
# cutoff=0.02, # test dataset, use lower value for real case
background=grn.var.symbol.tolist(),
)
enr.res2d.head(10)
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0].str.split('R-HSA').str[0].str.split(' WP').str[0].str[:50]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
top_term=20,
figsize=(4,10), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version. df[self.colname].replace( /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy. For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object. df[self.colname].replace(
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
figsize=(4,5), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version. df[self.colname].replace( /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy. For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object. df[self.colname].replace(
G = grn.plot_subgraph("PAGE4", only=100, max_genes=40, interactive=False)
['#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#1f77b4', '#ff7f0e'] Index(['nan-270', 'S100A6', 'MT2A', 'SPARCL1', 'MIF', 'SLC25A6', 'IGFBP7', 'APOD', 'MGP', 'DCN', 'A2M', 'PTN', 'RELB', 'DNAJB9', 'AKR1C1', 'C1S', 'CST3', 'CXCL8', 'FTH1', 'BRME1', 'FAM205A', 'TIMP1', 'YBX3', 'NPPB', 'TGIF2-RAB5IF', 'FBLN1', 'MTDH', 'IGFBP5', 'TFPI', 'MAP1B', 'COL6A2', 'HNRNPA0', 'EIF4A1', 'CCL21', 'FTL', 'CETN1', 'NME2', 'LSAMP', 'ACR', 'ATP6V0C', 'PAGE4'], dtype='object', name='symbol_2')
pnet = pnx.Network(notebook=True, cdn_resources="remote", directed=True)
pnet.from_nx(G)
first_node = list(G.nodes)[-1]
pnet.get_node(first_node)['color'] = "red"
pnet.save_graph("../figures/pyvis/grn_PAGE4.html")
# Assuming 'node_names' contains the list of gene names
enr = gp.enrichr(gene_list=list(G.nodes),
gene_sets=['KEGG_2021_Human',
#'MSigDB_Hallmark_2020',
#'Reactome_2022',
#'Tabula_Sapiens',
#'WikiPathway_2023_Human',
# 'TF_Perturbations_Followed_by_Expression',
# 'Reactome',
#'PPI_Hub_Proteins',
#'OMIM_Disease',
'GO_Molecular_Function_2023'
],
organism='Human', # change accordingly
#description='pathway',
# cutoff=0.02, # test dataset, use lower value for real case
background=grn.var.symbol.tolist(),
)
enr.res2d.head(20)
Gene_set | Term | P-value | Adjusted P-value | Old P-value | Old adjusted P-value | Odds Ratio | Combined Score | Genes | |
---|---|---|---|---|---|---|---|---|---|
0 | GO_Molecular_Function_2023 | Peptidase Inhibitor Activity (GO:0030414) | 0.000020 | 0.001566 | 0 | 0 | 104.105263 | 1128.623016 | CST3;TIMP1;TFPI |
1 | GO_Molecular_Function_2023 | Transition Metal Ion Binding (GO:0046914) | 0.000463 | 0.016191 | 0 | 0 | 9.025463 | 69.301557 | ACR;MT2A;FTH1;TIMP1;FTL |
2 | GO_Molecular_Function_2023 | Double-Stranded RNA Binding (GO:0003725) | 0.000607 | 0.016191 | 0 | 0 | 101.461538 | 751.496153 | EIF4A1;MTDH |
3 | GO_Molecular_Function_2023 | Endopeptidase Inhibitor Activity (GO:0004866) | 0.001010 | 0.019977 | 0 | 0 | 18.306502 | 126.278211 | CST3;TIMP1;TFPI |
4 | GO_Molecular_Function_2023 | Endopeptidase Regulator Activity (GO:0061135) | 0.001498 | 0.019977 | 0 | 0 | 50.705128 | 329.757829 | CST3;TFPI |
5 | GO_Molecular_Function_2023 | Ferrous Iron Binding (GO:0008198) | 0.001498 | 0.019977 | 0 | 0 | 50.705128 | 329.757829 | FTH1;FTL |
6 | GO_Molecular_Function_2023 | Protease Binding (GO:0002020) | 0.002465 | 0.027606 | 0 | 0 | 12.944079 | 77.736001 | CST3;ACR;TIMP1 |
7 | GO_Molecular_Function_2023 | Iron Ion Binding (GO:0005506) | 0.002761 | 0.027606 | 0 | 0 | 33.786325 | 199.079000 | FTH1;FTL |
8 | GO_Molecular_Function_2023 | Calcium Ion Binding (GO:0005509) | 0.003252 | 0.028910 | 0 | 0 | 7.400664 | 42.393777 | CETN1;S100A6;SPARCL1;FBLN1 |
9 | GO_Molecular_Function_2023 | Kinase Binding (GO:0019900) | 0.005211 | 0.038377 | 0 | 0 | 9.688322 | 50.930509 | PTN;RELB;HNRNPA0 |
10 | GO_Molecular_Function_2023 | RNA Binding (GO:0003723) | 0.005277 | 0.038377 | 0 | 0 | 5.000000 | 26.222182 | EIF4A1;YBX3;DCN;MTDH;HNRNPA0 |
11 | GO_Molecular_Function_2023 | Chemokine Activity (GO:0008009) | 0.007445 | 0.039047 | 0 | 0 | 18.405594 | 90.191859 | CXCL8;CCL21 |
12 | GO_Molecular_Function_2023 | Chemokine Receptor Binding (GO:0042379) | 0.007445 | 0.039047 | 0 | 0 | 18.405594 | 90.191859 | CXCL8;CCL21 |
13 | GO_Molecular_Function_2023 | Metal Ion Binding (GO:0046872) | 0.008569 | 0.039047 | 0 | 0 | 5.523471 | 26.289761 | CETN1;S100A6;SPARCL1;FBLN1 |
14 | GO_Molecular_Function_2023 | Protein Kinase Binding (GO:0019901) | 0.009285 | 0.039047 | 0 | 0 | 7.734868 | 36.193807 | PTN;RELB;HNRNPA0 |
15 | GO_Molecular_Function_2023 | mRNA 3'-UTR Binding (GO:0003730) | 0.009893 | 0.039047 | 0 | 0 | 15.566075 | 71.852090 | YBX3;HNRNPA0 |
16 | GO_Molecular_Function_2023 | NF-kappaB Binding (GO:0051059) | 0.010250 | 0.039047 | 0 | 0 | inf | inf | MTDH |
17 | GO_Molecular_Function_2023 | Phosphatase Inhibitor Activity (GO:0019212) | 0.010250 | 0.039047 | 0 | 0 | inf | inf | PTN |
18 | GO_Molecular_Function_2023 | Phosphotransferase Activity, Phosphate Group A... | 0.010250 | 0.039047 | 0 | 0 | inf | inf | NME2 |
19 | GO_Molecular_Function_2023 | Nucleobase-Containing Compound Kinase Activity... | 0.010250 | 0.039047 | 0 | 0 | inf | inf | NME2 |
enr.res2d.Term = enr.res2d.Term.str.split("(").str[0]
ax = dotplot(enr.res2d.replace([np.inf, -np.inf], 0).dropna(),
column="Adjusted P-value",
title='normal fibro PAGE4',
cmap=plt.cm.viridis,
size=4, # adjust dot size
figsize=(4,5), cutoff=0.25)
/home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: The 'method' keyword in Series.replace is deprecated and will be removed in a future version. df[self.colname].replace( /home/ml4ig1/miniconda3/envs/scprint/lib/python3.10/site-packages/gseapy/plot.py:689: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method. The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy. For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object. df[self.colname].replace(
5. Using classification¶
Here we used the network given by the mean of all heads in our scPRINT model. We could also use other scPRINT versions to see if the results are consistent. but we could also use the classification mechanism presented in our manuscript and other notebooks to get to possibly more specialized networks.
grn_normal = grn_inferer(layer=list(range(model.nlayers))[:], cell_type="naive B cell")
grn_normal, m, omni_cls = train_classifier(grn_normal, C=0.5, train_size=0.9, class_weight={
1: 200, 0: 1}, shuffle=True)
number of expressed genes in this cell type: 16801
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Predicting: 0it [00:00, ?it/s]
true elem 2115 ... doing regression.... metrics {'used_heads': 25, 'precision': 0.0, 'random_precision': 0.0006433550820644942, 'recall': 0.0, 'predicted_true': 177.0, 'number_of_true': 219.0, 'epr': 0.0}
# highlight differential links on genes that are expressed in both
grn_normal.varp['all'] = grn_normal.varp['GRN'].copy()
grn_normal.varp['GRN'] = grn_normal.varp['classified']
Notes on findings and their relation to the literature¶
CAV1-expressing CAFs contribute to invasion and metastasis in breast cancer24 loss of caveolin1 (CAV1) is found in metabolically reprogrammed CAFs that promote tumorigenesis
discoidin domain-containing receptor 2 (DDR2)21 and integrin α11β122, have emerged to identify CAFs in the context of a specific TME https://www.nature.com/articles/s12276-023-01013-0
normal fibroblast populations marked by the expression of Gli1 and Hoxb6
CAFs, remain perpetually activated with a high capacity for ECM synthesis and microenvironmental remodeling, leading to stromal desmoplasia, a phenomenon characterized by increased deposition of ECM components in tumors. In this regard, CAFs share many basic characteristics, such as a secretory phenotype and capacity to synthesize ECM components, with fibroblasts found in nonmalignant tissue fibrosis. Therefore, the classic markers found to be expressed in fibroblasts, including α-SMA, vimentin, desmin, fibroblast-specific protein 1 (FSP1; also known as S100A4)
Fibrogenesis is part of a normal protective response to tissue injury that can become irreversible and progressive, leading to fatal diseases. Senescent cells are a main driver of fibrotic diseases through their secretome, known as senescence-associated secretory phenotype (SASP) CAFs (vCAFs), cycling CAFs (cCAFs), and developmental CAFs (dCAFs)
The stem-like ‘universal’ type of fibroblast cell, marked by expression of peptidase inhibitor 16 (Pi16) and Col15a, found in the steady state across tissues, activated fibroblasts, as observed by the development of LRRC15+ CAFs in PDAC.
. SOCS1: Suppressor of cytokine signaling; STAT3: Signal transducer and activator of transcription 3; LDH: Lactate dehydrogenase; PYCR1: pyrroline-5-carboxylate reductase
senescence, and multiple types of fibrotic diseases in mice and humans are characterized by the accumulation of iron. https://www.nature.com/articles/s42255-023-00928-2
Current evidence indicate that cancer-associated fibroblasts (CAFs) play an important role in prostate cancer (PCa) development and progression. playing a role in invasion.
NDRG2, TSPAN1, PTN, APOE, OR51E2, P4HB, STEAP1 and ABCC4 were used to construct molecular subtypes and CAF-related gene prognostic index (CRGPI)
Studies on the role of STEAP1, OR51E2, PTN, ABCC4, NDRG2 have been relatively in-depth. For instance, STEAP1 is all-called six-transmembrane epithelial antigen of the prostate 1, which belongs to a family of metalloproteinases involved in iron and copper homeostasis and other cellular processes37. STEAP1 is overexpressed on the plasma membrane of PCa cells and is associated with PCa invasiveness and metastasis50. The possible mechanism of STEAP1 promoting tumor proliferation and metastasis is by acting as a channel for small molecules that are involved in intercellular communication OR51E2 perhaps could be used as one of the biomarkers for PCa53,54,55,56 https://www.nature.com/articles/s41598-023-36125-0#Sec11
The protein encoded by ABCC4 is a member of the superfamily of ATP-binding cassette (ABC) transporters, which is also called multidrug resistance protein 4 (MRP4)64. MRP4 was reported to be associated with drug resistance of PCa in many studies65,66,67
PRRX1, OSR1, FOXD1, HOXC4,LHX9, TBX3 and TWIST2, targeted five or more TFs inthe fibroblast TRN. This influential-set explained 62.5% ofthe total number of regulatory edges and they collectivelytargeted 15 out of 18 TFs represented in the network. Upona closer examination of the fibroblastic network, the inter-actions among OSR1–PRRX1–TWIST2 were notably co-ordinated, regulating one another in both directions https://www.researchgate.net/publication/264633205_A_transient_disruption_of_fibroblastic_transcriptional_regulatory_network_facilitates_trans-differentiation
FGF receptors are important in PCa initiation and progression. These endocrine FGFs require alpha-Klotho (KL) and/or beta-Klotho (KLB), two related single-pass transmembrane proteins restricted in their tissue distribution. FGF19 is expressed in primary and metastatic PCa tissues where it functions as an autocrine growth factor. https://www.e-cancer.fr/Professionnels-de-sante/Veille-bibliographique/Nota-Bene-Cancer/NBC-174/The-endocrine-fibroblast-growth-factor-FGF19-promotes-prostate-cancer-progression
• Prostate cancer stroma contains diverse populations of fibroblasts. • Carcinoma associated fibroblasts can promote prostate carcinogenesis. • Interaction among fibroblast subsets modulate stromal-epithelial paracrine signaling. The stroma in the normal prostate is predominantly composed of smooth muscle cells while in cancer this is partially replaced by fibroblasts and myofibroblasts6. These fibroblastic cells are responsible for collagen and extracellular matrix (ECM) deposition, changing local organ stiffness
iCAF, myCAF, apCAF
It has been shown that SDF-1/CXCL12, secreted by CAF, acts via TGFβ–mediated upregulation of CXCR4 in epithelial cells to activate Akt signaling associated with tumorigenesis. Overexpression of SDF-1/CXCL12 and TGFβ1 in benign human prostate fibroblasts has been shown to induce malignant transformation of benign prostate epithelial cells and formation of highly invasive tumors in vivo
increased cadherin 2 (CDH2) mRNA expression in LNCaP cells
Upregulation of DNA methyltransferases (DNMT) consistent with aberrant promoter hypermethylation of tumor-suppressor genes in multiple cancer types including PCa has been reported
CD10 and GPR77 can define a human CAF subset that sustains cancer stemness
Fibroblasts are also fundamental for inducing angiogenesis by secreting angiogenic factors such as vascular endothelial growth factor (VEGF) and matrix proteins Epithelial cells release TGFβ ligands, Kallikrein-related peptidase-4 (KLK4), CAFs are characterized by molecular markers that are upregulated compared to normal fibroblasts, such as fibroblast activation protein (FAP), PDGFR-β, fibroblast-specific protein-1 (FSP-1) (also known as S100A4)
ACTA2 α-smooth actin (α-SMA) Upregulated RT-qPCR, IHC on tissue microarrays [42,43] ASPN Asporin Upregulated Tag-based RNA profiling, microarray profiling, IHC, RT-qPCR [38,44,45] CAV1 Caveolin-1 Downregulated Tag profiling, IHC, RT-qPCR [38] COL1A1 Collagen Type-I Upregulated RT-qPCR, IHC [42,43] CXCL12 Stromal cell-derived factor 1 (SDF1)/ (C-X-C motif chemokine ligand 12 (CXCL12) Upregulated Tag-profiling, RT-qPCR, ELISA [38,46] FAP Fibroblast activation protein Upregulated IHC [42] FGF2, FGF7, FGF10 Fibroblast growth factor-2/-7/-10 Upregulated RT-qPCR, Western Blot, IHC [43,47,48] FN1 Fibronectin Upregulated Tag profiling, IHC, RT-qPCR [38] ITGA1 Integrin-α1 (CD49a) Upregulated IHC [49] OGN Osteoglycin Upregulated Tag-profiling, IHC, RT-qPCR [38] PDGFRB Platelet-derived growth factor receptor β Upregulated Microarray profiling [44,50] POSTN Periostin Upregulated Microarray profiling, IHC [44] S100A4 Fibroblast-specific protein 1 (FSP1)/S100 Calcium Binding Protein A4 (S100A4) Upregulated Immunofluorescence, RT-qPCR, Western Blot [51,52,53] S100A6 S100 Calcium Binding Protein A6 Downregulated Tag profiling, IHC, RT-qPCR [38] SPARC Secreted Protein Acidic and Cysteine Rich Up/downregulated Microarray profiling, Tag-profiling [38,44] STC1 Stanniocalcin 1 Downregulated Tag profiling, IHC, RT-qPCR [38] THY1 Cluster of differentiation 90 (CD90) antigen Upregulated IHC [54] TNC Tenascin C Upregulated RT-qPCR, IHC [42,43] VIM Vimentin Upregulated IHC
Benign prostatic hyperplasia (BPH) is a non-malignant growth of the prostate, typically occurring in older men with an occurrence of 80–90% of men in their 70s BPH develops in the transition zone differently from PCa foci, which usually develop in the peripheral zone
ESR1= . Here we studied the role of CAF estrogen receptor alpha (ERα) and found that it could protect against PCa invasion. ERα could function through a CAF–epithelial interaction via selectively upregulating thrombospondin 2 (Thbs2) and downregulating matrix metalloproteinase 3 (MMP3) at the protein and messenger RNA levels https://academic.oup.com/carcin/article/35/6/1301/449417
ACTA2: we found that “reactive CAFs” induce shear resistance to prostate tumor cells via intercellular contact and soluble derived factors. The reactive CAFs showed higher expression of α-smooth muscle actin (α-SMA) and fibroblast activation protein (FAP) compared to differentiated CAFs https://www.oncotarget.com/article/27510/
Proteins that exhibited a significant increase in CAF versus NPF were enriched for the functional categories “cell adhesion” and the “extracellular matrix.” The CAF phosphoproteome exhibited enhanced phosphorylation of proteins associated with the “spliceosome” and “actin binding.” COL1A1/2 and COL5A1; the receptor tyrosine kinase discoidin domain-containing receptor 2 (DDR2), a receptor for fibrillar collagens; and lysyl oxidase-like 2 (LOXL2) https://www.mcponline.org/article/S1535-9476(20)31548-6/fulltext
•Cancer-associated fibroblasts (CAFs) exhibit a highly aligned cytoskeleton and extracellular matrix. •CAFs are stiffer than normal fibroblasts from the prostate. •Benign prostate epithelial cells are more compliant after co-culture with CAFs. •Benign epithelial cells are more invasive and proliferative in presence of CAFs. https://www.sciencedirect.com/science/article/pii/S2590006420300338
These findings suggest that CAFs exosomes drive PCa metastasis via the miR-500a-3p/FBXW7/HSF1 axis in a hypoxic microenvironmen https://www.nature.com/articles/s41417-024-00742-2
https://www.genecards.org/cgi-bin/carddisp.pl?gene=CCT8L2 CCT8L2 (Chaperonin Containing TCP1 Subunit 8 Like 2) is a Protein Coding gene. Gene Ontology (GO) annotations related to this gene include calcium-activated potassium channel activity and monoatomic anion channel activity. An important paralog of this gene is CCT8. related to the meta transmembrane activity via
Possible molecular chaperone; assists the folding of proteins upon ATP hydrolysis. The TRiC complex mediates the folding of WRAP53/TCAB1, thereby regulating telomere maintenance (PubMed:25467444). As part of the TRiC complex may play a role in the assembly of BBSome, a complex involved in ciliogenesis regulating transports vesicles to the cilia
https://www.genecards.org/cgi-bin/carddisp.pl?gene=TIMP1 TIMP Metallopeptidase Inhibitor 1 inhibitors of the matrix metalloproteinases (MMPs), a group of peptidases involved in degradation of the extracellular matrix. In addition to its inhibitory role against most of the known MMPs, the encoded protein is able to promote cell proliferation
connected to TLL2: zinc-dependent metalloprotease part of the metzincin protein complex which also is Among its related pathways are Collagen chain trimerization and Extracellular matrix organization.
PDXP, SEPTIN5 cytoskeletal organization and regulation
PRAME inhibits the signaling of retinoic acid and mediates the regulation of selenoproteins
many genes involved in cancer. indeed, it is known that 80% adult male above 70 years old will have such hyperplastic prostate. This is due to scenesence. and that Iron accumulation drives fibrosis, senescence and the senescence-associated secretory phenotype. (of which we have some proteins present here.) A mechanism that this clustered is one of genes that strongly regulates each other in this fibroblasts from pre-cancerous lesions via fibrotic iron accumulation and a secretory mechanism GJC1 (exchange of things between cells (here likely metals?))
Unknown genes like SMIM40. integral membrane protein could be linked to this phenotype