VenusX: Unlocking Fine-Grained Functional Understanding of Proteins
Yang Tan 🔗, Wenrui Gou, Bozitao Zhong, Huiqun Yu, Liang Hong, Bingxin Zhou
Residue-level Binary Classification
Cross-family
Performance on Cross-family splits (Out-of-distribution). Higher is better.
Act: Active Sites | BindI: Binding Sites | Evo: Conserved Sites | Motif: Functional Motif | Dom: Functional Domain
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| SaProt (AF_650M) | Seq-Structure | 0.185 | 0.241 | 0.072 | 0.110 | 0.538 |
| Ankh (Base) | Sequence-only | 0.166 | 0.190 | 0.025 | 0.045 | 0.507 |
| ProtSSN (k20_h512) | Seq-Structure | 0.156 | 0.241 | 0.014 | 0.026 | 0.498 |
| ESM2 (t30) | Sequence-only | 0.143 | 0.278 | 0.060 | 0.098 | 0.533 |
| ProtBert | Sequence-only | 0.131 | 0.131 | 0.020 | 0.035 | 0.501 |
| ESM2 (t33) | Sequence-only | 0.143 | 0.126 | 0.031 | 0.050 | 0.507 |
| SaProt (AF_35M) | Seq-Structure | 0.114 | 0.132 | 0.036 | 0.056 | 0.510 |
| GVP-GNN | Structure-only | 0.101 | 0.019 | 0.001 | 0.002 | 0.485 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| SaProt (AF_35M) | Seq-Structure | 0.230 | 0.634 | 0.135 | 0.223 | 0.599 |
| SaProt (AF_650M) | Seq-Structure | 0.182 | 0.661 | 0.135 | 0.224 | 0.600 |
| ProtSSN (k20_h512) | Seq-Structure | 0.095 | 0.379 | 0.029 | 0.053 | 0.514 |
| ProtBert | Sequence-only | 0.112 | 0.416 | 0.048 | 0.086 | 0.530 |
| Ankh (Base) | Sequence-only | 0.145 | 0.437 | 0.086 | 0.144 | 0.559 |
| ESM2 (t30) | Sequence-only | 0.133 | 0.525 | 0.078 | 0.136 | 0.556 |
| ESM2 (t33) | Sequence-only | 0.159 | 0.581 | 0.108 | 0.181 | 0.579 |
| GVP-GNN | Structure-only | 0.040 | 0.000 | 0.000 | 0.000 | 0.488 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| Ankh (Base) | Sequence-only | 0.275 | 0.387 | 0.169 | 0.235 | 0.595 |
| SaProt (AF_650M) | Seq-Structure | 0.274 | 0.456 | 0.111 | 0.178 | 0.568 |
| SaProt (AF_35M) | Seq-Structure | 0.272 | 0.382 | 0.172 | 0.238 | 0.596 |
| ProtBert | Sequence-only | 0.243 | 0.482 | 0.009 | 0.017 | 0.489 |
| ESM2 (t33) | Sequence-only | 0.262 | 0.403 | 0.122 | 0.187 | 0.572 |
| ESM2 (t30) | Sequence-only | 0.235 | 0.374 | 0.097 | 0.154 | 0.555 |
| ProtSSN (k20_h512) | Seq-Structure | 0.227 | 0.452 | 0.034 | 0.062 | 0.511 |
| GVP-GNN | Structure-only | 0.101 | 0.176 | 0.035 | 0.058 | 0.506 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| ProtBert | Sequence-only | 0.348 | 0.472 | 0.231 | 0.310 | 0.628 |
| ESM2 (t33) | Sequence-only | 0.456 | 0.566 | 0.384 | 0.457 | 0.704 |
| SaProt (AF_650M) | Seq-Structure | 0.441 | 0.504 | 0.350 | 0.414 | 0.680 |
| ESM2 (t30) | Sequence-only | 0.433 | 0.510 | 0.432 | 0.467 | 0.707 |
| SaProt (AF_35M) | Seq-Structure | 0.408 | 0.485 | 0.411 | 0.445 | 0.695 |
| Ankh (Base) | Sequence-only | 0.394 | 0.499 | 0.303 | 0.377 | 0.662 |
| ProtSSN (k20_h512) | Seq-Structure | 0.390 | 0.390 | 0.365 | 0.412 | 0.678 |
| GVP-GNN | Structure-only | 0.329 | 0.329 | 0.453 | 0.399 | 0.661 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| SaProt (AF_650M) | Seq-Structure | 0.564 | 0.572 | 0.444 | 0.500 | 0.632 |
| SaProt (AF_35M) | Seq-Structure | 0.525 | 0.548 | 0.349 | 0.427 | 0.594 |
| ProtBert | Sequence-only | 0.508 | 0.588 | 0.138 | 0.223 | 0.501 |
| ESM2 (t33) | Sequence-only | 0.506 | 0.530 | 0.367 | 0.433 | 0.593 |
| ESM2 (t30) | Sequence-only | 0.470 | 0.496 | 0.360 | 0.417 | 0.578 |
| GVP-GNN | Structure-only | 0.468 | 0.519 | 0.087 | 0.149 | 0.462 |
| Ankh (Base) | Sequence-only | 0.449 | 0.494 | 0.280 | 0.357 | 0.552 |
| ProtSSN (k20_h512) | Seq-Structure | - | - | - | - | - |
Mixed-family
Performance on Mixed-family splits (In-distribution). Higher is better.
Act: Active Sites | BindI/BindB: Binding Sites | Evo: Conserved Sites | Motif: Functional Motif | Dom: Functional Domain | Epi: Epitope Sites
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| Ankh (Base) | Sequence-only | 0.873 | 0.862 | 0.700 | 0.773 | 0.883 |
| ESM2 (t30) | Sequence-only | 0.855 | 0.826 | 0.676 | 0.744 | 0.868 |
| ESM2 (t33) | Sequence-only | 0.852 | 0.845 | 0.682 | 0.755 | 0.874 |
| ProtBert | Sequence-only | 0.764 | 0.791 | 0.565 | 0.659 | 0.825 |
| SaProt (AF_650M) | Seq-Structure | 0.745 | 0.812 | 0.511 | 0.627 | 0.808 |
| SaProt (AF_35M) | Seq-Structure | 0.688 | 0.818 | 0.408 | 0.544 | 0.767 |
| GVP-GNN | Structure-only | 0.523 | 0.735 | 0.362 | 0.485 | 0.736 |
| ProtSSN (k20_h512) | Seq-Structure | 0.465 | 0.523 | 0.209 | 0.329 | 0.658 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| ESM2 (t30) | Sequence-only | 0.912 | 0.859 | 0.859 | 0.859 | 0.926 |
| Ankh (Base) | Sequence-only | 0.907 | 0.849 | 0.866 | 0.857 | 0.925 |
| ESM2 (t33) | Sequence-only | 0.904 | 0.869 | 0.830 | 0.849 | 0.921 |
| ProtBert | Sequence-only | 0.857 | 0.855 | 0.694 | 0.766 | 0.878 |
| SaProt (AF_650M) | Seq-Structure | 0.838 | 0.827 | 0.768 | 0.796 | 0.893 |
| SaProt (AF_35M) | Seq-Structure | 0.807 | 0.813 | 0.705 | 0.755 | 0.871 |
| ProtSSN (k20_h512) | Seq-Structure | 0.801 | 0.818 | 0.705 | 0.757 | 0.873 |
| GVP-GNN | Structure-only | 0.611 | 0.730 | 0.519 | 0.607 | 0.795 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| ESM2 (t33) | Sequence-only | 0.899 | 0.856 | 0.806 | 0.831 | 0.910 |
| Ankh (Base) | Sequence-only | 0.895 | 0.882 | 0.735 | 0.802 | 0.896 |
| ESM2 (t30) | Sequence-only | 0.862 | 0.816 | 0.783 | 0.799 | 0.894 |
| ProtBert | Sequence-only | 0.771 | 0.805 | 0.610 | 0.694 | 0.839 |
| SaProt (AF_650M) | Seq-Structure | 0.734 | 0.809 | 0.554 | 0.658 | 0.820 |
| SaProt (AF_35M) | Seq-Structure | 0.724 | 0.819 | 0.520 | 0.636 | 0.809 |
| ProtSSN (k20_h512) | Seq-Structure | 0.715 | 0.790 | 0.507 | 0.618 | 0.800 |
| GVP-GNN | Structure-only | 0.342 | 0.810 | 0.091 | 0.164 | 0.569 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| Ankh (Base) | Sequence-only | 0.884 | 0.846 | 0.789 | 0.817 | 0.895 |
| ESM2 (t33) | Sequence-only | 0.874 | 0.851 | 0.748 | 0.796 | 0.884 |
| ESM2 (t30) | Sequence-only | 0.855 | 0.824 | 0.775 | 0.799 | 0.885 |
| SaProt (AF_650M) | Seq-Structure | 0.802 | 0.841 | 0.615 | 0.710 | 0.837 |
| ProtBert | Sequence-only | 0.779 | 0.784 | 0.678 | 0.727 | 0.845 |
| SaProt (AF_35M) | Seq-Structure | 0.767 | 0.821 | 0.582 | 0.681 | 0.821 |
| ProtSSN (k20_h512) | Seq-Structure | 0.716 | 0.772 | 0.550 | 0.642 | 0.799 |
| GVP-GNN | Structure-only | 0.661 | 0.748 | 0.525 | 0.618 | 0.786 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| Ankh (Base) | Sequence-only | 0.673 | 0.674 | 0.467 | 0.552 | 0.700 |
| ESM2 (t33) | Sequence-only | 0.666 | 0.661 | 0.467 | 0.547 | 0.696 |
| SaProt (AF_650M) | Seq-Structure | 0.642 | 0.635 | 0.472 | 0.542 | 0.689 |
| ESM2 (t30) | Sequence-only | 0.634 | 0.648 | 0.433 | 0.519 | 0.679 |
| ProtBert | Sequence-only | 0.591 | 0.636 | 0.353 | 0.454 | 0.644 |
| SaProt (AF_35M) | Seq-Structure | 0.574 | 0.632 | 0.322 | 0.427 | 0.629 |
| GVP-GNN | Structure-only | 0.560 | 0.591 | 0.344 | 0.435 | 0.636 |
| ProtSSN (k20_h512) | Seq-Structure | - | - | - | - | - |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| ESM2 (t33) | Sequence-only | 0.446 | 0.605 | 0.329 | 0.427 | 0.707 |
| Ankh (Base) | Sequence-only | 0.421 | 0.634 | 0.260 | 0.369 | 0.678 |
| ESM2 (t30) | Sequence-only | 0.408 | 0.598 | 0.289 | 0.390 | 0.689 |
| ProtBert | Sequence-only | 0.340 | 0.547 | 0.238 | 0.332 | 0.659 |
| Model | Type | AUPR | Precision | Recall | F1-Positive | Macro-F1 |
|---|---|---|---|---|---|---|
| ESM2 (t30) | Sequence-only | 0.186 | 1.000 | 0.001 | 0.002 | 0.480 |
| ESM2 (t33) | Sequence-only | 0.174 | 0.000 | 0.000 | 0.000 | 0.479 |
| ProtBert | Sequence-only | 0.169 | 1.000 | 0.001 | 0.002 | 0.480 |
| Ankh (Base) | Sequence-only | 0.167 | 0.000 | 0.000 | 0.000 | 0.479 |
Fragment-level Multi-class Classification
MF50
Performance on InterPro datasets (MF50 split). Higher is better.
| Model | Type | Accuracy | Precision | Recall | Macro-F1 | MCC |
|---|---|---|---|---|---|---|
| SaProt (AF_650M) | Seq-Structure | 0.928 | 0.830 | 0.830 | 0.825 | 0.926 |
| GVP-GNN | Structure-only | 0.907 | 0.826 | 0.833 | 0.822 | 0.906 |
| SaProt (AF_35M) | Seq-Structure | 0.928 | 0.810 | 0.823 | 0.807 | 0.926 |
| ProtSSN (k20_h512) | Seq-Structure | 0.891 | 0.773 | 0.774 | 0.764 | 0.889 |
| Ankh (Base) | Sequence-only | 0.824 | 0.661 | 0.665 | 0.647 | 0.821 |
| ESM2 (t30) | Sequence-only | 0.819 | 0.659 | 0.670 | 0.647 | 0.815 |
| ESM2 (t33) | Sequence-only | 0.814 | 0.603 | 0.634 | 0.605 | 0.810 |
| ProtBert | Sequence-only | 0.736 | 0.618 | 0.636 | 0.609 | 0.731 |
| Model | Type | Accuracy | Precision | Recall | Macro-F1 | MCC |
|---|---|---|---|---|---|---|
| SaProt (AF_650M) | Seq-Structure | 0.986 | 0.968 | 0.956 | 0.957 | 0.984 |
| SaProt (AF_35M) | Seq-Structure | 0.976 | 0.943 | 0.929 | 0.931 | 0.971 |
| GVP-GNN | Structure-only | 0.972 | 0.901 | 0.882 | 0.884 | 0.967 |
| ProtSSN (k20_h512) | Seq-Structure | 0.972 | 0.940 | 0.948 | 0.931 | 0.967 |
| ESM2 (t30) | Sequence-only | 0.937 | 0.834 | 0.819 | 0.809 | 0.926 |
| ESM2 (t33) | Sequence-only | 0.934 | 0.755 | 0.775 | 0.753 | 0.922 |
| ProtBert | Sequence-only | 0.927 | 0.838 | 0.794 | 0.790 | 0.914 |
| Ankh (Base) | Sequence-only | 0.920 | 0.733 | 0.732 | 0.718 | 0.906 |
| Model | Type | Accuracy | Precision | Recall | Macro-F1 | MCC |
|---|---|---|---|---|---|---|
| SaProt (AF_650M) | Seq-Structure | 0.950 | 0.868 | 0.875 | 0.863 | 0.950 |
| SaProt (AF_35M) | Seq-Structure | 0.939 | 0.857 | 0.858 | 0.849 | 0.938 |
| ProtSSN (k20_h512) | Seq-Structure | 0.915 | 0.804 | 0.807 | 0.793 | 0.915 |
| GVP-GNN | Structure-only | 0.914 | 0.763 | 0.768 | 0.757 | 0.913 |
| Ankh (Base) | Sequence-only | 0.866 | 0.727 | 0.729 | 0.716 | 0.865 |
| ESM2 (t30) | Sequence-only | 0.853 | 0.681 | 0.684 | 0.667 | 0.852 |
| ESM2 (t33) | Sequence-only | 0.841 | 0.682 | 0.682 | 0.669 | 0.840 |
| ProtBert | Sequence-only | 0.828 | 0.644 | 0.646 | 0.627 | 0.827 |
| Model | Type | Accuracy | Precision | Recall | Macro-F1 | MCC |
|---|---|---|---|---|---|---|
| ProtSSN (k20_h512) | Seq-Structure | 0.914 | 0.564 | 0.556 | 0.556 | 0.907 |
| SaProt (AF_650M) | Seq-Structure | 0.927 | 0.546 | 0.562 | 0.552 | 0.921 |
| ESM2 (t33) | Sequence-only | 0.906 | 0.547 | 0.543 | 0.542 | 0.898 |
| SaProt (AF_35M) | Seq-Structure | 0.901 | 0.509 | 0.505 | 0.504 | 0.892 |
| Ankh (Base) | Sequence-only | 0.901 | 0.508 | 0.501 | 0.499 | 0.892 |
| ESM2 (t30) | Sequence-only | 0.884 | 0.458 | 0.461 | 0.457 | 0.875 |
| ProtBert | Sequence-only | 0.884 | 0.455 | 0.458 | 0.452 | 0.875 |
| GVP-GNN | Structure-only | 0.807 | 0.387 | 0.371 | 0.370 | 0.791 |
Pairwise Functional Similarity Scoring
F50
Performance on fragment-level splits (F50). Metric is AUC (%). Higher is better.
| Model | Type | AUC (%) |
|---|---|---|
| ESM-IF | Seq-Structure | 96.5 |
| Foldseek (3Di-AA) | Alignment | 96.1 |
| Foldseek (3Di) | Alignment | 96.0 |
| SaProt (AF2_35M) | Seq-Structure | 95.8 |
| TM-align | Alignment | 94.6 |
| TM-VEC | Seq-Structure | 93.6 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 91.8 |
| ProstT5 | Seq-Structure | 90.8 |
| SaProt (PDB_650M) | Seq-Structure | 82.8 |
| ProtSSN (k20_h512) | Seq-Structure | 79.1 |
| ProtBert | Sequence-only | 71.4 |
| Ankh (base) | Seq-Enc-Dec | 69.6 |
| ESM2 (t30) | Sequence-only | 69.4 |
| ESM-1B | Sequence-only | 67.6 |
| MIF-ST | Seq-Structure | 65.9 |
| ESM2 (t36) | Sequence-only | 65.8 |
| BLAST | Alignment | 52.9 |
| ESM2 (t33) | Sequence-only | 50.2 |
| Model | Type | AUC (%) |
|---|---|---|
| ProstT5 | Seq-Structure | 99.5 |
| TM-VEC | Seq-Structure | 98.6 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 98.5 |
| SaProt (PDB_650M) | Seq-Structure | 98.1 |
| ESM-IF | Seq-Structure | 95.0 |
| SaProt (AF2_35M) | Seq-Structure | 94.3 |
| Foldseek (3Di) | Alignment | 92.6 |
| Foldseek (3Di-AA) | Alignment | 92.6 |
| TM-align | Alignment | 90.1 |
| Ankh (base) | Seq-Enc-Dec | 88.9 |
| ProtSSN (k20_h512) | Seq-Structure | 88.4 |
| MIF-ST | Seq-Structure | 86.1 |
| ProtBert | Sequence-only | 84.9 |
| ESM-1B | Sequence-only | 84.5 |
| ESM2 (t30) | Sequence-only | 77.6 |
| ESM2 (t33) | Sequence-only | 73.0 |
| ESM2 (t36) | Sequence-only | 71.3 |
| BLAST | Alignment | 52.4 |
| Model | Type | AUC (%) |
|---|---|---|
| Foldseek (3Di-AA) | Alignment | 88.4 |
| Foldseek (3Di) | Alignment | 88.3 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 71.0 |
| TM-align | Alignment | 67.7 |
| TM-VEC | Seq-Structure | 67.4 |
| ESM2 (t36) | Sequence-only | 63.9 |
| Ankh (base) | Seq-Enc-Dec | 63.9 |
| SaProt (PDB_650M) | Seq-Structure | 62.6 |
| SaProt (AF2_35M) | Seq-Structure | 61.9 |
| ESM-IF | Seq-Structure | 61.3 |
| MIF-ST | Seq-Structure | 61.3 |
| ProtSSN (k20_h512) | Seq-Structure | 60.9 |
| ESM-1B | Sequence-only | 57.0 |
| ProstT5 | Seq-Structure | 55.6 |
| ProtBert | Sequence-only | 54.6 |
| BLAST | Alignment | 54.0 |
| ESM2 (t30) | Sequence-only | 52.4 |
| ESM2 (t33) | Sequence-only | 49.3 |
| Model | Type | AUC (%) |
|---|---|---|
| TM-VEC | Seq-Structure | 99.4 |
| SaProt (PDB_650M) | Seq-Structure | 98.9 |
| ProstT5 | Seq-Structure | 98.5 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 98.2 |
| ESM2 (t33) | Sequence-only | 92.1 |
| ESM2 (t36) | Sequence-only | 90.1 |
| ESM-1B | Sequence-only | 87.2 |
| Ankh (base) | Seq-Enc-Dec | 86.7 |
| SaProt (AF2_35M) | Seq-Structure | 85.3 |
| ProtBert | Sequence-only | 85.1 |
| ESM2 (t30) | Sequence-only | 84.3 |
| ESM-IF | Seq-Structure | 80.4 |
| TM-align | Alignment | 76.6 |
| Foldseek (3Di) | Alignment | 74.8 |
| Foldseek (3Di-AA) | Alignment | 74.7 |
| ProtSSN (k20_h512) | Seq-Structure | 72.4 |
| MIF-ST | Seq-Structure | 50.2 |
| BLAST | Alignment | 49.9 |
| Model | Type | AUC (%) |
|---|---|---|
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 98.5 |
| ProstT5 | Seq-Structure | 98.5 |
| TM-VEC | Seq-Structure | 98.2 |
| Ankh (base) | Seq-Enc-Dec | 97.6 |
| ESM-IF | Seq-Structure | 97.1 |
| SaProt (AF2_35M) | Seq-Structure | 96.0 |
| SaProt (PDB_650M) | Seq-Structure | 91.7 |
| ESM-1B | Sequence-only | 89.2 |
| ProtBert | Sequence-only | 85.3 |
| ProtSSN (k20_h512) | Seq-Structure | 82.9 |
| MIF-ST | Seq-Structure | 78.6 |
| ESM2 (t30) | Sequence-only | 78.0 |
| ESM2 (t36) | Sequence-only | 66.5 |
| ESM2 (t33) | Sequence-only | 62.2 |
P50
Performance on protein-level splits (P50). Metric is AUC (%). Higher is better.
| Model | Type | AUC (%) |
|---|---|---|
| Foldseek (3Di) | Alignment | 96.5 |
| Foldseek (3Di-AA) | Alignment | 96.5 |
| Ankh (base) | Seq-Enc-Dec | 90.4 |
| TM-VEC | Seq-Structure | 89.9 |
| ProstT5 | Seq-Structure | 80.7 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 78.1 |
| SaProt (AF2_35M) | Seq-Structure | 74.6 |
| ESM-1B | Sequence-only | 73.8 |
| ESM2 (t36) | Sequence-only | 72.9 |
| BLAST | Alignment | 71.7 |
| ESM-IF | Seq-Structure | 70.2 |
| ESM2 (t33) | Sequence-only | 70.0 |
| ESM2 (t30) | Sequence-only | 69.2 |
| ProtBert | Sequence-only | 68.7 |
| SaProt (PDB_650M) | Seq-Structure | 68.2 |
| MIF-ST | Seq-Structure | 65.9 |
| ProtSSN (k20_h512) | Seq-Structure | 64.8 |
| Model | Type | AUC (%) |
|---|---|---|
| Ankh (base) | Seq-Enc-Dec | 91.8 |
| TM-VEC | Seq-Structure | 82.4 |
| Foldseek (3Di) | Alignment | 80.6 |
| Foldseek (3Di-AA) | Alignment | 80.1 |
| ProstT5 | Seq-Structure | 79.2 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 77.1 |
| SaProt (AF2_35M) | Seq-Structure | 71.9 |
| SaProt (PDB_650M) | Seq-Structure | 71.1 |
| ESM-1B | Sequence-only | 69.8 |
| ESM2 (t36) | Sequence-only | 67.6 |
| ProtBert | Sequence-only | 66.8 |
| ESM-IF | Seq-Structure | 65.6 |
| ESM2 (t30) | Sequence-only | 65.5 |
| ESM2 (t33) | Sequence-only | 62.3 |
| ProtSSN (k20_h512) | Seq-Structure | 61.2 |
| MIF-ST | Seq-Structure | 59.2 |
| BLAST | Alignment | 51.1 |
| Model | Type | AUC (%) |
|---|---|---|
| Foldseek (3Di) | Alignment | 99.0 |
| Foldseek (3Di-AA) | Alignment | 99.0 |
| Ankh (base) | Seq-Enc-Dec | 98.9 |
| ProstT5 | Seq-Structure | 98.2 |
| TM-VEC | Seq-Structure | 96.2 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 95.6 |
| SaProt (PDB_650M) | Seq-Structure | 93.8 |
| SaProt (AF2_35M) | Seq-Structure | 92.7 |
| ESM2 (t36) | Sequence-only | 92.1 |
| ESM-IF | Seq-Structure | 90.6 |
| ESM2 (t33) | Sequence-only | 89.0 |
| ESM-1B | Sequence-only | 88.4 |
| ESM2 (t30) | Sequence-only | 87.5 |
| ProtSSN (k20_h512) | Seq-Structure | 86.2 |
| ProtBert | Sequence-only | 84.2 |
| MIF-ST | Seq-Structure | 80.3 |
| Model | Type | AUC (%) |
|---|---|---|
| TM-VEC | Seq-Structure | 71.7 |
| ESM2 (t36) | Sequence-only | 70.0 |
| ProstT5 | Seq-Structure | 69.8 |
| Ankh (base) | Seq-Enc-Dec | 69.7 |
| SaProt (PDB_650M) | Seq-Structure | 68.3 |
| ProtBert | Sequence-only | 68.2 |
| ESM2 (t30) | Sequence-only | 68.2 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 67.6 |
| SaProt (AF2_35M) | Seq-Structure | 66.6 |
| MIF-ST | Seq-Structure | 66.3 |
| ESM2 (t33) | Sequence-only | 66.1 |
| ESM-IF | Seq-Structure | 66.0 |
| Foldseek (3Di) | Alignment | 64.9 |
| Foldseek (3Di-AA) | Alignment | 64.7 |
| ProtSSN (k20_h512) | Seq-Structure | 64.0 |
| ESM-1B | Sequence-only | 58.4 |
| BLAST | Alignment | 56.2 |
| Model | Type | AUC (%) |
|---|---|---|
| Ankh (base) | Seq-Enc-Dec | 88.5 |
| ProtT5 (xl-uniref50) | Seq-Enc-Dec | 85.1 |
| ProstT5 | Seq-Structure | 79.3 |
| SaProt (AF2_35M) | Seq-Structure | 78.8 |
| ProtBert | Sequence-only | 77.9 |
| ESM2 (t30) | Sequence-only | 77.4 |
| SaProt (PDB_650M) | Seq-Structure | 76.1 |
| ESM-1B | Sequence-only | 74.7 |
| ESM-IF | Seq-Structure | 70.5 |
| ProtSSN (k20_h512) | Seq-Structure | 69.4 |
| ESM2 (t36) | Sequence-only | 66.7 |
| MIF-ST | Seq-Structure | 66.7 |
| ESM2 (t33) | Sequence-only | 66.4 |
| TM-VEC | Seq-Structure | 59.9 |