The webserver provides an API for advanced users as described in this page.
The base URL for API calls is https://gmgc.embl.de/api/v1.0/ and API calls return JSON (except where noted).
Also, note that the resources can all be downloaded for local processing. For large scale analyses, that will be more efficient than repeatedly calling the API.
versionReturns the version of the resource.
curl https://gmgc.embl.de/api/v1.0/version
{
    "gmgc-version": "1.0.0",
    "last-updated": "Jun 1 2020"
}
Results returned by lookup addresses and matching a single unigene the information documented below. Alternatively, if the provided <identifier> matches more than one unigene, the the reply will include a list of matches.
# Using an eggNOG identifier
curl https://gmgc.embl.de/api/v1.0/unigene/4PQJ6
{
  "matches": [
    {
      "source": "SAMN06172490",
      "biome": [
        "dog gut",
        "human skin",
        "cat gut"
      ],
      "taxonomy": "1262977",
      "id": "GMGC10.000_000_027.PEPT"
    },
    {
      "source": "SAMN06172460",
      "biome": [
        "dog gut"
      ],
      "taxonomy": "742823",
      "id": "GMGC10.000_186_864.PEPT"
    },
(...)
  ]
}
unigene/<identifier> - returns information about given unigene, including number of samples where it was identified, length and habitats.
curl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.054_598_380.SCLAV_5304
{
  "gene_family": "GMGC10.205_457_183.UNKNOWN",
  "cluster": "GMGC10.146_435_694.SCLAV_5304",
  "query": "GMGC10.054_598_380.SCLAV_5304",
  "name": "GMGC10.054_598_380.SCLAV_5304",
  "taxonomy": [
    {
      "name": "environmental samples",
      "id": "59619",
      "rank": "genus"
    }
  ],
  "samples": 573,
  "length": 88062,
  "habitat": [
    "human skin",
    "human gut"
  ],
  "genome_bins": 25,
  "strand": "+",
  "complete": 1
}unigene/<identifier>/dna_sequence - returns the DNA sequence of the requested unigenecurl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.054_598_380.SCLAV_5304/dna_sequence
{
  "query": "GMGC10.054_598_380.SCLAV_5304",
  "name": "GMGC10.054_598_380.SCLAV_5304",
  "dna_sequence": "ATGAAGTTAGGGGAGAAAATAA..."
}unigene/<identifier>/protein_sequence - returns the translated (aminoacid) sequence of the requested unigenecurl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.054_598_380.SCLAV_5304/protein_sequence
{
  "query": "GMGC10.054_598_380.SCLAV_5304",
  "name": "GMGC10.054_598_380.SCLAV_5304",
  "protein_sequence": "MKLGEKIMRLGKKTSRAISIALL..."
}unigene/<identifier>/features - returns intrinsic sequence features as well as annotations from Pfam, SMART and eggNOGcurl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.054_598_380.SCLAV_5304/features
{
  "features": {
    "intrinsic": [
      {
        "feature": "COIL",
        "end": 351,
        "start": 321
      }
    ],
    "pfam": [
      {
        "domain": "Pfam:RCC1",
        "evalue": 3.4e-06,
        "bitscore": 24.2,
        "end": 891,
        "start": 834
      },
      (...)
    ],
    "smart": [
      {
        "domain": "PbH1",
        "evalue": 4989.07360092172,
        "bitscore": 2.3,
        "end": 782,
        "start": 755
      },
      (...)
    ],
    "eggnog": {
      "cog_functional_category": "M",
      "eggnog_ogs": [
        "2IDH9@201174",
        "4D0CR@85004",
        "COG5184@1",
        "COG5184@2"
      ],
      "seed_ortholog_score": 340.5,
      "go_terms": [],
      "kegg_pathway": [],
      "bigg_reaction": [],
      "ec_number": "-",
      "cazy": "-",
      "kegg_reaction": [],
      "seed_eggnog_ortholog": "1394175.AWUN01000002_gene844",
      "predicted_protein_name": "-",
      "seed_ortholog_evalue": 7.8e-89,
      "kegg_ko": [],
      "eggnog_free_text_description": "Listeria-Bacteroides repeat domain (List_Bact_rpt)",
      "brite": [],
      "kegg_module": []
    }
  },
  "query": "GMGC10.054_598_380.SCLAV_5304",
  "name": "GMGC10.054_598_380.SCLAV_5304"
}unigene/<identifier>/samples - return the list of samples where given unigene was identified.curl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.054_598_380.SCLAV_5304/samples
{
  "samples": [
    "SAMEA1906425",
    "SAMEA1906421",
    "SAMEA1906417",
    (...)
  ],
  "query": "GMGC10.054_598_380.SCLAV_5304",
  "name": "GMGC10.054_598_380.SCLAV_5304"
}unigene/<identifier>/genome_bins - return the list of genome bins where given unigene was identified.curl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.054_598_380.SCLAV_5304/genome_bins
{
  "genome_bins": [
    "GMBC10.101_017",
    "GMBC10.160_410",
    "GMBC10.133_210",
    (...)
  ],
  "query": "GMGC10.054_598_380.SCLAV_5304",
  "name": "GMGC10.054_598_380.SCLAV_5304"
}unigene/<identifier>/antibiotics - returns the list of antibiotics associated with the specified unigenecurl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.000_001_095.UGD/antibiotics
{
  "antiobiotics": [
    "actinomycin",
    "actinomycind",
    "arylomycin",
    (...)
  ],
  "query": "GMGC10.000_001_095.UGD",
  "name": "GMGC10.000_001_095.UGD"
}unigene/<identifier>/aro_terms - returns the list of associated ARO/CARD identifierscurl https://gmgc.embl.de/api/v1.0/unigene/GMGC10.000_001_095.UGD/aro_terms
{
  "aro_terms": [
    "ARO:1000001",
    "ARO:3000000",
    "ARO:3002984",
    "ARO:3003577",
    "ARO:3003580",
    "ARO:3004112",
    "ARO:3004269"
  ],
  "query": "GMGC10.000_001_095.UGD",
  "name": "GMGC10.000_001_095.UGD"
}genome_bin/<genome_bin_id> - returns information about the requested genome_bin identifiercurl https://gmgc.embl.de/api/v1.0/genome_bin/GMBC10.001_023
{
  "min_contig_size": 3323,
  "name": "GMBC10.001_023",
  "contamination": 0,
  "genome": "SAMEA3708885.bin.14",
  "N50": 2611971,
  "total_bp_size": 2611971,
  "nr_contigs": 72,
  "GTDB_tk": "d__Bacteria;p__Firmicutes_A;c__Clostridia;o__Lachnospirales;f__Lachnospiraceae;g__Lachnospira;s__Lachnospira rogosae",
  "category": "high-quality",
  "quality": 99.33,
  "completeness": 99.33,
  "max_contig_size": 243991
}sample/<sample_id> - returns information about the specified samplecurl https://gmgc.embl.de/api/v1.0/sample/SAMEA3708885
{
  "ena_link": "https://www.ebi.ac.uk/ena/data/view/SAMEA3708885",
  "longitude": 12.568337,
  "habitat": "human gut",
  "latitude": 55.676097,
  "name": "SAMEA3708885"
}There are also plural versions of the lookups above, that work with POST:
unigenesunigenes/featuresunigenes/samplesunigenes/genome_binsunigenes/dna_sequenceunigenes/protein_sequencesamplesgenome_binsThese correspond to the calls above, except that they work for 
multiple inputs, passed in as a JSON (with correct content type), in the
 format: {"names": ["gene-A", "gene-B"]} (or sample A, or genome_bin A…).
E.g:
curl --header 'Content-Type: application/json' \
     --request POST \
     --data '{"names": ["GMGC10.003_873_867.PHOA", "GMGC10.016_471_114.PHOA"]}' \
     'https://gmgc.embl.de/api/v1.0/unigenes/genome_bins'habitat/<habitat_name>curl https://gmgc.embl.de/api/v1.0/habitat/human_gut
{
  "samples": [
    "SAMEA1906426",
    "SAMEA1906424",
    "SAMEA1906422",
    (...)
  ],
  "subcatalog_url": "https://gmgc.embl.de/downloads/v1.0/GMGC10.human-gut.95nr.fna.gz",
  "name": "human gut",
  "subcatalog_no_rare_url": "https://gmgc.embl.de/downloads/v1.0/GMGC10.human-gut.no-rare.95nr.fna.gz"
}query/sequenceThe call is as a POST request of up to 50 sequences as an attached FASTA file. The attachment should be called fasta.
For example, create test.fasta containing:
>MySeq
AALAMSALMALSJLAJLACAOSIJDAOSIJDALAASKJDASLKJALCEMALWPQRODASLKJALCKMALWPQRODASLKJ
ALCCKMALWPQRODASLKJALCKMALWPQROQUPJALSFAASLUFPASUFASFJAand with curl:
curl -X POST \
     -F 'fasta=@test.fasta' \
     -F 'mode=all' \
     -F 'return_seqs=true' \
     -F 'return_bins=true' \
     'https://gmgc.embl.de/api/v1.0/query/sequence'Note that the algorithm will always returns its best matches as hits and it is the user’s responsibility to filter them appropriately (i.e., if no good matches exist in the catalog, the algorithm will still return something, but it will be returned with a high e-value).
Parameters:
mode: "all" or "besthit"return_seqs: booleanreturn_bins: boolean    {
        "results":
            [{
                "query_name": "Q1",
                "hits":
                    [ # hits is always a list, but if the request included `mode=besthit`, this will be a list of size one.
                    {"unigene_id": "GMGC10.000_000_000.NAME",
                     "evalue": 12e-23,
                     "bitscore": 232.2,
                        # sequences are only provided if the request included `return_seqs=true`
                     "dna_sequence": "ATTATACAA...",
                     "protein_sequence": "MEPATA..."
                        # Genome bins are only provided if the request included `return_bins=true`
                     "genome_bins":
                        [ "GMBC10.001_023"
                        , "GMBC10.202_232"
                        ]
                    },
                    (...)
                    ]
              }, {
                  "query_name": "Q2",
                  "hits": (...)
              }]
    }