io.github.kuhumcst/dannet

DanNet - Danish WordNet with rich lexical relationships and SPARQL access.

26MITdevtools

Install

Config snippet generator goes here (5 client tabs)

README

![DanNet logo](/resources/public/images/dannet-logo-colour.svg)

[DanNet](https://cst.ku.dk/projekter/dannet/) is a [WordNet](https://en.wikipedia.org/wiki/WordNet) for the Danish language. DanNet uses [RDF](https://www.w3.org/RDF/) as its native representation at both the database level, in the application space, and as its primary serialisation format.

- Browse the data at [wordnet.dk](https://wordnet.dk)
- Query the data by [integrating with AI](#llm-integration)
- Download datasets from the [releases page](https://github.com/kuhumcst/DanNet/releases)

## Table of Contents
- [Dataset Formats](#dataset-formats)
- [Companion Datasets](#companion-datasets)
- [Standards](#standards)
- [LLM Integration](#llm-integration) (AI)
- [Implementation](#implementation)
- [Setup](#setup)
- [Deployment](#deployment)
- [Database Release Workflow](#database-release-workflow)

## Dataset Formats

DanNet is available in multiple formats to maximise compatibility:

| Format | Description                                                                                                                                 |
|--------|---------------------------------------------------------------------------------------------------------------------------------------------|
| **RDF (Turtle)** | Native representation. Load into any RDF graph database (such as Apache Jena) and query with [SPARQL](https://en.wikipedia.org/wiki/SPARQL). |
| **CSV** | Published with column metadata as [CSVW](https://csvw.org/).                                                                                |
| **WN-LMF** | [XML format](https://globalwordnet.github.io/schemas/#xml) compatible with Python libraries like [wn](https://github.com/goodmami/wn).      |

### Example: Using DanNet with Python

```python
import wn

wn.add("dannet-wn-lmf.xml.gz")

for synset in wn.synsets('kage'):
    print((synset.lexfile() or "?") + ": " + (synset.definition() or "?"))
```

### Differences Between Formats

While every format includes all synsets/senses/words, the CSV and WN-LMF variants do *not* include every data point:
- **CSV**: Some data is lost when converting from an open graph to fixed tables.
- **WN-LMF**: Only official GWA relations are included per the standard (proprietary DanNet relations from the [DanNet schema](/resources/schemas/internal/dannet-schema.ttl) are excluded).

For the complete dataset, use the RDF format or browse at [wordnet.dk](https://wordnet.dk).

## Companion Datasets

Several companion datasets expand the RDF graph with additional data:

| Dataset | Description |
|---------|-------------|
| **COR** | Links DanNet resources to IDs from the COR project. |
| **DDS** | Adds sentiment data to DanNet resources. |
| **OEWN extension** | Provides DanNet-style labels for the [Open English WordNet](https://en-word.net/) to facilitate browsing connections between the two datasets. |

### Inferred Data

Additional data is implicitly inferred from the base dataset, companion datasets, and ontological metadata. These inferences can be browsed at [wordnet.dk](https://wordnet.dk). Releases containing fully inferred graphs are specifically marked as such.

## Standards

DanNet is based on the [Ontolex-lemon](https://www.w3.org/2016/05/ontolex/) standard combined with [relations](https://globalwordnet.github.io/gwadoc/) defined by the Global Wordnet Association as used in the official [GWA RDF standard](https://globalwordnet.github.io/schemas/#rdf).

| Ontolex-lemon class | Represents |
|---------------------|------------|
| `ontolex:LexicalConcept` | Synsets |
| `ontolex:LexicalSense` | Word senses |
| `ontolex:LexicalEntry` | Words |
| `ontolex:Form` | Forms |

![Ontolex-lemon representation](doc/ontolex.png "The Ontolex-lemon representation of a WordNet")

### URI Prefixes

| Prefix | URI | Purpose |
|--------|-----|---------|
| `dn` | https://wordnet.dk/dannet/data/ | Dataset instances |
| `dnc` | https://wordnet.dk/dannet/concepts/ | Ontological type members |
| `dns` | https://wordnet.dk/dannet/schema/ | Schema definitions |

All DanNet URIs resolve to HTTP resources. Accessing one of these URIs via a GET request returns the data for that resource.

### Schemas

DanNet has proprietary relations defined in the [DanNet schema](/resources/schemas/internal/dannet-schema.ttl) in an Ontolex-compatible way. There is also a schema for [EuroWordNet concepts](/resources/schemas/internal/dannet-concepts.ttl). Both schemas follow the [RDF conventions](http://www-sop.inria.fr/acacia/personnel/phmartin/RDF/conventions.html#reversingRelations) listed by Philippe Martin.

## LLM Integration

DanNet can be connected to AI tools like Claude via MCP (Model Context Protocol).

- **MCP server URL**: `https://wordnet.dk/mcp`
- **Registry ID**: `io.github.kuhumcst/dannet`

To connect in e.g. Claude Desktop: go to `Settings > Connectors > Browse Connectors`, click "add a custom one", enter a name (e.g., "DanNet") and the MCP server URL.

![Claude Desktop setup](reso