{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9bdebd80-43ef-452e-a27c-7b38f9847d0e",
   "metadata": {},
   "source": [
    "# Create a JSON Knowledge Graph representing a Text-Fabric dataset (N1904-TF)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "462dc028-7cbc-43f8-b0ab-b857dc1afb1f",
   "metadata": {},
   "source": [
    "## Table of content (ToC)<a class=\"anchor\" id=\"TOC\"></a>\n",
    "* <a href=\"#bullet1\">1 - Introduction</a>\n",
    "* <a href=\"#bullet2\">2 - Load the TF dataset</a>\n",
    "* <a href=\"#bullet3\">3 - Run part of the Doc4TF code</a>\n",
    "* <a href=\"#bullet4\">4 - Run the extra code</a>\n",
    "* <a href=\"#bullet5\">5 - The result: a JSON Knowledge Graph</a>\n",
    "* <a href=\"#bullet6\">6 - Notebook version details</a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d15763f-4f24-4828-8e5b-c806e471fbf9",
   "metadata": {},
   "source": [
    "# 1 - Introduction <a class=\"anchor\" id=\"bullet1\"></a>\n",
    "##### [Back to ToC](#TOC)\n",
    "\n",
    "In this notebook we will create the bare (JSON) Knowlede Graph. To create the source dictionairy we will re-use part of the code I created for [Doc4TF](https://github.com/tonyjurg/Doc4TF)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ebfc77b-cc01-49f0-931e-15e2bd4179ef",
   "metadata": {},
   "source": [
    "## 2 - Load the TF dataset <a class=\"anchor\" id=\"bullet2\"></a>\n",
    "##### [Back to ToC](#TOC)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d1cae453-890b-49bf-90aa-082ca82c36d7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "**Locating corpus resources ...**"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">app:</b> <span title=\"r1.0.0=#gdb630837ae89b9468c9e50d13bda05cfd3de4f18 offline under C:/Users/tonyj/text-fabric-data/github\">~/text-fabric-data/github/CenterBLC/N1904/app</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<b title=\"local release\">data:</b> <span title=\"r1.0.0=#gdb630837ae89b9468c9e50d13bda05cfd3de4f18 offline under C:/Users/tonyj/text-fabric-data/github\">~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0</span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "            <b>TF:</b> <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/cheatsheet.html\" title=\"text-fabric api\">TF API 12.6.1</a>, <a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/app\" title=\"CenterBLC/N1904 app\">CenterBLC/N1904/app  v3</a>, <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/about/searchusage.html\" title=\"Search Templates Introduction and Reference\">Search Reference</a><br>\n",
       "            <b>Data:</b> <a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/tree/main/docs/about.md\" title=\"provenance of Nestle 1904 Greek New Testament\">CenterBLC - N1904 1.0.0</a>, <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/writing/greek.html\" title=\"How TF features represent text\">Character table</a>, <a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/README.md\" title=\"CenterBLC - N1904 feature documentation\">Feature docs</a><br>\n",
       "            <details class=\"nodeinfo\"><summary><b>Node types</b></summary>\n",
       "<table class=\"nodeinfo\">\n",
       "    <tr>\n",
       "        <th>Name</th>\n",
       "        <th># of nodes</th>\n",
       "        <th># slots / node</th>\n",
       "        <th>% coverage</th>\n",
       "    </tr>\n",
       "\n",
       "<tr>\n",
       "    <th>book</th>\n",
       "    <td>27</td>\n",
       "    <td>5102.93</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>chapter</th>\n",
       "    <td>260</td>\n",
       "    <td>529.92</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>verse</th>\n",
       "    <td>7944</td>\n",
       "    <td>17.34</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>sentence</th>\n",
       "    <td>8011</td>\n",
       "    <td>17.20</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>group</th>\n",
       "    <td>8945</td>\n",
       "    <td>7.01</td>\n",
       "    <td>46</td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>clause</th>\n",
       "    <td>42506</td>\n",
       "    <td>8.36</td>\n",
       "    <td><i>258</i></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>wg</th>\n",
       "    <td>106868</td>\n",
       "    <td>6.88</td>\n",
       "    <td><i>533</i></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>phrase</th>\n",
       "    <td>69007</td>\n",
       "    <td>1.90</td>\n",
       "    <td>95</td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th>subphrase</th>\n",
       "    <td>116178</td>\n",
       "    <td>1.60</td>\n",
       "    <td><i>135</i></td>\n",
       "</tr>\n",
       "\n",
       "<tr>\n",
       "    <th><i>word</i></th>\n",
       "    <td>137779</td>\n",
       "    <td>1.00</td>\n",
       "    <td><b>100</b></td>\n",
       "</tr>\n",
       "</table></details>\n",
       "            <b>Sets:</b> no custom sets<br>\n",
       "            <b>Features:</b><br>\n",
       "<details><summary><b>Nestle 1904 Greek New Testament</b></summary>\n",
       "    <div class=\"fcorpus\">\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/after.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/after.tf\">after</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> material after the end of the word</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/appositioncontainer.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/appositioncontainer.tf\">appositioncontainer</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> 1 if it is an apposition container</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/articular.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/articular.tf\">articular</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> 1 if the sentence, group, clause, phrase or wg has an article</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/before.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/before.tf\">before</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute before</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/book.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/book.tf\">book</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> book name (full name)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/bookshort.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/bookshort.tf\">bookshort</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> book name (abbreviated) from ref attribute in xml</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/case.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/case.tf\">case</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> grammatical case</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/chapter.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/chapter.tf\">chapter</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> chapter number, from ref attribute in xml</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/clausetype.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/clausetype.tf\">clausetype</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> clause type</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/cls.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/cls.tf\">cls</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute cls</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/cltype.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/cltype.tf\">cltype</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> clause type</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/criticalsign.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/criticalsign.tf\">criticalsign</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute criticalsign</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/crule.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/crule.tf\">crule</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> clause rule (from xml attribute Rule)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/degree.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/degree.tf\">degree</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> grammatical degree</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/discontinuous.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/discontinuous.tf\">discontinuous</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> 1 if the word is out of sequence in the xml</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/domain.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/domain.tf\">domain</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> domain</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/framespec.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/framespec.tf\">framespec</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute framespec</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/function.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/function.tf\">function</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute function</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/gender.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/gender.tf\">gender</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> grammatical gender</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/gloss.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/gloss.tf\">gloss</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> English gloss (BGVB)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/id.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/id.tf\">id</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> xml id</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/junction.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/junction.tf\">junction</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> type of junction</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/lang.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/lang.tf\">lang</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> language the text is in</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/lemma.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/lemma.tf\">lemma</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> lexical lemma</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/lemmatranslit.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/lemmatranslit.tf\">lemmatranslit</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> transliteration of the word lemma</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/ln.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/ln.tf\">ln</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> ln</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/mood.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/mood.tf\">mood</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> verbal mood</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/morph.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/morph.tf\">morph</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> morphological code</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/nodeid.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/nodeid.tf\">nodeid</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> node id (as in the XML source data)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/normalized.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/normalized.tf\">normalized</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> lemma normalized</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/note.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/note.tf\">note</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> annotation of linguistic nature</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/num.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/num.tf\">num</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> generated number (not in xml): book: (Matthew=1, Mark=2, ..., Revelation=27); sentence: numbered per chapter; word: numbered per verse.</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/number.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/number.tf\">number</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> grammatical number</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/otype.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/otype.tf\">otype</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> </span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/person.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/person.tf\">person</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> grammatical person</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/punctuation.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/punctuation.tf\">punctuation</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> punctuation found after a word</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/ref.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/ref.tf\">ref</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> biblical reference with word counting</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/referent.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/referent.tf\">referent</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> number of referent</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/rela.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/rela.tf\">rela</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute rela</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/role.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/role.tf\">role</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> role</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/rule.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/rule.tf\">rule</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> syntactical rule</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/sp.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/sp.tf\">sp</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> part-of-speach</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/strong.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/strong.tf\">strong</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> strong number</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/subjrefspec.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/subjrefspec.tf\">subjrefspec</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute subjrefspec</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/tense.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/tense.tf\">tense</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> verbal tense</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/text.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/text.tf\">text</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> the text of a word</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/trailer.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/trailer.tf\">trailer</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> material after the end of the word (excluding critical signs)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/trans.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/trans.tf\">trans</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> translation of the word surface text according to the Berean Interlinear Bible</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/translit.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/translit.tf\">translit</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> transliteration of the word surface text</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/typ.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/typ.tf\">typ</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> syntactical type (on sentence, group, clause or phrase)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/typems.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/typems.tf\">typems</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> morphological type (on word), syntactical type (on sentence, group, clause, phrase or wg)</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/unaccent.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/unaccent.tf\">unaccent</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> word in unicode characters without accents and diacritical markers</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/unicode.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/unicode.tf\">unicode</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> word in unicode characters plus material after it</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/variant.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/variant.tf\">variant</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> this is XML attribute variant</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/verse.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/verse.tf\">verse</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> verse number, from ref attribute in xml</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat \">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/voice.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/voice.tf\">voice</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> verbal voice</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/frame.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/frame.tf\">frame</a>\n",
       "</div>\n",
       "<div class=\"fmono\">str</div>\n",
       "\n",
       "<span> frame</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/oslots.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/oslots.tf\">oslots</a>\n",
       "</div>\n",
       "<div class=\"fmono\">none</div>\n",
       "\n",
       "<span> </span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/parent.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/parent.tf\">parent</a>\n",
       "</div>\n",
       "<div class=\"fmono\">none</div>\n",
       "\n",
       "<span> parent relationship between words</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/sibling.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/sibling.tf\">sibling</a>\n",
       "</div>\n",
       "<div class=\"fmono\">int</div>\n",
       "\n",
       "<span> this is XML attribute sibling</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "<div class=\"frow\">\n",
       "    <div class=\"fnamecat edge\">\n",
       "<a target=\"_blank\" href=\"https://github.com/CenterBLC/N1904/blob/main/docs/features/subjref.md\" title=\"~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0/subjref.tf\">subjref</a>\n",
       "</div>\n",
       "<div class=\"fmono\">none</div>\n",
       "\n",
       "<span> number of subject referent</span>\n",
       "\n",
       "</div>\n",
       "\n",
       "    </div>\n",
       "</details>\n",
       "\n",
       "            <b>Settings:</b><br><details ><summary><b>specified</b></summary><ol><li><b>apiVersion</b>: <code>3</code></li><li><b>appName</b>: <code>CenterBLC/N1904</code></li><li><b>appPath</b>: <code>C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/app</code></li><li><b>commit</b>: <code>gdb630837ae89b9468c9e50d13bda05cfd3de4f18</code></li><li><b>css</b>: <code>''</code></li><li><details><summary><b>dataDisplay</b>:</summary><ul><li><b>excludedFeatures</b>: <code>[]</code></li><li><details><summary><b>noneValues</b>:</summary><ul><li><code>none</code></li><li><code>unknown</code></li><li><i>no value</i></li><li><code>NA</code></li></ul></details></li><li><b>sectionSep1</b>: <code> </code></li><li><b>sectionSep2</b>: <code>:</code></li><li><b>textFormat</b>: <code>text-orig-full</code></li></ul></details></li><li><details><summary><b>docs</b>:</summary><ul><li><b>docBase</b>: <code>https://github.com/CenterBLC/N1904/tree/main/docs</code></li><li><b>docPage</b>: <code>about</code></li><li><b>docRoot</b>: <code>https://github.com/CenterBLC/N1904</code></li><li><details><summary><b>featureBase</b>:</summary><code>https://github.com/CenterBLC/N1904/blob/main/docs/features/&lt;feature&gt;.md</code></details></li><li><b>featurePage</b>: <code>README</code></li></ul></details></li><li><b>interfaceDefaults</b>: {<b>fmt</b>: <code>text-orig-full</code>}</li><li><b>isCompatible</b>: <code>True</code></li><li><b>local</b>: <code>local</code></li><li><details><summary><b>localDir</b>:</summary><code>C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/_temp</code></details></li><li><details><summary><b>provenanceSpec</b>:</summary><ul><li><b>branch</b>: <code>main</code></li><li><b>corpus</b>: <code>Nestle 1904 Greek New Testament</code></li><li><b>doi</b>: <code>10.5281/zenodo.13117910</code></li><li><b>moduleSpecs</b>: <code>[]</code></li><li><b>org</b>: <code>CenterBLC</code></li><li><b>relative</b>: <code>/tf</code></li><li><b>repo</b>: <code>N1904</code></li><li><b>repro</b>: <code>N1904</code></li><li><b>version</b>: <code>1.0.0</code></li><li><b>webBase</b>: <code>https://learner.bible/text/show_text/nestle1904/</code></li><li><b>webHint</b>: <code>Show this on the website</code></li><li><b>webLang</b>: <code>en</code></li><li><details><summary><b>webUrl</b>:</summary><code>https://learner.bible/text/show_text/nestle1904/&lt;1&gt;/&lt;2&gt;/&lt;3&gt;</code></details></li><li><b>webUrlLex</b>: <code>{webBase}/word?version={version}&amp;id=&lt;lid&gt;</code></li></ul></details></li><li><b>release</b>: <code>1.0.0</code></li><li><details><summary><b>typeDisplay</b>:</summary><ul><li><details><summary><b>clause</b>:</summary><ul><li><b>condense</b>: <code>True</code></li><li><b>label</b>: <code>{typ} {function} {rela} \\\\ {cls} {role} {junction}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>group</b>:</summary><ul><li><b>label</b>: <code>{typ} {function} {rela} \\\\ {typems} {role} {rule}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>phrase</b>:</summary><ul><li><b>condense</b>: <code>True</code></li><li><b>label</b>: <code>{typ} {function} {rela} \\\\ {typems} {role} {rule}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>sentence</b>:</summary><ul><li><b>label</b>: <code>{typ} {function} {rela} \\\\ {role} {rule}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>subphrase</b>:</summary><ul><li><b>label</b>: <code>{typ} {function} {rela} \\\\ {typems} {role} {rule}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>verse</b>:</summary><ul><li><b>condense</b>: <code>True</code></li><li><b>label</b>: <code>{book} {chapter}:{verse}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>wg</b>:</summary><ul><li><b>condense</b>: <code>True</code></li><li><b>label</b>: <code>{typems} {role} {rule} {junction}</code></li><li><b>style</b>: <code>''</code></li></ul></details></li><li><details><summary><b>word</b>:</summary><ul><li><details><summary><b>features</b>:</summary><ul><li><code>lemma</code></li><li><code>sp</code></li></ul></details></li><li><b>featuresBare</b>: [<code>gloss</code>]</li></ul></details></li></ul></details></li><li><b>writing</b>: <code>grc</code></li></ol></details>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<style>tr.tf.ltr, td.tf.ltr, th.tf.ltr { text-align: left ! important;}\n",
       "tr.tf.rtl, td.tf.rtl, th.tf.rtl { text-align: right ! important;}\n",
       "@font-face {\n",
       "  font-family: \"Gentium Plus\";\n",
       "  src: local('Gentium Plus'), local('GentiumPlus'),\n",
       "    url('/browser/static/fonts/GentiumPlus-R.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/GentiumPlus-R.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Ezra SIL\";\n",
       "  src: local('Ezra SIL'), local('EzraSIL'),\n",
       "    url('/browser/static/fonts/SILEOT.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SILEOT.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"SBL Hebrew\";\n",
       "  src: local('SBL Hebrew'), local('SBLHebrew'),\n",
       "    url('/browser/static/fonts/SBL_Hbrw.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SBL_Hbrw.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Estrangelo Edessa\";\n",
       "  src: local('Estrangelo Edessa'), local('EstrangeloEdessa');\n",
       "    url('/browser/static/fonts/SyrCOMEdessa.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SyrCOMEdessa.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: AmiriQuran;\n",
       "  font-style: normal;\n",
       "  font-weight: 400;\n",
       "  src: local('Amiri Quran'), local('AmiriQuran'),\n",
       "    url('/browser/static/fonts/AmiriQuran.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/AmiriQuran.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: AmiriQuranColored;\n",
       "  font-style: normal;\n",
       "  font-weight: 400;\n",
       "  src: local('Amiri Quran Colored'), local('AmiriQuranColored'),\n",
       "    url('/browser/static/fonts/AmiriQuranColored.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/AmiriQuranColored.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"Santakku\";\n",
       "  src: local('Santakku'),\n",
       "    url('/browser/static/fonts/Santakku.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/Santakku.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "@font-face {\n",
       "  font-family: \"SantakkuM\";\n",
       "  src: local('SantakkuM'),\n",
       "    url('/browser/static/fonts/SantakkuM.woff') format('woff'),\n",
       "    url('https://github.com/annotation/text-fabric/blob/master/tf/browser/static/fonts/SantakkuM.woff?raw=true') format('woff');\n",
       "}\n",
       "\n",
       "/* bypassing some classical notebook settings */\n",
       "div#notebook {\n",
       "  line-height: unset;\n",
       "}\n",
       "/* neutral text */\n",
       ".txtn,.txtn a:visited,.txtn a:link {\n",
       "    font-family: sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* transcription text */\n",
       ".txtt,.txtt a:visited,.txtt a:link {\n",
       "    font-family: monospace;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* source text */\n",
       ".txto,.txto a:visited,.txto a:link {\n",
       "    font-family: serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* phonetic text */\n",
       ".txtp,.txtp a:visited,.txtp a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* original script text */\n",
       ".txtu,.txtu a:visited,.txtu a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    text-decoration: none;\n",
       "    color: var(--text-color);\n",
       "}\n",
       "/* hebrew */\n",
       ".txtu.hbo,.lex.hbo {\n",
       "    font-family: \"Ezra SIL\", \"SBL Hebrew\", sans-serif;\n",
       "    font-size: large;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* syriac */\n",
       ".txtu.syc,.lex.syc {\n",
       "    font-family: \"Estrangelo Edessa\", sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* neo aramaic */\n",
       ".txtu.cld,.lex.cld {\n",
       "    font-family: \"CharisSIL-R\", sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* standard arabic */\n",
       ".txtu.ara,.lex.ara {\n",
       "    font-family: \"AmiriQuran\", sans-serif;\n",
       "    font-size: large;\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* cuneiform */\n",
       ".txtu.akk,.lex.akk {\n",
       "    font-family: Santakku, sans-serif;\n",
       "    font-size: large;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".txtu.uga,.lex.uga {\n",
       "    font-family: sans-serif;\n",
       "    /*font-family: Noto Sans Ugaritic, sans-serif;*/\n",
       "    font-size: large;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "/* greek */\n",
       ".txtu.grc,.lex.grc a:link {\n",
       "    font-family: Gentium, sans-serif;\n",
       "    font-size: medium;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       "a:hover {\n",
       "    text-decoration: underline | important;\n",
       "    color: #0000ff | important;\n",
       "}\n",
       ".ltr {\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".rtl {\n",
       "    direction: rtl ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".ubd {\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".col {\n",
       "   display: inline-block;\n",
       "}\n",
       ".features {\n",
       "    font-family: monospace;\n",
       "    font-size: medium;\n",
       "    font-weight: bold;\n",
       "    color: var(--features);\n",
       "    display: flex;\n",
       "    flex-flow: column nowrap;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    padding: 2px;\n",
       "    margin: 2px;\n",
       "    direction: ltr;\n",
       "    unicode-bidi: embed;\n",
       "    border: var(--meta-width) solid var(--meta-color);\n",
       "    border-radius: var(--meta-width);\n",
       "}\n",
       ".features div,.features span {\n",
       "    padding: 0;\n",
       "    margin: -2px 0;\n",
       "}\n",
       ".features .f {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: normal;\n",
       "    color: #5555bb;\n",
       "}\n",
       ".features .xft {\n",
       "  color: #000000;\n",
       "  background-color: #eeeeee;\n",
       "  font-size: medium;\n",
       "  margin: 2px 0px;\n",
       "}\n",
       ".features .xft .f {\n",
       "  color: #000000;\n",
       "  background-color: #eeeeee;\n",
       "  font-size: small;\n",
       "  font-weight: normal;\n",
       "}\n",
       ".tfsechead {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: bold;\n",
       "    color: var(--tfsechead);\n",
       "    unicode-bidi: embed;\n",
       "    text-align: start;\n",
       "}\n",
       ".structure {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    font-weight: bold;\n",
       "    color: var(--structure);\n",
       "    unicode-bidi: embed;\n",
       "    text-align: start;\n",
       "}\n",
       ".comments {\n",
       "    display: flex;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    flex-flow: column nowrap;\n",
       "}\n",
       ".nd, a:link.nd {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    color: var(--node);\n",
       "    vertical-align: super;\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".nde, a:link.nde {\n",
       "    font-family: sans-serif;\n",
       "    font-size: small;\n",
       "    color: var(--node);\n",
       "    direction: ltr ! important;\n",
       "    unicode-bidi: embed;\n",
       "}\n",
       ".etf {\n",
       "    font-size: normal;\n",
       "    border-radius: 0.2rem;\n",
       "    border: 1pt solid white;\n",
       "    padding: 0 0.2rem ! important;\n",
       "    margin: 0 0.2rem ! important;\n",
       "}\n",
       ".etfx {\n",
       "    font-size: x-large;\n",
       "}\n",
       ".lex {\n",
       "  color: var(--lex-color);;\n",
       "}\n",
       "#colormapplus, #colormapmin, .ecolormapmin {\n",
       "  font-weight: bold;\n",
       "  border-radius: 0.1rem;\n",
       "  background-color: #eeeeff;\n",
       "  padding: 0 1rem;\n",
       "  margin: 0 1rem;\n",
       "}\n",
       ".clr {\n",
       "  font-style: italic;\n",
       "  font-size: small;\n",
       "}\n",
       ".clmap,.eclmap {\n",
       "  padding: 0;\n",
       "}\n",
       ".children,.children.ltr {\n",
       "    display: flex;\n",
       "    border: 0;\n",
       "    background-color: #ffffff;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "}\n",
       ".children.stretch {\n",
       "    align-items: stretch;\n",
       "}\n",
       ".children.hor {\n",
       "    flex-flow: row nowrap;\n",
       "}\n",
       ".children.hor.wrap {\n",
       "    flex-flow: row wrap;\n",
       "}\n",
       ".children.ver {\n",
       "    flex-flow: column nowrap;\n",
       "}\n",
       ".children.ver.wrap {\n",
       "    flex-flow: column wrap;\n",
       "}\n",
       ".contnr {\n",
       "    width: fit-content;\n",
       "    display: flex;\n",
       "    justify-content: flex-start;\n",
       "    align-items: flex-start;\n",
       "    align-content: flex-start;\n",
       "    flex-flow: column nowrap;\n",
       "    background: #ffffff none repeat scroll 0 0;\n",
       "    padding:  10px 2px 2px 2px;\n",
       "    margin: 16px 2px 2px 2px;\n",
       "    border-style: solid;\n",
       "    font-size: small;\n",
       "}\n",
       ".contnr.trm {\n",
       "    background-attachment: local;\n",
       "}\n",
       ".contnr.cnul {\n",
       "    padding:  0;\n",
       "    margin: 0;\n",
       "    border-style: solid;\n",
       "    font-size: xx-small;\n",
       "}\n",
       ".contnr.cnul,.lbl.cnul {\n",
       "    border-color: var(--border-color-nul);\n",
       "    border-width: var(--border-width-nul);\n",
       "    border-radius: var(--border-width-nul);\n",
       "}\n",
       ".contnr.c0,.lbl.c0 {\n",
       "    border-color: var(--border-color0);\n",
       "    border-width: var(--border-width0);\n",
       "    border-radius: var(--border-width0);\n",
       "}\n",
       ".contnr.c1,.lbl.c1 {\n",
       "    border-color: var(--border-color1);\n",
       "    border-width: var(--border-width1);\n",
       "    border-radius: var(--border-width1);\n",
       "}\n",
       ".contnr.c2,.lbl.c2 {\n",
       "    border-color: var(--border-color2);\n",
       "    border-width: var(--border-width2);\n",
       "    border-radius: var(--border-width2);\n",
       "}\n",
       ".contnr.c3,.lbl.c3 {\n",
       "    border-color: var(--border-color3);\n",
       "    border-width: var(--border-width3);\n",
       "    border-radius: var(--border-width3);\n",
       "}\n",
       ".contnr.c4,.lbl.c4 {\n",
       "    border-color: var(--border-color4);\n",
       "    border-width: var(--border-width4);\n",
       "    border-radius: var(--border-width4);\n",
       "}\n",
       "span.plain {\n",
       "    /*display: inline-block;*/\n",
       "    display: inline-flex;\n",
       "    flex-flow: row wrap;\n",
       "    white-space: pre-wrap;\n",
       "}\n",
       "span.break {\n",
       "  flex-basis: 100%;\n",
       "  height: 0;\n",
       "}\n",
       ".plain {\n",
       "    background-color: #ffffff;\n",
       "}\n",
       ".plain.l,.contnr.l,.contnr.l>.lbl {\n",
       "    border-left-style: dotted\n",
       "}\n",
       ".plain.r,.contnr.r,.contnr.r>.lbl {\n",
       "    border-right-style: dotted\n",
       "}\n",
       ".plain.lno,.contnr.lno,.contnr.lno>.lbl {\n",
       "    border-left-style: none\n",
       "}\n",
       ".plain.rno,.contnr.rno,.contnr.rno>.lbl {\n",
       "    border-right-style: none\n",
       "}\n",
       ".plain.l {\n",
       "    padding-left: 4px;\n",
       "    margin-left: 2px;\n",
       "    border-width: var(--border-width-plain);\n",
       "}\n",
       ".plain.r {\n",
       "    padding-right: 4px;\n",
       "    margin-right: 2px;\n",
       "    border-width: var(--border-width-plain);\n",
       "}\n",
       ".lbl {\n",
       "    font-family: monospace;\n",
       "    margin-top: -24px;\n",
       "    margin-left: 20px;\n",
       "    background: #ffffff none repeat scroll 0 0;\n",
       "    padding: 0 6px;\n",
       "    border-style: solid;\n",
       "    display: block;\n",
       "    color: var(--label)\n",
       "}\n",
       ".lbl.trm {\n",
       "    background-attachment: local;\n",
       "    margin-top: 2px;\n",
       "    margin-left: 2px;\n",
       "    padding: 2px 2px;\n",
       "    border-style: none;\n",
       "}\n",
       ".lbl.cnul {\n",
       "    font-size: xx-small;\n",
       "}\n",
       ".lbl.c0 {\n",
       "    font-size: small;\n",
       "}\n",
       ".lbl.c1 {\n",
       "    font-size: small;\n",
       "}\n",
       ".lbl.c2 {\n",
       "    font-size: medium;\n",
       "}\n",
       ".lbl.c3 {\n",
       "    font-size: medium;\n",
       "}\n",
       ".lbl.c4 {\n",
       "    font-size: large;\n",
       "}\n",
       ".occs, a:link.occs {\n",
       "    font-size: small;\n",
       "}\n",
       "\n",
       "/* PROVENANCE */\n",
       "\n",
       "div.prov {\n",
       "\tmargin: 40px;\n",
       "\tpadding: 20px;\n",
       "\tborder: 2px solid var(--fog-rim);\n",
       "}\n",
       "div.pline {\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "}\n",
       "div.p2line {\n",
       "\tmargin-left: 2em;\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "}\n",
       "div.psline {\n",
       "\tdisplay: flex;\n",
       "\tflex-flow: row nowrap;\n",
       "\tjustify-content: stretch;\n",
       "\talign-items: baseline;\n",
       "\tbackground-color: var(--gold-mist-back);\n",
       "}\n",
       "div.pname {\n",
       "\tflex: 0 0 5rem;\n",
       "\tfont-weight: bold;\n",
       "}\n",
       "div.pval {\n",
       "    flex: 1 1 auto;\n",
       "}\n",
       "\n",
       "/* KEYBOARD */\n",
       ".ccoff {\n",
       "  background-color: inherit;\n",
       "}\n",
       ".ccon {\n",
       "  background-color: yellow ! important;\n",
       "}\n",
       ".ccon,.ccoff {\n",
       "  padding: 0.2rem;\n",
       "  margin: 0.2rem;\n",
       "  border: 0.1rem solid var(--letter-box-border);\n",
       "  border-radius: 0.1rem;\n",
       "}\n",
       ".ccline {\n",
       "  font-size: xx-large ! important;\n",
       "  font-weight: bold;\n",
       "  line-height: 2em ! important;\n",
       "}\n",
       "/* TF header */\n",
       "\n",
       "summary {\n",
       "  /* needed to override the normalize.less\n",
       "   * in the classical Jupyter Notebook\n",
       "   */\n",
       "  display: list-item ! important;\n",
       "}\n",
       "\n",
       ".fcorpus {\n",
       "  display: flex;\n",
       "  flex-flow: column nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "  overflow: auto;\n",
       "}\n",
       ".frow {\n",
       "  display: flex;\n",
       "  flex-flow: row nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "}\n",
       ".fmeta {\n",
       "  display: flex;\n",
       "  flex-flow: column nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "}\n",
       ".fmetarow {\n",
       "  display: flex;\n",
       "  flex-flow: row nowrap;\n",
       "  justify-content: flex-start;\n",
       "  align-items: flex-start;\n",
       "  align-content: flex-start;\n",
       "}\n",
       ".fmetakey {\n",
       "  min-width: 8em;\n",
       "  font-family: monospace;\n",
       "}\n",
       ".fnamecat {\n",
       "  min-width: 8em;\n",
       "}\n",
       ".fnamecat.edge {\n",
       "  font-weight: bold;\n",
       "  font-style: italic;\n",
       "}\n",
       ".fmono {\n",
       "    font-family: monospace;\n",
       "}\n",
       "\n",
       ":root {\n",
       "\t--node:               hsla(120, 100%,  20%, 1.0  );\n",
       "\t--label:              hsla(  0, 100%,  20%, 1.0  );\n",
       "\t--tfsechead:          hsla(  0, 100%,  25%, 1.0  );\n",
       "\t--structure:          hsla(120, 100%,  25%, 1.0  );\n",
       "\t--features:           hsla(  0,   0%,  30%, 1.0  );\n",
       "  --text-color:         hsla( 60,  80%,  10%, 1.0  );\n",
       "  --lex-color:          hsla(220,  90%,  60%, 1.0  );\n",
       "  --meta-color:         hsla(  0,   0%,  90%, 0.7  );\n",
       "  --meta-width:         3px;\n",
       "  --border-color-nul:   hsla(  0,   0%,  90%, 0.5  );\n",
       "  --border-color0:      hsla(  0,   0%,  90%, 0.9  );\n",
       "  --border-color1:      hsla(  0,   0%,  80%, 0.9  );\n",
       "  --border-color2:      hsla(  0,   0%,  70%, 0.9  );\n",
       "  --border-color3:      hsla(  0,   0%,  80%, 0.8  );\n",
       "  --border-color4:      hsla(  0,   0%,  60%, 0.9  );\n",
       "\t--letter-box-border:  hsla(  0,   0%,  80%, 0.5  );\n",
       "  --border-width-nul:   2px;\n",
       "  --border-width0:      2px;\n",
       "  --border-width1:      3px;\n",
       "  --border-width2:      4px;\n",
       "  --border-width3:      6px;\n",
       "  --border-width4:      5px;\n",
       "  --border-width-plain: 2px;\n",
       "}\n",
       ".hl {\n",
       "  background-color: var(--hl-strong);\n",
       "}\n",
       "span.hl {\n",
       "\tbackground-color: var(--hl-strong);\n",
       "\tborder-width: 0;\n",
       "\tborder-radius: 2px;\n",
       "\tborder-style: solid;\n",
       "}\n",
       "div.contnr.hl,div.lbl.hl {\n",
       "  background-color: var(--hl-strong);\n",
       "}\n",
       "div.contnr.hl {\n",
       "  border-color: var(--hl-rim) ! important;\n",
       "\tborder-width: 4px ! important;\n",
       "}\n",
       "\n",
       "span.hlbx {\n",
       "\tborder-color: var(--hl-rim);\n",
       "\tborder-width: 4px ! important;\n",
       "\tborder-style: solid;\n",
       "\tborder-radius: 6px;\n",
       "  padding: 4px;\n",
       "  margin: 4px;\n",
       "}\n",
       ".ehl {\n",
       "  background-color: var(--ehl-strong);\n",
       "}\n",
       "\n",
       ":root {\n",
       "\t--hl-strong:        hsla( 60, 100%,  70%, 0.9  );\n",
       "\t--hl-rim:           hsla( 55,  80%,  50%, 1.0  );\n",
       "\t--ehl-strong:       hsla(240, 100%,  70%, 0.9  );\n",
       "}\n",
       "</style>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "\n",
       "<script>\n",
       "globalThis.copyChar = (el, c) => {\n",
       "    for (const el of document.getElementsByClassName('ccon')) {\n",
       "        el.className = 'ccoff'\n",
       "    }\n",
       "    el.className = 'ccon'\n",
       "    navigator.clipboard.writeText(String.fromCharCode(c))\n",
       "}\n",
       "</script>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div><b>TF API:</b> names <a target=\"_blank\" href=\"https://annotation.github.io/text-fabric/tf/cheatsheet.html\" title=\"doc\">N F E L T S C TF Fs Fall Es Eall Cs Call</a> directly usable</div><hr>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "Display is setup for viewtype [syntax-view](https://github.com/CenterBLC/N1904/blob/main/docs/syntax-view.md#start)"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/markdown": [
       "See [here](https://github.com/CenterBLC/N1904/blob/main/docs/viewtypes.md#start) for more information on viewtypes"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from tf.app import use\n",
    "from collections import defaultdict\n",
    "import json\n",
    "\n",
    "# Load the N1904 Text-Fabric dataset\n",
    "A = use('CenterBLC/N1904', version='1.0.0', hoist=globals())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "937845a6-75dc-427b-8a3c-14669eb205aa",
   "metadata": {},
   "source": [
    "# 3 - Run part of the Doc4TF code <a class=\"anchor\" id=\"bullet3\"></a>\n",
    "##### [Back to ToC](#TOC)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d51d7b9b-8c04-4272-a4d2-46e2ad94097e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gathering generic details\n",
      "Analyzing Node Features: ........................................................\n",
      "Analyzing Edge Features: .....\n",
      "Finished in 19.82 seconds.\n"
     ]
    }
   ],
   "source": [
    "verbose=False\n",
    "tableLimit=10\n",
    "\n",
    "# Initialize an empty dictionary to store feature data\n",
    "featureDict = {}\n",
    "import time\n",
    "overallTime = time.time()\n",
    "\n",
    "def getFeatureDescription(metaData):\n",
    "    \"\"\"\n",
    "    This function looks for the 'description' key in the metadata dictionary. If the key is found,\n",
    "    it returns the corresponding description. If the key is not present, it returns a default \n",
    "    message indicating that no description is available.\n",
    "\n",
    "    Parameters:\n",
    "       metaData (dict): A dictionary containing metadata about a feature.\n",
    "\n",
    "    Returns:\n",
    "       str: The description of the feature if available, otherwise a default message.\n",
    "    \"\"\"\n",
    "    return metaData.get('description', \"No feature description\")\n",
    "\n",
    "def setDataType(metaData):\n",
    "    \"\"\"\n",
    "    This function checks for the 'valueType' key in the metadata. If the key is present, it\n",
    "    returns 'String' if the value is 'str', and 'Integer' for other types. If the 'valueType' key\n",
    "    is not present, it returns 'Unknown'.\n",
    "\n",
    "    Parameters:\n",
    "       metaData (dict): A dictionary containing metadata, including the 'valueType' of a feature.\n",
    "\n",
    "    Returns:\n",
    "       str: A string indicating the determined data type ('String', 'Integer', or 'Unknown').\n",
    "    \"\"\"\n",
    "    if 'valueType' in metaData:\n",
    "        return \"String\" if metaData[\"valueType\"] == 'str' else \"Integer\"\n",
    "    return \"Unknown\"\n",
    "\n",
    "def processFeature(feature, featureType, featureMethod):\n",
    "    \"\"\"\n",
    "    Processes a given feature by extracting metadata, description, and data type, and then\n",
    "    compiles frequency data for different node types in a feature dictionary. Certain features\n",
    "    are skipped based on their type. The processed data is added to a global feature dictionary.\n",
    "\n",
    "    Parameters:\n",
    "       feature (str): The name of the feature to be processed.\n",
    "       featureType (str): The type of the feature ('Node' or 'Edge').\n",
    "       featureMethod (function): A function to obtain feature data.\n",
    "\n",
    "    Returns:\n",
    "       None: The function updates a global dictionary with processed feature data and does not return anything.\n",
    "    \"\"\"\n",
    "    \n",
    "    # Obtain the meta data\n",
    "    featureMetaData = featureMethod(feature).meta\n",
    "    featureDescription = getFeatureDescription(featureMetaData)\n",
    "    dataType = setDataType(featureMetaData)\n",
    "\n",
    "    # Initialize dictionary to store feature frequency data\n",
    "    featureFrequencyDict = {}\n",
    "\n",
    "    # Skip for specific features based on type\n",
    "    if not (featureType == 'Node' and feature == 'otype') and not (featureType == 'Edge' and feature == 'oslots'):\n",
    "        for nodeType in F.otype.all:\n",
    "            frequencyLists = featureMethod(feature).freqList(nodeType)\n",
    "            \n",
    "            # Calculate the total frequency\n",
    "            if not isinstance(frequencyLists, int):\n",
    "                frequencyTotal = sum(freq for _, freq in frequencyLists)\n",
    "            else:\n",
    "                frequencyTotal = frequencyLists\n",
    "            \n",
    "            # Calculate the number of entries\n",
    "            if not isinstance(frequencyLists, int):\n",
    "                numberOfEntries = len(frequencyLists)\n",
    "            else:\n",
    "                numberOfEntries = 1 if frequencyLists != 0 else 0\n",
    "            # Check the length of the frequency table\n",
    "            truncated = True if numberOfEntries > tableLimit else False\n",
    "                \n",
    "            if not isinstance(frequencyLists, int):\n",
    "                if len(frequencyLists)!=0:\n",
    "                    featureFrequencyDict[nodeType] = {'nodetype': nodeType, 'freq': frequencyLists[:tableLimit], 'total': frequencyTotal, 'truncated': truncated}\n",
    "            elif isinstance(frequencyLists, int):\n",
    "                if frequencyLists != 0:\n",
    "                    featureFrequencyDict[nodeType] = {'nodetype': nodeType, 'freq': [(\"Link\", frequencyLists)], 'total': frequencyTotal, 'truncated': truncated}\n",
    "\n",
    "    # Add processed feature data to the main dictionary\n",
    "    featureDict[feature] = {'name': feature, 'descr': featureDescription, 'type': featureType, 'datatype': dataType, 'freqlist': featureFrequencyDict}\n",
    "    \n",
    "########################################################\n",
    "#                     MAIN FUNCTION                    #\n",
    "########################################################\n",
    "\n",
    "########################################################\n",
    "#             Gather general information               #\n",
    "########################################################\n",
    "\n",
    "print('Gathering generic details')\n",
    "\n",
    "# Initialize default values\n",
    "corpusName = A.appName\n",
    "liveName = ''\n",
    "versionName = A.version\n",
    "\n",
    "# Trying to locate corpus information\n",
    "if A.provenance:\n",
    "    for parts in A.provenance[0]: \n",
    "        if isinstance(parts, tuple):\n",
    "            key, value = parts[0], parts[1]\n",
    "            if verbose: print (f'General info: {key}={value}')\n",
    "            if key == 'corpus': corpusName = value\n",
    "            if key == 'version': versionName = value\n",
    "            # value for live is a tuple\n",
    "            if key == 'live': liveName=value[1]\n",
    "if liveName is not None and len(liveName)>1:\n",
    "    # an URL was found\n",
    "    pageTitleMD = f'Doc4TF pages for [{corpusName}]({liveName}) (version {versionName})'\n",
    "    pageTitleHTML = f'<h1>Doc4TF pages for <a href=\"{liveName}\">{corpusName}</a> (version {versionName})</h1>'\n",
    "else:\n",
    "    # No URL found\n",
    "    pageTitleMD = f'Doc4TF pages for {corpusName} (version {versionName})'\n",
    "    pageTitleHTML = f'<h1>Doc4TF pages for {corpusName} (version {versionName})</h1>'\n",
    "\n",
    "# Overwrite in case user provided a title\n",
    "if 'customPageTitleMD_' in globals():\n",
    "    pageTitleMD = customPageTitleMD\n",
    "if 'customPageTitleHTML' in globals():\n",
    "    pageTitleHTML = customPageTitleHTML\n",
    "\n",
    "    \n",
    "########################################################\n",
    "#             Processing node features                 #\n",
    "########################################################\n",
    "\n",
    "print('Analyzing Node Features: ', end='')\n",
    "for nodeFeature in Fall():\n",
    "    if not verbose: print('.', end='')  # Progress indicator\n",
    "    processFeature(nodeFeature, 'Node', Fs)\n",
    "    if verbose: print(f'\\nFeature {nodeFeature} = {featureDict[nodeFeature]}\\n')  # Print feature data if verbose\n",
    "\n",
    "########################################################\n",
    "#             Processing edge features                 #\n",
    "########################################################\n",
    "\n",
    "print('\\nAnalyzing Edge Features: ', end='')\n",
    "for edgeFeature in Eall():\n",
    "    if not verbose: print('.', end='')  # Progress indicator\n",
    "    processFeature(edgeFeature, 'Edge', Es)\n",
    "    if verbose: print(f'\\nFeature {edgeFeature} = {featureDict[edgeFeature]}\\n')  # Print feature data if verbose\n",
    "\n",
    "########################################################\n",
    "#             Sorting feature dictionary               #\n",
    "########################################################\n",
    "\n",
    "# Sort the feature dictionary alphabetically by keys\n",
    "sortedFeatureDict = {k: featureDict[k] for k in sorted(featureDict)}\n",
    "\n",
    "# Print the sorted feature dictionary if verbose\n",
    "if verbose:\n",
    "    print(\"\\nSorted Feature Dictionary:\")\n",
    "    for key, value in sortedFeatureDict.items():\n",
    "        print(f\"Feature {key} = {value}\")\n",
    "    \n",
    "print(f'\\nFinished in {time.time() - overallTime:.2f} seconds.')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a781c10a-7e95-46f2-a9b9-f94e725b5b3a",
   "metadata": {},
   "source": [
    "# 4 -  Run the extra code <a class=\"anchor\" id=\"bullet4\"></a>\n",
    "##### [Back to ToC](#TOC)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e3018ede-7247-4b94-8aba-649b6e41b6c9",
   "metadata": {},
   "source": [
    "import json\n",
    "\n",
    "knowledgeGraph = {\n",
    "    \"nodes\": {},\n",
    "    \"edges\": []\n",
    "}\n",
    "\n",
    "for featName, featInfo in featureDict.items():\n",
    "    # Determine if \"Node\" or \"Edge\" feature\n",
    "    featureKind = featInfo.get(\"type\", \"Node\")  # \"Node\" or \"Edge\"\n",
    "    if featureKind.lower() == \"edge\":\n",
    "        featureType = \"edge_feature\"\n",
    "    else:\n",
    "        featureType = \"node_feature\"\n",
    "\n",
    "    # Build a namespaced key for this feature\n",
    "    featureKey = f\"feature::{featName}\"\n",
    "\n",
    "    # Make sure the feature node is in the graph\n",
    "    nodeEntry = knowledgeGraph[\"nodes\"].setdefault(featureKey, {\n",
    "        \"type\": featureType,\n",
    "        \"valid_on\": []\n",
    "    })\n",
    "\n",
    "    # Store more metadata about the feature\n",
    "    nodeEntry[\"featureName\"] = featInfo.get(\"name\", featName)   # e.g. \"after\"\n",
    "    nodeEntry[\"description\"] = featInfo.get(\"descr\", \"\")        # e.g. \"material after the end of ...\"\n",
    "    nodeEntry[\"datatype\"]    = featInfo.get(\"datatype\", \"\")     # e.g. \"String\"\n",
    "\n",
    "    # Collect node types from the freqlist\n",
    "    freqInfo = featInfo.get(\"freqlist\", {})\n",
    "    for freqKey, freqDict in freqInfo.items():\n",
    "        # freqKey might be \"phrase\", \"word\", etc.\n",
    "        # freqDict has \"nodetype\": \"phrase\" (or \"word\"), plus \"freq\", \"total\", ...\n",
    "        nodeTypeName = freqDict.get(\"nodetype\", freqKey)\n",
    "\n",
    "        # Build a namespaced key for this node type\n",
    "        nodeTypeKey = f\"otype::{nodeTypeName}\"\n",
    "\n",
    "        # Make sure that node type is declared\n",
    "        if nodeTypeKey not in knowledgeGraph[\"nodes\"]:\n",
    "            knowledgeGraph[\"nodes\"][nodeTypeKey] = {\n",
    "                \"type\": \"node_type\",\n",
    "                \"origName\": nodeTypeName\n",
    "            }\n",
    "\n",
    "        # Record that this feature is valid on this node type\n",
    "        if nodeTypeKey not in nodeEntry[\"valid_on\"]:\n",
    "            nodeEntry[\"valid_on\"].append(nodeTypeKey)\n",
    "\n",
    "        # Add an edge with frequency detail\n",
    "        knowledgeGraph[\"edges\"].append({\n",
    "            \"from\": featureKey,\n",
    "            \"to\": nodeTypeKey,\n",
    "            \"relation\": \"valid on\",\n",
    "            \"freqDetail\": freqDict\n",
    "        })\n",
    "\n",
    "# Output the JSON\n",
    "outputPath = \"n1904_knowledge_graph.json\"\n",
    "with open(outputPath, \"w\", encoding=\"utf-8\") as f:\n",
    "    json.dump(knowledgeGraph, f, indent=2)\n",
    "\n",
    "print(f\"Knowledge graph saved to {outputPath}\")\n",
    "\n",
    "# Summary\n",
    "numNodeTypes = sum(1 for n, d in knowledgeGraph[\"nodes\"].items() if d[\"type\"] == \"node_type\")\n",
    "numFeatures  = sum(1 for n, d in knowledgeGraph[\"nodes\"].items() if d[\"type\"].endswith(\"_feature\"))\n",
    "numEdges     = len(knowledgeGraph[\"edges\"])\n",
    "print(f\"  - Node types: {numNodeTypes}\")\n",
    "print(f\"  - Features:   {numFeatures}\")\n",
    "print(f\"  - Edges:      {numEdges}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1fa83632-db4a-49f6-8740-418183b986f2",
   "metadata": {},
   "source": [
    "# 5 - The result: a JSON Knowledge Graph <a class=\"anchor\" id=\"bullet5\"></a>\n",
    "##### [Back to ToC](#TOC)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "80a020d1-5a7e-46a0-9c78-83ddcfc348f4",
   "metadata": {},
   "source": [
    "The resulting JSON is the actual Knowledge Graph which will be used as input for the [other notebook](generate_cytoscape_html.ipynb)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0eba060e-a680-4da1-a545-546584aa6214",
   "metadata": {
    "jp-MarkdownHeadingCollapsed": true
   },
   "source": [
    "# 6 - Notebook version details<a class=\"anchor\" id=\"bullet6\"></a>\n",
    "##### [Back to ToC](#TOC)\n",
    "\n",
    "<div style=\"float: left;\">\n",
    "  <table>\n",
    "    <tr>\n",
    "      <td><strong>Author</strong></td>\n",
    "      <td>Tony Jurg</td>\n",
    "    </tr>\n",
    "    <tr>\n",
    "      <td><strong>Version</strong></td>\n",
    "      <td>1.1</td>\n",
    "    </tr>\n",
    "    <tr>\n",
    "      <td><strong>Date</strong></td>\n",
    "      <td>3 April 2025</td>\n",
    "    </tr>\n",
    "  </table>\n",
    "</div>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}