heritage package
Submodules
heritage.cli module
Console script for Heritage.
heritage.constants module
Constants
heritage.heritage module
Python Interface to The Sanskrit Heritage Site
Use The Sanskrit Heritage Platform using,
Web mirror - no installation required - makes HTTP requests
Local installation - faster - uses console - no HTTP requests required
Using Local Installation
Heritage_Platform/ML/ contains the scripts
export QUERY_STRING as shell variable (referred to as OPTION_STRING in this code alongwith the ‘&text=TEXT’ part)
execute various scripts, such as ./reader
still produces HTML output that needs to be parsed
# Default input needs to be in the devanagari format # utils.devanagari_to_velthuis() function will convert this to VH
- heritage.heritage.freezeargs(func)[source]
Transform mutable dictionnary arguments into immutable frozen ones
Useful to be compatible with @cache. Should be added on top of @cache
- class heritage.heritage.HeritageAnalysis(case: str = None, number: str = None, gender: str = None, tense: str = None)[source]
Bases:
object
- case: str = None
- number: str = None
- gender: str = None
- tense: str = None
- class heritage.heritage.HeritageOutput(html: str)[source]
Bases:
object
Heritage Output Parser
Parse output generated by various utilities from Heritage Platform
- CLASSES = {'footer': ['enpied']}
- class heritage.heritage.HeritagePlatform(base_dir: str = '', base_url: Optional[str] = None, method: str = 'shell', **kwargs)[source]
Bases:
object
The Sanskrit Heritage Platform
Access various utilities from The Sanskrit Heritage Platform
Initialize Heritage Class
- Parameters:
base_dir (str) – Path to the Heritage_Platform repository. The directory should contain ‘ML’ sub-directory, which further contains the scripts
base_url (str, optional) – URL for the Heritage Platform Mirror. If None, the official INRIA website will be used. The default is None.
method (str, optional) –
Method used to obtain results. Results can be obtained either using the web installation or using UNIX shell.
Possible values are, ‘shell’ and ‘web’ The default is ‘shell’.
- INRIA_URL = 'https://sanskrit.inria.fr/cgi-bin/SKT/'
- ACTIONS = {'conjugation': {'shell': 'conjugation', 'web': 'sktconjug.cgi'}, 'declension': {'shell': 'declension', 'web': 'sktdeclin.cgi'}, 'dictionary': {'shell': '../MW/', 'web': '../../MW/'}, 'interface': {'shell': 'interface', 'web': 'sktgraph.cgi'}, 'lemma': {'shell': 'lemmatizer', 'web': 'sktlemmatizer.cgi'}, 'parser': {'shell': 'parser', 'web': 'sktparser.cgi'}, 'reader': {'shell': 'reader', 'web': 'sktreader.cgi'}, 'sandhi': {'shell': 'sandhier', 'web': 'sktsandhier.cgi'}, 'search': {'shell': 'indexer', 'web': 'sktindex.cgi'}, 'search_easy': {'shell': 'indexerd', 'web': 'sktsearch.cgi'}, 'user': {'shell': 'user_aid', 'web': 'sktuser.cgi'}}
- OPTIONS = {'font': {'default': 'deva', 'description': 'Font for Sanskrit output', 'values': {'deva': 'Devanagari', 'roma': 'Roman (IAST)'}}, 'lex': {'default': 'MW', 'description': 'Lexicon', 'values': {'MW': 'Monier-Williams Dictionary (English)', 'SH': 'Sanskrit Heritage Dictionary (French)'}}, 't': {'default': 'VH', 'description': 'Internal Transliteration Scheme', 'values': {'VH': 'Velthuis'}}}
- METHODS = ['shell', 'web']
- DEFAULT_METHOD = 'shell'
- __init__(base_dir: str = '', base_url: Optional[str] = None, method: str = 'shell', **kwargs)[source]
Initialize Heritage Class
- Parameters:
base_dir (str) – Path to the Heritage_Platform repository. The directory should contain ‘ML’ sub-directory, which further contains the scripts
base_url (str, optional) – URL for the Heritage Platform Mirror. If None, the official INRIA website will be used. The default is None.
method (str, optional) –
Method used to obtain results. Results can be obtained either using the web installation or using UNIX shell.
Possible values are, ‘shell’ and ‘web’ The default is ‘shell’.
- get_analysis(input_text: str, sentence: bool = True, unsandhied: bool = False, meta: bool = False)[source]
Obtain morphological analyses using The Sanskrit Reader Companion
- Parameters:
input_text (str) – Input text to analyse
sentence (bool, optional) – The input is treated as a sentence, if true, otherwise as a word. The default is True.
unsandhied (bool, optional) – If True, the input text is assumed to not contain sandhi. The default is False.
meta (bool, optional) – The option is passed to HeritageOutput.extract_analysis(). The default is False.
- Returns:
Dictionary of valid morphological analyses with solution_id as keys
- Return type:
dict
- get_parse(input_text: str, solution_id: Optional[int] = None, sentence: bool = True, unsandhied: bool = False)[source]
Obtain parse of a sentence using The Sanskrit Reader Companion
- Parameters:
input_text (str) – Input text to analyse
solution_id (int, optional) – Solution ID to parse. If None, the first solution ID is used. The default is None.
sentence (bool, optional) – The input is treated as a sentence, if true, otherwise as a word. The option is passed to HeritagePlatform.get_analysis(). The default is True.
unsandhied (bool, optional) – If True, the input text is assumed to not contain sandhi. The option is passed to HeritagePlatform.get_analysis(). The default is False.
- Returns:
Parse of the sentence
- Return type:
dict
- sandhi(word_1: str, word_2: str, mode: str = 'internal')[source]
Join two words by forming a Sandhi
- Parameters:
word_1 (str) – The first (left) word in the Sandhi
word_2 (str) – The second (right) word in the Sandhi
mode (str, optional) – Indicates whether the words join to form a single word or not Possible values are, * internal * external The default is ‘internal’.
- Returns:
sandhi – String obtained by forming the Sandhi
- Return type:
str
- search_inflected_form(word: str, category: str)[source]
Search an inflected form
- Parameters:
word (str) – Sanskrit Word to search (in Devanagari)
category (str) –
- Type of the word
Noun: Noun
Pron: Pronoun
Part: Participle
Inde: Indeclinible
Absya, Abstvaa, Voca, Iic, Ifc, Iiv, Piic etc.
- Returns:
matches – List of matches.
- Return type:
list
- get_declensions(word: str, gender: str, headers: bool = True, lexicon: Optional[str] = None)[source]
- search_lexicon(word: str, lexicon: Optional[str] = None)[source]
Search a word in the dictionary
- Parameters:
word (str) – Sanskrit Word to search (in Devanagari)
lexicon (str, optional) –
Lexicon to search the word in. Possible values are,
MW: Monier-Williams Dictionary
SH: Heritage Dictionary
The default is ‘MW’.
- Returns:
matches – List of matches.
- Return type:
list
- get_result_from_web(url: str, options: dict, attempts: int = 3)[source]
Get results from the Heritage Platform web mirror Exponential backoff is used in case there are network errors
- Parameters:
url (str) – URL of the CGI script to call HeritagePlatform.get_url() can be used to generate supported URLs
options (dict) – Dictionary containing valid options for the script
attempts (int, optional) – Number of attempts for the exponential backoff The default is 3.
- Returns:
Result (HTML) obtained
- Return type:
str
- get_result_from_shell(path: str, options: dict, timeout: int = 30)[source]
Get results from the Heritage Platform’s local installation via shell
- Parameters:
path (str) – Path to the executable script HeritagePlatform.get_path() can be used to generate supported paths
options (dict) – Valid options for the script
timeout (int, optional) – Timeout in seconds, after which the function will abort. The default is 30.
- Returns:
result – Result (HTML) obtained
- Return type:
str
- get_result(action: str, options: dict, *args, **kwargs)[source]
High-level function to obtain result for various actions
Avoids the hassle of generating the URL or PATH. Utilizes the HeritagePlatform.method attribute to determine whether to fetch through shell or web.
- Parameters:
action (str) – Action value corresponding to the utility to be used. Refer to HeritagePlatform.ACTIONS
options (dict) – Valid options for the specified action
- Returns:
Result (HTML) obtained
- Return type:
str
- set_method(method: str)[source]
Set method for fetching the output
Valid methods are listed in HeritagePlatform.METHODS
- set_option(opt_name: str, opt_value: str)[source]
Set global options
Any of these options, if expected by a particular utility from the Heritage Platform, will be directly used in the QUERY_STRING while fetching the output from that utility
class variable OPTIONS stores the default values for options
Each option contains, - a ‘description’ of the option - ‘values’ it can take (and descriptions of those values) - ‘default’ value
heritage.utils module
Utility Functions
- heritage.utils.devanagari_to_velthuis(text: str) str [source]
Convert Devanagari text to Velthuis
Heritage Platform uses its own DN to VH conversion This deviates from the standard one (from Wiki or other sources) Following is a translation of the JS function convert() from the Heritage Platform Source URL: https://sanskrit.inria.fr/DICO/utf82VH.js
Module contents
Heritage.py – Python Interface to The Sanskrit Heritage Platform