Products & Services

Via our Github, you can experiment the IAHLT open-source annotated content and decide if you would like to become IAHLT member to access our large Hebrew & Arabic datasets and models.

Our products in IAHLT Github - Click here

Services and Tools for Hebrew & Arabic

Universal Dependencies (UD) is a framework for consistent annotation of grammar (parts of speech, morphological features, and syntactic dependencies) across different human languages. UD is an open community effort with over 300 contributors producing nearly 200 treebanks in over 100 languages.

IAHLT public contribution

The UD Hebrew-IAHLTWiki treebank consists of 5,000 contemporary Hebrew sentences representing a variety of texts originating from Wikipedia entries:

https://github.com/UniversalDependencies/UD_Hebrew-IAHLTwiki

Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.

IAHLT Automatic Annotations Demos

Automatic Hebrew NER demo
https://huggingface.co/spaces/iahlt/iahlt-span-marker-alephbert-small-nemo-mt-he

Automatic Arabic NER demo
https://huggingface.co/spaces/iahlt/iahlt-span-marker-xlm-roberta-base-ar

IAHLT Open Source Use

Name	Description	Use	URL
Sourcehut	Forge with hosting for git/mailing list/CI and more	Code hosting/continuous integration/mailing list/issue tracking	https://sourcehut.org
NECKar	Wikidata entity extractor	Entity linking and NE preannotation	https://event.ifi.uni-heidelberg.de/?page_id=532
sacr	Coreference annotation tool	Coreference annotation	http://boberle.com/projects/sacr
trankit	UD parser and NE recognizer	UD parsing and NE recognition	https://github.com/nlp-uoregon/trankit
udpipe	Classical UD parser	Sentence segmentation (HE + AR) and lemmatization	https://github.com/ufal/udpipe
arborator	Universal dependencies annotation tool	Annotation for UD	https://github.com/Arborator
Doccano	Named entity annotation tool	Named entity annotation	https://github.com/doccano
Grew	Graph-based corpus search tool	Corpus search and validation for lemmatization and UD	https://grew.fr

Hebrew LLM Project

פרויקט משותף למודל שפה גנרטיבי גדול בעברית, פתוח, וחזק

האיגוד הישראלי לטכנולוגיות שפת אנוש

אינטל / מרכז דיקטה, בשיתוף מפא"ת / התכנית הלאומית

https://dicta.org.il/dicta-lm

NeoDictaBERT-bilingual: Pushing the Frontier of BERT models in Hebrew

https://huggingface.co/dicta-il/neodictabert-bilingual-embed

Products & Services

Via our Github, you can experiment the IAHLT open-source annotated content and decide if you would like to become IAHLT member to access our large Hebrew & Arabic datasets and models. Our products in IAHLT Github - Click here

Services and Tools for Hebrew & Arabic

IAHLT Automatic Annotations Demos

Automatic Hebrew NER demo https://huggingface.co/spaces/iahlt/iahlt-span-marker-alephbert-small-nemo-mt-he Automatic Arabic NER demo https://huggingface.co/spaces/iahlt/iahlt-span-marker-xlm-roberta-base-ar

IAHLT Open Source Use

Please contact us for any content creation or annotation needs, (text and audio).

Via our Github, you can experiment the IAHLT open-source annotated content and decide if you would like to become IAHLT member to access our large Hebrew & Arabic datasets and models.

Our products in IAHLT Github - Click here

Automatic Hebrew NER demo
https://huggingface.co/spaces/iahlt/iahlt-span-marker-alephbert-small-nemo-mt-he

Automatic Arabic NER demo
https://huggingface.co/spaces/iahlt/iahlt-span-marker-xlm-roberta-base-ar